What happens when you fall off the bike? Well, if you were anti-social like me, you got off the bike and quit learning since you had no place to be anyway. But apparently most normal human beings keep trying, so you will relate to Ryan Ogilvie’s post better than I did. He says when it comes to incident management, most organizations will experience issues, learn from them, and do a better job the next time.
A Cycle of Improvement
Ogilvie provides a basic series of steps for undergoing a proper post-incident review. It starts with reviewing the initial escalation. The main question to ask here is, “Was the right amount of information delivered in order to escalate to support resources?” If the answer is affirmative, then you can get on to the more important business of seeking root causes and preventing the incident’s occurrence again. Ogilvie then discusses how to review internal escalations:
Did all the right people get the right information to be able to restore service as quickly as possible? I have been in situations where an issue has occurred and after some looking around it turned out if we had only looped in “support group x” they would have helped cut the resolution time in half. Communication during these issues is crucial so reviewing the mode of communication in the post incident review is equally paramount.
Communication does not just extend to getting the right resources on board to fix the problem; it also pertains to informing clients that problems are being worked on and giving an estimate timeline as to when service will be restored. IT looks more competent and enjoys a stronger relationship with the business when everyone stays current with the information.
The last element of the post-incident review is of course taking the steps to learn from the thing that went wrong. Some especially sophisticated cases may require a lengthier investigation to find root causes, while others will have a more straightforward cause that can be addressed. What should ultimately occur is that you determine whether the problem was caused by a change that IT made, in which case you should have the change management team present to document the situation.
You can read the original post here: http://servicemanagementjourney.blogspot.com/2015/02/post-incident-reviews-dust-yourself-off.html