Since we have an expectation that “things just work,” the visibility to incident management can take center stage and as a result is often described as a “high-value process.” The challenge is that we view value in this manner. When we take a more objective look at this definition, we see that we want to avoid incidents at all cost rather than celebrate that we are great at resolving them in the first place.
In its simplest description, an incident is the breakdown of something working as it was designed. This characterization alone should tell us that this is the opposite of value add. The trouble is that culturally we “love the hero,” and incident managers can be seen as those who restore service when we need it most.
Because of this need to ensure service is restored as quickly as possible, many of the support people outside of the actual incident become very hands-off in an effort not to slow things down with too many hands working to help. This led me to the statement:
“Just because you are not an incident manager doesn’t mean that you can’t help improve the process.”
Think about that for a moment—everyone has some part in improving how an incident can impact our business. Here’s just a sample:
Depending on our organization setup, the service desk analysts may not be managing the incidents themselves. However, they are the IT that faces the business, so what they do during the incidents is important. While they are likely capturing the escalations, this is a good time to also capture some knowledge about the service that is impacted. We may know that a capability is unavailable, but does IT truly understand the business impact? Gathering further information from the business will allow us as IT to better understand the impact and improve communications. After the incident, details such as these are important in a post-mortem so that, if we need to adjust our responses, we can do so based on the impact the business is reporting.
IT Operations Manager
Infrastructure monitoring is something that is done “by operations for operations” in many organizations, as we just haven’t tied it into service management for one reason or another. It would make sense to correlate these alerts into real-time incidents, so why isn’t this being done? While doing this would allow us to identify issues before the business sees the impact, in reality many times the alerting mechanism is set up as an afterthought to the incident process. For the ops team to streamline what this looks like they would be able to weed out the garbage alerts that they currently get, and in the process better track what their infrastructure is doing.
IT Application Manager
We have all been involved in an incident that was escalated to networks because we all know that “this must be a networks issue.” One of the many challenges for incidents as they apply to application-level issues is that the symptoms could point to many things. From an application management perspective, having a solid knowledge repository of issues allows the incident manager or even the service desk to ask better questions in the event of an issue. Rather than saying that the users are not able to see module x on the application, they would be able to look up previous issues to see that, when an issue with module x arises, we need to check the following three items (for instance) to better determine a cause for the issue. Remember that knowledge is power. When we review the incident at the post-mortem, ensure someone from the IT applications team is invited—even if this wasn’t an application issue. They will get a sense of the issue, and they may have some better insight to the service we provide as a whole and any potential areas that have weaknesses. Getting input from various angles is important to be able to improve.
Everyone plays a part in incident management, big or small. From dealing with escalations to event management and improving communications. Start to think about what you can do, not only to improve your incident management process, but your overall delivery of services.
For more brilliant insights, check out Ryan’s blog: Service Management Journey
- Balance Incident and Change (or Prepare for a Thursday Beatdown) - Jan 23, 2018
- Problem Management Is Like Watering Plants - Dec 12, 2017
- Feedback Loop: If They Don’t Think You Care, They Won’t Care Either - Nov 14, 2017