A Business Technology Place

Root Cause Analysis Playbook

One of the staples of our Lean journey is a monthly root cause analysis (RCA) effort. The results of the team standard have surpassed my expectations, and I anticipate more potential positive results as we mature our approach. Our playbook is simple to execute, but requires disciplined execution and adherence to standard to recognize benefit and produce long term benefits.

=============

Prerequisite Activities

  1. Train team members on the fundamentals and business reason to use RCA.
  2. Create team standards for documentation and frequency of RCA events.
  3. Establish place on visual management board to post active, completed, and future RCA documents.

Execution

  1. In the frequency designated by the team standard, determine the process,procedure, or result as the subject for the RCA.
  2. Decide who the point person is to manage the current RCA effort.
  3. Analyze and document
    1. Define the problem
    2. Determine why the problem happened.
    3. Determine a solution to prevent the problem from happening again.
  4. Post results to management board

Organizational Adhesive

  1. Review progress of active RCAs and results of completed RCAs during weekly team meetings.
  2. Use managers as both participants and assignment owners.
  3. Audit adherence to department standards and post results on team audit board.
  4. Use the management board to put placeholders for RCAs that will happen in the future.

=============

A monthly cadence works well for our environment. It is frequent enough to keep problem solving active, but not so frequent to disrupt operational activities. We have found that RCAs which require more than a month of work to resolve should be classified as a project so we can keep the monthly cadence of RCA events.

The best part is living with the results and preventing problems from repeating. So far, we’ve not had any of the problem repeat that we’ve solved for in a RCA. I guess that’s the whole point.

Onward and Upward!

Root cause analysis for team building

Early in my career we used a process that loosely resembled a root cause analysis after a severity 1 production outage. The intent of the process was to determine why the severity outage occurred and then fix the problem so it didn’t happen again. No one liked process and the documents we produced were rarely used to influence process improvement. It was a checkbox and an exercise to fill-in-the-blanks to say we completed it. I always thought the name post-mortem was bit odd as well and we were certainly dead to the process. Looking back, I see post-mortem efforts can be valuable if championed and executed correctly. But there is a better way.

Twenty years later, we are learning to implement root cause analysis (RCA) into our recurring operational procedures. Like a post-mortem exercise, a RCA is typically done after an event has occurred with the intended benefit to prevent problems from recurring. If done correctly, this can reduce waste and downtime.

But a RCA is distinct with its own set of advantages. Our team is using lean A3 problem solving techniques as the backbone for RCAs.  It is apparent to me the RCA process, if supported and executed routinely, can shape a culture of continuous improvement. Here are a few practical ways:

  • The outputs can be used as a proactive measure to predict and prevent future failures. Problem solving focuses on examining why events occur coupled with action items and sustainment activities. This is a great way to identify potential future problems.

In one recent 5-why exercise about a database failure we identified a few weaknesses in a process in addition to the root-cause of a failure. Our corrective action plan addressed multiple weaknesses and has undoubtedly prevented some of the weaknesses from becoming service outages.

  • A systematic approach to RCA involves setting a recurring cadence for problem solving. RCAs require a wide range of knowledge to identify problems, compile documentation, and create sustainment activities. Individuals will struggle, but teams can thrive solving problems like this.

We post our RCAs on our department flow-and-performance board to make them visible, promote discussion, and to keep the process top of mind. Our standard is to perform one RCA per month. This reinforces that RCAs are part of the culture of the team.

  • Done correctly, RCA focuses on resolving process deficiencies instead of blaming people. It’s not always easy but we remind ourselves to focus on behaviors and results over individuals.

Onward and Upward!

Photo Credit: ResoluteSupportMedia via creative commons – https://flic.kr/p/88Kdgw