root cause analysis enersys houston texas

Alarm Management: Advancing From Failure Cause to Root Cause Analysis

Throughout the history of the pipeline industry, gas transmission operators have focused on the source of failure in pipeline incidents. Categories include corrosion, operator error, material failure, external interference, or external forces.

While post-event analysis is important, it is only one aspect of safe operations. Thanks to a renewed focus on culture through Pipeline SMS and the advancement of technology, operators should be shifting their focus to root cause analysis to understand why incidents occur.

This proactive, pre-event analysis will help operators prevent future incidents, helping the industry drive toward a zero-incident goal. Industry consultant Phil Hopkins said it well during his keynote address at the Pipeline Technology Conference 2019 that root cause analysis “allows organizations to learn from past failures and avoid similar incidents in the future.”

The key for operators is understanding the importance of Alarm Management to support the advancement from failure cause analysis to root cause analysis.

How Alarm Management Links to Root Cause Analysis

If failure cause analysis is a building block for Root Cause Analysis, then alarm rationalization is a building block for Alarm Management.

Alarm rationalization is the process of documenting the alarm-specific process to verify, diagnose, determine causes, and take the appropriate course of action to respond to an alarm presented to a pipeline controller in the control room.

The PHMSA Control Room Management Rule (49 CFR Parts 192 and 195) and API 1167 provide guidance on how to manage the alarm rationalization process and assist with controller response through alarm response sheets.

Along the same track, alarm response is a specified course of action for each controller to take when presented with an alarm through the SCADA system on their HMI display. Ideally, the controller will follow the rationalized procedure.

There is a reactive element to alarm rationalization. The proactive piece is Alarm Management, where operators go through the process of ensuring that the alarms are optimally selected, designed, prioritized, and documented in the system.

The result of an effective alarm management program is that each alarm is meaningful.

  • Each alarm indicates an Abnormal Operating Condition (AOC)
  • Each alarm specifies the priority of the alarm
  • Each alarm documents possible causes and related confirmations
  • Each alarm has a documented process for response

While operators cannot plan for the unforeseen, they can plan for how the control room should respond to each event. This is a picture of advancing from reacting to an event to planning ahead for the response to the event.

The Importance of Lessons Learned in Root Cause Analysis

Another critical step to support Alarm Management is constantly feeding information back into your program.

Lessons learned from other operators, from your personnel, and from the data collected in your SCADA system increases the effectiveness of your program. You should be continually sharpening and shaping the program by incorporating shared information into your program.

Not only will your operation be prepared and ready to anticipate the latest challenges, but your team will grow in confidence to perform critical tasks in their roles.

What steps can you take to gather the necessary information to feed back into your program?

  • Ask controllers and support personnel what they’re seeing.
  • Review the data gathered in your SCADA system to identify patterns or gaps.
  • Use the answers and data to dig deeper and perform root cause analysis.

Then, take follow-up steps to close any gaps, whether providing training for personnel or re-authenticating alarms in your system.

Consider Support to Advance to Root Cause Analysis

Root cause analysis to support alarm management ties back to an overall theme of pipeline integrity management or pipeline safety systems. This includes the Plan-Do-Check-Act cycle to support the proactive approach of continual improvement.

As pointed out by Phil Hopkins during PTC 2019, the key to successful implementation includes several elements: buy-in at the executive level, ownership at all levels of the operation, effective implementation, enhanced controls, communication, monitoring, and continuous review.

Each operator should strive to optimize each of these areas, but we understand that not every operator is in the same position on their journey to complete optimization.

Your operation may have top-level buy-in, but you’re still working on the culture through changed behaviors and attitudes. Or, vice versa.

EnerSys understands the importance of software and support to help each operator on their journey.

Software: Included in our POEMS CRM Suite is the ALMgr module that provides operators with the ability to effectively implement Alarm Management. Included are the tools and documentation to manage the alarm rationalization process, assist controller response with digital alarm response sheets, and analytical reports.

Support: Included in our service piece are Compliance and Consulting capabilities. We collaborate with your team to determine how to implement best practices unique to your operation. We also help operators establish procedures and understand how to utilize tools to support safe operations.

To find out more about our software and support capabilities, consider scheduling a consultation with our team. We appreciate the opportunity to help your operation advance from failure cause analysis to root cause analysis to support Alarm Management.