|
FRACAS – Unleashing the Power of the EAM as a Reliability
Improvement Tool
Bill Keeter,
Allied Reliability
Note: Originally presented at
EAM-2006 The Enterprise Asset Management Summit in Las Vegas
Introduction
A strong Failure Reporting, Analysis, and Corrective Action
System (FRACAS) is the backbone of a good asset performance
improvement effort. The FRACAS provides the business elements
required to close the loop on Root Cause Failure Analysis (RCFA)
and Reliability Centered Maintenance (RCM) efforts. The FRACAS
changes RCFA from what are often one shot exercises to a managed
program for systematically improving equipment and process
performance. This chapter describes the basics of implementing
the FRACAS and how to use it to insure implementation of RCFA
recommendations.

1.
Managing the FRACAS
The FRACAS is an important system that requires management
attention just like any other. Purposeful management for
success requires that the FRACAS be driven from the top down
through management policies and procedures to insure quality of
effort and meaningful results.
1.1.
Policies
The beginning step in the development of the FRACAS is the
establishment of management policies for equipment and process
reliability improvement that include requirements for reporting,
analyzing, and correcting system failures. The policy statement
should include a statement of purpose for the FRACAS, a
statement of personnel responsibilities at all levels, and a
description of the basic elements required in the FRACAS.
1.2.
Procedures
The FRACAS by its nature is a procedure driven system. It
requires procedures for reporting, analysis, and correction of
system failures. FRACAS procedures will guide how failures are
reported, where information is stored, which specific analysis
methods will be used, when they will be used, and who will use
them.
2.
Basic Elements of the FRACAS
2.1.
Failure Reporting
Failures must be reported in ways that lend themselves to
analysis with Reliability Engineering tools such as Weibull
Analysis, RCM, and Availability Simulation. The best reporting
schemes use individual failure modes as the basis for failure
reporting. Reporting schemes need to follow the hierarchical
structure of the equipment within the process.
2.1.1.
Failure Modes
Failure modes describe the individual failed components of the
maintainable item, including a descriptor for what happened to
the component. Failure modes are the things that occur and
cause the system to lose its ability to produce its desired
outputs.


2.1.2.
Developing Failure Modes
Failure modes are best developing using an orderly system that
includes a functional analysis of the equipment used in the
process. Equipment is generally broken down into a hierarchy
that shows graphically how the facility is put together to
achieve its business output.

2.1.3.
Failure Modes and Effects Analysis (FMEA)
Failure Modes and Effects Analysis (FMEA) is perhaps the best
way of developing failure modes for inclusion in the FRACAS
reporting system. It is an extremely systematic way of looking
at the functions of maintainable items to determine the most
likely causes of their loss of function. The causes of loss of
functional failure are the equipments failure modes.
A thorough FMEA that considers all the failure modes present
produces the most exact results, but may be too time consuming
to be of practical use in the everyday work environment.. A
useful group of failure modes can be generated by developing a
list of the most likely failure modes using a functional
breakdown of equipment. Development of the FMEA is best done by
a group of people who work with the equipment day-in and
day-out. What is important is to understand the functions of
the equipment and what things break or fail that cause the
equipment to lose its function.
2.1.3.1.
Maintainable Items
Maintainable items represent the lowest level of the facility
hierarchy than can be further broken down into components.
Maintainable items have specific, well definable functions that
enable the system to produce its desired output. It is the loss
of the function of these items that leads to lost production,
lost quality, safety issues, environmental issues, and
operational issues.
The maintainable item level is where we set maintenance tactics
and strategies to keep system performance at desired levels.
2.1.3.2.
Functions
Functions define the reason for the existence of the
maintainable items. Most maintainable items have one or more
primary functions and one or more secondary functions.
Functions describe what the maintainable item does, not what it
is. Functional Statements need to be written in a way that
makes it easy to identify what the functional failure is. The
best functional statements use everyday that we all can
understand. Local jargon is acceptable as long as everyone who
uses the FMEA will understand what the jargon represents.

2.1.3.3.
Functional Failures
The functional failure statement describes the loss of required
or desired function of the maintainable item. They usually
contain an adjective and the functional noun. Functional
failure statements rarely if ever contain a part name.

2.1.3.4.
Failure Modes
Once the functional failures are defined we can apply the
failure modes much as shown in tables one and two. The
important thing to remember is that the failure mode is a
combination of a component name as well as a descriptive word to
tell what happened to the component.
2.2.
Responsibilities
Every member of the organization has roles and responsibilities
related to the reporting of failures. It is important for each
person in the organization to understand his/her roles and
responsibilities.
2.2.1.
Facility Manager
The facility manager is responsible for establishing policies
that require the development of the FRACAS. The facility
manager provides the top down driven impetus for insuring that
everyone in the organization is focused on reporting, analyzing,
and correcting failures.
2.2.2.
Program Champion
The FRACAS program champion is responsible for developing the
written procedures need to implement the program. The Champion
provides upward and downward communication of program policies,
goals, and results. The Champion has direct responsibility for
insuring that required training takes place, and that each
individual in the organization understands what his/her roles
and goals are within the FRACAS program.
2.2.3.
Operations and Maintenance Managers
Successful development and use of the FRACAS depends on close
cooperation between the operations and maintenance managers
within the organization. Breakdowns in communication at this
level often lead to significant reductions in the benefits that
can be achieved with a well implemented FRACAS. The tone of
communication between these two managers usually sets the tone
of communication between their subordinates.
2.2.4.
Operations Supervisors
Operations supervisors play an extremely important role in
developing and sustaining FRACAS efforts. Operations
supervisors are responsible for insuring that the goals of the
FRACAS are made known to their direct reports, and for insuring
that initial failure reports are of high quality. Poor initial
reporting will lead to poor final reports, and can make the data
gathered useless for predicting and preventing future failures.
2.2.5.
Maintenance Supervisors
Maintenance supervisors also play an important role in
developing and sustaining FRACAS efforts. They are responsible
for insuring that their maintenance personnel take the necessary
time to insure that information about failed components is
correct, and is in line with the failure modes defined within
the FRACAS reporting system. Again, poor quality of information
here will often lead to poor final reports and information that
is not very useful for predicting and preventing future
failures. Good failure reporting requires good communication
between the operations and maintenance supervisors.
2.2.6.
Operators
Operators provide initial failure reports for the FRACAS. They
need to understand the importance of giving meaningful and
accurate reports about the functional failures they observe.
Operators need to have a thorough understanding of the
maintainable items that are present in the system. It is not
reasonable to expect that operators will know or be able to
determine what is causing the functional failure. It is
reasonable to expect that they will be able to describe the
functional failure in enough detail to aid maintainers in the
troubleshooting process, and to provide useful information to
the FRACAS analyst.
2.2.7.
Maintainers
Maintainers are in a position to have the greatest impact on the
outcome of FRACAS efforts. They are usually in the best
position to determine which components failed, and what happened
to them. They may be in a position to determine what caused the
failure mode to occur, but it is not reasonable to expect that
they will be able to determine the cause of every failure mode.
The maintainer has very specific responsibilities that require
enumeration.
2.2.7.1.
Preserving Evidence
The maintainer will usually be the first one on the scene to
have direct contact with the failed components. It is his
responsibility to document and record the condition of the
components as he finds them. The maintainer needs to be taught
preservation techniques, and how to record conditions around the
component using words and pictures. In no case should the
maintainer attempt to clean or alter the condition of the failed
components. The maintainer should protect the evidence by
covering it loosely with some protection like plastic bags to
prevent contamination from outside sources.
2.2.7.2.
Recording Conditions
The maintainer should record conditions around the failed
component. The best way is to take digital photos and write
concise notes about what is found.
2.2.7.3.
Identifying Potential Causes or Causal Factors
The maintainer may be able to determine what caused the
component to fail, as well as some causal factors that may have
led up to the failure. It is important to allow the maintainer
to say “I don’t know” at this point. Frequently the maintainer
will not be able to tell what caused the component to fail
during an initial analysis of the scene. In this case saying I
don’t know is better than an unfounded guess as to cause.
Determining cause may require further examination by engineering
specialist such as metallurgist and people experienced in
determining causes for the failed components in question.
2.2.8.
Failure Analyst
The failure analyst is responsible for screening initial failure
reports to determine if the reports are complete, and whether or
not further analysis is required. The analyst may order a Root
Cause Failure Analysis (RCFA) depending on whether or not the
consequences of the failure warrant it. The analyst
determination to order the RCFA should be driven by policy and
guidelines written into the FRACAS. The analyst is also
responsible for insuring that failure data is analyzed using
available analysis tools on a regular basis to determine whether
there need to be updates to the Preventive and Predictive
Maintenance Program, RCFA’s for recurrent failure modes, or
RCFA’s for failure modes exhibiting infant failures.
2.3.
Analysis Methods
Well collected failure data allows the analyst to use a variety
of analysis methods to determine how to improve asset
performance. A well trained analyst can use Weibull Analysis,
Reliability Centered Maintenance (RCM), Availability Simulation,
and Root Cause Failure Analysis (RCFA) to analyze the data and
determine solutions to asset performance problems.
2.3.1.
Weibull Analysis
Weibull Analysis, invented in the 1930’s by Swedish born Waloddi
Weibull, has become the statistical analysis method of choice
for examining equipment failures. The low number of data points
required for making reasonable decisions, as well as the ability
to look at times to failure distributions to determine potential
maintenance tactics give it substantial advantages over other
forms of statistical analysis for making asset management
decisions.
2.3.2.
Reliability Centered Maintenance (RCM) and Availability
Simulation
RCM coupled with Availability Simulation allows the analyst to
look at a wide variety of potential maintenance tactics to
determine which set of tactics can be applied to equipment
failures to achieve the best combination of profit, safety
criticality, environmental criticality, and operational
criticality for meeting the goals of the business. Availability
Simulation changes maintenance decision making from a day-to-day
exercise into a strategic planning exercise which can look far
into the future of the assets.
2.3.3.
Root Cause Failure Analysis (RCFA)
RCFA is arguably the most powerful tool available for improving
asset performance. RCFA allows the organization to analyze and
eliminate major failures as well as the small recurring failures
that chip away at company profits each and every day. The
FRACAS database is instrumental in insuring that good hard data
is used to back up the potential causes for failure given during
RCFA exercises. The most important element in successful RCFA
programs is the reliance on hard facts rather than supposition
by RCFA participants.
It is the strong combination of RCFA and RCM that allows an
organization to make rapid and sustainable improvements in asset
performance.
3.
The FRACAS Database
3.1.
Introduction
The FRACAS database is the repository for all gathered failure
information. It must be developed in a way that allows easy
entry of failure data, and easy retrieval of failure data for
analysis using the various methods previously described. The
database may take several forms depending on the size and
sophistication of the organization.
3.2.
Forms of the Database
The FRACAS database may take the form of a custom-built database
for use in small organizations, an off the shelf database for
use across larger organizations, or in some cases it may be
integrated into the facility’s Computerized Maintenance
Management System (CMMS) or Enterprise Asset Management System (EAMS).
3.2.1.
Custom Built Databases
Small companies or facilities may often opt to develop their own
FRACAS database due to the lack of funds and resources required
for purchasing either off the shelf packages or CMMS/EAMS
packages. The advantage to this method is low entry cost as
well as development based on the specific needs of the
organization. It is usually maintained by a single dedicated
individual. The major drawback to this type of system is the
inability to share and report data across a larger user base.
3.2.2.
Off The Shelf Solutions
There are a large variety of off the shelf FRACAS software
packages available today. They are usually more suitable for
larger organizations. Most systems have some for of analysis
ability already built into them, and offer the ability to attach
external documents and pictures to enhance failure reporting and
analysis. The available systems can be used in LAN and WAN
environments so that they can be a global solution for a large
company. Off the shelf systems require either total separate
data entry, or some combination of separate data entry and
import entry from either a CMMS or an EAMS environment. In most
cases the import data entry is accomplished by exporting data
from the CMMS/EAMS to an office product such as Excel, and then
importing the information into the FRACAS database.
Most providers of FRACAS software are constantly updating and
improving the software, and are open to changing the software
based on direct inputs from their user base.
3.2.3.
CMMS or EAMS Solutions
Very sophisticated organizations with large Information
Technology (IT) or Information Systems (IS) departments may be
able to implement the FRACAS database within their EAMS/CMMS.
The advantage of this solution is that all information is in a
single repository that is accessible from all levels of the
organization. The disadvantages are that it requires a sizable
investment in programming resources, and a programming change by
the EAMS/CMMS supplier may require an extensive rewrite of the
FRACAS module. Most IT/IS departments are unwilling to commit
to providing the follow on resources that may be required to
support future changes.
3.3.
Minimum Database Requirements
3.3.1.
Introduction
As a minimum the FRACAS database must contain elements that
allow the user to analyze failures using Weibull Analysis, RCM,
Availability Simulation, and RCFA. The following list is meant
to represent the absolute minimum requirements for the
database.
3.3.1.1.
Equipment Hierarchy
The database must contain the equipment hierarchy down to the
maintainable item level.
3.3.1.2.
Failure Modes
Failure Modes as described in section one should be in the
database in a tabular format. It is helpful if the failure
modes are contained in failure mode groups to minimize the list
of failure modes to search when assigned the mode to a given
failure report.
3.3.1.3.
Date and Time Stamp
The exact date and time of the report must be saved so that
successful Weibull Analysis can be accomplished. The lack of
specific times will impact the ability of the analyst to
determine exact times to failure for specific failure modes. As
an absolute minimum the date of the failure must be recorded.
3.3.1.4.
Failure description
There failure reporter must have the ability to describe what
happened in his own words to include the functional failure of
the maintainable item.
3.3.1.5.
Failure Impact
The database must contain information about the business impact
of the failure in terms of cost, downtime, safety criticality,
environmental criticality, and operational criticality.
3.3.1.6.
Causal Factors
Information about what may have caused the failure, or any
causal factors that may have led up to the failure must be
recorded. This information can be vital when later analysis of
the failures is performed.
3.3.1.7.
RCFA Follow-up
Many organizations that undertake RCFA efforts fail to
capitalize on the power of RCFA because they are unable to close
the loop on following up recommendations. The FRACAS is an
excellent place to keep information about which failures require
and RCFA, and who has organizational responsibility for
completing the implementation of RCFA recommendations.
3.3.2.
Reporting Capabilities
The FRACAS database should allow the analyst to produce a
variety of textual and graphical reports to aid in the analysis
of failures. Reporting of Weibull data, failure frequencies for
various failure modes, and database structure are extremely
important.
4.
Conclusion
A well designed Failure Reporting, Analysis, and Corrective
Action System (FRACAS) can be an important part of any
continuous improvement effort. Failure codes developed using
functional analysis of plant equipment greatly improve the
ability of the Reliability Engineer to analyze failures and
initiate changes in equipment design, maintenance strategies,
and operating strategies.
Simple, two-part failure codes for use in the CMMS/EAMS allow
operators and maintainers to better record failure information
for use in the FRACAS.
Note: Originally presented at
EAM-2006 The Enterprise Asset Management Summit in Las
Vegas.
Full proceedings are available on CD here
|