|
Extracted from Chapter 14 of Reliability-centered
Knowledge
By
Murray Wiseman
Optimal Maintenance Decisions (OMDEC) Inc.
www.omdec.com
Note from Terrence O'Hanlon,
Reliabilityweb.com: This article deals with the generic
term “streamlined” in the context of the sometimes
rigorous application of RCM as defined by a group of
select RCM practitioners. There is also an RCM process
that goes by the registered trade name of
SRCM® (Streamlined
Reliability Centered Maintenance) owned by Erin
Engineering and Research Inc., now a division of SKF.
This article and the reference to "streamlined" within,
is not about
SRCM®, Erin
Engineering or SKF, nor is it intended to confuse the
market in any way. For more information about
SRCM® visit
www.erineng.com or
www.skf.com
This article also assumes and
implies that there is an agreed upon process that goes
by the name “Reliability Centered Maintenance” or RCM.
That assumption is far from the case as RCM is a widely
used term that defies ownership or agreement. What one
man calls RCM is not what another would – which ads
considerably to the confusion in the RCM marketplace.
The Society of Automotive Engineers (SAE) did write an
RCM standard and it is widely acknowledged that the RCM2®
(another trade name – this time owned by Aladon LLC)
process influenced the direction of that standard. I am
not endorsing nor denouncing the SAE RCM standard. I am
simply stating that RCM as a term and a methodology is
not owned by any single party. There are a huge number
of derivations that lay claim to that exact sequence of
words (Reliability Centered Maintenance) or acronym
(RCM). In addition, the Society of Automotive
Engineers, while a fine organization, has no control or
authority over the use of the term Reliability Centered
Maintenance. They simply wrote a standard that some
accept as a valid definition.
Confused? Sorry about that.
To make up your own mind about the various derivations
of RCM, please think about attending RCM-2005 the
Reliability Centered Maintenance Managers’ Forum on
March 9-11, 2005 in Clearwater Beach Florida and again
on March 30,31 and April 1, 2005 in Surfer’s Paradise
Australia. For more information, visit
www.maintenanceconference.com
Religious or political zealots
confront one another, often, not on the basis of
the mores of their respective doctrines, but rather from
superficial differences in the details
surrounding each other’s cultural reference
points. Mathematicians take pride in their ability to
adopt a new set of definitions and symbols as
effortlessly as they would don a fresh suit of clothes.
Thus they proceed, unfettered by prior points of view,
to build new theorems upon old. The world of maintenance
has, not dissimilarly, spawned a multitude of cultures
and languages for formulating solutions to real
problems.
In
the preceding chapters we conducted RCM on several
diverse item types. We systematically answered each of
the seven RCM questions about the item, and, in the
order stipulated by the SAE JA-1011 standard: 1)
functions?, 2) failures?, 3) failure
modes?, 4) failure effects?, 5)
consequences?, 6) scheduled tasks?, and 7)
default tasks?. We entered the answers to the
questions in an electronic spreadsheet (for example, MS
Excel or a database form) formatted as the RCM Worksheet
illustrated in Figure 11-2 on page 138.
This
chapter explores one of several “streamlined” RCM
software programs.
We begin with an examination of what is meant by
“streamlining”. We illustrate the “streamlined” approach
by describing a popular representative RCM software
package called RCM Turbo.
We set up a cross-reference “dictionary” of terms
describing similar sounding but, sometimes, differently
applied concepts in the two “languages”. Finally we
summarize the relative advantages and potential
drawbacks of the “streamlined” RCM and the RCM
processes. Through this process, we discover how the
juxtaposition of two approaches may enlighten the
proponents of both.
Chapter 11(page 137) cited the SAE Standard “Evaluation
Criteria for Reliability-Centered Maintenance (RCM)
Processes” that defines RCM as:
“… a
specific process used to identify the policies which
must be implemented to manage the failure modes which
could cause the functional failure of any physical asset
in a given operating context.”
It
goes on, to define the process by adding:
“…Any
RCM process shall ensure that all the following seven
questions are answered satisfactorily and are answered
in the sequence shown as follows:
a.
What are the functions and associated desired standards
of performance of the asset in its present operating
context (functions)?
b. In
what ways can it fail to fulfill its functions
(functional failures)?
c.
What causes each functional failure (failure modes)?
d.
What happens when each failure occurs (failure effects)?
e. In
what way does each failure matter (failure
consequences)?
f.
What should be done to predict or prevent each failure
(proactive tasks and task intervals)
g.
What should be done if a suitable proactive task cannot
be found (default actions)?”
Were we
to consider the process
(of answering the 7 RCM questions in the sequence
stipulated) unacceptably resource intensive, then,
understandably, we would seek to replace it with a
process that consumes less time and fewer resources, but
by one that provides, no less a responsible
(sufficiently rigorous) analysis. We emphasize that the
JA 1011 SAE standard stipulates a minimal
set of criteria for a process to be called “RCM”.
Therefore, it is to be expected that most commercially
packaged RCM software systems and methodologies will add
a considerable number of features that will enhance and
facilitate the experience.
The
original
as well as the various streamlined RCM methods all
demand that the assembled team of analysts (operational,
process, and maintenance specialists) possess,
collectively, the knowledge necessary to make informed
decisions regarding the maintenance characteristics of
the item under scrutiny. The process chosen (either
original or streamlined) must, therefore, encourage the
maximum contribution by each participant so that RCM
decisions will carry the force of all knowledge and
experience available on the team. The success of any
“RCM” methodology, therefore, depends heavily on its
ability to gain true consensus, throughout
every stage of the analysis. The group, guided by a well
trained facilitator, exercises its best judgment when
visualizing the typical worst case scenario
(TWCS) surrounding each functional failure analyzed.
With
these objectives in mind,
we compare the two processes, by presenting a
comparative lexicon of some of their respective terms of
reference.
Table
14-1
Relationship between RCM and RCM Turbo terminology
|
RCM |
RCM Turbo |
|
Item:
a collection of parts, or systems that is convenient
to analyze as a group. It has been selected at a
high enough level of indenture that its failure may
easily be related to that of the equipment as a
whole, but at a low enough level so that the
analysis is of manageable size (i.e. having a
manageable number of failure modes). |
Maintainable item (MI):
same meaning |
|
No equivalent terminology is specified by the RCM
minimum criteria standard. (Any convenient or
existing equipment hierarchy naming system may be
used.) Operating context is often recorded in a
flexible text structure at the top of the RCM
worksheet. |
Productive unit (PU):
A system that includes several maintainable items. A
convenient place to record the operating context of
the MI. A productive unit belongs to a “Major Unit”
and a “Plant” is the highest level in the Turbo RCM
hierarchy. |
|
Worksheet:
A document (conveniently an electronic spreadsheet
or simple database application) onto which the
answers to the 7 RCM questions are recorded
during the RCM team session. |
The RCM Turbo software product is not meant to be
populated during the sessions, but afterwards by the
facilitator or other person trained in the use of
the software. A MS Excel form (Figure
14-2) is provided for use during the sessions. |
|
The RCM minimum criteria standard does not specify a
criticality or priority scale with which to schedule
the order of items to be analyzed. Nowlan and Heap
developed a simple priority system for the aviation
industry that has only two criticality ratings:
1)significant item,
and 2) non-significant item. This classification
system has proved useful in a variety of other
industries. For structurally significant items (SSI)
Nowlan and Heap apply a further classification of
one to four for each of the five categories:
1)Residual strength after failure, 2) Fatigue life,
3) Crack growth, 4) Corrosion, and 5) Accidental
damage. The minimum class (for all 5) determines
task frequency. There are two categories of SSI: 1)
Damage-tolerant and 2) Safe-life. Classifications 1
to 5 apply to damage-tolerant items, but only
classifications 4 and 5 apply to safe-life items.
(See Example 4
of Chapter 13
on page 178). |
Criticality/Priority:
values used to set priorities for PUs and MIs. It is
derived by question and answer sessions driven by
the program. (Criticality calculations in no way
detract from RCM. They merely add another dimension
to the analysis.) |
|
Failure:
Describes the way in which a specified function no
longer performs as required. It distinguishes (for
example) “full” from “partial” failure of a
function. The RCM Worksheet enforces a
one-to-many integrity constraint between
Function and Failure. |
Failure:
same basic definition. However Turbo-RCM does not
constrain a one-to-many (software) relationship
between Function and Failure. |
|
Failure Mode:
A reasonably likely cause of a specified failure.
Consists of a noun, a verb (active or passive form)
and a phrase such as “due to …”. For example “bolt
cracks due to stress corrosion fatigue”. The number
of failure modes to list and their “depth of
causality” depend on operating context. RCM enforces
a one-to-many integrity constraint between failure
and failure mode. RCM Turbo does not. |
Failure Mode:
A superset of the RCM definition. Structured in 3
parts as follows:
1) a component reference, 2) a “Failure
Mode & Effect” field - a single field that
includes both RCM concepts (Failure Mode and Failure
Effects), and 3) a “Root cause” reference. An
example of a RCM Turbo failure mode is: “Bearings”
+ “wear between rolling elements and racers leading
to increased vibration levels, localized heating and
eventual seizure and total stoppage of process due
to” + “normal wear and tear”. |
|
Failure Mode:
In RCM, the terms “Root Cause”, “Failure Mode”,
“Failure Mechanism”, “Failure Reason”, etc are
synonymous and represented by the term “Failure
Mode”. It is an “event” in the causality chain that
leads to the failed state. The “link” in the
causality chain selected as the “Failure Mode” is
the one that the organization can manage effectively
and practically by whichever means (proactive,
detective, or redesign). |
Root cause:
related to Failure Mode. Same definition. That is,
“Root Cause” in Turbo RCM is equivalent to “Failure
Mode” in RCM. |
|
Failure Effects:
Text answering the following:
• what sequence of events (considering a
TWCS
in the component, in the system, organization wide,
and in the external world)
could be touched off by the failure mode?
• how does the failure make itself known? What
observable events lead up to the failure?
• how is safety or the environment impacted?
(without mentioning the words "safety" or
"environment")
• how is production impacted? (quality, cost,
customer service)
• is there any additional damage caused by the
failure?
• how long will it take and what actions must be
accomplished to correct the failure?
• How does the likelihood of this failure depend on
deeper causes? Has it happened before? How often?
Under what circumstances?
|
Same definition but it is structurally embedded in
the “Failure Mode & Effect” field. In
addition the following “Failure Mode” fields (with
sample data) contribute to the “Effects” narrative:
Unit Output Reduction:
Total stoppage,
PU Downtime Cost:
$11,390 / hour,
MI Downtime Cost:
$11,390 / hour
F/mode&Effects:
Shaft failure-Chemical corrosion, overtorque,
indicated by cracks, increase in vibration leading
to shutdown of Brownstock washer
Characteristic:
Definitive life / wear out characteristics
Measurability:
Moderately easy to monitor
Category:
Normal Operation
Typical Warn Time:
4 Weeks
Root cause:
Normal wear & tear
MTBF:
5 years
Consequence:
Total stoppage
Strategy:
CBM |
|
Hidden Function:
A Function whose failure will not be detected under
normal circumstances. Identified by RCM during
functional analysis when examining each component
(from schematics, p&ids, photographs, and physical
walkaround) and listing the functions they suggest.
Code phrases (such as “able to”, “in the presence
of”, etc) are used to point out that a function is
hidden or protected by a hidden function. Subsequent
questions address the hidden function. The “hidden”
consequence supplants the other (three) failure
consequences in the RCM logic for determining a
mitigating task. |
Hidden Failure Mode:
Same meaning as RCM’s “hidden function”. It is
structured in the fields: Component, Failure Mode &
Effects, Task Description, Frequency, Duration,
Initiate Date, Job Group ID, Service Period, No. of
Units in Service, No. of failures, and MTBF of the
protective device (calculated). |
|
RCM records this information in the free text answer
to question 4, “Failure Effects”. However the JA1011
standard does not specify an explicit data field or
structure for MTBF. |
MTBF:
related to the Failure Mode. |
|
RCM records this information in the answer to
question 6 and 7 “Tasks” when following one of the
four branches (H, S, O, N) in the RCM decision logic
tree. |
Strategy:
related to Failure Mode. Takes one of three possible
values: 1) fixed time maintenance, 2) condition
based maintenance, or 3) operate to failure |
|
Same definition. RCM records this information in the
free text answer to question 4, “Failure effects”. |
P-F Interval:
related to Failure Mode. Estimated interval
(measured in working age units) between the
appearance of a potential failure and a functional
failure. |
|
Potential failure:
An indicator that a failure mode has initiated. |
S/A (secondary action) Indicator:
same meaning as “Potential failure” in RCM. |
|
No equivalent concept in RCM. If a failure mode is
due to design, lubrication, overload, or maintenance
practices, they would each constitute a separate
failure mode, and this information would be included
in the failure mode description itself. The word
“Safety” or “Environment” is not mentioned until the
consequence phase of the RCM logic diagram. |
Category:
related to Failure Mode. Takes one of six possible
values: 1) Design, 2) Lubrication, 3) Normal
Operation, 4) Overload Condition, 5) Maintenance
practices, or 6) Safety |
|
RCM records this information in the free text answer
to question 4, “Failure effects”. However no
explicit data structure is specified by the JA1011
standard. |
Characteristic:
related to Failure Mode. Takes one of three possible
values: 1) Definitive life/wearout, 2) General
degradation, and 3) Random |
|
Consequences:
Question 5. Takes one of four possible values: 1)
Hidden, 2) Safety /Environmental, 3) Operational,
and 4)Non-operational.
RCM records RCM Turbo’s “Consequence” in the free
text answer to question 4 “Failure effects”. |
Consequence:
related to Failure Mode. Takes one of four possible
values: 1) Total stoppage, 2) Partial
stoppage/quality, 3) No immediate effect, or 4) No
effect. This information |
|
RCM records this information both in the free text
answer to Question 4 “Failure effects” and in the
answer to Question 6 “Tasks”. Q6 asks whether there
is an applicable CBM task. Once a (CBM or
other) task is found to be applicable (practical)
RCM then asks whether it will be effective.
That is, will it sufficiently reduce or entirely
avoid the consequences of failure at acceptable
cost? |
Measurability:
related to Failure Mode. Takes one of three possible
values: 1) Easy, 2) Moderate, or 3) Impossible |
|
Redesign:
RCM records this information in the free text
answer to question 7, “Default Tasks”. Differs from
RCM Turbo only in the sequence in which this
question appears (i.e. following a determination
that no proactive or failure finding task adequately
mitigate the consequences of the failure.) |
Design Notes:
related to the Failure Mode. Records
decision/recommendation to “design-out” the failure
mode. (strictly speaking it is presented out of “RCM
sequence”.) |
|
RCM provides no specific field for this
information, leaving its provision up to the
implementer or commercial packager. |
Strategy Notes:
related to Failure Mode. A free text field used to
store comments or notes on the chosen maintenance
strategy. Useful where a second or alternative
strategy has been considered and rejected. |
|
RCM records this information in the free text
answer to question 4, “Failure Effects”. However,
without an explicitly specified structure. |
Breakdown Action:
related to Failure Mode. Describes what must be done
to repair the functional failure. Also has
the specific fields: Work Order No., SOP,
Duration, Downtime, MI Status,
S/A Initiator, Resources (up to six steps),
Assumptions, Materials, Spares. |
|
RCM develops this information in the decision
algorithm of question 5 (Is there an on-condition
maintenance task that is both applicable and
effective?) The RCM standard does not elaborate an
explicitly specified structure for recording this
information. |
Primary Action:
Related to the Failure mode. Describes what should
be done to prevent the failure mode. Also has
the specific fields: Work Order No., SOP,
Duration, Downtime, MI Status,
S/A Initiator, Resources (up to six steps),
Assumptions, Materials, Spares. |
|
RCM records this information in the free text
answer to question 6, “Tasks”. The RCM standard does
not elaborate an explicitly specified structure for
recording this information. |
Secondary Action:
related to Failure Mode. Describes what must be done
following the detection of a potential
failure. Also has the specific fields: Work Order
No., SOP, Duration, Downtime,
MI Status, S/A Initiator, Resources
(up to six steps), Assumptions, Materials,
Spares. |
|
RCM records this information in the free text
answer to question 4, “Failure Effects”. The RCM
standard does not elaborate an explicitly specified
structure for recording this information. |
Overhaul Action:
related to Failure Mode. Records Overhaul
Maintenance actions. For example, where the
Secondary Action was the change-out of a rotable
item which itself requires subsequent overhaul. Also
has the specific fields: Work Order No.,
SOP, Duration, Downtime, MI
Status, O/H Venue, S/A Initiator,
Resources (for up to six steps), Assumptions,
Materials, Spares. |
|
Not called a “library”. However, the records are
accessible (structured as answers to the seven
questions) in the RCM worksheets comprising the
global RCM table. No corporate harmonizing process
need be applied because every record is a “one-off”
development. However, tools, training, supervision
and support are required to validate and maintain
and update the knowledge base with day-to-day
experience. “Templating” of an entire item, is,
nonetheless, possible by copying any or all records
of an item after carefully comparing their
respective operating context descriptions. |
Failure Data Library:
a table of “3 part” failure modes referenced by
Machine Type. An administration process is used to
control the quality of data from multiple sites and
harmonize it for the purpose of providing
“templates” where applicable in future analyses of
other MIs or PUs. The focus on “templating”
justifies the appellation “Streamlined” in the case
of RCM Turbo. |
We may conclude from
Table 14-1, that, although RCM Turbo refers to
itself as a streamlined process, and, that some
of its terminology differs from that of RCM, it does not
omit any vital knowledge element specified by the
SAE RCM minimum criteria standard. RCM Turbo does
deviate from the sequence stipulated in the
standard. As pointed out in Chapter 11 (page 137), in
practice, however, RCM is not a sequential process. RCM
analysts anticipate the answers to subsequent
questions while working the current question.
Furthermore, the RCM process is iterative. That is, the
analysts often return to a previous answer and adjust it
in the light of revelations further on in the process.
The iterative and non-sequential nature of the RCM
process tends to render less important the differences
between the two approaches.
The terminology comparisons of
Table 14-1 show that RCM Turbo expands the
information elements of RCM into greater structural
detail. Such data structuring facilitates the post-RCM
processes (included in the RCM Turbo software package)
of workload smoothing, frequency calculations, and CMMS
integration as well as integration with a spares
optimization (optional) package.
Figure 14-1 of Example 1 shows how the RCM Worksheet
of Chapter 11 (Figure 11-2 page 138) might be combined
with the extended data fields of RCM Turbo.
|
PU Code:
Repulper, MI Code: Repulper screw |
Consequences and Results of Decision Algorithm
Q5, Q6, Q7 |
Task
|
Interval
|
By
|
|
Function Statement
Q1 |
Failure
Q2 |
Failure mode
Q3 |
Effects
Q4 |
|
|
|
|
|
|
|
|
|
To feed material 24 hours/day |
Does not feed at all |
Shaft fails |
Unit Output Reduction:
Total stoppage,
PU Downtime Cost:
$11,390 / hour,
MI Downtime Cost:
$11,390 / hour
F/mode&Effects:
Shaft failure-Chemical corrosion, overtorque,
indicated by cracks, increase in vibration leading
to shutdown of Brownstock washer
Characteristic:
Definitive life / wear out characteristics
Measurability:
Moderately easy to monitor
Category:
Normal Operation
Typical Warn Time:
4 Weeks
Root cause:
Normal wear & tear
MTBF:
5 years
Consequence:
Total stoppage
Strategy:
CBM |
|
|
|
|
|
|
|
|
Figure
14-1
RCM Worksheet applied to a RCM Turbo example
In the
RCM worksheet of
Figure 14-1 we note that most of the RCM Turbo
“failure mode” fields (in bold) fall quite
readily into the RCM Effects column, with the possible
exception of the field “Strategy”. The latter appears to
pre-empt the RCM decision logic of Questions 6 and 7.
We view this, nonetheless, as an insignificant departure
(from RCM), given that RCM analysts consider the
mitigating task in the normal course describing the
effects of failure. It is essential, however,
that the RCM consequences (H, S, O, or M) be determined
and the meticulous decision logic of RCM (on page 171)
be applied immediately following this RCM Turbo step.
RCM
Turbo facilitates data entry with a convenient Visual
Basic MS Excel form illustrated in
Figure 14-2.

Figure
14-2
MS Excel failure mode entry form in RCM Turbo
RCM Turbo then will perform a “primary” (i.e. a CBM)
task frequency calculation (Figure
14-3) and display the results that 14 days (i.e.
half the PF interval) is the recommended task frequency.
RCM Turbo calculates the annualized cost of the CBM
program so that it may be justified by comparison with
the annualized economic consequences (based on the MTBF
and the average cost of a failure) avoided by the CBM
program.

Figure 14-3 CBM
Frequency and Cost optimizing calculation
For scheduled overhaul, discard, and failure finding
tasks RCM Turbo performs analogous calculations by
applying a recorded MTBF, a qualitatively estimated
hazard function, and the recorded average economic
consequences of failure. The complete set of RCM Turbo’s
data fields is given in Appendix 12 on page 236.
1.
Table 14-1
illustrates that streamlined RCM (as it is embodied in
RCM Turbo), is not “streamlined” (i.e. in the sense of
being “abridged” or “reduced”). Rather, it encompasses
the principles of RCM, adding features that address CMMS
integration, quantitative reliability assessment and
task frequency calculations, spares, workload scheduling
and balancing, and other considerations.
2.
RCM Turbo does address the 7 RCM questions, however, not
in the sequence stipulated by the RCM Standard. The
software expands the 7 information elements of RCM into
various database fields. For example, MTBF, P-F
Interval, and Repair time are explicit fields related to
a Failure Mode.
3.
A RCM Worksheet based on the SAE JA1011 standard, will
provide excellent team focus regardless of the software
adopted. If populated (perhaps adapted as in
Figure 14-1) with RCM Turbo's needs in mind, the
worksheet (incorporating the RCM decision algorithm)
will benefit both streamlined and original RCM users.
4.
Both RCM and RCM Turbo demand that the persons
(primarily maintainers and operators), directly impacted
by maintenance decisions, participate fully in the
process. Indeed they must drive it. External
consultants can only teach the principles and techniques
of RCM. Regardless of the RCM software chosen, the
organization must select its analysts from among its
most experienced and competent operators and
maintainers. It must chose a facilitator, from within,
who will learn the RCM process fluently, elicit, and
faithfully record the technical knowledge of the
analysts. The facilitator must ask the 7 RCM questions
and ensure that consensus has been reached. He or she
must ask and ensure that each of the questions along the
appropriate branch of the RCM decision tree are
rigorously answered by the team, and duly recorded.
5.
Finally, we emphasize that reliability-centered
maintenance is not a software dominated process.
Software records the results of RCM analysis in a
convenient, accessible, and auditable format that traces
every maintenance task back to a failure mode that the
RCM team identified. Software enables integration with
the CMMS and implementation therein of the RCM analysis
results. As importantly, software, through regular
feedback from the field, and integration with the CMMS,
supports continuous “living” enhancement of the
initial RCM analysis.
Do you have
any comments on this article? If so send them to
murray@omdec.com.
References:
1.
RCM Turbo Maintenance Plan Development System Quick
Reference Guide
2.
RCM Turbo V9.2 User Guide
3.
RCM Turbo V9 desktop guide rev 2
4.
RCMT92 Installation Instructions
|