Chapter: Mechanical : Maintenance Engineering : Principles and Practices of Maintenance Planning

Reliability

Reliability is defined as the probability that a device will perform its intended function during a specified period of time under stated conditions.

RELIABILITY

Reliability may be defined in several ways:

The idea that an item is fit for a purpose with respect to time.

In the most discrete and practical sense: "Items that do not fail in use are reliable" and "Items that do fail in use are not reliable".

The capacity of a designed, produced or maintained item to perform as required over time.

The capacity of a population of designed, produced or maintained items to perform as required over time.

The resistance to failure of an item over time.

The probability of an item to perform a required function under stated conditions for a specified period of time.

In line with the creation of safety cases for safety, the goal is to provide a robust set of qualitative and quantitative evidence that an item or system will not contain unacceptable risk.

The basic sorts of steps to take are to:

First thoroughly identify as many as possible reliability hazards (e.g. relevant System Failure Scenarios item Failure modes, the basic Failure mechanisms and root causes) by specific analysis or tests.

Assess the Risk associated with them by analysis and testing.

Propose mitigations by which the risks may be lowered and controlled to an acceptable level.

Select the best mitigations and get agreement on final (accepted) Risk Levels, possible based on cost-benefit analysis.

AVAILABILITY

A Reliability Program Plan may also be used to evaluate and improve Availability of a system by the strategy on focusing on increasing testability & maintainability and not on reliability.

Improving maintainability is generally easier than reliability. Maintainability estimates (Repair rates) are also generally more accurate.

However, because the uncertainties in the reliability estimates are in most cases very large, it is likely to dominate the availability (prediction

uncertainty) problem; even in the case maintainability levels are very high.

When reliability is not under control more complicated issues may arise, like manpower (maintainers / customer service capability) shortage, spare part

availability, logistic delays, lack of repair facilities, extensive retro-fit and complex configuration management costs and others.

The problem of unreliability may be increased also due to the "Domino effect" of maintenance induced failures after repairs.

Only focusing on maintainability is therefore not enough. If failures are prevented, none of the others are of any importance and therefore reliability is generally regarded as the most important part of availability

RELIABILITY THEORY

Reliability is defined as the probability that a device will perform its intended function during a specified period of time under stated conditions. Mathematically, this may be expressed as,

Where f(x) is the failure probability density function and is the length of the period of time (which is assumed to start from time zero).

ACCELERATED TESTING:

The purpose of accelerated life testing is to induce field failure in the laboratory at a much faster rate by providing a harsher, but nonetheless representative, environment.

In such a test, the product is expected to fail in the lab just as it would have failed in the field—but in much less time.

The main objective of an accelerated test is either of the following:

To discover failure modes.

To predict the normal field life from the high stress lab life.

An Accelerated testing program can be broken down into the following steps:

Software reliability is a special aspect of reliability engineering. System reliability, by definition, includes all parts of the system, including hardware, software, supporting infrastructure (including critical external interfaces), operators and procedures. Traditionally, reliability engineering focuses on critical hardware parts of the system. Since the widespread use of digital integrated circuit technology, software has become an increasingly critical part of most electronics and, hence, nearly all present day systems.

Despite this difference in the source of failure between software and hardware, several software reliability models based on statistics have been proposed to quantify what we experience with software: the longer software is run, the higher the probability that it will eventually be used in an untested manner and exhibit a latent defect that results in a failure (Shooman 1987), (Musa 2005), (Denney 2005).

As with hardware, software reliability depends on good requirements, design and implementation. Software reliability engineering relies heavily on a disciplined software engineering process to anticipate and design against unintended consequences. There is more overlap between software quality engineering and software reliability engineering than between hardware quality and reliability. A good software development plan is a key aspect of the software reliability program. The software development plan describes the design and coding standards, peer reviews, unit tests, configuration management,

software metrics and software models to be used during software development.

Define objective and scope of the test

Collect required information about the product

Identify the stress(es)

Determine level of stress(es)

Conduct the accelerated test and analyze the collected data.

MEAN TIME BETWEEN FAILURES

Mean time between failures (MTBF) is the predicted elapsed time between inherent failures of a system during operation. ^[1] MTBF can be calculated as the arithmetic mean

(average) time between failures of a system.

FORMAL DEFINITION OF MTBF

By referring to the figure above, the MTBF is the sum of the operational periods divided by the number of observed failures.

If the "Down time" (with space) refers to the start of "downtime" (without space) and "up time" (with space) refers to the start of "uptime" (without spMean time betMean time between failuresween failuresace), the formula will be:

The MTBF is often denoted by the Greek letter θ, or

where ƒ is the density function of time until failure – satisfying the standard requirement of density functions

The Overview

For each observation, downtime is the instantaneous time it went down, which is after (i.e. greater than) the moment it went up, uptime. The difference (downtime minus uptime) is the amount of time it was operating between these two events.

MTBF value prediction is an important element in the development of products. Reliability engineers / design engineers, often utilize Reliability Software to calculate products' MTBF according to various methods/standards (MIL-HDBK-217F, Telcordia SR332, Siemens Norm, FIDES,UTE 80-810 (RDF2000), etc.).

However, these "prediction" methods are not intended to reflect fielded MTBF as is commonly believed. The intent of these tools is to focus design efforts on the weak links in the design

MTTR

MTTR is an abbreviation that has several different expansions, with greatly differing meanings.

It is wise to spell out exactly what is meant by the use of this abbreviation, rather than assuming the reader will know which is being assumed.

The M can stand for any of minimum, mean or maximum, and the R can stand for any of recovery, repair, respond, or restore.

The most common, mean, is also subject to interpretation, as there are many different ways in which a mean can be calculated.

Mean time to repair

Mean time to recovery/Mean time to restore

Mean time to respond

Mean time to replace

In an engineering context with no explicit definition, the engineering figure of merit, mean time to repair would be the most probable intent by virtue of seniority of usage.

It is also similar in meaning to the others above (more in the case of recovery, less in the case of respond, the latter being more properly styled mean "response time").

Study Material, Lecturing Notes, Assignment, Reference, Wiki description explanation, brief detail

Mechanical : Maintenance Engineering : Principles and Practices of Maintenance Planning : Reliability |