Achieving Operational Excellence: A Proven Zero Equipment Failure Strategy with TPM.
- Catherine Converset
- Mar 11
- 7 min read
Updated: Mar 25
By Luis Mosqueda – Senior Consultant – TPM Expert at Productivity Latin America

We often encounter claims that eliminating machine failures is impossible.
A. Zero Failure is possible. Look at the Airline Industry:
The airline industry boasts one of the highest reliability rates globally. Air accident statistics reveal that only one accident occurs for every 5.5 million flights, with 78% of these incidents unrelated to equipment failures. Specifically, pilot error accounts for 50%, weather for 12%, traffic control for 7%, and sabotage for 9%. The remaining 22% are linked to equipment failures, often due to defective components, design flaws, or deviations from established maintenance procedures. Importantly, accidents resulting from poor maintenance design are virtually non-existent.
The airline industry has been on this journey for over a century, and the proportion of accidents due to technical reasons has significantly decreased compared to those caused by human factors.
Pilots are trained to detect the slightest abnormalities, triggering immediate maintenance interventions. Although such incidents are rare, a comprehensive system exists to report these issues to those responsible for aircraft preventive maintenance. This system ensures that any necessary adjustments are made and communicated to airlines worldwide, resulting in continuous improvements to maintenance programs. The airline industry demonstrates that nearly zero failures are achievable with rigorous preventive maintenance, proper scheduling, and well-trained personnel.
This culture of constant vigilance and high-frequency checks has made near-zero failures a reality.
Similarly, industries can adopt such an approach, focusing on regular maintenance, skilled personnel, and a robust system to achieve reliability.
B. TPM: The Key to Achieving Zero Failure
Developed in Japan during the 1960s, TPM aims to maximize equipment reliability and productivity by engaging all employees, particularly operators, in maintenance activities. While TPM gained global recognition in the late 1970s, the core idea that it can prevent equipment failures has captivated many organizations worldwide.
To achieve this purpose, in TPM, there are what are known as “TPM Pillars,” and each of them has the purpose of involving a different group of people in an organization to work together to achieve the five TPM goals:
Zero Unplanned Downtime
Zero Speed Losses
Zero Defects
Zero Accidents
Minimum Lifecycle Cost
At the heart of TPM is a set of guiding pillars:

In my work as a TPM consultant, I have realized that people see what they want to see and ignore many other ideas and concepts. One part of TPM that is particularly captivating to most people trying to resolve machine problems is having operators perform maintenance tasks on their equipment. This concept belongs to one of the five pillars with which the TPM was created: Autonomous Maintenance. Since this is a very new concept, they think that by trying something different, the problems will finally be resolved.
C. Why Focusing Only on Autonomous Maintenance Can Lead to Failure
One of TPM's most misunderstood aspects is the Autonomous Maintenance concept, which encourages operators to perform basic maintenance tasks on their machines. While this is an essential component of TPM, it is not enough to eliminate equipment failures. Unfortunately, many organizations focus exclusively on Autonomous Maintenance, neglecting the foundational work required for success.
In the research that I have carried out, I have been able to realize that the concept of Autonomous Maintenance was born in Japan because, as part of the Total Quality Systems that they established in the 1950s, the plants placed all the responsibility for the operation and maintenance on the operators. The operator was responsible for producing, guaranteeing quality, and carrying out maintenance and had all the authority to stop the equipment when they considered it necessary to make corrections to the process or equipment. For this, they received all the support of the company’s management.
In the case of maintenance, it was assumed that if the operator was the person who spent the most time close to the equipment, they should be able to identify when any adjustments or replacement of parts should be made to keep the machine operating. This presupposes that an operator will remain operating the same machine for many years and gain experience to do what is necessary when required.
When all these concepts were imported to the Western World, they arrived with what existed without understanding everything behind it. The Japanese system doesn't work if operators continually change or there is rotation because they will never gain the necessary experience. What would happen when the experienced operator leaves the company, and a new operator arrives?
Would it take several years for the new operator to gain the same experience as the previous operator? And during this period, would the equipment experience failures again? This system is effective only if the operator can stop the equipment upon detecting a problem and if the company’s management provides adequate support to purchase and replace necessary parts as required.
The concept of Autonomous Maintenance was introduced in Japan under Total Quality Systems in the 1950s. Operators were empowered to stop the machine if they detected a fault. This system works well in environments where operators remain on the same equipment for long periods, gaining expertise. However, in environments with high turnover or rotation, operators may not develop sufficient expertise, leading to failures and a reliance on reactive maintenance.
D. Autonomous Maintenance vs. Preventive Maintenance: Two Sides of the Same Coin
The role of preventive maintenance in equipment reliability cannot be overstated. Preventive maintenance engineers define maintenance schedules based on historical data, such as failure records and operational conditions. This ensures that every component is maintained at the right time, preventing unnecessary failures.
In contrast, Autonomous Maintenance, when executed effectively, focuses on basic operator-driven activities like cleaning, lubrication, and identifying abnormalities. However, these activities should always be supported by a preventive maintenance engineer, who ensures the broader maintenance strategy is adhered to and that abnormalities are analyzed for root causes.
I often encounter companies that are interested in implementing TPM. Still, they want to implement only that tiny part of TPM, Autonomous Maintenance, without understanding many of its fundamentals and principles.
Let's look at it this way: If we make an analogy, let's say that the TPM is the equivalent of a five-floor building, and each floor is equivalent to each of the five pillars of the TPM. This building needs a foundation that must be built before building the first floor, and then all the other floors can be built. No pillar of the TPM can function if the foundation is not there; in the same way, the first floor cannot be built if the foundation has not been made. Nor can the second floor be built if the first floor is not built first. The same thing happens with the TPM.
In TPM, the first pillar to build is the Maintenance Management System because this pillar develops a series of work subsystems that are vital for not only this one to function but for all the other pillars of TPM. When this first pillar has been built, we can create the next one, Autonomous Maintenance. In other words, I can't build the second floor if I don't build the first floor first.
TPM is more than just a set of pillars and tools. It involves a structured, strategic approach that begins with building a robust Maintenance Management System (MMS) as the foundation. Without this, the pillars cannot function effectively.
E. The Solution: A Robust Maintenance Management System
Building a strong Maintenance Management System (MMS), the first pillar of TPM, is the cornerstone of achieving zero equipment failures. This system ensures periodic maintenance is carried out effectively, helping to achieve the goal of zero equipment failures at the lowest possible cost.
An MMS goes beyond merely creating a set of procedures; it establishes a work system that supports long-term reliability by defining maintenance schedules, tracking performance, and continuously improving processes. Without an effective MMS, the other pillars of TPM cannot be implemented properly, and equipment failures will continue to undermine productivity.
The Maintenance Management System maybe the most significant limitation to understanding this pillar lies in the lack of clarity about what it means to have a work “system,” so we will begin by establishing a definition and the implications of operating an effective work system.
First, we will have to define a “work system” as the set of functions, methods, and procedures with which a series of daily work routines must be performed daily by a group of specially assigned people to achieve a specific objective. This objective must have clear parameters or indicators with corresponding goals that allow for evaluation of the effectiveness of the work performed.
Over the years, we have had the opportunity to learn about many work systems in different companies, some of them very effectively covering some of the parts of the maintenance system but with serious deficiencies in others that are also of great importance. At Productivity, we have conceived a maintenance management system with three groups of subsystems:
The first group of subsystems contains all the elements that constitute the Basic Infrastructure, which will allow the system to operate at later stages and feed information that will be used in other pillars of the TPM.
With the second group of subsystems, the Basic Operating System of the Maintenance Management System is built and put into operation.
The third group of subsystems define the Optimized Maintenance Management System.
Developing the first two groups will create the necessary infrastructure so that the information received from operators is appropriately processed to correct abnormalities, incorporate new preventive maintenance activities, or adjust the existing ones. Autonomous maintenance will no longer eliminate equipment failures but rather sustain equipment that is already in good condition and improve preventive maintenance.