What is Pareto Analysis
Pareto Analysis (also referred to as Pareto Chart or Pareto Diagram) is one of the Seven Basic Quality Tools[1] for process improvement.
These seven basic tools form the fixed set of visual exercises most helpful in troubleshooting issues related to quality. They are called “basic” because they require little formal training in statistics and can effectively address most quality-related problems.
This article will focus comprehensively on the Pareto Analysis, its origin, how a Pareto Chart is constructed and its relevance in modern-day problem-solving.
Origin of Pareto Analysis
Pareto Analysis traces its roots to the Pareto Principle, first observed by the Italian sociologist and economist Vilfredo Pareto.[2] While studying income distribution across Italy in 1986, Vilfredo showed that 80% of income in the country was owned by 20% of the population.
Vilfredo’s work was extended by Joseph Juran[3], an American engineer and well-known quality management advocate.
Juran theorized that losses are never uniformly distributed over the quality characteristics. Rather they are always maldistributed in such a way that a small percentage of the quality characteristics always contributes a high percentage of the quality loss.
This forms the basis of the Pareto Principle, which, in simple words, means “for many outcomes, roughly 80% of consequences come from 20% of causes”.
This principle is also known as The 80/20 Rule (most common), The Law of the Vital Few or The Principle of Factor Sparsity, which all interchangeably mean the same.
It is important to note that although many systems tend to follow an approximate 80-20 pattern, this is not an absolute or obligatory rule. The distribution can vary, such as 90-10 or 70-30, while still adhering to the underlying principle that many outcomes are a result of a few causes.
Why use Pareto Analysis?
Organizations have diverse goals and aspirations but in most cases, are constrained by resources (money, manpower, machines, technology etc.).
Under such limitations, Pareto Analysis can help create maximum impact with the least amount of effort. This enables teams to work more efficiently on specific initiatives. Targets can be achieved faster simply by prioritizing initiatives in the right order.
Other benefits include:
- Setting clear priorities for the organization
- Increased daily productivity
- Ability to portion work into manageable segments
- Focused strategy
Pareto Analysis optimizes the overall organization’s performance by pointing to the highest return activities that can be pursued for maximum benefits.
When to use a Pareto Analysis?
The most compelling use case of a Pareto Analysis is to optimize the utilization of an organization’s resources by focusing them on a few key areas rather than spreading them over many others that have little impact on results.
Pareto Analysis helps identify patterns that highlight the main reasons behind most of the challenges an organization is trying to solve.
But, to perform a Pareto Analysis, the process data must fulfill two criteria:
1) It must be possible to arrangeable the data into categories
The core part of the analysis involves breaking down complex issues into their constituent root causes so that the repeating causes can be prioritized according to the number of occurrences.
For example, if an automobile assembly manager wants to analyze failures in the production line, it must be possible to classify them into categories such as equipment failures, material shortages, worker absenteeism, and quality defects.
Without this classification, it will not be possible to prioritize one category over another.
2) The ranking of the categories should matter
If the ranking of categories does not matter, the data frequency is no longer relevant for decision-making.
For example, if the assembly line manager determines that addressing the top two categories, equipment-related issues, and material-related issues, will have the most significant impact on reducing delays, the exact frequency difference between these two categories becomes less important. The focus is on resolving both categories as they are the primary contributors to the delays.
In this case, the data frequency becomes less relevant because the primary goal is to address the critical categories and minimize their impact, regardless of the precise order between them.
A Pareto Analysis will not add value in such a case.
Components of a Pareto Chart
A Pareto Chart is a combination of a bar graph and a line graph. It consists of four major components:
X-Axis | The category of data is plotted along the X-axis. In the below example, the categories are the causes of delay in an assembly line manufacturing setup. |
Y-Axis | Occurrences of each of the categories are plotted along Y-axis. This represents the number of recorded cases where a particular cause led to the delay. |
Ranked bars arranged in descending order | The height of the bar represents the frequency of occurrence of a particular category. These bars must be arranged in descending order. |
Cumulative percentage curve | Shows the cumulative percentage (on the y-axis) while traversing the categories from left to right. A secondary Y-axis with a 0 to 100% scale is used to plot the cumulative line graph. |
Constructing a Pareto Chart
To demonstrate the components and process of building a Pareto diagram, consider the example of a company that is facing delays in shipping products due to various problems in its production line.
The company has limited resources to spare and cannot focus on all the root causes. It must judiciously allot resources (manpower, management attention, funds etc.) such that chances of on-time delivery are maximized.
By performing a Pareto Analysis, the resulting Pareto Chart can help the company make the best use of its resources. The process involves the following steps:
Step 1 – Decide on the categories
Categories are the list of causes/events that contribute to a problem being addressed. This could be gathered through feedback from employees, clients, or customers.
It is important that the list of causes identified accurately reflect the issue. An analytical approach to preparing a root cause list could be by using a Five-Whys analysis.
In the case of the example considered, the company has identified a total of 12 causes that have led to the delays, which are:
(1) Equipment failures | (7) Labor disputes |
(2) Material shortages | (8) Design flaws |
(3) Worker absenteeism | (9) Supplier issues |
(4) Quality defects | (10) Equipment maintenance |
(5) Power outages | (11) Natural disasters |
(6) Transportation delays | (12) Environmental issues |
Step 2 – Establish a measurement metric
The next step is to identify a measurement metric that is most appropriate to the grouped categories.
These could range from the number of product defects per batch to the frequency of customer complaints, to how many resources it takes to manufacture a product to how long it takes to resolve customer complaints, etc.
In the case of the above example, the number of times a particular root cause was responsible for the delay is a good metric to consider.
Step 3 – Choose a timeframe to collect the data
This can be one work cycle, a sprint, one full day, one week, one month etc. In the case of the example, the company has chosen to record one week’s data.
It is important to choose a sufficiently broad timeframe to even out the impact of rare events and aberrations. For example, in the example’s case, if a day was selected instead of a week, it may not accurately capture the trends in material shortages or transportation delays. This could lead to misleading results.
Step 4 – Record the data over the selected timeline
Gather data on the number of times each chosen category was responsible for the delay. The measurement metric in this example case is the frequency, which was selected in Step-2. It could vary depending on the choice.
The recorded data must then be organized in a table according to the categories and timelines selected. In the case of the company’s example, the selected timeline is one week with 12 categories. Hence, the data tabulation will look as below:
Step 5 – Organize the data
The output of this step is a table with categories sorted in descending order as per their occurrence over the selected period.
In the example case considered, the steps would involve aggregating the occurrences by adding totals to each category, constructing a smaller table (or hiding the daily data) and then sorting the data in descending order based on occurrences of each root cause.
The same is shown in the figure below:
Step 6 – Calculate cumulative percentages
Cumulative percentages can be calculated using any of the spreadsheet applications. In the figure below, cumulative percentages are calculated for the example case using Google Sheets. The formula used can be seen in the image.
Step 7 – Construct the graph
This step involves graphing the data. A spreadsheet tool can be conveniently used to plot a bar graph (occurrences) and a line graph (commutive percentage).
In Google Sheets, this can be performed using a Combo Chart[4], with achieves both objectives with the following steps:
- Select the data in the table
- Go to Insert -> Chart
- From the Chat Editor, change the chart type to Combo Chart[4]
- Set X-axis data as a category – In the example case “Reason for Delay”
- Set left Y-axis data as the frequency of occurrence – In the example case “Reason for Delay”
- Set the right Y-axis data as the cumulative percentage
With this, the Pareto chart will be ready and will look as under:
Interpreting a Pareto Chart
Once a Pareto Chart is constructed, clarity emerges on which few out of the many occurrences have the most impact on the results.
The Cumulative percentage curve makes it easier to visually answer the question – “Which 20% of the causes are responsible for 80% of the results?”
It can be seen from the example case chart above that out of 12 identified causes, just 3 contribute to over 80% of the delays.
With this, the company has a clearer picture of where to focus its efforts and deploy resources.
Limitations of Pareto Analysis
Although Pareto Analysis is a potent visual problem-solving tool, it does have certain limitations, such as:
Data quality – if the data is compromised by errors, inconsistencies, biases, or missing values, results can be misleading or inaccurate, leading to wrong decisions.
Root cause analysis: Pareto analysis helps to identify the frequency and impact of different causes of a problem, but it does not provide any insight into the underlying reasons or mechanisms behind them.
For example, a Pareto chart may show that half of all problems occur in shipping and receiving, but it does not explain why that is the case. To find out the root causes, additional tools such as the 5 Whys or Fishbone diagrams are needed.
Qualitative aspects: Pareto analysis focuses on quantitative data, such as the number of occurrences or the percentage of the impact of different causes. However, it does not account for qualitative aspects, such as the severity, urgency, or complexity of the causes.
For example, a Pareto chart may show that equipment failures are the most frequent cause of delays, but it does not indicate how severe or difficult to fix those failures are. Qualitative aspects may affect the priority and feasibility of addressing different causes.
Future scenarios: Pareto analysis is based on past or present data, which may not reflect future scenarios or changes in parameters.
For example, a Pareto chart may show that supplier issues are a minor cause of delays, but that may change if the supplier changes its policies or prices. Pareto analysis does not account for uncertainty or variability in the data or the environment. Therefore, it should be updated regularly and supplemented with other tools such as scenario analysis or risk analysis.
Varied Applications of Pareto Analysis
Pareto Analysis, with its versatile nature, finds applications across diverse industries and sectors. It aids in identifying critical issues, prioritizing tasks, and allocating resources effectively.
Some examples use cases of Pareto Analysis are:
Manufacturing | Identifying major quality issues in production |
Retail | Determining the most frequent customer complaints |
Healthcare | Identifying the main causes of patient safety incidents |
Logistics | Analysing the key factors leading to delivery delays |
Banking | Identifying the most common sources of customer complaints |
Software | Identifying the top software bugs affecting users |
Hospitality | Determining the primary reasons for guest dissatisfaction |
Construction | Analysing the main causes of project delays |
Telecommunications | Identifying the major reasons for service outages |
Energy | Determining the primary sources of equipment failures |
Agriculture | Identifying the main factors affecting crop yield |
Transportation | Analysing the primary causes of accidents or breakdowns |
Education | Identifying the major obstacles to student learning outcomes |
Insurance | Determining the most common causes of claim denials |
Food Industry | Identifying the main sources of product contamination |
Pareto Analysis can even be applied at an individual level to identify the vital few factors that significantly impact personal productivity, thus aiding in effective time management, focused improvements, and better decision-making.
Sources
1. “THE 7 BASIC QUALITY TOOLS FOR PROCESS IMPROVEMENT”. American Society for Quality (ASQ) , https://asq.org/quality-resources/seven-basic-quality-tools. Accessed 29 Jun 2023
2. “Vilfredo Pareto”. Britannica, https://www.britannica.com/biography/Vilfredo-Pareto#ref37223. Accessed 01 Jul 2023
3. ” Dr. Joseph M. Juran”. Juran.com, https://www.juran.com/about-us/dr-jurans-history/. Accessed 01 Jul 2023
4. “Combo chart”. Google, https://support.google.com/docs/answer/9142593?sjid=5335603720602667500-AP#combo_chart&zippy=%2Ccombo-chart. Accessed 30 Jun 2023