Process Capability Analysis | Ali Suleiman | Skillshare

Process Capability Analysis

Ali Suleiman, Mechanical Design Engineer

Play Speed
  • 0.5x
  • 1x (Normal)
  • 1.25x
  • 1.5x
  • 2x
18 Lessons (1h 40m)
    • 1. Introduction

      1:49
    • 2. Sec01 Lec01 Data Distributions

      4:01
    • 3. Sec01 Lec02 Normal Distribution

      2:28
    • 4. Sec01 Lec03 Normality Testing

      5:38
    • 5. Sec01 Lec04 Mean & Variance

      9:49
    • 6. Sec01 Lec05 Measures of Performance

      2:33
    • 7. Sec02 Lec01 Process Definition

      7:53
    • 8. Sec02 Lec02 The Sigma Levels

      4:44
    • 9. Sec02 Lec03 Cp and Cpk

      12:55
    • 10. Sec02 Lec04 Cpm

      5:59
    • 11. Sec02 Lec05 Pp and Ppk

      6:52
    • 12. Sec02 Lec06 Statistical Process Control

      10:50
    • 13. Sec03 Lec01 The Optimal Distribution

      2:46
    • 14. Sec03 Lec02 The Loss Function

      3:15
    • 15. Sec03 Lec03 The P Diagram

      2:40
    • 16. Sec03 Lec04 Sources of Variation

      1:56
    • 17. Sec03 Lec05 The Optimization Algorithm

      9:36
    • 18. Sec03 Lec06 Implementation

      4:14
30 students are watching this class

About This Class

A complete course the helps you jump from beginner to expert level in process capability analysis.

The course just cover all topics along with bunch of examples and applications using either excel or Minitab.

Al Capability Indices are included: Cp  Cpk  Cpm  Pp  Ppk

The course also explains about data distribution, then bring more focus on the normal distribution. 

All the six sigma levels are explained in details.

The course includes also the process capability optimization techniques and algorithms.  

Transcripts

1. Introduction: If you would like to learn process capability analysis from A to Zed, then you are definitely at the right place. This course has all topics covered were upon completion, you will definitely be able to measure, analyze and optimize any process. This course has no prerequisites, as the first section of the course will take care of. That gets you ready to jump into the main topic of the course, not to mention that the courses full off real life examples almost on each discussed topic . Furthermore, you'll learn how to use Excel and Billy Tab through all the stages for conducting a process capability analysis. As for the course content, Section one will warm you up with some basic topics related to data distributions, especially the normal ones. It will also explain how to perform in a Ramallah testing, along with a clear illustration of the mean and variance as being measures of performance. Section two will start by defining the term process on explains the six different Sigma levels of performance. It will then introduce all the capability indices in regard to their definition measures and how to calculate them. The section will then in by introducing what its so called statistical process control Section three will focus on process capability optimization through explaining the loss function. P diagrams, sources of variation on also the optimization algorithm at the end of the schools will explain how to implement optimization measures most efficiently on achieve a goal. 2. Sec01 Lec01 Data Distributions: Yeah. Each time you perform process capability analysis you will be dealing with sets off data accordingly. It is very important for you to understand the data distributions. Almost everything in our world can generate data. The entity which generates the data is often called source. Let's start directly with an example, considering a small garden off cucumber plants. In this case, our source of data is the cucumber. Each source has one or more perimeters. In our example, 30 meters off, a cucumber can be its length or diameter or others. Each parameter will generate an outcome. In our example, let's say that the length parameter has generated 25 centimeters. However, due to variation factors, the perimeter will generate more than one outcome. In our example variation factors can be fertility or CD quality, etcetera. Perthes variation factors assumed that the length has generated a total of three outcomes 15 2025 centimeters. I know that each of these outcomes can be generated more than one time. We will represent the number of times on outcome is generated by frequency. In our example, let's say that quantity off cucumbers with 15 centimeters length are 30 for cucumbers off 20 centimeters to be 45 on for those off 25 centimeters to be 14 No, that outcomes are often represented by ranges in order to minimize their numbers. In our example, if we are to consider outcomes being from one centimeter upto 35 centimeters, then we will have 35 outcomes. However, if we do it as ranges, let's say 0 to 5 centimeters than 5 to 10 centimetres on DSO on. We will have seven outcomes on Lee for the same example. Assume we have the following outcomes being represented in ranges on that the listed frequencies are assigned to them. The best way to illustrate data distributions is by using the history Graham, where the outcomes are listed on the horizontal axes and that the vertical axis represents the frequency. The frequency plots in his to Graham are illustrated as bars. If we are to draw a line connecting the took meat points off each two successive bars than the shape of the overall poli line will represent the type off distribution. Having that said, there are many types of data distributions, such as normal distribution, which is close in shape. The distribution in our example. Another example. A normal distribution can be the heights in population. We have also the chi Square distribution, which is a lot used in studies related to confidence level. There is also the exponential distribution as an example in such distribution Is the life off a batteries? 3. Sec01 Lec02 Normal Distribution: In the previous lecture, we mentioned that there are many types off data distributions. The most popular type a month, um, is the normal distribution, which can also be called Gulshan distribution. The importance off this type lies behind the fact that it applies in a wide range of applications around us, including engineering ones. To understand the popularity off this distribution better, you must first know that any working system which is somehow under control is set to a specific nominal value. Let's take flowers as an example, assume that the typical day for the flowers to bloom is on the 21st off marsh than this day . 21 is the nominal value. Having that said, you will see that most of the flowers will bloom on this day due to variation factors like distribution off minerals in the soil. You will see that a relatively less number of flowers will bloom on day 20 or 22. As we're speaking about the control system that is set for a nominal value, then the probability off having different outcomes will be lower and lower. As we shift away from this value, this will lead to a normal distribution off timing for flowers to bloom. Take the following video as an example to what has been explained in this video, all the falling balls are set to fall through the path in the middle. However, due to the variation factor represented by the random collisions among the balls, they will start shifting away from the middle path at relatively lower quantities, thus forming a normal distribution. If we come to give a definition for a typical normal distribution, we would say that it has a bell shape. This shape is symmetrical on a central value that possesses the highest frequency on that. Other values will have gradually lower frequencies. We will come to know more and more about this distribution through the course. 4. Sec01 Lec03 Normality Testing: in this lecture, we will show how to simply check if your data is normal distributed. Previously, people were obliged to go into a series of calculations in order to check for normality of the distribution. However, in our days things got much easier with the existence off dedicated software. In this lecture, we will show how to simply check the normality off your data distribution, using either Excel or many tap. Let's first take two groups of data A and B as examples to work on. In this video, we will learn how to check if our data is normally distributed or not using Excel. As you can see, I have the data from both groups already recorded in the worksheet. The first thing to be careful about is that the outcomes need to be in an ascending order like those in Group B, in case your outcomes are not in ascending order like those in Group A. What we need to do is to mark the outcomes along with the frequencies than right click, go to sort, then select sort, smallest to largest. The next step is that we will mark the frequency column along with a title then Goto insert and choose Clustered Column So you will get the history. Graham. We will do the same for Group B as well, so we will mark the frequency column along with a title, then go to insert on Choose Again Clustered Column to get the second hissed a gram. If we are to compare, the two hissed a grams. We will see that dating Groupe somehow fit to the bell shaped curve off normal distribution so we can confirm that the data in Group A is normal distributed. As for Group B, we can clearly see that the data do not fit toe the bell shaped curve, thus confirming that the data in this group is not normally distributed. In this video, we will be showing you how to check if the data is normally distributed or not using Meet up. The good thing about many top is that you don't need to worry about the outcomes being in an ascending order as the software automatically work them for you. So for now, I will keep the outcomes in Group A in this random order. The first way to check if your data is normally distributed is by history, Graham. So we'll go to grass choose hissed a gram than his to ground with fit press. OK, now in this field will be choosing outcome A. Then we go to data options Goto frequency tab and choose frequency eight rest. Okay, okay. Again. And you will get the history. Graham. We will do the same for group B as well. So we'll go toe hissed. A gram hits the ground with fit. This time will be choosing outcome be go to date options frequency tab and choose frequency be and he will get the second hissed a gram meet up will be showing you the history grams having the data distributions along with the bell shaped curves so you can make a direct comparison in group A, we can see that the distribution is somehow fitting the bell shaped curve, thus confirming that the data is normally distributed. As for group B, we can see that the distribution is not fitting the bell shaped curve. So we can say that the data adequate b is not normal distributed The second way to check if your data is normally distributed or not inmate up is by performing in normality test, so we'll be going to start basic statistics on we choose normality tests. Here we will be dealing only with frequency columns, so we'll be choosing frequency. A press OK, and you would have the probability photograph. We'll do the same for groupie as well. So we'll go to start basic statistics, normality, tests and this time will be choosing frequency Be so we get the second graph as well. If we come to Group A, we will see that the ports are nearly linear, which is confirming that it's a normal distribution. However, if we come to Group B, we will see that the plots are scattered in a random way around the linear line. That's confirming that data in Group B is not normal distributed. 5. Sec01 Lec04 Mean & Variance: meaning variants are two basic terms in statistics yet are the most important ones, especially when performing a process capability. Analysis in this section will go through defining each of these two terms separately and how to calculate them for any data set manually and using either excel or minute up the mean simply represents the average value in a data set. Assume that you have a perimeter X where X I represents an outcome value on that we have a number off outcomes. In this case, the formula for getting the mean value is the sum off all outcome values divided by n. Let's recall the example related to the small guarding off cucumbers, saying that the length perimeter has generated the following outcomes. The mean value in this case is the sum off these outcome. Values divided by seven, thus equal to 24.8 centimeters know that for a perfectly shaped normal distribution, the highest frequency is expected to be on the mean outcome value. On that, the distribution is centered over it. Sometimes you may encounter a scenario where there is an outcome that happens to be very different from the rest off outcomes. This outcome is so called outlier outliers may exist due to an unusual or unlikely event that happened during the process. When calculating the mean in a data set where a out lawyer does exist, the calculation will lead toe on inaccurate result which do not represent the real case. If we recall our example again, assuming that the length perimeter has generated the following eight outcomes, we would clearly see that the outcome value 52 is far away from the rest off outcomes. If we do the calculation off mean including this outlier, as you can see, the mean value would not be represented ble anymore. The good practice is to ignore this outlier on, not to include it within the calculation in order to have a mean value that better represents the data set as a bottom line, it's highly recommended to check for out liars before you calculate the mean. The best way to check for out lawyers is by deploying a scatter plot. The variance in statistics is a measure off day to dispersion from the mean and is designated by Sigma Squared. Another meaning the variance will mainly tell you how big is the average spread off outcome values compared to the mean value as you may have a perimeter X and that X I is an outcome value if we take a data set composed off a number off outcomes along with a defined mean value. In that case, we can define the distance between each outcome value and the mean by X I minus mute. So the distance between the first outcome and the mean will be X one minus mu on DSO on as you will always have outcome values which are smaller than the mean than the distance for each off them will be a negative integer tow. Avoid that we will square all the distances in the data set if we come to calculate the value since which represents the average score distance, it will be the some off these squared distances divided by the number off outcomes. In order to have the average distance without being squared, we will calculate the square boot off the valiant switches to be called standard deviation . Having that said, the standard deviation is the actual every spread off. The outcome varies from the mean value coming back to our example, having these seven outcomes, we will calculate the square distances individually. As shown, the Valionis will be the sum divided by seven, thus equal to 75.55 Accordingly, the standard deviation in this case is equal to 8.69 I know that in all our previous calculations, we assume that each outcome has happened once in other meaning. The frequency for each of those outcomes waas equal toe one which is not the usual case. Now we will calculate the mean and variance for the outcomes, but this time to include frequencies within the calculation. Assume that these seven outcomes have the following frequencies, then the mean value will be equal to the sum off these outcomes being multiplied individually by their frequencies than all divided by the total number off frequencies. Accordingly, the mean value in our case will be 24.88 If we come to calculate the valiant than each of the squared distances should also be multiplied by its frequency, then to divide the sum by the total number off frequencies. In this case, the values will be equal to 57.95 and the standard deviation will be equal to 7.61 as a bottom line when the outcomes have frequencies more than one than the mean on variance are calculated for the following equations for calculating the mean and the variance. I have prepared the data in the way that you see here in the worksheet. The first thing that we need to do is to calculate the means. So I will go to this column where I need to calculate the multiplication off outcomes with frequencies. So I would say equal to outcome multiplied by frequency. Then I will track it down to the last roll here for the mean I'll say equal to some off all the fields in this column over the some off all frequencies on Dhere, we get the meat. The next thing is that we're going to go to calculate the variance and standard deviation. So the first thing is to calculate the distance. I will say, equal to outcome, minus the mean value. But be sure here to put the dollar symbol. Then I'll say OK and I will drag it down to the last roll. The next thing I will get the distance squared so I'll say equal squared and I will also drag it down to the last row. Now, in this column I will multiply that the squared distance with the frequency So it's equal to square distance multiplied by the frequency and I was dragged down as well. For calculating the variance, I will say equal some open parentheses on Mark all the fields in this column over the summer, off frequencies To calculate the standard deviation, I will take the square root of the valiant. So say equal sq rt, I'll choose the function. Andi, select the variance and press enter. As you can see now we have calculated the mean the value since on the standard deviation in this video, I will be calculating the mean and the standard deviation using minute up. Just know that such calculations are much easier with me, Tab. So as you can see, I have all the data set in the worksheet. I will goto graph, choose hissed a gram, then select his to Graham with fit and press ok in this field, I will be choosing the outcome column. Then I will go to data options. Choose frequency tab on in this field will be choosing the frequency. Call on press, OK? And OK again. As you can see, you have the history, Graham, and you have the mean value, the standard deviation and even the number, the total number off frequencies. 6. Sec01 Lec05 Measures of Performance: this lecture will illustrate how the mean and variance are taken as measures of performance . To start, let's take two archers and call them Archer A and R Trabi. Each of the two archers is allowed to shoot nine arrows on Lee. At the end of the contest, we had the following results. If we look at Archer A, we will see that his arrows are evenly distributed around the target, so we would say that he has accuracy in his shooting. However, his Adel's are seem to be widely spread around the target, which means that he has lack of control or stability in his shooting. If we assume that the arrows are normally distributed than the history, Graham will look like this. Where there is a wide dispersion, however, the distribution is almost symmetrical over the target point. If we look at Archer be, we will see that the arrows are not evenly distributed compared to the target point. So we would say that he has no accuracy in his shooting. However, the spread off arrows is a relatively small, which means that he has more control or stability in his shooting. Assuming a normal distribution off arrows the Instagram will look like this where the dispersion is small. However, the central axis off distribution is relatively far from the target point. By the shown example, we can conclude that there are two measures of performance, accuracy and stability for a normal distribution. The accuracy is represented by the distance between the target value and the mean. As for stability, it can be represented by the dispersion. Thus, by standard deviation. For that reason, both mean and standard deviation are considered to be measures of performance in order to have a better performance than both measures should be minimized. 7. Sec02 Lec01 Process Definition: By definition, a process is a Siri's off interrelated tasks that together transform inputs, tie desired output. The output for process is usually designated with the letter y if we are to conserve the process being the source than why here is an output perimeter on that. Due to variation factors, this pedometer will generate multiple outcomes at variable frequencies. Let's take this middle cutting machine as an example for process on that a long steel bar is to be the input. The machine has been set to cut this long bar into smaller pieces, each to be five centimeters long. The value five centimeters is called nominal value, which is the target value required by the engineers who are running this process. If the process has been set correctly, the outcomes which are close to this value are expected to have the highest frequency on that. Other outcomes should have gradually less frequencies as going far from it. Having that said, most manufacturing processes are expected to have a normal distribution off. All comms now assume that at the end of this process, we have the following outcomes along with the shown frequencies. If we put the data history Graham, we will see that the outcomes of this process are almost normally distributed. As engineers acknowledge the fact that the parts will be produced at variable outcomes, they define a tolerance range in which the produced parts are to be accepted, otherwise to be rejected. The maximum value on this range is called the upper specification limit, or U. S. L. On that the minimum value on it is called the lower specification limit, or L s. L assume that the engineers have said the acceptable length for the produce parts to be five centimeters, with a 0.2 tolerance range. In that case, the nominal value is five on that. The U. S. L is 5.2 on the LSL is 4.8. If we put the two specification limits on our history, Graham, we will see that two outcomes are being out of the acceptable range. These out of French outcomes are often called defects and eventually to be scrapped. If we would like to calculate a scrap rate in our process, it will be equal to the sum off frequencies related to defective outcomes divided by the total sum off frequencies in our case, the scrap rate is equal to 12.5%. Now we will show how to visualize the specification limits on history Graham through either Excel or miniter and Excel. Unfortunately, there is no offcial way to visualize the specification limits on history. Graham. However, in this video we will try to turn around that by showing you the following way. The first thing is toe. Make sure that your outcomes are in ascending order. On that, you deploy the table in the way that it's shown here. What do you loo first is toe choose a frequency value, which is a bit higher than the highest value in the frequency column. So we have 12. Let's choose 15. So we will put 15 in the LSL column on 4.8, which is the specification limit. We will do the same for the U. S. L column on. We will put it on 5.2. The second step is to mark all these three columns, then Goto answer and choose clustered color. As you can see, we have our history. Graham in orange color on the two specification limits in two different colors. What you will do is to change the color off each of them. So I will choose this color the next thing what we can do is to add data labels help change it here too. USL for sorry lsl on the other one as well to be the USL. As you can see now we have our history, Graham, with the two specifications limits being visualized in case the outcomes are not shown in the history. Graham, As you can see here, all what we can do is to select the Hesse Graham than double click gold, select data and press edit, then mark the outcomes and press OK, As you can see now, the outcomes are shown in the history Graham visualizing the specification limits on history. Graham, through many tab is very simple. So the first thing let's deploy the instagram So we will go to graph and choose hissed. A gram hits the ground with fit. Then here I will choose the outcome column, then go to date options Select frequency on in this field. I would choose frequency column press okay. And okay, a game. So to have the history, Graham, now, I will need to add the specification limits. So I will press on the graph then right click. Then I will go to add Then add reference lines here. I need to show the specification limits on the horizontal axes. So what I will need to do is to choose this field. Then I will put the 1st 1 which is 4.8 than space. Then 5.2 and press OK, As you can see now, I have both specification limits. If you would like to change their labels, you can do that as well. So you can put here lsl and here you are, self. So as you can see, I have now the history Graham, with both specification limits being on it. 8. Sec02 Lec02 The Sigma Levels: if we take a typical normal distribution on that, we set each of the two specification limits to be three signal away from the center. In this case, 99.73% off. The distribution will fall within the specification limits and that 0.27% will fall outside if we assume that cars manufacturing company is producing 1000 cars a day, which is a very normal quantity in automotive industry. On that, each car is being sold with $50,000. If this company is producing the cars with a three Sigma process, distribution than 99.73% of the cars will be released to the market and 0.27% of the cars will be scrapped. In that case, the scrap cars out of the daily production are to be 2.7 cars, which is worth $135,000. Now that is a big amount of money lost on a daily basis if you think that this is bad than the worse is yet to come. Our previous calculation was based on a process that was implemented and set on a three sigma level. But that is for short term, as processes cannot maintain the same performance level forever. On that, their performance is expected to degrade gradually. Over time. It is widely adopted that processes are expected to shift with 1.5 sigma on the long term. Another meaning a process which is on a three Sigma level as a short term, is expected to be on a 1.5 Sigma level on the long term. In such case, 93.32% will fall within the specification limits and that 6.68% will fall outside. If we come to our example again, it means that on the long term, the scrap cars out of the daily production will be 66.8 course, which is equivalent to $3,340,000 in value. As we are speaking about industries which produce big quantities, let's introduce what its so called DPM old, which stands for defects per 1,000,000 opportunities, simply the term refers to the quantity off defects out off one million produced parts. So if you recall again the three Sigma distribution on the long term than its DPM O is equal to 66,800 defective parts. In order for manufacturing industries to reduce the money loss, they tend to improve their processes. Tower the higher signal level no that signal levels are mainly measures off quality on that . The loss in quality is then translated into close off money, as we previously shown in the example. On the contrary, if a processes improved, our higher signal level, then quality will be increased. Leading customers toe have more crust in the product, thus increasing their sales and profit as a bottom line. Almost all successful industries do adult continuous improvement strategies in order to optimize their signal level. For that reason, they are six Sigma practitioners in order to help them with that optimization for your reference in this table, I will be listing the six signal levels along with their person tiles and DPM. Oh, on the short on long terms 9. Sec02 Lec03 Cp and Cpk: In this lecture, we will introduce the to process capability Indices CP and CPK before starting. Let's describe explicitly what actually happens to the distribution when it's sigma level is improved. For that, let's recall the three Sigma distribution, which we have shown previously know that the position for each of the specification limits is always fixed on doesn't change with the change off signal level in other meaning. If we designate the distance between each specification limit on the center with small D, then the distance D will always be constant. When a process is improved to a higher signal level, what changes is only the value off standard deviation in that manner. If assuming an improvement from a three Sigma level to a four Sigma level than the value of standard deviation will decrease, allowing the distance of the to be equal to four sigma. To make things clear, let's assume that we have this metallic bar, which is said to be on a moment of value off 10 centimeters, along with a three centimeters tolerance range on each direction. In that aspect, the upper specification limit is 13 centimeters on the lower specification limit is seven centimeters in this case, the distance between any of the specification limits and the center is three centimeters. Assume that the standard deviation for the produced batch was equal to one and then accordance. The distance D, which is equal to three, will also be equal to three Sigma, confirming the process performance to be at three Sigma level. Assume that after performing some improvements on the process, the standard deviation decreased from 1 to 0.75. In that accordance, the distance D stays the same being equal to three, but this time will be equal to four Sigma. Having that said, we can confirm that by decreasing the standard deviation from 1 to 0.75 the process has been improved from a three Sigma level before Sigma level. We mentioned previously that a typical normal distribution will fall with a three Sigma level. For that reason, we will always take the Three Sigma distribution as a reference when evaluating any distribution. Let's take this process distribution, which is being on four signal level. Toby under evaluation. In order to perform this evaluation, we will introduce the Performance Index CPI, which is equal to the distance between specification limits for the process under evaluation divided by that off a three Sigma process distribution. Having that said, the CPI index will be equal toe 1.33 Rekha The equation for calculating the CPI index Taking a three Sigma distribution as a reference is US L minus lsl divided by six Sigma. In our previous example, we assume that the distribution was perfectly centered over the required nominal value. But that is an ideal case. In such perfect case, the nominal value and the mean value are both equal. However, in reality, the distribution will not be centered over the nominal value in other meaning. The nominal value and the mean value will not be equal in such case. If we keep using the capability index CPI to evaluate the process performance than our evaluation will be less representative. For that reason, the Capability Index CPK has been introduced. The CPK calculation ignores the nominal value on adult the distance between the specification limits and the mean value as we are taking the mean value instead off the nominal one, then the tolerance range will not be symmetrical anymore. This means that we cannot perform the calculation in the same way as we did with the CPI index, taking the whole range at once and dividing over six signal accordingly. For Civic eight, we take each region separately on divide each of them on three segments. In that aspect, that CPK will be equal to the lower value between the two regions. Recalling our example again about the metallic bar, assuming that the standard deviation is still equal to one on that the mean value is equal to 10.7. Performing the calculation for each region, we got to values 0.76 on 1.23. In that case, we adult the lower value. Thus, the sippy cake will be equal to 0.76. To summarize what has been explained, we can say that for a center distribution, the CPI is calculated by dividing the whole range over six Sigma in center distribution. CPK can also be calculated through its formula. However, it will yield the same value as CP in case off a not center distribution. The CPI index will not be representative anymore. To evaluate the performance on that CPK is to be used as a conclusion. CPK can be used in both cases, so we don't need to worry about knowing how the distribution is centered in that accordance . CPK is more adopted as a process capability index among industries. For your reference, I will be listing the CPK values for the six different signal levels on short on long terms , assume that we have the following drilling CNC machine, which we are using to create multiple identical holes on metallic plates. The total number of holes to be drilled on each plate is 24. Our goal here is to check the drilling process performance. Using the CPK Index that required nominal diameter for each hole is 20 millimeters on the tolerance range is said to plus minus 0.5 millimeters. So we took one of the plates and measured each off the 24 holes. The outcomes on their frequencies are listed in the following table. We will show you how to perform the capability analysis on this example, using either excel or minute up. As you can see, I have copied the table off outcomes and frequencies into the worksheet. I have also listed the nominal value being at 20 millimeters and the tolerance range being plus minus 0.5. I have also included the long term table, including the civic, A values and the signal levels so we can use it for comparison later on the first thing, I will just confirm the U. S. L and the LSL values. So the USL will be 20.5 and that s L will be 19.5. The second thing I will need to list all the outcomes in this column. So I will go for the first outcome, which is 19 which is mentioned one time. So I will listed here. Then I will go to the second outcome, which is 19.5 on. I see that it's mentioned three times, so I will list it three times in the column. Now I will go to the next 1 19.9 It's mentioned six times, so I will have it six times in the column and so on. For all the outcomes. As you can see, I have all the outcomes being listed in the column. What I will need to do now is to calculate the mean so I will type equal some open parentheses on Mark all the outcomes. Then close the parentheses on Divide them over 24 for standard deviation. I will type equal S t d. Then I will choose the first function. Then I will mark all the outcomes on close parentheses. So now, as I have the mean on a standard deviation, I can start calculating the CPK values. So I will start with the upper region. I was like equal open parentheses. Then select the USL minus the mean value. Then close the parentheses over open parentheses and 53 times standard deviation, which is three sigma on presenter. I will do the same for the lower reason. So I will type equal open parentheses, select the lSl value minus the mean value over three Sigma, which is three times the standard deviation and close the parentheses. Now, as you can see, I have both speak a values. What I need to do is to choose the lower one. So here we have 0.43 here we have 0.42. So we'll choose this one. That means our civic a value is 0.42. If we are to compare the civic a value to our table. We can see that it's close to the three Sigma level so that we can confirm that the performance off our process is at three Sigma level. So, as you can see, I have copied all the outcomes in tow. The minute a book sheet. What we need to do now is to go to start, then choose quality tools on goto capability analysis, then choose normal. I will have this window so I will choose the All Outcomes column to be in this field. Then in the subgroup size, I will type 24 than in the lower spec field. I will put the LSL, which is 19.5 on in the upper spec field. I will put the USL, which is 20.5, then press OK and you would have all the information here. As you can see, here is the value of the CPK it 0.41. It's very close to the value that we have calculated in Excel, which was 0.42. The small difference here is just related to how many digits after the decimal point we have taken into calculations. So in Excel we have taken mawr digits after the decimal point, especially in our calculation for the standard deviation on the mean value 10. Sec02 Lec04 Cpm: In this lecture, we will introduce the third capability Index CPI M, as the nominal value is being the ideal value which is required by the engineers who run the process. Let's then call it as target value designated with letter T. We mentioned previously that this IPIC eight index does not take into consideration the target value off the process. By that, the Civic A takes into account only the mean value. Along with the defined specification limits. However, this is not actually the case. If you focus more on the formula, you will see that this IPIC eight calculation already assumes that the target value is the mid point between the specification limits and for that too symmetrical regions were adopted being at the same distance off three Sigma. Having that said, in case our tolerance range is not symmetrical over the target value, then CPK will not be representative as well. For that comes the CPM index, which can also be called that a Gucci capability Index this index takes into account. The proximity off process means our target value to calculate the CPM index. The following equation is to be adopted to recap We have two main cases distribution being centered over the target value. On the other case where the distribution is not being centred on it, in case it is centered than we may use any off the capability indices as the value for any off them will be equal in case the distribution is not centered. We need to check the tolerance range if symmetrical over the target value or not, in case it is symmetrical. We can use either CPK or sepia in case it is not symmetrical than it is highly recommended to use the CPM Index. Not that the signal level tables, which were provided previously, do fit for any off the capability indices. Let's now recall the same example in order to show How can Mississippi Em Index p calculated using either excel or minute up? They're calling our previous example. We have radical collected, the mean and the standard deviation this time will come to calculate the CPM index, so I will type equal than open to parentheses. Inside, I will choose USL minus lsl. Then I will put them over and again open to parentheses. Inside, I will put six times sq rt, then choose the square root function inside the function. I will also include two parentheses. Inside, I will put the standard deviation squared. Then I will put plus and again open toe parentheses and put the mean minus the target value . All will be squared, then again include them in two apprentices. Then I closed the main parentheses and l presenter. As you can see now we have the CPM index being almost 0.43. If we come toe compared to our long term table, we can see that it's closed toe three sigma level. So we can confirm that our process is on three Sigma level. To calculate the CPM index in miniter, we will do exactly the same how we have done before for CP and CPK. So I will go to start than to quality tools. Then I will goto capability analysis and I will choose normal in this field. I will choose all outcomes. Then in the subgroup size, I will choose 24. Then for the lower spec, it's 19.5 on for the A prospectus 20.5. The only additional thing that we need to do here is to specify the target value so I will goto options. Then in this field, I will put the target value being 20. I'll press okay on Okay, a game. So we'll have this window here having all the information. The CPM is located here. So it's 0.42. It's close to the value that we have calculated in Excel, which was 0.43. This small difference is just related toe How many digits after the decimal points we have adopted in our calculation, especially for the mean and the standard deviation values. 11. Sec02 Lec05 Pp and Ppk: In this lecture, we will come to introduce the two final indices, PP and PPK. If we are to compare for PP and PK to CP and CPK in regard to the formula, then we will find no difference. The only difference is related to the data behind the standard deviation and the mean value . To make things more clear, let's consider the following cutting machine. The machine is required to produce metallic bars at the target value off 20 millimeters with a tolerant, strange off three millimeters on each direction. Let's say that the machine is being able to produce seven bars every day and that the machine has first started working on Monday. Having that said, we have measured all the bars which were produced by this machine along the three days on the following outcomes were recorded. Accordingly, we have calculated the mean value under standard deviation for the produced bars. At the end of each day, that's having the following values. If we come to calculate the CPK off the produced parts for each day, then we should be using the mean value under standard deviation which are related to that day. Performing the calculations, we have the following CPK values having these values. Now we know how was the process performance on each of these days? But what if we would like to check the overall process performance along the three days? For that, we will calculate the overall mean value and standard deviation for all the 21 produced bars together. After that, we should calculate the CPK using the overall mean and standard deviation. Thus, the overall CPK value is 0.44. This overall CPK refers to what is so called PPK to summarize the performance indices PP and PPK are calculated using the exact same formula like CP and CPK. If the distribution is centered over T than PP and PPK should be equal in case not, BP and PPK will be different and it's better to use PPK in this case, PP and PPK are used to study process performance for the whole population. If we take a subgroup from that population than our study on this subgroup will be using CP and CPK in disease. As the calculation is typically the same between CPK and PPK, then we will not show how PPK is calculated using excel, however, we will show that in many tab as there are some points to be explained. As you can see, I have now the data from the previous example um representing the produced bars from the three days together. So I have 21 outcomes. If I assume that this column is actually the whole population, then I can simply calculate the CPK value on that. This value will represent the PPK as well. However, in some cases, the size off data will be big on. Do you want to make a study for CPK and PPK? Having subgroups on you don't want to Must in this column each time you're doing the calculation, maybe adding values or deleting values. So you are so you can work on civic eight or people. So there is a way to do that. So I'm gonna add one calling here. I'll call it group. Um, I will refer all the outcomes here to the three groups that we have. We have group related to Monday, then Tuesday, then Wednesday. So I'll call this Monday. Andi, I will have it for all seven outcomes, Then the 2nd 1 is Tuesday. I also have it for all the seven outcomes on the last seven outcomes refer to one state. You're not obliged to. To put it in this way, you can maybe split the groups by numbers may be group 123 So you can decide for that now I will go to start. Then I will go to quality tools, capability analysis and I would chose normal here I will choose the old comb. However, for the subgroup size, we have two options. Either you put the size of the group as a constant. So in this case, each group has seven outcomes, so I can put seven and that many tab will split automatically the outcomes in the old column by seven by seven outcomes. However, to be more precise, maybe you can use the second column that we created, which is referring to the group so I can refer that to I call him C two, which is representing the groups here. I will put again the little respect on the upper spec. So I have 19.7 on the prospect being 20.3. I will press OK on. It will give me this one toe having all the information. So as you can see, if you look at the PPK, it 0.44 12. Sec02 Lec06 Statistical Process Control: in this lecture will provide an illustration over what is so called statistical process control, or spc. Before that, let's first introduce the control chart, which mainly provides a visual ization on the variation in outcomes over the production timeline. The vertical axis in this chart represents the outcome value. While the horizontal axis represents the timeline, three horizontal reference lines should also be on the chart. The U. S L. The LSL and the mean value. The outcomes are then plotted in a chronological order, along with a police line connecting them all. I know that the U. S. L and L A cell values in the chart may not be the same as those which are initially defined by the tolerance range. The U. S. L and L s L values in control chart are equal to the mean value plus or minus three Sigma in that matter, The steps for deploying a controlled chart are as follows first to calculate the mean and standard deviation, then to calculate us l and L s L values as being equal toe the mean plus or minus three sigma. After that, you will need to deploy the USL lsl and mean values as a horizontal reference lines on the chart. The last step is to plot the outcomes in chronological order by the date off. Production control chart is a very handy tool as it allows a continuous monitoring for the process performance. It also allows to quickly discover problems and identify their types in order to choose the proper corrective action. Control charts also allow you to have a prediction tower the process performance within the near future. We mentioned that control charts can give signals if the process is being out of control or tends to be as a good practice. I will less some tips on how to identify those signals. The first sign can be one or more plots crossing the specification limits. The second sign can be two out of three successive blocks on the same side, from the center line on our to signal away from it. 1/3 sign can be four out of five successive plots on the same side from the center line and our one signature away from it. 1/4 sign can be having most off outcomes from the same batch, being at the same side from the center line final sign can be any unusual. Consistent patterns on the plot processes when being out of control may have two types of problems. Problems which are by common causes that are built into the process. Such problems appear consistent on the control chart. Other problems are those which are caused by special causes that just happened during operation like Fluctuation in Power supply. As an example, such problems are be randomly on the chart or consistent, but within a limited period of time. If you would like to confirm dramatically, the process is under statistical control or not. We can directly comport the CPK and PPK values. If we don't only take any off the batches within the whole production on comport, it's CPK to the PPK value and we see that they are approximately equal than our process is under statistical control. In case any off the batches have CPK value, which is drastically different from the overall PPK value. Then we can say that the process is not under statistical control. Let's now see how we can build a control chart using either excel or made up. As you can see, I have taken the outcomes from the previous example, and I have prepared the world sheet in Excel in the way that you see, the first thing is to calculate the means. So I will go on, put equal some open parentheses than Markel. The outcomes closed the parentheses and put it over 21. The second thing is to calculate the standard deviation. So I will put equal STD than I would choose the first function, then mark all the outcomes and close the parentheses and percenter. Now, as they have the mean value under standard deviation, I can calculate the USL on Della Selva. Let's first copy that mean value, so I will copy the mean value in this column, and I will pull it over. Then I will go to the USL value, which should be equal to the mean plus three sigma. So it's equal to the mean value plus open parentheses three times the standard deviation. Just don't forget to put the door sign in this way and presenter, I will do it for the l as well as well. So I will put equal mean value minus open parentheses, three times a standard deviation, close parentheses and press. Enter what I will do now is toe pull them down here. I forgot to put the Lord sign so I would add it. And now we will pull it down. So as you can see now, I have all the new medical information ready and I can start building the control chart. So I will goto insert, then choose the line chart. Then I will go to select data. Then I will mark all the table. Maybe we can markel the table. But without the timeline, we can specify the timeline later. So here I will specify the timeline. So I will put it. Then I will mark the time line on us. Okay. So that now you have the control chart. Maybe we can make the chart more eye friendly so I can go and change the color on the line type for the difference lines. So I will choose Maybe this one, then change it to be a desk line. I'll do the same for the 2nd 1 on for the mean value. Maybe I can just play with the line type in this way on we can also may be at the legend. So as you can see it. Now I have the control chart ready. So I have copied the same outcomes in the many top worksheet. The first thing to do is to go to start then toe control charts and toe valuable charts for individuals than choose individuals. Here I will just choose the outcomes column and I will press. OK, As you can see, No, I have the control chart with the mean value and the U. S L and L S L reference lines. In case you have different batches and you would like to work with subgroups. We can do it in a different way. We can go to start control charts again, but choose here valuables charts with subgroups. Then I will choose X bar. Are here our choose again. The outcomes call him for the subgroup size is here. I will need to specify the number off outcomes in each group so that many tap will automatically split all the outcomes into groups with the specified number. So we know that each batch has seven outcomes. So we'll put that the size is seven. Then we'll press OK, As you can see now, many top has, um, built to charts one chart is comparing the mean on one chart is comparing the range, which is related to the standard deviation. The 123 numbers here refers to the group. So this is batch one batch to on March 3. As we can see now from the chart, how they mean is changing along with the batches and also how the standard deviation is changing along with the matches. 13. Sec03 Lec01 The Optimal Distribution: in this lecture, we will illustrate the importance off normal distribution compared to other types off distributions, especially when speaking about optimization. In late 1970 the well known televisions manufacturer Sony has experienced a decrease in sales for TV's which are produced in USA as compared to those produced in Japan. An investigation report has been done addressing the produced televisions from both facilities. The report has identified four classes off TV's per the color density. Those off best color density or refer to class A than those with less color density were referred toe Class B, C and D. Gradually, the class D TVs are considered to be crossing the defined specifications limits. The investigators found out that the TV's, which are produced in USA, are off a uniform distribution along with zero defects on the other side. The TV's, which are produced in Japan, do have small percentage of defective TV's. However, the overall production was normally distributed. If we are to interpret these observations, we can say that although the U. S. A. Production was empty from defects, however, customers were equally receiving TV's from the variable glasses. This makes the TV's from Class B and C more visible to the market, thus creating a variation in quality from the perspective of the customer. On the other side, the production in Japan have the highest quantity off TV's being from class A. On that other classes are less visible toe the mark. This leads to dominance off class a quality in the market and satisfies the majority off the customers according to what preceded our conclusion will be first. It's not enough to have all your produced parts being within the specification limits in order to satisfy the customer on that. The shape off distribution also matters. Second, the normally distributed production leads to better quality, which can be visible to the customer that's achieving their satisfaction. 14. Sec03 Lec02 The Loss Function: when engineers define the target value along with the tolerant Strange, they assume that if the produced part crossed that range than it will be scrapped as it will totally lose its function. This also means a total loss off the parts value in money. Let's designate the distance between each specification limit on the target value by Delta zero. Let's also add a vertical axis representing the manuals on that each produced part is worth $10. We will also designate the outcome values on the horizontal axis with letter Y. In that aspect, if the part is located at any off the specification limits, then it has already lost all its financial value. We will designate this value, which represents the total loss with Letter A zero from quality perspective. Sports start losing financial value since the time they started to depart from the target. Having that said, we can say that parts which are not on the target are already losing partially from their financial value. If we come to draw a curve that represents this gradual money loss, then this curve represents the loss function. The loss function designated with the letter L will be equal tok multiplied by the expected value off. Why minus t all squirt. Okay, here is the loss rate and is equal to a zero over death. A zero square, as why can only be determined in a statistical manner than the lost function, will be as follows in that accordance to improve the process performance, the lost function should degrees as the lowest rate K on the target value T are always fixed than the only way to decrease the loss Function is by minimizing the distance between the mean on the target value on that is by modifying the mean value on also to minimize the value off the standard deviation for visualizing what happens to the distribution when optimization is performed, we will see that when the standard deviation is decrees, the curve will be somehow squeezed towered the mean value as when minimizing the distance between the mean on the target value, the distribution will shift our the target that's having less outcomes being out of the specification limits. In the coming lectures, we will come to know how we can decrease both of them in order to optimize our process 15. Sec03 Lec03 The P Diagram: The P diagram is an overall mapping for the process and is composed of the process block being in the middle along with four entities around it. The entity to the left is for the imports which are designated with the letter m. The entity to the right is for the outputs, which are designated with letter y. The entity above represents the factors that affect the process and can be controlled. These factors are designated with letter X. The entity below represents the factors that affect the process but cannot be controlled. These factors are often cold noise on. They are designated with letters said the output y represents the measure of performance for the process and is directly related to the function required from that process. If we would like to represent the diagram emphatically, then we would say that why is equal to the function off X factors, plus the function off that factors plus the measurement error. The measurement error is related to any measuring tool which we are using in the process, including the built in measurement devices on also the measuring tool which you are using to measure why on the produced parts, Let's take the cutting machine as an example. The function required from this process is toe have metallic bars which are cut at a specified length and that accordance the output y is the length off the metallic bar. The input of this process is the roll bar which is inserted inside. The X factors can be the vertical speed, the chain belt, rotational speed or the design of the blade etcetera. All these factors affect the process but can still be controlled. The set factors can be the way out of the blade or the environmental temperature in the facility. These factors are hard to be controlled but still have their effect on the process. The P diagram is a crucial step in process optimization as it will allow you to define the factors which you need to work on in order to optimize the output off the process. 16. Sec03 Lec04 Sources of Variation: recalling the P diagram from the previous lecture. Let's speak about the valuable sources of variation, which we may encounter when studying any process. Know that the variation in the output in any process is actually a combined result off the variations in its factors. For that reason, if we minimize the variations in those factors, then we will consequently minimize the variation in the output. The first source of variation can be categorized as external source and is mainly related to usage and environment. Examples off such source can be temperature use, misuse or abuse by the process operator. Such sources are reflected within the that factors in the P diagram. The second source can be categorized as a unit to unit source on it addresses the piece to piece difference that is caused by variation in manufacturing where it is represented by defined tolerances. Such sources fall within the X factors in the P diagram. The third source of variation can be categorised under deterioration or it represents the wear out off parts within the process. Over time, the reason off wearing out can be due to material fatigue, aging operation or others. Such sources may fall in any off the factors being in X factors or that factors In the next lecture, we will understand more about these variations. 17. Sec03 Lec05 The Optimization Algorithm: to understand the optimization algorithm better, we will automatically apply it in a real life example. So let's recall the cutting machine and have it as our example. The first step in the algorithm is to set our goal. Our goal here is to optimize the process by improving its CPK value in order to achieve better yield and reduce scrip. The second step is to measure and evaluate the current status by defining the currency. Pick a of the process. This is an important step as the current status will be taken as a reference to compare with after the optimization is implemented. Know that the optimization algorithm has two stages were it starts at system level, then flows down to the level off pedometers in that accordance. Our third step is to check the possibility for optimization on system level. Such optimization, maybe partial or may even affect the overall design off the process. An example of assistant optimization can be the change of technology where we can replace the played with a laser cutting head, assuming that the laser technology is more precise, just having less variation. Another example. Assuming that the machine is semi automatic, where it is to some level controlled by human operator. We can upgrade the machine to be fully automatic, thus eliminating the variation coming from the human interference. No, that such changes are mainly applicable during the development stage of the process as changing them later on may be costly and time consuming. However, if the syndicate is too bad on that, the loss off money makes it worth to be done then system optimization may still be a wind deal. When completing the system optimization stage, we can start working on the lower level that's optimizing the individual perimeters within the process. For that stop number four will be to map all pedometers through deploying the P diagram. Now assume that we have only two X factors. The first factor is related to the thickness off blade, which lies in a fixture on that the thickness is 20 plus minus three millimeter. As a tolerance. The second X factor is to be the rotational speed off the chain belt, where it is powered by an electric motor having a the final bill speed along with precision off plus minus five degrees per second. As for the that factors assume also that we have two of them. The first. That factor is to be the world out of blade over time. On the second, that factor is to be the environmental temperature. We will start our optimization with minimizing the variation, which is caused by the that factors and that is to be stopped. Number five. Speaking for the world out of the blade, we will plan several experimental runs using blades that are at different uses Time. After that, we will analyze the standard deviation from each one, Then evaluate the variation in that output as compared to the uses time off the blade. For that analysis, we will define the maximum uses time for a blade before replacing it with a new one and to have this specification within the control plan or instructions list off the process. As for the second set factor, which is the environmental temperature, we would also perform several experimental runs for the process while recording the temperature level at which each one was performed. In that accordance, we will put the output variation for the temperature levels, then defined the best temperature level for the process. Having that defined, we may control the temperature to be always at that level by introducing a certain air conditioning system. After that, we go to Step six, which is too tight on the tolerance range off the X factor's. Currently, we have the late thickness being at 20 plus minus three millimeters. If we could receive blades with more control thickness dimension, then this variation will be lower first, minimizing the variation in the output. I know that as the tolerance is being tightened, the coast off the plate will increase in order to define the new tolerance which can improve the performance along with the least calls. Then we can do multiple experimental runs at this dimension being a different tolerance ranges. Then we compare the output variation accordingly. As for the variation in the second X factor, which is the rotational speed, we can also perform multiple experimental runs using different motors, a different precision level, then to compare the outcome and have a decision on which motor to be adopted. When finishing from minimizing all the proceeded variations. We can now go to Step seven where we can work on the nominal values off the X factor's know that the sensitivity off the process in regard to variation in the outcomes has relation with the nominal value off the X factor's. For example, shifting an X factor from one level to another may reduce the variation in the output. To be able to identify which factor toe work. On reduced sensitivity, we will introduce what its so called signal to noise ratio. This ratio is equal to 10 multiplied by logarithms off the means squared over the standard deviation squared. What we will do first is to specify two different levels for each off the X factor's where each level represents a specific chosen value for X. These levels will yield four possible combinations where each combination reflects practically a certain change. For example, the first combination has both factors. At level one does no change. The second combination has x one being shifted to Level two, while X two is still at level one. So this means a change in X one on Lee and Soul. We will then perform multiple experimental runs at all these combinations, then calculate the mean and standard deviation for the outcomes off each one. This will allow us to calculate the signal to noise ratio at each one. The factors which have a drastically higher signal to noise ratio, are those which we can change in order to minimize the sensitivity toward the output variation. In our case, the highest signal to noise ratio is at the third run, which is related to the change off X two. Accordingly, we take X two as the factor where we modify its nominal value in order to reduce the sensitivity toward wide No, that along with a change off nominal value of this factor, the mean value may shift further from the target value. So as a second step, in order to bring the mean value back, close off the target, we need again to modify the nominal value for one of the factors For this purpose, we choose the factor with the least signal to noise ratio. In our case, it's X one. After performing all these optimization actions, we make sure to run the process with all of them combined than to compare the new civic a value to the old one in order to prove our optimization measures. No, that any optimization that we perform may actually cost us money or time or both, for that reason you should come for these expenses to the benefits that you will get from that optimization, so to make sure it is worth to be done. 18. Sec03 Lec06 Implementation: capability. Optimization is usually performed while the machines are already running and producing parts. In that accordance, the soonest we optimize, the more money can be saved in other meaning. The time needed for optimization to be implemented is a very important aspect to be taken into consideration. In this lecture, we will show how you can plan your implementation in the most efficient way possible. We mentioned previously that when optimizing a process, we work first on a variation which is represented by the standard deviation. Then we re adjust the mean let's say that the mean is already in a good position on that. We are only working to reduce the variation. Assume that the current mean value, which we have is 20.1 were the target value is set to 20 millimeters. On that, Our specification limits are three millimeters away from the target. Let's say that the current civic A value is 1.5 on that. Our goal is to achieve a higher civic, a value which is to be equal to two. Using the civic a formula, we can calculate the current standard deviation where we will have to values out of the calculation 0.64 0.68. The correct standard deviation is the one that yields the lower value within the two regions in the civic. A formula accordingly, the higher standard deviation is to be the correct one, which is 0.68. As for the required standard deviation toe, achieve a civic a value of two. We will do exactly the same calculation for this syndicate. We will again get to values for standard deviation 0.48 0.52. So we will adult the higher one. Now, as we have both the current and required standard deviations, we can simply subtract them to know how much we should minimize from the current standard deviation. In our case, we should decrease the current standard deviation by 0.16 Next, we will list all the optimization measures which we have defined by the optimization algorithm on list their standard deviations as well in the table. We will then subtract those standard deviations from the current one in order to get how much each measure is minimizing from the current standard deviation. After that, we will specify the time needed for each measure to be implemented in the machine, the table is now ready so we can choose which measures are to be implemented first. In order to improve the process to the desired CPK in the shortest time possible. Assume that these measures cannot be done in parallel on that we can only implement them in its excessive way one after another we see that the third measure alone is enough for us to achieve the goal. However, it needs 60 days to be implemented. If we combine the other three measures altogether, we see that they will allow us to achieve the desert CPK on that The total time for them to be implemented is 22 days only. In that accordance, we choose thes three measures to be done first.