Quality influencing actions come from informed decisions. You cannot fix what you do not measure.
Statistical process control (SPC) has long been an important tactic for companies looking to ensure high product quality. In modern electronics manufacturing, complexities involved do not comply with the fundamental assumption of process stability. This makes traditional SPC worthless as a high-level approach to quality management, when combined with the increasing amount of data collected. An approach compliant with Lean Six Sigma philosophy, allowing a wider scope than SPC, is superior in identifying and prioritising relevant improvement initiatives.
SPC was introduced in the 1920s, designed to address manufacturing of that era. One purpose of SPC was early detection of undesired behaviour, allowing for early intervention and improvement. Limitations were set by then available IT, a landscape completely different from modern times.
Back-tracking Moore’s Law, it is easy to accept that not only IT, but also product complexities and capabilities were different than today. In fact, measurements from manufacturing operations have no common measure with today’s situation. Following this complexity, combined with factors such as globalised markets driving up volume of manufacturing, the result is that the amount of output data today is incomprehensible by 1920 standards.
Fundamental limitations
SPC appears to still hold an important position for original equipment manufacturers (OEMs). It is found in continuous manufacturing processes, calculating control limits and attempting to detect out-of-order process parameters. In theory, such control limits help visualise if things are turning from good to worse. But theory is, well, theory!
A fundamental assumption of SPC is that you can remove or account for common cause variations from the process, which means all variations remaining are special cause ones. These are the parameters you need to worry about when these start to drift.
An electronics product today contains hundreds of components. It experiences many design modifications due to things such as component obsolescence. It is tested in various stages during assembly process, features multiple firmware revisions, test software versions, test operators, variance in environmental factors and so on.
Example of high dynamics
An example of high dynamics is Aidon, manufacturer of smart metering products. According to Aidon’s head of production, Petri Ounila, an average production batch contains 10,000 units, has units containing over 350 electronics components each and experiences more than 35 component changes throughout this build process.
This gives Aidon a new product, or process, every 280th unit. In addition, there are changes to test processes, fixtures, test programs, instrumentation and more. The result is an estimated average of a new process every 10th unit, or less. Or in other words, there are 1000 different processes in manufacturing a single batch.
How would you even begin to eliminate common cause variations here? And should you even do so?
Even if you managed, how would you go about implementing the alarming system?
A method in SPC, developed by Western Electric Co. back in 1956, is known as Western Electrical Rules, or WECO. It specifies certain rules where violation justifies investigation, depending on how far the observations are from ranges of standard deviations. One problematic feature of WECO is that it, on average, triggers a false alarm every 9175th unrelated measurement.
False alarms everywhere!
Let us say, you have an annual production output of 10,000 units. Each gets tested through five different processes. Each process has an average of 25 measurements. Combining these you get up to 62 false alarms per day on average, assuming 220 working days per year.
Let us repeat that. Assuming you, against all odds and reason, can remove common cause variations. You would still receive 62 alarms every day. People receiving 62 emails per day from a single source would likely block these, leaving potentially important announcements unacknowledged, with no follow up.
SPC-savvy users will likely argue that there are ways to reduce this by new and improved analytical methods. They might say, “There are Nelson Rules, we have AIAG, you should definitely use Juran Rules. You need to identify auto-correlation structures to reduce the number of false alarms. What about the ground-breaking state-of-of-the-art chart developed in early 2000s? Given it a go yet?”
Even if you manage to reduce the number of false alarms to five per day, could it represent a strategic alarming system? Adding actual process dynamics to the mix, can SPC produce a system that manufacturing managers rely on? One that keeps their concerns and ulcers at bay?
Enter KPIs
What most people do is make assumptions based on a limited set of important parameters to monitor, and carefully track these by plotting these on their control charts, X-mR charts or whatever they use to try and separate the wheat from the chaff. These KPIs are often captured and analysed downstream in the manufacturing process, after multiple units are combined into a system.
An obvious consequence of this is that problems are not detected where these happen, as these happen. The origin could easily come from one of the components or processes upstream, manufactured one month ago in a batch that by now has reached 50,000 units.
A cost-failure relationship known as the 10x rule says that, for each step in the manufacturing process a failure is allowed to continue, the cost of fixing it increases by a factor of 10. A failure found at system level can mean that technicians will need to rip apart the product, an act that gives opportunities for new defects.
If failure is allowed to reach the field, cost implications can be catastrophic. There are multiple examples from modern times where firms had to declare bankruptcy or protection against it due to the prospect of massive recalls. A recent example is Takata filing for bankruptcy after a massive recall of airbag components, which may exceed 100 million units.
One of the big inherent flaws of SPC, according to standards of modern approaches such as Lean Six Sigma, is that it makes assumptions of where problems are coming from. This is an obvious consequence of assuming stability in what are highly dynamic factors. Trending and tracking a limited set of KPIs only enhance this flaw. This again kicks off improvement initiatives that are likely to fail at focusing on the most pressing or cost-efficient issues.
A modern approach
All this is accounted for in modern methods for quality management. In electronics manufacturing, this starts with an honest recognition and monitoring of first pass yield (FPY)—true FPY to be precise. By true it means that any kind of failure must be accounted for, even if it only came from something as simple as the test operator forgetting to plug in a cable.
Every test after the first represents waste, resources the company could have spent better elsewhere. True FPY represents the single most important KPI, still most OEMs have no real clue what their’s is.
Knowing your FPY, you can break this down in parallel across different products, product families, factories, stations, fixtures, operators and test operation. Having this data available in real time as dashboards gives you a powerful captain’s view. It lets you quickly drill down to understand what the real origin of poor performance is and make interventions based on economic reasoning.
Allocating this insight as live dashboards to all involved stakeholders also contributes to enhanced accountability of quality. A good rule of thumb for dashboards is that unless the information is given to you, it would not be acted on. None of us have time to go looking for trouble.
As a next step, it is critical that you can quickly drill down to a Pareto view of your most occurring failures, across any of these dimensions. By now, it could be that the tools from SPC become relevant to learn more details. But now you know that you are applying it on something of high relevance, not based on educated guesses. You suddenly find yourself in a situation where you can prioritise initiatives based on a realistic cost-benefit ratio.
Repair data
Presence of repair data in your system is also critical—it cannot be exclusively contained in an MES system or external repair tool. Repair data supplies contextual data that improves root-cause analysis, and contains other benefits. From a human resource point of view, it also tells you if products are blindly retested, as sometimes, normal process variations take the measurement within pass-fail limits. Or, if the products are, in fact, taken out of the standard manufacturing line and fixed, as intended. Have no illusions, it is not rare to see products being retested more than 10 times within the same hour.
In short, quality influencing actions come from informed decisions. Unless you have a data management approach that can give you the full picture, across multiple operational dimensions, you can never optimise your product and process quality or company profits. You cannot fix what you do not measure!
Vidar Gronas is sales director, skyWATS