我们在做SPC分析的时候,需要收集数据,就会遇到一个子组大小的问题,有时候子组大小是1(单值),有时候是5或者更大的。那么子组大小我们如何确定,又如何使用呢?子组大小可以随意的吗?
子组是指在同一组条件(包括人、机、物、法、环、测)下产生的一组单元。子组代表了在过程中的一个 “片段“,所以,子组内的数据,必须在时间上相近的期间进行测量而取得。通过对比不同子组的特性变化来评估过程的稳定性。
比如:
对某个批次的一个取样,分别测量5次(用同一设备测试5次或者用5个设备分布测试),子组大小是5
对某个批次取样5个,分布测试一次,子组大小是5
如果每小时从生产线上抽取 5 个样品,子组大小就是 5
一台模切机每小时生产100个塑料零件。质量工程师每小时测量五个随机选择的零件。每五个零件的样本就是一个子组
每次只抽取一个样本,这种情况通常适用于单值控制图(I-MR控制图),通过逐个样本的数据来监控过程的稳定性和一致性。
检测成本较高:抽取多个样本可能浪费资源,或每次检测费用较高,比如破坏性测试或成本昂贵的分析。
过程本身波动较小:对于一些高度自动化、精确的过程(例如高精度装配工艺),个体的波动性可以直接反映过程状态。
一般生产现场设备运行数据或者在线实时检测设备,采集的数据一般来说都是单值类型的,如固定间隔的设备温度、压力、电流,如固定检测的在线检测SKU的重量、透明度等。
每次采集多个样本,这种方式更适合用于检测和区分过程的随机变异和系统变异。在子组大小大于 1 的场景中,常见的控制图类型是Xbar-R 图(均值-极差图)和Xbar-S 图(均值-标准差图),适用于过程波动的监控和分析。
批量生产:在一些制造或装配工序中,每小时或每批次产出大量产品,采集一个包含多个样本的子组可以更准确地反映过程的整体波动。
过程波动分析:当需要识别系统性问题时,多个样本可以区分出过程的特殊原因变异(可控制的异常)和普通原因变异(随机波动)。
频繁检测:在自动化程度较高的生产环境中,检测设备能够快速采集多个样本,以便于实时监控和调整工艺。例如,在电子元件生产线上,机器可以每隔几分钟自动抽取 5 个样本进行测量。
成本较低的检测:当检测费用相对较低时,可以更频繁地采集多个样本以增加分析的准确性。例如,质量检测成本不高的组装件,可以每次抽取 4-5 个样本来判断工艺稳定性。
过程稳定性和能力评估:多个样本的子组有助于更准确地计算过程能力指数(如 Cp、Cpk),使得对过程是否在控制范围内和是否满足规范范围的判断更可靠。
常见的子组大小为 3 到 5 个样本。例如:
子组大小为 4 或 5:适用于批量或连续生产,平衡了检测成本和数据准确性。
子组大小为 10 及以上:用于更严格的质量控制场景,例如航空航天或医疗设备行业的关键部件制造。
当子组大小进一步增大,例如达到 20 或 50 时,虽然可以获得更为详细的数据,但也会带来一些问题。子组大小太大可能导致以下几方面的问题:
平滑效果过强:当子组大小太大时,子组的均值会变得过于平滑,掩盖掉一些过程中的小幅变化。这会导致较小的异常难以检测,降低了控制图的敏感性。
异常检测滞后:较大的子组需要积累更多数据,因此在出现异常情况时,图上反映的响应速度较慢,可能错过及时纠正过程的机会。
采集和检测大子组会消耗更多的原材料和检测成本,尤其在检测费用较高的行业中(如生物医药、精密仪器制造),可能带来较大的经济压力。处理大样本量可能需要更长时间和更多人力,这在需要频繁检测的生产线上尤为不便,可能影响生产效率。
子组内一致性假设失效:SPC 假设每个子组内的数据代表相同条件下的样本,然而,子组越大,子组内的样本可能覆盖更长时间或更多批次,导致条件变异增大,违背了 SPC 的假设。大子组会导致控制图的控制限变窄,这会让过程看起来比实际更稳定,掩盖潜在的异常。
通常来说,子组大小为 3~5 是最常用且推荐的,因为它们能够平衡灵敏度、检测成本和数据复杂性。当子组需要更大时(例如超过 10 个),可以考虑分组采样或多阶段控制图的方式,以提高控制图的检测效果和灵敏度。
因此,子组大小过大一般会削弱 SPC 的控制效果,且成本与管理压力显著增加,通常并不推荐。
With the rapid update and iteration of Simple SPC, we released Simple SPC 2.0 today. Let’s take a look at what features we have updated in 2.0!
In the SPC system, you can configure the appKey and alarm user group of WeChat and DingTalk, and push them directly to WeChat and DingTalk. The following figure shows the actual effect of the push.
When the factor of the independent variable contains multiple levels, the statistical method of testing whether the averages of each level are equal, we have integrated the variance analysis function into the SPC analysis report, making it easier for everyone to do variance analysis.
As shown below:
Through something like http://xxx.com/access_token=xxxxxxxxxxm, any page of our SPC can be directly embedded through iframe. The actual effect is as follows
Spanish
Hindi
The operating environment has been upgraded to the latest version of Python 3.12. At the same time, some major libraries such as sqlalchemy and pandas have also been upgraded to the latest version. During the upgrade process, some codes have been optimized, which has comprehensively improved the performance of the product.
We are serious about SPC and we are constantly innovating.
Our philosophy: extreme innovation, committed to making the best SPC products in China, and helping the quality of domestic manufacturing grow together.
In the realm of quality management, CPK (Process Capability Index) and PPK (Process Performance Index) are common interview questions and indispensable statistical indicators for quality professionals. They seem simple, yet often lead to confusion and debate.
CPK: Process Capability Index, reflects the capability of a process under controlled conditions, typically used to measure short-term process capability. PPK: Process Performance Index, reflects the actual performance of a process, typically used to measure long-term process capability. The calculation formulas for both are similar, but the estimation method for σ (standard deviation) differs: CPK: Uses within-subgroup standard deviation to estimate σ; the calculation method for within-subgroup standard deviation varies for different data types. PPK: Considers overall variation and uses overall standard deviation to estimate σ. CPK might overestimate process capability, while PPK is closer to the true capability.
CPK Calculation: Based on control charts (x̄-R chart or x̄-s chart), σ is calculated using the average range (R-bar) divided by d2, or the average sample standard deviation (S-bar) divided by c4. PPK Calculation: Includes all data within the control chart in the calculation, σ is calculated directly using the STDEV() function in Excel. Cpk reflects within-subgroup variation (short-term fluctuation), while Ppk includes both short-term within-subgroup variation and long-term between-subgroup variation, representing the overall quality indicator of the entire production process. In practical applications, some advocate using Ppk for control during new product trial production and switching to Cpk for control after mass production stabilizes. This is because the quality fluctuation is large during the trial production stage, and Cpk might not be effective for control; only Ppk can provide an understanding of the overall quality.
However, some people question the value of CPK and PPK. Some believe that Ppk has limited practicality because calculating overall quality means the product has already been produced, and it's impossible to prevent defective products in real-time. Moreover, the data might not come from actual measurements but rather be "fabricated." CPK and PPK seem to have become a "numbers game." Furthermore, there's also debate about whether CPK and PPK represent short-term or long-term capability. Some point out that short-term/long-term capability has nothing to do with CPK/PPK but is solely related to sampling. Short sampling time means short-term capability, and vice versa.
CPK and PPK, as important process capability indicators, play a significant role in quality management. However, we should also recognize their limitations and not blindly pursue indicators while neglecting the control and improvement of the actual process. Sampling plays a crucial role in quality management. The sampling method and sample size will both affect the assessment of process capability. Therefore, when using CPK and PPK, we need to pay attention to the rationality and representativeness of sampling.
CPK, PPK, and sampling are all very important tools in quality management. We need to deeply understand their connotations and limitations and apply them flexibly to truly realize their value and achieve effective quality control.
In Statistical Process Control (SPC), specification limits and control limits are two core concepts. Although they both play important roles in quality management and process monitoring, they differ in their definitions, sources, and applications. This article will explain these two concepts in detail and mainly discuss their potential to be one-sided or two-sided, as well as their impact on SPC metrics.
Specification limits are the acceptable range of product or process characteristics specified by customers, design specifications, or industry standards. They define the quality requirements that a product or service must meet. Specification limits typically include the Upper Specification Limit (USL) and Lower Specification Limit (LSL), used to assess whether the product meets the expected quality standards. If product characteristics exceed the specification limits, they are considered nonconforming.
Specification limits can be one-sided or two-sided:
Control limits are derived from the statistical analysis of process data and are used to monitor process stability. Control limits typically consist of the Upper Control Limit (UCL) and Lower Control Limit (LCL), reflecting the natural variation range of most data points in a process under normal conditions (normal distribution).
Control limits are usually two-sided because their primary function is to detect process stability and abnormalities. The calculation of control limits is typically based on the 6σ principle under a normal distribution, where data points within the 3σ range are considered normal process variations, and data points beyond this range are considered abnormal signals (under a normal distribution, the probability of exceeding the control limits is less than 1%).
Regardless of whether the specification limits are one-sided or two-sided, control limits are used to determine process stability and should include both upper and lower control limits to assess stability.
For example, even if the specification limit only requires the characteristic value to be greater than a minimum value (e.g., greater than 100), control limits will still be calculated based on the data (e.g., UCL = 200, LCL = 50):
The role of UCL and LCL: The Upper Control Limit (UCL) and Lower Control Limit (LCL) are used to detect abnormalities in the process. If process data points exceed these control limits, it indicates potential process anomalies that require further investigation. Even if a data point greater than 200 is acceptable based on the specification limit (as long as it is greater than 100), it may indicate process instability according to the control limit (UCL = 200), prompting further investigation into the cause of this anomaly.
Abnormality and Stability: Even if the process characteristics meet the specification limits, significant process variation (e.g., data points exceeding UCL or LCL) may indicate process instability. Control limits help identify this potential instability.
If the specification limit is one-sided (e.g., only LSL), control limits will still be two-sided. Exceeding the control limits will still trigger alarms, but data points exceeding LCL should receive particular attention, while data points exceeding UCL can be analyzed or ignored as needed.
When the specification limit is one-sided, it affects certain SPC metrics, especially those related to capability indices such as Ca, Ppk, and Cpk.
Ca (Capability Index) measures the degree of deviation between the process mean and the specification center. For one-sided specification limits, Ca cannot be calculated due to the lack of a reference center value.
Ppk and Cpk measure process performance and capability. For one-sided specification limits, Ppk and Cpk calculations only consider the direction of the existing specification limit. For example:
In these cases, one-sided specification limits only assess process capability in one direction, potentially leading to an incomplete evaluation of the process. Particularly with one-sided specification limits, it is essential to use two-sided control limits to monitor process stability comprehensively.
You may have noticed that when using Minitab to create Xbar-R and Xbar-S control charts, each is composed of a pair of charts:
This article provides the most detailed explanation available on the internet.
To clarify, the Xbar Control Charts in the Xbar-R and Xbar-S Control Charts are not the same. The use of Xbar-R Control Chart and Xbar-S Control Chart is conditional; we should not use Xbar-R and Xbar-S simultaneously on the same set of inspection data. Therefore, we do not need to worry about whether the Xbar Control Charts in Xbar-R and Xbar-S are the same, because we will not be using Xbar-R and Xbar-S at the same time.
we need to use the Xbar-R Control Chart. The control limits for the Xbar Control Chart and R Control Chart are calculated as follows:
we need to use the Xbar-S Control Chart. The control limits for the Xbar Control Chart and S Control Chart are calculated as follows:
The SPC constants such as A2, D4, A3, B4, etc., used in these formulas are as follows: