Change point detection in process control with robust individuals control chart

It is crucial to realize when a process has changed and to what extent it has changed, then it would certainly ease the task. On occasion that practitioners could determine the time point of the change, they would have a smaller search window to pursue for the special cause. As a result, the special cause can be discovered quicker and the necessary actions to improve quality can be triggered sooner. In this paper, we had demonstrated the use of so-called exploratory data analysis robust modified individuals control chart incorporating the M-scale estimator and had made some comparisons to the existing charts. The proposed modified robust individuals control chart which incorporates the M-scale estimator in order to compute the process standard deviation offers substantial improvements over the existing median absolute deviation framework. With respect to the application in real data set, the proposed approach appears to perform better than the typical robust control chart, and outperforms other conventional charts particularly in the presence of contamination. Thus, it is for these reasons that the proposed modified robust individuals control chart is preferred especially when there is a possible existence of outliers in data collection process.


Introduction
Control charts are useful devices employed to detect uncommon behavior in the manufacturing process. Apart from the use in the setting of statistical process control, they are also applied as an exploratory tool in data analysis. It is common to see that monitoring a process under contamination via conventional charting procedure such as Shewhart's X-bar or X chart and range, R chart may cause undesirable number of false alarms [1]. Thus, robust control chart is an alternative tool for the conventional charts that maybe contaminated. This is due to the downweighting of the influences of the outlying data values [2]. In this study, we have presented a charting framework that satisfies the requirements of exploratory _____________________________ *Corresponding author: khng@utar.edu.my analysis in the retrospective phase I, which has the intention to detect the assignable causes by unfolding the sign of variations.
In essence, the development of statistical process control charts has its purpose to aid practitioners to monitor if the change has developed [3]. In the event that a change does appear, the chart should react as soon as possible, and serves as a tool to detect for the process changes to tell apart between the special causes of variation and the common causes of variation. As long as the control chart statistics fall within its limits, the control chart suggests that only the common causes of variation exist. In contrast, when the chart statistics exceeds the limits, the control chart exhibits that there might be one or more special causes existed. One of the important criteria for assessing the performance of a control chart performs is how quickly it responds to the changes. The quicker a process change is discovered by the chart, the quicker process engineers can start their inspection for the special cause. The necessary action can then be taken to correct the process once it has been diagnosed.
Various groups of researchers have worked with some relevant robust control charts. For example, [4] had written widely about robust multivariate control chart in relation to goodness-of-fit test while [5] were investigating the concept of robust multivariate EWMA control chart concerning the sparse mean shifts. [6] have discussed a stepwise approach to construct a robust Shewhart location control chart and [7] has done extensive work with frameworks of automating phase I process that affect the phase II performance for individuals control chart.

M-Scale estimate
To fix the ideas of a robust M-scale estimate, [8] defined the scale M-estimates by letting ()   to be a real function, satisfying the following assumptions.
then Equation (1) possesses a unique solution and the solution is different from 0. Suppose su = In practice, the M-scale estimate is computed via the implementation of the following two stages:
which is called a three-part redescending M-estimator to safeguard against extreme outlying observations. [13] proposed the series of steps for the control-charting framework which is demonstrated as follows:

De Mast and Roes control-charting procedure
1. Initially, determine the locations of potential shifts and initiate the significance test using the test statistic 2 1 RT n . Then, the current data is partition into the corresponding intervals. 2. Once the completion of this step, compute the means between the intervals of the subsequent shifts as well as the variance of in-control measurements using robust estimators. 3. Employing the estimates, set up a pair of control limits for each interval.
Observations are deemed to be outlying values if they lie outside the control limits.
For illustrative purposes, presuming that the null model of the in-control process with the measurements , To obtain the corresponding estimations, let us begin looking at the way to estimate the shift position, ,  assuming that a single shift does occur. Such scenario is demonstrated as below: Here, 12 ,,    and  are to be estimated. [14] proposed the maximum likelihood (ML) estimator for .
 As to the ML estimators for 12 ,,  and ,  together with ,  they are described as (̃1,̃2,̃,) = ( ) 12 12 , , , arg min , , , , l The term arg min ( ) lx represents the x value where ( ) lx is to be minimized. In order to determine ,  perform the partial derivatives for 12 ,  and ,  and solving the respective equations by equating to zero. The resulting estimators for fixed  are as below: Consequently, In addition to this, it is possible that not all observations are stemmed from the in-control Equation (6) Differentiating  in Equation (16) to obtain '  , and that standard deviation of the error is acquired based on asymptotic variance for the M-estimators of location. [15] called the scale estimator in Equation (15) as A-estimator which seems to perform satisfactorily, and yet possesses good features among others robust scale estimators.
The bisquare function is adopted in Equation (15) that is defined as In a sense, the function simply implies that measurements further away from the center will be gradually downweighted, and those fall beyond Having detected the shift, the null-model Equation (6) is tested against the alternative model Equation (7). The test statistic is given by RT n can be approximated by an F-distribution, with 1 n and 2 n degrees of freedom for a particular sample size .
n Further details on the computation of the critical values as well as 1 n and 2 n can be found in [13]. Once the shift has been located, the observations will be divided into two groups 12  We then employ the similar framework to both groups in order to verify whether or not there are more shifts available. More succinctly, the framework is depicted as follows: • Upon the detected shift, the observations are divided into two groups 12   given by Equation (14). A little weakness of using median absolute deviation (MAD) is that even though MAD has high breakdown-point (50%), it has low efficiency (37%) ( [16]). Hence, instead of using MAD estimator for   scale estimators now appear to dominate the field due to their generality, as well as possessing high breakdown-point and efficiency [8].
However, the distribution of the proposed test statistic RT in Equation (18) (6), with that the corresponding desired percentile values were determined from the generated empirical distribution. Table 1 presents the simulated cut-off values for test statistics in Equation (18) with sample sizes of 10, 20, 30 and 50. Details about the determination of cut-off values for the MAD framework can be found in [13]. On the basis of the cut-off values obtained, in the case where the position shift has been estimated and tested, the analogous control charting framework as indicated in Section 3 is implemented until the respective pairs of control limits for each interval are determined using Equation (20).

Illustrative example: application to wood moisture content data
For verification, in this section we intend to apply the framework of the proposed approach by analyzing a real data set. The data was obtained from a furniture company in Malaysia in which the wood moisture content measures are the quality characteristic of interest. Here there were 37 subgroups under consideration, and the values of moisture content were measured at eight different parts for each of the furniture. This implies that the data set consist of 37 subgroups each of size 8. To reflect the individual measurements, the associated chart was constructed by averaging the eight measurements for each of 37 subgroups. To gain insight into the motivation for the approach in exploratory context, the conventional X-bar ( ) X control chart is employed using the subgroups measurements for comparison purposes. The control limits are constructed through the within-group estimates using the average range and the chart is presented in Fig. 2. It can be seen that the control chart locates two signals that imply the evidence of between-groups variation which cannot be explained by the within-groups variation. For such behavior, the two signals would indicate that the wood moisture contents are not completely in its steady state in the sample. Perhaps it may offer some hints and yet not definite enough to distinguish the assignable cause. Moreover, the X chart locates some large outlying values but does not identify the existence of the possible shifts. A quite large standard deviation of the in-control process is estimated to be 1.429. This could be accounted by the reason that there is an inclusion of extra variation due to the shifts. In addition, there appears to be that X chart is not as sturdy as the M-estimator being applied in the proposed method. Evidently, the chart is able to indicate the shifts, but it does not explicitly indicate about their numbers or the time points where they happen. Without incorporating the located shifts in the analysis, the chart seems to be less competent to detect the rest of the assignable causes.
For comparison purposes, Fig. 3 and Fig. 4 display the plots for the wood moisture content measurements by means of the procedure using the existing MAD and the proposed M-scale estimators. The upper and lower control limits are computed by adopting Equation (20). From the figures, both control charts are able to discover the existence of few assignable causes, one of which is the isolated disturbance located by sample number 34, along with the two shifts in mean. After eliminating the outliers and rectifying the estimations caused by these outliers, the distribution is now can be explained satisfactorily by a normal distribution. In which case the standard deviation of M-scale method is slightly less than MAD method, which are respectively 0.6116 and 0.6286. This may be due to that the effects of outlying data values had been eliminated.
Another key aspect in judging the performance of chart is how quickly it responds or reacts to the change in a process. In this instance, the control charts in Fig. 3 and Fig. 4 give the significant suggestions of their nature and the time points on which they have manifested. These plots reveal that the content averages are actually shifted between sample 5 and sample 6; sample 27 and sample 28. Apart from the indicated shifts, the content averages seem to be in statistical control, with no assignable causes appear to come from the remaining observations.
In the event that if the inquirer is not intended to know whether the additional variation is due to between-groups variation, but merely to check if the extra variation is distributed randomly, then the average moving range (AMR)-chart can be plotted. This is shown in Fig.  5, with averages of wood moisture content served as individual observations. The chart of this kind is normally utilized in individuals control chart [17]. It is quite apparent that the chart fails to discover the shift in mean. Hence, as a whole, it appears to suggest that the procedure utilizing M-scale estimator has demonstrated the most convincing performance in terms its variation property for this data set.

Conclusion
In this paper, we had demonstrated the use of so-called exploratory data analysis robust modified individuals control chart incorporating the M-scale estimator and had made some comparisons to the existing charts. In which case, the proposed modified chart indeed offers relatively notable advantages over traditional control charts. It has been attempting to present an alternative robust control chart for detecting the process shift and variability for individual measurements in phase I, with the required table of critical values for testing the significance of the shift. The proposed modified robust individuals control chart which incorporates the M-scale estimator in order to compute the process standard deviation  offers substantial improvements over the existing MAD framework. With respect to the application in real data set, the proposed approach seems to perform marginally better than MAD method and outperforms other conventional charts in the presence of contamination. Thus, it is for these reasons that the proposed modified robust individuals control chart is preferred especially when there is a possible existence of outliers in data collection process.