Control charting with interval-valued data

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Several methods have been proposed for process monitoring with interval-valued data. We examine these methods and propose a simulation-based approach that is much easier to implement and has a much clearer interpretation. We consider both Phase I and Phase II implementation of interval data control charts and assess the loss of information from the use of interval data of various widths.

Similar Papers
  • Book Chapter
  • 10.1007/978-3-030-04164-9_3
Maximum Likelihood Estimation from Interval-Valued Data. Application to Fuzzy Clustering
  • Dec 29, 2018
  • Hani Hamdan

Interval-valued data are used in many applications where they represent data imprecision, measurement inaccuracy, or measurand variability. As a result of the increasing use of such data in data mining, many data analysis methods have been extended to interval data this last decade. The Expectation-Maximization (EM) algorithm has been widely used for maximum likelihood estimation of parameters in statistical models, where the model depends on unobserved latent variables. In our keynote talk, we will present the EM algorithm to interval-valued data. In this contribution, we provide an original likelihood expression for interval data. Then, we propose an original method to introduce the imprecision and the variability of data into the mathematical expectation of the EM algorithm. The maximization of the obtained expectation gives place to the EM algorithm for interval-valued data. We apply this EM algorithm to mixture model for maximum likelihood estimation of mixture model parameters from interval-valued data. A special attention is paid for the case of Gaussian mixture models. In order to show the usefulness of our approach, we apply it on real interval-valued data issued from a flaw diagnosis application using acoustic emission.

  • PDF Download Icon
  • Research Article
  • 10.47191/etj/v9i03.01
Interval-valued Data Group Average Clustering (IUPGMA)
  • Mar 31, 2024
  • Engineering and Technology Journal
  • Sérgio Mário Lins Galdino + 4 more

In this paper, we deal with a particular type of information, namely interval-valued data. We face the problem of clustering data units described by intervals of the real data set (interval data). Currently, clustering methods rely on dissimilarity measures for interval-valued data uses representative point distance. The Data Group Average Clustering UPGMA is one of the popular algorithms to construct a phylogenetic tree according to the distance matrix created by the pairwise distances among taxa. A phylogenetic tree is used to present the evolutionary relationships among the interesting biological species based on the similarities in their genetic sequences. Interval-valued Data Group Average Clustering (IUPGMA) extends Group Average clustering to interval-valued data. Based on the Range Euclidean Metric it is a reliable alternative to be used to uncertainty quantification from interval-valued data. They contain more information than point-valued data, and such informational advantages could be exploited to yield more efficient analysis.

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/fuzz-ieee.2018.8491483
Fuzzy rule-based modeling for interval-valued time series prediction
  • Jul 1, 2018
  • Leandro Maciel + 1 more

This paper suggests an interval fuzzy inference system (iFIS) modeling approach for interval-valued time series forecasting. Interval-valued data arise quite naturally in many situations in which such data represent uncertainty/variability or when comprehensive ways to summarize large data sets are required. The method comprises a fuzzy rule-based framework with affine consequents which provides a (non)linear framework that processes interval-valued symbolic data. The iFIS antecedents identification uses a fuzzy c-means clustering algorithm for interval-valued data with adaptive distances, whereas parameters of the linear consequents are estimated with a centre-range methodology to fit a linear regression model to symbolic interval data. iFIS forecasting power, measured by accuracy metrics and statistical tests, was evaluated through Monte Carlo experiments using synthetic interval-valued time series with linear and chaotic dynamics. The results indicate a superior performance of iFIS compared to traditional alternative single-valued and interval-valued forecasting models.

  • Research Article
  • Cite Count Icon 11
  • 10.1080/03610918.2020.1714662
Functional linear models for interval-valued data
  • Jan 21, 2020
  • Communications in Statistics - Simulation and Computation
  • Ufuk Beyaztas + 2 more

Aggregation of large databases in a specific format is a frequently used process to make the data easily manageable. Interval-valued data is one of the data types that is generated by such an aggregation process. Using traditional methods to analyze interval-valued data results in loss of information, and thus, several interval-valued data models have been proposed to gather reliable information from such data types. On the other hand, recent technological developments have led to high dimensional and complex data in many application areas, which may not be analyzed by traditional techniques. Functional data analysis is one of the most commonly used techniques to analyze such complex datasets. While the functional extensions of much traditional statistical techniques are available, the functional form of the interval-valued data has not been studied well. This article introduces the functional forms of some well-known regression models that take interval-valued data. The proposed methods are based on the function-on-function regression model, where both the response and predictor/s are functional. Through several Monte Carlo simulations and empirical data analysis, the finite sample performance of the proposed methods is evaluated and compared with the state-of-the-art.

  • Conference Article
  • Cite Count Icon 2
  • 10.1109/ssci50451.2021.9660018
Compositional Linear Regression on Interval-valued Data
  • Dec 5, 2021
  • Direnc Pekaslan + 1 more

Uncertainty is pervasive in data and decision making, manifesting in various forms, from lack-of-information to vagueness. Interval-valued data provide an efficient and effective way to capture this uncertainty not afforded by using point-values such as numbers. Once interval-valued data have been captured, insights pertaining to their inherent range can be explored and communicated. Approaches to analysis span from exploring summary statistics, such as the average range/uncertainty, to inferential statistics such as regression and more complex machine learning techniques. In recent years, approaches in particular to regression for interval-valued data have attracted increasing interest, with advances affording improved regression model accuracy and increased resilience to parameter flipping, i.e. ‘loss of mathematical coherence’. This paper explores the application of a branch of statistics-compositional data analysis, for the computation of inferential statistics and specifically regression for interval-valued data. We articulate how, why and when a compositional representation of interval-valued data may be appropriate, and how intervals on a fixed domain can be easily converted to a compositional representation. Finally, we show how regression for compositional (interval) data can be conducted, and how this can provide an elegant means to mitigating some of the challenges faced by traditional approaches, including preserving mathematical coherence by virtue of the compositional representation.

  • Research Article
  • Cite Count Icon 3
  • 10.1080/03610918.2024.2313675
Construction of Six Sigma-based control chart for interval-valued data
  • Feb 17, 2024
  • Communications in Statistics - Simulation and Computation
  • J Ravichandran + 2 more

Construction of control charts is straightforward if the data are real-valued. However, there are situations where data are essentially interval-valued in which each observation is represented by minimum and maximum values. While there are only few attempts to develop control charts for interval-valued data, there have been no simple-to-use approaches developed to deal with such type of data. In this article, we propose to make use of the popular Taguchi orthogonal array experiment to first redefine the data for experimental setup and then use Six Sigma control limits to determine overall control limits for interval-valued data. The resulting multiple control limits have inner and outer control limits for simultaneously monitoring minimum and maximum values of the averages of interval-valued data. The proposed control chart is then illustrated using a data set, and conclusions are drawn based on the results. It is observed that the proposed Six Sigma-based control chart for interval-valued data is, in fact, easy-to-use, and it performs better in detecting the out-of-control situations. We have established this using average run length comparisons as well with the traditional charts.

  • Research Article
  • Cite Count Icon 31
  • 10.1016/j.asoc.2011.01.006
Midpoint radius self-organizing maps for interval-valued data with telecommunications application
  • Jan 11, 2011
  • Applied Soft Computing
  • Pierpaolo D’Urso + 1 more

Midpoint radius self-organizing maps for interval-valued data with telecommunications application

  • Research Article
  • Cite Count Icon 1
  • 10.1080/02331888.2020.1811282
A self-consistent estimator for interval-valued data
  • Aug 28, 2020
  • Statistics
  • Hyejeong Choi + 3 more

In interval-valued data, the variable of interest is provided in the form of an interval with lower and upper bounds, not a single value. An univariate representation for the interval is not unique by its nature, in particular when interval-valued data are of the min-max (MM) type. Researchers focus on the marginal histogram distribution which is well suited to the measurement error (ME) type interval data. Two estimators, the empirical histogram estimator and nonparametric kernel estimator, have been proposed for the estimation of the marginal histogram in the literature. In this paper, we define a new univariate representation, named as a self-consistent marginal, for interval-valued data, and propose a self-consistent estimator (SCE) to estimate it. We theoretically and numerically investigate the properties of the SCE under various assumptions. We further illustrate the advantages of the SCE over the two existing estimators with empirical examples.

  • Book Chapter
  • Cite Count Icon 6
  • 10.1007/978-3-319-40596-4_57
Participatory Learning Fuzzy Clustering for Interval-Valued Data
  • Jan 1, 2016
  • Leandro Maciel + 3 more

This paper suggests an interval participatory learning fuzzy clustering (iPL) method for partitioning interval-valued data. Participatory learning provides a paradigm for learning that emphasizes the pervasive role of what is already known or believed in the learning process. iPL clustering method uses interval arithmetic, and the Hausdorff distance to compute the (dis)similarity between intervals. Computational experiments are reported using synthetic interval data sets with linearly non-separable clusters of different shapes and sizes. Comparisons include traditional hard and fuzzy clustering techniques for interval-valued data as benchmarks in terms of corrected Rand (CR) index for comparing two partitions. The results suggest that the interval participatory learning fuzzy clustering algorithm is highly effective to cluster interval-valued data and has comparable performance than alternative hard and fuzzy interval-based approaches.

  • Research Article
  • Cite Count Icon 6
  • 10.1007/s10182-016-0274-z
Data generation processes and statistical management of interval data
  • Jun 30, 2016
  • AStA Advances in Statistical Analysis
  • Ángela Blanco-Fernández + 1 more

Statistical methods for dealing with interval data have been developed for some time. Real intervals are the natural extension of real point values. They are commonly considered to generalize the nature of the experimental outcomes from the classical scenario to a more imprecise situation. Interval data have been mainly treated in the context of fuzzy models, as a particular case of increasing the level of imprecision of the data. However, specific methods to deal explicitly with interval data have also been developed. It is described which experimental settings might result in interval-valued data. Some of the major statistical procedures used to deal with interval data are presented. Given the quite different data generation processes resulting in interval data, it is discussed which method appears most appropriate for specific types of interval data. Some practical applications demonstrate the link between data generation processes, specific type of interval data, and statistical methods used for the analysis of these data.

  • Research Article
  • Cite Count Icon 8
  • 10.1002/qre.3199
Robust monitoring of multivariate processes with short‐ranged serial data correlation
  • Sep 10, 2022
  • Quality and Reliability Engineering International
  • Xiulin Xie + 1 more

Control charts are commonly used in practice for detecting distributional shifts of sequential processes. Traditional statistical process control (SPC) charts are based on the assumptions that process observations are independent and identically distributed and follow a parametric distribution when the process is in‐control (IC). In practice, these assumptions are rarely valid, and it has been well demonstrated that these traditional control charts are unreliable to use when their model assumptions are invalid. To overcome this limitation, nonparametric SPC has become an active research area, and some nonparametric control charts have been developed. But, most existing nonparametric control charts are based on data ordering and/or data categorization of the original process observations, which would result in information loss in the observed data and consequently reduce the effectiveness of the related control charts. In this paper, we suggest a new multivariate online monitoring scheme, in which process observations are first sequentially decorrelated, the decorrelated data of each quality variable are then transformed using their estimated IC distribution so that the IC distribution of the transformed data would be roughly N(0, 1), and finally the conventional multivariate exponentially weighted moving average (MEWMA) chart is applied to the transformed data of all quality variables for online process monitoring. This chart is self‐starting in the sense that estimates of all related IC quantities are updated recursively over time. It can well accommodate stationary short‐range serial data correlation, and its design is relatively simple since its control limit can be determined in advance by a Monte Carlo simulation. Because information loss due to data ordering and/or data categorization is avoided in this approach, numerical studies show that it is reliable to use and effective for process monitoring in various cases considered.

  • Research Article
  • Cite Count Icon 2
  • 10.1016/j.eswa.2023.122277
Ordinal classification for interval-valued data and interval-valued functional data
  • Oct 29, 2023
  • Expert Systems With Applications
  • Aleix Alcacer + 2 more

Ordinal classification for interval-valued data and interval-valued functional data

  • Conference Article
  • Cite Count Icon 19
  • 10.1109/fuzz-ieee.2017.8015466
Efficient modeling and representation of agreement in interval-valued data
  • Jul 1, 2017
  • Timothy C Havens + 2 more

Recently, there has been much research into effective representation and analysis of uncertainty in human responses, with applications in cyber-security, forest and wildlife management, and product development, to name a few. Most of this research has focused on representing the response uncertainty as intervals, e.g., “I give the movie between 2 and 4 stars.” In this paper, we extend upon the model-based interval agreement approach (lAA) for combining interval data into fuzzy sets and propose the efficient IAA (eIAA) algorithm, which enables efficient representation of and operation on the fuzzy sets produced by IAA (and other interval-based approaches, for that matter). We develop methods for efficiently modeling, representing, and aggregating both crisp and uncertain interval data (where the interval endpoints are intervals themselves). These intervals are assumed to be collected from individual or multiple survey respondents over single or repeated surveys; although, without loss of generality, the approaches put forth in this paper could be used for any interval-based data where representation and analysis is desired. The proposed method is designed to minimize loss of information when transferring the interval-based data into fuzzy set models and then when projecting onto a compressed set of basis functions. We provide full details of eIAA and demonstrate it on real-world and synthetic data.

  • Conference Article
  • Cite Count Icon 5
  • 10.1109/fuzzy.2011.6007336
Kernel-based fuzzy clustering of interval data
  • Jun 1, 2011
  • Bruno A Pimentel + 2 more

Kernel clustering methods have been very important in application of non-supervised machine learning to real problems. Kernel methods possess many advantages other than non-linearity such as modularity, ability to work with heterogeneous descriptions of data, incorporation of prior knowledge etc. In this paper, we present a clustering method based on kernel functions for partitioning a set of interval-valued data. In addition, this method is compared to a fuzzy partitioning approach for interval data introduced previously. Experiments with real and syntectic symbolic interval-valued data sets are presented. The evaluation of the clustering results furnished by the methods is performed regarding the computation of an external cluster validity index and the global error rate of classification.

  • Book Chapter
  • Cite Count Icon 13
  • 10.1007/978-3-319-49046-5_40
A Convex Combination Method for Linear Regression with Interval Data
  • Jan 1, 2016
  • Somsak Chanaim + 2 more

This paper introduces a new approach to fitting a linear regression model to interval-valued data by relaxing an assumption about using the center of interval data. We use convex combination between lower and upper values of the interval data as a parameter with value between [0,1]. Thus, the center method becomes a special case of this method. For the real application we use Capital Asset Pricing model (CAPM) and Autoregressive model (AR(p)) with interval-valued data to show that this method can provide a better result than the center method based on the Akaike information criterion (AIC).

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.