Monte Carlo Optimization for Sliding Window Size in Dixon Quality Control of Environmental Monitoring Time Series Data

Zhongya Fan,Fantang Zeng,Wencai Wang,Huiyun Feng,Ni Jiang,Changjin Zhao,Jingang Jiang

doi:10.3390/app10051876

Abstract

Outliers are often present in large datasets of water quality monitoring time series data. A method of combining the sliding window technique with Dixon detection criterion for the automatic detection of outliers in time series data is limited by the empirical determination of sliding window sizes. The scientific determination of the optimal sliding window size is very meaningful research work. This paper presents a new Monte Carlo Search Method (MCSM) based on random sampling to optimize the size of the sliding window, which fully takes advantage of computers and statistics. The MCSM was applied in a case study to automatic monitoring data of water quality factors in order to test its validity and usefulness. The results of comparing the accuracy and efficiency of the MCSM show that the new method in this paper is scientific and effective. The experimental results show that, at different sample sizes, the average accuracy is between 58.70% and 75.75%, and the average computation time increase is between 17.09% and 45.53%. In the era of big data in environmental monitoring, the proposed new methods can meet the required accuracy of outlier detection and improve the efficiency of calculation.

Highlights

The rapid development of the Internet of Things has promoted the application of smart sensors in the field of the environment, contributing to big data and the multi-dimension characteristics of environmental monitoring [1,2]
In order to scientifically compare the correctness of the new Monte Carlo Search Method (MCSM), Full Time Series Sliding Search Method (FTSSSM) experiments were carried out at the same time
When the sampling scale was 0.8n, the optimal window accuracy of different water quality factors was between 67.5% and 85%

Summary

Introduction

The rapid development of the Internet of Things has promoted the application of smart sensors in the field of the environment, contributing to big data and the multi-dimension characteristics of environmental monitoring [1,2]. Outlier processing is critical in environmental data analysis owing to its significant effect on future analysis and modeling [3,4]. The environment automatically requires monitoring values such as typical time series data, which have a large-scale collection time and include complex causes of outliers. There are many ways to detect outliers in time series, such as outlier detection based on prior rules [7], statistical distribution characteristics [8], the Kalman Filter Model (KLM) and Bayesian model [9], the Generalised Linear Model (GLM) -based algorithm [10], intelligence algorithms [3], etc.

Objectives

Methods

Results

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Applied Sciences	Publication Date: Mar 9, 2020
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Monte Carlo Optimization for Sliding Window Size in Dixon Quality Control of Environmental Monitoring Time Series Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences

Lead the way for us

Similar Papers

Development of a data-driven ensemble regressor and its applicability for identifying contextual and collective outliers in groundwater level time-series data
Yuhan Kim ... Mijin Kwon
Journal of Hydrology | VOL. 612
Yuhan Kim, et. al.Yuhan Kim ... Mijin Kwon
01 Sep 2022
Journal of Hydrology | VOL. 612

A Long Short Term Memory with Peephole Connections and Generative Adversarial Network Based Collaborative Methodology to Identify Outliers in ECG Dataset
M D Anto Praveena ... B Bharathi
Journal of Computational and Theoretical Nanoscience | VOL. 17
M D Anto Praveena, et. al.M D Anto Praveena ... B Bharathi
01 Aug 2020
Journal of Computational and Theoretical Nanoscience | VOL. 17

Genetic protein sequence analysis based on sequence alignment techniques for time series data
Shengjia Ni
Applied and Computational Engineering | VOL. 41
Shengjia NiShengjia Ni
22 Feb 2024
Applied and Computational Engineering | VOL. 41

Outlier detection and quasi-periodicity optimization algorithm: Frequency domain based outlier detection (FOD)
Ekin Can Erkuş ... Vilda Purutçuoğlu
European Journal of Operational Research | VOL. 291
Ekin Can Erkuş, et. al.Ekin Can Erkuş ... Vilda Purutçuoğlu
17 Jan 2020
European Journal of Operational Research | VOL. 291

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Monte Carlo Optimization for Sliding Window Size in Dixon Quality Control of Environmental Monitoring Time Series Data

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Applied Sciences