Outlier Detection for Pandemic-Related Data Using Compositional Functional Data Analysis

Christopher Rieser,Peter Filzmoser

doi:10.1007/978-3-030-78334-1_12

Abstract

AbstractWith accurate data, governments can make the most informed decisions to keep people safer through pandemics such as the COVID-19 coronavirus. In such events, data reliability is crucial and therefore outlier detection is an important and even unavoidable issue. Outliers are often considered as the most interesting observations, because the fact that they differ from the data majority may lead to relevant findings in the subject area. Outlier detection has also been addressed in the context of multivariate functional data, thus smooth functions of several characteristics, often derived from measurements at different time points (Hubert et al. in Stat Methods Appl 24(2):177–202, 2015b). Here the underlying data are regarded as compositions, with the compositional parts forming the multivariate information, and thus only relative information in terms of log-ratios between these parts is considered as relevant for the analysis. The multivariate functional data thus have to be derived as smooth functions by utilising this relative information. Subsequently, already established multivariate functional outlier detection procedures can be used, but for interpretation purposes, the functional data need to be presented in an appropriate space. The methodology is illustrated with publicly available data around the COVID-19 pandemic to find countries displaying outlying trends.

Highlights

The crisis caused by COVID-19 in almost all areas of life has revealed that an accurate data collection is a challenge that cannot be resolved due to political or logistic problems
Many countries report the number of cases, deaths, tests, and further parameters related to the COVID-19 pandemic regularly over time, and the data are accessible in public data repositories
The source of information for the analysis would not consist in the number of cases, death, tests, etc., for a particular day in a particular country, but in theratios between these numbers. This is what is done in compositional data analysis, and outlier detection in this context will focus on atypical behaviour in the multivariate information of suchratios

Summary

12.1 Introduction

The crisis caused by COVID-19 in almost all areas of life has revealed that an accurate data collection is a challenge that cannot be resolved due to political or logistic problems. Many countries report the number of cases, deaths, tests, and further parameters (variables) related to the COVID-19 pandemic regularly over time, and the data are accessible in public data repositories. Instead of directly considering the reported number (represented by the functions), one could focus on analysing relative information This can be done by taking (log-)ratios between the variables. The source of information for the analysis would not consist in the number of cases, death, tests, etc., for a particular day in a particular country, but in the (log-)ratios between these numbers This is what is done in compositional data analysis, and outlier detection in this context will focus on atypical behaviour in the multivariate information of such (log-)ratios. In this paper we consider a new method for the detection of outliers in the compositional functional data setting.

12.1.1 Compositional Data Analysis Concepts

12.1.2 Functional Data

12.2 Smoothing for CODA Time Series

12.3 Outlier Detection in Compositional FDA

12.4 Application to COVID-19 Data

12.5 Summary and Conclusions

Methods

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Outlier Detection for Pandemic-Related Data Using Compositional Functional Data Analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Oct 24, 2021
Citations: 2	License type: CC BY 4.0

Similar Papers

Multivariate functional outlier detection using the fast massive unsupervised outlier detection indices
Oluwasegun Taiwo Ojo ... Antonio Fernández Anta
Stat | VOL. 12
Oluwasegun Taiwo Ojo, et. al.Oluwasegun Taiwo Ojo ... Antonio Fernández Anta
01 Jan 2023
Stat | VOL. 12

Multivariate Functional Data Visualization and Outlier Detection
Wenlin Dai ... Marc G Genton
Journal of Computational and Graphical Statistics | VOL. 27
Wenlin Dai, et. al.Wenlin Dai ... Marc G Genton
24 Aug 2018
Journal of Computational and Graphical Statistics | VOL. 27

Application of multivariate outlier detection to fluid velocity measurements
John Griffin ... Lawrence S Ukeiley
Experiments in Fluids | VOL. 49
John Griffin, et. al.John Griffin ... Lawrence S Ukeiley
14 Apr 2010
Experiments in Fluids | VOL. 49

Outlier detection in multivariate functional data through a contaminated mixture model
Martial Amovin-Assagba ... Julien Jacques
Computational Statistics & Data Analysis | VOL. 174
Martial Amovin-Assagba, et. al.Martial Amovin-Assagba ... Julien Jacques
06 Apr 2022
Computational Statistics & Data Analysis | VOL. 174

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Outlier Detection for Pandemic-Related Data Using Compositional Functional Data Analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers