Abstract

This paper presents an outlier detection technique for univariate normal datasets. Outliers are observations that lips an abnormal distance from the mean. Outlier detection is a useful technique in such areas as fraud detection, financial analysis, health monitoring and Statistical modelling. Many recent approaches detect outliers according to reasonable, pre-defined concepts of an outlier. Methods of outlier detection such as Gaussian method of outlier detection have been widely used in the detection of outliers for univariate data-sets, however, such methods use measure of central tendency and dispersion that are affected by outliers hence making the method to be less robust towards detection of outliers. The study aimed at providing an alternative method that can be used in outlier detection for univariate normal data sets by deploying the measures of variation and central tendency that are least affected by the outliers (median and the geometric measure of variation). The study formulated an outlier detection formula using median and geometric measure of variation and then applied the formulation on randomly simulated normal dataset with outliers and recorded the number of outliers detected by the method in comparison to the other two existing best methods of outlier detection. The study then compared the sensitivity of the three methods in outlier detection. The simulation was done in two different ways, the first considered the variation in mean with a constant standard deviation while the second test held the mean constant while varying the standard deviation. The formulated outlier detection technique performed the best, eliminating the most required number of outliers compared to other two Gaussian outlier detection techniques when there was variation in mean. The study also established that the formulated method of outlier detection was stricter when the standard deviation was varied but still stands out to be the best as an outlier is defined relative to the mean and not the standard deviation. The study established that the formulated method is more sensitive than the Gaussian Method of outlier detection but performed as well as the best existing outlier detection technique. In conclusion, the study established that the formulated method could be employed in outlier detections for univariate normal data-sets as it performed almost the same to the best existing method of outlier detection for univariate data-sets.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call