Abstract
The detection of anomalous data from sensor data is the intractable problem in today’s hydrological data monitoring. Hardware malfunctions, power issues, battery life, and efficiency are challenges encountered while employing sensor devices to collect data. In such situations, data that is inconsistent may be recorded. By applying these types of dataset, inaccurate results may be produced when performing classification or other data analytic methods. In order to discover anomalous data from a huge dataset, this paper suggests a hybrid mechanism. Three unsupervised machine learning techniques are used to construct this mechanism. First, this study reduces superfluous data by using Principal Component Analysis (PCA). Isolation Forest (IF) is then used to find outlier scores. Finally, K-means clustering is used to distinguish between abnormal (anomalies) and regular data using a visual representation of cluster assignments. The cluster assessment indices’ criteria were used to evaluate this hybrid approach. According to the findings, this hybrid technique would be suitable for identifying anomalous data inside each data index of the dataset, depending on the target value.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.