Abstract
Local Outlier Factor (LOF) is an unsupervised anomaly detection algorithm that finds anomalies by assessing the local density of a data point relative to its neighborhood. Anomaly detection is the process of finding anomalies in datasets. Anomalies in real-time datasets may indicate critical events like bank frauds, data compromise, network threats, etc. This paper deals with the implementation of the LOF algorithm in the HPCC Systems platform, which is an open-source distributed computing platform for big data analytics. Improved LOF is also proposed which efficiently detects anomalies in datasets rich in duplicates. The impact of varying hyperparameters on the performance of LOF is examined in HPCC Systems. This paper examines the performance of LOF with other algorithms like COF, LoOP, and kNN over several datasets in the HPCC Systems. Additionally, the efficacy of LOF is evaluated across big-data frameworks such as Spark, Hadoop, and HPCC Systems, by comparing their runtime performances.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.