Abstract
Utilizing large data sets in practical scenarios usually requires identifying, annotating and classifying rare events or anomalies. Although several methods exists, there are two classes of algorithms: anomaly detection algorithms and classification algorithms. Both types of algorithms have different characteristics and in this paper, we set out to compare them on two cases. We use data from a neurointensive care unit and from microwave radio transmissions. We apply Isolation Forest and Random Forest algorithms to find events in the data that occur with a frequency of ca. 1%. The results show that classification algorithms (Random Forest) perform better and can achieve up to 100% accuracy, while the anomaly detection algorithms (Isolation Forest) can achieve only 73% at best. Based on the results, we conclude that it is better to invest in annotating data á priori and use classification algorithms, despite the lower costs of using the anomaly detection algorithms.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.