Abstract
The active learning approach for machine learning can greatly benefit those environments where a wealth of unlabeled data is available, and the labeling cost of the data can be restrictive. In this regard, security operations centers (SOCs) can take advantage of the human expertise available to improve machine learning-based detection models using the active learning approach. In the context of SOC operations and IoT botnet detection, our study provides a thorough benchmarking of the application of different active learning approaches within the framework of pool-based sampling. The selection of the optimal query instance for learning is evaluated using uncertainty sampling, ranked batch-mode sampling, and query by committee strategies. Our results show that the active learning approach can help to generate better detection models using all the active learning query strategies tested in our benchmarking setup. Leveraging the human–machine interaction can produce high-performance models in the context of IoT botnet detection using significantly less data than the passive approaches traditionally used for the generation of machine learning-based detection systems. Additionally, the impact of wrong-labeled data in the active learning implementation is explored.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.