Abstract

The preliminary information of data being normalized into 0 and 1 is essential for an accurate data mining result including real value negative selection algorithm. As one class classification, only the self sample is available during normalization; therefore there is less confidence it fully represents the whole problem when the non-self sample is unknown. The problem ‘out of range’ arises when the values of data being monitored exceed the boundary as the setting in the normalizing phase. This study aimed to investigate the effect of normalization technique and identify the most reliable normalization algorithm for real value negative selection algorithm mainly when the non-self is not available. Three normalization algorithms – the min max, soft-max scaling, and z-scores were selected for the experiment. Four universal datasets were normalized and the performance of each normalization algorithm towards real value negative selection algorithm were measured based on five key performance metrics-detection rate, specificity, false alarm rate, accuracy, and number of detector. The result indicates that the real value negative selection is highly relied on type of normalization algorithm where the selection of appropriate normalization approach can improve detection performance. The min max is the most reliable algorithm for real value negative selection when it consistently produces a good detection performance. Similar to Z-score it also has similar capability however min max seems to a better approach in term of higher specificity, lower false alarm rate, and fewer numbers of detectors. Meanwhile, the soft max scaling is found not suitable for real value negative selection algorithm.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.