Abstract
As one of the most popular machine learning methods, random forests have been successfully applied to different data analysis tasks such as classification, regression and cluster analysis. Recently, the random forest clustering method has received much attention due to its simplicity, accuracy and robustness. However, we cannot directly employ the random forest clustering algorithm to solve the discrete sequence clustering problem because of the lack of explicit features and “negative” sequences. In this paper, we propose a new random forest clustering algorithm for discrete sequences. The proposed method firstly injects a set of decoy sequences and then constructs the random forest in a supervised and adaptive manner by generating features on the fly. Experimental results on real data sets show that the proposed method can achieve better performance than those state-of-the-art discrete sequence clustering algorithms.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.