Abstract
Continuous data mining has led to the generation of multi class datasets through microarray technology. New improved algorithms are then required to process and interpret these data. Cancer prediction tailored with variable selection process has shown to improve the overall prediction accuracy. Through variable selection process, the amount of informative genes gathered are much lesser than the initial data, yet the selective subset present in other methods cannot be fine-tuned to suit the necessity for particular number of variables. Hence, an improved technique of various variable range selection based on Random Forest method is proposed to allow selective variable subsets for cancer prediction. Our results indicate improvement in the overall prediction accuracy of cancer data based on the improved various variable range selection technique which allows selective variable selection to create best subset of genes. Moreover, this technique can assist in variable interaction analysis, gene network analysis, gene-ranking analysis and many other related fields.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have