Abstract

The classification of genomic and proteomic data in extremely high dimensional datasets is a well-known problem which requires appropriate classification techniques. Classification methods are usually combined with gene selection techniques to provide optimal classification conditions—i.e. a lower dimensional classification environment. Another reason for reducing the dimensionality of such datasets is their interpretability, as it is much easier to interpret a small set of ranked genes than 20 thousand genes. This paper evaluates the classification performance of Rotation Forest classifier on small subsets of ranked genes for two dataset collections consisting of 47 genomic and proteomic classification problems. Robustness and high classification accuracy is shown to be an important feature of Rotation Forest when applied to small sets of genes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call