Abstract

The bump hunting, proposed by Friedman and Fisher, has become important in many fields such as marketing and medical fields, and etc. Among them, to answer the unresolved question of molecular heterogeneity and of tumoral phenotype in cancer, the local sparse bump hunting algorithm, such as CART (Classification and Regression Trees) and PRIM (Patient Rule Induction Method), is useful. In the bump hunting, we use the trade-off curve as a criterion such that the algorithm works effectively, instead of the misclassification rate in classification problems. The trade-off curve is constructed by finding the relation between the pureness rate and the capture rate. So far, we assessed the accuracy for the trade-off curve in typical fundamental cases that may be observed in real cases, and found that the proposed tree-GA can construct the effective trade-off curve. In addition, we investigated the prediction accuracy of the tree-GA by comparing the trade-off curve obtained by using the tree-GA with that obtained by using the PRIM, and found the superiority of the tree-GA over the PRIM when the sample size is large. In this paper, to focus on the sparse and small sample size cases observed in medical cases, we have investigated the typical fundamental cases using Monte Carlo simulations, and we found that the non-ignorable biases exist in the tree-GA. We have proposed a method here to remove such biases.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.