Abstract

Breast cancer is the most common malignant tumor nowadays, which is even more common than lung cancer among human being. Fine Needle Aspiration (FNA) is the most common way to detect whether the breast mass is malignant or not. Although the rate of survive after the malignant tumor spreads is pretty low, if the tumor can be noticed at early stage, patients still hold a great chance of survive. With a low death rate in early stage comparing to other cancers, a convenient and efficient way to help diagnosing whether the breast mass is benign or malignant is crucial. However, many existing models on predicting breast cancer includes variables of the patients themselves, like weight changes and lifestyle, which requires long-term tracking survey, and that will cost a lot of precious time during the early stage of the tumor. Thus, a model which only consider the numeric variable from the FNA would save a lot of time and increase the chance of living. In this study a model is created based on the FNA data of the breast tumor patients in Wisconsin in 1995. A set of variable filtrations are employed to get the best variable combination such as AIC, ridge regression, variance inflation factor (VIF), LASSO, and cross validation. After the variable filtration a model based on General Linear Model (GLM) is created with an accuracy of 0.88.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.