Abstract
Non-small cell lung cancer is the most common type of lung cancer. Identification of genes associated with this disease may contribute to the treatment of the disease. Therefore, a lot of work is being done. In some of these studies, genetic data is obtained by microarray analysis and shared publicly in databases such as NCBI Gene Expression Omnibus. In today’s big data era, machine learning algorithms are frequently used to access valuable information from data stacks. Within the scope of this study, all (6 pieces) microarray datasets related to NSCLC and drug resistance in the NCBI GEO database were analyzed by R Studio. With support vector machine, k nearest neighbor, naïve Bayes, random forest, C5.0 decision tree, multilayer perceptron, and artificial neural network algorithms with principal component step, the datasets were analyzed separately and related genes were determined through the caret package, and the top 10 genes for each algorithm were given in the findings section in order of importance. In this resulting gene table, ELOVL7, HMGA2, SAT1, RRM1, IER3, SLC7A11, and U2AF1 genes are included in at least 2 different datasets. These identified genes are recommended to researchers working in a wet laboratory environment to be validated experimentally.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.