Several diverse fields including the healthcare system and drug development sectors have benefited immensely through the adoption of deep learning (DL), which is a subset of artificial intelligence (AI) and machine learning (ML). Cancer makes up a significant percentage of the illnesses that cause early human mortality across the globe, and this situation is likely to rise in the coming years, especially when non-communicable illnesses are not considered. As a result, cancer patients would greatly benefit from precise and timely diagnosis and prediction. Deep learning (DL) has become a common technique in healthcare due to the abundance of computational power. Gene expression datasets are frequently used in major DL-based applications for illness detection, notably in cancer therapy. The quantity of medical data, on the other hand, is often insufficient to fulfill deep learning requirements. Microarray gene expression datasets are used for training procedures despite their extreme dimensionality, limited volume of data samples, and sparsely available information. Data augmentation is commonly used to expand the training sample size for gene data. The Wasserstein Tabular Generative Adversarial Network (WT-GAN) model is used for the data augmentation process for generating synthetic data in this proposed work. The correlation-based feature selection technique selects the most relevant characteristics based on threshold values. Deep FNN and ML algorithms train and classify the gene expression samples. The augmented data give better classification results (> 97%) when using WT-GAN for cancer diagnosis.
Read full abstract