Abstract

A large number of gene expression profile datasets mainly exist in the fields of biological information and gene microarrays. Traditional classification approaches are hard to gain a good performance in the gene expression profile data, due to the characteristics of high dimensionality and small sample size of gene expression profile datasets. In fact, as a data a augmentation technology, Wasserstein generative adversarial network based on gradient penalty (WGAN-GP) with conditional generative adversarial network (CWGAN-GP) can generate specified label samples in a simple fully connected network and is beneficial to improve the performance of the classification model. However, this data enhancement method generates the samples with low diversity and distribution uncertainty and decrease the classification accuracy. Therefore, this paper proposes a conditional Wasserstein generative adversarial network based on the gene expression datasets (Gene-CWGAN). Gene-CWGAN adopts a datasets division strategy based on the data distribution to help the model maintain the distribution of realistic samples. Subsequently, Gene-CWGAN enhances the diversity and quality of generated samples by removing the activation function of the output layer and adding constraint penalty items. Finally, Gene-CWGAN is compared with CGAN and CWGAN-GP on Colon, Leukemia2 and SRBCT verified to effectively improve the diversity and distribution stability of generated samples.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call