DNA is a valuable tool for classifying expression of genes in detection of breast cancer. Gene expression data are biological data that extract valuable hidden information from gene datasets. Extracting useful features from datasets is a challenging task. Our gene expression dataset had a small number of samples but many features. This paper compared three types of recurrent deep learning models, including recurrent neural networks (RNN), long short-term memory (LSTM), and gated recurrent unit (GRU), for classification of breast cancer. The goals of the study were to improve the accuracy of classification and to enhance the effectiveness of feature selection; the basic principle was to select the best features from the original datasets. The bat algorithm assists in selecting the best relevant feature when integrated with recurrent deep learning models, which improves breast cancer classification by leveraging training datasets. Data preprocessing involves removing unnecessary columns and filling out missing values with the median value. The result was a comparative study using recurrent deep learning with the bat algorithm to classify breast cancer. The bat algorithm with LSTM achieved higher accuracy than RNN and GRU, where GRU had the lowest accuracy.
Read full abstract