Abstract

INTRODUCTION: Breast cancer (BC) is the most commonly occurring cancer and the second leading cause for women’s disease death. The BC cases are associated with genital mutations which are inherited from older generations or acquired overtime. If the diagnosis is done at the first stage, effects associated with certain treatments can be limited, costs can be saved and the diagnostic time can be minimized. This can also help specialists target the best treatment to increase the rate of cures. Nevertheless, its discovery in patients is very challenging due to silent symptoms aside from the fact the routine screening is not recommended for women under 40 years old.OBJECTIVES: Several efforts are aimed at the BC early detection using machine and deep learning systems. The proposed algorithms use different data types to distinguish between cancerous and non-cancerous cases; as: mammography, ultrasound and MRI (magnetic resonance imaging) images. Then, different learning tools were applied on this data for the classification task. Despite the classification rates which exceed 90%, the major drawback of all these methods is that they are applicable only after the appearance of the cancerous tumors, which reduces the cure rates.METHODS: We propose a new technique for early breast cancer screening. For the data, we focus on cancerous and non-cancerous SNP (Single Nucleotide Polymorphism) protein sequences of the TP53 gene in chromosome 17. This gene is shown to be linked to different single amino acid mutations on which we will shed light here. The method we propose transforms SNP textual sequences into digital vectors via coding. Then, RGB scalogram images are generated using the continuous wavelet transform. A pretreatment of color coefficients is applied to scalograms aiming at creating four different databases. Finally, a CNN deep learning network is used for the binary classification of cancerous and non-cancerous images.RESULTS: During the validation process, we reached good performance with specificity of 97.84%, sensitivity of 96.45%, an overall accuracy of 95.29% and an equal run time of 12 minutes 3 seconds. These values ensure the efficiency of our method.To enhance more these results, we used the ORB feature detection technique. Consequently, the classification rates have been improved to reach 95.9% as accuracyCONCLUSION: Our method will allow significant savings time and lives by detecting the disease in patients whose genetic mutations are beginning to appear.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call