Abstract

In pathological diagnosis of breast cancer, there are problems such as shortage of pathologists, difficulties in sample labeling, and huge workload of manual diagnosis. Therefore, deep learning-based computer-assisted pathology analysis systems have been developed to diagnose breast cancer and have achieved impressive results. However, it is difficult to obtain a large number of training sets due to the scarcity of pathological images and the huge labeling costs. Therefore, the size of the training set should be planned before building the pathology computer-assisted breast cancer analysis system. Here, the authors present a study to determine the optimal size of the training data set needed to achieve high classification accuracy when developing a pathology computer-assisted breast cancer analysis system. The authors trained two kind of CNNs using six different sizes of training data set and then tested the resulting system with a total of 10,000 images. All images were acquired from the Camelyon17 challenge. Here, the authors propose a scheme for determining the size of the training set and the size of the model in developing the pathology computer-assisted breast cancer analysis systems, which can be easily applied to develop systems for other different pathological images.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.