Breast cancer has increasingly claimed the lives of women. Oncologists use digital mammograms as a viable source to detect breast cancer and classify it into benign and malignant based on the severity. The performance of the traditional methods on breast cancer detection could not be improved beyond a certain point due to the limitations and scope of computing. Moreover, the constrained scope of image processing techniques in developing automated breast cancer detection systems has motivated the researchers to shift their focus towards Artificial Intelligence based models. The Neural Networks (NN) have exhibited greater scope for the development of automated medical image analysis systems with the highest degree of accuracy. As NN model enables the automated system to understand the feature of problem-solving without being explicitly programmed. The optimization for NN offers an additional payoff on accuracy, computational complexity, and time. As the scope and suitability of optimization methods are data-dependent, the choice of selection of an appropriate optimization method itself is emerging as a prominent domain of research. In this paper, Deep Neural Networks (DNN) with different optimizers and Learning rates were designed for the prediction of breast cancer and its classification. Comparative performance analysis of five distinct first-order gradient-based optimization techniques, namely, Adaptive Gradient (Adagrad), Root Mean Square Propagation (RMSProp), Adaptive Delta (Adadelta), Adaptive Moment Estimation (Adam), and Stochastic Gradient Descent (SGD), is carried out to make predictions on the classification of breast cancer masses. For this purpose, the Mammographic Mass dataset was chosen for experimentation. The parameters determined for experiments were chosen on the number of hidden layers and learning rate along with hyperparameter tuning. The impacts of those optimizers were tested on the NN with One Hidden Layer (NN1HL), DNN with Three Hidden Layers (DNN4HL), and DNN with Eight Hidden Layers (DNN8HL). The experimental results showed that DNN8HL-Adam (DNN8HL-AM) had produced the highest accuracy of 91% among its counterparts. This research endorsed that the incorporation of optimizers in DNN contributes to an increased accuracy and optimized architecture for automated system development using neural networks.