(C) PLOS One [1]. This unaltered content originally appeared in journals.plosone.org. Licensed under Creative Commons Attribution (CC BY) license. url:https://journals.plos.org/plosone/s/licenses-and-copyright ------------ Breast lesions classifications of mammographic images using a deep convolutional neural network-based approach ['Tariq Mahmood', 'Faculty Of Information Technology', 'Beijing University Of Technology', 'Beijing', 'Division Of Science', 'Technology', 'Department Of Information Sciences', 'University Of Education', 'Lahore', 'Jianqiang Li'] Date: 2022-02 Abstract Breast cancer is one of the worst illnesses, with a higher fatality rate among women globally. Breast cancer detection needs accurate mammography interpretation and analysis, which is challenging for radiologists owing to the intricate anatomy of the breast and low image quality. Advances in deep learning-based models have significantly improved breast lesions’ detection, localization, risk assessment, and categorization. This study proposes a novel deep learning-based convolutional neural network (ConvNet) that significantly reduces human error in diagnosing breast malignancy tissues. Our methodology is most effective in eliciting task-specific features, as feature learning is coupled with classification tasks to achieve higher performance in automatically classifying the suspicious regions in mammograms as benign and malignant. To evaluate the model’s validity, 322 raw mammogram images from Mammographic Image Analysis Society (MIAS) and 580 from Private datasets were obtained to extract in-depth features, the intensity of information, and the high likelihood of malignancy. Both datasets are magnificently improved through preprocessing, synthetic data augmentation, and transfer learning techniques to attain the distinctive combination of breast tumors. The experimental findings indicate that the proposed approach achieved remarkable training accuracy of 0.98, test accuracy of 0.97, high sensitivity of 0.99, and an AUC of 0.99 in classifying breast masses on mammograms. The developed model achieved promising performance that helps the clinician in the speedy computation of mammography, breast masses diagnosis, treatment planning, and follow-up of disease progression. Moreover, it has the immense potential over retrospective approaches in consistency feature extraction and precise lesions classification. Citation: Mahmood T, Li J, Pei Y, Akhtar F, Rehman MU, Wasti SH (2022) Breast lesions classifications of mammographic images using a deep convolutional neural network-based approach. PLoS ONE 17(1): e0263126. https://doi.org/10.1371/journal.pone.0263126 Editor: Yan Chai Hum, University Tunku Abdul Rahman, MALAYSIA Received: August 6, 2021; Accepted: January 12, 2022; Published: January 27, 2022 Copyright: © 2022 Mahmood et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Data Availability: The Private dataset is not publicly accessible due to data privacy and ethical constraints. The Private dataset contains 580 annotated mammogram images and is available at Continental Medical College & Hayat Memorial Teaching Hospital. It is obtained after a formal request at the official email hmth@cmclhr.edu.pk of the Hayat Memorial Teaching Hospital only for research purposes that meet the criteria for access to confidential data. Funding: This study is supported by the National Key R&D Program of China with the project no. 2017YFB1400803. However, the funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. Competing interests: NO authors have competing interests. Abbreviations: BCDR, Breast Cancer Digital Repository; BIRADS, Breast Imaging-Reporting and Data System; CBIS-DDSM, Curated Breast Imaging Subset of DDSM; CNN, Convolutional Neural Network; ConvNet, Convolutional Neural Network; DDSM, Digital Database for Screening Mammography; DICOM, Digital Imaging and Communications in Medicine; GLCM, Gray-level Co-occurrence Matrix; MIAS, Mammographic Image Analysis Society; MRI, Magnetic Resonance Image; PGM, Portable Gray Map; RCNN, Regions with Convolutional Neural Networks; ReLU, Rectified linear units; ResNet, Residual neural network; RF, Random forest; RMS-Prop, Root Mean Square Propagation; ROI, Regions of Interest; RProp, Resilient Propagation; SGD, Stochastic Gradient Descent; SVM, Support Vector Machine; YOLO, You-Only-Look-Once Introduction Breast cancer is threatening malignancy and the leading cause of cancer-related mortality in women’s community, with an increased 6.6% to 6.9% mortality rate in the current year [1, 2]. This high death rate is primarily due to delayed malignancy detection. Breast cancer is curable if detected early, which increases the patient’s chances of survival [3]. Breast lesions are classified as calcification or mass based on their appearance, which may aid in the detection of breast malignancies. Masses are the most prevalent clinical sign of carcinomas that appear in mammograms as grey to white pixel intensity values. Timely detection of breast cancer masses is imperative for proper medication due to their modest size when patients exhibit no initial symptoms. Breast masses vary in intensity, distribution, shape (lobulated, irregular, round, oval) and boundary (spiculated, ill-defined, circumscribed) within the breast region, which increases the likelihood of misdiagnosis [4]. Breast cancer is categorized as malignant when tumors are irregularly shaped, have ambiguous edges, and blurred boundaries; on the other hand, benign masses are often dense, well-defined circumscribed, and roughly spherical. The hidden features nearby the masses area are crucial for breast cancer research [5]. As a result of the heterogeneity, morphological diversity, confusing boundaries, and varying cancerous cell sizes, doctors have difficulty recognizing malignant tumors, resulting in needless biopsies. Medical imaging modalities, especially digital mammography, is a well-known and effective technique for timely screening, detecting, and measuring breast density [6]. On the other hand, mammograms are challenging to grasp and interpret due to their poor contrast, architectural complexity, and similarity of lesion intensity to normal tissue at the mass boundaries, making it difficult to recognize the breast masses. It is crucial to extract accurate mass characteristics from mammography images for mass recognition and analysis. Mammography analysis assists radiologists in detecting the location and size of breast masses, which is reassuring for potential treatment measures [7]. A radiologist usually performs mammography analysis manually, which is time-intensive, complicated, biased, and prone to significant expert variability. Misinterpreted mammograms lead patients to take hazardous measures, such as advised breast biopsies if malignant lesions are detected. Reduced recall and biopsy rates are imperative for reducing patient stress and treatment costs while attaining optimal cancer detection measures based on individual needs [8]. Given these obstacles, the scientific community has made several efforts to enhance radiologists’ clinical performance by developing computer-aided diagnostic (CAD) systems to diagnose breast masses using mammography screening. The deep learning-based CAD methods are adversely affected by the inherent snags of mammography, such as noise and illumination. However, existing CAD methods are inefficient in reducing breast cancer mortality, and recall rates [9]. It is difficult to identify lesions from the neighboring healthy tissues, resulting in high false-(positive and negative) predictions. False-positive prediction needs more intensive treatments such as re-screening and biopsies, incurring excessive anxiety and pain [10]. Deep learning-based techniques for detecting and classifying high-risk lesions have been used to reduce human error rates in predicting breast lesions [11]. Developing robust deep learning-based approaches, especially the convolutional neural networks (CNN), can relieve the doctor’s pressure and improve automated mass detection, mass localization, feature learning, and classification [12]. Despite training challenges owing to a lack of labeled images, CNN preserves mammography’s spatial integrity, such as how pixels are linked to form a distinct feature. Moreover, deep learning-based methods still face significant problems acquiring massive annotated training images and ensuring generalizability across cohorts, devices procurement, and modalities. Recent advances in deep convolutional neural network (DCNN) models have enabled the automated detection of breast anomalies without the need for extensive training data by effectively modifying different parameters [13]. Many DCNN architectures are available, including VGGNet [14], Inception [15], ResNet [16], and EfficientNet [17], each with its own distinct design that is tailored for a particular grading task. DCNN architectures are coupled with transfer learning (TL) paradigms to recognize suspect breast areas accurately in mammography images, enhancing radiologists’ screening abilities. TL is a popular deep learning approach for detecting, learning features from lower layers, and classifying breast masses with fine-tuned hyperparameters [18]. Model fine-tuning through TL is much more affordable and efficient than starting with dynamically initialized weights. Moreover, the hybrid categorization study outperformed both learned and hand-crafted models [19]. This study aimed to examine the gap in predicting breast masses’ region of interest (ROI) on mammograms using computer vision algorithms. However, this research proposed a convolutional neural network that achieves state-of-the-art performance in classifying the ROI of breast masses, enabling physicians to detect even the most minor breast masses early. Each image in the MIAS and Private datasets is preprocessed to eliminate noise and improve image quality by proposing different enhancement approaches. Afterward, the ROI of benign and malignant classes is cropped according to each cancerous area’s specific coordinate, and small patches are extracted from the cropped ROI. Our approach efficiently extracts low-level features, reduces variability and generalization inaccuracy, and improves lesion classification using cropped image patches. This research transforms unbalanced data and significantly reduces computing time and false-(positive and negative) predictions. Convolution filters are employed to obtain spatial information of a broader region while retaining computational efficiency. Besides, the transfer learning paradigm is proposed to enhance the standard pre-trained methods by modifying the final layer to classify ROIs of breast masses accurately. We demonstrate our technique’s effectiveness by comparing it to other state-of-the-art approaches using the MIAS dataset as the gold standard. The proposed model and five standard CNNs pre-trained architecture are evaluated with different evaluation metrics such as AUC, sensitivity, recall, precision, and accuracy. As a result, the developed model aims to help experts in treatment planning and decision-making based on cancer’s initial symptoms to detect and classify suspicious areas from mammography. The rest of the proposed work is structured as follows. A description of the latest deep learning diagnosing and grading approaches are presented in Section. Section describes the technical strategies for breast mass detection and classification, datasets, and mammogram image preprocessing. Section depicts the investigation of proposed architecture’s performance and experimental findings. In Section the details of the detection and classification findings are carried out and discussed through various evaluation parameters. Finally, Section summarises the important research work’s outcomes and future directions. Related work A mammography-based computer-aided diagnosis system enables timely breast cancer detection, diagnosis, and medication. CAD systems improve the effectiveness and performance of breast mass diagnosing [20]. Different medical imaging modalities substantially lower the rate of false-positive prognosis to improve the predictive ability of breast mass. Due to the heterogeneity of breast density and mammography’s low contrast, feature selection and manual feature extrication are computationally challenging and time-intensive [21]. However, deep learning-based CNN algorithms have improved breast mass detection and classification by learning features from raw breast images. The research community achieved remarkable improvement in predicting breast cancer based on Deep CNN by minimizing the drawbacks of standard mass detection approaches. Guan et al. [22] develop an approach for recognizing and localizing breast tumors in digital mammograms based on regions of interest paired with a DCNN. The suggested method employs asymmetry information from a pair of breasts from the same individual to improve detection accuracy. Shu et al. [23] constructed a deep CNN classification model using two pooling structures rather than more conventional pooling approaches. The proposed technique entails two stages: feature extraction for feature learning and pooling structure for segmenting mammograms into subregions with a high risk of malignancy using the retrieved features. The model attained 0.922 accuracy with a 0.924 AUC on INbreast and 0.76 accuracy with an 0.82 AUC on the CBIS-DDSM databases. Samala et al. [24] devised a deep learning strategy combined with transfer learning to extract and classify the in-depth feature into cancerous and non-cancerous lesions on mammography. Breast mass classification was employed to assess generalization error, and the findings exceeded analytically derived features. Lee et al. [25] develop a fully automatic deep learning-based method for segmenting and classifying the dense areas in mammography. A full-field digital screening mammography dataset was used to assess the model’s efficacy. Ragab et al. [26] developed a deep learning-based algorithm for feature learning and classification to assist clinicians in detecting breast anomalies in mammograms. The proposed approach retrieved in-depth training features and tested the support vector machine classifier applying multiple kernel functions. The experiments were performed using the MIAS dataset and obtained an accuracy of 0.97, much higher than earlier methodologies. Tan et al. [27] designed a deep learning-based approach for detecting and classifying breast lesions as benign or malignant. The research speeds up the diagnostic process by assisting doctors in diagnosing breast masses and yields a higher accuracy of 0.82 in mass detection than previous approaches. Hadash et al. [28] represent a CNN-based approach for detecting, localizing, and classifying abnormal breast lesions. The design model was validated on digital database for screening mammography (DDSM) dataset, yielding an accuracy of 0.91, sensitivity of 0.94, and AUC of 0.92. Choukroun et al. [29] proposed a multiple instances learning (MIL) technique for the automated diagnosis and grading of breast anomalies without the stipulation of annotated data. The technique’s distinguishing feature identifies discerning regions across the mammography image, overcoming classification inaccuracies. Omonigho et al. [30] used the DCNN model to classify the mammographic images. The methodology aims to extract features and segment ROI using threshold approaches by modifying the last layer of the DCNN with the SVM model. The results demonstrated that the model’s accuracy improved. Daniel et al. [31] applied CNN fused with TL to categorize the pre-segmented breast masses as malignant or non-cancerous. Data augmentation approaches were used to reduce the training sample’s deficiency, resulting in 0.92 accuracy. Accurate mammography classification by deep learning model has significant benefits, including reducing annotation, improved use of contextual information, lower call-back rates, and unnecessary tests without sacrificing the model’s sensitivity. Despite traditional research for malignancy diagnosis having particular challenges, the following are the most significant: The scarcity of breast images poses a significant barrier in attaining an efficient classifying accuracy. Acquiring breast images from a particular vendor is tedious and expensive for training and validating breast mass classification techniques. Unbalanced data in training datasets are frequent, resulting in poor model performance on small datasets. Hence, deep learning-based models are adversely affected by the inherent snags of mammography, such as noise and illumination, so a technique for noise reduction is required. Presently, the concept of automatic detection and classification of breast lesions is gaining momentum, and radiologists continue to face gaps and challenges [32]. Existing studies expose that the CAD systems are ineffective in improving mammography diagnostic accuracy due to a lack of training data. Acquiring training data comprising breast cancer-related features and anomalies is crucial for conducting realistic analysis. Although benchmark datasets are widely obtainable, it requires considerable effort to erect a live medical dataset and image it in a laboratory [33]. Thus, deep learning methods are widely used for automated detection, requiring massive training data encompassing all features and variations correlated with breast cancer. Few studies use large-image datasets (ImageNet) to train the CNNs classifiers by fine-tuning the hyperparameter with transfer learning. Consequently, the proposed method employed data augmentation and transfer learning approaches to overcome the flaws above and obtain accurate breast mass predictions. Discussions Regular mammography screening has become widely known as the most effective method of detecting breast cancer in its earliest stages. However, radiologists’ mammogram-based diagnosis is highly likely to false positives, leading to needless imaging and tumor biopsies. Although the potential of developing deep learning techniques to help in masses screening is intriguing, earlier research has seldom focused on decreasing needless biopsies. Besides sustaining radiologists’ performance in detecting breast masses, a deep learning framework is intended to perform a more decisive role in determining whether lesions are cancerous or non-cancerous. This discrimination is hugely beneficial for suspicious-appearing yet eventually benign results leading to additional biopsies by the radiologist. A few types of breast cancer masses, such as spiculated and ill-defined lesions, still possess accurate detection and classification barriers. The clinical signs of the dense breast are not entirely apparent. However, it is complicated to identify dense lesion features and correctly classify lesions. As a result, the standard CAD systems have fewer challenges in extracting low-level features such as texture and non-textured characteristics to diagnose breast masses. This study proposed a method for extracting local features from small image patches inside high-resolution mammography. We demonstrated that it is essential to add tiny features confined inside an ROI region to boost the performance of deep learning models for classifying localized masses on high-resolution images. The proposed model can be used for automated annotations to mitigate the annotation cost. Furthermore, the proposed model accurately locates the breast masses during testing and training to avoid redundant tests and reduce the patient call-back rate. Consequently, the presented model performs well in detecting and classifying extremely dense breast masses with distinct shapes, edges, and sizes containing bright normal tissue, which is similar to abnormal masses. The proposed ConvNet model performs well on the Private and MIAS datasets, with accuracy of 0.98 and 0.94, respectively. Fig 8a and 8b illustrates the overall accuracy, loss, precision, sensitivity and AUC of each proposed models using both datasets. We enhanced the hyperparameters of all five proposed DCNN pre-trained models by adjusting the last layer during training to diagnose breast masses effectively. Detected masses are directly used in the classification stage, which reduces the model’s complexity and computational time. The projected model takes an average testing time and has moderate computational complexity and a quicker processing speed for detecting and classifying breast masses. It may assist further in decreasing needless biopsies by acting as a second reader where radiologists are unsure about the results. As a result, the proposed model has great potential for clinical procedures that provide radiologists with a precise way for quickly computing breast masses and monitoring disease development. We acknowledge our study suffers from the inherent limitations of observational studies. For instance, we did not interpret the degree of challenges associated with various kinds of breast cancer, which is clinically significant. One of the proposed method’s limitations was the lack in the availability of mammography images data. We will perform this in our future projects. Conclusion and future work Radiologists often have difficulties interpreting patient imaging data correctly, assessing the patient’s health, and detecting benign and malignant masses. Breast cancer mortality in high-risk women is significantly reduced when mammogram interpretation is accurate and leads to effective treatment. This study provides a ConvNet and five DCNN architectures for diagnosing and classifying breast cancer masses, enabling radiologists to detect even the smallest breast masses in their early stages. The transfer learning paradigm is used to enhance the pre-trained DCNN by fine-tuning of hyperparameter. Additionally, the proposed work revealed how image preprocessing and data augmentation strategies may help overcome dataset size bottleneck and mitigate overfitting. We anticipate that the proposed model is very promising and will provide an excellent automated toolkit to heighten prevailing clinical assessment and assist in experts’ decision-making processes. The experimental findings indicate that our method yielded remarkable training accuracy of 0.98, testing accuracy of 0.97, high sensitivity of 0.99, F-Score of 0.98, and AUC of 0.99. Furthermore, as future work, the proposed framework’s efficiency and accuracy can be enhanced by integrating the patches information into the suggested classification algorithm to improve effectiveness and the likelihood of obtaining a correct prediction. Acknowledgments I would like to thank my parents for always believing in me and being with me in every situation, my supervisor Professor Dr. Jianqiang Li and Professor Dr. Yan Pei for guiding me throughout this research. I would also like to thank all my friends for their immense support. [END] [1] Url: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0263126 (C) Plos One. "Accelerating the publication of peer-reviewed science." Licensed under Creative Commons Attribution (CC BY 4.0) URL: https://creativecommons.org/licenses/by/4.0/ via Magical.Fish Gopher News Feeds: gopher://magical.fish/1/feeds/news/plosone/