Diagnosis of Breast Cancer for Mammographic Images Using Various Data Mining Techniques

  • Poornima Deshpande , Prof. K. S. Ghuge, Prof. S. M. Jaybhaye


Numerous surgical oncologists believe that Breast Cancer is a noxious disease among others as quite a number of women suffer from this condition. It impacts almost 2.1 million women and it accounts for the death of 15% of women worldwide making more and more women vulnerable to this disease every year. Though there is no permanent solution to this disease, diagnosing it at an early stage can change the outcome in a very significant way. It can greatly reduce the risk of spreading the tumour to other parts of the Breast. In the past decade, CAD systems which assists in diagnosing the disorder with the help of AI tools has proved to be an eminent way for diagnosing Breast Cancer. Supervised techniques have been used for classification. This survey paper analyses various papers based on the concept of Decision Tree algorithms to diagnose breast cancer. It also lays out the detailed steps for the technique proposed in these papers which mainly consists of 4 stages. Pre-processing is the initial stage followed by segmentation and feature extraction. The system then enters into the classification phase which categorizes mammograms. These techniques focus on removing unwanted noise from the images by pre-processing them and applying filters. The targeted Region is extracted from these mammograms. The mammograms are further classified under 2 labels as benign(non-lethal) or malignant(lethal). Further, distinct algorithms of Decision tree (CART and C4.5) are surveyed to classify and detect cancer. Datasets such as mini-MIAS is generally used. The studies show that the results can also be improvised further with the Random Forest algorithm which is based on a supervised learning model. Random Forest works on an ensemble learning algorithm called Bootstrap Aggregation or Bagging for Classification and Regression.