Received 9 March 2016; accepted 24 April 2016; published 27 April 2016
The occurrence of breast cancer in women has increased significantly in the recent years. In various types of cancer, breast cancer is one of the important causes of death among middle and old aged women. Throughout their lifetime  , more than 8% of women suffered from this disease. In worldwide, breast cancer caused 458,503 deaths (13.7% of cancer deaths in women) in the year 2008  . Physical examination is one of the ways to detect the breast cancer and also the effectiveness of this method is limited by the subjective ability of doctors. Another way to detect breast cancer is screening mammography. It is one of the effective techniques for the early detection of breast cancer. As reported in  , screening mammography minimized the death rates from breast cancer among the women aged between 40 - 70 years  . To improve the accuracy, double reading of mammogram can be used. But it is cost effective and also it requires twice the radiologists’ reading time. It may not be possible to routinely perform a double reading by a radiologist. Hence, efforts have been made to develop a breast cancer classification system   to help radiologist in the analysis of mammograms in hospitals, which increase the accuracy of diagnosis, as well as to improve the uniformity of interpretation of images by the use of the computer’s results as a reference  .
Many researchers have developed various schemes for mammogram image classification. Hadjiiski et al.  used active contour method for segmentation purpose, stepwise feature selection and linear discriminant analysis for classification were used to select the useful features. Huo et al.  used a fully automated procedure to segment the masses using region growing algorithm, two-stage hybrid classifier includes a rule-based stage and an artificial neural network stage. Erkang et al.  proposed a histogram intersection based image classification. Hassanien et al.  proposed classification of the mammogram image based on rough set theory. Arod et al.  presented an AdaBost and Support vector machines for the mammogram image analysis. Zhang et al.  combines extracted features and human interpreted features from the mammogram, with the statistical classifier as other features in conjunction with the Genetic Neural Network. Eddaoudi et al. 2011  presented a masses detection algorithm based on SVM classification and texture analysis. Islam et al. 2010  proposed an efficient computer aided mass classification method in digitized mammograms using Artificial Neural Network (ANN). Pasquale Delogu et al.  used texture features and neural classifier for characterization of mammographic masses.
Kinoshita et al.  have used a texture features and shape features combination, three-layer feed-forward neural network as a classifier for and malignant classification system. Rangayyan et al.  developed a region-based edge profile acutance measure for evaluating the sharpness of mass boundaries. They reported that 92% of classification accuracy in classification of benign and malignant mass. Priebe et al.  estimated system based on the fractal features and texture for the detection of abnormalities. Sahiner et al.  introduced the rubber band straightening transform (RBST) for classification of malignant or benign mass. Chitre et al.  used texture measures for the classification of benign or malignant regions on mammogram. Furuya et al.  used different features like, first-order statistics features, second-order statistic (co-occurrence) features, density features, and shape features and introduced feature selection algorithm and evaluation of feature selection criteria to discriminate malignant tumors from normal on mammograms. Bruce et al.  demonstrated the techniques to extract multiresolution features to quantify mass shapes and the shapes were segmented manually by a radiologist. Pohlman et al.  created a system for segmenting a mass. It dealt with the adaptive region growing algorithm for segmenting of the masses. Schroeter et al.  have investigated the method based on the parameters of the Gaussian mixture by the genetic algorithm (GA) in the context of MR data segmentation. Bazi and Melgani  have proposed a support vector machine (SVM) classification system for detecting the most unique features and estimating the SVM parameters by using a Genetic Algorithm (GA). Daamouche et al.  presented the technique for feature extraction using Particle Swarm Optimization (PSO) to select the most informative features obtained by morphological profiles for classification.
Nandi et al. proposed a novel mass classification method using genetic programming (GP) and feature selection in  . Mavroforakis et al.  analyzed at various clinical features for benign/malignant classification and performed statistical analysis tests on those features. The authors used four different classifiers namely linear discriminant analysis, artificial neural networks, SVMs and k-nearest-neighbor to assess the discriminative power of each of the features.
The functional block diagram of the proposed system for classifying breast tumor in the mammogram is shown in Figure 1. The mammogram has been obtained and processed using various techniques. The features obtained are expected to provide valuable information to analyze the nature of the mammogram for further decision making in the clinical pathology. The current work includes preprocessing, Mass segmentation, feature extraction, classification based on feature selection. The techniques are presented in the following sections.
Figure 1. Flow Graph for breast cancer classification system.
Images in the Digital Database for Screening Mammography (DDSM) are noisy, containing background information and identifiable label which need to be eliminated before classification. Low pass filter is efficient  , which is applied to the image that maintains the mass portion while suppressing noise. Cropping process that eliminates image label and background information  . Histogram equalization is a method in image process- ing, which is used for contrast adjustment using the image histogram’s  . Through this adjustment, the intensity values can be better distributed on the histogram. Through this process the image abnormalities will be better visible.
Figure 2 Shows the preprocessing of mammogram image.
To analyze the mammogram image, it is essential to differentiate the suspicious area from its surroundings. The main idea of segmentation is to extract Region of Interest (ROI) contain mass portions and locate the suspicious mass portion of the ROI. The ROI portion that contains masses was manually extracted from the original mammography images.
This algorithm starts from a seed pixel which manually pointed inside the suspicious region and it expands the area around the seed to include nearby pixels falling within a threshold range. For the similar condition, two thresholds Th1 and Th2 have been used and are given in Equations (1) & (2).
Figure 2. Preprocessing of mammogram image.
where the average value of luminance in the segmented region and it is changing at each growing step, and K depends on a factor F. It is given in Equation (3)
where d is the distance of the pixel with maximum intensity from the initial seed, Iseed is the intensity value of the seed, Imax is the maximum intensity value in the ROI, and N × N is the size of the ROI. Further details can be found in  . Figure 3(a) & Figure 3(b) illustrate the results of ROI from mammogram image & results of mass segmentation from ROI.
4. Feature Extraction
Once the mass boundary is identified, it is essential to extract various features from segmented image, in order to generate feature vector to be used in classification stage. The various features extracted are Texture feature, Intensity histogram feature, Shape features, Radial distance features. Table 1 shows the summary of newly computed mammogram image features according to feature type, feature number and description.
5. Feature Selection Based on an Enhanced Cuckoo Search for Breast Cancer Classification
The main aim of feature selection (FS) is to identify important features from original feature vector. This is done by removing irrelevant, redundant and noisy features which will be helpful for the best performance in terms of accuracy  . Search procedures are mainly used for feature selection. A number of search procedures have been proposed and popularly used feature selection algorithms are genetic algorithm (GA)  , Particle swarm optimization (PSO)  and metaheuristic algorithms.
Metaheuristic algorithms are capable of extracting information from a set of features and often generate the best features in practice. The metaheuristic algorithms, namely Bat Algorithm (BA)  , Ant Colony Optimization (ACO)  , Harmony Search (HS)   and recently the Cuckoo Search (CS) algorithm  . The proposed feature selection approach called Enhanced cuckoo search, having different types of host nest with multiple eggs. It searches for the best feature subsets that maximize the classification performance.
Figure 3. (a) ROI from mammogram image; (b) Results of mass segmentation from ROI.
Table 1. Summary of Computed mammogram image features.
5.1. Cuckoo Search
Cuckoo search (CS) is one of the recent Nature-Inspired metaheuristic algorithm developed by Xin-She Yang and Suash Deb in 2009  . Cuckoos are attractive birds, because of the beautiful sounds they can make and their aggressive reproductive strategy. A few species such as the Ani and Guira cuckoos lay their eggs in communal nests and they may remove others’ eggs to increase the hatching probability of their own eggs. CS is based on the brood parasitism of few cuckoo species, by laying their eggs in the nest of other host birds. If a host bird identifies the eggs, which are not their own, it will throw these unknown eggs away or simply abandon its nest and built a new nest elsewhere  . Each egg in the nest represents a solution, and a cuckoo eggs represents a new solution. If the new solution (cuckoo) is better than old then replaces the solution which is so good in the nest. This algorithm is enhanced by the Levy flights  rather than by isotropic random walks.
The following three idealized rules for describing standard cuckoo search have been used
o At a time each cuckoo lays one egg and dumps at a randomly chosen nest.
o The best nests with high quality eggs will be passed over to the next generations.
o Number of available host nests are fixed and the cuckoo’s egg is discovered by the host bird with a probability pa ɛ (0,1). The host bird can do either get rid of the egg or just abandon the nest and build a completely new nest.
This last assumption can be approximated by a fraction pa of the n host nests that are replaced by new nests (with new random solutions). Further detail can be found in  . Cuckoo search is very simple and has extensive search space. It uses the levy flight to global search instead of standard random walk, which makes CS to explore the search space more efficiently.
5.2. Enhanced Cuckoo Search
Cuckoo search is modified to Enhanced Cuckoo Search with different types of host nest with multiple eggs  . In general cuckoo selects three types of nests for laying their eggs.
o The common cuckoo selects a group of host nests with egg characteristics similar to their own  .
o Others cuckoo selects a group of host nest with egg characteristics dissimilar to their own.
o Some other species of cuckoo lay cryptic egg, which are dark in color when the eggs of their host birds are light. This trick is used to hide the eggs from the host and evolve in cuckoos that parasite host with dark, domed nests.
Pseudo code is shown in Figure 4, for Enhanced Cuckoo Search based on egg laying behavior of cuckoo and multiple eggs in the nest.
5.2.1. Egg Representation
In the process of feature selection using ECS, each egg contains the solution it represents. The eggs are
Figure 4. Pseudo code for enhanced cuckoo search.
represented by 123 features. The mainly used way of encoding the feature selection is a binary string. Here, the random values are generated for feature position. If the value of the variable is 1, then the feature is selected else feature is not selected. The egg is represented as a binary string which is shown in Figure 5.
5.2.2. Initial Population
In this work, each egg represents a possible set of features that are selected and used to classify the samples correctly. Here the initial population size is considered as 50.
5.2.3. Finding New Solutions and Levy Flight
ECS based feature selection method makes use of levy flight for finding the new solutions from Equation (4). Some of the new solutions should be generated by a levy walk around the best solution obtained so far, this will speed up the local search.
By using Levy flight, the new solution is is produced for cuckoo i, and is given below
α is the step size. The step length follows the Levy distribution
5.2.4. Fitness Function
The fitness function plays a vital role in the selection process. The performance of any classification system is measured by its classification accuracy. Classification accuracy plays a major role in the process of breast cancer classification using significant features from the feature vector. Here, the classification accuracy of Minimum distance Classifier or k-NN classifier is used as the fitness function of ECS. The fitness function fitness(f) of ECS is defined as in Equation (5)
Accuracy(f) is the test accuracy of testing data f in the classifier which is built with the feature subset selection of training data. The classification accuracy of minimum distance classifier or k-NN is given in Equation (6).
s: Number of samples that are correctly classified in test data by minimum distance classifier or k-NN Classifier
t: Total number of samples in test data
5.2.5. Parameter Pa
In enhanced cuckoo search the Pa value is changed dynamically by using the formula given in Equation (7)
, is set at 0.5 and 0.3 respectively.
Figure 5. Egg representation of feature selection using ECS.
5.2.6. Crossover and Mutation
o Cuckoo type is common cuckoo then crossover is used to create two eggs in the nest and choose the best one among them.
o Cuckoo type is European cuckoo then create two eggs using crossover with uniform mutation operator and choose the best one among them.
o Otherwise eggs (cryptic) are created with random solution.
5.2.7. K Fold Cross Validation
K fold cross validation procedure has been used to evaluate the efficiency of the proposed system. In this procedure, the feature vector is randomly divided into K disjoint parts of approximately the equal size. Classifier is trained with K-1 parts and then is tested with a single part. This process is repeated K times (K folds) with each of the K parts used exactly once as the test data. The average of K results from the folds can be calculated to produce an average accuracy of minimum Distance Classifier and K nearest neighbor classifier.
5.2.8. Similarity Measure
To find the closeness between training data set and test data set normally some distance measures are used and they are Euclidean distance, City block distance and Minkowski distance and chi-Square Distance, are given in Equation (8) to Equation (11)
Euclidean distance (8)
City block distance (9)
Minkowski distance (10)
Chi-Square Distance (11)
Table 2 shows the parameters and its values for ECS for feature selection in breast cancer classification.
6. Results and Discussions
The images in the Digital Database for Screening Mammography (DDSM) have been considered for developing breast cancer classification system. The proposed system has been implemented and the features were also extracted for further analysis using MATLAB. In this work, initially 123 features were extracted. The proposed feature selection method based minimum distance classifier selects totally 34 features and produces 98.75% average classifier accuracy by using chi-square distance, ECS with k-NN classifier selects totally 29 features and produces 99.13% average classifier accuracy using Euclidean distance. The results of various distance measures and its average classification accuracy are given in Figure 6 & Figure 7. Optimal number of features selected from CS, HS and ECS are shown in Figure 8.
Table 2. Parameter setting for ECS for breast cancer classification.
Figure 6. Classifier accuracy using ECS with minimum distance classifier.
Figure 7. Classifier accuracy using ECS with k-NN classifier.
Figure 8. Optimal feature selected from CS, HS and proposed ECS.
In this paper, a total of 508 (288 benign and 212 malignant) mammogram images were used as case samples. For the segmentation, region growing algorithm was used, after the segmentation, each mass was represented with 123 features, including 96 texture features, 9 histogram features, 11 Shape features, 7 Radial distance features. Several feature selection methods were investigated for breast cancer classification, including the CS, HS and the proposed ECS. All these feature selection method uses a minimum distance classifier, k-nearest neighbor classifier. ECS with k-NN Classifier for breast cancer classification outperforms well when compared to other algorithms. The proposed feature selection method for breast cancer classification shows better performance in terms of accuracy with the least number of features.