Received 5 June 2016; accepted 5 July 2016; published 8 July 2016
Breast cancer is one of the major health problems for woman health. In the United States, one in eight women has breast cancer during their lives  . It is estimated that about 40,290 women will die of breast cancer in a year  . Therefore, early diagnosis and early treatment of breast cancer are very important to reduce death toll.
Ultrasound is a convenient and safe diagnostic method to distinguish benign breast lesion from malignant lesion. However, ultrasonography is an operator-dependent modality, and the operator requires much experience. In order to improve the operator dependency and increase accurate diagnosis rate, computer-aided diagnosis (CADx) systems which provide the likelihood of malignancy on mass and calcifications have been developed  -  . Some investigators reported that influence of dependence on operator and clinician’s diagnostic accuracy was improved by the use of CADx systems   .
Architectural distortion as well as mass and calcifications is an important indicator of breast cancer in ultrasonography images  -  . It is defined in Breast Imaging Reporting and Data System (BI-RADS) as follows  : “The normal architecture of the breast is distorted with no definite mass visible. This includes spiculations radiating from a point and focal retraction or distortion at the edge of the parenchyma.” It is a difficult task for clinicians to distinguish between benign and malignant architectural distortions in ultrasonography because they are often subtle in representation. Therefore, development of CADx systems for architectural distortions in ultrasonography has been desired from clinical practice. To our knowledge, however, no studies have developed such CADx system.
Mass is often associated with architectural distortion. To evaluate the mass with architectural distortion in ultrasonographic image, it is necessary to develop the feature extraction method to analyze both feature of mass and feature of architectural distortion. In a past study  , we developed a CADx system that could evaluate the likelihood of malignancy and that of the histological classification of masses in ultrasonographic images. However, our CADx system did not analyze the objective features of the architectural distortion. Therefore, it might be possible to improve the classification accuracy of our previous method by adding objective features for architectural distortion.
In this paper, we describe the development of feature extraction methods for architectural distortion in the service of a computerized scheme for histological classification of masses with such distortion in ultrasonographic image. We finally employed a k-nearest neighbors (k-NN) rule along with the extracted objective features to determine the histological classifications of masses with architectural distortions. The classification accuracies were evaluated by applying the proposed method to a test set of 72 masses with architectural distortions in ultrasonographic images.
2. Materials and Methods
Our database consisted of 72 two dimensional ultrasonographic images obtained from 47 patients at Mie University Hospital. It included 51 malignant masses (35 invasive carcinomas and 16 noninvasive carcinomas) and 21 benign masses with architectural distortion.
The histological classifications of these lesions were made through pathologic diagnosis. The ultrasonographic images were acquired with an ultrasound diagnostic system (APLIO XG SSA-790A, Toshiba Medical Systems Corp.) with a 12-MHz linear-array transducer (PLT-1204AT). A pixel size of each ultrasonographic image was 0.05 mm × 0.05 mm, and each image was quantized using a 256-level grey scale. Figure 1 shows an example of masses with three histological classifications. The size of these images was 20 mm × 17 mm.
Figure 2 shows a schematic diagram of the proposed method for the histological classification of masses with architectural distortions. The location and shape of the mass were manually determined by an experienced
Figure 1. Three masses with different histological classifications. (a) Invasive carcinoma, (b) Noninvasive carcinoma, (c) Benign.
Figure 2. Schematic diagram of the proposed method to determine the histological classification of masses with architectural distortions in ultrasonographic image.
clinician. We then extracted five objective features for architectural distortion and nine objective features for masses defined in our previous study  . We finally employed the k-NN rule using the extracted objective features to determine the histological classifications of the masses with architectural distortion.
2.2.1. Segmentation of Mass
For accurate extraction of image features, the locations and shapes of all masses were determined by an experienced clinician.
2.2.2. Extraction of Objective Features
Table 1 shows all 14 objective features that were extracted, consisting of five objective features for architectural distortion and nine objective features for mass  . The asterisk indicates that the features were newly defined in this study. Here, the five objective features for architectural distortion are described in detail, whereas the nine objective features for mass are described briefly. To quantify the architectural distortion, we newly defined extraction methods for the retraction (convergence) of a mammary gland (ACI1, ACI2, and ACI3), and extraction methods for spiculations (NumCorners, RatioPMPC). Spiculations are a stellate-shaped distortion caused by the invasion of cancer into the surrounding tissue  .
Average convergence index (ACI1, ACI2, and ACI3)
For obtaining the objective feature concerning convergence of mammary glands, it is necessary to detect linear structures such as mammary glands. Therefore, an ultrasonographic image was first decomposed into several subimages at scales j from 1 to 3 by using a filter bank  . Here, assume that the ultrasonographic image
Table 1. Definitions of features and feature codes.
The asterisk * indicates that the features were newly defined in this study.
was denoted by. These subimages consisted of horizontal subimages for the second difference in the vertical direction of the ultrasonographic image, the vertical subimages for the second difference in the horizontal direction of the ultrasonographic image, and the diagonal subimages for the first difference in the vertical direction followed by the first difference in the horizontal direction of the ultrasonographic image. The pixel values of these subimages, , and corresponded to the elements of a Hessian matrix H, which was defined as
The following expression states the condition that the two eigenvalues and () must satisfy for linear structures  :
Therefore, the enhanced image for linear structures (ELS) was defined by. Figure 3 shows an
example of an image enhanced for linear structures by using the filter bank. The segmented image was then obtained by applying a local gray-level thresholding technique  to the ELS. A thinned image was obtained by applying a thinning algorithm  to the segmented image.
To quantify the concentration of the mammary gland, we computed the convergence index using following equation:
where was the sum of all line primitives in the concentration mask from R1 to R8 (Figure 4 shows the concentration mask), dist represented the distance between O and Q, dx was the length of line primitive Q, and referred to the orientation of Q with respect to line OQ. The maximum value of Equation (3) was 1.0 and the minimum value was 0.0.The equation was obtained by modifying Hasegawa’s method  -  to include the
Figure 3. Example of an image enhanced for linear structures by a filter bank. (a) Original image, (b) Enhanced image for linear structures.
Figure 4. Concentration mask.
value of enhanced image for linear structures (ELS) and using a rectangle mask instead of a circular mask.
We divided mask R into eight regions Rk (k = 1 ~ 8) at 45-degree intervals, and computed the convergence index at each region Rk (k = 1 ~ 8). The mass with architectural distortion in the ultrasonographic image had varying sizes. Therefore, we computed three values of average convergence index (ACI1, ACI2, and ACI3) using concentration masks of three sizes: (length1 [pixel], length2 [pixel]) = (36, 180) at ACI1, (42, 210) at ACI2, and (48, 240) at ACI3. These values were empirically determined. ACI1, ACI2, and ACI3 were defined as
Number of corners of the mass (NumCorners)
The number of corners of the mass (NumCorners) was determined by Chen’s method  . We first detected edges to obtain a binary edge map and extract contours, as in the curvature scale space (CSS) method. The curvature was then calculated at a fixed low scale for each contour to retain the true corners. We regarded the local maxima of absolute curvature as the corner candidates, and adaptively calculated a threshold according to the mean curvature within a region of support. Round corners were removed by comparing the curvature of the corner candidates with the value of the adaptive threshold. Based on a dynamically recalculated region of support, we calculated the angles of the remaining corner candidates to eliminate false corners. Finally, we considered the end points of the open contours, and marked them as corners unless they were in the proximity of another corner. Figure 5 shows corners in a segmented mass.
Ratio of perimeter of segmented mass to that of a circle with the same area (RatioPMPC)
Ratio PMPC was determined by the ratio of the perimeter of the segmented mass to that of a circle with the same area, and was given by
where P_mass was the perimeter of the segmented mass, and P_circle was the perimeter of the circle with the same area as the segmented mass. Figure 6 shows an example of the segmented mass and the circle.
Objective features of mass
In past work, we had proposed nine objective features for the histological classification of masses in ultrasonographic images  . These features reflected clinicians’ subjective impressions based on experience. Our method had recorded satisfactory classification performance. Therefore, we used the same objective features in this study: depth-width ratio (D/W), degree of indistinctness along the margin (IndisMargin), homogeneity in internal echoes (HomoEchoes), echo level of internal echoes (InEchoes), echo level of posterior echoes (PostEchoes), circularity measure in mass shape (Circularity), polygon measure in mass shape (Polygon), lobulated shape measure in mass shape (Lobulated), and irregularity measure in mass shape (Irregularity).
2.2.3. Classification Scheme
A classifier based on the k-NN rule   was employed to distinguish three types of histological classifications. The k-NN rule adopts a majority voting strategy using k number of nearest neighbors. Unknown test data
Figure 5. A corner in a segmented mass.
Figure 6. A segmented mass (a), and a circle with the same area (b).
was classified as belonging to the class with the highest voting power. A leave-one-out-by-patient test method was used to train and test the classifier. In this method, data pertaining to one patient was first selected as part of the testing dataset, and data from the remaining patients was used to train the algorithm. This procedure was repeated until every patient in our database had been tested once.
2.2.4. Evaluation of Classification Performance
Sensitivity  , specificity  , positive predictive value (PPV)  , and negative predictive value (NPV)  were defined as
where TP (true positive) represented the number of malignant masses correctly identified, TN (true negative) was the number of benign masses correctly identified, FP (false positive) represented the number of benign masses incorrectly identified as malignant, and FN (false negative) was the number of malignant masses incorrectly identified as benign. Sensitivity refers to the ability of the test to identify correctly those patients who have the disease. Specificity refers to the ability of the test to identify correctly those patients who do not have the disease. PPV means the ratio of patients who receive a positive test that actually have the disease. NPV also means the ratio of patients who receive a negative test that are actually free of the disease.
Figure 7 shows the distribution of 14 objective features obtained from all masses with architectural distortions in our database. These objective features were normalized by using the average value and the standard deviation of each feature obtained from all masses. NumCorners, RatioPMPC, and Irregularity for invasive carcinomas
Figure 7. Distribution of objective features among (a) ACI1 and ACI2, (b) ACI3 and NumCorners, (c) RatioPMPC and D/W, (d) IndisMargin and HomoEchoes, (e) InEchoes and PostEchoes, (f) Circularity and Polygon, and (g) Lobulated and Irregularity.
were larger than those for other lesions. On the other hand, IndisMargin and InEchoes for invasive carcinomas were lower than those for other lesions. ACI1, ACI2, and ACI3 for the invasive carcinoma and noninvasive carcinoma were larger than those for benign mass. Circularity for benign mass also was larger than that for invasive carcinoma.
Table 2 shows the results of tests for univariate equality of group means. The F-value  for NumCorners was larger than that for any other features. Therefore, NumCorners made a larger contribution to determining three histological classifications of masses with architectural distortions. The p value for ACI1, ACI2, ACI3, NumCorners, RatioPMPC, IndisMargin, InEchoes, Circularity, and Irregularity satisfied the significance level (p < 0.05). Therefore, these nine objective features were statistically significant for the histological classification of masses with architectural distortions.
The k-NN rule was employed with the nine objective features to distinguish among the three histological classifications. Table 3 shows the results of the distinction of the three histological classifications by use of the classifier based on the k-NN rule with k = 3. The classification accuracy of the proposed method was 91.4% (32/35) for invasive carcinoma, 75.0% (12/16) for noninvasive carcinoma, and 85.7% (18/21) for benign mass. The sensitivity and specificity values were 92.2% (47/51) and 85.7% (18/21), respectively. The positive predictive values (PPV) were 88.9% (32/36) for invasive carcinoma and 85.7% (12/14) for noninvasive carcinoma whereas the negative predictive values (NPV) were 81.8% (18/22) for benign mass.
To investigate the usefulness of the proposed objective features on architectural distortion in terms of classification accuracy, we compared the proposed method with our previous method  to assess the histological
Table 2. Tests for univariate equality of group means.
Table 3. Determination results of three histological classifications using the k-NN rule for k = 3.
classification of masses with architectural distortions. We employed the k-NN rule with our previous objective features (D/W, IndisMargin, HomoEchoes, InEchoes, PostEchoes, Circularity, Polygon, Lobulated, and Irregularity)  . The classification accuracy of our previous method was 85.7% (30/35) for invasive carcinoma, 31.3% (5/16) for noninvasive carcinoma, and 76.2% (16/21) for benign mass. Here, the value of k in the k-NN rule was 8. The proposed method yielded higher classification accuracy than our previous method. Therefore, the objective features for architectural distortion defined in this study were useful for the histological classification of masses with such distortions.
To investigate its usefulness in terms of classification accuracy, the k-NN rule was compared with the multiple discriminant method (MDM)   . In past work  , we had used the MDM for the histological classification of masses. Table 4 shows the classification accuracies of the k-NN rule and the MDM based on the leave-one-out-by-patient test method. We used k = 3 in the k-NN rule. For inputs to the k-NN rule and the MDM, we used the nine objective features from this paper. The classification accuracies obtained with the k-NN rule were higher than those obtained by the MDM. It is possible that the MDM might not have accurately estimated the decision boundary  because the number of masses in each histological classification was small (in particular in noninvasive carcinoma). In contrast to the MDM, the k-NN rule did not implement a decision boundary, and is based on the distance measure (Euclidean distance) between test data and the specified training data. Therefore, in this study, we believe that the k-NN rule was more appropriate than the MDM for the histological classification of masses with architectural distortions.
In order to investigate the adequacy of the shape of the mask in the convergence index, we compared the classification accuracy of a computerized method using values for ACI1, ACI2, and ACI3 obtained by the circular convergence mask and six objective features (NumCorners, RatioPMPC, IndisMargin, InEchoes, Circularity, and Irregularity were the same as in the proposed method) with the results of the proposed method. The classification accuracies of the computerized method were 85.7% (18/21) for invasive carcinoma, 43.8% (7/16) for noninvasive carcinoma, and 91.4% (32/35) for benign mass. The classification accuracy of the proposed method was thus higher than that of the computerized method. In previous study  -  , the shape of lesions was approximated by the circle. Thus, it was possible to use the circular concentration mask to compute the convergence index. However, masses in ultrasonographic images vary in shape    . Therefore, in this study, we believe that a rectangular mask was more suitable than a circular mask to calculate the convergence index.
We also investigated the change in classification accuracy for the proposed method when k for the k-NN rule varied from 1 to 5. Table 5 shows the results for the three histological divisions in this case. With k = 3, the proposed method yielded the highest classification accuracy.
There are some limitations in our proposed method. The number of histological types used in this study was relatively small. Only three types of masses formed our database. Therefore, we need to expand the database by collecting other types of masses and re-evaluate our proposed method. Furthermore, the regions occupied by the masses were manually traced by an experienced clinician in this study. It is time consuming for clinicians to manually trace masses in clinical practice.
Table 4. Comparison of the classification accuracies of the k-NN rule and the MDM.
Table 5. Results for the three histological divisions in this case.
In this study, we developed a computerized determination scheme for histological classification of masses with architectural distortions in ultrasonographic image. Our proposed method was shown to yield high classification accuracy for histological classification, and could be useful in the differential diagnosis of masses with architectural distortions as a diagnostic aid. In future work, we plan to develop an automatic segmentation method for masses in ultrasonographic images.
This work was supported (in part) by JSPS Grant-in-Aid for Scientific Research on Innovative Areas (Multidisciplinary Computational Anatomy), JSPS KAKENHI Grant Number 15H01118.
The views expressed in this article do not reflect the official position of Mizuho Information & Research Institute, Inc. Any errors in this article are attributable to the authors.