Open Access

Detection of masses and microcalcifications in digital mammogram images using fuzzy logic


Cite

Breast cancer affects more than 8% of women in the USA and 5% in the UK [1, 2]. Early detection of breast cancer is vital in breast cancer management and mammography is the criterion standard and most reliable method for early detection [1,3].

Mammography is a special imaging system that provides a breast image using low-dose X-rays. The mammographic signs of the breast disease are masses and microcalcifications on mammograms. Detection of breast masses is a challenging task for radiologists because of the appearance and presentation on mammographic images, which are virtually the same as normal breast tissues. The masses have different shapes and larger areas of involvement compared with breast microcalcifications [1,4].

The sensitivity of radiologists in screening programs is estimated to be 65%-75% [1]. To improve sensitivity of the early detection of breast cancer in screening programs, independent double readings are suggested. A 4% to 15% increase in the number of detected cancers has then been reported [5, 7]. However, double reading leads to higher cost, increased workloads, and the need to employ more radiologists. Computer-aided detection (CAD) may be a cost-effective tool for use as an alternative to double reading. The CAD could act as second reader, prompting the radiologist to review regions in the mammogram that are selected as suspicious.

A typical CAD workflow is as follows. The radiologist performs the first reading of the mammogram and selects suspicious regions. Enhancement procedures for digital images may be performed by the radiologist before reading. The CAD system scans mammogram images to select suspicious regions. The radiologist then checks the CAD output to verify whether any suspicious regions were left unchecked in the first reading making a total of 2 human readings and an additional computerized aid.

Several studies have been performed to develop techniques (automatic, semiautomatic, or even manual) for lesion detection and segmentation. For example, Vornweg et al. [8] developed a neural network classifier using morphological parameters, contrast enhancement parameters, and clinical information to predict breast cancer. In their study the classifier sensitivity was 93.6% and specificity was 91.9% compared with confirmed histological diagnosis. It was noted by the authors that expert observers had less sensitivity (92.1%) and specificity (85.6%) than the suggested classifier method [8].

The present study concentrates on developing a new system to detect masses and microcalcifications. Such systems could be helpful for radiologists as a second reader in terms of improving sensitivity and specificity.

Material and methods
Preprocessing

Mammograms are difficult images to interpret, and a preprocessing phase is necessary to improve the quality of the images and make the feature extraction phase more reliable. Two techniques were applied to the images: a cropping operation and an image filtering operation. The second technique used was 3 by 3 median filtering. The median filter is the best filter to reduce noise in digital mammogram images [9, 10]. Finally, the pectoral muscle was automatically detected using a region growing algorithm.

Feature extraction

Texture is a property that describes the surface of an object based on its attributes such as shape and size. Texture analysis is used to explore useful information based on local variations in image pixel values. The goal of texture analysis is to quantify the relationship between pixels in a particular image.

Computer algorithms and techniques are not exactly the same as the human interpretation of textures. However, computers are able to help as a “second reader” for radiologists. Computers may also provide additional quantitative information from textures, while human interpretation may not. Although there are various methods for statistical texture, the gray-level co-occurrence matrix (GLCM) is a better choice than the others because more features are accessible. Moreover, it is useful for detecting small objects [11]. Therefore, the GLCM was a consideration in this study. One co-occurrence matrix implies that the relationships between neighboring pixels occur in both the forward and backward directions. There are a total of 8 directions: 0, 45°, 90°, 135°, 180°, 225°, 270°, and 315° [11]. The most important point is that opposite directions yield the same co-occurrence matrix. Therefore, in the present study, 4 directions, which were suggested by Haralick [11] were employed to compute co-occurrence matrices at distance of d = 1 · Fourteen statistical measurements such as contrast, correlation, and inverse difference moment were derived from the four GLCM matrices. In addition, the mean, median, mode, standard deviation, skewness, and kurtosis are also calculated.

Because the final aim of the fuzzy system was classification of different tissues, namely fat, dense, mass, and microcalcifications, small 8 by 8 view masks were defined. We extracted 18000 samples (8 pixels x 8 pixels) from various breast tissues. The method applied takes the small samples and calculates the co-occurrence matrices at distance d = 1. The angles used are 0, 45°, 90°, and 135°, with a 5th matrix being the mean of the 4 directions.

In the present study, to select the best features, data mining techniques were employed using Waikato Environment for Knowledge Analysis (WEKA) machine learning software (https://svn.cms.waikato.ac.nz/svn/weka/ and http://www.cs.waikato.ac.nz/~ml/weka/). The attribute selection was run and 7 features were selected, namely, mean, median, standard deviation, entropy, correlation, energy, and contrast.

System architecture

The suggested system was designed and implemented using data mining and fuzzy tools. There are different steps in this method i.e. fuzzification, process, and defuzzification. These steps are explained in detail in the following subsections.

Fuzzification

The fuzzy set theory has been proven useful in many areas of image processing [12]. It is well known that mammographic images have some degrees of fuzziness, such as indistinct borders, ill-defined shapes, and different densities. Because of the nature of mammography and breast structure, fuzzy logic is a better choice to handle the fuzziness of mammograms than traditional methods.

Because the crisp data were not suitable for a fuzzy inference (classifier) system, for each selected feature (which were specified earlier), a membership function was defined. The MatLab fuzzy tool box was employed for this purpose. The bell shape (Gaussian) membership function is more suitable because of the symmetric shape of this function. The Gaussian curve is given by:

f(x)=exp(0.5(xc)2σ2)$$f\left( x \right) = \exp \left( {{{ - 0.5{{\left( {x - c} \right)}^2}} \over {{\sigma ^2}}}} \right)$$

Fuzzy inference

A knowledge base including more than a hundred rules was prepared. The rules were developed based on the decision tree which was built using the WEKA software.

Defuzzification

The output of the rules would be the defuzzificated output images. Output membership function included 4 triangular members as different tissue types. Upon the “if-then” rules and membership weights the output in one of these 4 classes. The centroid aggregation method chooses the center of an output as the final fuzzy inference output. The centroid position could be calculated using following equation [13]:

X¯=13(a+xmax+b)$$\overline X = {1 \over 3}\left( {a + {x_{\max }} + b} \right)$$

Evaluation

After approval by the Medical Ethics Committee of the Faculty of Medicine and Health Sciences of the University Putra Malaysia (approval No. F01 (JIAPR (OB) 07)) and National Cancer Society of Malaysia (NCSM), the suggested system was implemented on 326 standard mammogram images that were obtained from the NCSM database, which included 194 normal cases and 132 abnormal cases.

To interpret images, 3 expert radiologists from the Department of Medical Imaging, University Malaya, Department of Imaging, University Putra Malaysia and Hospital Serdang with at least 5 years’ experience of digital mammogram reading and interpretation were invited to read the images and select the area of lesions.

PASW Statistics for Windows (version 18; SPSS Inc., Chicago, IL, USA) was used for data analysis. The results were obtained were based on the appropriate statistical approaches, and sensitivity and specificity as descriptive statistics, and a kappa statistic as inferential statistics. In addition, a receiver operating characteristic (ROC) curve was used.

Results

After applying the suggested method on the images (Figure 1), results showed that of the 132 abnormal cases, a suspicious region was selected correctly in 86% of cases (113 cases). Table 1 clearly shows the accuracy of the suggested system on cases against the true diagnoses. By contrast, the accuracy of radiologists against the true diagnoses is shown in Table 2. As is clearly shown, the first radiologist even had a better accuracy rate than the suggested system in comparison with the diagnosis. To measure accuracy, generally the correctly detected cases are presented in the form of sensitivity and specificity.

The total sensitivity of the suggested system was 85.6% while its specificity was 90.7%. Sensitivity and specificity of the radiologists are presented in Table 3. The sensitivity against 1-specificity is shown in an ROC curve and the area under the curve could be compared as criteria with other systems. Figure 2 shows the ROC curve for the experiment.

The area under the ROC curve is 0.881 for the system diagnosis, which is in the “good” category and shows that the accuracy of the suggested method is acceptable. The measure of compatibility or agreement between the results of the suggested method and the true diagnosis of the NCSM database was calculated using Cohen’s Kappa coefficient. Based on the data, the k was computed as 0.75 (P = 0.0001). Based on Landis and Koch’s [14] criteria and the k that was obtained from our data, there is a substantial agreement between the true diagnosis and results of our suggested method for detecting of suspicious regions.

Figure 1

A: Selected region on i535, B: Selected region on i541

The accuracy of suggested method against trae diagnosis

True diagnosisTotal
AbnormalNormal
Suggested Method
Abnormal11318131
Normal19176195
Total132194326

The accuracy of the radiologists against true diagnosis

True diagnosisTotal
AbnormalNormal
R1Abnormal11611127
Normal16183199
Total132194326
R2Abnormal11218130
Normal20176196
Total132194326
R3Abnormal10322125
Normal29172201
Total132194326

Sensitivity and specificity of radiologists and suggested method (Mammographic Image Analysis Society database images)

Sensitivity (%)Specificity (%)
Suggested system85.690.7
R187.994.3
R284.990.7
R378.088.7

Figure 2

ROC curve of the suggested system diagnosis and diagnosis by the 3 radiologists. Diagonal segments are produced by ties.

Discussion

The findings showed an 85.6% sensitivity and 90.7% specificity for suggested method when it was applied on images obtained locally at the NCSM. The findings of the present study showed higher performance than those described by Li et al. [15]. Bovis et al. [16], and Bovis and Singh [17]. A sensitivity of 70% was reported by Li et al. [15]. They selected and used 150 images in their study. However, there were more images in the present study, with higher sensitivity and specificity. This showed that our system performed better than previous systems. By contrast, Bovis et al. [16], used artificial neural networks to detect lesions. Texture parameters were entered into the network as inputs. Their results showed a sensitivity of 64%; which was not acceptable. Bovis and Singh [17] used subtraction of left and right images. The resulting image was entered into a neural network to detect suspicious regions. This system produced slightly better results and 74.7% sensitivity; which was higher than their earlier system. Findings of the present study showed that the suggested system had better sensitivity than any of the systems described previously.

There have been other studies, which included different methods with higher sensitivity and specificity. For instance Chen et al. [18] used FCM to detect lesions in selected ROI. Vornweg et al. [8] designed and suggested an artificial neural network to detect lesions. They reported a 91.9% sensitivity.

Conclusion

The fuzzy technique is helpful for designing appropriate CADs. Such systems may help radiologists and physicians to detect breast lesions at an early stage with acceptable sensitivity.

eISSN:
1875-855X
Language:
English
Publication timeframe:
6 times per year
Journal Subjects:
Medicine, Assistive Professions, Nursing, Basic Medical Science, other, Clinical Medicine