Received 24 March 2016; accepted 20 April 2016; published 15 June 2016
The image fusion is the process of combining two or more images to form a single fused image which can provide more reliable and accurate information. It is useful for human visual and machine perception or further analysis and image processing tasks   . The image fusion plays an important role in medical imaging, machine vision, remote sensing, microscopic imaging and military applications. Over the last few decades, medical imaging plays an important role in a large number of healthcare applications including diagnosis, treatment, etc.   . The main objective of multimodal medical image fusion is to capture the most relevant information from input images into a single output image which is useful in clinical applications. The different modalities of medical images contain complementary information of human organs and tissues which help the physicians to diagnose the diseases. The multimodality medical images such as Computed Tomography (CT), Magnetic Resonance Angiography (MRA), Magnetic Resonance Imaging (MRI), Positron Emission Tomography (PET), Ultrasonography (USG), Single-Photon Emission Computed Tomography (SPECT) images, X-rays, etc. can provide limited information. These multimodality medical images cannot provide comprehensive and accurate information. For example, MRI, CT, USG and MRA images are the structural medical images which provide high resolution images with anatomical information, whereas PET, SPECT and functional MRI (fMRI) images are functional medical images which provide low-spatial resolution images with functional information. Hence, anatomical and functional medical images can be integrated to obtain more useful information about the same object. It helps in diagnosing diseases accurately and reduces storage cost by storing the single fused image instead of multiple-input images.
So far, many image fusion techniques have been suggested in the literature. Some of the techniques are related to multimodality medical image fusion  -  . The image fusion techniques comprise of three categories such as pixel level, feature level and decision level fusions  . The pixel level image fusion is usually employed for medical image fusion, because of easy implementation and computational efficiency  . Hence, it is focused in the proposed work.
The simplest image fusion technique creates the fused image by taking the average of the source images, pixel by pixel. But, this method reduces the contrast of the fused image  . The techniques based on Principal Component Analysis (PCA), Intensity-Hue-Saturation (IHS), Independent Component Analysis (ICA), and Brovey transform (BT) produce spectral degradation  . The pyramidal methods such as Gradient pyramid (GP)  , Laplacian pyramid (LP)  , Ratio-of-low-pass pyramid (ROLP)  , Contrast-pyramid (CP)  , Morphological pyramid  suffer from blocking effects. Therefore, these are not highly appropriate methods for medical image fusion  . Discrete Wavelet Transform (DWT) based techniques which are commonly used, is good at isolated discontinuities and cannot preserve the salient features of the source images efficiently, and introduce artifacts and inconsistencies in the fused images  . Several Multiscale Geometric Analysis (MGA) tools such as Curvelet, Contourlet and NSCT, etc. are developed which do not suffer from the problems of the wavelet. Many image fusion techniques based on these MGA tools have been suggested by researchers  -  .
In medical image fusion, preservation of edge information is needed, but DWT based fusion may produce peculiarities along the edges and it captures limited directional information along vertical, horizontal and diagonal directions  . To overcome the problem of discrete wavelet transform, Contourlet transform has been proposed which gives the asymptotic optimal representation of contours and has been used for image fusion  . However, the upsampling and downsampling process of Contourlet transform produces shift-variance and having pseudo-Gibbs phenomena in the fused image  . Later, Non-subsampled Contourlet transform (NSCT) was proposed by Cunha et al.  which inherit the advantages of Contourlet transform, while effectively suppressing pseudo-Gibbs phenomena. The unique structure of NSCT, namely, two shift-invariant components, Non-Subsampled Pyramids (NSP) and Non-Subsampled Directional Filter Banks (NSDFB) imparts NSCT with shift invariance and redundancy properties, thereby making it an ideal and attractive MST for image fusion methods.
In this paper, a multimodal medical image fusion technique is proposed based on NSCT. The NSCT is applied on source images which decompose the images into low and high frequency components. The low frequency components of NSCT are fused by the maximum local mean scheme and high frequency components are fused by the maximum local variance scheme. The use of variance enhances the fusion scheme by preserving the edges in the images. These combinations preserve more details in the source images and improve the quality of the fused images. The fused image is obtained by taking Inverse Non-Subsampled Contourlet Transform (INSCT) with the coefficients obtained from all the frequency bands. The efficiency of the suggested technique is carried out by fusion experiments on different multimodality medical image pairs. Both qualitative and quantitative image analysis reveals that the proposed framework provides a better fusion results compared to the conventional image fusion techniques.
The rest of the paper is organized as follows. The NSCT is described in Section II followed by the proposed medical image fusion method in Section III. Results and discussions are given in Section IV and conclusion is given in Section V.
2. Non-Subsampled Contourlet Transform (NSCT)
The NSCT was introduced by Cunha et al. which is based on the theory of Contourlet Transform (CT) which achieves better results in image processing in geometric transformations  . The CT is a shift variant because it contains both down-samplers and up-samplers in the Laplacian Pyramid (LP) and Directional Filter Bank (DFB) stages. NSCT is a shift invariant, multi-scale and multi-directional transform which has a very vibrant implementation. It is obtained by using the Non-subsampled Pyramid Filter Bank (NSP or NSPFB) and the Non-subsampled Directional Filter Bank (NSDFB). Figure 1 shows the decomposition framework of the NSCT.
2.1. Non-Subsampled Pyramid Filter Bank
The multiscale property is ensured by the non-subsampled pyramid filter bank using two-channel non-subsampled filter bank. At each decomposition level, one low-frequency image and one high-frequency image can be produced. In the subsequent NSP decomposition level, the low-frequency components are decomposed iteratively to capture the singularities in the image. As a result, NSP can result in L + 1 sub-images, which consists of one low frequency image and L high frequency images where L denotes the number of decomposition levels. The sub-images have the same size as the source image.
2.2. Non-Subsampled Directional Filter Bank
The Non-subsampled Directional Filter Bank is a two-channel NSDFB which are constructed by combining the directional fan filter banks. The NSDFB ensures the NSCT with the multi-direction property and provides more directional detail information. The NSDFB is achieved by eliminating downsamplers and upsamplers in each two-channel filter bank in DFB tree structure and upsampling filters accordingly   . NSDFB allows the direction decomposition with k levels in each high-frequency subbands from NSPFB and then produces 2 k directional subbands with the same size as the source images. Thus, the NSDFB provides the NSCT with multidirectional performance and gives more precise directional detail information to get more accurate results  . Therefore, NSCT provides better frequency selectivity and an important property of the shift-invariance on
Figure 1. Decomposition of NSCT framework.
Figure 2. General procedure for NSCT based image fusion.
account of non-subsampled operation. The size of sub-images decomposed by NSCT is same and image fusion based on NSCT can mitigate the effects of misregistration in the fused images  . Therefore, NSCT is more suitable for medical image fusion. The general procedure for NSCT based image fusion is shown in Figure 2. The image fusion steps based on NSCT can be summarized below.
Step 1: NSCT is applied on source images to obtain lowpass subband coefficients and highpass directional subband coefficients at each scale and each direction. NSPFB and NSDFB are used to complete multiscale decomposition and multi-direction decomposition.
Step 2: The transformed coefficients are performed with fusion rules to select NSCT coefficients of the fused image.
Step 3: The fused image is constructed by performing an inverse NSCT to the selected coefficients obtained from Step 2.
3. Proposed Medical Image Fusion Method
The architecture of the proposed image fusion method is shown in Figure 3. The multimodal medical images are preprocessed before they are used for fusion. The preprocessing includes image registration and image resizing.
The images which are obtained by different modalities might be of different orientations. Hence, they are needed to be registered before they are fused. Image Registration is a fundamental task in image processing. It is the process of spatially aligning two or more images of a scene. Given a point in one image, the registration will determine the positions of the same point in another image. The source images are assumed to be spatially registered, which is a common assumption in image fusion  . Various techniques  can be applied to medical image registration. A thorough survey of image registration techniques can be referred to  .
The images to be fused may be of different sizes. For image fusion, size of the images must be the same. If the images vary in size, image scaling is required. Image scaling is the process of resizing an image. This is done by interpolating the smaller size image by rows and columns duplication.
Filtering is required for removing noise from the images, if noise is present in the input images. Median filtering is one kind of smoothing technique. All smoothing techniques are effective at removing noise, but adversely affect edges. When reducing the noise in an image, it is important to preserve the edges which are needed for medical image fusion. Edges are important to the visual appearance of images. For small to moderate levels of noise, the median filter is demonstrably better than Gaussian blur at removing noise whilst preserving edges for a given, fixed window size  . For speckle noise and additive noise, median filtering is particularly effective  . Like lowpass filtering, median filtering smoothes the image and is thus useful in reducing noise. Unlike lowpass filtering, median filtering can preserve discontinuities and can smooth a few pixels whose values differ significantly from their surroundings without affecting the other pixels. The median filter is also a sliding-window spatial filter, but it replaces the center value in the window with the median of all the pixel values in the window. Median filtering is a nonlinear process useful in reducing impulsive or salt-and-pepper noise. It is also useful in preserving edges in an image while reducing noise.
Figure 3. Proposed image fusion method.
In the proposed architecture as shown in Figure 3, A, B and F represent the two source images and the resultant fused image, respectively., indicates the low-frequency subband (LFS) of the images A and B at the scale K., represent the high-frequency subband (HFS) of the images A and B at scale k, () and direction h.
3.1. Fusion of Low Frequency Subbands
The coefficients in the low-frequency subbands represent the approximation component of the input images. Most of the information of input images is available in the low-frequency band. Hence, for fusing the low frequency coefficients, we propose a scheme by computing the mean (μ) in a neighborhood to select low frequency coefficients. Mean (μ) gives a measure of the average gray level in the region over which the mean is computed. It is calculated on the low frequency components of the input images within a 3-by-3 window and the frequency components which have a higher value of mean are selected as the fusion coefficients among the low frequency components. The formula for mean (μ) is given in Equation (1).
where ST is the window size, μ(p) denotes the mean value of coefficients centered at m, n in the window of ST and C represents the multi-scale decomposition coefficient in the low frequency subband. For the sake of notational simplicity, m, n corresponding to the location of each coefficient, is denoted by p. After calculating the mean of all coefficients in the low-frequency band, the corresponding coefficients with higher magnitude of mean are chosen into the fused image. The fusion scheme used for the low-frequency bands can be illustrated as follows:
3.2. Fusion of High Frequency Subbands
The objective of an image fusion is that the fused image should not discard any useful information present in the input images and should preserve the detailed information such as edges, lines and boundaries of the image. These details of the image are usually present in the high frequency subbands. Hence, it is important to find the appropriate method to fuse the detailed components of source images. The conventional methods do not consider the neighbouring coefficients. But, pixel in an image will have some relations with its neighbouring pixels, which implies that coefficients in high frequency subbands will also have some relations with its neighbouring coefficients.
Variance is used to characterize the details of the local region (3 × 3) in high-frequency sub-images. Processing of high-frequency coefficients has a direct effect on clarity and distortion of the image. Because of the importance of these coefficients for preservation of edges and details, the variance of a sub-image neighborhood characterizes the degree of change of that neighborhood. In addition, in a local area, if there is strong correlation among pixels, then the characteristic information embodied in any pixel is shared by the surrounding pixels. The fusion rule of selecting larger variance is efficient when the variances corresponding to the image pixels vary greatly in local areas   .
Hence, for fusing high frequency coefficients, variance based fusion rule is used which computes the variance in a neighbourhood to select the high frequency coefficients and also, this method produces large coefficients on the edges. The formula for the variance is given below.
where ST is the window size. The μ(p) and σ(p) denote the mean and variance values of the coefficients centered at m, n in the window of ST and C represents the multi-scale decomposition coefficient. Then, for selecting high frequency coefficients, the following fusion rule has been implemented.
The procedures of our method can be summarized as follows.
1) The input images to be fused must be registered to assure that the corresponding pixels are aligned.
2) Decompose the images using NSCT to get low and high frequency subbands.
3) The coefficients of low frequency subband of NSCT are selected by Equation (1) and Equation (2).
4) The coefficients of high frequency subbands of NSCT are selected by Equation (3) and Equation (4).
5) Perform the Inverse NSCT (INSCT) with the combined coefficients obtained from steps 3 and 4.
4. Results and Discussion
In our multimodal medical image fusion technique, evaluation index system is established to evaluate the fused images. The indices used in our framework measure the amount of information present in the fused image, contrast of the fused image, average intensity of the fused image, edge information present in the fused image. The indices are Entropy  , Mean, Standard deviation  and Edge based similarity measure (QAB/F)  .
The proposed multimodal medical image fusion method has been implemented in MATLAB R2010a by taking different multimodality medical images. We have taken CT, MRI and PET images as experimental images. The performance of our method is compared with the fusion results obtained from pixel averaging method  , the conventional DWT method with maximum selection rule  , NSCT  , NSCT  . Like most of the literatures, we assume the input images to be in perfect registration. In the proposed method, for implementing NSCT, maximally flat filters and diamond maxflat filters are used as pyramidal and directional filter banks respectively. The decomposition level of NSCT is [1 2 3 4]. In proposed framework, the window size 3 × 3 is considered which has been proved to be more effective for calculating the mean and variance   .
In order to evaluate the performance of the proposed image fusion method, six pairs of medical source images are considered as shown in Figure 4. The image pairs in Figure 4 (a1-a2, b1-b2, c1-c2, d1-d2, e1-e2 and f1-f2) are CT, MRI, T1-weighted MR (MR-T1), MR-GAD (Generalized Anxiety Disorder) and PET images. The corresponding pixels of two input images have been co-aligned perfectly. The proposed medical fusion method is applied to these image sets. For comparison, the same experimental images are used for all the existing methods.
The experimental results of the five different fusion methods are shown in Figure 5. Compared with the original source images, the fused image obtained from all the five methods, contains information about bones and tissues, which cannot be seen in the separate CT, MRI or PET images. By seeing the images in Figure 5 (a7-f7), the result of the proposed method is better than the other methods. The result of the pixel averaging method gives least values for all the indices because the information of bones and tissues are blurred. The results of NSCT based method is good compared to pixel averaging and DWT based methods.
The fused images obtained by the proposed method are more informative and have higher contrast than the input medical images which is helpful in visualization and interpretation. The fused image contains both soft tissue (from Figure 4(d1)) and bone information (from Figure 4(d2)). Similarly, the other fused images contain information from both the corresponding input images. The resultant fused images obtained by NSCT are visually similar to the fused images obtained by the proposed method. But, in quantitative analysis, we found that the fused images obtained by the proposed method have higher quantitative results than the methods of NSCT. Pixel averaging and DWT methods suffer from the problem of contrast reduction. It is clear from the images of Figure 5(a3-f3, a4-f4) that the pixel averaging and DWT methods have lost large amount of image details. We can see clearly from the resultant fused images given in Figure 5(a7-f7) that the proposed method results in high clarity, high information content and low contrast reduction. Hence, it is clear from the subjective analysis that the proposed method is more effective in fusing multimodality medical images and superior than other state-of-the-art medical image fusion techniques. An expert radiologist (Dr. P. Parimala, Ananya scans, Karur-1, Tamil Nadu, India) was asked to do subjective evaluation of fused images obtained by proposed method and compared methods. According to clinician opinion, fused result of compared methods suffer from the problem of contrast reduction than the proposed method (Figure 5(a7-f7)) which can be seen from the given results of Figure 5 and an expert also found that the fused images obtained by proposed method, are clearer, higher contrast and more informative than the source medical images.
Table 1 and Figure 6 show the Entropy, Mean, Standard deviation and QAB/F values of the different quantitative measures of the fused images obtained by both conventional and proposed image fusion techniques. The highest values are indicated by “bold” values in Table 1 for the quantitative measure. The higher values of QAB/F indicate that the fused images obtained by proposed method have more edge strength than the other methods.
Figure 4. Source images: (a1) = T1-weighted MR, (a2) = MRA, (b1) = MR-GAD (generalized anxiety disorder) (b2) = CT, (c1) = MRI, (c2) = PET, (d1) = MRI, (d2) = CT, (e1) = CT, (e2) = PET, (f1) = CT, (f2) = MRI.
(a3) (a4) (a5) (a6) (a7)(b3) (b4) (b5) (b6) (b7)(c3) (c4) (c5) (c6) (c7)(d3) (d4) (d5) (d6) (d7)(e3) (e4) (e5) (e6) (e7)(f3) (f4) (f5) (f6) (f7)
Figure 5. Fused Images: (a3), (b3), (c3), (d3), (e3), (f3)―Pixel Averaging; (a4), (b4), (c4), (d4), (e4), (f4)―DWT; (a5), (b5), (c5), (d5), (e5), (f5)―NSCT  ; (a6), (b6), (c6), (d6), (e6), (f6)―NSCT  ; (a7), (b7), (c7), (d7), (e7), (f7)―Proposed method.
Table 1. Evaluation indices for fused medical images.
Figure 6. Objective performance (entropy, mean, QAB/F, standard deviations) comparisons.
Similarly, the higher values of Entropy for the fused images show that the fused images obtained by the proposed method have more information content than the other fused images.
From Table 1, the standard deviation’s values of the resultant fused images are higher, which indicates that the fused images obtained by our proposed method have higher contrast than the fused images obtained by other image fusion techniques. Hence, it is clear from Table 1 that the fused images obtained by the proposed method are more informative and high contrast which is helpful in visualization and interpretation. Most of the existing state- of-the-art image fusion techniques suffer from contrast reduction, blocking effects and loss of image details, etc.
In our proposed method, the multi-scale, multi-directional and shift invariance properties of NSCT along with the use of different fusion rules have been used in such a way that it can capture the fine details present in the input medical images into the fused image without reducing the contrast of the image. Hence, it is obvious from the results and comparisons that the fused images obtained by the proposed method are more informative and high contrast which is helpful for the physicians in their diagnosis and treatment.
In this paper, a multimodal medical image fusion method is proposed based on Non-Subsampled Contourlet Transform (NSCT), which consists of three steps. In the first step, the medical images to be fused are decomposed into low and high frequency components by Non-Subsampled Contourlet Transform. In the second step, two different fusion rules are used for fusing the low frequency and high frequency bands which preserve more information in the fused image along with improved quality. The low frequency bands are fused by using local mean fusion rule, whereas high frequency bands are fused by using local variance fusion rule. In the last step, the fused image is reconstructed by Inverse Non-Subsampled Contourlet Transform (NSCT) with the composite coefficients. The percentage improvement in entropy is 0% - 40%, mean is 3% - 42%, standard deviation is 1% - 42%, QAB/F is 0.4% - 48% in proposed method comparing to conventional methods for six pairs of medical images. The visual and evaluation indices comparisons reveal that proposed method improves the details of the fused images than the conventional fusion methods.
The authors thank Dr. P. Parimala M.B.B.S, DMRD, (Ananya Scans, Karur-1, Tamil Nadu, India) for the subjective evaluation of the fused images.