High density objects, such as hip prosthesis, dental fillings or surgical clips can cause metal artifacts in computed tomography (CT) images. Streaking artifacts are caused by a combination of beam hardening, scatter, noise, photon starvation and exponential edge-gradient effect    . Dark streaks between metals are due to beam hardening and scatter whereas sharp thin alternating streaks can be due to motion and under sampling  . Beam hardening occurs because of polychromatic nature of x-ray beams in the CT system. Lower energy photons are easily absorbed by materials compared to higher energy photons. As a result, the transmitted beam becomes harder and cupping or dark band artifact emerges between dense materials in the images. Photon starvation artifacts arise when an object with high atomic number strongly attenuates the x-ray beam leading to decreased number of photons reaching the detector.
Numerous metal artifact correction methods (MAR) have been proposed since the 1980’s and they have shown to be effective in improving image quality  -  . Some techniques include but are not limited to interpolation based sonogram correction, non-interpolation based sonogram correction, hybrid sonogram correction, iterative image reconstruction and image based approaches  . Each method has its advantages and disadvantages but a hybrid approach appears to achieve the best result  . In addition, dual energy CT has been shown to be effective in reducing metal artifacts  . Some commercial products      have become available in the recent years due to the increasing computing power.
For external beam radiation therapy, metal artifacts can compromise a patient treatment in two different ways. First, the streaking artifacts can obscure anatomical details and make target and organs at risk (OAR) delineation challenging. Second, the artifacts change the CT Hounsfield unit and impact the accuracy of dose calculation in a treatment plan. Some studies    indicate the dosimetric impact between MAR corrected and uncorrected images is not clinically significant. On the contrary, findings from Spadea et al.  suggest dose error can vary between 10% to 25%. Report from Task Group 63 of American Association of Physicists in Medicine Radiation therapy Committee  has recommendations on the dosimetric considerations when dealing with high den- sity prosthetic devices. However, there are no recommendations on using MAR corrected datasets for dose calculation.
We had the opportunity to assess GE’s smart MAR algorithm on our CT scanner. It uses an automated, three-stage projection based process to improve the image quality  . The focus of this study is twofold. First, we examined the ability of the MAR software to restore the CT number in the vicinity of the metals without compromising the overall image quality. We evaluated various image quality parameters such as geometric accuracy, low contrast, uniformity and MTF on a few phantoms. Second, we assessed the dosimetric impact from calculating on MAR dataset versus non-MAR dataset for both pelvic cancer patients with hip prosthesis and head and neck (H/N) cancer patients with dental fillings.
2. Materials and Methods
2.1. Evaluation of Image Quality on Phantoms
Our phantom study was conducted with the Catphan® 504 phantom to evaluate the impact of MAR algorithm on CT number sensitometry, geometric accuracy, MTF, low contrast resolution and uniformity. A helical scan was acquired with GE Optima 580 RT-16 CT scanner (GE Healthcare, Milwaukee, WI) with the following parameters: 120 kV, auto mA, 1 s rotation time, 16 × 0.625 mm2 collimation, 2.5 mm slice thickness, 0.938 pitch and 25 cm sFOV. The second CT dataset was reconstructed with the MAR algorithm. Both MAR and non-MAR scans were analyzed with Image Owl QA software (Image Owl Inc., Greenwich, NY).
An in-house manufactured 20 cm diameter cylindrical water phantom was utilized to assess the accuracy of the CT number. The phantom contains three holes in which a 19.0 mm diameter cylindrical stainless steel insert can be positioned in any location while the other holes are filled with cylindrical acrylic inserts. The holes were located in the center of the phantom, in the periphery of the phantom and in between these two locations. These spots were chosen to evaluate how the position of the metal impacts the CT number accuracy. Three scans were acquired, one for each location of the metal insert. The scanning parameters include helical scan, 120 kV, auto mA, 1 s rotation time, 16 × 0.625 mm2 collimation, 2.5 mm slice thickness, 0.938 pitch, 25 cm sFOV, with and without MAR correction. An additional water phantom scan was acquired as the baseline image with the same scanning parameters but with the three holes filled with acrylic inserts. The accuracy of the CT number at six various positions was evaluated with a square ROI in Eclipse TPS™ (version 11.0.31, Varian Medical System, Palo Alto, CA). We compared the CT number from the baseline image without stainless steel to the MAR corrected scans with the stainless steel insert. The dimension of the stainless steel insert was also measured on the CT image by identifying the metal pixel using a threshold HU value (half the maximum metal HU value)  . An average value taken from the lateral and vertical directions of the stainless steel rod on central axis was compared to the physical dimension measured with an electronic caliper.
2.2. Evaluation of Clinical Plans
A total of fifteen H/N cancer patients with dental fillings and ten pelvic cancer patients with hip prosthesis who previously received radiation therapy at our cli- nic were selected for the study after obtaining ethics approval. The study population for H/N cases consisted of 11 male and 4 female with a mean age of 63.9 ± 15.4 years (range 34 - 85 years). For pelvic cases, there were 7 males and 3 females with a mean age of 73 ± 5.0 years (range 65 - 81 years). These patients underwent CT scanning with the following scanning parameters: helical scan, 120 kV, auto mA, 1 s rotation time, 16 × 0.625 mm2 collimation, 2.5 mm slice thickness, 0.938 pitch and 50 cm sFOV. Two CT datasets were reconstructed from the scan, a MAR dataset and a non-MAR dataset. Both datasets were exported to Eclipse TPS™ and delineation of target and organs at risk (OAR) was performed on the MAR dataset by the radiation oncologist and radiation therapist. Clinical plans were optimized and calculated with AAA (version 11.0.31) on the MAR dataset until PTV and OARs met our institution’s clinical dose constraints. Dose calculation grid of 2.5 mm was utilized with the heterogeneity correction. For H/N cancer patients, 6 MV IMRT was the default planning technique with prescription ranging from 45 Gy in 25 fractions to 70 Gy in 35 fractions. For patients with hip prosthesis, either 6 MV IMRT or VMAT was utilized. Prescription ranged from 45 Gy in 25 fractions to 74 Gy in 34 fractions. Table 1 and Table 2 outline the patient list with prescription dose and treatment techniques. As recommended by Task Group 63  , we avoided treatment fields entering through the hip prosthesis. VMAT plans consist of 2 or 2.5 arcs and were optimized with either an avoidance sector or with a constraint on the
Table 1. Pelvic patients with hip prosthesis.
Table 2. Head and neck patients with dental fillings.
prosthesis. For the H/N cases, no special attention was paid to avoid treatment beams entering through the dental fillings because these regions were small.
After the treatment plan was approved by the radiation oncologist, contours from the MAR dataset were copied onto the non-MAR dataset. Next, a separate dose calculation was performed on the non-MAR dataset with the same treatment field arrangement and fluence as the clinical plan. Dose differences between the two CT datasets were evaluated for PTV and OARs. Some patients had multiple PTVs but only results from the high dose PTV will be presented here. In this study, none of the metal artifacts were contoured with density over-rides.
To quantify the percentage and absolute difference between MAR and non- MAR plans, the following conventions were utilized:
For target volume evaluation, the conformity index was utilized. This is a ratio of prescription isodose volume to the target’s volume. Endpoints for PTV include D99% (dose to 99% of target volume) and V100% (volume receiving prescription dose). For H/N OARs, we compared the mean dose to the parotids and ma- ximum dose to spinal cord and brainstem. For pelvic plans, we assessed the DVH of bladder, rectum, femoral head, iliac crest and the genitalia.
3.1. Evaluation of Image Quality on Phantoms
Comparisons between scans with and without MAR algorithm on the Catphan phantom demonstrate similar results for image quality. Geometric accuracy, MTF, CT number for various materials and low contrast resolution were very similar, if not identical. There was a small difference for noise level. Table 3 summarizes the findings.
Evaluation of CT number at six various locations of the in-house phantom was conducted on the central axis slice. Figure 1 shows the stainless steel plug positioned at three different locations with and without MAR algorithm. Visually, there are significant reductions of metal artifacts on images with MAR correction. Comparison of HU difference between the baseline scan versus metal scan is shown in Figure 2. ROI positions 1, 2 and 3 are locations of the stainless steel insert whereas locations 4, 5 and 6 are in the water phantom as seen in the inset of Figure 2. If the MAR algorithm can restore the CT number perfectly, we would expect a zero HU difference between the baseline scan and the metal scan with MAR algorithm. However, we still observe a small HU difference when the MAR algorithm is applied. When the MAR algorithm is not applied, the HU difference between baseline scan and the metal scan is increased. This reveals the MAR algorithm is capable of restoring the CT number in the presence of metals. The same data analysis was performed on the baseline image without stainless steel insert and reconstructed with and without MAR correction. Results show there was negligible HU difference between the two datasets. This demonstrates
Table 3. Results of image quality tests for MAR and non-MAR scans of Catphan.
Figure 1. Stainless steel insert at three different locations of the water phantom with the other two holes filled with acrylic inserts. Original images without MAR algorithm are on the top panel (a) and images on the bottom panel (b) are with MAR correction. Viewing window = 400 HU and Level = 40 HU.
the MAR algorithm does not alter the CT number when there is no high density material. In Figure 2, ROI position 6 displays larger HU differences when the stainless steel insert is at position 2 or 3 and without MAR algorithm. This is due to the proximity of the streaking artifacts relative to ROI position 6 as seen in Figure 1(a). It’s prudent to point out in Figure 2 at ROI position 5, the HU dif-
Figure 2. HU difference between baseline image and stainless steel image at six different locations of the phantom.
ference was smaller without MAR algorithm when the metal insert was at ROI position 3. This is contrary to what we observe for other ROIs. Figure 1(b) demonstrates the MAR corrected image has a darker band posterior to the metal insert at position 3. The alternating dark and bright streaks in the uncorrected image leads to a higher standard deviation (SD) but the mean HU averages out to be closer to 0 HU at ROI position 5. ROI position 5 has an average CT number of −8.3 ± 12.8 HU and −3.6 ± 21.5 HU for MAR corrected and uncorrected images respectively. For all ROI positions, we analyzed the difference in SD between MAR and non-MAR datasets. On average, the SD decreases by 9.1 ± 6.6 HU when MAR algorithm is applied. The largest SD occurs when the stainless steel insert is in the center of the phantom. SD can be useful to quantify the severity of the metal artifact.
The physical diameter of the stainless steel was compared to the measurement from the CT image which over-estimated the insert by 0.9 mm. The MAR algorithm appears to correctly reconstruct the dimension of the stainless steel insert.
3.2. Dosimetric Evaluation of Clinical Plans
Similarly to our phantom study, we see a significant reduction of metal artifacts with our clinical CT datasets when the MAR algorithm is applied. However, re-
Figure 3. Pelvic patients were reconstructed with MAR (a) and (c) and without MAR (b) and (d). Images (a) and (b) show a bilateral hip replacement whereas images (c) and (d) show a single hip prosthesis. Viewing Window = 400 HU and Level = 40 HU.
sidual artifacts are still present. Figure 3 demonstrates axial slices of two prostate patients with and without MAR algorithm. One patient had a single hip prosthesis while the other had a double hip replacement.
For all fifteen H/N patients, the average percentage differences in conformity index, D99% and V100% are −0.3% ± 0.9%, −0.1% ± 0.1% and −0.1% ± 0.5% respectively. For all ten pelvic patients, the average percentage discrepancies in conformity index, D99% and V100% are −8.8% ± 11.4%, −0.1% ± 0.4% and −8.8% ± 12.1% respectively. Figure 4 and Figure 5 demonstrate PTV percentage differences for all H/N and pelvic plans respectively. In both figures, the majority of the percentage differences are negative numbers. However, a few of them have a positive percentage difference. This is due to the type of artifacts. Dark streaks have a lower HU and are less attenuating whereas bright streaks have a higher HU and are more attenuating.
Patient #13 in Figure 4 shows large differences in conformity index and V100% compared to the other patients. The metal artifacts from this patient are more severe and are in close proximity to the PTV. Figure 6 demonstrates an axial slice of this patient with the streaking artifacts from the dental fillings.
Patient #4 in Figure 5 indicates substantial discrepancies in conformity index and V100%. This is due to two factors. First, the dark streaking artifacts on the non-MAR dataset has a low HU and is less attenuating than tissue. Second, we aim to have 99% of PTV receiving 95% of prescription dose. Thus, the DVH curve has a steep slope at prescription dose, leading to the greater dose differences between MAR and non-MAR plans. This phenomenon is not observed in H/N plans since there are less metal artifacts from dental fillings and the planning goal is to have 95% of PTV receiving 100% of prescription dose. Overall, the DVH of PTV for both H/N and pelvic cases were very similar for both MAR and non-MAR plans. For pelvic cases, patients #3, 5, 9 and 10 have bilateral hip replacements but there is no significant difference between unilateral and bilateral hip replacements.
Figure 4. Comparison of PTV conformity index (CI), D99% and V100% for H/N plans calculated with and without the MAR algorithm (MAR − no MAR).
Figure 5. Comparison of PTV conformity index (CI), D99% and V100% for pelvic plans calculated with and without the MAR algorithm (MAR − no MAR).
Figure 6. Patient #13 from Figure 4 was reconstructed with MAR (a) and without MAR (b). The dark streaking metal artifacts from dental fillings are seen on the non-MAR dataset. Red contour is the PTV. The same W/L is used for both images.
Figure 7. Comparison of absolute dose difference (cGy) in OARs for H/N plans calculated with and without the MAR algorithm (MAR − no MAR).
For H/N OARs, we compared the mean dose to the parotids and maximum dose to the spinal cord and brainstem as shown in Figure 7. The maximum absolute dose difference between MAR and non-MAR plans for all OARs was 33.2 cGy with an average dose difference of 1.4 ± 9.1 cGy. Parotids display higher dose differences since they are in close proximity to the dental fillings compared to spinal cord and brainstem. This is observed in patients #2 and #12. OARs for pelvic plans include bladder, rectum, femoral heads, iliac crests and the genitalia as shown in Figure 8. The maximum absolute volume difference between MAR and non-MAR plans for all OARs was −9.0 cc with an average volume differences of −0.51 ± 1.5 cc. Some data points in Figure 8 are missing because the prescription dose is lower than the DVH constraints. For example, if the prescription is 45 Gy in 25 fractions for a pelvic plan, V70Gy for the rectum will be zero.
For the pelvic cases, we performed a plan subtraction in Eclipse between plans calculated on MAR and non-MAR datasets. An example is shown in Figure 9
Figure 8. Comparison of absolute volume difference (cc) in OARs for pelvic plans calculated with and without the MAR algorithm (MAR − no MAR).
Figure 9. Plan subtraction of MAR and non-MAR calculated plans for a bilateral hip implant case. Dose differences larger than ± 2% of the prescription are shown in orange and magenta. Viewing Window = 400 HU and Level = 40 HU.
with a double hip replacement. Isodose levels corresponding to ±2% of the prescription are shown in orange and magenta. Absolute dose differences larger than 2% are near the boundary regions of hip prosthesis and skin surface. A review of all ten of our pelvic cases indicates the dosimetric changes between calculations performed on a MAR and a non-MAR datasets are not significant. This finding is similar to the study from Li et al.  which concluded Philips’ O-MAR software improves the CT number accuracy and structure visualization but the dosimetric improvement was not a benefit.
Remarkable efforts have been made in the recent years in developing commercial algorithms to reduce metal artifacts and noise in CT images. In this paper, we provided an experimental and clinical evaluation of one commercially available MAR algorithm for CT simulations in radiation therapy. We found GE’s smart MAR algorithm to be effective in reducing artifacts for H/N patients with dental fillings and pelvic patients with hip prosthesis. The reduction of streaking artifacts allows radiation oncologists to accurately delineate targets and organs at risk. This negates the need to increase target margin which may lead to more normal tissue toxicity. Furthermore, the accuracy of CT number is improved when MAR algorithm is applied. GE’s software is able to correctly characterize the dimension of the stainless steel insert in our phantom study. Although the algorithm provides an improved image dataset, there are still some residual artifacts in the corrected images. Han et al.  evaluated dual-energy reconstructions of a GE CT scanner with and without MAR software for patients with hip prosthesis. They concluded the overall image quality in pelvic cavity with MAR algorithm was improved but new artifacts were also observed when using the MAR algorithm. Similar observation was made by other groups    . Study performed by Joeng et al.  suggests Philips’ O-MAR algorithm increases the detectability of the skin boundary near the hip prosthesis, resulting in improved skin contouring which can aid in dose calculation accuracy.
The degree of dose discrepancy between treatment plans calculated on a MAR dataset and a non-MAR dataset depends on a few factors. Our study shows dosimetric impact from hip prosthesis is greater than dental fillings because hip prosthesis produces more artifacts. The proximity of the organ to the high density material is crucial as well. A larger dose difference is observed when the organ of interest is closer to the high density material. The beam arrangement can also play a role as more uncertainties are introduced when a field is going through a high density material. Dose differences between the plans can be positive or negative depending on the type of metal artifacts. Dark streaks have lower HU and can introduce hot spots whereas bright streaks have higher HU and introduce cold spots. Our findings conclude there is minimal dosimetric difference between treatment plans calculated on the MAR and non-MAR datasets. This is supported by the studies from Li et al.  and Shen et al.  with Philips’ O-MAR algorithm. Investigations from Spadea et al.  suggest the impact of MAR on dosimetry is dependent on the atomic number of the metal. Low Z materials, like titanium (Z = 22), don’t produce significant dose errors whereas high Z materials, such as platinum (Z = 78), can substantially affect the dose calculation. High Z material can cause under-dosage of 20% - 25% in the region surrounding the metal and over-dosage of 10% - 15% downstream of the object  . Huang et al.  discovered the success of MAR may depend on the type of metal and the size of the implant. In addition, the largest dosimetric impact is due to the metal size accuracy instead of successful artifact reduction.
In our study, we chose to compare plans calculated on the MAR dataset versus the non-MAR dataset. We did not compare MAR plan to non-MAR plan without heterogeneity correction because the variation between these two plans includes differences from the heterogeneity correction. Since the focus of our investigation is on the metal artifacts, we did not want to include the dosimetric effects due to heterogeneity. One weakness of our study is that we do not know the composition of the hip prosthesis and dental fillings. Thus we are unable to correlate the dosimetric impact based on the type of the metal.
One limitation with GE’s smart MAR algorithm is that the sFOV must be smaller or equal to 50 cm. At our clinic, when the patient’s anatomy extends out- side of 50 cm sFOV, target and OAR contouring is performed on the MAR dataset. These contours are copied onto the non-MAR dataset for dose calculation purpose. In addition, metal artifacts on the non-MAR scan need to be contoured with 0 HU assigned.
This study indicates GE’s smart MAR algorithm can improve CT number accuracy and correctly characterize the dimension of the metal insert without impacting the overall image quality. However, residual metal artifacts are still observed in the MAR corrected images. The dose differences between IMRT and VMAT plans calculated on the MAR and non-MAR datasets depend on the proximity of the organ to the high density material, the type of streaking artifacts and the beam arrangement of the treatment plan. With our study population of 15 H/N patients with dental fillings and 10 pelvic patients with hip prosthesis, we found the dosimetric difference to be minimal between MAR and non-MAR datasets for both PTV and OARs. There are several advantages of planning on the MAR corrected images. First, there is substantial reduction of metal artifacts which can allow the radiation oncologist to contour targets and OAR more accurately. Second, treatment planning time can be reduced because there is no need to contour the artifacts and override the density. Last, the MAR corrected images will provide better reference images for image guidance. Therefore, MAR corrected images are recommended for radiotherapy treatment planning.