Inaccuracy in attenuation coefficient maps using MR-based (Magnetic Resonance Imaging) approaches can compromise PET (Positron Emission Tomography) quantification accuracy for the combined PET/MR system. This has been recognized as one of the weakest points in current PET/MR technology  . The difficulty lies in the fact that MR signal reflects the density of mobile protons in the material weighted by the respective magnetic resonance relaxation properties (T1 and T2), rather than the material’s photon attenuation properties. Therefore, MR images are not directly translatable into the 511 keV photon linear attenuation coefficients, which are required for PET attenuation correction, in the same way that CT (Computed Tomography) images can be translated. Many different approaches to obtain patient-specific attenuation maps from MR images have been proposed  -  . The most common approach involves deriving MR-based PET attenuation maps by classifying or segmenting the MR images into different tissue categories followed by assigning standard tissue-specific attenuation coefficients to each voxel based upon its assigned class. The tissue-classification approach is used by all of the commercially available PET/MR models because of its simplicity and flexibility, and is the focus of discussion in this paper.
On current commercial whole-body PET/MR systems, either a three-class (air, lung and soft tissue)  or a four-class (air, lung, fat and non-fat soft tissue)  method is employed for obtaining PET attenuation maps. Instead of being treated as a separate tissue class in the attenuation maps, in these systems bones are incorporated into the soft tissues despite their substantially different photon attenuation properties. This is largely attributable to the difficulty of detecting bone in MR images: first, the number of protons in bones that are visible to the MR scanner is only about 20% - 25% of that of soft tissues  ; and more importantly, the transverse relaxation time, T2, of bones is substantially shorter than those of soft tissues (0.3 - 0.5 ms versus 10 - 100 ms), leading to a rapidly decaying signal that cannot be captured by conventional MR sequences   . A number of studies have demonstrated that simply treating bone as soft tissue can cause inaccuracy in PET quantification, especially in brain imaging (~10% - 25% underestimation) and in the voxels that are adjacent to or inside of the bones in whole-body imaging (~10% underestimation)    . Inaccuracy in attenuation correction of bone may be especially problematic in pediatric studies, a potential key application for PET/MR imaging, given the higher bone to soft tissue ratio  . Many of these studies also suggested that identifying bone as a separate tissue class and assigning an attenuation value higher than that of soft tissues can significantly reduce the quantification inaccuracy  . To address these difficulties, special MR imaging methods such as ultrashort-echo-time (UTE) or zero-echo-time (ZTE) sequences have been proposed      . These techniques use an extremely short TE (as low as 10 us for the ZTE technique) to capture the signal of bone before its disappearance and thereby to obtain bone information using MR. UTE/ZTE MR-based techniques have been applied to head imaging and can potentially provide bone information for attenuation correction in whole-body PET/MR systems.
However, MR bone imaging using UTE/ZTE techniques still faces many challenges   . In addition to the short relaxation time and low proton density, a unique challenge of bone imaging stems from the small size of bone relative to the typical voxel sizes prescribed in clinical imaging, both MR and CT. Unlike soft tissues, whose dimensions are often greater than that of the voxel size (which is typically 1 - 3 mm), the dimension of bone―or more specifically, the mineralized, “hard tissue” component of bone―is usually close to or smaller than the dimension of the voxel prescribed for the imaging session. For example, the thickness of the cortical layer of the bones in the torso, such as the vertebrae and ribs, can be less than 1 mm   , and even the more sizable bones such as the pelvis have regions that are as thin as, or thinner than, the voxel size. As a result, when the signals from a human body are “voxelized” during a tomographic imaging study, most “soft tissue voxels” are homogeneous voxels that contain only soft tissues, whereas a substantial proportion of the “bone voxels” are, in fact, voxels that contain both bone and soft tissues, leading to partial voxel composition, which is the focus of this study.
The issue of partial voxel composition related to bone deserves special scrutiny for MR because it poses a substantially greater challenge to bone imaging in MR than in CT. In clinical CT images, the superior contrast to noise ratio between bone and soft tissue (contrast: >1000 HU, noise: 15 - 20 HU) makes it easy to detect the presence of bone even in voxels where the volumetric fraction of bone inside the voxel is low (this will be hereafter referred to as the bone volume fraction, or BVF for short). For example, consider a voxel that spans the interface between skeletal muscle (~50 HU) and the femur (~1400 HU) such that it is composed of 80% muscle and 20% cortical bone. The HU value of this voxel will be approximately 320, which is far greater than the expected value of soft tissue (typically less than 100 HU). It is thus easy to recognize the presence of bone in this voxel, despite the fact that the voxel is predominantly composed of soft tissue (80%). However, detecting the presence of bone with MR in voxels of mixed composition is substantially more difficult even with the aid of UTE/ZTE techniques, and the ability of bone identification with MR is far behind the ability of CT. As a result of the discrepancy in the abilities of bone identification between MR and CT, although previous CT simulated studies   showed that binary tissue classification of bone is an effective approach to correcting bone-induced quantification inaccuracy, their applicability to MR-based PET attenuation correction is limited because the high capability of bone detection (low HU threshold in bone segmentation) simulated in these studies may not yet be achievable with MR. Therefore, a more detailed investigation of this subject is warranted.
We hypothesized that accurate quantification may be achievable with less-than-perfect bone identification abilities, and as a result the segmentation-based approach remains feasible for PET/MR attenuation correction. In this study we investigated the relationship between the quantification accuracy of bone lesions in PET and the ability of an MR technique to detect the presence of bone in voxels where its fractional presence is low. We only focused on the quantification of bone lesions, as it has been demonstrated that in soft tissue lesions, quantification bias from ignoring bone are small. The goal of this study was to establish the requirement for accurate quantification of bone uptake in PET/MR and thereby to support the development of MR-based bone classification methods.
2.1. 18F-Sodium Fluoride PET/CT Data
18F-sodium fluoride (NaF) is a radiotracer for skeletal imaging. It has been used to evaluate metastatic bone diseases in oncology  . Image data of seven patients (five male, two female, age 55.5 ± 16.5 yr, weight 93.9 ± 20.7 kg) who had undergone whole-body 18F-NaF PET/CT examinations at The University of Texas MD Anderson Cancer Center were retrospectively obtained for this study. The studying of these patient data was approved by the Institutional Review Board of The University of Texas M. D. Anderson Cancer Center. All of the PET/CT exams had been performed on a Siemens Biograph mCT Flow PET/CT scanner. The injected NaF activities were 8.9 ± 0.6 mCi [322 ± 22 MBq]. After an uptake time of 46.6 ± 9.3 minutes, whole-body CT attenuation data and PET emission data were acquired from the vertex of the skull to the toes. No CT contrast material was administered to these patients. The CT data were acquired at 140 kVp, with a pitch factor of 1.4 and collimation of 16 × 1.2 mm. They were reconstructed into images with 512 × 512 matrices with a 1.5 mm transverse pixel size. The PET data were acquired in 3D mode and reconstructed into 200 × 200 matrices with a 4.1 mm transverse pixel size. Both datasets had 3 mm slice thickness and 2 mm slice spacing. Following the clinical protocol at our institution, PET reconstructions were performed using the “UltraHD-PET” option, which includes both PSF (point spread function) and TOF (time of flight) corrections, using two iterations, 21 subsets and a 5-mm FWHM Gaussian filter. The same parameters were used for all PET reconstructions in this study.
We used CT attenuation images that had been acquired during NaF PET/CT scans in our simulation of MR-based attenuation images. In order to make the results of this CT-simulated study applicable to MR, the bone volume fraction (BVF), a physical quantity independent of imaging modalities, was employed to characterize the ability of bone imaging. BVF values of voxels were estimated using the HU value of the voxels (see details in the Appendix). Using the estimated BVF values of voxels, MR-based attenuation map with various levels of bone imaging ability was simulated with CT images by classifying the mixed bone voxels with BVF above a certain threshold as bone voxels and the ones below the threshold as soft tissue voxels. The BVF threshold hence characterizes the ability of bone identification of the hypothetical MR technique. We then determined the proper attenuation coefficients to be assigned to voxels that had been classified as bone for each level of BVF, and we performed PET reconstructions with the simulated attenuation maps. It is worth noting that the rest of the CT images―soft tissue, lung and air voxels―were left intact without any modification in the attenuation images; by doing so, the effect of bone voxel identification on PET quantification is isolated. Finally, the bone lesion uptakes in PET data corrected with these simulated attenuation maps were compared to the PET data reconstructed with the original CT attenuation images to determine the PET quantification accuracy.
To summarize, our study consists of three steps:
Step 1: Estimate the BVF for each voxel using whole-body CT datasets from the attenuation scans of PET/CT studies at a resolution that is typically employed in the clinic by. This step was necessary because in this study, the ability to detect bone was characterized by the minimal BVF that could be identified as bone by an MR bone imaging technique.
Step 2: Create various attenuation maps that simulate MR-based attenuation correction with different bone detection ability by varying BVF thresholds. This was achieved by classifying only voxels above a certain BVF threshold as bone, while treating voxels below that threshold as simply soft tissue in the attenuation map.
Step 3: Perform attenuation correction using the simulated attenuation maps, and then compare the corresponding quantification accuracy of bone lesions to CT-corrected PET data. This step evaluated the quantification accuracy of MR-based attenuation correction by comparison to the gold-standard of CT-based attenuation corrected PET data.
2.2. Step 1: Estimation of Bone Volume Fraction
Prior to the calculation, all components outside of the patient anatomy, such as the CT table and any positioning pads, were digitally removed from the CT images.
The HU value in CT reflects the overall attenuation of a voxel. Ignoring noise, the linear attenuation coefficient of a voxel that contains multiple tissue types is the mean of the attenuation coefficients of the tissue types weighted by the respective volumetric fraction of each type:
, where (1)
In this equation, x, y, and z are the spatial coordinates of the voxel, and cn denotes the fraction of the volume of the voxel that consists of tissue type n; HUn(x, y, z) is the HU value of tissue type n at the voxel. Although in theory a voxel could contain more than two different tissue types (e.g., lung, fat, soft tissue and bone), under realistic conditions voxels containing three or more tissue types are so rare that they can be ignored. In this study, the focus was on the voxels that are partially bone and partially soft tissues.
Replacing the cn of bone with the Bone Volume Fraction in the dual-tissue-type scenario, Equation (1) becomes:
This equation means that given a measured HU (x, y, z), the volumetric fraction of the bone (BVF) of the voxel located at (x, y, z) can be computed if the HU values for tissue and bone are known. In order to perform the analysis in our study, we made the assumption that in contrast to the large HU difference (>1000) between soft tissues and bone, the difference within homogeneous tissue voxels is small enough to be neglected, and that all homogeneous soft tissue voxels can be considered to have the same HU value, HUtissue, and that all homogeneous bone voxels within the same CT slice can be considered to have the same HU value, HUbone(z), where z is the slice location (more details about the slice dependence of HUbone can be found in the Appendix).
In essence, Equation (2) becomes
The BVF for each voxel can be simply estimated as
In this study, HUtissue was set to 0, the HU value of water, which is used in the BVF estimation of all studies. The value of HUbone(z) was determined separately for each slice because human anatomy at different cross-section can produce different amount of beam hardening, which results in a variation of the measured HU values. The slice-to-slice variation in HU is minimal for soft tissues, but cannot be neglected for bone. The details of our method of determining HUbone(z) are described in the Appendix. The linear relationship between the volumetric fraction of bone and HU values measured with a clinical CT scanner has also been demonstrated by Parsa et al. ex vivo  .
2.3. Step 2: Simulation of MRAC Images with Various Bone Volume Fraction Thresholds
Detection of bone in voxels with higher BVF is always easier than in voxels with lower BVF because there are greater signal contrasts in the MR-derived parameters, such as , between these voxels and the background soft tissue voxels. Therefore, the ability of an imaging method to detect bone can be characterized by the minimum BVF that a voxel must have in order for the presence of bone to be detectable using this method. A low BVF threshold indicates that the technique is relatively sensitive to the presence of bone, while a high BVF threshold indicates relative insensitivity.
A BVF threshold of 100% corresponds to the situation of complete insensitivity, namely that in which the presence of bone can be detected only in voxels with BVFs strictly greater than 100%. Such an extreme threshold simulates the MRAC approaches that are available on the current commercial PET/MR systems, which treat bone as being the same as soft tissue. With a decreasing BVF threshold, the simulated sensitivity increases. In theory, the highest sensitivity corresponds to the BVF threshold of 0%, meaning that the presence of bone can be detected even in voxels in which the volumetric fraction of bone approaches 0%. This could not be simulated in our study, because we found that a 10% BVF threshold corresponds to 100 - 120 HU in the CT attenuation images. Further decreasing the threshold would start to include soft tissue voxels, which produces unacceptable classification errors. This also indicates that, as the most sensitive tomographic bone imaging modality, the bone detection sensitivity of CT is around 10% BVF. Expecting MR-based methods to achieve this level of sensitivity would be unrealistic with present technology.
In order to simulate the scenarios of different bone detection sensitivities in MR, ten sets of different attenuation images were created, representing a range of sensitivity with BVF thresholds of 100% (i.e., bone completely ignored and treated as soft tissue), 90%, 80%, 70%, 60%, 50%, 40%, 30%, 20% and 10%. Voxels with a BVF strictly higher than the thresholds were classified as bone in each CT dataset, while voxels with a BVF between 10% and the threshold values were classified as soft tissue.
For each BVF threshold, the mean HU value for the voxels classified as bone was calculated separately for each patient. These mean HU values were then assigned to the segmented bone voxels, replacing the original HU values. The mean HU values for soft tissue voxels were also calculated for each patient, and were assigned to the voxels with a BVF between 10% and the corresponding threshold. The HU values for the voxels with BVF < 10%, i.e., the soft tissue voxels, were not modified. By doing this, the attenuation contribution of bone was isolated from that of the rest of the body. The components that had initially been excluded from the CT images prior to the BVF estimation (e.g., the patient table, the positioning aids and the high attenuation components such as teeth and metal implants) were then reintroduced into the attenuation images so that they were consistent with the reference CT. The ten sets of attenuation images, along with the original CT attenuation images, were then used in reconstructions of the PET data.
2.4. Step 3: Identification and Comparison of NaF-Avid Bone Lesions
In order to quantitatively assess the effect of bone detection sensitivity on PET bone lesion quantification, we evaluated the percentage difference in tracer uptake of NaF-avid bone lesions between the data reconstructed with simulated attenuation correction maps and those reconstructed with the original CT attenuation maps. For convenience, all regions of bones with tracer uptake that was visibly higher than the adjacent background (i.e., appear as “hot spots”) are referred to as lesions in this study, without necessarily implying any clinical diagnosis. The lesions were identified in the PET images reconstructed with the original CTAC images.
The delineation of lesions was performed using in-house software written in Matlab [Mathworks, Natick, MA]. A seed location was first manually selected for each lesion, and then a region-growing algorithm was used to segment the PET voxels with an appropriate uptake threshold that had been manually determined for each lesion. If necessary, this step was followed by morphological dilation or erosion to ensure an adequate segmentation of the entire tracer-avid region. In these NaF PET/CT studies, NaF-avid lesions could be identified in bones all over the body. However, sometimes patients move their limbs during the lengthy PET scan, causing spatial misregistration of the limbs in the CT image and the PET image. This can lead to inaccuracy in quantification of the affected lesions. For this reason, lesions that were located in the upper and lower extremities were not included in the analysis.
A total of 119 suitable lesions were identified in five different anatomical regions: skull (N = 17), pelvis (N = 28), ribs (N = 17), vertebral processes (N = 13) and vertebral bodies (N = 44). These lesions were quantified using their maximum activity concentration values.
3.1. Simulated MRAC Image with Various Bone Volume Fraction Thresholds
The HU values assigned to the segmented attenuation images are the measured mean HU values of voxels above the respective BVF thresholds. These results are given in Table 1.
3.2. Quantification Difference in NaF-Avid Bone Lesions with and without an Explicit Bone Class in the Attenuation Images
Compared to the reference PET data, the quantification difference of the 119 tracer-avid bone lesions when no bone was classified in the attenuation images was −9.9% ± 5.5% (−2.0% to −26.4%) for the maximal uptake (see Figure 1 & Table 2). The degree of underestimation differed between lesions located in the skull (−19.9% ± 3.8%) and lesions located in the body (−8.2% ± 3.6%, p < 0.001). Within the body, the underestimation of lesions in the ribs (−4.6% ± 1.6%) was significantly lower compared to that of lesions in the pelvis, the vertebral bodies and the vertebral processes (−8.9% ± 3.5%, p < 0.001). The difference among lesions in the pelvis, the vertebral bodies and the vertebral processes were found
Table 1. HU values assigned to the segmented bone voxels corresponding to each sensitivity level of bone detection. These values were the measured mean HU of the voxels above the BVF threshold in the corresponding CT attenuation images.
Figure 1. Quantification error in 119 bone lesions when bone is classified as soft tissue in the attenuation image. The underestimation of uptake was significantly higher in the skull lesions, while no statistically significant difference was observed among lesions in the pelvis, the vertebral process and the vertebral body. The apparent quantification error in rib lesions was lower, which was caused by the spatial misregistration between the PET data and the CT attenuation data.
Table 2. Quantification error in 119 bone lesions when bone is classified as soft tissue in the attenuation image.
not to be statistically significant (p = 0.705 between the pelvis and the vertebral processes, p = 0.522 between the pelvis and the vertebral bodies, and p = 0.180 between the vertebral processes and the vertebral bodies).
For the simulation of the case of a highly sensitive method of bone classification in MR (corresponding to a BVF threshold of 10%), the quantification difference of the 119 bone lesions improved from −9.9% ± 5.5% to 1.2% ± 4.7%. However, this small mean difference reflects compensation between overestimation in some lesions and underestimation in others. Analyzing the absolute values of the differences in order to remove the compensation effect, the absolute quantification difference of the 119 lesions decreased from 9.9% ± 5.5% to 4.0% ± 2.7%, which is still a reduction of 59% of the original bias.
3.3. Effect of Bone Detection Sensitivity on Bone Lesion Quantification
An example of the simulated attenuation images with different bone imaging abilities are shown in Figure 2. The dependence of the quantification difference on
Figure 2. Illustration of simulated attenuation maps in this study and original CT. For display purposes, only the torso portion of the whole-body study are shown here. Different BVF thresholds were used when segmenting the partially bone voxels, simulating a hypothetical MR-based bone imaging method with different level of bone imaging abilities, in which voxels with BVF strictly greater than the threshold are identifiable. BVF = 100% corresponds to the scenario that no bone voxels were identified.
the sensitivity of bone detection is plotted in Figure 3 and Figure 4. The detailed results are shown in Table 3. The overall absolute quantification error, which was 9.9% ± 5.5% with the 100% BVF threshold (i.e., without any bone identification), did not decrease monotonically as the sensitivity of bone detection increased, but instead reached a minimum of 1.5% ± 1.3% in the simulated attenuation images at a 30% BVF threshold. This corresponds to a reduction of 84% of the original quantification bias. Beyond this point, increasing the bone detection sensitivity (i.e., reducing the BVF threshold) further led to slight increases in the quantification error.
There was also a notable difference between the lesions in the skull and the lesions in the body (i.e. in the pelvis, ribs, and vertebrae). As the bone detection sensitivity increased from no bone detection to approximately 30% BVF threshold, the degree of underestimation of bone lesions in the body decreased steadily from 8.2% ± 3.6% to 1.4% ± 1.1%. Overestimation then started to occur and the absolute quantification difference reached 3.4% ± 2.0% at a 10% BVF threshold. In contrast, underestimation of the uptake in skull lesions decreased from 19.9% ± 3.8% to 2.6% ± 2.1% as the bone detection sensitivity increased to 40% BVF threshold, and then rose to 8.0% ± 3.4% for still lower BVFs without ever becoming overestimated.
MR-based attenuation correction techniques using binary tissue-classification
Figure 3. Absolute quantification error of bone lesions at different location vs. bone detection sensitivity. For easier visualization, the data of lesions located at different sites are slightly shifted on the X-axis. The minimal quantification error occurred at approximately 30% BVF for the body lesions and approximately 40% for the skull lesions. (V-PROC = vertebral process; V-BODY = vertebral body.)
Figure 4. Quantification difference of bone lesions (without taking absolute values) at different location vs. bone detection sensitivity. For easier visualization, the data of lesions located at different sites are slightly shifted on the X-axis. A large difference between data of the body lesions and of the skull lesions can be observed. This is likely the result of using one single attenuation coefficient to represent the wide range of attenuation coefficients that can be observed in typical CT images. According to these results, if the binary classification method is to be used for the correction of photon attenuation of bone, optimal quantification results can be obtained with the detection of all voxels that are partially bone with a BVF above approximately 30%. (V-PROC = vertebral process; V-BODY = vertebral body.)
Table 3. Absolute quantification error in the evaluated lesions vs. BVF threshold used in binary segmentation of bone (V-Proc = vertebral process; V-Body = vertebral body). The minimum value for each skeletal region is underlined.
have been studied by a number of investigators, many of whom concluded that identifying bone as a separate tissue class is necessary in order to achieve accurate quantification of the PET data, especially for regions inside of or near bones. Some of these studies have shown that with a sensitive bone detection and classification method, accurate PET quantification can be achieved. However, previous studies have not discussed the issue regarding voxel averaging of bone and soft tissue when using a binary tissue-segmentation approach, or the required sensitivity for an MR tissue-classification based attenuation correction approach to correct for the attenuation from human bones. In this study, we examined the contribution to attenuation from bone in greater detail and, for the first time, evaluated how the sensitivity of bone detection can affect the PET quantification in bone lesions. We have thereby established that with a tissue-classification based approach, the MR imaging technique should be able to identify all voxels with greater than 30% BVF as “bone voxels” in order to minimize the bone-induced quantification inaccuracy in PET/MR studies.
When bones are not separately classified in the attenuation images, uptakes in all bone lesions evaluated in this study were underestimated. The underestimation spanned a wide range, from 2.0% to over 25%. It was significantly higher in the skull than in the other parts of the skeleton that were evaluated in this study. This is to be expected because in the head, the bone-to-tissue ratio is appreciably higher than in the body.
The underestimation of lesion uptakes in the ribs is significantly lower. However, this does not necessarily indicate that the detection of rib bones is less important for attenuation correction purpose. This difference in quantification of ribs likely stems from the deleterious effects of involuntary respiratory motion, which caused spatial misregistration between the PET images and the CT images so that uptake in the rib bones was partially projected into soft tissues in the CT images, thereby rendering the reference value (i.e., the CT-corrected PET data) inaccurate.
The most interesting result of this study is the non-monotonic relationship between the bone detection sensitivity and the quantification difference in bone lesions. Intuitively, one would expect that the most accurate PET data would be reconstructed from the attenuation map that was made with the highest bone detection sensitivity (i.e., the lowest BVF threshold). Our results show otherwise: when a binary-tissue-classification method is used for attenuation correction of bone in whole-body imaging, there appears to be an optimal BVF threshold for the segmentation of voxels that are partially bone. Beyond that optimal BVF, improving the sensitivity further is not only unnecessary but in fact counterproductive. This result may be explained by the manner in which tissue-classification approaches are performed in MR-based PET attenuation correction: a single attenuation coefficient is assigned to represent an entire tissue class. This can be justified relatively easily for air, fat, and non-fat soft tissues, whose attenuation coefficients have been shown to have very small inter-patient variation (partly because intra-voxel averaging is not a serious problem for these tissue classes). This, however, proved to be more problematic for the bone and lung class  , both of which are affected by substantial intra-voxel averaging (of air and soft tissue for the lung class, and of bone and soft tissue for the bone class) with typical clinical voxel sizes. As the result of tissue mixing at various ratios, the nominal attenuation coefficients measured in CT for these two classes have significantly wider distributions compared to “air”, “fat” and “soft tissue,” even for the same patient. Consequently, using a single attenuation coefficient to represent the entire tissue class inevitably leads to overestimation in some regions and underestimation in the other regions. In our study, this can be observed both in the split trend between skull and non-skull lesions and in the variation of lesion quantification within the same type of bone.
However, this does not invalidate the use of a binary-tissue-classification for PET bone attenuation correction. Our results in this simulated study at a representative clinical voxel size have shown that, using a binary bone segmentation method corresponding to approximately 30% BVF threshold and assigning 760 HU (corresponding to about 0.135 cm−1 for 511 keV photons  ), the absolute quantification difference of bone lesions was reduced to 1.5% ± 1.3% compared to CT-corrected PET data. Although this is probably as high an accuracy as a binary-classification method can achieve, it is acceptable because an absolute quantification difference of less than 2.0% is sufficient for most clinical PET applications, if not all. More importantly, this study demonstrates that, in order to minimize the quantification effect of bone, MR-based methods do not have to be as sensitive as CT is in bone detection. This is fortuitous, given the various limitations in the fundamental imaging mechanism of MR when it comes to bone imaging.
It should be noted that this study has several limitations. One of the primary limitations is the inaccuracy in the estimation of BVF using HU values, which comes from two main sources. The first is the inaccuracy of the values of HUbone. In the determination of BVF for individual voxels, ideally HUbone should be corrected for beam hardening on a voxel-by-voxel basis. However, this is not practical. By using the heuristic method that we developed to correct the HUbone variation on a slice-by-slice basis (see the Appendix), the uncertainty is reduced. The remaining variation can still cause inaccuracy in our results and, unfortunately, there is no simple way to further reduce its impact.
Another limitation of this study is associated with voxel size, which largely determines the extent of intra-voxel averaging. In theory, when the voxel size is small enough compared to the dimensions of human cortical bone, intra-voxel averaging is negligible, and most bone will be located in homogeneous bone voxels. In this scenario, even a method with low bone detection sensitivity would be able to classify a sufficient amount of bone for an adequate attenuation correction. However, reducing voxel averaging by using small voxel sizes in clinical MR scans, especially the ones used for attenuation correction purposes, is impractical as it requires substantially longer acquisition time and also degrades the signal-to-noise ratio of the images. The voxel size of the CT attenuation images used in this study was 1.5 mm × 1.5 mm × 3.0 mm, which is a typical voxel size in clinical MR imaging of the body. The optimal BVF threshold to correct for bone attenuation is expected to depend on the voxel size used for MR imaging, and larger voxel sizes are expected to require higher sensitivities (i.e., lower BVF thresholds) because it results in lower average BVF in heterogeneous voxels. Since the voxel size for whole-body MR attenuation images is not likely to be significantly smaller than the voxel size used in this study, a BVF of 30% can be considered to be the bone detection sensitivity that tissue-classification techniques should aim to achieve in order to obtain the best quantification of PET bone lesion uptakes in whole-body PET/MR studies. We estimate the sensitivities of previously published UTE MRAC studies to be around 50% - 70% using the sequence parameters published in the papers and typical MR properties of tissue. The sensitivities that were achieved in those studies can adequately correct the bone attenuation in PET/MR studies of the head. However, the present work suggests that greater sensitivity is needed for accurate bone lesion quantification in whole-body PET/MR studies.
Treating bone as soft tissue can lead to an underestimation of the uptake inside bone lesions in whole-body PET/MR studies of approximately 10%. The relationship between bone detection and the accuracy of PET quantification in bone is non-monotonic. By combining the proper level of bone detection with the corresponding mean HU for bone voxels, a tissue-classification approach can reduce the absolute quantification error of bone lesions to less than 2% compared to the reference CT-corrected PET data. The optimal bone detection threshold is approximately 40% BVF for the skull and 30% BVF for non-skull skeleton. This is the attenuation correction requirement for the most accurate quantification of bone lesions with PET/MR at a typical clinical voxel size.
The work of this paper was partially supported by the Shalek Award from the Medical Physics Program of The University of Texas Graduate School of Biomedical Sciences at Houston.
The volumetric fraction of bone, or BVF, of any voxel can be computed if the HU values of homogeneous bone and soft tissue in the voxel are known. While the HU value for soft tissue is relatively constant for a properly calibrated CT scanner, HU values for homogeneous bone voxels can be affected by factors such as scan parameters, spatial location and reconstruction methods. In this appendix, we describe our methodology for determining the value of HUbone to be used in the estimation of BVF.
The simplest way to determine HUbone is through direct measurement. However, this is not achievable for every voxel in a CT dataset simply because not every voxel is a bone voxel. The best solution is to use HUbone measured in a nearby region to approximate HUbone for voxels where HUbone is not measurable. In this study, we determined HUbone on a slice-by-slice basis, the justification of which is provided in the following paragraphs.
1) Identification of Homogeneous Bone Voxels in CT Images with Combined Thresholding and Morphological Erosion
In order to measure HUbone, homogeneous bone voxels must be first identified in the CT dataset. While homogeneous soft tissue voxels can be easily located in clinical whole-body CT images, bone voxels are scarce, and homogeneous bone voxels are much more so. The voxel averaging between bone and soft tissue can be classified into two categories: 1) the intermixing of soft tissue with the porous structure of the mineralized bone matrix, such as in trabecular bone, which we call “intrinsic” averaging, and 2) the apposition of bone and tissue at a soft tissue-bone interface, such as the boundary between cortical bone and skeletal muscle, which we call “extrinsic” averaging.
We adopted a two-step process to identify homogeneous bone voxels. First, to exclude the majority of trabecular bone voxels, a relatively high HU threshold was applied to the CT data and the voxels with high BVF were extracted. However, high BVF voxels can be located on the tissue-bone interface and remain potentially subject to the extrinsic averaging. To exclude these voxels, the second step was to apply a 3D morphological erosion algorithm to the mask of high BVF voxels. Using a 1-voxel erosion radius, this step usually reduced the number of extracted voxels by 40% ~ 60%. The remaining voxels were mostly homogeneous bone voxels (Figure A1). The mean HU values of these voxels was then used to estimate HUbone of the corresponding CT slice.
A limitation of this method is that not every CT slice contains voxels that can be used to estimate HUbone in the two-step process described above. For these slices, the value of HUbone cannot be directly measured. Instead of direct measurement, we developed a heuristic method that estimates the slice-specific HUbone value by exploiting its dependence on the beam hardening effect.
2) Effect of Beam Hardening on HU Values
In order to demonstrate the effect of beam hardening on the HU values, an anthropomorphic knee phantom was scanned with the CT component of a GE Discovery 690 PET/CT scanner using four different setups that introduced different degrees of beam hardening (Figure A2). The different extent of beam hardening was achieved by adding various amounts of tissue-equivalent attenuating material next to the phantom, with the overall attenuation increasing from setup 1 to setup 4. A series of CT data were acquired with the same scanning parameters: 120 kVp, 300 mAs, pitch factor = 0.984, and 40 × 0.625 mm collimation. They were reconstructed into images with a 0.98 mm transverse voxel size, a 0.625 mm slice thickness and a 0.625 mm slice spacing. Voxels of homogeneous “soft tissue” and “bone” of the entire phantom were segmented with a method similar to that described in section A of the appendix: an HU thresholding (soft tissue: [0 HU, 100 HU], bone: [1000 HU, ∞]) followed by morphological operations to erode the segmented masks isotropically by one voxel. The segmented voxels came only from the knee phantom and did not include any voxels from within the added attenuation materials.
HU values of segmented homogeneous tissue and bone voxels are plotted in
Figure A1. Illustration of the two-step process of extracting homogeneous bone voxels. Left: a zoom-in cross-section view of Femur in a whole-body CT dataset. Middle: voxels extracted with HU thresholding (marked with blue “x”). Right: voxels extracted after a 3D morphological erosion operation with 1-voxel erosion radius applied to the thresholded voxels.
Figure A2. The anthropomorphic knee phantom and the scan setups used to verify the impact of beam hardening on HU values. The total amount of attenuation increased monotonically from setup 1 to setup 4.
Figure A3. The top row shows the HU values of all the tissue voxels and bone voxels across the phantom for setups 1 - 4. HU values of the soft tissue voxels remained essentially the same, while a considerable decrease in the HU values of the “bone” was introduced by the increased amount of attenuation and beam hardening within the reconstructed CT images. The bottom row shows the HU values of different randomly selected CT slices of the phantom plotted against total in-slice attenuation (TISA), which is calculated as the sum of HU values over all voxels within the CT slice. While the HU value of tissue voxels remains approximately constant over different TISA, the HU value of bone voxels decreases with increasing TISA, reflecting the effect of the hardened beam in a
Figure A3. HU values corresponding to scan setups 1 - 4, with increasing overall attenuation. Top left: HU value of all tissue voxels. Top right: HU value of all bone voxels. Bottom left: HU value of tissue voxels in a randomly selected slice, corresponding to a range of different TISA (total in-slice attenuation, defined as the HU sum of all voxels within the slice). Bottom right: HU value of bone voxels in a randomly selected slice. While the HU value for tissue voxels remained essentially constant, the HU value for bone voxels were affected by the amount of attenuation present in the slices. There is an underlying linear relationship between HU and TISA.
more attenuating CT slice. It can be seen that the underlying relationship between bone HU and TISA is approximately linear. Since TISA can always be computed for arbitrary CT slices, it can be used to estimate the value of HUbone for slices where direct measurement is not possible.
In the data plotted here, there is approximately a 150 HU difference between the slices with maximal and minimal TISA. This corresponds to about a 5% difference in the measured CT attenuation. It should be noted that beam hardening and the resultant variation in bone attenuation values in a whole-body CT study can be greater than the difference demonstrated in this phantom study, as the amount of attenuation difference at different cross-section inside the human body (e.g., abdomen v.s. knee) can easily exceed the difference introduced in this experiment.
3) Estimation of HUbone
The relationship between HUbone and TISA was also observed in clinical CT data (with the interesting exception of skull slices), as shown in Figure A4. Using this relationship, we developed a heuristic method to estimate HUbone for each individual CT slice.
In this strategy, HUbone is adjusted for each individual slice when estimating the BVF, as described in Equation (4).
The voxels within objects other than the patients (e.g., the scanner table and the positioning aids) were first excluded from the images, and voxels of very high attenuation values such as teeth and metal implants were also excluded
Figure A4. HU vs. TISA of homogeneous bone voxels in one clinical whole-body CT dataset. Each data point and error bar represents the voxels in one CT slice. It can be seen that the relationship between the mean HU of the homogeneous bone voxels and TISA is approximately linear (except for the skull slices).
using a threshold of 2000 HU. Then we computed HUbone for each slice in each CT dataset of this study using the following steps:
a) Classification of slices
The CT slices were divided into skull slices and torso slices. The slice that contained the most inferior point of the chin was identified as the landmark, and all slices above it were designated as skull slices (including the landmark slice), while slices below the landmark were regarded as torso slices.
b) Segmentation of homogeneous bone voxels
The segmentation of homogeneous bone voxels uses the method described in section A. The candidate voxels to be classified as homogeneous bone voxels were those that exceeded a threshold of 1000 HU. In order to exclude the voxels that were potentially mixed with soft tissue, a morphological erosion algorithm was used to erode the mask of the candidates isotropically by one voxel. This step ensured that the voxels that were classified as bone were at least one voxel away from any soft tissue voxels.
c) Mean HU value for bone voxels
The mean HU value for the segmented homogeneous bone voxels in each slice was computed.
d) Estimation of HUbone(z) for torso slices
The total in-slice attenuation TISA (i.e., the summation of the HU values of all voxels within a particular CT slice) was calculated for each torso slice. A linear regression analysis was performed between TISA and the measured mean HU for bone voxels. HUbone(z) was then computed using the linear correlation coefficients and TISA for each torso slice.
e) Estimation of HUbone(z) for skull slices
The HU values for homogeneous bone voxels in the skull slices were observed not to be linearly correlated with TISA. Therefore, a single HUbone(z) was used for all of the skull slices. It was simply calculated as the mean of all the bone voxels in the skull of each dataset.
After HUbone(z) had been determined, BVF was computed for each voxel using Equation (4). The computed BVFs could have values less than 0 (for voxels with HU less than 0) or greater than 1 (for voxels with HU greater than HUbone), which are not realistically possible values for BVF. In those cases, the out of range BVFs were set to 0 or 1, respectively.