Commissioning a commercial treatment planning system in radiation oncology includes two major tasks: modeling the beam data and validating the accuracy of the models. An overall accuracy of 5% in the delivery of absorbed dose  is recommended by the International Commission on Radiation Units (ICRU) and the accuracy of 2% in the computed dose distribution  is suggested by American Association of Physicists in Medicine (AAPM).
Recently, AAPM has published a medical physics practice guideline (MPPG 5.a.),  which sets the minimum requirements for commissioning and QA of treatment planning dose calculations. The required validation process is described in MPPG 5.a. in the following Sections:
5. Photon beams: basic dose algorithm validation;
6. Photon beams: heterogeneity correction validation;
7. Photon beams: IMRT/VMAT dose validation;
8. Electron beam validation.
The guideline has suggested some validation tests and the evaluation criteria in each validation section (basic photon, heterogeneity, IMRT/VMAT and electrons). Verification has to take into account measurement accuracy on top of model limitations to understand the goodness of a model. MPPG 5.a. doesn’t specify the choice of the measurement technique to the user for those tests, but it states “Water tank profiles yield the most accurate absolute dose comparison, while array detectors can test multiple points wsithin the distribution and provide efficient comparison to calculations.”
2. Materials and Methods
Modeling of Collapsed Cone Convolution (CCC) dose algorithm in Pinnacle planning system followed the vendor’s instruction for the beam data collection. Modeling parameters in Pinnacle are adjustable for separate regions in depth dose, buildup, in and out of field, which are used to model photon spectrum, electron contamination, flattening filter attenuation, effective source size, flattening filter scatter source, respectively. Jaw and MLC leaf transmission factors are also the modeling parameters instead of the exact values of measurement.
For all basic validation tests, comparison of absolute dose between measurement and calculation for each point of interest (POI) is performed. An IBA (IBA Dosimetry GmbH, Schwarzenbruck, Germany) ion chamber cc13 was used in the measurement for photon basic dose algorithm, heterogeneity correction and a PTW (PTW-Freiburg, Freiburg, Germany) E type diode used for electron beam, and Sun Nuclear (Sun Nuclear, Melbourne, FL, USA) diode array (ArcCHECK) for photon IMRT/VMAT validations.
2.1. Photon Beams: Basic Dose Algorithm Validation
Tests 5.1 - 5.3 are the traditional verifications of percent depth dose (PDD), profiles and output at nominal source to surface distance (SSD), which are essential to check the agreement of the model with the commissioning data. In addition, MPPG 5.a. recommends five other tests as summarized below for basic photon beam validation in homogeneous media with static MLC fields.
5.4 Small MLC-shaped field (non SRS)
5.5 Large MLC-shaped field with extensive blocking (e.g., mantle)
5.6 Off-axis MLC shaped field, with maximum allowed leaf over travel
5.7 Asymmetric field at minimal anticipated SSD
5.8 10 × 10 cm2 field at oblique incidence (at least 20˚)
5.9 Large (>15 cm) field for each nonphysical wedge angle
Those tests were performed by comparing the absolute dose at various POIs between measurement and calculation. We scanned dose profiles in water at three SSDs (80 cm, 100 cm and 120 cm) and four different depths (2 cm, 4 cm, 12 cm and 25 cm) using IBA cc13 ion chamber and Blue water phantom. Table 1 shows the positions of ion chamber at varied SSDs and depths. The absolute
Table 1. Summary of the MLC-shaped field tests performed using in-water profile scan with SSD, depth and chamber positions relative to the isocenter.
dose at each point of the measured profile was converted from the charge signal using the ratio to that of the dose calibration. Each specified dose profile was calculated in the planning system using a virtual water phantom (50 cm × 50 cm × 50 cm). The resolution of the dose profiles was 2 mm for both calculation and measurement. All six tests (5.4 - 5.9) were carried out based on the suggestion from MPPG 5.a. Our experience has shown that test 5.7 can be incorporated in test 5.8 using an irregular/asymmetric MLC field and test 5.9 can be designed by the same MLC field as in test 5.5 with the wedge angles of interest added. Therefore, the experiment for all six suggested tests would be focused on such MLC fields as illustrated in Figure 1.
2.2. Photon Beams: Heterogeneity Correction Validation
The recommended test by the guideline for the accuracy of dose calculation through the heterogeneous media is the beam delivered to low-density material by a small field size (5 × 5 cm2). We employed a CIRS thorax phantom (Model:
Figure 1. Beam’s Eye View of the static MLC fields, (a) small non-SRS MLC-shaped field (Test 5.4); (b) large MLC-shaped field with extensive blocking (Test 5.5 or Test 5.9 with wedges); (c) off-axis MLC-shaped field with maximum allowed leaf travel (Test 5.6); (d) irregular MLC-shaped field (Test 5.7 at nominal gantry angle or Test 5.8 at oblique incidence).
002LFC), which consists of lung, tissue and bone equivalent materials. A photon beam of 5 × 5 cm2 open jaw field delivered 100 MU from AP, Left Lateral and PA direction, respectively. A calibrated ion chamber is inserted to compare the point dose with the volume average (mean dose). Films were also used to measure the dose at the POI. The phantom and the beam configurations are illustrated in Figure 2.
2.3. Photon Beams: IMRT/VMAT Dose Validation
Five types of validation tests recommended for IMRT/VMAT delivery modalities are summarized below.
Figure 2. CIRS thorax phantom with ion chamber and beam configuration.
7.1 Verify small field PDD, using a small detector such as diode or plastic scintillator
7.2 Verify output for small MLC-defined fields, using a small detector
7.3 TG-119 tests, using both ion chamber and array detectors with appropriate resolution
7.4 Clinical tests, using both ion chamber and array detectors with appropriate resolution
7.5 External review, various options such as IROC Houston anthropomorphic phantoms
We performed the validation of PDD and output for small MLC shaped fields with a diode. IMRT plan and QA test from TG-119 (prostate and C-shaped target)  was done with MapCHECK and an ion chamber in water slabs. Two representative clinical VMAT cases (lung and pelvis) were done by ArcCHECK with the ion chamber insert. IMRT QA tests were delivered beam-by-beam at the nominal gantry and also compared of the composite dose with ion chamber at high and low dose region respectively. End-to-End VMAT test with IROC H & N phantom  was verified by film and TLD.
2.4. Electron Beam Validation
The recommended tests for electron beam validation includes comparing the isodose distribution for a custom cutout, for an obliquely incident beam and for heterogeneous media. We performed the tests for a custom cutout and an obliquely incident beam using in-water scanning of profiles at different SSDs and depths. Figure 3 illustrates the setup of an oblique electron beam with 10 × 10 open cone at 30˚ gantry angle. The validation of dose calculation in heterogeneous media for electron beams can be tested using a piece of film sandwiched in between two thin slabs of styrofoam with solid water slabs place on the top and bottom as buildup and backscatter. The accuracy and the limitation of Pencil beam algorithm are well known and discussed elsewhere  . We wouldn’t include the discussion of our results in this publication due to the fact that MPPG 5.a. doesn’t consider Pencil beam as a good choice of algorithm for dose calculation in heterogeneous media.
Comparison between calculation and measurement is all given as absolute dose in cGy. For measurements using ion chamber, charge reading was converted to dose simply by the ratio to the TG51 calibration in water. The issue of the charge-dose conversion from different media is addressed in the discussion section. For measurements using films, dose was calculated with the calibration curve of the same batch of Gafchromic film following the manufacturer’s instruction.
3.1. Photon Beams: Basic Dose Algorithm Validation
Dose comparison at POI was done by plotting out calculated and measured profiles at the specified SSD and depth as the absolute dose for the delivery of 100
Figure 3. Test 8.2―Open cone at oblique beam and/or extended SSD.
MU. Also plotted is the difference between the calculation and the measurement. Figure 4 illustrates one of those plots for in-plane profiles from Test 5.8.
Overall agreement between calculation and measurement is consistent with the models by a visual inspection for all test fields, i.e., the agreement for high dose region (in-field and shoulder) varying with depth and energy. Penumbra region agrees well taking into account the setup uncertainty in measurement (with the correction of any offset less than 3 mm). The disagreement is also easily identifiable with the difference curve or the numeric result. In general, all tests met the tolerances given by the guideline (see Table 5 in reference 3) with no substantial disagreement. Only for test 5.8 (an oblique MLC beam) were some
Figure 4. Test 5.8 for 6 MV and 10 MV at SSD of 80 cm, in-plane profiles in absolute dose (cGy). Blue line: measurement; red line: calculation and green line: difference.
subtle discrepancies observed, which showed a slightly different penumbra and a spike in the calculation at deeper depth. Scrutiny of the profiles with different offsets at X direction as plotted in Figure 5 demonstrated the consistency of those features with the MLC aperture that a spike is pronounced across the beam central axis (X = 0). The model apparently was able to calculate the unequal amount of scatter contribution from the asymmetrical MLC shape and a few protruding MLC leaves toward center, resulting in different slopes of the penumbra. However, those subtle details were not reflected in the measurement, due most likely to the finite resolution of ion chamber.
Figure 5. In-plane profiles (Y direction) calculated at SSD of 80 cm and depth of 25 cm for Test 5.8., which are plotted at the varied offsets along cross-plane (X direction, X = 0 at CAX). Left panel is 6X and right 10X.
3.2. Photon Beams: Heterogeneity Correction Validation
The results are summarized in Table 2. We have seen the good agreement of the point dose between the calculation and the measurement for beams through different heterogeneous media in this test. The dose measurement between ion chamber and film is also consistent. The recommended procedure of MPPG 5.a. is to compare the ratio of dose above and below heterogeneity along the central axis. The comparison of an absolute dose at POI should be sufficient to show the dose calculation accuracy of the commissioned algorithm in heterogeneity, which might be considered as a less precise yet stricter approach by End-to-End test. A more precise test would clearly also pass as long as the POI test passes.
3.3. Photon Beams: IMRT/VMAT Dose Validation
We did the measurement of PDD and output for small MLC shaped fields from 1 × 1 cm2 to 5 × 5 cm2 using a diode. The difference between calculation and measurement was all within 3%. We passed the IMRT QA tests for TG-119 cases with both MapCHECK and ion chamber measurement. For both Elekta Infinity
Table 2. Measured and calculated point dose for heterogeneous CIRS phantom.
and TrueBeam, our IROC H & N phantom test had the pass rates over 90% on the Gamma Index of 7% and 4 mm and the TLD dose-error within 4%. With our TrueBeam, both IMRT and VMAT QA using ArcCHECK have had greater than 95% pass rate on the Gamma Index of 2% and 2 mm, including large field GYN cases (Y jaw ~ 35 cm), which are attributed to the quality beam data and fine models. With our Elekta Infinity, unfortunately, there was an issue for the initial model that about 50% VMAT cases failed on 2%/2mm Gamma Index on ArcCHECK (Pass rate less than 90% even on 3%/3mm with 10% threshold, global gamma index and measurement uncertainty off) although majority IMRT cases could pass 90%. Interestingly, the agreement for the basic photon beam and MLC tests on Elekta Infinity was similar to or better than that on TrueBeam. After exhausting the investigation of planning and measurement technique, we had to tweak and update the model with new beam data (re-measured with a diode instead of an ion chamber) for small MLC fields, which was able to pass all patient specific QA on VMAT. The profiles of new beam data appear slightly sharper on penumbra and lower tails. Detailed analysis and resolution of this finding will be presented as a separate study in conjunction with QA measurement techniques.
3.4. Electron Beam Validation
Figure 6 shows both in-plane and cross-plane dose profiles for the oblique electron beam. There exists a sizable disagreement in the high dose region (in-field) between calculation and measurement. The impact of central axis tilt on depth
Figure 6. (a) Test 8.2: Electron cross-plane (Lt panel) and in-plane (Rt panel) profiles in absolute dose (cGy). X axis is the distance relative to the central axis (X = 0) at specified depth, the upper panel for 9 MeV at depth of 2.5 cm and the lower panel for 20 MeV at depth of 5.0 cm. Red line: Pinnacle calculation; blue line: ion chamber measurement; green line: diode measurement. (b) Cross-plane profiles of ion chamber measurement (dashed line) scaled to that of diode measurement (solid green). The left panel is for 9 MeV at depth of 2.5 cm and the right panel is for 20 MeV at depth of 5.0 cm.
dose and penumbra was observable. The maximum discrepancy in absolute dose was up to 8% around the lateral shoulder of the cross-plane profile for 20 MeV at an extended SSD of 105 cm and depth of 5 cm. The tolerance in MPPG 5.a. for oblique electron beams is 5%. The tuning of the Pencil beam model can be done to reduce the difference, but we think the model is clinically acceptable since the discrepancy is no more than 5% for most energies and depths. Besides, we use manually calculated monitor unit (MU) for electron beams instead of that from Pinnacle TPS.
Table 3 is made to have a brief view of our results from all the validation tests we performed. Readers are referred to the text of this article and MPPG 5.a. for the details. In general, all the measurements are reliable and repeatable as well as consistent when compared with the same type of machines, e.g., Varian 2100EX vs. TrueBeam.
Table 3. Summary of the results for our validation tests in comparison with the recommended evaluation criteria.
The basic TPS photon beam evaluation methods and tolerances recommended by MPPG 5.a. are 2% with one parameter change or 5% with multiple parameter changes on relative dose in high dose region; 3 mm distance to agreement (DTA) in penumbra region and 3% of maximum field dose in low-dose tail  . Validation tests by the comparison of absolute point dose have the advantage to identify any detailed discrepancies and to provide the confidence in End-to-End results. The most probable errors would be the accuracy of measurement techniques including the setup. The centering of the ion chamber can be easily corrected by the scanning software. With carefully performed measurements, we should be able to reveal the limitations of either measurement or calculation. For example, a spike seen in the calculation at deeper depths for Test 5.8 might be related to the scatter from a couple of the protruding MLC leaves toward the center of the field. But, this feature is not resolved in the measurement due probably to the effect of ion chamber volume average. Diode has higher spatial resolution but we have failed in obtaining a smooth profile desired even with a slow-speed or point by point scan due likely to the bad signal to noise ratio (SNR) at a deep depth. More efforts are encouraged with quality diode/electro- meter or films to see if such a fine feature as observed in calculation can be resolved in measurement.
Accuracy of the dose measurement  is subject to a number of factors, including but not limited to, calibration and response of a detector, measurement setup, and conversion of signal reading to absorbed dose, etc. Ion chamber is quite a simple and accurate device for the measurement of absolute dose at a point. Question can be raised concerning about the conversion of charge reading to absorbed dose in different media. For photon beams, electron energy and thus stopping power ratios is not depth dependent therefore depth-ioni- zation ≈ depth-dose. For electron beams, electron energy and thus stopping power ratios is depth dependent, therefore, when converting from depth-ionization to depth-dose, stopping power ratios needs to be applied.
For the photon beam heterogeneity correction validation tests, we derived the dose from the in-water calibration. So, the dose to the solid phantom should take into account the conversion from dose-to-water to dose-to-muscle, which is
about 1% difference as. For electron beams, stopping power ratios may change a couple of percent over the depth
range to R50 particularly for higher electron energies. We see little effect of non-linear relationship between charge and dose as the ion chamber measurement is pretty much identical to that of a diode (Figure 6(b)), which is not depth dependent. The difference between an ion chamber and a diode in an electron beam is expected to be small in profiles due to the volume averaging effect and the real difference due to stopping power effect is in depth ionization vs. dose. We do observe the sizeable disagreement for in-field dose particularly for in-plane profiles between the measurements due mainly to the different calibration and response between ion chamber and diode. There is a subtle difference on the shoulder of the cross-plane profiles between calculation and measurement, which might be explained as the limitation of the pencil beam algorithm related to source modeling. The imperfectly constructed source distribution can cause the deviation on shoulder/penumbra region. Additional measurements can be performed in the future to investigate the effect of depth dependent of electron beam energy and the accuracy of electron dose calculation at oblique angles.
The dose calculation in homogeneity media was all performed using a virtual water phantom within the planning module, which is technically acceptable. MPPG 5.a. might have suggested a CT-based phantom with bulk water density, to simulate the clinical use of the system. With heterogeneity correction turned on in calculation, some ≥ 0.5% difference can be observed between the phantoms (water vs. medium), e.g., the dose at depth of 10 cm under reference conditions. As pointed out by the guideline, some heterogeneity dose calculation algorithms (e.g., Monte Carlo and GBBS) directly calculate dose to the material within the voxel (“dose to medium”). This can be converted to “dose to water” through application of stopping power ratios, with the goal of reproducing conventional (e.g., C/S) TPS doses.  However, this stopping power-based conversion has actually been found to decrease dosimetric agreement with conventional TPS doses in most cases   leading to “dose to medium” being recommended  .
IMRT/VMAT dose validation has the least amount of consensus amongst medical physicists and is controversial. Despite widespread IMRT utilization, accurate dosimetric commissioning of an IMRT system remains a challenge. In the most recent report from IROC Houston  , only 82% of the institutions passed the credentialing end-to-end test with the anthropomorphic head and neck phantom, and the conclusion was  that institutional QA results were not correlated to the unacceptable plan delivery. That IROC test used rather lenient dose-ratio and distance-to-agreement (DTA) criteria of 7% and 4 mm, respectively. Only 69% of the irradiations passed a narrowed TLD dose-error criterion of 5%. There is a question of sensitivity and reliability about specific IMRT/VMAT QA dosimeters and analysis methods. In the validation of our Elekta Infinity, however, the problem was other way around where we passed the IROC head and neck phantom test well but failed in patient specific QA for about 50% of clinical VMAT cases. We believe that a substantial amount of the failures in IMRT/VMAT validation are related to the fundamentals of the TPS commissioning. Our experience showed that acquisition and modeling of small MLC fields, particularly for the tail region, are critical to the IMRT/VMAT model. The issue might be related to the leaf gap model in the MLC configuration of this particular Elekta Linac. Detailed discussion of IMRT/VMAT QA criteria is beyond the scope of this article, but we have had further investigation underway to better understand the correlation of the criteria of validation tests with any potential deficiency of the model.
The extensive validation tests recommended by MPPG 5.a. are meant to understand the accuracy and limitations of a dose algorithm commissioned before it’s implemented in clinic. The MPPG 5.a. adapted the evaluation methods and tolerances for most validation tests from published AAPM task group reports and the criteria used by IROC Houston. Evaluation methods need to be explored further in relation with the refinement of a model and the optimization of the recommended testing methodologies. Validation tests for IMRT/VMAT are quite independent of those for basic photon beams, and hopefully some tests can be developed for direct diagnosis of any deficiencies in IMRT/VMAT delivery. Our validation tests have provided a couple of clinical implications that a VMAT model needs to be carefully tested for varied planning cases and electron beams using pencil beam algorithm have the limited accuracy for oblique incidence and heterogeneity media. On top of all, the uncertainty and efficiency of measurement should be well understood. The experience presented is a learning process about how the validation tests can be performed effectively for a dose calculation model.