The efficient and effective movement of freight is a critical component in the transformation and growth of the economy. Transportation infrastructure improvements can assist in moving freight and growing the economic vitality of a community. To evaluate the best investment decisions, travel demand models, which are representations of the existing transportation infrastructure, are used to evaluate “what if” scenarios to future socio-economic growth and infrastructure decisions.
This paper examines the potential to use a database of freight origin/destination locations in a medium sized community to develop a calibrated truck trip generation equation. Medium sized communities are taken to be between 200,000 and 1,000,000 people for the focus of this work. The relationship between truck stops and employment was developed and then tested to determine the validity of the equations for use in practice and the level of accuracy associated with the use in the community travel demand model. The paper concludes that using a 20 sector model for truck trip generation can provide the level of detail necessary to incorporate truck transportation needs into the travel demand models, improve the results, and potentially lead to improved investment decisions for the community.
2. Literature Review of Freight Transportation
Truck demand modeling issues are prevalent as the development of methods for freight forecasting have lagged behind that of passenger transportation forecasting in both theoretical and simulation modeling analyses ( Samimi et al., 2010; Alho, 2011; Fischer et al., 2005; Jong et al., 2004; Tavasszy & Jong, n.d.; Hunt & Stefan 2007; Jansuwan et al., 2014; Yang et al., 2010; North, 2009; Wheeler & Figliozzi, 2011; Chow et al., 2010; Cohen et al., 2008; Holguin-Veras et al., 2011 ). Additionally, data collection and availability are proving to be a large obstacle in the development and improvement of truck demand models ( Bastida & Holguin-Veras, 2009; Samimi et al., 2010; Hunt & Stefan, 2007; Ruan et al., 2011; Greaves & Figliozzi, 2008; Roorda et al., 2010 ). Fortunately, some research studies have tested the use of GPS in collecting better data for modeling and found it to increase the quality of the models ( Kuppam et al., 2014; Wheeler & Figliozzi, 2011; Greaes & Figliozzi, 2008; Doustmohammadi & Sisiopiku, 2016; Doustmohammadi et al., 2016b ). Despite the issues being faced, researchers have been working on several variations of truck demand models. Activity-based, or tour-based, modeling approaches have been shown to provide the most accurate movement predictions ( Samimi et al., 2010; Figliozzi, 2007; Kuppam et al., 2014; Gliebe et al., 2007; Hunt & Stefan, 2007; Ruan et al., 2011; Kim et al., 2011; Bradley et al., 2010; Doustmohammadi & Sisiopiku, 2016; Doustmohammadi et al., 2016b). Additionally, the ability to combine modeling techniques, primarily combing tour-based modeling with other types of modeling can meet specific needs (Alho, 2011; Fischer et al., 2005; Jansuwan et al., 2014; Holguín-Veras & Thorson, 2000; Holguín-Veras & Patil, 2008; Daly, 1982; Doustmohammadi et al., 2016a; Doustmohammadi et al., 2016b; Holguín-Veras et al., 2008; Holguín-Veras et al., 2013; Jong & Ben-Aliva, 2007).
Many medium sized communities are still using traditional four step modeling techniques. The initial step in the model, trip generation, is the key area where advances in truck modeling can be made. The Quick Response Freight Manual (QRFM) and its updated version QRFM II were prepared to provide tools to assist transportation professionals in truck modeling ( Cambridge, 1996 ; Cambridge, 2007 ). The models in QRFM have been shown to be useful in forecasting truck in a medium sized community ( Anderson et al., 2013 ; Cambridge, 2007 ). However, more work can be performed and this paper seeks to develop new truck trip generation equations based on GPS collected truck origin and destination locations and zonal employment in the area.
3. Data and Methodology
The two main datasets used to create the truck trip generation equations were the GPS collected truck origins/destinations and employment data from the U.S. Census Department Longitudinal Employer-Household Dynamics (LEHD).
The GPS collected truck origin-destinations were collected from a sample of vehicles for four days during 2011. The data collected were purchased from the American Transportation Research Institute (ATRI) and contained over 4 million observations. When the data were analyzed, it was determined that there were over 119,000 locations, where the truck remained stationary for a sufficient time to allow for loading/unloading of freight to avoid identification of traffic congestion and red light stops to be considered in the database.
The employment data were collected for all Birmingham, AL area census blocks. The LEHD data, for 2010, included employment data divided into 2-digit NAICS code, see Table 1.
The employment data were combined with the GPS data for truck origin/destinations in a geographic information system (GIS) to determine the number of trucks that stopped in each census block and distribution in employment for the block. The combined data were then utilized to develop equations that related the number of trucks having an origin/destination in the census block with the number of and type of employment in the block.
The truck trip generation equations were developed using standard statistical methodologies for relating independent variables, employment by NAICS sector, with the dependent variable, the number of truck trips into and out of the location. Two different modeling techniques were applied to the data: Stepwise Linear Regression and Bayesian Linear Regression. The techniques represent common statistical regression methodologies for model development. For modeling purposes, a random sample of 70 percent of the blocks were used for model development and 30 percent of the blocks were used for validation of the model.
4. Model Results and Validation
The models developed from the different regression techniques were analyzed
Table 1. NAICS data categories.
based on accuracy. For the Stepwise Linear Regression and Bayesian Linear Regression models, Table 2 shows the equations obtained and the R-squared value.
For the model validation, the 30 percent of the data that were withheld from the data used to develop the models were tested to examine the model accuracy. The validation was performed through a plot of the results and calculating the Percent Root Mean Square Error (%RMSE). The %RMSE error was calculated as ( Etz , 2018).
The validation scatter plots for the 20 sector Linear model and 20 sector Bayesian model are shown in Figure 1. The model accuracy, as determined by the %RMSEs, was calculated as 71.2 for the Linear model and 69.9 for the Bayesian model.
5. Accuracy of the Trip Generation Equations
To test the accuracy of the two 20 sector trip generation equations developed in this study, a comparison of the number of trucks observed on the roadways in Birmingham and the number of truck assigned to the Birmingham travel demand
Table 2. Model resluts and R-squared value.
Figure 1. Validation plot for 20 sector models.
model network was performed. This step was accomplished as the accuracy of the model must be a function of the application of the model to forecast truck trips in the community. The travel demand model for Birmingham was obtained and the truck trip generation equations were applied using the employment data from the LEHD for blocks, aggregated as necessary to model traffic analysis zones. For external stations, the number of trucks entering and leaving the study area were determined from traffic counts and assigned to each location in the model. The distribution of pass-through trucks was taken from the travel demand model methodology for dividing external-internal trips and external-external trips. For all zones, intrazonal truck trips were not allowed.
The truck trips were distributed to the network using a gravity model and friction factors designed for non-home-based-trips. Finally, an assignment was performed using an equilibrium assignment technique. However, as this study is solely focused on truck trips, only truck trips were assigned to the network. The results of the assignment were compared to the actual truck counts and the results are shown in Figure 2, the X-axis represents truck counts and the Y-axis represents the model assigned truck counts. The %RMSEs for the two methodologies were calculated as 47.262 for the Linear model and 47.147 for the Bayesian model.
From the %RMSE values and model validation plots, there is not a clear difference between the two models. This is to be expected as the parameters in the models were very similar and the key differences in the number of sectors used in the analysis.
6. Transferability of the Trip Generation Equations
To test the transferability of the two 20 sector truck trip generation equations, the travel demand models for Huntsville, AL and Montgomery, AL were used. The LEHD data for these two locations were collected and the two 20 sector truck trip generation equations were used to determine the number of truck productions and attractions for each zone. The external station truck numbers were obtained from traffic counts and a similar adjustment to remove pass-through was performed.
The assignment of trucks using the methodology and the actual number of truck on the roadway was compared. The results of the comparison for Huntsville are shown in Figure 3 and Figure 4 and the results of the comparison for Montgomery are shown in Figure 5 and Figure 6. The %RMSEs for Huntsville were calculated as 80.21 for the Linear model and 87.06 for the Bayesian model and for Montgomery were calculated as 84.35 for the Linear model and 78.41 for the Bayesian model.
Examining the difference between the two communities used for model transferability, the key difference is the distribution of employment within the two cities. Figure 7 shows the employment in the urbanized areas for the two cities using the 20 sectors. As can be seen, Huntsville has a very high employment
Figure 2. Model validation plot for 20 sector models.
Figure 3. Transferability plot for Huntsville using the linear model.
Figure 4. Transferability plot for Huntsville using the Bayesian model.
Figure 5. Transferability plot for Montgomery using the linear model.
Figure 6. Transferability plot for Montgomery using the Bayesian model.
Figure 7. Difference in Employment between Huntsville and Montgomery.
in Sector 54—Professional, Scientific and Technical Services and Sector 62—Health Care and Social Assistance which account for 19 percent and 16 percent of the total employment, respectively. Montgomery has a very high employment in Sector 92—Public Administration—accounting for 16 percent of the total employment.
The additional sectors included in the Bayesian model versus the Linear model might have benefitted Montgomery as several of the sectors where there was higher employment are not included in the Linear model. All of the sectors related to trade, transportation and warehousing, which accounts for 18 percent of the total employment in Montgomery, are only included in the Bayesian model. Additionally, other sectors that are higher in employment for Montgomery are not included in the Linear model, reducing the number of truck trips and leading to under assignment for trucks in the community and lowering the %RMSE.
The results of the study show that the combination of truck GPS data and LEHD employment data can be used to create new truck trip generation equations for medium sized communities. The 20 sector models both validated well statistically and in practice for Birmingham, with almost similar results. The final decision on the model selected relies with the community performing the modeling, however, the recommendation is that the 20 sector Bayesian model has the advantage of including all sectors of employment in the model that make the model more applicable in more communities as a high number of employment in the sectors not included in the Linear model would reduce the effectiveness.
Overall, the contribution of this paper is a truck trip generation model that can be used in medium sized communities (populations between 200,000 - 1,000,000) and the presentation of the means to develop a community specific truck trip generation model. While some researchers will claim that the results in this paper are not as accurate as necessary, the presentation and potential use of the models presented in this work will be an improvement over doing mothing and/or waiting until perfect truck data is available to have more accurate models. The choice of the models presented in this paper or using the methodology followed in this work to develop community specific models, if the GPS truck data is available, is left to the agency. Including truck trips in the model is vital as the correct forecasting of truck in a community is vital for the economic growth of the community.