Reference-Dependent Preferences and the Labor Supply of Chinese Drivers

Show more

1. Introduction

There is a vast of studies in the economic literature focusing on the wage elasticity of labor supply. The neoclassic models of labor supply predict that work hours should respond positively to transitory positive wage changes, as workers intertemporally substitute labor and leisure, working more when wages are high and consuming more leisure when wages are low. While this prediction is straightforward, but it is difficult to find empirical support. The empirical evidence has been surveyed intensively (for example, Blundell and MaCurdy, 1999) and a summary of the findings is that wage elasticities of labor supply are generally very small, often not significantly different from zero, and sometimes even negative.

One criticism of this literature is that the standard neoclassical models assume that workers can choose their work hours in response to transitory wage changes, or alternatively, can select a job with the optimal wage-hours combination from a joint distribution of jobs. However, actual wage changes are rarely transitory, so the hypothesis of intertemporal substitution must be tested jointly along with the auxiliary assumption of persistent wage shocks. As a result, the insignificant or negative wage elasticity of labor supply can plausibly be attributed to specification errors.

The ideal test of labor supply responses to transitory wage changes would use a context in which wages are relatively constant within a short period but uncorrelated across periods. In such case, dynamic optimization models predict a positive relationship between wages and hours worked, because of the negligible impact of life-cycle wealth of the short period wage changes (see, for example, MaCurdy, 1981).

In order to realize the purpose of research, drivers, as one group of workers, provide us with the most appropriate research subject.

The most apparent advantage is that drivers face wages that fluctuate within a short period due to demand shocks caused by many factors, such as weather, traffic, day-of-the-week effects, holidays, and conventions. Although rates per hour/mile/job are set, during busy periods, drivers spend less time searching for customers and jobs and thus earn a higher hourly/daily wage. The wages tend to be correlated within the short periods and uncorrelated across periods.

Another advantage of focusing on drivers is that they can choose the number of hours they work each period, unlike most workers facing fixed work hours, e.g., eight hours per day and five days per week. In sum, such a study can be easily generalized to other types of workers who have the freedom to choose work hours/days or even the targeted customers, but a necessary condition is that there exist transitory wage changes.

In this paper, we use a comprehensive dataset of taxi drivers in Chengdu, China. Our dataset overcomes the aforementioned problems of the NYC taxi driver dataset. People usually do not tip taxi drivers in China and fare information automatically recorded by the meters on taxis. Hence, fares recorded are a very accurate measure of income earned in our dataset. The dataset contains over 14 thousand taxis. Each taxi there is an observation every minute including its location and status (with or without passengers). There are more than one billion observations in total. We further combine these minute observations into trips and we calculate the duration and fare earned for each trip.

Based on this comprehensive dataset, we perform empirical analyses. First, we conduct an OLS linear regression of working hours on the hourly rate earned for that day. The neoclassic theory predicts the coefficient of hourly rate to be positive, and a negative coefficient does not support the neoclassic model. Our estimation results show that hourly rate has a positive effect on working hours, and this relationship is statistically significant after controlling for a variety of fixed effects, including taxi fixed effects, weather fixed effect, day of week fixed effects, and week fixed effects. Considering the economic significance, a one standard deviation increase in hourly rate could lead to a decrease of working hour by around 9% its standard deviation.

2. Empirical Models of the Labor Supply Decisions

2.1. OLS Estimation of Wage Elasticity of Labor Supply

The wage elasticity of labor supply can be estimated through a simple OLS regression, and the results can show a broad picture of how the daily hours worked are correlated with hourly earning opportunities for that day. The hourly earning opportunities can be computed as a fixed daily wage rate using total fare income divided by hours worked. The regression with one observation for each shift takes the following form

$\mathrm{ln}{H}_{it}=\eta \mathrm{ln}{W}_{it}+{X}_{it}\beta +{\epsilon}_{it},$ (1)

where

H_{it} represents the hours worked by driver i at day t;

W_{it} = Y_{it}/H_{it} and Y_{it} is the total fare income of driver i at day t;

X_{it} are other factors affecting labor supply;

ε_{it} is a random component with a standard normal distribution.

The parameter η measures the wage elasticity of labor supply, and neoclassic models predict that η to be positive. An important econometric problem with this approach is that the estimate relies on there being significant exogenous transitory day-to-day variation in the average wage. This variation drives the accurate estimate of η. However, it is hard to see a source of legitimate variation in the average hourly wage in the real data.

2.2. Discrete-Choice Stopping Model

Alternatively, the model of driver daily labor supply can be estimated as a survival time model in which quitting can occur at discrete points in time. Without deriving a full dynamic solution to the optimal stopping problem, a simple discrete-choice problem can be implemented empirically as reasonable approximation.

At any point s, a driver can calculate the forward-looking expected optimal stopping point, s*. The optimal stopping point can be a function of many factors, including hours worked and expectations about future earnings possibilities, etc. If daily income effects are important, the optimal stopping point can also be a function of income earned. A driver will stop at s if s ≥ s* so that s − s* ≥ 0.

A reduced-form representation of R(s) = s – s* is

${R}_{idc}\left(s\right)={\alpha}_{1}{h}_{s}+{\alpha}_{2}{y}_{s}+{X}_{idc}\beta +{u}_{i}+{\epsilon}_{idcs},$ (2)

where

i refers to driver;

d refers to the date;

c refers to the hour of the day;

h_{s} measures cumulative hours worked on the shift at s;

y_{s}_{ }measures cumulative income earned on the shift at s;

X_{idc} measures other determinants of the optimal stopping time.

The vector of X_{idc} includes weather, a set of fixed effects for hour of the day, day of the week, and location within a province/city. These variables are included to capture variation in earning opportunities from continuing to drive.

A driver stops driving at t if R_{idc}(s) ≥ 0. The coefficient α_{1} measures whether the probability of quitting will be related to hours worked, and the coefficient α_{2} measures whether income earned is important in deciding when to quit.

2.3. Asymmetric Estimation of Reference-Dependent Preferences

After any trip p during a shift, a driver can calculate the forward-looking expected optimal stopping point. This is a function of many variables, including hours worked so far on the shift and variables that affecting expectations about future earning possibilities. In addition, it could also be affected by the accumulated income in a nontraditional way: when the accumulated income is more than the reference income, there is a higher probability for the driver to stop working. An empirical representation of this reference-dependent model is given as follows:

${C}_{ijp}={X}_{ijp}\beta +\delta I\left[{Y}_{ijp}>Y{T}_{ij}\right]+\gamma I\left[{H}_{ijp}>H{T}_{ij}\right]+{\epsilon}_{ijp},$ (3)

where

C_{ijp} represents the forward-looking expected optimal stopping point for driver i on shift j after trip p;

X_{ijp} is a vector of variables determining the optimal stopping time;

Y_{ijp} represents the cumulative income level for driver i on shift j at trip p;

I[Y_{ijp} > YT_{ij}] is an indicator equal to one if accumulated income is larger than the reference income level, and equal to zero otherwise;

H_{ijp} represents the cumulative working hours for driver i on shift j at trip p;

I[H_{ijp} > HT_{ij}] is an indicator equal to one if accumulated working hours is larger than the reference level, and equal to zero otherwise.

The positive value of δ represents the incremental probability of stop working when the accumulated income is above the reference income level, and the positive value of γ represents the incremental probability of stop working when the accumulated working hours are above the reference level. This model can be easily extended to using only income or working hours as references.

2.4. Data Construction

We use a comprehensive dataset of taxi drivers in Chengdu, China from August 3 to August 23, 2016. The dataset contains over 14 thousand taxis. Each taxi there is an observation every minute from 6:00 am to 11:59 pm including its location and status (with or without passengers). There are more than one billion observations in total.

Compared to the dataset used in studying NYC taxi drivers, our dataset has some advantages. It is collected through devices in taxis, which record all GPS location, fare information automatically, unlike the NYC taxi driver dataset which involves transcribe handwritten receipts. The quality of our dataset is better. Moreover, people usually do not tip taxi drivers in China. Hence, fares recorded are a very accurate measure of income earned.

We construct a new dataset by combining these minute observations into trips. We identify the time slots as a trip if the status changes (beginning from without passengers to with passengers and ending from passengers to without passengers). The duration of each trip is calculated as the sum of the time slots between each trip. The distance of each trip is calculated as the sum of the distance traveled during each time slots. The speed is then calculated as the distance divided by the time slots. Most importantly, we calculate the fare earned during each trip based on the following rule: the fare starts with 8 CNY, the price is 1.9 CNY for every kilometer travelled between 2 km and 10 km, and the price is 2.85 CNY per km for over 10 km; if the speed is lower than 12 km per hour, the time counts toward waiting time and every 5 minutes waiting time is counted as 1 km travelled.

As a robustness check, we refine the dataset and keep the information for one driver over each day. Following standard literature, we identify driver shifts by the length of the taxi status without passengers. If it lasts for more than two hours for one taxi without passengers, we define it as a shift, and we keep the information only for the first driver starting from the beginning of the day to the time of the shift. We acknowledge that identifying accurate shifts is difficult from an empirical perspective, and we rely on this method commonly used in taxi driver literature. We also try to identify the shifts with longer time slots, and the results are all consistent.

3. Empirical Results

The literature of labor supply consists of two major competing theories, the neoclassical theory and reference-dependent theory. The empirical findings regarding these two theories are mixed and indecisive. This paper takes a comprehensive study of taxi drivers’ labor supply behavior using a new dataset of taxi drivers from China. By conducting our study in a different setting from the literature, we hope to clarify the findings in the literature.

3.1. Evidence from the Wage Elasticity of Labor Supply

We perform a linear regression of working hours on the hourly rate earned for that day as discussed in the empirical model (1). Specifically, we regress Ln(Work Hour) on Ln(Hourly Rate) and control for a set of fixed effects. As we discussed earlier, neoclassic theory predicts the coefficient of Ln(Hourly Rate) to be positive, and a negative coefficient does not support the neoclassic model.

We first list the summary statistics of the related variables for each taxi over each day. We have totally 197,573 taxi-day observations. The means of Total Income and Work Hour are 473.1 and 13.99, respectively. The Hourly Rate is averaged about 34.37, with the standard deviation about 49.89. The large standard deviation of Hourly Rate compared to its mean, indicates that we have enough variations in terms of the main independent variable of interest. We run regressions using the log form, and the results are unchanged if we directly use the level of the variables instead of logarithm. Ln(Work Hour) has the mean of 2.570, and standard deviation of 0.506. Ln(Hourly Rate) has the mean of 3.493, and standard deviation of 0.274. These values of means and standard deviations are used later to calculate the economic significance of our regression results. The mean of Weather is 1.187, indicating there are relatively more sunny days than rainy days during our sample period.

Table 1 reports the OLS estimation results on the effects of hourly rate on working hours. We report the t-statistics in parentheses and standard errors are clustered by day. In column (1), we can see the coefficient of Ln(Hourly Rate) is −0.135, and the t-statistic is −7.539, indicating that the result is significant at 1% level. In column (2), we include Taxi Fixed Effects. We can use this fixed effect to control for the working hour differences due to the different working habits of taxi drivers or different effects from regular work locations, etc. The coefficient of Ln(Hourly Rate) is −0.157, and it is again significant at 1% level.

Column (3) of Table 1 includes Taxi Fixed Effects, as well as Weather Fixed Effects, Day of Week Fixed Effects, and Week Fixed Effects. The Weather Fixed Effects can use to control for the variations of working hours caused by the weather of the day, for example, rainy day versus sunny day might affect the working hours differently. The Day of Week Fixed Effects can take into consideration of the differences due to a weekday or a weekend. The Week Fixed Effects can control the differences week by week. These fixed effects are comprehensive and leave the hourly rate as a main source of variation in working hours. The coefficient in front of Ln(Hourly Rate) is −0.164, with a large magnitude of t-statistic of −9.351.

Table 1. The effects of hourly rate on working hours.

***denotes statistical significance (two tailed) at the 10%, 5%, and 1% levels, respectively.

All of the three columns show that Ln(Hourly Rate) has a negative effect on Ln(Work Hour), and the effect is statistical significant at 1% level. Now we consider the economic significance by using column (3) as an example. As shown in Table 2, the standard deviation of Ln(Work Hour) and Ln(Hourly Rate) are 0.506 and 0.274, respectively. A one standard deviation increase of Ln(Hourly Rate) leads to a decrease in Ln(Work Hour) by 0.045 (=0.164 × 0.274), a sizeable effect of 9% (=0.045/0.506) of its standard deviation. This evidence shows that hourly rates have not only a statistically significant effect but also a large economic impact in determining working hours for taxi drivers.

Putting together, as drivers work less when wages go up, it is clearly an opposite effect to what neoclassical theory predicts. While our finding is in line with the literature studying taxi drivers’ labor supply. Farber et al. (2015) argue that the negative elasticity is not large enough, then he pointed out that this negativity could be due to the measurement or specification error which may lead downward bias of the elasticity. This is possibly due to that daily working hour is the dependent variable while the average hourly income is the ratio of daily income over daily hours.

Several papers in the literature then propose a possible way to fix this problem by using various instruments, i.e., other driver’s hourly wage on the same day. Farber et al. (2015) show that although the OLS result produces negative elasticity, it will be strongly positive once the instrument variable is added, hence, support the neoclassical prediction. This type of measurement error may exist due to “tips” or “imperfectly recorded and transcribed paper trip sheets” in the NYC taxi dataset used in many papers including Farber et al. (2015).

Our dataset is almost immune to this problem for the following reasons. First, taxi drivers in China rarely receive tips and they do not count on that as part of their income. Second, during the sample periods, all the trips are recoded through meters without any manual input. When the accuracy of the dataset is not a concern, IV method may not be a good estimaton, because such instruments are lack of variation and essentially constant across drivers and days. The instruments therefore are rather weak in terms of the explanatory power.

An important econometric problem with this approach is that the estimate

Table 2. Summary statistics on the daily basis.

relies on there being significant exogenous transitory day-to-day variation in the average wage. This variation drives the accurate estimate of the coefficient. However, it is hard to see a source of legitimate variation in the average hourly wage in the real data. Hence, in the following, we examine the discrete-choice stopping model and its asymmetric effects.

3.2. Evidence from Discrete-Choice Stopping Model

The OLS linear regression produces negative elasticity of labor supply, which is also economically and statistically significant. This result cannot be explained by the neoclassical theory. On the other hand, the reference dependence model has a quite contrasting prediction on the elasticity as suggested in the previous section. To check if our OLS result is consistent with the reference dependence model, we follow Farber et al. (2015) to model the labor supply decision of taxi driver as a dynamic discrete choice problem, where they need to decide whether to continue working after each trip. The reduced-form therefore should take the potential earnings opportunities, hours worked, and income earned and other factors that could affect preferences for work into consideration.

As suggested in Farber et al. (2015), without deriving a fully dynamic solution to the optimal stopping problem, a simple discrete-choice problem can be implemented empirically as reasonable approximation. The optimal stopping point can be a function of many factors, including hours works and expectations about future earnings possibilities, etc. If daily income effects are important, the optimal stopping point can also be a function of income earned. Following our previous discussion of reference-dependent models, individuals can make decisions on either income or hour targets, or both of them. In order to identify the most relevant explanation, we examine all three models.

We first summarize all the related variables on the trip basis in Table 3. Panel A shows that there are totally about 7.4 million observations. The mean of Cum Fare and Cum Hour are 286.2 and 7.778, respectively. The standard deviation of Cum Fare is 1110, which is quite large comparing to its mean. The standard deviation of Cum Hour is 4.580. Stop Trip defines the last trip for each taxi over each day. The mean and standard deviation of Stop Trip is 0.0267 and 0.161. The small mean of Stop Trip makes sense, because there are many trips over each day but only one is identified as the last trip.

Fare Range and Hour Range are defined in details in the Panel B of Table 3. We use the Dummy 0 as the baseline level, which means they will be omitted in the regression analysis to avoid perfect multicollinearity problem. There are around one million observations in each of the first five Fare Range Dummies, e.g., Fare Range Dummy 0 to Fare Rang Dummy 4. The rest of the four Fare Range Dummies contends relatively smaller number of observations. This distribution is formed naturally, because fare is cumulated from low to high. Similarly, we observe the number of observations exhibit a decreasing pattern in Hour Range Dummy Variables. Here we define Fare Range for every 100 CNY

Table 3. Summary statistics on the trip basis. (a) Panel A; (b) Panel B.

and Hour Range for every two hours. If we define differently, for example, Fare Range for every 50 dollars or Hour Range for every one hour, we are able to obtain similar results.

3.2.1. Income as Target

Given the previous discussion, the reduced form of the income dependent model can be estimated by regressing Stop Trip on Fare Range Dummies and controlling for a set of fixed effects. The estimation results are presented in Table 4. Again, we report the t-statistics in parentheses and standard errors are clustered

Table 4. Regression of stopping trip on fare range dummy variables.

***denotes statistical significance (two tailed) at the 10%, 5%, and 1% levels, respectively.

by day. In columns (1), (2), and (3), we conduct the OLS estimation, and in column (4), we estimate using probit model.

Table 4 estimates the probability of a shift ending after each trip due to the marginal effects of accumulated income. In column (1), we show the OLS estimation result without controlling any fixed effects. As earned income accumulates, the probability of stopping starts increase significantly compared to the baseline level. Specifically, compared to the cumulative fare below 100, Stop Trip increases by 0.001 when the fare range is between 100 and 200; Stop Trip increases by 0.005 when the fare range is between 200 and 300; Stop Trip increases by 0.017 when the fare range is between 300 and 400; Stop Trip increases by 0.0062 when the fare range is between 400 and 500; Stop Trip increases by 0.130 when the fare range is between 500 and 600; Stop Trip increases by 0.163 when the fare range is between 600 and 700; Stop Trip increases by 0.140 when the fare range is between 700 and 800; Stop Trip increases by 0.079 when the fare range is above 800.

In column (2), we include Taxi Fixed Effects, and in column (3), we include Taxi Fixed Effects, Weather Fixed Effects, Day of Week Fixed Effects, and Week Fixed Effects. The results in these two columns are similar to those reported in column (1).

We can see a clear pattern that the probability of stopping slowly increases for the fare range between 300 and 400 CNY, and then it sharply increases for the fare range between 400 and 500. The probability of stopping peaks for the fare range between 600 and 700, and then gets lower as the fare range further increases. These results are all statistically significant at 1% level and they hold when controlling various fixed effects.

In column (4) of Table 4, we present the probit model regression results. In running the probit model, we cannot include Taxi Fixed Effects, because there are too many dummy variables causing the failure of converging when using maximum likelihood estimation. The pattern from the probit model is generally consistent with the previous three columns. Comparing to the baseline level of fare range below 100, the probability of stopping slowly increases from 0.114 when to fare range is between 100 and 200 to 0.707 up to the fare range between 300 and 400 CNY, and then it sharply increases to 1.246 and 1.655 for the fare range between 400 and 500 and fare range between 500 and 600, respectively. The probability of stopping peaks at 1.799 for the fare range between 600 and 700, and then gets lower slightly to 1.700 and 1.372 for the fare range between 700 and 800 and fare range above 800, respectively. These results are all statistically significant at 1% level.

The reference-dependent model with income target suggested that 1) if income is below the income target, drivers have a higher marginal utility of income; 2) if income is above the income target, drivers have a higher marginal utility of leisure (disutility of work). Moreover, such a change around the income target is not smooth. It implies that the probability of stopping will be lowest when income is below the income target and will be highest when income is above the target. Our results in Table 4 indeed support this prediction.

One possible alternative explanation of the finding in Table 4 is that earned income and worked hours are highly positive correlated. Therefore, one may argue that taxi drivers’ may make their decisions on when to stop working based on hour target instead of income targets or both of them jointly. To address this concern, we also check the reduced form estimates of the reference-dependent models with hour target and both hour and income target in the next two subsections.

3.2.2. Work Hour as Target

The reduced form of the working-hour dependent model can be estimated by regressing Stop Trip on Hour Range Dummies and controlling for a set of fixed effects. The estimation results are presented in Table 5. Again, in columns (1), (2), and (3), we conduct the OLS estimation, and column (4) presents the estimation results using probit model.

In column (1), we show the OLS estimation result without controlling any fixed effects. As working hours accumulate, the probability of stopping starts increase significantly compared to the baseline level. Specifically, compared to the baseline level of cumulative hour below 2 hours, Stop Trip first shows no significant difference for cumulative working hours between 2 hours and 4 hours; Stop Trip increases by 0.001 when the cumulative working hours are between 4 hours and 6 hours; Stop Trip increases by 0.003 when the cumulative working hours are between 6 hours and 8 hours; Stop Trip increases by 0.006

Table 5. Regression of stopping trip on hour range dummy variables.

* and *** denote statistical significance (two tailed) at the 10%, 5%, and 1% levels, respectively.

when the cumulative working hours are between 8 hours and 10 hours; Stop Trip increases by 0.014 when the cumulative working hours are between 10 hours and 12 hours; Stop Trip increases by 0.037 when the cumulative working hours are between 12 hours and 14 hours; Stop Trip increases by 0.112 when the cumulative working hours are between 14 hours and 16 hours; Stop Trip increases by 0.299 when the cumulative working hours are above 16 hours.

In column (2), we include Taxi Fixed Effects, and in column (3), we include Taxi Fixed Effects, Weather Fixed Effects, Day of Week Fixed Effects, and Week Fixed Effects. The results in these two columns are similar to those reported in column (1).

We can see a clear pattern that the probability of stopping slowly increases up to the working hours between 12 hours and 14 hours, and then it sharply increases for the hour range between 14 hours and 16 hours. The probability of stopping peaks for the working hour range above 16 hours, and we find no evidence for decreasing pattern as the working hours increase. These results are all statistically significant at 1% level and hold when controlling various fixed effects.

Column (4) of Table 5 presents the probit model regression results. Again, we cannot include Taxi Fixed Effects. The pattern from the probit model is generally consistent with the previous three columns. Comparing to the baseline level of working hour below 2 hours, the probability of stopping slowly increases from 0.118 with working hours between 4 hours and 6 hours to 1.157 with working hours between 14 hours and 16 hours, and then it sharply increases and reaches its peak to 1.576 for the working hours above 16 hours. These results are all statistically significant at 1% level.

Similar to the prediction of a reference-dependent model with income target, a reference-dependent model with hour target would suggest that the individual should have a higher marginal utility of work if working hours are below the target and higher marginal utility of leisure if working hours are above the target. In terms of probability of stopping, we should expect this probability to peak around the target working time.

On the other hand, the neoclassical model predicts that as working hours accumulated, taxi drivers’ marginal utility of leisure becomes larger. Therefore, the probability of ending a shift should keep increasing.

The finding in Table 5 apparently is inconsistent with the implication of a reference-dependent model using working hours as target. Together, our findings in Table 4 and Table 5 support that a reference-dependent model of income target at least plays a role in explaining taxi drivers’ behavior of their labor supply in that it rules out the possible explanation of the model with hour target alone. However, we still need to consider the model with both income and hour target as these behaviors could be better explained in this model as income target may play different roles with hour target met or not.

3.2.3. Both Income and Work Hours as Targets

The income and working hour dependent model can be estimated by regressing Stop Trip on both Fare Range Dummies and Hour Range Dummies and controlling for a set of fixed effects. This model is discussed in details in the empirical model (2). The estimation results are presented in Table 6. Similarly as presented in Table 4 and Table 5, we conduct the OLS estimation in columns (1), (2), and (3), and we estimate using probit model in column (4).

Table 6. Regression of stopping trip on both fare and hour range dummy variables.

* and *** denote statistical significance (two tailed) at the 10%, 5%, and 1% levels, respectively.

In column (1), we show the OLS estimation result without controlling any fixed effects. Comparing to the cumulative fare below 100, Stop Trip does not show significant increase until fare range is between 400 and 500, in which it increases by 0.008; Stop Trip keep increasing by 0.030 when the fare range is between 500 and 600; Stop Trip increases by 0.043 when the fare range is between 600 and 700; Stop Trip keeps increasing but with a smaller magnitude of 0.024 when the fare range is between 700 and 800; Stop Trip decreases when the fare range is above 800. Considering the effects of working hour ranges, Stop Trip shows no significant difference for cumulative working hours between 2 hours and 4 hours; Stop Trip increases by 0.001 when the cumulative working hours are between 4 hours and 6 hours; Stop Trip increases by 0.003 when the cumulative working hours are between 6 hours and 8 hours; Stop Trip increases by 0.005 when the cumulative working hours are between 8 hours and 10 hours; Stop Trip increases by 0.011 when the cumulative working hours are between 10 hours and 12 hours; Stop Trip increases by 0.028 when the cumulative working hours are between 12 hours and 14 hours; Stop Trip increases by 0.096 when the cumulative working hours are between 14 hours and 16 hours; Stop Trip increases by 0.277 when the cumulative working hours are above 16 hours.

Comparing the result here with those in the column (1) of Table 4 and column (1) of Table 5, we see there is not much difference between the effects from working hour ranges, but the effects of fare ranges have been lowered. The results are similar if we include Taxi Fixed Effects in column (2), and include Taxi Fixed Effects, Weather Fixed Effects, Day of Week Fixed Effects, and Week Fixed Effects in column (3).

In column (4) of Table 4, we present the probit model regression results. The pattern from the probit model shows stronger effects from fare ranges. Comparing to the baseline level of fare range below 100, the probability of stopping starts increasing lowly to 0.027 when the fare range is between 200 and 300 CNY, it increases to 0.080 for the fare range between 300 and 400, and then it increases sharply to 0.224 and 0.358 for the fare range between 400 and 500 and fare range between 500 and 600, respectively. The probability of stopping keeps increasing and reaches its peak at 0.418 for the fare range between 600 and 700, and then gets lower to 0.334 and 0.147 for the fare range between 600 and 700 and fare range above 800, respectively. The pattern from the effects of hour range is generally consistent with the previous three columns. Comparing to the baseline level of working hour below 2 hours, the probability of stopping slowly increases from 0.107 with working hours between 4 hours and 6 hours to 0.818 with working hours between 12 hours and 14 hours, and then it sharply increases by 1.306 with working hours between 14 hours and 16 hours, and it reaches its peak to 1.951 for the working hours above 16 hours. These results are all statistically significant at 1% level.

Overall, we find that the probability of stopping keep getting larger at an increasing rate as working hours accumulated, but as the cumulative fare increases, it first increases and then decreases.

The reference-dependent model discussed in the previous section suggested that there are two “domain of losses”: 1) If income is below the income target, drivers have a higher marginal utility of income; 2) if hours are above the hours target, drivers have a higher marginal utility of leisure (disutility of work). It implies that the probability of stopping will be lowest when income is below the income target and hours are below hours target and will be highest when income and hours are above the income and hour target respectively.

Given these implications, if hours reference point matters for individual’s decision on stopping time, we should expect the probability to be peaked sometime as working hours accumulated. From our estimate, it does not happen. The probability of stopping keeps increasing as the neoclassical predicts that the marginal disutility of working becoming larger. In contrast, the probability of stopping indeed increases and peaked at 600 CNY, moreover the sharp jump around 500 yuan is consistent with the kink which would exist at the income target.

Our estimates show that taxi drivers’ behavior is better explained by the reference-dependent model with income target. This is in a quite contrast to the findings in the literature mainly rely on NYC taxi dataset.

3.3. Evidence from Asymmetric Models

To further check the robustness of our finding, we follow Crawford and Meng (2011) and Farber (2008) to estimate a reduced-form of stopping probability with dummy variables to measure the increment effects due to hitting the income and hours targets. As discussed in the empirical model (3), we run the regression of Stop Trip on Income Target and Hour Target, as well as Ln(Cum Fare) and Ln(Cum Hour).

As discussed in the variable definition, Income Target takes the value of 1 if the Cum Fare is above the daily average income over the sample period, and 0 otherwise. Hour Target takes the value of 1 if the Cum Hour is above the daily average working hours over the sample period, and 0 otherwise. The sample average income target is around 480 CNY and sample average working hours are around 13 hours, and these numbers are similar to those reported as mean value of Total Income and Work Hour in Table 2. Here we calculate the sample average for each taxi, and additionally, we also calculate the average across all taxi drivers, and the results do not vary much.

The two dummy variables imply whether income or working hours above the targets. If any of their coefficients are positive, it is consistent with the prediction of a reference-dependent model. Table 7 lists the regression results of the asymmetric effects of income and hour targets. We report the t-statistics in parentheses and standard errors are clustered by day. In column (1), we exclude all the fixed effects. Column (2) includes Taxi Fixed Effects, and column (3) includes Taxi Fixed Effects, Weather Fixed Effects, Day of Week Fixed Effects, and Week Fixed Effects. The estimation results in the three columns are virtually the same.

Taking column (3) as an example, the coefficient of Income Target is 0.045, and it is significant at 1% level. The coefficient of Hour Target is also significantly positive. However, when we look at the coefficients of Ln(Cum Fare) and Ln(Cum Hour), we see a totally different pattern. The coefficients of Ln(Cum Fare) is not significantly different from zero, meaning that once we take into consideration of the effects from Income Target, the log level of cumulative fare

Table 7. The asymmetric effects of income and hour targets.

***denotes statistical significance (two tailed) at the 10%, 5%, and 1% levels, respectively.

shows no effect on the probability of stopping. In contrast, the coefficient of Ln(Cum Hour) is 0.008, and significant at 1% level. The significant positive coefficient of the log level of cumulative working hours shows that the probability of stopping always increases as working hour increases.

Overall, our evidence from the asymmetric model again shows that we can use the reference-dependent model with income as target to explain the behaviors of taxi drivers, but there is a lack of evidence supporting working hour dependent model.

4. Discussion

We find strong evidence that the working hours of drivers are negatively related to the hour rates, and this effect is both statistically and economically significant. We then conduct a discrete-choice model and estimate the probability of stopping on a set of cumulative fare ranges and cumulative working hours. This is consistent evidence showing that the probability of stopping keeps increasing as cumulative working hours increase, but the probability of stopping first increases and then decreases as the cumulative fare increases. This indicates the existence of an income target in taxi drivers’ labor supply decisions. Lastly, we use the asymmetric model with the income target and working hour target as dummy variables, and the probability of stopping is significantly positively related to income target but shows no significant relation with cumulative fare. In contract, both working hour target and cumulative working hours seem to be important to explain the probability of stopping.

Overall, our results clearly reject the prediction of the neoclassical theory as the elasticities of labor supply is significantly negative. More interestingly, among the three reference-dependent models, our results are better explained by the income-based reference model. That taxi drivers seem to target certain income levels instead of total working time. This finding is quite different from the literature. For example, Crawford and Meng (2011) find that their results are more in line with the reference dependent model with both income and hour targets. One possible explanation of the difference between these findings is that Chinese taxi drivers may view the income and leisure differently compared to their counterparts in New York City. Such difference may be due to cultures, working conditions, and living environment, etc.

5. Conclusion

Drivers are a preferred research subject for studying the wage elasticity of labor supply, which has been proved by the results of the above-mentioned models. And applications of the dataset of taxi drivers in Chengdu, China, also expose the difference between literature and empirical results, which calls for further studies.

Drivers, of course, are not representative of the whole working population. Besides some demographic differences, many other groups (e.g., farmers and small-business proprietors) have similar self-selected occupations with low variable wages, long work hours, and relatively high rates of accidents. Therefore, it is important for these works to make long-horizon planning and effectively allocate their labor and investment in economic and educational opportunities for themselves and their children. This is where calling for attention and help from educators and policy makers to improve the social welfare of a nation.

References

[1] Blundell, R., & Macurdy, T. (1999). Labor Supply: A Review of Alternative Approaches.

[2] Crawford, V. P., & Meng, J. (2011). New York City Cab Drivers’ Labor Supply Revisited: Reference-Dependent Preferences with Rational-Expectations Targets for Hours and Income. American Economic Review, 101, 1912-1932.

https://doi.org/10.1257/aer.101.5.1912

[3] Farber, H. S. (2008). Reference-Dependent Preferences and Labor Supply: The Case of New York City Taxi Drivers. The American Economic Review, 98, 1069-1082.

https://doi.org/10.1257/aer.98.3.1069

[4] Farber, H. S., Silverman, D., & von Wachter, T. (2015). Factors Determining Callbacks to Job Applications by the Unemployed: An Audit Study. IZA Discussion Papers 9465, Cambridge: Institute of Labor Economics (IZA).

https://doi.org/10.3386/w21689

[5] MaCurdy, T. E. (1981). An Empirical Model of Labor Supply in a Life-Cycle Setting. Journal of Political Economy, 89, 1059-1085.

https://doi.org/10.1086/261023