Bilateral investment treaties (BITs) are legal agreements between two governments that specify standards of treatment for foreign direct investment (FDI) are backed by third-party enforcement with the goal of protecting and attracting FDI (UNCTAD, 1998). These treaties contain provisions that require governments not expropriate foreign investment without prompt and adequate compensation and treat foreign investors no less favorably than domestic investors (national treatment). Finally, BITs allow dispute settlement before a third-party arbitrator, usually the International Centre for the Settlement of Investment Disputes (ICSID), which helps allay investor concern about a host country’s potentially weak rule of law.
In principle, a BIT is designed to boost investor confidence and subsequently attract greater inflows of FDI. Empirical evidence on whether BITs help signatories attract FDI, however, is mixed. Whereas some scholars find that BITs do result in greater FDI inflows (Salacuse & Sullivan, 2005; Neumayer & Spess, 2005; Büthe & Milner, 2009), others find no effect of BITs on FDI (Hallward-Driemeier, 2003; Gallagher & Birch, 2006). More recent research questions whether our current measures of FDI are even suitable in the first place (Kerner, 2018) and the role of publication bias (Reiter & Bellak, 2020). However, without exploring why countries would sign BITs in the first place, these studies implicitly assume that all countries are equally likely to sign BITs. Yet this is not the case—there is considerable variation in the number of BITs across countries. Consequently, the empirical analysis of BITs must statistically account for this selection problem or risk biased estimates and incorrect inference on the effects of BITs on FDI. Using a matching approach to account for the endogeneity problem described above, I find that BITs do attract FDI.
2. BITs and FDI
The defining feature of FDI is its long-term nature, and in particular, its mobility ex-ante, and its illiquidity ex-post. That is, once foreign capital is invested in a country, the investor is essentially subject to the whims of the country’s leader. A leader, knowing that investments once made, are not easily withdrawn, may have incentives to alter taxes or regulations ex-post. Consequently, investors are cautious ex-ante, and in the absence of a credible commitment to the protection of investment, may refrain from investing which will leave both the government and investor worse off.
Unlike world trade, there is no international institution governing FDI. As such, for some scholars, BITs are “the most important international legal mechanism for the encouragement and governance” of FDI (Elkins, Guzman, & Simmons, 2006). BITs define a set of clear rights for multinational investors and establish mutually-agreeable terms for foreign direct investment. Most BITs include: a standard of fair and equitable treatment, protection against arbitrary or discriminatory policies with regards to FDI, and that investments shall not be expropriated or nationalized either directly or indirectly except for a public purpose; in a non-discriminatory manner; upon payment of prompt, adequate and effective compensation; and in accordance with the process of law (United Nations Model BIT, Article 6).
Like all international agreements, BITs involve costs—diplomatic, sovereignty, arbitration, and reputational. These costs suggest that countries are not equally likely to sign BITs, as evidenced by the almost complete absence of BITs between the advanced industrialized democracies. Consequently, any statistical analysis must account for the variation in a country’s likelihood of BIT signing in order to assess whether BITs fulfill their intended purpose of attracting FDI.
Perhaps the most straightforward to ascertain whether BITs work is to answer the following question: what would have happened to FDI inflows if a country with a BIT in a particular year had not signed a BIT? However, this is impossible to know in observational studies. Since there is only information on whether country X signed a BIT in year t and a corresponding amount of FDI inflows in year t + 1, there is no way to ascertain what would have happened to FDI inflows in year t + 1 if country X had not signed a BIT in year t. This is Rosenbaum and Rubin’s (1983) fundamental problem of causal inference.
Rosenbaum and Rubin (1983) define a propensity score as the conditional probability of treatment assignment given observed baseline covariates. Here, a propensity score is the predicted probability of signing a BIT given a vector of observable covariates. Next, I will match two countries with identical propensity scores with the condition that only one of the countries in this pair actually signed a BIT whereas the other did not. In the parlance of experimental studies, the presence of a BIT is the “treatment” and the absence of one, the “control”. Since these two units have the same propensity score, i.e. the likelihood of BIT signing, the only difference between them is the presence (or absence) of a BIT. Consequently, any difference in FDI inflows between these two units is due to the BIT. That is, I am able to estimate the average treatment effect (ATE) of a BIT on FDI inflows.
A discussion of the assumptions underlying this approach is in order. First, the unconfoundedness assumption states that I have included all possible covariates in the estimation of the propensity score. As such, there are no unobserved differences between those countries that sign BITs and those that do not. Second, the overlap assumption specifies that the probability of BIT signing is strictly positive. This ensures that there are sufficient units in both treatment (BIT) and control (no BIT) groups with similar propensity scores for matching. If both unconfoundedness and overlap hold, then the treatment (BIT signing) is considered strongly ignorable, and the matching analysis can proceed.
3. Research Design
I assess whether BITs attract FDI by using a matching approach to generate a sample of country-pairs—one country with a BIT and one without, which allows me to directly evaluate the effect of a BIT on FDI. I use a time-series cross-sectional (TSCS) dataset with country-year as the unit of analysis. This dataset includes all developing countries from 1960 to 2010.
The primary outcome is annual foreign direct investment inflows, FDI. This is the sum of a given’s country direct investment flows by foreign firms in a given country from the World Development Indicators (World Bank, 2014). FDI ranges from −13.50 to 4.98, with a mean of 0.20 and a standard deviation of 1.92.
The outcome in the propensity score equation is whether a country has a BIT in a particular year. This variable, BIT, is a binary variable coded one if a BIT exists for a country, and zero otherwise (United Nations Conference on Trade and Development, 2013). BIT is also the main predictor, or “treatment” in the final estimation of whether BITs predict greater FDI inflows. The list of covariates in the estimation of the propensity score includes: resource rents as a percent of GDP, property rights, the natural logs of GDP and per capita GDP, trade openness, capital openness, and the natural log of prior FDI inflows, all from the World Development Indicators (World Bank, 2014).
Table 1 shows the average values of the covariates between countries with BITs and countries without BITs in the sample. In most cases, there is a significant difference between BIT and non-BIT countries. These differences suggest that these countries are not directly comparable—without an appropriate estimation approach, the estimate of the impact of a BIT on FDI is likely to be unreliable and invalid.
My analysis proceeds in three stages. First, I estimate the propensity score using logistic regression with the covariates listed above. Next, I select a method to generate a sample of matched observations. The goal of matching, as Iacus, King and Porro (2012) state, is to “prune observations from the data so that the remaining data have better balance between the treated and control groups (emphasis in original).” Nearest-neighbor matching matches a country with a BIT to a country without a BIT with the closest propensity score and discards unmatched country-pairs. Kernel-based matching matches a country with a BIT to a weighted sum of countries with similar propensity scores but without BITs. It assigns greater weight to the countries without BITs with propensity scores closest to the country with a BIT—the further the propensity scores of the countries without BITs are from the propensity score of the country with a BIT, the smaller the weight. Radius matching (Dehejia & Wahba, 2002) matches all countries without BITs within a predetermined radius of the propensity score of a country with a BIT and excludes unmatched countries outside the radius from the analysis.
Each of these matching methods has advantages and disadvantages. Nearest-neighbor matching minimizes bias by selecting country-pairs with similar propensity scores. However, this greatly reduces the number of matched pairs and discards a large amount of information. While kernel-based matching increases the number of matched pairs, it runs the risk of generating country-pairs that are only weakly matched, which increases the possibility of bias. Like kernel-based
Table 1. Mean values of covariates for countries with and without BITs.
Data from the World Development Indicators (World Bank, 2014).
matching, radius matching increases the number of matched pairs and reduces information loss. However, the selection of a predetermined radius is at the discretion of the analyst, and there is also the risk of weak matches.
Consequently, I use coarsened exact matching (CEM), an alternative method that works in three steps. First, a user-customizable algorithm recodes individual values of each covariate into substantively meaningful groups, i.e. “coarsens” the data. For instance, one can recode the seven-point partisan identification scale typical in American politics into three categories: Democrat, Independent, and Republican, without the loss of relevant information. Another example is the common practice of using age groups, e.g. 18 - 24, 25 - 29, 30 - 34, 35 - 39, instead of exact ages. In essence, this procedure generates variable-sized strata on which to match observations. Next, exact matching matches a country with a BIT in one stratum in the coarsened data sample to a country without a BIT in the same stratum. Finally, the procedure removes the coarsening and retains the original values of the matched data. The advantages of CEM are twofold: 1) it is less reliant on model specification compared to propensity score matching, and 2) it reduces bias and improves efficiency (Iacus, King, & Porro, 2012). Finally, I estimate the impact of BITs on FDI inflows using the matched sample with ordinary least squares (OLS) regression.
Figure 1 shows the distribution of cases with and without BITs across the range of estimated propensity scores. The height of each bar represents the proportion of cases with a specific propensity score or predicted probability of BIT signing. The orange bars indicate observations that signed BITs while the gray bars represent observations without BITs.
Figure 1. Distribution of propensity scores between BIT and Non-BIT countries.
The chart shows that the propensity score equation satisfies the overlap assumption—the range of estimated propensity scores is strictly positive, and there are observations across the entire range. More importantly, at each given value of the propensity score, there are cases that have signed and not signed BITs, which allows me to match countries with similar propensity scores, one with and one without a BIT.
I first show the results from nearest-neighbor (NN) matching to highlight the subsequent advantage of coarsened exact matching. Table 2 compares the mean values of covariates between countries with BITs and countries without BITs before and after NN matching. Although there is a decrease in the difference in means of some of the covariates, substantial differences remain.
To better see the difference before-and-after NN matching, Table 3 presents the percentage in balance improvement after nearest-neighbor matching. With the exception of resource rents and growth, the balance improvements are minor.
Table 2. Average values of covariates before (left) and after (right) nearest-neighbor matching.
Table 3. Balance improvement (%) after NN matching.
Given the potential drawbacks of nearest-neighbor matching discussed earlier, I now present the results of coarsened exact matching (CEM). Table 4 shows the differences across covariates for countries with BITs and countries without BITs and compares the differences to the original (unmatched) sample. There is a clear and considerable decrease in the means of the covariates between both groups. What is more, there is a substantial reduction in covariate means compared to those under nearest-neighbor matching.
Like Table 3, Table 5 shows the percentage in balance improvement after CEM. Compared to nearest-neighbor matching, there is a big improvement in balance. This increases confidence that the units in this sample are better matched than those in nearest-neighbor matching. As such, I use this sample from CEM in the final regression analysis of BITs on FDI.
Finally, Figure 2 presents a graphical depiction of balance before and after coarsened exact matching, using histograms on the left, and jitter plots on the right. I discard all observations in both treated (BIT) and control (no BIT) groups outside the support of the distance measure, i.e. observations that do not match on propensity scores.
Given the advantages of CEM over NN matching, Table 6 presents the results from an ordinary least squares regression analysis of BITs on FDI using the CEM-matched sample. The results show a positive and statistically significant effect of BITs on FDI. In the parlance of causal inference, there is a positive average treatment effect of BIT on FDI inflows—BITs do result in greater FDI inflows.
Figure 3 shows the predicted amount of FDI inflows a country will receive with and without a BIT. A country with a BIT receives an estimated 45% increase in FDI inflows compared to a country without a BIT.
Table 4. Average values of covariates before (left) and after (right) Coarsened Exact Matching (CEM).
Table 5. Balance improvement (%) after CEM.
Table 6. Ordinary Least Squares (OLS) regression estimate of the effects of a BIT on FDI inflows.
Standard errors are in (parentheses). *** and ** indicates statistical significance at p < 0.01 and p < 0.05 respectively.
Figure 2. A graphical depiction of balance before and after matching.
Figure 3. Predicted FDI inflows with and without a BIT.
There remains a lack of consensus on whether BITs fulfill their expected purpose of attracting FDI. A core issue in the analysis of BITs on FDI lies with the fact that the relationship between BITs and FDI is endogenous. In particular, the issue is one of simultaneity—if countries sign BITs to attract FDI, then FDI levels ought to predict BIT signing. Yet if FDI levels predict BIT signing, then how can one identify whether BITs predict FDI inflows? Put another way, if FDI inflows affect BIT formation, and BITs influence FDI inflows, then it is difficult to ascertain the true magnitude of BITs on FDI Inflows. Another issue is the problem of selection. There is considerable variation in BIT formation across countries and time, which suggests that the process of BIT signing is nonrandom. If this issue is not addressed, then one cannot accurately identify whether BITs attract FDI since it is possible that some other variable that influences BIT signing may also affect FDI inflows.
Studies that attempt to deal with this problem tend to rely on the instrumental variables or two-stage least squares (2SLS) approach. This method rests on the use of a suitable instrument that predicts BIT formation but not FDI inflows. However, such an instrument is often challenging to find. Instead, I use a matching approach—coarsened exact matching—to account for the variation in BIT signing across countries and generate a sample that matches country pairs that are as similar as possible on all key characteristics except that one country has a BIT and the other does not. In sum, I find that BITs do lead to increased FDI inflows.
 Büthe, T., & Milner, H. V. (2008). The Politics of Foreign Direct Investment into Developing Countries: Increasing FDI through International Trade Agreements? American Journal of Political Science, 52, 741-762.
 Elkins, Z., Guzman, A. T., & Simmons, B. A. (2006). Competing for Capital: The Diffusion of Bilateral Investment Treaties, 1960-2000. International Organization, 60, 811-846.
 Gallagher, K. P., & Birch, M. B. L. (2006). Do Investment Agreements Attract Investment? Evidence from Latin America. Journal of World Investment and Trade, 7, 961-976.
 Neumayer, E., & Spess, L. (2005). Do Bilateral Investment Treaties Increase Foreign Direct Investment to Developing Countries? World Development, 33, 1567-1585.
 Reiter, L., & Bellak, C. (2020). Effects of BITs on FDI: The Role of Publication Bias. In J. Chaisse, L. Choukroune, & S. Jusoh (Eds.), Handbook of International Investment Law and Policy (pp. 1-28). Singapore: Springer.