The purpose of this paper is to propose a simple algorithm for computing partial derivatives of the optimal value function. Macroeconomic problems involving incentive compatibility constraints have received wide attention in the literature due to recent advances in dynamic optimization techniques (see   and references therein). Often the optimality conditions for this class of problems involve partial derivatives with respect to endogenous state variables of the optimal value function corresponding to the dynamic programming formulation of an outside option. Although many numerical methods can provide an approximation for the value function, there is no reason to believe that a derivative of this approximation will be close in any sense to the actual value of the derivative. In this note we suggest an algorithm for accurately computing these partial derivatives by simulation.
This issue has been previously considered in  , in the context of a stochastic growth model with capital accumulation under one-sided lack of commitment. To circumvent the problem of finding the values of the derivatives in  , the authors proposed a method based on the ideas of Benveniste and Scheinkman  . Unfortunately, their method has limited applicability since it depends on the availability of an analytical solution for the derivatives as conditional expectations of the known functions of the model solution. This paper proposes a simple algorithm to fill this gap in the literature.
In order to be able to use finite differences to approximate the gradient at a given point, one would need to know the values of the optimal value function at a certain set of points. Our algorithm obtains approximations of these values with arbitrary precision. Moreover, achieving this accuracy is feasible for all points in the state-space which have economic relevance.
The initial step of our algorithm involves obtaining numerical solution to a problem using a procedure which satisfies three criteria. First, it approximates some unknown function with flexible functional forms of finite elements. Second, it can deliver an accurate solution as the number of the finite elements in the function goes to infinity. Third and last, the resulting numerical solution must be such that it can be formulated as a set of policy functions approximated with flexible functional forms. The next step involves using Monte-Carlo integration in order to evaluate the conditional expectation of the discounted sum of future instantaneous utilities. The final step involves applying the method of finite differences to approximate the values of the partial derivatives of the value function.
The attractive features of the algorithm include its rather wide scope of applicability and simplicity of implementation. It can be used to study the questions of risk sharing under imperfect enforcement of contracts, as well as partnerships with limited commitment when several state variables appear in the model corresponding to the outside option. Such models may include habit formation preferences, several types of capital, or reputational co-state variables. The suggested method is computationally inexpensive. It does not suffer from the curse of dimensionality and therefore it is particularly convenient for models involving many state variables.
The rest of the paper is organized as follows. Section 2 discusses an example, where our algorithm proves to be useful. Section 3 sketches the idea behind the algorithm. Section 4 deals with implementation of the algorithm, while Section 5 compares it with some available alternatives. Section 6 concludes.
2. Applicability of the Algorithm: An Example
To fix ideas, we start with an example of a macroeconomic model where our computational algorithm proves to be useful. The key feature of this example is that solving it boils down to designing an optimal social contract which takes into account not only technological but also incentive and legal constraints. Our example illustrates the need for computing the gradient of the value function and its practical implementation. Furthermore, it shows that our algorithm is applicable to some widely used models, to which the method in  cannot be applied.
Consider a model of international risk sharing, which distinguishes itself from the canonical model  in two respects. First, as in  , we introduce a friction in the credit markets. We assume that the international loans are feasible only to the extent to which they can be enforced by the threat of exclusion from participation in any other international risk sharing arrangement. Second, we incorporate habit formation preferences into the model. The motivation for doing this is threefold. Habits help us illustrate the features of the algorithm by expanding the set of endogenous states in the model. Habit formation preferences tend to improve performance of the international business cycle models   . Finally, empirical studies suggest that habit formation is consistent with the observed consumption behavior  . To simplify the exposition, we assume inelastic labor supply.
The planner’s problem is to choose the sequences of consumption and investment to maximize a weighted sum of utilities
subject to an aggregate feasibility constraint
individual participation constraints for each ,
the equations of motion for the capital,
the laws of motion for habits,
and non-negativity constraints . We assume that productivity shocks follow a first order stationary vector autoregressive process, and that the initial values for the state variables , and the initial non-negative weights, , are given. In addition, the usual restrictions apply to the discount factor, , the capital depreciation rate, , and the persistence of habits, .
The outside option in the participation constraint (3) represents the optimal value function corresponding to the autarkic environment.
with the initial values being equal to the values of the state variables at the moment of deviation from the optimal plan.
In addition to Equations (2)-(5), the optimal allocations, for all , must satisfy the risk sharing condition,
the intertemporal condition,
the complementary slackness condition,
and the law of motion for the co-state variables , where , , and . In the equations above denotes
, andsimilar abbreviations apply to other terms1.
The gradient of the optimal value function enters the intertemporal condition (8) and the risk-sharing condition (7). Approximation of this gradient is the purpose of the algorithm proposed in  and in this paper. Because both
algorithms can approximate , we will use that fact to compare their accuracy in Section 1. Our approach can also approximate ,
whereas  cannot, because the analytical solution for this partial derivative as an expectation of the known functions of the model’s solution is not available. This will be further discussed in Section 1.
3. The Algorithm
Typically, in the models with participation constraints the reservation value is the value function of the outside alternative, evaluated at the current values of the endogenous state variables, , and exogenous shocks, . Consider the optimal value function at a point as an outcome of a standard optimization problem for the outside alternative:
where r is an instantaneous utility function, the discount factor, an exogenous Markov stochastic process, a vector of endogenous state variables, a vector of control variables, A a feasibility correspondence and l the law of motion for the endogenous state variables. The functional equation to this problem can be derived using the standard dynamic programming techniques. It yields a time invariant policy function f such that the optimal allocations satisfy .
The purpose of our algorithm is to find a pointwise approximation to the partial derivative of the value function with respect to its i-th argument. The algorithm takes the following three steps:
Step I (Numerical Solution) Solve the model in (9) with a spectral method and formulate the solution in terms of approximated policy functions , which depend on the state variables and some coefficients, .
Step II (Monte Carlo Integration) Simulate N sequences of the realizations of the stochastic process of size T with a starting value , for all . For a each sequence , simulate the series of the endogenous variables using approximated policy functions , the equations for motion for the state variables (10), and the initial values . Using the simulated series calculate the discounted sums of the instantaneous returns and average over N:
Step III (Numerical Differentiation) Repeat Step II to obtain approximations of the value function at two points, for instance and , where denotes a conformable vector of zeros with one on its i-th coordinate, and is a small positive number. Calculate the value of the partial derivative using, for example, Stirling’s finite difference formula:
The optimal choice of the method for calculating the derivatives in Step III is problem specific and its accuracy depends on the smoothness of the value function. The approaches available include a variety of difference formulas, Richardson Extrapolation, or curve fitting with cubic splines. These are described at length in the standard numerical methods texts such as    .
A brief note should be made at this point on the accuracy of the algorithm. In principle, arbitrary accuracy of the approximation can be achieved, by simultaneously increasing the dimension of the approximating family of functions in Step I, increasing the size of Monte Carlo iterations in Steps I and II, and decreasing the denominator in Step III. In practical applications, however, there are several sources of the approximation errors. First, in order to obtain the values of the optimal value function at a point, one relies on the approximations of the policy functions implied by the numerical solution to the model. Second, because we consider stochastic models, there is an additional error stemming from the evaluation of the integral in the computation of expected discounted returns. Finally, numerical differentiation introduces two more sources of error: the truncation error and the roundoff error. The truncation error comes from omitting higher order terms in the Taylor series expansion. The roundoff error is associated with storing real numbers in computer’s floating-point format. Section 0 discusses some practical accuracy issues in the context of an example.
4. Implementation of the Algorithm
This section describes a practical computational strategy for implementing the algorithm using the example from Section 2. The optimality conditions include partial derivatives of the value function corresponding to the dynamic programming formulation of the agents outside option, i.e. autarky. The functional equation for the autarkic problem is:
The objective of the algorithm is to find the values of the partials and at a point , which is likely to happen in equilibrium. Since the analytical expression for these derivatives is in general unavailable, we have no choice but to rely on numerical differentiation. Another complication which arises here is that the closed form solution to the optimal value function is generally unavailable too. Hence, one needs to approximate value function at two points, e.g. and with arbitrary accuracy in order to be able to use the finite differencing approach.
The first step of the algorithm involves solving the model with a spectral method which can approximate the policy functions with arbitrary accuracy. This example relies on a version of stochastic simulation algorithm, which formulates the solution in terms of approximated policy functions. The Euler equation for the problem is given by:
where marginal utility of consumption is
To simplify the exposition we will consider the case of non-persistent habits, which corresponds to . We assume the functional forms standard in the growth literature. The instantaneous utility function is given by
, where and . The production function
is Cobb-Douglas and is given by . The stochastic process for productivity is , where , and are independent normally distributed random variables with zero mean and variance . In this example, we restrict attention to one particular set of the parameters which are summarized in Table 1.
The algorithm follows the three steps: 1) numerical solution; 2) Monte Carlo integration; 3) numerical differentiation.
Step I (Numerical Solution) The sequences of optimal allocations must satisfy the following system of stochastic difference equations:
For the expositional purpose, we solve the model with a version of a stochastic simulation algorithm, which is easiest to implement (see e.g.  ). It takes the following steps:
1) Fix the initial conditions and draw a series of that obeys the law of motion for the exogenous shocks with T sufficiently large. To ensure sufficient accuracy of solution we chose for all the numerical examples considered. The computational burden of this is still rather low since the model needs to be solved only once.
2) Substitute the conditional expectations in (11) with the flexible functional forms that depend on the state variables and some coefficients, , to yield
Table 1. Parameterization of the model.
where , and denotes polynomial of degree n. By using the exponent of the logarithmic polynomial expansion we guarantee that the left hand side of (11) remains positive. Given , the next period values for the capital and habit stocks follow directly from the laws of motion (12) and (13).
3) Using the realizations of , repeat the previous step in order to obtain recursively a series of the endogenous variables , for this particular parameterization .
4) Run the following non-linear regression
where the role of the dependent variable is performed by the expression inside the conditional expectation in Equation (11).
5) Letting be the result of the regression in the previous step, use an iterative procedure to find the fixed point of S, and the set of coefficients . This would provide the solution for the endogenous variables for this particular realization of the stochastic process along with the approximated policy functions:
Step II (Monte Carlo Integration) Our objective is to find approximations of partials at a range of points. Supposing that the point of interest is the algorithm proceeds as follows:
Simulate N sequences of the realizations of the stochastic process of size with a starting value , for all . For a each sequence simulate the series of the endogenous variables using approximated policy functions, the laws of motion (12)-(13), and the corresponding initial values . Using the simulated series calculate the discounted sums of the instantaneous utilities and average over N,
Step III (Numerical Differentiation) To obtain get approximations of the optimal value function at and , where is a small positive number. Calculate the approximated value of the partial derivative using Stirling’s finite difference formula:
The partial with respect to the habit stock is obtained in a similar way. The length of the simulated series can be very moderate due to discounting of the future utilities. The optimal value of is both computer and problem specific.
5. Numerical Accuracy: A Comparison
This section considers the accuracy of the algorithm in the context of our example. First, we compare performance of our algorithm with the approach in  when such comparison is feasible. Next, we present several special cases, which isolate the contributions of different sources to the overall approximation error.
Consider the optimality conditions for the autarkic problem, written in the sequence form:
Condition (17) can be used to compare our algorithm with the method in  . The latter requires solving the model numerically and expressing the derivatives of interest in terms of conditional expectations and functions of the equilibrium path of the model. Applying recursive substitution and the law of iterated expectations to (17) yields:
An approximation of this derivative can be obtained by parameterizing the right hand side with flexible functional forms in the state variables and running a non-linear regression using the simulated series from the numerical solution of the model.
Figure 1 shows the approximations of the derivative obtained using our algorithm and the approach in  . The approximated values of are plotted for a range of a state variable while keeping the remaining states fixed at their deterministic steady state values. The histograms plot the sample distributions of capital and habit stock. A few observations can be made based on Figure 1. First, the two algorithms produce indistinguishable results when the state variables take the values which often occur in equilibrium. Second, for the points which are unlikely to occur in equilibrium, the approximations differ significantly. To see this feature, consider the range of values of capital stock in excess of 6.5. The plots of the approximate derivatives reported in the upper panel of Figure 1 do not coincide. Moreover, the upper tail of the histogram suggests that such values of are not unlikely to happen in equilibrium. Notice, that while considering a relatively high value of we kept the remaining arguments of at their deterministic steady state values. However, the points where capital is very high while consumption (and hence the habit stock) are at the steady state level are rather unusual. This is an expected result, since Monte Carlo integration delivers good approximations only in the region of the state space which is frequently visited by the model in equilibrium.
The framework we have chosen for a worked out example embeds several well known special cases. For instance, if , it reduces to the Brock-Mirman stochastic growth model. In this case, the analytical form of the one-period return function r, which maps the graph A of the feasibility correspondence into the real numbers is known. The correspondence describing the feasibility constraints is given by
and the instantaneous return function becomes
where . Hence, by virtue of the Benveniste-Scheinkman theorem the derivative of interest can be expressed as
where g is the optimal policy function for capital stock. This special case allows us to compare the simulation from our algorithm with the example where the only source of approximation errors is the approximation of the policy function g. This will allow us to isolate the contribution of the approximation errors in evaluation of the integrals and numerical differentiation to the overall approximation error of the algorithm. As shown in Figure 1, the approximation delivered by our algorithm is very close to the approximation which relies on the Benveniste-Scheinkman theorem. Once again, in the region of the state space which is often visited by the model in equilibrium the two approximations are virtually identical. This allows us to tentatively suggest that the main contribution to the approximation error of the algorithm comes from the approximation of the policy functions.
Our final special case compares the approximation of the derivative with its known exact solution. It is well known that for the functional forms , , and , the optimal policy function is defined by the simple law of motion . Moreover, the derivative of the value function has the following analytical solution:
Figure 1. Comparison with the algorithm of Marcet and Marimon  .
By replacing the approximated policy function with the known closed form solution, we can isolate the effect of the errors stemming from Monte Carlo integration and numerical differentiation on the accuracy of the approximation. Figure 2 compares the approximated derivative obtained using the exact policy function for with its analytical counterpart. The reported graphs are visually indistinguishable for the range of six standard deviations of around its deterministic steady state value. The approximation errors stemming from Monte Carlo integration and numerical differentiation are of an order of 10−9 of the value of the derivative. This suggests that obtaining accurate approximation of the policy functions is crucial for the accuracy of the whole algorithm.
6. Concluding Remarks
This paper contributes to the literature on recursive contracts by proposing an algorithm for computing the gradient of the optimal value function using simulation-based techniques. The proposed procedure is conceptually straightforward, computationally inexpensive, and simple to implement. It allows researchers to extend existing risk-sharing models with limited commitment by including additional endogenous state variables. For example, one may extend a multi-country single-good model in  by including different types of capital or time-non-separable preferences. Alternatively, a multi-country multi-good model in  can be extended to include capital accumulation and cross-country trade in investment goods.
Figure 2. Comparison with a Closed Form Solution.
In terms of accuracy, the algorithm demonstrates performance comparable with Marcet and Marimon’s method  in our benchmark example. In contrast to  , our method is flexible enough to handle dynamic models with large numbers of state variables even when derivatives of interest cannot be expressed in terms of conditional expectations and functions of the equilibrium path of the model.
While our algorithm has wide applicability, it inherits its speed and accuracy trade-offs from the underlying numerical solution method. Our experiments suggest that obtaining accurate approximation of the policy functions is crucial for the accuracy of the whole algorithm.
An additional limitation on the algorithm’s computational speed is imposed by model’s simulation and Monte-Carlo integration. However, both steps can be parallelized along the lines proposed in  in order to reduce the computational time burden. Exploring the costs and benefits of a parallel implementation of the algorithm is left for the future research.
1Derivation of the optimality condition is available upon request.
 Golosov, M., Tsyvinski, A. and Werquin, N. (2016) Recursive Contracts and Endogenously Incomplete Markets. In: Handbook of Macroeconomics, Vol. 2, Elsevier, Amsterdam, 725-841.
 Creel, M. (2008) Using Parallelization to Solve a Macroeconomic Model: A Parallel Parameterized Expectations Algorithm. Computational Economics, 32, 343-352.