On Nonlinear Pricing

Show more

1. Introduction

Studying prices is a cornerstone of economic analysis. The economics of prices is well understood under convexity. This includes the linkages between prices and efficiency in market economies (e.g., see Debreu, 1959; Mas-Colell et al., 1995), leading to the argument that marginal cost pricing is efficient in competitive markets. This also includes the evaluation of nonmarket goods, their prices being assessed as the marginal value of the goods (e.g., the evaluation of the price of carbon emission; see Nordhaus (2019)). While these results hold under convexity assumptions, they do not apply under nonconvexity. Indeed, under convexity, the separating hyperplane theorem holds, and a separating hyperplane can be used to define Lagrange multipliers representing relative prices (e.g., Takayama, 1985). But the separating hyperplane theorem does not hold under nonconvexity, thus raising questions about the validity of the standard Lagrangian approach and of the associated price evaluation. Such arguments have stimulated interest in revisiting and generalizing previous approaches under nonconvexity (e.g., Gould, 1969; Giannessi, 1984, 2005). A key argument is nonconvexity implies a need to replace separating hyperplane with a separating nonlinear hypersurface. Being nonlinear, a separating hypersurface implies nonlinear pricing. This is a key motivation for this paper: nonconvexity requires an explicit investigation of nonlinear pricing.

To stress the importance of this issue, note that nonconvexity can arise from multiple sources. An example is the case of a firm under increasing return to scale (IRS) in production activities. IRS is a form of nonconvexity that arises in the presence of fixed cost, making it rather common. But the efficiency of marginal cost pricing does not hold under IRS. Indeed, under IRS and competition, marginal cost is less than average cost, implying that marginal cost pricing generates negative profit, which is unsustainable. Another example of nonconvexity includes the presence of externalities: as showed by Baumol and Bradford (1972) and Starrett (1972), externalities are a source of nonconvexity. In general, the presence of nonconvexity can invalidate the efficiency of marginal cost pricing. This motivates the need to consider nonlinear pricing. The fact that nonlinear pricing is commonly observed in market economies (e.g., Wilson, 1993) indicates the importance of refining our understanding of nonlinear pricing.

This paper examines how nonlinear pricing arises under nonconvexity. The analysis is developed in the context of a constrained optimization problem under nonconvexity. Note that exploring these issues is not new. For example, generalized Lagrangian approaches have been explored to deal with nonconvexity (e.g., Gould, 1969; Rockafellar, 1974; Giannessi, 1984, 2005; Rubinov et al., 2002). Our analysis builds on the work by Gould (1969) and Giannessi (1984, 2005) who studied the linkages between a saddle-point of a Generalized Lagrangian and the solution to a constrained optimization problem. A key insight is to allow for nonlinear penalty functions in the generalized Lagrangian, penalty functions that provide a representation of a separating hypersurface that exists under general nonconvexity. It is well known that Lagrange multipliers can be interpreted as “marginal values of the constraints” under convexity (e.g., Takayama, 1985). This interpretation continues to apply to a Generalized Lagrangian approach under nonconvexity. Indeed, the slopes of the penalty functions provide measures of shadow prices under general conditions. But penalty functions being nonlinear imply nonlinear pricing. Importantly, the shape of the penalty functions (and hence the type of nonlinear pricing) depends on the nature of nonconvexity (as the penalty functions must satisfy the separation property).

This paper studies the nature of nonlinear pricing in the context of a general constrained optimization problem. The analysis also applies to the evaluation of economic efficiency. Indeed, following Luenberger (1995), Pareto efficiency can be expressed as the maximization of aggregate benefit subject to aggerate demands not exceeding aggregate supplies. Chavas and Briec (2012) and Chavas (2017) show that this result continues to hold under nonconvexity. In this case, pricing involves assessing the “marginal values” of the constraints. Under nonconvexity, this means that nonlinear pricing becomes an explicit part of efficiency evaluation. This argument is relevant to nonmarket allocation in which case our analysis applies to the shadow prices of nonmarket goods (Rosen, 1974) and contracts (Salanié, 1999). It also applies to market allocations where market prices now play two roles: 1) they must clear the market; and 2) they must provide proper incentives to achieve Pareto efficiency. Our analysis provides useful insights into this second role when efficiency under nonconvexity requires nonlinear pricing.

This paper is organized as follows. Section 2 presents a general constrained optimization problem under nonconvexity. It also provides an example illustrating the challenges created by nonconvexity. The properties of pricing under nonconvexity are discussed in Section 3, where a generalization of the envelope theorem is presented. Section 4 discusses the economic implications of our analysis for nonlinear pricing and price discrimination.

2. Constrained Optimization under Nonconvexity

Consider the following constrained optimization problem:

${f}^{*}\left(b\right)={\mathrm{max}}_{x}\left\{f\left(x\right):g\left(x\right)\le b,x\in X\right\}$ (1)

where $X\subset {\mathbb{R}}^{n}$ , $f:X\to \mathbb{R}$ is the direct objective function, $b=\left({b}_{1},\cdots ,{b}_{m}\right)\in {\mathbb{R}}^{m}$ and $g:X\to {\mathbb{R}}^{m}$ define m constraints denoted by ${g}_{j}\left(x\right)\le {b}_{j},j\in \left\{1,\cdots ,m\right\}$ . Letting ${x}^{*}\left(b\right)\in \mathrm{arg}{\mathrm{max}}_{x}\left\{f\left(x\right):g\left(x\right)\le b,x\in X\right\}$ , ${f}^{*}\left(b\right)$ in (1) is the indirect objective function satisfying ${f}^{*}\left(b\right)=f\left({x}^{*}\left(b\right)\right)$ . The feasible set in (1) is $S\left(b\right)=\left\{x:g\left(x\right)\le b,x\in {\mathbb{R}}^{n}\right\}$ . Throughout the paper, we assume that the functions f and g are continuous and that the feasible set $S\left(b\right)$ is non-empty and closed. The optimization problem (1) is well understood under convexity where $f\left(x\right)$ is a concave function and $S\left(b\right)$ is a convex set (e.g., Takayama, 1985). Our analysis focuses on the case of nonconvexity where the $f\left(x\right)$ is not necessarily concave and/or the feasible set S is not necessarily convex.

Problem (1) covers a wide range of economic applications. An example is where $f\left(x\right)$ is a profit function and (1) represents profit maximization for a firm. In an industry exhibiting increasing returns to scale (IRS), the underlying technology would be nonconvex. As noted in the introduction, under IRS and marginal cost pricing, competitive firms would fail to generate a positive profit, indicating that marginal cost pricing cannot be efficient. Another example is where $f\left(x\right)$ is the earning capacity of a household and (1) represents the choice of time allocation in the maximization of household income.

A third example is the case of economic efficiency. To see that, consider the case where $x=\left(y,z\right)$ , $y=\left({y}_{1},\cdots ,{y}_{K}\right)$ , ${y}_{k}\in {\mathbb{R}}_{+}^{m}$ denotes the consumption of m goods by the k-th consumer, $k\in \left\{1,\cdots ,K\right\}$ , K is the number of consumers, $z\in Z\subset {\mathbb{R}}^{m}$ are m aggregate production goods and Z is the feasible set for z. Letting ${f}_{k}\left({y}_{k}\right)$ be the benefit function obtained by the k-th consumer, Luenberger (1995) showed that Pareto efficiency implies the maximization of aggregate benefit. Thus, a Pareto efficient allocation $x=\left(y,z\right)$ must satisfy the maximization problem (1) where

$f\left(y\right)={\displaystyle {\sum}_{k=1}^{K}{f}_{k}\left({y}_{k}\right)}$ (2a)

denotes aggregate benefit, and feasibility is represented by the constraints

${\sum}_{k\in K}{y}_{i}}\le z+b$ , (2b)

where b denotes initial endowment and $x=\left({y}_{1},\cdots ,{y}_{K},z\right)\in {\mathbb{R}}_{+}^{mK}\times Z$ .1 Equation (2b) imposes the restriction that aggregate demand for goods does not exceed aggregate supply. Luenberger (1995) showed that Pareto efficiency implies the maximization of aggregate benefit given (1)-(2) under convexity (i.e. when the benefit functions ${f}_{k}\left({y}_{k}\right)$ are concave2 and the set Z is convex. As shown by Chavas and Briec (2012) and Chavas (2017), this argument continues to apply under nonconvexity,3 when the benefit functions ${f}_{k}\left({y}_{k}\right)$ are not concave and/or the set Z is nonconvex. As discussed in the introduction, there can be multiple sources of nonconvexity for Z (including the case where production technology exhibits increasing return to scale).

The marginal effects of b on ${f}^{*}\left(b\right)$ in (1) have been a great interest. In general, these marginal effects are the shadow prices of the constrains in (1). This interpretation also holds in the general evaluation of efficiency associated with (1)-(2). This shadow price interpretation is very useful when the goods are nonmarket goods (e.g., carbon emission) or when allocation decisions are made by nonmarket institutions (e.g., government or contract). In such cases, evaluating the marginal effects of b on ${f}^{*}\left(b\right)$ can provide useful guidance in policy making and contract design. Alternatively, when the goods are allocated in a market economy, then these marginal values become market prices. Given the strong linkages between (1)-(2) and efficiency, our analysis examines the linkages between market pricing and efficiency under nonconvexity. In all cases, we want to answer the questions: What are the marginal values of the constraints in Equation (1)? And what are the implications for pricing? The answer to these questions is well known under convexity. But it is more challenging under nonconvexity. As discussed below, nonconvexity means that we must allow for nonlinear pricing.

Following Gould (1969), consider a generalized Lagrangian associated with (1). Define a function $h\in H$ , where $H=\left\{{h}^{\prime}:{\mathbb{R}}^{n}\to {\mathbb{R}}^{m}|{h}^{\prime}\left({a}^{\prime}\right)\ge {h}^{\prime}\left(a\right),\forall {a}^{\prime}\ge a;{h}^{\prime}\left(0\right)=0\right\}$ , H being the set of non-decreasing functions h mapping ${\mathbb{R}}^{n}$ into ${\mathbb{R}}^{m}$ and satisfying $h\left(0\right)=0$ . For a given b and treating the function h as a penalty function, define the Generalized Lagrangian

$L\left(x,h,b\right)=f\left(x\right)+h\left(b\right)-h\left(g\left(x\right)\right)$ , (3)

where $x\in X$ and $h\in H$ . We allow the penalty function h to be nonlinear. Note that, under convexity (where X is a convex set and the functions f and g are concave), the separating hyperplane theorem applies, the function h can be taken to be linear, and Equation (3) reduces to the standard Lagrangian $L\left(x,\lambda ,b\right)=f\left(x\right)+{\displaystyle {\sum}_{i\in M}{\lambda}_{i}\left[{b}_{i}-{g}_{i}\left(x\right)\right]}$ , where $\lambda =\left({\lambda}_{1},\cdots ,{\lambda}_{m}\right)\in {\mathbb{R}}_{+}^{m}$ is a vector of nonnegative Lagrange multipliers (e.g., Takayama, 1985). But such arguments no longer apply under nonconvexity, inducing us to examine the case where the function h is nonlinear. This is illustrated in Figure 1.

Figure 1 represents a maximization problem where $n=2$ , the function f is increasing and the feasible set S has an upper bound. Figure 1 shows that the global solution is at point O. In the context of efficiency analysis under (2), the line ABC would be an indifference curve giving the set of points that generate the same value of as the optimum value ${f}^{*}$ ; and the curve DEF would be the boundary of the production possibility set. Figure 1 represents a situation of nonconvexity where the objective function f is nonconcave and the feasible set S is nonconvex. Under nonconvexity, Figure 1 shows that the separating hyperplane theorem does not apply. Indeed, at point O, the line GG’ is tangent to both the upper bound of the feasible set (the curve EOF) and the indifference curve in the neighborhood of point O (the line BC). But the hyperplane GG’ cuts the curve DEF and enters the feasible set close to point G’. This occurs because of the nonconvexity of the feasible set $S=\left\{x:g\left(x\right)\le 0,x\in X\right\}$ . Thus, under nonconvexity, the line GG’ is not a globally separating hyperplane. Without this global separation property, the standard Lagrangian approach fails to provide a proper characterization of a global solution to (1). This failure also applies to the global validity of Lagrange multipliers as representation of relative prices.

Figure 1 illustrates four important arguments. First, in Figure 1, the line GG’ still exhibits the separation property in the neighborhood of point O. Indeed, when limited to this neighborhood, the line GG’ stays above the boundary of the feasible set EF and below the indifference curve BC. It means that the standard Lagrangian would remain valid “locally”, i.e. in a neighborhood of the optimum point O. Under differentiability, this result supports the standard use of the Kuhn-Tucker conditions as representing a local optimum. But this positive result is undermined by the second argument: Figure 1 shows the existence of

Figure 1. Maximization under non-convexity.

point O’, located at the tangency between the boundary of the feasible set (the line DE) and the indifference curve going through point O’. It means that point O’ is a local solution to the maximization problem (in the sense that, in a small neighborhood of O’, there is no other feasible point that can increase f). Again, a standard Lagrangian approach could identify point O’ as a local solution. The problem is that, comparing points O and O’, Figure 1 shows that O is the global solution and O’ is not (as O is located on a higher indifference curve). This illustrates that the standard Lagrangian approach (and its associated Kuhn-Tucker conditions under differentiability) can identify local solutions that are not global. Can this limitation be overcome? The third argument addresses this question. As just noted, Figure 1 shows that the line GG’ is not a separating hyperplane. But the nonlinear line HG exhibits the separation property. Indeed, the line HG is a nonlinear separating hypersurface: except at the optimum point O, it always remains below the indifference curve ABC and above the boundary of the feasible set DEF. This indicates that a separating hypersurface always exists under nonconvexity, provided that we allow it to be nonlinear. This is a key insight we get from Figure 1. To the extent that a penalty function represents the separating function in Lagrangian approaches, it means that, in the presence of nonconvexity, we must consider a nonlinear function h in the Generalized Lagrangian (3). The fourth argument obtained from Figure 1 is that the choice of this nonlinear function is not arbitrary: it must be chosen to satisfy the separation property.

We know that, under convexity, $h\left(a\right)$ can be chosen to be linear (from the separating hyperplane theorem), its slopes being Lagrange multipliers that reflect the marginal values of the constraints. When the objective function f has a monetary interpretation, the Lagrange multipliers are the shadow prices of the constraints in (1). As we show in the next section, this interpretation remains valid under nonconvexity with one exception: allowing $h\left(a\right)$ to be nonlinear means that the slopes of $h\left(a\right)$ (i.e., the shadow prices) are no longer constant and we must consider explicitly nonlinear pricing. Figure 1 illustrates this argument. The hyperplane G’OG does not provide proper information on pricing. Indeed, if we were to take the slope of G’OG as measures of relative prices, then profit-maximizing competitive producers would improperly choose to produce at point D (as profit would be higher at point D than at point O). In the evaluation of efficiency in a market economy, this implies that uniform pricing would be inefficient. Importantly, the incentive to produce would shift to point O under the nonlinear separating hypersurface HOG. In other words, in the presence of nonconvexity, nonlinear pricing becomes an integral part of finding a global solution to the optimization problem (1). And when applied to efficiency analysis under (2) (where f measures aggregate benefit), nonlinear pricing becomes a central part of an efficient allocation.

These claims are now formalized in the context of the maximization problem (1). Consider the following dual problem

${L}^{*}\left(b\right)={\mathrm{inf}}_{h}{\mathrm{sup}}_{x}\left\{L\left(x,h,b\right):h\in H,x\in X\right\}$ (4)

Let ${x}_{b}^{*}$ and ${h}_{b}^{*}$ be a solution to problem (4) that satisfy ${L}^{*}\left(b\right)=L\left({x}_{b}^{*},{h}_{b}^{*},b\right)$ . A key issue is the relationship between ${L}^{*}\left(b\right)$ in (4) and the indirect objective function ${f}^{*}\left(b\right)$ in the primal problem (1). In general, a weak duality relationship holds: ${L}^{*}\left(b\right)\ge {f}^{*}\left(b\right)$ (Rubinov et al., 2002; Giannessi, 2005). While ${L}^{*}\left(b\right)$ is an upper bound to ${f}^{*}\left(b\right)$ , this identifies ${L}^{*}\left(b\right)-{f}^{*}\left(b\right)\ge 0$ as a “duality gap”. In this context, a “zero-duality gap” occurs when ${L}^{*}\left(b\right)={f}^{*}\left(b\right)$ , i.e. when the primal problem (1) and the dual problem (4) have the same value. The following result was obtained by Rubinov et al. (2002).

Lemma 1. There is a zero-duality gap at b if an only if ${f}^{*}\left(b\right)$ is upper semi-

continuous at b.

Lemma 1 establishes that the upper semi-continuity of ${f}^{*}\left(b\right)$ at b is a necessary and sufficient condition for a zero-duality gap. This condition is important for our analysis: it guarantees a close relationship between the maximization problem (1) and the dual Generalized Lagrangian problem (4). Note that the lower semi-continuity of ${f}^{*}\left(b\right)$ at b involves the effect of changing the constraints $\left[g\left(x\right)\le b\right]$ . As such, it is a “constraint qualification”. Other constraint qualifications have been proposed as sufficient conditions to obtain a zero-duality gap. They include Slater’s condition (stating that $\mathrm{int}\left(S\right)\ne \varnothing $ ) and various rank conditions on $\partial g/\partial x$ under differentiability (Bertsekas, 1995). Since lemma 1 presents a condition that is necessary and sufficient for a zero-duality gap, it follows that these other constraint qualifications are special cases of the condition stated in lemma 1. In other words, the upper semi-continuity of ${f}^{*}\left(b\right)$ at b is a “generalized constraint qualification”. We assume that it is satisfied throughout the rest of the paper.

The following key result was first obtained by Gould (1969).

Lemma 2. If there is a zero-duality gap at b, then the following properties hold:

$L\left({x}_{b}^{*},h,b\right)\ge L\left({x}_{b}^{*},{h}_{b}^{*},b\right)\ge L\left(x,{h}_{b}^{*},b\right),x\in X,h\in H$ , (5)

$g\left({x}_{b}^{*}\right)\le b$ , (6)

${h}_{b}^{*}\left(g\left({x}_{b}^{*}\right)\right)={h}_{b}^{*}\left(b\right)$ , (7)

${x}_{b}^{*}\in \mathrm{arg}{\mathrm{min}}_{x}\left\{f\left(x\right):g\left(x\right)\le b,x\in X\right\}$ . (8)

Lemma 2 shows that, under a zero-duality gap, finding a saddle-point in (4) or (5) is equivalent to solving the optimization problem (1). Importantly, this result holds globally; and it holds under non-convexity. This includes as a special case situations where the functions f and g are convex and X is a convex set: the separating hyperplane theorem then holds, h can be taken to be linear and the coefficients of h are standard Lagrange multipliers. As stressed in Gould (1969) and Giannessi (1984, 2005), Lemma 2 makes it clear how a Generalized Lagrangian approach can support the analysis of general constrained optimization problem under non-convexity. The key to this generalization is the non-linearity of the function h.

Lemma 2 also presents two additional results. First, Equation (6) shows that a saddle-point of the Generalized Lagrangian is always consistent with the feasibility constraints $g\left(x\right)\le b$ . Second, Equation (7) is a complementary slackness condition. It states that, at a saddle-point of the Generalized Lagrangian and for any constraint $i\in M$ , at least one of the following two conditions must hold: 1/ the function ${h}_{ib}^{*}\left(b\right)$ is strictly increasing in ${b}_{i}$ and the constraint ${g}_{i}$ is binding; or 2/ the constraint ${g}_{i}$ is not binding and the function ${h}_{ib}^{*}\left(b\right)$ does not vary with ${b}_{i}$ in the neighborhood of b. This condition is a generalization of a similar condition obtained in the standard Lagrangian approach when the function h is linear (e.g., under convexity).

3. Pricing under Nonconvexity

Lemma 2 applies to the general constrained optimization problem in (1). It states that a saddle-point of the Generalized Lagrangian identifies the solution to the maximization problem. It also involves the penalty function ${h}_{b}^{*}\in H$ . As discussed in this section, this function provides a useful characterization of pricing under general conditions.

Proposition 1. Under a zero-duality gap, consider two points $b\in {\mathbb{R}}^{m}$ and for ${b}^{\prime}\in {\mathbb{R}}^{m}$ . Then, the following inequalities hold

${h}_{b}^{*}\left({b}^{\prime}\right)-{h}_{b}^{*}\left(b\right)\ge {f}^{*}\left({b}^{\prime}\right)-{f}^{*}\left(b\right)\ge {h}_{{b}^{\prime}}^{*}\left({b}^{\prime}\right)-{h}_{{b}^{\prime}}^{*}\left(b\right)$ . (9)

Proof. Under a zero-duality gap, the Generalized Lagrangian has a saddle-point given in (5) for b and for ${b}^{\prime}$ . Then, the second inequality in (5) evaluated at $x={x}_{{b}^{\prime}}^{*}$ implies that

${f}^{*}\left(b\right)=L\left({x}_{b}^{*},{h}_{b}^{*},b\right)\ge L\left({x}_{{b}^{\prime}}^{*},{h}_{b}^{*},b\right)\equiv f\left({x}_{{b}^{\prime}}^{*}\right)+{h}_{b}^{*}\left(b\right)-{h}_{b}^{*}\left(g\left({x}_{{b}^{\prime}}^{*}\right)\right)$ . (10)

Note that

$f\left({x}_{{b}^{\prime}}^{*}\right)+{h}_{b}^{*}\left({b}^{\prime}\right)-{h}_{b}^{*}\left(g\left({x}_{{b}^{\prime}}^{*}\right)\right)\ge f\left({x}_{{b}^{\prime}}^{*}\right)={f}^{*}\left({b}^{\prime}\right)$ . (11)

since $g\left({x}_{{b}^{\prime}}^{*}\right)\le {b}^{\prime}$ from (6) and the function ${h}_{b}^{*}\left(a\right)\in H$ is non-decreasing in a. Summing (10) and (11) yields the first inequality in (9). The second inequality is obtained by multiplying the first inequality by −1 and switching b and ${b}^{\prime}$ .

Q.E.D.

Equation (9) shows how the change in the indirect function ${f}^{*}\left(b\right)$ is closely related to the penalty function ${h}_{b}^{*}\left(b\right)$ in the Generalized Lagrangian approach. When ${b}^{\prime}\ge b$ , the function ${h}_{b}^{*}\in H$ being non-decreasing, Proposition 1 gives the following result:

Corollary 1. When ${b}^{\prime}\ge b$ , we have

${h}_{b}^{*}\left({b}^{\prime}\right)-{h}_{b}^{*}\left(b\right)\ge {f}^{*}\left({b}^{\prime}\right)-{f}^{*}\left(b\right)\ge {h}_{{b}^{\prime}}^{*}\left({b}^{\prime}\right)-{h}_{{b}^{\prime}}^{*}\left(b\right)\ge 0$ . (12)

When the functions ${h}_{b}^{*}$ and ${f}^{*}$ are differentiable at b and letting ${b}^{\prime}\to b$ , the above

Corollary gives the following result:

Corollary 2. When the functions ${h}_{b}^{*}$ and ${f}^{*}$ are differentiable, we have

$\frac{\partial {f}^{*}\left(b\right)}{\partial b}=\frac{\partial {h}_{b}^{*}\left(b\right)}{\partial b}\ge 0$ . (13)

where $\frac{\partial {h}_{b}^{*}\left(b\right)}{\partial {b}_{i}}={\mathrm{lim}}_{{b}^{\prime}\to b}\left[\frac{{h}_{b}^{*}\left({b}^{\prime}\right)-{h}_{b}^{*}\left(b\right)}{{{b}^{\prime}}_{i}-{b}_{i}}\right],i\in M$ .

Equation (13) is a version of the envelope theorem, stating that the derivative of the indirect objective function ${f}^{*}\left(b\right)$ is equal to the derivative of the penalty function ${h}_{b}^{*}\left(b\right)$ . Equation (13) also states that this derivative is non-negative, meaning that increasing b tends to increase the value of the indirect objective function ${f}^{*}\left(b\right)$ , reflecting that the constraints are becoming less binding. This result implies that the derivative of the penalty function can be interpreted as a measure of “shadow prices” of the constraints. This is an important generalization of the standard Lagrangian approach. Indeed, under convexity, the penalty function ${h}_{b}^{*}$ can be taken to linear and $\partial {h}_{b}^{*}\left(b\right)/\partial b$ reduces to the standard Lagrange multipliers reflecting the slopes of a separating hyperplane. Our analysis establishes that such arguments generalize under nonconvexity. Indeed, the slopes of the penalty function ${h}_{b}^{*}$ are also shadow prices of the constraints as well as measures of the slopes of a separating hypersurface under nonconvexity.

Comparing (9) and (13) makes it clear that (9) is a generalization of the envelope theorem in several ways. First, (9) applies under general forms of nonconvexity. Second, it remains valid under any discrete change in b. Third, it holds without assuming differentiability of f or g.

Thus, Proposition 1 shows that the changes $\left[{h}_{b}^{*}\left({b}^{\prime}\right)-{h}_{b}^{*}\left(b\right)\right]$ provide a general measure of the marginal effects of relaxing the constraints by changing b in (1). When the objective function has a monetary interpretation, h has also a monetary interpretation and its gradients provide a measure of prices (or at least of shadow prices when applied to the evaluation of nonmarket goods). When nonconvexity implies that h must be nonlinear, it follows that nonlinear pricing become an integral part of economic analysis. Implications of these results are discussed next.

4. Implications

Our analysis makes it clear that nonconvexity requires the introduction of nonlinear pricing in economic analysis. This result implies a need to consider departures from uniform pricing. Such departures are significant as they contrast with standard competitive markets: competitive markets are efficient under convexity, and they are “simple” in the sense that all market participants face the same market-clearing prices (e.g., Debreu, 1959). The role of prices being set to clear the market remains under nonconvexity. But insisting on uniform pricing is not appropriate under nonconvexity. Indeed, as illustrated in Figure 1, uniform pricing can be inefficient. As noted in the introduction, an example is given by a competitive industry where firms exhibit increasing returns to scale (IRS). In this case, uniform pricing is never efficient. Indeed, under marginal cost pricing, marginal cost being less than average cost under IRS, a competitive firm would make a negative profit and would have no incentive to produce. Under average cost pricing, the firm would make zero profit; but under uniform pricing, the outcome would be inefficient (the price paid by all consumers being higher than the marginal cost). The efficient solution is nonlinear pricing: prices are not uniform across all market participants. This involves price discrimination schemes among market participants. As noted by Wilson (1993), such schemes are commonly observed in many markets.

An example is the case of electricity pricing. The electricity industry faces two issues: 1) power plants exhibit IRS; and 2) the demand for electricity fluctuates over time (e.g., demand is higher during heat waves due to increased use of air conditioning). As just noted, this is a scenario where marginal cost pricing is not sustainable while average cost pricing is inefficient. The efficient pricing scheme is peak-load pricing: charge more for electricity during peak demands, but charge less off-peak periods (Dutta & Mitra, 2017; Borenstein & Bushnell, 2018). Charging more during peak demand is efficient in two ways: 1) it generates additional income that can cover the difference between average cost and marginal cost (under IRS); and 2) it induces consumers to reduce their demand for electricity during peak periods, thus reducing the need to build costly new power plants just to satisfy peak demand. And charging less for electricity in off-peak periods is efficient if the off-peak price corresponds to marginal cost. Such a price discrimination scheme is a form of nonlinear pricing that can support an efficiency allocation under nonconvexity.

As illustrated in Figure 1, nonlinear pricing is linked with the separation property of h which depends on the nature of nonconvexity. A difficulty is that, while separating hypersurfaces always exist, they are not unique. This non-uniqueness makes the design and evaluation of nonlinear pricing challenging. Indeed, Figure 1 shows that the line HOG is one possible separating hypersurface; but there are others. Another possible separating hypersurface is the line ABOC corresponding to the case of perfect price discrimination. Perfect price discrimination has two important properties (Tirole, 1988): 1) it is efficient (as the line ABOC is a separating hypersurface that goes through the efficient point O); and 2) it generates the largest possible payment by consumers. This last property underlines the fact that price discrimination schemes have implications for income distribution (as further discussed below). But perfect price discrimination is very difficult to implement: by charging each unit of each good a different price (as illustrated by the slope of the line ABOC in Figure 1), it requires a very large amount of information, information that is typically not available to anyone. For this reason, even if they are efficient, perfect price discrimination schemes are not realistic nor observed. This raises the question: can we find some “simple” price discrimination schemes that are efficient? In the context of the optimization problem in (1)-(2), this involves identifying a nonlinear penalty function that satisfies the separation property. One step in this direction is the Augmented Lagrangian approach proposed by Hestenes (1969) and Rockafellar (1974). In this context, allowing h to be a quadratic function is one possibility. But Figure 1 illustrates choosing a quadratic function for h would fail to satisfy the global separation property. This reflects the fact that, while polynomial functions provide good local approximation properties, they may not be good choices for penalty functions.

One attractive possibility is to consider spline function for h. A spline function can provide a global approximation to any function (Ahlberg et al., 1967). This is important in the search for flexibility in the evaluation of separation functions. Linear spline functions may be particularly appealing: having the property of being piecewise linear, they would greatly simplify their implications for pricing. In this case, the analysis would identify multiple pricing regimes: prices would be constant between spline knots but they would change across spline knots as one move across regimes. Uniform pricing would be a special case when there is a single regime. The simplest form of nonlinear pricing would involve linear splines with two regimes. This corresponds to two-part tariff schemes commonly observed (e.g., retailers asking consumers to pay a “membership fee” on top of purchase charge; infant-industry protection policies that charge a different price on the domestic market versus the world market). In these examples, the pricing would be efficient if the lower price is set at marginal cost while the higher price generates added revenue that covers the cost of production (e.g., to pay for fixed cost under IRS). More generally, linear splines can represent flexible pricing schemes obtained by increasing the number of regimes. Examples include volume discounts (where the price is lower as the quantity purchased increases) and tariff-rate quotas commonly used in trade policy (where tariff rates increase in a stepwise manner as import quantities increase). Under a spline parametrization of h, the optimal pricing scheme could then be obtained by searching for the parameters that would satisfy a saddle-point of the Generalized Lagrangian in (3). How many regimes are needed to satisfy the separation property? Unfortunately, there is no general answer to this question: the need for nonlinearity in h depends on the nature of nonconvexity, meaning that the form of nonlinear pricing is expected to vary across situations. This identifies a need for more empirical research on this topic.

An important issue in nonlinear pricing is: if prices are not uniform, who is going to pay for the lower prices and who is going to pay for the higher prices? When consumers are not precisely targeted, consumers can be offered multiple pricing options, letting them decide which option they prefer. In this case, consumers’ self-selection plays a role. Examples abound (e.g., Wilson, 1993). Under volume discount, the unit price declines with the volume purchased. Then, each consumer decides whether it is worth getting a lower price on a larger purchase (e.g., is it worth buying two pairs of shoes when one can get a second pair at a reduced price?). In purchasing an airline ticket to travel from one city to another, each consumer decides whether they are willing to pay a higher “first class price” involving some “added services” compared to a lower “economy price”. In these cases, heterogeneity of consumers in their ability to choose prices and their willingness to pay for “added services” play a role in the extent and feasibility of possible price discriminations. This can be problematic when most consumers choose the low price and the few consumers who choose higher prices do not generate enough income to cover the total cost of production (e.g., including the cost of added services as well as fixed cost under IRS).

This issue can be resolved in situations where the price discrimination scheme can be more precisely targeted. And recent advances in information technology have contributed to expanding the possibilities for firms to implement price targeting. Again, examples abound. Offering senior discounts is price discrimination based on age. Universities do price discrimination on the basis of geographical origin when tuitions differ between in-state and out-of-state students. Discriminating between domestic firms and foreign firms is commonly observed in trade policy. Under precise targeting, such price discrimination schemes increase the prospects to find nonlinear pricing schemes that are efficient. For example, under fixed cost and IRS, efficient nonlinear pricing would involve marginal cost pricing for some market participants but higher prices for other market participants (to generate enough income to cover the fixed cost). This is very different from uniform pricing under competitive markets. Cleary, attaining efficiency under nonconvexity can require departure from uniform pricing. But price discrimination schemes are not always efficient. Indeed, they can be part of rent-seeking behavior in imperfectly competitive markets (Tirole, 1988), where nonlinear pricing does not satisfy the separation property in (1)-(2). There is a need for more research to evaluate when and where price discrimination is efficient.

Finally, even when it is efficient, nonlinear pricing can raise equity issues. Indeed, under price discrimination, having some market participants face different prices may be seen as “unfair”. This unfairness has sometimes been used by policymakers to argue against price discrimination schemes (e.g., making it illegal to discriminate on the basis of race). This argument indicates a need to go beyond economic efficiency in the evaluation of nonlinear pricing. The issue of “who is paying what” becomes important when nonlinear pricing schemes are precisely targeted toward particular individuals or groups. Consumers who pay a higher price for a product are made worse off (as the higher price reduces their purchasing power). But the firms selling the product benefit from increase revenue and profit. If the increased firm profit is redistributed to the adversely affected consumers, the effects of price discrimination on the welfare of these consumers can be attenuated. But if it is not, price discrimination schemes can contribute to increasing income inequality. One scenario is when some consumers are made worse off (as they pay a high price for some goods) while the associated increase in firm profit is captured by other consumers (e.g., the owners of the firms). In this case, even if it is efficient, price discrimination would increase income inequality. In such situations, the distribution effects of nonlinear pricing can be debated and subject to economic and political bargaining. This argument indicates that the evaluation of nonlinear pricing must go beyond just efficiency considerations. In general, the welfare and distribution effects of price discrimination depend on both the nature of the pricing scheme and the distribution of firm ownership. Addressing these issues seems to be a good topic for further research.

Declaration

No specific funding was used to support this research. Neither author has any conflict of interest related to this research. Finally, no data were used.

Acknowledgements

The authors wish to thank Letizia Pellegrini for helpful comments and suggestions. Any errors and omissions are the sole responsibility of the authors.

NOTES

^{1}Note
$z\in Z\subset {\mathbb{R}}^{m}$ are m aggregate production goods. When the goods are produced by J firms, then aggregate production satisfies
$z={\displaystyle {\sum}_{j=1}^{J}{z}_{j}}\in Z$ where
${z}_{j}$ is the production of the j-th firm. In this context, the set Z allows for externalities in production activities among firms.

^{2} Luenberger (1995) showed that the concavity of the benefit functions
${f}_{k}\left({y}_{k}\right)$ holds under quasi-concave preferences.

^{3}As noted in the introduction, nonconvexity can arise in the presence of externalities (Baumol & Bradford, 1972; Starrett, 1972). Thus, our analysis of efficiency also applies under production externalities.

References

[1] Ahlberg, J. H., Nielson, E. N., & Walsh, J. L. (1967). The Theory of Splines and Their Applications. New York: Academic Press.

[2] Baumol, W. J., & Bradford, D. F. (1972). Detrimental Externalities and Non-Convexity of the Production Set. Economica, 39, 160-176.

https://doi.org/10.2307/2552639

[3] Bertsekas, D. P. (1995). Nonlinear Programming. Belmont, MA: Athena Scientific.

[4] Borenstein, S., & Bushnell, J. B. (2018). Do Two Electric Pricing Wrongs Make a Right? Cost Recovery, Externalities and Efficiency. Working Paper 24756, Cambridge, MA: National Bureau of Economic Research.

https://doi.org/10.3386/w24756

[5] Chavas, J. P. (2017). Ricardo Revisited: The Benefits from Trade and the Role of Non-Convex Technologies. Theoretical Economic Letters, 7, 263-293.

https://doi.org/10.4236/tel.2017.72022

[6] Chavas, J. P., & Briec, W. (2012). On Efficiency under Non-Convexity. Economic Theory, 50, 671-701.

https://doi.org/10.1007/s00199-010-0587-1

[7] Debreu, G. (1959). Theory of Value. New York: Wiley.

[8] Dutta, G., & Mitra, K. (2017). A Literature Review on Dynamic Pricing of Electricity. Journal of the Operational Research Society, 68, 1131-1145.

https://doi.org/10.1057/s41274-016-0149-4

[9] Giannessi, F. (1984). Theorems of the Alternative and Optimality Conditions. Journal of Optimization Theory and Applications, 42, 331-365.

[10] Giannessi, F. (2005). Constrained Optimization and Image Space Analysis. Berlin: Springer.

[11] Gould, F. J. (1969). Extensions of Lagrangian Multipliers in Nonlinear Programming. SIAM Journal of Applied Mathematics, 17, 1280-1297.

https://doi.org/10.1137/0117120

[12] Hestenes, M. R. (1969). Multiplier and Gradient Method. Journal of Optimization Theory and Applications, 4, 303-320.

https://doi.org/10.1007/BF00927673

[13] Luenberger, D. G. (1995). Microeconomic Theory. New York: McGraw-Hill.

[14] Mas-Colell, A., Whinston, M. D., & Green, J. (1995). Microeconomic Theory. New York: Oxford University Press.

[15] Nordhaus, W. (2019). Climate Change: The Ultimate Challenge for Economics. American Economic Review, 109, 1991-2014.

https://doi.org/10.1257/aer.109.6.1991

[16] Rockafellar, R. T. (1974). Augmented Lagrange Multiplier Functions and Duality in Nonconvex Programming. SIAM Journal on Control and Optimization, 12, 1-19.

https://doi.org/10.1137/0312021

[17] Rosen, S. (1974). Hedonic Prices and Implicit Markets: Product Differentiation in Pure Competition. Journal of Political Economy, 82, 34-55.

https://doi.org/10.1086/260169

[18] Rubinov, A. M., Yang, X. X., & Yang, X. Q. (2002). The Zero-Duality Gap Property and Lower Semicontinuity of the Perturbation Function. Mathematics of Operation Research, 27, 775-791.

https://doi.org/10.1287/moor.27.4.775.295

[19] Salanié, B. (1999). The Economics of Contracts. Cambridge, MA: The MIT Press.

[20] Starrett, D. (1972). Fundamental Non-Convexities in the Theory of Externalities. Journal of Economic Theory, 4, 180-199.

https://doi.org/10.1016/0022-0531(72)90148-2

[21] Takayama, A. (1985). Mathematical Economics (2nd ed.). Cambridge: Cambridge University Press.

[22] Tirole, J. (1988). The Theory of Industrial Organization. Cambridge, MA: The MIT Press.

[23] Wilson, R. B. (1993). Nonlinear Pricing. Oxford: Oxford University Press.