Income Inequality Measures
Author(s) Johan Fellman
ABSTRACT
Income distributions are commonly unimodal and skew with a heavy right tail. Different skew models, such as the lognormal and the Pareto, have been proposed as suitable descriptions of income distribution and applied in specific empirical situations. More wide-ranging tools have been introduced as measures for general comparisons. In this study, we review the income analysis methods and apply them to specific Lorenz models.

1. Introduction

Income distributions are commonly unimodal and skew with a heavy right tail. Therefore, different skew models, such as the lognormal and the Pareto, have been proposed as suitable descriptions of income distribution, but they are usually applied in specific empirical situations [1] . For general studies, more wide-ranging tools have been considered. The target for them is to introduce measures that are useable for comparisons of different distributions. Primary income data yield the most exact estimates of income inequality coefficients such as Gini and Pietra. Earlier studies have shown that no method is always optimal. Therefore, different attempts are still worth studies. In this study, we review income analysis methods based on Lorenz curves. The theory is applied to specific models.

Fellman [2] analyzed different methods for numerical estimation of Gini coefficients. As an application of these methods, he considered Pareto distributions. Using Lorenz curves, various numerical integration attempts were made to obtain accurate estimates. Mettle et al. [3] considered Lorenz curves and estimated the Gini coefficient of income by Newton-Cotes methods, and compared the accuracy of these estimates for some (Ghanaian) data.

2. Methods

The Lorenz curve. The most commonly used theory is based on the Lorenz curve. Lorenz [4] developed it in order to analyze the distribution of income and wealth within populations. He described the Lorenz curve, $L\left(p\right)$ , for wealth within populations in the following way:

“Plot along one axis accumulated per cents of the population from poorest to richest, and along the other, wealth held by these per cents of the population”.

Consequently, $L\left(p\right)$ is an accumulated amount of income (wealth) defined as a function of the proportion p of the population. It satisfies the condition $L\left(p\right)\le p$ because the income share of the poor is less than their proportion of the population. The increase $\Delta L$ caused by a fixed increase $\Delta p$ of the population is a growing function of p, and accordingly, the derivative ${L}^{\prime }\left(p\right)$ is an increasing function of p and $L\left(p\right)$ is a convex function [5] .

Consider the income distribution ${F}_{X}\left(x\right)$ of a non-negative variable X. Let ${f}_{X}\left(x\right)$ be the corresponding frequency distribution and let the mean of X be ${\mu }_{X}=\underset{0}{\overset{\infty }{\int }}x\text{ }{f}_{X}\left(x\right)\text{d}x$ . Then, the Lorenz curve ${L}_{X}\left(p\right)$ is

${L}_{X}\left(p\right)=\frac{1}{{\mu }_{X}}\underset{0}{\overset{{x}_{p}}{\int }}x\text{ }{f}_{X}\left(x\right)\text{d}x$ , (1)

where ${x}_{p}$ is the p quantile, that is ${F}_{X}\left({x}_{p}\right)=p$ . The Lorenz curve is not defined if the mean is zero or infinite. A Lorenz curve always starts at $\left(0,0\right)$ and ends at $\left(1,1\right)$ . The higher the Lorenz curve, the lesser is the inequality of the income distribution. The diagonal $L\left(p\right)=p$ is commonly interpreted as the Lorenz curve for complete equality between income receivers, but according to [6] it is not perfectly associated with the Lorenz curve. As everyone has the same income level, strictly speaking, no one can be said to be at the lowest or highest level of the population. The associated Lorenz curve then exists only at the origin and the termination point by the definition of the curve. To overcome this problem, they adopted the convention of allocating any fraction $0 of the population to be the lowest/highest x percent. This convention then allowed the 45 degree line through the origin to be associated with complete equality, as usually loosely taken to be so. This permitted Wang & Smyth [6] to use $L\left(p\right)=p$ as a useful component in the creation of Lorenz curves.

On the other hand, increasing inequality lowers the Lorenz curve, and theoretically, it can converge towards the lower right corner of the square. A sketch of a Lorenz curve is given in Figure 1.

Variable transformations. Consider a transformed variable $Y=g\left(X\right)$ , where $g\left(\cdot \right)$ is positive and monotone increasing. Then, the distribution of ${F}_{Y}\left(y\right)$ is

${F}_{Y}\left(y\right)=P\left(Y\le y\right)=P\left(g\left(X\right)\le g\left(x\right)\right)=P\left(X\le x\right)={F}_{X}\left(x\right)$ . (2)

Figure 1. A sketch of a Lorenz curve ${L}_{X}\left(p\right)$ . The diagonal interpreted as complete equality is included in the figure.

For the transformed variable Y, the p quantile ${y}_{p}$ is ${F}_{Y}\left({y}_{p}\right)=p$ , that is, ${y}_{p}=g\left({x}_{p}\right)$ .

Now

${f}_{Y}\left(y\right)=\frac{\text{d}{F}_{Y}\left(y\right)}{\text{d}y}=\frac{\text{d}{F}_{X}\left(x\right)}{\text{d}x}\frac{\text{d}x}{\text{d}y}={f}_{X}\left(x\right)\frac{\text{d}x}{\text{d}y}$ . (3)

Hence,

${\mu }_{Y}=\underset{0}{\overset{\infty }{\int }}y{f}_{Y}\left(y\right)\text{d}y=\underset{0}{\overset{\infty }{\int }}g\left(x\right){f}_{X}\left(x\right)\frac{\text{d}x}{\text{d}y}\text{d}y=\underset{0}{\overset{\infty }{\int }}g\left(x\right){f}_{X}\left(x\right)\text{d}x$ (4)

and

${L}_{Y}\left(p\right)=\frac{1}{{\mu }_{Y}}\underset{0}{\overset{{x}_{p}}{\int }}g\left(x\right){f}_{X}\left(x\right)\text{d}x$ . (5)

If the transformation is linear $g\left(x\right)=\theta x$ , then $Y=\theta X$ , ${\mu }_{Y}=\theta {\mu }_{X}$ ,

${L}_{Y}\left(p\right)=\frac{1}{\theta {\mu }_{X}}\underset{0}{\overset{{x}_{p}}{\int }}\theta x{f}_{X}\left(x\right)\text{d}x={L}_{X}\left(p\right)$ , (6)

and consequently, the Lorenz curve is invariant under linear transformations. A simple example of this property is that the Lorenz curve of income distributions is independent of the currency used. Another not so obvious result is that proportional income increase and flat tax policies are linear transformations and do not influence the Lorenz curve. Consequently, the Lorenz curve satisfies the general rules [1] :

To every distribution $F\left(x\right)$ with finite mean corresponds a unique Lorenz curve, ${L}_{X}\left(p\right)$ . The contrary does not hold because every curve ${L}_{X}\left(p\right)$ is a common Lorenz curve for a whole class of distributions $F\left(\theta x\right)$ , where $\theta$ is an arbitrary positive constant.

Consider two variables X and Y, their distributions ${F}_{X}\left(x\right)$ and ${F}_{y}\left(y\right)$ , and their Lorenz curves ${L}_{X}\left(p\right)$ and ${L}_{Y}\left(p\right)$ . If ${L}_{X}\left(p\right)\ge {L}_{Y}\left(p\right)$ for all p, then measured by the Lorenz curves, the distribution ${F}_{X}\left(x\right)$ has lower inequality than the distribution ${F}_{y}\left(y\right)$ and ${F}_{X}\left(x\right)$ is said to Lorenz dominate ${F}_{y}\left(y\right)$ . We denote this relation ${F}_{X}\left(x\right)\underset{L}{\succ }{F}_{Y}\left(y\right)$ [1] . An example of Lorenz dominance is given in Figure 2.

Income inequalities can be of different type and the corresponding Lorenz curves may intersect and for these no Lorenz ordering can be identified (cf. Figure 3). The Lorenz curve ${L}_{1}\left(p\right)$ in Figure 3 corresponds to a population with

Figure 2. Lorenz curves with Lorenz ordering, that is, ${L}_{X}\left(p\right)\underset{L}{\succ }{L}_{Y}\left(p\right)$ .

Figure 3. Two intersecting Lorenz curves.

very poor among the poor and rich who are not so rich. On the other hand, Lorenz curve ${L}_{2}\left(p\right)$ corresponds to a population where the poor are relatively not so poor and the rich are relatively rich. For intersecting Lorenz curves, alternative inequality measures have to be defined.

Properties of Lorenz curves. The Lorenz curve has the following general properties:

i) $L\left(p\right)$ is monotone increasing,

ii) $L\left(p\right)\le p$ ,

iii) $L\left(p\right)$ is convex,

iv) $L\left(0\right)=0$ and $L\left(1\right)=1$ .

If the Lorenz curve is differentiable, the derivative has the following properties. Let ${L}_{X}\left(p\right)=\frac{1}{{\mu }_{X}}\underset{0}{\overset{{x}_{p}}{\int }}x\text{ }{f}_{X}\left(x\right)\text{d}x$ , ${F}_{X}\left({x}_{p}\right)=p$ and the density function ${f}_{X}\left(x\right)$ . When we differentiate the equation ${F}_{X}\left({x}_{p}\right)=p$ , we obtain $\frac{\text{d}{F}_{X}\left({x}_{p}\right)}{\text{d}p}=\frac{\text{d}{F}_{X}\left({x}_{p}\right)}{\text{d}{x}_{p}}\frac{\text{d}{x}_{p}}{\text{d}p}=1$ ,

${f}_{X}\left({x}_{p}\right)\frac{\text{d}{x}_{p}}{\text{d}p}=1$ (7)

and

$\frac{\text{d}{x}_{p}}{\text{d}p}=\frac{1}{{f}_{X}\left({x}_{p}\right)}$ . (8)

The differentiation of ${L}_{X}\left(p\right)=\frac{1}{{\mu }_{X}}\underset{0}{\overset{{x}_{p}}{\int }}x\text{ }{f}_{X}\left(x\right)\text{d}x$ yields

$\frac{\text{d}{L}_{X}\left(p\right)}{\text{d}p}=\frac{1}{{\mu }_{X}}\frac{\text{d}\underset{0}{\overset{{x}_{p}}{\int }}x\text{ }{f}_{X}\left(x\right)\text{d}x}{\text{d}{x}_{p}}\frac{\text{d}{x}_{p}}{\text{d}p}=\frac{1}{{\mu }_{X}}{x}_{p}{f}_{X}\left({x}_{p}\right)\frac{\text{d}{x}_{p}}{\text{d}p}=\frac{{x}_{p}}{{\mu }_{X}}$ , (9)

and consequently,

$\frac{\text{d}{L}_{X}\left(p\right)}{\text{d}p}=\frac{{x}_{p}}{{\mu }_{X}}$ . (10)

If the Lorenz curve is differentiable twice, then the second derivative is

$\frac{{\text{d}}^{2}{L}_{X}\left(p\right)}{\text{d}{p}^{2}}=\frac{1}{{\mu }_{X}}\frac{\text{d}{x}_{p}}{\text{d}p}=\frac{1}{{\mu }_{X}}\frac{1}{{f}_{X}\left({x}_{p}\right)}$ .

Hence,

$\frac{{\text{d}}^{2}L\left(p\right)}{\text{d}{p}^{2}}=\frac{1}{{\mu }_{X}{f}_{X}\left({x}_{p}\right)}$ . (11)

If $\underset{p↑1}{\mathrm{lim}}$ denotes the limit from the left, we can prove the following theorem [1] :

Theorem 1. If ${\mu }_{X}$ exists, then $\underset{p↑1}{\mathrm{lim}}{L}^{\prime }\left(p\right)\left(1-p\right)=0$ .

Proof. Consider the integral $\underset{x}{\overset{\infty }{\int }}t\text{ }{f}_{X}\left(t\right)\text{d}t$ . If ${\mu }_{X}$ exists, then $\underset{0}{\overset{\infty }{\int }}t\text{ }{f}_{X}\left(t\right)\text{d}t={\mu }_{X}$ and for every $\epsilon >0$ there exists an ${x}^{\prime }$ such that $\underset{x}{\overset{\infty }{\int }}t\text{ }{f}_{X}\left(t\right)\text{d}t<\epsilon$ if $x>{x}^{\prime }$ . Choose p so that ${x}_{p}>{x}^{\prime }$ , then

$\epsilon >\underset{{x}_{p}}{\overset{\infty }{\int }}t\text{ }{f}_{X}\left(t\right)\text{d}t\ge {x}_{p}\underset{{x}_{p}}{\overset{\infty }{\int }}{f}_{X}\left(t\right)\text{d}t={x}_{p}\left(1-p\right)$ (12)

and $\underset{p↑1}{\mathrm{lim}}{x}_{p}\left(1-p\right)=0$ .

As a consequence of (12),

$\underset{p↑1}{\mathrm{lim}}{{L}^{\prime }}_{X}\left(p\right)\left(1-p\right)=\underset{p↑1}{\mathrm{lim}}\frac{{x}_{p}}{{\mu }_{X}}\left(1-p\right)=\frac{1}{{\mu }_{X}}\underset{p↑1}{\mathrm{lim}}{x}_{p}\left(1-p\right)=0$ .

Consider a one-parametric class of cumulative distribution functions $F\left(x,\theta \right)$ , defined on the positive x-axis. If we assume that $F\left(x,\theta \right)=F\left(\theta x\right)$ , i.e. it depends only on the product $\theta x$ , then the following theorem holds [1] :

Theorem 2. Let $F\left(x,\theta \right)$ be an one-parametric class of distributions with the properties

i) $F\left(x,\theta \right)=F\left(\theta x\right)$ ,

ii) $F\left(\theta x\right)$ is defined on the positive x-axis,

iii) $F\left(\theta x\right)$ and its derivative are continuous,

iv) ${\mu }_{X}=E\left(X\right)$ exists.

Let $T=\theta X$ , then

${x}_{p}\left(\theta \right)=\frac{{t}_{p}}{\theta }$ (13)

and

${\mu }_{X}\left(\theta \right)=\frac{c}{\theta }$ , (14)

where ${t}_{p}$ and c are independent of $\theta$ .

Proof. Let $\theta$ be an arbitrary, positive parameter. Then the quantile ${x}_{p}\left(\theta \right)$ is defined by the equation $F\left(\theta {x}_{p}\right)=p$ . If we define ${t}_{p}$ by the equation $F\left({t}_{p}\right)=p$ , then ${t}_{p}$ does not depend on $\theta$ and $\theta {x}_{p}\left(\theta \right)={t}_{p}$ , and (13) is

proved. The formula (14) and the statement that $L\left(p\right)=\frac{1}{\mu \left(\theta \right)}\underset{0}{\overset{{x}_{p}\left(\theta \right)}{\int }}x\text{d}F\left(\theta x\right)$ is independent of $\theta$ is proved by using the substitution $t=\theta x$ in the integrals $E\left(X\right)=\underset{0}{\overset{\infty }{\int }}x\text{d}F\left(\theta x\right)$ and $L\left(p\right)=\frac{1}{\mu \left(\theta \right)}\underset{0}{\overset{{x}_{p}\left(\theta \right)}{\int }}x\text{d}F\left(\theta x\right)$ .

Furthermore, we can prove the following [1] :

Theorem 3. Consider a function $L\left(p\right)$ defined on the interval $\left[0,1\right]$ with the properties

1) $L\left(p\right)$ is monotone increasing and convex to the p-axis,

2) $L\left(0\right)=0$ and $L\left(1\right)=1$ ,

3) $L\left(p\right)$ is differentiable,

iv) $\underset{p↑1}{\mathrm{lim}}{L}^{\prime }\left(p\right)\left(1-p\right)=0$ ,

then $L\left(p\right)$ is a Lorenz curve of a distribution with finite mean.

Proof. If we denote the unknown distribution $F\left(x\right)$ and its derivative $f\left(x\right)$ , then necessarily ${L}^{\prime }\left(p\right)=\frac{{x}_{p}}{\mu }$ . The derivative ${L}^{\prime }\left(p\right)$ is a monotone- increasing function. If its inverse is denoted $M\left(p\right)$ , we get the necessary relation

$F\left({x}_{p}\right)=p=M\left(\frac{{x}_{p}}{\mu }\right)$ .

If $\theta =\frac{1}{\mu }$ , then $F\left(x\right)=M\left(\theta x\right)$ . Now we shall prove the sufficiency, that is, that $M\left(\theta x\right)$ is a distribution whose mean is $\mu =\frac{1}{\theta }$ and whose Lorenz curve is $L\left(p\right)$ . We denote $M\left(\theta x\right)=F\left(x\right)$ , then $f\left(x\right)={F}^{\prime }\left(x\right)=\theta {M}^{\prime }\left(\theta x\right)$ . After observing that the property (iv) indicates that ${L}^{\prime }\left(p\right)$ is integrable from 0 to 1, we introduce the variable transformation

$y=M\left( \theta x \right)$

$\text{d}y=\theta {M}^{\prime }\left(\theta x\right)\text{d}x$

$x=\frac{1}{\theta }{L}^{\prime }\left(y\right)$ .

We obtain

$\mu =\underset{t\to \infty }{\mathrm{lim}}\underset{0}{\overset{t}{\int }}x\theta {M}^{\prime }\left(\theta x\right)\text{d}x=\underset{p↑1}{\mathrm{lim}}\underset{0}{\overset{p}{\int }}\frac{1}{\theta }{L}^{\prime }\left(y\right)\text{d}y=\frac{1}{\theta }\underset{p↑1}{\mathrm{lim}}\underset{0}{\overset{p}{\int }}{L}^{\prime }\left(y\right)\text{d}y=\frac{1}{\theta }$ .

The given function $L\left(p\right)$ has a monotone-increasing inverse function whose mean is $\mu$ .

Using the same transformation, we obtain that the Lorenz curve $\stackrel{˜}{L}\left(p\right)$ of $F\left(x\right)=M\left(\theta x\right)$ is

$\stackrel{˜}{L}\left(p\right)=\theta \underset{0}{\overset{{x}_{p}}{\int }}x\theta {M}^{\prime }\left(\theta x\right)\text{d}x=\underset{0}{\overset{p}{\int }}{L}^{\prime }\left(y\right)\text{d}y=\underset{0}{\overset{p}{\int }}{L}^{\prime }\left(y\right)\text{d}y$ ,

and the theorem is proved.

These results have been collected in the following theorem [7] [8] :

Theorem 4. Consider a given function $L\left(p\right)$ with the properties

(i) $L\left(p\right)$ is monotone increasing and convex to the p-axis,

(ii) $L\left(0\right)=0$ and $L\left(1\right)=1$ ,

(iii) $L\left(p\right)$ is differentiable,

(i) $\underset{p↑1}{\mathrm{lim}}{L}^{\prime }\left(p\right)\left(1-p\right)=0$ ,

then $L\left(p\right)$ is the Lorenz curve of a whole class of distribution functions $F\left(\theta x\right)$ , where $\theta$ is an arbitrary positive constant and the function $F\left(\cdot \right)$ is the inverse function to ${L}^{\prime }\left(p\right)$ .

Fellman [7] presented this result and later Fellman [8] presented the following theorem:

Theorem 5. A class of continuous distributions $F\left(x,\theta \right)$ with finite mean has a common Lorenz curve if and only if $F\left(x,\theta \right)=F\left(\theta x\right)$ .

Additional properties of the Lorenz curves. Consider the vertical difference D, between the diagonal and the Lorenz curve

$D=p-{L}_{X}\left( p \right)$

$\frac{\text{d}D}{\text{d}p}=1-{{L}^{\prime }}_{X}\left(p\right)=1-\frac{{x}_{p}}{{\mu }_{X}}$

$\frac{{\text{d}}^{2}D}{\text{d}{p}^{2}}=-{{L}^{″}}_{X}\left(p\right)=-\frac{1}{{\mu }_{X}}\frac{\text{d}{x}_{p}}{\text{d}p}=-\frac{1}{{\mu }_{X}{f}_{X}\left(x\right)}<0$ .

The maximum of D implies $1-\frac{{x}_{p}}{{\mu }_{X}}=0$ , that is, ${x}_{p}=\mu$ .

For ${x}_{p}={\mu }_{X}$ , ${{L}^{\prime }}_{X}\left(p\right)=\frac{{\mu }_{X}}{{\mu }_{X}}=1$ and at the point ${p}_{\mu }={F}_{X}\left({\mu }_{X}\right)$ the tangent

is parallel to the line of perfect equality. This is also the point at which the vertical distance between the Lorenz curve and the egalitarian line attains its maximum ${D}_{\mathrm{max}}=F\left({\mu }_{X}\right)-{L}_{X}\left(F\left({\mu }_{X}\right)\right)$ . This maximum is defined as the Pietra index, in this study denoted P, and discussed below [9] .

Kleiber and Kotz have outlined a progressive development of how the income distributions can be characterized by their Lorenz curves [10] [11] .

Income inequality indices. When Lorenz curves intersect, the corresponding distributions cannot be compared by the Lorenz curves. Consequently, the distributions have to be compared by numerical indices mainly based on the Lorenz curves.

Gini index. The most frequently used index is the Gini coefficient, G [12] . Using the Lorenz curves, this coefficient is the ratio of the area between the diagonal and the Lorenz curve and the whole area under the diagonal (cf. Figure 1). The formula is

$G=1-2\underset{0}{\overset{1}{\int }}L\left(p\right)\text{d}p$ . (15)

This definition yields Gini coefficients satisfying the inequalities $0 . The higher the G value, the lower the Lorenz curve and the stronger the inequality. If ${G}_{X}<{G}_{Y}$ , then the distribution ${F}_{X}\left(x\right)$ , measured by the Gini coefficient, has lower inequality than the distribution ${F}_{y}\left(y\right)$ and we say that ${F}_{X}\left(x\right)$ Gini dominates ${F}_{y}\left(y\right)$ , and denote this relation ${F}_{X}\left(x\right)\underset{G}{\succ }{F}_{Y}\left(y\right)$ [13] . Finally, one observes the obvious result ${F}_{X}\left(x\right)\underset{L}{\succ }{F}_{Y}\left(y\right)⇒{F}_{X}\left(x\right)\underset{G}{\succ }{F}_{Y}\left(y\right)$ , that is, Lorenz dominance implies Gini dominance.

The coefficient allows direct comparison of the income of two populations’ distributions, regardless of their sizes. The Gini’s main limitation is that it is not easily decomposable or additive. Also, it does not respond in the same way to income transfers between people in opposite tails of the income distribution as it does to transfers in the middle of the distribution. The reason for the popularity of the Gini coefficient is that it is easy to compute being a ratio of two areas in Lorenz curve diagrams. As a disadvantage, the Gini index only maps a number to the properties of a diagram, but the diagram itself is not based on any model of a distribution process. The “meaning” of the Gini index can only be understood empirically. Hence, the Gini does not capture where in the distribution the inequality occurs. As an additional result, two very different distributions of income having different Lorenz curves can have the same Gini index.

Using the Gini coefficient presented in the text, one can compare the Gini coefficients for ${L}_{1}\left(p\right)$ and ${L}_{2}\left(p\right)$ in Figure 3; ${L}_{1}\left(p\right)$ has less inequality ( ${G}_{1}=0.333$ ) than ${L}_{2}\left(p\right)$ ( ${G}_{2}=0.360$ ).

There are other inequality measures defined by the Gini coefficient. Yitzhaki [14] proposed a generalized Gini coefficient

$G\left(\nu \right)=1-\nu \left(1-\nu \right)\underset{0}{\overset{1}{\int }}{\left(1-p\right)}^{\nu -2}L\left(p\right)\text{d}p$ , (16)

where $\nu >1$ . Different ${\nu }^{\prime }s$ are used in order to identify different inequality properties. For low ${\nu }^{\prime }s$ greater weights are associated with the rich and for high ${\nu }^{\prime }s$ greater weights are associated with the poor.

Using the mean income ( $\mu$ ) and the Gini coefficient (G), Sen [15] proposed a welfare index

$W=\mu \left(1-G\right)$ . (17)

Pietra index. The Pietra index P is defined as the maximum ${D}_{\mathrm{max}}=F\left({\mu }_{X}\right)-{L}_{X}\left(F\left({\mu }_{X}\right)\right)$ , presented above. It can be graphically represented as the longest vertical distance between the diagonal and the Lorenz curve, or the cumulative portion of the total income held below a certain income percentile, with the 45 degree line representing perfect equality. The definitions yield Pietra coefficients satisfying the inequality $0\le P<1$ . The lower bound of P is obtained when there is total income equality, that is, the Lorenz curve coincides with the diagonal. The upper bound can be obtained when the Lorenz curve converges towards the lower right corner. The limits in the inequalities can be obtained, and this is outlined in Figure 4. The Pietra index can be interpreted as the income of the rich that should be redistributed to the poor in order to obtain total income equality. In other words, the value of the index approximates the share of total income that must be transferred from households above the mean to those below the mean to achieve equality in the distribution of incomes. Higher values of P indicate more inequality, and more redistribution is needed to achieve income equality. Therefore, the index is sometimes named the Robin Hood index. The Pietra index is also known as the Hoover index and it is still better known as the Schutz index [16] [17] [18] .

If ${P}_{X}<{P}_{Y}$ , then the distribution ${F}_{X}\left(x\right)$ measured by the Pietra index has lower inequality than the distribution ${F}_{Y}\left(y\right)$ , and we say that ${F}_{X}\left(x\right)$ Pietra dominates ${F}_{Y}\left(y\right)$ . We denote this relation ${F}_{X}\left(x\right)\underset{P}{\succ }{F}_{Y}\left(y\right)$ . For the Lorenz curves in Figure 3, ${L}_{1}\left(p\right)$ is more equal than ${L}_{2}\left(p\right)$ . In general, the Pietra and the Gini orderings are not identical [1] . However, one observes the similar

Figure 4. Sketches of two extreme cases of simplified Rao-Tam Lorenz curves with corresponding P indices. For the Lorenz curve $k=1.25$ , the Pietra index is 0.0819 and for the Lorenz curve $k=10$ the index is 0.697 [1] .

obvious result ${F}_{X}\left(x\right)\underset{L}{\succ }{F}_{Y}\left(y\right)⇒{F}_{X}\left(x\right)\underset{P}{\succ }{F}_{Y}\left(y\right)$ , that is, Lorenz dominance implies Pietra dominance.

An alternative definition of the Pietra index has also been given. It can be defined as twice the area of the largest triangle inscribed in the area between the Lorenz curve and the diagonal line [9] . In Figure 5, one observes that the triangle obtains its maximum when the corner lies on the Lorenz curve where the tan-

gent is parallel to the diagonal. The height of the triangle is $h=\frac{P}{\sqrt{2}}$ , and the base is the diagonal $b=\sqrt{2}$ . The double of the area is $2\text{area}=2\frac{1}{2}\frac{P}{\sqrt{2}}\sqrt{2}=P$ .

In comparison, the Gini index, G, is twice the area between the Lorenz curve and the diagonal, and the Pietra index is twice the area of the triangle inscribed in this area. Hence, the inequality $G\ge P$ holds generally [1] .

3. Applications

In this section, we collect some examples in order to elucidate the theory. The models Pareto [19] , the simplified Rao-Tam [20] and the Chotikapanich [21] contain only one parameter. Therefore, they can easily be analyzed. Rohde [22] and Fellman [2] [13] paid these models special attention and examined them in more detail. However, they are so simple that it is impossible to distinguish between the estimated length of the range for the income distribution function and

Figure 5. The Lorenz curve and the geometric interpretations of the Pietra index.

the Gini coefficient. If one of these properties is estimated, the other is fixed. We consider these three theoretical Lorenz curve models. We present how the Lorenz curves and the Gini and the Pietra indices depend on the model parameters. In addition, we compare the Lorenz curves of the models when their Gini indices are equal.

Pareto model. We define the Pareto distribution as $F\left(x\right)=1-{x}^{-\alpha }$ , where $x\ge 1$ and $\alpha >1$ .

The frequency function is $f\left(x\right)=\alpha {x}^{-\alpha -1}$ , the mean is $\mu =\frac{\alpha }{\alpha -1}$ , the quantiles are ${x}_{p}={\left(\frac{1}{1-p}\right)}^{\frac{1}{\alpha }}$ , the Lorenz curve $L\left(p\right)=1-{\left(1-p\right)}^{\frac{\alpha -1}{\alpha }}$ and the Gini coefficient $G=\frac{1}{2\alpha -1}$ . If $\alpha \to 1$ , $G\to 1$ and if $\alpha \to \infty$ , $G\to 0$ . In Figure 6, we present the Pareto distribution as a function of the parameter $\alpha$ .

Finally, the Pietra index is $P=\frac{1}{\alpha }{\left(\frac{\alpha -1}{\alpha }\right)}^{\alpha -1}$ . According to the general theory, the inequality $G\ge P$ holds for all parameter values, and consequently, $P\to 0$ when $\alpha \to \infty$ . Let $\beta =\alpha -1$ , then $P=\frac{1}{\beta +1}{\left(\frac{\beta }{\beta +1}\right)}^{\beta }$ . When $\beta \to 0$ , $\left(\alpha \to 1\right)$ , then, $P\to 1$ . In Figure 7, we compare the Gini and Pietra indices as functions of the model parameter $\alpha$ . One observes that the inequality $G\ge P$ holds.

Simplified Rao-Tam model. Consider the simplified Rao-Tam model whose Lorenz curve is $L\left(p\right)={p}^{\alpha }$ $\left(\alpha >1\right)$ [20] . When $\alpha \to 1$ , then $L\left(p\right)\to p$ and

Figure 6. Lorenz curves for Pareto distributions with different parameter values.

Figure 7. Gini and Pietra indices for Pareto distributions with different parameter values.

when $\alpha \to \infty$ $L\left(p\right)\to 0$ for all $0\le p<1$ and the Lorenz curve converges, towards the lower right corner of the square. In Figure 8, we present the Lorenz curve for a set of $\alpha$ values.

The Gini coefficient is $G=\frac{\alpha -1}{\alpha +1}$ . When $\alpha \to 1$ , then $G\to 0$ and when $\alpha \to \infty$ then $G\to 1$ . The Pietra index is $P={\left(\frac{1}{\alpha }\right)}^{\frac{1}{\alpha -1}}-{\left(\frac{1}{\alpha }\right)}^{\frac{\alpha }{\alpha -1}}$ . Using the vertical difference $D=p-{p}^{\alpha }$ , the index inequalities $D\le P hold, and for $\alpha \to 1$ $G\to 0$ , and consequently, $P\to 0$ . For increasing $\alpha$ values, the supremum of $D=p-{p}^{\alpha }$ is one. This must also hold for the supremum of $P={\left(\frac{1}{\alpha }\right)}^{\frac{1}{\alpha -1}}-{\left(\frac{1}{\alpha }\right)}^{\frac{\alpha }{\alpha -1}}$ . Consequently, the interval $0 cannot be shortened. In Figure 9, we present G and P for different $\alpha$ .

Figure 8. Lorenz curves for simplified Rao-Tam distributions with different parameter values.

Figure 9. Gini and Pietra indices for simplified Rao-Tam distributions with different parameter values.

Chotikapanich. Chotikapanich [21] defined the Lorenz curve ${L}_{C}\left(p\right)=\frac{{\text{e}}^{kp}-1}{{\text{e}}^{k}-1}$ for $k>0$ .

The limits of the fractions studied below result in indefinite forms $\frac{0}{0}$ and l’Hospital’s rule $\underset{k\to a}{\mathrm{lim}}\frac{f\left(x\right)}{g\left(x\right)}=\underset{k\to a}{\mathrm{lim}}\frac{{f}^{\prime }\left(x\right)}{{g}^{\prime }\left(x\right)}$ has to be applied several times. For $k\to 0$ and arbitrary $0 , we obtain $\underset{k\to 0}{\mathrm{lim}}\frac{{\text{e}}^{kp}-1}{{\text{e}}^{k}-1}=\underset{k\to 0}{\mathrm{lim}}\frac{{\text{e}}^{kp}p}{{\text{e}}^{k}}=p$ . This means that the Lorenz curve converges towards the diagonal. For $k\to \infty$ , one obtains that for all $0

$\underset{k\to \infty }{\mathrm{lim}}\frac{{\text{e}}^{kp}-1}{{\text{e}}^{k}-1}=\underset{k\to \infty }{\mathrm{lim}}\frac{{\text{e}}^{kp}p}{{\text{e}}^{k}}=\underset{k\to \infty }{\mathrm{lim}}\frac{{\text{e}}^{kp}p}{{\text{e}}^{k}}=\underset{k\to \infty }{\mathrm{lim}}p{\text{e}}^{-k\left(1-p\right)}=0.$

This means that the Lorenz curve converges towards the lower right corner of the square.

The extreme Lorenz curves can be obtain by the limit studies $k\to 0$ and $k\to \infty$ , and the Lorenz curves as functions of the parameter k are presented in Figure 10.

The Gini index is for the Chotikapanich model

$G=1-2\left(\frac{\frac{1}{k}{\text{e}}^{k}-1}{{\text{e}}^{k}-1}-\frac{1}{k\left({\text{e}}^{k}-1\right)}\right)=1-2\left(\frac{{\text{e}}^{k}-k-1}{k\left({\text{e}}^{k}-1\right)}\right)$ .

When $k\to 0$ , one obtains

$\begin{array}{c}\underset{k\to 0}{\mathrm{lim}}G=\underset{k\to 0}{\mathrm{lim}}\left(1-2\left(\frac{{\text{e}}^{k}-k-1}{k\left({\text{e}}^{k}-1\right)}\right)\right)=1-2\underset{k\to 0}{\mathrm{lim}}\left(\frac{{\text{e}}^{k}-1}{\left({\text{e}}^{k}-1\right)+k{\text{e}}^{k}}\right)\\ =1-2\underset{k\to 0}{\mathrm{lim}}\left(\frac{{\text{e}}^{k}}{{\text{e}}^{k}+k{\text{e}}^{k}+{\text{e}}^{k}}\right)=1-2\frac{1}{2}=0\end{array}$

When $k\to \infty$ , one obtains

Figure 10. Lorenz curves for Chotikapanich distributions with different parameter values.

$\begin{array}{c}\underset{k\to \infty }{\mathrm{lim}}G=\underset{k\to \infty }{\mathrm{lim}}\left(1-2\left(\frac{{\text{e}}^{k}-k-1}{k\left({\text{e}}^{k}-1\right)}\right)\right)=1-2\underset{k\to \infty }{\mathrm{lim}}\left(\frac{{\text{e}}^{k}-1}{\left({\text{e}}^{k}-1\right)+k{\text{e}}^{k}}\right)\\ =1-2\underset{k\to \infty }{\mathrm{lim}}\left(\frac{{\text{e}}^{k}}{{\text{e}}^{k}+k{\text{e}}^{k}+{\text{e}}^{k}}\right)=1-2\underset{k\to \infty }{\mathrm{lim}}\left(\frac{1}{2+k}\right)=1-0=1\end{array}$

Consequently, for G, the inequalities $0 hold and the range cannot be shortened.

The Pietra index is

$\begin{array}{c}P=\frac{1}{k}\mathrm{ln}\left(\frac{{\text{e}}^{k}-1}{k}\right)-\frac{1}{k}+\frac{1}{{\text{e}}^{k}-1}=\frac{1}{k}\mathrm{ln}\left(\frac{{\text{e}}^{k}-1}{k}\right)-\frac{{\text{e}}^{k}-1-k}{k\left({\text{e}}^{k}-1\right)}\\ =\frac{1}{k}k+\frac{1}{k}\mathrm{ln}\frac{\left(1-{\text{e}}^{-k}\right)}{k}-\frac{{e}^{k}-1-k}{k\left({\text{e}}^{k}-1\right)}=1+\frac{1}{k}\mathrm{ln}\frac{\left(1-{\text{e}}^{-k}\right)}{k}-\frac{{\text{e}}^{k}-1-k}{k\left({\text{e}}^{k}-1\right)}\end{array}$

In general, $P , and consequently, $P\to 0$ when $k\to 0$ . When $k\to \infty$ , one obtains

$\begin{array}{l}\underset{k\to \infty }{\mathrm{lim}}\left(1+\frac{1}{k}\mathrm{ln}\frac{\left(1-{\text{e}}^{-k}\right)}{k}-\frac{{\text{e}}^{k}-1-k}{k\left({\text{e}}^{k}-1\right)}\right)\\ =\underset{k\to \infty }{\mathrm{lim}}\left(1+\frac{1}{k}\mathrm{ln}\frac{\left(1-{\text{e}}^{-k}\right)}{k}-\right)-\underset{k\to \infty }{\mathrm{lim}}\left(\frac{{\text{e}}^{k}-1}{\left({\text{e}}^{k}-1\right)+k{\text{e}}^{k}}\right)\\ =\underset{k\to \infty }{\mathrm{lim}}\left(1-\frac{\mathrm{ln}\left(k\right)}{k}+\frac{1}{k}\mathrm{ln}\left(1-{\text{e}}^{-k}\right)\right)-0=1\end{array}$

Hence, $\underset{k\to \infty }{\mathrm{lim}}P=1$ and the inequalities $0 hold and cannot be shortened.

The G and P as functions of the parameter k indices are presented in Figure 11.

Figure 11. Gini and Pietra indices for Chotikapanich distributions with different parameter values.

Figure 12. Lorenz curves for distributions with the same Gini index ( $G=0.5$ ). Note that the Lorenz curves for the simplified Rao-Tam and Chotikapanich models are rather similar, but the Lorenz curve for the Pareto model is markedly different.

Above, we made the general remark that different distributions can result in the same Gini index. In Figure 12, we present a simple example of this finding. We compare a Chotikapanich model with the Gini index 0.500 (k = 3.593525) with a Pareto model (with $\alpha =1.5$ ) and a simplified Rao-Tam distribution (with $\alpha =3.0$ ), all having the same Gini index. The Lorenz curves for the simplified Rao-Tam and Chotikapanich models are rather similar, but the Lorenz curve for the Pareto model is markedly different. This is caused by the fact that the Pareto model is introduced as a model for distributions with high income levels.

4. Discussion

In general, the step from the Lorenz curve to the income distribution starts from the formula

${L}^{\prime }\left(p\right)=\frac{{x}_{p}}{\mu }$ , (18)

where ${x}_{p}$ is the p-percentile and µ is the mean of the corresponding distribution $F\left(x\right)$ . We define $M\left(\cdot \right)$ as the inverse function of the derivative ${L}^{\prime }\left(\cdot \right)$ . From (18), we obtain

$p=M\left(\frac{{x}_{p}}{\mu }\right)$ . (19)

Equation (19) indicates that $M\left(\cdot \right)$ is the income distribution function corresponding to the given Lorenz curve, that is, $F\left(x\right)=M\left(\frac{x}{\mu }\right)$ . This connection

between the Lorenz curve and the distribution function is easily defined, but for most of the exact Lorenz curves, it is difficult or even impossible to mathematically obtain the distribution.

Primary income data yield the most exact estimates of the income inequality coefficients, such as Gini and Pietra. Fellman [2] analyzed different methods for numerical estimation of Gini coefficients based on Lorenz curves. As an application of these methods, he considered Pareto distributions. Using Lorenz curves, various numerical integration attempts were applied to obtain accurate estimates. The trapezium rule is simple, but yields a positive bias for the area under the Lorenz curve, and consequently, a negative bias for the Gini coefficient. Simpson’s rule is better fitted to the Lorenz curve, but this rule demands an even number of subintervals of the same length. Lagrange polynomials of second degree can be considered as a generalization of Simpson’s rule. Fellman [2] compared different methods and he also gave references concerning numerical integration. To include Simpson’s rule in his study, he considered Lorenz curves with deciles. Compared with Simpson’s rule, he used the trapezium rule, Lagrange polynomials and generalizations of Golden’s method [23] . No method was uniformly optimal, but the trapezium rule was almost always inferior and Simpson’s rule was superior. Golden’s method is usually of medium quality. Mettle et al. [3] estimated the Lorenz curve by Newton-Cotes methods, namely, the trapezium rule, Simpson’s 1/3 rule and Simpson’s 3/8 rule. Using these, they estimated the Gini coefficients of income and compared the accuracy of these estimates on some (Ghanaian) data. The curves in Figure 12 in this study indicate weaknesses in the inequality indices. Lorenz curves with the same Gini index may show marked geometrical differences and consequently different income distributions.

Acknowledgements

This work was supported in part by a grant from the Magnus Ehrnrooth Foundation.

Cite this paper
Fellman, J. (2018) Income Inequality Measures. Theoretical Economics Letters, 8, 557-574. doi: 10.4236/tel.2018.83039.
References
[1]   Fellman, J. (2015) Mathematical Analysis of Distribution and Redistribution of Income. Science Publishing Group, 166 p. http://www.sciencepublishinggroup.com/book/B-978-1-940366-25-8.aspx

[2]   Fellman, J. (2012) Estimation of Gini Coefficients Using Lorenz Curves. Journal of Statistical and Econometric Methods, 1, 31-38.

[3]   Mettle, F.O., Darkwah, K.A., Nortey, E.N. and Lotsi, C.A. (2016) An Estimation of the Gini Coefficient of Income Using Newton-Cotes Methods. Conference Paper February 2016.

[4]   Lorenz, M.O. (1905) Methods for Measuring Concentration of Wealth. Publ. of the Amer. Statist. Assoc., 9, 209-219.

[5]   Fellman, J. (2011) Lorenz Curve. In: Lovric, M., Ed., International Encyclopedia of Statistical Science, Part 12, 760-762.
https://doi.org/10.1007/978-3-642-04898-2_345

[6]   Wang, Z.X., Ng, Y.-K. and Smyth, R. (2011) A General Method for Creating Lorenz Curves. Review of Income and Wealth, 57, 561-582.
https://doi.org/10.1111/j.1475-4991.2010.00425.x

[7]   Fellman, J. (1976) The Effect of Transformations on Lorenz Curves. Econometrica, 44, 823-824.
https://doi.org/10.2307/1913450

[8]   Fellman, J. (1980) Transformations and Lorenz Curves. Swedish School of Economics and Business Administration Working Papers, 48, 18 p.

[9]   Lee, W.-C. (1999) Probabilistic Analysis of Global Performances of Diagnostic Tests: Interpreting the Lorenz Curve Based Summary Measures. Statistics in Medicine, 18, 455-471.
https://doi.org/10.1002/(SICI)1097-0258(19990228)18:4<455::AID-SIM44>3.0.CO;2-A

[10]   Kleiber, Ch. and Kotz, S. (2001) Characterizations of Income Distributions and the Moment Problem of Order Statistics. The 53rd Session of the International Statistical Institute, Seoul, Korea, 22-29 August 2001. http://isi.cbs.nl/iamamember/CD2/pdf/462.pdf

[11]   Kleiber, Ch. and Kotz, S. (2002) A Characterization of Income Distributions in Terms of Generalized Gini Coefficients. Social Choice and Welfare, 19, 789-794.
https://doi.org/10.1007/s003550200154

[12]   Gini, C. (1914) Sulla misura della concentrazione e della variabilita dei caratteri. Premiate officine grafiche C. Ferrari, 73c, 1203-1248.

[13]   Fellman, J. (2012) Modelling Lorenz Curve. Journal of Statistical and Econometric Methods, 1, 53-62.

[14]   Yitzhaki, S. (1983) On an Extension of the Gini Index. International Economic Review, 24, 617-628.
https://doi.org/10.2307/2648789

[15]   Sen, A. (1973) On Economic Inequality. Clarendon Press, Oxford.
https://doi.org/10.1093/0198281935.001.0001

[16]   Hoover, E. (1936) The Measurement of Industrial Localization. The Review of Economics and Statistics, 18, 162-171.
https://doi.org/10.2307/1927875

[17]   Schutz, R.R. (1951) On the Measurement of Income Inequality. American Economic Review, 41, 107-122.

[18]   Atkinson, A.B. (1970) On the Measurement of Inequality. Journal of Economic Theory, 2, 244-263.
https://doi.org/10.1016/0022-0531(70)90039-6

[19]   Rasche, R.H., Gaffney, J., Koo, A.Y.C. and Obst, N. (1980) Functional Forms for Estimating the Lorenz Curve. Econometrica, 48, 1061-1062.
https://doi.org/10.2307/1912948

[20]   Rao, U.L.G. and Tam, A.Y.-P. (1987) An Empirical Study of Selection and Estimation of Alternative Models of the Lorenz Curve. Journal of Applied Statistics, 14, 275-280.
https://doi.org/10.1080/02664768700000032

[21]   Chotikapanich, D. (1993) A Comparison of Alternative Functional Forms for the Lorenz Curve. Economics Letters, 41, 129-138.
https://doi.org/10.1016/0165-1765(93)90186-G

[22]   Rohde, N. (2009) An Alternative Functional Form for Estimating the Lorenz Curve. Economics Letters, 105, 61-63.
https://doi.org/10.1016/j.econlet.2009.05.015

[23]   Golden, J. (2008) A Simple Geometric Approach to Approximating the Gini Coefficient. The Journal of Economic Education, 39, 68-77.
https://doi.org/10.3200/JECE.39.1.68-77

Top