The World Inequality Report 2018  is out. It shows that during 2015-16 the top 10% of individuals captured some 37% of total wealth in the European Union that has the best score among other huge economies such as China (41%), Russia (46%), Brazil (55%), the USA (47%), India (55%), etc. In terms of income inequality, the report also shows that from 1980 to 2018, the world’s richest 1% captured about 27% of new income, while the poorest 50% of the world population received only 13%. Thus, compared to the years preceding globalization, income inequality has increased almost everywhere, that is, in developing or developed countries, including those that enjoy the highest level of development. At first sight, one would tend to suspect similar policy structure as the culprit. However, observing that the top 10% had captured some 61% of total wealth in Sub-Sahara Africa, wealth and income inequality then appears to be a ubiquitous phenomenon. It is then reasonable to suspect that indeed some homogeneous power law lurks behind the distribution of wealth and income in market economies.
A homogeneous power law can be represented as: , where is a variable; K is a constant; x is the other variable of interest; and α is the scale exponent. Financial economists have attempted to study the impact of power law identified in stock market activities, such as firms’ size distributions, returns on investments, traded volume, etc.   , among others. But, their studies attempt to reproduce Zipf’s hyperbolic law, which relates word rank and word frequency in natural languages, or they are simply influenced by the so-called Pareto Law, expressed as: . Moreover, estimates are obtained from log-log plots  . Experimental data of quantities that follow a power law are usually very noisy and therefore obtaining reliable estimates for the exponent a is difficult. In fact, estimates obtained from graphical methods based on linear least squares fit of some empirical data points, such as a time series, produce biased estimators. If precision is a requirement, then it is advised to use maximum likelihood methods as an alternative, e.g. the so-called Hill’s estimator that gives the inverse of the exponent of the Pareto distribution. In practice, however, one never knows for certain whether an observed quantity is drawn from a power law distribution. At any rate, the information obtained either from graphical methods or others is of little use in policy formulation. For example, an α = 2, say, means that if is doubled, then x will go up by 22. Even if a power law is really present, the researcher will not know how it manifests itself. More importantly, if the output signal is from a multifractal, endowed with many scale exponents and driven by a strange attractor, what scale is being estimated? Clearly, graphical methods are of little use in such situations.
Yet, power laws with fractional exponents showing scale invariance are perhaps the only way to study many critical phenomena in the real-world settings. The values of exponents reflect a large number of regularities found in physics, biology, psychology, etc., and in many human constructs such as music, economics, etc. Whether the exponent is an integer or a non-integer, relations between variables are characterized by the notion of self-similarity. And we contend that by knowing why self-similarity or dissimilarity exists is to know much more about the phenomenon that we wish to study in this paper.
For tractability, however, let us first recall that self-similarity is more than a welcomed attribute, because it is fundamental to our nature whether we are aware of it or not, and also because it is the concept that underlies fractals, complexity, and many laws of nature. In this paper we intend to look at income distribution through the lens of fractals, and there, self-similarity plays a huge role, as evidenced by the difficulty of distinguishing between minute-by-minute or second-by-second stock averages     . Indeed, power laws are found in many scientific and human constructs. Proper analyses in these areas require some generalization of the fractal concept either with many exponents (where they are called multifractals), or fractal ‘tout court’ with one scaling exponent; at any rate, self-similarity may appear on many scales. Hence, it would really be surprising if some power law did not underlie stocks and commodity exchanges.
Power laws also govern the noise spectra of various processes. But, interpreting noise spectra has its own pitfalls, because non-stationarity in the data can produce fake scaling behavior, or as evidenced by the excessive association of the Zipf’s law with pink noise or 1/f noise. The latter is a signal that gives a power spectral density (power per frequency interval) that is inversely proportional to the frequency of the signal. Maybe, the attachment of economists to 1/f noise ((where β (=1) is the squared modulus of the Fourier transform) is from the fact that 1/f noise is connected to systems that are near their equilibrium. Whereas β > 1 is associated with non-equilibrium driven systems. Thus, values of α = 1 or 3 found in the studies mentioned above probably will not occur in spectral analyses, because economic systems are most often far from their equilibrium, and economic data are very noisy. Spectral analysis provides representations of outputs of systems that are as much as possible noise-free, from which on can derive precious quantitative and qualitative information concerning the system’s behavior.
The stochastic behavior of a deterministic dynamic system could show a Power Spectrum Density (PSD) that is qualitatively similar to that of a truly random time series, as in a broad-band power spectrum. On the other hand, a purely stochastic system may also contain a flicker noise, because low frequency components dominate high frequency ones; this is known as the intermittency problem. That is why we believe that it is safer to start by first characterizing the attractor in view of ascertaining whether or not the system under study is stochastic or one that has structure.
Power laws abound in the human experience. Its operation can easily be inferred from a few casual observations such as: Distributively, the frequency of “smalls” exceeds that of “bigs”; the fact that there are more small stars than big ones; the fact that the frequency of unstable configurations in the universe is higher than that of stable configurations; or why the frequency of small words exceeds that of long words, etc. It then follows that it might be natural that the frequency of smaller income earners would exceed the frequency of big earners, if market economies are driven by power laws. However, one must be careful in dealing with spectral power that is continuous and diverges to low frequencies. There exist a plethora of methods, such as the multitaper method, to handle these situations. Here however, we are less interested in precise measurements than in determining the upper and lower bounds of power spectra of multifractals whose fractal attractors are characterized by fractional exponents.
If the S&P-500 Index is the output signal of a multifractal, it is then worth investigating whether income distributions (which is of interest here) are better studied by noise spectra (their power spectrum) than by reconstructed distributions from tails exponents. To do so, however, it is necessary to first express the power law in frequency domain, i.e., , where k is a constant; b is the power spectrum; and is the power function that measures the power of the signal per unit of time. We next characterize the attractor of the index to determine its fractal dimension. We will attempt that in the next two sections. The second will be devoted to the data, to preliminaries to the computation of the singularity spectrum (to demonstrate both its multi-fractality and its non-randomness), and to the noise spectra of the S&P-500. Our findings will be summarized in the Third Section.
2. Data and Preliminaries
We used the Grand Microsoft Excel series of closing prices of the S&P-500 from January 3rd 1961 to February 28th 2011, sampled at daily intervals, and expressed as a Mixed Fractional Brownian Motion ((MfBm), see Appendix A), assuming its non-randomness. The series was next truncated into 7 segments (the b’s in Appendix A) that were previously determined in  , and each segment was de-trended using 3 logarithmic differences and filtered for white noise. Segment length varies from 29 to 211.
The analysis will be done in two stages. In the first, we will use the wavelet multi-resolution software of Trusoft International, the Benoit version, to determine the boundaries of the observational range of the Hurst exponents (H), the power spectrum, as well as the Hausdorff dimensions ((D0), see Appendices B and D). In the second stage, D0 being the first scale exponent will be used as the starting point in the determination of the generalized fractal dimensions or the singularity spectra of the segments.
As shown in Appendix B, the Hausdorff dimension (D0) is a more efficient measure than either the topological or the box-counting dimension, and also a more natural measure within the multifractal formalism. For, if a closed bounded set (where is the real line) is a manifold, then the value of its dimension must accordingly be either i) an integer or a non-integer; and ii) points and countable unions of points of zero volume must have zero dimension. It can then be seen that the topological dimension fails on these conditions since it is always an integer, giving zero for the Cantor set for example, which is obviously not true. By a similar argument, the box-counting measure also fail on ii), whereas D0 satisfies both i) and ii), and .
Recalling, at the same time, that if and G is a collection of subsets of X whose union contains V, then G is a cover of V. If further X is a topological space, then G is an open cover if each of its subsets is an open set. Therefore, the term ‘fractal dimension’, more generally referred to as the capacity dimension of fractal sets, is also the exponent D0 in the expression , where is the minimum number of open sets of diameter (e) needed to cover the set  . The D0 given by the software will be the starting point in the computation of the singularity spectrum, including the correlation dimension.
2.1. The Singularity Spectrum
The method of multifractal cascades is now known as the multifractal formalism introduced by   in response to systematic experimental deviations observed in the Kolmogorov theory of homogeneous and isotropic turbulence. It has since undergone considerable theoretical development and practical applications in many disciplines as it seems well adapted to reveal the hierarchy governing special distributions of singularities of multifractal measures.
In this paper it is referred to as the Mandelbrot Method  which is a simple iterative construction that asymptotically models strange attractors. It consists of an “initiator” (the unit interval) on which a unit mass is uniformly distributed, and a “generalized generator” (ρ) with two intervals (ei), . The initiator is first divided into two bins with equal probability (pi). Next, the exponent q is assigned to the probabilities, while the exponent t is assigned to the support intervals. The exponent τ(q) is the Renyi’s scaling exponent, and q is a real parameter that can take positive as well as negative values. In the case of mono-fractal (or self-affine), τ(q) depends linearly on q; otherwise the process is a multifractal.
Quadratic maps have the same structure, but different intervals. Experimentally, Schroeder  has found that an interval size e1 = 0.400 to be a good approximation of ei for the logistic map. By using Equation (3) below, our e1 = 0.408903, which is equivalent to a generalized generator of ρ = 2.445564 instead of the approximated value of 2.5, chosen initially by Schroeder. The difference is due to the fact that the logistic map is not exactly self-similar. Since the approximate map of a given process might not be known in advance, one should appeal to Equation (1) below to yield the size of the generator and intervals from the Hausdorff dimension obtained from the wavelet. Once e1 is known, then all the Renyi’s generalized fractal dimensions, except D1 (the information dimension), can be computed. But beforehand, the Legendre Transform posits: ; , and . While the Holder analysis decomposes a measure into a sum of measures, where each is characterized by a value of the Holder exponent a; the latter measures the strength of the local singularity or roughness. Everything is then on hand to construct the multifractal spectrum; thus, the generality of D0 in this approach cannot be over-emphasized. To wit:
From the partition function:
, positing , and (1)
We can derive two equations in quadratic form:
Equation (2) is derived from positing q = 0, Dq = D0, and . Then, , where G is the Golden Mean; hence,
From (3), we have:
As the exponent τ(q) describes the same aspect of the multifractal spectrum, denoted f(a), we have;
, for (6)
Equation (6) is Renyi’s  ,  generalized dimensions of order q, which handles every portion of the support of the attractor in a uniform manner and describes the nature of singularities at the same time. It works for all q’s, except of course q = 1.
For D1, we have:
but, for D∞ and D−∞, it is easier to expand the numerator of (6). That is,
using log2 and letting q ® ¥ or (-¥), we have:
What is needed here is the correlation dimension that can be computed for q = 2; that is:
The importance of D2 lies in its relation with the concept of correlation. It can be shown that as e ® 0, the sum in (9) equals the total counts, defined as C(r) and used in the method proposed by Grassberger and Procaccia  . One of the many roles played by the correlation dimension lies in its ability to distinguish between chaos and random determinism. In the Grassberger and Procaccia method, one builds a d-dimensional data vector from d measurements spaced equidistantly in time, and determines D2 of the d-dimensional point set. If the data were random, then as d increases, D2 would increase continuously with d. However, if the system is deterministic, D2 will not increase any more once the embedding dimension exceed D2. For more on this, the reader is referred to Appendix C.
2.2. The Power Spectrum
The wavelet multi-resolution software computes the Hurst exponent (H) for each segment of the S&P-500 Index, and the segments are used to calculate the power spectrum as shown in Appendix D. It should stress at this juncture that self-similarity is at play, indicating that there exist relations between variables, and D0 is at hand. The power spectrum can then be computed for each segment of the Index. The Hurst exponent, defined as: , is also a measure of persistence (H > 0.5) and anti-persistence (H < 0.5) in statistical time series. Persistence is related to long memory in time series, meaning that an increase in values is most likely be followed by another increase; while anti-persistence (H < 1/2) relates to short-term memory or return to the mean, meaning that an increase will most likely be followed by a decrease, and vice versa. An H = 0.5 is taken to mean randomness as in Brown noise spectrum.
Interestingly, the rescaled range and segment sizes follow a power law, and H is its exponent. The intensity of fluctuations in anti-persistence mode increases as H moves closer to zero; hence, its connection to frequency.
3. The Results
The results are summarized in Table 1, while Table 2 provides some additional information that might not be apparent in Table 1. As it can be seen, the power spectrum β, that describes how much different frequencies contribute to the average power of the signal, fluctuates from segment to segment. Values between 1 and 2 reflect anti-persistence in the index, and values between 2 and 3 reflect long-term memory or persistence. Readers interested in knowing the wave length of the memory are referred to the excellent paper by Peters  . Thus, over the whole period, 4 segments reflect anti-persistence and three reflect persistence.
Table 1. The power spectrum and the correlation dimension of the S&P Index: 1961-2011.
Table 2. More information about persistence and anti-persistence modes.
The symbols and ¯ indicate improvement or deterioration in ordinal space or increase and decrease in real space. D2 indicates the frequency of orbit’s visits to different subspaces in the attractor.
The interesting observation for the present purpose is that the power spectrum lying between 1 and 2 reflects dark-pink noise spectra; and that implies thereby a deterioration in income distribution. While values lying between 2 and 3 reflect dark-brown spectra coinciding to improvement. Interestingly, over the whole range of the data considered, no brown noise was detected; and we note in passing that this should be significant for studies based on Brownian motion.
For an additional verification, we consider some values obtained from another measure, called the Gini Index. The latter has a few interpretative limitations. For example, it measures relative income; thus two countries could have the same Gini value and yet are very different in terms of economic status. Or the Gini index may exceed a value of 1.0 when some individuals make a negative contribution to the total income, etc. However, these limitations do not apply in the present case. In essence, a value of zero in the Gini Index reflects perfect equality, while a value of 1 reflects perfect inequality.
Now consider how the Gini Indices of the US economy vary over time. Over the period 1961-72, the S&P-500 Index was in persistence mode with a value of β = 2.044; the Gini index went down from 0.52 in 1961 to 0.42 in 1972. In contrast, from 1972 to 1983, when the S&P-500 was in anti-persistence mode, the power spectrum was 1.4, probably due to changes in the status of the US dollar, war, and the oil shock.
We would then expect a deterioration in the Gini index over that period. That is what happens; the Gini coefficient went from 0.42 to 0.46. During the whole period 2003-2011, the system was again in anti-persistence mode, the Gini coefficient again went from 0.51 to 0.53. During the brief period 1998-2002, when the S&P-500 index was in persistence mode, the Gini coefficient remained at 0.50; that is the only glitch observed. But from 2003 to 2008, it increased from 0.49 to 0.50. From 2007 to 2008, the Gini index increased from 0.50 to 0.52. Thus, during the economic meltdown from 2007 to 2011, the index was in anti-persistence mode and the Gini coefficient increased from 0.50 to 0.53. If we were to examine the Gini indices for other countries, we would most likely observe a similar situation, except where it is mitigated by equality-like policies.
In summary therefore, dark-pink noise (1 < β < 2) implies deterioration in income distribution, while dark-brown noise (2 < β < 3) reflects improvement in income distribution.
Turning now to the correlation dimension D2 in 2-D, it remained between 2 and 3 over the whole period under study, implying that the S&P-500 was never a random process. The other interesting result is that the correlation dimension (which detects probabilistic structure among variables)of each segment is a non-integer, implying that its dynamics should show a countable set of periodic orbits of arbitrary long periods, and an uncountable set of non-periodic orbits. Such a situation might appear random to the naked eyes, but in fact the process is deterministic. Furthermore, when the system was in anti-persistence mode, D2 increased, and it decreased in persistence mode, as can be seen in the last column of Table 1. This means that there is a sort of phase shift that occurred at the fold at H = 1/2. This is explained by an enlargement or a shrinking of the singular spectrum at the values of the b’s in Appendix A. For example, during the period 1961-72, the process was in persistence mode, while during the period 1972-83, it was in anti-persistence mode. Consequently, the information dimension D1 went from 2.4192 to 2.7083; D0 went from 2.4780 to 2.7791, and so on. That is, the size of the attractor increases in anti-persistence mode and decreases in persistence mode, as we would expect.
Further verifications come from similar studies as in  . He found that for 3-D fractal attractors of continuous-time dissipative systems, the non-integer fractal dimension is between 2 < D2 < 3, as found in the last column of Table 1. To take yet another example, consider the findings of Edgar Peters who used the Grassberger/Procaccia procedure to compute D2 of the S&P-500 Index, sampled at monthly intervals from January 1980 to July 1989. He found that the embedding dimension was 2.33. Even though we do not have the same series’ length nor the same sampling interval, nevertheless this study arrives a value of 2.3345 for the period 1983-87. This might be due the fact that D2 remained constant over the time interval, or due to the fact that segment as well as all the others were filtered for white noise prior to the analysis since both methods are sensitive to noise, or due to a combination of both; at any rate, this kind of concordance in that statistic is rather rare in economics.
4. Concluding Remarks
Our initial contention was that wealth and income inequalities in market economies are too ubiquitous and systematic not to be driven by some power law. To verify that assertion, the S&P-500 Index, sampled daily over a span of 50 years, was examined. It was found that the index varies from anti-persistence to persistence modes during the period studied. Consequently, its noise spectrum varies from dark-pink, when the power spectrum was between 1 and 2, to dark-brown, when the power spectrum lied between 2 and 3. On the assumption that the Index is the output of a multifractal, its singularity spectrum, including the correlation dimension of each segment, was also computed using the method proposed by Mandelbrot. The value computed for the correlation dimension is compared to that obtained from another procedure advocated by Grassberger and Procaccia. Values from the two methods were found to be in perfect agreement.
It was further found that both noise spectra and correlation dimensions vary with persistence modes. When the index was in anti-persistence mode (the power spectrum lies between 1 and 2), the correlation dimension was over 2.5; but in persistence mode (the power spectrum was between 2 and 3) the correlation dimension was below 2.5. More interestingly, in anti-persistence mode, income distribution became more unequal than when it was in persistence mode. This finding confirms the original suspicion that power spectra faithfully reflect the state of the economy and drive income distribution. Moreover, the fact that the correlation dimension is not an integer also shows that the whole process is deterministic and driven by a strange attractor with a fractional exponent.
If this is true that income and wealth distributions are driven by a power law, then perfect equality in this area is unattainable, even with strong equitable policies, since one would be fighting not only the rich but also the power law along the way. Incidentally, power laws should be operational in many other areas of economics. For example, if economists could muster the courage to drop the unobservable appendage called utility function, they would see that what they term law of demand is none other than a power law whose exponent a is 1; for more on this, the reader is referred to  .
Before closing, let us say that we believe that an equitable policy is desirable, but policy makers must bear in mind that producing “effective policies” is also a fight against the power law. As policy approaches equity, the power law constraint becomes more and more unsurmountable. Nonetheless, what is obvious is what Table 2 reveals. That is, all would-be effective policies should focus on the stabilization of fluctuations.
Appendix A. The Mixed Fractional Brownian Motion
The Mixed Fractional Brownian Motion is given by:
, where , . and , . (A2.1)
Zt is an observed combination of Gaussian processes ( ), each with its own H index. are the unobserved Mandelbrot-van Ness  nputs into Zt. The latter not only captures the properties of the dynamic input/output construct describing the financial market, but its structure allows the analysis of the data segment by segment, depending on their scaling limit of self-similarity. That setting allows the judicious use of both the wavelet multi-resolution analysis and the Mandelbrot Method of multifractal analysis. For, if outputs are only approximately self-similar, they must be decomposed into subsets supporting a Borel probability measure having some sort of symmetry, which can reproduce copies of the sets on arbitrarily small scales up to a given precision (For more on this, see   .
If now we consider price index as the observable output or an observable signal, Zt, we have:
(The Process): , where , , is a combination of
observed Gaussian processes, each with its own H index, while are unobservable inputs into Zt, arriving as “cars” or as “trains” in the sense of Sottinen  . Further:
Then Zt is completely characterized by its covariance function :
Zt has the following essential properties:
Property 1. (Scale invariance). and ( , ) have the same probability distribution. This property is a consequence of the covariance function, R(t, s), which is homogeneous of order 2H.
Property 2. (Stationary Increments). Over the interval (t, s), has a normal distribution with zero mean and variance given by .
Property 3. (Dependence). Defining ; , and . If , Zt is anti-persistent and STD exists; if, on the other hand, , Zt is persistent and LTD exists.
In the literature, Zt is termed: Mixed Fractional Brownian Motion (MfBm) (see,    ). While is the Mandelbrot-van Ness process (fBm).
Appendix B. The Hausdorff Dimension
According to Warwick.ac.uk”s  lectures on Fractals and Dimension Theory, the Hausdorff dimension is a description of the geometry of a fractal set. If E is a fractal set whose dimension is sought, then let be a finite covering of E into sets whose diameters are less than e. Then E Ì Uici and the dimension of its set satisfy some dI = d(ci). If the function:
where the infimum (over all coverings satisfying dI < e) defines a measure for the set E. Then decreases monotonically with D. Therefore, there is a unique transition point H that satisfies the Hausdorff dimension. That is:
for and 0 for (B2.2)
so that , (DH is henceforth denoted, D0).
For a greater ease of exposition, one might wish to define a probability u on E and consider upper and lower dimensions of u as measurable functions du(sup) and du(inf)−, where
where ball(b, e) is a ball of radius e > 0 about b.
Moreover, if a closed bounded set is a manifold, the value of its dimension must satisfy the Warwick criteria. That is, its dimension must be:
1) either an integer or a non-integer; and
2) points and countable unions of points of zero volume must have zero dimension.
It can then be seen that the topological dimension (dimT), for example, fails on both criteria since it is always an integer, giving 0 for the Cantor set, which is not true. By a similar argument, the Box-counting measure fails on ii), whereas dimH(E) satisfies both i) and ii), and dimH(E) £ dimBox(E).
Both the so-called capacity and the Hausdorff dimensions are closely related in the sense that both are fractal and geometric measures, but in general the Hausdorff dim is the lower limit of capacity dim. Both are geometric, not probability measures, indicating how orbits fill the phase space under the flow of a dynamic system.
Appendix C. The Correlation Dimension
This presentation is provided by Medio. For n discrete points and an arbitrary fixed time interval, the correlation function is:
where θ(s) is the Heavyside function, i.e., θ(s) = 1 if (s) ³ 0 and 0 if (s) < 0.
For small r, C(r) behaves as a power of r. Then:
where D2 is the correlation dimension. In reality, one proceeds by counting how many points have a smaller Euclidean distance than some given distance r. As r varies, so does C(r), defined here as the total count, divided by the squared number of points. The quantity C(r) is also called the correlation sum. As r ® 0, the sum , which yields the correlation dimension. The latter is another fractal dimension and a probability measure describing the frequency with which orbits visit different parts of the attractor of a dynamic system.
Appendix D. The Hurst Exponent
The Hurst exponent  is used to measure the long-term memory of time series. It involves the correlations of the series with the rate at which these decrease as the lag between pairs of values increase. The Hurst exponent (H) is then defined as the asymptotic behavior of the rescaled range as a function of the time span of a time series as follows:
E [the range of the first n values/the standard deviation of the first n values] = as n ® ¥, where E stands for expected value, C is a constant, and n is the number of data points in the time series.
The Hurst exponent H is also related to both the Hausdorff dimension and the power spectrum as follows:
This relation is valid for a given range, i.e., 0 ≤ H ≤ 1.
 Dominique, C.-R. and Rivera, S.L. (2012) Short-Term Dependence in Time Series as an Index of Complexity: Example from the S&P-500 Index. International Business Research, 5, 38-47.
 Mandelbrot, B. (1974) Intermittent Turbulence in Self-Similar Cascades: Divergence of High Moments and Dimension of the Carrier. Journal of Fluid Mechanics, 62, 331-358.
 Mandelbrot, B. (2003) Multifractal Power Law Distributions: Negative and Critical Dimensions and Other Anomalies, Explained by s Simple Example. Journal of Statistical Physics, 110, 739-774.
 Warwick.Ac.uk (2012) Lectures on Fractals and Dimension Theory.