Almost everybody knows how to calculate an average or (arithmetic) mean, and its use is widespread. Its interpretation is quite often questionable and sometimes ludicrous, see e.g. . Though such remarks are important, this paper focuses on a different, more mathematical point.
Sometimes an average makes perfect sense. The average weight of an airline passenger times the number of passengers gives a perfect measure for the weight of all passengers. The average size of a screw in a warehouse does not make sense. Probably nobody will be fooled by this extreme example. However, there are more sophisticated examples in  like designing the cockpit of a fighter jet in accordance to the height of an average pilot, which turned out to be hardly existent.
Sometimes the problem is solved by using the median instead of the mean or average. But this is not the point here. The median size of a screw in a warehouse is as useless as its average size. From Chapter 2 one can see that the arithmetic mean is a number that minimizes the sum of the quadratic deviations from all data points. So the actual data points may be bigger or smaller than the average, but the (linear) deviations to the bigger and smaller side are the same. If being bigger than the average can be canceled out by being smaller, an average does make sense. It is the case in the example of the average weight of a passenger. A heavier passenger can be compensated by a lighter one. This is in contrast to screws in a warehouse. One that is too small is as bad as one that is too big.
So the pivotal point is the use of the arithmetic mean. In the example of the average weight of an airline passenger, the use was to calculate the total passenger load. The total passenger load is a positive real number. The use of the screw was to fit. Such function has only two values: does fit and does not fit. For “does not fit”, it is irrelevant whether the screw is too small or too big. Mathematically speaking such a use-function must be strictly linear.
So far for a complicated explanation of a maybe trivial issue. However, most long-term considerations take into account non-linear equations. As an archetype consider exponential population growth. Obviously, it is not a linear function. Its exponent is essentially the birthrate. A larger birthrate implies higher population growth and vice versa, but it is highly non-linear. A ten percent higher birthrate in one part of the population cannot be compensated by a ten percent lower one in another one. However, almost everywhere an average birthrate is used to forecast the future population. Chapter 3 scrutinizes this example. Surprisingly, one may use average birthrates in the normal (unlimited) population growth formula. In Chapter 3 we prove generally that average parameters such as average birthrates can be used in linear differential equations only.
A further application are financial markets. Mankind is far away from having proper differential equations determining the future profit of a company and with it the future stock price. However, people try to build estimates. In these calculations, one uses averages from mean inflation over mean spending on R & D to mean productivity of the employees. Everything else would make the considerations impossible for practical reasons as it is not possible to consider and determine millions of variables. However, nobody doubts that the financial word is governed by non-linear differential equations. If it was not, the solutions would have to be plane waves in contrast to all observations. We will discuss this in more detail in Chapter 4. It leads to the sad conclusion that almost all quantitative analyses in financial markets are at least doubtful.
In Chapter 5 we will consider the diffusion model in marketing, a mathematical tool to forecast the future market share. There, also average quantities are used. And due to that, one (sometimes) gets completely wrong results. Even chaos effects have been predicted  though they do not exist . In Chapter 5 we also show how one can use a continuum limit to overcome the problem of knowing average parameters only. It is quite important to notice that in almost all financial models a continuum limit is not possible. This has been stated for the first time in . It is due to the fact that typical financial data such as e.g. stock prices are, unlike the market share, not conserved. This led to the suggestion of a conserved value  in finance. Eventually it has been proven mathematically in .
We close with a look to further research in Chapter 6. Here we consider chaos in contrast to randomness. Though chaos appears random, it is not. However, it seems so random that chaotic functions are sometimes used to create “random” numbers. Building arithmetic means in chaotically varying quantities sometimes (but not always) gives identical results to random variations as one might expect. But sometimes it does not. This is an open question. It is of special relevance whenever something varies chaotically such as prices in financial markets. Is a statistical analysis allowed at all? Can one modify ordinary statistics in order to cope with it?
2. Definition of Mean and Median
It is assumed that arithmetic mean (here also called average) and median are well-known to any potential reader of this publication. Else it can be found in any mathematical handbook such as . Here we take a route different from most textbooks. It is important to understand what mean and median means.
Given is a set of real number with i running from 1 to N. We define the average as a number so that
To find this minimum, Equation (1) can be differentiated with respect to and set to zero. It leads to the well-known formula for the average
The average is therefore a special least square fit. The data points are fitted by a constant. As with the least square fit, taking the squares in Equation (1) is by no means justified. It is practical for getting positive numbers and keeping an analytic function. However, it is arbitrary. Why not take the fourth power? Taking squares makes small numbers smaller and larger ones larger. The error of a least square fit is given by Equation (1). This does not make sense if the have a dimension such as ? The error thus has the dimension ?sup>2 which has no meaning. Taking the square root of Equation (1) does not help either. Squares and roots are non-linear functions which must not be interchanged with the sum. Therefore, the least square fit is an approximation only. A least absolute value fit is the correct procedure. However, dealing with it becomes horribly complicated, and the result can be obtained numerically only in most cases. This is the reason why a least square fit is so popular though strictly speaking it is wrong. The difference between a (wrong) least square fit and the (correct) absolute value fit is small in many cases. However, if the data points are varying over orders of magnitude (e.g. in exponential growth) the error becomes significant. This logic brings us to the definition of the median:
Given is a set of real number with i running from 1 to N. We define the median as a number so that
Equation (3) is a non-analytic function of . Of course, the minimum can be determined. One way is to use differentiation carefully: differentiating from the right and left, respectively, at non-analytic points. Either way the minimization problem of Equation (3) has the solution
where sgn denotes the signum function defined as −1 for negative arguments, +1 for positive ones and zero else. As one sees, must be in “the middle” of the numbers in order to fulfill Equation (4). is therefore exactly what one calls median.
Mean and median are least square fits or least absolute value fit, respectively, where the fit function is a constant. In order to find means or medians for a continuous distribution one has to change the sum signs into integrals.
From this definition of mean and median it becomes clear that the use of mean and median is not optional depending on the situation. Median is the correct way and mean is the approximation. If median and mean are similar, mean is a good approximation. However, sometimes mean gives something exact. Knowing the mean weight and the number of passengers, one knows the exact weight of all passengers combined. This may be practical, but it has nothing to do with a statistical interpretation, what mean and median are meant for.
3. Why the Underlying Differential Equation Must Be Considered
In this chapter we want to show that averages can be used even in non-linear functions as long as the underlying differential equation is linear. As a starting example consider the formula for unlimited population growth:
denotes the population at a time t and is the population at . b is the birth rate (number of children per woman) and is the birthrate when the population stays constant (typically ). is a constant depending essentially on the lifespan of the population. Because depends exponentially on the birthrate b, it appears doubtful to use an average birthrate. Some years ago we used Equation (5) as an exercise for graduate students: Half of the population has a birthrate and half of it . On average it yields . Of course, the population does not stay constant, because only in the beginning holds. Half of the population becomes extinct and the other half is reproducing rapidly. In a properly weighted average we have a time dependent average birthrate of
Of course, one must not insert of Equation (6) directly into Equation (5). Equation (5) is the solution of the differential equation
So one has to insert the of Equation (6) for b into Equation (7). The solution is:
If one took a realistic growth model with e.g. limited growth, the corresponding differential equation would be non-linear. A weighted average like in Equation (6) will not be possible in that case. This is an important information for any person dealing with population growth or decrease. Such people use much more sophisticated models compared to Equation (7). Their differential equations are non-linear in almost all cases. Nevertheless they use average birth rates only. So their results are generally wrong—yet it is hard to tell by how much. In order to check, one must have the distribution of birthrates. Such distributions cannot be found in statistical data banks. It is left to the reader to try some examples or it would be an exercise for advanced graduate students. In what follows we will prove that the linearity of the underlying differential equation is essential for using averages.
Instead of Equation (7) we use a very general model for a function :
In a linear model, holds. The function g corresponds to the parameters of the differential equation. Without limitation we are just considering two functions and . If we are able to prove that the averaging is wrong for all non-linear functions, we have for sure shown that it will not work out for more than two functions. Furthermore, our proof can be extended easily to an arbitrary number of functions. Therefore we consider only two functions:
Needless to say that the functions g are analytic functions. Therefore a Taylor expansion is possible:
Please note that the a’s in Equation (11) have upper indices rather than exponents in contrast to the f’s in Equation (11). For the average f one can write by using Equations (10) and (11).
On the other hand, the average coefficient is given by
Comparing with the average coefficient from Equation (12) we have the following equation to hold:
Equation (14) is equivalent to
Because the different powers of f in Equation (15) are linearly independent, all corresponding powers must fulfill Equation (15) separately. This is generally impossible except we have only one exponent . In other words, the Taylor expansion of g contains only the linear term. This concludes the proof that only in linear differential equations averages can be used.
In this chapter we have shown that averages can be used even in non-linear equations as long as the underlying differential equation is linear. It is a plausible result, because the differential equation governs the situation. If it is linear, averages are fine to use. The solution of the differential equation is just the sum (integral) of the underlying microscopic interactions. If each interaction may use averages, so does its sum.
4. Financial Markets
Finance is far away from having models like population growth. At most one has heuristic models. The goal is to predict prices or at least probabilities for it. These models have grown more and more complex. The ultimate model has not been established, and the authors are convinced that it will never be (For more details see below and also  or  ). However, there is no doubt that these models will consist of non-linear differential equations. If the governing differential equations were linear, their solutions would be plane waves. This is in contrast to any observation of financial data. Furthermore, the used “tools” are based on non-linear differential equations in most cases. Just as an example consider the Black-Scholes model . It is a model for pricing options. The details are not important here, but it is a non-linear (partial) differential equation. (There are many more such models, also or especially in quantitative economics, see e.g.  ).
In order to use these models one needs parameters such as inflation rates, interest rates, investments for e.g. R & D, and so forth. For all these parameters one uses averages. Some are even defined as averages such as the inflation of a basket of goods over a year. Everything else would be virtually impossible. One would have to consider a huge number of variables changing every day or maybe every second. At first glance averaging appears reasonable because there is an interest in e.g. average prices. However, we have shown in Chapter 3 that one must not use average quantities in non-linear differential equations.
We come to the sad conclusion that almost all work in finance and quantitative economics suffers from these shortcomings. That such (wrong) calculations lead to at least sometimes correct results is of course far from being a justification: Ex falso quodlibet!1 Especially in finance there maybe also some herd effect if sufficiently many people believe in a certain model. Then it is nothing but a self-fulfilling prophecy, e.g. cf. .
However, there are many more shortcomings besides using averages in non-linear differential equations. There exist plenty of additional variables than usually considered and these variables appear to be important. It leads to the almost ludicrous result that the weather on Wall Street is an essential influencer on stock prices . This comes as no surprise as it has been proven that prices of most financial products vary chaotically . Within chaos tiny changes in seemingly unimportant parameters have big effects in the end. Therefore, financial markets work similarly to gambling. However, it is not considered gambling. Else there would be regulations to give the same odds to anybody.
To overcome these difficulties one has to:
• Use individual data instead of averages.
• Use many more presently unknown variables.
• Know any parameter/variable up to an extremely high accuracy due to chaos.
That is the reason why the authors are convinced that there will never be a proper model for financial markets. Please be aware that “extremely high accuracy” is quite often much more than 101000 digits. From this, another problem arises. A computer has to perform these highly accurate calculations. In a very simple chaotic situation as mentioned in Chapter 6, we estimate calculation times of 10276 times the age of the universe on a 3.5 GHz processor. Even quantum computing would not help because it is currently only 100 million times faster than an ordinary computer. Our very simple chaotic calculation would still take 10268 times the age of the universe.
Calculating next week’s lottery numbers is comparably simple to the above. That is the main reason why considering conserved values has been suggested in  and proven in . By using it, all problems disappear. However, gaining money due to trading financial products will also disappear. This comes as no surprise as such trading is nothing but a special form of gambling .
Furthermore, it is important to note that prices of financial products vary chaotically. As we will explain in Chapter 6, it will make use of statistics (e.g. averages) not without flaws.
5. Diffusion Model in Market Forecast
In this chapter we will comment on the use of the diffusion model of marketing. It is a tool for forecasting the future market share. Using averages in this model may lead to completely wrong results under certain circumstances . However, here we can present a way out by using a continuum limit, which is the reason why we comment on this particular model. Unfortunately, the continuum limit cannot be applied to the world of finance because stock prices and the like are not conserved quantities.
The use of the diffusion model in marketing started in the 1960 and it is used ever since. There are several versions. We will consider the so-called logistic diffusion model. It is an iterating formula calculating the market share at time t from the market share at time . The main idea behind it is that the product diffuses into the market, like the smell of sold waffles in a shopping mall attracts more customers. The formula of the logistic diffusion model takes the form
b is a diffusion constant. A large b means that one will gain market share rapidly, and a small b implies slow growth or even shrinking. The constant M in the term is the natural limitation. Else the market share will grow to infinity, which is unrealistic. It is similar to a growth limitation in a realistic growth model. From a mathematical point of view, one may always set . In doing so, one will get the following formula for the logistic diffusion model, which has also been used in  :
If b approaches a certain value (≈3.5699) something strange happens. N is changing very rapidly and seemingly randomly between 0 and 1. This comes as no surprise since Equation (16) is nothing but the logistic map, cf. Equation (19) of Chapter 6. This has been described as the end of the diffusion model in . But how can it be? Why is the market share “jumping” if it is growing sufficiently fast? How can a market share change chaotically though it is a conserved quantity?2 Why is the market share varying between −∞ and +∞ for ?
As already mentioned in  , the constant b of Equation (16) is generally speaking a different one for each customer buying or not buying something. So one would probably have millions of different constants b. With such a huge number of fit parameters reality is described perfectly. On the sad side, it would make forecasts impossible by using Equation (16). Nobody can estimate so many parameters. Therefore, one uses an average b in Equation (16). However, with the same proof as in Chapter 2 (but much simpler) one can show that one must not use averages in Equation (16). Using numbers one will find, however, that for small values of b there is only a minor error. If b is approaching 4, the error becomes huge. As a conclusion, using an average b in Equation (16) produced the in reality not existing chaos effects. This is a good example to show that using averages may produce tremendous errors. The huge error has to do with the fact that Equation (16) shows chaos in a mathematical sense. Though we cannot calculate the corresponding error in the financial world (as argued in Footnote 1), it is assumed to be very large due to the fact that chaos is also present in e.g. stock prices .
Unlike the models of finance of the last chapter, in this example one can easily overcome the problem of using averages. The approach is the same as in diffusion in the physical world. The analog to Equation (16) is called the ballistic regime. There single molecules are considered. They scatter on each other or with other molecules. Depending on the details of each scattering the exchange of energy and momentum within each scattering is different. It makes thorough considerations next to impossible in the same way one cannot use Equation (16) with many different b. Because the number of molecules, the exchanged energy and momentum are conserved quantities, one may take averages over long time spans and long distances. The time spans and distances have to be so large, that within them many scatterings will take place, so that one can consider the average exchange of momentum and energy. Taking also into account symmetry considerations, this will lead to what is called hydrodynamics in physics.
Unfortunately, a similar approach is not possible in finance if one considers stock prices and the like. These are non-conserved quantities. The problem was first addressed in  and let to further research and eventually to the definition of a conserved value in finance and economics (  ,  ).
The market share is a perfectly conserved quantity. If the market share of one person or company goes up, it must go down somewhere else accordingly. Building upon this, it is now easy to perform a continuum limit in Equation (16). Details can be found in  , though it can be considered common sense. Equation (16) transforms into a differential equation:
Unlike Equation (16) one can easily solve Equation (17) in a closed form:
For small values of b the iterative solution of Equation (16) gives almost an identical result compared to Equation (18). Of course, Equation (18) is reasonable for any (positive) value of b. Similar formulas for other diffusion models and also for are easily obtained in the same way or can be found in . Though there is no chaos within a properly used diffusion model, chaos effects may be present in market forecast. As it is impossible that the market share itself varies chaotically, the time to reach that market share can vary chaotically, because time is no conserved quantity.3 The diffusion model of marketing has an artificial time variable because each time step has the length one. However, there are other market forecast procedures such the one suggested in . There one explicitly determines the market share and the time to reach it. Depending on the detailed numbers, one may or may not find chaos effects. In the example in  they are explicitly proven.
So we can conclude that the never is chaos within the diffusion model. The wrong usage of averages seemingly produced chaos effects. Though one cannot show that the wrong use of averages produces a similar tremendous effect in finance, it is at least highly plausible.
6. Further Research
The purpose of this last chapter is to give some general remarks about the often mentioned word chaos. It is especially puzzling that chaotic variations appear so randomly that one can use them to produce random numbers. However, there are differences which we will point out. Hausdorff dimension or the Lyapunov exponent (see e.g.  or  ) are the correct tools to evaluate chaos besides its random look. They clarify and quantify the difference between chaos and randomness. Unfortunately, they can only be used if the chaotically varying variable is given by a mathematical formulation (equation). It is impossible to use them by considering a finite number of data points. Though we know at least from  that stock prices are varying chaotically in many cases, we cannot see chaos in the stock prices quoted at the stock exchange. Evaluating them statistically is therefore far from being flawless with no solution at hand. Therefore we leave it to further research. Here we are just explaining the problem.
Chaos effects are known to mathematicians for more than a century. In the 1960 Edward Lorenz found that long term weather forecast is impossible due to chaos (butterfly effect). In the 1980 it has become common in physics. Starting from the 1990 it has been scrutinized in business and economics. Just as an example consider  or . Furthermore, chaos has also be used to explain less quantitative but nevertheless important things like the origin of war. In this context the phrase “drop of honey effect” has been framed in .
In this chapter we will introduce the maybe simplest mathematical model which shows chaos. It is the logistic map:
Equation (19) is mathematically identical to the logistic diffusion model of Equation (16). We have and as an iteration index. Starting with we will have , , , , and . The first iterations are obtained easily. The last ones are already much more complicated. Typically one has to take into account 10300 digits in order to get the correct results. These 1,000 numbers look like random numbers between 0 and 1. Indeed one finds
and also a nearly perfect equal distribution. One can also plot e.g. as a function of x. It looks identical to plotting a random number.
The strange (chaotic) behavior will start at and is fully developed at . leads to a divergence. For one can show by e.g. complete induction that
Equation (20) makes it possible to calculate the 1,000 values of within a quite short computing time. Using Equation (19) directly, which is necessary if e.g. is chosen, one needs 10276 times the age of the universe as mentioned in Chapter 4. Please note that for any finite n Equations (19, 20) are strictly speaking non-chaotic, though they look very chaotic for e.g. . Only for real chaos is present in a mathematical sense. One can also calculate the average of in the limit . As expected one will get
It proves that there really is an equal distribution of the functional values between 0 and 1.
In order to see the difference between randomness and chaos we will introduce two common methods to detect chaos mathematically. The first is the Lyapunov exponent, which one will find in most textbooks about chaos such as . The Lyapunov exponent is defined as
Equation (22) holds for every function f not just the logistic map. However, f must be an iterative function. means chaos. By inserting from Equation (20) into Equation (22) the Lyapunov exponent for the logistic map ( ) is easily calculated to . It is (almost) independent of x. For certain values of x the logistic map will give 0 after a finite number of iterations. The values are:
As becomes a constant function after a finite number of operations, the differentiation in Equation (22) gives zero and the logarithm minus infinity.
Because an iterative function is a function of a function of a function …, and so forth, one may apply the chain rule for the differentiation in Equation (22) yielding a product. The logarithm transforms the product into a sum. After some rearrangement one finally gets
Inserting for f the logistic map of Equation (19) yields
Equation (25) is the only reasonable way to calculate the Lyapunov exponent of the logistic map for . Please note that a numerical calculation of the Lyapunov exponent for via Equation (25) is numerically still very challenging. For we know from above that Equation (25) will yield . The in Equation (25) look like random numbers between zero and one as stated above. So one might come up with the idea to calculate the Lyapunov exponent of random numbers via Equation (25). Naively trying it, one will get a result around 0.4. More careful considerations show that the limit in Equation (25) does not exist for random numbers. This has to do with the fact that random numbers come arbitrarily close to 0.5. Avoiding the values for x given in Equation (23), the values of the logistic map may come close to 0.5 but not arbitrarily close.
So we have shown that the limit of Equation (25) does exist for a chaotically varying f. It is for . Using the seemingly identical varying random numbers yields a non-existing limit in Equation (25). Here the explanation for it is easy as stated in the last paragraph. However, having numerical data like e.g. stock prices one has to decide: Is it a random variation or a chaotic one? For sure any limits one will build may be completely different.4 If one decides for a chaotic variation, one has to know how this chaos works. As stated, Lyapunov exponents are positive when chaos is present, but they may take any value.
Scrutinizing some measured data not being created by a known (or assumed) mathematical procedure is therefore highly risky. Ordinary statistics is at least doubtful. Therefore we called this last chapter further research, though it appears to be far from straight forward.
As mentioned above there is a second method to quantify chaos. The results there do not have the same dire consequences for finance as we got from considering the Lyapunov exponent. It may be however important for engineering and related sciences. The next method is the Hausdorff dimension. Its detailed definition can be found in any advanced textbook such as  or . Though the Hausdorff dimension is defined in any spatial dimension, we here just consider two dimensions. In a two-dimensional plane one may have objects of dimension 0 (dots), dimension 1 (lines or curves), or dimension 2 (e.g. a filled triangle). The Hausdorff dimension is a generalization of this approach which allows non-integer dimensions. Its definition goes as follows. One has to cover the objects in a plane with N circles of diameter l. When l goes to zero, N will go to infinity—at least in most cases. In that limit one may write
The exponent D determines how fast the number of circles goes to infinity. It is called the Hausdorff dimension. If one has M dots in a plane, one needs M circles to cover the dots. So with and Equation (26) is fulfilled. The Hausdorff Dimension is in this case identical to the ordinary dimension. Considering a square with side length c, Equation (26) is fulfilled for and as it should because a square is a two-dimensional object.
In order to get non-integer Hausdorff dimensions, consider from Equation (20). For any finite n it is a curve oscillating times up and down between 0 and 1. This line has a Hausdorff dimension of 1. Taking the limit is slightly tricky but a rigorous calculation yields . So in the limit from Equation (20) becomes truly chaotic showing a fractal dimension. A fractal dimension is a rigorous proof of chaos like a positive Lyapunov exponent. Please note that a positive Lyapunov exponent and a fractal Hausdorff dimension both prove chaos, but there is no algebraic connection between them, because the Hausdorff dimension is a global measure while the Lyapunov exponent depends on the variable (here x).
Instead of considering from Equation (20) one may consider a function mapping the interval to a random number between 0 and 1 (and 1 and 0 to 0). As stated, this function looks identical to . However, it is a filled square having a Hausdorff dimension of 2. So we have a second difference between randomness and chaos. In this case we have or , respectively.
As a result, chaotically varying quantities look random. Some limits and averages are identical whether random numbers or chaotically varying ones are considered. Others are completely different such as e.g. Lyapunov exponent or Hausdorff dimension. A statistical analysis of experimental data such as stock prices is therefore generally impossible, because one does not know whether they are random or chaotic. Even if one has proven or at least has assumed chaos, it is impossible to decide the mathematical form of this chaos such as its Lyapunov exponent or Hausdorff dimension.
We are indebted to our colleague Sascha Fabian who showed in his lectures that averages are sometimes complete nonsense. He also gave us the hint to  where chaos effects in the diffusion model are discussed.
1It is not possible to show the exact margin of error due to averaging without considering a particular model. Even then we do not know the correct result in order to calculate an error. As the mathematical expression ex falso quodlibet indicates, making a wrong assumption can “prove” anything. 1 = 2 implies not only 2 = 3 but also 1030 = 0 .
2Admittedly conserved quantities in this sense were first mentioned in 2011 in  , many years after 1993 when  has been published.
3The chaos effects in the weather forecast show the same behavior. As the amount of rain is a conserved quantity, it is well predictable. The exact time when (and where) the rain starts is by no means conserved. And indeed this time is practically unpredictable over a sufficiently long time period.
4Please note that a differentiation, integration, or Fourier transformation also implies building limits.