A well celebrated, fundamental probability distribution for the class of continuous functions is the classical Gaussian distribution named after the German Mathematician Karl Friedrich Gauss in 1809.
Definition 1.1 Let and be constants with and . The function
is called the normal probability density function of a random variable X with parameters and .
Both in theories and applications, without element of equivocation, the Gaussian distribution function is the most essential and widely referencing distribution in statistics.
The well-known method of deriving this distribution first appeared in the second edition of the Doctrine of Chances by Abraham de Moivre (hence, de Moivre’s Laplace limit theorem) published in 1738 (     ). The mathematical statement of the popular de Moivre’s theorem follows.
Theorem 1.1 (de Moivre’s Laplace limit theorem) As n grows large ( ), for x in the neighborhood of np, for moderate values of p ( and ), we can approximate
Explicitly, the theorem asserts that suppose , and let p and q be probabilities, with . The function
called the binomial probability function converges to the probability density function of the normal distribution as with mean np and standard deviation .
Although, De Moivre proved the result for (  ).  extended and generalized the proof to all values of p (probability of success in any trial) such that p is not too small and not too big. Feller result was expounded by .     used uniqueness property of moment generating function technique to proof the same theorem.
In this paper, we attempt to find an answer to the question: is there any alternative procedure to the derivation of Gaussian probability density function apart from de Moivre’s Laplace limit theorem approach which relies heavily on many Lemmas and Theorems (Stirling approximation formula, Maclaurin series expansion etc.), as evidenced by the work of  and  ?
2. Existing Technique
This section presents the summary proof of the existing de Moivre’s Laplace limit theorem. First and foremost, the study state with proof, the most important lemma of the de-Moivre’s Laplace limit theorem, Stirling approximation principle.
Lemma 2.1 (Stirling Approximation Principle) Given an integer , the factorial of a large number n can be replaced with the approximation
Proof 2.1 This lemma can be derived using the integral definition of the factorial,
Note that the derivative of the logarithm of the integrand can be written
The integrand is sharply peaked with the contribution important only near . Therefore, let where , and write
Recall that the Maclaurin series of . Therefore,
Taking the exponential on both sides of the preceding Equation (7) gives
Plugging (8) into the integral expression for , that is, (4) gives
From (9), let and considering and as a dummy variable such that
Transforming from algebra to polar coordinates yields , which implies with Jacobian (J) of the transformation as
Therefore, . Substituting for I in (9) gives
We now begin with proof of theorem (1.1) using the popular existing technique.
Proof 2.2 Using the result of lemma (2.1), Equation (3) can be rewritten as
Multiplying both numerator and denominator of Equation (14) by to get
Since x is in the neighborhood of np, change variables , where measures the distance from the mean, np, of the binomial and the measured quantity x. Re-write (15) in terms of and further simplify as follow
Note that . Therefore, rewriting (16) in exponential form to have
Suppose , using Maclaurin series and similarly . So that, and . As a result,
Recall that which implies that . From binomial distribution , and which implies that . Making appropriate substitution of these in the Equation (18) yields
The theorem confirmed.
We recommend that readers interested in the detailed proof of the theorem to consult the study expounded by .
3. The Proposed Technique
Suppose a random experiment of throwing needle or any other dart related objects at the origin of the cartesian plane is performed with the aim of hitting the centre (see Figure 1).
Due to human nature of inconsistency or lack of perfection, varying results in the throwing generate random errors. To make the derivation possible and less rigorous, we make the following assumptions:
1) The errors are independent of the orientation of the coordinate system.
2) Errors in perpendicular directions are independent. This means that being too high doesn’t alter the probability of being off to the right.
Figure 1. The possible results of the dart experiment.
3) Small errors are more likely than large errors. That is, throwings are more likely to land in region P than either Q or R, since region P is closer to the target (origin). Similarly, for the same reason, region Q is more likely than region R. Furthermore, there is higher possibility or tendency of hitting region V than either S or T, since V has the wider or bigger surface area and the distances from the origin are approximately the same.
From Figure 2, let the probability of the needle falling in the vertical strip from x to be denoted as . Similarly, the probability of the needle falling in the horizontal strip from y to be . Obviously, the function cannot be constant, due to the stochastic nature of the experiment. In this study, our interest is to know and obtain the form and characteristics of the function . From second assumption, the probability of the needle falling in the shaded region ABCD (see Figure 2) is
Note that any regions r unit from the origin with area has the same probability which is a consequence of the assumption that errors do not depend on the orientation. We can say that
from fundamental rule of Calculus, differentiating (using product rule) both sides of Equation (21) with respect to gives
Here, since is independent of orientation. By transformation to polar coordinates, and , we can rewrite the derivatives in Equation (22) as
Figure 2. The typical example of the experiment.
Using chain rule of differentiation, (23) becomes
Rewriting Equation (24) again by replacing with x and with y yields
The above differential equation can be put in a form such that it can be solved using variable separable technique as
This differential equation can only be true for any x and y, x and y are
independent, if and only if the ratio defined by (26) is a constant. That is, if
Consider in (27) and rearrange to have
Integrating Equation (28) gives
By third assumption, c must be negative so that we write the probability function (29)
If there is a horizontal shift of target from the origin to an arbitrary point which now mark the new center/target, then the probability function in (30) becomes
Differentiating (31) and set the derivative equal to zero gives
since implies . Therefore, Equation (31) has maximum value at and point of inflexion at . Obviously, (31) has given
us the basic form of the Gaussian distribution with constants k and c, and domain of X as to . Therefore, for Equation (31) to be regarded as a proper probability density function, the total area under the curve must be 1. That is
For a symmetric function , . Applying this property to Equation (33) yields
Squaring both sides of (34) to get
This is possible since x and y are just dummy variables. Recall that x and y are also independent, so we can write the product in LHS of (35) as a double integral to produce
Putting and in the preceding Equation (36) gives
The double integral (37) can be evaluated using polar coordinates as and with Jacobian (J) of the transformation as
So, Equation (37) now becomes
Evaluating the double integral in Equation (40) by first letting , and solving for k in the resulting equation yields
Putting (41) in (31), the probability density function, , becomes
Again, integration of probability function over its domain gives 1. Therefore, from (42)
Further simplification of the preceding Equation (43) gives
One of the important goals in mathematical theory of statistics is to obtain the mean and variance of any probability function under study. The mean, , is defined to be the value of the integral . The variance, , is the value of the integral . Therefore, using Equation (42),
or equivalently as
consider Equation (46) and using integration by part ( ) with and , we have
putting (44) in the preceding equation above, gives
Substituting (47) in (42), the derived probability density function has form
Based on the three aforestated basic assumptions, we have easily derived Equation (48) famously known anywhere in the whole world as Normal or Gaussian distribution function with mean and standard deviation .
To verify that Equation (19) is a proper probability density function with parameters and is to show that the integral
is equal to 1.
Change variables of integration by letting , which implies that . Then
Here are dummy variables. Switching to polar coordinate by making the substitutions , produces r as the Jacobian of the transformation. So
Put . Therefore,
Thus , indicating that (48) is a proper probability density function. Other properties of the distribution such as; moments, moments generating function, cumulant generating function, characteristics function, parameter estimation and the likes can be found in   .
While working with the outlined objective, we are able to establish that there exists an approach that is not only serving as an alternative proof of derivation of the Gaussian probability density function but also free from rigorous mathematical analysis and independent of Lemmas and Theorems. This paper can be classified as a theoretical study of Gaussian distribution and can serve as an excellent teaching reference in probability and statistics classes where only basic calculus and skills to deal with algebraic expressions, Maclaurin series expansion and Euler distribution of second kind (gamma function) are the only background requirements.
The authors are highly grateful to the editor and anonymous referees for reading through the manuscript, constructive comments and suggestions that helped in the improvement of the revised version of the paper.
 Lesigne, E. (2005) Heads or Tails: An Introduction to Limit Theorems in Probability. American Mathematical Society, Providence, Volume 28 of Student Mathematical Library.
 Bagui, S.C., Bhaumik, D.K. and Mehra, K.L. (2013) A Few Counter Examples Useful in Teaching Central Limit Theorem. The American Statistician, 67, 49-56.
 Bagui, S.C. and Mehra, K.L. (2016) Convergence of Binomial, Poisson, Negative-Binomial, and Gamma to Normal Distribution: Moment Generating Functions Technique. American Journal of Mathematics and Statistics, 6, 115-121.
 Young, G.A. and Smith, R.L. (2005) Essentials of Statistical Inference. Cambridge Series in Statistical and Probabilistic Mathematics, Cambridge University Press, Cambridge.