On the Mean Difference Variance in Random Samples of Student’s Variables

Show more

1. The Mean Difference Variance General Expression

Let *X *be a continuous random variable with density function *f*(*x*) and distribution function *F*(*x*). Then, let
${X}_{1},{X}_{2},\cdots ,{X}_{n}$ be a simple random sample from such population; the sample mean difference is

$\stackrel{\xaf}{\Delta}=\frac{{\displaystyle \underset{i=1}{\overset{n}{\sum}}{\displaystyle \underset{j=1}{\overset{n}{\sum}}\left|{X}_{i}-{X}_{j}\right|}}}{n\left(n-1\right)}$ (1)

The mean value of $\stackrel{\xaf}{\Delta}$ is equal to the mean difference of the population

$\Delta ={\displaystyle \underset{-\infty}{\overset{+\infty}{\int}}{\displaystyle \underset{-\infty}{\overset{+\infty}{\int}}\left|x-y\right|f\left(x\right)f\left(y\right)\text{d}x\text{d}y}}$ (2)

In 1952 Lomnicki [3] obtained the following general expression of the sample mean difference variance

$\mathrm{var}\left(\stackrel{\xaf}{\Delta}\right)=\frac{1}{n\left(n-1\right)}\left[4\left(n-1\right){\sigma}^{2}+16\left(n-2\right)I-2\left(2n-3\right){\Delta}^{2}\right]$ (3)

in which ${\sigma}^{2}$ and Δ are the variance and the mean difference of the considered distributive model respectively, whereas

$I={\displaystyle \underset{-\infty}{\overset{+\infty}{\int}}G\left(x\right)H\left(x\right)f\left(x\right)\text{d}x}$ (4)

in which

$G\left(x\right)={\displaystyle \underset{-\infty}{\overset{x}{\int}}\left(x-y\right)f\left(y\right)\text{d}y}$ (5)

and

$H\left(x\right)=G\left(x\right)+\mu -x$, (6)

in which $\mu $ is the mean value of such distributive model.

The mean value $\mu $ and the variance are known for almost all the distributive models. Concerning the mean difference $\Delta $, the known results are collected in Girone’s and Mazzitelli’s paper [4].

So, to determine the expression of the sample mean difference variance it’s only needed the calculation of I.

2. I Expression for Odd g

The density function of the Student’s distributive model is

$f\left(x\right)=\frac{{\left(1+{x}^{2}/2\right)}^{\frac{g+1}{2}}}{\sqrt{g}B\left(g/2,g/2\right)},\text{\hspace{0.17em}}\text{\hspace{0.17em}}-\infty <x<+\infty $ (7)

in which the parameter g is called number of degrees of freedom.

Using Mathematica software for such model for $g=5,7,\cdots ,19$ the obtained values of ${I}_{g}$ are shown in Table 1.

The second term is easily represented by the following formula

Table 1. The obtained values of ${I}_{g}$ for $g=5,7,\cdots ,19$.

$-\frac{g}{6\left(g-2\right)}$. (8)

The first term expression is more complicated to be determined. After several attempts comparing each first term to the previous one, we pointed out the recurring formula:

${A}_{g}={A}_{g-2}\frac{g\left(g-4\right){\left(g-3\right)}^{2}\left(3g-4\right)\left(3g-8\right)}{{\left(g-2\right)}^{4}\left(3g-7\right)\left(3g-11\right)}$, per $g=\text{5},\text{7},\cdots $ (9)

with the initial value ${A}_{3}=15/\left(2{\pi}^{2}\right)$.

Considering the previous relation, we came to the following expression of the first ${I}_{g}$ term for odd g values greater than 3:

${A}_{g}=\frac{\sqrt{3}g\Gamma {\left[\left(g-1\right)/2\right]}^{2}\Gamma \left(g/2-1/3\right)\Gamma \left(g/2+1/3\right)}{2\left(g-2\right)\Gamma {\left(g/2\right)}^{2}\Gamma \left(g/2-1/6\right)\Gamma \left(g/2-5/6\right)\pi}$, (10)

and then the ${I}_{g}$ expression

${I}_{g}=\frac{\sqrt{3}g\Gamma {\left[\left(g-1\right)/2\right]}^{2}\Gamma \left(g/2-1/3\right)\Gamma \left(g/2+1/3\right)}{2\left(g-2\right)\Gamma {\left(g/2\right)}^{2}\Gamma \left(g/2-1/6\right)\Gamma \left(g/2-5/6\right)\pi}-\frac{g}{6\left(g-2\right)}$ (11)

3. I Expression for Even g

Using again Mathematica software for $g=\text{4},\text{6},\cdots ,\text{2}0$ the obtained values of ${I}_{g}$ are shown in Table 2.

The second term of ${I}_{g}$ is represented again by the simple formula

$-\frac{g}{6\left(g-2\right)}$. (12)

After several attempts, comparing each first term to the previous one, we pointed out the recurring formula:

${A}_{g}={A}_{g-2}\frac{g\left(g-4\right){\left(g-3\right)}^{2}\left(3g-4\right)\left(3g-8\right)}{{\left(g-2\right)}^{4}\left(3g-7\right)\left(3g-11\right)}$, per $g=\text{6},\text{8},\cdots $ (13)

Table 2. The obtained values of ${I}_{g}$ for $g=\text{4},\text{6},\cdots ,\text{2}0$.

with the initial value ${A}_{4}=8/15$. It has to be noticed that the recurring formula is the same one as the odd case.

Considering the previous relation, we came to the following expression of the first ${I}_{g}$ term for even g values greater than 4:

${A}_{g}=\frac{{2}^{3-2g}\sqrt{3}g\left(g-2\right)\Gamma {\left(g-2\right)}^{2}\Gamma \left(g/2-1/3\right)\Gamma \left(g/2+1/3\right)}{\Gamma {\left(g/2\right)}^{4}\Gamma \left(g/2-1/6\right)\Gamma \left(g/2-5/6\right)}$ (14)

and then the ${I}_{g}$ expression

${I}_{g}=\frac{{2}^{3-2g}\sqrt{3}g\left(g-2\right)\Gamma {\left(g-2\right)}^{2}\Gamma \left(g/2-1/3\right)\Gamma \left(g/2+1/3\right)}{\Gamma {\left(g/2\right)}^{4}\Gamma \left(g/2-1/6\right)\Gamma \left(g/2-5/6\right)}-\frac{g}{6\left(g-2\right)}$ (15)

Through some algebraic steps it is easily verified that the two ${I}_{g}$ formulas for the odd case and the even one are the same and, moreover, a single more compact expression is the following

${I}_{g}=\frac{{2}^{3-2g}\sqrt{3}g\Gamma \left(g/2-1/3\right)\Gamma \left(g/2+1/3\right)}{{\left(g-1\right)}^{2}\left(g-2\right)B\left(g/2,g/2\right)\Gamma \left(g/2-1/6\right)\Gamma \left(g/2-5/6\right)}-\frac{g}{6\left(g-2\right)}$ (16)

4. The Sample Mean Difference Variance

Let us remind that for the Student’s distributive model the expressions of the mean value ( $\mu $ ), the variance ( ${\sigma}^{2}$ ) and the mean difference (Δ) are the following:

$\mu =0$, (17)

${\sigma}^{2}=\frac{g}{g-2}$, (18)

$\Delta =\frac{\sqrt{g}B\left[\left(g-1\right)/2,\left(g+1\right)/2\right]}{\left(2g-1\right)B\left(g,g\right)B\left(g/2,g/2\right){2}^{2g-3}}$. (19)

Using the Lomnicki’s formula we came to the following expression of the mean difference variance for the Student’s distributive model:

$\begin{array}{c}\mathrm{var}\left(\stackrel{\xaf}{\Delta}\right)=\frac{4g\left(n+1\right)}{3\left(g-2\right)n\left(n-1\right)}+\frac{{2}^{5-2g}g\Gamma {\left[g-1/2\right]}^{2}\Gamma {\left[\left(g-1\right)/2\right]}^{2}\left(3-2n\right)}{\Gamma {\left[g/2\right]}^{6}n\left(n-1\right)}\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+\frac{8\sqrt{3}g\Gamma {\left[\left(g-1\right)/2\right]}^{2}h\left(g\right)\left(n-2\right)}{\pi \left(g-2\right)\Gamma {\left[g/2\right]}^{2}n\left(n-1\right)}\end{array}$ (20)

in which

$h\left(g\right)=\frac{\Gamma \left(g/2-1/3\right)\Gamma \left(g/2+1/3\right)}{\Gamma \left(g/2-1/6\right)\Gamma \left(g/2-5/6\right)}$ (21)

It is easily checked that, as g diverges,

$\mathrm{var}\left(\stackrel{\xaf}{\Delta}\right)=\frac{72-48\sqrt{3}+4\pi -\left(48+24\sqrt{3}+4\pi \right)n}{3\pi n\left(n-1\right)}$, (22)

that represents the sample difference variance for the normal model. It is also easily verified that, as n diverges, the above mentioned variance approaches zero, which means that $\stackrel{\xaf}{\Delta}$ is also a consistent estimator.

5. Conclusions

The sample mean difference $\stackrel{\xaf}{\Delta}$ is a correct estimator of the mean difference population for every distributive model. To verify if it is also consistent or not we need to calculate its variance, in this paper we have obtained the variance of $\stackrel{\xaf}{\Delta}$ formal expression for the Student’s distributive model in terms of the parameter g (degrees of freedom) and of the sample size n.

Because, even for the Student’s distributive model, such variance approaches zero as the sample size n diverges, $\stackrel{\xaf}{\Delta}$ results consistent. As g diverges the Student’s distributive model tends to the normal one. As a matter of fact the variance of $\stackrel{\xaf}{\Delta}$ expression we found approaches the variance of $\stackrel{\xaf}{\Delta}$ for the normal distributive model.

Acknowledgements

The helpful and constructive comments of a referee which lead to an improvement of the presentation of the paper and support from the editorial staff of Open Journal of Statistics to process the paper are all gratefully acknowledged.

References

[1] Campobasso, F. (2007) Alcuni risultati sulla varianza della differenza media camp- ionaria. Annals of the Department of Statistical Sciences, 6.

[2] Nair, U.S. (1936) The Standard Error of Gini’s Mean Difference. Biometrika, 28, 428-436.

https://doi.org/10.2307/2333957

[3] Lomnicki, Z.A. (1952) The Standard Error of Gini’s Mean Difference. Annals of Mathematical Statistics, 23, 635-637.

https://doi.org/10.1214/aoms/1177729346

[4] Girone, G. and Mazzitelli, D. (2007) La differenza media nei principali modelli distributivi continui. Annals of the Department of Statistical Sciences, 6.