New Probability Distributions in Astrophysics: II. The Generalized and Double Truncated Lindley
Abstract: The statistical parameters of five generalizations of the Lindley distribution, such as the average, variance and moments, are reviewed. A new double truncated Lindley distribution with three parameters is derived. The new distributions are applied to model the initial mass function for stars.

1. Introduction

The Lindley distribution, after  , has one parameter. In recent years the Lindley distribution has been the subject of many generalizations, we report some of them among others: one with two parameters , a two-parameter weighted one , the generalized Poisson-Lindley , the extended Lindley  and a transmuted Lindley-geometric distribution . Several generalizations of the Lindley distribution can be found in a recent review . The Lindley distribution is useful in modeling biological data from grouped mortality studies   and the first application to astrophysics of the Lindley distribution has been done for the initial mass function (IMF) for stars and the luminosity function for galaxies . The IMF is routinely modeled by the lognormal distribution and therefore the following question naturally arises. Can a Lindley distribution or a generalization be an alternative to the lognormal fit for the IMF? In order to answer the above question Section 2 reviews the notion of statistical sample and Lindley distribution, Section 3 reviews five generalizations of the Lindley distribution, Section 4 introduces the double Lindley distribution and Section 5 fits the six new Lindley distributions to four samples for the mass of the stars.

2. Preliminaries

We report some basic information on the adopted sample and on the original Lindley distribution with one parameter.

2.1. The Sample

The experimental sample consists of the data ${x}_{i}$ with i varying between 1 and n; the sample mean, $\stackrel{¯}{x}$, is

$\stackrel{¯}{x}=\frac{1}{n}\underset{i=1}{\overset{n}{\sum }}\text{ }\text{ }{x}_{i},$ (1)

the unbiased sample variance, ${s}^{2}$, is

${s}^{2}=\frac{1}{n-1}\underset{i=1}{\overset{n}{\sum }}{\left({x}_{i}-\stackrel{¯}{x}\right)}^{2},$ (2)

and the sample rth moment about the origin, ${\stackrel{¯}{x}}_{r}$, is

${\stackrel{¯}{x}}_{r}=\frac{1}{n}\underset{i=1}{\overset{n}{\sum }}{\left({x}_{i}\right)}^{r}.$ (3)

2.2. The Lindley Distribution with One Parameter

The Lindley probability density function (PDF) with one parameter, $f\left(x\right)$, is

$f\left(x;c\right)=\frac{{c}^{2}{\text{e}}^{-cx}\left(x+1\right)}{1+c};\text{\hspace{0.17em}}\text{\hspace{0.17em}}x>0,c>0$ (4)

where $x>0$ and $c>0$.

The cumulative distribution function (CDF), $F\left(x\right)$, is

$F\left(x;c\right)=1-\left(1+\frac{cx}{1+c}\right){\text{e}}^{-cx};\text{\hspace{0.17em}}\text{\hspace{0.17em}}x>0,c>0.$ (5)

At $x=0$, $f\left(0\right)=\frac{{c}^{2}}{1+c}$ and is not zero.

The average value or mean, $\mu$, is

$\mu \left(c\right)=\frac{2+c}{c\left(1+c\right)},$ (6)

the variance, ${\sigma }^{2}$, is

${\sigma }^{2}\left(c\right)=\frac{{c}^{2}+4c+2}{{c}^{2}{\left(1+c\right)}^{2}}.$ (7)

The rth moment about the origin for the Lindley distribution, ${{\mu }^{\prime }}_{r}$, is

${{\mu }^{\prime }}_{r}=\frac{{c}^{-r}\Gamma \left(r+2\right)+{c}^{1-r}\Gamma \left(r+1\right)}{1+c},$ (8)

where

$\Gamma \left(z\right)={\int }_{0}^{\infty }\text{ }{\text{e}}^{-t}{t}^{z-1}\text{d}t,$ (9)

is the gamma function, see . The central moments, ${\mu }_{r}$, are

${\mu }_{3}=\frac{2\text{ }{c}^{3}+12{c}^{2}+12c+4}{{c}^{3}{\left(1+c\right)}^{3}}$ (10a)

${\mu }_{4}=\frac{9\text{ }{c}^{4}+72{c}^{3}+132{c}^{2}+96c+24}{{c}^{4}{\left(1+c\right)}^{4}}$ (10b)

More details can be found in .

3. Generalizations of the Lindley Distribution

We review the statistics of the Lindley distribution with two parameters, power, generalized, new generalized and new weighted.

3.1. The Lindley Distribution with Two Parameters

The Lindley PDF with two parameters TPLD  is

$f\left(x;b,c\right)=\frac{{c}^{2}\left(b+x\right){\text{e}}^{-cx}}{bc+1},$ (11)

where $x>0$, $c>0$ and $b\text{ }c>-1$. The CDF of the TPLD is

$F\left(x;c,b\right)=1-\frac{\left(bc+cx+1\right){\text{e}}^{-cx}}{bc+1}.$ (12)

The average value or mean of the TPLD is

$\mu \left(b,c\right)=\frac{bc+2}{c\left(bc+1\right)},$ (13)

and the variance of the TPLD is

${\sigma }^{2}\left(b,c\right)=\frac{{b}^{2}{c}^{2}+4bc+2}{{c}^{2}{\left(bc+1\right)}^{2}}.$ (14)

The mode of the TPLD is at

$Mode=\frac{1-bc}{c},$ (15)

see Equation (2.3) in . The rth moment about the origin for the TPLD, ${{\mu }^{\prime }}_{r}$, is

${{\mu }^{\prime }}_{r}=\frac{{c}^{1-r}b\Gamma \left(r+1\right)+{c}^{-r}\Gamma \left(r+2\right)}{bc+1}.$ (16)

The two parameters b and c can be obtained by the following match

$\mu =\stackrel{¯}{x}$ (17a)

${\sigma }^{2}={s}^{2},$ (17b)

which means

$\stackrel{^}{b}=\frac{-\left({s}^{2}+{\stackrel{¯}{x}}^{2}\right)\left(\stackrel{¯}{x}\text{ }\sqrt{-2{s}^{2}+2{\stackrel{¯}{x}}^{2}}-2{s}^{2}\right)}{\left(\stackrel{¯}{x}\text{ }\sqrt{-2{s}^{2}+2{\stackrel{¯}{x}}^{2}}+{\stackrel{¯}{x}}^{2}-{s}^{2}\right)\left(2\text{ }\stackrel{¯}{x}+\sqrt{-2{s}^{2}+2{\stackrel{¯}{x}}^{2}}\right)},$ (18)

and

$\stackrel{^}{c}=\frac{2\text{ }\stackrel{¯}{x}+\sqrt{-2{s}^{2}+2{\stackrel{¯}{x}}^{2}}}{{s}^{2}+{\stackrel{¯}{x}}^{2}}.$ (19)

3.2. The Power Lindley Distribution

The power Lindley PDF with two parameters (PLD) according to  is

$f\left(x;b,c\right)=\frac{c{b}^{2}\left(1+{x}^{c}\right){x}^{c-1}{\text{e}}^{-b{x}^{c}}}{b+1},$ (20)

where b, c and $x>0$. The CDF of the PLD is

$F\left(x;c,b\right)=\frac{\left(-b{x}^{c}-b-1\right){\text{e}}^{-b{x}^{c}}+b+1}{b+1}.$ (21)

The average value or mean of the PLD is

$\mu \left(b,c\right)=\frac{\left({b}^{-{c}^{-1}}c+{b}^{\frac{c-1}{c}}c+{b}^{-{c}^{-1}}\right)\Gamma \left(\frac{c+1}{c}\right)}{\left(b+1\right)c},$ (22)

and the variance of the PLD is

${\sigma }^{2}\left(b,c\right)=\frac{NA}{DA},$ (23)

where

$\begin{array}{c}NA=-{b}^{-2{c}^{-1}}{\left(\Gamma \left(\frac{c+1}{c}\right)\right)}^{2}{c}^{2}+{b}^{-2{c}^{-1}}\Gamma \left(\frac{c+2}{c}\right)b{c}^{2}-{b}^{\frac{2\text{ }c-2}{c}}{\left(\Gamma \left(\frac{c+1}{c}\right)\right)}^{2}{c}^{2}\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}-2{\left(\Gamma \left(\frac{c+1}{c}\right)\right)}^{2}{b}^{\frac{-2+c}{c}}{c}^{2}+{b}^{\frac{-2+c}{c}}\Gamma \left(\frac{c+2}{c}\right)b{c}^{2}-2{b}^{-2{c}^{-1}}{\left(\Gamma \left(\frac{c+1}{c}\right)\right)}^{2}c\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+2{b}^{-2{c}^{-1}}\Gamma \left(\frac{c+2}{c}\right)bc+{b}^{-2{c}^{-1}}\Gamma \left(\frac{c+2}{c}\right){c}^{2}-2{\left(\Gamma \left(\frac{c+1}{c}\right)\right)}^{2}{b}^{\frac{-2+c}{c}}c\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+{b}^{\frac{-2+c}{c}}\Gamma \left(\frac{c+2}{c}\right){c}^{2}-{b}^{-2{c}^{-1}}{\left(\Gamma \left(\frac{c+1}{c}\right)\right)}^{2}+2{b}^{-2{c}^{-1}}\Gamma \left(\frac{c+2}{c}\right)c,\end{array}$ (24)

and

$DA={\left(b+1\right)}^{2}{c}^{2}.$ (25)

The mode of the PLD is at

$Mode=\frac{-cb+\sqrt{1+\left({b}^{2}+4\right){c}^{2}+\left(-2b-4\right)c}+2c-1}{2cb}.$ (26)

The rth moment about the origin for the PLD is

${{\mu }^{\prime }}_{r}=\frac{{b}^{\frac{-r+c}{c}}\Gamma \left(\frac{r+c}{c}\right)+{b}^{-\frac{r}{c}}\Gamma \left(\frac{r+2\text{ }c}{c}\right)}{b+1}.$ (27)

The two parameters b and c of the PLD can be found by numerically solving the nonlinear system given by Equation (17a) and Equation (17b).

3.3. The Generalized Lindley Distribution

The generalized Lindley PDF with three parameters (GLD) according to  is

$f\left(x;a,b,c\right)=\frac{{b}^{2}{\left(bx\right)}^{a-1}\left(cx+a\right){\text{e}}^{-bx}}{\left(c+b\right)\Gamma \left(a+1\right)},$ (28)

where a, b, c and $x>0$. The CDF of the GLD is

$\begin{array}{l}F\left(x;a,c,b\right)\\ =\frac{{\text{e}}^{-1/2bx}\left({x}^{a/2}\left(c{b}^{a/2}+{b}^{a/2+1}\right){M}_{a/2,a/2+1/2}\left(bx\right)+{b}^{a+1}{x}^{a}{\text{e}}^{-1/2bx}\left(a+1\right)\right)}{\left(c+b\right)\Gamma \left(a+2\right)},\end{array}$ (29)

where ${M}_{\mu ,\text{ }\nu }\left(z\right)$ is the Whittaker M function, see . The average value or mean of the GLD is

$\mu \left(a,b,c\right)=\frac{ab+ac+c}{b\left(c+b\right)},$ (30)

and the variance of the GLD is

${\sigma }^{2}\left(a,b,c\right)=\frac{a{b}^{2}+2cba+{c}^{2}a+2cb+{c}^{2}}{{b}^{2}{\left(c+b\right)}^{2}}.$ (31)

The hazard rate function, $h\left(x;a,b,c\right)$, of the GLD is

$\begin{array}{l}h\left(x;a,b,c\right)\\ =\frac{-{b}^{a+1}{x}^{a-1}\left(cx+a\right){\text{e}}^{-bx}\left(a+1\right)}{{\text{e}}^{-1/2bx}{x}^{a/2}\left(c{b}^{a/2}+{b}^{a/2+1}\right){M}_{a/2,a/2+1/2}\left(bx\right)+{x}^{a}{b}^{a+1}\left(a+1\right){\text{e}}^{-bx}-\left(c+b\right)\Gamma \left(a+2\right)},\end{array}$ (32)

and Figure 1 reports an example. Here the CDF, Equation (29), and the hazard rate function, Equation (32), are reported in closed form in contrast to what was asserted by . The mode of the GLD is at

Figure 1. Plot of the three-dimensional surface of the hazard rate function when b = 3 and c = 0.5.

$Mode=\frac{-ab+ac+\sqrt{{a}^{2}{b}^{2}+2\text{ }{a}^{2}bc+{a}^{2}{c}^{2}-4\text{ }abc}}{2bc}.$ (33)

The rth moment about the origin for the GLD is

${{\mu }^{\prime }}_{r}=\frac{\Gamma \left(r+a\right)\left({b}^{-r}ca+{b}^{-r}cr+{b}^{-r+1}a\right)}{\left(c+b\right)\Gamma \left(a+1\right)},$ (34)

and in particular the third moment is

${{\mu }^{\prime }}_{3}=\frac{\Gamma \left(3+a\right)\left(ab+ac+3c\right)}{\left(c+b\right)\Gamma \left(a+1\right){b}^{3}}.$ (35)

The three parameters a, b and c of the GLD can be obtained by numerically solving the following three non-linear equations

$\mu =\stackrel{¯}{x}$ (36a)

${\sigma }^{2}={s}^{2}$ (36b)

${{\mu }^{\prime }}_{3}={\stackrel{¯}{x}}_{3}.$ (36c)

3.4. The New Generalized Lindley Distribution

The new generalized Lindley PDF with three parameters (NGLD) according to  is

$f\left(x;a,b,c\right)=\frac{\left({c}^{a+1}{x}^{a-1}\Gamma \left(b\right)+{c}^{b}{x}^{b-1}\Gamma \left(a\right)\right){\text{e}}^{-cx}}{\left(1+c\right)\Gamma \left(a\right)\Gamma \left(b\right)},$ (37)

where a, b, c and $x>0$. The CDF of the NGLD is

$F\left(x;a,c,b\right)=\frac{NB}{\left(1+c\right)\Gamma \left(b+2\right)\Gamma \left(a+2\right)},$ (38)

where

$\begin{array}{c}NB=\Gamma \left(b+2\right){x}^{a}{c}^{a+1}{\text{e}}^{-cx}a+\Gamma \left(a+2\right){x}^{b}{c}^{b}{\text{e}}^{-cx}b-\Gamma \left(b+2\right)c\Gamma \left(a+1,cx\right)a\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+\Gamma \left(b+2\right){x}^{a}{c}^{a+1}{\text{e}}^{-cx}+\Gamma \left(a+2\right){x}^{b}{c}^{b}{\text{e}}^{-cx}-\Gamma \left(b+2\right)c\Gamma \left(a+1,cx\right)\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+\Gamma \left(b+2\right)\Gamma \left(a+2\right)c-\Gamma \left(a+2\right)\Gamma \left(b+1,cx\right)b+\Gamma \left(b+2\right)\Gamma \left(a+2\right)\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}-\Gamma \left(a+2\right)\Gamma \left(b+1,cx\right)\end{array}$ (49)

where $\Gamma \left(a,z\right)$ is the incomplete Gamma function, defined by

${\Gamma }_{}\text{​}\left(a,z\right)={\int }_{z}^{\infty }\text{ }{t}^{a-1}{\text{e}}^{-t}\text{d}t,$ (40)

see . The average value of the NGLD is

$\mu \left(a,b,c\right)=\frac{ac+b}{c\left(1+c\right)},$ (41)

and the variance of the NGLD is

${\sigma }^{2}\left(a,b,c\right)=\frac{{a}^{2}c-2\text{ }abc+a{c}^{2}+{b}^{2}c+ac+bc+b}{{c}^{2}{\left(1+c\right)}^{2}}.$ (42)

The rth moment about the origin for the NGLD is

${{\mu }^{\prime }}_{r}=\frac{{c}^{-r+1}\Gamma \left(r+a\right)\Gamma \left(b\right)+{c}^{-r}\Gamma \left(r+b\right)\Gamma \left(a\right)}{\left(1+c\right)\Gamma \left(a\right)\Gamma \left(b\right)},$ (43)

and the third moment is

${{\mu }^{\prime }}_{3}=\frac{\Gamma \left(3+a\right)\Gamma \left(b\right)c+\Gamma \left(3+b\right)\Gamma \left(a\right)}{{c}^{3}\left(1+c\right)\Gamma \left(a\right)\Gamma \left(b\right)}.$ (44)

The three parameters a, b and c of the NGLD are obtained by numerically solving the three non-linear Equation (36a), Equation (36b) and Equation (36a).

3.5. The New Weighted Lindley Distribution

The new weighted Lindley PDF with two parameters (NWL) according to  is

$f\left(x;b,c\right)=\frac{-{c}^{2}{\left(1+b\right)}^{2}\left(1+x\right)\left(-1+{\text{e}}^{-cbx}\right){\text{e}}^{-cx}}{b\left(cb+b+c+2\right)},$ (45)

where b, c and $x>0$. The CDF of the NWL is

$F\left(x;c,b\right)=\frac{NC}{b\left(cb+b+c+2\right)},$ (46)

where

$\begin{array}{c}NC=-{\text{e}}^{-cx}{b}^{2}cx+{\text{e}}^{-c\left(1+b\right)x}bcx-{\text{e}}^{-cx}{b}^{2}c-2\text{ }{\text{e}}^{-cx}bcx+{\text{e}}^{-c\left(1+b\right)x}bc\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+{\text{e}}^{-c\left(1+b\right)x}cx-{\text{e}}^{-cx}{b}^{2}-2{\text{e}}^{-cx}bc-{\text{e}}^{-cx}cx+{b}^{2}c+{\text{e}}^{-c\left(1+b\right)x}c\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}-2{\text{e}}^{-cx}b-{\text{e}}^{-cx}c+{b}^{2}+cb+{\text{e}}^{-c\left(1+b\right)x}-{\text{e}}^{-cx}+2\text{ }b.\end{array}$ (47)

The average value of the NWL is

$\mu \left(a,b,c\right)=\frac{{b}^{2}c+2{b}^{2}+3cb+6b+2c+6}{\left(cb+b+c+2\right)c\left(1+b\right)},$ (48)

and the variance of the NWL is

${\sigma }^{2}\left(a,b,c\right)=\frac{ND}{{c}^{2}{\left(bc+b+c+2\right)}^{2}{\left(1+b\right)}^{2}},$ (49)

where

$\begin{array}{c}ND={b}^{4}{c}^{2}+4{b}^{4}c+4{b}^{3}{c}^{2}+2{b}^{4}+18{b}^{3}c+7{b}^{2}{c}^{2}+12{b}^{3}+32{b}^{2}c\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+6b{c}^{2}+24{b}^{2}+30bc+2{c}^{2}+24b+12c+12.\end{array}$ (50)

The rth moment about the origin for the NWL is

${{\mu }^{\prime }}_{r}=\frac{NE}{b\left(bc+b+c+2\right)},$ (51)

where

$\begin{array}{c}NE=-\left({c}^{1-r}{b}^{1-r}{\left(\frac{1+b}{b}\right)}^{-r}+{b}^{-r}{\left(\frac{1+b}{b}\right)}^{-r}{c}^{-r}r-{c}^{-r}{b}^{2}r+{c}^{1-r}{b}^{-r}{\left(\frac{1+b}{b}\right)}^{-r}\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}-{c}^{1-r}{b}^{2}+{b}^{-r}{\left(\frac{1+b}{b}\right)}^{-r}{c}^{-r}-{c}^{-r}{b}^{2}-2{c}^{-r}br-2{c}^{1-r}b-2{c}^{-r}b-{c}^{-r}r\\ \begin{array}{c}\text{ }\\ \text{ }\end{array}\text{\hspace{0.17em}}-{c}^{1-r}-{c}^{-r}\right)\Gamma \left(1+r\right).\end{array}$ (52)

The two parameters b and c of the NWL can be found by numerically solving the nonlinear system given by Equation (17a) and Equation (17b).

4. The Double Truncated Lindley Distribution

Let X be a random variable defined in $\left[{x}_{l},{x}_{u}\right]$ ; the double truncated (DTL) version of the Lindley PDF with one parameter, ${f}_{t}\left(x;c,{x}_{l},{x}_{u}\right)$, is

${f}_{t}\left(x;c,{x}_{l},{x}_{u}\right)=\frac{{c}^{2}{\text{e}}^{-cx}\left(x+1\right)}{{\text{e}}^{-c{x}_{l}}c{x}_{l}-{\text{e}}^{-c{x}_{u}}c{x}_{u}+{\text{e}}^{-c{x}_{l}}c-{\text{e}}^{-c{x}_{u}}c+{\text{e}}^{-c{x}_{l}}-{\text{e}}^{-c{x}_{u}}},$ (53)

where the effect of the double truncation increases the parameters from one to three, see . The double truncated Lindley distribution with scale, which has four parameters, was introduced in .

Its CDF, ${F}_{t}\left(x;b,c,{x}_{l},{x}_{u}\right)$, is

${F}_{t}\left(x;b,c,{x}_{l},{x}_{u}\right)=\frac{NF}{{\left(\left(-1+\left(-{x}_{u}-1\right)c\right){\text{e}}^{c{x}_{l}}+\left(1+\left({x}_{l}+1\right)c\right){\text{e}}^{c{x}_{u}}\right)}^{2}},$ (54)

where

$\begin{array}{c}NF=-{\text{e}}^{c\left({x}_{l}+{x}_{u}\right)}\left(-{\left(1+\left({x}_{l}+1\right)c\right)}^{2}{\text{e}}^{-c\left({x}_{l}-{x}_{u}\right)}-\left(1+\left(x+1\right)c\right)\left(1+\left({x}_{u}+1\right)c\right){\text{e}}^{c\left(-x+{x}_{l}\right)}\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+\left(\left(1+\left(x+1\right)c\right){\text{e}}^{c\left(-x+{x}_{u}\right)}+1+\left({x}_{u}+1\right)c\right)\left(1+\left({x}_{l}+1\right)c\right)\right).\end{array}$ (55)

The average value, ${\mu }_{t}\left(c,{x}_{l},{x}_{u}\right)$, is

$\begin{array}{l}{\mu }_{t}\left(c,{x}_{l},{x}_{u}\right)\\ =\frac{\left(2+\left({x}_{u}^{2}+{x}_{u}\right){c}^{2}+\left(2\text{ }{x}_{u}+1\right)c\right){\text{e}}^{c{x}_{l}}-{\text{e}}^{c{x}_{u}}\left(2+\left({x}_{l}^{2}+{x}_{l}\right){c}^{2}+\left(2{x}_{l}+1\right)c\right)}{-c\left(\left(-1+\left(-{x}_{u}-1\right)c\right){\text{e}}^{c{x}_{l}}+{\text{e}}^{c{x}_{u}}\left(1+\left({x}_{l}+1\right)c\right)\right)}.\end{array}$ (56)

The rth moment about the origin for the DTL, ${{\mu }^{\prime }}_{r}\left(c,{x}_{l},{x}_{u}\right)$, is

${{\mu }^{\prime }}_{r}\left(c,{x}_{l},{x}_{u}\right)=\frac{NG}{\left(\left(1+\left({x}_{l}+1\right)c\right){\text{e}}^{-c{x}_{l}}-\left(1+\left({x}_{u}+1\right)c\right){\text{e}}^{-c{x}_{u}}\right)\left(r+1\right)},$ (57)

where

$\begin{array}{c}NG=-{x}_{l}^{r/2}{\text{e}}^{-1/2c{x}_{l}}\left({c}^{1-r/2}+{c}^{-r/2}\left(r+1\right)\right){M}_{r/2,r/2+1/2}\left(c{x}_{l}\right)\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+\left({c}^{1-r/2}+{c}^{-r/2}\left(r+1\right)\right){\text{e}}^{-1/2c{x}_{u}}{x}_{u}^{r/2}{M}_{r/2,r/2+1/2}\left(c{x}_{u}\right)\\ \text{\hspace{0.17em}}\text{\hspace{0.17em}}+c\left(r+1\right)\left({\text{e}}^{-c{x}_{l}}{x}_{l}^{r+1}-{\text{e}}^{-c{x}_{u}}{x}_{u}^{r+1}\right).\end{array}$ (58)

The three parameters which characterize the DTL can be found in the following way. Consider the sample of stellar masses $\mathcal{X}={x}_{1},{x}_{2},\cdots ,{x}_{n}$ and let ${x}_{\left(1\right)}\ge {x}_{\left(2\right)}\ge \cdots \ge {x}_{\left(n\right)}$ denote their order statistics, so that ${x}_{\left(1\right)}=max\left({x}_{1},{x}_{2},\cdots ,{x}_{n}\right)$, ${x}_{\left(n\right)}=min\left({x}_{1},{x}_{2},\cdots ,{x}_{n}\right)$. The first two parameters ${x}_{l}$ and ${x}_{u}$ are

${x}_{l}={x}_{\left(n\right)},\text{ }{x}_{u}={x}_{\left(1\right)}.$ (59)

The third parameter c can be found by solving the following non-linear equation

${\mu }_{t}\left(c,{x}_{l},{x}_{u}\right)=\stackrel{¯}{x}.$ (60)

5. Application to the IMF

We report the adopted statistics for four samples of stars which will be subject of fit, with the lognormal, the Lindley generalizations and the double truncated Lindley.

5.1. The Involved Statistics

The merit function ${\chi }^{2}$ is computed according to the formula

${\chi }^{2}=\underset{i=1}{\overset{n}{\sum }}\frac{{\left({T}_{i}-{O}_{i}\right)}^{2}}{{T}_{i}},$ (61)

where n is the number of bins, ${T}_{i}$ is the theoretical value, and ${O}_{i}$ is the experimental value represented by the frequencies. The theoretical frequency distribution is given by

${T}_{i}=N\Delta {x}_{i}p\left(x\right),$ (62)

where N is the number of elements of the sample, $\Delta {x}_{i}$ is the magnitude of the size interval, and $p\left(x\right)$ is the PDF under examination.

A reduced merit function ${\chi }_{red}^{2}$ is evaluated by

${\chi }_{red}^{2}={\chi }^{2}/NF,$ (63)

where $NF=n-k$ is the number of degrees of freedom, n is the number of bins, and k is the number of parameters. The goodness of the fit can be expressed by the probability Q, see equation 15.2.12 in , which involves the degrees of freedom and ${\chi }^{2}$. According to  p. 658, the fit “may be acceptable” if $Q>0.001$.

The Akaike information criterion (AIC), see , is defined by

$\text{AIC}=2k-2\mathrm{ln}\left(L\right),$ (64)

where L is the likelihood function and k the number of free parameters in the model. We assume a Gaussian distribution for the errors and the likelihood

function can be derived from the ${\chi }^{2}$ statistic $L\propto exp\left(-\frac{{\chi }^{2}}{2}\right)$ where ${\chi }^{2}$ has

been computed by Equation (65), see  . Now the AIC becomes

$\text{AIC}=2k+{\chi }^{2}.$ (65)

The Kolmogorov-Smirnov test (K-S), see   , does not require binning the data. The K-S test, as implemented by the FORTRAN subroutine KSONE in , finds the maximum distance, D, between the theoretical and the astronomical CDF as well the significance level ${P}_{KS}$, see formulas 14.3.5 and 14.3.9 in ; if ${P}_{KS}\ge 0.1$, the goodness of the fit is believable.

5.2. The Selected Sample of Stars

The first test is performed on NGC 2362 where the 271 stars have a range $1.47{M}_{\odot }\ge M\ge 0.11{M}_{\odot }$, see  and CDS catalog J/MNRAS/384/675/Table 1.

Table 1. Numerical values of ${\chi }_{red}^{2}$, AIC, probability Q, D, the maximum distance between theoretical and observed CDF, and ${P}_{KS}$, significance level, in the K-S test of the lognormal distribution, see Equation (66), for different mass distributions. The number of linear bins, n, is 20.

The second test is performed on the low-mass IMF in the young cluster NGC 6611, see  and CDS catalog J/MNRAS/392/1034. This massive cluster has an age of 2 - 3 Myr and contains masses from $1.5{M}_{\odot }\ge M\ge 0.02{M}_{\odot }$. Therefore the brown dwarfs (BD) region, $\approx 0.2{\mathcal{M}}_{\odot }$ is covered.

The third test is performed on $\gamma$ Velorum cluster where the 237 stars have a range $1.31{M}_{\odot }\ge M\ge 0.15{M}_{\odot }$, see  and CDS catalog J/A + A/589/A70/Table 5.

The fourth test is performed on young cluster Berkeley 59 where the 420 stars have a range $2.24{M}_{\odot }\ge M\ge 0.15{M}_{\odot }$, see  and CDS catalog J/AJ/155/44/Table 3.

5.3. The Lognormal Distribution

Let X be a random variable defined in $\left[0,\infty \right]$ ; the lognormal PDF, following  or formula (14.2) in , is

$\text{PDF}\left(x;m,\sigma \right)=\frac{{\text{e}}^{-\frac{1}{2{\sigma }^{2}}{\left(ln\left(\frac{x}{m}\right)\right)}^{2}}}{x\sigma \sqrt{2\pi }},$ (66)

where m is the median and $\sigma$ the shape parameter. The CDF is

$\text{CDF}\left(x;m,\sigma \right)=\frac{1}{2}+\frac{1}{2}\text{erf}\left(\frac{1}{2}\frac{\sqrt{2}\left(-ln\left(m\right)+ln\left(x\right)\right)}{\sigma }\right),$ (67)

where $\text{erf}\left(x\right)$ is the error function, defined as

$\text{erf}\left(x\right)=\frac{2}{\sqrt{\pi }}{\int }_{0}^{x}\text{ }{\text{e}}^{-{t}^{2}}\text{d}t,$ (68)

see . The average value or mean, $E\left(X\right)$, is

$E\left(X;m,\sigma \right)=m{\text{e}}^{\frac{1}{2}{\sigma }^{2}},$ (69)

the variance, $Var\left(X\right)$, is

$Var={\text{e}}^{{\sigma }^{2}}\left({\text{e}}^{{\sigma }^{2}}-1\right){m}^{2},$ (70)

the second moment about the origin, ${E}^{2}\left(X\right)$, is

$E\left({X}^{2};m,\sigma \right)={m}^{2}{\text{e}}^{2\text{ }{\sigma }^{2}}.$ (71)

The statistics for the lognormal distribution for these four astronomical samples of stars are reported in Table 1.

5.4. The Generalizations of the Lindley Distribution

The statistics for the Lindley distribution and its generalizations are reported in the following tables: Table 2 for the Lindley distribution with one parameter, Table 3 for the TPLD, Table 4 for the PLD, Table 5 for the GLD, Table 6 for the NGLD and Table 7 for the NWL. The best fit for NGC 2362 is obtained with the PLD, see Figure 2.

The best fit for NGC 6611 is obtained with the Lindley PDF with one parameter, see Figure 3.

Table 2. Numerical values of ${\chi }_{red}^{2}$, AIC, probability Q, D, the maximum distance between theoretical and observed CDF, and ${P}_{KS}$, significance level, in the K-S test of the Lindley distribution with one parameter for different mass distributions. The number of linear bins, n, is 20.

Table 3. Numerical values of ${\chi }_{red}^{2}$, AIC, probability Q, D, the maximum distance between theoretical and observed CDF, and ${P}_{KS}$, significance level, in the K-S test of the TPLD distribution with two parameters for different mass distributions. The number of linear bins, n, is 20.

Table 4. Numerical values of ${\chi }_{red}^{2}$, AIC, probability Q, D, the maximum distance between theoretical and observed CDF, and ${P}_{KS}$, significance level, in the K-S test of the PLD distribution with two parameters for different mass distributions. The number of linear bins, n, is 20.

Table 5. Numerical values of ${\chi }_{red}^{2}$, AIC, probability Q, D, the maximum distance between theoretical and observed CDF, and ${P}_{KS}$, significance level, in the K-S test of the GLD distribution with three parameters for different mass distributions. The number of linear bins, n, is 20.

Table 6. Numerical values of ${\chi }_{red}^{2}$, AIC, probability Q, D, the maximum distance between theoretical and observed CDF, and ${P}_{KS}$, significance level, in the K-S test of the NGLD distribution with three parameters for different mass distributions. The number of linear bins, n, is 20.

Table 7. Numerical values of ${\chi }_{red}^{2}$, AIC, probability Q, D, the maximum distance between theoretical and observed CDF, and ${P}_{KS}$, significance level, in the K-S test of the NWL distribution with two parameters for different mass distributions. The number of linear bins, n, is 20.

The best fit for $\gamma$ Velorum is obtained with the lognormal PDF, see Figure 4.

The best fit for the young cluster Berkeley 59 is obtained with the NGLD, see Figure 5.

5.5. The Double Truncated Lindley

The statistics for the DTL with three parameters are reported in Table 8. Figure 6 reports the CDF of the DTL for NGC 6611 which is the best fit of the various distributions here analysed for this cluster.

6. Conclusion

In this paper we explored five generalizations of the Lindley distribution as well

Table 8. Numerical values of ${\chi }_{red}^{2}$, AIC, probability Q, D, the maximum distance between theoretical and observed CDF, and ${P}_{KS}$, significance level, in the K-S test of the DTL for different mass distributions. The number of linear bins, n, is 20.

Figure 2. Empirical PDF of mass distribution for NGC 2362 cluster data (273 stars + BDs) when the number of bins, n, is 20 (steps with blue full line) with a superposition of the PLD (red dashed line). Theoretical parameters as in Table 4.

Figure 3. Empirical PDF of mass distribution for NGC 6611 cluster data when the number of bins, n, is 20 (steps with blue full line) with a superposition of the Lindley PDF with one parameter (red dashed line). Theoretical parameters as in Table 2.

Figure 4. Empirical PDF of mass distribution for $\gamma$ Velorum cluster data when the number of bins, n, is 20 (steps with blue full line) with a superposition of the lognormal PDF (red dashed line). Theoretical parameters as in Table 1.

Figure 5. Empirical PDF of mass distribution for the young cluster Berkeley 59 when the number of bins, n, is 20 (steps with blue full line) with a superposition of the NGLD (red dashed line). Theoretical parameters as in Table 6.

Figure 6. Empirical CDF of mass distribution for NGC 6611 cluster data (blue dotted line) with a superposition of the DTL CDF with one parameter (red line). Theoretical parameters as in Table 8.

Table 9. Best fits: Name of the cluster, name of the distribution, D, the maximum distance between theoretical and observed CDF, and ${P}_{KS}$, significance level, in the K-S test.

Figure 7. Part of the empirical CDF of mass distribution for NGC 6611 cluster data (orange circles) with a superposition of the DTL CDF with one parameter (black full line), the lognormal (red dashed line), the Lindley with one parameter (green dot-dash-dot-dash line) and the TPLD (blue dot line).

the double truncated Lindley distribution against the lognormal distribution. For each IMF of the four clusters here analysed, the distribution which realizes the best fit is reported in Table 9. The above table allows concluding that the Lindley family here suggested produces better fits than does the lognormal distribution. Figure 7 reports the CDF for NGC 6611 as well as four fitting curves.

Cite this paper: Zaninetti, L. (2020) New Probability Distributions in Astrophysics: II. The Generalized and Double Truncated Lindley. International Journal of Astronomy and Astrophysics, 10, 39-55. doi: 10.4236/ijaa.2020.101004.
References

   Lindley, D.V. (1958) Fiducial Distributions and Bayes’s Theorem. Journal of the Royal Statistical Society. Series B (Methodological), 20, 102-107.
https://doi.org/10.1111/j.2517-6161.1958.tb00278.x

   Lindley, D.V. (1965) Introduction to Probability and Statistics from a Bayesian Viewpoint. Cambridge University Press, Cambridge.
https://doi.org/10.1017/CBO9780511662973

   Shanker, R. and Mishra, A. (2013) A Two-Parameter Lindley Distribution. Statistics in Transition New Series, 1, 45.

   Ghitany, M., Alqallaf, F., Al-Mutairi, D.K. and Husain, H. (2011) A Two-Parameter Weighted Lindley Distribution and Its Applications to Survival Data. Mathematics and Computers in Simulation, 81, 1190.
https://doi.org/10.1016/j.matcom.2010.11.005

   Mahmoudi, E. and Zakerzadeh, H. (2010) Generalized Poisson: Lindley Distribution. Communications in Statistics Theory and Methods, 39, 1785.
https://doi.org/10.1080/03610920902898514

   Bakouch, H.S., Al-Zahrani, B.M., Al-Shomrani, A.A., Marchi, V.A. and Louzada, F. (2012) An Extended Lindley Distribution. Journal of the Korean Statistical Society, 41, 75.
https://doi.org/10.1016/j.jkss.2011.06.002

   Merovci, F. and Elbatal, I. (2013) Transmuted Lindley-Geometric Distribution and Its Applications. arXiv Preprint arXiv:1309.3774.

   Tomy, L. (2018) A Retrospective Study on Lindley Distribution. Biometrics and Biostatistics International Journal, 7, 163.
https://doi.org/10.15406/bbij.2018.07.00205

   Shanker, R. and Mishra, A. (2014) A Two-Parameter Poisson-Lindley Distribution. International journal of Statistics and Systems, 9, 79.

   Zaninetti, L. (2019) The Truncated Lindley Distribution with Applications in Astrophysics. Galaxies, 7, 61.
https://doi.org/10.3390/galaxies7020061

   Olver, F.W.J., Lozier, D.W., Boisvert, R.F. and Clark, C.W. (2010) NIST Handbook of Mathematical Functions. Cambridge University Press, Cambridge.

   Zakerzadeh, H. and Dolati, A. (2009) Generalized Lindley Distribution. Journal of Mathematical extension, 3, 13.

   Ibrahim, E., Merovci, F. and Elgarhy, M. (2013) A New Generalized Lindley Distribution. Mathematical Theory and Modeling, 3, 30.

   Asgharzadeh, A., Bakouch, H.S., Nadarajah, S., Sharafi, F., et al. (2016) A New Weighted Lindley Distribution with Application. Brazilian Journal of Probability and Statistics, 30, 1.
https://doi.org/10.1214/14-BJPS253

   Singh, S.K., Singh, U. and Sharma, V.K. (2014) The Truncated Lindley Distribution: Inference and Application. Journal of Statistics Applications & Probability, 3, 219.
https://doi.org/10.12785/jsap/030212

   Press, W.H., Teukolsky, S.A., Vetterling, W.T. and Flannery, B.P. (1992) Numerical Recipes in FORTRAN. The Art of Scientific Computing. Cambridge University Press, Cambridge, UK.

   Akaike, H. (1974) A New Look at the Statistical Model Identification. IEEE Transactions on Automatic Control, 19, 716.
https://doi.org/10.1109/TAC.1974.1100705

   Liddle, A.R. (2004) How Many Cosmological Parameters? Monthly Notices of the Royal Astronomical Society, 351, L49.
https://doi.org/10.1111/j.1365-2966.2004.08033.x

   Godlowski, W. and Szydowski, M. (2005) Constraints on Dark Energy Models from Supernovae. In: Turatto, M., Benetti, S., Zampieri, L. and Shea, W., Eds., 1604-2004: Supernovae as Cosmological Lighthouses, Astronomical Society of the Pacific Conference Series, Astronomical Society of the Pacific, New York, 508-516

   Kolmogoroff, A. (1941) Confidence Limits for an Unknown Distribution Function. The Annals of Mathematical Statistics, 12, 461.
https://doi.org/10.1214/aoms/1177731684

   Smirnov, N. (1948) Table for Estimating the Goodness of Fit of Empirical Distributions. The Annals of Mathematical Statistics, 19, 279.
https://doi.org/10.1214/aoms/1177730256

   Massey Frank, J.J. (1951) The Kolmogorov-Smirnov Test for Goodness of Fit. Journal of the American Statistical Association, 46, 68.
https://doi.org/10.1080/01621459.1951.10500769

   Irwin, J., Hodgkin, S., Aigrain, S., Bouvier, J., Hebb, L., Irwin, M. and Moraux, E. (2008) The Monitor Project: Rotation of Low-Mass Stars in NGC 2362-Testing the Disc Regulation Paradigm at 5 Myr. Monthly Notices of the Royal Astronomical Society, 384, 675.
https://doi.org/10.1111/j.1365-2966.2007.12725.x

   Oliveira, J.M., Jeffries, R.D. and van Loon, J.T. (2009) The Low-Mass Initial Mass Function in the Young Cluster NGC 6611. Monthly Notices of the Royal Astronomical Society, 392, 1034.
https://doi.org/10.1111/j.1365-2966.2008.14140.x

   Prisinzano, L., Damiani, F., et al. (2016) The Gaia-ESO Survey: Membership and Initial Mass Function of the γ Velorum Cluster. Astronomy & Astrophysics, 589, A70.
https://doi.org/10.1051/0004-6361/201527875

   Panwar, N., Pandey, A.K., Samal, M.R., et al. (2018) Young Cluster Berkeley 59: Properties, Evolution, and Star Formation. The Astronomical Journal, 155, 44.
https://doi.org/10.3847/1538-3881/aa9f1b

   Evans, M., Hastings, N. and Peacock, B. (2000) Statistical Distributions. 3rd Edition, John Wiley & Sons Inc., New York.

   Johnson, N.L., Kotz, S. and Balakrishnan, N. (1994) Continuous Univariate Distributions. Volume 1, 2nd Edition, Wiley, New York.

Top