Back
 OJS  Vol.11 No.2 , April 2021
Density Estimation Using Gumbel Kernel Estimator
Abstract: In this article, our proposed kernel estimator, named as Gumbel kernel, which broadened the class of non-negative, asymmetric kernel density estimators. Such kernel estimator can be used in nonparametric estimation of the probability density function (pdf). When the density functions have limited bounded support on [0, ∞) and they are liberated of boundary bias, always non-negative and obtain the optimal rate of convergence for the mean integrated squared error (MISE). The bias, variance and the optimal bandwidth of the proposed estimators are investigated on theoretical grounds as well as on simulation basis. Further, the applicability of the proposed estimator is compared to Weibull kernel estimator, where performance of newly proposed kernel is outstanding.

1. Introduction

To investigate the properties and features of data or in anomaly detection, density estimation performs a vital role. For this purpose, the nonparametric kernel density estimation or curve estimation is a famous technique. Nonparametric estimation has certain advantages over parametric estimation, e.g. the problem of priori distribution choice, possibility of using non-homogenous data, no functional form and the most important is allocation of weights, etc. [1]. In nonparametric density estimation, boundary bias is a very serious issue. It affects the performance of the estimator at boundary points due to boundary effects, then from the interior points. Such problem is happened; when smoothing is carried out near the boundary and fixed symmetric kernel allocate weights outside the density support. That’s why in some cases, parametric method for curve estimation performs better than nonparametric estimation [2]. Such problem is happened when variables represent some sort of physical measure such as time or length. These variables thus have a natural lower boundary, e.g. time of birth, etc. So, when smoothing is carried out near the boundary and then fixed, symmetric kernels allocate weights outside the density support and due to this, boundary bias arise [3].

There is a vast literature on removing boundary effects in nonparametric method. As yet there appears to be no single dominating solution that corrects the boundary problem for all shapes of densities. Some common techniques are reflection of data, introduced by Schuster [4]. Similarly, Silverman [5] proposed negative-reflection. Eubank and Speckman [6] suggested semi-parametric model. Chen [7] suggested the solution of this problem by replacing the symmetric kernels by the asymmetric Bata kernel which never assigns weight outside the support. Many others used Chen’s idea and proposed other kernels, i.e. Gamma [8], lognormal [9], Inverse Gaussian [10], Weibull [11], etc.

By following Chen [7], we are going to propose a new class of density estimator named as a Gumbel kernel estimator along with its bias, variance and optimal bandwidth, which will be the keen addition in category of asymmetrical kernel(s) that solve the problem of boundary bias. The Gumbel distribution is a particular case of the Generalized Extreme Value (GEV) distribution, and also known as the log Weibull or Fisher-Tippett distribution. GEV is a family of continuous probability distributions which combines the Gumbel, Frechet and Weibull families and also known as type I, II and III extreme value distributions. The common functional form for all 3 distributions was discovered by McFadden [12].

This paper is organized as follows. In the first Section, we present some information about kernel smoothing and in Section 2 we presented the proposed kernel. In third Section, we investigated the bias, variance and optimal bandwidth of the Gumbel kernel estimator. The performance of the proposed estimator will be tested via real and simulated data sets in Section 4, while Section 5 concludes.

2. Gumbel Kernel Estimator

Let X 1 , , X n be a random sample from a distribution with an unknown probability density function f which has bounded support on [0, ∞). Representation of pdf of Gumbel (µ, β) is

f ( j ) = 1 β e ( z + e z ) , j > 0 , (1)

where z = j μ β and β > 0 . The mean and variance of J are equal to μ + β γ and β 2 π 2 6 , where γ 0.5772 is the Euler-Mascheroni constant.

As μ = x and s = h 1 2 , the class of Gumbel kernel considered is:

K Gumbel ( x , h 1 2 ) ( j ) = 1 h e ( j x h + e ( j x h ) ) . (2)

where h is bandwidth satisfying the condition that h 0 and n h as n . If a random variable X has a pdf K G u m b e l ( x , h 1 2 ) ( x ) , then E ( X ) = x + h γ and the variance is var ( X ) = h π 2 6 .

The corresponding estimator of pdf is

f ^ Gumbel ( X ) = n 1 i = 1 n K Gumbel ( x , h 1 2 ) ( X i ) . (3)

This estimator is easy to use and similar to following kernels for comparison:

Gamma 1 and Gamma 2 kernels by Chen [7] are;

K G a m 1 ( x / b + 1 , b ) ( y ) = y x / b exp { y / b } Γ { x / b + 1 } b x / b + 1 , (4)

and

K G a m 2 ( ρ b , b ) ( y ) = y ρ b ( x ) 1 exp { y / b } Γ { ρ b ( x ) } b ρ b ( x ) . (5)

where

ρ b ( x ) = { x / b if x [ 2 b , ) ; 1 4 ( x / b ) 2 + 1 if x [ 0 , 2 b ) . (6)

Beta kernel by Chen [7] is;

K B ( x / b + 1 , 1 x / b + 1 ) ( y ) = y x / b ( 1 y ) ( 1 x ) / b B { x / b + 1 , ( 1 x ) / b + 1 } , (7)

where, B is Beta function.

Birnbaum-Saunders and Log-Normal kernels by Jin and Kawczak [9] are;

K B S ( b 1 / 2 , x ) ( y ) = 1 2 π b 2 ( 1 x y + x y 3 ) exp [ 1 2 b ( y x 2 + x y ) ] , (8)

and

K L N ( ln x , 4 ln ( 1 + b ) ) ( y ) = 1 8 π ln ( 1 + b ) y exp [ ( ln y ln x ) 2 8 ln ( 1 + b ) ] (9)

Inverse Gaussian and Reciprocal Inverse Gaussian kernels by Scaillet [10] are;

K I G ( x , 1 / b ) ( y ) = 1 2 π b y 3 exp [ 1 2 b x ( y x 2 + x y ) ] , (10)

and

K R I G ( ln x , 4 ln ( 1 / b ) ) ( y ) = 1 2 π b y exp [ x b 2 b ( y x b 2 + x b y ) ] (11)

Erlang kernel by Salha, et al. [13] is;

K E ( x , 1 b ) ( y ) = 1 Γ ( 1 + 1 b ) [ 1 x ( 1 + 1 b ) ] b + 1 b y 1 h exp ( y x ( 1 + 1 b ) ) (12)

Weibull kernel by Salha, et al. [11] is;

K w ( x , 1 b ) ( y ) = Γ ( 1 + b ) b x [ y Γ ( 1 + b ) x ] 1 b 1 exp ( ( y Γ ( 1 + b ) x ) 1 b ) (13)

3. Bias, Variance and Optimal Bandwidth

Theorem 1 (Bias)

The bias of proposed estimator is given by;

Bias { f ^ Gumbel ( x ) } = h ( γ f ( x ) h + 1 2 f ( x ) π 2 6 ) + o ( 1 ) (14)

Proof:

E ( f ^ Gumbel ( x ) ) = 0 K Gumbel ( x , h 1 2 ) ( X i ) f ( x ) d x = E ( f ( ξ x ) ) ,

where ξ x follows a Gumbel distribution with scale parameter h 1 2 and shape parameter x.

The Taylor expansion about μ x for f ( ξ x ) is:

f ( ξ x ) = f ( μ x ) + f ( μ x ) ( ξ x μ x ) + 1 2 f ( μ x ) ( ξ x μ x ) 2 + o ( 1 ) .

So, E ( f ( ξ x ) ) = f ( x ) + 1 2 f ( x ) v a r ( ξ x ) + o ( 1 ) .

E ( f ( ξ x ) ) = f ( x ) + h ( γ f ( x ) + h π 2 12 f ( x ) ) + o ( 1 ) .

Hence,

Bias ( f ^ Gumbel ( x ) ) = h ( γ f ( x ) + h π 2 12 f ( x ) ) + o ( 1 )

Theorem 2 (Variance)

The variance of the proposed estimator is given by:

var ( f ^ Gumbel ( x ) ) = 1 2 h 1 2 ( x + h γ 2 f ( x + h γ 2 ) ) (15)

Proof:

var ( f ^ Gumbel ( x ) ) = 1 n var [ K Gumbel ( x , h 1 2 ) ( X i ) ] = n 1 [ E ( K Gumbel ( x , h 1 2 ) 2 ( X i ) ) ] + o ( n 1 )

Let η x be a Gumbel ( x , h 1 2 2 ) distributed random variable. Hence μ x = E [ η x ] = x + h γ 2 and V x = var [ η x ] = h π 2 24 . We have

E ( K Gumbel ( x , h 1 2 ) 2 ( X i ) ) = J h E [ η x f ( η x ) ] ,

where, J h = 1 2 h 1 2 . By Taylor expansion of η x f ( η x ) we get:

η x f ( η x ) = η x f ( η x ) + [ ( η x f ( η x ) + f ( η x ) ) ] ( η x μ x ) + 1 2 [ η x f ( η x ) + 2 f ( η x ) ] ( η x μ x ) 2 + o ( h 2 )

So,

E [ η x f ( η x ) ] = x + h γ 2 f ( x + h γ 2 ) + 1 2 [ x + h γ 2 f ' ' ( x + h γ 2 ) + 2 f ( x + h γ 2 ) ] h π 2 24 + o ( h 2 ) = x + h γ 2 f ( x + h γ 2 ) + h [ x + h γ 2 f ' ' ( x + h γ 2 ) π 2 24 + f ( x + h γ 2 ) 12 ] + o ( h 2 x 2 ) = x + h γ 2 f ( x + h γ 2 ) + o ( h 2 x 2 )

Therefore,

var ( f ^ Gumbel ( x ) ) = 1 2 h 1 2 [ x + h γ 2 f ( x + h γ 2 ) ] + o ( h 2 x 2 )

Optimal Bandwidth

To proceed for optimal bandwidth, initially Mean Squared Error (MSE) and Mean Integrated Squared Error (MISE) are derived as;

As we know Mean square errors for Gumbel kernel estimator is

MSE [ f ^ Gumbel ( x ) ] = Bias 2 [ f ^ Gumbel ( x ) ] + var [ f ^ Gumbel ( x ) ] = h ( ( γ f ( x ) + h π 2 12 f ( x ) ) ) 2 + 1 2 h [ x + h γ 2 f ( x + h γ 2 ) ]

We can approximate MISE to be:

MISE [ f ^ Gumbel ( x ) ] = h A + 1 2 h B (16)

where, A = ( γ f ( x ) + h π 2 12 f ( x ) ) 2 d x and B = x + h γ 2 f ( x + h γ 2 ) d x .

To find optimal bandwidth, now we minimize Equation (16) with respect to h, so we have

dMISE [ f ^ G u m b e l ( x ) ] d h = A h 3 2 4 B (17)

d 2 MISE [ f ^ G u m b e l ( x ) ] d h 2 = 3 8 h 5 2 B > 0

Setting (17) equal zero yields an optimal bandwidth h o p t for the given pdf and kernel:

h o p t = [ 4 ( γ f ( x ) + h π 2 12 f ( x ) ) 2 d x x + h γ 2 f ( x + h γ 2 ) d x ] 2 3

4. Applications

In this section, the performance of the proposed estimators in estimating the pdf is observed through real life data as well as by simulation study.

4.1. Suicide Data Example

We take suicide data given in Silverman [5] to inspect the performance of new developed kernel. The data gives the lengths of the treatment spells (in days) of control patients in suicide study.

We used the logarithm of the data to draw Figure 1 using data driven bandwidth, named as normal scale rule (NSR) by Silverman [5]. The NSR is given by;

h N S R = 0.79 R n 1 5 , (18)

where R is inter-quartile range, which results in 0.4894. It can be observed that Gumbel kernel performed very well, especially near end points and free of boundary bias.

4.2. Flood Data Example

Further, we used the flood data given in Gumbel [14], to exhibit the practical performance of the Gumbel estimator. The data give the discharge per second of the Rhone River (Europe).

Here fixed bandwidth which is 1,000,000, is used. Figure 2 shows that the performance of new proposed kernel estimator, which is acceptable.

4.3. Simulation Study

In this section we wish to investigate the finite sample properties of the two asymmetric kernel estimators; Gumbel and Weibull, which belong to family of extreme value distributions. The experiments are based on 1000 random samples of length n = 3 5 = 243 , n = 486 and n = 972 . For each simulated sample and each estimator considered, mean squared errors (MSE) are reported in Table 1, for extreme value distributions, namely Frechet, Weibull and Gumbel distributions and various parameter values by using bandwidth given as [8].

Figure 1. The Gumbel kernel estimator for the suicide data.

Figure 2. The Gumbel kernel estimator for the flood data.

Table 1. Mean square errors.

Here in Table 1, variety of randomly selected location parameter (small/medium/large) is examined with constant scale parameter. We may observe that the Gumbel kernel estimator performs better than Weibull kernel estimator unanimously almost for all density estimates with all different parameters and different sample sizes. For both Gumbel and Weibull kernel, MSEs decreased as sample size increased. In graphical representation, we present Gumbel kernel with real density. It can be examined in Figure 3 that the performance of

Figure 3. The Gumbel kernel estimator of the density functions with different distributions. (where solid line shows the real density and other line represents the density estimated by Gumbel kernel). (a) Gumbel (3, 1) (b) Frechet (3, 1, 1) (c) Weibull (25.713, 1).

the Gumbel estimator is acceptable at the boundary near the zero with different densities. In the interior, the behavior of the pdf estimator becomes more similar as we get away from zero in any extreme value distribution case.

5. Conclusion

In this paper, we have proposed a new kernel estimator for probability density functions for (iid) data [0, ∞), namely Gumbel kernel. Such densities are encountered in a wide variety of applications to describe extreme wind speeds, sea wave heights, floods, rainfall, age at death, minimum temperature, rainfall during droughts, electrical strength of materials, air pollution problems, geological problems, naval engineering etc. [4]. Gumbel kernel is free of boundary bias, non-negative, with natural varying shape. We showed that the bias depends on the smoothing parameter h and the estimated point x, and it goes to zero as h → 0, also it gets smaller for the values of x closed to zero. The variance of the new proposed kernel estimator was investigated, and we noticed that it depends also on h and x. On the other hand, it goes to zero as h → 0, and gets large at the values of x close to zero.

In addition, the performance of the proposed estimators is tested in three applications. In a simulation study, we used different densities of GEV distribution and compared it with Weibull (Extreme value distribution III) kernel estimator on basis of MSE. We observed that the performance of the proposed estimator is excellent, and gives a smaller MSE. Additionally, by using real data examples, we exhibited the practical performance of the new estimator.

From the above discussion, it can be concluded that one of the reason for adaptation of nonparametric method was to control the allocation of weights at boundary points. But boundary bias is still present if symmetrical kernels are used for curve estimation. In this situation, best alternative is to use asymmetrical kernel and Gumbel kernel is finest selection than Weibull kernel, comparatively.

Cite this paper: Khan, J. and Akbar, A. (2021) Density Estimation Using Gumbel Kernel Estimator. Open Journal of Statistics, 11, 319-328. doi: 10.4236/ojs.2021.112018.
References

[1]   Adamowski, K. and Labatiuk, C. (1987) Estimation of Flood Frequencies by a Nonparametric Density Procedure. In: Singh, V.P., ed., Hydrologic Frequency Modeling, Springer, Dordrecht.
https://doi.org/10.1007/978-94-009-3953-0_5

[2]   Jou, P.H., Akhoond-Ali, A.M., Behnia, A. and Chinipardaz, R. (2008) Parametric and Nonparametric Frequency Analysis of Monthly Precipitation in Iran. Journal of Applied Sciences, 8, 3242-4348.
https://doi.org/10.3923/jas.2008.3242.3248

[3]   Cid, J.A and von Davier, A.A. (2015) Examining Potential Boundary Bias Effects in Kernel Smoothing on Equating: An Introduction for the Adaptive and Epanechnikov Kernels. Applied Psychological Measurement, 39, 208-222.
https://doi.org/10.1177/0146621614555901

[4]   Schuster, E.F. (1985) Incorporating Support Constraints into Nonparametric Estimation of Densities. Communications in Statistics—Theory and Methods, 14, 1123-1136.
https://doi.org/10.1080/03610928508828965

[5]   Silverman, B.W. (1986) Density Estimation. Chapman and Hall/CRC, London.
https://doi.org/10.1007/978-1-4899-3324-9_6

[6]   Eubank, R.L. and Speckman, P. (1990) Curve Fitting by Polynomial-Trigonometric Regression. Biometrika, 77, 1-9.
https://doi.org/10.1093/biomet/77.1.1

[7]   Chen, S.X. (2000) Beta Kernel Smoothers for Regression Curves. Statistica Sinica, 10, 73-91.

[8]   Chen, S.X. (2000) Probability Density Function Estimation Using Gamma Kernels. Annals of the Institute of Statistical Mathematics, 52, 471-480.
https://doi.org/10.1023/A:1004165218295

[9]   Jin, X. and Kawczak, J. (2003) Birnbaum-Saunders and Lognormal Kernel Estimators for Modeling Durations in High Frequency Financial Data. Annals of Economics and Finance, 4, 103-124.

[10]   Scaillet, O. (2004) Density Estimation Using Inverse and Reciprocal Inverse Gaussian Kernels. Nonparametric Statistics, 16, 217-226.
https://doi.org/10.1080/10485250310001624819

[11]   Salha, R.B., Ahmed, E.S. and Alhoubi, I.M. (2014) Hazard Rate Function Estimation Using Weibull Kernel. Open Journal of Statistics, 4, 650-661.
https://doi.org/10.4236/ojs.2014.48061

[12]   McFadden, D. (1978) Modeling the Choice Residential Location. Transportation Research Record, 673, 72-77.

[13]   Salha, R.B., Ahmed, E.S. and Alhoubi, I.M. (2014) Hazard Rate Function Estimation Using Erlang Kernel. Pure Mathematical Sciences, 3, 141-152.
https://doi.org/10.12988/pms.2014.4616

[14]   Gumbel, E.J. (1941) The Return Period of Flood Flows. The Annals of Mathematical Statistics, 12, 163-190.
https://doi.org/10.1214/aoms/1177731747

 
 
Top