An Efficient Class of Estimators for the Finite Population Mean in Ranked Set Sampling

Show more

Received 14 February 2016; accepted 11 June 2016; published 14 June 2016

1. Introduction

The problem of estimation in the finite population mean has been widely considered by many authors in different sampling designs. In application, there may be a situation when the variable of interest cannot be measured easily or is very expensive to do so, but it can be ranked easily at no cost or at very little cost. In view of this situation, [2] introduced the Ranked Set Sampling (RSS) procedure. [3] proved the mathematical theory that the sample mean under RSS was an unbiased estimator of the finite population mean and more precise than the sample mean estimator under simple random sampling (SRS).

The auxiliary information plays an important role in increasing efficiency of the estimator. [4] suggested an estimator for population ratio in RSS and showed that it had less variance as compared to usual ratio estimator in simple random sampling (SRS).

In RSS, perfect ranking of elements was considered by [2] and [3] for estimation of population mean. In some situations, ranking may not be perfect. According to [5] , the sample mean in RSS is an unbiased estimator of the population mean regardless of errors in ranking of the elements. In [6] , the ranking of elements was done on basis of the auxiliary variable instead of judgment. [1] suggested an estimator for population mean and ranking of the elements was observed on basis of the auxiliary variable. [7] had suggested a class of Hartley-Ross type unbiased estimators in RSS. [8] had also proposed unbiased estimators in RSS and stratified ranked set sampling.

In this paper, we suggest a class of estimators for the population mean, using known population mean of the auxiliary variable in RSS. It is shown that the proposed class of estimators outperforms as compared to the [9] , [1] and several other estimators. Also some special cases of the proposed class are considered in Table A1 (Appendix).

2. Ranked Set Sampling Procedure

In ranked set sampling (RSS), we select m random samples, each of size m units from the population, and rank the units within each sample with respect to a variable of interest. In order to facilitate the ranking, the design parameter m, is chosen to be small. From the first sample the unit having the lowest rank is selected, from the second sample the unit having second lowest rank is selected and the process is continued until from the last sample the unit having the highest rank is selected. In this way, we obtain m measured units, one from each sample. The cycle may be repeated r times until units have been measured. These units form the RSS data.

Suppose that the variable of interest Y is difficult to measure and to rank, but there is the auxiliary variable X, which is correlated with Y. The variable X may be used to obtain the rank of Y. To perform the sampling procedure, m bivariate random samples, each of size m units are drawn from the population then each sample is ranked with respect to one of the variables Y or X. Here, we assume that the perfect ranking is done on basis of the auxiliary variable X while the ranking of Y is with error. An actual measurement from the first sample is then taken of the unit with the smallest rank of X, together with the variable Y associated with the smallest rank of X. From the second sample of size m the Y associated with the second smallest rank of X is measured. The process is continued until from the mth sample, the Y associated with the highest rank of X is measured. The cycle is repeated r times until bivariate units have been measured out of the total selected units.

3. Some Existing Estimators and Notations

We consider a situation when rank the elements on the auxiliary variable. Let be the ith judgment ordering in the ith set for the study variable Y based on the ith order statistics of the ith set of the auxiliary variable X at the jth cycle. Based on RSS, the sample mean estimator of the population mean, is given by

(1)

where.

To obtain the bias and of estimators, we define:

such that

,

and

, , ,

where

, , ,. and are the

coefficients of variation of Y and X respectively. It also be noted that the values of and are the means of ith order statistics from some specific distributions (see [10] ).

The variance of under RSS scheme, is given by

(2)

[4] proposed an estimator of the population ratio under RSS as:

(3)

When population mean () of the auxiliary variable (X) is known, and the variables Y and X are positively correlated, [9] proposed the ratio estimator for population mean () based on RSS as

(4)

The bias and MSE of, up to the first degree of approximation, are given by

(5)

and

(6)

When population mean () of the auxiliary variable (X) is known, and the variables Y and X are negatively correlated, then the product estimator based on RSS is defined as:

(7)

The bias and MSE of, up to the first degree of approximation, are given by

(8)

and

(9)

[11] suggested an estimator under RSS and is defined as:

(10)

where is suitably chosen constant.

The minimum bias and MSE of at optimum value of i.e.

are given by

(11)

and

(12)

The difference-type estimator for population mean () based on RSS, is given by

(13)

where d is a constant.

The minimum variance of at optimum value of d i.e.

is given by

(14)

Following [12] , [1] suggested a class of estimators of the population mean (), based on RSS as:

(15)

where is a suitably chosen constant, a and b are either real numbers or functions of known parameters of the auxiliary variable X, g is a scalar which takes value of 1 (for generating ratio-type estimators) and (for generating product-type estimators) and are constants whose sum need not be unity.

The bias of, is given by

(16)

The MSE of, to first degree of approximation, is given by

(17)

where

We discuss two cases.

Case 1: Sum of weights is unity (i.e.).

Solving (17), the optimum value of, is given by

Substituting in (17), we get the minimum MSE of, given by

(18)

Case 2: Sum of weights is flexible (i.e.).

Solving (17), the optimum values of and are given by

and

Substituting the optimum values of and in (17), we get

(19)

4. Proposed Class of Estimators

Following [1] and [12] , we propose a class of estimators of the population mean (), under RSS as

(20)

where is a suitably chosen constant, a and b are either real numbers or the functions of known parameters of the auxiliary variable X and are constants whose sum need not be unity. From (20) we can generate a large number of estimators for the different values of the constants (Table A1 in Appendix). The proposed estimator can be written in terms of and as

(21)

where.

Solving (21), we have

(22)

Taking expectation of both sides of above equation, we get bias of, given by

(23)

Squaring both sides of Equation (22) and ignoring higher order terms of e’s, we have

Taking expectation of both sides of above equation, we obtain the MSE of as given by

(24)

where

We discuss two cases.

Case 1: Sum of weights is unity (i.e.).

The optimum value of, is given by

Thus, the minimum MSE of, is given by

(25)

Case 2: Sum of weights is flexible (i.e.).

For, the MSE of in Equation (24) is minimized for

and

Substituting the optimum values of and in (24), we get

(26)

Note: It is difficult to make the theoretical comparison due to complexity, therefore we adopt the numerical study.

5. Simulation Study

We use the same data set as earlier used by [1] , and perform some simulation study to investigate the per- formances of the estimators.

Population (source: [13] ).

Y = Number of acres devoted to farms during 1992 (ACRES92).

X = Number of large farms during 1992 (LARGEF92).

We set and to select a sample of units from the population of size. To compute the values of, and by simulation, we explain our simulation methodology as follow.

Here, and can be written as

and

where

To find the possible values of the ratio for, we generate and calculate, , , , and. It means that when the first smallest value is selected from the ranked set sample, the expected ratio of that value to the population mean could be close to 0.25, and when the second smallest value is selected the ratio of that value to the population mean could be close to 0.50, and when the third smallest value is selected the expected ratio of that value to the population mean will close to 1. Similarly, the expected ratio of the fourth and fifth values could be close to 1.25 and 1.75 respectively. In each case we weighted error term with a small number 0.08 to make sure that the ratio remains positive. In other words, it means that we are generating. Thus, the possible values of the ratio are expected to remain close to those we are considering here. Similarly, for the possible values of the ratio, we consider , , , , and, where. Here we weighted with a small number 0.05 because it may be less risky to rank the auxiliary variable X than the study variable Y. Thus the values of, , and are obtained through this simulation and are represented in the last three columns of Table 1.

Table 1. PREs of proposed class of estimators through simulation.

We investigate the percentage relative efficiency (PRE) of ratio estimator (say), the Searls estimator, the difference estimator, [1] estimator when with respect to conventional estimator (say). We also calculate PRE of the proposed class of estimators, say, when and when, say, , with respect to. The PRE of our proposed estimator and other existing estimators, , with respect to con- ventional estimator, is defined as

(27)

The PREs of our proposed estimator and other existing estimators with respect to conventional estimator are given in Table 1.

6. Conclusions

Since and are the fixed constants in [1] estimator and in the proposed class of estimators. There can be a large number of combinations for different values of these constants. Here, only limited number of results are reported in Table 1. Obviously, it can be observed through the simulation study in Table 1, that the proposed class of estimators is more efficient than all considered estimators. Its PRE increases from 164.5 to 171.8 when changes from 0.1 to 0.9 but decreases slightly when is close to 0.5. Generally, we can say PRE of proposed class increases as value of increases for fixed values of constants a, b and g [1] . Class of estimators has maximum PRE 167.5, but it is less efficient as compared to the proposed class of estimators for all the choices of constants reported in Table 1. Also from the Table 1, we can see that other competitor estimators are also less efficient than the proposed class of estimators. If we make comparison between the two proposed cases then the class of estimators in Case 2 is more precise than the Case 1. We can see from Table 1 that by fixing the values of a and b at, the proposed classes of estimators give more precise results when the value of is away form, either close to 0 or 1. While by fixing positive values of the constants a and b, we get more precise results for close to 0.5.

Therefore, the proposed class of estimators can be preferred over its competitive estimators in application under RSS.

Acknowledgements

The authors wish to thank the editor and the anonymous referees for their suggestions which led to improvement in the earlier version of the manuscript.

Appendix

Table A1. Some special cases of the proposed class of estimators.

References

[1] Singh, H.P., Tailor, R. and Singh, S. (2014) General Procedure for Estimating the Population Mean Using Ranked Set Sampling. Journal of Statistical Computation and Simulation, 84, 931-945.

http://dx.doi.org/10.1080/00949655.2012.733395

[2] Mclntyre, G. (1952) A Method for Unbiased Selective Sampling, Using Ranked Sets. Crop and Pasture Science, 3, 385-390.

http://dx.doi.org/10.1071/AR9520385

[3] Takahasi, K. and Wakimoto, K. (1968) On Unbiased Estimates of the Population Mean Based on the Sample Stratified by Means of Ordering. Annals of the Institute of Statistical Mathematics, 20, 1-31.

http://dx.doi.org/10.1007/BF02911622

[4] Samawi, H.M. and Muttlak, M.A. (1996) Estimation of Ratio Using Ranked Set Sampling. Biometrical Journal, 38, 753-764.

http://dx.doi.org/10.1002/bimj.4710380616

[5] Dell, T. and Clutter, J. (1972) Ranked Set Sampling Theory with Order Statistics Background. Biometrics, 545-555.

http://dx.doi.org/10.2307/2556166

[6] Stokes, S.L. (1977) Ranked Set Sampling with Concomitant Variables. Communication in Statistics: Theory and Methods, 6, 1207-1211.

http://dx.doi.org/10.1080/03610927708827563

[7] Khan, L. and Shabbir, J. (2015) A Class of Hartley-Ross Type Unbiased Estimators for Population Mean Using Ranked Set Sampling. Hacettepe Journal of Mathematics and Statistics.

http://dx.doi.org/10.15672/HJMS.20156210579

[8] Khan, L. and Shabbir, J. (2016) Hartley-Ross Type Unbiased Estimators Using Ranked Set Sampling and Stratified Ranked Set Sampling. North Carolina Journal of Mathematics and Statistics, 2, 10-22.

[9] Kadilar, C., Unyazici, Y. and Cingi, H. (2009) Ratio Estimator for the Population Mean Using Ranked Set Sampling. Statistical Papers, 50, 301-309. http://dx.doi.org/10.1007/s00362-007-0079-y

[10] Arnold, B.C., Balakrishnan, N. and Nagaraja, H.N. (2012) A First Course in Order Statistics. Vol. 54, Siam.

[11] Searls, D.T. (1964) The Utilization of a Known Coefficient of Variation in the Estimation Procedure. Journal of the American Statistical Association, 59, 1225-1226.

http://dx.doi.org/10.1080/01621459.1964.10480765

[12] Khoshnevisan, M., Singh, R., Chauhan, P., Sawan, N. and Smarandache, F. (2007) A General Family of Estimators for Estimating Population Mean Using Known Value of Some Population Parameter(s). Far East Journal of Theoretical Statistics, 22, 181-191.

[13] Lohr, S. (1999) Sampling: Design and Analysis. Duxbury Press, Boston.