Subsampling Method for Robust Estimation of Regression Models

Affiliation(s)

Department of Mathematics and Statistics, University of Victoria, Victoria, British Columbia, Canada V8W 3P4.

Department of Mathematics and Statistics, University of Victoria, Victoria, Canada.

Department of Mathematics and Statistics, University of Victoria, Victoria, British Columbia, Canada V8W 3P4.

Department of Mathematics and Statistics, University of Victoria, Victoria, Canada.

ABSTRACT

We propose a subsampling method for robust estimation of regression models which is built on classical methods such as the least squares method. It makes use of the non-robust nature of the underlying classical method to find a good sample from regression data contaminated with outliers, and then applies the classical method to the good sample to produce robust estimates of the regression model parameters. The subsampling method is a computational method rooted in the bootstrap methodology which trades analytical treatment for intensive computation; it finds the good sample through repeated fitting of the regression model to many random subsamples of the contaminated data instead of through an analytical treatment of the outliers. The subsampling method can be applied to all regression models for which non-robust classical methods are available. In the present paper, we focus on the basic formulation and robustness property of the subsampling method that are valid for all regression models. We also discuss variations of the method and apply it to three examples involving three different regression models.

We propose a subsampling method for robust estimation of regression models which is built on classical methods such as the least squares method. It makes use of the non-robust nature of the underlying classical method to find a good sample from regression data contaminated with outliers, and then applies the classical method to the good sample to produce robust estimates of the regression model parameters. The subsampling method is a computational method rooted in the bootstrap methodology which trades analytical treatment for intensive computation; it finds the good sample through repeated fitting of the regression model to many random subsamples of the contaminated data instead of through an analytical treatment of the outliers. The subsampling method can be applied to all regression models for which non-robust classical methods are available. In the present paper, we focus on the basic formulation and robustness property of the subsampling method that are valid for all regression models. We also discuss variations of the method and apply it to three examples involving three different regression models.

Cite this paper

M. Tsao and X. Ling, "Subsampling Method for Robust Estimation of Regression Models,"*Open Journal of Statistics*, Vol. 2 No. 3, 2012, pp. 281-296. doi: 10.4236/ojs.2012.23034.

M. Tsao and X. Ling, "Subsampling Method for Robust Estimation of Regression Models,"

References

[1] P. J. Huber, “Robust Statistics”, Wiley, New York, 1981.

[2] F. R. Hampel, E.M. Ronchetti, P. J. Rousseeuw and W. A. Stahel, “Robust Statistics: The Approach Based on Influence Functions”, Wiley, New York, 1986.

[3] P. J. Rousseeuw and A. M. Leroy, “Robust Regression and Outlier Detection”, Wiley, New York, 1987.

[4] R. A. Maronna, R. D. Martin and V. J. Yohai, “Robust Statistics: Theory and Methods'', Wiley, New York, 2006.

[5] D. G. Simpson, D. Ruppert and R. J. Carroll, “On One-step GM-estimates and Stability of Inferences in Linear Regression”, Journal of the American Statistical Association, Vol. 87, 1992, pp. 439-450.

[6] V. J. Yohai, “High Breakdown-point and High Efficiency Estimates for Regression”, The Annals of Statistics, Vol. 15, 1987, pp. 642-656.

[7] K. A. Brownlee, “Statistical Theory and Methodology in Science and Engineering”, second edition, Wiley, New York, 1965.

[8] D. F. Andrews,”A Robust Method for Multiple Linear Regression”, Technometrics, Vol. 16, 1974, pp. 523-531.

[9] D. C. Montgomery, E. A. Peck and G. G. Vining, “Introduction to Linear Regression Analysis”, 4th edition, Wiley, New York, 2006.

[10] J. R. Ashford, “An Approach to the Analysis of Data for Semi-quantal Responses in Biological Assay”, Biometrics, Vol. 15, 1959, pp. 573-581.

[11] E. Cantoni and E. Ronchetti, “Robust Inference for Generalized Linear Models”, Journal of American Statistical Association, Vol. 96, 2001, pp. 1022-1030.

[12] D. M. Bates and D. G. Watts, “Nonlinear Regression Analysis and Its Applications”, Wiley, New York, 1988.

[13] M. Tsao, “Partial Depth Functions for Multivariate Data”, manuscript in preparation, 2012.

[1] P. J. Huber, “Robust Statistics”, Wiley, New York, 1981.

[2] F. R. Hampel, E.M. Ronchetti, P. J. Rousseeuw and W. A. Stahel, “Robust Statistics: The Approach Based on Influence Functions”, Wiley, New York, 1986.

[3] P. J. Rousseeuw and A. M. Leroy, “Robust Regression and Outlier Detection”, Wiley, New York, 1987.

[4] R. A. Maronna, R. D. Martin and V. J. Yohai, “Robust Statistics: Theory and Methods'', Wiley, New York, 2006.

[5] D. G. Simpson, D. Ruppert and R. J. Carroll, “On One-step GM-estimates and Stability of Inferences in Linear Regression”, Journal of the American Statistical Association, Vol. 87, 1992, pp. 439-450.

[6] V. J. Yohai, “High Breakdown-point and High Efficiency Estimates for Regression”, The Annals of Statistics, Vol. 15, 1987, pp. 642-656.

[7] K. A. Brownlee, “Statistical Theory and Methodology in Science and Engineering”, second edition, Wiley, New York, 1965.

[8] D. F. Andrews,”A Robust Method for Multiple Linear Regression”, Technometrics, Vol. 16, 1974, pp. 523-531.

[9] D. C. Montgomery, E. A. Peck and G. G. Vining, “Introduction to Linear Regression Analysis”, 4th edition, Wiley, New York, 2006.

[10] J. R. Ashford, “An Approach to the Analysis of Data for Semi-quantal Responses in Biological Assay”, Biometrics, Vol. 15, 1959, pp. 573-581.

[11] E. Cantoni and E. Ronchetti, “Robust Inference for Generalized Linear Models”, Journal of American Statistical Association, Vol. 96, 2001, pp. 1022-1030.

[12] D. M. Bates and D. G. Watts, “Nonlinear Regression Analysis and Its Applications”, Wiley, New York, 1988.

[13] M. Tsao, “Partial Depth Functions for Multivariate Data”, manuscript in preparation, 2012.