OJS  Vol.8 No.3 , June 2018
Local Kernel Dimension Reduction in Approximate Bayesian Computation
ABSTRACT
Approximate Bayesian Computation (ABC) is a popular sampling method in applications involving intractable likelihood functions. Instead of evaluating the likelihood function, ABC approximates the posterior distribution by a set of accepted samples which are simulated from a generating model. Simulated samples are accepted if the distances between the samples and the observation are smaller than some threshold. The distance is calculated in terms of summary statistics. This paper proposes Local Gradient Kernel Dimension Reduction (LGKDR) to construct low dimensional summary statistics for ABC. The proposed method identifies a sufficient subspace of the original summary statistics by implicitly considering all non-linear transforms therein, and a weighting kernel is used for the concentration of the projections. No strong assumptions are made on the marginal distributions, nor the regression models, permitting usage in a wide range of applications. Experiments are done with simple rejection ABC and sequential Monte Carlo ABC methods. Results are reported as competitive in the former and substantially better in the latter cases in which Monte Carlo errors are compressed as much as possible.
Cite this paper
Zhou, J. and Fukumizu, K. (2018) Local Kernel Dimension Reduction in Approximate Bayesian Computation. Open Journal of Statistics, 8, 479-496. doi: 10.4236/ojs.2018.83031.
References
[1]   Pritchard, J.K., Seielstad, M.T., Perez-Lezaun, A. and Feldman, M.W. (1999) Population Growth of Human Y Chromosomes: A Study of Y Chromosome Microsatellites. Molecular Biology and Evolution, 16, 1791-1798.
https://doi.org/10.1093/oxfordjournals.molbev.a026091

[2]   Beaumont, M.A., Zhang, W. and Balding, D.J. (2002) Approximate Bayesian Computation in Population Genetics. Genetics, 162, 2025-2035.

[3]   Toni, T., Welch, D., Strelkowa, N., Ipsen, A. and Stumpf, M.P. (2009) Approximate Bayesian Computation Scheme for Parameter Inference and Model Selection in Dynamical Systems. Journal of the Royal Society Interface, 6, 187-202.
https://doi.org/10.1098/rsif.2008.0172

[4]   Csillry, K., Blum, M.G.B., Gaggiotti, O.E. and Franois, O. (2010) Approximate Bayesian Computation (ABC) in Practice. Trends in Ecology and Evolution, 25, 410-418.
https://doi.org/10.1016/j.tree.2010.04.001

[5]   Grelaud, A., Robert, C.P., Marin, J.M., Rodolphe, F. and Taly, J.F. (2009) ABC Likelihood-Free Methods for Model Choice in Gibbs Random Fields. Bayesian Analysis, 4, 317-335.
https://doi.org/10.1214/09-BA412

[6]   Bertorelle, G., Benazzo, A. and Mona, S. (2010) ABC as a Flexible Framework to Estimate Demography over Space and Time: Some Cons, Many Pros. Molecular Ecology, 19, 2609-2625.
https://doi.org/10.1111/j.1365-294X.2010.04690.x

[7]   Blum, M.G.B. (2010) Approximate Bayesian Computation: A Nonparametric Perspective. Journal of the American Statistical Association, 105, 1178-1187.
https://doi.org/10.1198/jasa.2010.tm09448

[8]   Frazier, D.T., Martin, G.M., Robert, C.P. and Rousseau, J. (2016) Asymptotic Properties of Approximate Bayesian Computation. ArXiv e-prints.

[9]   Li, W. and Fearnhead, P. (2015) On the Asymptotic Efficiency of ABC Estimators. ArXiv e-prints.

[10]   Moore, W.S. (1995) Inferring Phylogenies from Mtdna Variation-Mitochondrial-Gene Trees Versus Nuclear-Gene Trees. Evolution, 49, 718-726.
https://doi.org/10.2307/2410325

[11]   Marjoram, P., Molitor, J., Plagnol, V. and Tavare, S. (2003) Markov Chain Monte Carlo without Likelihoods. Proceedings of the National Academy of Sciences of the United States of America, 100, 15324-15328.

[12]   Sisson, S.A., Fan, Y. and Tanaka, M.M. (2007) Sequential Monte Carlo without Likelihoods. Proceedings of the National Academy of Sciences of the United States of America, 104, 1760-1765.
https://doi.org/10.1073/pnas.0607208104

[13]   Del Moral, P., Doucet, A. and Jasra, A. (2012) An Adaptive Sequential Monte Carlo Method for Approximate Bayesian Computation. Statistics and Computing, 22, 1009-1020.
https://doi.org/10.1007/s11222-011-9271-y

[14]   Joyce, P. and Marjoram, P. (2008) Approximately Sufficient Statistics and Bayesian Computation. Statistical Applications in Genetics and Molecular Biology, 7, Article 26.

[15]   Wegmann, D., Leuenberger, C. and Excoffier, L. (2009) Efficient Approximate Bayesian Computation Coupled with Markov Chain Monte Carlo without Likelihood. Genetics, 182, 1207-1218.
https://doi.org/10.1534/genetics.109.102509

[16]   Blum, M.G.B. and Franccois, O. (2010) Non-Linear Regression Models for Approximate Bayesian Computation. Statistics and Computing, 20, 63-73.
https://doi.org/10.1007/s11222-009-9116-0

[17]   Fearnhead, P. and Prangle, D. (2012) Constructing Summary Statistics for Approximate Bayesian Computation: Semi-Automatic Approximate Bayesian Computation. Journal of the Royal Statistical Society Series B: Statistical Methodology, 74, 419-474.
https://doi.org/10.1111/j.1467-9868.2011.01010.x

[18]   Blum, M.G.B., Nunes, M.A., Prangle, D. and Sisson, S.A. (2013) A Comparative Review of Dimension Reduction Methods in Approximate Bayesian Computation. Statistical Science, 28, 189-208.
https://doi.org/10.1214/12-STS406

[19]   Fukumizu, K., Bach, F.R. and Jordan, M.I. (2004) Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces. Journal of Machine Learning Research, 5, 73-99.

[20]   Li, K.C. (1991) Sliced Inverse Regression for Dimension Reduction. Journal of the American Statistical Association, 86, 316-327.
https://doi.org/10.1080/01621459.1991.10475035

[21]   Fukumizu, K., Bach, F.R. and Jordan, M.I. (2009) Kernel Dimension Reduction in Regression. The Annals of Statistics, 37, 1871-1905.
https://doi.org/10.1214/08-AOS637

[22]   Fukumizu, K. and Leng, C. (2014) Gradient-Based Kernel Dimension Reduction for Regression. Journal of the American Statistical Association, 109, 359-370.
https://doi.org/10.1080/01621459.2013.838167

[23]   Aronszajn, N. (1950) Theory of Reproducing Kernels. Transactions of the American Mathematical Society, 68, 337-337.
https://doi.org/10.1090/S0002-9947-1950-0051437-7

[24]   Nunes, M. and Balding, D.J. (2010) On Optimal Selection of Summary Statistics for Approximate Bayesian Computation. Statistical Applications in Genetics and Molecular Biology, 9, Article 34.

[25]   Smola, A.J., Scholkopf, B. and Müller, K.R. (1998) The Connection between Regularization Operators and Support Vector Kernels. Neural Networks, 11, 637-649.
https://doi.org/10.1016/S0893-6080(98)00032-X

[26]   Nakagome, S., Fukumizu, K. and Mano, S. (2013) Kernel Approximate Bayesian Computation in Population Genetic Inferences. Statistical Applications in Genetics and Molecular Biology, 12, 667-678.
https://doi.org/10.1515/sagmb-2012-0050

[27]   Wood, S.N. (2010) Statistical Inference for Noisy Nonlinear Ecological Dynamic Systems. Nature, 466, 1102-1104.
https://doi.org/10.1038/nature09319

[28]   Jabot, F., Faure, T. and Dumoulin, N. (2013) Easyabc: Performing Efficient Approximate Bayesian Computation Sampling Schemes Using R. Methods in Ecology and Evolution, 4, 684-687.
https://doi.org/10.1111/2041-210X.12050

[29]   Valle, S., Li, W. and Qin, S.J. (1999) Selection of the Number of Principal Components: The Variance of the Reconstruction Error Criterion with a Comparison to Other Methods. Industrial & Engineering Chemistry Research, 38, 4389-4401.
https://doi.org/10.1021/ie990110i

[30]   Hudson, R.R. (2002) Generating Samples under a Wright-Fisher Neutral Model of Genetic Variation. Bioinformatics (Oxford, England), 18, 337-338.
https://doi.org/10.1093/bioinformatics/18.2.337

[31]   Nordborg, M. (2008) Coalescent Theory. John Wiley & Sons, Ltd., Hoboken, 843-877.

[32]   Beaumont, M.A., Cornuet, J.M., Marin, J.M. and Robert, C.P. (2009) Adaptive Approximate Bayesian Computation. Biometrika, 96, 983-990.
https://doi.org/10.1093/biomet/asp052

[33]   Hudson, R.R. (2002) Ms—A Program for Generating Samples under Neutral Models.

 
 
Top