Skip to main content
Log in

An Application of Multiple Comparison Techniques to Model Selection

  • Published:
Annals of the Institute of Statistical Mathematics Aims and scope Submit manuscript

Abstract

Akaike's information criterion (AIC) is widely used to estimate the best model from a given candidate set of parameterized probabilistic models. In this paper, considering the sampling error of AIC, a set of good models is constructed rather than choosing a single model. This set is called a confidence set of models, which includes the minimum ε{AIC} model at an error rate smaller than the specified significance level. The result is given as P-value for each model, from which the confidence set is immediately obtained. A variant of Gupta's subset selection procedure is devised, in which a standardized difference of AIC is calculated for every pair of models. The critical constants are computed by the Monte-Carlo method, where the asymptotic normal approximation of AIC is used. The proposed method neither requires the full model nor assumes a hierarchical structure of models, and it has higher power than similar existing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Aitkin, M. A. (1974). Simultaneous inference and the choice of variable subsets in multiple regression, Technometrics, 16, 221–227.

    Google Scholar 

  • Akaike, H. (1974). A new look at the statistical model identification, IEEE Trans. Automat. Control, 19, 716–723.

    Google Scholar 

  • Akaike, H. (1979). A Bayesian extension of the minimum AIC procedure of autoregressive model fitting, Biometrika, 66, 237–242.

    Google Scholar 

  • Arvesen, J. N. and McCabe, G. P., Jr. (1975). Subset selection problems for variances with applications to regression analysis, J. Amer. Statist. Assoc., 70, 166–170.

    Google Scholar 

  • Atkinson, A. C. (1970). A method for discriminating between models, J. Roy. Statist. Soc. Ser. B, 32, 323–353.

    Google Scholar 

  • Belsley, D. A., Kuh, E. and Welsch, R. E. (1980). Regression Diagnostics, Wiley, New York.

    Google Scholar 

  • Bozdogan, H. (1987). Model selection and Akaike's information criterion (AIC): the general theory and its analytical extensions, Psychometrika, 52, 345–370.

    Google Scholar 

  • Cox, D. R. (1962). Further results on tests of separate families of hypotheses, J. Roy. Statist. Soc. Ser. B, 24, 406–424.

    Google Scholar 

  • Dastoor, N. K. and McAleer, M. (1989). Some power comparisons of joint and paired tests for nonnested models under local hypotheses, Econometric Theory, 5, 83–94.

    Google Scholar 

  • Draper, N. and Smith, H. (1981). Applied Regression Analysis (2nd ed.), Wiley, New York.

    Google Scholar 

  • Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap, Chapman & Hall, New York.

    Google Scholar 

  • Efron, B., Halloran, E. and Holmes, S. (1996). Bootstrap confidence levels for phylogenetic trees, Proc. Nat. Acad. Sci. U.S.A., 93, 13429–13434.

    Google Scholar 

  • Felsenstein, J. (1985). Confidence limits on phylogenies: an approach using the bootstrap, Evolution, 39, 783–791.

    Google Scholar 

  • Felsenstein, J. and Kishino, H. (1993). Is there something wrong with the bootstrap on phylogenies? A reply to Hillis and Bull, Systematic Biology, 42, 193–200.

    Google Scholar 

  • Gupta, S. S. and Huang, D. Y. (1976). Selection procedures for the means and variances of normal populations: unequal sample sizes case, Sankhyā Ser. B, 38, 112–128.

    Google Scholar 

  • Gupta, S. S. and Panchapakesan, S. (1979). Multiple Decision Procedures, Wiley, New York.

    Google Scholar 

  • Hochberg, Y. and Tamhane, A. C. (1987). Multiple Comparison Procedures, Wiley, New York.

    Google Scholar 

  • Linhart, H. (1988). A test whether two AIC's differ significantly, South African Statistical Journal, 22, 153–161.

    Google Scholar 

  • Mallows, C. L. (1973). Some comments on C p , Technometrics, 15, 661–675.

    Google Scholar 

  • Shimodaira, H. (1993). A model search technique based on confidence set and map of models, Proc. Inst. Statist. Math., 41, 131–147 (in Japanese).

    Google Scholar 

  • Shimodaira, H. (1997a). Assessing the error probability of the model selection test, Ann. Inst. Statist. Math., 49, 395–410.

    Google Scholar 

  • Shimodaira, H. (1997b). A graphical technique for finding a set of good models using AIC and its variance, Ann. Inst. Statist. Math. (submitted).

  • Spjøtvoll, E. (1972). Multiple comparison of regression functions, The Annals of Mathematical Statistics, 43, 1076–1088.

    Google Scholar 

  • Spjøtvoll, E. (1977). Alternatives to plotting C p in multiple regression, Biomctrika, 64, 1–8.

    Google Scholar 

  • Vuong, Q. H. (1989). Likelihood ratio tests for model selection and non-nested hypotheses, Econometrica, 57, 307–333.

    Google Scholar 

  • White, H. (1982). Maximum likelihood estimation of misspecified models, Econometrica, 50, 1–25.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

About this article

Cite this article

Shimodaira, H. An Application of Multiple Comparison Techniques to Model Selection. Annals of the Institute of Statistical Mathematics 50, 1–13 (1998). https://doi.org/10.1023/A:1003483128844

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1003483128844

Navigation