Random effects structure for confirmatory hypothesis testing: Keep it maximal

Dale J Barr; Roger Levy; Christoph Scheepers; Harry J Tily

doi:10.1016/j.jml.2012.11.001

Random effects structure for confirmatory hypothesis testing: Keep it maximal

J Mem Lang. 2013 Apr;68(3):10.1016/j.jml.2012.11.001. doi: 10.1016/j.jml.2012.11.001.

Authors

Dale J Barr¹, Roger Levy², Christoph Scheepers¹, Harry J Tily³

Affiliations

¹ Institute of Neuroscience and Psychology, University of Glasgow, 58 Hillhead St., Glasgow G12 8QB, United Kingdom.
² Department of Linguistics, University of California at San Diego, La Jolla, CA 92093-0108, USA.
³ Department of Brain and Cognitive Sciences, Massachussetts Institute of Technology, Cambridge, MA 02139, USA.

Abstract

Linear mixed-effects models (LMEMs) have become increasingly prominent in psycholinguistics and related areas. However, many researchers do not seem to appreciate how random effects structures affect the generalizability of an analysis. Here, we argue that researchers using LMEMs for confirmatory hypothesis testing should minimally adhere to the standards that have been in place for many decades. Through theoretical arguments and Monte Carlo simulation, we show that LMEMs generalize best when they include the maximal random effects structure justified by the design. The generalization performance of LMEMs including data-driven random effects structures strongly depends upon modeling criteria and sample size, yielding reasonable results on moderately-sized samples when conservative criteria are used, but with little or no power advantage over maximal models. Finally, random-intercepts-only LMEMs used on within-subjects and/or within-items data from populations where subjects and/or items vary in their sensitivity to experimental manipulations always generalize worse than separate F₁ and F₂ tests, and in many cases, even worse than F₁ alone. Maximal LMEMs should be the 'gold standard' for confirmatory hypothesis testing in psycholinguistics and beyond.

Keywords: Monte Carlo simulation; generalization; linear mixed-effects models; statistics.

Grants and funding

R01 HD065829/HD/NICHD NIH HHS/United States