Review
Eta squared and partial eta squared as measures of effect size in educational research

https://doi.org/10.1016/j.edurev.2010.12.001Get rights and content

Abstract

Eta squared measures the proportion of the total variance in a dependent variable that is associated with the membership of different groups defined by an independent variable. Partial eta squared is a similar measure in which the effects of other independent variables and interactions are partialled out. The development of these measures is described and their characteristics compared. In the past, the two measures have been confused in the research literature, partly because of a labelling error in the output produced by certain versions of the statistical package SPSS. Nowadays, partial eta squared is overwhelmingly cited as a measure of effect size in the educational research literature. Although there are good reasons for this, the interpretation of both measures needs to be undertaken with care. The paper concludes with a summary of the key characteristics of eta squared and partial eta squared.

Research highlights

▶ Eta squared and partial eta squared are measures of effect size. ▶ In the past, they have been confused in the research literature. ▶ Nowadays, partial eta squared is widely cited as a measure of effect size. ▶ The interpretation of both measures needs to be undertaken with care.

Introduction

Most educational researchers appreciate, in abstract terms at least, that statements which describe the outcomes of tests of statistical inference need to be distinguished from statements which describe the importance of the relevant findings in theoretical or practical terms. Following a series of editorials and other articles by Thompson, 1994, Thompson, 1996, Thompson, 1999, Thompson, 2001 and the report of an American Psychological Association “task force” (Wilkinson & Task Force on Statistical Inference, 1999), many educational journals in North America require researchers to report measures of “effect size” as well as test statistics such as t or F (American Educational Research Association, 2006, Huberty, 2002). Journals elsewhere in the world tend to be less draconian, but authors are increasingly being encouraged to report such measures in their accounts of empirical research.

The term “effect” is most commonly associated with the use of different treatments in formal experimental research. However, nowadays it is commonly used by statisticians in the context of correlational research that examines the role of classification variables determined in advance of the research project rather than treatment variables that are under experimental control. For instance, the “effect” of gender on some measure of attainment is the difference between the scores obtained by male and female participants, or the “effect” of age might be the correlation between the participants’ ages and their scores on the relevant attainment test (Sechrest & Yeaton, 1982). The measurement of effect size is thus an issue of relevance to all educational researchers, whether they are engaged in experimental or correlational research.

Fortunately there are good accounts of the development, interpretation and application of measures of effect size in education and cognate disciplines (e.g., Huberty, 2002, Olejnik and Algina, 2000, Richardson, 1996, Sechrest and Yeaton, 1982). Consequently, there is ample support for educational researchers wishing to develop expertise in the use of such measures. However, there is a lack of clarity in the literature concerning two particular measures called eta squared and partial eta squared. These were originally devised for measuring effect sizes in factorial designs, but nowadays they have a variety of other applications. Potentially, they are therefore of considerable importance in educational research. I recently published a paper in Educational Research Review that was intended as a tutorial overview of the development of measures of multivariate association (Richardson, 2007). In the present article, in the same spirit, I shall review the historical development of eta squared and partial eta squared and the issues involved in their application to assist other educational researchers and their students.

Section snippets

Defining eta squared

Most readers will be familiar with the linear correlation coefficient, Pearson r, which measures the degree of linear association between two variables (X and Y, say). Many will also be familiar with the fact that r2, which is often termed the “coefficient of determination”, is equal to the proportion of the total variation in one of the variables that can be predicted or explained on the basis of its linear relationship with the other variable (see, e.g., Hays, 1963, p. 505). Pearson, 1903,

Defining partial eta squared

In Section 2.1, I showed that, in a one-way analysis of variance, the statistic F can be expressed as a function of η2 (see Formula (2)). Cohen (1965) pointed out, conversely, that the corresponding values of η and η2 could be calculated from the value of F. In the present notation, this yields the following formula:η2=[F(k1)][F(k1)+(Nk)].For the example in Table 1, η2 = [30.00 × (2  1)]/{[30.00 × (2  1)] + (100  2)} = 30/(30 + 98) = 30/128 = .234. In the special case when there are just two groups, η can be

Metric properties of eta squared and partial eta squared

By definition, both the classical version of η2 as defined in Formula (1) and partial η2 as defined in Formula (6) vary between 0 and 1. The classical version of η2 takes the value of 0 when the independent variable explains none of the variance in the dependent variable, and it takes the value of 1 when the independent variable in question explains all of the variance in the dependent variable. In a factorial design, the values of η2 corresponding to the different components of variation are

Confusion between eta squared and partial eta squared

For many years, partial η2 was not widely discussed in statistics textbooks. Levine and Hullett (2002) examined more than 20 such books that had been published between 1970 and 2000. They found that only one mentioned partial η2 by name (Pedhazur, 1997, p. 507), but two others presented Formula (7) (that is, partial η) as a definition of classical η and Formula (6) (that is, partial η2) as a definition of classical η2 (Rosenthal and Rosnow, 1985, Rosenthal and Rosnow, 1991). Levine and Hullett

Limitations of eta squared and partial eta squared

A number of commentators have pointed out limitations of both η2 and partial η2 as measures of effect size (O’Grady, 1982, Pedhazur, 1997, pp. 505–509; Sechrest & Yeaton, 1982). In Section 4, I noted that both η2 and partial η2 have a logical upper bound of 1. To take the hypothetical example in Table 1, the only situation in which η2 and partial η2 could equal 1 is one in which all the boys obtain one score and all the girls obtain some other score. If the dependent variable is distributed in

Conclusions

Recent commentators have arrived at different positions with regard to the usefulness of classical η2 and partial η2. Field (2009, pp. 415–416) and Kline (2009, pp. 166–168) just presented the two measures of effect size without explaining why one might be preferred to the other. Indeed, Kline envisaged that researchers might report values of both η2 and partial η2 in their research publications, but this is surely likely to confuse rather than enlighten their readers. Levine and Hullett (2002)

Acknowledgements

I am grateful to Laurel Fisher, Paul Ginns, James Hartley, Anesa Hosein, Natalia Kucirkova, Erik Meyer, Anna Parpala, Richard Remedios and Dan Richards for their comments on previous versions of this paper.

References (66)

  • J. Cohen

    Some statistical issues in psychological research

  • J. Cohen

    Statistical power analysis for the behavioural sciences

    (1969)
  • J. Cohen

    Eta-squared and partial eta-squared in fixed factor ANOVA designs

    Educational and Psychological Measurement

    (1973)
  • J. Cohen

    Statistical power analysis for the behavioural sciences

    (1988)
  • H. Cooper et al.

    Expected effect sizes: Estimates for statistical power analysis in social psychology

    Personality and Social Psychology Bulletin

    (1982)
  • S. Diamond

    Information and error: An introduction to statistical analysis

    (1959)
  • A. Field

    Discovering statistics using SPSS for Windows: Advanced techniques for the beginner

    (2000)
  • A. Field

    Discovering statistics using SPSS (and sex, drugs and rock ‘n’ roll)

    (2005)
  • A. Field

    Discovering statistics using SPSS (and sex and drugs and rock ‘n’ roll)

    (2009)
  • R.A. Fisher

    Statistical methods for research workers

    (1925)
  • R.L. Fowler

    Point estimates and confidence intervals in measures of association

    Psychological Bulletin

    (1985)
  • H. Friedman

    Magnitude of experimental effect and a table for its rapid estimation

    Psychological Bulletin

    (1968)
  • G.V. Glass et al.

    Measures of association in comparative experiments: Their development and interpretation

    American Educational Research Journal

    (1969)
  • R.J. Grissom et al.

    Effect sizes for research: A broad practical approach

    (2005)
  • R.F. Haase et al.

    How significant is a significant difference? Average effect size of research in counseling psychology

    Journal of Counseling Psychology

    (1982)
  • M. Halperin et al.

    Recommended standards for statistical symbols and notation

    American Statistician

    (1965)
  • W.L. Hays

    Statistics

    (1963)
  • L.V. Hedges et al.

    Statistical methods for meta-analysis

    (1985)
  • J.F. Hemphill

    Interpreting the magnitudes of correlation coefficients

    American Psychologist

    (2003)
  • D.C. Howell

    Statistical methods for psychology

    (2002)
  • C.J. Huberty

    A history of effect size indices

    Educational and Psychological Measurement

    (2002)
  • N.S. Jacobson et al.

    Clinical significance: A statistical approach to defining meaningful change in psychotherapy research

    Journal of Consulting and Clinical Psychology

    (1991)
  • T.L. Kelley

    An unbiased correlation ratio measure

    Proceedings of the National Academy of Sciences

    (1935)
  • Cited by (1890)

    View all citing articles on Scopus
    View full text