An ANCOVA puzzler

meta-analysis
effect size
standardized mean difference
Published

November 24, 2020

Doing effect size calculations for meta-analysis is a good way to lose your faith in humanity—or at least your faith in researchers’ abilities to do anything like sensible statistical inference. Try it, and you’re surely encounter head-scratchingly weird ways that authors have reported even simple analyses, like basic group comparisons. When you encounter this sort of thing, you have two paths: you can despair, curse, and/or throw things, or you can view the studies as curious little puzzles—brain-teasers, if you will—to keep you awake and prevent you from losing track of those notes you took during your stats courses, back when. Here’s one of those curious little puzzles, which I recently encountered in helping a colleague with a meta-analysis project.

A researcher conducts a randomized experiment, assigning participants to each of G groups. Each participant is assessed on a variable Y at pre-test and at post-test (we can assume there’s no attrition). In their study write-up, the researcher reports sample sizes for each group, means and standard deviations for each group at pre-test and at post-test, and adjusted means at post-test, where the adjustment is done using a basic analysis of covariance, controlling for pre-test scores only. The data layout looks like this:

Group N Pre-test M Pre-test SD Post-test M Post-test SD Adjusted post-test M
Group A nA x¯A sA0 y¯A sA1 y~A
Group B nB x¯B sB0 y¯B sB1 y~B

Note that the write-up does not provide an estimate of the correlation between the pre-test and the post-test, nor does it report a standard deviation or standard error for the mean change-score between pre-test and post-test within each group. All we have are the summary statistics, plus the adjusted post-test scores. We can assume that the adjustment was done according to the basic ANCOVA model, assuming a common slope across groups as well as homoskedasticity and so on. The model is then yig=αg+βxig+eig, for i=1,...,ng and g=1,...,G, where eig is an independent error term that is assumed to have constant variance across groups.

For realz?

Here’s an example with real data, drawn from Table 2 of Murawski (2006):

Group N Pre-test M Pre-test SD Post-test M Post-test SD Adjusted post-test M
Group A 25 37.48 4.64 37.96 4.35 37.84
Group B 26 36.85 5.18 36.46 3.86 36.66
Group C 16 37.88 3.88 37.38 4.76 36.98

That study reported this information for each of several outcomes, with separate analyses for each of two sub-groups (LD and NLD). The text also reports that they used a two-level hierarchical linear model for the ANCOVA adjustment. For simplicity, let’s just ignore the hierarchical linear model aspect and assume that it’s a straight, one-level ANCOVA.

The puzzler

Calculate an estimate of the standardized mean difference between group B and group A, along with the sampling variance of the SMD estimate, that adjusts for pre-test differences between groups. Candidates for numerator of the SMD include the adjusted mean difference, y~By~A or the difference-in-differences, (y¯Bx¯B)(y¯Ax¯A). In either case, the tricky bit is finding the sampling variance of this quantity, which involves the pre-post correlation. For the denominator of the SMD, you use the post-test SD, either pooled across just groups A and B or pooled across all G groups, assuming a common population variance.

The solution

The key here is to recognize that you can calculate β^, the estimated slope of the within-group pre-post relationship, based on the difference between the adjusted group means and the raw post-test means. Then the pre-post correlation can be derived from the β^. In the ANCOVA model, y~g=y¯gβ^(x¯gx¯¯), where x¯¯ is the overall mean across groups, or x¯¯=1ng=1Gngx¯g, with n=g=1Gng. Thus, we can back out β^ from the reported summary statistics for group g as β^g=y¯gy~gx¯gx¯¯. Actually, we get G estimates of β^g—one from each group. Taking a weighted average seems sensible here, so we end up with β^=1ng=1Gng(y¯gy~gx¯gx¯¯). Now, let r denote the sample correlation between pre-test and post-test, after partialing out differences in means for each group. This correlation is related to β^ as r=β^×spxspy, where spx and spy are the standard deviations of the pre-test and post-test, respectively, pooled across all g groups.

Here’s the result of carrying out these calculations with the example data from Murawski (2006):

library(dplyr)

dat <- tibble(
  Group = c("A","B","C"),
  N = c(25, 26, 16),
  m_pre = c(37.48, 36.85, 37.88),
  sd_pre = c(4.64, 5.18, 3.88),
  m_post = c(37.96, 36.46, 37.38),
  sd_post = c(4.35, 3.86, 4.76),
  m_adj = c(37.84, 36.66, 36.98)
)

corr_est <- 
  dat %>%
  mutate(
    m_pre_pooled = weighted.mean(m_pre, w = N),
    beta_hat = (m_post - m_adj) / (m_pre - m_pre_pooled)
  ) %>%
  summarise(
    df = sum(N - 1),
    s_sq_x = sum((N - 1) * sd_pre^2) / df,
    s_sq_y = sum((N - 1) * sd_post^2) / df,
    beta_hat = weighted.mean(beta_hat, w = N)
  ) %>%
  mutate(
    r = beta_hat * sqrt(s_sq_x / s_sq_y)
  )

corr_est
# A tibble: 1 × 5
     df s_sq_x s_sq_y beta_hat     r
  <dbl>  <dbl>  <dbl>    <dbl> <dbl>
1    64   22.1   18.2    0.636 0.700

From here, we can calculate the numerator of the SMD a few different ways.

Diff-in-diff

One option would be to take the between-group difference in pre-post differences (a.k.a., the diff-in-diff): DD=(y¯Bx¯B)(y¯Ax¯A). Assuming that the within-group variance of the pre-test and the within-group variance of the post-test are equal (and constant across groups), then Var(DD)=2σ2(1ρ)(1nA+1nB), where σ2 is the within-group population variance of the post-test and ρ is the population correlation between pre- and post. Dividing DD by spy gives an estimate of the standardized mean difference between group B and group A, dDD=DDspy, with approximate sampling variance Var(dDD)2(1ρ)(1nA+1nB)+δ22(nG), where δ is the SMD parameter. The variance can be estimated by substituting estimates of ρ and δ: VDD=2(1r)(1nA+1nB)+d22(nG). If you would prefer to pool the post-test standard deviation across groups A and B only, then replace (nG) with (nA+nB2) in the second term of VDD.

Regression adjustment

An alternative to the diff-in-diff approach is to use the regression-adjusted mean difference between group B and group A as the numerator of the SMD. Here, we would calculate the standardized mean difference as dreg=y~By~Aspy. Now, the variance of the regression-adjusted mean difference is approximately Var(y~By~A)σ2(1ρ2)(1nA+1nB), from which it follows that the variance of the regression-adjusted SMD is approximately Var(dreg)(1ρ2)(1nA+1nB)+δ22(nG). Again, the variance can be estimated by substituting estimates of ρ and δ: Vreg=(1r2)(1nA+1nB)+d22(nG). If you would prefer to pool the post-test standard deviation across groups A and B only, then replace (nG) with (nA+nB2) in the second term of VDD.

The regression-adjusted effect size estimator will always have smaller sampling variance than the diff-in-diff estimator (under the assumptions given above) and so it would seem to be generally preferable. The main reason I could see for using the diff-in-diff estimator is if it was the only thing that could be calculated for the other studies included in a synthesis.

Numerical example

Here I calculate both estimates and their corresponding variances with the example data from Murawski (2006):

diffs <- 
  dat %>%
  filter(Group %in% c("A","B")) %>%
  summarise(
    diff_pre = diff(m_pre),
    diff_post = diff(m_post),
    diff_adj = diff(m_adj),
    inv_n = sum(1 / N)
  )

d_est <-
  diffs %>%
  mutate(
    d_DD = (diff_post - diff_pre) / sqrt(corr_est$s_sq_y),
    V_DD = 2 * (1 - corr_est$r) * inv_n + d_DD^2 / (2 * corr_est$df),
    d_reg = diff_adj / sqrt(corr_est$s_sq_y),
    V_reg = (1 - corr_est$r^2) * inv_n + d_DD^2 / (2 * corr_est$df),
  )

d_est %>%
  select(d_DD, V_DD, d_reg, V_reg)
# A tibble: 1 × 4
    d_DD   V_DD  d_reg  V_reg
   <dbl>  <dbl>  <dbl>  <dbl>
1 -0.204 0.0474 -0.276 0.0403

The effect size estimates are rather discrepant, which is a bit worrisome.

Back to top