Selection models for meta-analyses of dependent effect sizes
2025-06-13
Megha Joshi
Melissa Rodgers
Ryan Williams
Joshua Polanin
David Miller
The research reported here was supported, in whole or in part, by the Institute of Education Sciences, U.S. Department of Education, through grant R305D220026 to the American Institutes for Research. The opinions expressed are those of the authors and do not represent the views of the Institute or the U.S. Department of Education.
Selective reporting occurs if affirmative findings are more likely to be reported and available for inclusion in meta-analysis
Selective reporting distorts the evidence base available for systematic review/meta-analysis
Concerns about selective reporting span the social, behavioral, and health sciences.
Graphical diagnostics
Tests/adjustments for funnel plot asymmetry
Selection models
p-value diagnostics
Dependent effect sizes are ubiquitous in education and social science meta-analyses.
We have well-developed methods for modeling dependent effect sizes assuming no selection.
But only very recent developments for investigating selective reporting in databases with dependent effect sizes (Chen and Pustejovsky 2024).
Under Vevea and Hedges (1995) step-function model, the distribution of observed effect size estimates is piece-wise normal.
math = require("mathjs")
norm = import('https://unpkg.com/norm-dist@3.1.0/index.js?module')
eta = math.sqrt(tau**2 + sigma**2)
H = 2
alpha = [alpha1, alpha2]
lambda = [1, lambda1, lambda2_ratio * lambda1]
lambda_max = math.max(lambda)
function findlambda(p, alp, lam) {
var m = 0;
while (p >= alp[m]) {
m += 1;
}
return lam[m];
}
function findMoments(mu, tau, sigma, alp, lam) {
let H = alp.length;
let eta = math.sqrt(tau**2 + sigma**2);
let gamma_h = Array(H+2).fill(null).map((x,i) => {
if (i==0) {
return Infinity;
} else if (i==H+1) {
return -Infinity;
} else {
return sigma * norm.icdf(1 - alp[i-1]);
}
});
let c_h = Array(H+2).fill(null).map((x,i) => {
return (gamma_h[i] - mu) / eta;
});
let B_h = Array(H+1).fill(null).map((x,i) => {
return norm.cdf(c_h[i]) - norm.cdf(c_h[i+1])
});
let Ai = 0;
for (let i = 0; i <= H; i++) {
Ai += lam[i] * B_h[i];
}
let psi_h = Array(H+1).fill(null).map((x,i) => {
return (norm.pdf(c_h[i+1]) - norm.pdf(c_h[i])) / B_h[i]
});
let psi_top = 0;
for (let i = 0; i <= H; i++) {
psi_top += lam[i] * B_h[i] * psi_h[i];
}
let psi_bar = psi_top / Ai;
let ET = mu + eta * psi_bar;
let dc_h = c_h.map((c_val) => {
if (math.abs(c_val) == Infinity) {
return 0;
} else {
return c_val * norm.pdf(c_val);
}
});
let kappa_h = Array(H+1).fill(null).map((x,i) => {
return (dc_h[i] - dc_h[i+1]) / B_h[i];
});
let kappa_top = 0;
for (let i = 0; i <= H; i++) {
kappa_top += lam[i] * B_h[i] * kappa_h[i];
}
let kappa_bar = kappa_top / Ai;
let SDT = eta * math.sqrt(1 - kappa_bar - psi_bar**2);
return ({Ai: Ai, ET: ET, SDT: SDT});
}
moments = findMoments(mu, tau, sigma, alpha, lambda)
Ai_toprint = moments.Ai.toFixed(3)
ET_toprint = moments.ET.toFixed(3)
eta_toprint = eta.toFixed(3)
SDT_toprint = moments.SDT.toFixed(3)
pts = 201
density_dat = Array(pts).fill().map((element, index) => {
let t = mu - 3 * eta + index * eta * 6 / (pts - 1);
let p = 1 - norm.cdf(t / sigma);
let dt = norm.pdf((t - mu) / eta) / eta;
let lambda_val = findlambda(p, alpha, lambda);
return ({
t: t,
d_unselected: dt,
d_selected: lambda_val * dt / lambda_max
})
})
viewof mu = Inputs.range(
[-2, 2],
{value: 0.1, step: 0.01, label: tex`\mu`}
)
viewof tau = Inputs.range(
[0, 2],
{value: 0.15, step: 0.01, label: tex`\tau`}
)
viewof sigma = Inputs.range(
[0, 1],
{value: 0.25, step: 0.01, label: tex`\sigma_i`}
)
viewof alpha1 = Inputs.range(
[0, 1],
{value: 0.025, step: 0.005, label: tex`\alpha_1`}
)
viewof alpha2 = Inputs.range(
[0, 1],
{value: 0.50, step: 0.005, label: tex`\alpha_2`}
)
viewof lambda1 = Inputs.range(
[0, 2],
{value: 1, step: 0.01, label: tex`\lambda_1`}
)
viewof lambda2_ratio = Inputs.range(
[0, 2],
{value: 1, step: 0.01, label: tex`\lambda_2 / \lambda_1`}
)
Plot.plot({
height: 500,
width: 1000,
y: {
grid: false,
label: "Density"
},
x: {
label: "Effect size estimate (Ti)"
},
marks: [
Plot.ruleY([0]),
Plot.ruleX([0]),
Plot.areaY(density_dat, {x: "t", y: "d_unselected", fillOpacity: 0.3}),
Plot.areaY(density_dat, {x: "t", y: "d_selected", fill: "blue", fillOpacity: 0.5}),
Plot.lineY(density_dat, {x: "t", y: "d_selected", stroke: "blue"})
]
})
tex`
\begin{aligned}
\mu &= ${mu} & \qquad \sqrt{\tau^2 + \sigma_{ij}} &= ${eta_toprint} \\
\mathbb{E}\left(T_{ij}\right) &= ${ET_toprint}
& \qquad \sqrt{\mathbb{V}\left(T_{ij}\right)} &= ${SDT_toprint} \\
\Pr(T_{ij} \text{ is observed}) &= ${Ai_toprint}
\end{aligned}
`
Model the marginal distribution of observed effects, ignoring the dependence structure
Maximum likelihood (composite marginal likelihood)
Augmented, reweighted Gaussian likelihood
Lehmann, Elliot, and Calin-Jageman (2018) reported a systematic review of studies on color-priming, examining whether exposure to the color red influenced attractiveness judgments.
Mean ES | Heterogeneity Variance | |||
---|---|---|---|---|
Coef. | Est. | SE | Est. | SE |
(A) Summary meta-analysis | ||||
Overall | 0.207 | 0.0571 | 0.103 | 0.0251 |
(B) Moderation by design type | ||||
Between-Subjects | 0.19 | 0.0642 | 0.104 | 0.0256 |
Within-Subjects | 0.273 | 0.1456 | 0.104 | 0.0256 |
Mean ES | Heterogeneity Variance | |||
---|---|---|---|---|
Coef. | Est. | SE | Est. | SE |
(A) Summary meta-analysis | ||||
Overall | 0.207 | 0.0571 | 0.103 | 0.0251 |
(B) Moderation by design type | ||||
Between-Subjects | 0.19 | 0.0642 | 0.104 | 0.0256 |
Within-Subjects | 0.273 | 0.1456 | 0.104 | 0.0256 |
library(metaselection)
# load the data
data("dat.lehmann2018", package = "metadat")
# tidy up
dat.lehmann2018$study <- dat.lehmann2018$Full_Citation
dat.lehmann2018$sei <- sqrt(dat.lehmann2018$vi)
dat.lehmann2018$Design <- factor(dat.lehmann2018$Design, levels = c("Between Subjects","Within Subjects"), labels = c("Between","Within"))
# fit a one-step selection model
sel1 <- selection_model(
yi = yi, # effect size est.
sei = sei, # standard error
cluster = study, # identifier for independent clusters
data = dat.lehmann2018, # dataset
selection_type = "step", # type of selection model
steps = .025, # single threshold for step-function
estimator = "CML", # estimation method
bootstrap = "none" # large-sample sandwich standard errors
)
Step Function Model
Call:
selection_model(data = dat.lehmann2018, yi = yi, sei = sei, cluster = study,
selection_type = "step", steps = 0.025, estimator = "CML",
bootstrap = "none")
Number of clusters = 41; Number of effects = 81
Steps: 0.025
Estimator: composite marginal likelihood
Variance estimator: robust
Log composite likelihood of selection model: -44.46436
Inverse selection weighted partial log likelihood: 58.35719
Mean effect estimates:
Large Sample
Coef. Estimate Std. Error p-value Lower Upper
beta 0.133 0.137 0.333 -0.136 0.402
Heterogeneity estimates:
Large Sample
Coef. Estimate Std. Error p-value Lower Upper
tau2 0.0811 0.0845 --- 0.0105 0.625
Selection process estimates:
Step: 0 < p <= 0.025; Studies: 16; Effects: 25
Large Sample
Coef. Estimate Std. Error p-value Lower Upper
lambda0 1 --- --- --- ---
Step: 0.025 < p <= 1; Studies: 29; Effects: 56
Large Sample
Coef. Estimate Std. Error p-value Lower Upper
lambda1 0.548 0.616 0.593 0.0607 4.96
# turn on parallel processing
library(future)
plan(multisession, workers = 8)
set.seed(20250613) # for reproducibility
sel1_boot <- selection_model(
yi = yi, # effect size est.
sei = sei, # standard error
cluster = study, # identifier for independent clusters
data = dat.lehmann2018, # dataset
selection_type = "step", # type of selection model
steps = .025, # single threshold for step-function
estimator = "CML", # estimation method
bootstrap = "two-stage", # recommended type of bootstrapping
R = 1999, # number of bootstrap re-samples
CI_type = c("large-sample", # keep the large-sample sandwich CI
"percentile") # recommended type of bootstrap CI
)
Step Function Model with Cluster Bootstrapping
Call:
selection_model(data = dat.lehmann2018, yi = yi, sei = sei, cluster = study,
selection_type = "step", steps = 0.025, estimator = "CML",
CI_type = c("large-sample", "percentile"), bootstrap = "two-stage",
R = 1999)
Number of clusters = 41; Number of effects = 81
Steps: 0.025
Estimator: composite marginal likelihood
Variance estimator: robust
Bootstrap type: two-stage
Number of bootstrap replications: 1999
Log composite likelihood of selection model: -44.46436
Inverse selection weighted partial log likelihood: 58.35719
Mean effect estimates:
Large Sample Percentile Bootstrap
Coef. Estimate Std. Error p-value Lower Upper Lower Upper
beta 0.133 0.137 0.333 -0.136 0.402 -0.0174 0.435
Heterogeneity estimates:
Large Sample Percentile Bootstrap
Coef. Estimate Std. Error p-value Lower Upper Lower Upper
tau2 0.0811 0.0845 --- 0.0105 0.625 1.73e-17 0.238
Selection process estimates:
Step: 0 < p <= 0.025; Studies: 16; Effects: 25
Large Sample Percentile Bootstrap
Coef. Estimate Std. Error p-value Lower Upper Lower Upper
lambda0 1 --- --- --- --- --- ---
Step: 0.025 < p <= 1; Studies: 29; Effects: 56
Large Sample Percentile Bootstrap
Coef. Estimate Std. Error p-value Lower Upper Lower Upper
lambda1 0.548 0.616 0.593 0.0607 4.96 0.0537 2.88
set.seed(20250613) # for reproducibility
sel1_mod <- selection_model(
yi = yi, # effect size est.
sei = sei, # standard error
cluster = study, # identifier for independent clusters
mean_mods = ~ 0 + Design, # design type moderator
data = dat.lehmann2018, # dataset
selection_type = "step", # type of selection model
steps = .025, # single threshold for step-function
estimator = "CML", # estimation method
bootstrap = "two-stage", # recommended type of bootstrapping
R = 1999, # number of bootstrap re-samples
CI_type = c("large-sample", # keep the large-sample sandwich CI
"percentile") # recommended type of bootstrap CI
)
Step Function Model with Cluster Bootstrapping
Call:
selection_model(data = dat.lehmann2018, yi = yi, sei = sei, cluster = study,
selection_type = "step", steps = 0.025, mean_mods = ~0 +
Design, estimator = "CML", CI_type = c("large-sample",
"percentile"), bootstrap = "two-stage", R = 1999)
Number of clusters = 41; Number of effects = 81
Steps: 0.025
Estimator: composite marginal likelihood
Variance estimator: robust
Bootstrap type: two-stage
Number of bootstrap replications: 1990
Log composite likelihood of selection model: -44.12226
Inverse selection weighted partial log likelihood: 61.14273
Mean effect estimates:
Large Sample Percentile Bootstrap
Coef. Estimate Std. Error p-value Lower Upper Lower Upper
beta_DesignBetween 0.113 0.117 0.333 -0.116 0.343 -0.0484 0.339
beta_DesignWithin 0.196 0.234 0.400 -0.261 0.654 0.0104 0.985
Heterogeneity estimates:
Large Sample Percentile Bootstrap
Coef. Estimate Std. Error p-value Lower Upper Lower Upper
tau2 0.0785 0.081 --- 0.0104 0.593 1.14e-17 0.197
Selection process estimates:
Step: 0 < p <= 0.025; Studies: 16; Effects: 25
Large Sample Percentile Bootstrap
Coef. Estimate Std. Error p-value Lower Upper Lower Upper
lambda0 1 --- --- --- --- --- ---
Step: 0.025 < p <= 1; Studies: 29; Effects: 56
Large Sample Percentile Bootstrap
Coef. Estimate Std. Error p-value Lower Upper Lower Upper
lambda1 0.533 0.601 0.577 0.0584 4.86 0.042 2.66
set.seed(20250613) # for reproducibility
sel2_mod <- selection_model(
yi = yi, # effect size est.
sei = sei, # standard error
cluster = study, # identifier for independent clusters
mean_mods = ~ 0 + Design, # design type moderator
data = dat.lehmann2018, # dataset
selection_type = "step", # type of selection model
steps = c(.025,.500), # two thresholds for step-function
estimator = "CML", # estimation method
bootstrap = "two-stage", # recommended type of bootstrapping
R = 1999, # number of bootstrap re-samples
CI_type = c("large-sample", # keep the large-sample sandwich CI
"percentile") # recommended type of bootstrap CI
)
Step Function Model with Cluster Bootstrapping
Call:
selection_model(data = dat.lehmann2018, yi = yi, sei = sei, cluster = study,
selection_type = "step", steps = c(0.025, 0.5), mean_mods = ~0 +
Design, estimator = "CML", CI_type = c("large-sample",
"percentile"), bootstrap = "two-stage", R = 1999)
Number of clusters = 41; Number of effects = 81
Steps: 0.025, 0.5
Estimator: composite marginal likelihood
Variance estimator: robust
Bootstrap type: two-stage
Number of bootstrap replications: 1990
Log composite likelihood of selection model: -43.51452
Inverse selection weighted partial log likelihood: 85.42712
Mean effect estimates:
Large Sample Percentile Bootstrap
Coef. Estimate Std. Error p-value Lower Upper Lower Upper
beta_DesignBetween 0.0419 0.140 0.765 -0.233 0.317 -0.1706 0.335
beta_DesignWithin 0.1462 0.247 0.554 -0.338 0.631 -0.0329 0.989
Heterogeneity estimates:
Large Sample Percentile Bootstrap
Coef. Estimate Std. Error p-value Lower Upper Lower Upper
tau2 0.0804 0.0881 --- 0.00937 0.69 1.26e-17 0.207
Selection process estimates:
Step: 0 < p <= 0.025; Studies: 16; Effects: 25
Large Sample Percentile Bootstrap
Coef. Estimate Std. Error p-value Lower Upper Lower Upper
lambda0 1 --- --- --- --- --- ---
Step: 0.025 < p <= 0.5; Studies: 22; Effects: 33
Large Sample Percentile Bootstrap
Coef. Estimate Std. Error p-value Lower Upper Lower Upper
lambda1 0.476 0.595 0.552 0.0412 5.5 0.0364 2.68
Step: 0.5 < p <= 1; Studies: 17; Effects: 23
Large Sample Percentile Bootstrap
Coef. Estimate Std. Error p-value Lower Upper Lower Upper
lambda2 0.307 0.466 0.437 0.0156 6.03 0.0122 3.13
Other supported models:
beta-function models (Citkowicz and Vevea 2017)
Location-scale meta-regression (Viechtbauer and López‐López 2022)
Predictors of selection (Coburn and Vevea 2015)
Marginal step-function selection models are worth adding to the toolbox (Pustejovsky, Citkowicz, and Joshi 2025).
Low bias compared to other selective reporting adjustments (including PET-PEESE)
Bias-variance trade-off relative to regular meta-analytic models
Two-stage clustered bootstrap percentile confidence intervals work tolerably well
metaselection
Currently available on Github at https://github.com/jepusto/metaselection
Install using