Effective sample size aggregation

econometrics
causal inference
weighting
Author

James E. Pustejovsky

Published

January 22, 2019

In settings with independent observations, sample size is one way to quickly characterize the precision of an estimate. But what if your estimate is based on weighted data, where each observation doesn’t necessarily contribute to equally to the estimate? Here, one useful way to gauge the precision of an estimate is the effective sample size or ESS. Suppose that we have N independent observations Y1,...,YN drawn from a population with standard deviation σ, and that observation i receives weight wi. We take the weighted sample mean y~=1Wi=1NwiYi,whereW=i=1Nwi. with sampling variance Var(y~)=σ2W2i=1Nwi2.

The ESS is the number of observations from an equally weighted sample that would yield the same level of precision as the weighted sample mean. In an equally weighted sample of size N~, the variance would be simply σ2/N~, and so ESS is the value of N~ that solves σ2N~=σ2W2i=1Nwi2.

Re-arranging, the ESS is thus defined as N~=W2i=1Nwi2.

The ESS is reported in several packages for propensity score weighting, including twang and optweight. In the propensity score context, ESS is a useful measure for comparing different sets of estimated propensity weights, in that weights (or propensity score models/matching methods) that have a larger ESS will yield a more precise estimate of a treatment effect. Given two sets of weights that achieve equivalent degrees of balance, the weights with larger ESS are thus preferable. Methods introduced by Zubizarreta (2015)—and implemented in the optweight package—take this logic a step further by using ESS as an objective function to be minimized, subject to specified balancing constraints.

Multi-site effective sample size

Two of my recent projects have involved applying propensity score weighting methods in multi-site settings, where we are interested in estimating site-specific treatment effects as well as an overall aggregate effect. It is straight-forward to calculate an ESS for each site, but how then should we aggregate the ESS across sites to characterize the precision of the overall estimate? Several times now, I have found myself having to re-derive the aggregated ESS, and so I am going to work through it here now so as to save future-me (and perhaps you, dear reader) some time.

Suppose that we have J sites, nj observations from site j for j=1,...,J, and total sample size N=j=1Jnj. Observation i from site j has outcome Yij and weight wij. The site-specific weighted average at site j is then y~j=1Wji=1njwijYij,whereWj=i=1njwij and the overall average is y~=1Nj=1Jnj y~j=1Nj=1Ji=1njnjwijWjYij.

For calculating the overall average, observation i from unit j contributes weight uij=njwij/Wj.

Using these unit-specific weights, the effective sample size for the overall average is ESS=N2j=1Ji=1njuij2. We can also define a site-specific ESS for site j: ESSj=Wj2i=1njwij2.

Using the decomposition of the weights as uij=njwij/Wj, the overall ESS can be written as ESS=N2j=1Jnj2(i=1njwij2/Wj2). Noting that the term in the parentheses of the denominator is equivalent to 1/ESSj, the overall ESS can therefore be written in terms of the site-specific ESSs and sample sizes: ESS=N2j=1Jnj2/ESSj.

There you go. Future me will thank me for this!

Back to top