In settings with independent observations, sample size is one way to quickly characterize the precision of an estimate. But what if your estimate is based on weighted data, where each observation doesn’t necessarily contribute to equally to the estimate? Here, one useful way to gauge the precision of an estimate is the effective sample size or ESS. Suppose that we have
The ESS is the number of observations from an equally weighted sample that would yield the same level of precision as the weighted sample mean. In an equally weighted sample of size
Re-arranging, the ESS is thus defined as
The ESS is reported in several packages for propensity score weighting, including twang and optweight. In the propensity score context, ESS is a useful measure for comparing different sets of estimated propensity weights, in that weights (or propensity score models/matching methods) that have a larger ESS will yield a more precise estimate of a treatment effect. Given two sets of weights that achieve equivalent degrees of balance, the weights with larger ESS are thus preferable. Methods introduced by Zubizarreta (2015)—and implemented in the optweight package—take this logic a step further by using ESS as an objective function to be minimized, subject to specified balancing constraints.
Multi-site effective sample size
Two of my recent projects have involved applying propensity score weighting methods in multi-site settings, where we are interested in estimating site-specific treatment effects as well as an overall aggregate effect. It is straight-forward to calculate an ESS for each site, but how then should we aggregate the ESS across sites to characterize the precision of the overall estimate? Several times now, I have found myself having to re-derive the aggregated ESS, and so I am going to work through it here now so as to save future-me (and perhaps you, dear reader) some time.
Suppose that we have
For calculating the overall average, observation
Using these unit-specific weights, the effective sample size for the overall average is
Using the decomposition of the weights as
There you go. Future me will thank me for this!