Heteroskedasticity-robust tests in linear regression: A review and evaluation of small-sample corrections

James E. Pustejovsky and Gleb Furman

4/30/2017

Linear regression

Let’s talk about a basic regression model:

\[ \begin{aligned} y_i &= \beta_0 + \beta_1 x_{1i} + \cdots + \beta_{p-1} x_{p-1,i} + e_i \\ \mathbf{y} &= \mathbf{X} \boldsymbol\beta + \mathbf{e} \end{aligned} \] estimated by ordinary least squares:

\[ \boldsymbol{\hat\beta} = \left(\mathbf{X}'\mathbf{X}\right)^{-1} \mathbf{X}'\mathbf{y}. \]

HCCMEs

Potential small-sample improvements

Other potential small-sample improvements

Approximations for the reference distribution of test statistic:

Simulations

Size of HC* variants, \(\alpha = .05\)

Size of HC* variants, \(\alpha = .01\)

Size of HC* variants, \(\alpha = .005\)

Size of selected tests, \(\alpha = .05\)

Size of selected tests, \(\alpha = .01\)

Size of selected tests, \(\alpha = .005\)

Findings

  1. Currently recommended test HC3 does not adequately control type-I error rate.
  2. At the \(\alpha = .05\) level, HC4 maintains most accurate rejection rates of all tests considered.
  3. At smaller \(\alpha\) levels, Satterthwaite and Edgeworth approximations out-perform HC3 and HC4.

Discussion

References

Cribari-Neto, F. (2004). Asymptotic inference under heteroskedasticity of unknown form. Computational Statistics and Data Analysis, 45(2), 215-233.

Cribari-Neto, F., Souza, T. C., & Vasconcellos, K. L. P. (2007). Inference under heteroskedasticity and leveraged data. Communications in Statistics - Theory and Methods, 36(10), 1877-1888.

Cribari-Neto, F., & da Silva, W. B. (2011). A new heteroskedasticity-consistent covariance matrix estimator for the linear regression model. Advances in Statistical Analysis, 95(2), 129-146.

Davidson, R., & MacKinnon, J. G. (1993). Estimation and Inference in Econometrics. New York, NY: Oxford University Press.

Eicker, F. (1967). Limit theorems for regressions with unequal and dependent errors. In Proceedings of the Berkeley Symposium on Mathematical Statistics and Probability (Vol. 1, pp. 59-82). Berkeley, CA: University of California Press.

Huber, P. J. (1967). The behavior of maximum likelihood estimates under nonstandard conditions. In Proceedings of the fifth Berkeley symposium on Mathematical Statistics and Probability (pp. 221-233). Berkeley, CA: University of California Press.

Kauermann, G., & Carroll, R. J. (2001). A note on the efficiency of sandwich covariance matrix estimation. Journal of the American Statistical Association, 96(456), 1387-1396.

Lipsitz, S. R., Ibrahim, J. G., & Parzen, M. (1999). A degrees-of-freedom approximation for a t-statistic with heterogeneous variance. Journal of the Royal Statistical Society: Series D (The Statistician), 48(4), 495-506.

Long, J. S., & Ervin, L. H. (2000). Using heteroscedasticity consistent standard errors in the linear regression model. The American Statistician, 54(3), 217-224.

MacKinnon, J. G., & White, H. (1985). Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties. Journal of Econometrics, 29, 305-325.

McCaffrey, D. F., & Bell, R. M. (2006). Improved hypothesis testing for coefficients in generalized estimating equations with small samples of clusters. Statistics in Medicine, 25(23), 4081-98.

White, H. (1980). A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica, 48(4), 817-838.

Degree of heteroskedasticity