Heteroskedasticity-consistent (HC) standard errors

Heteroskedasticity is when the variance of a model’s error term is related to the predictors in that model. For more information, see Wikipedia: Heteroscedasticity.

Many regression models assume homoskedasticity (i.e. constant variance of the error term), especially when calculating standard errors. So in the presence of heteroskedasticity, standard errors will be incorrect. Heteroskedasticity-consistent (HC) standard errors — also called “heteroskedasticity-robust”, or sometimes just “robust” standard errors — are calculated without assuming such homoskedasticity. For more information, see Wikipedia: Heteroscedasticity-consistent standard errors.

Keep in Mind

  • Robust standard errors are a common way of dealing with heteroskedasticity. However, they make certain assumptions about the form of that heteroskedasticity which may not be true. You may instead want to use GMM instead.
  • For nonlinear models like Logit, heteroskedasticity can bias estimates in addition to messing up standard errors. Simply using a robust covariance matrix will not eliminate this bias. Check the documentation of your nonlinear regression command to see whether its robust-error options also adjust for this bias. If not, consider other ways of dealing with heteroskedasticity besides robust errors.
  • There are multiple kinds of robust standard errors, for example HC1, HC2, and HC3. Check in to the kind available to you in the commands you’re using.

The easiest way to obtain robust standard errors in R is with the estimatr package (link) and its family of lm_robust functions. These will default to “HC2” errors, but users can specify a variety of other options.

# If necessary, install estimatr
# install.packages(c('estimatr'))

# Get mtcars data
# data(mtcars) ## Optional: Will load automatically anyway

# Default is "HC2". Here we'll specify "HC3" just to illustrate.
m1 <- lm_robust(mpg ~ cyl + disp + hp, data = mtcars, se_type = "HC3")

Alternately, users may consider the vcovHC function from the sandwich package (link), which is very flexible and supports a wide variety of generic regression objects. For inference (t-tests, etc.), use in conjunction with the coeftest function from the lmtest package (link).

# If necessary, install lmtest and sandwich
# install.packages(c('lmtest','sandwich'))

# Create a normal regression model (i.e. without robust standard errors)
m2 <- lm(mpg ~ cyl + disp + hp, data = mtcars)

# Get the robust VCOV matrix using sandwich::vcovHC(). We can pick the kind of robust errors 
# with the "type" argument. Note that, unlike estimatr::lm_robust(), the default this time  
# is "HC3". I'll specify it here anyway just to illustrate.
vcovHC(m2, type = "HC3")
sqrt(diag(vcovHC(m2))) ## HAC SEs

# For statistical inference, use together with lmtest::coeftest().
coeftest(m2, vcov = vcovHC(m2))


Stata has robust standard errors built into most regression commands, and they generally work the same way for all commands.

* Load in auto data
sysuse auto.dta, clear

* Just add robust to the options of the regression
* This will give you HC1
regress price mpg gear_ratio foreign, robust

* For other kinds of robust standard errors use vce()
regress price mpg gear_ratio foreign, vce(hc3)