Heteroskedasticity-consistent (HC) standard errors
Heteroskedasticity is when the variance of a model’s error term is related to the predictors in that model. For more information, see Wikipedia: Heteroscedasticity.
Many regression models assume homoskedasticity (i.e. constant variance of the error term), especially when calculating standard errors. So in the presence of heteroskedasticity, standard errors will be incorrect. Heteroskedasticity-consistent (HC) standard errors — also called “heteroskedasticity-robust”, or sometimes just “robust” standard errors — are calculated without assuming such homoskedasticity. For more information, see Wikipedia: Heteroscedasticity-consistent standard errors.
Keep in Mind
- Robust standard errors are a common way of dealing with heteroskedasticity. However, they make certain assumptions about the form of that heteroskedasticity which may not be true. You may instead want to use GMM instead.
- For nonlinear models like Logit, heteroskedasticity can bias estimates in addition to messing up standard errors. Simply using a robust covariance matrix will not eliminate this bias. Check the documentation of your nonlinear regression command to see whether its robust-error options also adjust for this bias. If not, consider other ways of dealing with heteroskedasticity besides robust errors.
- There are multiple kinds of robust standard errors, for example HC1, HC2, and HC3. Check in to the kind available to you in the commands you’re using.
Also Consider
- Generalized Method of Moments
- Cluster-Robust Standard Errors
- Bootstrap Standard Errors
- Jackknife Standard Errors
Implementations
R
The easiest way to obtain robust standard errors in R is with the estimatr package (link) and its family of lm_robust
functions. These will default to “HC2” errors, but users can specify a variety of other options.
# If necessary, install estimatr
# install.packages(c('estimatr'))
library(estimatr)
# Get mtcars data
# data(mtcars) ## Optional: Will load automatically anyway
# Default is "HC2". Here we'll specify "HC3" just to illustrate.
m1 <- lm_robust(mpg ~ cyl + disp + hp, data = mtcars, se_type = "HC3")
summary(m1)
Alternately, users may consider the vcovHC
function from the sandwich package (link), which is very flexible and supports a wide variety of generic regression objects. For inference (t-tests, etc.), use in conjunction with the coeftest
function from the lmtest package (link).
# If necessary, install lmtest and sandwich
# install.packages(c('lmtest','sandwich'))
library(sandwich)
library(lmtest)
# Create a normal regression model (i.e. without robust standard errors)
m2 <- lm(mpg ~ cyl + disp + hp, data = mtcars)
# Get the robust VCOV matrix using sandwich::vcovHC(). We can pick the kind of robust errors
# with the "type" argument. Note that, unlike estimatr::lm_robust(), the default this time
# is "HC3". I'll specify it here anyway just to illustrate.
vcovHC(m2, type = "HC3")
sqrt(diag(vcovHC(m2))) ## HAC SEs
# For statistical inference, use together with lmtest::coeftest().
coeftest(m2, vcov = vcovHC(m2))
Stata
Stata has robust standard errors built into most regression commands, and they generally work the same way for all commands.
* Load in auto data
sysuse auto.dta, clear
* Just add robust to the options of the regression
* This will give you HC1
regress price mpg gear_ratio foreign, robust
* For other kinds of robust standard errors use vce()
regress price mpg gear_ratio foreign, vce(hc3)