Heteroskedasticity is when the variance of a model’s error term is related to the predictors in that model. For more information, see Wikipedia: Heteroscedasticity.
Many regression models assume homoskedasticity (i.e. constant variance of the error term), especially when calculating standard errors. So in the presence of heteroskedasticity, standard errors will be incorrect. Heteroskedasticity-consistent (HC) standard errors — also called “heteroskedasticity-robust”, or sometimes just “robust” standard errors — are calculated without assuming such homoskedasticity. For more information, see Wikipedia: Heteroscedasticity-consistent standard errors.
- Robust standard errors are a common way of dealing with heteroskedasticity. However, they make certain assumptions about the form of that heteroskedasticity which may not be true. You may instead want to use GMM instead.
- For nonlinear models like Logit, heteroskedasticity can bias estimates in addition to messing up standard errors. Simply using a robust covariance matrix will not eliminate this bias. Check the documentation of your nonlinear regression command to see whether its robust-error options also adjust for this bias. If not, consider other ways of dealing with heteroskedasticity besides robust errors.
- There are multiple kinds of robust standard errors, for example HC1, HC2, and HC3. Check in to the kind available to you in the commands you’re using.
- Generalized Method of Moments
- Cluster-Robust Standard Errors
- Bootstrap Standard Errors
- Jackknife Standard Errors
The easiest way to obtain robust standard errors in R is with the estimatr package (link) and its family of
lm_robust functions. These will default to “HC2” errors, but users can specify a variety of other options.
# If necessary, install estimatr # install.packages(c('estimatr')) library(estimatr) # Get mtcars data # data(mtcars) ## Optional: Will load automatically anyway # Default is "HC2". Here we'll specify "HC3" just to illustrate. m1 <- lm_robust(mpg ~ cyl + disp + hp, data = mtcars, se_type = "HC3") summary(m1)
Alternately, users may consider the
vcovHC function from the sandwich package (link), which is very flexible and supports a wide variety of generic regression objects. For inference (t-tests, etc.), use in conjunction with the
coeftest function from the lmtest package (link).
# If necessary, install lmtest and sandwich # install.packages(c('lmtest','sandwich')) library(sandwich) library(lmtest) # Create a normal regression model (i.e. without robust standard errors) m2 <- lm(mpg ~ cyl + disp + hp, data = mtcars) # Get the robust VCOV matrix using sandwich::vcovHC(). We can pick the kind of robust errors # with the "type" argument. Note that, unlike estimatr::lm_robust(), the default this time # is "HC3". I'll specify it here anyway just to illustrate. vcovHC(m2, type = "HC3") sqrt(diag(vcovHC(m2))) ## HAC SEs # For statistical inference, use together with lmtest::coeftest(). coeftest(m2, vcov = vcovHC(m2))
Stata has robust standard errors built into most regression commands, and they generally work the same way for all commands.
* Load in auto data sysuse auto.dta, clear * Just add robust to the options of the regression * This will give you HC1 regress price mpg gear_ratio foreign, robust * For other kinds of robust standard errors use vce() regress price mpg gear_ratio foreign, vce(hc3)