Quantile Regression is an extension of linear regression analysis. Quantile Regression differs from OLS in how it estimates the response variable. OLS estimates the conditional mean of \(Y\) across the predictor variables (\(X_1, X_2, X_3...\)), whereas quantile regression estimates the conditional median (or quantiles) of \(Y\) across the predictor variables (\(X_1, X_2, X_3...\)). It is useful in situations where OLS assumptions are not met (heteroskedasticity, bi-modal or skewed distributions). To specify the desired quantile, select a \(\tau\) value between 0 to 1 (.5 gives the median).
For more information on Quantile Regression, see Wikipedia: Quantile Regression
- This method allows for the dependent variable to have any distributional form, however it cannot be a dummy variable and must be continuous.
- This method is robust to outliers, so there is no need to remove outlier observations.
- Either the intercept term or at least one predictor is required to run an analysis.
- LASSO regression cannot be used for feature selection in this framework due to it requiring OLS assumptions to be satisfied.
- This method does not restrict the use of polynomial or interaction terms. A unique functional form can be specified.
- While Quantile Regression can be useful in applications where OLS assumptions are not met, it can actually be used to detect heteroskedasticity. This makes is a useful tool to ensure this assumption is met for OLS.
- Several different standard error calculations can be used with this method, however bootstrapped standard errors are generally the best for complex modeling situations. Clustered standard errors are also possible by estimating a quantile regression with pooled OLS clustered errors.
quantreg function in statsmodels allows for quantile regression.
import statsmodels.api as sm import statsmodels.formula.api as smf mtcars = sm.datasets.get_rdataset("mtcars", "datasets").data mod = smf.quantreg('mpg ~ cyl + hp + wt', mtcars) # Specify the quantile when you fit res = mod.fit(q=.2) print(res.summary())
The main package to implement Quantile Regression in R is through the
quantreg package. The main function in this package is
qr(), which fits a Quantile Regression model with a default \(\tau\) value of .5 but can be changed.
# Load package library(quantreg) # Load data data(mtcars) # Run quantile regression with mpg as outcome variable # and cyl, hp, and wt as predictors # Using a tau value of .2 for quantiles quantreg_model = rq(mpg ~ cyl + hp + wt, data = mtcars, tau = .2) # Look at results summary(quantreg_model)
Quantile regression can be performed in Stata using the
qreg function. By default it fits a median (
help qreg for some variants, including a bootstrapped quantile regression
sysuse auto qreg mpg price trunk weight, q(.2)