A nested logistical regression (nested logit, for short) is a statistical method for finding a best-fit line when the the outcome variable $Y$ is a binary variable, taking values of 0 or 1. Logit regressions, in general, follow a logistical distribution and restrict predicted probabilities between 0 and 1.

Traditional logit models require that the Independence of Irrelevant Alternatives(IIA) property holds for all possible outcomes of some process. Nested logit models differ by allowing ‘nests’ of outcomes that satisfy IIA within them, but not requiring that all outcomes jointly satisfy IIA.

For an example of violating the IIA property, see Red Bus/Blue Bus Paradox.

For a more thorough theoretical treatment, see SAS Documentation: Nested Logit.

## Keep in Mind

• Returned beta coefficients are not the marginal effects normally returned from an OLS regression. They are maximum likelihood estimations. A beta coefficient can not be interpreted as “a unit increase in $X$ leads to a $\beta$ unit change in the probability of $Y$.”

• The marginal effect can be obtained by performing a transformation after you estimate. A rough estimation technique is to divide the beta coefficient by 4.

• Another transformation that may be helpful is the odds ratio. This value is found by raising $e$ to the power of the beta coefficient. $e^\beta$ can be interpreted as : the percentage change in likelihood of $Y$, given a unit change in $X$.

# Implementations

## R

R has multiple packages that can estimate a nested logit model. To show a simple example, we will use the mlogit package.

# Install mlogit and AER packages and load them. Latter is just for a dataset we'll be using.
# install.packages("mlogit", "AER")
library("mlogit", "AER")

data("TravelMode", package = "AER")

# Use the mlogit() function to run a nested logit estimation

# Here, we will predict what mode of travel individuals
# choose using cost and wait times

nestedlogit = mlogit(
choice ~ gcost + wait,
data = TravelMode,
##The variable from which our nests are determined
alt.var = 'mode',
#The variable that dictates the binary choice
choice = 'choice',
#List of nests as named vectors
nests = list(Fast = c('air','train'), Slow = c('car','bus'))
)

# The results

summary(nestedlogit)

# In this case, air travel is treated as the base level.
# others maximum likelihood estimators relative
# to air are reported as separate intercepts

# The elasticities for each cluster are displayed
# as iv:Fast and iv:Slow


Another set of more robust examples comes from Kenneth Train and Yves Croissant