DifferenceinDifferences Event Study / Dynamic DifferenceinDifferences
A DifferenceinDifference (DID) event study, or a Dynamic DID model, is a useful tool in evaluating treatment effects of the pre and post treatment periods in your respective study. However, since treatment can be staggered  where the treatment group are treated at different time periods  it might be challenging to create a clean event study.
In the following code, we will learn how to create a DID event study when treatment is staggered. If there is only one treatment period, the same methodology as described below can be applied.
Importantly, while this page uses data from a staggeredtreatment design, Sun and Abraham (2020) showed that basic eventstudy estimation can be biased in this setup. The code below should be used only in cases where treatment occurs at a single time period. Where possible, the code will also show how to apply the Sun and Abraham estimator that fixes this problem.
The regression that DID event studies are based aroud is:
\[Y_{gt} = \alpha + \Sigma_{k=T_0}^{2}\beta_k\times treat_{gk}+\Sigma_{k=0}^{T_1}\beta_k\times treat_{gk}+ X_{st}\Gamma+\phi_s+\gamma_t+\epsilon_{st}\]Where:

\(treat_{sk}\) is a dummy variable, equaling 1 if the observation’s periods relative to the group \(g\)’s first treated period is the same value as
k
; 0 otherwise (and 0 for all nevertreated groups). 
\(T_0\) and \(T_1\) are the lowest and highest number of leads and lags to consider surrouning the treatment period, respectively.

\(X`\) are controls

\(\phi\) and \(\gamma\) are state and time fixed effects

Estimation is generally performed with standard errors clustered at the group level
Important notes on the regression:

The point of the regression is to show for each period before and after treatment that the coefficients on the pretreated periods are statistically insignificant

Showing the control and treated groups are statistically the same before treatment (\(\beta_k=0\) for \(k < 0\)) supports, though does not prove, the parallel trends assumption in DID estimation.

One of the time periods must be dropped to avoid perfect multicollinearity (as in most fixedeffects setups). In most event studies, the 1 time lag is used as the dropped reference.
Keep in Mind
Mechanically, an event study is a graphical illustration of the point estimates and confidence intervals of the regression for each time period before and after the treatment period. It’s especially relevant in the DID environment as the point estimates are the average mean differences between the treated and control groups, which provides further evidence of the credibility in assuming parallel trends.
Also Consider
 2x2 DifferenceinDifferences
 A great resource for learning more about DID and event study theory is at Causal Inference: The Mixtape.
Implementations
All implementations use the same data, which comes from Stevenson and Wolfers (2006) by way of Clarke & Schythe (2020), who use it as an example to demonstrate GoodmanBacon effects. This data is a balanced panel from 1964 through 1996 of the United States nofault divorce reforms and female suicide rates. You can directly download the data here.
Column _nfd
in the data specifies the year in which the law went into effect for the respective state. We use this column to identify the lead and lags with respect to year of treatment. pcinc
, asmrh
, and cases
are controls.
Note that there are some states in which _nfd
is empty. These states never received treatment, and thus exist as a control.
Python
Python makes dealing with lots of interaction terms like we have here a little painful, but we can iterate to do a lot of the work for us.
import pandas as pd
import linearmodels as lm
# Read in data
df = pd.read_stata("https://raw.githubusercontent.com/LOSTSTATS/LOSTSTATS.github.io/master/Model_Estimation/Data/Event_Study_DiD/bacon_example.dta")
# create the lag/lead for treated states
# fill in control obs with 0
# This allows for the interaction between `treat` and `time_to_treat` to occur for each state.
# Otherwise, there may be some missingss and the estimations will be off.
df['time_to_treat'] = (
df['_nfd'].sub(df['year'])
# missing values for _nfd implies no treatment
.fillna(0)
# so we don't have decimals in our factor names
.astype('int')
)
# Create our interactions by hand,
# skipping 1, the last one before treatment
df = (
# returns dataframe with dummy columns in place of the columns
# in the named argument, all other columns untouched
pd.get_dummies(df, columns=['time_to_treat'], prefix='INX')
# Be sure not to include the minuses in the name
.rename(columns=lambda x: x.replace('', 'm'))
# get_dummies has a `drop_first` argument, but if we want to
# refer to a specific level, we should return all levels and
# drop out reference column manually
.drop(columns='INX_m1')
# Set our individual and time (index) for our data
.set_index(['stfips', 'year'])
)
# Estimate the regression
scalars = ['pcinc', 'asmrh', 'cases']
factors = df.columns[df.columns.str.contains('INX')]
exog = factors.union(scalars)
endog = 'asmrs'
# with the standard api:
mod = lm.PanelOLS(df[endog], df[exog], entity_effects=True, time_effects=True)
fit = mod.fit(cov_type='clustered', cluster_entity=True)
fit.summary
# with the formula api:
# We can save ourselves some time by creating the regression formula automatically
inxnames = df.columns[range(13,df.shape[1])]
formula = '{} ~ {} + EntityEffects + TimeEffects'.format(endog, '+'.join(exog))
mod = lm.PanelOLS.from_formula(formula,df)
# Specify clustering when we fit the model
clfe = mod.fit(cov_type = 'clustered',
cluster_entity = True)
# Look at regression results
clfe.summary
Now we can plot the results with matplotlib. Two common approaches are to include verticalline confidence intervals with errorbar()
or to include a confidence interval ribbon with fill_between()
. I’ll show the errorbar()
version.
# Get coefficients and CIs
res = pd.concat([clfe.params, clfe.std_errors], axis = 1)
# Scale standard error to 95% CI
res['ci'] = res['std_error']*1.96
# We only want time interactions
res = res.filter(like='INX', axis=0)
# Turn the coefficient names back to numbers
res.index = (
res.index
.str.replace('INX_', '')
.str.replace('m', '')
.astype('int')
.rename('time_to_treat')
)
# And add our reference period back in, and sort automatically
res.reindex(range(res.index.min(), res.index.max()+1)).fillna(0)
# Plot the estimates as connected lines with error bars
ax = res.plot(
y='parameter',
yerr='ci',
xlabel='Time to Treatment',
ylabel='Estimated Effect',
legend=False
)
# Add a horizontal line at 0
ax.axhline(0, linestyle='dashed')
# And a vertical line at the treatment time
# some versions of pandas have bug return xaxis object with data_interval
# starting at 0. In that case change 0 to 21
ax.axvline(0, linestyle='dashed')
Which produces:
Of course, as earlier mentioned, this analysis is subject to the critique by Sun and Abraham (2020). You want to calculate effects separately by timewhentreated, and then aggregate to the timetotreatment level properly, avoiding the way these estimates can “contaminate” each other in the regular model. You are on your own for this process in Python, though. Read the paper (and the backend code from the R or Stata implementations listed below).
R
First, load packages and the data
library(dplyr)
library(fixest)
library(tidyverse)
library(broom)
library(haven)
#Load and prepare data
bacon_df < read_dta("https://raw.githubusercontent.com/LOSTSTATS/LOSTSTATS.github.io/master/Model_Estimation/Data/Event_Study_DiD/bacon_example.dta") %>%
mutate(
# create the lag/lead for treated states
# fill in control obs with 0
# This allows for the interaction between `treat` and `time_to_treat` to occur for each state.
# Otherwise, there may be some NAs and the estimations will be off.
time_to_treat =ifelse(is.na(`_nfd`),0,year  `_nfd`),
# this will determine the difference
# btw controls and treated states
treat = ifelse(is.na(`_nfd`),0,1)
)
Also, while it’s not necessary given how we’re about to use the fixest package, if you’re interested in creating a dummy variable for each lead/lag in your data set, you can use the package fastDummies and bacon_df < dummy_cols(bacon_df, select_column = "time_to_treat")
.
We will run the eventstudy regression using feols()
from the fixest package. fixest is very fast, contains support for complex fixedeffects interactions, selecting our own reference group like we need with i()
, and will also help run the Sun and Abraham (2020) estimator.
m_1 < feols(asmrs ~
# The timetreatment interaction terms
i(treat, time_to_treat, ref=1) +
# Controls
pcinc + asmrh + cases
# State and year fixed effects
 stfips + year,
# feols clusters by the first fixed effect anyway, just making that clear
cluster=~stfips, data=bacon_df)
# Now turn the results into a data frame with a year column for easy plotting
event_1 < tidy(m_1, conf.int = TRUE) %>%
# For plotting purposes, we only want the terms that reference years
# and not the controls
mutate(year = as.numeric(parse_number(term))) %>%
filter(!is.na(year))
Now we can plot the results with ggplot2. Two common approaches are to include verticalline confidence intervals with geom_pointrange()
or to include a confidence interval ribbon with geom_ribbon()
. I’ll show the geom_pointrange()
version, but this is easy to swap out.
Now, you could just simply use coefplot(m_1)
from fixest and be done with it! This will create the eventstudy plot for you. Done. But if you want to maybe do some ggplot2 styling afterwards, or do some byhand tweaks, you can do it yourself:
event_1 %>%
ggplot(
mapping = aes(
x = year,
y = estimate,
ymin = conf.low,
ymax = conf.high
)
) +
geom_pointrange(
position = position_dodge(width = 1),
# Optional decoration:
color="black",
fatten=.5,
alpha=.8
) +
# Add a line marker for y = 0 (to see if the CI overlaps 0)
geom_hline(yintercept = 0, color = "red", alpha = 0.2) +
# A marker for the last preevent period
geom_vline(xintercept = 1, color = "black", size = 0.5, alpha = 0.4) +
# And the event period
geom_vline(xintercept = 0, linetype="dotted", color = "black", size = 0.5, alpha = 0.2) +
# Additional decoration:
theme_bw() +
theme(
plot.title = element_text(face = "bold", size = 12),
legend.background = element_rect(fill = "white", size = 4, colour = "white"),
legend.justification = c(0, 1),
legend.position = c(0, 1),
axis.ticks = element_line(colour = "white", size = 0.1),
panel.grid.major = element_line(colour = "white", size = 0.07),
panel.grid.minor = element_blank()
) +
annotate("text", x = c(0, 2), y = 30, label = c("", "treat")) +
labs(title = "Event Study: Staggered Treatment", y = "Estimate", x = "Time")
This results in:
Another common option in these graphs is to link all the individual point estimates with a line. This can be done by adding +geom_line()
right after the geom_pointrange()
call.
Of course, as earlier mentioned, this analysis is subject to the critique by Sun and Abraham (2020). We can also use fixest to estimate the Sun and Abraham estimator to calculate effects separately by timewhentreated, and then aggregate to the timetotreatment level properly, avoiding the way these estimates can “contaminate” each other in the regular model.
# see help(aggregate.fixest)
# As Sun and Abraham indicate, drop any alwaystreated groups
sun_df < bacon_df %>%
filter(`_nfd` > min(year)  !treat) %>%
# and set time_to_treat to 1000 for untreated groups
mutate(time_to_treat = case_when(
treat == 0 ~ 1000,
treat == 1 ~ time_to_treat
)) %>%
# and create a new yeartreated variable that's impossibly far in the future
# for untreated groups
mutate(year_treated = case_when(
treat == 0 ~ 10000,
treat == 1 ~ `_nfd`
)) %>%
# and a shared identifier for year treated and year
mutate(id = paste0(year_treated, ':', year))
# Don't include so many pre and postlags that you've got a lot of tiny periods
table(sun_df$time_to_treat)
sun_df < sun_df %>%
filter(time_to_treat == 1000  (time_to_treat >= 9 & time_to_treat <= 24))
# Read the Sun and Abraham paper before including controls as I do here
m_2 < feols(asmrs ~
# This time, interact time_to_treatment with year treated
# Dropping as reference the 1 period and the nevertreated
i(time_to_treat, f2 = year_treated, drop = c(1, 1000)) +
# Controls
pcinc + asmrh + cases
# Fixed effects for group and year
 stfips + year,
data=sun_df)
# Aggregate the coefficients by group
agg_coef = aggregate(m_2, "(time_to_treat)::(?[[:digit:]]+)")
# And plot
agg_coef %>%
as_tibble() %>%
mutate(conf.low = Estimate  1.96*`Std. Error`,
conf.high = Estimate + 1.96*`Std. Error`,
`Time to Treatment` = c(9:2, 0:24)) %>%
ggplot(mapping = aes(x = `Time to Treatment`, y = Estimate,
ymin = conf.low, ymax = conf.high))+
geom_pointrange(position = position_dodge(width = 1),
# Optional decoration:
color="black", fatten=.5, alpha=.8) +
geom_line() +
# Add a line marker for y = 0 (to see if the CI overlaps 0)
geom_hline(yintercept=0, color = "red",alpha=0.2)+
# A marker for the last preevent period
geom_vline(xintercept = 1, color = "black", size=0.5, alpha=0.4) +
# And the event period
geom_vline(xintercept = 0, linetype="dotted", color = "black", size=0.5, alpha=0.2)+
# Additional decoration:
theme_bw()+
labs(title="Event Study: Staggered Treatment with Sun and Abraham (2020) Estimation", y="Estimate", x="Time")
Stata
We can use the reghdfe package to help with our twoway fixed effects and highdimensional data. Install with ssc install reghdfe
first if you don’t have it.
use "https://raw.githubusercontent.com/LOSTSTATS/LOSTSTATS.github.io/master/Model_Estimation/Data/Event_Study_DiD/bacon_example.dta", clear
* create the lag/lead for treated states
* fill in control obs with 0
* This allows for the interaction between `treat` and `time_to_treat` to occur for each state.
* Otherwise, there may be some NAs and the estimations will be off.
g time_to_treat = year  _nfd
replace time_to_treat = 0 if missing(_nfd)
* this will determine the difference
* btw controls and treated states
g treat = !missing(_nfd)
* Stata won't allow factors with negative values, so let's shift
* timetotreat to start at 0, keeping track of where the true 1 is
summ time_to_treat
g shifted_ttt = time_to_treat  r(min)
summ shifted_ttt if time_to_treat == 1
local true_neg1 = r(mean)
* Regress on our interaction terms with FEs for group and year,
* clustering at the group (state) level
* use ib# to specify our reference group
reghdfe asmrs ib`true_neg1'.shifted_ttt pcinc asmrh cases, a(stfips year) vce(cluster stfips)
Now we can plot.
* Pull out the coefficients and SEs
g coef = .
g se = .
levelsof shifted_ttt, l(times)
foreach t in `times' {
replace coef = _b[`t'.shifted_ttt] if shifted_ttt == `t'
replace se = _se[`t'.shifted_ttt] if shifted_ttt == `t'
}
* Make confidence intervals
g ci_top = coef+1.96*se
g ci_bottom = coef  1.96*se
* Limit ourselves to one observation per quarter
* now switch back to time_to_treat to get original timing
keep time_to_treat coef se ci_*
duplicates drop
sort time_to_treat
* Create connected scatterplot of coefficients
* with CIs included with rcap
* and a line at 0 both horizontally and vertically
summ ci_top
local top_range = r(max)
summ ci_bottom
local bottom_range = r(min)
twoway (sc coef time_to_treat, connect(line)) ///
(rcap ci_top ci_bottom time_to_treat) ///
(function y = 0, range(time_to_treat)) ///
(function y = 0, range(`bottom_range' `top_range') horiz), ///
xtitle("Time to Treatment") caption("95% Confidence Intervals Shown")
Which produces:
Any further decoration or theming at that point is up to you.
Of course, as earlier mentioned, this analysis is subject to the critique by Sun and Abraham (2020). We can also use fixest to estimate the Sun and Abraham estimator to calculate effects separately by timewhentreated, and then aggregate to the timetotreatment level properly, avoiding the way these estimates can “contaminate” each other in the regular model.
We can estimate the Sun and Abraham method using the eventstudyinteract package by Sun herself. Install by installing the github package with net install github, from("https://haghish.github.io/github/")
and then installing eventstudyinteract with github install lsun20/eventstudyinteract
.
See help eventstudyinteract
for more information.
* Reload our data since we squashed it to graph
use "https://raw.githubusercontent.com/LOSTSTATS/LOSTSTATS.github.io/master/Model_Estimation/Data/Event_Study_DiD/bacon_example.dta", clear
* create the lag/lead for treated states
* fill in control obs with 0
* This allows for the interaction between `treat` and `time_to_treat` to occur for each state.
* Otherwise, there may be some NAs and the estimations will be off.
g time_to_treat = year  _nfd
replace time_to_treat = 0 if missing(_nfd)
* this will determine the difference
* btw controls and treated states
g treat = !missing(_nfd)
g never_treat = missing(_nfd)
* Create relativetime indicators for treated groups by hand
* ignore distant leads and lags due to lack of observations
* (note this assumes any effects outside these leads/lags is 0)
tab time_to_treat
forvalues t = 9(1)16 {
if `t' < 1 {
local tname = abs(`t')
g g_m`tname' = time_to_treat == `t'
}
else if `t' >= 0 {
g g_`t' = time_to_treat == `t'
}
}
eventstudyinteract asmrs g_*, cohort(_nfd) control_cohort(never_treat) covariates(pcinc asmrh cases) absorb(i.stfips i.year) vce(cluster stfips)
* Get effects and plot
* as of this writing, the coefficient matrix is unlabeled and so we can't do _b[] and _se[]
* instead we'll work with the results table
matrix T = r(table)
g coef = 0 if time_to_treat == 1
g se = 0 if time_to_treat == 1
forvalues t = 9(1)16 {
if `t' < 1 {
local tname = abs(`t')
replace coef = T[1,colnumb(T,"g_m`tname'")] if time_to_treat == `t'
replace se = T[2,colnumb(T,"g_m`tname'")] if time_to_treat == `t'
}
else if `t' >= 0 {
replace coef = T[1,colnumb(T,"g_`t'")] if time_to_treat == `t'
replace se = T[2,colnumb(T,"g_`t'")] if time_to_treat == `t'
}
}
* Make confidence intervals
g ci_top = coef+1.96*se
g ci_bottom = coef  1.96*se
keep time_to_treat coef se ci_*
duplicates drop
sort time_to_treat
keep if inrange(time_to_treat, 9, 16)
* Create connected scatterplot of coefficients
* with CIs included with rcap
* and a line at 0 both horizontally and vertically
summ ci_top
local top_range = r(max)
summ ci_bottom
local bottom_range = r(min)
twoway (sc coef time_to_treat, connect(line)) ///
(rcap ci_top ci_bottom time_to_treat) ///
(function y = 0, range(time_to_treat)) ///
(function y = 0, range(`bottom_range' `top_range') horiz), ///
xtitle("Time to Treatment with Sun and Abraham (2020) Estimation") caption("95% Confidence Intervals Shown")