2019 Northwestern-Duke Main and Advanced Causal Inference Workshops
May 3, 2019
- Main Workshop: Monday, August 12 - Friday, August 16
- Advanced Workshop: Monday, August 19 - Wednesday, August 21
Workshop Schedule
You should plan on full days, roughly 9:00 AM - 5:00 PM. Breakfast will be available at 8:30 AM.
Workshop Organizers
- Bernie Black, Northwestern University, Pritzker Law School, Institute for Policy Research, and Kellogg School of Management, Department of Finance
Bernie Black is Nicholas J. Chabraja Professor at Northwestern University, with positions in the Pritzker School of Law, the Institute for Policy Research, and the Kellogg School of Management, Finance Department. Principal research interests: health law and policy; empirical legal studies, law and finance, international corporate governance. Web page with link to CV. Papers on SSRN. - Mat McCubbins, Duke University, Department of Political Science and Law School
Professor of Political Science and Law at Duke University, with positions in the Political Science Department and the Law School, and director of the Center for Law and Democracy. Principal research interests: democratic institutions, legislative organization; behavioral experiments, communication, learning and decisional; statutory interpretation, administrative procedure, research design; network economics. Web page with link to CV. Papers on SSRN.
Main Workshop
OverviewResearch design for causal inference is at the heart of a “credibility revolution” in empirical research. We will cover the design of true randomized experiments and contrast them to natural or quasi experiments and to pure observational studies, where part of the sample is treated in some way, the remainder is a control group, but the researcher controls neither the assignment of cases to treatment and control groups nor administration of the treatment. We will assess the causal inferences one can draw from a research design, threats to valid inference, and research designs that can mitigate those threats.
Most empirical methods courses survey a variety of methods. We will begin instead with the goal of causal inference, and emphasize how to design research to come closer to that goal. The methods are often adapted to a particular study. Some of the methods are covered in PhD programs, but rarely with a focus on credible causal inference and on which methods to use with messy, real-world datasets and limited sample sizes.
Target Audience
Quantitative empirical researchers (faculty and graduate students) in social science, including law, political science, economics, many business-school areas (finance, accounting, management, marketing, etc.), medicine, sociology, education, psychology, etc. –anywhere that causal inference is important.
We will assume knowledge, at the level of an upper-level college econometrics or similar course, of multivariate regression, including OLS, logit, and probit; basic probability and statistics including conditional and compound probabilities, confidence intervals, t-statistics, and standard errors; and some understanding of instrumental variables. Despite limited prerequisites, this course should be suitable for researchers with recent PhD-level training and for empirical legal scholars with reasonable but more limited training.
Main Workshop Faculty (in order of appearance)
- Donald B. Rubin, Harvard University, Statistics Department
Donald Rubin is John L. Loeb Professor of Statistics Emeritus, at Harvard. His work on the “Rubin Causal Model” is central to modern understanding of causal inference with observational data. Principal research interests: statistical methods for causal inference; Bayesian statistics; analysis of incomplete data. Web page. Wikipedia. - Brigham Frandsen, Brigham Young University, Economics Department; currently visiting MIT
Brigham Frandsen is Associate Professor of Economics and Visiting Associate Professor at MIT. Research interests include developing methods for causal inference on how treatment effects vary across the treated population, and applying those methods to labor, education, and health economics questions. Web page. - Joshua Angrist, MIT, Economics Department
Joshua Angrist is Ford Professor of Economics at MIT. His work on “causal IV” is at the central to the modern revival of use of instrumental variables methods. Principal research interests: labor economics; econometrics. Author of Joshua Angrist and Jorn-Steffen Pischke, Mostly Harmless Econometrics: An Empiricist’s Companion (2009) and Mastering ‘Metrics: The Path from Cause to Effect (2014), and the Mostly Harmless Econometrics blog. - Jens Hainmueller, Stanford University, Political Science Department
Jens Hainmueller is Professor in the Stanford Political Science Department, and co-Director of the Stanford Immigration Policy Lab. He also holds a courtesy appointment in the Stanford Graduate School of Business. His research interests include statistical methods, political economy, and political behavior. Web page. Papers on SSRN.
Monday, August 12 (Donald Rubin)
- Introduction to Modern Methods for Causal Inference
Overview of causal inference and the Rubin “potential outcomes” causal model. The “gold standard” of a randomized experiment. Treatment and control groups, and the core role of the assignment (to treatment) mechanism. Causal inference as a missing data problem, and imputation of missing potential outcomes. Rerandomization. One-sided and two-sided noncompliance.
- Matching and Reweighting Designs for “Pure” Observational Studies
The core, untestable requirement of selection [only] on observables. Ensuring covariate balance and common support. Subclassification, matching, reweighting, and regression estimators of average treatment effects. Propensity score methods.
- Instrumental variable methods
Causal inference with instrumental variables (IV), including (i) the core, untestable need to satisfy the “only through” exclusion restriction; (ii) heterogeneous treatment effects; and (iii) intent-to-treat designs for randomized trials (or quasi-experiments) with noncompliance.
- Beyond means
Going beyond estimating “average treatment effects,” and estimating quantile and marginal causal effects.
- Panel Data and Difference-in-Differences
Panel data methods: pooled OLS, random effects, correlated random effects, and fixed effects. Simple two-period DiD. The core “parallel changes” assumption. Testing this assumption. Leads and lags and distributed lag models. When does a design with unit fixed effects become DiD? Accommodating covariates. Triple differences. Robust and clustered standard errors. Introduction to synthetic controls.
- Regression Discontinuity
(Regression) discontinuity (RD) research designs: sharp and fuzzy designs; bandwidth choice; testing for covariate balance and manipulation of the threshold; discontinuities as substitutes for true randomization and sources of convincing instruments.
- Attendees will present their own research design questions from current work in breakout sessions and receive feedback on research design. Session leaders: Bernie Black, Mat McCubbins, Jens Hainmueller, Vladimir Atanasov (William and Mary). Additional parallel sessions if needed to meet demand.
On selected days (tentatively, Tuesday, Wednesday, and Thursday), we will run parallel Stata and R sessions to illustrate code for the research designs discussed in the lectures, or the speakers will build Stata code into their lecture slides. Presenters: Bernard Black (Stata) and Joshua Lerner (R).
Registration and Workshop Cost
Tuition is $900 ($600 for post-docs and graduate students PhD, SJD, or law; $400 if you are Northwestern or Duke affiliated). The workshop fee includes all materials, temporary Stata 15 license, breakfast, lunch, snacks, and an evening reception on the first workshop day.
You can cancel from either workshop five weeks in advance (July 1 for main workshop, July 8 for advanced workshop) for a 75% refund and by three weeks in advance 50% refund (in each case, less credit card processing fee), but there are no refunds after that. The Northwestern-Duke discount should be applied automatically if you submit a valid northwestern.edu or duke.edu email address.
Advanced Workshop
OverviewThe advanced workshop provides in-depth discussion of selected topics that are beyond what we can cover in the main workshop. The principal topics for 2019 include:
Monday and first half of Tuesday: Advanced matching and balancing methods, including synthetic controls, methods that aim at exact covariate balance, and balancing with panel data.
Tuesday afternoon and Wednesday: Application of machine learning methods to causal inference. Tuesday afternoon will be an introduction to methods; Wednesday will be for applications to causal inference; where machine learning approaches are and are not useful.
Target Audience
Empirical researchers who are familiar with the basics of causal inference (from our main workshop or otherwise), and want to extend their knowledge. We will assume familiarity, but not expertise, with potential outcomes, difference-in-differences, regression discontinuity, and panel data.
Advanced Workshop Faculty
- Jann Spiess, Stanford University, Business School
Jann Spiess is Post-Doctoral Researcher at Microsoft Research and Assistant Professor of Operations, Information & Technology at Stanford Graduate School of Business. He is coauthor, with Sendhil Mullainathan, of Machine Learning: An Applied Econometric Approach (Journal of Economic Perspectives, 2017). His research focuses on integrating insights and techniques from machine learning into econometrics. Website. - Yiqing Xu, UC San Diego, Political Science Department
Yiqing Xu is Assistant Professor of Political Science at University of California, San Diego. His main methods research involves causal inference with panel data. Website.
Monday, August 19 and Tuesday, August 20: Morning (Yiqing Xu)
- Advanced panel data methods
Advanced topics for causal inference with panel data using parametric, semi-parametric, non-parametric methods for addressing imbalance between treated and control units. Topics include interactive fixed effects and matrix completion methods, as well as reweighting approaches such as panel matching, trajectory balancing and augmented synthetic control. Relative strengths and weaknesses of different methods will be discussed. - Stata sessions: Monday and Tuesday afternoons, after regular class sessions
Stata code for causal inference with panel data: code and examples from Vladimir Atanasov and Bernard Black,The Trouble with Instruments: The Need for Pre-Treatment Balance in Shock-IV Designs (working paper 2018)
- Introduction to machine learning (predictive inference)
Introduction to “machine-learning” approaches to prediction algorithms. High-dimensional model selection (function classes, regularization, tuning), model combination (ensemble models, bagging, boosting), model evaluation, and implementation.
- Applications of machine learning to causal inference
When and how can machine learning methods be applied to causal inference questions. Limitations (prediction vs estimation) and opportunities (data pre-processing, prediction as quantity of interest, high-dimensional nuisance parameters), with examples from an emerging empirical literature.
Tuition is $600 ($400 for post-docs and graduate students PhD, SJD, or law; $300 if you are Northwestern or Duke affiliated). There is a $200 discount for non-Northwestern-Duke persons attending both workshops.
You can cancel from either workshop five weeks in advance (July 1 for main workshop, July 8 for advanced workshop) for a 75% refund and by three weeks in advance 50% refund (in each case, less credit card processing fee), but there are no refunds after that. The Northwestern-Duke discount should be applied automatically if you submit a valid northwestern.edu or duke.edu email address.
Questions about the workshops: Please email Bernie Black or Mat McCubbins for substantive questions or fee waiver requests, and Isabel Fox for logistics and registration
Categories: Social Sciences