The Role of Statistics and Data Science in Impact Evaluation
Category: International Statistical Institute
Proposal Description
Impact evaluation involves estimating an intervention’s effect on outcome(s) of interest. It is often investigated by non-statisticians in policy domains including public health, social services, and education. Considering “correlation does not imply causation,” statistical and data science tools are broadly applicable here, but they must be used carefully to correctly estimate causal relationships. This session will provide an overview of modern approaches to impact evaluation and assess their utility and limitations for estimating causal effects and informing evidence-based policymaking.
In different disciplines, “impact evaluation” refers to different problems and approaches, leading to confusion about the many methods, definitions, and notations. Stefan Sperlich (Université de Genève) will present a mostly data-driven procedure that takes advantage of this diversity by combining methods. Sperlich will guide practitioners from selecting indicators and causality models to treatment effect significance tests. He will step through a specific example, then discuss strategies to minimize the influence of subjective judgement on results in non-pre-designed experiments. Graph theory and nonparametric statistics can play important roles.
Daniela De Angelis (University of Cambridge) will discuss variants of the causal factor analysis (FA) model, which is broadly used to estimate the impact of an intervention using observational time-series data on multiple units, allowing for measured and unmeasured confounders. De Angelis will demonstrate FA’s utility in settings with limited data and an extension to model the dependence of causal effects on modifiers. Fitting these models under the Bayesian paradigm leads to straightforward uncertainty quantification for causal quantities and can ensure data-driven model parsimony by exploiting regularizing priors.
José R. Zubizarreta (Harvard University) will propose a robust weighting approach for estimation in studies leveraging changes in policies or programs over time and in exposed and unexposed locations (known as event studies), which allows investigators to progressively build larger valid weighted contrasts by leveraging, in a sequential manner, increasingly stronger assumptions on the potential outcomes and the assignment mechanism. This approach is adaptable to a generally defined estimand and allows for generalization. Zubizarreta will provide weighting diagnostics and visualization tools, and illustrate these methods in a case study of the impact of divorce reforms on female suicide.
Elizabeth Stuart (Johns Hopkins) will present analyses of six multi-site Randomized Controlled Trials (RCTs), testing the ability of modern prediction methods to estimate site-specific effects using a wide range of moderator variables. While all methods yielded accurate impact predictions when the variation in impacts across sites was close to zero, the accuracy diminished given substantial variation. Bayesian Additive Regression Trees (BART) typically produced “less inaccurate” predictions than lasso regression or the Sample Average Treatment Effect. This work cautions how well trials as currently implemented can inform policymaking in individual sites and motivates future work to examine how to best do so.
Elizabeth Eisenhauer (Westat) will discuss bias in RCTs for impact evaluation of social programs. The introduction of some nonrandom assignments due to ethical considerations may introduce nonignorable missing data, biasing impact estimates. Inspired by survey sampling methods, Eisenhauer will demonstrate adjustments to mitigate these biases and improve the validity of impact evaluations.