Biostatistics Seminar Series: A Breakthrough in Addressing Sparsity in Causal Graphical Models

April 11
11am-12pm
708 Broadway, Room 1001 / Online

This event is part of the Biostatistics Seminar Series hosted by the GPH Department of Biostatistics

Causal graphical models (e.g., DAGs) are used many scientific domains to represent important causal assumptions about the processes that underlie collected data. The focus of this work is on graphical causal discovery, i.e., the data-driven model selection of graphs, for the “downstream” purpose of using the estimated graphs for subsequent causal inference tasks such as establishing the identifying formula for some causal effect of interest and then estimating it. An obstacle to having confidence in existing causal discovery algorithms in public health applications is that these algorithms tend to estimate structures that are overly sparse – missing too many edges. However, statistical “caution” (or “conservativism”) would err on the side of more dense graphs rather than more sparse graphs. 

In this talk, Dr. Daniel Malinsky proposes to reformulate the conditional independence hypothesis tests of classical constraint-based algorithms as equivalence tests: test the null hypothesis of association greater than some (user-chosen, sample-size dependent) threshold, rather than test the null of no association. He argue this addresses several important statistical issues in applied causal model selection and leads to procedures with desirable behaviors and properties. Time-permitting, he will also discuss recent work on addressing a related issue: the problem of valid “post-selection” inference, i.e., constructing valid confidence intervals for causal effects that account for the model selection process.

About the Speaker: 

Dr. Daniel Malinsky's methodological research focuses mostly on causal inference: developing statistical methods and machine learning tools to support inference about treatment effects, interventions, and policies. Current research topics include graphical structure learning (a.k.a. causal discovery or causal model selection), semiparametric inference, time series analysis, and missing data. Application areas of particular interest include environmental determinants of health and health disparities. Dr. Malinsky also studies algorithmic fairness: understanding and counteracting the biases introduced by data science tools deployed in socially-impactful settings. Finally, Dr. Malinsky has interests in the philosophy of science and the foundations of statistics. Before joining the Mailman School of Public Health, Dr. Malinsky was a postdoctoral fellow at Johns Hopkins University.