Siyu Heng
Assistant Professor of Biostatistics
-
Professional overview
-
Siyu Heng, PhD is an Assistant Professor in the Department of Biostatistics, with an interest in both methodology research and applied research. His areas of expertise are in causal inference, health data science, observational studies, randomized trials, sensitivity analysis, instrumental variables, measurement error, and in survey data and their applications in public health.
Dr. Heng’s research has been published in the Journal of the Royal Statistical Society and in Physical Review, among others. He has been recognized with several awards, including the IPUMS Global Health Research Award for the Best Student Paper; the Lawrence D. Brown Best Paper Award; the ASA Mental Health Statistics Section Student Paper Award; the ENAR Distinguished Student Paper Award; the NESS Student Research Award, and the Wellcome Trust Data Reuse Prize.
Dr. Heng received his PhD in applied mathematics and computational science from the University of Pennsylvania, and his BA in statistics from Nanjing University.
-
Education
-
PhD Candidate, Applied Mathematics and Computational Science (Statistics Track) University of PennsylvaniaBS, Mathematics, Nanjing University
-
Honors and awards
-
IPUMS Global Health Research Award for the Best Student Paper, Integrated Public Use Microdata Series (2021)IMS Hannan Graduate Student Travel Award, Institute of Mathematical Statistics (2021)ASA Mental Health Statistics Section Student Paper Award, American Statistical Association Section on Mental Health Statistics (2021)ENAR Distinguished Student Paper Award, International Biometric Society Eastern North American Region (2021)Wellcome Trust Data Reuse Prize: Malaria, Wellcome Trust (2019)Benjamin Franklin Fellowship, University of Pennsylvania School of Arts and Sciences (201620172018)
-
Areas of research and study
-
Causal InferenceEpidemiologyGlobal HealthHealth EquityInstrumental VariablesObservational StudiesPublic Health PolicyRandomized ExperimentationSocial Sciences
-
Publications
Publications
Maximizing the reach of universal child sexual abuse prevention: Protocol for an equivalence trial
Guastaferro, K., Melchior, M. S., Heng, S., Trudeau, J., & Holloway, J. L. (n.d.).Publication year
2024Journal title
Contemporary Clinical Trials CommunicationsVolume
41AbstractBackground: Child sexual abuse (CSA) affects 1 in 5 girls and 1 in 12 boys before age 18. Universal school-based prevention programs are an effective and cost-efficient method of teaching students an array of personal safety skills. However, the programmatic reach of universal school-based programs is limited by the inherent reliance on the school infrastructure and a dearth of available alternative delivery modalities. Methods: The design for this study will use a rigorous cluster randomized design (N = 180 classrooms) to determine the equivalence of two delivery modalities of Safe Touches: as usual vs. modified. The as usual workshop will be delivered by two facilitators with live puppet skits (n = 90). Whereas, the modified workshop will be delivered by one facilitator using prerecorded skit videos (n = 90). We will determine the equivalence by measuring concept learning acquisition preworkshop to immediate postworkshop (Aim 1) and retention at 3-months postworkshop (Aim 2) among students in classrooms that receive the as usual or modified workshops. To conclude equivalence, it is imperative to also examine factors that may impact future dissemination and implementation, specifically program adoption among school personnel and implementation fidelity between the two modalities (Aim 3). Conclusion: Study findings will inform the ongoing development of effective CSA prevention programs and policy decisions regarding the sustainable integration of such programs within schools. Clinical trial registration: NCT06195852.Instrumental variables: to strengthen or not to strengthen?
Heng, S., Zhang, B., Han, X., Lorch, S. A., & Small, D. S. (n.d.).Publication year
2023Journal title
Journal of the Royal Statistical Society. Series A: Statistics in SocietyVolume
186Issue
4Page(s)
852-873AbstractInstrumental variables (IVs) are extensively used to handle unmeasured confounding. However, weak IVs may cause problems. Many matched studies have considered strengthening an IV through discarding some of the sample. It is widely accepted that strengthening an IV tends to increase the power of non-parametric tests and sensitivity analyses. We re-evaluate this conventional wisdom and offer new insights. First, we evaluate the trade-off between IV strength and sample size assuming a valid IV and exhibit conditions under which strengthening an IV increases power. Second, we derive a criterion for checking the validity of a sensitivity analysis model with a continuous dose and show that the widely used Γ sensitivity analysis model, which was used to argue that strengthening an IV increases the power of sensitivity analyses in large samples, does not work for continuous IVs. Third, we quantify the bias of the Wald estimator with a possibly invalid IV and leverage it to develop a valid sensitivity analysis framework and show that strengthening an IV may or may not increase the power of sensitivity analyses. We use our framework to study the effect on premature babies of being delivered in a high technology/high volume neonatal intensive care unit.SOCIAL DISTANCING AND COVID-19: RANDOMIZATION INFERENCE FOR A STRUCTURED DOSE-RESPONSE RELATIONSHIP
Zhang, B., Heng, S., Ye, T., & Small, D. S. (n.d.).Publication year
2023Journal title
Annals of Applied StatisticsVolume
17Issue
1Page(s)
23-46AbstractSocial distancing is widely acknowledged as an effective public health policy combating the novel coronavirus. But extreme forms of social distanc-ing, like isolation and quarantine, have costs, and it is not clear how much social distancing is needed to achieve public health effects. In this article we develop a design-based framework to test the causal null hypothesis and make inference about the dose-response relationship between reduction in social mobility and COVID-19 related public health outcomes. We first dis-cuss how to embed observational data with a time-independent, continuous treatment dose into an approximate randomized experiment and develop a randomization-based procedure that tests if a structured dose-response relationship fits the data. We then generalize the design and testing procedure to a longitudinal setting and apply them to investigate the effect of social distancing during the first phased reopening in the United States on public health outcomes using data compiled from Unacast™, the United States Census Bu-reau, and the County Health Rankings and Roadmaps Program. We rejected a primary analysis null hypothesis that stated the social distancing from April 27, 2020 to June 28, 2020, had no effect on the COVID-19-related death toll from June 29, 2020 to August 2, 2020 (p-value < 0.001), and found that it took more reduction in mobility to prevent exponential growth in case num-bers for nonrural counties compared to rural counties.Testing Biased Randomization Assumptions and Quantifying Imperfect Matching and Residual Confounding in Matched Observational Studies
Chen, K., Heng, S., Long, Q., & Zhang, B. (n.d.).Publication year
2023Journal title
Journal of Computational and Graphical StatisticsVolume
32Issue
2Page(s)
528-538AbstractOne central goal of design of observational studies is to embed nonexperimental data into an approximate randomized controlled trial using statistical matching. Despite empirical researchers’ best intention and effort to create high-quality matched samples, residual imbalance due to observed covariates not being well matched often persists. Although statistical tests have been developed to test the randomization assumption and its implications, few provide a means to quantify the level of residual confounding due to observed covariates not being well matched in matched samples. In this article, we develop two generic classes of exact statistical tests for a biased randomization assumption. One important by-product of our testing framework is a quantity called residual sensitivity value (RSV), which provides a means to quantify the level of residual confounding due to imperfect matching of observed covariates in a matched sample. We advocate taking into account RSV in the downstream primary analysis. The proposed methodology is illustrated by re-examining a famous observational study concerning the effect of right heart catheterization (RHC) in the initial care of critically ill patients. Code implementing the method can be found in the supplementary materials.The Central Role of the Propensity Score in Sensitivity Analysis for Matched Observational Studies
Heng, S. (n.d.).Publication year
2023Journal title
Observational StudiesVolume
9Issue
1Page(s)
35-41AbstractThe propensity score, which was originally introduced in Rosenbaum and Rubin (1983), has been widely considered one of the most important concepts in the causal inference literature. This article briefly reviews some propensity score models involving both observed and unobserved covariates and discusses their applications in sensitivity analysis for matched observational studies.Bridging preference-based instrumental variable studies and cluster-randomized encouragement experiments: Study design, noncompliance, and average cluster effect ratio
Zhang, B., Heng, S., MacKay, E. J., & Ye, T. (n.d.).Publication year
2022Journal title
BiometricsVolume
78Issue
4Page(s)
1639-1650AbstractInstrumental variable (IV) methods are widely used in medical research to draw causal conclusions when the treatment and outcome are confounded by unmeasured confounding variables. One important feature of such studies is that the IV is often applied at the cluster level, for example, hospitals' or physicians' preference for a certain treatment where each hospital or physician naturally defines a cluster. This paper proposes to embed such observational IV data into a cluster-randomized encouragement experiment using nonbipartite matching. Potential outcomes and causal assumptions underpinning the design are formalized and examined. Testing procedures for two commonly used estimands, Fisher's sharp null hypothesis and the pooled effect ratio (PER), are extended to the current setting. We then introduce a novel cluster-heterogeneous proportional treatment effect model and the relevant estimand: the average cluster effect ratio. This new estimand is advantageous over the structural parameter in a constant proportional treatment effect model in that it allows treatment heterogeneity, and is advantageous over the PER estimand in that it does not suffer from Simpson's paradox. We develop an asymptotically valid randomization-based testing procedure for this new estimand based on solving a mixed-integer quadratically constrained optimization problem. The proposed design and inferential methods are applied to a study of the effect of using transesophageal echocardiography during coronary artery bypass graft surgery on patients' 30-day mortality rate. R package ivdesign implements the proposed method.Hemoglobin Levels among Male Agricultural Workers: Analyses from the Demographic and Health Surveys to Investigate a Marker for Chronic Kidney Disease of Uncertain Etiology
Lin, Y., Heng, S., Anand, S., Deshpande, S. K., & Small, D. S. (n.d.).Publication year
2022Journal title
Journal of Occupational and Environmental MedicineVolume
64Issue
12Page(s)
E805-E810AbstractObjective Estimate agricultural work's effect on hemoglobin (Hgb) level in men. A negative effect may indicate presence of chronic kidney disease of uncertain etiology. Methods We use Demographic and Health Surveys data from seven African and Asian countries and use matching to control for seven confounders. Results On average, Hgb levels were 0.09 g/dL lower among agricultural workers compared with matched controls. Significant effects were observed in Ethiopia, India, Lesotho, and Senegal, with effects from 0.07 to 0.30 g/dL lower Hgb level among agricultural workers. The findings were robust to multiple control groups and a modest amount of unmeasured confounding. Conclusions Men engaged in agricultural work in four of the seven countries studied have modestly lower Hgb levels. Our data support integrating kidney function assessments within Demographic and Health Surveys and other population-based surveys.Association between Transesophageal Echocardiography and Clinical Outcomes after Coronary Artery Bypass Graft Surgery
MacKay, E. J., Zhang, B., Heng, S., Ye, T., Neuman, M. D., Augoustides, J. G., Feinman, J. W., Desai, N. D., & Groeneveld, P. W. (n.d.).Publication year
2021Journal title
Journal of the American Society of EchocardiographyVolume
34Issue
6Page(s)
571-581AbstractBackground: Coronary artery bypass graft (CABG) surgery is the most widely performed cardiac surgery in the United States. Transesophageal echocardiography (TEE) is frequently used in a variety of cardiac surgical procedures, but its clinical benefit in isolated CABG surgery is unclear, and guidelines remain indeterminate. The aim of this study was to compare clinical outcomes among patients undergoing isolated CABG surgery with versus without TEE in order to test the hypothesis that TEE would be associated with improved clinical outcomes after CABG surgery. Methods: A matched retrospective cohort study was conducted among Medicare beneficiaries undergoing isolated CABG surgery with versus without intraoperative monitoring using TEE in the United States. The primary analysis was a near/far instrumental variable match that paired hospitals with similar characteristics and patient populations but with opposing probabilities for using TEE in CABG surgery. Outcomes included 30-day mortality, a composite outcome of stroke or 30-day mortality, length of hospitalization, and incidence of esophageal perforation. Results: Of 114,871 patients undergoing isolated CABG surgery, 65,471 (57%) underwent TEE and 49,400 (43%) did not. Hospital-level instrumental variable matched analysis demonstrated that among the subset of 968 matched hospitals, TEE receipt was associated with lower 30-day mortality (3.7% vs 4.9%, P <.001), a lower incidence of the composite outcome of stroke or 30-day mortality (4.5% vs 5.6%, P <.001), no difference in length of hospitalization (10.32 vs 10.52 days, P =.26), and no difference in the incidence of esophageal perforation (0.01% vs 0.01%, P =.63). These results were replicated in surgeon-level and patient-level matched-pair instrumental variable analyses, and all analyses were robust to sensitivity analyses that tested for biases introduced by unmeasured confounding. Conclusions: The findings from this study suggest that TEE may offer a clinical benefit to cardiac surgical patients undergoing isolated CABG surgery.Increasing power for observational studies of aberrant response: An adaptive approach
Heng, S., Kang, H., Small, D. S., & Fogarty, C. B. (n.d.).Publication year
2021Journal title
Journal of the Royal Statistical Society. Series B: Statistical MethodologyVolume
83Issue
3Page(s)
482-504AbstractIn many observational studies, the interest is in the effect of treatment on bad, aberrant outcomes rather than the average outcome. For such settings, the traditional approach is to define a dichotomous outcome indicating aberration from a continuous score and use the Mantel–Haenszel test with matched data. For example, studies of determinants of poor child growth use the World Health Organization’s definition of child stunting being height-for-age z-score ≤ − 2. The traditional approach may lose power because it discards potentially useful information about the severity of aberration. We develop an adaptive approach that makes use of this information and asymptotically dominates the traditional approach. We develop our approach in two parts. First, we develop an aberrant rank approach in matched observational studies and prove a novel design sensitivity formula enabling its asymptotic comparison with the Mantel–Haenszel test under various settings. Second, we develop a new, general adaptive approach, the two-stage programming method, and use it to adaptively combine the aberrant rank test and the Mantel–Haenszel test. We apply our approach to a study of the effect of teenage pregnancy on stunting.Relationship between changing malaria burden and low birth weight in sub-saharan africa: A difference-in-differences study via a pair-of-pairs approach
Heng, S., O’meara, W. P., Simmons, R. A., & Small, D. S. (n.d.).Publication year
2021Journal title
eLifeVolume
10AbstractBackground: According to the World Health Organization (WHO), in 2018, an estimated 228 million malaria cases occurred worldwide with most cases occurring in sub-Saharan Africa. Scale-up of vector control tools coupled with increased access to diagnosis and effective treatment has resulted in a large decline in malaria prevalence in some areas, but other areas have seen little change. Although interventional studies demonstrate that preventing malaria during pregnancy can reduce the rate of low birth weight (i.e. child’s birth weight <2500 g), it remains unknown whether natural changes in parasite transmission and malaria burden can improve birth outcomes. Methods: We conducted an observational study of the effect of changing malaria burden on low birth weight using data from 18,112 births in 19 countries in sub-Saharan African countries during the years 2000–2015. Specifically, we conducted a difference-in-differences study via a pair-of-pairs matching approach using the fact that some sub-Saharan areas experienced sharp drops in malaria prevalence and some experienced little change. Results: A malaria prevalence decline from a high rate (Plasmodium falciparum parasite rate in children aged 2-up-to-10 (i.e. PfPR2-10) > 0.4) to a low rate (PfPR2-10 < 0.2) is estimated to reduce the rate of low birth weight by 1.48 percentage points (95% confidence interval: 3.70 percentage points reduction, 0.74 percentage points increase), which is a 17% reduction in the low birth weight rate compared to the average (8.6%) in our study population with observed birth weight records (1.48/8.6 » 17%). When focusing on first pregnancies, a decline in malaria prevalence from high to low is estimated to have a greater impact on the low birth weight rate than for all births: 3.73 percentage points (95% confidence interval: 9.11 percentage points reduction, 1.64 percentage points increase). Conclusions: Although the confidence intervals cannot rule out the possibility of no effect at the 95% confidence level, the concurrence between our primary analysis, secondary analyses, and sensitivity analyses, and the magnitude of the effect size, contribute to the weight of the evidence suggesting that declining malaria burden can potentially substantially reduce the low birth weight rate at the community level in sub-Saharan Africa, particularly among firstborns. The novel statistical methodology developed in this article–a pair-of-pairs approach to a difference-in-differences study–could be useful for many settings in which different units are observed at different times.SHARPENING THE ROSENBAUM SENSITIVITY BOUNDS TO ADDRESS CONCERNS ABOUT INTERACTIONS BETWEEN OBSERVED AND UNOBSERVED COVARIATES
Heng, S., & Small, D. S. (n.d.).Publication year
2021Journal title
Statistica SinicaVolume
31Page(s)
2331-2353AbstractIn observational studies, it is typically unrealistic to assume that treatments are assigned randomly, even conditional on adjusting for all observed covariates. Therefore, a sensitivity analysis is often needed to examine how hidden biases due to unobserved covariates affect inferences on treatment effects. In matched observational studies, where each treated unit is matched to one or multiple untreated controls for observed covariates, the Rosenbaum bounds sensitivity analysis is one of the most popular sensitivity analysis models. We show that in the presence of interactions between observed and unobserved covariates, directly applying the Rosenbaum bounds almost inevitably exaggerates the report of sensitivity of causal conclusions to hidden bias. We give sharper odds ratio bounds to fix this deficiency. We illustrate our new method by studying the effect of a anger/hostility tendency on the risk of experiencing heart problems.Finding the strength in a weak instrument in a study of cognitive outcomes produced by Catholic high schools
Heng, S., Small, D. S., & Rosenbaum, P. R. (n.d.).Publication year
2020Journal title
Journal of the Royal Statistical Society. Series A: Statistics in SocietyVolume
183Issue
3Page(s)
935-958AbstractWe show that the strength of an instrument is incompletely characterized by the proportion of compliers, and we propose and evaluate new methods that extract more information from certain settings with comparatively few compliers. Specifically, we demonstrate that, for a fixed small proportion of compliers, the presence of an equal number of always-takers and never-takers weakens an instrument, whereas the absence of always-takers or, equivalently, the absence of never-takers strengthens an instrument. In this statement, the strength of an instrument refers to its ability to recognize and reject a false hypothesis about a structural parameter. Equivalently, the strength of an instrument refers to its ability to exclude from a confidence interval a false value of a structural parameter. This ability is measured by the Bahadur efficiency of a test that assumes that the instrument is flawless, or the Bahadur efficiency of a sensitivity analysis that assumes that the instrument may be somewhat biased. When there are few compliers, the outcomes for most people are unaffected by fluctuations in the instrument, so most of the information about the treatment effect is contained in the tail of the distribution of the outcomes. Exploiting this fact, we propose new methods that emphasize the affected portion of the distribution of outcomes, thereby extracting more information from studies with few compliers. Studies of the effects of Catholic high schools on academic test performance have used ‘being Catholic’ as an instrument for ‘attending a Catholic high school’, and the application concerns such a comparison using the US National Educational Longitudinal Study. Most Catholics did not attend Catholic school, so there are few compliers, but it was rare for non-Catholics to attend Catholic school, so there are very few always-takers.Can phoretic particles swim in two dimensions?
Sondak, D., Hawley, C., Heng, S., Vinsonhaler, R., Lauga, E., & Thiffeault, J. L. (n.d.).Publication year
2016Journal title
Physical Review EVolume
94Issue
6AbstractArtificial phoretic particles swim using self-generated gradients in chemical species (self-diffusiophoresis) or charges and currents (self-electrophoresis). These particles can be used to study the physics of collective motion in active matter and might have promising applications in bioengineering. In the case of self-diffusiophoresis, the classical physical model relies on a steady solution of the diffusion equation, from which chemical gradients, phoretic flows, and ultimately the swimming velocity may be derived. Motivated by disk-shaped particles in thin films and under confinement, we examine the extension to two dimensions. Because the two-dimensional diffusion equation lacks a steady state with the correct boundary conditions, Laplace transforms must be used to study the long-time behavior of the problem and determine the swimming velocity. For fixed chemical fluxes on the particle surface, we find that the swimming velocity ultimately always decays logarithmically in time. In the case of finite Péclet numbers, we solve the full advection-diffusion equation numerically and show that this decay can be avoided by the particle moving to regions of unconsumed reactant. Finite advection thus regularizes the two-dimensional phoretic problem.