Siyu Heng

Siyu Heng
Siyu Heng

Assistant Professor of Biostatistics

Professional overview

Siyu Heng, PhD is an Assistant Professor in the Department of Biostatistics, with an interest in both methodology research and applied research. His areas of expertise are in causal inference, health data science, observational studies, randomized trials, sensitivity analysis, instrumental variables, measurement error, and in survey data and their applications in public health.

Dr. Heng’s research has been published in the Journal of the Royal Statistical Society and in Physical Review, among others. He has been recognized with several awards, including the IPUMS Global Health Research Award for the Best Student Paper; the Lawrence D. Brown Best Paper Award; the ASA Mental Health Statistics Section Student Paper Award; the ENAR Distinguished Student Paper Award; the NESS Student Research Award, and the Wellcome Trust Data Reuse Prize.

Dr. Heng received his PhD in applied mathematics and computational science from the University of Pennsylvania, and his BA in statistics from Nanjing University.

Education

PhD Candidate, Applied Mathematics and Computational Science (Statistics Track) University of Pennsylvania
BS, Mathematics, Nanjing University

Honors and awards

IPUMS Global Health Research Award for the Best Student Paper, Integrated Public Use Microdata Series (2021)
IMS Hannan Graduate Student Travel Award, Institute of Mathematical Statistics (2021)
ASA Mental Health Statistics Section Student Paper Award, American Statistical Association Section on Mental Health Statistics (2021)
ENAR Distinguished Student Paper Award, International Biometric Society Eastern North American Region (2021)
Wellcome Trust Data Reuse Prize: Malaria, Wellcome Trust (2019)
Benjamin Franklin Fellowship, University of Pennsylvania School of Arts and Sciences (201620172018)

Areas of research and study

Causal Inference
Epidemiology
Global Health
Health Equity
Instrumental Variables
Observational Studies
Public Health Policy
Randomized Experimentation
Social Sciences

Publications

Publications

Bridging preference-based instrumental variable studies and cluster-randomized encouragement experiments: Study design, noncompliance, and average cluster effect ratio

Zhang, B., Heng, S., MacKay, E. J., & Ye, T. (n.d.).

Publication year

2022

Journal title

Biometrics

Volume

78

Issue

4

Page(s)

1639-1650
Abstract
Abstract
Instrumental variable (IV) methods are widely used in medical research to draw causal conclusions when the treatment and outcome are confounded by unmeasured confounding variables. One important feature of such studies is that the IV is often applied at the cluster level, for example, hospitals' or physicians' preference for a certain treatment where each hospital or physician naturally defines a cluster. This paper proposes to embed such observational IV data into a cluster-randomized encouragement experiment using nonbipartite matching. Potential outcomes and causal assumptions underpinning the design are formalized and examined. Testing procedures for two commonly used estimands, Fisher's sharp null hypothesis and the pooled effect ratio (PER), are extended to the current setting. We then introduce a novel cluster-heterogeneous proportional treatment effect model and the relevant estimand: the average cluster effect ratio. This new estimand is advantageous over the structural parameter in a constant proportional treatment effect model in that it allows treatment heterogeneity, and is advantageous over the PER estimand in that it does not suffer from Simpson's paradox. We develop an asymptotically valid randomization-based testing procedure for this new estimand based on solving a mixed-integer quadratically constrained optimization problem. The proposed design and inferential methods are applied to a study of the effect of using transesophageal echocardiography during coronary artery bypass graft surgery on patients' 30-day mortality rate. R package ivdesign implements the proposed method.

Hemoglobin Levels among Male Agricultural Workers: Analyses from the Demographic and Health Surveys to Investigate a Marker for Chronic Kidney Disease of Uncertain Etiology

Lin, Y., Heng, S., Anand, S., Deshpande, S. K., & Small, D. S. (n.d.).

Publication year

2022

Journal title

Journal of Occupational and Environmental Medicine

Volume

64

Issue

12

Page(s)

E805-E810
Abstract
Abstract
Objective Estimate agricultural work's effect on hemoglobin (Hgb) level in men. A negative effect may indicate presence of chronic kidney disease of uncertain etiology. Methods We use Demographic and Health Surveys data from seven African and Asian countries and use matching to control for seven confounders. Results On average, Hgb levels were 0.09 g/dL lower among agricultural workers compared with matched controls. Significant effects were observed in Ethiopia, India, Lesotho, and Senegal, with effects from 0.07 to 0.30 g/dL lower Hgb level among agricultural workers. The findings were robust to multiple control groups and a modest amount of unmeasured confounding. Conclusions Men engaged in agricultural work in four of the seven countries studied have modestly lower Hgb levels. Our data support integrating kidney function assessments within Demographic and Health Surveys and other population-based surveys.

Testing Biased Randomization Assumptions and Quantifying Imperfect Matching and Residual Confounding in Matched Observational Studies

Chen, K., Heng, S., Long, Q., & Zhang, B. (n.d.).

Publication year

2022

Journal title

Journal of Computational and Graphical Statistics
Abstract
Abstract
One central goal of design of observational studies is to embed nonexperimental data into an approximate randomized controlled trial using statistical matching. Despite empirical researchers’ best intention and effort to create high-quality matched samples, residual imbalance due to observed covariates not being well matched often persists. Although statistical tests have been developed to test the randomization assumption and its implications, few provide a means to quantify the level of residual confounding due to observed covariates not being well matched in matched samples. In this article, we develop two generic classes of exact statistical tests for a biased randomization assumption. One important by-product of our testing framework is a quantity called residual sensitivity value (RSV), which provides a means to quantify the level of residual confounding due to imperfect matching of observed covariates in a matched sample. We advocate taking into account RSV in the downstream primary analysis. The proposed methodology is illustrated by re-examining a famous observational study concerning the effect of right heart catheterization (RHC) in the initial care of critically ill patients. Code implementing the method can be found in the supplementary materials.

Association between Transesophageal Echocardiography and Clinical Outcomes after Coronary Artery Bypass Graft Surgery

MacKay, E. J., Zhang, B., Heng, S., Ye, T., Neuman, M. D., Augoustides, J. G., Feinman, J. W., Desai, N. D., & Groeneveld, P. W. (n.d.).

Publication year

2021

Journal title

Journal of the American Society of Echocardiography

Volume

34

Issue

6

Page(s)

571-581
Abstract
Abstract
Background: Coronary artery bypass graft (CABG) surgery is the most widely performed cardiac surgery in the United States. Transesophageal echocardiography (TEE) is frequently used in a variety of cardiac surgical procedures, but its clinical benefit in isolated CABG surgery is unclear, and guidelines remain indeterminate. The aim of this study was to compare clinical outcomes among patients undergoing isolated CABG surgery with versus without TEE in order to test the hypothesis that TEE would be associated with improved clinical outcomes after CABG surgery. Methods: A matched retrospective cohort study was conducted among Medicare beneficiaries undergoing isolated CABG surgery with versus without intraoperative monitoring using TEE in the United States. The primary analysis was a near/far instrumental variable match that paired hospitals with similar characteristics and patient populations but with opposing probabilities for using TEE in CABG surgery. Outcomes included 30-day mortality, a composite outcome of stroke or 30-day mortality, length of hospitalization, and incidence of esophageal perforation. Results: Of 114,871 patients undergoing isolated CABG surgery, 65,471 (57%) underwent TEE and 49,400 (43%) did not. Hospital-level instrumental variable matched analysis demonstrated that among the subset of 968 matched hospitals, TEE receipt was associated with lower 30-day mortality (3.7% vs 4.9%, P <.001), a lower incidence of the composite outcome of stroke or 30-day mortality (4.5% vs 5.6%, P <.001), no difference in length of hospitalization (10.32 vs 10.52 days, P =.26), and no difference in the incidence of esophageal perforation (0.01% vs 0.01%, P =.63). These results were replicated in surgeon-level and patient-level matched-pair instrumental variable analyses, and all analyses were robust to sensitivity analyses that tested for biases introduced by unmeasured confounding. Conclusions: The findings from this study suggest that TEE may offer a clinical benefit to cardiac surgical patients undergoing isolated CABG surgery.

Increasing power for observational studies of aberrant response: An adaptive approach

Heng, S., Kang, H., Small, D. S., & Fogarty, C. B. (n.d.).

Publication year

2021

Journal title

Journal of the Royal Statistical Society. Series B: Statistical Methodology

Volume

83

Issue

3

Page(s)

482-504
Abstract
Abstract
In many observational studies, the interest is in the effect of treatment on bad, aberrant outcomes rather than the average outcome. For such settings, the traditional approach is to define a dichotomous outcome indicating aberration from a continuous score and use the Mantel–Haenszel test with matched data. For example, studies of determinants of poor child growth use the World Health Organization’s definition of child stunting being height-for-age z-score ≤ − 2. The traditional approach may lose power because it discards potentially useful information about the severity of aberration. We develop an adaptive approach that makes use of this information and asymptotically dominates the traditional approach. We develop our approach in two parts. First, we develop an aberrant rank approach in matched observational studies and prove a novel design sensitivity formula enabling its asymptotic comparison with the Mantel–Haenszel test under various settings. Second, we develop a new, general adaptive approach, the two-stage programming method, and use it to adaptively combine the aberrant rank test and the Mantel–Haenszel test. We apply our approach to a study of the effect of teenage pregnancy on stunting.

Relationship between changing malaria burden and low birth weight in sub-saharan africa: A difference-in-differences study via a pair-of-pairs approach

Heng, S., O’meara, W. P., Simmons, R. A., & Small, D. S. (n.d.).

Publication year

2021

Journal title

eLife

Volume

10
Abstract
Abstract
Background: According to the World Health Organization (WHO), in 2018, an estimated 228 million malaria cases occurred worldwide with most cases occurring in sub-Saharan Africa. Scale-up of vector control tools coupled with increased access to diagnosis and effective treatment has resulted in a large decline in malaria prevalence in some areas, but other areas have seen little change. Although interventional studies demonstrate that preventing malaria during pregnancy can reduce the rate of low birth weight (i.e. child’s birth weight <2500 g), it remains unknown whether natural changes in parasite transmission and malaria burden can improve birth outcomes. Methods: We conducted an observational study of the effect of changing malaria burden on low birth weight using data from 18,112 births in 19 countries in sub-Saharan African countries during the years 2000–2015. Specifically, we conducted a difference-in-differences study via a pair-of-pairs matching approach using the fact that some sub-Saharan areas experienced sharp drops in malaria prevalence and some experienced little change. Results: A malaria prevalence decline from a high rate (Plasmodium falciparum parasite rate in children aged 2-up-to-10 (i.e. PfPR2-10) > 0.4) to a low rate (PfPR2-10 < 0.2) is estimated to reduce the rate of low birth weight by 1.48 percentage points (95% confidence interval: 3.70 percentage points reduction, 0.74 percentage points increase), which is a 17% reduction in the low birth weight rate compared to the average (8.6%) in our study population with observed birth weight records (1.48/8.6 » 17%). When focusing on first pregnancies, a decline in malaria prevalence from high to low is estimated to have a greater impact on the low birth weight rate than for all births: 3.73 percentage points (95% confidence interval: 9.11 percentage points reduction, 1.64 percentage points increase). Conclusions: Although the confidence intervals cannot rule out the possibility of no effect at the 95% confidence level, the concurrence between our primary analysis, secondary analyses, and sensitivity analyses, and the magnitude of the effect size, contribute to the weight of the evidence suggesting that declining malaria burden can potentially substantially reduce the low birth weight rate at the community level in sub-Saharan Africa, particularly among firstborns. The novel statistical methodology developed in this article–a pair-of-pairs approach to a difference-in-differences study–could be useful for many settings in which different units are observed at different times.

Finding the strength in a weak instrument in a study of cognitive outcomes produced by Catholic high schools

Heng, S., Small, D. S., & Rosenbaum, P. R. (n.d.).

Publication year

2020

Journal title

Journal of the Royal Statistical Society. Series A: Statistics in Society

Volume

183

Issue

3

Page(s)

935-958
Abstract
Abstract
We show that the strength of an instrument is incompletely characterized by the proportion of compliers, and we propose and evaluate new methods that extract more information from certain settings with comparatively few compliers. Specifically, we demonstrate that, for a fixed small proportion of compliers, the presence of an equal number of always-takers and never-takers weakens an instrument, whereas the absence of always-takers or, equivalently, the absence of never-takers strengthens an instrument. In this statement, the strength of an instrument refers to its ability to recognize and reject a false hypothesis about a structural parameter. Equivalently, the strength of an instrument refers to its ability to exclude from a confidence interval a false value of a structural parameter. This ability is measured by the Bahadur efficiency of a test that assumes that the instrument is flawless, or the Bahadur efficiency of a sensitivity analysis that assumes that the instrument may be somewhat biased. When there are few compliers, the outcomes for most people are unaffected by fluctuations in the instrument, so most of the information about the treatment effect is contained in the tail of the distribution of the outcomes. Exploiting this fact, we propose new methods that emphasize the affected portion of the distribution of outcomes, thereby extracting more information from studies with few compliers. Studies of the effects of Catholic high schools on academic test performance have used ‘being Catholic’ as an instrument for ‘attending a Catholic high school’, and the application concerns such a comparison using the US National Educational Longitudinal Study. Most Catholics did not attend Catholic school, so there are few compliers, but it was rare for non-Catholics to attend Catholic school, so there are very few always-takers.

Can phoretic particles swim in two dimensions?

Sondak, D., Hawley, C., Heng, S., Vinsonhaler, R., Lauga, E., & Thiffeault, J. L. (n.d.).

Publication year

2016

Journal title

Physical Review E

Volume

94

Issue

6
Abstract
Abstract
Artificial phoretic particles swim using self-generated gradients in chemical species (self-diffusiophoresis) or charges and currents (self-electrophoresis). These particles can be used to study the physics of collective motion in active matter and might have promising applications in bioengineering. In the case of self-diffusiophoresis, the classical physical model relies on a steady solution of the diffusion equation, from which chemical gradients, phoretic flows, and ultimately the swimming velocity may be derived. Motivated by disk-shaped particles in thin films and under confinement, we examine the extension to two dimensions. Because the two-dimensional diffusion equation lacks a steady state with the correct boundary conditions, Laplace transforms must be used to study the long-time behavior of the problem and determine the swimming velocity. For fixed chemical fluxes on the particle surface, we find that the swimming velocity ultimately always decays logarithmically in time. In the case of finite Péclet numbers, we solve the full advection-diffusion equation numerically and show that this decay can be avoided by the particle moving to regions of unconsumed reactant. Finite advection thus regularizes the two-dimensional phoretic problem.

Contact

siyuheng@nyu.edu 708 Broadway New York, NY, 10003