Biostatistics Seminar Series with Abhirup Datta, Ph.D. - Bayesian Multiplier Method for Population Size Estimation Using Multi-Source Data

April 14
12:30-1:30pm
Online

This event is hosted by the Department of Biostatistics.

ABSTRACT: Multiplier method is a common approach to size estimation of key populations like men having sex with men (MSM) and female sex workers (FSW) who are at in increased risk for HIV. Multiplier method estimator uses two data sources – a listing (enumeration of key population members from health registers, community-based event attendance logs, etc.) and a survey where participants are asked about their involvement in this past listing. When multiple listings are available along with one survey, the survey data is paired with each listing to create a pair-specific estimate which are then averaged across listings to obtain the unified estimate. This practice ignores the correlation among the pair-specific estimates owing to the use of the common survey data and leads to loss of precision. We recast the multiple multiplier method as a special case of multiple capture-recapture problem with incomplete data and propose a fully model based approach for size estimation using multiple capture-recapture data with arbitrary pattern of incompleteness. We use a data augmentation scheme that allows us to model the correlations in the data and produce a unified estimate of population size per region. We also consider data misalignment where counts from some of the data sources are not available for each region but as an aggregate over few regions. We propose a solution to the general misalignment problem which considers data-source-specific patterns of misalignment. We use simulation studies to demonstrate the accurate inferential capabilities of our Bayesian multiplier method. This approach is then used to produce uncertainty-quantified population size estimates of key populations in eSwatini. Lastly, we propose a Bayesian nonparametric extension for incomplete capture-recapture that allows non-independent data sources.

BIO: Dr. Datta is an Assistant Professor in the Department of Biostatistics at the Johns Hopkins Bloomberg School of Public Health. His research interests are diverse, among which are spatial machine learning, graphical models, fast Bayesian algorithms for high-dimensional spatial data, hierarchical models, and Bayesian machine learning and shrinkage methods for complex survey-based datasets arising from epidemiological field work.