Biostatistics Seminar Series: SIMPLE: Statistical Inference on Membership Profiles in Large Networks ft. Dr. Jinchi Lv

November 13
12:30-2pm
715 Broadway, 12th Floor, Room 1221

Please join the Department of Biostatistics for a Seminar Series lecture on SIMPLE: Statistical Inference on Membership Profiles in Large Networks featuring Dr. Jinchi Lv.

Abstract: Network data is prevalent in many contemporary big data applications in which a common interest is to unveil important latent links between different pairs of nodes. Yet a simple fundamental question of how to precisely quantify the statistical uncertainty associated with the identification of latent links still remains largely unexplored. In this paper, we propose the method of statistical inference on membership profiles in large networks (SIMPLE) in the setting of degree-corrected mixed membership model, where the null hypothesis assumes that the pair of nodes share the same profile of community memberships. In the simpler case of no degree heterogeneity, the model reduces to the mixed membership model for which an alternative more robust test is also proposed. Both tests are of the Hotelling-type statistics based on the rows of empirical eigenvectors or their ratios, whose asymptotic covariance matrices are very challenging to derive and estimate. Nevertheless, their analytical expressions are unveiled and the unknown covariance matrices are consistently estimated. Under some mild regularity conditions, we establish the exact limiting distributions of the two forms of SIMPLE test statistics under the null hypothesis and contiguous alternative hypothesis. They are the chi-square distributions and the noncentral chi-square distributions, respectively, with degrees of freedom depending on whether the degrees are corrected or not. We also address the important issue of estimating the unknown number of communities and establish the asymptotic properties of the associated test statistics. The advantages and practical utility of our new procedures in terms of both size and power are demonstrated through several simulation examples and real network applications. This is a joint work with Jianqing Fan, Yingying Fan and Xiao Han. Short bio: Jinchi Lv is Kenneth King Stonier Chair in Business Administration and Professor in Data Sciences and Operations Department of the Marshall School of Business at the University of Southern California, Professor in Department of Mathematics at USC, and an Associate Fellow of USC Dornsife Institute for New Economic Thinking (INET). He received his Ph.D. in Mathematics from Princeton University in 2007. He was McAlister Associate Professor in Business Administration at USC from 2016-2019. His research interests include statistics, machine learning, data science, business applications, and artificial intelligence and blockchain. His papers have been published in journals in statistics, economics, computer science, information theory, and biology, and one of them was published as a Discussion Paper in Journal of the Royal Statistical Society Series B (2008). He is the recipient of Fellow of Institute of Mathematical Statistics (2019), USC Marshall Dean's Award for Research Impact (2017), Adobe Data Science Research Award (2017), the Royal Statistical Society Guy Medal in Bronze (2015), NSF Faculty Early Career Development (CAREER) Award (2010), USC Marshall Dean's Award for Research Excellence (2009), and Zumberge Individual Award from USC's James H. Zumberge Faculty Research and Innovation Fund (2008). He has served as an associate editor of the Annals of Statistics (2013-2018), Journal of Business & Economic Statistics (2018-present), and Statistica Sinica (2008-2016).