Yang Feng

Yang Feng

Yang Feng

Scroll

Professor of Biostatistics

Professional overview

Yang Feng is a Professor and Ph.D. Program Director of Biostatistics in the School of Global Public Health and an affiliate faculty in the Center for Data Science at New York University. He obtained his Ph.D. in Operations Research at Princeton University in 2010.

Feng's research interests encompass the theoretical and methodological aspects of machine learning, high-dimensional statistics, social network models, and nonparametric statistics, leading to a wealth of practical applications, including Alzheimer's disease, cancer classification, and electronic health records. His research has been funded by multiple grants from the National Institutes of Health (NIH) and the National Science Foundation (NSF), notably the NSF CAREER Award.

He is currently an Associate Editor for the Journal of the American Statistical Association (JASA), the Journal of Business & Economic Statistics (JBES), Journal of Computational & Graphical Statistics (JCGS), and the Annals of Applied Statistics (AoAS). His professional recognitions include being named a fellow of the American Statistical Association (ASA) and the Institute of Mathematical Statistics (IMS), as well as an elected member of the International Statistical Institute (ISI).

Please visit Dr. Yang Feng's website and Google Scholar page from more information.

Education

B.S. in Mathematics, University of Science and Technology of China, Hefei, China
Ph.D. in Operations Research, Princeton University, Princeton, NJ

Areas of research and study

Bioinformatics
Biostatistics
High-dimensional data analysis/integration
Machine learning
Modeling Social and Behavioral Dynamics
Nonparametric statistics

Publications

Publications

Testing community structure for hypergraphs

Feng, Y., Yuan, M., Liu, R., Feng, Y., & Shang, Z. (n.d.).

Publication year

2022

Journal title

The Annals of Statistics

Volume

50

Issue

1

Page(s)

147--169
Abstract
Abstract
~

The Interplay of Demographic Variables and Social Distancing Scores in Deep Prediction of US COVID-19 Cases

Feng, Y., Tang, F., Feng, Y., Chiheb, H., & Fan, J. (n.d.).

Publication year

2021

Journal title

Journal of the American Statistical Association
Abstract
Abstract
~

The restricted consistency property of leave-$n_v$-out cross-validation for high-dimensional variable selection

Feng, Y., Feng, Y., & Yu, Y. i. (n.d.).

Publication year

2019

Journal title

Statistica Sinica

Volume

29

Page(s)

1607--1630
Abstract
Abstract
~

Towards the Theory of Unsupervised Federated Learning : Non-asymptotic Analysis of Federated EM Algorithms

Tian, Y., Weng, H., & Feng, Y. (n.d.).

Publication year

2024

Journal title

Proceedings of Machine Learning Research

Volume

235

Page(s)

48226-48279
Abstract
Abstract
While supervised federated learning approaches have enjoyed significant success, the domain of unsupervised federated learning remains relatively underexplored. Several federated EM algorithms have gained popularity in practice, however, their theoretical foundations are often lacking. In this paper, we first introduce a federated gradient EM algorithm (FedGrEM) designed for the unsupervised learning of mixture models, which supplements the existing federated EM algorithms by considering task heterogeneity and potential adversarial attacks. We present a comprehensive finite-sample theory that holds for general mixture models, then apply this general theory on specific statistical models to characterize the explicit estimation error of model parameters and mixture proportions. Our theory elucidates when and how FedGrEM outperforms local single-task learning with insights extending to existing federated EM algorithms. This bridges the gap between their practical success and theoretical understanding. Our numerical results validate our theory, and demonstrate FedGrEM’s superiority over existing unsupervised federated learning benchmarks.

Transfer learning under high-dimensional generalized linear models

Feng, Y., Tian, Y. e., & Feng, Y. (n.d.).

Publication year

2023

Journal title

Journal of the American Statistical Association

Volume

118

Issue

544

Page(s)

2684--2697
Abstract
Abstract
~

Unsupervised Federated Learning: A Federated Gradient EM Algorithm for Heterogeneous Mixture Models with Robustness against Adversarial Attacks

Feng, Y., Tian, Y. e., Weng, H., & Feng, Y. (n.d.).

Publication year

2023

Journal title

arXiv preprint arXiv:2310.15330
Abstract
Abstract
~

Unsupervised Multi-task and Transfer Learning on Gaussian Mixture Models

Feng, Y., Tian, Y. e., Weng, H., & Feng, Y. (n.d.).

Publication year

2022

Journal title

arXiv preprint arXiv:2209.15224
Abstract
Abstract
~

Variable selection for high-dimensional generalized linear model with block-missing data

Feng, Y., He, Y., Feng, Y., & Song, X. (n.d.).

Publication year

2023

Journal title

Scandinavian Journal of Statistics

Volume

50

Issue

3

Page(s)

1279--1297
Abstract
Abstract
~

Visceral adipose tissue in patients with COVID-19: risk stratification for severity

Feng, Y., Chandarana, H., Dane, B., Mikheev, A., Taffel, M. T., Feng, Y., & Rusinek, H. (n.d.).

Publication year

2021

Journal title

Abdominal Radiology

Volume

46

Issue

2

Page(s)

818--825
Abstract
Abstract
~

ℓ1-Penalized Multinomial Regression : Estimation, Inference, and Prediction, With an Application to Risk Factor Identification for Different Dementia Subtypes

Tian, Y., Rusinek, H., Masurkar, A. V., & Feng, Y. (n.d.).

Publication year

2024

Journal title

Statistics in Medicine

Volume

43

Issue

30

Page(s)

5711-5747
Abstract
Abstract
High-dimensional multinomial regression models are very useful in practice but have received less research attention than logistic regression models, especially from the perspective of statistical inference. In this work, we analyze the estimation and prediction error of the contrast-based (Formula presented.) -penalized multinomial regression model and extend the debiasing method to the multinomial case, providing a valid confidence interval for each coefficient and (Formula presented.) value of the individual hypothesis test. We also examine cases of model misspecification and non-identically distributed data to demonstrate the robustness of our method when some assumptions are violated. We apply the debiasing method to identify important predictors in the progression into dementia of different subtypes. Results from extensive simulations show the superiority of the debiasing method compared to other inference methods.

Contact

yang.feng@nyu.edu 708 Broadway New York, NY, 10003