Yang Feng
Yang Feng
Professor of Biostatistics
-
Professional overview
-
Yang Feng is a Professor and Ph.D. Program Director of Biostatistics in the School of Global Public Health and an affiliate faculty in the Center for Data Science at New York University. He obtained his Ph.D. in Operations Research at Princeton University in 2010.
Feng's research interests encompass the theoretical and methodological aspects of machine learning, high-dimensional statistics, social network models, and nonparametric statistics, leading to a wealth of practical applications, including Alzheimer's disease, cancer classification, and electronic health records. His research has been funded by multiple grants from the National Institutes of Health (NIH) and the National Science Foundation (NSF), notably the NSF CAREER Award.
He is currently an Associate Editor for the Journal of the American Statistical Association (JASA), the Journal of Business & Economic Statistics (JBES), Journal of Computational & Graphical Statistics (JCGS), and the Annals of Applied Statistics (AoAS). His professional recognitions include being named a fellow of the American Statistical Association (ASA) and the Institute of Mathematical Statistics (IMS), as well as an elected member of the International Statistical Institute (ISI).
Please visit Dr. Yang Feng's website and Google Scholar page from more information.
-
Education
-
B.S. in Mathematics, University of Science and Technology of China, Hefei, ChinaPh.D. in Operations Research, Princeton University, Princeton, NJ
-
Areas of research and study
-
BioinformaticsBiostatisticsHigh-dimensional data analysis/integrationMachine learningModeling Social and Behavioral DynamicsNonparametric statistics
-
Publications
Publications
A demonstration of the RaSEn package
AbstractFeng, Y., Tian, Y. e., & Feng, Y. (n.d.).Publication year
2021Abstract~A flexible quasi-likelihood model for microbiome abundance count data
AbstractFeng, Y., Shi, Y., Li, H., Wang, C., Chen, J., Jiang, H., Shih, Y.-C. T., Zhang, H., Song, Y., Feng, Y., & Liu, L. (n.d.).Publication year
2023Journal title
Statistics in MedicineVolume
42Issue
25Page(s)
4632--4643Abstract~A Kronecker Product Model for Repeated Pattern Detection on 2D Urban Images
AbstractLiu, J., Psarakis, E. Z., Feng, Y., & Stamos, I. (n.d.).Publication year
2019Journal title
IEEE Transactions on Pattern Analysis and Machine IntelligenceVolume
41Issue
9Page(s)
2266-2272AbstractRepeated patterns (such as windows, balconies, and doors) are prominent and significant features in urban scenes. Therefore, detection of these repeated patterns becomes very important for city scene analysis. This paper attacks the problem of repeated pattern detection in a precise, efficient and automatic way, by combining traditional feature extraction with a Kronecker product based low-rank model. We introduced novel algorithms that extract repeated patterns from rectified images with solid theoretical support. Our method is tailored for 2D images of building façades and tested on a large set of façade images.A likelihood-ratio type test for stochastic block models with bounded degrees
AbstractFeng, Y., Yuan, M., Feng, Y., & Shang, Z. (n.d.).Publication year
2022Journal title
Journal of Statistical Planning and InferenceVolume
219Page(s)
98--119Abstract~A Projection Based Conditional Dependence Measure with Applications to High-dimensional Undirected Graphical Models
AbstractFeng, Y., Fan, J., Feng, Y., & Xia, L. (n.d.).Publication year
2020Journal title
Journal of EconometricsAbstract~Accounting for incomplete testing in the estimation of epidemic parameters
AbstractBetensky, R. A., & Feng, Y. (n.d.).Publication year
2020Journal title
International Journal of EpidemiologyVolume
49Issue
5Page(s)
1419-1426Abstract~Accounting for incomplete testing in the estimation of epidemic parameters
AbstractFeng, Y., Betensky, R. A., & Feng, Y. (n.d.).Publication year
2020Journal title
International Journal of EpidemiologyAbstract~Analytical performance of lateral flow immunoassay for SARS-CoV-2 exposure screening on venous and capillary blood samples
AbstractBlack, M. A., Shen, G., Feng, X., Garcia Beltran, W. F., Feng, Y., Vasudevaraja, V., Allison, D., Lin, L. H., Gindin, T., Astudillo, M., Yang, D., Murali, M., Iafrate, A. J., Jour, G., Cotzia, P., & Snuderl, M. (n.d.).Publication year
2021Journal title
Journal of Immunological MethodsVolume
489AbstractObjectives: We validate the use of a lateral flow immunoassay (LFI) intended for rapid screening and qualitative detection of anti-SARS-CoV-2 IgM and IgG in serum, plasma, and whole blood, and compare results with ELISA. We also seek to establish the value of LFI testing on blood obtained from a capillary blood sample. Methods: Samples collected by venous blood draw and finger stick were obtained from patients with SARS-CoV-2 detected by RT-qPCR and control patients. Samples were tested with Biolidics 2019-nCoV IgG/IgM Detection Kit lateral flow immunoassay, and antibody calls were compared with ELISA. Results: Biolidics LFI showed clinical sensitivity of 92% with venous blood at 7 days after PCR diagnosis of SARS-CoV-2. Test specificity was 92% for IgM and 100% for IgG. There was no significant difference in detecting IgM and IgG with Biolidics LFI and ELISA at D0 and D7 (p = 1.00), except for detection of IgM at D7 (p = 0.04). Capillary blood of SARS-CoV-2 patients showed 93% sensitivity for antibody detection. Conclusions: Clinical performance of Biolidics 2019-nCoV IgG/IgM Detection Kit is comparable to ELISA and was consistent across sample types. This provides an opportunity for decentralized rapid testing and may allow point-of-care and longitudinal self-testing for the presence of anti-SARS-CoV-2 antibodies.Association of body composition parameters measured on CT with risk of hospitalization in patients with Covid-19
AbstractFeng, Y., Chandarana, H., Pisuchpen, N., Krieger, R., Dane, B., Mikheev, A., Feng, Y., Kambadakone, A., & Rusinek, H. (n.d.).Publication year
2021Journal title
European Journal of RadiologyVolume
145Page(s)
110031Abstract~Association of hyperglycemia and molecular subclass on survival in IDH-wildtype glioblastoma
AbstractFeng, Y., Liu, E. K., Vasudevaraja, V., Sviderskiy, V. O., Feng, Y., Tran, I., Serrano, J., Cordova, C., Kurz, S. C., Golfinos, J. G., Sulman, E. P., & others. (n.d.).Publication year
2022Journal title
Neuro-Oncology AdvancesVolume
4Issue
1Page(s)
vdac163Abstract~Clinical, Pathological, and Molecular Characteristics of Diffuse Spinal Cord Gliomas
AbstractFeng, Y., Garcia, M. R., Feng, Y., Vasudevaraja, V., Galbraith, K., Serrano, J., Thomas, C., Radmanesh, A., Hidalgo, E. T., Harter, D. H., Allen, J. C., & others. (n.d.).Publication year
2022Journal title
Journal of Neuropathology & Experimental NeurologyVolume
81Issue
11Page(s)
865--872Abstract~Comments on: Statistical inference and large-scale multiple testing for high-dimensional regression models
AbstractFeng, Y., Tian, Y. e., & Feng, Y. (n.d.).Publication year
2023Journal title
TestVolume
32Issue
4Page(s)
1172--1176Abstract~Community detection with nodal information : Likelihood and its variational approximation
AbstractWeng, H., & Feng, Y. (n.d.).Publication year
2022Journal title
StatVolume
11Issue
1AbstractCommunity detection is one of the fundamental problems in the study of network data. Most existing community detection approaches only consider edge information as inputs, and the output could be suboptimal when nodal information is available. In such cases, it is desirable to leverage nodal information for the improvement of community detection accuracy. Towards this goal, we propose a flexible network model incorporating nodal information and develop likelihood-based inference methods. For the proposed methods, we establish favorable asymptotic properties as well as efficient algorithms for computation. Numerical experiments show the effectiveness of our methods in utilizing nodal information across a variety of simulated and real network data sets.Comparison of solid tissue sequencing and liquid biopsy accuracy in identification of clinically relevant gene mutations and rearrangements in lung adenocarcinomas
AbstractFeng, Y., Lin, L. H., Allison, D. H., Feng, Y., Jour, G., Park, K., Zhou, F., Moreira, A. L., Shen, G., Feng, X., Sabari, J., & others. (n.d.).Publication year
2021Journal title
Modern PathologyVolume
34Issue
12Page(s)
2168--2174Abstract~Consistent Estimation of the Number of Communities in Non-uniform Hypergraph Model
AbstractShang, Z., Zhang, Z., & Feng, Y. (n.d.).Publication year
2025Journal title
StatVolume
14Issue
2AbstractWe propose an algorithm based on cross-validation to estimate the number of communities in a general non-uniform hypergraph model. The algorithm involves a three-step process. Initially, it randomly divides the set of hyperedges into a training set and a testing set. Subsequently, for each candidate number of communities, we construct a spectral estimation of community labels and least square estimation of the hyperedge probabilities based on the training set. The final step involves the computation of cross-validation scores using the testing set. The proposed algorithm is shown to be consistent when the number of vertices tends to infinity.DDAC-SpAM: A Distributed Algorithm for Fitting High-dimensional Sparse Additive Models with Feature Division and Decorrelation
AbstractFeng, Y., He, Y., Wu, R., Zhou, Y., & Feng, Y. (n.d.).Publication year
2023Journal title
Journal of the American Statistical AssociationPage(s)
1--12Abstract~Design-Based Causal Inference with Missing Outcomes: Missingness Mechanisms, Imputation-Assisted Randomization Tests, and Covariate Adjustment
AbstractFeng, Y., Heng, S., Zhang, J., & Feng, Y. (n.d.).Publication year
2023Journal title
arXiv preprint arXiv:2310.18556Abstract~Differential Role of Hyperglycemia on Survival in IDH-wildtype Glioblastoma Subclasses
AbstractFeng, Y., Liu, E., Vasudevaraja, V., Sviderskiy, V., Feng, Y., Tran, I., Serrano, J., Cordova, C., Kurz, S., Golfinos, J., Sulman, E., & others. (n.d.). (6th eds.).Publication year
2022Volume
81Page(s)
440--440Abstract~Discussion of “Cocitation and Coauthorship Networks of Statisticians”
AbstractFeng, Y., Weng, H., & Feng, Y. (n.d.).Publication year
2022Journal title
Journal of Business & Economic StatisticsVolume
40Issue
2Page(s)
486--490Abstract~Imbalanced classification: A paradigm-based review
AbstractFeng, Y., Feng, Y., Zhou, M., & Tong, X. (n.d.).Publication year
2021Journal title
Statistical Analysis and Data Mining: The ASA Data Science JournalVolume
14Issue
5Page(s)
383--406Abstract~Large-scale model selection in misspecified generalized linear models
AbstractFeng, Y., Demirkaya, E., Feng, Y., Basu, P., & Lv, J. (n.d.).Publication year
2022Journal title
BiometrikaVolume
109Issue
1Page(s)
123--136Abstract~Learning from Similar Linear Representations: Adaptivity, Minimaxity, and Robustness
AbstractFeng, Y., Tian, Y. e., Gu, Y., & Feng, Y. (n.d.).Publication year
2023Journal title
arXiv preprint arXiv:2303.17765Abstract~Likelihood adaptively modified penalties
AbstractFeng, Y., Feng, Y., Li, T., & Ying, Z. (n.d.).Publication year
2019Journal title
Applied Stochastic Models in Business and IndustryVolume
35Issue
2Page(s)
330--353Abstract~Machine collaboration
AbstractFeng, Y., Liu, Q., & Feng, Y. (n.d.).Publication year
2024Journal title
StatVolume
13Issue
1Page(s)
e661Abstract~Machine collaboration
AbstractLiu, Q., & Feng, Y. (n.d.).Publication year
2024Journal title
StatVolume
13Issue
1AbstractWe propose a new ensemble framework for supervised learning, called machine collaboration (MaC), using a collection of possibly heterogeneous base learning methods (hereafter, base machines) for prediction tasks. Unlike bagging/stacking (a parallel and independent framework) and boosting (a sequential and top-down framework), MaC is a type of circular and recursive learning framework. The circular and recursive nature helps the base machines to transfer information circularly and update their structures and parameters accordingly. The theoretical result on the risk bound of the estimator from MaC reveals that the circular and recursive feature can help MaC reduce risk via a parsimonious ensemble. We conduct extensive experiments on MaC using both simulated data and 119 benchmark real datasets. The results demonstrate that in most cases, MaC performs significantly better than several other state-of-the-art methods, including classification and regression trees, neural networks, stacking, and boosting.