Yang Feng
 
                  
            Yang Feng
      
      
      
    
    
    
    
    
          Professor of Biostatistics
- 
  Professional overview
- 
            Yang Feng is a Professor and Ph.D. Program Director of Biostatistics in the School of Global Public Health and an affiliate faculty in the Center for Data Science at New York University. He obtained his Ph.D. in Operations Research at Princeton University in 2010. Feng's research interests encompass the theoretical and methodological aspects of machine learning, high-dimensional statistics, social network models, and nonparametric statistics, leading to a wealth of practical applications, including Alzheimer's disease, cancer classification, and electronic health records. His research has been funded by multiple grants from the National Institutes of Health (NIH) and the National Science Foundation (NSF), notably the NSF CAREER Award. He is currently an Associate Editor for the Journal of the American Statistical Association (JASA), the Journal of Business & Economic Statistics (JBES), Journal of Computational & Graphical Statistics (JCGS), and the Annals of Applied Statistics (AoAS). His professional recognitions include being named a fellow of the American Statistical Association (ASA) and the Institute of Mathematical Statistics (IMS), as well as an elected member of the International Statistical Institute (ISI). Please visit Dr. Yang Feng's website and Google Scholar page from more information. 
- 
  Education
- 
      B.S. in Mathematics, University of Science and Technology of China, Hefei, ChinaPh.D. in Operations Research, Princeton University, Princeton, NJ
- 
  Areas of research and study
- 
                BioinformaticsBiostatisticsHigh-dimensional data analysis/integrationMachine learningModeling Social and Behavioral DynamicsNonparametric statistics
- 
  Publications
- Publications- A demonstration of the RaSEn packageAbstractFeng, Y., Tian, Y. e., & Feng, Y. (n.d.).- Publication year2021Abstract~- A flexible quasi-likelihood model for microbiome abundance count dataAbstractFeng, Y., Shi, Y., Li, H., Wang, C., Chen, J., Jiang, H., Shih, Y.-C. T., Zhang, H., Song, Y., Feng, Y., & Liu, L. (n.d.).- Publication year2023- Journal titleStatistics in Medicine- Volume42- Issue25- Page(s)4632--4643Abstract~- A Kronecker Product Model for Repeated Pattern Detection on 2D Urban ImagesAbstractLiu, J., Psarakis, E. Z., Feng, Y., & Stamos, I. (n.d.).- Publication year2019- Journal titleIEEE Transactions on Pattern Analysis and Machine Intelligence- Volume41- Issue9- Page(s)2266-2272AbstractRepeated patterns (such as windows, balconies, and doors) are prominent and significant features in urban scenes. Therefore, detection of these repeated patterns becomes very important for city scene analysis. This paper attacks the problem of repeated pattern detection in a precise, efficient and automatic way, by combining traditional feature extraction with a Kronecker product based low-rank model. We introduced novel algorithms that extract repeated patterns from rectified images with solid theoretical support. Our method is tailored for 2D images of building façades and tested on a large set of façade images.- A likelihood-ratio type test for stochastic block models with bounded degreesAbstractFeng, Y., Yuan, M., Feng, Y., & Shang, Z. (n.d.).- Publication year2022- Journal titleJournal of Statistical Planning and Inference- Volume219- Page(s)98--119Abstract~- A Projection Based Conditional Dependence Measure with Applications to High-dimensional Undirected Graphical ModelsAbstractFeng, Y., Fan, J., Feng, Y., & Xia, L. (n.d.).- Publication year2020- Journal titleJournal of EconometricsAbstract~- Accounting for incomplete testing in the estimation of epidemic parametersAbstractBetensky, R. A., & Feng, Y. (n.d.).- Publication year2020- Journal titleInternational Journal of Epidemiology- Volume49- Issue5- Page(s)1419-1426Abstract~- Accounting for incomplete testing in the estimation of epidemic parametersAbstractFeng, Y., Betensky, R. A., & Feng, Y. (n.d.).- Publication year2020- Journal titleInternational Journal of EpidemiologyAbstract~- Analytical performance of lateral flow immunoassay for SARS-CoV-2 exposure screening on venous and capillary blood samplesAbstractBlack, M. A., Shen, G., Feng, X., Garcia Beltran, W. F., Feng, Y., Vasudevaraja, V., Allison, D., Lin, L. H., Gindin, T., Astudillo, M., Yang, D., Murali, M., Iafrate, A. J., Jour, G., Cotzia, P., & Snuderl, M. (n.d.).- Publication year2021- Journal titleJournal of Immunological Methods- Volume489AbstractObjectives: We validate the use of a lateral flow immunoassay (LFI) intended for rapid screening and qualitative detection of anti-SARS-CoV-2 IgM and IgG in serum, plasma, and whole blood, and compare results with ELISA. We also seek to establish the value of LFI testing on blood obtained from a capillary blood sample. Methods: Samples collected by venous blood draw and finger stick were obtained from patients with SARS-CoV-2 detected by RT-qPCR and control patients. Samples were tested with Biolidics 2019-nCoV IgG/IgM Detection Kit lateral flow immunoassay, and antibody calls were compared with ELISA. Results: Biolidics LFI showed clinical sensitivity of 92% with venous blood at 7 days after PCR diagnosis of SARS-CoV-2. Test specificity was 92% for IgM and 100% for IgG. There was no significant difference in detecting IgM and IgG with Biolidics LFI and ELISA at D0 and D7 (p = 1.00), except for detection of IgM at D7 (p = 0.04). Capillary blood of SARS-CoV-2 patients showed 93% sensitivity for antibody detection. Conclusions: Clinical performance of Biolidics 2019-nCoV IgG/IgM Detection Kit is comparable to ELISA and was consistent across sample types. This provides an opportunity for decentralized rapid testing and may allow point-of-care and longitudinal self-testing for the presence of anti-SARS-CoV-2 antibodies.- Association of body composition parameters measured on CT with risk of hospitalization in patients with Covid-19AbstractFeng, Y., Chandarana, H., Pisuchpen, N., Krieger, R., Dane, B., Mikheev, A., Feng, Y., Kambadakone, A., & Rusinek, H. (n.d.).- Publication year2021- Journal titleEuropean Journal of Radiology- Volume145- Page(s)110031Abstract~- Association of hyperglycemia and molecular subclass on survival in IDH-wildtype glioblastomaAbstractFeng, Y., Liu, E. K., Vasudevaraja, V., Sviderskiy, V. O., Feng, Y., Tran, I., Serrano, J., Cordova, C., Kurz, S. C., Golfinos, J. G., Sulman, E. P., & others. (n.d.).- Publication year2022- Journal titleNeuro-Oncology Advances- Volume4- Issue1- Page(s)vdac163Abstract~- Clinical, Pathological, and Molecular Characteristics of Diffuse Spinal Cord GliomasAbstractFeng, Y., Garcia, M. R., Feng, Y., Vasudevaraja, V., Galbraith, K., Serrano, J., Thomas, C., Radmanesh, A., Hidalgo, E. T., Harter, D. H., Allen, J. C., & others. (n.d.).- Publication year2022- Journal titleJournal of Neuropathology & Experimental Neurology- Volume81- Issue11- Page(s)865--872Abstract~- Comments on: Statistical inference and large-scale multiple testing for high-dimensional regression modelsAbstractFeng, Y., Tian, Y. e., & Feng, Y. (n.d.).- Publication year2023- Journal titleTest- Volume32- Issue4- Page(s)1172--1176Abstract~- Community detection with nodal information : Likelihood and its variational approximationAbstractWeng, H., & Feng, Y. (n.d.).- Publication year2022- Journal titleStat- Volume11- Issue1AbstractCommunity detection is one of the fundamental problems in the study of network data. Most existing community detection approaches only consider edge information as inputs, and the output could be suboptimal when nodal information is available. In such cases, it is desirable to leverage nodal information for the improvement of community detection accuracy. Towards this goal, we propose a flexible network model incorporating nodal information and develop likelihood-based inference methods. For the proposed methods, we establish favorable asymptotic properties as well as efficient algorithms for computation. Numerical experiments show the effectiveness of our methods in utilizing nodal information across a variety of simulated and real network data sets.- Comparison of solid tissue sequencing and liquid biopsy accuracy in identification of clinically relevant gene mutations and rearrangements in lung adenocarcinomasAbstractFeng, Y., Lin, L. H., Allison, D. H., Feng, Y., Jour, G., Park, K., Zhou, F., Moreira, A. L., Shen, G., Feng, X., Sabari, J., & others. (n.d.).- Publication year2021- Journal titleModern Pathology- Volume34- Issue12- Page(s)2168--2174Abstract~- Consistent Estimation of the Number of Communities in Non-uniform Hypergraph ModelAbstractShang, Z., Zhang, Z., & Feng, Y. (n.d.).- Publication year2025- Journal titleStat- Volume14- Issue2AbstractWe propose an algorithm based on cross-validation to estimate the number of communities in a general non-uniform hypergraph model. The algorithm involves a three-step process. Initially, it randomly divides the set of hyperedges into a training set and a testing set. Subsequently, for each candidate number of communities, we construct a spectral estimation of community labels and least square estimation of the hyperedge probabilities based on the training set. The final step involves the computation of cross-validation scores using the testing set. The proposed algorithm is shown to be consistent when the number of vertices tends to infinity.- DDAC-SpAM: A Distributed Algorithm for Fitting High-dimensional Sparse Additive Models with Feature Division and DecorrelationAbstractFeng, Y., He, Y., Wu, R., Zhou, Y., & Feng, Y. (n.d.).- Publication year2023- Journal titleJournal of the American Statistical Association- Page(s)1--12Abstract~- Design-Based Causal Inference with Missing Outcomes: Missingness Mechanisms, Imputation-Assisted Randomization Tests, and Covariate AdjustmentAbstractFeng, Y., Heng, S., Zhang, J., & Feng, Y. (n.d.).- Publication year2023- Journal titlearXiv preprint arXiv:2310.18556Abstract~- Differential Role of Hyperglycemia on Survival in IDH-wildtype Glioblastoma SubclassesAbstractFeng, Y., Liu, E., Vasudevaraja, V., Sviderskiy, V., Feng, Y., Tran, I., Serrano, J., Cordova, C., Kurz, S., Golfinos, J., Sulman, E., & others. (n.d.). (6th eds.).- Publication year2022- Volume81- Page(s)440--440Abstract~- Discussion of “Cocitation and Coauthorship Networks of Statisticians”AbstractFeng, Y., Weng, H., & Feng, Y. (n.d.).- Publication year2022- Journal titleJournal of Business & Economic Statistics- Volume40- Issue2- Page(s)486--490Abstract~- Imbalanced classification: A paradigm-based reviewAbstractFeng, Y., Feng, Y., Zhou, M., & Tong, X. (n.d.).- Publication year2021- Journal titleStatistical Analysis and Data Mining: The ASA Data Science Journal- Volume14- Issue5- Page(s)383--406Abstract~- Large-scale model selection in misspecified generalized linear modelsAbstractFeng, Y., Demirkaya, E., Feng, Y., Basu, P., & Lv, J. (n.d.).- Publication year2022- Journal titleBiometrika- Volume109- Issue1- Page(s)123--136Abstract~- Learning from Similar Linear Representations: Adaptivity, Minimaxity, and RobustnessAbstractFeng, Y., Tian, Y. e., Gu, Y., & Feng, Y. (n.d.).- Publication year2023- Journal titlearXiv preprint arXiv:2303.17765Abstract~- Likelihood adaptively modified penaltiesAbstractFeng, Y., Feng, Y., Li, T., & Ying, Z. (n.d.).- Publication year2019- Journal titleApplied Stochastic Models in Business and Industry- Volume35- Issue2- Page(s)330--353Abstract~- Machine collaborationAbstractFeng, Y., Liu, Q., & Feng, Y. (n.d.).- Publication year2024- Journal titleStat- Volume13- Issue1- Page(s)e661Abstract~- Machine collaborationAbstractLiu, Q., & Feng, Y. (n.d.).- Publication year2024- Journal titleStat- Volume13- Issue1AbstractWe propose a new ensemble framework for supervised learning, called machine collaboration (MaC), using a collection of possibly heterogeneous base learning methods (hereafter, base machines) for prediction tasks. Unlike bagging/stacking (a parallel and independent framework) and boosting (a sequential and top-down framework), MaC is a type of circular and recursive learning framework. The circular and recursive nature helps the base machines to transfer information circularly and update their structures and parameters accordingly. The theoretical result on the risk bound of the estimator from MaC reveals that the circular and recursive feature can help MaC reduce risk via a parsimonious ensemble. We conduct extensive experiments on MaC using both simulated data and 119 benchmark real datasets. The results demonstrate that in most cases, MaC performs significantly better than several other state-of-the-art methods, including classification and regression trees, neural networks, stacking, and boosting.