Yajun Mei
Yajun Mei
Professor of Biostatistics
-
Professional overview
-
Yajun Mei is a Professor of Biostatistics at NYU/GPH, starting from July 1, 2024. He received the B.S. degree in Mathematics from Peking University, Beijing, China, in 1996, and the Ph.D. degree in Mathematics with a minor in Electrical Engineering from the California Institute of Technology, Pasadena, CA, USA, in 2003. He was a Postdoc in Biostatistics in the renowned Fred Hutch Cancer Center in Seattle, WA during 2003 and 2005. Prior to joining NYU, Dr. Mei was an Assistant/Associate/Full Professor in H. Milton Stewart School of Industrial and Systems Engineering at the Georgia Institute of Technology, Atlanta, GA for 18 years from 2006 to 2024, and had been a co-director of Biostatistics, Epidemiology, and Study Design (BERD) of Georgia CTSA since 2018.
Dr. Mei’s research interests are statistics, machine learning, and data science, and their applications in biomedical science and public health, particularly, streaming data analysis, sequential decision/design, change-point problems, precision/personalized medicine, hot-spots detection for infectious diseases, longitudinal data analysis, bioinformatics, and clinical trials. His work has received several recognitions including Abraham Wald Prizes in Sequential Analysis in both 2009 and 2024, NSF CAREER Award in 2010, an elected Fellow of American Statistical Association (ASA) in 2023, and multiple best paper awards.
-
Education
-
BS, Mathematics, Peking UniversityPhD, Mathematics, California Institute of Technology
-
Honors and awards
-
Fellow of American Statistical Association (2023)Star Research Achievement Award, 2021 Virtual Critical Care Congress (2021)Best Paper Competition Award, Quality, Statistics & Reliability of INFORMS (2020)Bronze Snapshot Award, Society of Critical Care Medicine (2019)NSF Career AwardThank a Teacher Certificate, Center for Teaching and Learning (2011201220162020202120222023)Abraham Wald Prize (2009)Best Paper Award, 11th International Conference on Information Fusion (2008)New Researcher Fellow, Statistical and Applied Mathematical Sciences Institute (2005)Fred Hutchinson SPAC Travel Award to attend 2005 Joint Statistical Meetings, Minneapolis, MN (2005)Travel Award to 8th New Researchers Conference, Minneapolis, MN (2005)Travel Award to IEEE International Symposium on Information Theory, Chicago, IL (2004)Travel Award to IPAM workshop on inverse problem, UCLA, Los Angeles, CA (2003)Fred Hutchinson SPAC Course Scholarship (2003)Travel Award to the SAMSI workshop on inverse problem, Research Triangular Park, NC (2002)
-
Publications
Publications
Efficient Sequential UCB-based Hungarian Algorithm for Assignment Problems
AbstractAbstractThe assignment problem has many real-world applications such as allocations of agents and tasks for optimal utility gain. While it has been well-studied in the optimization literature when the underlying utility between every pair of agent and task is known, research is limited when the utilities are unknown and need to be learned from data on the fly. In this work, motivated by the mentor-mentee matching application in U.S. universities, we develop an efficient sequential assignment algorithm, with the objective of nearly maximizing the overall utility simultaneously for each time. Our proposed algorithm is to use stochastic binary bandit feedback to estimate the unknown utilities through the logistic regression, and then to combine the Upper Confidence Bound (UCB) method in the multi-armed bandit problem with the Hungarian algorithm in the assignment problem. We derive the theoretical bounds of our algorithm for both the estimation error and the total regret, and numerical studies are conducted to illustrate the usefulness of our algorithm.Implicit Regularization Properties of Variance Reduced Stochastic Mirror Descent
AbstractLuo, Y., Huo, X., & Mei, Y. (n.d.).Publication year
2022Page(s)
696-701AbstractIn machine learning and statistical data analysis, we often run into objective function that is a summation: the number of terms in the summation possibly is equal to the sample size, which can be enormous. In such a setting, the stochastic mirror descent (SMD) algorithm is a numerically efficient method - each iteration involving a very small subset of the data. The variance reduction version of SMD (VRSMD) can further improve SMD by inducing faster convergence. On the other hand, algorithms such as gradient descent and stochastic gradient descent have the implicit regularization property that leads to better performance in terms of the generalization errors. Little is known on whether such a property holds for VRSMD. We prove here that the discrete VRSMD estimator sequence converges to the minimum mirror interpolant in the linear regression. This establishes the implicit regularization property for VRSMD. As an application of the above result, we derive a model estimation accuracy result in the setting when the true model is sparse. We use numerical examples to illustrate the empirical power of VRSMD.Predicting the rheology of limestone calcined clay cements (LC3) : Linking composition and hydration kinetics to yield stress through Machine Learning
AbstractCanbek, O., Xu, Q., Mei, Y., Washburn, N. R., & Kurtis, K. E. (n.d.).Publication year
2022Journal title
Cement and Concrete ResearchVolume
160AbstractThe physicochemical characteristics of calcined clay influence yield stress of limestone calcined clay cements (LC3), but the independent influences the clay's physical and chemical characteristics as well as the effect of other variables on LC3 rheology are less well-understood. Further, a relationship between LC3 hydration kinetics and yield stress – important for informing mixture design – has not yet been established. Here, rheological properties were determined in pastes with varying water-to-solid ratio (w/s), constituent mass ratios (PC:metakaolin:limestone), limestone particle size and gypsum content. From these data, an ML model developed allowed the independent examination of the different mechanisms by which metakaolin fraction influences yield stress of LC3, identifying four predictors – packing index, Al2O3/SO3, total particle density and metakaolin fraction relative to limestone (MK/LS) – most significant for predicting LC3 yield stress. A methodology based on kernel smoothing also identified hydration kinetics parameters best correlated with yield stress.Private Sequential Hypothesis Testing for Statisticians : Privacy, Error Rates, and Sample Size
AbstractZhang, W., Mei, Y., & Cummings, R. (n.d.).Publication year
2022Journal title
Proceedings of Machine Learning ResearchVolume
151Page(s)
11356-11373AbstractThe sequential hypothesis testing problem is a class of statistical analyses where the sample size is not fixed in advance. Instead, the decision-process takes in new observations sequentially to make real-time decisions for testing an alternative hypothesis against a null hypothesis until some stopping criterion is satisfied. In many common applications of sequential hypothesis testing, the data can be highly sensitive and may require privacy protection; for example, sequential hypothesis testing is used in clinical trials, where doctors sequentially collect data from patients and must determine when to stop recruiting patients and whether the treatment is effective. The field of differential privacy has been developed to offer data analysis tools with strong privacy guarantees, and has been commonly applied to machine learning and statistical tasks. In this work, we study the sequential hypothesis testing problem under a slight variant of differential privacy, known as Renyi differential privacy. We present a new private algorithm based on Wald's Sequential Probability Ratio Test (SPRT) that also gives strong theoretical privacy guarantees. We provide theoretical analysis on statistical performance measured by Type I and Type II error as well as the expected sample size. We also empirically validate our theoretical results on several synthetic databases, showing that our algorithms also perform well in practice. Unlike previous work in private hypothesis testing that focused only on the classical fixed sample setting, our results in the sequential setting allow a conclusion to be reached much earlier, and thus saving the cost of collecting additional samples.Rapid detection of hot-spots via tensor decomposition with applications to crime rate data
AbstractZhao, Y., Yan, H., Holte, S., & Mei, Y. (n.d.).Publication year
2022Journal title
Journal of Applied StatisticsVolume
49Issue
7Page(s)
1636-1662AbstractIn many real-world applications of monitoring multivariate spatio-temporal data that are non-stationary over time, one is often interested in detecting hot-spots with spatial sparsity and temporal consistency, instead of detecting system-wise changes as in traditional statistical process control (SPC) literature. In this paper, we propose an efficient method to detect hot-spots through tensor decomposition, and our method has three steps. First, we fit the observed data into a Smooth Sparse Decomposition Tensor (SSD-Tensor) model that serves as a dimension reduction and de-noising technique: it is an additive model decomposing the original data into: smooth but non-stationary global mean, sparse local anomalies, and random noises. Next, we estimate model parameters by the penalized framework that includes Least Absolute Shrinkage and Selection Operator (LASSO) and fused LASSO penalty. An efficient recursive optimization algorithm is developed based on Fast Iterative Shrinkage Thresholding Algorithm (FISTA). Finally, we apply a Cumulative Sum (CUSUM) Control Chart to monitor model residuals after removing global means, which helps to detect when and where hot-spots occur. To demonstrate the usefulness of our proposed SSD-Tensor method, we compare it with several other methods including scan statistics, LASSO-based, PCA-based, T2-based control chart in extensive numerical simulation studies and a real crime rate dataset.Robust change detection for large-scale data streams
AbstractZhang, R., Mei, Y., & Shi, J. (n.d.).Publication year
2022Journal title
Sequential AnalysisVolume
41Issue
1Page(s)
1-19AbstractRobust change point detection for large-scale data streams has many real-world applications in industrial quality control, signal detection, and biosurveillance. Unfortunately, it is highly nontrivial to develop efficient schemes due to three challenges: (1) the unknown sparse subset of affected data streams, (2) the unexpected outliers, and (3) computational scalability for real-time monitoring and detection. In this article, we develop a family of efficient real-time robust detection schemes for monitoring large-scale independent data streams. For each data stream, we propose to construct a new local robust detection statistic called the (Formula presented.) -CUSUM (cumulative sum) statistic that can reduce the effect of outliers by using the Box-Cox transformation of the likelihood function. Then the global scheme will raise an alarm based upon the sum of the shrinkage transformation of these local (Formula presented.) -CUSUM statistics to filter out unaffected data streams. In addition, we propose a new concept called false alarm breakdown point to measure the robustness of online monitoring schemes and propose a worst-case detection efficiency score to measure the detection efficiency when the data contain outliers. We then characterize the breakdown point and the efficiency score of our proposed schemes. Asymptotic analysis and numerical simulations are conducted to illustrate the robustness and efficiency of our proposed schemes.ROBUSTNESS AND TRACTABILITY FOR NONCONVEX M-ESTIMATORS
AbstractZhang, R., Mei, Y., Shi, J., & Xu, H. (n.d.).Publication year
2022Journal title
Statistica SinicaVolume
32Issue
3Page(s)
1295-1316AbstractWe investigate two important properties of M-estimators, namely, robustness and tractability, in a linear regression setting, when the observations are contaminated by some arbitrary outliers. Specifically, robustness is the statistical property that the estimator should always be close to the true underlying parameters, regardless of the distribution of the outliers, and tractability refers to the computational property that the estimator can be computed efficiently, even if the objective function of the M-estimator is nonconvex. In this article, by examining the empirical risk, we show that under some sufficient conditions, many M-estimators enjoy nice robustness and tractability properties simultaneously when the percentage of outliers is small. We extend our analysis to the high-dimensional setting, where the number of parameters is greater than the number of samples, p ≫ n, and prove that when the proportion of outliers is small, the penalized M-estimators with the L1 penalty enjoy robustness and tractability simultaneously. Our research provides an analytic approach to determine the effects of outliers and tuning parameters on the robustness and tractability of some families of M-estimators. Simulations and case studies are presented to illustrate the usefulness of our theoretical results for M-estimators under Welsch’s exponential squared loss and Tukey’s bisquare loss.The Directional Bias Helps Stochastic Gradient Descent to Generalize in Kernel Regression Models
AbstractLuo, Y., Huo, X., & Mei, Y. (n.d.).Publication year
2022Page(s)
678-683AbstractWe study the Stochastic Gradient Descent (SGD) algorithm in nonparametric statistics: kernel regression in particular. The directional bias property of SGD, which is known in the linear regression setting, is generalized to the kernel regression. More specifically, we prove that SGD with moderate and annealing step-size converges along the direction of the eigenvector that corresponds to the largest eigenvalue of the Gram matrix. In addition, the Gradient Descent (GD) with a moderate or small step-size converges along the direction that corresponds to the smallest eigenvalue. These facts are referred to as the directional bias properties; they may interpret how an SGD-computed estimator has a potentially smaller generalization error than a GD-computed estimator. The application of our theory is demonstrated by simulation studies and a case study that is based on the FashionMNIST dataset.Treatment Effect Modeling for FTIR Signals Subject to Multiple Sources of Uncertainties
AbstractTian, H., Wang, A., Chen, J., Jiang, X., Shi, J., Zhang, C., Mei, Y., & Wang, B. (n.d.).Publication year
2022Journal title
IEEE Transactions on Automation Science and EngineeringVolume
19Issue
2Page(s)
895-906AbstractFourier-transform infrared spectroscopy (FTIR) is a widely adopted technique for characterizing the chemical composition in many physical and chemical analyses. However, FTIR spectra are subject to multiple sources of uncertainty, and thus the analysis of them relies on domain experts and can only lead to qualitative conclusions. This study aims to analyze the effect of a certain treatment on FTIR spectra subject to two commonly observed uncertainties, the offset shift and the multiplicative error. Due to these uncertainties, the pre-exposure FTIR spectra are modeled according to the physical understanding of the uncertainty - observed spectra can be viewed as translating and stretchering an underlying template signal, and the post-exposure FTIR spectra are modeled as the translated and stretchered template signal plus an extra functional treatment effect. To provide engineering interpretation, the treatment effect is modeled as the product of the pattern of modification and its corresponding magnitude. A two-step parameter estimation algorithm is developed to estimate the underlying template signal, the pattern of modification, and the magnitude of modification at various treatment strengths. The effectiveness of the proposed method is validated in a simulation study. Furtherly, in a real case study, the proposed method is used to investigate the effect of plasma exposure on the FTIR spectra. As a result, the proposed method effectively identifies the pattern of modification under uncertainties in the manufacturing environment, which matches the knowledge of the affected chemical components by the plasma treatment. And the recovered magnitude of modification provides guidance in selecting the control parameter of the plasma treatment. Note to Practitioners - FTIR spectrometer is often used to characterize the surface chemical composition of a material. Due to the large uncertainties associated with the nature of spectrometer and the measurement environment, the FTIR signals are usually examined visually by experienced engineers and technicians in industrial applications, which can be both time-consuming and inaccurate. To understand the effect of plasma exposure on the surface property of carbon fiber reinforced polymer (CFRP) material, the elimination of uncertainties associated with FTIR signals is investigated, and a systematic method is proposed to quantify the effect of surface treatments on FTIR signals. A two-step analytic procedure is proposed, which provides information on how the plasma exposure distorts the FTIR signals, and how the plasma distance relates to the magnitude of the distortion. The methodology in this article can be used to analyze the treatment effect on a variety of spectroscopic measurements that are subject to uncertainties such as offset and scaling errors, which expands the applications of in situ handheld spectrometer metrology in manufacturing industries.A boosting inspired personalized threshold method for sepsis screening
AbstractFeng, C., Feng, C., Griffin, P., Kethireddy, S., & Mei, Y. (n.d.).Publication year
2021Journal title
Journal of Applied StatisticsVolume
48Issue
1Page(s)
154-175AbstractSepsis is one of the biggest risks to patient safety, with a natural mortality rate between 25% and 50%. It is difficult to diagnose, and no validated standard for diagnosis currently exists. A commonly used scoring criteria is the quick sequential organ failure assessment (qSOFA). It demonstrates very low specificity in ICU populations, however. We develop a method to personalize thresholds in qSOFA that incorporates easily to measure patient baseline characteristics. We compare the personalized threshold method to qSOFA, five previously published methods that obtain an optimal constant threshold for a single biomarker, and to the machine learning algorithms based on logistic regression and AdaBoosting using patient data in the MIMIC-III database. The personalized threshold method achieves higher accuracy than qSOFA and the five published methods and has comparable performance to machine learning methods. Personalized thresholds, however, are much easier to adopt in real-life monitoring than machine learning methods as they are computed once for a patient and used in the same way as qSOFA, whereas the machine learning methods are hard to implement and interpret.Aneurysmal Subarachnoid Hemorrhage : Trends, Outcomes, and Predictions from a 15-Year Perspective of a Single Neurocritical Care Unit
AbstractSamuels, O. B., Sadan, O., Feng, C., Feng, C., Martin, K., Medani, K., Mei, Y., & Barrow, D. L. (n.d.).Publication year
2021Journal title
NeurosurgeryVolume
88Issue
3Page(s)
574-583AbstractBACKGROUND: Aneurysmal subarachnoid hemorrhage (aSAH) is associated with disproportionally high mortality and long-term neurological sequelae. Management of patients with aSAH has changed markedly over the years, leading to improvements in outcome. OBJECTIVE: To describe trends in aSAH care and outcome in a high-volume single center 15-yr cohort. METHODS: All new admissions diagnosed with subarachnoid hemorrhage (SAH) to our tertiary neuro-intensive care unit between 2002 and 2016 were reviewed. Trend analysis was performed to assess temporal changes and a step-wise regression analysis was done to identify factors associated with outcomes. RESULTS: Out of 3970 admissions of patients with SAH, 2475 patients proved to have a ruptured intracranial aneurysm. Over the years of the study, patient acuity increased by Hunt & Hess (H&H) grade and related complications. Endovascular therapies became more prevalent over the years, and were correlated with better outcome. Functional outcome overall improved, yet the main effect was noted in the low- and intermediate-grade patients. Several parameters were associated with poor functional outcome, including long-term mechanical ventilation (odds ratio 11.99, CI 95% [7.15-20.63]), acute kidney injury (3.55 [1.64-8.24]), pneumonia (2.89 [1.89-4.42]), hydrocephalus (1.80 [1.24-2.63]) diabetes mellitus (1.71 [1.04-2.84]), seizures (1.69 [1.07-2.70], H&H (1.67 [1.45-1.94]), and age (1.06 [1.05-1.07]), while endovascular approach to treat the aneurysm, compared with clip-ligation, had a positive effect (0.35 [0.25-0.48]). CONCLUSION: This large, single referral center, retrospective analysis reveals important trends in the treatment of aSAH. It also demonstrates that despite improvement in functional outcome over the years, systemic complications remain a significant risk factor for poor prognosis. The historic H&H determination of outcome is less valid with today's improved care.Correlation-based dynamic sampling for online high dimensional process monitoring
AbstractNabhan, M., Mei, Y., & Shi, J. (n.d.).Publication year
2021Journal title
Journal of Quality TechnologyVolume
53Issue
3Page(s)
289-308AbstractEffective process monitoring of high-dimensional data streams with embedded spatial structures has been an arising challenge for environments with limited resources. Utilizing the spatial structure is key to improve monitoring performance. This article proposes a correlation-based dynamic sampling technique for change detection. Our method borrows the idea of Upper Confidence Bound algorithm and uses the correlation structure not only to calculate a global statistic, but also to infer unobserved sensors from partial observations. Simulation studies and two case studies on solar flare detection and carbon nanotubes (CNTs) buckypaper process monitoring are used to validate the effectiveness of our method.Creation of a Pediatric Choledocholithiasis Prediction Model
AbstractCohen, R. Z., Tian, H., Sauer, C. G., Willingham, F. F., Santore, M. T., Mei, Y., & Freeman, A. J. (n.d.).Publication year
2021Journal title
Journal of Pediatric Gastroenterology and NutritionVolume
73Issue
5Page(s)
636-641AbstractBackground:Definitive non-invasive detection of pediatric choledocholithiasis could allow more efficient identification of those patients who are most likely to benefit from therapeutic endoscopic retrograde cholangiopancreatography (ERCP) for stone extraction.Objective:To craft a pediatric choledocholithiasis prediction model using a combination of commonly available serum laboratory values and ultrasound results.Methods:A retrospective review of laboratory and imaging results from 316 pediatric patients who underwent intraoperative cholangiogram or ERCP due to suspicion of choledocholithiasis were collected and compared to presence of common bile duct stones on cholangiography. Multivariate logistic regression with supervised machine learning was used to create a predictive scoring model. Monte-Carlo cross-validation was used to validate the scoring model and a score threshold that would provide at least 90% specificity for choledocholithiasis was determined in an effort to minimize non-therapeutic ERCP.Results:Alanine aminotransferase (ALT), total bilirubin, alkaline phosphatase, and common bile duct diameter via ultrasound were found to be the key clinical variables to determine the likelihood of choledocholithiasis. The dictated specificity threshold of 90.3% yielded a sensitivity of 40.8% and overall accuracy of 71.5% in detecting choledocholithiasis. Positive predictive value was 71.4% and negative predictive value was 72.1%.Conclusion:Our novel pediatric choledocholithiasis predictive model is a highly specific tool to suggest ERCP in the setting of likely choledocholithiasis.Editorial : Mathematical Fundamentals of Machine Learning
AbstractGlickenstein, D., Hamm, K., Huo, X., Mei, Y., & Stoll, M. (n.d.).Publication year
2021Journal title
Frontiers in Applied Mathematics and StatisticsVolume
7Abstract~Multi-Stream Quickest Detection with Unknown Post-Change Parameters under Sampling Control
AbstractAbstractThe multi-stream quickest detection problem with unknown post-change parameters is studied under the sampling control constraint, where there are M local processes in a system but one is only able to take observations from one of these M local processes at each time instant. The objective is to raise a correct alarm as quickly as possible once the change occurs subject to both false alarm and sampling control constraints. We propose an efficient myopic-sampling-based quickest detection algorithm under sampling control constraint, and show it is asymptotically optimal in the sense of minimizing the detection delay under our context when the number M of processes is fixed. Simulation studies are conducted to validate our theoretical results.Nonparametric monitoring of multivariate data via KNN learning
AbstractLi, W., Zhang, C., Tsung, F., & Mei, Y. (n.d.).Publication year
2021Journal title
International Journal of Production ResearchVolume
59Issue
20Page(s)
6311-6326AbstractProcess monitoring of multivariate quality attributes is important in many industrial applications, in which rich historical data are often available thanks to modern sensing technologies. While multivariate statistical process control (SPC) has been receiving increasing attention, existing methods are often inadequate as they are sensitive to the parametric model assumptions of multivariate data. In this paper, we propose a novel, nonparametric k-nearest neighbours empirical cumulative sum (KNN-ECUSUM) control chart that is a machine-learning-based black-box control chart for monitoring multivariate data by utilising extensive historical data under both in-control and out-of-control scenarios. Our proposed method utilises the k-nearest neighbours (KNN) algorithm for dimension reduction to transform multivariate data into univariate data and then applies the CUSUM procedure to monitor the change on the empirical distribution of the transformed univariate data. Extensive simulation studies and a real industrial example based on a disk monitoring system demonstrate the robustness and effectiveness of our proposed method.Optimum Multi-Stream Sequential Change-Point Detection with Sampling Control
AbstractXu, Q., Mei, Y., & Moustakides, G. V. (n.d.).Publication year
2021Journal title
IEEE Transactions on Information TheoryVolume
67Issue
11Page(s)
7627-7636AbstractIn multi-stream sequential change-point detection it is assumed that there are M processes in a system and at some unknown time, an occurring event changes the distribution of the samples of a particular process. In this article, we consider this problem under a sampling control constraint when one is allowed, at each point in time, to sample a single process. The objective is to raise an alarm as quickly as possible subject to a proper false alarm constraint. We show that under sampling control, a simple myopic-sampling-based sequential change-point detection strategy is second-order asymptotically optimal when the number M of processes is fixed. This means that the proposed detector, even by sampling with a rate 1/M of the full rate, enjoys the same detection delay, up to some additive finite constant, as the optimal procedure. Simulation experiments corroborate our theoretical results.Quantitation of lymphatic transport mechanism and barrier influences on lymph node-resident leukocyte access to lymph-borne macromolecules and drug delivery systems
AbstractArcher, P. A., Sestito, L. F., Manspeaker, M. P., O’Melia, M. J., Rohner, N. A., Schudel, A., Mei, Y., & Thomas, S. N. (n.d.).Publication year
2021Journal title
Drug Delivery and Translational ResearchVolume
11Issue
6Page(s)
2328-2343AbstractLymph nodes (LNs) are tissues of the immune system that house leukocytes, making them targets of interest for a variety of therapeutic immunomodulation applications. However, achieving accumulation of a therapeutic in the LN does not guarantee equal access to all leukocyte subsets. LNs are structured to enable sampling of lymph draining from peripheral tissues in a highly spatiotemporally regulated fashion in order to facilitate optimal adaptive immune responses. This structure results in restricted nanoscale drug delivery carrier access to specific leukocyte targets within the LN parenchyma. Herein, a framework is presented to assess the manner in which lymph-derived macromolecules and particles are sampled in the LN to reveal new insights into how therapeutic strategies or drug delivery systems may be designed to improve access to dLN-resident leukocytes. This summary analysis of previous reports from our group assesses model nanoscale fluorescent tracer association with various leukocyte populations across relevant time periods post administration, studies the effects of bioactive molecule NO on access of lymph-borne solutes to dLN leukocytes, and illustrates the benefits to leukocyte access afforded by lymphatic-targeted multistage drug delivery systems. Results reveal trends consistent with the consensus view of how lymph is sampled by LN leukocytes resulting from tissue structural barriers that regulate inter-LN transport and demonstrate how novel, engineered delivery systems may be designed to overcome these barriers to unlock the therapeutic potential of LN-resident cells as drug delivery targets.Routine Use of Contrast on Admission Transthoracic Echocardiography for Heart Failure Reduces the Rate of Repeat Echocardiography during Index Admission
AbstractLee, K. C., Liu, S., Callahan, P., Green, T., Jarrett, T., Cochran, J. D., Mei, Y., Mobasseri, S., Sayegh, H., Rangarajan, V., Flueckiger, P., & Vannan, M. A. (n.d.).Publication year
2021Journal title
Journal of the American Society of EchocardiographyVolume
34Issue
12Page(s)
1253-1261.e4AbstractBackground: The authors retrospectively evaluated the impact of ultrasound enhancing agent (UEA) use in the first transthoracic echocardiographic (TTE) examination, regardless of baseline image quality, on the number of repeat TTEs and length of stay (LOS) during a heart failure (HF) admission. Methods: There were 9,115 HF admissions associated with admission TTE examinations over a 4-year period (5,337 men; mean age, 67.6 ± 15.0 years). Patients were grouped into those who received UEAs (contrast group) in the first TTE study and those who did not (noncontrast group). Repeat TTE examinations were classified as justified if performed for concrete clinical indications during hospitalization. Results: In the 9,115 admissions for HF (5,600 in the contrast group, 3,515 in the noncontrast group), 927 patients underwent repeat TTE studies (505 in the contrast group, 422 in the noncontrast group), which were considered justified in 823 patients. Of the 104 patients who underwent unjustified repeat TTE studies, 80 (76.7%) belonged to the noncontrast group and 24 to the contrast group. Also, UEA use increased from 50.4% in 2014 to 74.3%, and the rate of unjustified repeat studies decreased from 1.3% to 0.9%. The rates of unjustified repeat TTE imaging were 2.3% and 0.4% (in the noncontrast and contrast groups, respectively), and patients in the contrast group were less likely to undergo unjustified repeat examinations (odds ratio, 0.18; 95% CI, 0.12–0.29; P 6 days) LOS. Conclusions: The routine use of UEA in the first TTE examination for HF irrespective of image quality is associated with reduced unjustified repeat TTE testing and may reduce LOS during an index HF admission.Single and multiple change-point detection with differential privacy
AbstractZhang, W., Krehbiel, S., Tuo, R., Mei, Y., & Cummings, R. (n.d.).Publication year
2021Journal title
Journal of Machine Learning ResearchVolume
22AbstractThe change-point detection problem seeks to identify distributional changes at an unknown change-point k* in a stream of data. This problem appears in many important practical settings involving personal data, including biosurveillance, fault detection, finance, signal detection, and security systems. The field of differential privacy offers data analysis tools that provide powerful worst-case privacy guarantees. We study the statistical problem of change-point detection through the lens of differential privacy. We give private algorithms for both online and offine change-point detection, analyze these algorithms theoretically, and provide empirical validation of our results.Glucose Variability as Measured by Inter-measurement Percentage Change is Predictive of In-patient Mortality in Aneurysmal Subarachnoid Hemorrhage
AbstractSadan, O., Feng, C., Feng, C., Vidakovic, B., Mei, Y., Martin, K., Samuels, O., & Hall, C. L. (n.d.).Publication year
2020Journal title
Neurocritical CareVolume
33Issue
2Page(s)
458-467AbstractBackground: Critically ill aneurysmal subarachnoid hemorrhage (aSAH) patients suffer from systemic complications at a high rate. Hyperglycemia is a common intensive care unit (ICU) complication and has become a focus after aggressive glucose management was associated with improved ICU outcomes. Subsequent research has suggested that glucose variability, not a specific blood glucose range, may be a more appropriate clinical target. Glucose variability is highly correlated to poor outcomes in a wide spectrum of critically ill patients. Here, we investigate the changes between subsequent glucose values termed “inter-measurement difference,” as an indicator of glucose variability and its association with outcomes in patients with aSAH. Methods: All SAH admissions to a single, tertiary referral center between 2002 and 2016 were screened. All aneurysmal cases who had more than 2 glucose measurements were included (n = 2451). We calculated several measures of variability, including simple variance, the average consecutive absolute change, average absolute change by time difference, within subject variance, median absolute deviation, and average or median consecutive absolute percentage change. Predictor variables also included admission Hunt and Hess grade, age, gender, cardiovascular risk factors, and surgical treatment. In-patient mortality was the main outcome measure. Results: In a multiple regression analysis, nearly all forms of glucose variability calculations were found to be correlated with in-patient mortality. The consecutive absolute percentage change, however, was most predictive: OR 5.2 [1.4–19.8, CI 95%] for percentage change and 8.8 [1.8–43.6] for median change, when controlling for the defined predictors. Survival to ICU discharge was associated with lower glucose variability (consecutive absolute percentage change 17% ± 9%) compared with the group that did not survive to discharge (20% ± 15%, p ' 0.01). Interestingly, this finding was not significant in patients with pre-admission poorly controlled diabetes as indicated by HbA1c (OR 0.45 [0.04–7.18], by percentage change). The effect is driven mostly by non-diabetic patients or those with well-controlled diabetes. Conclusions: Reduced glucose variability is highly correlated with in-patient survival and long-term mortality in aSAH patients. This finding was observed in the non-diabetic and well-controlled diabetic patients, suggesting a possible benefit for personalized glucose targets based on baseline HbA1c and minimizing variability. The inter-measure percentage change as an indicator of glucose variability is not only predictive of outcome, but is an easy-to-use tool that could be implemented in future clinical trials.Improved performance properties of the CISPRT algorithm for distributed sequential detection
AbstractLiu, K., & Mei, Y. (n.d.).Publication year
2020Journal title
Signal ProcessingVolume
172AbstractIn distributed sequential detection problems, local sensors observe raw local observations over time, and are allowed to communicate local information with their immediate neighborhood at each time step so that the sensors can work together to make a quick but accurate decision when testing binary hypotheses on the true raw sensor distributions. One interesting algorithm is the Consensus-Innovation Sequential Probability Ratio Test (CISPRT) algorithm proposed by Sahu and Kar (IEEE Trans. Signal Process., 2016). In this article, we present improved finite-sample properties on error probabilities and expected sample sizes of the CISPRT algorithm for Gaussian data in term of network connectivity, and more importantly, derive its sharp first-order asymptotic properties in the classical asymptotic regime when Type I and II error probabilities go to 0. The usefulness of our theoretical results are validated through numerical simulations.Rapid detection of hot-spot by tensor decomposition with application to weekly gonorrhea data
AbstractZhao, Y., Yan, H., Holte, S. E., Kerani, R. P., & Mei, Y. (n.d.).Publication year
2020Page(s)
289-310AbstractIn many bio-surveillance and healthcare applications, data sources are measured from many spatial locations repeatedly over time, say, daily/weekly/monthly. In these applications, we are typically interested in detecting hot-spots, which are defined as some structured outliers that are sparse over the spatial domain but persistent over time. In this paper, we propose a tensor decomposition method to detect when and where the hot-spots occur. Our proposed methods represent the observed raw data as a three-dimensional tensor including a circular time dimension for daily/weekly/monthly patterns, and then decompose the tensor into three components: smooth global trend, local hot-spots, and residuals. A combination of LASSO and fused LASSO is used to estimate the model parameters, and a CUSUM procedure is applied to detect when and where the hot-spots might occur. The usefulness of our proposed methodology is validated through numerical simulation and a real-world dataset in the weekly number of gonorrhea cases from 2006 to 2018 for 50 states in the United States.Second-Order Asymptotically Optimal Change-point Detection Algorithm with Sampling Control
AbstractXu, Q., Mei, Y., & Moustakides, G. V. (n.d.).Publication year
2020Page(s)
1136-1140AbstractIn the sequential change-point detection problem for multi-stream data, it is assumed that there are M processes in a system and at some unknown time, an occurring event impacts one unknown local process in the sense of changing the distribution of observations from that affected local process. In this paper, we consider such problem under the sampling control constraint, in which one is able to take observations from only one of the local processes at each time step. Our objective is to design an adaptive sampling policy and a stopping time policy that is able to raise a correct alarm as quickly as possible subject to the false alarm and sampling control constraint. We develop an efficient sequential change-point detection algorithm under the sampling control that turns out to be second-order asymptotically optimal under the full data scenario. That is, with the sampling rate that is only 1/M of the full data scenario, our proposed algorithm has the same performance up to second-order as the optimal procedure under the full data scenario.Wavelet-Based Robust Estimation of Hurst Exponent with Application in Visual Impairment Classification
AbstractFeng, C., Feng, C., Mei, Y., & Vidakovic, B. (n.d.).Publication year
2020Journal title
Journal of Data ScienceVolume
18Issue
4Page(s)
581-605AbstractPupillary response behavior (PRB) refers to changes in pupil diameter in response to simple or complex stimuli. There are underlying, unique patterns hidden within complex, high-frequency PRB data that can be utilized to classify visual impairment, but those patterns cannot be described by traditional summary statistics. For those complex high-frequency data, Hurst exponent, as a measure of long-term memory of time series, becomes a powerful tool to detect the muted or irregular change patterns. In this paper, we proposed robust estimators of Hurst exponent based on non-decimated wavelet transforms. The properties of the proposed estimators were studied both theoretically and numerically. We applied our methods to PRB data to extract the Hurst exponent and then used it as a predictor to classify individuals with different degrees of visual impairment. Compared with other standard wavelet-based methods, our methods reduce the variance of the estimators and increase the classification accuracy.