Yajun Mei
Yajun Mei
Professor of Biostatistics
-
Professional overview
-
Yajun Mei is a Professor of Biostatistics at NYU/GPH, starting from July 1, 2024. He received the B.S. degree in Mathematics from Peking University, Beijing, China, in 1996, and the Ph.D. degree in Mathematics with a minor in Electrical Engineering from the California Institute of Technology, Pasadena, CA, USA, in 2003. He was a Postdoc in Biostatistics in the renowned Fred Hutch Cancer Center in Seattle, WA during 2003 and 2005. Prior to joining NYU, Dr. Mei was an Assistant/Associate/Full Professor in H. Milton Stewart School of Industrial and Systems Engineering at the Georgia Institute of Technology, Atlanta, GA for 18 years from 2006 to 2024, and had been a co-director of Biostatistics, Epidemiology, and Study Design (BERD) of Georgia CTSA since 2018.
Dr. Mei’s research interests are statistics, machine learning, and data science, and their applications in biomedical science and public health, particularly, streaming data analysis, sequential decision/design, change-point problems, precision/personalized medicine, hot-spots detection for infectious diseases, longitudinal data analysis, bioinformatics, and clinical trials. His work has received several recognitions including Abraham Wald Prizes in Sequential Analysis in both 2009 and 2024, NSF CAREER Award in 2010, an elected Fellow of American Statistical Association (ASA) in 2023, and multiple best paper awards.
-
Education
-
BS, Mathematics, Peking UniversityPhD, Mathematics, California Institute of Technology
-
Honors and awards
-
Fellow of American Statistical Association (2023)Star Research Achievement Award, 2021 Virtual Critical Care Congress (2021)Best Paper Competition Award, Quality, Statistics & Reliability of INFORMS (2020)Bronze Snapshot Award, Society of Critical Care Medicine (2019)NSF Career AwardThank a Teacher Certificate, Center for Teaching and Learning (2011201220162020202120222023)Abraham Wald Prize (2009)Best Paper Award, 11th International Conference on Information Fusion (2008)New Researcher Fellow, Statistical and Applied Mathematical Sciences Institute (2005)Fred Hutchinson SPAC Travel Award to attend 2005 Joint Statistical Meetings, Minneapolis, MN (2005)Travel Award to 8th New Researchers Conference, Minneapolis, MN (2005)Travel Award to IEEE International Symposium on Information Theory, Chicago, IL (2004)Travel Award to IPAM workshop on inverse problem, UCLA, Los Angeles, CA (2003)Fred Hutchinson SPAC Course Scholarship (2003)Travel Award to the SAMSI workshop on inverse problem, Research Triangular Park, NC (2002)
-
Publications
Publications
Adaptive Online Monitoring of the Ising model
AbstractSuh, N., Zhang, R., & Mei, Y. (n.d.).Publication year
2019Page(s)
426-431AbstractIsing model is a general framework for capturing the dependency structure among random variables. It has many interesting real-world applications in the fields of medical imaging, genetics, disease surveillance, etc. Nonetheless, literature on the online change-point detection of the interaction parameter in the model is rather limited. This might be attributed to following two challenges: 1) the exact evaluation of the likelihood function with the given data is computationally infeasible due to the presence of partition function and 2) the post-change parameter usually is unknown. In this paper, we overcome these two challenges via our proposed adaptive pseudo-CUSUM procedure, which incorporates the notion of pseudo-likelihood function under the CUSUM framework. Asymptotic analysis, numerical simulation, and case study corroborate the statistical efficiency and the practicality of our proposed scheme.Optimal Stopping for Interval Estimation in Bernoulli Trials
AbstractYaacoub, T., Moustakides, G. V., & Mei, Y. (n.d.).Publication year
2019Journal title
IEEE Transactions on Information TheoryVolume
65Issue
5Page(s)
3022-3033AbstractWe propose an optimal sequential methodology for obtaining confidence intervals for a binomial proportion heta. Assuming that an independent and identically distributed sequence of Bernoulli ( heta ) trials is observed sequentially, we are interested in designing: 1) a stopping time T that will decide the best time to stop sampling the process and 2) an optimum estimator \hat{{ heta}}-{{T}} that will provide the optimum center of the interval estimate of heta. We follow a semi-Bayesian approach, where we assume that there exists a prior distribution for heta , and our goal is to minimize the average number of samples while we guarantee a minimal specified coverage probability level. The solution is obtained by applying standard optimal stopping theory and computing the optimum pair (T,\hat{{ heta }}-{{T}}) numerically. Regarding the optimum stopping time component T , we demonstrate that it enjoys certain very interesting characteristics not commonly encountered in solutions of other classical optimal stopping problems. In particular, we prove that, for a particular prior (beta density), the optimum stopping time is always bounded from above and below; it needs to first accumulate a sufficient amount of information before deciding whether or not to stop, and it will always terminate before some finite deterministic time. We also conjecture that these properties are present with any prior. Finally, we compare our method with the optimum fixed-sample-size procedure as well as with existing alternative sequential schemes.Scalable sum-shrinkage schemes for distributed monitoring large-scale data streams
AbstractLiu, K., Zhang, R., & Mei, Y. (n.d.).Publication year
2019Journal title
Statistica SinicaVolume
29Issue
1Page(s)
1-22AbstractIn this article, we investigate the problem of monitoring independent large-scale data streams where an undesired event may occur at some unknown time and affect only a few unknown data streams. Motivated by parallel and distributed computing, we propose to develop scalable global monitoring schemes by parallel running local detection procedures and by using the sum of the shrinkage transformation of local detection statistics as a global statistic to make a decision. The usefulness of our proposed SUM-Shrinkage approach is illustrated in an example of monitoring large-scale independent normally distributed data streams when the local post-change mean shifts are unknown and can be positive or negative.Tandem-width sequential confidence intervals for a Bernoulli proportion
AbstractYaacoub, T., Goldsman, D., Mei, Y., & Moustakides, G. V. (n.d.).Publication year
2019Journal title
Sequential AnalysisVolume
38Issue
2Page(s)
163-183AbstractWe propose a two-stage sequential method for obtaining tandem-width confidence intervals for a Bernoulli proportion p. The term “tandem-width” refers to the fact that the half-width of the 100(1 - α)% confidence interval is not fixed beforehand; it is instead required to satisfy two different half-width upper bounds, h0 and h1, depending on the (unknown) values of p. To tackle this problem, we first propose a simple but useful sequential method for obtaining fixed-width confidence intervals for p, whose stopping rule is based on the minimax estimator of p. We observe Bernoulli(p) trials sequentially, and for some fixed half-width h = h0 or h1, we develop a stopping time T such that the resulting confidence interval for p, [(Formula presented.)], covers the parameter with confidence at least 100(1 - α)% where (Formula presented.) is the maximum likelihood estimator of p at time T. Furthermore, we derive theoretical properties of our proposed fixed-width and tandem-width methods and compare their performances with existing alternative sequential schemes. The proposed minimax-based fixed-width method performs similarly to alternative fixed-width methods, while being easier to implement in practice. In addition, the proposed tandem-width method produces effective savings in sample size compared to the fixed-width counterpart and provides excellent results for scientists to use when no prior knowledge of p is available.Asymptotic statistical properties of communication-efficient quickest detection schemes in sensor networks
AbstractZhang, R., & Mei, Y. (n.d.).Publication year
2018Journal title
Sequential AnalysisVolume
37Issue
3Page(s)
375-396AbstractThe quickest change detection problem is studied in a general context of monitoring a large number K of data streams in sensor networks when the “trigger event” may affect different sensors differently. In particular, the occurring event might affect some unknown, but not necessarily all, sensors and also could have an immediate or delayed impact on those affected sensors. Motivated by censoring sensor networks, we develop scalable communication-efficient schemes based on the sum of those local cumulative sum (CUSUM) statistics that are “large” under either hard, soft, or order thresholding rules. Moreover, we provide the detection delay analysis of these communication-efficient schemes in the context of monitoring K independent data streams and establish their asymptotic statistical properties under two regimes: one is the classical asymptotic regime when the dimension K is fixed, and the other is the modern asymptotic regime when the dimension K goes to ∞ Our theoretical results illustrate the deep connections between communication efficiency and statistical efficiency.Differentially private change-point detection
AbstractCummings, R., Krehbiel, S., Mei, Y., Tuo, R., & Zhang, W. (n.d.).Publication year
2018Journal title
Advances in Neural Information Processing SystemsVolume
2018-DecemberPage(s)
10825-10834AbstractThe change-point detection problem seeks to identify distributional changes at an unknown change-point k in a stream of data. This problem appears in many important practical settings involving personal data, including biosurveillance, fault detection, finance, signal detection, and security systems. The field of differential privacy offers data analysis tools that provide powerful worst-case privacy guarantees. We study the statistical problem of change-point detection through the lens of differential privacy. We give private algorithms for both online and offline change-point detection, analyze these algorithms theoretically, and provide empirical validation of our results.Thresholded Multivariate Principal Component Analysis for Phase I Multichannel Profile Monitoring
AbstractWang, Y., Mei, Y., & Paynabar, K. (n.d.).Publication year
2018Journal title
TechnometricsVolume
60Issue
3Page(s)
360-372AbstractMonitoring multichannel profiles has important applications in manufacturing systems improvement, but it is nontrivial to develop efficient statistical methods because profiles are high-dimensional functional data with intrinsic inner- and interchannel correlations, and that the change might only affect a few unknown features of multichannel profiles. To tackle these challenges, we propose a novel thresholded multivariate principal component analysis (PCA) method for multichannel profile monitoring. Our proposed method consists of two steps of dimension reduction: It first applies the functional PCA to extract a reasonably large number of features under the in-control state, and then uses the soft-thresholding techniques to further select significant features capturing profile information under the out-of-control state. The choice of tuning parameter for soft-thresholding is provided based on asymptotic analysis, and extensive numerical studies are conducted to illustrate the efficacy of our proposed thresholded PCA methodology.Precision in the specification of ordinary differential equations and parameter estimation in modeling biological processes
AbstractAbstractIn recent years, the use of differential equations to describe the dynamics of within-host viral infections, most frequently HIV-1 or Hepatitis B or C dynamics, has become quite common. The pioneering work described in [1,2,3,4] provided estimates of both the HIV-1 viral clearance rate, c, and infected cell turnover rate, δ, and revealed that while it often takes years for HIV-1 infection to progress to AIDS, the virus is replicating rapidly and continuously throughout these years of apparent latent infection. In addition, at least two compartments of viral-producing cells that decay at different rates were identified. Estimates of infected cell decay and viral clearance rates dramatically changed the understanding of HIV replication, etiology, and pathogenesis. Since that time, models of this type have been used extensively to describe and predict both in vivo viral and/or immune system dynamics and the transmission of HIV throughout a population. However, there are both mathematical and statistical challenges associated with models of this type, and the goal of this chapter is to describe some of these as well as offer possible solutions or options. In particular statistical aspects associated with parameter estimation, model comparison and study design will be described. Although the models developed by Perelson et al. [3,4] are relatively simple and were developed nearly 20 years ago, these models will be used in this chapter to demonstrate concepts in a relatively simple setting. In the first section, a statistical approach for model comparison is described using the model developed in [4] as the null hypothesis model for formal statistical comparison to an alternative model. In the next section, the concept of the mathematical sensitivity matrix and its relationship to the Fisher information matrix (FIM) will be described, and will be used to demonstrate how to evaluate parameter identifiability in ordinary differential equation (ODE) models. The next section demonstrates how to determine what types of additional data are required to address the problem of nonidentifiable parameters in ODE models. Examples are provided to demonstrate these concepts. The chapter ends with some recommendations.Search for evergreens in science : A functional data analysis
AbstractZhang, R., Wang, J., & Mei, Y. (n.d.).Publication year
2017Journal title
Journal of InformetricsVolume
11Issue
3Page(s)
629-644AbstractEvergreens in science are papers that display a continual rise in annual citations without decline, at least within a sufficiently long time period. Aiming to better understand evergreens in particular and patterns of citation trajectory in general, this paper develops a functional data analysis method to cluster citation trajectories of a sample of 1699 research papers published in 1980 in the American Physical Society (APS) journals. We propose a functional Poisson regression model for individual papers’ citation trajectories, and fit the model to the observed 30-year citations of individual papers by functional principal component analysis and maximum likelihood estimation. Based on the estimated paper-specific coefficients, we apply the K-means clustering algorithm to cluster papers into different groups, for uncovering general types of citation trajectories. The result demonstrates the existence of an evergreen cluster of papers that do not exhibit any decline in annual citations over 30 years.Sequential estimation based on conditional cost
AbstractMoustakides, G. V., Yaacoub, T., & Mei, Y. (n.d.).Publication year
2017Page(s)
436-440AbstractWe consider the problem of parameter estimation under a sequential framework. Specifically we assume that an i.i.d. random process is observed sequentially with its common pdf having a random parameter that must be estimated. We are interested in designing a stopping time that will decide when is the best moment to stop sampling the process and an estimator that will use the acquired samples in order to provide the desired estimate. We follow a semi-Bayesian approach where we assign cost to the pair (estimate, true parameter) and our goal is to minimize the average sample size guaranteeing at the same time an average cost below some prescribed level. For our analysis we adopt a conditional average cost which leads to a considerable simplification in the sequential estimation problem, otherwise known to be analytically intractable. We apply our results to a number of examples and compare our method with the optimum fixed sample size but also with existing sequential schemes.Discussion on “Sequential detection/isolation of abrupt changes” by Igor V. Nikiforov
AbstractLiu, K., & Mei, Y. (n.d.).Publication year
2016Journal title
Sequential AnalysisVolume
35Issue
3Page(s)
316-319AbstractIn this interesting article, Professor Nikiforov reviewed the current state of quickest change detection/isolation problem. In our discussion of his article we focus on the concerns and the opportunities of the subfield of quickest change detection or, more generally, sequential methodologies, in the modern information age.Effect of bivariate data's correlation on sequential tests of circular error probability
AbstractLi, Y., & Mei, Y. (n.d.).Publication year
2016Journal title
Journal of Statistical Planning and InferenceVolume
171Page(s)
99-114AbstractThe problem of evaluating a military or GPS/GSM system's precision quality is considered in this article, where one sequentially observes bivariate normal data (Xi, Yi)'s and wants to test hypotheses on the circular error probability (CEP) or the probability of nonconforming, i.e., the probabilities of the system hitting or missing a pre-specified disk target. In such a problem, we first consider a sequential probability ratio test (SPRT) developed under the erroneous assumption of the correlation coefficient ρ=0, and investigate its properties when the true ρ≠0. It was shown that at least one of the Type I and Type II error probabilities would be larger than the required ones if the true ρ≠0, and for the detailed effects, exp-2≈0.1353 turns out to be a critical value for the hypothesized probability of nonconforming. Moreover, we propose several sequential tests when the correlation coefficient ρ is unknown, and among these tests, the method of generalized sequential likelihood ratio test (GSLRT) in Bangdiwala (1982) seems to work well.Symmetric directional false discovery rate control
AbstractHolte, S. E., Lee, E. K., & Mei, Y. (n.d.).Publication year
2016Journal title
Statistical MethodologyVolume
33Page(s)
71-82AbstractThis research is motivated from the analysis of a real gene expression data that aims to identify a subset of “interesting” or “significant” genes for further studies. When we blindly applied the standard false discovery rate (FDR) methods, our biology collaborators were suspicious or confused, as the selected list of significant genes was highly unbalanced: there were ten times more under-expressed genes than the over-expressed genes. Their concerns led us to realize that the observed two-sample t-statistics were highly skewed and asymmetric, and thus the standard FDR methods might be inappropriate. To tackle this case, we propose a symmetric directional FDR control method that categorizes the genes into “over-expressed” and “under-expressed” genes, pairs “over-expressed” and “under-expressed” genes, defines the p-values for gene pairs via column permutations, and then applies the standard FDR method to select “significant” gene pairs instead of “significant” individual genes. We compare our proposed symmetric directional FDR method with the standard FDR method by applying them to simulated data and several well-known real data sets.An Adaptive Sampling Strategy for Online High-Dimensional Process Monitoring
AbstractLiu, K., Mei, Y., & Shi, J. (n.d.).Publication year
2015Journal title
TechnometricsVolume
57Issue
3Page(s)
305-319AbstractTemporally and spatially dense data-rich environments provide unprecedented opportunities and challenges for effective process control. In this article, we propose a systematic and scalable adaptive sampling strategy for online high-dimensional process monitoring in the context of limited resources with only partial information available at each acquisition time. The proposed adaptive sampling strategy includes a broad range of applications: (1) when only a limited number of sensors is available; (2) when only a limited number of sensors can be in "ON" state in a fully deployed sensor network; and (3) when only partial data streams can be analyzed at the fusion center due to limited transmission and processing capabilities even though the full data streams have been acquired remotely. A monitoring scheme of using the sum of top-r local CUSUM statistics is developed and named as "TRAS" (top-r based adaptive sampling), which is scalable and robust in detecting a wide range of possible mean shifts in all directions, when each data stream follows a univariate normal distribution. Two properties of this proposed method are also investigated. Case studies are performed on a hot-forming process and a real solar flare process to illustrate and evaluate the performance of the proposed method.Large-Scale Multi-Stream Quickest Change Detection via Shrinkage Post-Change Estimation
AbstractWang, Y., & Mei, Y. (n.d.).Publication year
2015Journal title
IEEE Transactions on Information TheoryVolume
61Issue
12Page(s)
6926-6938AbstractThe quickest change detection problem is considered in the context of monitoring large-scale independent normal distributed data streams with possible changes in some of the means. It is assumed that for each individual local data stream, either there are no local changes, or there is a big local change that is larger than a pre-specified lower bound. Two different types of scenarios are studied: one is the sparse post-change case when the unknown number of affected data streams is much smaller than the total number of data streams, and the other is when all local data streams are affected simultaneously although not necessarily identically. We propose a systematic approach to develop efficient global monitoring schemes for quickest change detection by combining hard thresholding with linear shrinkage estimators to estimating all post-change parameters simultaneously. Our theoretical analysis demonstrates that the shrinkage estimation can balance the tradeoff between the first-order and second-order terms of the asymptotic expression on the detection delays, and our numerical simulation studies illustrate the usefulness of shrinkage estimation and the challenge of Monte Carlo simulation of the average run length to false alarm in the context of online monitoring large-scale data streams.Quickest change detection and Kullback-Leibler divergence for two-state hidden Markov models
AbstractAbstractThe quickest change detection problem is studied in two-state hidden Markov models (HMM), where the vector parameter θ of the HMM may change from θ0 to θ1 at some unknown time, and one wants to detect the true change as quickly as possible while controlling the false alarm rate. It turns out that the generalized likelihood ratio (GLR) scheme, while theoretically straightforward, is generally computationally infeasible for the HMM. To develop efficient but computationally simple schemes for the HMM, we first show that the recursive CUSUM scheme proposed in Fuh (Ann. Statist., 2003) can be regarded as a quasi-GLR scheme for some suitable pseudo post-change hypotheses. Next, we extend the quasi-GLR idea to propose recursive score schemes in a more complicated scenario when the post-change parameter θ1 of the HMM involves a real-valued nuisance parameter. Finally, our research provides an alternative approach that can numerically compute the Kullback-Leibler (KL) divergence of two-state HMMs via the invariant probability measure and the Fredholm integral equation.Quickest Change Detection and Kullback-Leibler Divergence for Two-State Hidden Markov Models
AbstractFuh, C. D., & Mei, Y. (n.d.).Publication year
2015Journal title
IEEE Transactions on Signal ProcessingVolume
63Issue
18Page(s)
4866-4878AbstractIn this paper, the quickest change detection problem is studied in two-state hidden Markov models (HMM), where the vector parameter θ of the HMM changes from θ0 to θ1 at some unknown time, and one wants to detect the true change as quickly as possible while controlling the false alarm rate. It turns out that the generalized likelihood ratio (GLR) scheme, while theoretically straightforward, is generally computationally infeasible for the HMM. To develop efficient but computationally simple schemes for the HMM, we first discuss a subtlety in the recursive form of the generalized likelihood ratio (GLR) scheme for the HMM. Then we show that the recursive CUSUM scheme proposed in Fuh (Ann. Statist., 2003) can be regarded as a quasi-GLR scheme for pseudo post-change hypotheses with certain dependence structure between pre- and postchange observations. Next, we extend the quasi-GLR idea to propose recursive score schemes in the scenario when the postchange parameter θ1 of the HMM involves a real-valued nuisance parameter. Finally, the Kullback-Leibler (KL) divergence plays an essential role in the quickest change detection problem and many other fields, however it is rather challenging to numerically compute it in HMMs. Here we develop a non-Monte Carlo method that computes the KL divergence of two-state HMMs via the underlying invariant probability measure, which is characterized by the Fredholm integral equation. Numerical study demonstrates an unusual property of the KL divergence for HMM that implies the severe effects of misspecifying the postchange parameter for the HMM.Comment on "Quantifying long-term scientific impact"
AbstractWang, J., Mei, Y., & Hicks, D. (n.d.).Publication year
2014Journal title
ScienceVolume
345Issue
6193Page(s)
149bAbstractWang et al. (Reports, 4 October 2013, p. 127) claimed high prediction power for their model of citation dynamics. We replicate their analysis but find discouraging results: 14.75% papers are estimated with unreasonably large μ (>5) and λ (>10) and correspondingly enormous prediction errors. The prediction power is even worse than simply using short-term citations to approximate long-term citations.Online parallel monitoring via hard-thresholding post-change estimation
AbstractAbstractThe online parallel monitoring problem is studied when one is monitoring large-scale data streams, and an event occurs at an unknown time and affects an unknown subset of data streams. Efficient online parallel monitoring schemes are developed by combining the standard sequential change-point method with hard-thresholding post-change estimation. Theoretical analysis and simulation study demonstrate the usefulness of hard-thresholding for online parallel monitoring.Discussion on "Change-Points : From Sequential Detection to Biology and Back" by David O. Siegmund
AbstractMei, Y. (n.d.).Publication year
2013Journal title
Sequential AnalysisVolume
32Issue
1Page(s)
32-35AbstractIn his interesting paper, Professor Siegmund illustrates that the problem formulations and methodologies are generally transferable between off-line and on-line settings of change-point problems. In our discussion of his paper, we echo his thoughts with our own experiences.Quantization effect on the log-likelihood ratio and its application to decentralized sequential detection
AbstractWang, Y., & Mei, Y. (n.d.).Publication year
2013Journal title
IEEE Transactions on Signal ProcessingVolume
61Issue
6Page(s)
1536-1543AbstractIt is well known that quantization cannot increase the Kullback-Leibler divergence which can be thought of as the expected value or first moment of the log-likelihood ratio. In this paper, we investigate the quantization effects on the second moment of the log-likelihood ratio. It is shown via the convex domination technique that quantization may result in an increase in the case of the second moment, but the increase is bounded above by 2/e. The result is then applied to decentralized sequential detection problems not only to provide simpler sufficient conditions for asymptotic optimality theories in the simplest models, but also to shed new light on more complicated models. In addition, some brief remarks on other higher-order moments of the log-likelihood ratio are also provided.A multistage procedure for decentralized sequential multi-hypothesis testing problems
AbstractWang, Y., & Mei, Y. (n.d.).Publication year
2012Journal title
Sequential AnalysisVolume
31Issue
4Page(s)
505-527AbstractWe studied the problem of sequentially testing M ≥ 2 hypotheses with a decentralized sensor network system. In such a system, the local sensors observe raw data and then send quantized observations to a fusion center, which makes a final decision regarding hypothesis is true. Motivated by the two-stage tests in Wang and Mei (2011), we propose a multistage decentralized sequential test that provides multiple opportunities for the local sensors to adjust to the optimal local quantizers. It is demonstrated that when the hypothesis testing problem is asymmetric, the multistage test is second-order asymptotically optimal. Even though this result constitutes an interesting theoretical improvement over twostage tests that can enjoy only first-order asymptotic optimality, the corresponding practical merits seem to be only marginal. Indeed, performance gains over two-stage procedures with carefully selected thresholds are small.Quantization effect on second moment of log-likelihood ratio and its application to decentralized sequential detection
AbstractAbstractIt is well known that quantization cannot increase the Kullback-Leibler divergence which can be thought of as the expected value or first moment of the log-likelihood ratio. In this paper, we investigate the quantization effects on the second moment of the log-likelihood ratio. It is shown that quantization may result in an increase in the case of the second moment, but the increase is bounded above by 2/e. The result is then applied to decentralized sequential detection problems to provide a simpler sufficient condition for asymptotic optimality theory, and the technique is also extended to investigate the quantization effects on other higher-order moments of the log-likelihood ratio and provide lower bounds on higher-order moments.Asymptotic optimality theory for decentralized sequential multihypothesis testing problems
AbstractWang, Y., & Mei, Y. (n.d.).Publication year
2011Journal title
IEEE Transactions on Information TheoryVolume
57Issue
10Page(s)
7068-7083AbstractThe Bayesian formulation of sequentially testing M ≥ 3 hypotheses is studied in the context of a decentralized sensor network system. In such a system, local sensors observe raw observations and send quantized sensor messages to a fusion center which makes a final decision when stopping taking observations. Asymptotically optimal decentralized sequential tests are developed from a class of "two-stage" tests that allows the sensor network system to make a preliminary decision in the first stage and then optimize each local sensor quantizer accordingly in the second stage. It is shown that the optimal local quantizer at each local sensor in the second stage can be defined as a maximin quantizer which turns out to be a randomization of at most M-1 unambiguous likelihood quantizers (ULQ). We first present in detail our results for the system with a single sensor and binary sensor messages, and then extend to more general cases involving any finite alphabet sensor messages, multiple sensors, or composite hypotheses.Early detection of a change in poisson rate after accounting for population size effects
AbstractMei, Y., Won Han, S., & Tsui, K. L. (n.d.).Publication year
2011Journal title
Statistica SinicaVolume
21Issue
2Page(s)
597-624AbstractMotivated by applications in bio and syndromic surveillance, this article is concerned with the problem of detecting a change in the mean of Poisson distributions after taking into account the effects of population size. The family of generalized likelihood ratio (GLR) schemes is proposed and its asymptotic optimality properties are established under the classical asymptotic setting. However, numerical simulation studies illustrate that the GLR schemes are at times not as efficient as two families of ad-hoc schemes based on either the weighted likelihood ratios or the adaptive threshold method that adjust the effects of population sizes. To explain this, a further asymptotic optimality analysis is developed under a new asymptotic setting that is more suitable to our finite-sample numerical simulations. In addition, we extend our approaches to a general setting with arbitrary probability distributions, as well as to the continuous-time setting involving the multiplicative intensity models for Poisson processes, but further research is needed.