1
|
Yang Q, Ji H, Xu Z, Li Y, Wang P, Sun J, Fan X, Zhang H, Lu H, Zhang Z. Ultra-fast and accurate electron ionization mass spectrum matching for compound identification with million-scale in-silico library. Nat Commun 2023; 14:3722. [PMID: 37349295 DOI: 10.1038/s41467-023-39279-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2022] [Accepted: 06/07/2023] [Indexed: 06/24/2023] Open
Abstract
Spectrum matching is the most common method for compound identification in mass spectrometry (MS). However, some challenges limit its efficiency, including the coverage of spectral libraries, the accuracy, and the speed of matching. In this study, a million-scale in-silico EI-MS library is established. Furthermore, an ultra-fast and accurate spectrum matching (FastEI) method is proposed to substantially improve accuracy using Word2vec spectral embedding and boost the speed using the hierarchical navigable small-world graph (HNSW). It achieves 80.4% recall@10 accuracy (88.3% with 5 Da mass filter) with a speedup of two orders of magnitude compared with the weighted cosine similarity method (WCS). When FastEI is applied to identify the molecules beyond NIST 2017 library, it achieves 50% recall@1 accuracy. FastEI is packaged as a standalone and user-friendly software for common users with limited computational backgrounds. Overall, FastEI combined with a million-scale in-silico library facilitates compound identification as an accurate and ultra-fast tool.
Collapse
Affiliation(s)
- Qiong Yang
- College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, PR, China
| | - Hongchao Ji
- Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120, PR, China
| | - Zhenbo Xu
- College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, PR, China
| | - Yiming Li
- College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, PR, China
| | - Pingshan Wang
- College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, PR, China
| | - Jinyu Sun
- College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, PR, China
| | - Xiaqiong Fan
- College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, PR, China
| | - Hailiang Zhang
- College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, PR, China
| | - Hongmei Lu
- College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, PR, China.
| | - Zhimin Zhang
- College of Chemistry and Chemical Engineering, Central South University, Changsha, 410083, PR, China.
| |
Collapse
|
2
|
Kou J, Walther G. Large-scale inference with block structure. Ann Stat 2022. [DOI: 10.1214/21-aos2162] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
3
|
Affiliation(s)
- Jiyao Kou
- Department of Statistics, Stanford University, Stanford, CA, USA
| |
Collapse
|
4
|
Affiliation(s)
- Ali Abolhassani
- Department of Mathematics, Azarbaijan Shahid Madani University, Tabriz, Iran
| | - Marcos O. Prates
- Department of Statistics, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| |
Collapse
|
5
|
Allévius B, Höhle M. An unconditional space–time scan statistic for ZIP‐distributed data. Scand Stat Theory Appl 2018. [DOI: 10.1111/sjos.12341] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Benjamin Allévius
- Department of MathematicsStockholm University 106 91 Stockholm Sweden
| | - Michael Höhle
- Department of MathematicsStockholm University 106 91 Stockholm Sweden
| |
Collapse
|
6
|
Han J, Zhu L, Kulldorff M, Hostovich S, Stinchcomb DG, Tatalovich Z, Lewis DR, Feuer EJ. Using Gini coefficient to determining optimal cluster reporting sizes for spatial scan statistics. Int J Health Geogr 2016; 15:27. [PMID: 27488416 PMCID: PMC4971627 DOI: 10.1186/s12942-016-0056-6] [Citation(s) in RCA: 73] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2016] [Accepted: 07/20/2016] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Spatial and space-time scan statistics are widely used in disease surveillance to identify geographical areas of elevated disease risk and for the early detection of disease outbreaks. With a scan statistic, a scanning window of variable location and size moves across the map to evaluate thousands of overlapping windows as potential clusters, adjusting for the multiple testing. Almost always, the method will find many very similar overlapping clusters, and it is not useful to report all of them. This paper proposes to use the Gini coefficient to help select which of the many overlapping clusters to report. METHODS The Gini coefficient provides a quick and intuitive way to evaluate the degree of the heterogeneity of the collection of clusters, which is useful to explain how well the cluster collection reveal the underlying true cluster patterns. Using simulation studies and real cancer mortality data, it is compared with the traditional approach for reporting non-overlapping clusters. RESULTS The Gini coefficient can identify a more refined collection of non-overlapping clusters to report. For example, it is able to determine when it makes more sense to report a collection of smaller non-overlapping clusters versus a single large cluster containing all of them. It also fulfils a set of desirable theoretical properties, such as being invariant under a uniform multiplication of the population numbers by the same constant. CONCLUSIONS The Gini coefficient can be used to determine which set of non-overlapping clusters to report. It has been implemented in the free SaTScan™ software version 9.3 ( www.satscan.org ).
Collapse
Affiliation(s)
- Junhee Han
- Division of Biostatistics, Research Institute of Convergence for Biomedical Science and Technology, Pusan National University Yangsan Hospital, Pusan, Korea
| | - Li Zhu
- Surveillance Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892 USA
| | - Martin Kulldorff
- Brigham and Women’s Hospital and Harvard Medical School, Boston, MA USA
| | | | | | - Zaria Tatalovich
- Surveillance Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892 USA
| | - Denise Riedel Lewis
- Surveillance Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892 USA
| | - Eric J. Feuer
- Surveillance Research Program, Division of Cancer Control and Population Sciences, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892 USA
| |
Collapse
|
7
|
Xu J, Gangnon RE. Stepwise and stagewise approaches for spatial cluster detection. Spat Spatiotemporal Epidemiol 2016; 17:59-74. [PMID: 27246273 DOI: 10.1016/j.sste.2016.04.007] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/08/2015] [Revised: 04/05/2016] [Accepted: 04/12/2016] [Indexed: 10/21/2022]
Abstract
Spatial cluster detection is an important tool in many areas such as sociology, botany and public health. Previous work has mostly taken either a hypothesis testing framework or a Bayesian framework. In this paper, we propose a few approaches under a frequentist variable selection framework for spatial cluster detection. The forward stepwise methods search for multiple clusters by iteratively adding currently most likely cluster while adjusting for the effects of previously identified clusters. The stagewise methods also consist of a series of steps, but with a tiny step size in each iteration. We study the features and performances of our proposed methods using simulations on idealized grids or real geographic areas. From the simulations, we compare the performance of the proposed methods in terms of estimation accuracy and power. These methods are applied to the the well-known New York leukemia data as well as Indiana poverty data.
Collapse
Affiliation(s)
- Jiale Xu
- Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, United States.
| | - Ronald E Gangnon
- Department of Biostatistics and Medical Informatics and Department of Population Health Sciences, University of Wisconsin-Madison, Madison, WI 53726, United States.
| |
Collapse
|
8
|
Shu L, Zhou R, Su Y. A self-adjusted weighted likelihood ratio test for global clustering of disease. J STAT COMPUT SIM 2016. [DOI: 10.1080/00949655.2015.1049604] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
9
|
Lin P, Kung Y, Clayton M. Spatial scan statistics for detection of multiple clusters with arbitrary shapes. Biometrics 2016; 72:1226-1234. [DOI: 10.1111/biom.12509] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2014] [Revised: 12/01/2015] [Accepted: 02/01/2016] [Indexed: 11/28/2022]
Affiliation(s)
- Pei‐Sheng Lin
- Division of Biostatistics and Bioinformatics, National Health Research Institutes Taiwan
- Department of Mathematics, National Chung Cheng University Taiwan
| | - Yi‐Hung Kung
- Division of Biostatistics and Bioinformatics, National Health Research Institutes Taiwan
| | - Murray Clayton
- Department of Statistics, University of Wisconsin‐Madison Wisconsin, U.S.A
| |
Collapse
|
10
|
Affiliation(s)
- Pei-Sheng Lin
- Division of Biostatistics and Bioinformatics; National Health Research Institutes
- Department of Mathematics; National Chung Cheng University
| |
Collapse
|
11
|
Zhao X, Zhou XH, Feng Z, Guo P, He H, Zhang T, Duan L, Li X. A scan statistic for binary outcome based on hypergeometric probability model, with an application to detecting spatial clusters of Japanese encephalitis. PLoS One 2013; 8:e65419. [PMID: 23785424 PMCID: PMC3681795 DOI: 10.1371/journal.pone.0065419] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2013] [Accepted: 04/24/2013] [Indexed: 11/29/2022] Open
Abstract
As a useful tool for geographical cluster detection of events, the spatial scan statistic is widely applied in many fields and plays an increasingly important role. The classic version of the spatial scan statistic for the binary outcome is developed by Kulldorff, based on the Bernoulli or the Poisson probability model. In this paper, we apply the Hypergeometric probability model to construct the likelihood function under the null hypothesis. Compared with existing methods, the likelihood function under the null hypothesis is an alternative and indirect method to identify the potential cluster, and the test statistic is the extreme value of the likelihood function. Similar with Kulldorff's methods, we adopt Monte Carlo test for the test of significance. Both methods are applied for detecting spatial clusters of Japanese encephalitis in Sichuan province, China, in 2009, and the detected clusters are identical. Through a simulation to independent benchmark data, it is indicated that the test statistic based on the Hypergeometric model outweighs Kulldorff's statistics for clusters of high population density or large size; otherwise Kulldorff's statistics are superior.
Collapse
Affiliation(s)
- Xing Zhao
- Department of Biostatistics, West China School of Public Health, Sichuan University, Chengdu, Sichuan, China
- Department of Biostatistics, School of Public Health and Community Medicine, University of Washington, Seattle, Washington, United States of America
| | - Xiao-Hua Zhou
- Department of Biostatistics, School of Public Health and Community Medicine, University of Washington, Seattle, Washington, United States of America
| | - Zijian Feng
- Office for Disease Control and Emergency Response, Chinese Center for Disease Control and Prevention (China CDC), Beijing, China
| | - Pengfei Guo
- Department of Biostatistics, West China School of Public Health, Sichuan University, Chengdu, Sichuan, China
| | - Hongyan He
- Department of Biostatistics, West China School of Public Health, Sichuan University, Chengdu, Sichuan, China
| | - Tao Zhang
- Department of Biostatistics, West China School of Public Health, Sichuan University, Chengdu, Sichuan, China
| | - Lei Duan
- School of Computer Science, Sichuan University, Chengdu, Sichuan, China
- State Key Laboratory of Software Engineering, Wuhan University, Wuhan, Hubei, China
| | - Xiaosong Li
- Department of Biostatistics, West China School of Public Health, Sichuan University, Chengdu, Sichuan, China
| |
Collapse
|
12
|
On latent process models in multi-dimensional space. Stat Probab Lett 2012. [DOI: 10.1016/j.spl.2012.03.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
|
13
|
Gangnon RE. Local multiplicity adjustment for the spatial scan statistic using the Gumbel distribution. Biometrics 2011; 68:174-82. [PMID: 21762118 DOI: 10.1111/j.1541-0420.2011.01643.x] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
The spatial scan statistic is an important and widely used tool for cluster detection. It is based on the simultaneous evaluation of the statistical significance of the maximum likelihood ratio test statistic over a large collection of potential clusters. In most cluster detection problems, there is variation in the extent of local multiplicity across the study region. For example, using a fixed maximum geographic radius for clusters, urban areas typically have many overlapping potential clusters, whereas rural areas have relatively few. The spatial scan statistic does not account for local multiplicity variation. We describe a previously proposed local multiplicity adjustment based on a nested Bonferroni correction and propose a novel adjustment based on a Gumbel distribution approximation to the distribution of a local scan statistic. We compare the performance of all three statistics in terms of power and a novel unbiased cluster detection criterion. These methods are then applied to the well-known New York leukemia dataset and a Wisconsin breast cancer incidence dataset.
Collapse
Affiliation(s)
- Ronald E Gangnon
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin 53726, USA.
| |
Collapse
|
14
|
Gangnon RE. A model for space-time cluster detection using spatial clusters with flexible temporal risk patterns. Stat Med 2010; 29:2325-37. [PMID: 20564730 DOI: 10.1002/sim.3984] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Maps of estimated disease rates over multiple time periods are useful tools for gaining etiologic insights regarding potential exposures associated with specific locations and times. In this paper, we describe an extension of the Gangnon-Clayton model for spatial clustering to spatio-temporal data. As in the purely spatial model, a large set of circular regions of varying radii centered at observed locations are considered as potential clusters, e.g. subregions with a different pattern of risk than the remainder of the study region. Within the spatio-temporal model, no specific parametric form is imposed on the temporal pattern of risk within each cluster. In addition to the clusters, the proposed model incorporates spatial and spatio-temporal heterogeneity effects and can readily accommodate regional covariates. Inference is performed in a Bayesian framework using MCMC. Although formal inferences about the number of clusters could be obtained using a reversible jump MCMC algorithm, we use local Bayes factors from models with a fixed, but overly large, number of clusters to draw inferences about both the number and the locations of the clusters. We illustrate the approach with two applications of the model to data on female breast cancer mortality in Japan and evaluate its operating characteristics in a simulation study.
Collapse
Affiliation(s)
- Ronald E Gangnon
- Departments of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53726, USA.
| |
Collapse
|
15
|
Gangnon RE. Local multiplicity adjustments for spatial cluster detection. ENVIRONMENTAL AND ECOLOGICAL STATISTICS 2010; 17:55-71. [PMID: 20485455 PMCID: PMC2871332 DOI: 10.1007/s10651-008-0101-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
The spatial scan statistic is a widely applied tool for cluster detection. The spatial scan statistic evaluates the significance of a series of potential circular clusters using Monte Carlo simulation to account for the multiplicity of comparisons. In most settings, the extent of the multiplicity problem varies across the study region. For example, urban areas typically have many overlapping clusters, while rural areas have few. The spatial scan statistic does not account for these local variations in the multiplicity problem. We propose two new spatially-varying multiplicity adjustments for spatial cluster detection, one based on a nested Bonferroni adjustment and one based on local averaging. Geographic variations in power for the spatial scan statistic and the two new statistics are explored through simulation studies, and the methods are applied to both the well-known New York leukemia data and data from a case-control study of breast cancer in Wisconsin.
Collapse
Affiliation(s)
- Ronald E Gangnon
- Departments of Biostatistics and Medical Informatics and Population Health Sciences, 603 WARF Office Building, University of Wisconsin-Madison, 610 Walnut Street, Madison, WI 53726, USA,
| |
Collapse
|
16
|
Abstract
This review examines the state of Bayesian thinking as Statistics in Medicine was launched in 1982, reflecting particularly on its applicability and uses in medical research. It then looks at each subsequent five-year epoch, with a focus on papers appearing in Statistics in Medicine, putting these in the context of major developments in Bayesian thinking and computation with reference to important books, landmark meetings and seminal papers. It charts the growth of Bayesian statistics as it is applied to medicine and makes predictions for the future. From sparse beginnings, where Bayesian statistics was barely mentioned, Bayesian statistics has now permeated all the major areas of medical statistics, including clinical trials, epidemiology, meta-analyses and evidence synthesis, spatial modelling, longitudinal modelling, survival modelling, molecular genetics and decision-making in respect of new technologies.
Collapse
Affiliation(s)
- Deborah Ashby
- Wolfson Institute of Preventive Medicine, Barts and The London, Queen Mary's School of Medicine & Dentistry, University of London, Charterhouse Square, London EC1M 6BQ, UK.
| |
Collapse
|
17
|
Christiansen LE, Andersen JS, Wegener HC, Madsen H. Spatial scan statistics using elliptic windows. JOURNAL OF AGRICULTURAL BIOLOGICAL AND ENVIRONMENTAL STATISTICS 2006. [DOI: 10.1198/108571106x154858] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
18
|
Abstract
Many different methods have been proposed to test the spatial randomness of a point pattern adjusting for an inhomogeneous background population. These tests can be classified into cluster detection tests, concerned with the detection and inference of local clusters, and global clustering tests, which collect evidence for clustering throughout the study region. This paper is mainly concerned about global clustering tests. Some tests for spatial randomness are based on likelihoods, which include the spatial and space-time scan statistics with variable window size and Gangnon and Clayton's weighted average likelihood ratio tests. Both of these tests perform well compared to other tests for cluster detection and global clustering, respectively. In this study, we develop other likelihood based tests for global clustering and we explore the use of different weight functions with these tests. The power of these tests is evaluated using simulated data set and compared with existing methods.
Collapse
Affiliation(s)
- Changhong Song
- Department of Statistics, University of Connecticut, Storrs, CT 06269, USA.
| | | |
Collapse
|
19
|
Abstract
In this paper, we evaluate the usefulness of local Bayes factors as a tool for spatial cluster detection. In particular, we consider whether local Bayes factors from models with a fixed, but overly large number of clusters can consistently identify the evidence for clustering for a variety of prior specifications for the cluster locations. We also investigate the robustness of the local Bayes factor to the number of clusters included in the model. We explore the impacts of prior choice for cluster location and the number of clusters on posterior inference for disease rates. We conduct the comparison by analysing data on 1990 breast cancer incidence in Wisconsin.
Collapse
Affiliation(s)
- Ronald E Gangnon
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, 610 N. Walnut Street, Madison, Wisconsin 53726, USA.
| |
Collapse
|
20
|
Waller LA, Hill EG, Rudd RA. The geography of power: statistical performance of tests of clusters and clustering in heterogeneous populations. Stat Med 2006; 25:853-65. [PMID: 16453372 DOI: 10.1002/sim.2418] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Heterogeneous population densities complicate comparisons of statistical power between hypothesis tests evaluating spatial clusters or clustering of disease. Specifically, the location of a cluster within a heterogeneously distributed population at risk impacts power properties, complicating comparisons of tests, and allowing one to map spatial variations in statistical power for different tests. Such maps provide insight into the overall power of a particular test, and also indicate areas within the study area where tests are more or less likely to detect the same local increase in relative risk. While such maps are largely driven by local sample size, we also find differences due to features of the statistics themselves. We illustrate these concepts using two tests: Tango's index of clustering and the spatial scan statistic. Furthermore, assessments of the accuracy of the 'most likely cluster' involve not only statistical power, but also spatial accuracy in identifying the location of a true underlying cluster. We illustrate these concepts via induction of artificial clusters within the observed incidence of severe cardiac birth defects in Santa Clara County, CA in 1981.
Collapse
Affiliation(s)
- Lance A Waller
- Department of Biostatistics, Rollins School of Public Health, Emory University, 1518 Clifton Road NE, Atlanta, GA 30322, USA.
| | | | | |
Collapse
|
21
|
Song C, Kulldorff M. Tango's maximized excess events test with different weights. Int J Health Geogr 2005; 4:32. [PMID: 16356179 PMCID: PMC1343587 DOI: 10.1186/1476-072x-4-32] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2005] [Accepted: 12/15/2005] [Indexed: 11/26/2022] Open
Abstract
Background Tango's maximized excess events test (MEET) has been shown to have very good statistical power in detecting global disease clustering. A nice feature of this test is that it considers a range of spatial scale parameters, adjusting for the multiple testing. This means that it has good power to detect a wide range of clustering processes. The test depends on the functional form of a weight function, and it is unknown how sensitive the test is to the choice of this weight function and what function provides optimal power for different clustering processes. In this study, we evaluate the performance of the test for a wide range of weight functions. Results The power varies greatly with different choice of weight. Tango's original choice for the weight function works very well. There are also other weight functions that provide good power. Conclusion We recommend the use of Tango's MEET to test global disease clustering, either with the original weight or one of the alternate weights that have good power.
Collapse
Affiliation(s)
- Changhong Song
- Department of Statistics, University of Connecticut, Storrs, CT, 06269, USA
| | - Martin Kulldorff
- Department of Ambulatory Care and Prevention, Harvard Medical School and Harvard Pilgrim Health Care, 133 Brookline Avenue, 6th Floor, Boston, MA 02215, USA
| |
Collapse
|
22
|
Yang TY, Swartz TB. Applications of Binary Segmentation to the Estimation of Quantal Response Curves and Spatial Intensity. Biom J 2005; 47:489-501. [PMID: 16161806 DOI: 10.1002/bimj.200310136] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
This paper explores the use of binary segmentation procedures in two applications. The first application is concerned with the estimation of nonparametric quantal response curves. With Bernoulli data and an assumed monotone increasing curve, this gives rise a change-point model where the change points are determined using a sequence of nested hypothesis tests of whether a change point exists. The second application concerns cluster identification and inference for spatial data where the shape of the clusters and the number of clusters is unknown. The procedure involves a sequence of nested hypothesis tests of a single cluster versus a pair of distinct clusters. Examples of both applications are provided.
Collapse
Affiliation(s)
- Tae Y Yang
- Department of Mathematics, Myongji University, Yongin, Korea 449-728.
| | | |
Collapse
|
23
|
Abstract
Maps of regional disease rates are potentially useful tools in examining spatial patterns of disease and for identifying clusters. Bayes and empirical Bayes approaches to this problem have proven useful in smoothing crude maps of disease rates. In recent years, models including both spatially correlated random effects and spatially unstructured random effects have been very popular. The spatially correlated random effects have been proposed in an attempt to capture a general clustering in the data. As an alternative, we propose replacing the spatially structured random effect with fixed clustering effects associated with particular areas. A reversible jump Markov chain Monte Carlo (RJMCMC) algorithm for posterior inference is described. We illustrate the model using the well-known New York leukaemia data.
Collapse
Affiliation(s)
- Ronald E Gangnon
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, 610 N. Walnut Street, Madison, Wisconsin 53726, U.S.A.
| | | |
Collapse
|