1
|
Ochieng D, Hoang AT, Dickhaus T. Multiple testing of composite null hypotheses for discrete data using randomized p-values. Biom J 2024; 66:e2300077. [PMID: 37857533 DOI: 10.1002/bimj.202300077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Revised: 09/25/2023] [Accepted: 09/29/2023] [Indexed: 10/21/2023]
Abstract
P-values that are derived from continuously distributed test statistics are typically uniformly distributed on (0,1) under least favorable parameter configurations (LFCs) in the null hypothesis. Conservativeness of a p-value P (meaning that P is under the null hypothesis stochastically larger than uniform on (0,1)) can occur if the test statistic from which P is derived is discrete, or if the true parameter value under the null is not an LFC. To deal with both of these sources of conservativeness, we present two approaches utilizing randomized p-values. We illustrate their effectiveness for testing a composite null hypothesis under a binomial model. We also give an example of how the proposed p-values can be used to test a composite null in group testing designs. We find that the proposed randomized p-values are less conservative compared to nonrandomized p-values under the null hypothesis, but that they are stochastically not smaller under the alternative. The problem of establishing the validity of randomized p-values has received attention in previous literature. We show that our proposed randomized p-values are valid under various discrete statistical models, which are such that the distribution of the corresponding test statistic belongs to an exponential family. The behavior of the power function for the tests based on the proposed randomized p-values as a function of the sample size is also investigated. Simulations and a real data example are used to compare the different considered p-values.
Collapse
Affiliation(s)
- Daniel Ochieng
- Institute for Statistics, University of Bremen, Bremen, Germany
| | - Anh-Tuan Hoang
- Institute for Statistics, University of Bremen, Bremen, Germany
| | | |
Collapse
|
2
|
Ebert TA, Shawer D, Brlansky RH, Rogers ME. Seasonal Patterns in the Frequency of Candidatus Liberibacter Asiaticus in Populations of Diaphorina citri (Hemiptera: Psyllidae) in Florida. Insects 2023; 14:756. [PMID: 37754724 PMCID: PMC10532026 DOI: 10.3390/insects14090756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Revised: 09/04/2023] [Accepted: 09/06/2023] [Indexed: 09/28/2023]
Abstract
Candidatus Liberibacter asiaticus (CLas) is one of the putative causal agents of huanglongbing, which is a serious disease in citrus production. The pathogen is transmitted by Diaphorina citri Kuwayama (Hemiptera: Psyllidae). As an observational study, six groves in central Florida and one grove at the southern tip of Florida were sampled monthly from January 2008 through February 2012 (50 months). The collected psyllids were sorted by sex and abdominal color. Disease prevalence in adults peaked in November, with a minor peak in February. Gray/brown females had the highest prevalence, and blue/green individuals of either sex had the lowest prevalence. CLas prevalence in blue/green females was highly correlated with the prevalence in other sexes and colors. Thus, the underlying causes for seasonal fluctuations in prevalence operated in a similar fashion for all psyllids. The pattern was caused by larger nymphs displacing smaller ones from the optimal feeding sites and immunological robustness in different sex-color morphotypes. Alternative hypotheses were also considered. Improving our understanding of biological interactions and how to sample them will improve management decisions. We agree with other authors that psyllid management is critical year-round.
Collapse
Affiliation(s)
- Timothy A. Ebert
- Citrus Research and Education Center, University of Florida, 700 Experiment Station Rd., Lake Alfred, FL 33850, USA; (R.H.B.); (M.E.R.)
| | - Dalia Shawer
- Department of Economic Entomology, Faculty of Agriculture, Kafr Elsheikh University, Kafr Elsheikh 33516, Egypt;
| | - Ron H. Brlansky
- Citrus Research and Education Center, University of Florida, 700 Experiment Station Rd., Lake Alfred, FL 33850, USA; (R.H.B.); (M.E.R.)
| | - Michael E. Rogers
- Citrus Research and Education Center, University of Florida, 700 Experiment Station Rd., Lake Alfred, FL 33850, USA; (R.H.B.); (M.E.R.)
| |
Collapse
|
3
|
Eetemadi A, Tagkopoulos I. Algorithmic lifestyle optimization. J Am Med Inform Assoc 2022; 30:38-45. [PMID: 36308771 PMCID: PMC9748593 DOI: 10.1093/jamia/ocac186] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 05/09/2022] [Accepted: 10/06/2022] [Indexed: 12/15/2022] Open
Abstract
OBJECTIVE A hallmark of personalized medicine and nutrition is to identify effective treatment plans at the individual level. Lifestyle interventions (LIs), from diet to exercise, can have a significant effect over time, especially in the case of food intolerances and allergies. The large set of candidate interventions, make it difficult to evaluate which intervention plan would be more favorable for any given individual. In this study, we aimed to develop a method for rapid identification of favorable LIs for a given individual. MATERIALS AND METHODS We have developed a method, algorithmic lifestyle optimization (ALO), for rapid identification of effective LIs. At its core, a group testing algorithm identifies the effectiveness of each intervention efficiently, within the context of its pertinent group. RESULTS Evaluations on synthetic and real data show that ALO is robust to noise, data size, and data heterogeneity. Compared to the standard of practice techniques, such as the standard elimination diet (SED), it identifies the effective LIs 58.9%-68.4% faster when used to discover an individual's food intolerances and allergies to 19-56 foods. DISCUSSION ALO achieves its superior performance by: (1) grouping multiple LIs together optimally from prior statistics, and (2) adapting the groupings of LIs from the individual's subsequent responses. Future extensions to ALO should enable incorporating nutritional constraints. CONCLUSION ALO provides a new approach for the discovery of effective interventions in nutrition and medicine, leading to better intervention plans faster and with less inconvenience to the patient compared to SED.
Collapse
Affiliation(s)
- Ameen Eetemadi
- Department of Computer Science, University of California, Davis, Davis, California, USA
- Genome Center, University of California, Davis, Davis, California, USA
- AI Institute for Next Generation Food Systems (AIFS), University of California, Davis, Davis, California, USA
| | - Ilias Tagkopoulos
- Department of Computer Science, University of California, Davis, Davis, California, USA
- Genome Center, University of California, Davis, Davis, California, USA
- AI Institute for Next Generation Food Systems (AIFS), University of California, Davis, Davis, California, USA
| |
Collapse
|
4
|
Sewell DK. Leveraging network structure to improve pooled testing efficiency. J R Stat Soc Ser C Appl Stat 2022; 71:1648-1662. [PMID: 36632279 PMCID: PMC9826453 DOI: 10.1111/rssc.12594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2021] [Accepted: 08/11/2022] [Indexed: 02/01/2023]
Abstract
Screening is a powerful tool for infection control, allowing for infectious individuals, whether they be symptomatic or asymptomatic, to be identified and isolated. The resource burden of regular and comprehensive screening can often be prohibitive, however. One such measure to address this is pooled testing, whereby groups of individuals are each given a composite test; should a group receive a positive diagnostic test result, those comprising the group are then tested individually. Infectious disease is spread through a transmission network, and this paper shows how assigning individuals to pools based on this underlying network can improve the efficiency of the pooled testing strategy, thereby reducing the resource burden. We designed a simulated annealing algorithm to improve the pooled testing efficiency as measured by the ratio of the expected number of correct classifications to the expected number of tests performed. We then evaluated our approach using an agent-based model designed to simulate the spread of SARS-CoV-2 in a school setting. Our results suggest that our approach can decrease the number of tests required to regularly screen the student body, and that these reductions are quite robust to assigning pools based on partially observed or noisy versions of the network.
Collapse
|
5
|
Sewell DK. Network-Informed Constrained Divisive Pooled Testing Assignments. Front Big Data 2022; 5:893760. [PMID: 35875594 PMCID: PMC9304576 DOI: 10.3389/fdata.2022.893760] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Accepted: 06/06/2022] [Indexed: 11/13/2022] Open
Abstract
Frequent universal testing in a finite population is an effective approach to preventing large infectious disease outbreaks. Yet when the target group has many constituents, this strategy can be cost prohibitive. One approach to alleviate the resource burden is to group multiple individual tests into one unit in order to determine if further tests at the individual level are necessary. This approach, referred to as a group testing or pooled testing, has received much attention in finding the minimum cost pooling strategy. Existing approaches, however, assume either independence or very simple dependence structures between individuals. This assumption ignores the fact that in the context of infectious diseases there is an underlying transmission network that connects individuals. We develop a constrained divisive hierarchical clustering algorithm that assigns individuals to pools based on the contact patterns between individuals. In a simulation study based on real networks, we show the benefits of using our proposed approach compared to random assignments even when the network is imperfectly measured and there is a high degree of missingness in the data.
Collapse
Affiliation(s)
- Daniel K. Sewell
- Department of Biostatistics, University of Iowa, Iowa City, IA, United States
| |
Collapse
|
6
|
Best AF, Malinovsky Y, Albert PS. The efficient design of Nested Group Testing algorithms for disease identification in clustered data. J Appl Stat 2022; 50:2228-2245. [PMID: 37434628 PMCID: PMC10332225 DOI: 10.1080/02664763.2022.2071419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 04/23/2022] [Indexed: 10/18/2022]
Abstract
Group testing study designs have been used since the 1940s to reduce screening costs for uncommon diseases; for rare diseases, all cases are identifiable with substantially fewer tests than the population size. Substantial research has identified efficient designs under this paradigm. However, little work has focused on the important problem of disease screening among clustered data, such as geographic heterogeneity in HIV prevalence. We evaluated designs where we first estimate disease prevalence and then apply efficient group testing algorithms using these estimates. Specifically, we evaluate prevalence using individual testing on a fixed-size subset of each cluster and use these prevalence estimates to choose group sizes that minimize the corresponding estimated average number of tests per subject. We compare designs where we estimate cluster-specific prevalences as well as a common prevalence across clusters, use different group testing algorithms, construct groups from individuals within and in different clusters, and consider misclassification. For diseases with low prevalence, our results suggest that accounting for clustering is unnecessary. However, for diseases with higher prevalence and sizeable between-cluster heterogeneity, accounting for clustering in study design and implementation improves efficiency. We consider the practical aspects of our design recommendations with two examples with strong clustering effects: (1) Identification of HIV carriers in the US population and (2) Laboratory screening of anti-cancer compounds using cell lines.
Collapse
Affiliation(s)
- Ana F. Best
- Biostatistics Branch, Biometrics Research Program, Division of Cancer Treatment and Diagnosis, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Yaakov Malinovsky
- Department of Mathematics and Statistics, University of Maryland Baltimore County, Baltimore, MD, USA
| | - Paul S. Albert
- Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
7
|
Sudo M, Osakabe M. freqpcr: Estimation of population allele frequency using qPCR ΔΔCq measures from bulk samples. Mol Ecol Resour 2022; 22:1380-1393. [PMID: 34882971 PMCID: PMC9300209 DOI: 10.1111/1755-0998.13554] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2021] [Revised: 10/28/2021] [Accepted: 11/03/2021] [Indexed: 11/29/2022]
Abstract
PCR techniques, both quantitative (qPCR) and nonquantitative, have been used to estimate the frequency of a specific allele in a population. However, the labour required to sample numerous individuals and subsequently handle each sample renders the quantification of rare mutations (e.g., pesticide resistance gene mutations at the early stages of resistance development) challenging. Meanwhile, pooling DNA from multiple individuals as a "bulk sample" combined with qPCR may reduce handling costs. The qPCR output for a bulk sample, however, contains uncertainty owing to variations in DNA yields from each individual, in addition to measurement errors. In this study, we have developed a statistical model to estimate the frequency of the specific allele and its confidence interval when the sample allele frequencies are obtained in the form of ΔΔCq in the qPCR analyses on multiple bulk samples collected from a population. We assumed a gamma distribution as the individual DNA yield and developed an R package for parameter estimation, which was verified using real DNA samples from acaricide-resistant spider mites, as well as a numerical simulation. Our model resulted in unbiased point estimates of the allele frequency compared with simple averaging of the ΔΔCq values. The confidence intervals suggest that dividing the bulk samples into more parts will improve precision if the total number of individuals is equal; however, if the cost of PCR analysis is higher than that of sampling, increasing the total number and pooling them into a few bulk samples may also yield comparable precision.
Collapse
Affiliation(s)
- Masaaki Sudo
- Division of Fruit Tree and Tea Pest Control ResearchInstitute for Plant ProtectionNARO: Kanaya Tea Research StationShimadaJapan
| | - Masahiro Osakabe
- Laboratory of Ecological InformationGraduate School of AgricultureKyoto UniversityKyotoJapan
| |
Collapse
|
8
|
Lin YJ, Yu CH, Liu TH, Chang CS, Chen WT. Constructions and Comparisons of Pooling Matrices for Pooled Testing of COVID-19. IEEE Trans Netw Sci Eng 2022; 9:467-480. [PMID: 35582549 PMCID: PMC9014483 DOI: 10.1109/tnse.2021.3121709] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 09/06/2021] [Accepted: 10/17/2021] [Indexed: 06/15/2023]
Abstract
In comparison with individual testing, group testing is more efficient in reducing the number of tests and potentially leading to tremendous cost reduction. There are two key elements in a group testing technique: (i) the pooling matrix that directs samples to be pooled into groups, and (ii) the decoding algorithm that uses the group test results to reconstruct the status of each sample. In this paper, we propose a new family of pooling matrices from packing the pencil of lines (PPoL) in a finite projective plane. We compare their performance with various pooling matrices proposed in the literature, including 2D-pooling, P-BEST, and Tapestry, using the two-stage definite defectives (DD) decoding algorithm. By conducting extensive simulations for a range of prevalence rates up to 5%, our numerical results show that there is no pooling matrix with the lowest relative cost in the whole range of the prevalence rates. To optimize the performance, one should choose the right pooling matrix, depending on the prevalence rate. The family of PPoL matrices can dynamically adjust their construction parameters according to the prevalence rates and could be a better alternative than using a fixed pooling matrix.
Collapse
Affiliation(s)
- Yi-Jheng Lin
- Institute of Communications EngineeringNational Tsing Hua UniversityHsinchu30013Taiwan
| | - Che-Hao Yu
- Institute of Communications EngineeringNational Tsing Hua UniversityHsinchu30013Taiwan
| | - Tzu-Hsuan Liu
- Institute of Communications EngineeringNational Tsing Hua UniversityHsinchu30013Taiwan
| | - Cheng-Shang Chang
- Institute of Communications EngineeringNational Tsing Hua UniversityHsinchu30013Taiwan
| | - Wen-Tsuen Chen
- Institute of Communications EngineeringNational Tsing Hua UniversityHsinchu30013Taiwan
| |
Collapse
|
9
|
Frontiers Production Office. Erratum: Group Testing for SARS-CoV-2 Allows for Up to 10-Fold Efficiency Increase Across Realistic Scenarios and Testing Strategies. Front Public Health 2021; 9:781326. [PMID: 34733816 PMCID: PMC8558621 DOI: 10.3389/fpubh.2021.781326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 09/22/2021] [Indexed: 11/13/2022] Open
Abstract
[This corrects the article DOI: 10.3389/fpubh.2021.583377.].
Collapse
|
10
|
Hoegh A, Peel AJ, Madden W, Ruiz Aravena M, Morris A, Washburne A, Plowright RK. Estimating viral prevalence with data fusion for adaptive two-phase pooled sampling. Ecol Evol 2021; 11:14012-14023. [PMID: 34707835 PMCID: PMC8525136 DOI: 10.1002/ece3.8107] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Revised: 06/09/2021] [Accepted: 06/18/2021] [Indexed: 11/16/2022] Open
Abstract
The COVID-19 pandemic has highlighted the importance of efficient sampling strategies and statistical methods for monitoring infection prevalence, both in humans and in reservoir hosts. Pooled testing can be an efficient tool for learning pathogen prevalence in a population. Typically, pooled testing requires a second-phase retesting procedure to identify infected individuals, but when the goal is solely to learn prevalence in a population, such as a reservoir host, there are more efficient methods for allocating the second-phase samples.To estimate pathogen prevalence in a population, this manuscript presents an approach for data fusion with two-phased testing of pooled samples that allows more efficient estimation of prevalence with less samples than traditional methods. The first phase uses pooled samples to estimate the population prevalence and inform efficient strategies for the second phase. To combine information from both phases, we introduce a Bayesian data fusion procedure that combines pooled samples with individual samples for joint inferences about the population prevalence.Data fusion procedures result in more efficient estimation of prevalence than traditional procedures that only use individual samples or a single phase of pooled sampling.The manuscript presents guidance on implementing the first-phase and second-phase sampling plans using data fusion. Such methods can be used to assess the risk of pathogen spillover from reservoir hosts to humans, or to track pathogens such as SARS-CoV-2 in populations.
Collapse
Affiliation(s)
- Andrew Hoegh
- Department of Mathematical SciencesMontana State UniversityBozemanMTUSA
| | - Alison J. Peel
- Centre for Planetary Health and Food SecurityGriffith UniversityNathanQLDAustralia
| | - Wyatt Madden
- Department of Microbiology and ImmunologyMontana State UniversityBozemanMTUSA
| | - Manuel Ruiz Aravena
- Department of Microbiology and ImmunologyMontana State UniversityBozemanMTUSA
| | - Aaron Morris
- Department of Veterinary MedicineUniversity of CambridgeCambridgeUK
| | | | - Raina K. Plowright
- Department of Microbiology and ImmunologyMontana State UniversityBozemanMTUSA
| |
Collapse
|
11
|
Verdun CM, Fuchs T, Harar P, Elbrächter D, Fischer DS, Berner J, Grohs P, Theis FJ, Krahmer F. Group Testing for SARS-CoV-2 Allows for Up to 10-Fold Efficiency Increase Across Realistic Scenarios and Testing Strategies. Front Public Health 2021; 9:583377. [PMID: 34490172 PMCID: PMC8416485 DOI: 10.3389/fpubh.2021.583377] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Accepted: 07/26/2021] [Indexed: 11/24/2022] Open
Abstract
Background: Due to the ongoing COVID-19 pandemic, demand for diagnostic testing has increased drastically, resulting in shortages of necessary materials to conduct the tests and overwhelming the capacity of testing laboratories. The supply scarcity and capacity limits affect test administration: priority must be given to hospitalized patients and symptomatic individuals, which can prevent the identification of asymptomatic and presymptomatic individuals and hence effective tracking and tracing policies. We describe optimized group testing strategies applicable to SARS-CoV-2 tests in scenarios tailored to the current COVID-19 pandemic and assess significant gains compared to individual testing. Methods: We account for biochemically realistic scenarios in the context of dilution effects on SARS-CoV-2 samples and consider evidence on specificity and sensitivity of PCR-based tests for the novel coronavirus. Because of the current uncertainty and the temporal and spatial changes in the prevalence regime, we provide analysis for several realistic scenarios and propose fast and reliable strategies for massive testing procedures. Key Findings: We find significant efficiency gaps between different group testing strategies in realistic scenarios for SARS-CoV-2 testing, highlighting the need for an informed decision of the pooling protocol depending on estimated prevalence, target specificity, and high- vs. low-risk population. For example, using one of the presented methods, all 1.47 million inhabitants of Munich, Germany, could be tested using only around 141 thousand tests if the infection rate is below 0.4% is assumed. Using 1 million tests, the 6.69 million inhabitants from the city of Rio de Janeiro, Brazil, could be tested as long as the infection rate does not exceed 1%. Moreover, we provide an interactive web application, available at www.grouptexting.com, for visualizing the different strategies and designing pooling schemes according to specific prevalence scenarios and test configurations. Interpretation: Altogether, this work may help provide a basis for an efficient upscaling of current testing procedures, which takes the population heterogeneity into account and is fine-grained towards the desired study populations, e.g., mild/asymptomatic individuals vs. symptomatic ones but also mixtures thereof. Funding: German Science Foundation (DFG), German Federal Ministry of Education and Research (BMBF), Chan Zuckerberg Initiative DAF, and Austrian Science Fund (FWF).
Collapse
Affiliation(s)
- Claudio M. Verdun
- Department of Mathematics, Technical University of Munich, Garching, Germany
- Department of Electrical and Computer Engineering, Technical University of Munich, Munich, Germany
| | - Tim Fuchs
- Department of Mathematics, Technical University of Munich, Garching, Germany
| | - Pavol Harar
- Research Network Data Science, University of Vienna, Vienna, Austria
- Department of Telecommunications, Brno University of Technology, Brno, Czechia
| | | | - David S. Fischer
- Institute of Computational Biology, Helmholtz Zentrum München, Munich, Germany
| | - Julius Berner
- Faculty of Mathematics, University of Vienna, Vienna, Austria
| | - Philipp Grohs
- Research Network Data Science, University of Vienna, Vienna, Austria
- Faculty of Mathematics, University of Vienna, Vienna, Austria
- Johann Radon Institute for Computational and Applied Mathematics, Austrian Academy of Sciences, Linz, Austria
| | - Fabian J. Theis
- Department of Mathematics, Technical University of Munich, Garching, Germany
- Institute of Computational Biology, Helmholtz Zentrum München, Munich, Germany
| | - Felix Krahmer
- Department of Mathematics, Technical University of Munich, Garching, Germany
- Munich Data Science Institute, Technical University of Munich, Garching, Germany
| |
Collapse
|
12
|
Attia MA, Chang WT, Tandon R. Heterogeneity Aware Two-Stage Group Testing. IEEE Trans Signal Process 2021; 69:3977-3990. [PMID: 37982073 PMCID: PMC8544931 DOI: 10.1109/tsp.2021.3093785] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/02/2020] [Revised: 02/19/2021] [Accepted: 06/16/2021] [Indexed: 11/21/2023]
Abstract
Group testing refers to the process of testing pooled samples to reduce the total number of tests. Given the current pandemic, and the shortage of test supplies for COVID-19, group testing can play a critical role in time and cost efficient diagnostics. In many scenarios, samples collected from users are also accompanied with auxiliary information (such as demographics, history of exposure, onset of symptoms). Such auxiliary information may differ across patients, and is typically not considered while designing group testing algorithms. In this paper, we abstract such heterogeneity using a model where the population can be categorized into clusters with different prevalence rates. The main result of this work is to show that exploiting knowledge heterogeneity can further improve the efficiency of group testing. Motivated by the practical constraints and diagnostic considerations, we focus on two-stage group testing algorithms, where in the first stage, the goal is to detect as many negative samples by pooling, whereas the second stage involves individual testing to detect any remaining samples. For this class of algorithms, we prove that the gain in efficiency is related to the concavity of the number of tests as a function of the prevalence. We also show how one can choose the optimal pooling parameters for one of the algorithms in this class, namely, doubly constant pooling. We present lower bounds on the average number of tests as a function of the population heterogeneity profile, and also provide numerical results and comparisons.
Collapse
Affiliation(s)
- Mohamed A Attia
- Department of Electrical, Computer EngineeringUniversity of Arizona Tucson AZ 85721 USA
| | - Wei-Ting Chang
- Department of Electrical, Computer EngineeringUniversity of Arizona Tucson AZ 85721 USA
| | - Ravi Tandon
- Department of Electrical, Computer EngineeringUniversity of Arizona Tucson AZ 85721 USA
| |
Collapse
|
13
|
Lin YJ, Yu CH, Liu TH, Chang CS, Chen WT. Positively Correlated Samples Save Pooled Testing Costs. IEEE Trans Netw Sci Eng 2021; 8:2170-2182. [PMID: 35783009 PMCID: PMC8769016 DOI: 10.1109/tnse.2021.3081759] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Revised: 03/29/2021] [Accepted: 05/16/2021] [Indexed: 06/15/2023]
Abstract
The group testing approach, which achieves significant cost reduction over the individual testing approach, has received a lot of interest lately for massive testing of COVID-19. Many studies simply assume samples mixed in a group are independent. However, this assumption may not be reasonable for a contagious disease like COVID-19. Specifically, people within a family tend to infect each other and thus are likely to be positively correlated. By exploiting positive correlation, we make the following two main contributions. One is to provide a rigorous proof that further cost reduction can be achieved by using the Dorfman two-stage method when samples within a group are positively correlated. The other is to propose a hierarchical agglomerative algorithm for pooled testing with a social graph, where an edge in the social graph connects frequent social contacts between two persons. Such an algorithm leads to notable cost reduction (roughly 20-35%) compared to random pooling when the Dorfman two-stage algorithm is applied.
Collapse
Affiliation(s)
- Yi-Jheng Lin
- Institute of Communications EngineeringNational Tsing Hua UniversityHsinchu300044Taiwan
| | - Che-Hao Yu
- Institute of Communications EngineeringNational Tsing Hua UniversityHsinchu300044Taiwan
| | - Tzu-Hsuan Liu
- Institute of Communications EngineeringNational Tsing Hua UniversityHsinchu300044Taiwan
| | - Cheng-Shang Chang
- Institute of Communications EngineeringNational Tsing Hua UniversityHsinchu300044Taiwan
| | - Wen-Tsuen Chen
- Institute of Communications EngineeringNational Tsing Hua UniversityHsinchu300044Taiwan
| |
Collapse
|
14
|
Whitmeyer M. An imperfect test for a virus can Be worse than No test at all. Health Econ 2021; 30:1347-1360. [PMID: 33763902 PMCID: PMC8250338 DOI: 10.1002/hec.4254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Revised: 01/21/2021] [Accepted: 01/28/2021] [Indexed: 06/12/2023]
Abstract
This note studies the effect of the availability of a test for a virus on the public health of a population. It is shown by example that the existence of a freely available and moderately informative test for a virus may lower society's welfare in comparison to the case where no test exists or access to the test is restricted. In this setting, any test provided to any subset of agents who would find it optimal not to isolate absent the test improves welfare.
Collapse
Affiliation(s)
- Mark Whitmeyer
- Hausdorff Center for Mathematics and Institute for MicroeconomicsUniversity of BonnBonnGermany
| |
Collapse
|
15
|
Alizad-Rahvar AR, Vafadar S, Totonchi M, Sadeghi M. False Negative Mitigation in Group Testing for COVID-19 Screening. Front Med (Lausanne) 2021; 8:661277. [PMID: 34095171 PMCID: PMC8170512 DOI: 10.3389/fmed.2021.661277] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2021] [Accepted: 04/06/2021] [Indexed: 11/13/2022] Open
Abstract
After lifting the COVID-19 lockdown restrictions and opening businesses, screening is essential to prevent the spread of the virus. Group testing could be a promising candidate for screening to save time and resources. However, due to the high false-negative rate (FNR) of the RT-PCR diagnostic test, we should be cautious about using group testing because a group's false-negative result identifies all the individuals in a group as uninfected. Repeating the test is the best solution to reduce the FNR, and repeats should be integrated with the group-testing method to increase the sensitivity of the test. The simplest way is to replicate the test twice for each group (the 2Rgt method). In this paper, we present a new method for group testing (the groupMix method), which integrates two repeats in the test. Then we introduce the 2-stage sequential version of both the groupMix and the 2Rgt methods. We compare these methods analytically regarding the sensitivity and the average number of tests. The tradeoff between the sensitivity and the average number of tests should be considered when choosing the best method for the screening strategy. We applied the groupMix method to screening 263 people and identified 2 infected individuals by performing 98 tests. This method achieved a 63% saving in the number of tests compared to individual testing. Our experimental results show that in COVID-19 screening, the viral load can be low, and the group size should not be more than 6; otherwise, the FNR increases significantly. A web interface of the groupMix method is publicly available for laboratories to implement this method.
Collapse
Affiliation(s)
- Amir Reza Alizad-Rahvar
- School of Biological Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
| | - Safar Vafadar
- Laboratory of Biological Complex Systems and Bioinformatics (CBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Mehdi Totonchi
- Department of Genetics, Royan Institute for Reproductive Biomedicine, The Academic Center for Education, Culture, and Research (ACECR), Tehran, Iran
| | - Mehdi Sadeghi
- Department of Medical Genetics, National Institute for Genetic Engineering and Biotechnology, Tehran, Iran
| |
Collapse
|
16
|
Haber G, Malinovsky Y, Albert PS. Is group testing ready for prime-time in disease identification? Stat Med 2021; 40:3865-3880. [PMID: 33913183 DOI: 10.1002/sim.9003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 02/19/2021] [Accepted: 03/22/2021] [Indexed: 12/18/2022]
Abstract
Large-scale disease screening is a complicated process in which high costs must be balanced against pressing public health needs. When the goal is screening for infectious disease, one approach is group testing in which samples are initially tested in pools and individual samples are retested only if the initial pooled test was positive. Intuitively, if the prevalence of infection is small, this could result in a large reduction of the total number of tests required. Despite this, the use of group testing in medical studies has been limited, largely due to skepticism about the impact of pooling on the accuracy of a given assay. While there is a large body of research addressing the issue of testing errors in group testing studies, it is customary to assume that the misclassification parameters are known from an external population and/or that the values do not change with the group size. Both of these assumptions are highly questionable for many medical practitioners considering group testing in their study design. In this article, we explore how the failure of these assumptions might impact the efficacy of a group testing design and, consequently, whether group testing is currently feasible for medical screening. Specifically, we look at how incorrect assumptions about the sensitivity function at the design stage can lead to poor estimation of a procedure's overall sensitivity and expected number of tests. Furthermore, if a validation study is used to estimate the pooled misclassification parameters of a given assay, we show that the sample sizes required are so large as to be prohibitive in all but the largest screening programs.
Collapse
Affiliation(s)
- Gregory Haber
- Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, Maryland, USA
| | - Yaakov Malinovsky
- Department of Mathematics and Statistics, University of Maryland, Baltimore County, Baltimore, Maryland, USA
| | - Paul S Albert
- Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, Maryland, USA
| |
Collapse
|
17
|
Perivolaropoulos C, Vlacha V. A reduction of the number of assays and turnaround time by optimizing polymerase chain reaction (PCR) pooled testing for SARS-CoV-2. J Med Virol 2021; 93:4508-4515. [PMID: 33783005 PMCID: PMC8250672 DOI: 10.1002/jmv.26972] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2021] [Revised: 03/04/2021] [Accepted: 03/24/2021] [Indexed: 01/07/2023]
Abstract
Early detection of the severe acute respiratory syndrome coronavirus 2 infection can decrease the spread of the disease and provide therapeutic options promptly in affected individuals. However, the diagnosis by reverse‐transcription polymerase chain reaction is costly and time‐consuming. Several methods of group testing have been developed to overcome this problem. The proposed strategy offers optimization of group testing according to the available resources by decreasing not only the number of the assays but also the turnaround time. The initial classification of the samples would be done according to the intention of testing defined as diagnostic or screening/surveillance, achieving the best possible homogeneity. The proposed stratification of pooling is based on branching (divisions) and depth (levels of re‐pooling) of the original group in association with the estimated probability of a positive sample. The dilutional effect of the grouped samples has also been considered. The margins of minimum and maximum conservation of assays of pooled specimens are calculated and the optimum strategy can be selected in association with the probability of positive samples in the original group. This algorithm intends to be a useful tool for group testing offering a choice of strategies according to the requirements.
Collapse
Affiliation(s)
| | - Vasiliki Vlacha
- Department of Early Years Learning and Care, University of Ioannina, Ioannina, Greece.,Paediatric Department, Karamandanio Children's Hospital of Patras, Patras, Greece
| |
Collapse
|
18
|
Bilder CR, Tebbs JM, McMahan CS. Informative array testing with multiplex assays. Stat Med 2021; 40:3021-3034. [PMID: 33763901 DOI: 10.1002/sim.8954] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Revised: 02/12/2021] [Accepted: 03/01/2021] [Indexed: 11/07/2022]
Abstract
High-volume testing of clinical specimens for sexually transmitted diseases is performed frequently by a process known as group testing. This algorithmic process involves testing portions of specimens from separate individuals together as one unit (or "group") to detect diseases. Retesting is performed on groups that test positively in order to differentiate between positive and negative individual specimens. The overall goal is to use the least number of tests possible across all individuals without sacrificing diagnostic accuracy. One of the most efficient group testing algorithms is array testing. In its simplest form, specimens are arranged into a grid-like structure so that row and column groups can be formed. Positive-testing rows/columns indicate which specimens to retest. With the growing use of multiplex assays, the increasing number of diseases tested by these assays, and the availability of subject-specific risk information, opportunities exist to make this testing process even more efficient. We propose specific specimen arrangements within an array that can reduce the number of retests needed when compared with other array testing algorithms. We examine how to calculate operating characteristics, including the expected number of tests and the SD for the number of tests, and then subsequently find a best arrangement. Our methods are illustrated for chlamydia and gonorrhea detection with the Aptima Combo 2 Assay. We also provide R functions to make our research accessible to laboratories.
Collapse
Affiliation(s)
- Christopher R Bilder
- Department of Statistics, University of Nebraska-Lincoln, Lincoln, Nebraska, USA
| | - Joshua M Tebbs
- Department of Statistics, University of South Carolina, Columbia, South Carolina, USA
| | - Christopher S McMahan
- School of Mathematical and Statistical Sciences, Clemson University, Clemson, South Carolina, USA
| |
Collapse
|
19
|
Ghosh S, Agarwal R, Rehan MA, Pathak S, Agarwal P, Gupta Y, Consul S, Gupta N, Goenka R, Rajwade A, Gopalkrishnan M. A Compressed Sensing Approach to Pooled RT-PCR Testing for COVID-19 Detection. IEEE Open J Signal Process 2021; 2:248-264. [PMID: 34812422 PMCID: PMC8545028 DOI: 10.1109/ojsp.2021.3075913] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Revised: 04/13/2021] [Accepted: 04/17/2021] [Indexed: 05/12/2023]
Abstract
We propose 'Tapestry', a single-round pooled testing method with application to COVID-19 testing using quantitative Reverse Transcription Polymerase Chain Reaction (RT-PCR) that can result in shorter testing time and conservation of reagents and testing kits, at clinically acceptable false positive or false negative rates. Tapestry combines ideas from compressed sensing and combinatorial group testing to create a new kind of algorithm that is very effective in deconvoluting pooled tests. Unlike Boolean group testing algorithms, the input is a quantitative readout from each test and the output is a list of viral loads for each sample relative to the pool with the highest viral load. For guaranteed recovery of [Formula: see text] infected samples out of [Formula: see text] being tested, Tapestry needs only [Formula: see text] tests with high probability, using random binary pooling matrices. However, we propose deterministic binary pooling matrices based on combinatorial design ideas of Kirkman Triple Systems, which balance between good reconstruction properties and matrix sparsity for ease of pooling while requiring fewer tests in practice. This enables large savings using Tapestry at low prevalence rates while maintaining viability at prevalence rates as high as 9.5%. Empirically we find that single-round Tapestry pooling improves over two-round Dorfman pooling by almost a factor of 2 in the number of tests required. We evaluate Tapestry in simulations with synthetic data obtained using a novel noise model for RT-PCR, and validate it in wet lab experiments with oligomers in quantitative RT-PCR assays. Lastly, we describe use-case scenarios for deployment.
Collapse
Affiliation(s)
- Sabyasachi Ghosh
- 1 Department of Computer Science and EngineeringIIT Bombay Mumbai 400076 India
| | - Rishi Agarwal
- 1 Department of Computer Science and EngineeringIIT Bombay Mumbai 400076 India
| | - Mohammad Ali Rehan
- 1 Department of Computer Science and EngineeringIIT Bombay Mumbai 400076 India
| | - Shreya Pathak
- 1 Department of Computer Science and EngineeringIIT Bombay Mumbai 400076 India
| | - Pratyush Agarwal
- 1 Department of Computer Science and EngineeringIIT Bombay Mumbai 400076 India
| | - Yash Gupta
- 1 Department of Computer Science and EngineeringIIT Bombay Mumbai 400076 India
| | - Sarthak Consul
- 2 Department of Electrical EngineeringIIT Bombay Mumbai 400076 India
| | - Nimay Gupta
- 1 Department of Computer Science and EngineeringIIT Bombay Mumbai 400076 India
| | - Ritesh Goenka
- 1 Department of Computer Science and EngineeringIIT Bombay Mumbai 400076 India
| | - Ajit Rajwade
- 1 Department of Computer Science and EngineeringIIT Bombay Mumbai 400076 India
| | | |
Collapse
|
20
|
Zhang W, Liu A, Li Q, Albert PS. Nonparametric estimation of distributions and diagnostic accuracy based on group-tested results with differential misclassification. Biometrics 2020; 76:1147-1156. [PMID: 32083733 PMCID: PMC8581970 DOI: 10.1111/biom.13236] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Revised: 12/06/2019] [Accepted: 01/27/2020] [Indexed: 11/30/2022]
Abstract
This article concerns the problem of estimating a continuous distribution in a diseased or nondiseased population when only group-based test results on the disease status are available. The problem is challenging in that individual disease statuses are not observed and testing results are often subject to misclassification, with further complication that the misclassification may be differential as the group size and the number of the diseased individuals in the group vary. We propose a method to construct nonparametric estimation of the distribution and obtain its asymptotic properties. The performance of the distribution estimator is evaluated under various design considerations concerning group sizes and classification errors. The method is exemplified with data from the National Health and Nutrition Examination Survey study to estimate the distribution and diagnostic accuracy of C-reactive protein in blood samples in predicting chlamydia incidence.
Collapse
Affiliation(s)
- Wei Zhang
- LSC, NCMIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| | - Aiyi Liu
- Biostatistics and Bioinformatics Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland
| | - Qizhai Li
- LSC, NCMIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| | - Paul S. Albert
- Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, Maryland
| |
Collapse
|
21
|
Pilcher CD, Westreich D, Hudgens MG. Group Testing for Severe Acute Respiratory Syndrome- Coronavirus 2 to Enable Rapid Scale-up of Testing and Real-Time Surveillance of Incidence. J Infect Dis 2020; 222:903-909. [PMID: 32592581 PMCID: PMC7337777 DOI: 10.1093/infdis/jiaa378] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2020] [Accepted: 06/23/2020] [Indexed: 01/03/2023] Open
Abstract
High-throughput molecular testing for severe acute respiratory syndrome-coronavirus 2 (SARS-CoV-2) may be enabled by group testing in which pools of specimens are screened, and individual specimens tested only after a pool tests positive. Several laboratories have recently published examples of pooling strategies applied to SARS-CoV-2 specimens, but overall guidance on efficient pooling strategies is lacking. Therefore we developed a model of the efficiency and accuracy of specimen pooling algorithms based on available data on SAR-CoV-2 viral dynamics. For a fixed number of tests, we estimate that programs using group testing could screen 2-20 times as many specimens compared with individual testing, increase the total number of true positive infections identified, and improve the positive predictive value of results. We compare outcomes that may be expected in different testing situations and provide general recommendations for group testing implementation. A free, publicly-available Web calculator is provided to help inform laboratory decisions on SARS-CoV-2 pooling algorithms.
Collapse
Affiliation(s)
- Christopher D Pilcher
- Department of Medicine, University of California San Francisco, San Francisco, California, USA
| | - Daniel Westreich
- Department of Epidemiology, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Michael G Hudgens
- Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| |
Collapse
|
22
|
Affiliation(s)
- Ozkan Ufuk Nalbantoglu
- Department of Computer Engineering, Erciyes University, Kayseri, Turkey
- Genome and Stem Cell Center (GenKok), Erciyes University, Kayseri, Turkey
| | - Aycan Gundogdu
- Genome and Stem Cell Center (GenKok), Erciyes University, Kayseri, Turkey
- Department of Microbiology and Clinical Microbiology, Erciyes University, Kayseri, Turkey
| |
Collapse
|
23
|
Deka S, Kalita D, Mangla A, Shankar R. Analysis of multi-sample pools in the detection of SARS-CoV-2 RNA for mass screening: An Indian perspective. Indian J Med Microbiol 2020; 38:451-456. [PMID: 33154262 PMCID: PMC7709612 DOI: 10.4103/ijmm.ijmm_20_273] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Revised: 07/27/2020] [Accepted: 08/20/2020] [Indexed: 12/02/2022]
Abstract
In the current COVID-19 crisis, many national healthcare systems are confronted with a huge demand for mass testing and an acute shortage of diagnostic resources. Considering group testing as a viable solution, this pilot study was carried out to find the maximum number of samples that can be pooled together to accurately detect one positive sample carrying the severe acute respiratory syndrome-coronavirus 2 viral RNA from different pools. We made different pool sizes ranging from 5 to 30 samples. Three positive samples, covering the common range of polymerase chain reaction (PCR) threshold cycle values (an indirect indicator of viral load) observed in our patients, were selected, and different pools were made with known negative samples. The pools underwent real-time qualitative PCR for the determination of effective maximum pool size. It was observed that up to 20-sample pools of all positive samples could accurately be detected in terms of both E gene and RdRp gene, leading to considerable conservation of resources, time and workforce. However, while deciding the optimal pool size, the infection level in that particular geographical area and sensitivity of the test assay used (limit of detection) have to be taken into account.
Collapse
Affiliation(s)
- Sangeeta Deka
- Department of Microbiology, All India Institute of Medical Sciences, Rishikesh, Uttarakhand, India
| | - Deepjyoti Kalita
- Department of Microbiology, All India Institute of Medical Sciences, Rishikesh, Uttarakhand, India
| | - Amit Mangla
- Department of Microbiology, All India Institute of Medical Sciences, Rishikesh, Uttarakhand, India
| | - Ravi Shankar
- Department of Microbiology, All India Institute of Medical Sciences, Rishikesh, Uttarakhand, India
| |
Collapse
|
24
|
Seong JT. Group Testing-Based Robust Algorithm for Diagnosis of COVID-19. Diagnostics (Basel) 2020; 10:diagnostics10060396. [PMID: 32545224 PMCID: PMC7345105 DOI: 10.3390/diagnostics10060396] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2020] [Revised: 06/01/2020] [Accepted: 06/08/2020] [Indexed: 11/16/2022] Open
Abstract
At the time of writing, the COVID-19 infection is spreading rapidly. Currently, there is no vaccine or treatment, and researchers around the world are attempting to fight the infection. In this paper, we consider a diagnosis method for COVID-19, which is characterized by a very rapid rate of infection and is widespread. A possible method for avoiding severe infections is to stop the spread of the infection in advance by the prompt and accurate diagnosis of COVID-19. To this end, we exploit a group testing (GT) scheme, which is used to find a small set of confirmed cases out of a large population. For the accurate detection of false positives and negatives, we propose a robust algorithm (RA) based on the maximum a posteriori probability (MAP). The key idea of the proposed RA is to exploit iterative detection to propagate beliefs to neighbor nodes by exchanging marginal probabilities between input and output nodes. As a result, we show that our proposed RA provides the benefit of being robust against noise in the GT schemes. In addition, we demonstrate the performance of our proposal with a number of tests and successfully find a set of infected samples in both noiseless and noisy GT schemes with different COVID-19 incidence rates.
Collapse
Affiliation(s)
- Jin-Taek Seong
- Department of Convergence Software, Mokpo National University, Muan 58554, Korea
| |
Collapse
|
25
|
Zhang W, Liu A, Li Q, Albert PS. Incorporating retesting outcomes for estimation of disease prevalence. Stat Med 2019; 39:687-697. [PMID: 31758594 DOI: 10.1002/sim.8439] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2019] [Revised: 10/31/2019] [Accepted: 11/03/2019] [Indexed: 11/12/2022]
Abstract
Group testing has been widely used as a cost-effective strategy to screen for and estimate the prevalence of a rare disease. While it is well-recognized that retesting is necessary for identifying infected subjects, it is not required for estimating the prevalence. For a test without misclassification, gains in statistical efficiency are expected from incorporating retesting results in the estimation of the prevalence. However, when the test is subject to misclassification, it is not clear how much gain should be expected. There are a number of theoretical challenges in addressing this issue, including (1) enumerating the potential test results from retesting individual subjects in a group, (2) the dependence among these test results and the test result from testing at the group level, and (3) differential misclassification due to pooling of biospecimens. Overcoming some of these challenges, we show that retesting subjects in either positive or negative groups can substantially improve the efficiency of the estimation and that retesting positive groups yields higher efficiency than retesting a same number or proportion of negative groups.
Collapse
Affiliation(s)
- Wei Zhang
- LSC, NCMIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.,Division of Intramural Population Health Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland
| | - Aiyi Liu
- Division of Intramural Population Health Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland
| | - Qizhai Li
- LSC, NCMIS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| | - Paul S Albert
- Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Institutes of Health, Rockville, Maryland
| |
Collapse
|
26
|
Jones TA. Effect of Collaborative Group Testing on Dental Students' Performance and Perceived Learning in an Introductory Comprehensive Care Course. J Dent Educ 2019; 83:88-93. [PMID: 30600254 DOI: 10.21815/jde.019.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2018] [Accepted: 07/03/2018] [Indexed: 11/20/2022]
Abstract
The use of collaboration while in dental school can help prepare dental students for the team-oriented nature of the workforce. One way to do this is via collaborative group testing (CGT), a method of assessment allowing students to learn from one another. The aim of this study was to examine the CGT method in a predoctoral dental education setting to determine if student examination performance improved with the addition of collaboration and if collaborative testing was beneficial to students' learning process. In 2016, all first-year dental students (n=76) at one U.S. dental school were assessed in an introductory comprehensive care course using a two-stage CGT in which students were assessed individually, prior to taking the same test in collaboration with an assigned partner. Three quizzes and a final examination were given in which student participants served as both control (individual assessments) and treatment (collaborative assessment). At the conclusion of the course, a questionnaire was administered to ascertain student perspectives. All assessments yielded favorable results with an overall score improvement from a mean of 81.1% on individual assessments to 91% on collaborative assessments (p=0.001), indicating that collaboration improved assessment outcomes. Additionally, retention of material was suggested with individual scores on the cumulative final surpassing average individual scores of the preceding quizzes (p<0.001). Students' responses on the questionnaire indicated that they perceived implementation of CGT was beneficial to their learning process. With these results, this testing methodology shows promise to enhance dental student learning, material retention, and teamwork.
Collapse
Affiliation(s)
- Tobie Ann Jones
- Assistant Professor of Restorative Dentistry, School of Dentistry, Oregon Health & Science University.
| |
Collapse
|
27
|
Haber G, Malinovsky Y. Efficient methods for the estimation of the multinomial parameter for the two-trait group testing model. Electron J Stat 2019; 13:2624-2657. [PMID: 34267856 DOI: 10.1214/19-ejs1583] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Estimation of a single Bernoulli parameter using pooled sampling is among the oldest problems in the group testing literature. To carry out such estimation, an array of efficient estimators have been introduced covering a wide range of situations routinely encountered in applications. More recently, there has been growing interest in using group testing to simultaneously estimate the joint probabilities of two correlated traits using a multinomial model. Unfortunately, basic estimation results, such as the maximum likelihood estimator (MLE), have not been adequately addressed in the literature for such cases. In this paper, we show that finding the MLE for this problem is equivalent to maximizing a multinomial likelihood with a restricted parameter space. A solution using the EM algorithm is presented which is guaranteed to converge to the global maximizer, even on the boundary of the parameter space. Two additional closed form estimators are presented with the goal of minimizing the bias and/or mean square error. The methods are illustrated by considering an application to the joint estimation of transmission prevalence for two strains of the Potato virus Y by the aphid Myzus persicae.
Collapse
Affiliation(s)
- Gregory Haber
- Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, NIH, Bethesda, MD 20892, USA
| | - Yaakov Malinovsky
- Department of Mathematics and Statistics, University of Maryland, Baltimore County, Baltimore, MD 21250, USA
| |
Collapse
|
28
|
Eskridge KM, Gilmour SG, Posadas LG. Group screening for rare events based on incomplete block designs. Biotechnol Prog 2018; 35:e2770. [PMID: 30592187 DOI: 10.1002/btpr.2770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2018] [Revised: 12/06/2018] [Indexed: 11/08/2022]
Abstract
Fields such as, diagnostic testing, biotherapeutics, drug development, and toxicology among others, center on the premise of searching through many specimens for a rare event. Scientists in the business of "searching for a needle in a haystack" may greatly benefit from the use of group screening design strategies. Group screening, where specimens are composited into pools with each pool being tested for the presence of the event, can be much more cost-efficient than testing each individual specimen. A number of group screening designs have been proposed in the literature. Incomplete block screening designs are described here and compared with other group screening designs. It is shown under certain conditions, that incomplete block screening designs can provide nearly a 90% cost saving compared to other group screening designs such as when prevalence is 0.001 and screening 3876 specimens with an ICB-sequential design vs. a Dorfman design. In other cases, previous group screening designs are shown to be most efficient. Overall, when prevalence is small (≤0.05) group screening designs are shown to be quite cost effective at screening a large number of specimens and in general there is no one design that is best in all situations. © 2018 American Institute of Chemical Engineers Biotechnol Progress, 35: e2770, 2019.
Collapse
Affiliation(s)
- Kent M Eskridge
- Dept. of Statistics, University of Nebraska, Lincoln, Nebraska
| | - Steven G Gilmour
- Dept. of Mathematics, King's College, University of London, London, U.K
| | - Luis G Posadas
- Depart. of Agronomy and Horticulture, University of Nebraska, Lincoln, Nebraska
| |
Collapse
|
29
|
Chaturvedi N, Menezes RXD, Goeman JJ, Wieringen WV. A test for detecting differential indirect trans effects between two groups of samples. Stat Appl Genet Mol Biol 2018; 17:/j/sagmb.ahead-of-print/sagmb-2017-0058/sagmb-2017-0058.xml. [PMID: 30059350 DOI: 10.1515/sagmb-2017-0058] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Integrative analysis of copy number and gene expression data can help in understanding the cis and trans effect of copy number aberrations on transcription levels of genes involved in a pathway. To analyse how these copy number mediated gene-gene interactions differ between groups of samples we propose a new method, named dNET. Our method uses ridge regression to model the network topology involving one gene's expression level, its gene dosage and the expression levels of other genes in the network. The interaction parameters are estimated by fitting the model per gene for all samples together. However, instead of testing for differential network topology per gene, dNET tests for an overall difference in estimated parameters between two groups of samples and produces a single p-value. With the help of several simulation studies, we show that dNET can detect differential network nodes with high accuracy and low rate of false positives even in the presence of differential cis effects. We also apply dNET to publicly available TCGA cancer datasets and identify pathways where copy number mediated gene-gene interactions differ between samples with cancer stage lower than stage 3 and samples with cancer stage 3 or above.
Collapse
Affiliation(s)
- Nimisha Chaturvedi
- Afdeling Epidemiologie en Biostatistiek, Amsterdam Public Health Research Institute, Medische Faculteit (F-vleugel), VU Medisch Centrum, 1007 MB Amsterdam, The Netherlands
- Netherlands Bioinformatics Center, 260 NBIC, 6500 HB Nijmegen, The Netherlands
| | - Renée X de Menezes
- Afdeling Epidemiologie en Biostatistiek, Amsterdam Public Health Research Institute, Medische Faculteit (F-vleugel), VU Medisch Centrum, 1007 MB Amsterdam, The Netherlands
- Netherlands Bioinformatics Center, 260 NBIC, 6500 HB Nijmegen, The Netherlands
| | - Jelle J Goeman
- Department of Biomedical Data Sciences, Room Number S5-P, LUMC Main Building, Leiden University Medical Center, Albinusdreef 2, 2333 ZA Leiden, The Netherlands
| | - Wessel van Wieringen
- Afdeling Epidemiologie en Biostatistiek, Amsterdam Public Health Research Institute, Medische Faculteit (F-vleugel), VU Medisch Centrum, 1007 MB Amsterdam, The Netherlands
- Department of Mathematics, Amsterdam Public Health Research Institute, Faculty of Sciences, Vrije Universiteit, De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands
| |
Collapse
|
30
|
Zhang W, Liu A, Albert PS, Ashmead RD, Schisterman EF, Mills JL. A pooling strategy to effectively use genotype data in quantitative traits genome-wide association studies. Stat Med 2018; 37:4083-4095. [PMID: 30003569 DOI: 10.1002/sim.7898] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2018] [Revised: 04/17/2018] [Accepted: 06/01/2018] [Indexed: 11/11/2022]
Abstract
The goal of quantitative traits genome-wide association studies is to identify associations between a phenotypic variable, such as a vitamin level and genetic variants, often single-nucleotide polymorphisms. When funding limits the number of assays that can be performed to measure the level of the phenotypic variable, a subgroup of subjects is often randomly selected from the genotype database and the level of the phenotypic variable is then measured for each subject. Because only a proportion of the genotype data can be used, such a simple random sampling method may suffer from substantial loss of efficiency, especially when the number of assays is relative small and the frequency of the less common variant (minor allele frequency) is low. We propose a pooling strategy in which subjects in a randomly selected reference subgroup are aligned with randomly selected subjects from the remaining study subjects to form independent pools; blood samples from subjects in each pool are mixed; and the level of the phenotypic variable is measured for each pool. We demonstrate that the proposed pooling approach produces considerable gains in efficiency over the simple random sampling method for inference concerning the phenotype-genotype association, resulting in higher precision and power. The methods are illustrated using genotypic and phenotypic data from the Trinity Students Study, a quantitative genome-wide association study.
Collapse
Affiliation(s)
- Wei Zhang
- Division of Intramural Population Health Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland
| | - Aiyi Liu
- Division of Intramural Population Health Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland
| | - Paul S Albert
- Biostatistics Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, Maryland
| | - Robert D Ashmead
- Center for Statistical Research and Methodology, US Census Bureau, Washington, District of Columbia
| | - Enrique F Schisterman
- Division of Intramural Population Health Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland
| | - James L Mills
- Division of Intramural Population Health Research, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland
| |
Collapse
|
31
|
Hyun N, Gastwirth JL, Graubard BI. Grouping methods for estimating the prevalences of rare traits from complex survey data that preserve confidentiality of respondents. Stat Med 2018; 37:2174-2186. [PMID: 29579785 DOI: 10.1002/sim.7648] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2017] [Revised: 01/01/2018] [Accepted: 02/07/2018] [Indexed: 11/06/2022]
Abstract
Originally, 2-stage group testing was developed for efficiently screening individuals for a disease. In response to the HIV/AIDS epidemic, 1-stage group testing was adopted for estimating prevalences of a single or multiple traits from testing groups of size q, so individuals were not tested. This paper extends the methodology of 1-stage group testing to surveys with sample weighted complex multistage-cluster designs. Sample weighted-generalized estimating equations are used to estimate the prevalences of categorical traits while accounting for the error rates inherent in the tests. Two difficulties arise when using group testing in complex samples: (1) How does one weight the results of the test on each group as the sample weights will differ among observations in the same group. Furthermore, if the sample weights are related to positivity of the diagnostic test, then group-level weighting is needed to reduce bias in the prevalence estimation; (2) How does one form groups that will allow accurate estimation of the standard errors of prevalence estimates under multistage-cluster sampling allowing for intracluster correlation of the test results. We study 5 different grouping methods to address the weighting and cluster sampling aspects of complex designed samples. Finite sample properties of the estimators of prevalences, variances, and confidence interval coverage for these grouping methods are studied using simulations. National Health and Nutrition Examination Survey data are used to illustrate the methods.
Collapse
Affiliation(s)
- Noorie Hyun
- Division of Biostatistics, Institute of Health and Equity, Medical College of Wisconsin, Milwaukee, WI, USA
| | - Joseph L Gastwirth
- Department of Statistics, George Washington University, Washington, DC, USA
| | - Barry I Graubard
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, Rockville, MD, U.S.A
| |
Collapse
|
32
|
Abstract
Pooled testing is useful to identify positive specimens for large-scale screening. Matrix pooling is one of the commonly used algorithms. In this work, we investigate the properties of matrix pooling and reveal that the efficiency of matrix pooling is related with the magnitude of overlapping among groups. Based on this property, we develop a new design to further improve the efficiency while taking into account of testing error. The efficiency, pooling sensitivity and specificity of this algorithm are explicitly derived and verified through plasmode simulation of detecting acute human immunodeficiency virus among patients who were suspected to have malaria in rural Ugandan. We show that the new design outperforms matrix pooling in efficiency while retain the pooling sensitivity and specificity.
Collapse
Affiliation(s)
- Wenjun Xiong
- 1 School of Mathematics and Statistics, Guangxi Normal University, Guilin, China.,2 Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| | - Juan Ding
- 1 School of Mathematics and Statistics, Guangxi Normal University, Guilin, China.,3 Department of Medicine, Vanderbilt University School of Medicine, Nashville, USA
| | - Yuanzhen He
- 4 School of Mathematical Sciences, Beijing Normal University, Beijing, China
| | - Qizhai Li
- 2 Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
33
|
Warasi MS, Tebbs JM, McMahan CS, Bilder CR. Estimating the prevalence of multiple diseases from two-stage hierarchical pooling. Stat Med 2016; 35:3851-64. [PMID: 27090057 PMCID: PMC4965323 DOI: 10.1002/sim.6964] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2015] [Revised: 12/31/2015] [Accepted: 03/17/2016] [Indexed: 11/08/2022]
Abstract
Testing protocols in large-scale sexually transmitted disease screening applications often involve pooling biospecimens (e.g., blood, urine, and swabs) to lower costs and to increase the number of individuals who can be tested. With the recent development of assays that detect multiple diseases, it is now common to test biospecimen pools for multiple infections simultaneously. Recent work has developed an expectation-maximization algorithm to estimate the prevalence of two infections using a two-stage, Dorfman-type testing algorithm motivated by current screening practices for chlamydia and gonorrhea in the USA. In this article, we have the same goal but instead take a more flexible Bayesian approach. Doing so allows us to incorporate information about assay uncertainty during the testing process, which involves testing both pools and individuals, and also to update information as individuals are tested. Overall, our approach provides reliable inference for disease probabilities and accurately estimates assay sensitivity and specificity even when little or no information is provided in the prior distributions. We illustrate the performance of our estimation methods using simulation and by applying them to chlamydia and gonorrhea data collected in Nebraska. Copyright © 2016 John Wiley & Sons, Ltd.
Collapse
Affiliation(s)
- Md S Warasi
- Department of Statistics, University of South Carolina, Columbia, 29208, SC, U.S.A
| | - Joshua M Tebbs
- Department of Statistics, University of South Carolina, Columbia, 29208, SC, U.S.A
| | | | - Christopher R Bilder
- Department of Statistics, University of Nebraska-Lincoln, Lincoln, 68583, NE, U.S.A
| |
Collapse
|
34
|
Zhao S, He Y, Zhang X, Xu W, Wu W, Gao S. Group Testing with Multiple Inhibitor Sets and Error-Tolerant and Its Decoding Algorithms. J Comput Biol 2016; 23:821-9. [PMID: 27387263 DOI: 10.1089/cmb.2014.0202] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In this article, we advance a new group testing model [Formula: see text] with multiple inhibitor sets and error-tolerant and propose decoding algorithms for it to identify all its positives by using [Formula: see text]-disjunct matrix. The decoding complexity for it is [Formula: see text], where [Formula: see text]. Moreover, we extend this new group testing to threshold group testing and give the threshold group testing model [Formula: see text] with multiple inhibitor sets and error-tolerant. By using [Formula: see text]-disjunct matrix, we propose its decoding algorithms for gap g = 0 and g > 0, respectively. Finally, we point out that the new group testing is the natural generalization for the clone model.
Collapse
Affiliation(s)
- Shufang Zhao
- 1 Scientific and Educational Department, Hebei General Hospital , Shijiazhuang, China
| | - Yichao He
- 2 College of Information Engineering, Shijiazhuang University of Economics , Shijiazhuang, China
| | - Xinlu Zhang
- 3 College of Mathematics and Information Science, Hebei Normal University , Shijiazhuang, China
| | - Wen Xu
- 4 Department of Computer Science, University of Texas at Dallas , Richardson, Texas
| | - Weili Wu
- 4 Department of Computer Science, University of Texas at Dallas , Richardson, Texas
| | - Suogang Gao
- 2 College of Information Engineering, Shijiazhuang University of Economics , Shijiazhuang, China
| |
Collapse
|
35
|
Abstract
A cover-free family is a family of subsets of a finite set in which no one is covered by the union of r others. We study a variation of cover-free family: A binary matrix is (r, w]-consecutive-disjunct if for any w cyclically consecutive columns [Formula: see text] and another r cyclically consecutive columns [Formula: see text], there exists one row intersecting [Formula: see text] but none of [Formula: see text]. In group testing, the goal is to determine a small subset of positive items D in a large population [Formula: see text] by group tests. By applying consecutive-disjunct matrices, we solve threshold group testing of consecutive positives in [Formula: see text] group tests nonadaptively, and the decoding complexity is [Formula: see text] where u is a threshold parameter in threshold group testing and it is assumed that |D|≤d and [Formula: see text]. Meanwhile, we obtain that for group testing of consecutive positives, all positives can be identified in [Formula: see text] group tests nonadaptively and the decoding complexity is [Formula: see text].
Collapse
Affiliation(s)
- Huilan Chang
- Department of Applied Mathematics, National University of Kaohsiung , Taiwan, Republic of China
| | - Yi-Chang Chiu
- Department of Applied Mathematics, National University of Kaohsiung , Taiwan, Republic of China
| | - Yi-Lin Tsai
- Department of Applied Mathematics, National University of Kaohsiung , Taiwan, Republic of China
| |
Collapse
|
36
|
Zhang Z, Liu C, Kim S, Liu A. Prevalence estimation subject to misclassification: the mis-substitution bias and some remedies. Stat Med 2014; 33:4482-500. [PMID: 25043925 DOI: 10.1002/sim.6268] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2013] [Revised: 06/24/2014] [Accepted: 06/30/2014] [Indexed: 11/07/2022]
Abstract
We consider the problem of estimating the prevalence of a disease under a group testing framework. Because assays are usually imperfect, misclassification of disease status is a major challenge in prevalence estimation. To account for possible misclassification, it is usually assumed that the sensitivity and specificity of the assay are known and independent of the group size. This assumption is often questionable, and substitution of incorrect values of an assay's sensitivity and specificity can result in a large bias in the prevalence estimate, which we refer to as the mis-substitution bias. In this article, we propose simple designs and methods for prevalence estimation that do not require known values of assay sensitivity and specificity. If a gold standard test is available, it can be applied to a validation subsample to yield information on the imperfect assay's sensitivity and specificity. When a gold standard is unavailable, it is possible to estimate assay sensitivity and specificity, either as unknown constants or as specified functions of the group size, from group testing data with varying group size. We develop methods for estimating parameters and for finding or approximating optimal designs, and perform extensive simulation experiments to evaluate and compare the different designs. An example concerning human immunodeficiency virus infection is used to illustrate the validation subsample design.
Collapse
Affiliation(s)
- Zhiwei Zhang
- Division of Biostatistics, Office of Surveillance and Biometrics, Center for Devices and Radiological Health, Food and Drug Administration, Silver Spring, MA, U.S.A
| | | | | | | |
Collapse
|
37
|
Cao CC, Li C, Huang Z, Ma X, Sun X. Identifying rare variants with optimal depth of coverage and cost-effective overlapping pool sequencing. Genet Epidemiol 2013; 37:820-30. [PMID: 24166758 DOI: 10.1002/gepi.21769] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2013] [Revised: 09/09/2013] [Accepted: 09/27/2013] [Indexed: 01/19/2023]
Abstract
Genome-wide association studies have identified hundreds of genetic variants associated with complex diseases although most variants identified so far explain only a small proportion of heritability, suggesting that rare variants are responsible for missing heritability. Identification of rare variants through large-scale resequencing becomes increasing important but still prohibitively expensive despite the rapid decline in the sequencing costs. Nevertheless, group testing based overlapping pool sequencing in which pooled rather than individual samples are sequenced will greatly reduces the efforts of sample preparation as well as the costs to screen for rare variants. Here, we proposed an overlapping pool sequencing to screen rare variants with optimal sequencing depth and a corresponding cost model. We formulated a model to compute the optimal depth for sufficient observations of variants in pooled sequencing. Utilizing shifted transversal design algorithm, appropriate parameters for overlapping pool sequencing could be selected to minimize cost and guarantee accuracy. Due to the mixing constraint and high depth for pooled sequencing, results showed that it was more cost-effective to divide a large population into smaller blocks which were tested using optimized strategies independently. Finally, we conducted an experiment to screen variant carriers with frequency equaled 1%. With simulated pools and publicly available human exome sequencing data, the experiment achieved 99.93% accuracy. Utilizing overlapping pool sequencing, the cost for screening variant carriers with frequency equaled 1% in 200 diploid individuals dropped to at least 66% at which target sequencing region was set to 30 Mb.
Collapse
Affiliation(s)
- Chang-Chang Cao
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, China
| | | | | | | | | |
Collapse
|
38
|
Abstract
Monitoring populations of hosts as well as insect vectors is an important part of agricultural and public health risk assessment. In applications where pathogen prevalence is likely low, it is common to test pools of subjects for the presence of infection, rather than to test subjects individually. This technique is known as pooled (group) testing. In this paper, we revisit the problem of estimating the population prevalence p from pooled testing, but we consider applications where inverse binomial sampling is used. Our work is unlike previous research in pooled testing, which has largely assumed a binomial model. Inverse sampling is natural to implement when there is a need to report estimates early on in the data collection process and has been used in individual testing applications when disease incidence is low. We consider point and interval estimation procedures for p in this new pooled testing setting, and we use example data sets from the literature to describe and to illustrate our methods.
Collapse
Affiliation(s)
- Nicholas A. Pritchard
- Department of Mathematics and Statistics, Coastal Carolina University, Conway, SC 29528, USA
| | - Joshua M. Tebbs
- Department of Statistics, University of South Carolina, Columbia, SC 29208, USA
| |
Collapse
|
39
|
Abstract
Over the past three decades we have steadily increased our knowledge on the genetic basis of many severe disorders. Nevertheless, there are still great challenges in applying this knowledge routinely in the clinic, mainly due to the relatively tedious and expensive process of genotyping. Since the genetic variations that underlie the disorders are relatively rare in the population, they can be thought of as a sparse signal. Using methods and ideas from compressed sensing and group testing, we have developed a cost-effective genotyping protocol to detect carriers for severe genetic disorders. In particular, we have adapted our scheme to a recently developed class of high throughput DNA sequencing technologies. The mathematical framework presented here has some important distinctions from the 'traditional' compressed sensing and group testing frameworks in order to address biological and technical constraints of our setting.
Collapse
Affiliation(s)
- Yaniv Erlich
- Watson School of Biological Science, Cold Spring Harbor Laboratory, NY, 11724 USA
| | - Assaf Gordon
- Watson School of Biological Science, Cold Spring Harbor Laboratory, NY, 11724 USA
| | | | - Gregory J. Hannon
- Watson School of Biological Science, Cold Spring Harbor Laboratory, NY, 11724 USA
| | - Partha P. Mitra
- Watson School of Biological Science, Cold Spring Harbor Laboratory, NY, 11724 USA
| |
Collapse
|
40
|
KAINKARYAM RAGHUNANDANM, WOOLF PETERJ. Pooling in high-throughput drug screening. Curr Opin Drug Discov Devel 2009; 12:339-50. [PMID: 19396735 PMCID: PMC3204799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Pooling in HTS refers to the act of testing mixtures of compounds in a primary screen to accurately identify hits for secondary screening. The reduction in the number of tests needed to screen a compound library by pooling can also be extended to achieve much-needed error tolerance in HTS. Despite the success of HTS in other biological experiments, pooling in high-throughput drug screening has been a controversial and often marginalized paradigm. At first appearance, pooling appears to promise gains from reduced effort, or possibly could create more problems than solutions. However, this article demonstrates that pooling is a practical and necessary part of HTS: discussions include the rationale for pooling compounds in HTS, a unifying view of pooling design theory, a review of past attempts at pooling and their success, and recent advances in the field.
Collapse
Affiliation(s)
| | - PETER J. WOOLF
- Department of Chemical Engineering, University of Michigan, Ann Arbor
| |
Collapse
|
41
|
Abstract
Pooling experiments date as far back as 1915 and were initially used in dilution studies for estimating the density of organisms in some medium. These early uses of pooling were necessitated by scientific and technical limitations. Today, pooling experiments are driven by the potential cost savings and precision gains that can result, and they are making a substantial impact on blood screening and drug discovery. A general review of pooling experiments is given here, with additional details and discussion of issues and methods for two important application areas, namely, blood testing and drug discovery. The blood testing application is very old, from 1943, yet is still used today, especially for HIV antibody screening. In contrast, the drug discovery application is relatively new, with early uses occurring in the period from the late 1980s to early 1990s. Statistical methods for this latter application are still actively being investigated and developed through both the pharmaceutical industries and academic research. The ability of pooling to investigate synergism offers exciting prospects for the discovery of combination therapies.
Collapse
|