1
|
Gajewski BJ, Carlson SE, Brown AR, Mudaranthakam DP, Kerling EH, Valentine CJ. The value of a two-armed Bayesian response adaptive randomization trial. J Biopharm Stat 2023; 33:43-52. [PMID: 36411742 PMCID: PMC9812849 DOI: 10.1080/10543406.2022.2148161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Accepted: 11/12/2022] [Indexed: 11/23/2022]
Abstract
We investigate the value of a two-armed Bayesian response adaptive randomization (RAR) design to investigate early preterm birth rates of high versus low dose of docosahexaenoic acid during pregnancy. Unexpectedly, the COVID-19 pandemic forced recruitment to pause at 1100 participants rather than the planned 1355. The difference in power between number of participants at the pause and planned was 87% and 90% respectively. We decided to stop the study. This paper describes how the RAR was used to execute the study. The value of RAR in two-armed studies is quite high and their use in the future is promising.
Collapse
Affiliation(s)
- Byron J Gajewski
- Department of Biostatistics & Data Science, University of Kansas Medical Center, Kansas City, KS, USA
| | - Susan E Carlson
- Department of Dietetics and Nutrition, University of Kansas Medical Center, Kansas City, KS, USA
| | - Alexandra R Brown
- Department of Biostatistics & Data Science, University of Kansas Medical Center, Kansas City, KS, USA
| | - Dinesh Pal Mudaranthakam
- Department of Biostatistics & Data Science, University of Kansas Medical Center, Kansas City, KS, USA
| | - Elizabeth H Kerling
- Department of Dietetics and Nutrition, University of Kansas Medical Center, Kansas City, KS, USA
| | | |
Collapse
|
2
|
Delomas TA, Willis SC, Parker BL, Miller D, Anders P, Schreier A, Narum S. Genotyping single nucleotide polymorphisms and inferring ploidy by amplicon sequencing for polyploid, ploidy-variable organisms. Mol Ecol Resour 2021; 21:2288-2298. [PMID: 34008918 DOI: 10.1111/1755-0998.13431] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Revised: 04/21/2021] [Accepted: 05/11/2021] [Indexed: 11/30/2022]
Abstract
Whole genome duplication is hypothesized to have played a critical role in the evolution of several major taxa, including vertebrates, and while many lineages have rediploidized, some retain polyploid genomes. Additionally, variation in ploidy can occur naturally or be artificially induced within select plant and animal species. Modern genetic techniques have not been widely applied to polyploid or ploidy-variable species, in part due to the difficulty of obtaining genotype data from polyploids. In this study, we demonstrate a strategy for developing an amplicon sequencing panel of single nucleotide polymorphisms for high-throughput genotyping of polyploid organisms. We then develop a method to infer ploidy of individuals from amplicon sequencing data that is generalized to apply to any ploidy and does not require prior identification of heterozygous genotypes. Combining these two techniques will allow researchers to both infer ploidy and generate ploidy-aware genotypes with the same amplicon sequencing panel. We demonstrate this approach with white sturgeon Acipenser transmontanus, a ploidy-variable (octoploid, decaploid and dodecaploid) imperiled species under conservation management in the Pacific Northwest and obtained a panel of 325 loci. These loci were validated by examining inheritance in known-cross families, and the ploidy inference method was validated with known ploidy samples. We provide scripts that adapt existing pipelines to genotype polyploids and an R package for application of the ploidy inference method. We expect that these techniques will empower studies of genetic variation and inheritance in polyploid organisms that vary in ploidy level, either naturally or as a result of artificial propagation practices.
Collapse
Affiliation(s)
- Thomas A Delomas
- Pacific States Marine Fisheries Commission/Idaho Department of Fish and Game, Eagle Fish Genetics Laboratory, Eagle, ID, USA
| | - Stuart C Willis
- Hagerman Genetics Lab, Columbia River Inter-Tribal Fish Commission, Hagerman, ID, USA
| | - Blaine L Parker
- Columbia River Inter-Tribal Fish Commission, Portland, OR, USA
| | | | | | - Andrea Schreier
- Genomic Variation Laboratory, Department of Animal Science, University of California Davis, Davis, CA, USA
| | - Shawn Narum
- Hagerman Genetics Lab, Columbia River Inter-Tribal Fish Commission, Hagerman, ID, USA
| |
Collapse
|
3
|
Abstract
Using a sample from a population to estimate the proportion of the population with a certain category label is a broadly important problem. In the context of microbiome studies, this problem arises when researchers wish to use a sample from a population of microbes to estimate the population proportion of a particular taxon, known as the taxon's relative abundance. In this paper, we propose a beta-binomial model for this task. Like existing models, our model allows for a taxon's relative abundance to be associated with covariates of interest. However, unlike existing models, our proposal also allows for the overdispersion in the taxon's counts to be associated with covariates of interest. We exploit this model in order to propose tests not only for differential relative abundance, but also for differential variability. The latter is particularly valuable in light of speculation that dysbiosis, the perturbation from a normal microbiome that can occur in certain disease conditions, may manifest as a loss of stability, or increase in variability, of the counts associated with each taxon. We demonstrate the performance of our proposed model using a simulation study and an application to soil microbial data.
Collapse
Affiliation(s)
| | - Daniela Witten
- Departments of Statistics and Biostatistics, University of Washington
| | - Amy D Willis
- Department of Biostatistics, University of Washington
| |
Collapse
|
4
|
Menssen M, Schaarschmidt F. Prediction intervals for overdispersed binomial data with application to historical controls. Stat Med 2019; 38:2652-2663. [PMID: 30835886 DOI: 10.1002/sim.8124] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2018] [Revised: 01/24/2019] [Accepted: 01/25/2019] [Indexed: 11/05/2022]
Abstract
Bioassays are highly standardized trials for assessing the impact of a chemical compound on a model organism. In that context, it is standard to compare several treatment groups with an untreated control. If the same type of bioassay is carried out several times, the amount of information about the historical controls rises with every new study. This information can be applied to predict the outcome of one future control using a prediction interval. Since the observations are counts of success out of a given sample size, like mortality or histopathological findings, the data can be assumed to be binomial but may exhibit overdispersion caused by the variability between historical studies. We describe two approaches that account for overdispersion: asymptotic prediction intervals using the quasi-binomial assumption and prediction intervals based on the quantiles of the beta-binomial distribution. Both interval types were α-calibrated using bootstrap methods. For an assessment of the intervals coverage probabilities, a simulation study based on various numbers of historical studies and sample sizes as well as different binomial proportions and varying levels of overdispersion was run. It could be shown that α-calibration can improve the coverage probabilities of both interval types. The coverage probability of the calibrated intervals, calculated based on at least 10 historical studies, was satisfactory close to the nominal 95%. In a last step, the intervals were computed based on a real data set from the NTP homepage, using historical controls from bioassays with the mice strain B6C3F1.
Collapse
Affiliation(s)
- Max Menssen
- Abteilung Biostatistik, Institut für Zellbiologie und Biophysik, Leibniz Universität Hannover, Hannover, Germany
| | - Frank Schaarschmidt
- Abteilung Biostatistik, Institut für Zellbiologie und Biophysik, Leibniz Universität Hannover, Hannover, Germany
| |
Collapse
|
5
|
Jakaitiene A, Avino M, Guarracino MR. Beta-Binomial Model for the Detection of Rare Mutations in Pooled Next-Generation Sequencing Experiments. J Comput Biol 2016; 24:357-367. [PMID: 27632638 DOI: 10.1089/cmb.2016.0106] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Against diminishing costs, next-generation sequencing (NGS) still remains expensive for studies with a large number of individuals. As cost saving, sequencing genome of pools containing multiple samples might be used. Currently, there are many software available for the detection of single-nucleotide polymorphisms (SNPs). Sensitivity and specificity depend on the model used and data analyzed, indicating that all software have space for improvement. We use beta-binomial model to detect rare mutations in untagged pooled NGS experiments. We propose a multireference framework for pooled data with ability being specific up to two patients affected by neuromuscular disorders (NMD). We assessed the results comparing with The Genome Analysis Toolkit (GATK), CRISP, SNVer, and FreeBayes. Our results show that the multireference approach applying beta-binomial model is accurate in predicting rare mutations at 0.01 fraction. Finally, we explored the concordance of mutations between the model and software, checking their involvement in any NMD-related gene. We detected seven novel SNPs, for which the functional analysis produced enriched terms related to locomotion and musculature.
Collapse
Affiliation(s)
- Audrone Jakaitiene
- 1 Bioinformatics and Biostatistics Center, Department of Human and Medical Genetics, Faculty of Medicine, Vilnius University , Vilnius, Lithuania
| | - Mariano Avino
- 2 High Performance Computing and Networking Institute , National Research Council, Naples, Italy
| | - Mario Rosario Guarracino
- 2 High Performance Computing and Networking Institute , National Research Council, Naples, Italy
| |
Collapse
|
6
|
Robinson MD, Kahraman A, Law CW, Lindsay H, Nowicka M, Weber LM, Zhou X. Statistical methods for detecting differentially methylated loci and regions. Front Genet 2014; 5:324. [PMID: 25278959 PMCID: PMC4165320 DOI: 10.3389/fgene.2014.00324] [Citation(s) in RCA: 71] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2014] [Accepted: 08/29/2014] [Indexed: 12/19/2022] Open
Abstract
DNA methylation, the reversible addition of methyl groups at CpG dinucleotides, represents an important regulatory layer associated with gene expression. Changed methylation status has been noted across diverse pathological states, including cancer. The rapid development and uptake of microarrays and large scale DNA sequencing has prompted an explosion of data analytic methods for processing and discovering changes in DNA methylation across varied data types. In this mini-review, we present a compact and accessible discussion of many of the salient challenges, such as experimental design, statistical methods for differential methylation detection, critical considerations such as cell type composition and the potential confounding that can arise from batch effects. From a statistical perspective, our main interests include the use of empirical Bayes or hierarchical models, which have proved immensely powerful in genomics, and the procedures by which false discovery control is achieved.
Collapse
Affiliation(s)
- Mark D Robinson
- Institute of Molecular Life Sciences, University of Zurich Zurich, Switzerland ; SIB Swiss Institute of Bioinformatics, University of Zurich Zurich, Switzerland
| | - Abdullah Kahraman
- Institute of Molecular Life Sciences, University of Zurich Zurich, Switzerland ; SIB Swiss Institute of Bioinformatics, University of Zurich Zurich, Switzerland
| | - Charity W Law
- Institute of Molecular Life Sciences, University of Zurich Zurich, Switzerland ; SIB Swiss Institute of Bioinformatics, University of Zurich Zurich, Switzerland
| | - Helen Lindsay
- Institute of Molecular Life Sciences, University of Zurich Zurich, Switzerland ; SIB Swiss Institute of Bioinformatics, University of Zurich Zurich, Switzerland
| | - Malgorzata Nowicka
- Institute of Molecular Life Sciences, University of Zurich Zurich, Switzerland ; SIB Swiss Institute of Bioinformatics, University of Zurich Zurich, Switzerland
| | - Lukas M Weber
- Institute of Molecular Life Sciences, University of Zurich Zurich, Switzerland ; SIB Swiss Institute of Bioinformatics, University of Zurich Zurich, Switzerland
| | - Xiaobei Zhou
- Institute of Molecular Life Sciences, University of Zurich Zurich, Switzerland ; SIB Swiss Institute of Bioinformatics, University of Zurich Zurich, Switzerland
| |
Collapse
|
7
|
Iddi S, Molenberghs G, Aregay M, Kalema G. Empirical Bayes estimates for correlated hierarchical data with overdispersion. Pharm Stat 2014; 13:316-26. [PMID: 25181392 DOI: 10.1002/pst.1635] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2013] [Revised: 05/02/2014] [Accepted: 07/24/2014] [Indexed: 11/07/2022]
Abstract
An extension of the generalized linear mixed model was constructed to simultaneously accommodate overdispersion and hierarchies present in longitudinal or clustered data. This so-called combined model includes conjugate random effects at observation level for overdispersion and normal random effects at subject level to handle correlation, respectively. A variety of data types can be handled in this way, using different members of the exponential family. Both maximum likelihood and Bayesian estimation for covariate effects and variance components were proposed. The focus of this paper is the development of an estimation procedure for the two sets of random effects. These are necessary when making predictions for future responses or their associated probabilities. Such (empirical) Bayes estimates will also be helpful in model diagnosis, both when checking the fit of the model as well as when investigating outlying observations. The proposed procedure is applied to three datasets of different outcome types.
Collapse
Affiliation(s)
- Samuel Iddi
- Department of Statistics, University of Ghana, Legon-Accra, Ghana; I-BioStat, KU Leuven - University of Leuven
| | | | | | | |
Collapse
|
8
|
Abstract
One of the most important indicators of dental caries prevalence is the total count of decayed, missing or filled surfaces in a tooth. These count data are often clustered in nature (several count responses clustered within a subject), over-dispersed as well as spatially referenced (a diseased tooth might be positively influencing the decay process of a set of neighbouring teeth). In this article, we develop a multivariate spatial betabinomial (BB) model for these data that accommodates both over-dispersion as well as latent spatial associations. Using a Bayesian paradigm, the re-parameterised marginal mean (as well as variance) under the BB framework are modelled using a regression on subject/tooth-specific co-variables and a conditionally autoregressive prior that models the latent spatial process. The necessity of exploiting spatial associations to model count data arising in dental caries research is demonstrated using a small simulation study. Real data confirms that our spatial BB model provides a superior estimation and model fit as compared to other sub-models that do not consider modelling spatial associations.
Collapse
Affiliation(s)
- Dipankar Bandyopadhyay
- Division of Biostatistics and Epidemiology, Medical University of South Carolina, Charleston, SC 29425, USA.
| | | | | |
Collapse
|