26
|
Rahnenführer J, Futschik A. Cost-effective Screening for Differentially Expressed Genes in Microarray Experiments Based on Normal Mixtures. AUSTRIAN JOURNAL OF STATISTICS 2016. [DOI: 10.17713/ajs.v32i3.458] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Microarray experiments allow the monitoring of expression levels for thousands of genes simultaneously. Based on data obtained from the co-hybridization of two mRNA samples, a frequent goal is to find out which genes are differentially expressed. For this purpose, we propose to estimate the distribution of popular test statistics by a mixture of normal distributions. These statistics are calculated for each gene separately. A Bayes classifier is then used to decide upon differential expression. The cut-off for the classifier is chosen according to the number of false positives and negatives when applied to realistic data generating models. In particular, we generate data from a mixture model and from an Empirical Bayes model. By comparing the numbers of false decisions for various test statistics in the context of the considered models, we investigate which of the statistics are particularly suitable with our approach.
Collapse
|
27
|
Wertz W, Duller C, Pilz J, Quatember A, Berghold A, Stadlober E, Bomze I, Steckel-Berger G, Katzenbeisser W, Futschik A, Liebmann FG. Book Reviews. AUSTRIAN JOURNAL OF STATISTICS 2016. [DOI: 10.17713/ajs.v29i1.501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Elements of Large-Sample Theory. (E.L. Lehmann)Mathematical Statistics. (J. Shao)A Practical Guide to Heavy Tails. (R.J. Adler, R.E. Feldmann, M.S. Taqqu)Statistik. DerWeg zur Datenanalyse. (L. Fahrmeir, R. K¨unstler, I. Pigeot, G. Tutz)Einführung in die angewandte Statistik für Biowissenschaftler. (A. Kessel, M. Junge,W. Nachtigall)Data Analysis, Statistical and Computational Methods for Scientists and Engineers.(S. Brandt)Numerical Analysis for Statisticans. (K. Lange)Angewandte Statistik. (L. Sachs)Using SPSS forWindows. Data Analysis and Graphics. (K.E. Voelkl, S. Gerber)Stochastik mit Mathematica. (M. Overbeck-Larisch, W. Dolejsky)Negativauslese und Tarifdifferenzierung im Versicherungssektor. (Ch. Bach)
Collapse
|
28
|
Futschik A. Ist der Euro fair? Ergebnis einer empirischen Untersuchung. AUSTRIAN JOURNAL OF STATISTICS 2016. [DOI: 10.17713/ajs.v31i1.467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
In letzter Zeit sind Medienberichte aufgetaucht, nach denen der Euro keine faire Münze sein soll. Eine Abweichung der Kopfwahrscheinlichkeit von 1/2 wurde insbesondere in Zusammenhang mit dem belgischen und französischen Euro behauptet. Wir präsentieren in diesem Zusammenhang Ergebnisse von Münzwurfexperimenten und diskutieren die Implikationen aus statistischer Sicht.
Collapse
|
29
|
Lürzel S, Windschnurer I, Futschik A, Palme R, Waiblinger S. Effects of gentle interactions on the relationship with humans and on stress-related parameters in group-housed calves. Anim Welf 2015. [DOI: 10.7120/09627286.24.4.475] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
30
|
Lürzel S, Münsch C, Windschnurer I, Futschik A, Palme R, Waiblinger S. The influence of gentle interactions on avoidance distance towards humans, weight gain and physiological parameters in group-housed dairy calves. Appl Anim Behav Sci 2015. [DOI: 10.1016/j.applanim.2015.09.004] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
31
|
Nordmann E, Barth K, Futschik A, Palme R, Waiblinger S. Head partitions at the feed barrier affect behaviour of goats. Appl Anim Behav Sci 2015. [DOI: 10.1016/j.applanim.2015.03.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
32
|
Wagner K, Seitner D, Barth K, Palme R, Futschik A, Waiblinger S. Effects of mother versus artificial rearing during the first 12 weeks of life on challenge responses of dairy cows. Appl Anim Behav Sci 2015. [DOI: 10.1016/j.applanim.2014.12.010] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
33
|
Burgstaller JP, Johnston IG, Jones NS, Albrechtová J, Kolbe T, Vogl C, Futschik A, Mayrhofer C, Klein D, Sabitzer S, Blattner M, Gülly C, Poulton J, Rülicke T, Piálek J, Steinborn R, Brem G. MtDNA segregation in heteroplasmic tissues is common in vivo and modulated by haplotype differences and developmental stage. Cell Rep 2014; 7:2031-2041. [PMID: 24910436 DOI: 10.1016/j.celrep.2014.05.020] [Citation(s) in RCA: 64] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2013] [Revised: 03/11/2014] [Accepted: 05/12/2014] [Indexed: 12/21/2022] Open
Abstract
The dynamics by which mitochondrial DNA (mtDNA) evolves within organisms are still poorly understood, despite the fact that inheritance and proliferation of mutated mtDNA cause fatal and incurable diseases. When two mtDNA haplotypes are present in a cell, it is usually assumed that segregation (the proliferation of one haplotype over another) is negligible. We challenge this assumption by showing that segregation depends on the genetic distance between haplotypes. We provide evidence by creating four mouse models containing mtDNA haplotype pairs of varying diversity. We find tissue-specific segregation in all models over a wide range of tissues. Key findings are segregation in postmitotic tissues (important for disease models) and segregation covering all developmental stages from prenatal to old age. We identify four dynamic regimes of mtDNA segregation. Our findings suggest potential complications for therapies in human populations: we propose "haplotype matching" as an approach to avoid these issues.
Collapse
|
34
|
Futschik A, Hotz T, Munk A, Sieling H. Multiscale DNA partitioning: statistical evidence for segments. Bioinformatics 2014; 30:2255-62. [PMID: 24753487 DOI: 10.1093/bioinformatics/btu180] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION DNA segmentation, i.e. the partitioning of DNA in compositionally homogeneous segments, is a basic task in bioinformatics. Different algorithms have been proposed for various partitioning criteria such as Guanine/Cytosine (GC) content, local ancestry in population genetics or copy number variation. A critical component of any such method is the choice of an appropriate number of segments. Some methods use model selection criteria and do not provide a suitable error control. Other methods that are based on simulating a statistic under a null model provide suitable error control only if the correct null model is chosen. RESULTS Here, we focus on partitioning with respect to GC content and propose a new approach that provides statistical error control: as in statistical hypothesis testing, it guarantees with a user-specified probability [Formula: see text] that the number of identified segments does not exceed the number of actually present segments. The method is based on a statistical multiscale criterion, rendering this as a segmentation method that searches segments of any length (on all scales) simultaneously. It is also accurate in localizing segments: under benchmark scenarios, our approach leads to a segmentation that is more accurate than the approaches discussed in the comparative review of Elhaik et al. In our real data examples, we find segments that often correspond well to features taken from standard University of California at Santa Cruz (UCSC) genome annotation tracks. AVAILABILITY AND IMPLEMENTATION Our method is implemented in function smuceR of the R-package stepR available at http://www.stochastik.math.uni-goettingen.de/smuce.
Collapse
|
35
|
Suvorov A, Nolte V, Pandey RV, Franssen SU, Futschik A, Schlötterer C. Intra-specific regulatory variation in Drosophila pseudoobscura. PLoS One 2013; 8:e83547. [PMID: 24386226 PMCID: PMC3873948 DOI: 10.1371/journal.pone.0083547] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2013] [Accepted: 11/06/2013] [Indexed: 11/18/2022] Open
Abstract
It is generally accepted that gene regulation serves an important role in determining the phenotype. To shed light on the evolutionary forces operating on gene regulation, previous studies mainly focused on the expression differences between species and their inter-specific hybrids. Here, we use RNA-Seq to study the intra-specific distribution of cis- and trans-regulatory variation in Drosophila pseudoobscura. Consistent with previous results, we find almost twice as many genes (26%) with significant trans-effects than genes with significant cis-effects (18%). While this result supports the previous suggestion of a larger mutational target of trans-effects, we also show that trans-effects may be subjected to purifying selection. Our results underline the importance of intra-specific analyses for the understanding of the evolution of gene expression.
Collapse
|
36
|
Szabò S, Barth K, Graml C, Futschik A, Palme R, Waiblinger S. Introducing young dairy goats into the adult herd after parturition reduces social stress. J Dairy Sci 2013; 96:5644-55. [DOI: 10.3168/jds.2012-5556] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2012] [Accepted: 04/29/2013] [Indexed: 11/19/2022]
|
37
|
Wagner K, Barth K, Hillmann E, Palme R, Futschik A, Waiblinger S. Mother rearing of dairy calves: Reactions to isolation and to confrontation with an unfamiliar conspecific in a new environment. Appl Anim Behav Sci 2013. [DOI: 10.1016/j.applanim.2013.04.010] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
|
38
|
Bastide H, Betancourt A, Nolte V, Tobler R, Stöbe P, Futschik A, Schlötterer C. A genome-wide, fine-scale map of natural pigmentation variation in Drosophila melanogaster. PLoS Genet 2013; 9:e1003534. [PMID: 23754958 PMCID: PMC3674992 DOI: 10.1371/journal.pgen.1003534] [Citation(s) in RCA: 97] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2012] [Accepted: 04/11/2013] [Indexed: 11/25/2022] Open
Abstract
Various approaches can be applied to uncover the genetic basis of natural phenotypic variation, each with their specific strengths and limitations. Here, we use a replicated genome-wide association approach (Pool-GWAS) to fine-scale map genomic regions contributing to natural variation in female abdominal pigmentation in Drosophila melanogaster, a trait that is highly variable in natural populations and highly heritable in the laboratory. We examined abdominal pigmentation phenotypes in approximately 8000 female European D. melanogaster, isolating 1000 individuals with extreme phenotypes. We then used whole-genome Illumina sequencing to identify single nucleotide polymorphisms (SNPs) segregating in our sample, and tested these for associations with pigmentation by contrasting allele frequencies between replicate pools of light and dark individuals. We identify two small regions near the pigmentation genes tan and bric-à-brac 1, both corresponding to known cis-regulatory regions, which contain SNPs showing significant associations with pigmentation variation. While the Pool-GWAS approach suffers some limitations, its cost advantage facilitates replication and it can be applied to any non-model system with an available reference genome. Phenotypic variation is abundant in natural populations, but its genetic basis is not always well-understood. Here, we examine the genetic basis of body pigmentation in Drosophila, a trait with a long history of study in Drosophila genetics and evolution. We conducted the first genome-wide scan for polymorphism associated with pigmentation variation in a large natural sample of D. melanogaster, and found SNPs near two genes, tan and bric-à-brac 1, affecting the trait. The SNPs associated with pigmentation variation in these genes appear to act by affecting the regulation of the pigmentation genes, rather than their protein coding sequence.
Collapse
|
39
|
Pandey RV, Franssen SU, Futschik A, Schlötterer C. Allelic imbalance metre (Allim), a new tool for measuring allele-specific gene expression with RNA-seq data. Mol Ecol Resour 2013; 13:740-5. [PMID: 23615333 PMCID: PMC3739924 DOI: 10.1111/1755-0998.12110] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2013] [Revised: 03/18/2013] [Accepted: 03/22/2013] [Indexed: 11/29/2022]
Abstract
Estimating differences in gene expression among alleles is of high interest for many areas in biology and medicine. Here, we present a user-friendly software tool, Allim, to estimate allele-specific gene expression. Because mapping bias is a major problem for reliable estimates of allele-specific gene expression using RNA-seq, Allim combines two different strategies to account for the mapping biases. In order to reduce the mapping bias, Allim first generates a polymorphism-aware reference genome that accounts for the sequence variation between the alleles. Then, a sequence-specific simulation tool estimates the residual mapping bias. Statistical tests for allelic imbalance are provided that can be used with the bias corrected RNA-seq data.
Collapse
|
40
|
Faisal M, Futschik A, Hussain I. A new approach to choose acceptance cutoff for approximate Bayesian computation. J Appl Stat 2013. [DOI: 10.1080/02664763.2012.756860] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
41
|
Boitard S, Kofler R, Françoise P, Robelin D, Schlötterer C, Futschik A. Pool-hmm: a Python program for estimating the allele frequency spectrum and detecting selective sweeps from next generation sequencing of pooled samples. Mol Ecol Resour 2013; 13:337-40. [PMID: 23311589 PMCID: PMC3592992 DOI: 10.1111/1755-0998.12063] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2012] [Revised: 11/26/2012] [Accepted: 11/29/2012] [Indexed: 11/28/2022]
Abstract
Due to its cost effectiveness, next generation sequencing of pools of individuals (Pool-Seq) is becoming a popular strategy for genome-wide estimation of allele frequencies in population samples. As the allele frequency spectrum provides information about past episodes of selection, Pool-seq is also a promising design for genomic scans for selection. However, no software tool has yet been developed for selection scans based on Pool-Seq data. We introduce Pool-hmm, a Python program for the estimation of allele frequencies and the detection of selective sweeps in a Pool-Seq sample. Pool-hmm includes several options that allow a flexible analysis of Pool-Seq data, and can be run in parallel on several processors. Source code and documentation for Pool-hmm is freely available at https://qgsp.jouy.inra.fr/.
Collapse
|
42
|
Aeschbacher S, Futschik A, Beaumont MA. Approximate
B
ayesian computation for modular inference problems with many parameters: the example of migration rates. Mol Ecol 2013; 22:987-1002. [DOI: 10.1111/mec.12165] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2012] [Revised: 11/08/2012] [Accepted: 11/08/2012] [Indexed: 12/01/2022]
|
43
|
Wagner K, Barth K, Palme R, Futschik A, Waiblinger S. Integration into the dairy cow herd: Long-term effects of mother contact during the first twelve weeks of life. Appl Anim Behav Sci 2012. [DOI: 10.1016/j.applanim.2012.08.011] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
44
|
Ramsey DM, Futschik A. DNA pooling and statistical tests for the detection of single nucleotide polymorphisms. Stat Appl Genet Mol Biol 2012; 11:Article 1. [PMID: 23023700 DOI: 10.1515/1544-6115.1763] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
The development of next generation genome sequencers gives the opportunity of learning more about the genetic make-up of human and other populations. One important question involves the location of sites at which variation occurs within a population. Our focus will be on the detection of rare variants. Such variants will often not be present in smaller samples and are hard to distinguish from sequencing errors in larger samples. This is particularly true for pooled samples which are often used as part of a cost saving strategy. The focus of this article is on experiments that involve DNA pooling. We derive experimental designs that optimize the power of statistical tests for detecting single nucleotide polymorphisms (SNPs, sites at which there is variation within a population). We also present a new simple test that calls a SNP, if the maximum number of reads of a prospective variant across lanes exceeds a certain threshold. The value of this threshold is defined according to the number of available lanes, the parameters of the genome sequencer and a specified probability of accepting that there is variation at a site when no variation is present. On the basis of this test, we derive pool sizes which are optimal for the detection of rare variants. This test is compared with a likelihood ratio test, which takes into account the number of reads of a prospective variant from all the lanes. It is shown that the threshold based rule achieves a comparable power to this likelihood ratio test and may well be a useful tool in determining near optimal pool sizes for the detection of rare alleles in practical applications.
Collapse
|
45
|
Boitard S, Schlötterer C, Nolte V, Pandey RV, Futschik A. Detecting selective sweeps from pooled next-generation sequencing samples. Mol Biol Evol 2012; 29:2177-86. [PMID: 22411855 PMCID: PMC3424412 DOI: 10.1093/molbev/mss090] [Citation(s) in RCA: 71] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Due to its cost effectiveness, next-generation sequencing of pools of individuals (Pool-Seq) is becoming a popular strategy for characterizing variation in population samples. Because Pool-Seq provides genome-wide SNP frequency data, it is possible to use them for demographic inference and/or the identification of selective sweeps. Here, we introduce a statistical method that is designed to detect selective sweeps from pooled data by accounting for statistical challenges associated with Pool-Seq, namely sequencing errors and random sampling among chromosomes. This allows for an efficient use of the information: all base calls are included in the analysis, but the higher credibility of regions with higher coverage and base calls with better quality scores is accounted for. Computer simulations show that our method efficiently detects sweeps even at very low coverage (0.5× per chromosome). Indeed, the power of detecting sweeps is similar to what we could expect from sequences of individual chromosomes. Since the inference of selective sweeps is based on the allele frequency spectrum (AFS), we also provide a method to accurately estimate the AFS provided that the quality scores for the sequence reads are reliable. Applying our approach to Pool-Seq data from Drosophila melanogaster, we identify several selective sweep signatures on chromosome X that include some previously well-characterized sweeps like the wapl region.
Collapse
|
46
|
|
47
|
Futschik A, Gach F. On the inadmissibility of Watterson's estimator. Theor Popul Biol 2007; 73:212-21. [PMID: 18215409 DOI: 10.1016/j.tpb.2007.11.009] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2007] [Revised: 11/08/2007] [Accepted: 11/13/2007] [Indexed: 11/19/2022]
Abstract
We consider the estimation of the scaled mutation parameter theta, which is one of the parameters of key interest in population genetics. We provide a general result showing when estimators of theta can be improved using shrinkage when taking the mean squared error as the measure of performance. As a consequence, we show that Watterson's estimator is inadmissible, and propose an alternative shrinkage-based estimator that is easy to calculate and has a smaller mean squared error than Watterson's estimator for all possible parameter values 0<theta<infinity. This estimator is admissible in the class of all linear estimators. We then derive improved versions for other estimators of theta, including the MLE. We also investigate how an improvement can be obtained both when combining information from several independent loci and when explicitly taking into account recombination. A simulation study provides information about the amount of improvement achieved by our alternative estimators.
Collapse
|
48
|
Baierl A, Futschik A, Bogdan M, Biecek P. Locating multiple interacting quantitative trait loci using robust model selection. Comput Stat Data Anal 2007. [DOI: 10.1016/j.csda.2007.02.010] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|
49
|
Zak M, Baierl A, Bogdan M, Futschik A. Locating multiple interacting quantitative trait Loci using rank-based model selection. Genetics 2007; 176:1845-54. [PMID: 17507685 PMCID: PMC1931563 DOI: 10.1534/genetics.106.068031] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2006] [Accepted: 04/17/2007] [Indexed: 11/18/2022] Open
Abstract
In previous work, a modified version of the Bayesian information criterion (mBIC) was proposed to locate multiple interacting quantitative trait loci (QTL). Simulation studies and real data analysis demonstrate good properties of the mBIC in situations where the error distribution is approximately normal. However, as with other standard techniques of QTL mapping, the performance of the mBIC strongly deteriorates when the trait distribution is heavy tailed or when the data contain a significant proportion of outliers. In the present article, we propose a suitable robust version of the mBIC that is based on ranks. We investigate the properties of the resulting method on the basis of theoretical calculations, computer simulations, and a real data analysis. Our simulation results show that for the sample sizes typically used in QTL mapping, the methods based on ranks are almost as efficient as standard techniques when the data are normal and are much better when the data come from some heavy-tailed distribution or include a proportion of outliers.
Collapse
|
50
|
Clarke BR, Futschik A. On the convergence of Newton's method when estimating higher dimensional parameters. J MULTIVARIATE ANAL 2007. [DOI: 10.1016/j.jmva.2006.12.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
|