1
|
Haselimashhadi H, Mason JC, Munoz-Fuentes V, López-Gómez F, Babalola K, Acar EF, Kumar V, White J, Flenniken AM, King R, Straiton E, Seavitt JR, Gaspero A, Garza A, Christianson AE, Hsu CW, Reynolds CL, Lanza DG, Lorenzo I, Green JR, Gallegos JJ, Bohat R, Samaco RC, Veeraragavan S, Kim JK, Miller G, Fuchs H, Garrett L, Becker L, Kang YK, Clary D, Cho SY, Tamura M, Tanaka N, Soo KD, Bezginov A, About GB, Champy MF, Vasseur L, Leblanc S, Meziane H, Selloum M, Reilly PT, Spielmann N, Maier H, Gailus-Durner V, Sorg T, Hiroshi M, Yuichi O, Heaney JD, Dickinson ME, Wolfgang W, Tocchini-Valentini GP, Lloyd KCK, McKerlie C, Seong JK, Yann H, de Angelis MH, Brown SDM, Smedley D, Flicek P, Mallon AM, Parkinson H, Meehan TF. Soft windowing application to improve analysis of high-throughput phenotyping data. Bioinformatics 2020; 36:1492-1500. [PMID: 31591642 PMCID: PMC7115897 DOI: 10.1093/bioinformatics/btz744] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Revised: 08/20/2019] [Accepted: 10/04/2019] [Indexed: 11/14/2022] Open
Abstract
Motivation High-throughput phenomic projects generate complex data from small treatment and large control groups that increase the power of the analyses but introduce variation over time. A method is needed to utlize a set of temporally local controls that maximizes analytic power while minimizing noise from unspecified environmental factors. Results Here we introduce ‘soft windowing’, a methodological approach that selects a window of time that includes the most appropriate controls for analysis. Using phenotype data from the International Mouse Phenotyping Consortium (IMPC), adaptive windows were applied such that control data collected proximally to mutants were assigned the maximal weight, while data collected earlier or later had less weight. We applied this method to IMPC data and compared the results with those obtained from a standard non-windowed approach. Validation was performed using a resampling approach in which we demonstrate a 10% reduction of false positives from 2.5 million analyses. We applied the method to our production analysis pipeline that establishes genotype–phenotype associations by comparing mutant versus control data. We report an increase of 30% in significant P-values, as well as linkage to 106 versus 99 disease models via phenotype overlap with the soft-windowed and non-windowed approaches, respectively, from a set of 2082 mutant mouse lines. Our method is generalizable and can benefit large-scale human phenomic projects such as the UK Biobank and the All of Us resources. Availability and implementation The method is freely available in the R package SmoothWin, available on CRAN http://CRAN.R-project.org/package=SmoothWin. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hamed Haselimashhadi
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Jeremy C Mason
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Violeta Munoz-Fuentes
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Federico López-Gómez
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Kolawole Babalola
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Elif F Acar
- The Centre for Phenogenomics.,The Hospital for Sick Children, Toronto, Canada.,Department of Statistics, University of Manitoba, Winnipeg, MB R3T 2N2 Canada
| | - Vivek Kumar
- The Jackson Laboratory, Bar Harbor, ME 04609, USA
| | - Jacqui White
- The Jackson Laboratory, Bar Harbor, ME 04609, USA
| | - Ann M Flenniken
- The Centre for Phenogenomics.,Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Canada
| | | | | | | | | | | | | | | | | | | | | | | | | | - Ritu Bohat
- Baylor College of Medicine, Houston, TX, USA
| | | | | | - Jong Kyoung Kim
- Daegu Gyeongbuk Institute of Science & Technology (DGIST), Daegu, Korea
| | | | | | | | - Lore Becker
- Helmholtz Center Munich, Neuherberg, Germany
| | | | - David Clary
- Mouse Biology Program, University of California Davis, Davis, CA, USA
| | - Soo Young Cho
- National Cancer Center (NCC) & Korea Mouse Phenotyping Center (KMPC), Korea
| | | | | | - Kyung Dong Soo
- Seoul National University & Korea Mouse Phenotyping Center (KMPC), Korea
| | - Alexandr Bezginov
- The Centre for Phenogenomics.,The Hospital for Sick Children, Toronto, Canada
| | - Ghina Bou About
- Université de Strasbourg, CNRS, INSERM, Institut Clinique de la Souris, PHENOMIN-ICS, 67404 Illkirch, France
| | - Marie-France Champy
- Université de Strasbourg, CNRS, INSERM, Institut Clinique de la Souris, PHENOMIN-ICS, 67404 Illkirch, France
| | - Laurent Vasseur
- Université de Strasbourg, CNRS, INSERM, Institut Clinique de la Souris, PHENOMIN-ICS, 67404 Illkirch, France
| | - Sophie Leblanc
- Université de Strasbourg, CNRS, INSERM, Institut Clinique de la Souris, PHENOMIN-ICS, 67404 Illkirch, France
| | - Hamid Meziane
- Université de Strasbourg, CNRS, INSERM, Institut Clinique de la Souris, PHENOMIN-ICS, 67404 Illkirch, France
| | - Mohammed Selloum
- Université de Strasbourg, CNRS, INSERM, Institut Clinique de la Souris, PHENOMIN-ICS, 67404 Illkirch, France
| | - Patrick T Reilly
- Université de Strasbourg, CNRS, INSERM, Institut Clinique de la Souris, PHENOMIN-ICS, 67404 Illkirch, France
| | | | | | | | - Tania Sorg
- Université de Strasbourg, CNRS, INSERM, Institut Clinique de la Souris, PHENOMIN-ICS, 67404 Illkirch, France
| | | | - Obata Yuichi
- RIKEN BioResource Research Center, Tsukuba, Japan
| | | | | | - Wurst Wolfgang
- Institute of Developmental Genetics, Helmholtz Centre Munich, Munich, Germany
| | | | | | - Colin McKerlie
- The Centre for Phenogenomics.,The Hospital for Sick Children, Toronto, Canada
| | - Je Kyung Seong
- Seoul National University & Korea Mouse Phenotyping Center (KMPC), Korea
| | - Herault Yann
- Université de Strasbourg, CNRS, INSERM, Institut de Génétique, Biologie Moléculaire et Cellulaire, Institut Clinique de la Souris, IGBMC, PHENOMIN-ICS, 67404 Illkirch, France
| | | | | | - Damian Smedley
- William Harvey Research Institute, Charterhouse Square Barts and the London School of Medicine and Dentistry Queen Mary University of London, London EC1M 6BQ, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | - Helen Parkinson
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | - Terrence F Meehan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| |
Collapse
|