1
|
Liu Y, Yeung WSB, Chiu PCN, Cao D. Computational approaches for predicting variant impact: An overview from resources, principles to applications. Front Genet 2022; 13:981005. [PMID: 36246661 PMCID: PMC9559863 DOI: 10.3389/fgene.2022.981005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 08/08/2022] [Indexed: 11/13/2022] Open
Abstract
One objective of human genetics is to unveil the variants that contribute to human diseases. With the rapid development and wide use of next-generation sequencing (NGS), massive genomic sequence data have been created, making personal genetic information available. Conventional experimental evidence is critical in establishing the relationship between sequence variants and phenotype but with low efficiency. Due to the lack of comprehensive databases and resources which present clinical and experimental evidence on genotype-phenotype relationship, as well as accumulating variants found from NGS, different computational tools that can predict the impact of the variants on phenotype have been greatly developed to bridge the gap. In this review, we present a brief introduction and discussion about the computational approaches for variant impact prediction. Following an innovative manner, we mainly focus on approaches for non-synonymous variants (nsSNVs) impact prediction and categorize them into six classes. Their underlying rationale and constraints, together with the concerns and remedies raised from comparative studies are discussed. We also present how the predictive approaches employed in different research. Although diverse constraints exist, the computational predictive approaches are indispensable in exploring genotype-phenotype relationship.
Collapse
Affiliation(s)
- Ye Liu
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
| | - William S. B. Yeung
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
- Department of Obstetrics and Gynaecology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Philip C. N. Chiu
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
- Department of Obstetrics and Gynaecology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong SAR, China
| | - Dandan Cao
- Shenzhen Key Laboratory of Fertility Regulation, Reproductive Medicine Center, The University of Hong Kong-Shenzhen Hospital, Shenzhen, China
| |
Collapse
|
2
|
Hunt SE, Moore B, Amode RM, Armean IM, Lemos D, Mushtaq A, Parton A, Schuilenburg H, Szpak M, Thormann A, Perry E, Trevanion SJ, Flicek P, Yates AD, Cunningham F. Annotating and prioritizing genomic variants using the Ensembl Variant Effect Predictor-A tutorial. Hum Mutat 2021; 43:986-997. [PMID: 34816521 PMCID: PMC7613081 DOI: 10.1002/humu.24298] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 11/02/2021] [Accepted: 11/14/2021] [Indexed: 11/05/2022]
Abstract
The Ensembl Variant Effect Predictor (VEP) is a freely available, open-source tool for the annotation and filtering of genomic variants. It predicts variant molecular consequences using the Ensembl/GENCODE or RefSeq gene sets. It also reports phenotype associations from databases such as ClinVar, allele frequencies from studies including gnomAD, and predictions of deleteriousness from tools such as Sorting Intolerant From Tolerant and Combined Annotation Dependent Depletion. Ensembl VEP includes filtering options to customize variant prioritization. It is well supported and updated roughly quarterly to incorporate the latest gene, variant, and phenotype association information. Ensembl VEP analysis can be performed using a highly configurable, extensible command-line tool, a Representational State Transfer application programming interface, and a user-friendly web interface. These access methods are designed to suit different levels of bioinformatics experience and meet different needs in terms of data size, visualization, and flexibility. In this tutorial, we will describe performing variant annotation using the Ensembl VEP web tool, which enables sophisticated analysis through a simple interface.
Collapse
Affiliation(s)
- Sarah E Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Benjamin Moore
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Ridwan M Amode
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Irina M Armean
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Diana Lemos
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Aleena Mushtaq
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Andrew Parton
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Helen Schuilenburg
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Michał Szpak
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Anja Thormann
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Emily Perry
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Stephen J Trevanion
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Andrew D Yates
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Fiona Cunningham
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| |
Collapse
|
3
|
Wagner AH, Babb L, Alterovitz G, Baudis M, Brush M, Cameron DL, Cline M, Griffith M, Griffith OL, Hunt SE, Kreda D, Lee JM, Li S, Lopez J, Moyer E, Nelson T, Patel RY, Riehle K, Robinson PN, Rynearson S, Schuilenburg H, Tsukanov K, Walsh B, Konopko M, Rehm HL, Yates AD, Freimuth RR, Hart RK. The GA4GH Variation Representation Specification: A computational framework for variation representation and federated identification. CELL GENOMICS 2021; 1. [PMID: 35311178 PMCID: PMC8929418 DOI: 10.1016/j.xgen.2021.100027] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Maximizing the personal, public, research, and clinical value of genomic information will require the reliable exchange of genetic variation data. We report here the Variation Representation Specification (VRS, pronounced "verse"), an extensible framework for the computable representation of variation that complements contemporary human-readable and flat file standards for genomic variation representation. VRS provides semantically precise representations of variation and leverages this design to enable federated identification of biomolecular variation with globally consistent and unique computed identifiers. The VRS framework includes a terminology and information model, machine-readable schema, data sharing conventions, and a reference implementation, each of which is intended to be broadly useful and freely available for community use. VRS was developed by a partnership among national information resource providers, public initiatives, and diagnostic testing laboratories under the auspices of the Global Alliance for Genomics and Health (GA4GH).
Collapse
Affiliation(s)
- Alex H. Wagner
- Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH 43210, USA
- The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children’s Hospital, Columbus, OH 43215, USA
- Corresponding author
| | - Lawrence Babb
- Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Corresponding author
| | - Gil Alterovitz
- Harvard Medical School, Boston, MA 02115, USA
- Department of Medicine, Brigham and Women’s Hospital, Boston, MA 02115, USA
| | - Michael Baudis
- University of Zurich and Swiss Institute of Bioinformatics, Zurich, Switzerland
| | - Matthew Brush
- Oregon Health & Science University, Portland, OR 97239, USA
| | - Daniel L. Cameron
- Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
- Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
| | - Melissa Cline
- UC Santa Cruz Genomics Institute, Santa Cruz, CA 95060, USA
| | - Malachi Griffith
- Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Obi L. Griffith
- Washington University School of Medicine, St. Louis, MO 63108, USA
| | - Sarah E. Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - David Kreda
- Department of Biomedical Informatics, Harvard Medical School, Boston MA 02115, USA
| | - Jennifer M. Lee
- Essex Management LLC and National Cancer Institute, Rockville, MD 20850, USA
| | - Stephanie Li
- The Global Alliance for Genomics and Health, Toronto, ON, Canada
| | | | - Eric Moyer
- National Center for Biotechnology Information, National Library of Medicine National Institutes of Health, Bethesda, MD 20894, USA
| | | | | | - Kevin Riehle
- Baylor College of Medicine, Houston, TX 77030, USA
| | | | - Shawn Rynearson
- Utah Center for Genetic Discovery, University of Utah, Salt Lake City, UT 84112, USA
| | - Helen Schuilenburg
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Kirill Tsukanov
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Brian Walsh
- Oregon Health & Science University, Portland, OR 97239, USA
| | - Melissa Konopko
- The Global Alliance for Genomics and Health, Toronto, ON, Canada
| | - Heidi L. Rehm
- Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- Center for Genomic Medicine, Massachusetts General Hospital, Cambridge, MA 02142, USA
| | - Andrew D. Yates
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Robert R. Freimuth
- Center for Individualized Medicine, Department of Artificial Intelligence and Informatics, Mayo Clinic, Rochester, MN 55905, USA
| | - Reece K. Hart
- Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
- MyOme, Inc., Menlo Park, CA 94070, USA
- Corresponding author
| |
Collapse
|
4
|
Mante J, Roehner N, Keating K, McLaughlin JA, Young E, Beal J, Myers CJ. Curation Principles Derived from the Analysis of the SBOL iGEM Data Set. ACS Synth Biol 2021; 10:2592-2606. [PMID: 34546707 DOI: 10.1021/acssynbio.1c00225] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
As an engineering endeavor, synthetic biology requires effective sharing of genetic design information that can be reused in the construction of new designs. While there are a number of large community repositories of design information, curation of this information has been limited. This in turn limits the ways in which design information can be put to use. The aim of this work was to improve this situation by creating a curated library of parts from the International Genetically Engineered Machines (iGEM) registry data set. To this end, an analysis of the Synthetic Biology Open Language (SBOL) version of the iGEM registry was carried out using four different approaches-simple statistics, SnapGene autoannotation, SYNBICT autoannotation, and expert analysis-the results of which are presented herein. Key challenges encountered include the use of free text, insufficient part provenance, part duplication, lack of part removal, and insufficient continuous curation. On the basis of these analyses, the focus has shifted from the creation of a curated iGEM part library to instead the extraction of a set of lessons, which are presented here. These lessons can be exploited to facilitate the creation and curation of other part libraries using a simpler and less labor intensive process.
Collapse
Affiliation(s)
- Jeanet Mante
- University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - Nicholas Roehner
- Raytheon BBN Technologies, Cambridge, Massachusetts 02138, United States
| | - Kevin Keating
- Worcester Polytechnic Institute, Worcester, Massachusetts 01609, United States
| | | | - Eric Young
- Worcester Polytechnic Institute, Worcester, Massachusetts 01609, United States
| | - Jacob Beal
- Raytheon BBN Technologies, Cambridge, Massachusetts 02138, United States
| | - Chris J. Myers
- University of Colorado Boulder, Boulder, Colorado 80309, United States
| |
Collapse
|
5
|
Pandi MT, Koromina M, Tsafaridis I, Patsilinakos S, Christoforou E, van der Spek PJ, Patrinos GP. A novel machine learning-based approach for the computational functional assessment of pharmacogenomic variants. Hum Genomics 2021; 15:51. [PMID: 34372920 PMCID: PMC8351412 DOI: 10.1186/s40246-021-00352-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2021] [Accepted: 07/28/2021] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND The field of pharmacogenomics focuses on the way a person's genome affects his or her response to a certain dose of a specified medication. The main aim is to utilize this information to guide and personalize the treatment in a way that maximizes the clinical benefits and minimizes the risks for the patients, thus fulfilling the promises of personalized medicine. Technological advances in genome sequencing, combined with the development of improved computational methods for the efficient analysis of the huge amount of generated data, have allowed the fast and inexpensive sequencing of a patient's genome, hence rendering its incorporation into clinical routine practice a realistic possibility. METHODS This study exploited thoroughly characterized in functional level SNVs within genes involved in drug metabolism and transport, to train a classifier that would categorize novel variants according to their expected effect on protein functionality. This categorization is based on the available in silico prediction and/or conservation scores, which are selected with the use of recursive feature elimination process. Toward this end, information regarding 190 pharmacovariants was leveraged, alongside with 4 machine learning algorithms, namely AdaBoost, XGBoost, multinomial logistic regression, and random forest, of which the performance was assessed through 5-fold cross validation. RESULTS All models achieved similar performance toward making informed conclusions, with RF model achieving the highest accuracy (85%, 95% CI: 0.79, 0.90), as well as improved overall performance (precision 85%, sensitivity 84%, specificity 94%) and being used for subsequent analyses. When applied on real world WGS data, the selected RF model identified 2 missense variants, expected to lead to decreased function proteins and 1 to increased. As expected, a greater number of variants were highlighted when the approach was used on NGS data derived from targeted resequencing of coding regions. Specifically, 71 variants (out of 156 with sufficient annotation information) were classified as to "Decreased function," 41 variants as "No" function proteins, and 1 variant in "Increased function." CONCLUSION Overall, the proposed RF-based classification model holds promise to lead to an extremely useful variant prioritization and act as a scoring tool with interesting clinical applications in the fields of pharmacogenomics and personalized medicine.
Collapse
Affiliation(s)
- Maria-Theodora Pandi
- Erasmus University Medical Center, Faculty of Medicine and Health Sciences, Department of Pathology, Bioinformatics Unit, Rotterdam, the Netherlands
| | - Maria Koromina
- Laboratory of Pharmacogenomics and Individualized Therapy, Department of Pharmacy, School of Health Sciences, University of Patras, Patras, Greece.,The Golden Helix Foundation, London, UK
| | | | | | | | - Peter J van der Spek
- Erasmus University Medical Center, Faculty of Medicine and Health Sciences, Department of Pathology, Bioinformatics Unit, Rotterdam, the Netherlands
| | - George P Patrinos
- Laboratory of Pharmacogenomics and Individualized Therapy, Department of Pharmacy, School of Health Sciences, University of Patras, Patras, Greece. .,Zayed Center of Health Sciences, United Arab Emirates University, Al-Ain, United Arab Emirates. .,Department of Pathology, College of Medicine and Health Sciences, United Arab Emirates University, Al-Ain, United Arab Emirates.
| |
Collapse
|
6
|
Wegrzyn JL, Falk T, Grau E, Buehler S, Ramnath R, Herndon N. Cyberinfrastructure and resources to enable an integrative approach to studying forest trees. Evol Appl 2020; 13:228-241. [PMID: 31892954 PMCID: PMC6935593 DOI: 10.1111/eva.12860] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2019] [Revised: 08/11/2019] [Accepted: 08/14/2019] [Indexed: 12/19/2022] Open
Abstract
Sequencing technologies and bioinformatic approaches are now available to resolve the challenges associated with complex and heterozygous genomes. Increased access to less expensive and more effective instrumentation will contribute to a wealth of high-quality plant genomes in the next few years. In the meantime, more than 370 tree species are associated with public projects in primary repositories that are interrogating expression profiles, identifying variants, or analyzing targeted capture without a high-quality reference genome. Genomic data from these projects generates sequences that represent intermediate assemblies for transcriptomes and genomes. These data contribute to forest tree biology, but the associated sequence remains trapped in supplemental files that are poorly integrated in plant community databases and comparative genomic platforms. Successful implementation of life science cyberinfrastructure is improving data standards, ontologies, analytic workflows, and integrated database platforms for both model and non-model plant species. Unique to forest trees with large populations that are long-lived, outcrossing, and genetically diverse, the phenotypic and environmental metrics associated with georeferenced populations are just as important as the genomic data sampled for each individual. To address questions related to forest health and productivity, cyberinfrastructure must keep pace with the magnitude of genomic and phenomic sampling of larger populations. This review examines the current landscape of cyberinfrastructure, with an emphasis on best practices and resources to align community data with the Findable, Accessible, Interoperable, and Reusable (FAIR) guidelines.
Collapse
Affiliation(s)
- Jill L. Wegrzyn
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut
| | - Taylor Falk
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut
| | - Emily Grau
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut
| | - Sean Buehler
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut
| | - Risharde Ramnath
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut
| | - Nic Herndon
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut
| |
Collapse
|
7
|
Raudvere U, Kolberg L, Kuzmin I, Arak T, Adler P, Peterson H, Vilo J. g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update). Nucleic Acids Res 2019; 47:W191-W198. [PMID: 31066453 PMCID: PMC6602461 DOI: 10.1093/nar/gkz369] [Citation(s) in RCA: 3017] [Impact Index Per Article: 603.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2019] [Revised: 04/07/2019] [Accepted: 04/29/2019] [Indexed: 02/07/2023] Open
Abstract
Biological data analysis often deals with lists of genes arising from various studies. The g:Profiler toolset is widely used for finding biological categories enriched in gene lists, conversions between gene identifiers and mappings to their orthologs. The mission of g:Profiler is to provide a reliable service based on up-to-date high quality data in a convenient manner across many evidence types, identifier spaces and organisms. g:Profiler relies on Ensembl as a primary data source and follows their quarterly release cycle while updating the other data sources simultaneously. The current update provides a better user experience due to a modern responsive web interface, standardised API and libraries. The results are delivered through an interactive and configurable web design. Results can be downloaded as publication ready visualisations or delimited text files. In the current update we have extended the support to 467 species and strains, including vertebrates, plants, fungi, insects and parasites. By supporting user uploaded custom GMT files, g:Profiler is now capable of analysing data from any organism. All past releases are maintained for reproducibility and transparency. The 2019 update introduces an extensive technical rewrite making the services faster and more flexible. g:Profiler is freely available at https://biit.cs.ut.ee/gprofiler.
Collapse
Affiliation(s)
- Uku Raudvere
- Institute of Computer Science, University of Tartu, J. Liivi 2, 50409 Tartu, Estonia
| | - Liis Kolberg
- Institute of Computer Science, University of Tartu, J. Liivi 2, 50409 Tartu, Estonia
| | - Ivan Kuzmin
- Institute of Computer Science, University of Tartu, J. Liivi 2, 50409 Tartu, Estonia
| | - Tambet Arak
- Institute of Computer Science, University of Tartu, J. Liivi 2, 50409 Tartu, Estonia
| | - Priit Adler
- Institute of Computer Science, University of Tartu, J. Liivi 2, 50409 Tartu, Estonia
- Quretec Ltd, Ülikooli 6a, 51003, Tartu, Estonia
| | - Hedi Peterson
- Institute of Computer Science, University of Tartu, J. Liivi 2, 50409 Tartu, Estonia
- Quretec Ltd, Ülikooli 6a, 51003, Tartu, Estonia
| | - Jaak Vilo
- Institute of Computer Science, University of Tartu, J. Liivi 2, 50409 Tartu, Estonia
- Quretec Ltd, Ülikooli 6a, 51003, Tartu, Estonia
- Software Technology and Applications Competence Centre, Ülikooli 2, 51003 Tartu, Estonia
| |
Collapse
|
8
|
Shaik NA, Banaganapalli B. Computational Molecular Phenotypic Analysis of PTPN22 (W620R), IL6R (D358A), and TYK2 (P1104A) Gene Mutations of Rheumatoid Arthritis. Front Genet 2019; 10:168. [PMID: 30899276 PMCID: PMC6416176 DOI: 10.3389/fgene.2019.00168] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2018] [Accepted: 02/15/2019] [Indexed: 12/31/2022] Open
Abstract
Rheumatoid arthritis (RA) is a chronic autoimmune disorder of bone joints caused by the complex interplay between several factors like body physiology, the environment with genetic background. The recent meta-analysis of GWAS has expanded the total number of RA-associated loci to more than 100, out of which approximately ∼97% (98 variants) loci are located in non-coding regions, and the other ∼3% (3 variants) are in three different non-HLA genes, i.e., TYK2 (Prp1104Ala), IL6R (Asp358Ala), and PTPN22 (Trp620Arg). However, whether these variants prompt changes in the protein phenotype with regards to its stability, structure, and interaction with other molecules, remains unknown. Thus, we selected the three clinically pathogenic variants described above, as positive controls and applied diverse computational methods to scrutinize if those mutations cause changes in the protein phenotype. Both wild type and mutant protein structures of PTPN22 (W620R), IL6R (D358A), and TYK2 (P1104A) were modeled and studied for structural deviations. Furthermore, we have also studied the secondary structure characteristics, solvent accessibility and stability, and the molecular interaction deformities caused by the amino acid substitutions. We observed that simple nucleotide predictions of SIFT, PolyPhen, CADD and FATHMM yields mixed findings in screening the RA-missense variants which showed a ≥P-value threshold of 5 × 10-8 in genome wide association studies. However, structure-based analysis confirms that mutant structures shows subtle but significant changes at their core regions, but their functional domains seems to lose wild type like functional interaction. Our findings suggest that the multidirectional computational analysis of clinically potential RA-mutations could act as a primary screening step before undertaking functional biology assays.
Collapse
Affiliation(s)
- Noor Ahmad Shaik
- Department of Genetic Medicine, Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia.,Princess Al-Jawhara Center of Excellence in Research of Hereditary Disorders, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Babajan Banaganapalli
- Department of Genetic Medicine, Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia.,Princess Al-Jawhara Center of Excellence in Research of Hereditary Disorders, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
9
|
Cusin I, Teixeira D, Zahn-Zabal M, Rech de Laval V, Gleizes A, Viassolo V, Chappuis PO, Hutter P, Bairoch A, Gaudet P. A new bioinformatics tool to help assess the significance of BRCA1 variants. Hum Genomics 2018; 12:36. [PMID: 29996917 PMCID: PMC6042458 DOI: 10.1186/s40246-018-0168-0] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2018] [Accepted: 06/25/2018] [Indexed: 12/23/2022] Open
Abstract
Background Germline pathogenic variants in the breast cancer type 1 susceptibility gene BRCA1 are associated with a 60% lifetime risk for breast and ovarian cancer. This overall risk estimate is for all BRCA1 variants; obviously, not all variants confer the same risk of developing a disease. In cancer patients, loss of BRCA1 function in tumor tissue has been associated with an increased sensitivity to platinum agents and to poly-(ADP-ribose) polymerase (PARP) inhibitors. For clinical management of both at-risk individuals and cancer patients, it would be important that each identified genetic variant be associated with clinical significance. Unfortunately for the vast majority of variants, the clinical impact is unknown. The availability of results from studies assessing the impact of variants on protein function may provide insight of crucial importance. Results and conclusion We have collected, curated, and structured the molecular and cellular phenotypic impact of 3654 distinct BRCA1 variants. The data was modeled in triple format, using the variant as a subject, the studied function as the object, and a predicate describing the relation between the two. Each annotation is supported by a fully traceable evidence. The data was captured using standard ontologies to ensure consistency, and enhance searchability and interoperability. We have assessed the extent to which functional defects at the molecular and cellular levels correlate with the clinical interpretation of variants by ClinVar submitters. Approximately 30% of the ClinVar BRCA1 missense variants have some molecular or cellular assay available in the literature. Pathogenic variants (as assigned by ClinVar) have at least some significant functional defect in 94% of testable cases. For benign variants, 77% of ClinVar benign variants, for which neXtProt Cancer variant portal has data, shows either no or mild experimental functional defects. While this does not provide evidence for clinical interpretation of variants, it may provide some guidance for variants of unknown significance, in the absence of more reliable data. The neXtProt Cancer variant portal (https://www.nextprot.org/portals/breast-cancer) contains over 6300 observations at the molecular and/or cellular level for BRCA1 variants.
Collapse
Affiliation(s)
- Isabelle Cusin
- CALIPHO group, SIB Swiss Institute of Bioinformatics, 1211, Geneva 4, Switzerland
| | - Daniel Teixeira
- CALIPHO group, SIB Swiss Institute of Bioinformatics, 1211, Geneva 4, Switzerland
| | - Monique Zahn-Zabal
- CALIPHO group, SIB Swiss Institute of Bioinformatics, 1211, Geneva 4, Switzerland
| | - Valentine Rech de Laval
- CALIPHO group, SIB Swiss Institute of Bioinformatics, 1211, Geneva 4, Switzerland.,Department of Human Protein Sciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Anne Gleizes
- CALIPHO group, SIB Swiss Institute of Bioinformatics, 1211, Geneva 4, Switzerland
| | - Valeria Viassolo
- Oncogenetics and Cancer Prevention Unit, Division of Oncology, University Hospitals of Geneva, 1205, Geneva, Switzerland
| | - Pierre O Chappuis
- Oncogenetics and Cancer Prevention Unit, Division of Oncology, University Hospitals of Geneva, 1205, Geneva, Switzerland.,Division of Genetic Medicine, University Hospitals of Geneva, 1205, Geneva, Switzerland
| | - Pierre Hutter
- Sophia Genetics, Rue du Centre 172, 1025, Saint Sulpice, Switzerland
| | - Amos Bairoch
- CALIPHO group, SIB Swiss Institute of Bioinformatics, 1211, Geneva 4, Switzerland.,Department of Human Protein Sciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland
| | - Pascale Gaudet
- CALIPHO group, SIB Swiss Institute of Bioinformatics, 1211, Geneva 4, Switzerland. .,Department of Human Protein Sciences, Faculty of Medicine, University of Geneva, Geneva, Switzerland.
| |
Collapse
|
10
|
Shaik NA, Awan ZA, Verma PK, Elango R, Banaganapalli B. Protein phenotype diagnosis of autosomal dominant calmodulin mutations causing irregular heart rhythms. J Cell Biochem 2018; 119:8233-8248. [PMID: 29932249 DOI: 10.1002/jcb.26834] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2018] [Accepted: 03/09/2018] [Indexed: 12/21/2022]
Abstract
The life-threatening group of irregular cardiac rhythmic disorders also known as Cardiac Arrhythmias (CA) are caused by mutations in highly conserved Calmodulin (CALM/CaM) genes. Herein, we present a multidimensional approach to diagnose changes in phenotypic, stability, and Ca2+ ion binding properties of CA-causing mutations. Mutation pathogenicity was determined by diverse computational machine learning approaches. We further modeled the mutations in 3D protein structure and analyzed residue level phenotype plasticity. We have also examined the influence of torsion angles, number of H-bonds, and free energy dynamics on the stability, near-native simulation dynamic potential of residue fluctuations in protein structures, Ca2+ ion binding potentials, of CaM mutants. Our study recomends to use M-CAP method for measuring the pathogenicity of CA causing CaM variants. Interestingly, most CA-causing variants we analyzed, exists in either third (V/H-96, S/I-98, V-103) or fourth (G/V-130, V/E/H-132, H-134, P-136, G-141, and L-142) EF-hands located in carboxyl domains of the CaM molecule. We observed that the minor structural fluctuations caused by these variants are likely tolerable owing to the highly flexible nature of calmodulin's globular domains. However, our molecular docking results supports that these variants disturb the affinity of CaM toward Ca2+ ions and corroborate previous findings from functional studies. Taken together, these computational findings can explain the molecular reasons for subtle changes in structure, flexibility, and stability aspects of mutant CaM molecule. Our comprehensive molecular scanning approach demonstrates the utility of computational methods in quick preliminary screening of CA- CaM mutations before undertaking time consuming and complicated functional laboratory assays.
Collapse
Affiliation(s)
- Noor A Shaik
- Department of Genetic Medicine, Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia.,Princess Al-Jawhara Al-Brahim Centre of Excellence in Research of Hereditary Disorders (PACER-HD), King Abdulaziz University, Jeddah, Saudi Arabia
| | - Zuhier A Awan
- Department of Clinical Biochemistry, Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Prashant K Verma
- Department of Genetic Medicine, Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Ramu Elango
- Department of Genetic Medicine, Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia.,Princess Al-Jawhara Al-Brahim Centre of Excellence in Research of Hereditary Disorders (PACER-HD), King Abdulaziz University, Jeddah, Saudi Arabia
| | - Babajan Banaganapalli
- Department of Genetic Medicine, Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia.,Princess Al-Jawhara Al-Brahim Centre of Excellence in Research of Hereditary Disorders (PACER-HD), King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
11
|
Wood MA, Paralkar M, Paralkar MP, Nguyen A, Struck AJ, Ellrott K, Margolin A, Nellore A, Thompson RF. Population-level distribution and putative immunogenicity of cancer neoepitopes. BMC Cancer 2018; 18:414. [PMID: 29653567 PMCID: PMC5899330 DOI: 10.1186/s12885-018-4325-6] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2017] [Accepted: 04/03/2018] [Indexed: 02/08/2023] Open
Abstract
Background Tumor neoantigens are drivers of cancer immunotherapy response; however, current prediction tools produce many candidates requiring further prioritization. Additional filtration criteria and population-level understanding may assist with prioritization. Herein, we show neoepitope immunogenicity is related to measures of peptide novelty and report population-level behavior of these and other metrics. Methods We propose four peptide novelty metrics to refine predicted neoantigenicity: tumor vs. paired normal peptide binding affinity difference, tumor vs. paired normal peptide sequence similarity, tumor vs. closest human peptide sequence similarity, and tumor vs. closest microbial peptide sequence similarity. We apply these metrics to neoepitopes predicted from somatic missense mutations in The Cancer Genome Atlas (TCGA) and a cohort of melanoma patients, and to a group of peptides with neoepitope-specific immune response data using an extension of pVAC-Seq (Hundal et al., pVAC-Seq: a genome-guided in silico approach to identifying tumor neoantigens. Genome Med 8:11, 2016). Results We show neoepitope burden varies across TCGA diseases and HLA alleles, with surprisingly low repetition of neoepitope sequences across patients or neoepitope preferences among sets of HLA alleles. Only 20.3% of predicted neoepitopes across TCGA patients displayed novel binding change based on our binding affinity difference criteria. Similarity of amino acid sequence was typically high between paired tumor-normal epitopes, but in 24.6% of cases, neoepitopes were more similar to other human peptides, or bacterial (56.8% of cases) or viral peptides (15.5% of cases), than their paired normal counterparts. Applied to peptides with neoepitope-specific immune response, a linear model incorporating neoepitope binding affinity, protein sequence similarity between neoepitopes and their closest viral peptides, and paired binding affinity difference was able to predict immunogenicity (AUROC = 0.66). Conclusions Our proposed prioritization criteria emphasize neoepitope novelty and refine patient neoepitope predictions for focus on biologically meaningful candidate neoantigens. We have demonstrated that neoepitopes should be considered not only with respect to their paired normal epitope, but to the entire human proteome, and bacterial and viral peptides, with potential implications for neoepitope immunogenicity and personalized vaccines for cancer treatment. We conclude that putative neoantigens are highly variable across individuals as a function of cancer genetics and personalized HLA repertoire, while the overall behavior of filtration criteria reflects predictable patterns. Electronic supplementary material The online version of this article (10.1186/s12885-018-4325-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Mary A Wood
- Computational Biology Program, Oregon Health and Science University, Portland, OR, USA.,Portland VA Research Foundation, Portland, OR, USA
| | - Mayur Paralkar
- Computational Biology Program, Oregon Health and Science University, Portland, OR, USA.,Carnegie Mellon University, Pittsburgh, PA, USA
| | - Mihir P Paralkar
- Computational Biology Program, Oregon Health and Science University, Portland, OR, USA.,Carnegie Mellon University, Pittsburgh, PA, USA
| | - Austin Nguyen
- Computational Biology Program, Oregon Health and Science University, Portland, OR, USA.,Oregon State University, Corvallis, OR, USA
| | - Adam J Struck
- Computational Biology Program, Oregon Health and Science University, Portland, OR, USA
| | - Kyle Ellrott
- Computational Biology Program, Oregon Health and Science University, Portland, OR, USA.,Department of Biomedical Engineering, Oregon Health and Science University, Portland, OR, USA
| | - Adam Margolin
- Computational Biology Program, Oregon Health and Science University, Portland, OR, USA.,Department of Biomedical Engineering, Oregon Health and Science University, Portland, OR, USA
| | - Abhinav Nellore
- Computational Biology Program, Oregon Health and Science University, Portland, OR, USA.,Department of Biomedical Engineering, Oregon Health and Science University, Portland, OR, USA.,Department of Surgery, Oregon Health and Science University, Portland, OR, USA
| | - Reid F Thompson
- Computational Biology Program, Oregon Health and Science University, Portland, OR, USA. .,Department of Biomedical Engineering, Oregon Health and Science University, Portland, OR, USA. .,Department of Radiation Medicine, Oregon Health and Science University, Portland, OR, USA. .,VA Portland Health Care System, Portland, OR, USA.
| |
Collapse
|
12
|
Hunt SE, McLaren W, Gil L, Thormann A, Schuilenburg H, Sheppard D, Parton A, Armean IM, Trevanion SJ, Flicek P, Cunningham F. Ensembl variation resources. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2018:5255129. [PMID: 30576484 PMCID: PMC6310513 DOI: 10.1093/database/bay119] [Citation(s) in RCA: 280] [Impact Index Per Article: 46.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/31/2018] [Accepted: 10/04/2018] [Indexed: 12/31/2022]
Abstract
The major goal of sequencing humans and many other species is to understand the link between genomic variation, phenotype and disease. There are numerous valuable and well-established variation resources, but collating and making sense of non-homogeneous, often large-scale data sets from disparate sources remains a challenge. Without a systematic catalogue of these data and appropriate query and annotation tools, understanding the genome sequence of an individual and assessing their disease risk is impossible. In Ensembl, we substantially solve this problem: we develop methods to facilitate data integration and broad access; aggregate information in a consistent manner and make it available a variety of standard formats, both visually and programmatically; build analysis pipelines to compare variants to comprehensive genomic annotation sets; and make all tools and data publicly available.
Collapse
Affiliation(s)
- Sarah E Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - William McLaren
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Laurent Gil
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Anja Thormann
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Helen Schuilenburg
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Dan Sheppard
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Andrew Parton
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Irina M Armean
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Stephen J Trevanion
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Fiona Cunningham
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| |
Collapse
|
13
|
Eilbeck K, Quinlan A, Yandell M. Settling the score: variant prioritization and Mendelian disease. Nat Rev Genet 2017; 18:599-612. [PMID: 28804138 PMCID: PMC5935497 DOI: 10.1038/nrg.2017.52] [Citation(s) in RCA: 152] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
When investigating Mendelian disease using exome or genome sequencing, distinguishing disease-causing genetic variants from the multitude of candidate variants is a complex, multidimensional task. Many prioritization tools and online interpretation resources exist, and professional organizations have offered clinical guidelines for review and return of prioritization results. In this Review, we describe the strengths and weaknesses of widely used computational approaches, explain their roles in the diagnostic and discovery process and discuss how they can inform (and misinform) expert reviewers. We place variant prioritization in the wider context of gene prioritization, burden testing and genotype-phenotype association, and we discuss opportunities and challenges introduced by whole-genome sequencing.
Collapse
Affiliation(s)
- Karen Eilbeck
- Department of Biomedical Informatics, School of Medicine, University of Utah, 421 Wakara Way, Suite 120, Salt Lake City, Utah 84108, USA
| | - Aaron Quinlan
- Department of Biomedical Informatics, School of Medicine, University of Utah, 421 Wakara Way, Suite 120, Salt Lake City, Utah 84108, USA
- Department of Human Genetics, Eccles Institute of Human Genetics, School of Medicine, University of Utah, 15 S 2030 E, Salt Lake City, Utah 84112, USA
| | - Mark Yandell
- Department of Human Genetics, Eccles Institute of Human Genetics, School of Medicine, University of Utah, 15 S 2030 E, Salt Lake City, Utah 84112, USA
| |
Collapse
|
14
|
Allot A, Chennen K, Nevers Y, Poidevin L, Kress A, Ripp R, Thompson JD, Poch O, Lecompte O. MyGeneFriends: A Social Network Linking Genes, Genetic Diseases, and Researchers. J Med Internet Res 2017. [PMID: 28623182 PMCID: PMC5493784 DOI: 10.2196/jmir.6676] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Background The constant and massive increase of biological data offers unprecedented opportunities to decipher the function and evolution of genes and their roles in human diseases. However, the multiplicity of sources and flow of data mean that efficient access to useful information and knowledge production has become a major challenge. This challenge can be addressed by taking inspiration from Web 2.0 and particularly social networks, which are at the forefront of big data exploration and human-data interaction. Objective MyGeneFriends is a Web platform inspired by social networks, devoted to genetic disease analysis, and organized around three types of proactive agents: genes, humans, and genetic diseases. The aim of this study was to improve exploration and exploitation of biological, postgenomic era big data. Methods MyGeneFriends leverages conventions popularized by top social networks (Facebook, LinkedIn, etc), such as networks of friends, profile pages, friendship recommendations, affinity scores, news feeds, content recommendation, and data visualization. Results MyGeneFriends provides simple and intuitive interactions with data through evaluation and visualization of connections (friendships) between genes, humans, and diseases. The platform suggests new friends and publications and allows agents to follow the activity of their friends. It dynamically personalizes information depending on the user’s specific interests and provides an efficient way to share information with collaborators. Furthermore, the user’s behavior itself generates new information that constitutes an added value integrated in the network, which can be used to discover new connections between biological agents. Conclusions We have developed MyGeneFriends, a Web platform leveraging conventions from popular social networks to redefine the relationship between humans and biological big data and improve human processing of biomedical data. MyGeneFriends is available at lbgi.fr/mygenefriends.
Collapse
Affiliation(s)
- Alexis Allot
- ICUBE UMR 7357, Complex Systems and Translational Bioinformatics, Université de Strasbourg - CNRS - FMTS, Strasbourg, France
| | - Kirsley Chennen
- ICUBE UMR 7357, Complex Systems and Translational Bioinformatics, Université de Strasbourg - CNRS - FMTS, Strasbourg, France
| | - Yannis Nevers
- ICUBE UMR 7357, Complex Systems and Translational Bioinformatics, Université de Strasbourg - CNRS - FMTS, Strasbourg, France
| | - Laetitia Poidevin
- ICUBE UMR 7357, Complex Systems and Translational Bioinformatics, Université de Strasbourg - CNRS - FMTS, Strasbourg, France
| | - Arnaud Kress
- ICUBE UMR 7357, Complex Systems and Translational Bioinformatics, Université de Strasbourg - CNRS - FMTS, Strasbourg, France
| | - Raymond Ripp
- ICUBE UMR 7357, Complex Systems and Translational Bioinformatics, Université de Strasbourg - CNRS - FMTS, Strasbourg, France
| | - Julie Dawn Thompson
- ICUBE UMR 7357, Complex Systems and Translational Bioinformatics, Université de Strasbourg - CNRS - FMTS, Strasbourg, France
| | - Olivier Poch
- ICUBE UMR 7357, Complex Systems and Translational Bioinformatics, Université de Strasbourg - CNRS - FMTS, Strasbourg, France
| | - Odile Lecompte
- ICUBE UMR 7357, Complex Systems and Translational Bioinformatics, Université de Strasbourg - CNRS - FMTS, Strasbourg, France
| |
Collapse
|
15
|
Uzilov AV, Cheesman KC, Fink MY, Newman LC, Pandya C, Lalazar Y, Hefti M, Fowkes M, Deikus G, Lau CY, Moe AS, Kinoshita Y, Kasai Y, Zweig M, Gupta A, Starcevic D, Mahajan M, Schadt EE, Post KD, Donovan MJ, Sebra R, Chen R, Geer EB. Identification of a novel RASD1 somatic mutation in a USP8-mutated corticotroph adenoma. Cold Spring Harb Mol Case Stud 2017; 3:a001602. [PMID: 28487882 PMCID: PMC5411693 DOI: 10.1101/mcs.a001602] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2016] [Accepted: 02/15/2017] [Indexed: 12/30/2022] Open
Abstract
Cushing's disease (CD) is caused by pituitary corticotroph adenomas that secrete excess adrenocorticotropic hormone (ACTH). In these tumors, somatic mutations in the gene USP8 have been identified as recurrent and pathogenic and are the sole known molecular driver for CD. Although other somatic mutations were reported in these studies, their contribution to the pathogenesis of CD remains unexplored. No molecular drivers have been established for a large proportion of CD cases and tumor heterogeneity has not yet been investigated using genomics methods. Also, even in USP8-mutant tumors, a possibility may exist of additional contributing mutations, following a paradigm from other neoplasm types where multiple somatic alterations contribute to neoplastic transformation. The current study utilizes whole-exome discovery sequencing on the Illumina platform, followed by targeted amplicon-validation sequencing on the Pacific Biosciences platform, to interrogate the somatic mutation landscape in a corticotroph adenoma resected from a CD patient. In this USP8-mutated tumor, we identified an interesting somatic mutation in the gene RASD1, which is a component of the corticotropin-releasing hormone receptor signaling system. This finding may provide insight into a novel mechanism involving loss of feedback control to the corticotropin-releasing hormone receptor and subsequent deregulation of ACTH production in corticotroph tumors.
Collapse
Affiliation(s)
- Andrew V Uzilov
- Department of Genetics and Genomic Sciences and Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Khadeen C Cheesman
- Division of Endocrinology, Diabetes, and Bone Disease, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Marc Y Fink
- Department of Genetics and Genomic Sciences and Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Leah C Newman
- Department of Genetics and Genomic Sciences and Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Chetanya Pandya
- Department of Genetics and Genomic Sciences and Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Yelena Lalazar
- Division of Endocrinology, Diabetes, and Bone Disease, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Marco Hefti
- Department of Pathology, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Mary Fowkes
- Department of Pathology, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Gintaras Deikus
- Department of Genetics and Genomic Sciences and Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Chun Yee Lau
- Department of Genetics and Genomic Sciences and Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Aye S Moe
- Department of Genetics and Genomic Sciences and Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Yayoi Kinoshita
- Department of Pathology, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Yumi Kasai
- Department of Genetics and Genomic Sciences and Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Micol Zweig
- Department of Genetics and Genomic Sciences and Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Arpeta Gupta
- Division of Endocrinology, Diabetes, and Bone Disease, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Daniela Starcevic
- Department of Genetics and Genomic Sciences and Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Milind Mahajan
- Department of Genetics and Genomic Sciences and Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Eric E Schadt
- Department of Genetics and Genomic Sciences and Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Kalmon D Post
- Department of Neurosurgery, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Michael J Donovan
- Department of Pathology, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Robert Sebra
- Department of Genetics and Genomic Sciences and Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Rong Chen
- Department of Genetics and Genomic Sciences and Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
| | - Eliza B Geer
- Division of Endocrinology, Diabetes, and Bone Disease, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
- Department of Neurosurgery, Icahn School of Medicine at Mount Sinai, New York, New York 10029, USA
- Multidisciplinary Pituitary and Skull Base Tumor Center, Memorial Sloan Kettering, New York, New York 10065, USA
| |
Collapse
|
16
|
Abstract
The Protein Ontology (PRO) is the reference ontology for proteins in the Open Biomedical Ontologies (OBO) foundry and consists of three sub-ontologies representing protein classes of homologous genes, proteoforms (e.g., splice isoforms, sequence variants, and post-translationally modified forms), and protein complexes. PRO defines classes of proteins and protein complexes, both species-specific and species nonspecific, and indicates their relationships in a hierarchical framework, supporting accurate protein annotation at the appropriate level of granularity, analyses of protein conservation across species, and semantic reasoning. In the first section of this chapter, we describe the PRO framework including categories of PRO terms and the relationship of PRO to other ontologies and protein resources. Next, we provide a tutorial about the PRO website ( proconsortium.org ) where users can browse and search the PRO hierarchy, view reports on individual PRO terms, and visualize relationships among PRO terms in a hierarchical table view, a multiple sequence alignment view, and a Cytoscape network view. Finally, we describe several examples illustrating the unique and rich information available in PRO.
Collapse
|
17
|
Aken BL, Achuthan P, Akanni W, Amode MR, Bernsdorff F, Bhai J, Billis K, Carvalho-Silva D, Cummins C, Clapham P, Gil L, Girón CG, Gordon L, Hourlier T, Hunt SE, Janacek SH, Juettemann T, Keenan S, Laird MR, Lavidas I, Maurel T, McLaren W, Moore B, Murphy DN, Nag R, Newman V, Nuhn M, Ong CK, Parker A, Patricio M, Riat HS, Sheppard D, Sparrow H, Taylor K, Thormann A, Vullo A, Walts B, Wilder SP, Zadissa A, Kostadima M, Martin FJ, Muffato M, Perry E, Ruffier M, Staines DM, Trevanion SJ, Cunningham F, Yates A, Zerbino DR, Flicek P. Ensembl 2017. Nucleic Acids Res 2016; 45:D635-D642. [PMID: 27899575 PMCID: PMC5210575 DOI: 10.1093/nar/gkw1104] [Citation(s) in RCA: 409] [Impact Index Per Article: 51.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2016] [Revised: 10/25/2016] [Accepted: 10/28/2016] [Indexed: 12/12/2022] Open
Abstract
Ensembl (www.ensembl.org) is a database and genome browser for enabling research on vertebrate genomes. We import, analyse, curate and integrate a diverse collection of large-scale reference data to create a more comprehensive view of genome biology than would be possible from any individual dataset. Our extensive data resources include evidence-based gene and regulatory region annotation, genome variation and gene trees. An accompanying suite of tools, infrastructure and programmatic access methods ensure uniform data analysis and distribution for all supported species. Together, these provide a comprehensive solution for large-scale and targeted genomics applications alike. Among many other developments over the past year, we have improved our resources for gene regulation and comparative genomics, and added CRISPR/Cas9 target sites. We released new browser functionality and tools, including improved filtering and prioritization of genome variation, Manhattan plot visualization for linkage disequilibrium and eQTL data, and an ontology search for phenotypes, traits and disease. We have also enhanced data discovery and access with a track hub registry and a selection of new REST end points. All Ensembl data are freely released to the scientific community and our source code is available via the open source Apache 2.0 license.
Collapse
Affiliation(s)
- Bronwen L Aken
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Premanand Achuthan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Wasiu Akanni
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - M Ridwan Amode
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Friederike Bernsdorff
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Jyothish Bhai
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Konstantinos Billis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Denise Carvalho-Silva
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Carla Cummins
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Peter Clapham
- Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Laurent Gil
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Carlos García Girón
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Leo Gordon
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Thibaut Hourlier
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sarah E Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sophie H Janacek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Thomas Juettemann
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stephen Keenan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Matthew R Laird
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Ilias Lavidas
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Thomas Maurel
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - William McLaren
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Benjamin Moore
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniel N Murphy
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Rishi Nag
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Victoria Newman
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Michael Nuhn
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Chuang Kee Ong
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Anne Parker
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mateus Patricio
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Harpreet Singh Riat
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniel Sheppard
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Helen Sparrow
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Kieron Taylor
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Anja Thormann
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Alessandro Vullo
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Brandon Walts
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Steven P Wilder
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Amonida Zadissa
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Myrto Kostadima
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Fergal J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Matthieu Muffato
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Emily Perry
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Magali Ruffier
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniel M Staines
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Stephen J Trevanion
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Fiona Cunningham
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Andrew Yates
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniel R Zerbino
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK .,Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| |
Collapse
|
18
|
Howe DG, Bradford YM, Eagle A, Fashena D, Frazer K, Kalita P, Mani P, Martin R, Moxon ST, Paddock H, Pich C, Ramachandran S, Ruzicka L, Schaper K, Shao X, Singer A, Toro S, Van Slyke C, Westerfield M. The Zebrafish Model Organism Database: new support for human disease models, mutation details, gene expression phenotypes and searching. Nucleic Acids Res 2016; 45:D758-D768. [PMID: 27899582 PMCID: PMC5210580 DOI: 10.1093/nar/gkw1116] [Citation(s) in RCA: 50] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2016] [Revised: 10/25/2016] [Accepted: 10/27/2016] [Indexed: 12/16/2022] Open
Abstract
The Zebrafish Model Organism Database (ZFIN; http://zfin.org) is the central resource for zebrafish (Danio rerio) genetic, genomic, phenotypic and developmental data. ZFIN curators provide expert manual curation and integration of comprehensive data involving zebrafish genes, mutants, transgenic constructs and lines, phenotypes, genotypes, gene expressions, morpholinos, TALENs, CRISPRs, antibodies, anatomical structures, models of human disease and publications. We integrate curated, directly submitted, and collaboratively generated data, making these available to zebrafish research community. Among the vertebrate model organisms, zebrafish are superbly suited for rapid generation of sequence-targeted mutant lines, characterization of phenotypes including gene expression patterns, and generation of human disease models. The recent rapid adoption of zebrafish as human disease models is making management of these data particularly important to both the research and clinical communities. Here, we describe recent enhancements to ZFIN including use of the zebrafish experimental conditions ontology, ‘Fish’ records in the ZFIN database, support for gene expression phenotypes, models of human disease, mutation details at the DNA, RNA and protein levels, and updates to the ZFIN single box search.
Collapse
Affiliation(s)
- Douglas G Howe
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Yvonne M Bradford
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Anne Eagle
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - David Fashena
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Ken Frazer
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Patrick Kalita
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Prita Mani
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Ryan Martin
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Sierra Taylor Moxon
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Holly Paddock
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Christian Pich
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | | | - Leyla Ruzicka
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Kevin Schaper
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Xiang Shao
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Amy Singer
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Sabrina Toro
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Ceri Van Slyke
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Monte Westerfield
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| |
Collapse
|
19
|
McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GRS, Thormann A, Flicek P, Cunningham F. The Ensembl Variant Effect Predictor. Genome Biol 2016. [PMID: 27268795 DOI: 10.1186/s13059-016–0974-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
The Ensembl Variant Effect Predictor is a powerful toolset for the analysis, annotation, and prioritization of genomic variants in coding and non-coding regions. It provides access to an extensive collection of genomic annotation, with a variety of interfaces to suit different requirements, and simple options for configuring and extending analysis. It is open source, free to use, and supports full reproducibility of results. The Ensembl Variant Effect Predictor can simplify and accelerate variant interpretation in a wide range of study designs.
Collapse
Affiliation(s)
- William McLaren
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | - Laurent Gil
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Sarah E Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Harpreet Singh Riat
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Graham R S Ritchie
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Anja Thormann
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Fiona Cunningham
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| |
Collapse
|
20
|
McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GRS, Thormann A, Flicek P, Cunningham F. The Ensembl Variant Effect Predictor. Genome Biol 2016; 17:122. [PMID: 27268795 PMCID: PMC4893825 DOI: 10.1186/s13059-016-0974-4] [Citation(s) in RCA: 4261] [Impact Index Per Article: 532.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2016] [Accepted: 05/03/2016] [Indexed: 02/06/2023] Open
Abstract
The Ensembl Variant Effect Predictor is a powerful toolset for the analysis, annotation, and prioritization of genomic variants in coding and non-coding regions. It provides access to an extensive collection of genomic annotation, with a variety of interfaces to suit different requirements, and simple options for configuring and extending analysis. It is open source, free to use, and supports full reproducibility of results. The Ensembl Variant Effect Predictor can simplify and accelerate variant interpretation in a wide range of study designs.
Collapse
Affiliation(s)
- William McLaren
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | - Laurent Gil
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Sarah E Hunt
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Harpreet Singh Riat
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Graham R S Ritchie
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Anja Thormann
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Fiona Cunningham
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| |
Collapse
|