1
|
Baldarelli RM, Smith CL, Ringwald M, Richardson JE, Bult CJ. Mouse Genome Informatics: an integrated knowledgebase system for the laboratory mouse. Genetics 2024; 227:iyae031. [PMID: 38531069 PMCID: PMC11075557 DOI: 10.1093/genetics/iyae031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2023] [Accepted: 02/13/2024] [Indexed: 03/28/2024] Open
Abstract
Mouse Genome Informatics (MGI) is a federation of expertly curated information resources designed to support experimental and computational investigations into genetic and genomic aspects of human biology and disease using the laboratory mouse as a model system. The Mouse Genome Database (MGD) and the Gene Expression Database (GXD) are core MGI databases that share data and system architecture. MGI serves as the central community resource of integrated information about mouse genome features, variation, expression, gene function, phenotype, and human disease models acquired from peer-reviewed publications, author submissions, and major bioinformatics resources. To facilitate integration and standardization of data, biocuration scientists annotate using terms from controlled metadata vocabularies and biological ontologies (e.g. Mammalian Phenotype Ontology, Mouse Developmental Anatomy, Disease Ontology, Gene Ontology, etc.), and by applying international community standards for gene, allele, and mouse strain nomenclature. MGI serves basic scientists, translational researchers, and data scientists by providing access to FAIR-compliant data in both human-readable and compute-ready formats. The MGI resource is accessible at https://informatics.jax.org. Here, we present an overview of the core data types represented in MGI and highlight recent enhancements to the resource with a focus on new data and functionality for MGD and GXD.
Collapse
Affiliation(s)
| | | | | | | | - Carol J Bult
- The Jackson Laboratory, Bar Harbor, ME 04609, USA
| |
Collapse
|
2
|
Liang Y, Pan C, Yin T, Wang L, Gao X, Wang E, Quang H, Huang D, Tan L, Xiang K, Wang Y, Alexander PB, Li Q, Yao T, Zhang Z, Wang X. Branched-Chain Amino Acid Accumulation Fuels the Senescence-Associated Secretory Phenotype. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2303489. [PMID: 37964763 PMCID: PMC10787106 DOI: 10.1002/advs.202303489] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 10/07/2023] [Indexed: 11/16/2023]
Abstract
The essential branched-chain amino acids (BCAAs) leucine, isoleucine, and valine play critical roles in protein synthesis and energy metabolism. Despite their widespread use as nutritional supplements, BCAAs' full effects on mammalian physiology remain uncertain due to the complexities of BCAA metabolic regulation. Here a novel mechanism linking intrinsic alterations in BCAA metabolism is identified to cellular senescence and the senescence-associated secretory phenotype (SASP), both of which contribute to organismal aging and inflammation-related diseases. Altered BCAA metabolism driving the SASP is mediated by robust activation of the BCAA transporters Solute Carrier Family 6 Members 14 and 15 as well as downregulation of the catabolic enzyme BCAA transaminase 1 during onset of cellular senescence, leading to highly elevated intracellular BCAA levels in senescent cells. This, in turn, activates the mammalian target of rapamycin complex 1 (mTORC1) to establish the full SASP program. Transgenic Drosophila models further indicate that orthologous BCAA regulators are involved in the induction of cellular senescence and age-related phenotypes in flies, suggesting evolutionary conservation of this metabolic pathway during aging. Finally, experimentally blocking BCAA accumulation attenuates the inflammatory response in a mouse senescence model, highlighting the therapeutic potential of modulating BCAA metabolism for the treatment of age-related and inflammatory diseases.
Collapse
Affiliation(s)
- Yaosi Liang
- Department of Pharmacology and Cancer BiologyDuke University Medical CenterDurhamNC27710USA
| | - Christopher Pan
- Department of Pharmacology and Cancer BiologyDuke University Medical CenterDurhamNC27710USA
| | - Tao Yin
- Department of Pharmacology and Cancer BiologyDuke University Medical CenterDurhamNC27710USA
| | - Lu Wang
- Department of Pharmacology and Cancer BiologyDuke University Medical CenterDurhamNC27710USA
- State Key Laboratory of Molecular BiologyShanghai Institute of Biochemistry and Cell BiologyCenter for Excellence in Molecular Cell ScienceChinese Academy of SciencesShanghai200031China
| | - Xia Gao
- Department of Pharmacology and Cancer BiologyDuke University Medical CenterDurhamNC27710USA
- Children's Nutrition Research CenterDepartment of PediatricsBaylor College of MedicineHoustonTX77030USA
| | - Ergang Wang
- Department of Pharmacology and Cancer BiologyDuke University Medical CenterDurhamNC27710USA
| | - Holly Quang
- Children's Nutrition Research CenterDepartment of PediatricsBaylor College of MedicineHoustonTX77030USA
| | - De Huang
- Department of Pharmacology and Cancer BiologyDuke University Medical CenterDurhamNC27710USA
- School of Basic Medical SciencesDivision of Life Sciences and MedicineUniversity of Science and Technology of ChinaHefei230026China
| | - Lianmei Tan
- Department of Pharmacology and Cancer BiologyDuke University Medical CenterDurhamNC27710USA
| | - Kun Xiang
- Department of Pharmacology and Cancer BiologyDuke University Medical CenterDurhamNC27710USA
| | - Yu Wang
- Center for Regenerative MedicineMassachusetts General HospitalHarvard Medical SchoolBostonMA02114USA
| | - Peter B. Alexander
- Department of Pharmacology and Cancer BiologyDuke University Medical CenterDurhamNC27710USA
| | - Qi‐Jing Li
- Department of ImmunologyDuke University Medical CenterDurhamNC27710USA
- Institute of Molecular and Cell BiologyAgency for ScienceTechnology and Research (A*STAR)Singapore138673Singapore
- Singapore Immunology NetworkAgency for ScienceTechnology and Research (A*STAR)Singapore138673Singapore
| | - Tso‐Pang Yao
- Department of Pharmacology and Cancer BiologyDuke University Medical CenterDurhamNC27710USA
| | - Zhao Zhang
- Department of Pharmacology and Cancer BiologyDuke University Medical CenterDurhamNC27710USA
| | - Xiao‐Fan Wang
- Department of Pharmacology and Cancer BiologyDuke University Medical CenterDurhamNC27710USA
| |
Collapse
|
3
|
Holmes A, Carvalho-Silva D, Sondka Z, Ahmed M, Argasinska J, Lyne R, Sangrador-Vegas A, Ward S. Help biocurators to maximize the reach of your data. PLoS Biol 2024; 22:e3002477. [PMID: 38271296 PMCID: PMC10810541 DOI: 10.1371/journal.pbio.3002477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2024] Open
Abstract
Curated scientific databases catalogue and amplify research findings to maximize their reach. Authors should write their papers with this in mind, ensuring that data are accurate, easy to extract, and presented in standardized formats.
Collapse
Affiliation(s)
| | | | - Zbyslaw Sondka
- Wellcome Sanger Institute, Hinxton, Cambridge, United Kingdom
| | - Madiha Ahmed
- Wellcome Sanger Institute, Hinxton, Cambridge, United Kingdom
| | | | - Rachel Lyne
- Wellcome Sanger Institute, Hinxton, Cambridge, United Kingdom
| | | | - Sari Ward
- Wellcome Sanger Institute, Hinxton, Cambridge, United Kingdom
| |
Collapse
|
4
|
Bult CJ, Sternberg PW. The alliance of genome resources: transforming comparative genomics. Mamm Genome 2023; 34:531-544. [PMID: 37666946 PMCID: PMC10628019 DOI: 10.1007/s00335-023-10015-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 08/11/2023] [Indexed: 09/06/2023]
Abstract
Comparing genomic and biological characteristics across multiple species is essential to using model systems to investigate the molecular and cellular mechanisms underlying human biology and disease and to translate mechanistic insights from studies in model organisms for clinical applications. Building a scalable knowledge commons platform that supports cross-species comparison of rich, expertly curated knowledge regarding gene function, phenotype, and disease associations available for model organisms and humans is the primary mission of the Alliance of Genome Resources (the Alliance). The Alliance is a consortium of seven model organism knowledgebases (mouse, rat, yeast, nematode, zebrafish, frog, fruit fly) and the Gene Ontology resource. The Alliance uses a common set of gene ortholog assertions as the basis for comparing biological annotations across the organisms represented in the Alliance. The major types of knowledge associated with genes that are represented in the Alliance database currently include gene function, phenotypic alleles and variants, human disease associations, pathways, gene expression, and both protein-protein and genetic interactions. The Alliance has enhanced the ability of researchers to easily compare biological annotations for common data types across model organisms and human through the implementation of shared programmatic access mechanisms, data-specific web pages with a unified "look and feel", and interactive user interfaces specifically designed to support comparative biology. The modular infrastructure developed by the Alliance allows the resource to serve as an extensible "knowledge commons" capable of expanding to accommodate additional model organisms.
Collapse
|
5
|
Yehudi Y, Hughes-Noehrer L, Goble C, Jay C. Subjective data models in bioinformatics and how wet lab and computational biologists conceptualise data. Sci Data 2023; 10:756. [PMID: 37919302 PMCID: PMC10622411 DOI: 10.1038/s41597-023-02627-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 10/09/2023] [Indexed: 11/04/2023] Open
Abstract
Biological science produces "big data" in varied formats, which necessitates using computational tools to process, integrate, and analyse data. Researchers using computational biology tools range from those using computers for communication, to those writing analysis code. We examine differences in how researchers conceptualise the same data, which we call "subjective data models". We interviewed 22 people with biological experience and varied levels of computational experience, and found that many had fluid subjective data models that changed depending on circumstance. Surprisingly, results did not cluster around participants' computational experience levels. People did not consistently map entities from abstract data models to the real-world entities in files, and certain data identifier formats were easier to infer meaning from than others. Real-world implications: 1) software engineers should design interfaces for task performance, emulating popular user interfaces, rather than targeting professional backgrounds; 2) when insufficient context is provided, people may guess what data means, whether or not they are correct, emphasising the importance of contextual metadata to remove the need for erroneous guesswork.
Collapse
Affiliation(s)
- Yo Yehudi
- Department of Computer Science, University of Manchester, Oxford Road, Manchester, M13 9PL, UK.
- OLS, Wimblington, PE15 0QE, UK.
| | - Lukas Hughes-Noehrer
- Department of Computer Science, University of Manchester, Oxford Road, Manchester, M13 9PL, UK
| | - Carole Goble
- Department of Computer Science, University of Manchester, Oxford Road, Manchester, M13 9PL, UK
| | - Caroline Jay
- Department of Computer Science, University of Manchester, Oxford Road, Manchester, M13 9PL, UK
| |
Collapse
|
6
|
Castanza AS, Recla JM, Eby D, Thorvaldsdóttir H, Bult CJ, Mesirov JP. Extending support for mouse data in the Molecular Signatures Database (MSigDB). Nat Methods 2023; 20:1619-1620. [PMID: 37704782 DOI: 10.1038/s41592-023-02014-7] [Citation(s) in RCA: 22] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/15/2023]
Affiliation(s)
- Anthony S Castanza
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | - Jill M Recla
- The Jackson Laboratory for Mammalian Genomics, Bar Harbor, ME, USA
| | - David Eby
- Department of Medicine, University of California San Diego, La Jolla, CA, USA
| | | | - Carol J Bult
- The Jackson Laboratory for Mammalian Genomics, Bar Harbor, ME, USA
| | - Jill P Mesirov
- Department of Medicine, University of California San Diego, La Jolla, CA, USA.
- Moores Cancer Center, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
7
|
Swart PC, Du Plessis M, Rust C, Womersley JS, van den Heuvel LL, Seedat S, Hemmings SMJ. Identifying genetic loci that are associated with changes in gene expression in PTSD in a South African cohort. J Neurochem 2023; 166:705-719. [PMID: 37522158 PMCID: PMC10953375 DOI: 10.1111/jnc.15919] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2023] [Revised: 06/30/2023] [Accepted: 07/05/2023] [Indexed: 08/01/2023]
Abstract
The molecular mechanisms underlying posttraumatic stress disorder (PTSD) are yet to be fully elucidated, especially in underrepresented population groups. Expression quantitative trait loci (eQTLs) are DNA sequence variants that influence gene expression, in a local (cis-) or distal (trans-) manner, and subsequently impact cellular, tissue, and system physiology. This study aims to identify genetic loci associated with gene expression changes in a South African PTSD cohort. Genome-wide genotype and RNA-sequencing data were obtained from 32 trauma-exposed controls and 35 PTSD cases of mixed-ancestry, as part of the SHARED ROOTS project. The first approach utilised 108 937 single-nucleotide polymorphisms (SNPs) (MAF > 10%) and 11 312 genes with Matrix eQTL to map potential eQTLs, while controlling for covariates as appropriate. The second analysis was focused on 5638 SNPs related to a previously calculated PTSD polygenic risk score for this cohort. SNP-gene pairs were considered eQTLs if they surpassed Bonferroni correction and had a false discovery rate <0.05. We did not identify eQTLs that significantly influenced gene expression in a PTSD-dependent manner. However, several known cis-eQTLs, independent of PTSD diagnosis, were observed. rs8521 (C > T) was associated with TAGLN and SIDT2 expression, and rs11085906 (C > T) was associated with ZNF333 expression. This exploratory study provides insight into the molecular mechanisms associated with PTSD in a non-European, admixed sample population. This study was limited by the cross-sectional design and insufficient statistical power. Overall, this study should encourage further multi-omics approaches towards investigating PTSD in diverse populations.
Collapse
Affiliation(s)
- Patricia C. Swart
- Department of Psychiatry, Faculty of Medicine and Health SciencesStellenbosch UniversityCape TownSouth Africa
- South African Medical Research Council/Stellenbosch University Genomics of Brain Disorders UnitCape TownSouth Africa
| | - Morne Du Plessis
- Department of Psychiatry, Faculty of Medicine and Health SciencesStellenbosch UniversityCape TownSouth Africa
- South African Medical Research Council/Stellenbosch University Genomics of Brain Disorders UnitCape TownSouth Africa
| | - Carlien Rust
- Department of Psychiatry, Faculty of Medicine and Health SciencesStellenbosch UniversityCape TownSouth Africa
- South African Medical Research Council/Stellenbosch University Genomics of Brain Disorders UnitCape TownSouth Africa
| | - Jacqueline S. Womersley
- Department of Psychiatry, Faculty of Medicine and Health SciencesStellenbosch UniversityCape TownSouth Africa
- South African Medical Research Council/Stellenbosch University Genomics of Brain Disorders UnitCape TownSouth Africa
| | - Leigh L. van den Heuvel
- Department of Psychiatry, Faculty of Medicine and Health SciencesStellenbosch UniversityCape TownSouth Africa
- South African Medical Research Council/Stellenbosch University Genomics of Brain Disorders UnitCape TownSouth Africa
| | - Soraya Seedat
- Department of Psychiatry, Faculty of Medicine and Health SciencesStellenbosch UniversityCape TownSouth Africa
- South African Medical Research Council/Stellenbosch University Genomics of Brain Disorders UnitCape TownSouth Africa
| | - Sian M. J. Hemmings
- Department of Psychiatry, Faculty of Medicine and Health SciencesStellenbosch UniversityCape TownSouth Africa
- South African Medical Research Council/Stellenbosch University Genomics of Brain Disorders UnitCape TownSouth Africa
| |
Collapse
|
8
|
Nakamura A, Broséus L, Tost J, Vaiman D, Martins S, Keyes K, Bonello K, Fekom M, Strandberg-Larsen K, Sutter-Dallay AL, Heude B, Melchior M, Lepeule J. Epigenome-Wide Associations of Placental DNA Methylation and Behavioral and Emotional Difficulties in Children at 3 Years of Age. Int J Mol Sci 2023; 24:11772. [PMID: 37511531 PMCID: PMC10380531 DOI: 10.3390/ijms241411772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Revised: 07/04/2023] [Accepted: 07/12/2023] [Indexed: 07/30/2023] Open
Abstract
The placenta is a key organ for fetal and brain development. Its epigenome can be regarded as a biochemical record of the prenatal environment and a potential mechanism of its association with the future health of the fetus. We investigated associations between placental DNA methylation levels and child behavioral and emotional difficulties, assessed at 3 years of age using the Strengths and Difficulties Questionnaire (SDQ) in 441 mother-child dyads from the EDEN cohort. Hypothesis-driven and exploratory analyses (on differentially methylated probes (EWAS) and regions (DMR)) were adjusted for confounders, technical factors, and cell composition estimates, corrected for multiple comparisons, and stratified by child sex. Hypothesis-driven analyses showed an association of cg26703534 (AHRR) with emotional symptoms, and exploratory analyses identified two probes, cg09126090 (intergenic region) and cg10305789 (PPP1R16B), as negatively associated with peer relationship problems, as well as 33 DMRs, mostly positively associated with at least one of the SDQ subscales. Among girls, most associations were seen with emotional difficulties, whereas in boys, DMRs were as much associated with emotional than behavioral difficulties. This study provides the first evidence of associations between placental DNA methylation and child behavioral and emotional difficulties. Our results suggest sex-specific associations and might provide new insights into the mechanisms of neurodevelopment.
Collapse
Affiliation(s)
- Aurélie Nakamura
- Team of Environmental Epidemiology Applied to Development and Respiratory Health, Institute for Advanced Biosciences (IAB), University Grenoble Alpes, INSERM, 38700 La Tronche, France;
| | - Lucile Broséus
- Team of Environmental Epidemiology Applied to Development and Respiratory Health, Institute for Advanced Biosciences (IAB), University Grenoble Alpes, INSERM, 38700 La Tronche, France;
| | - Jörg Tost
- Laboratory for Epigenetics and Environment, Centre National de Recherche en Génomique Humaine, CEA—Institut de Biologie François Jacob, University Paris Saclay, 91057 Evry, France;
| | - Daniel Vaiman
- From Gametes to Birth, Institut Cochin, U1016 INSERM, UMR 8104 CNRS, Paris Cité University, 75014 Paris, France;
| | - Silvia Martins
- Department of Epidemiology, Columbia University Mailman School of Public Health, New York, NY 10032, USA; (S.M.); (K.K.)
| | - Katherine Keyes
- Department of Epidemiology, Columbia University Mailman School of Public Health, New York, NY 10032, USA; (S.M.); (K.K.)
| | - Kim Bonello
- Institut Pierre Louis d’Epidémiologie et de Santé Publique (IPLESP), Equipe de Recherche en Epidémiologie Sociale (ERES), Sorbonne Université, INSERM, 75571 Paris, France; (K.B.); (M.F.); (M.M.)
- Department of General Practice, School of Medicine, Sorbonne University, 75013 Paris, France
| | - Mathilde Fekom
- Institut Pierre Louis d’Epidémiologie et de Santé Publique (IPLESP), Equipe de Recherche en Epidémiologie Sociale (ERES), Sorbonne Université, INSERM, 75571 Paris, France; (K.B.); (M.F.); (M.M.)
| | - Katrine Strandberg-Larsen
- Section of Epidemiology, Department of Public Health, University of Copenhagen, 1165 Copenhagen, Denmark;
| | - Anne-Laure Sutter-Dallay
- Bordeaux Population Health, Bordeaux University, INSERM, UMR 1219, 33076 Bordeaux, France;
- University Department of Child and Adolescent Psychiatry, Charles Perrens Hospital, 33000 Bordeaux, France
| | - Barbara Heude
- Center for Research in Epidemiology and Statistics (CRESS), Université Paris Cité and Université Sorbonne Paris Nord, INSERM, INRAE, 75004 Paris, France;
| | - Maria Melchior
- Institut Pierre Louis d’Epidémiologie et de Santé Publique (IPLESP), Equipe de Recherche en Epidémiologie Sociale (ERES), Sorbonne Université, INSERM, 75571 Paris, France; (K.B.); (M.F.); (M.M.)
| | - Johanna Lepeule
- Team of Environmental Epidemiology Applied to Development and Respiratory Health, Institute for Advanced Biosciences (IAB), University Grenoble Alpes, INSERM, 38700 La Tronche, France;
| |
Collapse
|
9
|
Walls GM, Ghita M, Queen R, Edgar KS, Gill EK, Kuburas R, Grieve DJ, Watson CJ, McWilliam A, Van Herk M, Williams KJ, Cole AJ, Jain S, Butterworth KT. Spatial Gene Expression Changes in the Mouse Heart After Base-Targeted Irradiation. Int J Radiat Oncol Biol Phys 2023; 115:453-463. [PMID: 35985456 DOI: 10.1016/j.ijrobp.2022.08.031] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Revised: 08/03/2022] [Accepted: 08/05/2022] [Indexed: 01/11/2023]
Abstract
PURPOSE Radiation cardiotoxicity (RC) is a clinically significant adverse effect of treatment for patients with thoracic malignancies. Clinical studies in lung cancer have indicated that heart substructures are not uniformly radiosensitive, and that dose to the heart base drives RC. In this study, we aimed to characterize late changes in gene expression using spatial transcriptomics in a mouse model of base regional radiosensitivity. METHODS AND MATERIALS An aged female C57BL/6 mouse was irradiated with 16 Gy delivered to the cranial third of the heart using a 6 × 9 mm parallel opposed beam geometry on a small animal radiation research platform, and a second mouse was sham-irradiated. After echocardiography, whole hearts were collected at 30 weeks for spatial transcriptomic analysis to map gene expression changes occurring in different regions of the partially irradiated heart. Cardiac regions were manually annotated on the capture slides and the gene expression profiles compared across different regions. RESULTS Ejection fraction was reduced at 30 weeks after a 16 Gy irradiation to the heart base, compared with the sham-irradiated controls. There were markedly more significant gene expression changes within the irradiated regions compared with nonirradiated regions. Variation was observed in the transcriptomic effects of radiation on different cardiac base structures (eg, between the right atrium [n = 86 dysregulated genes], left atrium [n = 96 dysregulated genes], and the vasculature [n = 129 dysregulated genes]). Disrupted biological processes spanned extracellular matrix as well as circulatory, neuronal, and contractility activities. CONCLUSIONS This is the first study to report spatially resolved gene expression changes in irradiated tissues. Examination of the regional radiation response in the heart can help to further our understanding of the cardiac base's radiosensitivity and support the development of actionable targets for pharmacologic intervention and biologically relevant dose constraints.
Collapse
Affiliation(s)
- Gerard M Walls
- Patrick G Johnston Centre for Cancer Research, Queen's University Belfast, Northern Ireland; Cancer Centre Belfast City Hospital, Belfast Health & Social Care Trust, Belfast, Northern Ireland.
| | - Mihaela Ghita
- Patrick G Johnston Centre for Cancer Research, Queen's University Belfast, Northern Ireland
| | - Rachel Queen
- Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle-upon-Tyne, England
| | - Kevin S Edgar
- Wellcome-Wolfson Institute for Experimental Medicine, Queen's University Belfast, Belfast, Northern Ireland
| | - Eleanor K Gill
- Wellcome-Wolfson Institute for Experimental Medicine, Queen's University Belfast, Belfast, Northern Ireland; Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, England
| | - Refik Kuburas
- Patrick G Johnston Centre for Cancer Research, Queen's University Belfast, Northern Ireland
| | - David J Grieve
- Wellcome-Wolfson Institute for Experimental Medicine, Queen's University Belfast, Belfast, Northern Ireland
| | - Chris J Watson
- Wellcome-Wolfson Institute for Experimental Medicine, Queen's University Belfast, Belfast, Northern Ireland
| | - Alan McWilliam
- Division of Cancer Sciences, University of Manchester, Oglesby Building, Manchester, England; Department of Radiation Therapy Related Research, The Christie Foundation Trust, Manchester, England
| | - Marcel Van Herk
- Division of Cancer Sciences, University of Manchester, Oglesby Building, Manchester, England; Department of Radiation Therapy Related Research, The Christie Foundation Trust, Manchester, England
| | - Kaye J Williams
- Division of Pharmacy and Optometry, School of Health Science, Faculty of Biology Medicine and Health, University of Manchester, Manchester, England
| | - Aidan J Cole
- Patrick G Johnston Centre for Cancer Research, Queen's University Belfast, Northern Ireland; Cancer Centre Belfast City Hospital, Belfast Health & Social Care Trust, Belfast, Northern Ireland
| | - Suneil Jain
- Patrick G Johnston Centre for Cancer Research, Queen's University Belfast, Northern Ireland; Cancer Centre Belfast City Hospital, Belfast Health & Social Care Trust, Belfast, Northern Ireland
| | - Karl T Butterworth
- Patrick G Johnston Centre for Cancer Research, Queen's University Belfast, Northern Ireland
| |
Collapse
|
10
|
REDfly: An Integrated Knowledgebase for Insect Regulatory Genomics. INSECTS 2022; 13:insects13070618. [PMID: 35886794 PMCID: PMC9323752 DOI: 10.3390/insects13070618] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 07/01/2022] [Accepted: 07/06/2022] [Indexed: 11/29/2022]
Abstract
Simple Summary Understanding how genes are regulated is a vital area of current biological research and a crucial adjunct to ongoing efforts to sequence entire genomes. Knowing the DNA sequences responsible for gene regulation—transcriptional cis-regulatory modules (CRMs, e.g., “enhancers”) and transcription factor binding sites (TFBSs)—is important for many areas of research including interpretation and validation of data developed by large-scale genomics projects, providing training data for machine-learning CRM-discovery methods, genome annotation, modeling gene-regulatory networks, studying the evolution of gene regulation, and numerous aspects of the basic biology of transcriptional regulation. Knowledge of insect CRMs is also an important step in developing biotechnology methods for control of insect disease vectors and for eliminating pathogen transmission. The REDfly (Regulatory Element Database for Fly) database integrates all of the available insect cis-regulatory information from multiple sources to provide a comprehensive collection of known regulatory elements. In this paper, we describe REDfly’s basic contents and data model, emphasizing recently added features, and provide illustrated walk-throughs of some common search scenarios. Abstract We provide here an updated description of the REDfly (Regulatory Element Database for Fly) database of transcriptional regulatory elements, a unique resource that provides regulatory annotation for the genome of Drosophila and other insects. The genomic sequences regulating insect gene expression—transcriptional cis-regulatory modules (CRMs, e.g., “enhancers”) and transcription factor binding sites (TFBSs)—are not currently curated by any other major database resources. However, knowledge of such sequences is important, as CRMs play critical roles with respect to disease as well as normal development, phenotypic variation, and evolution. Characterized CRMs also provide useful tools for both basic and applied research, including developing methods for insect control. REDfly, which is the most detailed existing platform for metazoan regulatory-element annotation, includes over 40,000 experimentally verified CRMs and TFBSs along with their DNA sequences, their associated genes, and the expression patterns they direct. Here, we briefly describe REDfly’s contents and data model, with an emphasis on the new features implemented since 2020. We then provide an illustrated walk-through of several common REDfly search use cases.
Collapse
|
11
|
Kachroo AH, Vandeloo M, Greco BM, Abdullah M. Humanized yeast to model human biology, disease and evolution. Dis Model Mech 2022; 15:275614. [PMID: 35661208 PMCID: PMC9194483 DOI: 10.1242/dmm.049309] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
For decades, budding yeast, a single-cellular eukaryote, has provided remarkable insights into human biology. Yeast and humans share several thousand genes despite morphological and cellular differences and over a billion years of separate evolution. These genes encode critical cellular processes, the failure of which in humans results in disease. Although recent developments in genome engineering of mammalian cells permit genetic assays in human cell lines, there is still a need to develop biological reagents to study human disease variants in a high-throughput manner. Many protein-coding human genes can successfully substitute for their yeast equivalents and sustain yeast growth, thus opening up doors for developing direct assays of human gene function in a tractable system referred to as 'humanized yeast'. Humanized yeast permits the discovery of new human biology by measuring human protein activity in a simplified organismal context. This Review summarizes recent developments showing how humanized yeast can directly assay human gene function and explore variant effects at scale. Thus, by extending the 'awesome power of yeast genetics' to study human biology, humanizing yeast reinforces the high relevance of evolutionarily distant model organisms to explore human gene evolution, function and disease.
Collapse
|
12
|
A Saccharomyces eubayanus haploid resource for research studies. Sci Rep 2022; 12:5976. [PMID: 35396494 PMCID: PMC8993842 DOI: 10.1038/s41598-022-10048-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Accepted: 04/01/2022] [Indexed: 12/16/2022] Open
Abstract
Since its identification, Saccharomyces eubayanus has been recognized as the missing parent of the lager hybrid, S. pastorianus. This wild yeast has never been isolated from fermentation environments, thus representing an interesting candidate for evolutionary, ecological and genetic studies. However, it is imperative to develop additional molecular genetics tools to ease manipulation and thus facilitate future studies. With this in mind, we generated a collection of stable haploid strains representative of three main lineages described in S. eubayanus (PB-1, PB-2 and PB-3), by deleting the HO gene using CRISPR-Cas9 and tetrad micromanipulation. Phenotypic characterization under different conditions demonstrated that the haploid derivates were extremely similar to their parental strains. Genomic analysis in three strains highlighted a likely low frequency of off-targets, and sequencing of a single tetrad evidenced no structural variants in any of the haploid spores. Finally, we demonstrate the utilization of the haploid set by challenging the strains under mass-mating conditions. In this way, we found that S. eubayanus under liquid conditions has a preference to remain in a haploid state, unlike S. cerevisiae that mates rapidly. This haploid resource is a novel set of strains for future yeast molecular genetics studies.
Collapse
|
13
|
Engel SR, Wong ED, Nash RS, Aleksander S, Alexander M, Douglass E, Karra K, Miyasato SR, Simison M, Skrzypek MS, Weng S, Cherry JM. New data and collaborations at the Saccharomyces Genome Database: updated reference genome, alleles, and the Alliance of Genome Resources. Genetics 2022; 220:iyab224. [PMID: 34897464 PMCID: PMC9209811 DOI: 10.1093/genetics/iyab224] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2021] [Accepted: 11/11/2021] [Indexed: 02/03/2023] Open
Abstract
Saccharomyces cerevisiae is used to provide fundamental understanding of eukaryotic genetics, gene product function, and cellular biological processes. Saccharomyces Genome Database (SGD) has been supporting the yeast research community since 1993, serving as its de facto hub. Over the years, SGD has maintained the genetic nomenclature, chromosome maps, and functional annotation, and developed various tools and methods for analysis and curation of a variety of emerging data types. More recently, SGD and six other model organism focused knowledgebases have come together to create the Alliance of Genome Resources to develop sustainable genome information resources that promote and support the use of various model organisms to understand the genetic and genomic bases of human biology and disease. Here we describe recent activities at SGD, including the latest reference genome annotation update, the development of a curation system for mutant alleles, and new pages addressing homology across model organisms as well as the use of yeast to study human disease.
Collapse
Affiliation(s)
- Stacia R Engel
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Edith D Wong
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Robert S Nash
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Suzi Aleksander
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Micheal Alexander
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Eric Douglass
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Kalpana Karra
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Stuart R Miyasato
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Matt Simison
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Marek S Skrzypek
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - Shuai Weng
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| | - J Michael Cherry
- Department of Genetics, Stanford University, Stanford, CA 94305-5120, USA
| |
Collapse
|
14
|
Wood V, Sternberg PW, Lipshitz HD. Making biological knowledge useful for humans and machines. Genetics 2022; 220:6563297. [PMID: 35380659 PMCID: PMC8982017 DOI: 10.1093/genetics/iyac001] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
|
15
|
Agapite J, Albou LP, Aleksander SA, Alexander M, Anagnostopoulos AV, Antonazzo G, Argasinska J, Arnaboldi V, Attrill H, Becerra A, Bello SM, Blake JA, Blodgett O, Bradford YM, Bult CJ, Cain S, Calvi BR, Carbon S, Chan J, Chen WJ, Michael Cherry J, Cho J, Christie KR, Crosby MA, Davis P, da Veiga Beltrame E, De Pons JL, D’Eustachio P, Diamantakis S, Dolan ME, dos Santos G, Douglass E, Dunn B, Eagle A, Ebert D, Engel SR, Fashena D, Foley S, Frazer K, Gao S, Gibson AC, Gondwe F, Goodman J, Sian Gramates L, Grove CA, Hale P, Harris T, Thomas Hayman G, Hill DP, Howe DG, Howe KL, Hu Y, Jha S, Kadin JA, Kaufman TC, Kalita P, Karra K, Kishore R, Kwitek AE, Laulederkind SJF, Lee R, Longden I, Luypaert M, MacPherson KA, Martin R, Marygold SJ, Matthews B, McAndrews MS, Millburn G, Miyasato S, Motenko H, Moxon S, Muller HM, Mungall CJ, Muruganujan A, Mushayahama T, Nalabolu HS, Nash RS, Ng P, Nuin P, Paddock H, Paulini M, Perrimon N, Pich C, Quinton-Tulloch M, Raciti D, Ramachandran S, Richardson JE, Gelbart SR, Ruzicka L, Schaper K, Schindelman G, Shimoyama M, Simison M, Shaw DR, Shrivatsav A, Singer A, Skrzypek M, Smith CM, Smith CL, Smith JR, Stein L, Sternberg PW, Tabone CJ, Thomas PD, Thorat K, Thota J, Toro S, Tomczuk M, Trovisco V, Tutaj MA, Tutaj M, Urbano JM, Van Auken K, Van Slyke CE, Wang Q, Wang SJ, Weng S, Westerfield M, Williams G, Wilming LG, Wong ED, Wright A, Yook K, Zarowiecki M, Zhou P, Zytkovicz M. Harmonizing model organism data in the Alliance of Genome Resources. Genetics 2022; 220:iyac022. [PMID: 35380658 PMCID: PMC8982023 DOI: 10.1093/genetics/iyac022] [Citation(s) in RCA: 50] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2021] [Accepted: 01/26/2022] [Indexed: 02/06/2023] Open
Abstract
The Alliance of Genome Resources (the Alliance) is a combined effort of 7 knowledgebase projects: Saccharomyces Genome Database, WormBase, FlyBase, Mouse Genome Database, the Zebrafish Information Network, Rat Genome Database, and the Gene Ontology Resource. The Alliance seeks to provide several benefits: better service to the various communities served by these projects; a harmonized view of data for all biomedical researchers, bioinformaticians, clinicians, and students; and a more sustainable infrastructure. The Alliance has harmonized cross-organism data to provide useful comparative views of gene function, gene expression, and human disease relevance. The basis of the comparative views is shared calls of orthology relationships and the use of common ontologies. The key types of data are alleles and variants, gene function based on gene ontology annotations, phenotypes, association to human disease, gene expression, protein-protein and genetic interactions, and participation in pathways. The information is presented on uniform gene pages that allow facile summarization of information about each gene in each of the 7 organisms covered (budding yeast, roundworm Caenorhabditis elegans, fruit fly, house mouse, zebrafish, brown rat, and human). The harmonized knowledge is freely available on the alliancegenome.org portal, as downloadable files, and by APIs. We expect other existing and emerging knowledge bases to join in the effort to provide the union of useful data and features that each knowledge base currently provides.
Collapse
|
16
|
Bradford YM, Van Slyke CE, Ruzicka L, Singer A, Eagle A, Fashena D, Howe DG, Frazer K, Martin R, Paddock H, Pich C, Ramachandran S, Westerfield M. Zebrafish Information Network, the knowledgebase for Danio rerio research. Genetics 2022; 220:6528852. [PMID: 35166825 PMCID: PMC8982015 DOI: 10.1093/genetics/iyac016] [Citation(s) in RCA: 84] [Impact Index Per Article: 42.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2021] [Accepted: 01/18/2022] [Indexed: 11/24/2022] Open
Abstract
The Zebrafish Information Network (zfin.org) is the central repository for Danio rerio genetic and genomic data. The Zebrafish Information Network has served the zebrafish research community since 1994, expertly curating, integrating, and displaying zebrafish data. Key data types available at the Zebrafish Information Network include, but are not limited to, genes, alleles, human disease models, gene expression, phenotype, and gene function. The Zebrafish Information Network makes zebrafish research data Findable, Accessible, Interoperable, and Reusable through nomenclature, curatorial and annotation activities, web interfaces, and data downloads. Recently, the Zebrafish Information Network and 6 other model organism knowledgebases have collaborated to form the Alliance of Genome Resources, aiming to develop sustainable genome information resources that enable the use of model organisms to understand the genetic and genomic basis of human biology and disease. Here, we provide an overview of the data available at the Zebrafish Information Network including recent updates to the gene page to provide access to single-cell RNA sequencing data, links to Alliance web pages, ribbon diagrams to summarize the biological systems and Gene Ontology terms that have annotations, and data integration with the Alliance of Genome Resources.
Collapse
Affiliation(s)
- Yvonne M Bradford
- The Institute of Neuroscience, University of Oregon, Eugene, Oregon 97403-1254, USA
| | - Ceri E Van Slyke
- The Institute of Neuroscience, University of Oregon, Eugene, Oregon 97403-1254, USA
| | - Leyla Ruzicka
- The Institute of Neuroscience, University of Oregon, Eugene, Oregon 97403-1254, USA
| | - Amy Singer
- The Institute of Neuroscience, University of Oregon, Eugene, Oregon 97403-1254, USA
| | - Anne Eagle
- The Institute of Neuroscience, University of Oregon, Eugene, Oregon 97403-1254, USA
| | - David Fashena
- The Institute of Neuroscience, University of Oregon, Eugene, Oregon 97403-1254, USA
| | - Douglas G Howe
- The Institute of Neuroscience, University of Oregon, Eugene, Oregon 97403-1254, USA
| | - Ken Frazer
- The Institute of Neuroscience, University of Oregon, Eugene, Oregon 97403-1254, USA
| | - Ryan Martin
- The Institute of Neuroscience, University of Oregon, Eugene, Oregon 97403-1254, USA
| | - Holly Paddock
- The Institute of Neuroscience, University of Oregon, Eugene, Oregon 97403-1254, USA
| | - Christian Pich
- The Institute of Neuroscience, University of Oregon, Eugene, Oregon 97403-1254, USA
| | - Sridhar Ramachandran
- The Institute of Neuroscience, University of Oregon, Eugene, Oregon 97403-1254, USA
| | - Monte Westerfield
- The Institute of Neuroscience, University of Oregon, Eugene, Oregon 97403-1254, USA
| |
Collapse
|
17
|
Davis P, Zarowiecki M, Arnaboldi V, Becerra A, Cain S, Chan J, Chen WJ, Cho J, da Veiga Beltrame E, Diamantakis S, Gao S, Grigoriadis D, Grove CA, Harris TW, Kishore R, Le T, Lee RYN, Luypaert M, Müller HM, Nakamura C, Nuin P, Paulini M, Quinton-Tulloch M, Raciti D, Rodgers FH, Russell M, Schindelman G, Singh A, Stickland T, Van Auken K, Wang Q, Williams G, Wright AJ, Yook K, Berriman M, Howe KL, Schedl T, Stein L, Sternberg PW. WormBase in 2022-data, processes, and tools for analyzing Caenorhabditis elegans. Genetics 2022; 220:6521733. [PMID: 35134929 PMCID: PMC8982018 DOI: 10.1093/genetics/iyac003] [Citation(s) in RCA: 148] [Impact Index Per Article: 74.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 12/17/2021] [Indexed: 02/06/2023] Open
Abstract
WormBase (www.wormbase.org) is the central repository for the genetics and genomics of the nematode Caenorhabditis elegans. We provide the research community with data and tools to facilitate the use of C. elegans and related nematodes as model organisms for studying human health, development, and many aspects of fundamental biology. Throughout our 22-year history, we have continued to evolve to reflect progress and innovation in the science and technologies involved in the study of C. elegans. We strive to incorporate new data types and richer data sets, and to provide integrated displays and services that avail the knowledge generated by the published nematode genetics literature. Here, we provide a broad overview of the current state of WormBase in terms of data type, curation workflows, analysis, and tools, including exciting new advances for analysis of single-cell data, text mining and visualization, and the new community collaboration forum. Concurrently, we continue the integration and harmonization of infrastructure, processes, and tools with the Alliance of Genome Resources, of which WormBase is a founding member.
Collapse
Affiliation(s)
- Paul Davis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Magdalena Zarowiecki
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Valerio Arnaboldi
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Andrés Becerra
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Scott Cain
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Juancarlos Chan
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Wen J Chen
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Jaehyoung Cho
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Eduardo da Veiga Beltrame
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Stavros Diamantakis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Sibyl Gao
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Dionysis Grigoriadis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Christian A Grove
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Todd W Harris
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Ranjana Kishore
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Tuan Le
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Raymond Y N Lee
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Manuel Luypaert
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Hans-Michael Müller
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Cecilia Nakamura
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Paulo Nuin
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Michael Paulini
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Mark Quinton-Tulloch
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Daniela Raciti
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Faye H Rodgers
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Matthew Russell
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Gary Schindelman
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Archana Singh
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Tim Stickland
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Kimberly Van Auken
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Qinghua Wang
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Gary Williams
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Adam J Wright
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Karen Yook
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| | - Matt Berriman
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
| | - Kevin L Howe
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - Tim Schedl
- Department of Genetics, Washington University School of Medicine, St Louis, MO 63110, USA
| | - Lincoln Stein
- Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
| | - Paul W Sternberg
- Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
| |
Collapse
|
18
|
Arshinoff BI, Cary GA, Karimi K, Foley S, Agalakov S, Delgado F, Lotay VS, Ku CJ, Pells TJ, Beatman TR, Kim E, Cameron RA, Vize PD, Telmer C, Croce JC, Ettensohn CA, Hinman VF. Echinobase: leveraging an extant model organism database to build a knowledgebase supporting research on the genomics and biology of echinoderms. Nucleic Acids Res 2022; 50:D970-D979. [PMID: 34791383 PMCID: PMC8728261 DOI: 10.1093/nar/gkab1005] [Citation(s) in RCA: 41] [Impact Index Per Article: 20.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Revised: 10/05/2021] [Accepted: 10/13/2021] [Indexed: 12/16/2022] Open
Abstract
Echinobase (www.echinobase.org) is a third generation web resource supporting genomic research on echinoderms. The new version was built by cloning the mature Xenopus model organism knowledgebase, Xenbase, refactoring data ingestion pipelines and modifying the user interface to adapt to multispecies echinoderm content. This approach leveraged over 15 years of previous database and web application development to generate a new fully featured informatics resource in a single year. In addition to the software stack, Echinobase uses the private cloud and physical hosts that support Xenbase. Echinobase currently supports six echinoderm species, focused on those used for genomics, developmental biology and gene regulatory network analyses. Over 38 000 gene pages, 18 000 publications, new improved genome assemblies, JBrowse genome browser and BLAST + services are available and supported by the development of a new echinoderm anatomical ontology, uniformly applied formal gene nomenclature, and consistent orthology predictions. A novel feature of Echinobase is integrating support for multiple, disparate species. New genomes from the diverse echinoderm phylum will be added and supported as data becomes available. The common code development design of the integrated knowledgebases ensures parallel improvements as each resource evolves. This approach is widely applicable for developing new model organism informatics resources.
Collapse
Affiliation(s)
- Bradley I Arshinoff
- Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Gregory A Cary
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Kamran Karimi
- Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Saoirse Foley
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Sergei Agalakov
- Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Francisco Delgado
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Vaneet S Lotay
- Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Carolyn J Ku
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Troy J Pells
- Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Thomas R Beatman
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Eugene Kim
- Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - R Andrew Cameron
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Peter D Vize
- Department of Biological Sciences, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Cheryl A Telmer
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Jenifer C Croce
- Laboratoire de Biologie du Développement de Villefranche-sur-Mer (LBDV), Institut de la Mer de Villefranche (IMEV), Sorbonne Université, CNRS, Villefranche-sur-Mer, France
| | - Charles A Ettensohn
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Veronica F Hinman
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| |
Collapse
|
19
|
Abstract
Since 1992, FlyBase has provided a freely available online database of information about the model organism Drosophila melanogaster. Data in FlyBase is curated manually from research papers as well as computationally from a variety of relevant sources, to serve as an information hub that enables and accelerates research discovery. This chapter aims to give users new to the database an overview of the layout and types of data available, as well as introducing some tools with which to access the data. More experienced users will find useful information about recent improvements and descriptions to enable more efficient navigation of the database.
Collapse
Affiliation(s)
| | - Aoife Larkin
- Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, UK
| | - Jim Thurmond
- Department of Biology, Indiana University, Bloomington, IN, USA
| |
Collapse
|
20
|
Thomas PD, Ebert D, Muruganujan A, Mushayahama T, Albou L, Mi H. PANTHER: Making genome-scale phylogenetics accessible to all. Protein Sci 2022; 31:8-22. [PMID: 34717010 PMCID: PMC8740835 DOI: 10.1002/pro.4218] [Citation(s) in RCA: 501] [Impact Index Per Article: 250.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Revised: 10/24/2021] [Accepted: 10/26/2021] [Indexed: 02/03/2023]
Abstract
Phylogenetics is a powerful tool for analyzing protein sequences, by inferring their evolutionary relationships to other proteins. However, phylogenetics analyses can be challenging: they are computationally expensive and must be performed carefully in order to avoid systematic errors and artifacts. Protein Analysis THrough Evolutionary Relationships (PANTHER; http://pantherdb.org) is a publicly available, user-focused knowledgebase that stores the results of an extensive phylogenetic reconstruction pipeline that includes computational and manual processes and quality control steps. First, fully reconciled phylogenetic trees (including ancestral protein sequences) are reconstructed for a set of "reference" protein sequences obtained from fully sequenced genomes of organisms across the tree of life. Second, the resulting phylogenetic trees are manually reviewed and annotated with function evolution events: inferred gains and losses of protein function along branches of the phylogenetic tree. Here, we describe in detail the current contents of PANTHER, how those contents are generated, and how they can be used in a variety of applications. The PANTHER knowledgebase can be downloaded or accessed via an extensive API. In addition, PANTHER provides software tools to facilitate the application of the knowledgebase to common protein sequence analysis tasks: exploring an annotated genome by gene function; performing "enrichment analysis" of lists of genes; annotating a single sequence or large batch of sequences by homology; and assessing the likelihood that a genetic variant at a particular site in a protein will have deleterious effects.
Collapse
Affiliation(s)
- Paul D. Thomas
- Division of Bioinformatics, Department of Population and Public Health SciencesUniversity of Southern CaliforniaLos AngelesCaliforniaUSA
| | - Dustin Ebert
- Division of Bioinformatics, Department of Population and Public Health SciencesUniversity of Southern CaliforniaLos AngelesCaliforniaUSA
| | - Anushya Muruganujan
- Division of Bioinformatics, Department of Population and Public Health SciencesUniversity of Southern CaliforniaLos AngelesCaliforniaUSA
| | - Tremayne Mushayahama
- Division of Bioinformatics, Department of Population and Public Health SciencesUniversity of Southern CaliforniaLos AngelesCaliforniaUSA
| | - Laurent‐Philippe Albou
- Division of Bioinformatics, Department of Population and Public Health SciencesUniversity of Southern CaliforniaLos AngelesCaliforniaUSA
| | - Huaiyu Mi
- Division of Bioinformatics, Department of Population and Public Health SciencesUniversity of Southern CaliforniaLos AngelesCaliforniaUSA
| |
Collapse
|
21
|
Brayton CF. Laboratory Codes in Nomenclature and Scientific Communication (Advancing Organism Nomenclature in Scientific Communication to Improve Research Reporting and Reproducibility). ILAR J 2021; 62:295-309. [PMID: 36528817 DOI: 10.1093/ilar/ilac016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 08/23/2022] [Indexed: 12/23/2022] Open
Abstract
Laboratory registration codes, also known as laboratory codes or lab codes, are a key element in standardized laboratory animal and genetic nomenclature. As such they are critical to accurate scientific communication and to research reproducibility and integrity. The original committee on Mouse Genetic Nomenclature published nomenclature conventions for mice genetics in 1940, and then conventions for inbred strains in 1952. Unique designations were needed, and have been in use since the 1950s, for the sources of animals and substrains, for the laboratories that identified new alleles or mutations, and then for developers of transgenes and induced mutations. Current laboratory codes are typically a 2- to 4-letter acronym for an institution or an investigator. Unique codes are assigned from the International Laboratory Code Registry, which was developed and is maintained by ILAR in the National Academies (National Academies of Sciences Engineering and Medicine and previously National Academy of Sciences). As a resource for the global research community, the registry has been online since 1997. Since 2003 mouse and rat genetic and strain nomenclature rules have been reviewed and updated annually as a joint effort of the International Committee on Standardized Genetic Nomenclature for Mice and the Rat Genome and Nomenclature Committee. The current nomenclature conventions (particularly conventions for non-inbred animals) are applicable beyond rodents, although not widely adopted. Ongoing recognition, since at least the 1930s, of the research relevance of genetic backgrounds and origins of animals, and of spontaneous and induced genetic variants speaks to the need for broader application of standardized nomenclature for animals in research, particularly given the increasing numbers and complexities of genetically modified swine, nonhuman primates, fish, and other species.
Collapse
Affiliation(s)
- Cory F Brayton
- Johns Hopkins Medicine, Molecular and Comparative Pathobiology, Baltimore, Maryland, USA
| |
Collapse
|
22
|
Rutherford KM, Harris MA, Oliferenko S, Wood V. JaponicusDB: rapid deployment of a model organism database for an emerging model species. Genetics 2021; 220:6481558. [PMID: 35380656 PMCID: PMC9209809 DOI: 10.1093/genetics/iyab223] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2021] [Accepted: 11/09/2021] [Indexed: 02/03/2023] Open
Abstract
The fission yeast Schizosaccharomyces japonicus has recently emerged as a powerful system for studying the evolution of essential cellular processes, drawing on similarities as well as key differences between S. japonicus and the related, well-established model Schizosaccharomyces pombe. We have deployed the open-source, modular code and tools originally developed for PomBase, the S. pombe model organism database (MOD), to create JaponicusDB (www.japonicusdb.org), a new MOD dedicated to S. japonicus. By providing a central resource with ready access to a growing body of experimental data, ontology-based curation, seamless browsing and querying, and the ability to integrate new data with existing knowledge, JaponicusDB supports fission yeast biologists to a far greater extent than any other source of S. japonicus data. JaponicusDB thus enables S. japonicus researchers to realize the full potential of studying a newly emerging model species and illustrates the widely applicable power and utility of harnessing reusable PomBase code to build a comprehensive, community-maintainable repository of species-relevant knowledge.
Collapse
Affiliation(s)
- Kim M Rutherford
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK
| | - Midori A Harris
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK
| | - Snezhana Oliferenko
- The Francis Crick Institute, London NW1 1AT, UK,Randall Centre for Cell and Molecular Biophysics, School of Basic and Medical Biosciences, King’s College London, London SE1 1UL, UK,Corresponding author: (S.O.); (V.W.)
| | - Valerie Wood
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, UK,Corresponding author: (S.O.); (V.W.)
| |
Collapse
|
23
|
Ringwald M, Richardson JE, Baldarelli RM, Blake JA, Kadin JA, Smith C, Bult CJ. Mouse Genome Informatics (MGI): latest news from MGD and GXD. Mamm Genome 2021; 33:4-18. [PMID: 34698891 PMCID: PMC8913530 DOI: 10.1007/s00335-021-09921-0] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 09/21/2021] [Indexed: 12/01/2022]
Abstract
The Mouse Genome Informatics (MGI) database system combines multiple expertly curated community data resources into a shared knowledge management ecosystem united by common metadata annotation standards. MGI's mission is to facilitate the use of the mouse as an experimental model for understanding the genetic and genomic basis of human health and disease. MGI is the authoritative source for mouse gene, allele, and strain nomenclature and is the primary source of mouse phenotype annotations, functional annotations, developmental gene expression information, and annotations of mouse models with human diseases. MGI maintains mouse anatomy and phenotype ontologies and contributes to the development of the Gene Ontology and Disease Ontology and uses these ontologies as standard terminologies for annotation. The Mouse Genome Database (MGD) and the Gene Expression Database (GXD) are MGI's two major knowledgebases. Here, we highlight some of the recent changes and enhancements to MGD and GXD that have been implemented in response to changing needs of the biomedical research community and to improve the efficiency of expert curation. MGI can be accessed freely at http://www.informatics.jax.org .
Collapse
|
24
|
|
25
|
Richardson JE, Baldarelli RM, Bult CJ. Multiple genome viewer (MGV): a new tool for visualization and comparison of multiple annotated genomes. Mamm Genome 2021; 33:44-54. [PMID: 34448927 PMCID: PMC8913476 DOI: 10.1007/s00335-021-09904-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Accepted: 08/08/2021] [Indexed: 11/30/2022]
Abstract
The assembled and annotated genomes for 16 inbred mouse strains (Lilue et al., Nat Genet 50:1574–1583, 2018) and two wild-derived strains (CAROLI/EiJ and PAHARI/EiJ) (Thybert et al., Genome Res 28:448–459, 2018) are valuable resources for mouse genetics and comparative genomics. We developed the multiple genome viewer (MGV; http://www.informatics.jax.org/mgv) to support visualization, exploration, and comparison of genome annotations within and across these genomes. MGV displays chromosomal regions of user-selected genomes as horizontal tracks. Equivalent features across the genome tracks are highlighted using vertical ‘swim lane’ connectors. Navigation across the genomes is synchronized as a researcher uses the scroll and zoom functions. Researchers can generate custom sets of genes and other genome features to be displayed in MGV by entering genome coordinates, function, phenotype, disease, and/or pathway terms. MGV was developed to be genome agnostic and can be used to display homologous features across genomes of different organisms.
Collapse
|
26
|
Campos TL, Korhonen PK, Hofmann A, Gasser RB, Young ND. Harnessing model organism genomics to underpin the machine learning-based prediction of essential genes in eukaryotes - Biotechnological implications. Biotechnol Adv 2021; 54:107822. [PMID: 34461202 DOI: 10.1016/j.biotechadv.2021.107822] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2021] [Revised: 08/17/2021] [Accepted: 08/24/2021] [Indexed: 12/17/2022]
Abstract
The availability of high-quality genomes and advances in functional genomics have enabled large-scale studies of essential genes in model eukaryotes, including the 'elegant worm' (Caenorhabditis elegans; Nematoda) and the 'vinegar fly' (Drosophila melanogaster; Arthropoda). However, this is not the case for other, much less-studied organisms, such as socioeconomically important parasites, for which functional genomic platforms usually do not exist. Thus, there is a need to develop innovative techniques or approaches for the prediction, identification and investigation of essential genes. A key approach that could enable the prediction of such genes is machine learning (ML). Here, we undertake an historical review of experimental and computational approaches employed for the characterisation of essential genes in eukaryotes, with a particular focus on model ecdysozoans (C. elegans and D. melanogaster), and discuss the possible applicability of ML-approaches to organisms such as socioeconomically important parasites. We highlight some recent results showing that high-performance ML, combined with feature engineering, allows a reliable prediction of essential genes from extensive, publicly available 'omic data sets, with major potential to prioritise such genes (with statistical confidence) for subsequent functional genomic validation. These findings could 'open the door' to fundamental and applied research areas. Evidence of some commonality in the essential gene-complement between these two organisms indicates that an ML-engineering approach could find broader applicability to ecdysozoans such as parasitic nematodes or arthropods, provided that suitably large and informative data sets become/are available for proper feature engineering, and for the robust training and validation of algorithms. This area warrants detailed exploration to, for example, facilitate the identification and characterisation of essential molecules as novel targets for drugs and vaccines against parasitic diseases. This focus is particularly important, given the substantial impact that such diseases have worldwide, and the current challenges associated with their prevention and control and with drug resistance in parasite populations.
Collapse
Affiliation(s)
- Tulio L Campos
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia; Bioinformatics Core Facility, Instituto Aggeu Magalhães, Fundação Oswaldo Cruz (IAM-Fiocruz), Recife, Pernambuco, Brazil
| | - Pasi K Korhonen
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Andreas Hofmann
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Robin B Gasser
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia.
| | - Neil D Young
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia.
| |
Collapse
|
27
|
Huang X, Wang C, Chen L, Zhang T, Leung KL, Wong G. Human amyloid beta and α-synuclein co-expression in neurons impair behavior and recapitulate features for Lewy body dementia in Caenorhabditis elegans. Biochim Biophys Acta Mol Basis Dis 2021; 1867:166203. [PMID: 34146705 DOI: 10.1016/j.bbadis.2021.166203] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2021] [Revised: 06/10/2021] [Accepted: 06/15/2021] [Indexed: 12/12/2022]
Abstract
Amyloid β (Aβ), a product of APP, and SNCA (α-synuclein (α-syn)) are two of the key proteins found in lesions associated with the age-related neurodegenerative disorders Alzheimer's disease (AD) and Parkinson's disease (PD), respectively. Previous clinical studies uncovered Aβ and α-syn co-expression in the brains of patients, which lead to Lewy body dementia (LBD), a disease encompassing Dementia with Lewy bodies (DLB) and Parkinson's disease dementia (PDD). To explore the pathogenesis and define the relationship between Aβ and α-syn for LBD, we established a C. elegans model which co-expresses human Aβ and α-syn with alanine 53 to threonine mutant (α-syn(A53T)) in pan-neurons. Compared to α-syn(A53T) single transgenic animals, pan-neuronal Aβ and α-syn(A53T) co-expression further enhanced the thrashing, egg laying, serotonin and cholinergic signaling deficits, and dopaminergic neuron damage in C. elegans. In addition, Aβ increased α-syn expression in transgenic animals. Transcriptome analysis of both Aβ;α-syn(A53T) strains and DLB patients showed common downregulation in lipid metabolism and lysosome function genes, suggesting that a decrease of lysosome function may reduce the clearance ability in DLB, and this may lead to the further pathogenic protein accumulation. These findings suggest that our model can recapitulate some features in LBD and provides a mechanism by which Aβ may exacerbate α-syn pathogenesis.
Collapse
Affiliation(s)
- Xiaobing Huang
- Cancer Centre, Centre of Reproduction, Development and Aging, Faculty of Health Sciences, University of Macau, Macau 999078, China
| | - Changliang Wang
- Cancer Centre, Centre of Reproduction, Development and Aging, Faculty of Health Sciences, University of Macau, Macau 999078, China; Bioland Laboratory (Guangzhou Regenerative Medicine and Health Guangdong Laboratory), Guangzhou 510005, China
| | - Liang Chen
- Department of Computer Science, College of Engineering, Shantou University, Shantou 515063, China; Key Laboratory of Intelligent Manufacturing Technology of Ministry of Education, Shantou University, Shantou 515063, China
| | - Tianjiao Zhang
- Cancer Centre, Centre of Reproduction, Development and Aging, Faculty of Health Sciences, University of Macau, Macau 999078, China
| | - Ka Lai Leung
- Cancer Centre, Centre of Reproduction, Development and Aging, Faculty of Health Sciences, University of Macau, Macau 999078, China
| | - Garry Wong
- Cancer Centre, Centre of Reproduction, Development and Aging, Faculty of Health Sciences, University of Macau, Macau 999078, China.
| |
Collapse
|
28
|
Kałuzińska Ż, Kołat D, Bednarek AK, Płuciennik E. PLEK2, RRM2, GCSH: A Novel WWOX-Dependent Biomarker Triad of Glioblastoma at the Crossroads of Cytoskeleton Reorganization and Metabolism Alterations. Cancers (Basel) 2021; 13:cancers13122955. [PMID: 34204789 PMCID: PMC8231639 DOI: 10.3390/cancers13122955] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2021] [Revised: 04/30/2021] [Accepted: 06/11/2021] [Indexed: 02/07/2023] Open
Abstract
Glioblastoma is one of the deadliest human cancers. Its malignancy depends on cytoskeleton reorganization, which is related to, e.g., epithelial-to-mesenchymal transition and metastasis. The malignant phenotype of glioblastoma is also affected by the WWOX gene, which is lost in nearly a quarter of gliomas. Although the role of WWOX in the cytoskeleton rearrangement has been found in neural progenitor cells, its function as a modulator of cytoskeleton in gliomas was not investigated. Therefore, this study aimed to investigate the role of WWOX and its collaborators in cytoskeleton dynamics of glioblastoma. Methodology on RNA-seq data integrated the use of databases, bioinformatics tools, web-based platforms, and machine learning algorithm, and the obtained results were validated through microarray data. PLEK2, RRM2, and GCSH were the most relevant WWOX-dependent genes that could serve as novel biomarkers. Other genes important in the context of cytoskeleton (BMP4, CCL11, CUX2, DUSP7, FAM92B, GRIN2B, HOXA1, HOXA10, KIF20A, NF2, SPOCK1, TTR, UHRF1, and WT1), metabolism (MTHFD2), or correlation with WWOX (COL3A1, KIF20A, RNF141, and RXRG) were also discovered. For the first time, we propose that changes in WWOX expression dictate a myriad of alterations that affect both glioblastoma cytoskeleton and metabolism, rendering new therapeutic possibilities.
Collapse
|
29
|
Abstract
Following funding cuts, several model organism databases and the larger efforts that rely on their data are left facing uncertain futures.
Collapse
|
30
|
Howe DG, Ramachandran S, Bradford YM, Fashena D, Toro S, Eagle A, Frazer K, Kalita P, Mani P, Martin R, Moxon ST, Paddock H, Pich C, Ruzicka L, Schaper K, Shao X, Singer A, Van Slyke CE, Westerfield M. The Zebrafish Information Network: major gene page and home page updates. Nucleic Acids Res 2021; 49:D1058-D1064. [PMID: 33170210 PMCID: PMC7778988 DOI: 10.1093/nar/gkaa1010] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/05/2020] [Accepted: 10/13/2020] [Indexed: 02/06/2023] Open
Abstract
The Zebrafish Information Network (ZFIN) (https://zfin.org/) is the database for the model organism, zebrafish (Danio rerio). ZFIN expertly curates, organizes, and provides a wide array of zebrafish genetic and genomic data, including genes, alleles, transgenic lines, gene expression, gene function, mutant phenotypes, orthology, human disease models, gene and mutant nomenclature, and reagents. New features at ZFIN include major updates to the home page and the gene page, the two most used pages at ZFIN. Data including disease models, phenotypes, expression, mutants and gene function continue to be contributed to The Alliance of Genome Resources for integration with similar data from other model organisms.
Collapse
Affiliation(s)
- Douglas G Howe
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | | | - Yvonne M Bradford
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - David Fashena
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Sabrina Toro
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Anne Eagle
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Ken Frazer
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Patrick Kalita
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Prita Mani
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Ryan Martin
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Sierra Taylor Moxon
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Holly Paddock
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Christian Pich
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Leyla Ruzicka
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Kevin Schaper
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Xiang Shao
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Amy Singer
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Ceri E Van Slyke
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| | - Monte Westerfield
- The Institute of Neuroscience, University of Oregon, Eugene, OR 97403-1254, USA
| |
Collapse
|
31
|
Hu Y, Comjean A, Rodiger J, Liu Y, Gao Y, Chung V, Zirin J, Perrimon N, Mohr SE. FlyRNAi.org-the database of the Drosophila RNAi screening center and transgenic RNAi project: 2021 update. Nucleic Acids Res 2021; 49:D908-D915. [PMID: 33104800 PMCID: PMC7778949 DOI: 10.1093/nar/gkaa936] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2020] [Revised: 10/01/2020] [Accepted: 10/06/2020] [Indexed: 12/24/2022] Open
Abstract
The FlyRNAi database at the Drosophila RNAi Screening Center and Transgenic RNAi Project (DRSC/TRiP) provides a suite of online resources that facilitate functional genomics studies with a special emphasis on Drosophila melanogaster. Currently, the database provides: gene-centric resources that facilitate ortholog mapping and mining of information about orthologs in common genetic model species; reagent-centric resources that help researchers identify RNAi and CRISPR sgRNA reagents or designs; and data-centric resources that facilitate visualization and mining of transcriptomics data, protein modification data, protein interactions, and more. Here, we discuss updated and new features that help biological and biomedical researchers efficiently identify, visualize, analyze, and integrate information and data for Drosophila and other species. Together, these resources facilitate multiple steps in functional genomics workflows, from building gene and reagent lists to management, analysis, and integration of data.
Collapse
Affiliation(s)
- Yanhui Hu
- Department of Genetics, Blavatnik Institute, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA.,Drosophila RNAi Screening Center, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| | - Aram Comjean
- Department of Genetics, Blavatnik Institute, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA.,Drosophila RNAi Screening Center, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| | - Jonathan Rodiger
- Department of Genetics, Blavatnik Institute, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA.,Drosophila RNAi Screening Center, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| | - Yifang Liu
- Department of Genetics, Blavatnik Institute, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA.,Drosophila RNAi Screening Center, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| | - Yue Gao
- Department of Genetics, Blavatnik Institute, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA.,Drosophila RNAi Screening Center, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| | - Verena Chung
- Department of Genetics, Blavatnik Institute, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA.,Drosophila RNAi Screening Center, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| | - Jonathan Zirin
- Department of Genetics, Blavatnik Institute, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA.,Drosophila RNAi Screening Center, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| | - Norbert Perrimon
- Department of Genetics, Blavatnik Institute, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA.,Drosophila RNAi Screening Center, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA.,Howard Hughes Medical Institute, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| | - Stephanie E Mohr
- Department of Genetics, Blavatnik Institute, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA.,Drosophila RNAi Screening Center, Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115, USA
| |
Collapse
|
32
|
Sweeney BA, Petrov AI, Ribas CE, Finn RD, Bateman A, Szymanski M, Karlowski WM, Seemann SE, Gorodkin J, Cannone JJ, Gutell RR, Kay S, Marygold S, dos Santos G, Frankish A, Mudge JM, Barshir R, Fishilevich S, Chan PP, Lowe TM, Seal R, Bruford E, Panni S, Porras P, Karagkouni D, Hatzigeorgiou AG, Ma L, Zhang Z, Volders PJ, Mestdagh P, Griffiths-Jones S, Fromm B, Peterson KJ, Kalvari I, Nawrocki EP, Petrov AS, Weng S, Bouchard-Bourelle P, Scott M, Lui LM, Hoksza D, Lovering RC, Kramarz B, Mani P, Ramachandran S, Weinberg Z. RNAcentral 2021: secondary structure integration, improved sequence search and new member databases. Nucleic Acids Res 2021; 49:D212-D220. [PMID: 33106848 PMCID: PMC7779037 DOI: 10.1093/nar/gkaa921] [Citation(s) in RCA: 134] [Impact Index Per Article: 44.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 10/05/2020] [Indexed: 12/16/2022] Open
Abstract
RNAcentral is a comprehensive database of non-coding RNA (ncRNA) sequences that provides a single access point to 44 RNA resources and >18 million ncRNA sequences from a wide range of organisms and RNA types. RNAcentral now also includes secondary (2D) structure information for >13 million sequences, making RNAcentral the world's largest RNA 2D structure database. The 2D diagrams are displayed using R2DT, a new 2D structure visualization method that uses consistent, reproducible and recognizable layouts for related RNAs. The sequence similarity search has been updated with a faster interface featuring facets for filtering search results by RNA type, organism, source database or any keyword. This sequence search tool is available as a reusable web component, and has been integrated into several RNAcentral member databases, including Rfam, miRBase and snoDB. To allow for a more fine-grained assignment of RNA types and subtypes, all RNAcentral sequences have been annotated with Sequence Ontology terms. The RNAcentral database continues to grow and provide a central data resource for the RNA community. RNAcentral is freely available at https://rnacentral.org.
Collapse
|
33
|
Blake JA, Baldarelli R, Kadin JA, Richardson JE, Smith C, Bult CJ. Mouse Genome Database (MGD): Knowledgebase for mouse-human comparative biology. Nucleic Acids Res 2021; 49:D981-D987. [PMID: 33231642 PMCID: PMC7779030 DOI: 10.1093/nar/gkaa1083] [Citation(s) in RCA: 178] [Impact Index Per Article: 59.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Revised: 10/18/2020] [Accepted: 11/22/2020] [Indexed: 11/17/2022] Open
Abstract
The Mouse Genome Database (MGD; http://www.informatics.jax.org) is the community model organism knowledgebase for the laboratory mouse, a widely used animal model for comparative studies of the genetic and genomic basis for human health and disease. MGD is the authoritative source for biological reference data related to mouse genes, gene functions, phenotypes and mouse models of human disease. MGD is the primary source for official gene, allele, and mouse strain nomenclature based on the guidelines set by the International Committee on Standardized Nomenclature for Mice. MGD's biocuration scientists curate information from the biomedical literature and from large and small datasets contributed directly by investigators. In this report we describe significant enhancements to the content and interfaces at MGD, including (i) improvements in the Multi Genome Viewer for exploring the genomes of multiple mouse strains, (ii) inclusion of many more mouse strains and new mouse strain pages with extended query options and (iii) integration of extensive data about mouse strain variants. We also describe improvements to the efficiency of literature curation processes and the implementation of an information portal focused on mouse models and genes for the study of COVID-19.
Collapse
|
34
|
Meyer C, Scalzitti N, Jeannin-Girardon A, Collet P, Poch O, Thompson JD. Understanding the causes of errors in eukaryotic protein-coding gene prediction: a case study of primate proteomes. BMC Bioinformatics 2020; 21:513. [PMID: 33172385 PMCID: PMC7656754 DOI: 10.1186/s12859-020-03855-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2020] [Accepted: 10/30/2020] [Indexed: 11/10/2022] Open
Abstract
Background Recent advances in sequencing technologies have led to an explosion in the number of genomes available, but accurate genome annotation remains a major challenge. The prediction of protein-coding genes in eukaryotic genomes is especially problematic, due to their complex exon–intron structures. Even the best eukaryotic gene prediction algorithms can make serious errors that will significantly affect subsequent analyses. Results We first investigated the prevalence of gene prediction errors in a large set of 176,478 proteins from ten primate proteomes available in public databases. Using the well-studied human proteins as a reference, a total of 82,305 potential errors were detected, including 44,001 deletions, 27,289 insertions and 11,015 mismatched segments where part of the correct protein sequence is replaced with an alternative erroneous sequence. We then focused on the mismatched sequence errors that cause particular problems for downstream applications. A detailed characterization allowed us to identify the potential causes for the gene misprediction in approximately half (5446) of these cases. As a proof-of-concept, we also developed a simple method which allowed us to propose improved sequences for 603 primate proteins. Conclusions Gene prediction errors in primate proteomes affect up to 50% of the sequences. Major causes of errors include undetermined genome regions, genome sequencing or assembly issues, and limitations in the models used to represent gene exon–intron structures. Nevertheless, existing genome sequences can still be exploited to improve protein sequence quality. Perspectives of the work include the characterization of other types of gene prediction errors, as well as the development of a more comprehensive algorithm for protein sequence error correction.
Collapse
Affiliation(s)
- Corentin Meyer
- Department of Computer Science, ICube, CNRS, University of Strasbourg, Strasbourg, France
| | - Nicolas Scalzitti
- Department of Computer Science, ICube, CNRS, University of Strasbourg, Strasbourg, France
| | - Anne Jeannin-Girardon
- Department of Computer Science, ICube, CNRS, University of Strasbourg, Strasbourg, France
| | - Pierre Collet
- Department of Computer Science, ICube, CNRS, University of Strasbourg, Strasbourg, France
| | - Olivier Poch
- Department of Computer Science, ICube, CNRS, University of Strasbourg, Strasbourg, France
| | - Julie D Thompson
- Department of Computer Science, ICube, CNRS, University of Strasbourg, Strasbourg, France.
| |
Collapse
|
35
|
Delso G, Cirillo D, Kaggie JD, Valencia A, Metser U, Veit-Haibach P. How to Design AI-Driven Clinical Trials in Nuclear Medicine. Semin Nucl Med 2020; 51:112-119. [PMID: 33509367 DOI: 10.1053/j.semnuclmed.2020.09.003] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Artificial intelligence (AI) is an overarching term for a multitude of technologies which are currently being discussed and introduced in several areas of medicine and in medical imaging specifically. There is, however, limited literature and information about how AI techniques can be integrated into the design of clinical imaging trials. This article will present several aspects of AI being used in trials today and how imaging departments and especially nuclear medicine departments can prepare themselves to be at the forefront of AI-driven clinical trials. Beginning with some basic explanation on AI techniques currently being used and existing challenges of its implementation, it will also cover the logistical prerequisites which have to be in place in nuclear medicine departments to participate successfully in AI-driven clinical trials.
Collapse
Affiliation(s)
| | | | - Joshua D Kaggie
- Department of Radiology, University of Cambridge, Cambridge, UK
| | | | - Ur Metser
- Joint Department of Medical Imaging, University Health Network, Toronto, CA
| | | |
Collapse
|