1
|
Sunil RS, Lim SC, Itharajula M, Mutwil M. The gene function prediction challenge: Large language models and knowledge graphs to the rescue. CURRENT OPINION IN PLANT BIOLOGY 2024; 82:102665. [PMID: 39579414 DOI: 10.1016/j.pbi.2024.102665] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/13/2024] [Revised: 10/23/2024] [Accepted: 10/24/2024] [Indexed: 11/25/2024]
Abstract
Elucidating gene function is one of the ultimate goals of plant science. Despite this, only ∼15 % of all genes in the model plant Arabidopsis thaliana have comprehensively experimentally verified functions. While bioinformatical gene function prediction approaches can guide biologists in their experimental efforts, neither the performance of the gene function prediction methods nor the number of experimental characterization of genes has increased dramatically in recent years. In this review, we will discuss the status quo and the trajectory of gene function elucidation and outline the recent advances in gene function prediction approaches. We will then discuss how recent artificial intelligence advances in large language models and knowledge graphs can be leveraged to accelerate gene function predictions and keep us updated with scientific literature.
Collapse
Affiliation(s)
- Rohan Shawn Sunil
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore, 637551, Singapore
| | - Shan Chun Lim
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore, 637551, Singapore
| | - Manoj Itharajula
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore, 637551, Singapore
| | - Marek Mutwil
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore, 637551, Singapore.
| |
Collapse
|
2
|
Alfatah M, Lim JJJ, Zhang Y, Naaz A, Cheng TYN, Yogasundaram S, Faidzinn NA, Lin JJ, Eisenhaber B, Eisenhaber F. Uncharacterized yeast gene YBR238C, an effector of TORC1 signaling in a mitochondrial feedback loop, accelerates cellular aging via HAP4- and RMD9-dependent mechanisms. eLife 2024; 12:RP92178. [PMID: 38713053 PMCID: PMC11076046 DOI: 10.7554/elife.92178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/08/2024] Open
Abstract
Uncovering the regulators of cellular aging will unravel the complexity of aging biology and identify potential therapeutic interventions to delay the onset and progress of chronic, aging-related diseases. In this work, we systematically compared genesets involved in regulating the lifespan of Saccharomyces cerevisiae (a powerful model organism to study the cellular aging of humans) and those with expression changes under rapamycin treatment. Among the functionally uncharacterized genes in the overlap set, YBR238C stood out as the only one downregulated by rapamycin and with an increased chronological and replicative lifespan upon deletion. We show that YBR238C and its paralog RMD9 oppositely affect mitochondria and aging. YBR238C deletion increases the cellular lifespan by enhancing mitochondrial function. Its overexpression accelerates cellular aging via mitochondrial dysfunction. We find that the phenotypic effect of YBR238C is largely explained by HAP4- and RMD9-dependent mechanisms. Furthermore, we find that genetic- or chemical-based induction of mitochondrial dysfunction increases TORC1 (Target of Rapamycin Complex 1) activity that, subsequently, accelerates cellular aging. Notably, TORC1 inhibition by rapamycin (or deletion of YBR238C) improves the shortened lifespan under these mitochondrial dysfunction conditions in yeast and human cells. The growth of mutant cells (a proxy of TORC1 activity) with enhanced mitochondrial function is sensitive to rapamycin whereas the growth of defective mitochondrial mutants is largely resistant to rapamycin compared to wild type. Our findings demonstrate a feedback loop between TORC1 and mitochondria (the TORC1-MItochondria-TORC1 (TOMITO) signaling process) that regulates cellular aging processes. Hereby, YBR238C is an effector of TORC1 modulating mitochondrial function.
Collapse
Affiliation(s)
- Mohammad Alfatah
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR)SingaporeSingapore
| | - Jolyn Jia Jia Lim
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR)SingaporeSingapore
| | - Yizhong Zhang
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR)SingaporeSingapore
| | - Arshia Naaz
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR)SingaporeSingapore
| | - Trishia Yi Ning Cheng
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR)SingaporeSingapore
| | - Sonia Yogasundaram
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR)SingaporeSingapore
| | - Nashrul Afiq Faidzinn
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR)SingaporeSingapore
| | - Jovian Jing Lin
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR)SingaporeSingapore
| | - Birgit Eisenhaber
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR)SingaporeSingapore
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR)SingaporeSingapore
- LASA – Lausitz Advanced Scientific Applications gGmbHWeißwasserGermany
| | - Frank Eisenhaber
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR)SingaporeSingapore
- Genome Institute of Singapore (GIS), Agency for Science, Technology and Research (A*STAR)SingaporeSingapore
- LASA – Lausitz Advanced Scientific Applications gGmbHWeißwasserGermany
- School of Biological Sciences (SBS), Nanyang Technological University (NTU)SingaporeSingapore
| |
Collapse
|
3
|
Karampatakis T, Tsergouli K, Behzadi P. Pan-Genome Plasticity and Virulence Factors: A Natural Treasure Trove for Acinetobacter baumannii. Antibiotics (Basel) 2024; 13:257. [PMID: 38534692 DOI: 10.3390/antibiotics13030257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Revised: 02/17/2024] [Accepted: 03/12/2024] [Indexed: 03/28/2024] Open
Abstract
Acinetobacter baumannii is a Gram-negative pathogen responsible for a variety of community- and hospital-acquired infections. It is recognized as a life-threatening pathogen among hospitalized individuals and, in particular, immunocompromised patients in many countries. A. baumannii, as a member of the ESKAPE group, encompasses high genomic plasticity and simultaneously is predisposed to receive and exchange the mobile genetic elements (MGEs) through horizontal genetic transfer (HGT). Indeed, A. baumannii is a treasure trove that contains a high number of virulence factors. In accordance with these unique pathogenic characteristics of A. baumannii, the authors aim to discuss the natural treasure trove of pan-genome and virulence factors pertaining to this bacterial monster and try to highlight the reasons why this bacterium is a great concern in the global public health system.
Collapse
Affiliation(s)
| | - Katerina Tsergouli
- Microbiology Department, Agios Pavlos General Hospital, 55134 Thessaloniki, Greece
| | - Payam Behzadi
- Department of Microbiology, Shahr-e-Qods Branch, Islamic Azad University, Tehran 37541-374, Iran
| |
Collapse
|
4
|
Sintsova A, Ruscheweyh HJ, Field CM, Feer L, Nguyen BD, Daniel B, Hardt WD, Vorholt JA, Sunagawa S. mBARq: a versatile and user-friendly framework for the analysis of DNA barcodes from transposon insertion libraries, knockout mutants, and isogenic strain populations. Bioinformatics 2024; 40:btae078. [PMID: 38341646 PMCID: PMC10885212 DOI: 10.1093/bioinformatics/btae078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 12/18/2023] [Accepted: 02/08/2024] [Indexed: 02/12/2024] Open
Abstract
MOTIVATION DNA barcoding has become a powerful tool for assessing the fitness of strains in a variety of studies, including random transposon mutagenesis screens, attenuation of site-directed mutants, and population dynamics of isogenic strain pools. However, the statistical analysis, visualization, and contextualization of the data resulting from such experiments can be complex and require bioinformatic skills. RESULTS Here, we developed mBARq, a user-friendly tool designed to simplify these steps for diverse experimental setups. The tool is seamlessly integrated with an intuitive web app for interactive data exploration via the STRING and KEGG databases to accelerate scientific discovery. AVAILABILITY AND IMPLEMENTATION The tool is implemented in Python. The source code is freely available (https://github.com/MicrobiologyETHZ/mbarq) and the web app can be accessed at: https://microbiomics.io/tools/mbarq-app.
Collapse
Affiliation(s)
- Anna Sintsova
- Department of Biology, Institute of Microbiology, ETH Zurich, Zurich 8093, Switzerland
- Department of Biology, Institute of Microbiology, Swiss Institute of Bioinformatics, ETH Zurich, Zurich 8093, Switzerland
| | - Hans-Joachim Ruscheweyh
- Department of Biology, Institute of Microbiology, ETH Zurich, Zurich 8093, Switzerland
- Department of Biology, Institute of Microbiology, Swiss Institute of Bioinformatics, ETH Zurich, Zurich 8093, Switzerland
| | - Christopher M Field
- Department of Biology, Institute of Microbiology, ETH Zurich, Zurich 8093, Switzerland
- Department of Biology, Institute of Microbiology, Swiss Institute of Bioinformatics, ETH Zurich, Zurich 8093, Switzerland
| | - Lilith Feer
- Department of Biology, Institute of Microbiology, ETH Zurich, Zurich 8093, Switzerland
- Department of Biology, Institute of Microbiology, Swiss Institute of Bioinformatics, ETH Zurich, Zurich 8093, Switzerland
| | - Bidong D Nguyen
- Department of Biology, Institute of Microbiology, ETH Zurich, Zurich 8093, Switzerland
| | - Benjamin Daniel
- Department of Biology, Institute of Microbiology, ETH Zurich, Zurich 8093, Switzerland
| | - Wolf-Dietrich Hardt
- Department of Biology, Institute of Microbiology, ETH Zurich, Zurich 8093, Switzerland
| | - Julia A Vorholt
- Department of Biology, Institute of Microbiology, ETH Zurich, Zurich 8093, Switzerland
| | - Shinichi Sunagawa
- Department of Biology, Institute of Microbiology, ETH Zurich, Zurich 8093, Switzerland
- Department of Biology, Institute of Microbiology, Swiss Institute of Bioinformatics, ETH Zurich, Zurich 8093, Switzerland
| |
Collapse
|
5
|
Rappsilber J. A dive into the unknome. Trends Genet 2024; 40:15-16. [PMID: 37968205 DOI: 10.1016/j.tig.2023.10.011] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2023] [Accepted: 10/23/2023] [Indexed: 11/17/2023]
Abstract
We may never understand the function of all genes, findings by Freeman, Munro and colleagues suggest, unless we rethink our approaches. They make a thorough attempt at quantifying the unknownness of protein-coding genes and experimentally prove that many neglected genes hold the seed of important discoveries.
Collapse
Affiliation(s)
- Juri Rappsilber
- Technische Universität Berlin, Chair of Bioanalytics, 10623 Berlin, Germany; Wellcome Centre for Cell Biology, University of Edinburgh, Edinburgh, EH9 3BF, UK; Si-M/'Der Simulierte Mensch', a Science Framework of Technische Universität Berlin and Charité - Universitätsmedizin Berlin, Berlin, Germany.
| |
Collapse
|
6
|
Tantoso E, Eisenhaber B, Sinha S, Jensen LJ, Eisenhaber F. Did the early full genome sequencing of yeast boost gene function discovery? Biol Direct 2023; 18:46. [PMID: 37574542 PMCID: PMC10424406 DOI: 10.1186/s13062-023-00403-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Accepted: 08/01/2023] [Indexed: 08/15/2023] Open
Abstract
BACKGROUND Although the genome of Saccharomyces cerevisiae (S. cerevisiae) was the first one of a eukaryote organism that was fully sequenced (in 1996), a complete understanding of the potential of encoded biomolecular mechanisms has not yet been achieved. Here, we wish to quantify how far the goal of a full list of S. cerevisiae gene functions still is. RESULTS The scientific literature about S. cerevisiae protein-coding genes has been mapped onto the yeast genome via the mentioning of names for genomic regions in scientific publications. The match was quantified with the ratio of a given gene name's occurrences to those of any gene names in the article. We find that ~ 230 elite genes with ≥ 75 full publication equivalents (FPEs, FPE = 1 is an idealized publication referring to just a single gene) command ~ 45% of all literature. At the same time, about two thirds of the genes (each with less than 10 FPEs) are described in just 12% of the literature (in average each such gene has just ~ 1.5% of the literature of an elite gene). About 600 genes have not been mentioned in any dedicated article. Compared with other groups of genes, the literature growth rates were highest for uncharacterized or understudied genes until late nineties of the twentieth century. Yet, these growth rates deteriorated and became negative thereafter. Thus, yeast function discovery for previously uncharacterized genes has returned to the level of ~ 1980. At the same time, literature for anyhow well-studied genes (with a threshold T10 (≥ 10 FPEs) and higher) remains steadily growing. CONCLUSIONS Did the early full genome sequencing of yeast boost gene function discovery? The data proves that the moment of publishing the full genome in reality coincides with the onset of decline of gene function discovery for previously uncharacterized genes. If the current status of literature about yeast molecular mechanisms can be extrapolated into the future, it will take about another ~ 50 years to complete the yeast gene function list. We found that a small group of scientific journals contributed extraordinarily to publishing early reports relevant to yeast gene function discoveries.
Collapse
Affiliation(s)
- Erwin Tantoso
- Agency for Science, Technology and Research (A*STAR), Bioinformatics Institute (BII), 30 Biopolis Street #07-01, Matrix Building, Singapore, 138671, Republic of Singapore.
- Agency for Science, Technology and Research (A*STAR), Genome Institute of Singapore (GIS), 60 Biopolis Street, Singapore, 138672, Republic of Singapore.
| | - Birgit Eisenhaber
- Agency for Science, Technology and Research (A*STAR), Bioinformatics Institute (BII), 30 Biopolis Street #07-01, Matrix Building, Singapore, 138671, Republic of Singapore.
- Agency for Science, Technology and Research (A*STAR), Genome Institute of Singapore (GIS), 60 Biopolis Street, Singapore, 138672, Republic of Singapore.
- LASA - Lausitz Advanced Scientific Applications gGmbH, Straße Der Einheit 2-24, 02943, Weißwasser, Federal Republic of Germany.
| | - Swati Sinha
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Lars Juhl Jensen
- Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Frank Eisenhaber
- Agency for Science, Technology and Research (A*STAR), Bioinformatics Institute (BII), 30 Biopolis Street #07-01, Matrix Building, Singapore, 138671, Republic of Singapore.
- Agency for Science, Technology and Research (A*STAR), Genome Institute of Singapore (GIS), 60 Biopolis Street, Singapore, 138672, Republic of Singapore.
- LASA - Lausitz Advanced Scientific Applications gGmbH, Straße Der Einheit 2-24, 02943, Weißwasser, Federal Republic of Germany.
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore, 637551, Republic of Singapore.
| |
Collapse
|
7
|
Shimada T, Yoshida H. Overview of the Molecular Mechanism of Bacterial Environmental Adaptation by Comprehensive Analysis. Int J Mol Sci 2023; 24:ijms24087602. [PMID: 37108762 PMCID: PMC10145747 DOI: 10.3390/ijms24087602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2023] [Revised: 04/17/2023] [Accepted: 04/19/2023] [Indexed: 04/29/2023] Open
Abstract
So far, the genome sequences of more than tens of thousands of organisms have been determined, and the overall picture of the genes that make up one organism has been clarified [https://www [...].
Collapse
Affiliation(s)
- Tomohiro Shimada
- School of Agriculture, Meiji University, Kawasaki 214-8571, Kanagawa, Japan
| | - Hideji Yoshida
- Department of Physics, Osaka Medical and Pharmaceutical University, Takatsuki 569-8686, Osaka, Japan
| |
Collapse
|