Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Nevers Y, Jones TEM, Jyothi D, Yates B, Ferret M, Portell-Silva L, Codo L, Cosentino S, Marcet-Houben M, Vlasova A, Poidevin L, Kress A, Hickman M, Persson E, Piližota I, Guijarro-Clarke C, Iwasaki W, Lecompte O, Sonnhammer E, Roos DS, Gabaldón T, Thybert D, Thomas PD, Hu Y, Emms DM, Bruford E, Capella-Gutierrez S, Martin MJ, Dessimoz C, Altenhoff A. The Quest for Orthologs orthology benchmark service in 2022. Nucleic Acids Res 2022;50:W623-W632. [PMID: 35552456 PMCID: PMC9252809 DOI: 10.1093/nar/gkac330] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Revised: 04/07/2022] [Accepted: 04/30/2022] [Indexed: 11/15/2022] Open

For:	Nevers Y, Jones TEM, Jyothi D, Yates B, Ferret M, Portell-Silva L, Codo L, Cosentino S, Marcet-Houben M, Vlasova A, Poidevin L, Kress A, Hickman M, Persson E, Piližota I, Guijarro-Clarke C, Iwasaki W, Lecompte O, Sonnhammer E, Roos DS, Gabaldón T, Thybert D, Thomas PD, Hu Y, Emms DM, Bruford E, Capella-Gutierrez S, Martin MJ, Dessimoz C, Altenhoff A. The Quest for Orthologs orthology benchmark service in 2022. Nucleic Acids Res 2022;50:W623-W632. [PMID: 35552456 PMCID: PMC9252809 DOI: 10.1093/nar/gkac330] [Citation(s) in RCA: 19] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Revised: 04/07/2022] [Accepted: 04/30/2022] [Indexed: 11/15/2022] Open

Number

Cited by Other Article(s)

Fukunaga T, Ogawa T, Iwasaki W, Sonoike K. Phylogenetic Profiling Analysis of the Phycobilisome Revealed a Novel State-Transition Regulator Gene in Synechocystis sp. PCC 6803. PLANT & CELL PHYSIOLOGY 2024;65:1450-1460. [PMID: 39034452 PMCID: PMC11447641 DOI: 10.1093/pcp/pcae083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Revised: 07/05/2024] [Accepted: 07/20/2024] [Indexed: 07/23/2024]

Cosentino S, Sriswasdi S, Iwasaki W. SonicParanoid2: fast, accurate, and comprehensive orthology inference with machine learning and language models. Genome Biol 2024;25:195. [PMID: 39054525 PMCID: PMC11270883 DOI: 10.1186/s13059-024-03298-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 06/04/2024] [Indexed: 07/27/2024] Open

Cox RM, Papoulas O, Shril S, Lee C, Gardner T, Battenhouse AM, Lee M, Drew K, McWhite CD, Yang D, Leggere JC, Durand D, Hildebrandt F, Wallingford JB, Marcotte EM. Ancient eukaryotic protein interactions illuminate modern genetic traits and disorders. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.26.595818. [PMID: 38853926 PMCID: PMC11160598 DOI: 10.1101/2024.05.26.595818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]

Tian X, Teo WFA, Yang Y, Dong L, Wong A, Chen L, Ahmed H, Choo SW, Jakubovics NS, Tan GYA. Genome characterisation and comparative analysis of Schaalia dentiphila sp. nov. and its subspecies, S. dentiphila subsp. denticola subsp. nov., from the human oral cavity. BMC Microbiol 2024;24:185. [PMID: 38802738 PMCID: PMC11131293 DOI: 10.1186/s12866-024-03346-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Accepted: 05/21/2024] [Indexed: 05/29/2024] Open

Abstract

BACKGROUND

Schaalia species are primarily found among the oral microbiota of humans and other animals. They have been associated with various infections through their involvement in biofilm formation, modulation of host responses, and interaction with other microorganisms. In this study, two strains previously indicated as Actinomyces spp. were found to be novel members of the genus Schaalia based on their whole genome sequences.

RESULTS

Whole-genome sequencing revealed both strains with a genome size of 2.3 Mbp and GC contents of 65.5%. Phylogenetics analysis for taxonomic placement revealed strains NCTC 9931 and C24 as distinct species within the genus Schaalia. Overall genome-relatedness indices including digital DNA-DNA hybridization (dDDH), and average nucleotide/amino acid identity (ANI/AAI) confirmed both strains as distinct species, with values below the species boundary thresholds (dDDH < 70%, and ANI and AAI < 95%) when compared to nearest type strain Schaalia odontolytica NCTC 9935 T. Pangenome and orthologous analyses highlighted their differences in gene properties and biological functions compared to existing type strains. Additionally, the identification of genomic islands (GIs) and virulence-associated factors indicated their genetic diversity and potential adaptive capabilities, as well as potential implications for human health. Notably, CRISPR-Cas systems in strain NCTC 9931 underscore its adaptive immune mechanisms compared to strain C24.

CONCLUSIONS

Based on these findings, strain NCTC 9931T (= ATCC 17982T = DSM 43331T = CIP 104728T = CCUG 18309T = NCTC 14978T = CGMCC 1.90328T) represents a novel species, for which the name Schaalia dentiphila subsp. dentiphila sp. nov. subsp. nov. is proposed, while strain C24T (= NCTC 14980T = CGMCC 1.90329T) represents a distinct novel subspecies, for which the name Schaalia dentiphila subsp. denticola. subsp. nov. is proposed. This study enriches our understanding of the genomic diversity of Schaalia species and paves the way for further investigations into their roles in oral health.

SIGNIFICANCE

This research reveals two Schaalia strains, NCTC 9931 T and C24T, as novel entities with distinct genomic features. Expanding the taxonomic framework of the genus Schaalia, this study offers a critical resource for probing the metabolic intricacies and resistance patterns of these bacteria. This work stands as a cornerstone for microbial taxonomy, paving the way for significant advances in clinical diagnostics.

Collapse

Affiliation(s)

Xuechen Tian Institute of Biological Sciences, Faculty of Science, Universiti Malaya, Kuala Lumpur, 50603, Malaysia College of Science, Mathematics and Technology, Wenzhou-Kean University, 88 Daxue Road, Ouhai, Wenzhou, Zhejiang Province, 325060, China Wenzhou Municipal Key Laboratory for Applied Biomedical and Biopharmaceutical Informatics, Wenzhou-Kean University, Ouhai, Wenzhou, Zhejiang Province, 325060, China Zhejiang Bioinformatics International Science and Technology Cooperation Center, Wenzhou-Kean University, Ouhai, Wenzhou, Zhejiang Province, 325060, China
Wee Fei Aaron Teo Institute of Biological Sciences, Faculty of Science, Universiti Malaya, Kuala Lumpur, 50603, Malaysia
Yixin Yang College of Science, Mathematics and Technology, Wenzhou-Kean University, 88 Daxue Road, Ouhai, Wenzhou, Zhejiang Province, 325060, China Wenzhou Municipal Key Laboratory for Applied Biomedical and Biopharmaceutical Informatics, Wenzhou-Kean University, Ouhai, Wenzhou, Zhejiang Province, 325060, China Zhejiang Bioinformatics International Science and Technology Cooperation Center, Wenzhou-Kean University, Ouhai, Wenzhou, Zhejiang Province, 325060, China Dorothy and George Hennings College of Science, Mathematics and Technology, Kean University, 1000 Morris Ave, Union, NJ, 07083, USA
Linyinxue Dong Wenzhou Municipal Key Laboratory for Applied Biomedical and Biopharmaceutical Informatics, Wenzhou-Kean University, Ouhai, Wenzhou, Zhejiang Province, 325060, China Zhejiang Bioinformatics International Science and Technology Cooperation Center, Wenzhou-Kean University, Ouhai, Wenzhou, Zhejiang Province, 325060, China
Aloysius Wong College of Science, Mathematics and Technology, Wenzhou-Kean University, 88 Daxue Road, Ouhai, Wenzhou, Zhejiang Province, 325060, China Wenzhou Municipal Key Laboratory for Applied Biomedical and Biopharmaceutical Informatics, Wenzhou-Kean University, Ouhai, Wenzhou, Zhejiang Province, 325060, China Zhejiang Bioinformatics International Science and Technology Cooperation Center, Wenzhou-Kean University, Ouhai, Wenzhou, Zhejiang Province, 325060, China Dorothy and George Hennings College of Science, Mathematics and Technology, Kean University, 1000 Morris Ave, Union, NJ, 07083, USA
Li Chen Institute of Biological Sciences, Faculty of Science, Universiti Malaya, Kuala Lumpur, 50603, Malaysia
Halah Ahmed School of Dental Sciences, Faculty of Medical Sciences, Newcastle University, Framlington Place, Newcastle Upon Tyne, NE2 4BW, UK
Siew Woh Choo College of Science, Mathematics and Technology, Wenzhou-Kean University, 88 Daxue Road, Ouhai, Wenzhou, Zhejiang Province, 325060, China. Wenzhou Municipal Key Laboratory for Applied Biomedical and Biopharmaceutical Informatics, Wenzhou-Kean University, Ouhai, Wenzhou, Zhejiang Province, 325060, China. Zhejiang Bioinformatics International Science and Technology Cooperation Center, Wenzhou-Kean University, Ouhai, Wenzhou, Zhejiang Province, 325060, China. Dorothy and George Hennings College of Science, Mathematics and Technology, Kean University, 1000 Morris Ave, Union, NJ, 07083, USA.
Nicholas S Jakubovics School of Dental Sciences, Faculty of Medical Sciences, Newcastle University, Framlington Place, Newcastle Upon Tyne, NE2 4BW, UK.
Geok Yuan Annie Tan Institute of Biological Sciences, Faculty of Science, Universiti Malaya, Kuala Lumpur, 50603, Malaysia.

Collapse

Sternberg PW, Van Auken K, Wang Q, Wright A, Yook K, Zarowiecki M, Arnaboldi V, Becerra A, Brown S, Cain S, Chan J, Chen WJ, Cho J, Davis P, Diamantakis S, Dyer S, Grigoriadis D, Grove CA, Harris T, Howe K, Kishore R, Lee R, Longden I, Luypaert M, Müller HM, Nuin P, Quinton-Tulloch M, Raciti D, Schedl T, Schindelman G, Stein L. WormBase 2024: status and transitioning to Alliance infrastructure. Genetics 2024;227:iyae050. [PMID: 38573366 PMCID: PMC11075546 DOI: 10.1093/genetics/iyae050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Revised: 03/19/2024] [Accepted: 03/20/2024] [Indexed: 04/05/2024] Open

Affiliation(s)

Paul W Sternberg Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
Kimberly Van Auken Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
Qinghua Wang Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
Adam Wright Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
Karen Yook Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
Magdalena Zarowiecki European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
Valerio Arnaboldi Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
Andrés Becerra European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
Stephanie Brown School of Infection and Immunity, University of Glasgow, Glasgow G12 8TA, UK
Scott Cain Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
Juancarlos Chan Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
Wen J Chen Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
Jaehyoung Cho Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
Paul Davis European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
Stavros Diamantakis European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
Sarah Dyer European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
Dionysis Grigoriadis School of Infection and Immunity, University of Glasgow, Glasgow G12 8TA, UK
Christian A Grove Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
Todd Harris Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
Kevin Howe European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
Ranjana Kishore Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
Raymond Lee Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
Ian Longden Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
Manuel Luypaert European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
Hans-Michael Müller Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
Paulo Nuin Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada
Mark Quinton-Tulloch European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge CB10 1SD, UK
Daniela Raciti Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
Tim Schedl Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
Gary Schindelman Division of Biology and Biological Engineering 140-18, California Institute of Technology, Pasadena, CA 91125, USA
Lincoln Stein Informatics and Bio-computing Platform, Ontario Institute for Cancer Research, Toronto, ON M5G0A3, Canada

Collapse

Aleksander SA, Anagnostopoulos AV, Antonazzo G, Arnaboldi V, Attrill H, Becerra A, Bello SM, Blodgett O, Bradford YM, Bult CJ, Cain S, Calvi BR, Carbon S, Chan J, Chen WJ, Cherry JM, Cho J, Crosby MA, De Pons JL, D’Eustachio P, Diamantakis S, Dolan ME, dos Santos G, Dyer S, Ebert D, Engel SR, Fashena D, Fisher M, Foley S, Gibson AC, Gollapally VR, Gramates LS, Grove CA, Hale P, Harris T, Hayman GT, Hu Y, James-Zorn C, Karimi K, Karra K, Kishore R, Kwitek AE, Laulederkind SJF, Lee R, Longden I, Luypaert M, Markarian N, Marygold SJ, Matthews B, McAndrews MS, Millburn G, Miyasato S, Motenko H, Moxon S, Muller HM, Mungall CJ, Muruganujan A, Mushayahama T, Nash RS, Nuin P, Paddock H, Pells T, Perrimon N, Pich C, Quinton-Tulloch M, Raciti D, Ramachandran S, Richardson JE, Gelbart SR, Ruzicka L, Schindelman G, Shaw DR, Sherlock G, Shrivatsav A, Singer A, Smith CM, Smith CL, Smith JR, Stein L, Sternberg PW, Tabone CJ, Thomas PD, Thorat K, Thota J, Tomczuk M, Trovisco V, Tutaj MA, Urbano JM, Van Auken K, Van Slyke CE, Vize PD, Wang Q, Weng S, Westerfield M, Wilming LG, Wong ED, Wright A, Yook K, Zhou P, Zorn A, Zytkovicz M. Updates to the Alliance of Genome Resources central infrastructure. Genetics 2024;227:iyae049. [PMID: 38552170 PMCID: PMC11075569 DOI: 10.1093/genetics/iyae049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 02/28/2024] [Accepted: 02/29/2024] [Indexed: 04/09/2024] Open

Affiliation(s)

The Alliance of Genome Resources Consortium
Suzanne A Aleksander Department of Genetics, Stanford University , Stanford, CA 94305
Anna V Anagnostopoulos The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
Giulia Antonazzo Department of Physiology, Development and Neuroscience , University of Cambridge, Downing Street, Cambridge CB2 3DY , UK
Valerio Arnaboldi Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
Helen Attrill Department of Physiology, Development and Neuroscience , University of Cambridge, Downing Street, Cambridge CB2 3DY , UK
Andrés Becerra European Molecular Biology Laboratory, European Bioinformatics Institute , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD , UK
Susan M Bello The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
Olin Blodgett The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
Yvonne M Bradford Institute of Neuroscience, University of Oregon , Eugene, OR 97403
Carol J Bult The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
Scott Cain Informatics and Bio-computing Platform, Ontario Institute for Cancer Research , Toronto, ON M5G0A3 , Canada
Brian R Calvi Department of Biology, Indiana University , Bloomington, IN 47408 , USA
Seth Carbon Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory , Berkeley, CA
Juancarlos Chan Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
Wen J Chen Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
J Michael Cherry Department of Genetics, Stanford University , Stanford, CA 94305
Jaehyoung Cho Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
Madeline A Crosby The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
Jeffrey L De Pons Medical College of Wisconsin—Rat Genome Database, Departments of Physiology and Biomedical Engineering , Medical College of Wisconsin, Milwaukee, WI 53226 , USA
Peter D’Eustachio NYU Grossman School of Medicine, New York , NY 10016
Stavros Diamantakis European Molecular Biology Laboratory, European Bioinformatics Institute , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD , UK
Mary E Dolan The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
Gilberto dos Santos The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
Sarah Dyer European Molecular Biology Laboratory, European Bioinformatics Institute , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD , UK
Dustin Ebert Department of Population and Public Health Sciences, University of Southern California , Los Angeles, CA 90033 , USA
Stacia R Engel Department of Genetics, Stanford University , Stanford, CA 94305
David Fashena Institute of Neuroscience, University of Oregon , Eugene, OR 97403
Malcolm Fisher Division of Developmental Biology, Cincinnati Children's Hospital Medical Center , 3333 Burnet Ave, Cincinnati, OH 45229 , USA
Saoirse Foley Department of Biological Sciences, Carnegie Mellon University , 5000 Forbes Ave, Pittsburgh, PA 15203
Adam C Gibson Medical College of Wisconsin—Rat Genome Database, Departments of Physiology and Biomedical Engineering , Medical College of Wisconsin, Milwaukee, WI 53226 , USA
Varun R Gollapally Medical College of Wisconsin—Rat Genome Database, Departments of Physiology and Biomedical Engineering , Medical College of Wisconsin, Milwaukee, WI 53226 , USA
L Sian Gramates The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
Christian A Grove Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
Paul Hale The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
Todd Harris Informatics and Bio-computing Platform, Ontario Institute for Cancer Research , Toronto, ON M5G0A3 , Canada
G Thomas Hayman Medical College of Wisconsin—Rat Genome Database, Departments of Physiology and Biomedical Engineering , Medical College of Wisconsin, Milwaukee, WI 53226 , USA
Yanhui Hu Department of Genetics, Howard Hughes Medical Institute , Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115 , USA
Christina James-Zorn Division of Developmental Biology, Cincinnati Children's Hospital Medical Center , 3333 Burnet Ave, Cincinnati, OH 45229 , USA
Kamran Karimi Department of Biological Sciences, University of Calgary , 507 Campus Dr NW, Calgary, AB T2N 4V8 , Canada
Kalpana Karra Department of Genetics, Stanford University , Stanford, CA 94305
Ranjana Kishore Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
Anne E Kwitek Medical College of Wisconsin—Rat Genome Database, Departments of Physiology and Biomedical Engineering , Medical College of Wisconsin, Milwaukee, WI 53226 , USA
Stanley J F Laulederkind Medical College of Wisconsin—Rat Genome Database, Departments of Physiology and Biomedical Engineering , Medical College of Wisconsin, Milwaukee, WI 53226 , USA
Raymond Lee Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
Ian Longden The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
Manuel Luypaert European Molecular Biology Laboratory, European Bioinformatics Institute , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD , UK
Nicholas Markarian Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
Steven J Marygold Department of Physiology, Development and Neuroscience , University of Cambridge, Downing Street, Cambridge CB2 3DY , UK
Beverley Matthews The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
Monica S McAndrews The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
Gillian Millburn Department of Physiology, Development and Neuroscience , University of Cambridge, Downing Street, Cambridge CB2 3DY , UK
Stuart Miyasato Department of Genetics, Stanford University , Stanford, CA 94305
Howie Motenko The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
Sierra Moxon Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory , Berkeley, CA
Hans-Michael Muller Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
Christopher J Mungall Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory , Berkeley, CA
Anushya Muruganujan Department of Population and Public Health Sciences, University of Southern California , Los Angeles, CA 90033 , USA
Tremayne Mushayahama Department of Population and Public Health Sciences, University of Southern California , Los Angeles, CA 90033 , USA
Robert S Nash Department of Genetics, Stanford University , Stanford, CA 94305
Paulo Nuin Informatics and Bio-computing Platform, Ontario Institute for Cancer Research , Toronto, ON M5G0A3 , Canada
Holly Paddock Institute of Neuroscience, University of Oregon , Eugene, OR 97403
Troy Pells Department of Biological Sciences, University of Calgary , 507 Campus Dr NW, Calgary, AB T2N 4V8 , Canada
Norbert Perrimon Department of Genetics, Howard Hughes Medical Institute , Harvard Medical School, 77 Avenue Louis Pasteur, Boston, MA 02115 , USA
Christian Pich Institute of Neuroscience, University of Oregon , Eugene, OR 97403
Mark Quinton-Tulloch European Molecular Biology Laboratory, European Bioinformatics Institute , Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD , UK
Daniela Raciti Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
Sridhar Ramachandran Institute of Neuroscience, University of Oregon , Eugene, OR 97403
Joel E Richardson Institute of Neuroscience, University of Oregon , Eugene, OR 97403
Susan Russo Gelbart The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
Leyla Ruzicka Institute of Neuroscience, University of Oregon , Eugene, OR 97403
Gary Schindelman Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
David R Shaw The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
Gavin Sherlock Department of Genetics, Stanford University , Stanford, CA 94305
Ajay Shrivatsav Department of Genetics, Stanford University , Stanford, CA 94305
Amy Singer Institute of Neuroscience, University of Oregon , Eugene, OR 97403
Constance M Smith The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
Cynthia L Smith The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
Jennifer R Smith Medical College of Wisconsin—Rat Genome Database, Departments of Physiology and Biomedical Engineering , Medical College of Wisconsin, Milwaukee, WI 53226 , USA
Lincoln Stein Informatics and Bio-computing Platform, Ontario Institute for Cancer Research , Toronto, ON M5G0A3 , Canada
Paul W Sternberg Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
Christopher J Tabone The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
Paul D Thomas Department of Population and Public Health Sciences, University of Southern California , Los Angeles, CA 90033 , USA
Ketaki Thorat Medical College of Wisconsin—Rat Genome Database, Departments of Physiology and Biomedical Engineering , Medical College of Wisconsin, Milwaukee, WI 53226 , USA
Jyothi Thota Medical College of Wisconsin—Rat Genome Database, Departments of Physiology and Biomedical Engineering , Medical College of Wisconsin, Milwaukee, WI 53226 , USA
Monika Tomczuk The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
Vitor Trovisco Department of Physiology, Development and Neuroscience , University of Cambridge, Downing Street, Cambridge CB2 3DY , UK
Marek A Tutaj Medical College of Wisconsin—Rat Genome Database, Departments of Physiology and Biomedical Engineering , Medical College of Wisconsin, Milwaukee, WI 53226 , USA
Jose-Maria Urbano Department of Physiology, Development and Neuroscience , University of Cambridge, Downing Street, Cambridge CB2 3DY , UK
Kimberly Van Auken Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
Ceri E Van Slyke Institute of Neuroscience, University of Oregon , Eugene, OR 97403
Peter D Vize Department of Biological Sciences, University of Calgary , 507 Campus Dr NW, Calgary, AB T2N 4V8 , Canada
Qinghua Wang Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
Shuai Weng Department of Genetics, Stanford University , Stanford, CA 94305
Monte Westerfield Institute of Neuroscience, University of Oregon , Eugene, OR 97403
Laurens G Wilming The Jackson Laboratory for Mammalian Genomics, Bar Harbor , ME 04609 , USA
Edith D Wong Department of Genetics, Stanford University , Stanford, CA 94305
Adam Wright Informatics and Bio-computing Platform, Ontario Institute for Cancer Research , Toronto, ON M5G0A3 , Canada
Karen Yook Division of Biology and Biological Engineering 140-18, California Institute of Technology , Pasadena, CA 91125 , USA
Pinglei Zhou The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA
Aaron Zorn Division of Developmental Biology, Cincinnati Children's Hospital Medical Center , 3333 Burnet Ave, Cincinnati, OH 45229 , USA
Mark Zytkovicz The Biological Laboratories, Harvard University , 16 Divinity Avenue, Cambridge, MA 02138 , USA

Collapse

Brooks TG, Lahens NF, Mrčela A, Grant GR. Challenges and best practices in omics benchmarking. Nat Rev Genet 2024;25:326-339. [PMID: 38216661 DOI: 10.1038/s41576-023-00679-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/14/2023] [Indexed: 01/14/2024]

Ludwig J, Mrázek J. OrthoRefine: automated enhancement of prior ortholog identification via synteny. BMC Bioinformatics 2024;25:163. [PMID: 38664637 PMCID: PMC11044567 DOI: 10.1186/s12859-024-05786-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Accepted: 04/15/2024] [Indexed: 04/29/2024] Open

Abstract

BACKGROUND

Identifying orthologs continues to be an early and imperative step in genome analysis but remains a challenging problem. While synteny (conservation of gene order) has previously been used independently and in combination with other methods to identify orthologs, applying synteny in ortholog identification has yet to be automated in a user-friendly manner. This desire for automation and ease-of-use led us to develop OrthoRefine, a standalone program that uses synteny to refine ortholog identification.

RESULTS

We developed OrthoRefine to improve the detection of orthologous genes by implementing a look-around window approach to detect synteny. We tested OrthoRefine in tandem with OrthoFinder, one of the most used software for identification of orthologs in recent years. We evaluated improvements provided by OrthoRefine in several bacterial and a eukaryotic dataset. OrthoRefine efficiently eliminates paralogs from orthologous groups detected by OrthoFinder. Using synteny increased specificity and functional ortholog identification; additionally, analysis of BLAST e-value, phylogenetics, and operon occurrence further supported using synteny for ortholog identification. A comparison of several window sizes suggested that smaller window sizes (eight genes) were generally the most suitable for identifying orthologs via synteny. However, larger windows (30 genes) performed better in datasets containing less closely related genomes. A typical run of OrthoRefine with ~ 10 bacterial genomes can be completed in a few minutes on a regular desktop PC.

CONCLUSION

OrthoRefine is a simple-to-use, standalone tool that automates the application of synteny to improve ortholog detection. OrthoRefine is particularly efficient in eliminating paralogs from orthologous groups delineated by standard methods.

Collapse

Roder T, Pimentel G, Fuchsmann P, Stern MT, von Ah U, Vergères G, Peischl S, Brynildsrud O, Bruggmann R, Bär C. Scoary2: rapid association of phenotypic multi-omics data with microbial pan-genomes. Genome Biol 2024;25:93. [PMID: 38605417 PMCID: PMC11007987 DOI: 10.1186/s13059-024-03233-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 03/29/2024] [Indexed: 04/13/2024] Open

Thiébaut A, Altenhoff AM, Campli G, Glover N, Dessimoz C, Waterhouse RM. DrosOMA: the Drosophila Orthologous Matrix browser. F1000Res 2024;12:936. [PMID: 38434623 PMCID: PMC10905159 DOI: 10.12688/f1000research.135250.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/12/2024] [Indexed: 03/05/2024] Open

Abstract

Background

Comparative genomic analyses to delineate gene evolutionary histories inform the understanding of organismal biology by characterising gene and gene family origins, trajectories, and dynamics, as well as enabling the tracing of speciation, duplication, and loss events, and facilitating the transfer of gene functional information across species. Genomic data are available for an increasing number of species from the genus Drosophila, however, a dedicated resource exploiting these data to provide the research community with browsable results from genus-wide orthology delineation has been lacking.

Methods

Using the OMA Orthologous Matrix orthology inference approach and browser deployment framework, we catalogued orthologues across a selected set of Drosophila species with high-quality annotated genomes. We developed and deployed a dedicated instance of the OMA browser to facilitate intuitive exploration, visualisation, and downloading of the genus-wide orthology delineation results.

Results

DrosOMA - the Drosophila Orthologous Matrix browser, accessible from https://drosoma.dcsr.unil.ch/ - presents the results of orthology delineation for 36 drosophilids from across the genus and four outgroup dipterans. It enables querying and browsing of the orthology data through a feature-rich web interface, with gene-view, orthologous group-view, and genome-view pages, including comprehensive gene name and identifier cross-references together with available functional annotations and protein domain architectures, as well as tools to visualise local and global synteny conservation.

Conclusions

The DrosOMA browser demonstrates the deployability of the OMA browser framework for building user-friendly orthology databases with dense sampling of a selected taxonomic group. It provides the Drosophila research community with a tailored resource of browsable results from genus-wide orthology delineation.

Collapse

Carhuaricra-Huaman D, Setubal JC. Protein-Coding Gene Families in Prokaryote Genome Comparisons. Methods Mol Biol 2024;2802:33-55. [PMID: 38819555 DOI: 10.1007/978-1-0716-3838-5_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]

Singleton M, Eisen M. Leveraging genomic redundancy to improve inference and alignment of orthologous proteins. G3 (BETHESDA, MD.) 2023;13:jkad222. [PMID: 37770067 PMCID: PMC10700111 DOI: 10.1093/g3journal/jkad222] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/28/2023] [Revised: 09/11/2023] [Accepted: 09/19/2023] [Indexed: 10/03/2023]

Nestor BJ, Bayer PE, Fernandez CGT, Edwards D, Finnegan PM. Approaches to increase the validity of gene family identification using manual homology search tools. Genetica 2023;151:325-338. [PMID: 37817002 PMCID: PMC10692271 DOI: 10.1007/s10709-023-00196-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 10/01/2023] [Indexed: 10/12/2023]

Jin Z, Sato Y, Kawashima M, Kanehisa M. KEGG tools for classification and analysis of viral proteins. Protein Sci 2023;32:e4820. [PMID: 37881892 PMCID: PMC10661063 DOI: 10.1002/pro.4820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2023] [Revised: 10/19/2023] [Accepted: 10/21/2023] [Indexed: 10/27/2023]

Bult CJ, Sternberg PW. The alliance of genome resources: transforming comparative genomics. Mamm Genome 2023;34:531-544. [PMID: 37666946 PMCID: PMC10628019 DOI: 10.1007/s00335-023-10015-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 08/11/2023] [Indexed: 09/06/2023]

Aleksander SA, Anagnostopoulos AV, Antonazzo G, Arnaboldi V, Attrill H, Becerra A, Bello SM, Blodgett O, Bradford YM, Bult CJ, Cain S, Calvi BR, Carbon S, Chan J, Chen WJ, Michael Cherry J, Cho J, Crosby MA, De Pons JL, D’Eustachio P, Diamantakis S, Dolan ME, Santos GD, Dyer S, Ebert D, Engel SR, Fashena D, Fisher M, Foley S, Gibson AC, Gollapally VR, Sian Gramates L, Grove CA, Hale P, Harris T, Thomas Hayman G, Hu Y, James-Zorn C, Karimi K, Karra K, Kishore R, Kwitek AE, Laulederkind SJF, Lee R, Longden I, Luypaert M, Markarian N, Marygold SJ, Matthews B, McAndrews MS, Millburn G, Miyasato S, Motenko H, Moxon S, Muller HM, Mungall CJ, Muruganujan A, Mushayahama T, Nash RS, Nuin P, Paddock H, Pells T, Perrimon N, Pich C, Quinton-Tulloch M, Raciti D, Ramachandran S, Richardson JE, Gelbart SR, Ruzicka L, Schindelman G, Shaw DR, Sherlock G, Shrivatsav A, Singer A, Smith CM, Smith CL, Smith JR, Stein L, Sternberg PW, Tabone CJ, Thomas PD, Thorat K, Thota J, Tomczuk M, Trovisco V, Tutaj MA, Urbano JM, Auken KV, Van Slyke CE, Vize PD, Wang Q, Weng S, Westerfield M, Wilming LG, Wong ED, Wright A, Yook K, Zhou P, Zorn A, Zytkovicz M. Updates to the Alliance of Genome Resources Central Infrastructure Alliance of Genome Resources Consortium. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.20.567935. [PMID: 38045425 PMCID: PMC10690154 DOI: 10.1101/2023.11.20.567935] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2023]

Contreras-Moreira B, Saraf S, Naamati G, Casas AM, Amberkar SS, Flicek P, Jones AR, Dyer S. GET_PANGENES: calling pangenes from plant genome alignments confirms presence-absence variation. Genome Biol 2023;24:223. [PMID: 37798615 PMCID: PMC10552430 DOI: 10.1186/s13059-023-03071-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Accepted: 09/21/2023] [Indexed: 10/07/2023] Open

Pajkos M, Erdős G, Dosztányi Z. The Origin of Discrepancies between Predictions and Annotations in Intrinsically Disordered Proteins. Biomolecules 2023;13:1442. [PMID: 37892124 PMCID: PMC10604070 DOI: 10.3390/biom13101442] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 09/05/2023] [Accepted: 09/20/2023] [Indexed: 10/29/2023] Open

Chodkowski M, Zielezinski A, Anbalagan S. A ligand-receptor interactome atlas of the zebrafish. iScience 2023;26:107309. [PMID: 37539027 PMCID: PMC10393773 DOI: 10.1016/j.isci.2023.107309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 05/25/2023] [Accepted: 07/04/2023] [Indexed: 08/05/2023] Open

Lyubetsky VA, Rubanov LI, Tereshina MB, Ivanova AS, Araslanova KR, Uroshlev LA, Goremykina GI, Yang JR, Kanovei VG, Zverkov OA, Shitikov AD, Korotkova DD, Zaraisky AG. Wide-scale identification of novel/eliminated genes responsible for evolutionary transformations. Biol Direct 2023;18:45. [PMID: 37568147 PMCID: PMC10416458 DOI: 10.1186/s13062-023-00405-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Accepted: 08/07/2023] [Indexed: 08/13/2023] Open

Abstract

BACKGROUND

It is generally accepted that most evolutionary transformations at the phenotype level are associated either with rearrangements of genomic regulatory elements, which control the activity of gene networks, or with changes in the amino acid contents of proteins. Recently, evidence has accumulated that significant evolutionary transformations could also be associated with the loss/emergence of whole genes. The targeted identification of such genes is a challenging problem for both bioinformatics and evo-devo research.

RESULTS

To solve this problem we propose the WINEGRET method, named after the first letters of the title. Its main idea is to search for genes that satisfy two requirements: first, the desired genes were lost/emerged at the same evolutionary stage at which the phenotypic trait of interest was lost/emerged, and second, the expression of these genes changes significantly during the development of the trait of interest in the model organism. To verify the first requirement, we do not use existing databases of orthologs, but rely purely on gene homology and local synteny by using some novel quickly computable conditions. Genes satisfying the second requirement are found by deep RNA sequencing. As a proof of principle, we used our method to find genes absent in extant amniotes (reptiles, birds, mammals) but present in anamniotes (fish and amphibians), in which these genes are involved in the regeneration of large body appendages. As a result, 57 genes were identified. For three of them, c-c motif chemokine 4, eotaxin-like, and a previously unknown gene called here sod4, essential roles for tail regeneration were demonstrated. Noteworthy, we established that the latter gene belongs to a novel family of Cu/Zn-superoxide dismutases lost by amniotes, SOD4.

CONCLUSIONS

We present a method for targeted identification of genes whose loss/emergence in evolution could be associated with the loss/emergence of a phenotypic trait of interest. In a proof-of-principle study, we identified genes absent in amniotes that participate in body appendage regeneration in anamniotes. Our method provides a wide range of opportunities for studying the relationship between the loss/emergence of phenotypic traits and the loss/emergence of specific genes in evolution.

Collapse

Affiliation(s)

Vassily A Lyubetsky Institute for Information Transmission Problems of the Russian Academy of Sciences (Kharkevich Institute), 19 Build. 1, Bolshoy Karetny per., Moscow, Russia, 127051 Department of Mechanics and Mathematics, Lomonosov Moscow State University, Kolmogorova Str., 1, Moscow, Russia, 119234
Lev I Rubanov Institute for Information Transmission Problems of the Russian Academy of Sciences (Kharkevich Institute), 19 Build. 1, Bolshoy Karetny per., Moscow, Russia, 127051
Maria B Tereshina Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 16/10, Miklukho-Maklaya Str., Moscow, Russia, 117997 Pirogov Russian National Research Medical University, Moscow, Russia
Anastasiya S Ivanova Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 16/10, Miklukho-Maklaya Str., Moscow, Russia, 117997 Department of Molecular Medicine, The Scripps Research Institute, La Jolla, USA
Karina R Araslanova Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 16/10, Miklukho-Maklaya Str., Moscow, Russia, 117997
Leonid A Uroshlev Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 32, Vavilova Str., Moscow, Russia, 119991
Galina I Goremykina Plekhanov Russian University of Economics, Stremyanny Lane 36, Moscow, Russia
Jian-Rong Yang Advanced Medical Technology Center, The First Affiliated Hospital, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, China Department of Genetics and Biomedical Informatics, Zhongshan School of Medicine, Sun Yat-sen University, Guangzhou, 510080, China
Vladimir G Kanovei Institute for Information Transmission Problems of the Russian Academy of Sciences (Kharkevich Institute), 19 Build. 1, Bolshoy Karetny per., Moscow, Russia, 127051
Oleg A Zverkov Institute for Information Transmission Problems of the Russian Academy of Sciences (Kharkevich Institute), 19 Build. 1, Bolshoy Karetny per., Moscow, Russia, 127051
Alexander D Shitikov Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 16/10, Miklukho-Maklaya Str., Moscow, Russia, 117997
Daria D Korotkova Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 16/10, Miklukho-Maklaya Str., Moscow, Russia, 117997 Global Health Institute, School of Life Sciences, EPFL, Lausanne, Switzerland
Andrey G Zaraisky Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 16/10, Miklukho-Maklaya Str., Moscow, Russia, 117997. Pirogov Russian National Research Medical University, Moscow, Russia.

Collapse

Langschied F, Leisegang MS, Brandes RP, Ebersberger I. ncOrtho: efficient and reliable identification of miRNA orthologs. Nucleic Acids Res 2023;51:e71. [PMID: 37260093 PMCID: PMC10359484 DOI: 10.1093/nar/gkad467] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 05/04/2023] [Accepted: 05/30/2023] [Indexed: 06/02/2023] Open

Moi D, Dessimoz C. Phylogenetic profiling in eukaryotes comes of age. Proc Natl Acad Sci U S A 2023;120:e2305013120. [PMID: 37126713 PMCID: PMC10175774 DOI: 10.1073/pnas.2305013120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2023] Open

Sun J, Lu F, Luo Y, Bie L, Xu L, Wang Y. OrthoVenn3: an integrated platform for exploring and visualizing orthologous data across genomes. Nucleic Acids Res 2023:7146343. [PMID: 37114999 DOI: 10.1093/nar/gkad313] [Citation(s) in RCA: 81] [Impact Index Per Article: 81.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 04/07/2023] [Accepted: 04/13/2023] [Indexed: 04/29/2023] Open

Persson E, Sonnhammer ELL. InParanoiDB 9: Ortholog Groups for Protein Domains and Full-Length Proteins. J Mol Biol 2023:168001. [PMID: 36764355 DOI: 10.1016/j.jmb.2023.168001] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 01/20/2023] [Accepted: 02/01/2023] [Indexed: 02/11/2023]

Kaur H, Lynn AM. Mapping the FtsQBL divisome components in bacterial NTD pathogens as potential drug targets. Front Genet 2023;13:1010870. [PMID: 36685953 PMCID: PMC9846249 DOI: 10.3389/fgene.2022.1010870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 12/05/2022] [Indexed: 01/05/2023] Open

Abstract

Cytokinesis is an essential process in bacterial cell division, and it involves more than 25 essential/non-essential cell division proteins that form a protein complex known as a divisome. Central to the divisome are the proteins FtsB and FtsL binding to FtsQ to form a complex FtsQBL, which helps link the early proteins with late proteins. The FtsQBL complex is highly conserved as a component across bacteria. Pathogens like Vibrio cholerae, Mycobacterium ulcerans, Mycobacterium leprae, and Chlamydia trachomatis are the causative agents of the bacterial Neglected Tropical Diseases Cholera, Buruli ulcer, Leprosy, and Trachoma, respectively, some of which seemingly lack known homologs for some of the FtsQBL complex proteins. In the absence of experimental characterization, either due to insufficient resources or the massive increase in novel sequences generated from genomics, functional annotation is traditionally inferred by sequence similarity to a known homolog. With the advent of accurate protein structure prediction methods, features both at the fold level and at the protein interaction level can be used to identify orthologs that cannot be unambiguously identified using sequence similarity methods. Using the FtsQBL complex proteins as a case study, we report potential remote homologs using Profile Hidden Markov models and structures predicted using AlphaFold. Predicted ortholog structures show conformational similarity with corresponding E. coli proteins irrespective of their level of sequence similarity. Alphafold multimer was used to characterize remote homologs as FtsB or FtsL, when they were not sufficiently distinguishable at both the sequence or structure level, as their interactions with FtsQ and FtsW play a crucial role in their function. The structures were then analyzed to identify functionally critical regions of the proteins consistent with their homologs and delineate regions potentially useful for inhibitor discovery.

Collapse

Liu X, Shen Q, Zhang S. Cross-species cell-type assignment from single-cell RNA-seq data by a heterogeneous graph neural network. Genome Res 2023;33:96-111. [PMID: 36526433 PMCID: PMC9977153 DOI: 10.1101/gr.276868.122] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 12/09/2022] [Indexed: 12/23/2022]

Kress A, Poch O, Lecompte O, Thompson JD. Real or fake? Measuring the impact of protein annotation errors on estimates of domain gain and loss events. FRONTIERS IN BIOINFORMATICS 2023;3:1178926. [PMID: 37151482 PMCID: PMC10158824 DOI: 10.3389/fbinf.2023.1178926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 04/05/2023] [Indexed: 05/09/2023] Open

Abstract

Protein annotation errors can have significant consequences in a wide range of fields, ranging from protein structure and function prediction to biomedical research, drug discovery, and biotechnology. By comparing the domains of different proteins, scientists can identify common domains, classify proteins based on their domain architecture, and highlight proteins that have evolved differently in one or more species or clades. However, genome-wide identification of different protein domain architectures involves a complex error-prone pipeline that includes genome sequencing, prediction of gene exon/intron structures, and inference of protein sequences and domain annotations. Here we developed an automated fact-checking approach to distinguish true domain loss/gain events from false events caused by errors that occur during the annotation process. Using genome-wide ortholog sets and taking advantage of the high-quality human and Saccharomyces cerevisiae genome annotations, we analyzed the domain gain and loss events in the predicted proteomes of 9 non-human primates (NHP) and 20 non-S. cerevisiae fungi (NSF) as annotated in the Uniprot and Interpro databases. Our approach allowed us to quantify the impact of errors on estimates of protein domain gains and losses, and we show that domain losses are over-estimated ten-fold and three-fold in the NHP and NSF proteins respectively. This is in line with previous studies of gene-level losses, where issues with genome sequencing or gene annotation led to genes being falsely inferred as absent. In addition, we show that insistent protein domain annotations are a major factor contributing to the false events. For the first time, to our knowledge, we show that domain gains are also over-estimated by three-fold and two-fold respectively in NHP and NSF proteins. Based on our more accurate estimates, we infer that true domain losses and gains in NHP with respect to humans are observed at similar rates, while domain gains in the more divergent NSF are observed twice as frequently as domain losses with respect to S. cerevisiae. This study highlights the need to critically examine the scientific validity of protein annotations, and represents a significant step toward scalable computational fact-checking methods that may 1 day mitigate the propagation of wrong information in protein databases.

Collapse

Duan G, Wu G, Chen X, Tian D, Li Z, Sun Y, Du Z, Hao L, Song S, Gao Y, Xiao J, Zhang Z, Bao Y, Tang B, Zhao W. HGD: an integrated homologous gene database across multiple species. Nucleic Acids Res 2022;51:D994-D1002. [PMID: 36318261 PMCID: PMC9825607 DOI: 10.1093/nar/gkac970] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 09/28/2022] [Accepted: 10/17/2022] [Indexed: 11/06/2022] Open

Affiliation(s)

Guangya Duan
Gangao Wu
Xiaoning Chen National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
Dongmei Tian National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
Zhaohua Li National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
Yanling Sun National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
Zhenglin Du National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
Lili Hao National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China
Shuhui Song National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
Yuan Gao National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
Jingfa Xiao National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
Zhang Zhang National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
Yiming Bao National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing 100101, China,University of Chinese Academy of Sciences, Beijing 100049, China
Bixia Tang Correspondence may also be addressed to Bixia Tang.
Wenming Zhao To whom correspondence should be addressed. Tel: +86 1084097636; Fax: +86 1084097720;

Collapse