Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Arendsee Z, Li J, Singh U, Seetharam A, Dorman K, Wurtele ES. phylostratr: a framework for phylostratigraphy. Bioinformatics 2020;35:3617-3627. [PMID: 30873536 DOI: 10.1093/bioinformatics/btz171] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Revised: 02/27/2019] [Accepted: 03/13/2019] [Indexed: 12/20/2022] Open

For:	Arendsee Z, Li J, Singh U, Seetharam A, Dorman K, Wurtele ES. phylostratr: a framework for phylostratigraphy. Bioinformatics 2020;35:3617-3627. [PMID: 30873536 DOI: 10.1093/bioinformatics/btz171] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Revised: 02/27/2019] [Accepted: 03/13/2019] [Indexed: 12/20/2022] Open

Number

Cited by Other Article(s)

Reyes-Herrera PH, Delgadillo-Duran DA, Flores-Gonzalez M, Mueller LA, Cristancho MA, Barrero LS. Chromosome-scale genome assembly and annotation of the tetraploid potato cultivar Diacol Capiro adapted to the Andean region. G3 (BETHESDA, MD.) 2024:jkae139. [PMID: 39058924 DOI: 10.1093/g3journal/jkae139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Accepted: 06/05/2024] [Indexed: 07/28/2024]

Abstract

Potato (Solanum tuberosum) is an essential crop for food security and is ranked as the third most important crop worldwide for human consumption. The Diacol Capiro cultivar holds the dominant position in Colombian cultivation, primarily catering to the food processing industry. This highly heterozygous, autotetraploid cultivar belongs to the Andigenum group and it stands out for its adaptation to a wide variety of environments spanning altitudes from 1,800 to 3,200 meters above sea level. Here, a chromosome-scale assembly, referred to as DC, is presented for this cultivar. The assembly was generated by combining circular consensus sequencing with proximity ligation Hi-C for the scaffolding and represents 2.369 Gb with 48 pseudochromosomes covering 2,091 Gb and an anchor rate of 88.26%. The reference genome metrics, including an N50 of 50.5 Mb, a BUSCO (Benchmarking Universal Single-Copy Orthologue) score of 99.38%, and an Long Terminal Repeat Assembly Index score of 13.53, collectively signal the achieved high assembly quality. A comprehensive annotation yielded a total of 154,114 genes, and the associated BUSCO score of 95.78% for the annotated sequences attests to their completeness. The number of predicted NLR (Nucleotide-Binding and Leucine-Rich-Repeat genes) was 2107 with a large representation of NBARC (for nucleotide binding domain shared by Apaf-1, certain R gene products, and CED-4) containing domains (99.85%). Further comparative analysis of the proposed annotation-based assembly with high-quality known potato genomes, showed a similar genome metrics with differences in total gene numbers related to the ploidy status. The genome assembly and annotation of DC presented in this study represent a valuable asset for comprehending potato genetics. This resource aids in targeted breeding initiatives and contributes to the creation of enhanced, resilient, and more productive potato varieties, particularly beneficial for countries in Latin America.

Collapse

Hayford RK, Haley OC, Cannon EK, Portwood JL, Gardiner JM, Andorf CM, Woodhouse MR. Functional annotation and meta-analysis of maize transcriptomes reveal genes involved in biotic and abiotic stress. BMC Genomics 2024;25:533. [PMID: 38816789 PMCID: PMC11137889 DOI: 10.1186/s12864-024-10443-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 05/22/2024] [Indexed: 06/01/2024] Open

Abstract

BACKGROUND

Environmental stress factors, such as biotic and abiotic stress, are becoming more common due to climate variability, significantly affecting global maize yield. Transcriptome profiling studies provide insights into the molecular mechanisms underlying stress response in maize, though the functions of many genes are still unknown. To enhance the functional annotation of maize-specific genes, MaizeGDB has outlined a data-driven approach with an emphasis on identifying genes and traits related to biotic and abiotic stress.

RESULTS

We mapped high-quality RNA-Seq expression reads from 24 different publicly available datasets (17 abiotic and seven biotic studies) generated from the B73 cultivar to the recent version of the reference genome B73 (B73v5) and deduced stress-related functional annotation of maize gene models. We conducted a robust meta-analysis of the transcriptome profiles from the datasets to identify maize loci responsive to stress, identifying 3,230 differentially expressed genes (DEGs): 2,555 DEGs regulated in response to abiotic stress, 408 DEGs regulated during biotic stress, and 267 common DEGs (co-DEGs) that overlap between abiotic and biotic stress. We discovered hub genes from network analyses, and among the hub genes of the co-DEGs we identified a putative NAC domain transcription factor superfamily protein (Zm00001eb369060) IDP275, which previously responded to herbivory and drought stress. IDP275 was up-regulated in our analysis in response to eight different abiotic and four different biotic stresses. A gene set enrichment and pathway analysis of hub genes of the co-DEGs revealed hormone-mediated signaling processes and phenylpropanoid biosynthesis pathways, respectively. Using phylostratigraphic analysis, we also demonstrated how abiotic and biotic stress genes differentially evolve to adapt to changing environments.

CONCLUSIONS

These results will help facilitate the functional annotation of multiple stress response gene models and annotation in maize. Data can be accessed and downloaded at the Maize Genetics and Genomics Database (MaizeGDB).

Collapse

Sen S, Woodhouse MR, Portwood JL, Andorf CM. Maize Feature Store: A centralized resource to manage and analyze curated maize multi-omics features for machine learning applications. Database (Oxford) 2023;2023:baad078. [PMID: 37935586 PMCID: PMC10634621 DOI: 10.1093/database/baad078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 09/16/2023] [Accepted: 10/19/2023] [Indexed: 11/09/2023]

Barrera-Redondo J, Lotharukpong JS, Drost HG, Coelho SM. Uncovering gene-family founder events during major evolutionary transitions in animals, plants and fungi using GenEra. Genome Biol 2023;24:54. [PMID: 36964572 PMCID: PMC10037820 DOI: 10.1186/s13059-023-02895-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Accepted: 03/10/2023] [Indexed: 03/26/2023] Open

Nesterenko M, Miroliubov A. From head to rootlet: comparative transcriptomic analysis of a rhizocephalan barnacle Peltogaster reticulata (Crustacea: Rhizocephala). F1000Res 2023;11:583. [PMID: 36447930 PMCID: PMC9664023 DOI: 10.12688/f1000research.110492.2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/04/2023] [Indexed: 01/11/2023] Open

Abstract

Background: Rhizocephalan barnacles stand out in the diverse world of metazoan parasites. The body of a rhizocephalan female is modified beyond revealing any recognizable morphological features, consisting of the interna, a system of rootlets, and the externa, a sac-like reproductive body. Moreover, rhizocephalans have an outstanding ability to control their hosts, literally turning them into "zombies". Despite all these amazing traits, there are no genomic or transcriptomic data about any Rhizocephala. Methods: We collected transcriptomes from four body parts of an adult female rhizocephalan Peltogaster reticulata: the externa, and the main, growing, and thoracic parts of the interna. We used all prepared data for the de novo assembly of the reference transcriptome. Next, a set of encoded proteins was determined, the expression levels of protein-coding genes in different parts of the parasite's body were calculated and lists of enriched bioprocesses were identified. We also in silico identified and analyzed sets of potential excretory / secretory proteins. Finally, we applied phylostratigraphy and evolutionary transcriptomics approaches to our data. Results: The assembled reference transcriptome included transcripts of 12,620 protein-coding genes and was the first for any rhizocephalan. Based on the results obtained, the spatial heterogeneity of protein-coding gene expression in different regions of the adult female body of P. reticulata was established. The results of both transcriptomic analysis and histological studies indicated the presence of germ-like cells in the lumen of the interna. The potential molecular basis of the interaction between the nervous system of the host and the parasite's interna was also determined. Given the prolonged expression of development-associated genes, we suggest that rhizocephalans "got stuck in their metamorphosis", even at the reproductive stage. Conclusions: The results of the first comparative transcriptomic analysis for Rhizocephala not only clarified but also expanded the existing ideas about the biology of these extraordinary parasites.

Collapse

Ma S, Skarica M, Li Q, Xu C, Risgaard RD, Tebbenkamp AT, Mato-Blanco X, Kovner R, Krsnik Ž, de Martin X, Luria V, Martí-Pérez X, Liang D, Karger A, Schmidt DK, Gomez-Sanchez Z, Qi C, Gobeske KT, Pochareddy S, Debnath A, Hottman CJ, Spurrier J, Teo L, Boghdadi AG, Homman-Ludiye J, Ely JJ, Daadi EW, Mi D, Daadi M, Marín O, Hof PR, Rasin MR, Bourne J, Sherwood CC, Santpere G, Girgenti MJ, Strittmatter SM, Sousa AM, Sestan N. Molecular and cellular evolution of the primate dorsolateral prefrontal cortex. Science 2022;377:eabo7257. [PMID: 36007006 PMCID: PMC9614553 DOI: 10.1126/science.abo7257] [Citation(s) in RCA: 69] [Impact Index Per Article: 34.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Affiliation(s)

Shaojie Ma Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
Mario Skarica Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
Qian Li Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
Chuan Xu Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
Ryan D. Risgaard Waisman Center, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI 53705, USA Medical Scientist Training Program, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI 53705, USA
Andrew T.N. Tebbenkamp Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
Xoel Mato-Blanco Neurogenomics Group, Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), MELIS, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
Rothem Kovner Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
Željka Krsnik Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA Croatian Institute for Brain Research, School of Medicine, University of Zagreb, 10000 Zagreb, Croatia
Xabier de Martin Neurogenomics Group, Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), MELIS, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
Victor Luria Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
Xavier Martí-Pérez Neurogenomics Group, Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), MELIS, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
Dan Liang Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
Amir Karger IT-Research Computing, Harvard Medical School, Boston, MA, USA
Danielle K. Schmidt Waisman Center, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI 53705, USA
Zachary Gomez-Sanchez Waisman Center, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI 53705, USA
Cai Qi Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
Kevin T. Gobeske Division of Neurocritical Care and Emergency Neurology, Department of Neurology, Yale School of Medicine, New Haven, CT 06510, USA
Sirisha Pochareddy Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
Ashwin Debnath Waisman Center, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI 53705, USA
Cade J. Hottman Waisman Center, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI 53705, USA
Joshua Spurrier Program in Cellular Neuroscience, Neurodegeneration and Repair, Department of Neurology, Yale School of Medicine, New Haven, CT 06536, USA
Leon Teo Australian Regenerative Medicine Institute, 15 Innovation Walk, Monash University, Clayton VIC, 3800, Australia
Anthony G. Boghdadi Australian Regenerative Medicine Institute, 15 Innovation Walk, Monash University, Clayton VIC, 3800, Australia
Jihane Homman-Ludiye Australian Regenerative Medicine Institute, 15 Innovation Walk, Monash University, Clayton VIC, 3800, Australia
John J. Ely MAEBIOS, Alamogordo, NM 88310, USA Department of Anthropology and Center for the Advanced Study of Human Paleobiology, The George Washington University, Washington, DC, USA
Etienne W. Daadi Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, TX, USA
Da Mi Tsinghua-Peking Center for Life Sciences, IDG/McGovern Institute for Brain Research, School of Life Sciences, Tsinghua University, Beijing 100084, China
Marcel Daadi Southwest National Primate Research Center, Texas Biomedical Research Institute, San Antonio, TX, USA Department of Cell Systems & Anatomy, Radiology, Long School of Medicine, UT Health San Antonio NeoNeuron LLC, Palo Alto, CA 94306, USA
Oscar Marín Centre for Developmental Neurobiology, Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London SE1 1UL, UK MRC Centre for Neurodevelopmental Disorders, King’s College London, London SE1 1UL, UK
Patrick R. Hof Nash Family Department of Neuroscience and Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
Mladen-Roko Rasin Department of Neuroscience and Cell Biology, Robert Wood Johnson Medical School, Rutgers University, Piscataway, NJ 08854, USA
James Bourne Australian Regenerative Medicine Institute, 15 Innovation Walk, Monash University, Clayton VIC, 3800, Australia
Chet C. Sherwood Department of Anthropology and Center for the Advanced Study of Human Paleobiology, The George Washington University, Washington, DC, USA
Gabriel Santpere Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA Neurogenomics Group, Research Programme on Biomedical Informatics (GRIB), Hospital del Mar Medical Research Institute (IMIM), MELIS, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
Matthew J. Girgenti Department of Psychiatry, Yale School of Medicine, New Haven, CT 06510, USA National Center for PTSD, US Department of Veterans Affairs, White River Junction, VT, USA
Stephen M. Strittmatter Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA Program in Cellular Neuroscience, Neurodegeneration and Repair, Department of Neurology, Yale School of Medicine, New Haven, CT 06536, USA Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
André M.M. Sousa Waisman Center, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI 53705, USA Department of Neuroscience, School of Medicine and Public Health, University of Wisconsin-Madison, Madison, WI 53705, USA
Nenad Sestan Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA Department of Psychiatry, Yale School of Medicine, New Haven, CT 06510, USA Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA Departments of Genetics and Comparative Medicine, Program in Cellular Neuroscience, Neurodegeneration and Repair, and Yale Child Study Center, Yale School of Medicine, New Haven, CT 06510, USA

Collapse

Raxwal VK, Singh S, Agarwal M, Riha K. Transcriptional and post-transcriptional regulation of young genes in plants. BMC Biol 2022;20:134. [PMID: 35676681 PMCID: PMC9178820 DOI: 10.1186/s12915-022-01339-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 05/30/2022] [Indexed: 12/03/2022] Open

Nesterenko M, Miroliubov A. From head to rootlet: comparative transcriptomic analysis of a rhizocephalan barnacle Peltogaster reticulata (Crustacea: Rhizocephala). F1000Res 2022;11:583. [PMID: 36447930 PMCID: PMC9664023 DOI: 10.12688/f1000research.110492.1] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 01/04/2023] [Indexed: 09/16/2023] Open

Abstract

Background: Rhizocephalan barnacles stand out in the diverse world of metazoan parasites. The body of a rhizocephalan female is modified beyond revealing any recognizable morphological features, consisting of the interna, a system of rootlets, and the externa, a sac-like reproductive body. Moreover, rhizocephalans have an outstanding ability to control their hosts, literally turning them into "zombies". Despite all these amazing traits, there are no genomic or transcriptomic data about any Rhizocephala. Methods: We collected transcriptomes from four body parts of an adult female rhizocephalan Peltogaster reticulata: the externa, and the main, growing, and thoracic parts of the interna. We used all prepared data for the de novo assembly of the reference transcriptome. Next, a set of encoded proteins was determined, the expression levels of protein-coding genes in different parts of the parasite's body were calculated and lists of enriched bioprocesses were identified. We also in silico identified and analyzed sets of potential excretory / secretory proteins. Finally, we applied phylostratigraphy and evolutionary transcriptomics approaches to our data. Results: The assembled reference transcriptome included transcripts of 12,620 protein-coding genes and was the first for any rhizocephalan. Based on the results obtained, the spatial heterogeneity of protein-coding gene expression in different regions of the adult female body of P. reticulata was established. The results of both transcriptomic analysis and histological studies indicated the presence of germ-like cells in the lumen of the interna. The potential molecular basis of the interaction between the nervous system of the host and the parasite's interna was also determined. Given the prolonged expression of development-associated genes, we suggest that rhizocephalans "got stuck in their metamorphosis", even at the reproductive stage. Conclusions: The results of the first comparative transcriptomic analysis for Rhizocephala not only clarified but also expanded the existing ideas about the biology of these extraordinary parasites.

Collapse

Li J, Singh U, Bhandary P, Campbell J, Arendsee Z, Seetharam AS, Wurtele ES. Foster thy young: enhanced prediction of orphan genes in assembled genomes. Nucleic Acids Res 2021;50:e37. [PMID: 34928390 PMCID: PMC9023268 DOI: 10.1093/nar/gkab1238] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Revised: 10/22/2021] [Accepted: 12/02/2021] [Indexed: 02/06/2023] Open

Li F, Rane RV, Luria V, Xiong Z, Chen J, Li Z, Catullo RA, Griffin PC, Schiffer M, Pearce S, Lee SF, McElroy K, Stocker A, Shirriffs J, Cockerell F, Coppin C, Sgrò CM, Karger A, Cain JW, Weber JA, Santpere G, Kirschner MW, Hoffmann AA, Oakeshott JG, Zhang G. Phylogenomic analyses of the genus Drosophila reveals genomic signals of climate adaptation. Mol Ecol Resour 2021;22:1559-1581. [PMID: 34839580 PMCID: PMC9299920 DOI: 10.1111/1755-0998.13561] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2020] [Accepted: 11/10/2021] [Indexed: 01/13/2023]

Abstract

Many Drosophila species differ widely in their distributions and climate niches, making them excellent subjects for evolutionary genomic studies. Here, we have developed a database of high‐quality assemblies for 46 Drosophila species and one closely related Zaprionus. Fifteen of the genomes were newly sequenced, and 20 were improved with additional sequencing. New or improved annotations were generated for all 47 species, assisted by new transcriptomes for 19. Phylogenomic analyses of these data resolved several previously ambiguous relationships, especially in the melanogaster species group. However, it also revealed significant phylogenetic incongruence among genes, mainly in the form of incomplete lineage sorting in the subgenus Sophophora but also including asymmetric introgression in the subgenus Drosophila. Using the phylogeny as a framework and taking into account these incongruences, we then screened the data for genome‐wide signals of adaptation to different climatic niches. First, phylostratigraphy revealed relatively high rates of recent novel gene gain in three temperate pseudoobscura and five desert‐adapted cactophilic mulleri subgroup species. Second, we found differing ratios of nonsynonymous to synonymous substitutions in several hundred orthologues between climate generalists and specialists, with trends for significantly higher ratios for those in tropical and lower ratios for those in temperate‐continental specialists respectively than those in the climate generalists. Finally, resequencing natural populations of 13 species revealed tropics‐restricted species generally had smaller population sizes, lower genome diversity and more deleterious mutations than the more widespread species. We conclude that adaptation to different climates in the genus Drosophila has been associated with large‐scale and multifaceted genomic changes.

Collapse

Affiliation(s)

Fang Li BGI-Shenzhen, Shenzhen, China.,Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen, Denmark
Rahul V Rane Commonwealth Scientific and Industrial Research Organisation, Acton, ACT, Australia.,Bio21 Institute, School of BioSciences, University of Melbourne, Parkville, Vic., Australia
Victor Luria Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, USA
Zijun Xiong BGI-Shenzhen, Shenzhen, China.,State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences (CAS), Kunming, Yunnan, China.,College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
Jiawei Chen BGI-Shenzhen, Shenzhen, China
Zimai Li BGI-Shenzhen, Shenzhen, China
Renee A Catullo Commonwealth Scientific and Industrial Research Organisation, Acton, ACT, Australia.,Division of Ecology and Evolution, Centre for Biodiversity Analysis, The Australian National University, Acton, ACT, Australia
Philippa C Griffin Bio21 Institute, School of BioSciences, University of Melbourne, Parkville, Vic., Australia
Michele Schiffer Bio21 Institute, School of BioSciences, University of Melbourne, Parkville, Vic., Australia.,Daintree Rainforest Observatory, James Cook University, Cape Tribulation, Qld, Australia
Stephen Pearce Commonwealth Scientific and Industrial Research Organisation, Acton, ACT, Australia
Siu Fai Lee Commonwealth Scientific and Industrial Research Organisation, Acton, ACT, Australia.,Applied BioSciences, Macquarie University, North Ryde, NSW, Australia
Kerensa McElroy Commonwealth Scientific and Industrial Research Organisation, Acton, ACT, Australia
Ann Stocker Bio21 Institute, School of BioSciences, University of Melbourne, Parkville, Vic., Australia
Jennifer Shirriffs Bio21 Institute, School of BioSciences, University of Melbourne, Parkville, Vic., Australia
Fiona Cockerell School of Biological Sciences, Monash University, Clayton, Vic., Australia
Chris Coppin Commonwealth Scientific and Industrial Research Organisation, Acton, ACT, Australia
Carla M Sgrò School of Biological Sciences, Monash University, Clayton, Vic., Australia
Amir Karger IT - Research Computing, Harvard Medical School, Boston, Massachusetts, USA
John W Cain Department of Mathematics, Harvard University, Cambridge, Massachusetts, USA
Jessica A Weber Department of Genetics, Harvard Medical School, Boston, Massachusetts, USA
Gabriel Santpere Neurogenomics Group, Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences (DCEXS), Hospital del Mar Medical Research Institute (IMIM), Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
Marc W Kirschner Department of Systems Biology, Harvard Medical School, Boston, Massachusetts, USA
Ary A Hoffmann Bio21 Institute, School of BioSciences, University of Melbourne, Parkville, Vic., Australia
John G Oakeshott Commonwealth Scientific and Industrial Research Organisation, Acton, ACT, Australia.,Applied BioSciences, Macquarie University, North Ryde, NSW, Australia
Guojie Zhang BGI-Shenzhen, Shenzhen, China.,Section for Ecology and Evolution, Department of Biology, University of Copenhagen, Copenhagen, Denmark.,State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences (CAS), Kunming, Yunnan, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China

Collapse

Stein WD, Hoshen MB. During evolution from the earliest tetrapoda, newly-recruited genes are increasingly paralogues of existing genes and distribute non-randomly among the chromosomes. BMC Genomics 2021;22:794. [PMID: 34736418 PMCID: PMC8570013 DOI: 10.1186/s12864-021-08066-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Accepted: 09/28/2021] [Indexed: 11/10/2022] Open

Seetharam AS, Yu Y, Bélanger S, Clark LG, Meyers BC, Kellogg EA, Hufford MB. The Streptochaeta Genome and the Evolution of the Grasses. FRONTIERS IN PLANT SCIENCE 2021;12:710383. [PMID: 34671369 PMCID: PMC8521107 DOI: 10.3389/fpls.2021.710383] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/16/2021] [Accepted: 09/08/2021] [Indexed: 05/15/2023]

Li J, Singh U, Arendsee Z, Wurtele ES. Landscape of the Dark Transcriptome Revealed Through Re-mining Massive RNA-Seq Data. Front Genet 2021;12:722981. [PMID: 34484307 PMCID: PMC8415361 DOI: 10.3389/fgene.2021.722981] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 07/26/2021] [Indexed: 12/13/2022] Open

Banerjee S, Bhandary P, Woodhouse M, Sen TZ, Wise RP, Andorf CM. FINDER: an automated software package to annotate eukaryotic genes from RNA-Seq data and associated protein sequences. BMC Bioinformatics 2021;22:205. [PMID: 33879057 PMCID: PMC8056616 DOI: 10.1186/s12859-021-04120-9] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Accepted: 04/07/2021] [Indexed: 12/23/2022] Open

Abstract

BACKGROUND

Gene annotation in eukaryotes is a non-trivial task that requires meticulous analysis of accumulated transcript data. Challenges include transcriptionally active regions of the genome that contain overlapping genes, genes that produce numerous transcripts, transposable elements and numerous diverse sequence repeats. Currently available gene annotation software applications depend on pre-constructed full-length gene sequence assemblies which are not guaranteed to be error-free. The origins of these sequences are often uncertain, making it difficult to identify and rectify errors in them. This hinders the creation of an accurate and holistic representation of the transcriptomic landscape across multiple tissue types and experimental conditions. Therefore, to gauge the extent of diversity in gene structures, a comprehensive analysis of genome-wide expression data is imperative.

RESULTS

We present FINDER, a fully automated computational tool that optimizes the entire process of annotating genes and transcript structures. Unlike current state-of-the-art pipelines, FINDER automates the RNA-Seq pre-processing step by working directly with raw sequence reads and optimizes gene prediction from BRAKER2 by supplementing these reads with associated proteins. The FINDER pipeline (1) reports transcripts and recognizes genes that are expressed under specific conditions, (2) generates all possible alternatively spliced transcripts from expressed RNA-Seq data, (3) analyzes read coverage patterns to modify existing transcript models and create new ones, and (4) scores genes as high- or low-confidence based on the available evidence across multiple datasets. We demonstrate the ability of FINDER to automatically annotate a diverse pool of genomes from eight species.

CONCLUSIONS

FINDER takes a completely automated approach to annotate genes directly from raw expression data. It is capable of processing eukaryotic genomes of all sizes and requires no manual supervision-ideal for bench researchers with limited experience in handling computational tools.

Collapse

Sociality sculpts similar patterns of molecular evolution in two independently evolved lineages of eusocial bees. Commun Biol 2021;4:253. [PMID: 33637860 PMCID: PMC7977082 DOI: 10.1038/s42003-021-01770-6] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 01/28/2021] [Indexed: 12/19/2022] Open

Singh U, Hur M, Dorman K, Wurtele ES. MetaOmGraph: a workbench for interactive exploratory data analysis of large expression datasets. Nucleic Acids Res 2020;48:e23. [PMID: 31956905 PMCID: PMC7039010 DOI: 10.1093/nar/gkz1209] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2019] [Revised: 12/05/2019] [Accepted: 12/17/2019] [Indexed: 12/17/2022] Open

Singh U, Syrkin Wurtele E. How new genes are born. eLife 2020;9:e55136. [PMID: 32072921 PMCID: PMC7030788 DOI: 10.7554/elife.55136] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2020] [Accepted: 02/17/2020] [Indexed: 12/17/2022] Open

Leiboff S, Hake S. Reconstructing the Transcriptional Ontogeny of Maize and Sorghum Supports an Inverse Hourglass Model of Inflorescence Development. Curr Biol 2019;29:3410-3419.e3. [PMID: 31587998 DOI: 10.1016/j.cub.2019.08.044] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2019] [Revised: 06/29/2019] [Accepted: 08/19/2019] [Indexed: 12/31/2022]

Arendsee Z, Li J, Singh U, Bhandary P, Seetharam A, Wurtele ES. fagin: synteny-based phylostratigraphy and finer classification of young genes. BMC Bioinformatics 2019;20:440. [PMID: 31455236 PMCID: PMC6712868 DOI: 10.1186/s12859-019-3023-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2019] [Accepted: 08/08/2019] [Indexed: 12/30/2022] Open

Abstract

BACKGROUND

With every new genome that is sequenced, thousands of species-specific genes (orphans) are found, some originating from ultra-rapid mutations of existing genes, many others originating de novo from non-genic regions of the genome. If some of these genes survive across speciations, then extant organisms will contain a patchwork of genes whose ancestors first appeared at different times. Standard phylostratigraphy, the technique of partitioning genes by their age, is based solely on protein similarity algorithms. However, this approach relies on negative evidence ─ a failure to detect a homolog of a query gene. An alternative approach is to limit the search for homologs to syntenic regions. Then, genes can be positively identified as de novo orphans by tracing them to non-coding sequences in related species.

RESULTS

We have developed a synteny-based pipeline in the R framework. Fagin determines the genomic context of each query gene in a focal species compared to homologous sequence in target species. We tested the fagin pipeline on two focal species, Arabidopsis thaliana (plus four target species in Brassicaseae) and Saccharomyces cerevisiae (plus six target species in Saccharomyces). Using microsynteny maps, fagin classified the homology relationship of each query gene against each target genome into three main classes, and further subclasses: AAic (has a coding syntenic homolog), NTic (has a non-coding syntenic homolog), and Unknown (has no detected syntenic homolog). fagin inferred over half the "Unknown" A. thaliana query genes, and about 20% for S. cerevisiae, as lacking a syntenic homolog because of local indels or scrambled synteny.

CONCLUSIONS

fagin augments standard phylostratigraphy, and extends synteny-based phylostratigraphy with an automated, customizable, and detailed contextual analysis. By comparing synteny-based phylostrata to standard phylostrata, fagin systematically identifies those orphans and lineage-specific genes that are well-supported to have originated de novo. Analyzing within-species genomes should distinguish orphan genes that may have originated through rapid divergence from de novo orphans. Fagin also delineates whether a gene has no syntenic homolog because of technical or biological reasons. These analyses indicate that some orphans may be associated with regions of high genomic perturbation.

Collapse