1
|
Bryant MJ, Coello AM, Glendening AM, Hilliman SA, Jara CF, Pring SS, Rodriguez Rivera A, Santiago Membreño J, Nigro L, Pauloski N, Graham MR, King T, Jockusch EL, O'Neill RJ, Wegrzyn JL, Santibáñez-López CE, Webster CN. Unveiling the genetic blueprint of a desert scorpion: A chromosome-level genome of Hadrurus arizonensis provides the first reference for Parvorder Iurida. Genome Biol Evol 2024:evae097. [PMID: 38701023 DOI: 10.1093/gbe/evae097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2024] [Revised: 04/19/2024] [Accepted: 04/28/2024] [Indexed: 05/05/2024] Open
Abstract
Over 400 million years old, scorpions represent an ancient group of arachnids and one of the first animals to adapt to life on land. Presently, the lack of available genomes within scorpions hinders research on their evolution. This study leverages ultra-long nanopore sequencing and Pore-C to generate the first chromosome level assembly and annotation for the desert hairy scorpion, Hadrurus arizonensis. The assembled genome is 2.23 Gb in size with an N50 of 280 Mb. Pore-C scaffolding re-oriented 99.6% of bases into nine chromosomes and BUSCO identified 998 (98.6%) complete arthropod single copy orthologs. Repetitive elements represent 54.69% of the assembled bases, including 872,874 (29.39%) LINE elements. A total of 18,996 protein-coding genes and 75,256 transcripts were predicted, and extracted protein sequences yielded a BUSCO score of 97.2%. This is the first genome assembled and annotated within the family Hadruridae, representing a crucial resource for closing gaps in genomic knowledge of scorpions, resolving arachnid phylogeny, and advancing studies in comparative and functional genomics.
Collapse
Affiliation(s)
- Meridia Jane Bryant
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Asher M Coello
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Adam M Glendening
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Samuel A Hilliman
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Carolina Fernanda Jara
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Samuel S Pring
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | | | | | - Lisa Nigro
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
| | - Nicole Pauloski
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
| | - Matthew R Graham
- Department of Biology, Eastern Connecticut State University, CT, USA
| | - Teisha King
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Elizabeth L Jockusch
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Rachel J O'Neill
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
| | | | - Cynthia N Webster
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
| |
Collapse
|
2
|
McEvoy SL, Grady PGS, Pauloski N, O'Neill RJ, Wegrzyn JL. Profiling genome-wide methylation in two maples: Fine-scale approaches to detection with nanopore technology. Evol Appl 2024; 17:e13669. [PMID: 38633133 PMCID: PMC11022628 DOI: 10.1111/eva.13669] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 02/04/2024] [Accepted: 02/12/2024] [Indexed: 04/19/2024] Open
Abstract
DNA methylation is critical to the regulation of transposable elements and gene expression and can play an important role in the adaptation of stress response mechanisms in plants. Traditional methods of methylation quantification rely on bisulfite conversion that can compromise accuracy. Recent advances in long-read sequencing technologies allow for methylation detection in real time. The associated algorithms that interpret these modifications have evolved from strictly statistical approaches to Hidden Markov Models and, recently, deep learning approaches. Much of the existing software focuses on methylation in the CG context, but methylation in other contexts is important to quantify, as it is extensively leveraged in plants. Here, we present methylation profiles for two maple species across the full range of 5mC sequence contexts using Oxford Nanopore Technologies (ONT) long-reads. Hybrid and reference-guided assemblies were generated for two new Acer accessions: Acer negundo (box elder; 65x ONT and 111X Illumina) and Acer saccharum (sugar maple; 93x ONT and 148X Illumina). The ONT reads generated for these assemblies were re-basecalled, and methylation detection was conducted in a custom pipeline with the published Acer references (PacBio assemblies) and hybrid assemblies reported herein to generate four epigenomes. Examination of the transposable element landscape revealed the dominance of LTR Copia elements and patterns of methylation associated with different classes of TEs. Methylation distributions were examined at high resolution across gene and repeat density and described within the broader angiosperm context, and more narrowly in the context of gene family dynamics and candidate nutrient stress genes.
Collapse
Affiliation(s)
- Susan L. McEvoy
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticutUSA
- Department of Forest SciencesUniversity of HelsinkiHelsinkiFinland
| | - Patrick G. S. Grady
- Department of Molecular and Cell BiologyUniversity of ConnecticutStorrsConnecticutUSA
| | - Nicole Pauloski
- Department of Molecular and Cell BiologyUniversity of ConnecticutStorrsConnecticutUSA
- Institute for Systems GenomicsUniversity of ConnecticutStorrsConnecticutUSA
| | - Rachel J. O'Neill
- Department of Molecular and Cell BiologyUniversity of ConnecticutStorrsConnecticutUSA
- Institute for Systems GenomicsUniversity of ConnecticutStorrsConnecticutUSA
| | - Jill L. Wegrzyn
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticutUSA
- Institute for Systems GenomicsUniversity of ConnecticutStorrsConnecticutUSA
| |
Collapse
|
3
|
Neale DB, Zimin AV, Meltzer A, Bhattarai A, Amee M, Figueroa Corona L, Allen BJ, Puiu D, Wright J, De La Torre AR, McGuire PE, Timp W, Salzberg SL, Wegrzyn JL. A Genome Sequence for the Threatened Whitebark Pine. G3 (Bethesda) 2024:jkae061. [PMID: 38526344 DOI: 10.1093/g3journal/jkae061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 02/29/2024] [Accepted: 03/12/2024] [Indexed: 03/26/2024]
Abstract
Whitebark pine (WBP, Pinus albicaulis) is a white pine of subalpine regions in western contiguous US and Canada. WBP has become critically threatened throughout a significant part of its natural range due to mortality from the introduced fungal pathogen white pine blister rust (WPBR, Cronartium ribicola) and additional threats from mountain pine beetle (Dendroctonus ponderosae), wildfire, and maladaptation due to changing climate. Vast acreages of WBP have suffered nearly complete mortality. Genomic technologies can contribute to a faster, more cost-effective approach to the traditional practices of identifying disease-resistant, climate-adapted seed sources for restoration. With deep-coverage Illumina short-reads of haploid megagametophyte tissue and Oxford Nanopore long-reads of diploid needle tissue, followed by a hybrid, multistep assembly approach, we produced a final assembly containing 27.6 Gbp of sequence in 92,740 contigs (N50 537,007 bp) and 34,716 scaffolds (N50 2.0 Gbp). Approximately 87.2% (24.0 Gbp) of total sequence was placed on the twelve WBP chromosomes. Annotation yielded 25,362 protein-coding genes, and over 77% of the genome was characterized as repeats. WBP has demonstrated the greatest variation in resistance to WPBR among the North American white pines. Candidate genes for quantitative resistance include disease resistance genes known as nucleotide-binding leucine-rich-repeat receptors (NLRs). A combination of protein domain alignments and direct genome scanning was employed to fully describe the three subclasses of NLRs. Our high-quality reference sequence and annotation provide a marked improvement in NLR identification compared to previous assessments that leveraged de novo assembled transcriptomes.
Collapse
Affiliation(s)
- David B Neale
- Department of Plant Sciences, University of California, Davis, Davis, CA 95616 USA
- Whitebark Pine Ecosystem Foundation, Missoula, MT 59808 USA
| | - Aleksey V Zimin
- Department of Biomedical Engineering and Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218 USA
| | - Amy Meltzer
- Department of Biomedical Engineering and Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218 USA
| | - Akriti Bhattarai
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269 USA
| | - Maurice Amee
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269 USA
| | | | - Brian J Allen
- Department of Plant Sciences, University of California, Davis, Davis, CA 95616 USA
- University of California Cooperative Extension, Central Sierra, Jackson, CA 95642 USA
| | - Daniela Puiu
- Department of Biomedical Engineering and Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218 USA
| | - Jessica Wright
- USDA Forest Service, Pacific Southwest Research Station, Davis, CA 95618 USA
| | | | - Patrick E McGuire
- Department of Plant Sciences, University of California, Davis, Davis, CA 95616 USA
| | - Winston Timp
- Department of Biomedical Engineering and Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218 USA
| | - Steven L Salzberg
- Department of Biomedical Engineering and Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218 USA
- Departments of Computer Science and Biostatistics, Johns Hopkins University, Baltimore, MD 21218 USA
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269 USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269 USA
| |
Collapse
|
4
|
Guzman-Torres CR, Trybulec E, LeVasseur H, Akella H, Amee M, Strickland E, Pauloski N, Williams M, Romero-Severson J, Hoban S, Woeste K, Pike CC, Fetter KC, Webster CN, Neitzey ML, O’Neill RJ, Wegrzyn JL. Conserving a threatened North American walnut: a chromosome-scale reference genome for butternut (Juglans cinerea). G3 (Bethesda) 2024; 14:jkad189. [PMID: 37703053 PMCID: PMC10849370 DOI: 10.1093/g3journal/jkad189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 05/23/2023] [Accepted: 07/28/2023] [Indexed: 09/14/2023]
Abstract
With the advent of affordable and more accurate third-generation sequencing technologies, and the associated bioinformatic tools, it is now possible to sequence, assemble, and annotate more species of conservation concern than ever before. Juglans cinerea, commonly known as butternut or white walnut, is a member of the walnut family, native to the Eastern United States and Southeastern Canada. The species is currently listed as Endangered on the IUCN Red List due to decline from an invasive fungus known as Ophiognomonia clavigignenti-juglandacearum (Oc-j) that causes butternut canker. Oc-j creates visible sores on the trunks of the tree which essentially starves and slowly kills the tree. Natural resistance to this pathogen is rare. Conserving butternut is of utmost priority due to its critical ecosystem role and cultural significance. As part of an integrated undergraduate and graduate student training program in biodiversity and conservation genomics, the first reference genome for Juglans cinerea is described here. This chromosome-scale 539 Mb assembly was generated from over 100 × coverage of Oxford Nanopore long reads and scaffolded with the Juglans mandshurica genome. Scaffolding with a closely related species oriented and ordered the sequences in a manner more representative of the structure of the genome without altering the sequence. Comparisons with sequenced Juglandaceae revealed high levels of synteny and further supported J. cinerea's recent phylogenetic placement. Comparative assessment of gene family evolution revealed a significant number of contracting families, including several associated with biotic stress response.
Collapse
Affiliation(s)
- Cristopher R Guzman-Torres
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269, USA
| | - Emily Trybulec
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269, USA
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269, USA
| | - Hannah LeVasseur
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269, USA
| | - Harshita Akella
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269, USA
| | - Maurice Amee
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269, USA
| | - Emily Strickland
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
| | - Nicole Pauloski
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269, USA
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269, USA
| | - Martin Williams
- Atlantic Forestry Center, Canadian Forest Service, Natural Resources Canada, Fredericton, NB E3B 5P7, Canada
| | | | - Sean Hoban
- The Center for Tree Science, The Morton Arboretum, Lisle, IL 60532, USA
| | - Keith Woeste
- USDA Forest Service, Northern Research Station, West Lafayette, IN 47906, USA
| | - Carolyn C Pike
- USDA Forest Service, Eastern Region State, Private and Tribal Forestry, West Lafayette, IN 47906, USA
| | - Karl C Fetter
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269, USA
| | - Cynthia N Webster
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269, USA
| | - Michelle L Neitzey
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269, USA
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269, USA
| | - Rachel J O’Neill
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269, USA
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269, USA
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269, USA
| |
Collapse
|
5
|
Knutie SA, Webster CN, Vaziri GJ, Albert L, Harvey JA, LaRue M, Verrett TB, Soldo A, Koop JAH, Chaves JA, Wegrzyn JL. Urban living can rescue Darwin's finches from the lethal effects of invasive vampire flies. Glob Chang Biol 2024; 30:e17145. [PMID: 38273516 DOI: 10.1111/gcb.17145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 12/15/2023] [Accepted: 12/21/2023] [Indexed: 01/27/2024]
Abstract
Human activity changes multiple factors in the environment, which can have positive or negative synergistic effects on organisms. However, few studies have explored the causal effects of multiple anthropogenic factors, such as urbanization and invasive species, on animals and the mechanisms that mediate these interactions. This study examines the influence of urbanization on the detrimental effect of invasive avian vampire flies (Philornis downsi) on endemic Darwin's finches in the Galápagos Islands. We experimentally manipulated nest fly abundance in urban and non-urban locations and then characterized nestling health, fledging success, diet, and gene expression patterns related to host defense. Fledging success of non-parasitized nestlings from urban (79%) and non-urban (75%) nests did not differ significantly. However, parasitized, non-urban nestlings lost more blood, and fewer nestlings survived (8%) compared to urban nestlings (50%). Stable isotopic values (δ15 N) from urban nestling feces were higher than those from non-urban nestlings, suggesting that urban nestlings are consuming more protein. δ15 N values correlated negatively with parasite abundance, which suggests that diet might influence host defenses (e.g., tolerance and resistance). Parasitized, urban nestlings differentially expressed genes within pathways associated with red blood cell production (tolerance) and pro-inflammatory response (innate immunological resistance), compared to parasitized, non-urban nestlings. In contrast, parasitized non-urban nestlings differentially expressed genes within pathways associated with immunoglobulin production (adaptive immunological resistance). Our results suggest that urban nestlings are investing more in pro-inflammatory responses to resist parasites but also recovering more blood cells to tolerate blood loss. Although non-urban nestlings are mounting an adaptive immune response, it is likely a last effort by the immune system rather than an effective defense against avian vampire flies since few nestlings survived.
Collapse
Affiliation(s)
- Sarah A Knutie
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, Connecticut, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, Connecticut, USA
| | - Cynthia N Webster
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, Connecticut, USA
| | - Grace J Vaziri
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, Connecticut, USA
| | - Lauren Albert
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, Connecticut, USA
| | - Johanna A Harvey
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, Connecticut, USA
- Department of Science and Technology, University of Maryland, College Park, Maryland, USA
| | - Michelle LaRue
- School of Earth and Environment, University of Canterbury, Christchurch, New Zealand
| | - Taylor B Verrett
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, Connecticut, USA
| | - Alexandria Soldo
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, Connecticut, USA
| | - Jennifer A H Koop
- Department of Biological Sciences, Northern Illinois University, DeKalb, Illinois, USA
| | - Jaime A Chaves
- Department of Biology, San Francisco State University, San Francisco, California, USA
- Colegio de Ciencias Biológicas y Ambientales, Universidad San Francisco de Quito, Quito, Ecuador
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, Connecticut, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, Connecticut, USA
| |
Collapse
|
6
|
Neale DB, Zimin AV, Meltzer A, Bhattarai A, Amee M, Corona LF, Allen BJ, Puiu D, Wright J, Torre ARDL, McGuire PE, Timp W, Salzberg SL, Wegrzyn JL. A Genome Sequence for the Threatened Whitebark Pine. bioRxiv 2023:2023.11.16.567420. [PMID: 38014212 PMCID: PMC10680812 DOI: 10.1101/2023.11.16.567420] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2023]
Abstract
Whitebark pine (WBP, Pinus albicaulis ) is a white pine of subalpine regions in western contiguous US and Canada. WBP has become critically threatened throughout a significant part of its natural range due to mortality from the introduced fungal pathogen white pine blister rust (WPBR, Cronartium ribicola ) and additional threats from mountain pine beetle ( Dendroctonus ponderosae ), wildfire, and maladaptation due to changing climate. Vast acreages of WBP have suffered nearly complete mortality. Genomic technologies can contribute to a faster, more cost-effective approach to the traditional practices of identifying disease-resistant, climate-adapted seed sources for restoration. With deep-coverage Illumina short-reads of haploid megametophyte tissue and Oxford Nanopore long-reads of diploid needle tissue, followed by a hybrid, multistep assembly approach, we produced a final assembly containing 27.6 Gbp of sequence in 92,740 contigs (N50 537,007 bp) and 34,716 scaffolds (N50 2.0 Gbp). Approximately 87.2% (24.0 Gbp) of total sequence was placed on the twelve WBP chromosomes. Annotation yielded 25,362 protein-coding genes, and over 77% of the genome was characterized as repeats. WBP has demonstrated the greatest variation in resistance to WPBR among the North American white pines. Candidate genes for quantitative resistance include disease resistance genes known as nucleotide-binding leucine-rich-repeat receptors (NLRs). A combination of protein domain alignments and direct genome scanning was employed to fully describe the three subclasses of NLRs (TNL, CNL, RNL). Our high-quality reference sequence and annotation provide a marked improvement in NLR identification compared to previous assessments that leveraged de novo assembled transcriptomes.
Collapse
|
7
|
Vuruputoor VS, Monyak D, Fetter KC, Webster C, Bhattarai A, Shrestha B, Zaman S, Bennett J, McEvoy SL, Caballero M, Wegrzyn JL. Welcome to the big leaves: Best practices for improving genome annotation in non-model plant genomes. Appl Plant Sci 2023; 11:e11533. [PMID: 37601314 PMCID: PMC10439824 DOI: 10.1002/aps3.11533] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/04/2022] [Revised: 02/04/2023] [Accepted: 02/10/2023] [Indexed: 08/22/2023]
Abstract
Premise Robust standards to evaluate quality and completeness are lacking in eukaryotic structural genome annotation, as genome annotation software is developed using model organisms and typically lacks benchmarking to comprehensively evaluate the quality and accuracy of the final predictions. The annotation of plant genomes is particularly challenging due to their large sizes, abundant transposable elements, and variable ploidies. This study investigates the impact of genome quality, complexity, sequence read input, and method on protein-coding gene predictions. Methods The impact of repeat masking, long-read and short-read inputs, and de novo and genome-guided protein evidence was examined in the context of the popular BRAKER and MAKER workflows for five plant genomes. The annotations were benchmarked for structural traits and sequence similarity. Results Benchmarks that reflect gene structures, reciprocal similarity search alignments, and mono-exonic/multi-exonic gene counts provide a more complete view of annotation accuracy. Transcripts derived from RNA-read alignments alone are not sufficient for genome annotation. Gene prediction workflows that combine evidence-based and ab initio approaches are recommended, and a combination of short and long reads can improve genome annotation. Adding protein evidence from de novo assemblies, genome-guided transcriptome assemblies, or full-length proteins from OrthoDB generates more putative false positives as implemented in the current workflows. Post-processing with functional and structural filters is highly recommended. Discussion While the annotation of non-model plant genomes remains complex, this study provides recommendations for inputs and methodological approaches. We discuss a set of best practices to generate an optimal plant genome annotation and present a more robust set of metrics to evaluate the resulting predictions.
Collapse
Affiliation(s)
- Vidya S. Vuruputoor
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut06269USA
| | - Daniel Monyak
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut06269USA
| | - Karl C. Fetter
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut06269USA
| | - Cynthia Webster
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut06269USA
| | - Akriti Bhattarai
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut06269USA
| | - Bikash Shrestha
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut06269USA
| | - Sumaira Zaman
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut06269USA
| | - Jeremy Bennett
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut06269USA
| | - Susan L. McEvoy
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut06269USA
| | - Madison Caballero
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut06269USA
| | - Jill L. Wegrzyn
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut06269USA
| |
Collapse
|
8
|
Visser EA, Kampmann TP, Wegrzyn JL, Naidoo S. Multispecies comparison of host responses to Fusarium circinatum challenge in tropical pines show consistency in resistance mechanisms. Plant Cell Environ 2023; 46:1705-1725. [PMID: 36541367 DOI: 10.1111/pce.14522] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Accepted: 12/18/2022] [Indexed: 06/17/2023]
Abstract
Fusarium circinatum poses a threat to both commercial and natural pine forests. Large variation in host resistance exists between species, with many economically important species being susceptible. Development of resistant genotypes could be expedited and optimised by investigating the molecular mechanisms underlying host resistance and susceptibility as well as increasing the available genetic resources. RNA-seq data, from F. circinatum inoculated and mock-inoculated ca. 6-month-old shoot tissue at 3- and 7-days postinoculation, was generated for three commercially important tropical pines, Pinus oocarpa, Pinus maximinoi and Pinus greggii. De novo transcriptomes were assembled and used to investigate the NLR and PR gene content within available pine references. Host responses to F. circinatum challenge were investigated in P. oocarpa (resistant) and P. greggii (susceptible), in comparison to previously generated expression profiles from Pinus tecunumanii (resistant) and Pinus patula (susceptible). Expression results indicated crosstalk between induced salicylate, jasmonate and ethylene signalling is involved in host resistance and compromised in susceptible hosts. Additionally, higher constitutive expression of sulfur metabolism and flavonoid biosynthesis in resistant hosts suggest involvement of these metabolites in resistance.
Collapse
Affiliation(s)
- Erik A Visser
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria, South Africa
| | - Tamanique P Kampmann
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria, South Africa
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, Connecticut, USA
| | - Sanushka Naidoo
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Pretoria, South Africa
| |
Collapse
|
9
|
Velasco VME, Ferreira A, Zaman S, Noordermeer D, Ensminger I, Wegrzyn JL. A long-read and short-read transcriptomics approach provides the first high-quality reference transcriptome and genome annotation for Pseudotsuga menziesii (Douglas-fir). G3 (Bethesda) 2023; 13:jkac304. [PMID: 36454025 PMCID: PMC10468028 DOI: 10.1093/g3journal/jkac304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 12/13/2021] [Accepted: 10/19/2022] [Indexed: 12/02/2022]
Abstract
Douglas-fir (Pseudotsuga menziesii) is native to western North America. It grows in a wide range of environmental conditions and is an important timber tree. Although there are several studies on the gene expression responses of Douglas-fir to abiotic cues, the absence of high-quality transcriptome and genome data is a barrier to further investigation. Like for most conifers, the available transcriptome and genome reference dataset for Douglas-fir remains fragmented and requires refinement. We aimed to generate a highly accurate, and complete reference transcriptome and genome annotation. We deep-sequenced the transcriptome of Douglas-fir needles from seedlings that were grown under nonstress control conditions or a combination of heat and drought stress conditions using long-read (LR) and short-read (SR) sequencing platforms. We used 2 computational approaches, namely de novo and genome-guided LR transcriptome assembly. Using the LR de novo assembly, we identified 1.3X more high-quality transcripts, 1.85X more "complete" genes, and 2.7X more functionally annotated genes compared to the genome-guided assembly approach. We predicted 666 long noncoding RNAs and 12,778 unique protein-coding transcripts including 2,016 putative transcription factors. We leveraged the LR de novo assembled transcriptome with paired-end SR and a published single-end SR transcriptome to generate an improved genome annotation. This was conducted with BRAKER2 and refined based on functional annotation, repetitive content, and transcriptome alignment. This high-quality genome annotation has 51,419 unique gene models derived from 322,631 initial predictions. Overall, our informatics approach provides a new reference Douglas-fir transcriptome assembly and genome annotation with considerably improved completeness and functional annotation.
Collapse
Affiliation(s)
| | - Alyssa Ferreira
- Department of Evolution and Ecology, University of
Connecticut, Storrs, CT 06269, USA
| | - Sumaira Zaman
- Department of Evolution and Ecology, University of
Connecticut, Storrs, CT 06269, USA
| | - Devin Noordermeer
- Department of Biology, University of Toronto,
Mississauga, ON L5L 1C8, Canada
- Graduate Department of Cell and Systems Biology, University of
Toronto, Toronto, ON M5S, Canada
| | - Ingo Ensminger
- Department of Biology, University of Toronto,
Mississauga, ON L5L 1C8, Canada
- Graduate Department of Cell and Systems Biology, University of
Toronto, Toronto, ON M5S, Canada
- Graduate Department of Ecology and Evolutionary Biology, University of
Toronto, Toronto, ON M5S, Canada
| | - Jill L Wegrzyn
- Department of Evolution and Ecology, University of
Connecticut, Storrs, CT 06269, USA
| |
Collapse
|
10
|
Cobo-Simón I, Maloof JN, Li R, Amini H, Méndez-Cea B, García-García I, Gómez-Garrido J, Esteve-Codina A, Dabad M, Alioto T, Wegrzyn JL, Seco JI, Linares JC, Gallego FJ. Contrasting transcriptomic patterns reveal a genomic basis for drought resilience in the relict fir Abies pinsapo Boiss. Tree Physiol 2023; 43:315-334. [PMID: 36210755 DOI: 10.1093/treephys/tpac115] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 10/05/2022] [Indexed: 06/16/2023]
Abstract
Climate change challenges the adaptive capacity of several forest tree species in the face of increasing drought and rising temperatures. Therefore, understanding the mechanistic connections between genetic diversity and drought resilience is highly valuable for conserving drought-sensitive forests. Nonetheless, the post-drought recovery in trees from a transcriptomic perspective has not yet been studied by comparing contrasting phenotypes. Here, experimental drought treatments, gas-exchange dynamics and transcriptomic analysis (RNA-seq) were performed in the relict and drought-sensitive fir Abies pinsapo Boiss. to identify gene expression differences over immediate (24 h) and extended drought (20 days). Post-drought responses were investigated to define resilient and sensitive phenotypes. Single nucleotide polymorphisms (SNPs) were also studied to characterize the genomic basis of A. pinsapo drought resilience. Weighted gene co-expression network analysis showed an activation of stomatal closing and an inhibition of plant growth-related genes during the immediate drought, consistent with an isohydric dynamic. During the extended drought, transcription factors, as well as cellular damage and homeostasis protection-related genes prevailed. Resilient individuals activate photosynthesis-related genes and inhibit aerial growth-related genes, suggesting a shifting shoot/root biomass allocation to improve water uptake and whole-plant carbon balance. About, 152 fixed SNPs were found between resilient and sensitive seedlings, which were mostly located in RNA-activity-related genes, including epigenetic regulation. Contrasting gene expression and SNPs were found between different post-drought resilience phenotypes for the first time in a forest tree, suggesting a transcriptomic and genomic basis for drought resilience. The obtained drought-related transcriptomic profile and drought-resilience candidate genes may guide conservation programs for this threatened tree species.
Collapse
Affiliation(s)
- Irene Cobo-Simón
- Dpto Sistemas Físicos, Químicos y Naturales, Univ. Pablo de Olavide, 41013 Sevilla, Spain
- Dpto Genética, Fisiología y Microbiología, Unidad de Genética, Facultad de CC Biológicas, Universidad Complutense de Madrid 28040, Spain
| | - Julin N Maloof
- University of California at Davis, Department of Plant Biology, Davis, CA 95616, USA
| | - Ruijuan Li
- University of California at Davis, Department of Plant Biology, Davis, CA 95616, USA
| | - Hajar Amini
- University of California at Davis, Department of Plant Biology, Davis, CA 95616, USA
| | - Belén Méndez-Cea
- Dpto Genética, Fisiología y Microbiología, Unidad de Genética, Facultad de CC Biológicas, Universidad Complutense de Madrid 28040, Spain
| | - Isabel García-García
- Dpto Genética, Fisiología y Microbiología, Unidad de Genética, Facultad de CC Biológicas, Universidad Complutense de Madrid 28040, Spain
| | - Jèssica Gómez-Garrido
- CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona 08028, Spain
| | - Anna Esteve-Codina
- CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona 08028, Spain
| | - Marc Dabad
- CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona 08028, Spain
| | - Tyler Alioto
- CNAG-CRG, Centre for Genomic Regulation, Barcelona Institute of Science and Technology, Barcelona 08028, Spain
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
| | - José Ignacio Seco
- Dpto Sistemas Físicos, Químicos y Naturales, Univ. Pablo de Olavide, 41013 Sevilla, Spain
| | - Juan Carlos Linares
- Dpto Sistemas Físicos, Químicos y Naturales, Univ. Pablo de Olavide, 41013 Sevilla, Spain
| | - Francisco Javier Gallego
- Dpto Genética, Fisiología y Microbiología, Unidad de Genética, Facultad de CC Biológicas, Universidad Complutense de Madrid 28040, Spain
| |
Collapse
|
11
|
Lötter A, Duong TA, Candotti J, Mizrachi E, Wegrzyn JL, Myburg AA. Haplogenome assembly reveals structural variation in Eucalyptus interspecific hybrids. Gigascience 2022; 12:giad064. [PMID: 37632754 PMCID: PMC10460159 DOI: 10.1093/gigascience/giad064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2022] [Revised: 02/15/2023] [Accepted: 07/27/2023] [Indexed: 08/28/2023] Open
Abstract
BACKGROUND De novo phased (haplo)genome assembly using long-read DNA sequencing data has improved the detection and characterization of structural variants (SVs) in plant and animal genomes. Able to span across haplotypes, long reads allow phased, haplogenome assembly in highly outbred organisms such as forest trees. Eucalyptus tree species and interspecific hybrids are the most widely planted hardwood trees with F1 hybrids of Eucalyptus grandis and E. urophylla forming the bulk of fast-growing pulpwood plantations in subtropical regions. The extent of structural variation and its effect on interspecific hybridization is unknown in these trees. As a first step towards elucidating the extent of structural variation between the genomes of E. grandis and E. urophylla, we sequenced and assembled the haplogenomes contained in an F1 hybrid of the two species. FINDINGS Using Nanopore sequencing and a trio-binning approach, we assembled the separate haplogenomes (566.7 Mb and 544.5 Mb) to 98.0% BUSCO completion. High-density SNP genetic linkage maps of both parents allowed scaffolding of 88.0% of the haplogenome contigs into 11 pseudo-chromosomes (scaffold N50 of 43.8 Mb and 42.5 Mb for the E. grandis and E. urophylla haplogenomes, respectively). We identify 48,729 SVs between the two haplogenomes providing the first detailed insight into genome structural rearrangement in these species. The two haplogenomes have similar gene content, 35,572 and 33,915 functionally annotated genes, of which 34.7% are contained in genome rearrangements. CONCLUSIONS Knowledge of SV and haplotype diversity in the two species will form the basis for understanding the genetic basis of hybrid superiority in these trees.
Collapse
Affiliation(s)
- Anneri Lötter
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private bag X20, Pretoria 0028, South Africa
| | - Tuan A Duong
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private bag X20, Pretoria 0028, South Africa
| | - Julia Candotti
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private bag X20, Pretoria 0028, South Africa
| | - Eshchar Mizrachi
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private bag X20, Pretoria 0028, South Africa
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, Institute for Systems Genomics: Computational Biology Core, University of Connecticut, Storrs, CT 06269, USA
| | - Alexander A Myburg
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), University of Pretoria, Private bag X20, Pretoria 0028, South Africa
| |
Collapse
|
12
|
Liu Y, Wang S, Li L, Yang T, Dong S, Wei T, Wu S, Liu Y, Gong Y, Feng X, Ma J, Chang G, Huang J, Yang Y, Wang H, Liu M, Xu Y, Liang H, Yu J, Cai Y, Zhang Z, Fan Y, Mu W, Sahu SK, Liu S, Lang X, Yang L, Li N, Habib S, Yang Y, Lindstrom AJ, Liang P, Goffinet B, Zaman S, Wegrzyn JL, Li D, Liu J, Cui J, Sonnenschein EC, Wang X, Ruan J, Xue JY, Shao ZQ, Song C, Fan G, Li Z, Zhang L, Liu J, Liu ZJ, Jiao Y, Wang XQ, Wu H, Wang E, Lisby M, Yang H, Wang J, Liu X, Xu X, Li N, Soltis PS, Van de Peer Y, Soltis DE, Gong X, Liu H, Zhang S. The Cycas genome and the early evolution of seed plants. Nat Plants 2022; 8:389-401. [PMID: 35437001 PMCID: PMC9023351 DOI: 10.1038/s41477-022-01129-7] [Citation(s) in RCA: 52] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/03/2021] [Accepted: 03/10/2022] [Indexed: 05/05/2023]
Abstract
Cycads represent one of the most ancient lineages of living seed plants. Identifying genomic features uniquely shared by cycads and other extant seed plants, but not non-seed-producing plants, may shed light on the origin of key innovations, as well as the early diversification of seed plants. Here, we report the 10.5-Gb reference genome of Cycas panzhihuaensis, complemented by the transcriptomes of 339 cycad species. Nuclear and plastid phylogenomic analyses strongly suggest that cycads and Ginkgo form a clade sister to all other living gymnosperms, in contrast to mitochondrial data, which place cycads alone in this position. We found evidence for an ancient whole-genome duplication in the common ancestor of extant gymnosperms. The Cycas genome contains four homologues of the fitD gene family that were likely acquired via horizontal gene transfer from fungi, and these genes confer herbivore resistance in cycads. The male-specific region of the Y chromosome of C. panzhihuaensis contains a MADS-box transcription factor expressed exclusively in male cones that is similar to a system reported in Ginkgo, suggesting that a sex determination mechanism controlled by MADS-box genes may have originated in the common ancestor of cycads and Ginkgo. The C. panzhihuaensis genome provides an important new resource of broad utility for biologists.
Collapse
Affiliation(s)
- Yang Liu
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China.
- Key Laboratory of Southern Subtropical Plant Diversity, Fairy Lake Botanical Garden, Shenzhen & Chinese Academy of Sciences, Shenzhen, China.
| | - Sibo Wang
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China
| | - Linzhou Li
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China
| | - Ting Yang
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China
| | - Shanshan Dong
- Key Laboratory of Southern Subtropical Plant Diversity, Fairy Lake Botanical Garden, Shenzhen & Chinese Academy of Sciences, Shenzhen, China
| | - Tong Wei
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China
| | - Shengdan Wu
- State Key Laboratory of Grassland Agro-Ecosystems, College of Ecology, Lanzhou University, Lanzhou, China
| | - Yongbo Liu
- State Environmental Protection Key Laboratory of Regional Eco-process and Function Assessment, Chinese Research Academy of Environmental Sciences, Beijing, China
| | - Yiqing Gong
- Key Laboratory of Southern Subtropical Plant Diversity, Fairy Lake Botanical Garden, Shenzhen & Chinese Academy of Sciences, Shenzhen, China
| | - Xiuyan Feng
- Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, China
| | - Jianchao Ma
- Key Laboratory of Plant Stress Biology, State Key Laboratory of Crop Stress Adaptation and Improvement, Henan University, Kaifeng, China
| | - Guanxiao Chang
- Key Laboratory of Plant Stress Biology, State Key Laboratory of Crop Stress Adaptation and Improvement, Henan University, Kaifeng, China
| | - Jinling Huang
- Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, China
- Key Laboratory of Plant Stress Biology, State Key Laboratory of Crop Stress Adaptation and Improvement, Henan University, Kaifeng, China
- Department of Biology, East Carolina University, Greenville, NC, USA
| | - Yong Yang
- College of Biology and Environment, Nanjing Forestry University, Nanjing, China
| | - Hongli Wang
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Min Liu
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China
| | - Yan Xu
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Hongping Liang
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Jin Yu
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Yuqing Cai
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Zhaowu Zhang
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China
| | - Yannan Fan
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China
| | - Weixue Mu
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China
| | - Sunil Kumar Sahu
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China
| | - Shuchun Liu
- Key Laboratory of Southern Subtropical Plant Diversity, Fairy Lake Botanical Garden, Shenzhen & Chinese Academy of Sciences, Shenzhen, China
| | - Xiaoan Lang
- Key Laboratory of Southern Subtropical Plant Diversity, Fairy Lake Botanical Garden, Shenzhen & Chinese Academy of Sciences, Shenzhen, China
- Nanning Botanical Garden, Nanning, China
| | - Leilei Yang
- Key Laboratory of Southern Subtropical Plant Diversity, Fairy Lake Botanical Garden, Shenzhen & Chinese Academy of Sciences, Shenzhen, China
| | - Na Li
- Key Laboratory of Southern Subtropical Plant Diversity, Fairy Lake Botanical Garden, Shenzhen & Chinese Academy of Sciences, Shenzhen, China
| | - Sadaf Habib
- Key Laboratory of Southern Subtropical Plant Diversity, Fairy Lake Botanical Garden, Shenzhen & Chinese Academy of Sciences, Shenzhen, China
- School of Life Sciences, Sun Yat-sen University, Guangzhou, China
| | - Yongqiong Yang
- Sichuan Cycas panzhihuaensis National Nature Reserve, Panzhihua, China
| | | | - Pei Liang
- Department of Entomology, China Agricultural University, Beijing, China
| | - Bernard Goffinet
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Sumaira Zaman
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Dexiang Li
- Nanning Botanical Garden, Nanning, China
| | - Jian Liu
- Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, China
| | - Jie Cui
- Guangdong Provincial Key Laboratory for Plant Epigenetics, Longhua Institute of Innovative Biotechnology, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, China
| | - Eva C Sonnenschein
- Department of Biotechnology and Biomedicine, Technical University of Denmark, Lyngby, Denmark
| | - Xiaobo Wang
- Shenzhen Agricultural Genome Research Institute, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Jue Ruan
- Shenzhen Agricultural Genome Research Institute, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Jia-Yu Xue
- College of Horticulture, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing, China
| | - Zhu-Qing Shao
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Nanjing University, Nanjing, China
| | - Chi Song
- Chengdu University of Traditional Chinese Medicine, Chengdu, China
| | - Guangyi Fan
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China
| | - Zhen Li
- Department of Plant Biotechnology and Bioinformatics, Ghent University, VIB UGent Center for Plant Systems Biology, Gent, Belgium
| | - Liangsheng Zhang
- College of Agriculture and Biotechnology, Zhejiang University, Hangzhou, China
- Hainan Institute of Zhejiang University, Sanya, China
| | - Jianquan Liu
- The College of Life Sciences, Sichuan University, Chengdu, China
| | - Zhong-Jian Liu
- Key Laboratory of Orchid Conservation and Utilization of National Forestry and Grassland Administration at College of Landscape Architecture, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Yuannian Jiao
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, China
| | - Xiao-Quan Wang
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, China
| | - Hong Wu
- College of Life Sciences, South China Agricultural University, Guangzhou, China
| | - Ertao Wang
- National Key Laboratory of Plant Molecular Genetics, Chinese Academy of Sciences Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China
| | - Michael Lisby
- Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Huanming Yang
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China
| | - Jian Wang
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China
| | - Xin Liu
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China
| | - Xun Xu
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China
| | - Nan Li
- Key Laboratory of Southern Subtropical Plant Diversity, Fairy Lake Botanical Garden, Shenzhen & Chinese Academy of Sciences, Shenzhen, China
| | - Pamela S Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL, USA
| | - Yves Van de Peer
- College of Horticulture, Academy for Advanced Interdisciplinary Studies, Nanjing Agricultural University, Nanjing, China.
- Department of Plant Biotechnology and Bioinformatics, Ghent University, VIB UGent Center for Plant Systems Biology, Gent, Belgium.
- Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa.
| | - Douglas E Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL, USA.
- Department of Biology, University of Florida, Gainesville, FL, USA.
| | - Xun Gong
- Key Laboratory for Plant Diversity and Biogeography of East Asia, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, China.
| | - Huan Liu
- State Key Laboratory of Agricultural Genomics, BGI-Shenzhen, Shenzhen, China.
| | - Shouzhou Zhang
- Key Laboratory of Southern Subtropical Plant Diversity, Fairy Lake Botanical Garden, Shenzhen & Chinese Academy of Sciences, Shenzhen, China.
| |
Collapse
|
13
|
Mahoney JD, Wang S, Iorio LA, Wegrzyn JL, Dorris M, Martin D, Bolling BW, Brand MH, Wang H. De novo assembly of a fruit transcriptome set identifies AmMYB10 as a key regulator of anthocyanin biosynthesis in Aronia melanocarpa. BMC Plant Biol 2022; 22:143. [PMID: 35337270 PMCID: PMC8951710 DOI: 10.1186/s12870-022-03518-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/05/2021] [Accepted: 02/25/2022] [Indexed: 06/14/2023]
Abstract
Aronia is a group of deciduous fruiting shrubs, of the Rosaceae family, native to eastern North America. Interest in Aronia has increased because of the high levels of dietary antioxidants in Aronia fruits. Using Illumina RNA-seq transcriptome analysis, this study investigates the molecular mechanisms of polyphenol biosynthesis during Aronia fruit development. Six A. melanocarpa (diploid) accessions were collected at four fruit developmental stages. De novo assembly was performed with 341 million clean reads from 24 samples and assembled into 90,008 transcripts with an average length of 801 bp. The transcriptome had 96.1% complete according to Benchmarking Universal Single-Copy Orthologs (BUSCOs). The differentially expressed genes (DEGs) were identified in flavonoid biosynthetic and metabolic processes, pigment biosynthesis, carbohydrate metabolic processes, and polysaccharide metabolic processes based on significant Gene Ontology (GO) biological terms. The expression of ten anthocyanin biosynthetic genes showed significant up-regulation during fruit development according to the transcriptomic data, which was further confirmed using qRT-PCR expression analyses. Additionally, transcription factor genes were identified among the DEGs. Using a transient expression assay, we confirmed that AmMYB10 induces anthocyanin biosynthesis. The de novo transcriptome data provides a valuable resource for the understanding the molecular mechanisms of fruit anthocyanin biosynthesis in Aronia and species of the Rosaceae family.
Collapse
Affiliation(s)
- Jonathan D Mahoney
- Department of Plant Science and Landscape Architecture, University of Connecticut, Storrs, CT, 06269, USA
| | - Sining Wang
- Department of Plant Science and Landscape Architecture, University of Connecticut, Storrs, CT, 06269, USA
| | - Liam A Iorio
- Department of Plant Science and Landscape Architecture, University of Connecticut, Storrs, CT, 06269, USA
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, 06269, USA
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, 06269, USA
| | - Matthew Dorris
- Department of Food Science, University of Wisconsin, Madison, WI, 53706, USA
| | - Derek Martin
- Department of Food Science, University of Wisconsin, Madison, WI, 53706, USA
| | - Bradley W Bolling
- Department of Food Science, University of Wisconsin, Madison, WI, 53706, USA
| | - Mark H Brand
- Department of Plant Science and Landscape Architecture, University of Connecticut, Storrs, CT, 06269, USA
| | - Huanzhong Wang
- Department of Plant Science and Landscape Architecture, University of Connecticut, Storrs, CT, 06269, USA.
- Institute for Systems Genomics, University of Connecticut, Storrs, CT, 06269, USA.
| |
Collapse
|
14
|
McEvoy SL, Sezen UU, Trouern‐Trend A, McMahon SM, Schaberg PG, Yang J, Wegrzyn JL, Swenson NG. Strategies of tolerance reflected in two North American maple genomes. Plant J 2022; 109:1591-1613. [PMID: 34967059 PMCID: PMC9304320 DOI: 10.1111/tpj.15657] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Accepted: 12/22/2021] [Indexed: 05/24/2023]
Abstract
The first chromosome‐scale assemblies for North American members of the Acer genus, sugar maple (Acer saccharum) and boxelder (Acer negundo), as well as transcriptomic evaluation of the abiotic stress response in A. saccharum are reported. This integrated study describes in‐depth aspects contributing to each species' approach to tolerance and applies current knowledge in many areas of plant genome biology with Acer physiology to help convey the genomic complexities underlying tolerance in broadleaf tree species.
Collapse
Affiliation(s)
- Susan L. McEvoy
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut06269USA
| | - U. Uzay Sezen
- Smithsonian Environmental Research CenterEdgewaterMaryland21037USA
| | - Alexander Trouern‐Trend
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut06269USA
| | - Sean M. McMahon
- Smithsonian Environmental Research CenterEdgewaterMaryland21037USA
| | - Paul G. Schaberg
- Forest ServiceU.S. Department of Agriculture, Northern Research StationBurlingtonVermont05405USA
| | - Jie Yang
- CAS Key Laboratory of Tropical Forest Ecology, Xishuangbanna Tropical Botanical GardenChinese Academy of SciencesMengla666303YunnanChina
| | - Jill L. Wegrzyn
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut06269USA
| | - Nathan G. Swenson
- Department of Biological SciencesUniversity of Notre DameNotre DameIndiana46556USA
| |
Collapse
|
15
|
Webster C, Figueroa‐Corona L, Méndez‐González ID, Álvarez‐Soto L, Neale DB, Jaramillo‐Correa JP, Wegrzyn JL, Vázquez‐Lobo A. Comparative analysis of differential gene expression indicates divergence in ontogenetic strategies of leaves in two conifer genera. Ecol Evol 2022; 12:e8611. [PMID: 35222971 PMCID: PMC8848466 DOI: 10.1002/ece3.8611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2021] [Revised: 12/21/2021] [Accepted: 01/23/2022] [Indexed: 11/09/2022] Open
Abstract
In land plants, heteroblasty broadly refers to a drastic change in morphology during growth through ontogeny. Juniperus flaccida and Pinus cembroides are conifers of independent lineages known to exhibit leaf heteroblasty between the juvenile and adult life stage of development. Juvenile leaves of P. cembroides develop spirally on the main stem and appear decurrent, flattened, and needle‐like; whereas adult photosynthetic leaves are triangular or semi‐circular needle‐like, and grow in whorls on secondary or tertiary compact dwarf shoots. By comparison, J. flaccida juvenile leaves are decurrent and needle‐like, and adult leaves are compact, short, and scale‐like. Comparative analyses were performed to evaluate differences in anatomy and gene expression patterns between developmental phases in both species. RNA from 12 samples was sequenced and analyzed with available software. They were assembled de novo from the RNA‐Seq reads. Following assembly, 63,741 high‐quality transcripts were functionally annotated in P. cembroides and 69,448 in J. flaccida. Evaluation of the orthologous groups yielded 4140 shared gene families among the four references (adult and juvenile from each species). Activities related to cell division and development were more abundant in juveniles than adults in P. cembroides, and more abundant in adults than juveniles in J. flaccida. Overall, there were 509 up‐regulated and 81 down‐regulated genes in the juvenile condition of P. cembroides and 14 up‐regulated and 22 down‐regulated genes in J. flaccida. Gene interaction network analysis showed evidence of co‐expression and co‐localization of up‐regulated genes involved in cell wall and cuticle formation, development, and phenylpropanoid pathway, in juvenile P. cembroides leaves. Whereas in J. flaccida, differential expression and gene interaction patterns were detected in genes involved in photosynthesis and chloroplast biogenesis. Although J. flaccida and P. cembroides both exhibit leaf heteroblastic development, little overlap was detected, and unique genes and pathways were highlighted in this study.
Collapse
Affiliation(s)
- Cynthia Webster
- Department of Ecology and Evolutionary Biology University of Connecticut Storrs Connecticut USA
| | - Laura Figueroa‐Corona
- Departamento de Ecología Evolutiva Instituto de Ecología Universidad Nacional Autónoma de México Ciudad de México Mexico
| | - Iván David Méndez‐González
- Departamento de Ecología Evolutiva Instituto de Ecología Universidad Nacional Autónoma de México Ciudad de México Mexico
- Department of Biological Sciences University of Pittsburgh Pittsburgh Pennsylvania USA
| | - Lluvia Álvarez‐Soto
- Facultad de Ciencias Biológicas Universidad Autónoma del Estado de Morelos Cuernavaca México
| | - David B. Neale
- Department of Plant Sciences University of California Davis California USA
| | - Juan Pablo Jaramillo‐Correa
- Departamento de Ecología Evolutiva Instituto de Ecología Universidad Nacional Autónoma de México Ciudad de México Mexico
| | - Jill L. Wegrzyn
- Department of Ecology and Evolutionary Biology University of Connecticut Storrs Connecticut USA
| | - Alejandra Vázquez‐Lobo
- Centro de Investigación en Biodiversidad y Conservación Universidad Autónoma del Estado de Morelos Cuernavaca México
| |
Collapse
|
16
|
Lawniczak MKN, Durbin R, Flicek P, Lindblad-Toh K, Wei X, Archibald JM, Baker WJ, Belov K, Blaxter ML, Marques Bonet T, Childers AK, Coddington JA, Crandall KA, Crawford AJ, Davey RP, Di Palma F, Fang Q, Haerty W, Hall N, Hoff KJ, Howe K, Jarvis ED, Johnson WE, Johnson RN, Kersey PJ, Liu X, Lopez JV, Myers EW, Pettersson OV, Phillippy AM, Poelchau MF, Pruitt KD, Rhie A, Castilla-Rubio JC, Sahu SK, Salmon NA, Soltis PS, Swarbreck D, Thibaud-Nissen F, Wang S, Wegrzyn JL, Zhang G, Zhang H, Lewin HA, Richards S. Standards recommendations for the Earth BioGenome Project. Proc Natl Acad Sci U S A 2022; 119:e2115639118. [PMID: 35042802 PMCID: PMC8795494 DOI: 10.1073/pnas.2115639118] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
A global international initiative, such as the Earth BioGenome Project (EBP), requires both agreement and coordination on standards to ensure that the collective effort generates rapid progress toward its goals. To this end, the EBP initiated five technical standards committees comprising volunteer members from the global genomics scientific community: Sample Collection and Processing, Sequencing and Assembly, Annotation, Analysis, and IT and Informatics. The current versions of the resulting standards documents are available on the EBP website, with the recognition that opportunities, technologies, and challenges may improve or change in the future, requiring flexibility for the EBP to meet its goals. Here, we describe some highlights from the proposed standards, and areas where additional challenges will need to be met.
Collapse
Affiliation(s)
- Mara K N Lawniczak
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, United Kingdom
| | - Richard Durbin
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, United Kingdom
- Department of Genetics, University of Cambridge, Cambridge CB3 0DH, United Kingdom
| | - Paul Flicek
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, United Kingdom
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Cambridge CB10 1SD, United Kingdom
| | - Kerstin Lindblad-Toh
- Broad Institute of MIT and Harvard, Cambridge, MA 02142
- Science for Life Laboratory, Department of Medical Biochemistry and Microbiology, Uppsala University 751 23 Uppsala, Sweden
| | | | - John M Archibald
- Department of Biochemistry and Molecular Biology, Dalhousie University, Halifax, NS B3H 4R2, Canada
| | - William J Baker
- Department of Accelerated Taxonomy, Royal Botanic Gardens, Kew, Surrey TW9 3AE, United Kingdom
| | - Katherine Belov
- School of Life and Environmental Sciences, Faculty of Science, University of Sydney, Sydney, NSW 2006, Australia
| | - Mark L Blaxter
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, United Kingdom
| | - Tomas Marques Bonet
- Institute of Evolutionary Biology, Consejo Superior de Investigaciones Científicas-Universitat Pompeau Fabra, Parc de Rechercha Biomédica Barcelona 08003 Barcelona, Spain
- Catalan Institution of Research and Advanced Studies 08010 Barcelona, Spain
- Centre Nacional d'Anàlisi Geonòmica - Centre for Genomic Regulation, Barcelona Institute of Science and Technology 08028 Barcelona, Spain
- Institut Català de Paleontologia Miquel Crusafont, Universitat Autònoma de Barcelona 08193 Barcelona, Spain
| | - Anna K Childers
- Bee Research Laboratory, Beltsville Agricultural Research Center, US Department of Agriculture, Agricultural Research Service, Beltsville, MD 20705
| | - Jonathan A Coddington
- Smithsonian Institution, National Museum of Natural History, Washington, DC 20560-0105
| | - Keith A Crandall
- Computational Biology Institute and Department of Biostatistics & Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC 20052
| | - Andrew J Crawford
- Department of Biological Sciences, Universidad de los Andes 111711 Bogotá, Colombia
| | - Robert P Davey
- Engineering Biology, Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, United Kingdom
| | | | - Qi Fang
- BGI-Shenzhen, Beishan Industrial Zone, Shenzhen 518083, China
| | - Wilfried Haerty
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, United Kingdom
| | - Neil Hall
- Genome British Columbia, Vancouver, BC V5Z 0C4, Canada
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, United Kingdom
| | - Katharina J Hoff
- Institute of Mathematics and Computer Science, Center for Functional Genomics of Microbes, University of Greifswald 17489 Greifswald, Germany
| | - Kerstin Howe
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, United Kingdom
| | - Erich D Jarvis
- Vertebrate Genomes Lab, The Rockefeller University, New York, NY 10065
- HHMI, Chevy Chase, MD 20815
| | - Warren E Johnson
- Center for Species Survival, Smithsonian Conservation Biology Institute, National Zoological Park, Front Royal, VA 22630
- The Walter Reed Biosystematics Unit, Museum Support Center MRC-534, Smithsonian Institution, Suitland, MD 20746-2863
| | - Rebecca N Johnson
- Smithsonian Institution, National Museum of Natural History, Washington, DC 20560-0105
| | - Paul J Kersey
- European Molecular Biology Laboratory, European Bioinformatics Institute, Cambridge CB10 1SD, United Kingdom
| | - Xin Liu
- China National GeneBank, Shenzhen 518120, China
| | - Jose Victor Lopez
- Halmos College of Arts and Sciences, Guy Harvey Oceanographic Center, Nova Southeastern University, Dania Beach, FL 33004
| | - Eugene W Myers
- Department of Systems Biology, Max Planck Institute of Molecular Cell Biology and Genetics, Dresden 01307, Germany
| | | | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, NIH, Bethesda, MD 20894
| | - Monica F Poelchau
- National Agricultural Library, USDA Agricultural Research Service, Beltsville, MD 20705
| | - Kim D Pruitt
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD 20894
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, NIH, Bethesda, MD 20894
| | | | - Sunil Kumar Sahu
- China National GeneBank, Shenzhen 518120, China
- BGI-Shenzhen, Beishan Industrial Zone, Shenzhen 518083, China
| | - Nicholas A Salmon
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, United Kingdom
| | - Pamela S Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL 32611
| | - David Swarbreck
- Earlham Institute, Norwich Research Park, Norwich NR4 7UZ, United Kingdom
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, NIH, Bethesda, MD 20894
| | - Sibo Wang
- China National GeneBank, Shenzhen 518120, China
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269
- Institute for Systems Genomics, Computational Biology Core, University of Connecticut, Storrs, CT 06269
| | - Guojie Zhang
- Villum Center for Biodiversity Genomics, Section for Ecology and Evolution, Department of Biology, University of Copenhagen 1165 Copenhagen, Denmark
- China National Genebank, BGI-Shenzhen 518083 Shenzhen, China
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences 650223 Kunming, China
- Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences 650223 Kunming, China
| | - He Zhang
- BGI-Qingdao, BGI-Shenzhen 266555 Qingdao, China
| | - Harris A Lewin
- University of California Davis Genome Center, University of California, Davis, CA 95616
- Department of Evolution and Ecology, University of California, Davis, CA 95616
| | - Stephen Richards
- University of California Davis Genome Center, University of California, Davis, CA 95616;
| |
Collapse
|
17
|
Kress WJ, Soltis DE, Kersey PJ, Wegrzyn JL, Leebens-Mack JH, Gostel MR, Liu X, Soltis PS. Green plant genomes: What we know in an era of rapidly expanding opportunities. Proc Natl Acad Sci U S A 2022; 119:e2115640118. [PMID: 35042803 PMCID: PMC8795535 DOI: 10.1073/pnas.2115640118] [Citation(s) in RCA: 46] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Green plants play a fundamental role in ecosystems, human health, and agriculture. As de novo genomes are being generated for all known eukaryotic species as advocated by the Earth BioGenome Project, increasing genomic information on green land plants is essential. However, setting standards for the generation and storage of the complex set of genomes that characterize the green lineage of life is a major challenge for plant scientists. Such standards will need to accommodate the immense variation in green plant genome size, transposable element content, and structural complexity while enabling research into the molecular and evolutionary processes that have resulted in this enormous genomic variation. Here we provide an overview and assessment of the current state of knowledge of green plant genomes. To date fewer than 300 complete chromosome-scale genome assemblies representing fewer than 900 species have been generated across the estimated 450,000 to 500,000 species in the green plant clade. These genomes range in size from 12 Mb to 27.6 Gb and are biased toward agricultural crops with large branches of the green tree of life untouched by genomic-scale sequencing. Locating suitable tissue samples of most species of plants, especially those taxa from extreme environments, remains one of the biggest hurdles to increasing our genomic inventory. Furthermore, the annotation of plant genomes is at present undergoing intensive improvement. It is our hope that this fresh overview will help in the development of genomic quality standards for a cohesive and meaningful synthesis of green plant genomes as we scale up for the future.
Collapse
Affiliation(s)
- W John Kress
- National Museum of Natural History, Smithsonian Institution, Department of Botany, Washington, DC 20013-7012;
- Department of Biological Sciences, Dartmouth College, Hanover, NH 03755
- Arnold Arboretum, Harvard University, Boston, MA 02130
| | - Douglas E Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL 32611
- Biodiversity Institute, University of Florida, Gainesville, FL 32611
- Department of Biology, University of Florida, Gainesville, FL 32611
| | - Paul J Kersey
- Royal Botanic Gardens, Kew, Richmond, Surrey TW9 3AE, United Kingdom
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, Institute for Systems Genomics: Computational Biology Core, University of Connecticut, Storrs, CT 06269-3214
| | - James H Leebens-Mack
- Department of Plant Biology, 2101 Miller Plant Sciences, University of Georgia, Athens, GA 30602-7271
| | - Morgan R Gostel
- Botanical Research Institute of Texas, Fort Worth, TX 76107-3400
| | - Xin Liu
- China National GeneBank, BGI-Shenzhen, Shenzhen 518120, China
| | - Pamela S Soltis
- Florida Museum of Natural History, University of Florida, Gainesville, FL 32611
- Biodiversity Institute, University of Florida, Gainesville, FL 32611
| |
Collapse
|
18
|
Neale DB, Zimin AV, Zaman S, Scott AD, Shrestha B, Workman RE, Puiu D, Allen BJ, Moore ZJ, Sekhwal MK, De La Torre AR, McGuire PE, Burns E, Timp W, Wegrzyn JL, Salzberg SL. Assembled and annotated 26.5 Gbp coast redwood genome: a resource for estimating evolutionary adaptive potential and investigating hexaploid origin. G3 (Bethesda) 2022; 12:6460957. [PMID: 35100403 PMCID: PMC8728005 DOI: 10.1093/g3journal/jkab380] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/26/2021] [Accepted: 10/25/2021] [Indexed: 12/15/2022]
Abstract
Sequencing, assembly, and annotation of the 26.5 Gbp hexaploid genome of coast redwood (Sequoia sempervirens) was completed leading toward discovery of genes related to climate adaptation and investigation of the origin of the hexaploid genome. Deep-coverage short-read Illumina sequencing data from haploid tissue from a single seed were combined with long-read Oxford Nanopore Technologies sequencing data from diploid needle tissue to create an initial assembly, which was then scaffolded using proximity ligation data to produce a highly contiguous final assembly, SESE 2.1, with a scaffold N50 size of 44.9 Mbp. The assembly included several scaffolds that span entire chromosome arms, confirmed by the presence of telomere and centromere sequences on the ends of the scaffolds. The structural annotation produced 118,906 genes with 113 containing introns that exceed 500 Kbp in length and one reaching 2 Mb. Nearly 19 Gbp of the genome represented repetitive content with the vast majority characterized as long terminal repeats, with a 2.9:1 ratio of Copia to Gypsy elements that may aid in gene expression control. Comparison of coast redwood to other conifers revealed species-specific expansions for a plethora of abiotic and biotic stress response genes, including those involved in fungal disease resistance, detoxification, and physical injury/structural remodeling and others supporting flavonoid biosynthesis. Analysis of multiple genes that exist in triplicate in coast redwood but only once in its diploid relative, giant sequoia, supports a previous hypothesis that the hexaploidy is the result of autopolyploidy rather than any hybridizations with separate but closely related conifer species.
Collapse
Affiliation(s)
- David B Neale
- Department of Plant Sciences, University of California, Davis, Davis, CA 95616, USA
| | - Aleksey V Zimin
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA.,Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21211, USA
| | - Sumaira Zaman
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA.,Department of Computer Science & Engineering, University of Connecticut, Storrs, CT 06269, USA
| | - Alison D Scott
- Department of Plant Sciences, University of California, Davis, Davis, CA 95616, USA
| | - Bikash Shrestha
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
| | - Rachael E Workman
- Department of Molecular Biology and Genetics, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Daniela Puiu
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA.,Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21211, USA
| | - Brian J Allen
- Department of Plant Sciences, University of California, Davis, Davis, CA 95616, USA
| | - Zane J Moore
- Department of Plant Sciences, University of California, Davis, Davis, CA 95616, USA
| | - Manoj K Sekhwal
- School of Forestry, Northern Arizona University, Flagstaff, AZ 86011, USA
| | | | - Patrick E McGuire
- Department of Plant Sciences, University of California, Davis, Davis, CA 95616, USA
| | - Emily Burns
- Save the Redwoods League, San Francisco, CA 94104, USA
| | - Winston Timp
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA.,Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21211, USA.,Department of Molecular Biology and Genetics, Johns Hopkins University, Baltimore, MD 21205, USA
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA.,Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269, USA
| | - Steven L Salzberg
- Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA.,Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21211, USA.,Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, USA.,Department of Biostatistics, Johns Hopkins University, Baltimore, MD 21205, USA
| |
Collapse
|
19
|
Urban MC, Merow C, Wegrzyn JL, Maitner BS, Corcoran D. How to Publish at Pandemic Speed. Bioscience 2021. [DOI: 10.1093/biosci/biab084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Mark C Urban
- Center of Biological Risk, University of Connecticut, Storrs, Connecticut, United States
| | - Cory Merow
- Center of Biological Risk, University of Connecticut, Storrs, Connecticut, United States
- Eversource Energy Center, also University of Connecticut, Storrs
| | - Jill L Wegrzyn
- Center of Biological Risk, University of Connecticut, Storrs, Connecticut, United States
| | - Brian S Maitner
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, Connecticut, United States
| | - Derek Corcoran
- Center of Biological Risk, University of Connecticut, Storrs, Connecticut, United States
| |
Collapse
|
20
|
Caballero M, Lauer E, Bennett J, Zaman S, McEvoy S, Acosta J, Jackson C, Townsend L, Eckert A, Whetten RW, Loopstra C, Holliday J, Mandal M, Wegrzyn JL, Isik F. Toward genomic selection in Pinus taeda: Integrating resources to support array design in a complex conifer genome. Appl Plant Sci 2021; 9:e11439. [PMID: 34268018 PMCID: PMC8272584 DOI: 10.1002/aps3.11439] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 05/21/2021] [Indexed: 05/13/2023]
Abstract
PREMISE An informatics approach was used for the construction of an Axiom genotyping array from heterogeneous, high-throughput sequence data to assess the complex genome of loblolly pine (Pinus taeda). METHODS High-throughput sequence data, sourced from exome capture and whole genome reduced-representation approaches from 2698 trees across five sequence populations, were analyzed with the improved genome assembly and annotation for the loblolly pine. A variant detection, filtering, and probe design pipeline was developed to detect true variants across and within populations. From 8.27 million variants, a total of 642,275 were evaluated and 423,695 of those were screened across a range-wide population. RESULTS The final informatics and screening approach delivered an Axiom array representing 46,439 high-confidence variants to the forest tree breeding and genetics community. Based on the annotated reference genome, 34% were located in or directly upstream or downstream of genic regions. DISCUSSION The Pita50K array represents a genome-wide resource developed from sequence data for an economically important conifer, loblolly pine. It uniquely integrates independent projects that assessed trees sampled across the native range. The challenges associated with the large and repetitive genome are addressed in the development of this resource.
Collapse
Affiliation(s)
- Madison Caballero
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut06269USA
| | - Edwin Lauer
- Department of Forestry and Environmental ResourcesNorth Carolina State UniversityRaleighNorth Carolina27695USA
| | - Jeremy Bennett
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut06269USA
| | - Sumaira Zaman
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut06269USA
| | - Susan McEvoy
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut06269USA
| | - Juan Acosta
- Department of Forestry and Environmental ResourcesNorth Carolina State UniversityRaleighNorth Carolina27695USA
| | - Colin Jackson
- Department of Forestry and Environmental ResourcesNorth Carolina State UniversityRaleighNorth Carolina27695USA
| | - Laura Townsend
- Department of Forestry and Environmental ResourcesNorth Carolina State UniversityRaleighNorth Carolina27695USA
| | - Andrew Eckert
- Department of BiologyVirginia Commonwealth UniversityRichmondVirginia23284USA
| | - Ross W. Whetten
- Department of Forestry and Environmental ResourcesNorth Carolina State UniversityRaleighNorth Carolina27695USA
| | - Carol Loopstra
- Department of Ecology and Conservation BiologyTexas A&M UniversityCollege StationTexas77843USA
| | - Jason Holliday
- Department of Forest Resources and Environmental ConservationVirginia Polytechnic Institute and State UniversityBlacksburgVirginia24061USA
| | - Mihir Mandal
- Department of BiologyClaflin UniversityOrangeburgSouth Carolina29115USA
| | - Jill L. Wegrzyn
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut06269USA
| | - Fikret Isik
- Department of Forestry and Environmental ResourcesNorth Carolina State UniversityRaleighNorth Carolina27695USA
| |
Collapse
|
21
|
Li J, West JB, Hart A, Wegrzyn JL, Smith MA, Domec JC, Loopstra CA, Casola C. Extensive Variation in Drought-Induced Gene Expression Changes Between Loblolly Pine Genotypes. Front Genet 2021; 12:661440. [PMID: 34140968 PMCID: PMC8203665 DOI: 10.3389/fgene.2021.661440] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2021] [Accepted: 04/07/2021] [Indexed: 01/22/2023] Open
Abstract
Drought response is coordinated through expression changes in a large suite of genes. Interspecific variation in this response is common and associated with drought-tolerant and -sensitive genotypes. The extent to which different genetic networks orchestrate the adjustments to water deficit in tolerant and sensitive genotypes has not been fully elucidated, particularly in non-model or woody plants. Differential expression analysis via RNA-seq was evaluated in root tissue exposed to simulated drought conditions in two loblolly pine (Pinus taeda L.) clones with contrasting tolerance to drought. Loblolly pine is the prevalent conifer in southeastern U.S. and a major commercial forestry species worldwide. Significant changes in gene expression levels were found in more than 4,000 transcripts [drought-related transcripts (DRTs)]. Genotype by environment (GxE) interactions were prevalent, suggesting that different cohorts of genes are influenced by drought conditions in the tolerant vs. sensitive genotypes. Functional annotation categories and metabolic pathways associated with DRTs showed higher levels of overlap between clones, with the notable exception of GO categories in upregulated DRTs. Conversely, both differentially expressed transcription factors (TFs) and TF families were largely different between clones. Our results indicate that the response of a drought-tolerant loblolly pine genotype vs. a sensitive genotype to water limitation is remarkably different on a gene-by-gene level, although it involves similar genetic networks. Upregulated transcripts under drought conditions represent the most diverging component between genotypes, which might depend on the activation and repression of substantially different groups of TFs.
Collapse
Affiliation(s)
- Jingjia Li
- Department of Ecology and Conservation Biology, Texas A&M University, College Station, TX, United States
| | - Jason B West
- Department of Ecology and Conservation Biology, Texas A&M University, College Station, TX, United States
| | - Alexander Hart
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, United States
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, United States
| | - Matthew A Smith
- Department of Biological Sciences, Florida International University, Miami, FL, United States
| | - Jean-Christophe Domec
- Bordeaux Sciences Agro, UMR 1391 INRA ISPA, Gradignan, France.,Nicholas School of the Environment, Duke University, Durham, NC, United States
| | - Carol A Loopstra
- Department of Ecology and Conservation Biology, Texas A&M University, College Station, TX, United States
| | - Claudio Casola
- Department of Ecology and Conservation Biology, Texas A&M University, College Station, TX, United States
| |
Collapse
|
22
|
Trouern-Trend AJ, Falk T, Zaman S, Caballero M, Neale DB, Langley CH, Dandekar AM, Stevens KA, Wegrzyn JL. Comparative genomics of six Juglans species reveals disease-associated gene family contractions. Plant J 2020; 102:410-423. [PMID: 31823432 DOI: 10.1111/tpj.14630] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/23/2019] [Accepted: 11/01/2019] [Indexed: 06/10/2023]
Abstract
Juglans (walnuts), the most speciose genus in the walnut family (Juglandaceae), represents most of the family's commercially valuable fruit and wood-producing trees. It includes several species used as rootstock for their resistance to various abiotic and biotic stressors. We present the full structural and functional genome annotations of six Juglans species and one outgroup within Juglandaceae (Juglans regia, J. cathayensis, J. hindsii, J. microcarpa, J. nigra, J. sigillata and Pterocarya stenoptera) produced using BRAKER2 semi-unsupervised gene prediction pipeline and additional tools. For each annotation, gene predictors were trained using 19 tissue-specific J. regia transcriptomes aligned to the genomes. Additional functional evidence and filters were applied to multi-exonic and mono-exonic putative genes to yield between 27 000 and 44 000 high-confidence gene models per species. Comparison of gene models to the BUSCO embryophyta dataset suggested that, on average, genome annotation completeness was 85.6%. We utilized these high-quality annotations to assess gene family evolution within Juglans, and among Juglans and selected Eurosid species. We found notable contractions in several gene families in J. hindsii, including disease resistance-related wall-associated kinase (WAK), Catharanthus roseus receptor-like kinase (CrRLK1L) and others involved in abiotic stress response. Finally, we confirmed an ancient whole-genome duplication that took place in a common ancestor of Juglandaceae using site substitution comparative analysis.
Collapse
Affiliation(s)
| | - Taylor Falk
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Sumaira Zaman
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Madison Caballero
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - David B Neale
- Department of Plant Sciences, University of California Davis, Davis, CA, USA
| | - Charles H Langley
- Department of Evolution and Ecology, University of California Davis, Davis, CA, USA
| | - Abhaya M Dandekar
- Department of Plant Sciences, University of California Davis, Davis, CA, USA
| | - Kristian A Stevens
- Department of Evolution and Ecology, University of California Davis, Davis, CA, USA
- Department of Computer Science, University of California Davis, Davis, CA, USA
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| |
Collapse
|
23
|
Spoor S, Cheng CH, Sanderson LA, Condon B, Almsaeed A, Chen M, Bretaudeau A, Rasche H, Jung S, Main D, Bett K, Staton M, Wegrzyn JL, Feltus FA, Ficklin SP. Tripal v3: an ontology-based toolkit for construction of FAIR biological community databases. Database (Oxford) 2020; 2019:5532788. [PMID: 31328773 PMCID: PMC6643302 DOI: 10.1093/database/baz077] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/08/2019] [Revised: 05/12/2019] [Accepted: 05/22/2019] [Indexed: 12/20/2022]
Abstract
Community biological databases provide an important online resource for both public and private data, analysis tools and community engagement. These sites house genomic, transcriptomic, genetic, breeding and ancillary data for specific species, families or clades. Due to the complexity and increasing quantities of these data, construction of online resources is increasingly difficult especially with limited funding and access to technical expertise. Furthermore, online repositories are expected to promote FAIR data principles (findable, accessible, interoperable and reusable) that presents additional challenges. The open-source Tripal database toolkit seeks to mitigate these challenges by creating both the software and an interactive community of developers for construction of online community databases. Additionally, through coordinated, distributed co-development, Tripal sites encourage community-wide sustainability. Here, we report the release of Tripal version 3 that improves data accessibility and data sharing through systematic use of controlled vocabularies (CVs). Tripal uses the community-developed Chado database as a default data store, but now provides tools to support other data stores, while ensuring that CVs remain the central organizational structure for the data. A new site developer can use Tripal to develop a basic site with little to no programming, with the ability to integrate other data types using extension modules and the Tripal application programming interface. A thorough online User’s Guide and Developer’s Handbook are available at http://tripal.info, providing download, installation and step-by-step setup instructions.
Collapse
Affiliation(s)
- Shawna Spoor
- Department of Horticulture, Washington State University, Pullman, WA, USA
| | - Chun-Huai Cheng
- Department of Horticulture, Washington State University, Pullman, WA, USA
| | | | - Bradford Condon
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, TN, USA
| | - Abdullah Almsaeed
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, TN, USA
| | - Ming Chen
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, TN, USA
| | - Anthony Bretaudeau
- INRA, UMR IGEPP, BIPAA/GenOuest, INRIA/Irisa - Campus de Beaulieu, Rennes Cedex, France
| | - Helena Rasche
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Freiburg im Breisgau, Germany
| | - Sook Jung
- Department of Horticulture, Washington State University, Pullman, WA, USA
| | - Dorrie Main
- Department of Horticulture, Washington State University, Pullman, WA, USA
| | - Kirstin Bett
- Department of Plant Sciences, University of Saskatchewan, Saskatoon, SK, Canada
| | - Margaret Staton
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, TN, USA
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA.,Computational Biology Core, Institute for Systems Genomics, University of Connecticut, Storrs, CT, USA
| | - F Alex Feltus
- Dept. of Genetics and Biochemistry, Clemson University, Clemson, USA
| | - Stephen P Ficklin
- Department of Horticulture, Washington State University, Pullman, WA, USA
| |
Collapse
|
24
|
Wegrzyn JL, Falk T, Grau E, Buehler S, Ramnath R, Herndon N. Cyberinfrastructure and resources to enable an integrative approach to studying forest trees. Evol Appl 2020; 13:228-241. [PMID: 31892954 PMCID: PMC6935593 DOI: 10.1111/eva.12860] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2019] [Revised: 08/11/2019] [Accepted: 08/14/2019] [Indexed: 12/19/2022] Open
Abstract
Sequencing technologies and bioinformatic approaches are now available to resolve the challenges associated with complex and heterozygous genomes. Increased access to less expensive and more effective instrumentation will contribute to a wealth of high-quality plant genomes in the next few years. In the meantime, more than 370 tree species are associated with public projects in primary repositories that are interrogating expression profiles, identifying variants, or analyzing targeted capture without a high-quality reference genome. Genomic data from these projects generates sequences that represent intermediate assemblies for transcriptomes and genomes. These data contribute to forest tree biology, but the associated sequence remains trapped in supplemental files that are poorly integrated in plant community databases and comparative genomic platforms. Successful implementation of life science cyberinfrastructure is improving data standards, ontologies, analytic workflows, and integrated database platforms for both model and non-model plant species. Unique to forest trees with large populations that are long-lived, outcrossing, and genetically diverse, the phenotypic and environmental metrics associated with georeferenced populations are just as important as the genomic data sampled for each individual. To address questions related to forest health and productivity, cyberinfrastructure must keep pace with the magnitude of genomic and phenomic sampling of larger populations. This review examines the current landscape of cyberinfrastructure, with an emphasis on best practices and resources to align community data with the Findable, Accessible, Interoperable, and Reusable (FAIR) guidelines.
Collapse
Affiliation(s)
- Jill L. Wegrzyn
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut
| | - Taylor Falk
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut
| | - Emily Grau
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut
| | - Sean Buehler
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut
| | - Risharde Ramnath
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut
| | - Nic Herndon
- Department of Ecology and Evolutionary BiologyUniversity of ConnecticutStorrsConnecticut
| |
Collapse
|
25
|
Hart AJ, Ginzburg S, Xu MS, Fisher CR, Rahmatpour N, Mitton JB, Paul R, Wegrzyn JL. EnTAP: Bringing faster and smarter functional annotation to non-model eukaryotic transcriptomes. Mol Ecol Resour 2019; 20:591-604. [PMID: 31628884 DOI: 10.1111/1755-0998.13106] [Citation(s) in RCA: 75] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2018] [Revised: 09/18/2019] [Accepted: 09/24/2019] [Indexed: 11/28/2022]
Abstract
EnTAP (Eukaryotic Non-Model Transcriptome Annotation Pipeline) was designed to improve the accuracy, speed, and flexibility of functional gene annotation for de novo assembled transcriptomes in non-model eukaryotes. This software package addresses the fragmentation and related assembly issues that result in inflated transcript estimates and poor annotation rates of protein-coding transcripts. Following filters applied through assessment of true expression and frame selection, open-source tools are leveraged to functionally annotate the reduced set of translated proteins. Downstream features include fast similarity search across five repositories, protein domain assignment, orthologous gene family assessment, and Gene Ontology (GO) term assignment. The final annotation integrates across multiple databases and selects an optimal assignment from a combination of weighted metrics describing similarity search score, taxonomic relationship, and informativeness. Researchers have the option to include additional filters to identify and remove contaminants, identify associated pathways, and prepare the transcripts for enrichment analysis. This fully featured pipeline is easy to install, configure, and runs significantly faster than comparable annotation packages. EnTAP is optimized to generate extensive functional information for the gene space of organisms with limited or poorly characterized genomic resources.
Collapse
Affiliation(s)
- Alexander J Hart
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Samuel Ginzburg
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Muyang Sam Xu
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Cera R Fisher
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Nasim Rahmatpour
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Jeffry B Mitton
- Department of Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, USA
| | - Robin Paul
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| |
Collapse
|
26
|
Visser EA, Wegrzyn JL, Steenkamp ET, Myburg AA, Naidoo S. Dual RNA-Seq Analysis of the Pine- Fusarium circinatum Interaction in Resistant ( Pinus tecunumanii) and Susceptible ( Pinus patula) Hosts. Microorganisms 2019; 7:E315. [PMID: 31487786 PMCID: PMC6780516 DOI: 10.3390/microorganisms7090315] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Revised: 08/15/2019] [Accepted: 08/20/2019] [Indexed: 12/15/2022] Open
Abstract
Fusarium circinatum poses a serious threat to many pine species in both commercial and natural pine forests. Knowledge regarding the molecular basis of pine-F. circinatum host-pathogen interactions could assist efforts to produce more resistant planting stock. This study aimed to identify molecular responses underlying resistance against F. circinatum. A dual RNA-seq approach was used to investigate host and pathogen expression in F. circinatum challenged Pinus tecunumanii (resistant) and Pinus patula (susceptible), at three- and seven-days post inoculation. RNA-seq reads were mapped to combined host-pathogen references for both pine species to identify differentially expressed genes (DEGs). F. circinatum genes expressed during infection showed decreased ergosterol biosynthesis in P. tecunumanii relative to P. patula. For P. tecunumanii, enriched gene ontologies and DEGs indicated roles for auxin-, ethylene-, jasmonate- and salicylate-mediated phytohormone signalling. Correspondingly, key phytohormone signaling components were down-regulated in P. patula. Key F. circinatum ergosterol biosynthesis genes were expressed at lower levels during infection of the resistant relative to the susceptible host. This study further suggests that coordination of phytohormone signaling is required for F. circinatum resistance in P. tecunumanii, while a comparatively delayed response and impaired phytohormone signaling contributes to susceptibility in P. patula.
Collapse
Affiliation(s)
- Erik A Visser
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), Centre for Bioinformatics and Computational Biology, University of Pretoria, Private Bag X20, Pretoria 0028, South Africa
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
| | - Emma T Steenkamp
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), Centre for Bioinformatics and Computational Biology, University of Pretoria, Private Bag X20, Pretoria 0028, South Africa
| | - Alexander A Myburg
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), Centre for Bioinformatics and Computational Biology, University of Pretoria, Private Bag X20, Pretoria 0028, South Africa
| | - Sanushka Naidoo
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), Centre for Bioinformatics and Computational Biology, University of Pretoria, Private Bag X20, Pretoria 0028, South Africa.
| |
Collapse
|
27
|
Wegrzyn JL, Staton MA, Street NR, Main D, Grau E, Herndon N, Buehler S, Falk T, Zaman S, Ramnath R, Richter P, Sun L, Condon B, Almsaeed A, Chen M, Mannapperuma C, Jung S, Ficklin S. Cyberinfrastructure to Improve Forest Health and Productivity: The Role of Tree Databases in Connecting Genomes, Phenomes, and the Environment. Front Plant Sci 2019; 10:813. [PMID: 31293610 PMCID: PMC6603172 DOI: 10.3389/fpls.2019.00813] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2018] [Accepted: 06/05/2019] [Indexed: 05/11/2023]
Abstract
Despite tremendous advancements in high throughput sequencing, the vast majority of tree genomes, and in particular, forest trees, remain elusive. Although primary databases store genetic resources for just over 2,000 forest tree species, these are largely focused on sequence storage, basic genome assemblies, and functional assignment through existing pipelines. The tree databases reviewed here serve as secondary repositories for community data. They vary in their focal species, the data they curate, and the analytics provided, but they are united in moving toward a goal of centralizing both data access and analysis. They provide frameworks to view and update annotations for complex genomes, interrogate systems level expression profiles, curate data for comparative genomics, and perform real-time analysis with genotype and phenotype data. The organism databases of today are no longer simply catalogs or containers of genetic information. These repositories represent integrated cyberinfrastructure that support cross-site queries and analysis in web-based environments. These resources are striving to integrate across diverse experimental designs, sequence types, and related measures through ontologies, community standards, and web services. Efficient, simple, and robust platforms that enhance the data generated by the research community, contribute to improving forest health and productivity.
Collapse
Affiliation(s)
- Jill L. Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, United States
| | - Margaret A. Staton
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, Knoxville, TN, United States
| | - Nathaniel R. Street
- Umeå Plant Science Centre, Department of Plant Physiology, Umeå University, Umeå, Sweden
| | - Dorrie Main
- Department of Horticulture, Washington State University, Pullman, WA, United States
| | - Emily Grau
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, United States
| | - Nic Herndon
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, United States
| | - Sean Buehler
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, United States
| | - Taylor Falk
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, United States
| | - Sumaira Zaman
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, United States
| | - Risharde Ramnath
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, United States
| | - Peter Richter
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, United States
| | - Lang Sun
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, United States
| | - Bradford Condon
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, Knoxville, TN, United States
| | - Abdullah Almsaeed
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, Knoxville, TN, United States
| | - Ming Chen
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, Knoxville, TN, United States
| | - Chanaka Mannapperuma
- Umeå Plant Science Centre, Department of Plant Physiology, Umeå University, Umeå, Sweden
| | - Sook Jung
- Department of Horticulture, Washington State University, Pullman, WA, United States
| | - Stephen Ficklin
- Department of Horticulture, Washington State University, Pullman, WA, United States
| |
Collapse
|
28
|
Falk T, Herndon N, Grau E, Buehler S, Richter P, Zaman S, Baker EM, Ramnath R, Ficklin S, Staton M, Feltus FA, Jung S, Main D, Wegrzyn JL. Growing and cultivating the forest genomics database, TreeGenes. Database (Oxford) 2019; 2019:5375414. [PMID: 30865259 PMCID: PMC6424413 DOI: 10.1093/database/baz043] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Affiliation(s)
- Taylor Falk
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Nic Herndon
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Emily Grau
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Sean Buehler
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Peter Richter
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Sumaira Zaman
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Eliza M Baker
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Risharde Ramnath
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Stephen Ficklin
- Department of Horticulture, Washington State University, Pullman, WA, USA
| | - Margaret Staton
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, TN, USA
| | - Frank A Feltus
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA
| | - Sook Jung
- Department of Horticulture, Washington State University, Pullman, WA, USA
| | - Doreen Main
- Department of Horticulture, Washington State University, Pullman, WA, USA
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| |
Collapse
|
29
|
Jung S, Lee T, Cheng CH, Buble K, Zheng P, Yu J, Humann J, Ficklin SP, Gasic K, Scott K, Frank M, Ru S, Hough H, Evans K, Peace C, Olmstead M, DeVetter LW, McFerson J, Coe M, Wegrzyn JL, Staton ME, Abbott AG, Main D. 15 years of GDR: New data and functionality in the Genome Database for Rosaceae. Nucleic Acids Res 2019; 47:D1137-D1145. [PMID: 30357347 PMCID: PMC6324069 DOI: 10.1093/nar/gky1000] [Citation(s) in RCA: 187] [Impact Index Per Article: 37.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2018] [Accepted: 10/09/2018] [Indexed: 12/13/2022] Open
Abstract
The Genome Database for Rosaceae (GDR, https://www.rosaceae.org) is an integrated web-based community database resource providing access to publicly available genomics, genetics and breeding data and data-mining tools to facilitate basic, translational and applied research in Rosaceae. The volume of data in GDR has increased greatly over the last 5 years. The GDR now houses multiple versions of whole genome assembly and annotation data from 14 species, made available by recent advances in sequencing technology. Annotated and searchable reference transcriptomes, RefTrans, combining peer-reviewed published RNA-Seq as well as EST datasets, are newly available for major crop species. Significantly more quantitative trait loci, genetic maps and markers are available in MapViewer, a new visualization tool that better integrates with other pages in GDR. Pathways can be accessed through the new GDR Cyc Pathways databases, and synteny among the newest genome assemblies from eight species can be viewed through the new synteny browser, SynView. Collated single-nucleotide polymorphism diversity data and phenotypic data from publicly available breeding datasets are integrated with other relevant data. Also, the new Breeding Information Management System allows breeders to upload, manage and analyze their private breeding data within the secure GDR server with an option to release data publicly.
Collapse
Affiliation(s)
- Sook Jung
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| | - Taein Lee
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| | - Chun-Huai Cheng
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| | - Katheryn Buble
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| | - Ping Zheng
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| | - Jing Yu
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| | - Jodi Humann
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| | - Stephen P Ficklin
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| | - Ksenija Gasic
- Department of Plant and Environmental Sciences, Clemson University, Clemson, SC 29634-0310, USA
| | - Kristin Scott
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| | - Morgan Frank
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| | - Sushan Ru
- Department of Agronomy and Plant Genetics, University of Minnesota, St Paul, MN 55108, USA
| | - Heidi Hough
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| | - Kate Evans
- Department of Horticulture, Washington State University Tree Fruit Research and Extension Center, Wenatchee, WA 98801, USA
| | - Cameron Peace
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| | - Mercy Olmstead
- Horticultural Sciences Department, University of Florida, Gainesville, FL 32611, USA
| | - Lisa W DeVetter
- Department of Horticulture, Washington State University, Northwestern Washington Research and Extension Center, Mount Vernon, WA 98273, USA
| | - James McFerson
- Department of Horticulture, Washington State University Tree Fruit Research and Extension Center, Wenatchee, WA 98801, USA
| | - Michael Coe
- Cedar Lake Research Group, LLC, Portland, OR 97293, USA
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
| | - Margaret E Staton
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, TN 37996, USA
| | - Albert G Abbott
- Forest Health Research and Extension Center, University of Kentucky, Lexington, KY 40546-0091, USA
| | - Dorrie Main
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| |
Collapse
|
30
|
Buble K, Jung S, Humann JL, Yu J, Cheng CH, Lee T, Ficklin SP, Hough H, Condon B, Staton ME, Wegrzyn JL, Main D. Tripal MapViewer: A tool for interactive visualization and comparison of genetic maps. Database (Oxford) 2019; 2019:baz100. [PMID: 31688940 PMCID: PMC6829499 DOI: 10.1093/database/baz100] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2018] [Revised: 06/09/2019] [Accepted: 07/16/2019] [Indexed: 11/14/2022]
Abstract
Tripal is an open-source, resource-efficient toolkit for construction of genomic, genetic and breeding databases. It facilitates development of biological websites by providing tools to integrate and display biological data using the generic database schema, Chado, together with Drupal, a popular website creation and content management system. Tripal MapViewer is a new interactive tool for visualizing genetic map data. Developed as a Tripal replacement for Comparative Map Viewer (CMap), it enables visualization of entire maps or linkage groups and features such as molecular markers, quantitative trait loci (QTLs) and heritable phenotypic markers. It also provides graphical comparison of maps sharing the same markers as well as dot plot and correspondence matrices. MapViewer integrates directly with the Tripal application programming interface framework, improving data searching capability and providing a more seamless experience for site visitors. The Tripal MapViewer interface can be integrated in any Tripal map page and linked from any Tripal page for markers, QTLs, heritable morphological markers or genes. Configuration of the display is available through a control panel and the administration interface. The administration interface also allows configuration of the custom database query for building materialized views, providing better performance and flexibility in the way data is stored in the Chado database schema. MapViewer is implemented with the D3.js technology and is currently being used at the Genome Database for Rosaceae (https://www.rosaceae.org), CottonGen (https://www.cottongen.org), Citrus Genome Database (https://citrusgenomedb.org), Vaccinium Genome Database (https://www.vaccinium.org) and Cool Season Food Legume Database (https://www.coolseasonfoodlegume.org). It is also currently in development on the Hardwood Genomics Web (https://hardwoodgenomics.org) and TreeGenes (https://treegenesdb.org). Database URL: https://gitlab.com/mainlabwsu/tripal_map.
Collapse
Affiliation(s)
- Katheryn Buble
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| | - Sook Jung
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| | - Jodi L Humann
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| | - Jing Yu
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| | - Chun-Huai Cheng
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| | - Taein Lee
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| | - Stephen P Ficklin
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| | - Heidi Hough
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| | - Bradford Condon
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, TN 37996, USA
| | - Margaret E Staton
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, TN 37996, USA
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA
| | - Dorrie Main
- Department of Horticulture, Washington State University, Pullman, WA 99164-6414, USA
| |
Collapse
|
31
|
Visser EA, Wegrzyn JL, Myburg AA, Naidoo S. Defence transcriptome assembly and pathogenesis related gene family analysis in Pinus tecunumanii (low elevation). BMC Genomics 2018; 19:632. [PMID: 30139335 PMCID: PMC6108113 DOI: 10.1186/s12864-018-5015-0] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2018] [Accepted: 08/14/2018] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND Fusarium circinatum is a pressing threat to the cultivation of many economically important pine tree species. Efforts to develop effective disease management strategies can be aided by investigating the molecular mechanisms involved in the host-pathogen interaction between F. circinatum and pine species. Pinus tecunumanii and Pinus patula are two closely related tropical pine species that differ widely in their resistance to F. circinatum challenge, being resistant and susceptible respectively, providing the potential for a useful pathosystem to investigate the molecular responses underlying resistance to F. circinatum. However, no genomic resources are available for P. tecunumanii. Pathogenesis-related proteins are classes of proteins that play important roles in plant-microbe interactions, e.g. chitinases; proteins that break down the major structural component of fungal cell walls. Generating a reference sequence for P. tecunumanii and characterizing pathogenesis related gene families in these two pine species is an important step towards unravelling the pine-F. circinatum interaction. RESULTS Eight reference based and 12 de novo assembled transcriptomes were produced, for juvenile shoot tissue from both species. EvidentialGene pipeline redundancy reduction, expression filtering, protein clustering and taxonomic filtering produced a 50 Mb shoot transcriptome consisting of 28,621 contigs for P. tecunumanii and a 72 Mb shoot transcriptome consisting of 52,735 contigs for P. patula. Predicted protein sequences encoded by the assembled transcriptomes were clustered with reference proteomes from 92 other species to identify pathogenesis related gene families in P. patula, P. tecunumanii and other pine species. CONCLUSIONS The P. tecunumanii transcriptome is the first gene catalogue for the species, representing an important resource for studying resistance to the pitch canker pathogen, F. circinatum. This study also constitutes, to our knowledge, the largest index of gymnosperm PR-genes to date.
Collapse
Affiliation(s)
- Erik A. Visser
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), Genomics Research Institute (GRI), University of Pretoria, Private bag X20, Pretoria, 0028 South Africa
| | - Jill L. Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269 USA
| | - Alexander A. Myburg
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), Genomics Research Institute (GRI), University of Pretoria, Private bag X20, Pretoria, 0028 South Africa
| | - Sanushka Naidoo
- Department of Biochemistry, Genetics and Microbiology, Forestry and Agricultural Biotechnology Institute (FABI), Genomics Research Institute (GRI), University of Pretoria, Private bag X20, Pretoria, 0028 South Africa
| |
Collapse
|
32
|
Falk T, Herndon N, Grau E, Buehler S, Richter P, Zaman S, Baker EM, Ramnath R, Ficklin S, Staton M, Feltus FA, Jung S, Main D, Wegrzyn JL. Growing and cultivating the forest genomics database, TreeGenes. Database (Oxford) 2018; 2018:1-11. [PMID: 30239664 PMCID: PMC6146132 DOI: 10.1093/database/bay084] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/04/2018] [Accepted: 07/20/2018] [Indexed: 11/15/2022]
Abstract
Forest trees are valued sources of pulp, timber and biofuels, and serve a role in carbon sequestration, biodiversity maintenance and watershed stability. Examining the relationships among genetic, phenotypic and environmental factors for these species provides insight on the areas of concern for breeders and researchers alike. The TreeGenes database is a web-based repository that is home to 1790 tree species and over 1500 registered users. The database provides a curated archive for high-throughput genomics, including reference genomes, transcriptomes, genetic maps and variant data. These resources are paired with extensive phenotypic information and environmental layers. TreeGenes recently migrated to Tripal, an integrated and open-source database schema and content management system. This migration enabled developments focused on data exchange, data transfer and improved analytical capacity, as well as providing TreeGenes the opportunity to communicate with the following partner databases: Hardwood Genomics Web, Genome Database for Rosaceae, and the Citrus Genome Database. Recent development in TreeGenes has focused on coordinating information for georeferenced accessions, including metadata acquisition and ontological frameworks, to improve integration across studies combining genetic, phenotypic and environmental data. This focus was paired with the development of tools to enable comparative genomics and data visualization. By combining advanced data importers, relevant metadata standards and integrated analytical frameworks, TreeGenes provides a platform for researchers to store, submit and analyze forest tree data.
Collapse
Affiliation(s)
- Taylor Falk
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Nic Herndon
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Emily Grau
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Sean Buehler
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Peter Richter
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Sumaira Zaman
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Eliza M Baker
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Risharde Ramnath
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| | - Stephen Ficklin
- Department of Horticulture, Washington State University, Pullman, WA, USA
| | - Margaret Staton
- Department of Entomology and Plant Pathology, University of Tennessee, Knoxville, TN, USA
| | - Frank A Feltus
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA
| | - Sook Jung
- Department of Horticulture, Washington State University, Pullman, WA, USA
| | - Doreen Main
- Department of Horticulture, Washington State University, Pullman, WA, USA
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
| |
Collapse
|
33
|
Zimin AV, Stevens KA, Crepeau MW, Puiu D, Wegrzyn JL, Yorke JA, Langley CH, Neale DB, Salzberg SL. Erratum to: An improved assembly of the loblolly pine mega-genome using long-read single-molecule sequencing. Gigascience 2017; 6:1. [PMID: 29020755 PMCID: PMC5632297 DOI: 10.1093/gigascience/gix072] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
The 22-gigabase genome of loblolly pine (Pinus taeda) is one of the largest ever sequenced. The draft assembly published in 2014 was built entirely from short Illumina reads, with lengths ranging from 100 to 250 base pairs (bp). The assembly was quite fragmented, containing over 11 million contigs whose weighted average (N50) size was 8206 bp. To improve this result, we generated approximately 12-fold coverage in long reads using the Single Molecule Real Time sequencing technology developed at Pacific Biosciences. We assembled the long and short reads together using the MaSuRCA mega-reads assembly algorithm, which produced a substantially better assembly, P. taeda version 2.0. The new assembly has an N50 contig size of 25 361, more than three times as large as achieved in the original assembly, and an N50 scaffold size of 107 821, 61% larger than the previous assembly.
Collapse
|
34
|
Zimin AV, Stevens KA, Crepeau MW, Puiu D, Wegrzyn JL, Yorke JA, Langley CH, Neale DB, Salzberg SL. An improved assembly of the loblolly pine mega-genome using long-read single-molecule sequencing. Gigascience 2017; 6:1-4. [PMID: 28369353 PMCID: PMC5437942 DOI: 10.1093/gigascience/giw016] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Accepted: 12/21/2016] [Indexed: 11/30/2022] Open
Abstract
The 22-gigabase genome of loblolly pine (Pinus taeda) is one of the largest ever sequenced. The draft assembly published in 2014 was built entirely from short Illumina reads, with lengths ranging from 100 to 250 base pairs (bp). The assembly was quite fragmented, containing over 11 million contigs whose weighted average (N50) size was 8206 bp. To improve this result, we generated approximately 12-fold coverage in long reads using the Single Molecule Real Time sequencing technology developed at Pacific Biosciences. We assembled the long and short reads together using the MaSuRCA mega-reads assembly algorithm, which produced a substantially better assembly, P. taeda version 2.0. The new assembly has an N50 contig size of 25 361, more than three times as large as achieved in the original assembly, and an N50 scaffold size of 107 821, 61% larger than the previous assembly.
Collapse
Affiliation(s)
- Aleksey V Zimin
- Institute for Physical Sciences and Technology, University of Maryland, College Park, MD.,Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD
| | - Kristian A Stevens
- Department of Evolution and Ecology, University of California, Davis, CA
| | - Marc W Crepeau
- Department of Evolution and Ecology, University of California, Davis, CA
| | - Daniela Puiu
- Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT
| | - James A Yorke
- Institute for Physical Sciences and Technology, University of Maryland, College Park, MD
| | - Charles H Langley
- Department of Evolution and Ecology, University of California, Davis, CA
| | - David B Neale
- Department of Plant Sciences, University of California, Davis, CA
| | - Steven L Salzberg
- Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD.,Departments of Biomedical Engineering, Computer Science, and Biostatistics, Johns Hopkins University, Baltimore, MD
| |
Collapse
|
35
|
Sablok G, Chen TW, Lee CC, Yang C, Gan RC, Wegrzyn JL, Porta NL, Nayak KC, Huang PJ, Varotto C, Tang P. ChloroMitoCU: Codon patterns across organelle genomes for functional genomics and evolutionary applications. DNA Res 2017; 24:327-332. [PMID: 28419256 PMCID: PMC5499650 DOI: 10.1093/dnares/dsw044] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2016] [Accepted: 09/14/2016] [Indexed: 01/01/2023] Open
Abstract
Organelle genomes are widely thought to have arisen from reduction events involving cyanobacterial and archaeal genomes, in the case of chloroplasts, or α-proteobacterial genomes, in the case of mitochondria. Heterogeneity in base composition and codon preference has long been the subject of investigation of topics ranging from phylogenetic distortion to the design of overexpression cassettes for transgenic expression. From the overexpression point of view, it is critical to systematically analyze the codon usage patterns of the organelle genomes. In light of the importance of codon usage patterns in the development of hyper-expression organelle transgenics, we present ChloroMitoCU, the first-ever curated, web-based reference catalog of the codon usage patterns in organelle genomes. ChloroMitoCU contains the pre-compiled codon usage patterns of 328 chloroplast genomes (29,960 CDS) and 3,502 mitochondrial genomes (49,066 CDS), enabling genome-wide exploration and comparative analysis of codon usage patterns across species. ChloroMitoCU allows the phylogenetic comparison of codon usage patterns across organelle genomes, the prediction of codon usage patterns based on user-submitted transcripts or assembled organelle genes, and comparative analysis with the pre-compiled patterns across species of interest. ChloroMitoCU can increase our understanding of the biased patterns of codon usage in organelle genomes across multiple clades. ChloroMitoCU can be accessed at: http://chloromitocu.cgu.edu.tw/
Collapse
Affiliation(s)
- Gaurav Sablok
- Department of Biodiversity and Molecular Ecology, Research and Innovation Centre, Fondazione Edmund Mach, Via E. Mach 1, 38010 S. Michele all'Adige (TN), Italy
| | - Ting-Wen Chen
- Bioinformatics Core Laboratory, Molecular Medicine Research Center, Chang Gung University, Kweishan, Taoyuan 333, Taiwan
| | - Chi-Ching Lee
- Bioinformatics Core Laboratory, Molecular Medicine Research Center, Chang Gung University, Kweishan, Taoyuan 333, Taiwan
| | - Chi Yang
- Bioinformatics Core Laboratory, Molecular Medicine Research Center, Chang Gung University, Kweishan, Taoyuan 333, Taiwan
| | - Ruei-Chi Gan
- Bioinformatics Core Laboratory, Molecular Medicine Research Center, Chang Gung University, Kweishan, Taoyuan 333, Taiwan
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University 10 of Connecticut, 75 North Eagleville Road, Storrs, CT 06269-3043 USA
| | - Nicola L Porta
- Department of Sustainable Agrobiosystems and Bioresources, Research and Innovation Centre, Fondazione Edmund Mach, Via E. Mach 1, 38010 S. Michele all'Adige (TN), Italy.,MOUNTFOR Project Centre, European Forest Institute, Via E. Mach 1, 38010 San Michele all'Adige, Trento, Italy
| | - Kinshuk C Nayak
- Bioinformatics Centre, Institute of Life Sciences, Department of Biotechnology, Govt. India, Nalco Square, Bhubaneswar - 751 023, India
| | - Po-Jung Huang
- Bioinformatics Core Laboratory, Molecular Medicine Research Center, Chang Gung University, Kweishan, Taoyuan 333, Taiwan
| | - Claudio Varotto
- Department of Biodiversity and Molecular Ecology, Research and Innovation Centre, Fondazione Edmund Mach, Via E. Mach 1, 38010 S. Michele all'Adige (TN), Italy
| | - Petrus Tang
- Bioinformatics Core Laboratory, Molecular Medicine Research Center, Chang Gung University, Kweishan, Taoyuan 333, Taiwan.,Molecular Infectious Diseases Research Center, Chang Gung Memorial Hospital, Kweishan, Taoyuan 333, Taiwan
| |
Collapse
|
36
|
Cronn R, Dolan PC, Jogdeo S, Wegrzyn JL, Neale DB, St Clair JB, Denver DR. Transcription through the eye of a needle: daily and annual cyclic gene expression variation in Douglas-fir needles. BMC Genomics 2017; 18:558. [PMID: 28738815 PMCID: PMC5525293 DOI: 10.1186/s12864-017-3916-y] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2016] [Accepted: 06/30/2017] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Perennial growth in plants is the product of interdependent cycles of daily and annual stimuli that induce cycles of growth and dormancy. In conifers, needles are the key perennial organ that integrates daily and seasonal signals from light, temperature, and water availability. To understand the relationship between seasonal cycles and seasonal gene expression responses in conifers, we examined diurnal and circannual needle mRNA accumulation in Douglas-fir (Pseudotsuga menziesii) needles at diurnal and circannual scales. Using mRNA sequencing, we sampled 6.1 × 109 reads from 19 trees and constructed a de novo pan-transcriptome reference that includes 173,882 tree-derived transcripts. Using this reference, we mapped RNA-Seq reads from 179 samples that capture daily and annual variation. RESULTS We identified 12,042 diurnally-cyclic transcripts, 9299 of which showed homology to annotated genes from other plant genomes, including angiosperm core clock genes. Annual analysis revealed 21,225 circannual transcripts, 17,335 of which showed homology to annotated genes from other plant genomes. The timing of maximum gene expression is associated with light intensity at diurnal scales and photoperiod at annual scales, with approximately half of transcripts reaching maximum expression +/- 2 h from sunrise and sunset, and +/- 20 days from winter and summer solstices. Comparisons with published studies from other conifers shows congruent behavior in clock genes with Japanese cedar (Cryptomeria), and a significant preservation of gene expression patterns for 2278 putative orthologs from Douglas-fir during the summer growing season, and 760 putative orthologs from spruce (Picea) during the transition from fall to winter. CONCLUSIONS Our study highlight the extensive diurnal and circannual transcriptome variability demonstrated in conifer needles. At these temporal scales, 29% of expressed transcripts show a significant diurnal cycle, and 58.7% show a significant circannual cycle. Remarkably, thousands of genes reach their annual peak activity during winter dormancy. Our study establishes the fine-scale timing of daily and annual maximum gene expression for diverse needle genes in Douglas-fir, and it highlights the potential for using this information for evaluating hypotheses concerning the daily or seasonal timing of gene activity in temperate-zone conifers, and for identifying cyclic transcriptome components in other conifer species.
Collapse
Affiliation(s)
- Richard Cronn
- Pacific Northwest Research Station, USDA Forest Service, Corvallis, OR, 97331, USA.
| | - Peter C Dolan
- University of Minnesota - Morris, Morris, MN, 56267, USA
| | - Sanjuro Jogdeo
- Department of Integrative Biology, Oregon State University, Corvallis, OR, 97331, USA
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, 06269, USA
| | - David B Neale
- Department of Plant Sciences, University of California - Davis, Davis, CA, 95616, USA
| | - J Bradley St Clair
- Pacific Northwest Research Station, USDA Forest Service, Corvallis, OR, 97331, USA
| | - Dee R Denver
- Department of Integrative Biology, Oregon State University, Corvallis, OR, 97331, USA
| |
Collapse
|
37
|
Lind BM, Friedline CJ, Wegrzyn JL, Maloney PE, Vogler DR, Neale DB, Eckert AJ. Water availability drives signatures of local adaptation in whitebark pine (Pinus albicaulis Engelm.) across fine spatial scales of the Lake Tahoe Basin, USA. Mol Ecol 2017; 26:3168-3185. [PMID: 28316116 DOI: 10.1111/mec.14106] [Citation(s) in RCA: 43] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2016] [Revised: 03/03/2017] [Accepted: 03/06/2017] [Indexed: 12/18/2022]
Abstract
Patterns of local adaptation at fine spatial scales are central to understanding how evolution proceeds, and are essential to the effective management of economically and ecologically important forest tree species. Here, we employ single and multilocus analyses of genetic data (n = 116 231 SNPs) to describe signatures of fine-scale adaptation within eight whitebark pine (Pinus albicaulis Engelm.) populations across the local extent of the environmentally heterogeneous Lake Tahoe Basin, USA. We show that despite highly shared genetic variation (FST = 0.0069), there is strong evidence for adaptation to the rain shadow experienced across the eastern Sierra Nevada. Specifically, we build upon evidence from a common garden study and find that allele frequencies of loci associated with four phenotypes (mean = 236 SNPs), 18 environmental variables (mean = 99 SNPs), and those detected through genetic differentiation (n = 110 SNPs) exhibit significantly higher signals of selection (covariance of allele frequencies) than could be expected to arise, given the data. We also provide evidence that this covariance tracks environmental measures related to soil water availability through subtle allele frequency shifts across populations. Our results replicate empirical support for theoretical expectations of local adaptation for populations exhibiting strong gene flow and high selective pressures and suggest that ongoing adaptation of many P. albicaulis populations within the Lake Tahoe Basin will not be constrained by the lack of genetic variation. Even so, some populations exhibit low levels of heritability for the traits presumed to be related to fitness. These instances could be used to prioritize management to maintain adaptive potential. Overall, we suggest that established practices regarding whitebark pine conservation be maintained, with the additional context of fine-scale adaptation.
Collapse
Affiliation(s)
- Brandon M Lind
- Integrative Life Sciences Program, Virginia Commonwealth University, Richmond, VA, 23284, USA
| | | | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, 06269, USA
| | - Patricia E Maloney
- Department of Plant Pathology and Tahoe Environmental Research Center, University of California, Davis, CA, 95616, USA
| | - Detlev R Vogler
- USDA, Forest Service, Pacific Southwest Research Station, Institute of Forest Genetics, 2480 Carson Road, Placerville, CA, 95667, USA
| | - David B Neale
- Department of Plant Sciences, University of California, Davis, CA, 95616, USA
| | - Andrew J Eckert
- Department of Biology, Virginia Commonwealth University, Richmond, VA, 23284, USA
| |
Collapse
|
38
|
Velotta JP, Wegrzyn JL, Ginzburg S, Kang L, Czesny S, O'Neill RJ, McCormick SD, Michalak P, Schultz ET. Transcriptomic imprints of adaptation to fresh water: parallel evolution of osmoregulatory gene expression in the Alewife. Mol Ecol 2017; 26:831-848. [DOI: 10.1111/mec.13983] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2016] [Revised: 11/15/2016] [Accepted: 11/18/2016] [Indexed: 01/08/2023]
Affiliation(s)
- Jonathan P. Velotta
- Department of Ecology and Evolutionary Biology; University of Connecticut; Storrs CT 06269-3043 USA
| | - Jill L. Wegrzyn
- Department of Ecology and Evolutionary Biology; University of Connecticut; Storrs CT 06269-3043 USA
| | - Samuel Ginzburg
- Department of Ecology and Evolutionary Biology; University of Connecticut; Storrs CT 06269-3043 USA
| | - Lin Kang
- Department of Biological Sciences; Virginia Bioinformatics Institute; Virginia Tech; Blacksburg VA 24061 USA
| | - Sergiusz Czesny
- Lake Michigan Biological Station; Illinois Natural History Survey; University of Illinois; Zion IL 60099 USA
| | - Rachel J. O'Neill
- Department of Molecular and Cell Biology; University of Connecticut; Storrs CT 06269-3125 USA
| | - Stephen D. McCormick
- Conte Anadromous Fish Research Center; U.S. Geological Survey; Turners Falls MA 01376 USA
| | - Pawel Michalak
- Department of Biological Sciences; Virginia Bioinformatics Institute; Virginia Tech; Blacksburg VA 24061 USA
| | - Eric T. Schultz
- Department of Ecology and Evolutionary Biology; University of Connecticut; Storrs CT 06269-3043 USA
| |
Collapse
|
39
|
Gonzalez-Ibeas D, Martinez-Garcia PJ, Famula RA, Delfino-Mix A, Stevens KA, Loopstra CA, Langley CH, Neale DB, Wegrzyn JL. Assessing the Gene Content of the Megagenome: Sugar Pine (Pinus lambertiana). G3 (Bethesda) 2016; 6:3787-3802. [PMID: 27799338 PMCID: PMC5144951 DOI: 10.1534/g3.116.032805] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/24/2016] [Accepted: 07/13/2016] [Indexed: 02/06/2023]
Abstract
Sugar pine (Pinus lambertiana Douglas) is within the subgenus Strobus with an estimated genome size of 31 Gbp. Transcriptomic resources are of particular interest in conifers due to the challenges presented in their megagenomes for gene identification. In this study, we present the first comprehensive survey of the P. lambertiana transcriptome through deep sequencing of a variety of tissue types to generate more than 2.5 billion short reads. Third generation, long reads generated through PacBio Iso-Seq have been included for the first time in conifers to combat the challenges associated with de novo transcriptome assembly. A technology comparison is provided here to contribute to the otherwise scarce comparisons of second and third generation transcriptome sequencing approaches in plant species. In addition, the transcriptome reference was essential for gene model identification and quality assessment in the parallel project responsible for sequencing and assembly of the entire genome. In this study, the transcriptomic data were also used to address questions surrounding lineage-specific Dicer-like proteins in conifers. These proteins play a role in the control of transposable element proliferation and the related genome expansion in conifers.
Collapse
Affiliation(s)
- Daniel Gonzalez-Ibeas
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, Connecticut 06269
| | | | - Randi A Famula
- Department of Plant Sciences, University of California, Davis, California 95616
| | - Annette Delfino-Mix
- United States Department of Agriculture Forest Service, Institute of Forest Genetics, Placerville, California 95667
| | - Kristian A Stevens
- Department of Evolution and Ecology, University of California, Davis, California 95616
| | - Carol A Loopstra
- Department of Ecosystem Science and Management, Texas A&M University, College Station, Texas 77843
| | - Charles H Langley
- Department of Evolution and Ecology, University of California, Davis, California 95616
| | - David B Neale
- Department of Plant Sciences, University of California, Davis, California 95616
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, Connecticut 06269
| |
Collapse
|
40
|
Martínez-García PJ, Crepeau MW, Puiu D, Gonzalez-Ibeas D, Whalen J, Stevens KA, Paul R, Butterfield TS, Britton MT, Reagan RL, Chakraborty S, Walawage SL, Vasquez-Gross HA, Cardeno C, Famula RA, Pratt K, Kuruganti S, Aradhya MK, Leslie CA, Dandekar AM, Salzberg SL, Wegrzyn JL, Langley CH, Neale DB. The walnut (Juglans regia) genome sequence reveals diversity in genes coding for the biosynthesis of non-structural polyphenols. Plant J 2016; 87:507-32. [PMID: 27145194 DOI: 10.1111/tpj.13207] [Citation(s) in RCA: 118] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2015] [Revised: 04/22/2016] [Accepted: 04/27/2016] [Indexed: 05/18/2023]
Abstract
The Persian walnut (Juglans regia L.), a diploid species native to the mountainous regions of Central Asia, is the major walnut species cultivated for nut production and is one of the most widespread tree nut species in the world. The high nutritional value of J. regia nuts is associated with a rich array of polyphenolic compounds, whose complete biosynthetic pathways are still unknown. A J. regia genome sequence was obtained from the cultivar 'Chandler' to discover target genes and additional unknown genes. The 667-Mbp genome was assembled using two different methods (SOAPdenovo2 and MaSuRCA), with an N50 scaffold size of 464 955 bp (based on a genome size of 606 Mbp), 221 640 contigs and a GC content of 37%. Annotation with MAKER-P and other genomic resources yielded 32 498 gene models. Previous studies in walnut relying on tissue-specific methods have only identified a single polyphenol oxidase (PPO) gene (JrPPO1). Enabled by the J. regia genome sequence, a second homolog of PPO (JrPPO2) was discovered. In addition, about 130 genes in the large gallate 1-β-glucosyltransferase (GGT) superfamily were detected. Specifically, two genes, JrGGT1 and JrGGT2, were significantly homologous to the GGT from Quercus robur (QrGGT), which is involved in the synthesis of 1-O-galloyl-β-d-glucose, a precursor for the synthesis of hydrolysable tannins. The reference genome for J. regia provides meaningful insight into the complex pathways required for the synthesis of polyphenols. The walnut genome sequence provides important tools and methods to accelerate breeding and to facilitate the genetic dissection of complex traits.
Collapse
Affiliation(s)
| | - Marc W Crepeau
- Department of Evolution and Ecology, University of California, Davis, CA, 95616, USA
| | - Daniela Puiu
- Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, MD, 21205, USA
| | - Daniel Gonzalez-Ibeas
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, 06269-3043, USA
| | - Jeanne Whalen
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, 06269-3043, USA
| | - Kristian A Stevens
- Department of Evolution and Ecology, University of California, Davis, CA, 95616, USA
| | - Robin Paul
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, 06269-3043, USA
| | | | | | - Russell L Reagan
- Department of Plant Sciences, University of California, Davis, CA, 95616, USA
| | - Sandeep Chakraborty
- Department of Plant Sciences, University of California, Davis, CA, 95616, USA
| | - Sriema L Walawage
- Department of Plant Sciences, University of California, Davis, CA, 95616, USA
| | | | - Charis Cardeno
- Department of Evolution and Ecology, University of California, Davis, CA, 95616, USA
| | - Randi A Famula
- Department of Plant Sciences, University of California, Davis, CA, 95616, USA
| | - Kevin Pratt
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, 06269-3043, USA
| | - Sowmya Kuruganti
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, 06269-3043, USA
| | | | - Charles A Leslie
- Department of Plant Sciences, University of California, Davis, CA, 95616, USA
| | - Abhaya M Dandekar
- Department of Plant Sciences, University of California, Davis, CA, 95616, USA
| | - Steven L Salzberg
- Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, MD, 21205, USA
- Departments of Biomedical Engineering, Computer Science, and Biostatistics, Johns Hopkins University, Baltimore, MD, 21205, USA
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, 06269-3043, USA
| | - Charles H Langley
- Department of Evolution and Ecology, University of California, Davis, CA, 95616, USA
| | - David B Neale
- Department of Plant Sciences, University of California, Davis, CA, 95616, USA.
| |
Collapse
|
41
|
Visser EA, Wegrzyn JL, Steenkmap ET, Myburg AA, Naidoo S. Combined de novo and genome guided assembly and annotation of the Pinus patula juvenile shoot transcriptome. BMC Genomics 2015; 16:1057. [PMID: 26652261 PMCID: PMC4676862 DOI: 10.1186/s12864-015-2277-7] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2015] [Accepted: 12/06/2015] [Indexed: 11/25/2022] Open
Abstract
Background Pines are the most important tree species to the international forestry industry, covering 42 % of the global industrial forest plantation area. One of the most pressing threats to cultivation of some pine species is the pitch canker fungus, Fusarium circinatum, which can have devastating effects in both the field and nursery. Investigation of the Pinus-F. circinatum host-pathogen interaction is crucial for development of effective disease management strategies. As with many non-model organisms, investigation of host-pathogen interactions in pine species is hampered by limited genomic resources. This was partially alleviated through release of the 22 Gbp Pinus taeda v1.01 genome sequence (http://pinegenome.org/pinerefseq/) in 2014. Despite the fact that the fragmented state of the genome may hamper comprehensive transcriptome analysis, it is possible to leverage the inherent redundancy resulting from deep RNA sequencing with Illumina short reads to assemble transcripts in the absence of a completed reference sequence. These data can then be integrated with available genomic data to produce a comprehensive transcriptome resource. The aim of this study was to provide a foundation for gene expression analysis of disease response mechanisms in Pinus patula through transcriptome assembly. Results Eighteen de novo and two reference based assemblies were produced for P. patula shoot tissue. For this purpose three transcriptome assemblers, Trinity, Velvet/OASES and SOAPdenovo-Trans, were used to maximise diversity and completeness of assembled transcripts. Redundancy in the assembly was reduced using the EvidentialGene pipeline. The resulting 52 Mb P. patula v1.0 shoot transcriptome consists of 52 112 unigenes, 60 % of which could be functionally annotated. Conclusions The assembled transcriptome will serve as a major genomic resource for future investigation of P. patula and represents the largest gene catalogue produced to date for this species. Furthermore, this assembly can help detect gene-based genetic markers for P. patula and the comparative assembly workflow could be applied to generate similar resources for other non-model species. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-2277-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Erik A Visser
- Department of Genetics, Forestry and Agricultural Biotechnology Institute (FABI), Genomics Research Institute (GRI), University of Pretoria, Private bag X20, Pretoria, 0028, South Africa.
| | - Jill L Wegrzyn
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, 06269, USA.
| | - Emma T Steenkmap
- Department of Microbiology and Plant Pathology, Forestry and Agricultural Biotechnology Institute (FABI), Genomics Research Institute (GRI), University of Pretoria, Private bag X20, Pretoria, 0028, South Africa.
| | - Alexander A Myburg
- Department of Genetics, Forestry and Agricultural Biotechnology Institute (FABI), Genomics Research Institute (GRI), University of Pretoria, Private bag X20, Pretoria, 0028, South Africa.
| | - Sanushka Naidoo
- Department of Genetics, Forestry and Agricultural Biotechnology Institute (FABI), Genomics Research Institute (GRI), University of Pretoria, Private bag X20, Pretoria, 0028, South Africa.
| |
Collapse
|
42
|
Westbrook JW, Walker AR, Neves LG, Munoz P, Resende MFR, Neale DB, Wegrzyn JL, Huber DA, Kirst M, Davis JM, Peter GF. Discovering candidate genes that regulate resin canal number in Pinus taeda stems by integrating genetic analysis across environments, ages, and populations. New Phytol 2015; 205:627-641. [PMID: 25266813 DOI: 10.1111/nph.13074] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2014] [Accepted: 08/14/2014] [Indexed: 06/03/2023]
Abstract
Genetically improving constitutive resin canal development in Pinus stems may enhance the capacity to synthesize terpenes for bark beetle resistance, chemical feedstocks, and biofuels. To discover genes that potentially regulate axial resin canal number (RCN), single nucleotide polymorphisms (SNPs) in 4027 genes were tested for association with RCN in two growth rings and three environments in a complex pedigree of 520 Pinus taeda individuals (CCLONES). The map locations of associated genes were compared with RCN quantitative trait loci (QTLs) in a (P. taeda × Pinus elliottii) × P. elliottii pseudo-backcross of 345 full-sibs (BC1). Resin canal number was heritable (h(2) ˜ 0.12-0.21) and positively genetically correlated with xylem growth (rg ˜ 0.32-0.72) and oleoresin flow (rg ˜ 0.15-0.51). Sixteen well-supported candidate regulators of RCN were discovered in CCLONES, including genes associated across sites and ages, unidirectionally associated with oleoresin flow and xylem growth, and mapped to RCN QTLs in BC1. Breeding is predicted to increase RCN 11% in one generation and could be accelerated with genomic selection at accuracies of 0.45-0.52 across environments. There is significant genetic variation for RCN in loblolly pine, which can be exploited in breeding for elevated terpene content.
Collapse
Affiliation(s)
- Jared W Westbrook
- Forest Genomics Laboratory, Genetics Institute, University of Florida, 1376 Mowry Rd, Rm 320, Gainesville, FL, 32611, USA; Plant Molecular and Cellular Biology graduate program, University of Florida, Gainesville, PO Box 110410, FL 32611, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
43
|
Neale DB, Wegrzyn JL, Stevens KA, Zimin AV, Puiu D, Crepeau MW, Cardeno C, Koriabine M, Holtz-Morris AE, Liechty JD, Martínez-García PJ, Vasquez-Gross HA, Lin BY, Zieve JJ, Dougherty WM, Fuentes-Soriano S, Wu LS, Gilbert D, Marçais G, Roberts M, Holt C, Yandell M, Davis JM, Smith KE, Dean JFD, Lorenz WW, Whetten RW, Sederoff R, Wheeler N, McGuire PE, Main D, Loopstra CA, Mockaitis K, deJong PJ, Yorke JA, Salzberg SL, Langley CH. Decoding the massive genome of loblolly pine using haploid DNA and novel assembly strategies. Genome Biol 2014; 15:R59. [PMID: 24647006 PMCID: PMC4053751 DOI: 10.1186/gb-2014-15-3-r59] [Citation(s) in RCA: 274] [Impact Index Per Article: 27.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2014] [Accepted: 03/04/2014] [Indexed: 11/30/2022] Open
Abstract
Background The size and complexity of conifer genomes has, until now, prevented full genome sequencing and assembly. The large research community and economic importance of loblolly pine, Pinus taeda L., made it an early candidate for reference sequence determination. Results We develop a novel strategy to sequence the genome of loblolly pine that combines unique aspects of pine reproductive biology and genome assembly methodology. We use a whole genome shotgun approach relying primarily on next generation sequence generated from a single haploid seed megagametophyte from a loblolly pine tree, 20-1010, that has been used in industrial forest tree breeding. The resulting sequence and assembly was used to generate a draft genome spanning 23.2 Gbp and containing 20.1 Gbp with an N50 scaffold size of 66.9 kbp, making it a significant improvement over available conifer genomes. The long scaffold lengths allow the annotation of 50,172 gene models with intron lengths averaging over 2.7 kbp and sometimes exceeding 100 kbp in length. Analysis of orthologous gene sets identifies gene families that may be unique to conifers. We further characterize and expand the existing repeat library based on the de novo analysis of the repetitive content, estimated to encompass 82% of the genome. Conclusions In addition to its value as a resource for researchers and breeders, the loblolly pine genome sequence and assembly reported here demonstrates a novel approach to sequencing the large and complex genomes of this important group of plants that can now be widely applied.
Collapse
|
44
|
Eckert AJ, Bower AD, Jermstad KD, Wegrzyn JL, Knaus BJ, Syring JV, Neale DB. Multilocus analyses reveal little evidence for lineage-wide adaptive evolution within major clades of soft pines (Pinus subgenus Strobus). Mol Ecol 2013; 22:5635-50. [PMID: 24134614 DOI: 10.1111/mec.12514] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2012] [Revised: 08/27/2013] [Accepted: 08/29/2013] [Indexed: 12/26/2022]
Abstract
Estimates from molecular data for the fraction of new nonsynonymous mutations that are adaptive vary strongly across plant species. Much of this variation is due to differences in life history strategies as they influence the effective population size (Ne ). Ample variation for these estimates, however, remains even when comparisons are made across species with similar values of Ne . An open question thus remains as to why the large disparity for estimates of adaptive evolution exists among plant species. Here, we have estimated the distribution of deleterious fitness effects (DFE) and the fraction of adaptive nonsynonymous substitutions (α) for 11 species of soft pines (subgenus Strobus) using DNA sequence data from 167 orthologous nuclear gene fragments. Most newly arising nonsynonymous mutations were inferred to be so strongly deleterious that they would rarely become fixed. Little evidence for long-term adaptive evolution was detected, as all 11 estimates for α were not significantly different from zero. Nucleotide diversity at synonymous sites, moreover, was strongly correlated with attributes of the DFE across species, thus illustrating a strong consistency with the expectations from the Nearly Neutral Theory of molecular evolution. Application of these patterns to genome-wide expectations for these species, however, was difficult as the loci chosen for the analysis were a biased set of conserved loci, which greatly influenced the estimates of the DFE and α. This implies that genome-wide parameter estimates will need truly genome-wide data, so that many of the existing patterns documented previously for forest trees (e.g. little evidence for signature of selection) may need revision.
Collapse
Affiliation(s)
- Andrew J Eckert
- Department of Biology, Virginia Commonwealth University, Richmond, VA, 23284, USA
| | | | | | | | | | | | | |
Collapse
|
45
|
Mann IK, Wegrzyn JL, Rajora OP. Generation, functional annotation and comparative analysis of black spruce (Picea mariana) ESTs: an important conifer genomic resource. BMC Genomics 2013; 14:702. [PMID: 24119028 PMCID: PMC4007752 DOI: 10.1186/1471-2164-14-702] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2013] [Accepted: 10/08/2013] [Indexed: 12/01/2022] Open
Abstract
Background EST (expressed sequence tag) sequences and their annotation provide a highly valuable resource for gene discovery, genome sequence annotation, and other genomics studies that can be applied in genetics, breeding and conservation programs for non-model organisms. Conifers are long-lived plants that are ecologically and economically important globally, and have a large genome size. Black spruce (Picea mariana), is a transcontinental species of the North American boreal and temperate forests. However, there are limited transcriptomic and genomic resources for this species. The primary objective of our study was to develop a black spruce transcriptomic resource to facilitate on-going functional genomics projects related to growth and adaptation to climate change. Results We conducted bidirectional sequencing of cDNA clones from a standard cDNA library constructed from black spruce needle tissues. We obtained 4,594 high quality (2,455 5' end and 2,139 3' end) sequence reads, with an average read-length of 532 bp. Clustering and assembly of ESTs resulted in 2,731 unique sequences, consisting of 2,234 singletons and 497 contigs. Approximately two-thirds (63%) of unique sequences were functionally annotated. Genes involved in 36 molecular functions and 90 biological processes were discovered, including 24 putative transcription factors and 232 genes involved in photosynthesis. Most abundantly expressed transcripts were associated with photosynthesis, growth factors, stress and disease response, and transcription factors. A total of 216 full-length genes were identified. About 18% (493) of the transcripts were novel, representing an important addition to the Genbank EST database (dbEST). Fifty-seven di-, tri-, tetra- and penta-nucleotide simple sequence repeats were identified. Conclusions We have developed the first high quality EST resource for black spruce and identified 493 novel transcripts, which may be species-specific related to life history and ecological traits. We have also identified full-length genes and microsatellite-containing ESTs. Based on EST sequence similarities, black spruce showed close evolutionary relationships with congeneric Picea glauca and Picea sitchensis compared to other Pinaceae members and angiosperms. The EST sequences reported here provide an important resource for genome annotation, functional and comparative genomics, molecular breeding, conservation and management studies and applications in black spruce and related conifer species.
Collapse
Affiliation(s)
- Ishminder K Mann
- Forest Genetics and Biotechnology Group, Department of Biology, Life Sciences Centre, Dalhousie University, 1355 Oxford Street, Halifax, NS B3H 4J1, Canada.
| | | | | |
Collapse
|
46
|
Wegrzyn JL, Lin BY, Zieve JJ, Dougherty WM, Martínez-García PJ, Koriabine M, Holtz-Morris A, deJong P, Crepeau M, Langley CH, Puiu D, Salzberg SL, Neale DB, Stevens KA. Insights into the loblolly pine genome: characterization of BAC and fosmid sequences. PLoS One 2013; 8:e72439. [PMID: 24023741 PMCID: PMC3762812 DOI: 10.1371/journal.pone.0072439] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2013] [Accepted: 07/10/2013] [Indexed: 12/22/2022] Open
Abstract
Despite their prevalence and importance, the genome sequences of loblolly pine, Norway spruce, and white spruce, three ecologically and economically important conifer species, are just becoming available to the research community. Following the completion of these large assemblies, annotation efforts will be undertaken to characterize the reference sequences. Accurate annotation of these ancient genomes would be aided by a comprehensive repeat library; however, few studies have generated enough sequence to fully evaluate and catalog their non-genic content. In this paper, two sets of loblolly pine genomic sequence, 103 previously assembled BACs and 90,954 newly sequenced and assembled fosmid scaffolds, were analyzed. Together, this sequence represents 280 Mbp (roughly 1% of the loblolly pine genome) and one of the most comprehensive studies of repetitive elements and genes in a gymnosperm species. A combination of homology and de novo methodologies were applied to identify both conserved and novel repeats. Similarity analysis estimated a repetitive content of 27% that included both full and partial elements. When combined with the de novo investigation, the estimate increased to almost 86%. Over 60% of the repetitive sequence consists of full or partial LTR (long terminal repeat) retrotransposons. Through de novo approaches, 6,270 novel, full-length transposable element families and 9,415 sub-families were identified. Among those 6,270 families, 82% were annotated as single-copy. Several of the novel, high-copy families are described here, with the largest, PtPiedmont, comprising 133 full-length copies. In addition to repeats, analysis of the coding region reported 23 full-length eukaryotic orthologous proteins (KOGS) and another 29 novel or orthologous genes. These discoveries, along with other genomic resources, will be used to annotate conifer genomes and address long-standing questions about gymnosperm evolution.
Collapse
Affiliation(s)
- Jill L. Wegrzyn
- Department of Plant Sciences, University of California Davis, Davis, California, United States of America
- * E-mail: (JLW); (KAS)
| | - Brian Y. Lin
- Department of Plant Sciences, University of California Davis, Davis, California, United States of America
| | - Jacob J. Zieve
- Department of Plant Sciences, University of California Davis, Davis, California, United States of America
| | - William M. Dougherty
- Department of Evolution and Ecology, University of California Davis, Davis, California, United States of America
| | - Pedro J. Martínez-García
- Department of Plant Sciences, University of California Davis, Davis, California, United States of America
| | - Maxim Koriabine
- Children's Hospital Oakland Research Institute, Oakland, California, United States of America
| | - Ann Holtz-Morris
- Children's Hospital Oakland Research Institute, Oakland, California, United States of America
| | - Pieter deJong
- Children's Hospital Oakland Research Institute, Oakland, California, United States of America
| | - Marc Crepeau
- Department of Evolution and Ecology, University of California Davis, Davis, California, United States of America
| | - Charles H. Langley
- Department of Evolution and Ecology, University of California Davis, Davis, California, United States of America
| | - Daniela Puiu
- Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Steven L. Salzberg
- Center for Computational Biology, McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - David B. Neale
- Department of Plant Sciences, University of California Davis, Davis, California, United States of America
| | - Kristian A. Stevens
- Department of Evolution and Ecology, University of California Davis, Davis, California, United States of America
- * E-mail: (JLW); (KAS)
| |
Collapse
|
47
|
Westbrook JW, Resende MFR, Munoz P, Walker AR, Wegrzyn JL, Nelson CD, Neale DB, Kirst M, Huber DA, Gezan SA, Peter GF, Davis JM. Association genetics of oleoresin flow in loblolly pine: discovering genes and predicting phenotype for improved resistance to bark beetles and bioenergy potential. New Phytol 2013; 199:89-100. [PMID: 23534834 DOI: 10.1111/nph.12240] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/18/2013] [Accepted: 02/15/2013] [Indexed: 05/28/2023]
Abstract
Rapidly enhancing oleoresin production in conifer stems through genomic selection and genetic engineering may increase resistance to bark beetles and terpenoid yield for liquid biofuels. We integrated association genetic and genomic prediction analyses of oleoresin flow (g 24 h(-1)) using 4854 single nucleotide polymorphisms (SNPs) in expressed genes within a pedigreed population of loblolly pine (Pinus taeda) that was clonally replicated at three sites in the southeastern United States. Additive genetic variation in oleoresin flow (h(2) ≈ 0.12-0.30) was strongly correlated between years in which precipitation varied (r(a) ≈ 0.95), while the genetic correlation between sites declined from 0.8 to 0.37 with increasing differences in soil and climate among sites. A total of 231 SNPs were significantly associated with oleoresin flow, of which 81% were specific to individual sites. SNPs in sequences similar to ethylene signaling proteins, ABC transporters, and diterpenoid hydroxylases were associated with oleoresin flow across sites. Despite this complex genetic architecture, we developed a genomic prediction model to accelerate breeding for enhanced oleoresin flow that is robust to environmental variation. Results imply that breeding could increase oleoresin flow 1.5- to 2.4-fold in one generation.
Collapse
Affiliation(s)
- Jared W Westbrook
- Forest Genomics Laboratory, Genetics Institute, University of Florida, 1376 Mowry Rd, Rm 320, Gainesville, FL, 32611, USA
| | - Marcio F R Resende
- Forest Genomics Laboratory, Genetics Institute, University of Florida, 1376 Mowry Rd, Rm 320, Gainesville, FL, 32611, USA
| | - Patricio Munoz
- Forest Genomics Laboratory, Genetics Institute, University of Florida, 1376 Mowry Rd, Rm 320, Gainesville, FL, 32611, USA
| | - Alejandro R Walker
- Forest Genomics Laboratory, Genetics Institute, University of Florida, 1376 Mowry Rd, Rm 320, Gainesville, FL, 32611, USA
- School of Forest Resources and Conservation, University of Florida, PO Box 110410, Gainesville, FL, 32611, USA
| | - Jill L Wegrzyn
- Department of Plant Sciences, University of California at Davis, Mail Stop 4, Davis, CA, 95616, USA
| | - C Dana Nelson
- Southern Institute of Forest Genetics, USDA Forest Service, Southern Research Station, 23332 Success Rd, Saucier, MS, 39574, USA
| | - David B Neale
- Department of Plant Sciences, University of California at Davis, Mail Stop 4, Davis, CA, 95616, USA
| | - Matias Kirst
- Forest Genomics Laboratory, Genetics Institute, University of Florida, 1376 Mowry Rd, Rm 320, Gainesville, FL, 32611, USA
- School of Forest Resources and Conservation, University of Florida, PO Box 110410, Gainesville, FL, 32611, USA
| | - Dudley A Huber
- School of Forest Resources and Conservation, University of Florida, PO Box 110410, Gainesville, FL, 32611, USA
| | - Salvador A Gezan
- School of Forest Resources and Conservation, University of Florida, PO Box 110410, Gainesville, FL, 32611, USA
| | - Gary F Peter
- Forest Genomics Laboratory, Genetics Institute, University of Florida, 1376 Mowry Rd, Rm 320, Gainesville, FL, 32611, USA
- School of Forest Resources and Conservation, University of Florida, PO Box 110410, Gainesville, FL, 32611, USA
| | - John M Davis
- Forest Genomics Laboratory, Genetics Institute, University of Florida, 1376 Mowry Rd, Rm 320, Gainesville, FL, 32611, USA
- School of Forest Resources and Conservation, University of Florida, PO Box 110410, Gainesville, FL, 32611, USA
| |
Collapse
|
48
|
Palle SR, Seeve CM, Eckert AJ, Wegrzyn JL, Neale DB, Loopstra CA. Association of loblolly pine xylem development gene expression with single-nucleotide polymorphisms. Tree Physiol 2013; 33:763-74. [PMID: 23933831 DOI: 10.1093/treephys/tpt054] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Variation in the expression of genes with putative roles in wood development was associated with single-nucleotide polymorphisms (SNPs) using a population of loblolly pine (Pinus taeda L.) that included individuals from much of the native range. Association studies were performed using 3938 SNPs and expression data obtained using quantitative real-time polymerase chain reaction (PCR) (qRT-PCR) for 106 xylem development genes in 400 clonally replicated loblolly pine individuals. A general linear model (GLM) approach, which takes the underlying population structure into consideration, was used to discover significant associations. After adjustment for multiple testing using a false discovery rate correction, 88 statistically significant associations (Q<0.05) were observed for 80 SNPs with the expression data of 33 xylem development genes. Thirty SNPs caused nonsynonymous mutations, 18 resulted in synonymous mutations, 11 were in 3' untranslated regions (UTRs), 1 was in a 5' UTR and 20 were in introns. Using AraNet, we found that Arabidopsis genes with high similarity to the loblolly pine genes involved in 21 of the 88 statistically significant associations are connected in functional gene networks. Comparisons of gene expression values revealed that in most cases the average expression in plants homozygous for the rare SNP allele was lower than that of plants that were heterozygous or homozygous for the abundant allele. Although there are association studies of SNPs and expression profiles for humans, Arabidopsis and white spruce, to the best of our knowledge, this is the first example of such an association genetic study in pines. Functional validation of these associations will lead to a deeper understanding of the molecular basis of phenotypic differences in wood development among individuals in conifer populations.
Collapse
Affiliation(s)
- Sreenath R Palle
- Department of Ecosystem Science and Management, Molecular and Environmental Plant Sciences, Texas A&M University, TAMU 2138, College Station, TX 77843, USA
| | | | | | | | | | | |
Collapse
|
49
|
Abstract
An open-access culture and a well-developed comparative-genomics infrastructure must be developed in forest trees to derive the full potential of genome sequencing in this diverse group of plants that are the dominant species in much of the earth's terrestrial ecosystems.
Collapse
|
50
|
Vasquez‐Gross HA, Yu JJ, Figueroa B, Gessler DDG, Neale DB, Wegrzyn JL. CartograTree: connecting tree genomes, phenotypes and environment. Mol Ecol Resour 2013; 13:528-37. [DOI: 10.1111/1755-0998.12067] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2012] [Revised: 12/03/2012] [Accepted: 12/12/2012] [Indexed: 01/01/2023]
Affiliation(s)
| | - John J. Yu
- Department of Plant Sciences University of California at Davis Davis CA 95616 USA
| | - Ben Figueroa
- Department of Plant Sciences University of California at Davis Davis CA 95616 USA
| | | | - David B. Neale
- Department of Plant Sciences University of California at Davis Davis CA 95616 USA
| | - Jill L. Wegrzyn
- Department of Plant Sciences University of California at Davis Davis CA 95616 USA
| |
Collapse
|