1
|
Monnahan PJ, Michno JM, O'Connor C, Brohammer AB, Springer NM, McGaugh SE, Hirsch CN. Using multiple reference genomes to identify and resolve annotation inconsistencies. BMC Genomics 2020; 21:281. [PMID: 32264824 PMCID: PMC7140576 DOI: 10.1186/s12864-020-6696-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2019] [Accepted: 03/24/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Advances in sequencing technologies have led to the release of reference genomes and annotations for multiple individuals within more well-studied systems. While each of these new genome assemblies shares significant portions of synteny between each other, the annotated structure of gene models within these regions can differ. Of particular concern are split-gene misannotations, in which a single gene is incorrectly annotated as two distinct genes or two genes are incorrectly annotated as a single gene. These misannotations can have major impacts on functional prediction, estimates of expression, and many downstream analyses. RESULTS We developed a high-throughput method based on pairwise comparisons of annotations that detect potential split-gene misannotations and quantifies support for whether the genes should be merged into a single gene model. We demonstrated the utility of our method using gene annotations of three reference genomes from maize (B73, PH207, and W22), a difficult system from an annotation perspective due to the size and complexity of the genome. On average, we found several hundred of these potential split-gene misannotations in each pairwise comparison, corresponding to 3-5% of gene models across annotations. To determine which state (i.e. one gene or multiple genes) is biologically supported, we utilized RNAseq data from 10 tissues throughout development along with a novel metric and simulation framework. The methods we have developed require minimal human interaction and can be applied to future assemblies to aid in annotation efforts. CONCLUSIONS Split-gene misannotations occur at appreciable frequency in maize annotations. We have developed a method to easily identify and correct these misannotations. Importantly, this method is generic in that it can utilize any type of short-read expression data. Failure to account for split-gene misannotations has serious consequences for biological inference, particularly for expression-based analyses.
Collapse
Affiliation(s)
- Patrick J Monnahan
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN, 55108, USA
- Department of Ecology, Evolution, and Behavior, University of Minnesota, St. Paul, MN, 55108, USA
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, MN, 55108, USA
| | - Jean-Michel Michno
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN, 55108, USA
| | - Christine O'Connor
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN, 55108, USA
- Department of Ecology, Evolution, and Behavior, University of Minnesota, St. Paul, MN, 55108, USA
| | - Alex B Brohammer
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN, 55108, USA
| | - Nathan M Springer
- Department of Plant and Microbial Biology, University of Minnesota, St. Paul, MN, 55108, USA
| | - Suzanne E McGaugh
- Department of Ecology, Evolution, and Behavior, University of Minnesota, St. Paul, MN, 55108, USA
| | - Candice N Hirsch
- Department of Agronomy and Plant Genetics, University of Minnesota, St. Paul, MN, 55108, USA.
| |
Collapse
|
2
|
Hu X, Steimel JP, Kapka-Kitzman DM, Davis-Vogel C, Richtman NM, Mathis JP, Nelson ME, Lu AL, Wu G. Molecular characterization of the insecticidal activity of double-stranded RNA targeting the smooth septate junction of western corn rootworm (Diabrotica virgifera virgifera). PLoS One 2019; 14:e0210491. [PMID: 30629687 PMCID: PMC6328145 DOI: 10.1371/journal.pone.0210491] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2018] [Accepted: 12/24/2018] [Indexed: 01/14/2023] Open
Abstract
The western corn rootworm (WCR, Diabrotica virgifera virgifera) gene, dvssj1, is a putative homolog of the Drosophila melanogaster gene, snakeskin (ssk). This gene encodes a membrane protein associated with the smooth septate junction (SSJ) which is required for the proper barrier function of the epithelial lining of insect intestines. Disruption of DVSSJ integrity by RNAi technique has been shown previously to be an effective approach for corn rootworm control, by apparent suppression of production of DVSSJ1 protein leading to growth inhibition and mortality. To understand the mechanism that leads to the death of WCR larvae by dvssj1 double-stranded RNA, we examined the molecular characteristics associated with SSJ functions during larval development. Dvssj1 dsRNA diet feeding results in dose-dependent suppression of mRNA and protein; this impairs SSJ formation and barrier function of the midgut and results in larval mortality. These findings suggest that the malfunctioning of the SSJ complex in midgut triggered by dvssj1 silencing is the principal cause of WCR death. This study also illustrates that dvssj1 is a midgut-specific gene in WCR and its functions are consistent with biological functions described for ssk.
Collapse
Affiliation(s)
- Xu Hu
- DuPont Pioneer, Johnston, Iowa, United States of America
- * E-mail: (XH); (MEN)
| | | | | | | | | | - John P. Mathis
- DuPont Pioneer, Johnston, Iowa, United States of America
| | - Mark E. Nelson
- DuPont Pioneer, Johnston, Iowa, United States of America
- * E-mail: (XH); (MEN)
| | - Albert L. Lu
- DuPont Pioneer, Johnston, Iowa, United States of America
| | - Gusui Wu
- DuPont Pioneer, Hayward, California, United States of America
| |
Collapse
|
3
|
Avila LM, Obeidat W, Earl H, Niu X, Hargreaves W, Lukens L. Shared and genetically distinct Zea mays transcriptome responses to ongoing and past low temperature exposure. BMC Genomics 2018; 19:761. [PMID: 30342485 PMCID: PMC6196024 DOI: 10.1186/s12864-018-5134-7] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2018] [Accepted: 10/01/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Cold temperatures and their alleviation affect many plant traits including the abundance of protein coding gene transcripts. Transcript level changes that occur in response to cold temperatures and their alleviation are shared or vary across genotypes. In this study we identify individual transcripts and groups of functionally related transcripts that consistently respond to cold and its alleviation. Genes that respond differently to temperature changes across genotypes may have limited functional importance. We investigate if these genes share functions, and if their genotype-specific gene expression levels change in magnitude or rank across temperatures. RESULTS We estimate transcript abundances from over 22,000 genes in two unrelated Zea mays inbred lines during and after cold temperature exposure. Genotype and temperature contribute to many genes' abundances. Past cold exposure affects many fewer genes. Genes up-regulated in cold encode many cytokinin glucoside biosynthesis enzymes, transcription factors, signalling molecules, and proteins involved in diverse environmental responses. After cold exposure, protease inhibitors and cuticular wax genes are newly up-regulated, and environmentally responsive genes continue to be up-regulated. Genes down-regulated in response to cold include many photosynthesis, translation, and DNA replication associated genes. After cold exposure, DNA replication and translation genes are still preferentially downregulated. Lignin and suberin biosynthesis are newly down-regulated. DNA replication, reactive oxygen species response, and anthocyanin biosynthesis genes have strong, genotype-specific temperature responses. The ranks of genotypes' transcript abundances often change across temperatures. CONCLUSIONS We report a large, core transcriptome response to cold and the alleviation of cold. In cold, many of the core suite of genes are up or downregulated to control plant growth and photosynthesis and limit cellular damage. In recovery, core responses are in part to prepare for future stress. Functionally related genes are consistently and greatly up-regulated in a single genotype in response to cold or its alleviation, suggesting positive selection has driven genotype-specific temperature responses in maize.
Collapse
Affiliation(s)
- Luis M Avila
- Department of Plant Agriculture, University of Guelph, 50 Stone Road East, Guelph, ON N1G 2W1 Canada
| | - Wisam Obeidat
- Department of Plant Agriculture, University of Guelph, 50 Stone Road East, Guelph, ON N1G 2W1 Canada
| | - Hugh Earl
- Department of Plant Agriculture, University of Guelph, 50 Stone Road East, Guelph, ON N1G 2W1 Canada
| | - Xiaomu Niu
- Dupont/Pioneer, 7300 NW 62nd Ave, DuPont Pioneer, Johnston, Iowa, 50131 USA
| | - William Hargreaves
- Department of Plant Agriculture, University of Guelph, 50 Stone Road East, Guelph, ON N1G 2W1 Canada
| | - Lewis Lukens
- Department of Plant Agriculture, University of Guelph, 50 Stone Road East, Guelph, ON N1G 2W1 Canada
| |
Collapse
|
4
|
Bilinski P, Albert PS, Berg JJ, Birchler JA, Grote MN, Lorant A, Quezada J, Swarts K, Yang J, Ross-Ibarra J. Parallel altitudinal clines reveal trends in adaptive evolution of genome size in Zea mays. PLoS Genet 2018; 14:e1007162. [PMID: 29746459 PMCID: PMC5944917 DOI: 10.1371/journal.pgen.1007162] [Citation(s) in RCA: 60] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2017] [Accepted: 12/20/2017] [Indexed: 12/03/2022] Open
Abstract
While the vast majority of genome size variation in plants is due to differences in repetitive sequence, we know little about how selection acts on repeat content in natural populations. Here we investigate parallel changes in intraspecific genome size and repeat content of domesticated maize (Zea mays) landraces and their wild relative teosinte across altitudinal gradients in Mesoamerica and South America. We combine genotyping, low coverage whole-genome sequence data, and flow cytometry to test for evidence of selection on genome size and individual repeat abundance. We find that population structure alone cannot explain the observed variation, implying that clinal patterns of genome size are maintained by natural selection. Our modeling additionally provides evidence of selection on individual heterochromatic knob repeats, likely due to their large individual contribution to genome size. To better understand the phenotypes driving selection on genome size, we conducted a growth chamber experiment using a population of highland teosinte exhibiting extensive variation in genome size. We find weak support for a positive correlation between genome size and cell size, but stronger support for a negative correlation between genome size and the rate of cell production. Reanalyzing published data of cell counts in maize shoot apical meristems, we then identify a negative correlation between cell production rate and flowering time. Together, our data suggest a model in which variation in genome size is driven by natural selection on flowering time across altitudinal clines, connecting intraspecific variation in repetitive sequence to important differences in adaptive phenotypes. Genome size in plants can vary by orders of magnitude, but this variation has long been considered to be of little functional consequence. Studying three independent adaptations to high altitude in Zea mays, we find that genome size experiences parallel pressures from natural selection, causing a reduction in genome size with increasing altitude. Though reductions in overall repetitive content are responsible for the genome size change, we find that only those individual loci contributing most to the variation in genome size are individually targeted by selection. To identify the phenotype influenced by genome size, we study how variation in genome size within a single wild population impacts leaf growth and cell division. We find that genome size variation correlates negatively with the rate of cell division, suggesting that individuals with larger genomes require longer to complete a mitotic cycle. Finally, we reanalyze data from maize inbreds to show that faster cell division is correlated with earlier flowering, connecting observed variation in genome size to an important adaptive phenotype.
Collapse
Affiliation(s)
- Paul Bilinski
- Department of Plant Sciences, University of California, Davis, Davis, California, United States of America
- Research Group for Ancient Genomics and Evolution, Department of Molecular Biology, Max Planck Institute for Developmental Biology, Tuebingen, Germany
- * E-mail: (PB); (JRI)
| | - Patrice S. Albert
- Division of Biological Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Jeremy J. Berg
- Center for Population Biology, University of California, Davis, Davis, California, United States of America
- Department of Evolution and Ecology, University of California, Davis, Davis, California, United States of America
| | - James A. Birchler
- Division of Biological Sciences, University of Missouri, Columbia, Missouri, United States of America
| | - Mark N. Grote
- Department of Anthropology, University of California, Davis, Davis, California, United States of America
| | - Anne Lorant
- Department of Plant Sciences, University of California, Davis, Davis, California, United States of America
| | - Juvenal Quezada
- Department of Plant Sciences, University of California, Davis, Davis, California, United States of America
| | - Kelly Swarts
- Research Group for Ancient Genomics and Evolution, Department of Molecular Biology, Max Planck Institute for Developmental Biology, Tuebingen, Germany
| | - Jinliang Yang
- Department of Plant Sciences, University of California, Davis, Davis, California, United States of America
- Department of Agronomy and Horticulture, University of Nebraska-Lincoln, Lincoln, Nebraska, United States of America
| | - Jeffrey Ross-Ibarra
- Department of Plant Sciences, University of California, Davis, Davis, California, United States of America
- Center for Population Biology, University of California, Davis, Davis, California, United States of America
- Genome Center, University of California, Davis, Davis, California, United States of America
- * E-mail: (PB); (JRI)
| |
Collapse
|
5
|
Bilinski P, Han Y, Hufford MB, Lorant A, Zhang P, Estep MC, Jiang J, Ross-Ibarra J. Genomic abundance is not predictive of tandem repeat localization in grass genomes. PLoS One 2017; 12:e0177896. [PMID: 28570674 PMCID: PMC5453492 DOI: 10.1371/journal.pone.0177896] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2016] [Accepted: 05/04/2017] [Indexed: 11/25/2022] Open
Abstract
Highly repetitive regions have historically posed a challenge when investigating sequence variation and content. High-throughput sequencing has enabled researchers to use whole-genome shotgun sequencing to estimate the abundance of repetitive sequence, and these methodologies have been recently applied to centromeres. Previous research has investigated variation in centromere repeats across eukaryotes, positing that the highest abundance tandem repeat in a genome is often the centromeric repeat. To test this assumption, we used shotgun sequencing and a bioinformatic pipeline to identify common tandem repeats across a number of grass species. We find that de novo assembly and subsequent abundance ranking of repeats can successfully identify tandem repeats with homology to known tandem repeats. Fluorescent in-situ hybridization shows that de novo assembly and ranking of repeats from non-model taxa identifies chromosome domains rich in tandem repeats both near pericentromeres and elsewhere in the genome.
Collapse
Affiliation(s)
- Paul Bilinski
- Dept. of Plant Sciences, University of California, Davis, Davis, CA, United States of America
- * E-mail: (PB); (JRI)
| | - Yonghua Han
- School of Life Sciences, Jiangsu Normal University, Xuzhou, China
- Dept. of Horticulture, University of Wisconsin-Madison, Madison, WI, United States of America
| | - Matthew B. Hufford
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, United States of America
| | - Anne Lorant
- Dept. of Plant Sciences, University of California, Davis, Davis, CA, United States of America
| | - Pingdong Zhang
- Dept. of Horticulture, University of Wisconsin-Madison, Madison, WI, United States of America
- College of Bioscience and Biotechnology, Beijing Forestry University, Beijing, China
| | - Matt C. Estep
- Dept. of Biology, Appalachian State University, Boone, NC, United States of America
| | - Jiming Jiang
- Dept. of Horticulture, University of Wisconsin-Madison, Madison, WI, United States of America
| | - Jeffrey Ross-Ibarra
- Dept. of Plant Sciences, University of California, Davis, Davis, CA, United States of America
- Genome Center and Center for Population Biology, University of California, Davis, Davis, CA, United States of America
- * E-mail: (PB); (JRI)
| |
Collapse
|
6
|
de Andrade LRB, Fritsche Neto R, Granato ÍSC, Sant’Ana GC, Morais PPP, Borém A. Genetic Vulnerability and the Relationship of Commercial Germplasms of Maize in Brazil with the Nested Association Mapping Parents. PLoS One 2016; 11:e0163739. [PMID: 27780247 PMCID: PMC5079593 DOI: 10.1371/journal.pone.0163739] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2016] [Accepted: 09/13/2016] [Indexed: 11/19/2022] Open
Abstract
A few breeding companies dominate the maize (Zea mays L.) hybrid market in Brazil: Monsanto® (35%), DuPont Pioneer® (30%), Dow Agrosciences® (15%), Syngenta® (10%) and Helix Sementes (4%). Therefore, it is important to monitor the genetic diversity in commercial germplasms as breeding practices, registration and marketing of new cultivars can lead to a significant reduction of the genetic diversity. Reduced genetic variation may lead to crop vulnerabilities, food insecurity and limited genetic gains following selection. The aim of this study was to evaluate the genetic vulnerability risk by examining the relationship between the commercial Brazilian maize germplasms and the Nested Association Mapping (NAM) Parents. For this purpose, we used the commercial hybrids with the largest market share in Brazil and the NAM parents. The hybrids were genotyped for 768 single nucleotide polymorphisms (SNPs), using the Illumina Goldengate® platform. The NAM parent genomic data, comprising 1,536 SNPs for each line, were obtained from the Panzea data bank. The population structure, genetic diversity and the correlation between allele frequencies were analyzed. Based on the estimated effective population size and genetic variability, it was found that there is a low risk of genetic vulnerability in the commercial Brazilian maize germplasms. However, the genetic diversity is lower than those found in the NAM parents. Furthermore, the Brazilian germplasms presented no close relations with most NAM parents, except B73. This indicates that B73, or its heterotic group (Iowa Stiff Stalk Synthetic), contributed to the development of the commercial Brazilian germplasms.
Collapse
Affiliation(s)
| | - Roberto Fritsche Neto
- Genetics Department, Luiz de Queiroz College of Agriculture, University of São Paulo, Piracicaba, São Paulo, Brazil
- * E-mail:
| | | | - Gustavo César Sant’Ana
- Systèmes biologiques, Centre de Coopération Internationale en Recherche Agronomique pour le Développement, Montpellier, Languedoc-Roussillo, France
| | | | - Aluízio Borém
- Plants Science Department, Federal University of Viçosa, Viçosa, Minas Gerais, Brazil
| |
Collapse
|