1
|
Marin-Recinos MF, Pucker B. Genetic factors explaining anthocyanin pigmentation differences. BMC PLANT BIOLOGY 2024; 24:627. [PMID: 38961369 PMCID: PMC11221117 DOI: 10.1186/s12870-024-05316-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 06/20/2024] [Indexed: 07/05/2024]
Abstract
BACKGROUND Anthocyanins are important contributors to coloration across a wide phylogenetic range of plants. Biological functions of anthocyanins span from reproduction to protection against biotic and abiotic stressors. Owing to a clearly visible phenotype of mutants, the anthocyanin biosynthesis and its sophisticated regulation have been studied in numerous plant species. Genes encoding the anthocyanin biosynthesis enzymes are regulated by a transcription factor complex comprising MYB, bHLH and WD40 proteins. RESULTS A systematic comparison of anthocyanin-pigmented vs. non-pigmented varieties was performed within numerous plant species covering the taxonomic diversity of flowering plants. The literature was screened for cases in which genetic factors causing anthocyanin loss were reported. Additionally, transcriptomic data sets from four previous studies were reanalyzed to determine the genes possibly responsible for color variation based on their expression pattern. The contribution of different structural and regulatory genes to the intraspecific pigmentation differences was quantified. Differences concerning transcription factors are by far the most frequent explanation for pigmentation differences observed between two varieties of the same species. Among the transcription factors in the analyzed cases, MYB genes are significantly more prone to account for pigmentation differences compared to bHLH or WD40 genes. Among the structural genes, DFR genes are most often associated with anthocyanin loss. CONCLUSIONS These findings support previous assumptions about the susceptibility of transcriptional regulation to evolutionary changes and its importance for the evolution of novel coloration phenotypes. Our findings underline the particular significance of MYBs and their apparent prevalent role in the specificity of the MBW complex.
Collapse
Affiliation(s)
- Maria F Marin-Recinos
- Plant Biotechnology and Bioinformatics, Institute of Plant Biology and BRICS, TU Braunschweig, Braunschweig, Germany
| | - Boas Pucker
- Plant Biotechnology and Bioinformatics, Institute of Plant Biology and BRICS, TU Braunschweig, Braunschweig, Germany.
| |
Collapse
|
2
|
Zhou H, Su X, Song B. ACMGA: a reference-free multiple-genome alignment pipeline for plant species. BMC Genomics 2024; 25:515. [PMID: 38796435 PMCID: PMC11127342 DOI: 10.1186/s12864-024-10430-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Accepted: 05/20/2024] [Indexed: 05/28/2024] Open
Abstract
BACKGROUND The short-read whole-genome sequencing (WGS) approach has been widely applied to investigate the genomic variation in the natural populations of many plant species. With the rapid advancements in long-read sequencing and genome assembly technologies, high-quality genome sequences are available for a group of varieties for many plant species. These genome sequences are expected to help researchers comprehensively investigate any type of genomic variants that are missed by the WGS technology. However, multiple genome alignment (MGA) tools designed by the human genome research community might be unsuitable for plant genomes. RESULTS To fill this gap, we developed the AnchorWave-Cactus Multiple Genome Alignment (ACMGA) pipeline, which improved the alignment of repeat elements and could identify long (> 50 bp) deletions or insertions (INDELs). We conducted MGA using ACMGA and Cactus for 8 Arabidopsis (Arabidopsis thaliana) and 26 Maize (Zea mays) de novo assembled genome sequences and compared them with the previously published short-read variant calling results. MGA identified more single nucleotide variants (SNVs) and long INDELs than did previously published WGS variant callings. Additionally, ACMGA detected significantly more SNVs and long INDELs in repetitive regions and the whole genome than did Cactus. Compared with the results of Cactus, the results of ACMGA were more similar to the previously published variants called using short-read. These two MGA pipelines identified numerous multi-allelic variants that were missed by the WGS variant calling pipeline. CONCLUSIONS Aligning de novo assembled genome sequences could identify more SNVs and INDELs than mapping short-read. ACMGA combines the advantages of AnchorWave and Cactus and offers a practical solution for plant MGA by integrating global alignment, a 2-piece-affine-gap cost strategy, and the progressive MGA algorithm.
Collapse
Affiliation(s)
- Huafeng Zhou
- College of Computer Science and Technology, Qingdao University, Qingdao, Shandong, 266071, China
- National Key Laboratory of Wheat Improvement, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agriculture Sciences in Weifang, Weifang, Shandong, 261325, China
| | - Xiaoquan Su
- College of Computer Science and Technology, Qingdao University, Qingdao, Shandong, 266071, China.
| | - Baoxing Song
- National Key Laboratory of Wheat Improvement, Peking University Institute of Advanced Agricultural Sciences, Shandong Laboratory of Advanced Agriculture Sciences in Weifang, Weifang, Shandong, 261325, China.
- Key Laboratory of Maize Biology and Genetic Breeding in Arid Area of Northwest Region of the Ministry of Agriculture, College of Agronomy, Northwest A&F University, Yangling, Shaanxi, 712100, China.
| |
Collapse
|
3
|
Natarajan S, Pucker B, Srivastava S. Genomic and transcriptomic analysis of camptothecin producing novel fungal endophyte: Alternaria burnsii NCIM 1409. Sci Rep 2023; 13:14614. [PMID: 37670002 PMCID: PMC10480469 DOI: 10.1038/s41598-023-41738-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 08/30/2023] [Indexed: 09/07/2023] Open
Abstract
Camptothecin is an important anticancer alkaloid produced by particular plant species. No suitable synthetic route has been established for camptothecin production yet, imposing a stress on plant-based production systems. Endophytes associated with these camptothecin-producing plants have been reported to also produce camptothecin and other high-value phytochemicals. A previous study identified a fungal endophyte Alternaria burnsii NCIM 1409, isolated from Nothapodytes nimmoniana, to be a sustainable producer of camptothecin. Our study provides key insights on camptothecin biosynthesis in this recently discovered endophyte. The whole genome sequence of A. burnsii NCIM 1409 was assembled and screened for biosynthetic gene clusters. Comparative studies with related fungi supported the identification of candidate genes involved in camptothecin synthesis and also helped to understand some aspects of the endophyte's defense against the toxic effects of camptothecin. No evidence for horizontal gene transfer of the camptothecin biosynthetic genes from the host plant to the endophyte was detected suggesting an independent evolution of the camptothecin biosynthesis in this fungus.
Collapse
Affiliation(s)
- Shakunthala Natarajan
- Plant Biotechnology and Bioinformatics, Institute of Plant Biology and Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, 38106, Brunswick, Germany
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600 036, India
| | - Boas Pucker
- Plant Biotechnology and Bioinformatics, Institute of Plant Biology and Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, 38106, Brunswick, Germany.
| | - Smita Srivastava
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600 036, India.
| |
Collapse
|
4
|
Meckoni SN, Nass B, Pucker B. Phylogenetic placement of Ceratophyllum submersum based on a complete plastome sequence derived from nanopore long read sequencing data. BMC Res Notes 2023; 16:187. [PMID: 37626355 PMCID: PMC10464454 DOI: 10.1186/s13104-023-06459-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 08/14/2023] [Indexed: 08/27/2023] Open
Abstract
OBJECTIVE Eutrophication poses a mounting concern in today's world. Ceratophyllum submersum L. is one of many plants capable of living in eutrophic conditions, therefore it could play a critical role in addressing the problem of eutrophication. This study aimed to take a first genomic look at C. submersum. RESULTS Sequencing of gDNA from C. submersum yielded enough reads to assemble a plastome. Subsequent annotation and phylogenetic analysis validated existing information regarding angiosperm relationships and the positioning of Ceratophylalles in a wider phylogenetic context.
Collapse
Affiliation(s)
- Samuel Nestor Meckoni
- Plant Biotechnology and Bioinformatics, Institute of Plant Biology, TU Braunschweig, 38106, Braunschweig, Germany
| | - Benneth Nass
- Plant Biotechnology and Bioinformatics, Institute of Plant Biology, TU Braunschweig, 38106, Braunschweig, Germany
- Faculty of Environmental Sciences, Czech University of Life Sciences Prague, Kamýcká 129, Praha 6 - Suchdol, CZ-165 21, Prague, Czech Republic
| | - Boas Pucker
- Plant Biotechnology and Bioinformatics, Institute of Plant Biology, TU Braunschweig, 38106, Braunschweig, Germany.
- Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, 38106, Braunschweig, Germany.
| |
Collapse
|
5
|
McConnell SC, Hernandez KM, Andrade J, de Jong JLO. Immune gene variation associated with chromosome-scale differences among individual zebrafish genomes. Sci Rep 2023; 13:7777. [PMID: 37179373 PMCID: PMC10183018 DOI: 10.1038/s41598-023-34467-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 04/30/2023] [Indexed: 05/15/2023] Open
Abstract
Immune genes have evolved to maintain exceptional diversity, offering robust defense against pathogens. We performed genomic assembly to examine immune gene variation in zebrafish. Gene pathway analysis identified immune genes as significantly enriched among genes with evidence of positive selection. A large subset of genes was absent from analysis of coding sequences due to apparent lack of reads, prompting us to examine genes overlapping zero coverage regions (ZCRs), defined as 2 kb stretches without mapped reads. Immune genes were identified as highly enriched within ZCRs, including over 60% of major histocompatibility complex (MHC) genes and NOD-like receptor (NLR) genes, mediators of direct and indirect pathogen recognition. This variation was most highly concentrated throughout one arm of chromosome 4 carrying a large cluster of NLR genes, associated with large-scale structural variation covering more than half of the chromosome. Our genomic assemblies uncovered alternative haplotypes and distinct complements of immune genes among individual zebrafish, including the MHC Class II locus on chromosome 8 and the NLR gene cluster on chromosome 4. While previous studies have shown marked variation in NLR genes between vertebrate species, our study highlights extensive variation in NLR gene regions between individuals of the same species. Taken together, these findings provide evidence of immune gene variation on a scale previously unknown in other vertebrate species and raise questions about potential impact on immune function.
Collapse
Affiliation(s)
- Sean C McConnell
- Section of Hematology-Oncology and Stem Cell Transplant, Department of Pediatrics, The University of Chicago, Chicago, IL, 60637, USA
| | - Kyle M Hernandez
- Center for Research Informatics, The University of Chicago, Chicago, IL, 60637, USA
- Department of Medicine, Computational Biomedicine and Biomedical Data Science, Center for Translational Data Science, The University of Chicago, Chicago, IL, 60637, USA
| | - Jorge Andrade
- Center for Research Informatics, The University of Chicago, Chicago, IL, 60637, USA
- Kite Pharma, Santa Monica, CA, 90404, USA
| | - Jill L O de Jong
- Section of Hematology-Oncology and Stem Cell Transplant, Department of Pediatrics, The University of Chicago, Chicago, IL, 60637, USA.
| |
Collapse
|
6
|
Ventimiglia M, Marturano G, Vangelisti A, Usai G, Simoni S, Cavallini A, Giordani T, Natali L, Zuccolo A, Mascagni F. Genome-wide identification and characterization of exapted transposable elements in the large genome of sunflower (Helianthus annuus L.). THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2023; 113:734-748. [PMID: 36573648 DOI: 10.1111/tpj.16078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/10/2022] [Revised: 12/07/2022] [Accepted: 12/21/2022] [Indexed: 06/17/2023]
Abstract
Transposable elements (TEs) are an important source of genome variability, playing many roles in the evolution of eukaryotic species. Besides well-known phenomena, TEs may undergo the exaptation process and generate the so-called exapted transposable element genes (ETEs). Here we present a genome-wide survey of ETEs in the large genome of sunflower (Helianthus annuus L.), in which the massive amount of TEs, provides a significant source for exaptation. A library of sunflower TEs was used to build TE-specific Hidden Markov Model profiles, to search for all available sunflower gene products. In doing so, 20 016 putative ETEs were identified and further investigated for the characteristics that distinguish TEs from genes, leading to the validation of 3530 ETEs. The analysis of ETEs transcription patterns under different stress conditions showed a differential regulation triggered by treatments mimicking biotic and abiotic stress; furthermore, the distribution of functional domains of differentially regulated ETEs revealed a relevant presence of domains involved in many aspects of cellular functions. A comparative genomic investigation was performed including species representative of Asterids and appropriate outgroups: the bulk of ETEs that resulted were specific to the sunflower, while few ETEs presented orthologues in the genome of all analyzed species, making the hypothesis of a conserved function. This study highlights the crucial role played by exaptation, actively contributing to species evolution.
Collapse
Affiliation(s)
- Maria Ventimiglia
- Department of Agriculture, Food and Environment, University of Pisa, Via del Borghetto 80, 56124, Pisa, Italy
| | - Giovanni Marturano
- Crop Science Research Center, Sant'Anna School of Advanced Studies, Piazza Martiri della Libertà 33, 56127, Pisa, Italy
| | - Alberto Vangelisti
- Department of Agriculture, Food and Environment, University of Pisa, Via del Borghetto 80, 56124, Pisa, Italy
| | - Gabriele Usai
- Department of Agriculture, Food and Environment, University of Pisa, Via del Borghetto 80, 56124, Pisa, Italy
| | - Samuel Simoni
- Department of Agriculture, Food and Environment, University of Pisa, Via del Borghetto 80, 56124, Pisa, Italy
| | - Andrea Cavallini
- Department of Agriculture, Food and Environment, University of Pisa, Via del Borghetto 80, 56124, Pisa, Italy
| | - Tommaso Giordani
- Department of Agriculture, Food and Environment, University of Pisa, Via del Borghetto 80, 56124, Pisa, Italy
| | - Lucia Natali
- Department of Agriculture, Food and Environment, University of Pisa, Via del Borghetto 80, 56124, Pisa, Italy
| | - Andrea Zuccolo
- Crop Science Research Center, Sant'Anna School of Advanced Studies, Piazza Martiri della Libertà 33, 56127, Pisa, Italy
- Center for Desert Agriculture, Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | - Flavia Mascagni
- Department of Agriculture, Food and Environment, University of Pisa, Via del Borghetto 80, 56124, Pisa, Italy
| |
Collapse
|
7
|
Piña JS, Orozco-Arias S, Tobón-Orozco N, Camargo-Forero L, Tabares-Soto R, Guyot R. G-SAIP: Graphical Sequence Alignment Through Parallel Programming in the Post-Genomic Era. Evol Bioinform Online 2023; 19:11769343221150585. [PMID: 36703866 PMCID: PMC9871978 DOI: 10.1177/11769343221150585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 12/23/2022] [Indexed: 01/22/2023] Open
Abstract
A common task in bioinformatics is to compare DNA sequences to identify similarities between organisms at the sequence level. An approach to such comparison is the dot-plots, a 2-dimensional graphical representation to analyze DNA or protein alignments. Dot-plots alignment software existed before the sequencing revolution, and now there is an ongoing limitation when dealing with large-size sequences, resulting in very long execution times. High-Performance Computing (HPC) techniques have been successfully used in many applications to reduce computing times, but so far, very few applications for graphical sequence alignment using HPC have been reported. Here, we present G-SAIP (Graphical Sequence Alignment in Parallel), a software capable of spawning multiple distributed processes on CPUs, over a supercomputing infrastructure to speed up the execution time for dot-plot generation up to 1.68× compared with other current fastest tools, improve the efficiency for comparative structural genomic analysis, phylogenetics because the benefits of pairwise alignments for comparison between genomes, repetitive structure identification, and assembly quality checking.
Collapse
Affiliation(s)
- Johan S. Piña
- Department of Data Science, People
Contact, Manizales, Caldas, Colombia,Department of Computer Science,
Universidad Autónoma de Manizales, Manizales, Caldas, Colombia,Johan S. Piña, Department of Computer
Science, Universidad Autónoma de Manizales, Antigua estación del ferrocarril,
Manizales, Caldas 170004, Colombia.
| | - Simon Orozco-Arias
- Department of Computer Science,
Universidad Autónoma de Manizales, Manizales, Caldas, Colombia,Department of Systems and Informatics,
Universidad de Caldas, Manizales, Caldas, Colombia
| | - Nicolas Tobón-Orozco
- Department of Computer Science,
Universidad Autónoma de Manizales, Manizales, Caldas, Colombia
| | | | - Reinel Tabares-Soto
- Department of Electronics and
Automation, Universidad Autónoma de Manizales, Manizales, Caldas, Colombia
| | - Romain Guyot
- Department of Electronics and
Automation, Universidad Autónoma de Manizales, Manizales, Caldas, Colombia,Institut de Recherche pour le
Développement, CIRAD, University of Montpellier, Montpellier, France
| |
Collapse
|
8
|
Pucker B, Iorizzo M. Apiaceae FNS I originated from F3H through tandem gene duplication. PLoS One 2023; 18:e0280155. [PMID: 36656808 PMCID: PMC9851555 DOI: 10.1371/journal.pone.0280155] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 12/21/2022] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND Flavonoids are specialized metabolites with numerous biological functions in stress response and reproduction of plants. Flavones are one subgroup that is produced by the flavone synthase (FNS). Two distinct enzyme families evolved that can catalyze the biosynthesis of flavones. While the membrane-bound FNS II is widely distributed in seed plants, one lineage of soluble FNS I appeared to be unique to Apiaceae species. RESULTS We show through phylogenetic and comparative genomic analyses that Apiaceae FNS I evolved through tandem gene duplication of flavanone 3-hydroxylase (F3H) followed by neofunctionalization. Currently available datasets suggest that this event happened within the Apiaceae in a common ancestor of Daucus carota and Apium graveolens. The results also support previous findings that FNS I in the Apiaceae evolved independent of FNS I in other plant species. CONCLUSION We validated a long standing hypothesis about the evolution of Apiaceae FNS I and predicted the phylogenetic position of this event. Our results explain how an Apiaceae-specific FNS I lineage evolved and confirm independence from other FNS I lineages reported in non-Apiaceae species.
Collapse
Affiliation(s)
- Boas Pucker
- Institute of Plant Biology, TU Braunschweig, Braunschweig, Germany
- BRICS, TU Braunschweig, Braunschweig, Germany
- * E-mail: (BP); (MI)
| | - Massimo Iorizzo
- Plants for Human Health Institute, NC State University, Kannapolis, North Carolina, United States of America
- Department of Horticultural Science, NC State University, Raleigh, North Carolina, United States of America
- * E-mail: (BP); (MI)
| |
Collapse
|
9
|
Rabanal FA, Gräff M, Lanz C, Fritschi K, Llaca V, Lang M, Carbonell-Bejerano P, Henderson I, Weigel D. Pushing the limits of HiFi assemblies reveals centromere diversity between two Arabidopsis thaliana genomes. Nucleic Acids Res 2022; 50:12309-12327. [PMID: 36453992 PMCID: PMC9757041 DOI: 10.1093/nar/gkac1115] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 09/13/2022] [Accepted: 11/10/2022] [Indexed: 12/05/2022] Open
Abstract
Although long-read sequencing can often enable chromosome-level reconstruction of genomes, it is still unclear how one can routinely obtain gapless assemblies. In the model plant Arabidopsis thaliana, other than the reference accession Col-0, all other accessions de novo assembled with long-reads until now have used PacBio continuous long reads (CLR). Although these assemblies sometimes achieved chromosome-arm level contigs, they inevitably broke near the centromeres, excluding megabases of DNA from analysis in pan-genome projects. Since PacBio high-fidelity (HiFi) reads circumvent the high error rate of CLR technologies, albeit at the expense of read length, we compared a CLR assembly of accession Eyach15-2 to HiFi assemblies of the same sample. The use of five different assemblers starting from subsampled data allowed us to evaluate the impact of coverage and read length. We found that centromeres and rDNA clusters are responsible for 71% of contig breaks in the CLR scaffolds, while relatively short stretches of GA/TC repeats are at the core of >85% of the unfilled gaps in our best HiFi assemblies. Since the HiFi technology consistently enabled us to reconstruct gapless centromeres and 5S rDNA clusters, we demonstrate the value of the approach by comparing these previously inaccessible regions of the genome between the Eyach15-2 accession and the reference accession Col-0.
Collapse
Affiliation(s)
| | | | - Christa Lanz
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| | - Katrin Fritschi
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| | - Victor Llaca
- Genomics Technologies, Corteva Agriscience, Johnston, IA 50131, USA
| | - Michelle Lang
- Genomics Technologies, Corteva Agriscience, Johnston, IA 50131, USA
| | - Pablo Carbonell-Bejerano
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| | - Ian Henderson
- Department of Plant Sciences, University of Cambridge, Cambridge, CB2 3EA, UK
| | - Detlef Weigel
- Correspondence may also be addressed to Detlef Weigel. Tel: +49 7071 601 1410;
| |
Collapse
|
10
|
Orantes-Bonilla M, Makhoul M, Lee H, Chawla HS, Vollrath P, Langstroff A, Sedlazeck FJ, Zou J, Snowdon RJ. Frequent spontaneous structural rearrangements promote rapid genome diversification in a Brassica napus F1 generation. FRONTIERS IN PLANT SCIENCE 2022; 13:1057953. [PMID: 36466276 PMCID: PMC9716091 DOI: 10.3389/fpls.2022.1057953] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 10/31/2022] [Indexed: 05/26/2023]
Abstract
In a cross between two homozygous Brassica napus plants of synthetic and natural origin, we demonstrate that novel structural genome variants from the synthetic parent cause immediate genome diversification among F1 offspring. Long read sequencing in twelve F1 sister plants revealed five large-scale structural rearrangements where both parents carried different homozygous alleles but the heterozygous F1 genomes were not identical heterozygotes as expected. Such spontaneous rearrangements were part of homoeologous exchanges or segmental deletions and were identified in different, individual F1 plants. The variants caused deletions, gene copy-number variations, diverging methylation patterns and other structural changes in large numbers of genes and may have been causal for unexpected phenotypic variation between individual F1 sister plants, for example strong divergence of plant height and leaf area. This example supports the hypothesis that spontaneous de novo structural rearrangements after de novo polyploidization can rapidly overcome intense allopolyploidization bottlenecks to re-expand crops genetic diversity for ecogeographical expansion and human selection. The findings imply that natural genome restructuring in allopolyploid plants from interspecific hybridization, a common approach in plant breeding, can have a considerably more drastic impact on genetic diversity in agricultural ecosystems than extremely precise, biotechnological genome modifications.
Collapse
Affiliation(s)
- Mauricio Orantes-Bonilla
- Department of Plant Breeding, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig University, Giessen, Germany
| | - Manar Makhoul
- Department of Plant Breeding, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig University, Giessen, Germany
| | - HueyTyng Lee
- Department of Plant Breeding, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig University, Giessen, Germany
| | - Harmeet Singh Chawla
- Department of Plant Sciences, Crop Development Centre, University of Saskatchewan, Saskatoon, SK, Canada
| | - Paul Vollrath
- Department of Plant Breeding, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig University, Giessen, Germany
| | - Anna Langstroff
- Department of Plant Breeding, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig University, Giessen, Germany
| | - Fritz J. Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, United States
| | - Jun Zou
- National Key Laboratory of Crop Genetic Improvement, College of Plant Science & Technology, Huazhong Agricultural University, Wuhan, China
| | - Rod J. Snowdon
- Department of Plant Breeding, IFZ Research Centre for Biosystems, Land Use and Nutrition, Justus Liebig University, Giessen, Germany
| |
Collapse
|
11
|
Feng T, Wu P, Gao H, Kosma DK, Jenks MA, Lü S. Natural variation in root suberization is associated with local environment in Arabidopsis thaliana. THE NEW PHYTOLOGIST 2022; 236:385-398. [PMID: 35751382 DOI: 10.1111/nph.18341] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2022] [Accepted: 06/16/2022] [Indexed: 06/15/2023]
Abstract
Genetic signature of climate adaptation has been widely recognized across the genome of many organisms; however, the eco-physiological basis for linking genomic polymorphisms with local adaptations remains largely unexplored. Using a panel of 218 world-wide Arabidopsis accessions, we characterized the natural variation in root suberization by quantifying 16 suberin monomers. We explored the associations between suberization traits and 126 climate variables. We conducted genome-wide association analysis and integrated previous genotype-environment association (GEA) to identify the genetic bases underlying suberization variation and their involvements in climate adaptation. Root suberin content displays extensive variation across Arabidopsis populations and significantly correlates with local moisture gradients and soil characteristics. Specifically, enhanced suberization is associated with drier environments, higher soil cation-exchange capacity, and lower soil pH; higher proportional levels of very-long-chain suberin is negatively correlated with moisture availability, lower soil gravel content, and higher soil silt fraction. We identified 94 putative causal loci and experimentally proved that GPAT6 is involved in C16 suberin biosynthesis. Highly significant associations between the putative genes and environmental variables were observed. Roots appear highly responsive to environmental heterogeneity via regulation of suberization, especially the suberin composition. The patterns of suberization-environment correlation and the suberin-related GEA fit the expectations of local adaptation for the polygenic suberization trait.
Collapse
Affiliation(s)
- Tao Feng
- State Key Laboratory of Biocatalysis and Enzyme Engineering, School of Life Sciences, Hubei University, Wuhan, 430062, China
- Hubei Hongshan Laboratory, Wuhan, 430070, China
| | - Pan Wu
- State Key Laboratory of Biocatalysis and Enzyme Engineering, School of Life Sciences, Hubei University, Wuhan, 430062, China
| | - Huani Gao
- State Key Laboratory of Biocatalysis and Enzyme Engineering, School of Life Sciences, Hubei University, Wuhan, 430062, China
| | - Dylan K Kosma
- Department of Biochemistry and Molecular Biology, University of Nevada, Reno, Reno, NV, 89557, USA
| | - Matthew A Jenks
- School of Plant Sciences, College of Agriculture and Life Sciences, The University of Arizona, Tucson, AZ, 85721, USA
| | - Shiyou Lü
- State Key Laboratory of Biocatalysis and Enzyme Engineering, School of Life Sciences, Hubei University, Wuhan, 430062, China
- Hubei Hongshan Laboratory, Wuhan, 430070, China
| |
Collapse
|
12
|
Rajput R, Tyagi S, Naik J, Pucker B, Stracke R, Pandey A. The R2R3-MYB gene family in Cicer arietinum: genome-wide identification and expression analysis leads to functional characterization of proanthocyanidin biosynthesis regulators in the seed coat. PLANTA 2022; 256:67. [PMID: 36038740 DOI: 10.1007/s00425-022-03979-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 08/19/2022] [Indexed: 06/15/2023]
Abstract
We identified 119 typical CaMYB encoding genes and reveal the major components of the proanthocyanidin regulatory network. CaPARs emerged as promising targets for genetic engineering toward improved agronomic traits in C. arietinum. Chickpea (Cicer arietinum) is among the eight oldest crops and has two main types, i.e., desi and kabuli, whose most obvious difference is the color of their seeds. We show that this color difference is due to differences in proanthocyanidin content of seed coats. Using a targeted approach, we performed in silico analysis, metabolite profiling, molecular, genetic, and biochemical studies to decipher the transcriptional regulatory network involved in proanthocyanidin biosynthesis in the seed coat of C. arietinum. Based on the annotated C. arietinum reference genome sequence, we identified 119 typical CaMYB encoding genes, grouped in 32 distinct clades. Two CaR2R3-MYB transcription factors, named CaPAR1 and CaPAR2, clustering with known proanthocyanidin regulators (PARs) were identified and further analyzed. The expression of CaPAR genes correlated well with the expression of the key structural proanthocyanidin biosynthesis genes CaANR and CaLAR and with proanthocyanidin levels. Protein-protein interaction studies suggest the in vivo interaction of CaPAR1 and CaPAR2 with the bHLH-type transcription factor CaTT8. Co-transfection analyses using Arabidopsis thaliana protoplasts showed that the CaPAR proteins form a MBW complex with CaTT8 and CaTTG1, able to activate the promoters of CaANR and CaLAR in planta. Finally, transgenic expression of CaPARs in the proanthocyanidin-deficient A. thaliana mutant tt2-1 leads to complementation of the transparent testa phenotype. Taken together, our results reveal main components of the proanthocyanidin regulatory network in C. arietinum and suggest that CaPARs are relevant targets of genetic engineering toward improved agronomic traits.
Collapse
Affiliation(s)
- Ruchika Rajput
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India
| | - Shivi Tyagi
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India
| | - Jogindra Naik
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India
| | - Boas Pucker
- Chair of Genetics and Genomics of Plants, Bielefeld University, 33615, Bielefeld, Germany
- Institute of Plant Biology and Braunschweig Integrated Centre of Systems Biology (BRICS), TU Brunswick, Brunswick, Germany
| | - Ralf Stracke
- Chair of Genetics and Genomics of Plants, Bielefeld University, 33615, Bielefeld, Germany
| | - Ashutosh Pandey
- National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi, 110067, India.
| |
Collapse
|
13
|
Schilbert HM, Pucker B, Ries D, Viehöver P, Micic Z, Dreyer F, Beckmann K, Wittkop B, Weisshaar B, Holtgräwe D. Mapping‑by‑Sequencing Reveals Genomic Regions Associated with Seed Quality Parameters in Brassica napus. Genes (Basel) 2022; 13:genes13071131. [PMID: 35885914 PMCID: PMC9317104 DOI: 10.3390/genes13071131] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 06/15/2022] [Accepted: 06/22/2022] [Indexed: 11/21/2022] Open
Abstract
Rapeseed (Brassica napus L.) is an important oil crop and has the potential to serve as a highly productive source of protein. This protein exhibits an excellent amino acid composition and has high nutritional value for humans. Seed protein content (SPC) and seed oil content (SOC) are two complex quantitative and polygenic traits which are negatively correlated and assumed to be controlled by additive and epistatic effects. A reduction in seed glucosinolate (GSL) content is desired as GSLs cause a stringent and bitter taste. The goal here was the identification of genomic intervals relevant for seed GSL content and SPC/SOC. Mapping by sequencing (MBS) revealed 30 and 15 new and known genomic intervals associated with seed GSL content and SPC/SOC, respectively. Within these intervals, we identified known but also so far unknown putatively causal genes and sequence variants. A 4 bp insertion in the MYB28 homolog on C09 shows a significant association with a reduction in seed GSL content. This study provides insights into the genetic architecture and potential mechanisms underlying seed quality traits, which will enhance future breeding approaches in B. napus.
Collapse
Affiliation(s)
- Hanna Marie Schilbert
- Genetics and Genomics of Plants, CeBiTec & Faculty of Biology, Bielefeld University, Universitätsstraße 27, 33615 Bielefeld, Germany; (H.M.S.); (B.P.); (D.R.); (P.V.); (B.W.)
- Graduate School DILS, Bielefeld Institute for Bioinformatics Infrastructure (BIBI), Faculty of Technology, Bielefeld University, Universitätsstraße 27, 33615 Bielefeld, Germany
| | - Boas Pucker
- Genetics and Genomics of Plants, CeBiTec & Faculty of Biology, Bielefeld University, Universitätsstraße 27, 33615 Bielefeld, Germany; (H.M.S.); (B.P.); (D.R.); (P.V.); (B.W.)
- Plant Biotechnology and Bioinformatics, Institute of Plant Biology & Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, Mendelssohnstraße 4, 38106 Braunschweig, Germany
| | - David Ries
- Genetics and Genomics of Plants, CeBiTec & Faculty of Biology, Bielefeld University, Universitätsstraße 27, 33615 Bielefeld, Germany; (H.M.S.); (B.P.); (D.R.); (P.V.); (B.W.)
| | - Prisca Viehöver
- Genetics and Genomics of Plants, CeBiTec & Faculty of Biology, Bielefeld University, Universitätsstraße 27, 33615 Bielefeld, Germany; (H.M.S.); (B.P.); (D.R.); (P.V.); (B.W.)
| | - Zeljko Micic
- Deutsche Saatveredelung AG, Weissenburger Straße 5, 59557 Lippstadt, Germany;
| | - Felix Dreyer
- NPZ Innovation GmbH, Hohenlieth-Hof 1, 24363 Holtsee, Germany; (F.D.); (K.B.)
| | - Katrin Beckmann
- NPZ Innovation GmbH, Hohenlieth-Hof 1, 24363 Holtsee, Germany; (F.D.); (K.B.)
| | - Benjamin Wittkop
- Department of Plant Breeding, Justus Liebig University, Heinrich-Buff-Ring 26-32, 35392 Giessen, Germany;
| | - Bernd Weisshaar
- Genetics and Genomics of Plants, CeBiTec & Faculty of Biology, Bielefeld University, Universitätsstraße 27, 33615 Bielefeld, Germany; (H.M.S.); (B.P.); (D.R.); (P.V.); (B.W.)
| | - Daniela Holtgräwe
- Genetics and Genomics of Plants, CeBiTec & Faculty of Biology, Bielefeld University, Universitätsstraße 27, 33615 Bielefeld, Germany; (H.M.S.); (B.P.); (D.R.); (P.V.); (B.W.)
- Correspondence:
| |
Collapse
|
14
|
Automatic identification and annotation of MYB gene family members in plants. BMC Genomics 2022; 23:220. [PMID: 35305581 PMCID: PMC8933966 DOI: 10.1186/s12864-022-08452-5] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 03/07/2022] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND MYBs are among the largest transcription factor families in plants. Consequently, members of this family are involved in a plethora of processes including development and specialized metabolism. The MYB families of many plant species were investigated in the last two decades since the first investigation looked at Arabidopsis thaliana. This body of knowledge and characterized sequences provide the basis for the identification, classification, and functional annotation of candidate sequences in new genome and transcriptome assemblies. RESULTS A pipeline for the automatic identification and functional annotation of MYBs in a given sequence data set was implemented in Python. MYB candidates are identified, screened for the presence of a MYB domain and other motifs, and finally placed in a phylogenetic context with well characterized sequences. In addition to technical benchmarking based on existing annotation, the transcriptome assembly of Croton tiglium and the annotated genome sequence of Castanea crenata were screened for MYBs. Results of both analyses are presented in this study to illustrate the potential of this application. The analysis of one species takes only a few minutes depending on the number of predicted sequences and the size of the MYB gene family. This pipeline, the required bait sequences, and reference sequences for a classification are freely available on github: https://github.com/bpucker/MYB_annotator . CONCLUSIONS This automatic annotation of the MYB gene family in novel assemblies makes genome-wide investigations consistent and paves the way for comparative studies in the future. Candidate genes for in-depth analyses are presented based on their orthology to previously characterized sequences which allows the functional annotation of the newly identified MYBs with high confidence. The identification of orthologs can also be harnessed to detect duplication and deletion events.
Collapse
|
15
|
Wierzbicki F, Schwarz F, Cannalonga O, Kofler R. Novel quality metrics allow identifying and generating high-quality assemblies of piRNA clusters. Mol Ecol Resour 2022; 22:102-121. [PMID: 34181811 DOI: 10.1111/1755-0998.13455] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 04/30/2021] [Accepted: 06/14/2021] [Indexed: 12/30/2022]
Abstract
In most animals, it is thought that the proliferation of a transposable element (TE) is stopped when the TE jumps into a piRNA cluster. Despite this central importance, little is known about the composition and the evolutionary dynamics of piRNA clusters. This is largely because piRNA clusters are notoriously difficult to assemble as they are frequently composed of highly repetitive DNA. With long reads, we may finally be able to obtain reliable assemblies of piRNA clusters. Unfortunately, it is unclear how to generate and identify the best assemblies, as many assembly strategies exist and standard quality metrics are ignorant of TEs. To address these problems, we introduce several novel quality metrics that assess: (a) the fraction of completely assembled piRNA clusters, (b) the quality of the assembled clusters and (c) whether an assembly captures the overall TE landscape of an organisms (i.e. the abundance, the number of SNPs and internal deletions of all TE families). The requirements for computing these metrics vary, ranging from annotations of piRNA clusters to consensus sequences of TEs and genomic sequencing data. Using these novel metrics, we evaluate the effect of assembly algorithm, polishing, read length, coverage, residual polymorphisms and finally identify strategies that yield reliable assemblies of piRNA clusters. Based on an optimized approach, we provide assemblies for the two Drosophila melanogaster strains Canton-S and Pi2. About 80% of known piRNA clusters were assembled in both strains. Finally, we demonstrate the generality of our approach by extending our metrics to humans and Arabidopsis thaliana.
Collapse
Affiliation(s)
- Filip Wierzbicki
- Institut für Populationsgenetik, Vetmeduni Vienna, Wien, Austria.,Vienna Graduate School of Population Genetics, Vetmeduni Vienna, Vienna, Austria
| | - Florian Schwarz
- Institut für Populationsgenetik, Vetmeduni Vienna, Wien, Austria.,Vienna Graduate School of Population Genetics, Vetmeduni Vienna, Vienna, Austria
| | | | - Robert Kofler
- Institut für Populationsgenetik, Vetmeduni Vienna, Wien, Austria
| |
Collapse
|
16
|
Schilbert HM, Schöne M, Baier T, Busche M, Viehöver P, Weisshaar B, Holtgräwe D. Characterization of the Brassica napus Flavonol Synthase Gene Family Reveals Bifunctional Flavonol Synthases. FRONTIERS IN PLANT SCIENCE 2021; 12:733762. [PMID: 34721462 PMCID: PMC8548573 DOI: 10.3389/fpls.2021.733762] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Accepted: 09/21/2021] [Indexed: 06/13/2023]
Abstract
Flavonol synthase (FLS) is a key enzyme for the formation of flavonols, which are a subclass of the flavonoids. FLS catalyzes the conversion of dihydroflavonols to flavonols. The enzyme belongs to the 2-oxoglutarate-dependent dioxygenases (2-ODD) superfamily. We characterized the FLS gene family of Brassica napus that covers 13 genes, based on the genome sequence of the B. napus cultivar Express 617. The goal was to unravel which BnaFLS genes are relevant for seed flavonol accumulation in the amphidiploid species B. napus. Two BnaFLS1 homeologs were identified and shown to encode bifunctional enzymes. Both exhibit FLS activity as well as flavanone 3-hydroxylase (F3H) activity, which was demonstrated in vivo and in planta. BnaFLS1-1 and -2 are capable of converting flavanones into dihydroflavonols and further into flavonols. Analysis of spatio-temporal transcription patterns revealed similar expression profiles of BnaFLS1 genes. Both are mainly expressed in reproductive organs and co-expressed with the genes encoding early steps of flavonoid biosynthesis. Our results provide novel insights into flavonol biosynthesis in B. napus and contribute information for breeding targets with the aim to modify the flavonol content in rapeseed.
Collapse
Affiliation(s)
- Hanna Marie Schilbert
- Genetics and Genomics of Plants, CeBiTec and Faculty of Biology, Bielefeld University, Bielefeld, Germany
| | - Maximilian Schöne
- Genetics and Genomics of Plants, CeBiTec and Faculty of Biology, Bielefeld University, Bielefeld, Germany
| | - Thomas Baier
- Algae Biotechnology and Bioenergy, CeBiTec and Faculty of Biology, Bielefeld University, Bielefeld, Germany
| | - Mareike Busche
- Genetics and Genomics of Plants, CeBiTec and Faculty of Biology, Bielefeld University, Bielefeld, Germany
| | - Prisca Viehöver
- Genetics and Genomics of Plants, CeBiTec and Faculty of Biology, Bielefeld University, Bielefeld, Germany
| | - Bernd Weisshaar
- Genetics and Genomics of Plants, CeBiTec and Faculty of Biology, Bielefeld University, Bielefeld, Germany
| | - Daniela Holtgräwe
- Genetics and Genomics of Plants, CeBiTec and Faculty of Biology, Bielefeld University, Bielefeld, Germany
| |
Collapse
|
17
|
Pucker B, Kleinbölting N, Weisshaar B. Large scale genomic rearrangements in selected Arabidopsis thaliana T-DNA lines are caused by T-DNA insertion mutagenesis. BMC Genomics 2021; 22:599. [PMID: 34362298 PMCID: PMC8348815 DOI: 10.1186/s12864-021-07877-8] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2021] [Accepted: 07/06/2021] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND Experimental proof of gene function assignments in plants is based on mutant analyses. T-DNA insertion lines provided an invaluable resource of mutants and enabled systematic reverse genetics-based investigation of the functions of Arabidopsis thaliana genes during the last decades. RESULTS We sequenced the genomes of 14 A. thaliana GABI-Kat T-DNA insertion lines, which eluded flanking sequence tag-based attempts to characterize their insertion loci, with Oxford Nanopore Technologies (ONT) long reads. Complex T-DNA insertions were resolved and 11 previously unknown T-DNA loci identified, resulting in about 2 T-DNA insertions per line and suggesting that this number was previously underestimated. T-DNA mutagenesis caused fusions of chromosomes along with compensating translocations to keep the gene set complete throughout meiosis. Also, an inverted duplication of 800 kbp was detected. About 10 % of GABI-Kat lines might be affected by chromosomal rearrangements, some of which do not involve T-DNA. Local assembly of selected reads was shown to be a computationally effective method to resolve the structure of T-DNA insertion loci. We developed an automated workflow to support investigation of long read data from T-DNA insertion lines. All steps from DNA extraction to assembly of T-DNA loci can be completed within days. CONCLUSIONS Long read sequencing was demonstrated to be an effective way to resolve complex T-DNA insertions and chromosome fusions. Many T-DNA insertions comprise not just a single T-DNA, but complex arrays of multiple T-DNAs. It is becoming obvious that T-DNA insertion alleles must be characterized by exact identification of both T-DNA::genome junctions to generate clear genotype-to-phenotype relations.
Collapse
Affiliation(s)
- Boas Pucker
- Genetics and Genomics of Plants, Center for Biotechnology (CeBiTec), Bielefeld University, Sequenz 1, 33615 Bielefeld, Germany
- Evolution and Diversity, Department of Plant Sciences, University of Cambridge, Cambridge, UK
| | - Nils Kleinbölting
- Bioinformatics Resource Facility, Center for Biotechnology (CeBiTec, Bielefeld University, Sequenz 1, 33615 Bielefeld, Germany
| | - Bernd Weisshaar
- Genetics and Genomics of Plants, Center for Biotechnology (CeBiTec), Bielefeld University, Sequenz 1, 33615 Bielefeld, Germany
| |
Collapse
|
18
|
Zmienko A, Marszalek-Zenczak M, Wojciechowski P, Samelak-Czajka A, Luczak M, Kozlowski P, Karlowski WM, Figlerowicz M. AthCNV: A Map of DNA Copy Number Variations in the Arabidopsis Genome. THE PLANT CELL 2020; 32:1797-1819. [PMID: 32265262 PMCID: PMC7268809 DOI: 10.1105/tpc.19.00640] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2019] [Revised: 03/09/2020] [Accepted: 03/30/2020] [Indexed: 05/13/2023]
Abstract
Copy number variations (CNVs) greatly contribute to intraspecies genetic polymorphism and phenotypic diversity. Recent analyses of sequencing data for >1000 Arabidopsis (Arabidopsis thaliana) accessions focused on small variations and did not include CNVs. Here, we performed genome-wide analysis and identified large indels (50 to 499 bp) and CNVs (500 bp and larger) in these accessions. The CNVs fully overlap with 18.3% of protein-coding genes, with enrichment for evolutionarily young genes and genes involved in stress and defense. By combining analysis of both genes and transposable elements (TEs) affected by CNVs, we revealed that the variation statuses of genes and TEs are tightly linked and jointly contribute to the unequal distribution of these elements in the genome. We also determined the gene copy numbers in a set of 1060 accessions and experimentally validated the accuracy of our predictions by multiplex ligation-dependent probe amplification assays. We then successfully used the CNVs as markers to analyze population structure and migration patterns. Finally, we examined the impact of gene dosage variation triggered by a CNV spanning the SEC10 gene on SEC10 expression at both the transcript and protein levels. The catalog of CNVs, CNV-overlapping genes, and their genotypes in a top model dicot will stimulate the exploration of the genetic basis of phenotypic variation.
Collapse
Affiliation(s)
- Agnieszka Zmienko
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704 Poznan, Poland
- Institute of Computing Science, Faculty of Computing Science, Poznan University of Technology, Poznan, Poland
| | | | - Pawel Wojciechowski
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704 Poznan, Poland
- Institute of Computing Science, Faculty of Computing Science, Poznan University of Technology, Poznan, Poland
| | - Anna Samelak-Czajka
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704 Poznan, Poland
| | - Magdalena Luczak
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704 Poznan, Poland
| | - Piotr Kozlowski
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704 Poznan, Poland
| | - Wojciech M Karlowski
- Department of Computational Biology, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, 61-614 Poznan, Poland
| | - Marek Figlerowicz
- Institute of Bioorganic Chemistry, Polish Academy of Sciences, 61-704 Poznan, Poland
- Institute of Computing Science, Faculty of Computing Science, Poznan University of Technology, Poznan, Poland
| |
Collapse
|
19
|
Schilbert HM, Rempel A, Pucker B. Comparison of Read Mapping and Variant Calling Tools for the Analysis of Plant NGS Data. PLANTS (BASEL, SWITZERLAND) 2020; 9:E439. [PMID: 32252268 PMCID: PMC7238416 DOI: 10.3390/plants9040439] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2020] [Revised: 03/28/2020] [Accepted: 03/30/2020] [Indexed: 12/30/2022]
Abstract
High-throughput sequencing technologies have rapidly developed during the past years and have become an essential tool in plant sciences. However, the analysis of genomic data remains challenging and relies mostly on the performance of automatic pipelines. Frequently applied pipelines involve the alignment of sequence reads against a reference sequence and the identification of sequence variants. Since most benchmarking studies of bioinformatics tools for this purpose have been conducted on human datasets, there is a lack of benchmarking studies in plant sciences. In this study, we evaluated the performance of 50 different variant calling pipelines, including five read mappers and ten variant callers, on six real plant datasets of the model organism Arabidopsis thaliana. Sets of variants were evaluated based on various parameters including sensitivity and specificity. We found that all investigated tools are suitable for analysis of NGS data in plant research. When looking at different performance metrics, BWA-MEM and Novoalign were the best mappers and GATK returned the best results in the variant calling step.
Collapse
Affiliation(s)
- Hanna Marie Schilbert
- Genetics and Genomics of Plants, CeBiTec and Faculty of Biology, Bielefeld University, 33615 Bielefeld, Germany
| | - Andreas Rempel
- Genetics and Genomics of Plants, CeBiTec and Faculty of Biology, Bielefeld University, 33615 Bielefeld, Germany
- Graduate School DILS, Bielefeld Institute for Bioinformatics Infrastructure (BIBI), Faculty of Technology, Bielefeld University, 33615 Bielefeld, Germany
| | - Boas Pucker
- Genetics and Genomics of Plants, CeBiTec and Faculty of Biology, Bielefeld University, 33615 Bielefeld, Germany
- Molecular Genetics and Physiology of Plants, Faculty of Biology and Biotechnology, Ruhr-University Bochum, 44801 Bochum, Germany
| |
Collapse
|
20
|
Siadjeu C, Pucker B, Viehöver P, Albach DC, Weisshaar B. High Contiguity De Novo Genome Sequence Assembly of Trifoliate Yam ( Dioscorea dumetorum) Using Long Read Sequencing. Genes (Basel) 2020; 11:E274. [PMID: 32143301 PMCID: PMC7140821 DOI: 10.3390/genes11030274] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Revised: 02/25/2020] [Accepted: 02/29/2020] [Indexed: 12/17/2022] Open
Abstract
Trifoliate yam (Dioscorea dumetorum) is one example of an orphan crop, not traded internationally. Post-harvest hardening of the tubers of this species starts within 24 h after harvesting and renders the tubers inedible. Genomic resources are required for D. dumetorum to improve breeding for non-hardening varieties as well as for other traits. We sequenced the D. dumetorum genome and generated the corresponding annotation. The two haplophases of this highly heterozygous genome were separated to a large extent. The assembly represents 485 Mbp of the genome with an N50 of over 3.2 Mbp. A total of 35,269 protein-encoding gene models as well as 9941 non-coding RNA genes were predicted, and functional annotations were assigned.
Collapse
Affiliation(s)
- Christian Siadjeu
- Institute for Biology and Environmental Sciences, Biodiversity and Evolution of Plants, Carl-von-Ossietzky University Oldenburg, Carl-von-Ossietzky Str. 9-11, 26111 Oldenburg, Germany; (C.S.); (D.C.A.)
- Genetics and Genomics of Plants, Faculty of Biology, Center for Biotechnology (CeBiTec), Bielefeld University, Sequenz 1, 33615 Bielefeld, NRW, Germany; (B.P.); (P.V.)
| | - Boas Pucker
- Genetics and Genomics of Plants, Faculty of Biology, Center for Biotechnology (CeBiTec), Bielefeld University, Sequenz 1, 33615 Bielefeld, NRW, Germany; (B.P.); (P.V.)
- Molecular Genetics and Physiology of Plants, Faculty of Biology and Biotechnology, Ruhr-University Bochum, Universitätsstraße 150, 44801 Bochum, Germany
| | - Prisca Viehöver
- Genetics and Genomics of Plants, Faculty of Biology, Center for Biotechnology (CeBiTec), Bielefeld University, Sequenz 1, 33615 Bielefeld, NRW, Germany; (B.P.); (P.V.)
| | - Dirk C. Albach
- Institute for Biology and Environmental Sciences, Biodiversity and Evolution of Plants, Carl-von-Ossietzky University Oldenburg, Carl-von-Ossietzky Str. 9-11, 26111 Oldenburg, Germany; (C.S.); (D.C.A.)
| | - Bernd Weisshaar
- Genetics and Genomics of Plants, Faculty of Biology, Center for Biotechnology (CeBiTec), Bielefeld University, Sequenz 1, 33615 Bielefeld, NRW, Germany; (B.P.); (P.V.)
| |
Collapse
|
21
|
Genome Sequencing of Musa acuminata Dwarf Cavendish Reveals a Duplication of a Large Segment of Chromosome 2. G3-GENES GENOMES GENETICS 2020; 10:37-42. [PMID: 31712258 PMCID: PMC6945009 DOI: 10.1534/g3.119.400847] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Different Musa species, subspecies, and cultivars are currently investigated to reveal their genomic diversity. Here, we compare the genome sequence of one of the commercially most important cultivars, Musa acuminata Dwarf Cavendish, against the Pahang reference genome assembly. Numerous small sequence variants were detected and the ploidy of the cultivar presented here was determined as triploid based on sequence variant frequencies. Illumina sequence data also revealed a duplication of a large segment on the long arm of chromosome 2 in the Dwarf Cavendish genome. Comparison against previously sequenced cultivars provided evidence that this duplication is unique to Dwarf Cavendish. Although no functional relevance of this duplication was identified, this example shows the potential of plants to tolerate such aneuploidies.
Collapse
|
22
|
Rutter MT, Murren CJ, Callahan HS, Bisner AM, Leebens-Mack J, Wolyniak MJ, Strand AE. Distributed phenomics with the unPAK project reveals the effects of mutations. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2019; 100:199-211. [PMID: 31155775 DOI: 10.1111/tpj.14427] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2019] [Revised: 05/01/2019] [Accepted: 05/10/2019] [Indexed: 06/09/2023]
Abstract
Determining how genes are associated with traits in plants and other organisms is a major challenge in modern biology. The unPAK project - undergraduates phenotyping Arabidopsis knockouts - has generated phenotype data for thousands of non-lethal insertion mutation lines within a single Arabidopsis thaliana genomic background. The focal phenotypes examined by unPAK are complex macroscopic fitness-related traits, which have ecological, evolutionary and agricultural importance. These phenotypes are placed in the context of the wild-type and also natural accessions (phytometers), and standardized for environmental differences between assays. Data from the unPAK project are used to describe broad patterns in the phenotypic consequences of insertion mutation, and to identify individual mutant lines with distinct phenotypes as candidates for further study. Inclusion of undergraduate researchers is at the core of unPAK activities, and an important broader impact of the project is providing students an opportunity to obtain research experience.
Collapse
Affiliation(s)
- Matthew T Rutter
- Department of Biology, College of Charleston, 66 George Street, Charleston, SC, 29424, USA
| | - Courtney J Murren
- Department of Biology, College of Charleston, 66 George Street, Charleston, SC, 29424, USA
| | - Hilary S Callahan
- Department of Biology, Barnard College, 3009 Broadway, New York, NY, 10027, USA
| | - April M Bisner
- Department of Biology, College of Charleston, 66 George Street, Charleston, SC, 29424, USA
| | - Jim Leebens-Mack
- Department of Plant Biology, University of Georgia, 120 Carlton St, Athens, GA, 30602, USA
| | | | - Allan E Strand
- Department of Biology, College of Charleston, 66 George Street, Charleston, SC, 29424, USA
| |
Collapse
|
23
|
Pucker B, Rückert C, Stracke R, Viehöver P, Kalinowski J, Weisshaar B. Twenty-Five Years of Propagation in Suspension Cell Culture Results in Substantial Alterations of the Arabidopsis Thaliana Genome. Genes (Basel) 2019; 10:E671. [PMID: 31480756 PMCID: PMC6770967 DOI: 10.3390/genes10090671] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2019] [Revised: 08/23/2019] [Accepted: 08/29/2019] [Indexed: 01/16/2023] Open
Abstract
Arabidopsis thaliana is one of the best studied plant model organisms. Besides cultivation in greenhouses, cells of this plant can also be propagated in suspension cell culture. At7 is one such cell line that was established about 25 years ago. Here, we report the sequencing and the analysis of the At7 genome. Large scale duplications and deletions compared to the Columbia-0 (Col-0) reference sequence were detected. The number of deletions exceeds the number of insertions, thus indicating that a haploid genome size reduction is ongoing. Patterns of small sequence variants differ from the ones observed between A. thaliana accessions, e.g., the number of single nucleotide variants matches the number of insertions/deletions. RNA-Seq analysis reveals that disrupted alleles are less frequent in the transcriptome than the native ones.
Collapse
Affiliation(s)
- Boas Pucker
- Genetics and Genomics of Plants, Faculty of Biology, Center for Biotechnology (CeBiTec), Bielefeld University, Sequenz 1, 33615 Bielefeld, NRW, Germany.
| | - Christian Rückert
- Microbial Genomics and Biotechnology, Center for Biotechnology (CeBiTec), Bielefeld University, Sequenz 1, 33615 Bielefeld, NRW, Germany
| | - Ralf Stracke
- Genetics and Genomics of Plants, Faculty of Biology, Center for Biotechnology (CeBiTec), Bielefeld University, Sequenz 1, 33615 Bielefeld, NRW, Germany
| | - Prisca Viehöver
- Genetics and Genomics of Plants, Faculty of Biology, Center for Biotechnology (CeBiTec), Bielefeld University, Sequenz 1, 33615 Bielefeld, NRW, Germany
| | - Jörn Kalinowski
- Microbial Genomics and Biotechnology, Center for Biotechnology (CeBiTec), Bielefeld University, Sequenz 1, 33615 Bielefeld, NRW, Germany
| | - Bernd Weisshaar
- Genetics and Genomics of Plants, Faculty of Biology, Center for Biotechnology (CeBiTec), Bielefeld University, Sequenz 1, 33615 Bielefeld, NRW, Germany
| |
Collapse
|
24
|
Ou S, Chen J, Jiang N. Assessing genome assembly quality using the LTR Assembly Index (LAI). Nucleic Acids Res 2019; 46:e126. [PMID: 30107434 PMCID: PMC6265445 DOI: 10.1093/nar/gky730] [Citation(s) in RCA: 226] [Impact Index Per Article: 45.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2018] [Accepted: 07/31/2018] [Indexed: 12/15/2022] Open
Abstract
Assembling a plant genome is challenging due to the abundance of repetitive sequences, yet no standard is available to evaluate the assembly of repeat space. LTR retrotransposons (LTR-RTs) are the predominant interspersed repeat that is poorly assembled in draft genomes. Here, we propose a reference-free genome metric called LTR Assembly Index (LAI) that evaluates assembly continuity using LTR-RTs. After correcting for LTR-RT amplification dynamics, we show that LAI is independent of genome size, genomic LTR-RT content, and gene space evaluation metrics (i.e., BUSCO and CEGMA). By comparing genomic sequences produced by various sequencing techniques, we reveal the significant gain of assembly continuity by using long-read-based techniques over short-read-based methods. Moreover, LAI can facilitate iterative assembly improvement with assembler selection and identify low-quality genomic regions. To apply LAI, intact LTR-RTs and total LTR-RTs should contribute at least 0.1% and 5% to the genome size, respectively. The LAI program is freely available on GitHub: https://github.com/oushujun/LTR_retriever.
Collapse
Affiliation(s)
- Shujun Ou
- Department of Horticulture, Michigan State University, East Lansing, MI 48824, USA.,Program in Ecology, Evolutionary Biology and Behavior, Michigan State University, East Lansing, MI 48824, USA
| | - Jinfeng Chen
- Department of Plant Pathology and Microbiology, University of California, Riverside, CA 92507, USA
| | - Ning Jiang
- Department of Horticulture, Michigan State University, East Lansing, MI 48824, USA.,Program in Ecology, Evolutionary Biology and Behavior, Michigan State University, East Lansing, MI 48824, USA
| |
Collapse
|
25
|
Pucker B, Schilbert HM, Schumacher SF. Integrating Molecular Biology and Bioinformatics Education. J Integr Bioinform 2019; 16:/j/jib.ahead-of-print/jib-2019-0005/jib-2019-0005.xml. [PMID: 31145692 PMCID: PMC6798849 DOI: 10.1515/jib-2019-0005] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Accepted: 04/15/2019] [Indexed: 02/01/2023] Open
Abstract
Combined awareness about the power and limitations of bioinformatics and molecular biology enables advanced research based on high-throughput data. Despite an increasing demand of scientists with a combined background in both fields, the education of dry and wet lab subjects are often still separated. This work describes an example of integrated education with a focus on genomics and transcriptomics. Participants learned computational and molecular biology methods in the same practical course. Peer-review was applied as a teaching method to foster cooperative learning of students with heterogeneous backgrounds. The positive evaluation results indicate that this approach was accepted by the participants and would likely be suitable for wider scale application.
Collapse
Affiliation(s)
- Boas Pucker
- Genetics and Genomics of Plants, CeBiTec and Faculty of Biology, Bielefeld University, Bielefeld, Germany
| | - Hanna Marie Schilbert
- Genetics and Genomics of Plants, CeBiTec and Faculty of Biology, Bielefeld University, Bielefeld, Germany
| | | |
Collapse
|
26
|
Pucker B, Holtgräwe D, Stadermann KB, Frey K, Huettel B, Reinhardt R, Weisshaar B. A chromosome-level sequence assembly reveals the structure of the Arabidopsis thaliana Nd-1 genome and its gene set. PLoS One 2019; 14:e0216233. [PMID: 31112551 PMCID: PMC6529160 DOI: 10.1371/journal.pone.0216233] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2018] [Accepted: 04/16/2019] [Indexed: 01/27/2023] Open
Abstract
In addition to the BAC-based reference sequence of the accession Columbia-0 from the year 2000, several short read assemblies of THE plant model organism Arabidopsis thaliana were published during the last years. Also, a SMRT-based assembly of Landsberg erecta has been generated that identified translocation and inversion polymorphisms between two genotypes of the species. Here we provide a chromosome-arm level assembly of the A. thaliana accession Niederzenz-1 (AthNd-1_v2c) based on SMRT sequencing data. The best assembly comprises 69 nucleome sequences and displays a contig length of up to 16 Mbp. Compared to an earlier Illumina short read-based NGS assembly (AthNd-1_v1), a 75 fold increase in contiguity was observed for AthNd-1_v2c. To assign contig locations independent from the Col-0 gold standard reference sequence, we used genetic anchoring to generate a de novo assembly. In addition, we assembled the chondrome and plastome sequences. Detailed analyses of AthNd-1_v2c allowed reliable identification of large genomic rearrangements between A. thaliana accessions contributing to differences in the gene sets that distinguish the genotypes. One of the differences detected identified a gene that is lacking from the Col-0 gold standard sequence. This de novo assembly extends the known proportion of the A. thaliana pan-genome.
Collapse
Affiliation(s)
- Boas Pucker
- Bielefeld University, Faculty of Biology & Center for Biotechnology, Bielefeld, Germany
| | - Daniela Holtgräwe
- Bielefeld University, Faculty of Biology & Center for Biotechnology, Bielefeld, Germany
| | - Kai Bernd Stadermann
- Bielefeld University, Faculty of Biology & Center for Biotechnology, Bielefeld, Germany
| | - Katharina Frey
- Bielefeld University, Faculty of Biology & Center for Biotechnology, Bielefeld, Germany
| | - Bruno Huettel
- Max Planck Genome Centre Cologne, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | - Richard Reinhardt
- Max Planck Genome Centre Cologne, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | - Bernd Weisshaar
- Bielefeld University, Faculty of Biology & Center for Biotechnology, Bielefeld, Germany
| |
Collapse
|
27
|
Bragina MK, Afonnikov DA, Salina EA. Progress in plant genome sequencing: research directions. Vavilovskii Zhurnal Genet Selektsii 2019. [DOI: 10.18699/vj19.459] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
Since the first plant genome of Arabidopsis thaliana has been sequenced and published, genome sequencing technologies have undergone significant changes. New algorithms, sequencing technologies and bioinformatic approaches were adopted to obtain genome, transcriptome and exome sequences for model and crop species, which have permitted deep inferences into plant biology. As a result of an improved genome assembly and analysis methods, genome sequencing costs plummeted and the number of high-quality plant genome sequences is constantly growing. Consequently, more than 300 plant genome sequences have been published over the past twenty years. Although many of the published genomes are considered incomplete, they proved to be a valuable tool for identifying genes involved in the formation of economically valuable plant traits, for marker-assisted and genomic selection and for comparative analysis of plant genomes in order to determine the basic patterns of origin of various plant species. Since a high coverage and resolution of a genome sequence is not enough to detect all changes in complex samples, targeted sequencing, which consists in the isolation and sequencing of a specific region of the genome, has begun to develop. Targeted sequencing has a higher detection power (the ability to identify new differences/variants) and resolution (up to one basis). In addition, exome sequencing (the method of sequencing only protein-coding genes regions) is actively developed, which allows for the sequencing of non-expressed alleles and genes that cannot be found with RNA-seq. In this review, an analysis of sequencing technologies development and the construction of “reference” genomes of plants is performed. A comparison of the methods of targeted sequencing based on the use of the reference DNA sequence is accomplished.
Collapse
Affiliation(s)
| | - D. A. Afonnikov
- Institute of Cytology and Genetics, SB RAS; Novosibirsk State University
| | | |
Collapse
|
28
|
Pucker B, Brockington SF. Genome-wide analyses supported by RNA-Seq reveal non-canonical splice sites in plant genomes. BMC Genomics 2018; 19:980. [PMID: 30594132 PMCID: PMC6310983 DOI: 10.1186/s12864-018-5360-z] [Citation(s) in RCA: 30] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2018] [Accepted: 12/10/2018] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Most eukaryotic genes comprise exons and introns thus requiring the precise removal of introns from pre-mRNAs to enable protein biosynthesis. U2 and U12 spliceosomes catalyze this step by recognizing motifs on the transcript in order to remove the introns. A process which is dependent on precise definition of exon-intron borders by splice sites, which are consequently highly conserved across species. Only very few combinations of terminal dinucleotides are frequently observed at intron ends, dominated by the canonical GT-AG splice sites on the DNA level. RESULTS Here we investigate the occurrence of diverse combinations of dinucleotides at predicted splice sites. Analyzing 121 plant genome sequences based on their annotation revealed strong splice site conservation across species, annotation errors, and true biological divergence from canonical splice sites. The frequency of non-canonical splice sites clearly correlates with their divergence from canonical ones indicating either an accumulation of probably neutral mutations, or evolution towards canonical splice sites. Strong conservation across multiple species and non-random accumulation of substitutions in splice sites indicate a functional relevance of non-canonical splice sites. The average composition of splice sites across all investigated species is 98.7% for GT-AG, 1.2% for GC-AG, 0.06% for AT-AC, and 0.09% for minor non-canonical splice sites. RNA-Seq data sets of 35 species were incorporated to validate non-canonical splice site predictions through gaps in sequencing reads alignments and to demonstrate the expression of affected genes. CONCLUSION We conclude that bona fide non-canonical splice sites are present and appear to be functionally relevant in most plant genomes, although at low abundance.
Collapse
Affiliation(s)
- Boas Pucker
- Evolution and Diversity, Department of Plant Sciences, University of Cambridge, Cambridge, UK
- Genetics and Genomics of Plants, CeBiTec & Faculty of Biology, Bielefeld University, Bielefeld, Germany
| | - Samuel F. Brockington
- Evolution and Diversity, Department of Plant Sciences, University of Cambridge, Cambridge, UK
| |
Collapse
|
29
|
Behnke N, Suprianto E, Möllers C. A major QTL on chromosome C05 significantly reduces acid detergent lignin (ADL) content and increases seed oil and protein content in oilseed rape (Brassica napus L.). TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2018; 131:2477-2492. [PMID: 30143828 DOI: 10.1007/s00122-018-3167-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2018] [Accepted: 08/17/2018] [Indexed: 05/27/2023]
Abstract
A reduction in acid detergent lignin content in oilseed rape resulted in an increase in seed oil and protein content. Worldwide increasing demand for vegetable oil and protein requires continuous breeding efforts to enhance the yield of oil and protein crop species. The oil-extracted meal of oilseed rape is currently mainly used for feeding livestock, but efforts are undertaken to use the oilseed rape protein in food production. One limiting factor is the high lignin content of black-seeded oilseed rape that negatively affects digestibility and sensory quality of food products compared to soybean. Breeding attempts to develop yellow seeded oilseed rape with reduced lignin content have not yet resulted in competitive cultivars. The objective of this work was to investigate the inheritance of seed quality in a DH population derived from the cross of the high oil lines SGDH14 and cv. Express. The DH population of 139 lines was tested in field experiments in 14 environments in north-west Europe. Seeds harvested from open pollinated plants were used for extensive seed quality analysis. A molecular marker map based on the Illumina Infinium 60 K Brassica SNP chip was used to map QTL. Amongst others, one major QTL for acid detergent lignin content, explaining 81% of the phenotypic variance, was identified on chromosome C05. Lines with reduced lignin content nevertheless did not show a yellowish appearance, but showed a reduced seed hull content. The position of the QTL co-located with QTL for oil and protein content of the defatted meal with opposite additive effects, suggesting that the reduction in lignin content resulted in an increase in oil and protein content.
Collapse
Affiliation(s)
- Nina Behnke
- Department of Crop Sciences, Georg-August-Universität Göttingen, Von-Siebold-Str. 8, 37075, Göttingen, Germany
| | - Edy Suprianto
- Department of Crop Sciences, Georg-August-Universität Göttingen, Von-Siebold-Str. 8, 37075, Göttingen, Germany
| | - Christian Möllers
- Department of Crop Sciences, Georg-August-Universität Göttingen, Von-Siebold-Str. 8, 37075, Göttingen, Germany.
| |
Collapse
|
30
|
Haak M, Vinke S, Keller W, Droste J, Rückert C, Kalinowski J, Pucker B. High Quality de Novo Transcriptome Assembly of Croton tiglium. Front Mol Biosci 2018; 5:62. [PMID: 30027092 PMCID: PMC6041412 DOI: 10.3389/fmolb.2018.00062] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2018] [Accepted: 06/18/2018] [Indexed: 12/31/2022] Open
Affiliation(s)
- Markus Haak
- Center for Biotechnology, Bielefeld University, Bielefeld, Germany
| | - Svenja Vinke
- Center for Biotechnology, Bielefeld University, Bielefeld, Germany
| | - Willy Keller
- Center for Biotechnology, Bielefeld University, Bielefeld, Germany.,Faculty of Biology, Bielefeld University, Bielefeld, Germany
| | - Julian Droste
- Center for Biotechnology, Bielefeld University, Bielefeld, Germany.,Faculty of Biology, Bielefeld University, Bielefeld, Germany
| | - Christian Rückert
- Center for Biotechnology, Bielefeld University, Bielefeld, Germany.,Faculty of Biology, Bielefeld University, Bielefeld, Germany.,Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, United States
| | - Jörn Kalinowski
- Center for Biotechnology, Bielefeld University, Bielefeld, Germany.,Faculty of Biology, Bielefeld University, Bielefeld, Germany
| | - Boas Pucker
- Center for Biotechnology, Bielefeld University, Bielefeld, Germany.,Faculty of Biology, Bielefeld University, Bielefeld, Germany.,Department of Plant Sciences, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
31
|
Zhao N, Wang Y, Hua J. The Roles of Mitochondrion in Intergenomic Gene Transfer in Plants: A Source and a Pool. Int J Mol Sci 2018; 19:ijms19020547. [PMID: 29439501 PMCID: PMC5855769 DOI: 10.3390/ijms19020547] [Citation(s) in RCA: 37] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2017] [Revised: 01/31/2018] [Accepted: 02/06/2018] [Indexed: 11/30/2022] Open
Abstract
Intergenomic gene transfer (IGT) is continuous in the evolutionary history of plants. In this field, most studies concentrate on a few related species. Here, we look at IGT from a broader evolutionary perspective, using 24 plants. We discover many IGT events by assessing the data from nuclear, mitochondrial and chloroplast genomes. Thus, we summarize the two roles of the mitochondrion: a source and a pool. That is, the mitochondrion gives massive sequences and integrates nuclear transposons and chloroplast tRNA genes. Though the directions are opposite, lots of likenesses emerge. First, mitochondrial gene transfer is pervasive in all 24 plants. Second, gene transfer is a single event of certain shared ancestors during evolutionary divergence. Third, sequence features of homologies vary for different purposes in the donor and recipient genomes. Finally, small repeats (or micro-homologies) contribute to gene transfer by mediating recombination in the recipient genome.
Collapse
Affiliation(s)
- Nan Zhao
- Laboratory of Cotton Genetics, Genomics and Breeding/Key Laboratory of Crop Heterosis and Utilization of Ministry of Education, College of Agronomy and Biotechnology , China Agricultural University, Beijing 100193, China.
| | - Yumei Wang
- Institute of Cash Crops, Hubei Academy of Agricultural Sciences, Wuhan 430064, China.
| | - Jinping Hua
- Laboratory of Cotton Genetics, Genomics and Breeding/Key Laboratory of Crop Heterosis and Utilization of Ministry of Education, College of Agronomy and Biotechnology , China Agricultural University, Beijing 100193, China.
| |
Collapse
|
32
|
Pucker B, Holtgräwe D, Weisshaar B. Consideration of non-canonical splice sites improves gene prediction on the Arabidopsis thaliana Niederzenz-1 genome sequence. BMC Res Notes 2017; 10:667. [PMID: 29202864 PMCID: PMC5716242 DOI: 10.1186/s13104-017-2985-y] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2017] [Accepted: 11/23/2017] [Indexed: 12/26/2022] Open
Abstract
Objective The Arabidopsis thaliana Niederzenz-1 genome sequence was recently published with an ab initio gene prediction. In depth analysis of the predicted gene set revealed some errors involving genes with non-canonical splice sites in their introns. Since non-canonical splice sites are difficult to predict ab initio, we checked for options to improve the annotation by transferring annotation information from the recently released Columbia-0 reference genome sequence annotation Araport11. Results Incorporation of hints generated from Araport11 enabled the precise prediction of non-canonical splice sites. Manual inspection of RNA-Seq read mapping and RT-PCR were applied to validate the structural annotations of non-canonical splice sites. Predictions of untranslated regions were also updated by harnessing the potential of Araport11’s information, which was generated by using high coverage RNA-Seq data. The improved gene set of the Nd-1 genome assembly (GeneSet_Nd-1_v1.1) was evaluated via comparison to the initial gene prediction (GeneSet_Nd-1_v1.0) as well as against Araport11 for the Col-0 reference genome sequence. GeneSet_Nd-1_v1.1 contains previously missed non-canonical splice sites in 1256 genes. Reciprocal best hits for 24,527 (89.4%) of all nuclear Col-0 genes against the GeneSet_Nd-1_v1.1 indicate a high gene prediction quality. Electronic supplementary material The online version of this article (10.1186/s13104-017-2985-y) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Boas Pucker
- Faculty of Biology & Center for Biotechnology, Bielefeld University, Bielefeld, Germany
| | - Daniela Holtgräwe
- Faculty of Biology & Center for Biotechnology, Bielefeld University, Bielefeld, Germany
| | - Bernd Weisshaar
- Faculty of Biology & Center for Biotechnology, Bielefeld University, Bielefeld, Germany.
| |
Collapse
|
33
|
Mahato NK, Gupta V, Singh P, Kumari R, Verma H, Tripathi C, Rani P, Sharma A, Singhvi N, Sood U, Hira P, Kohli P, Nayyar N, Puri A, Bajaj A, Kumar R, Negi V, Talwar C, Khurana H, Nagar S, Sharma M, Mishra H, Singh AK, Dhingra G, Negi RK, Shakarad M, Singh Y, Lal R. Microbial taxonomy in the era of OMICS: application of DNA sequences, computational tools and techniques. Antonie van Leeuwenhoek 2017; 110:1357-1371. [PMID: 28831610 DOI: 10.1007/s10482-017-0928-1] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/17/2017] [Accepted: 08/10/2017] [Indexed: 02/06/2023]
Abstract
The current prokaryotic taxonomy classifies phenotypically and genotypically diverse microorganisms using a polyphasic approach. With advances in the next-generation sequencing technologies and computational tools for analysis of genomes, the traditional polyphasic method is complemented with genomic data to delineate and classify bacterial genera and species as an alternative to cumbersome and error-prone laboratory tests. This review discusses the applications of sequence-based tools and techniques for bacterial classification and provides a scheme for more robust and reproducible bacterial classification based on genomic data. The present review highlights promising tools and techniques such as ortho-Average Nucleotide Identity, Genome to Genome Distance Calculator and Multi Locus Sequence Analysis, which can be validly employed for characterizing novel microorganisms and assessing phylogenetic relationships. In addition, the review discusses the possibility of employing metagenomic data to assess the phylogenetic associations of uncultured microorganisms. Through this article, we present a review of genomic approaches that can be included in the scheme of taxonomy of bacteria and archaea based on computational and in silico advances to boost the credibility of taxonomic classification in this genomic era.
Collapse
Affiliation(s)
| | - Vipin Gupta
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Priya Singh
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Rashmi Kumari
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | | | - Charu Tripathi
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Pooja Rani
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Anukriti Sharma
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Nirjara Singhvi
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Utkarsh Sood
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Princy Hira
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Puneet Kohli
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Namita Nayyar
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Akshita Puri
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Abhay Bajaj
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Roshan Kumar
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Vivek Negi
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Chandni Talwar
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Himani Khurana
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Shekhar Nagar
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Monika Sharma
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Harshita Mishra
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Amit Kumar Singh
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Gauri Dhingra
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Ram Krishan Negi
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | | | - Yogendra Singh
- Department of Zoology, University of Delhi, Delhi, 110007, India
| | - Rup Lal
- Department of Zoology, University of Delhi, Delhi, 110007, India.
| |
Collapse
|
34
|
From plant genomes to phenotypes. J Biotechnol 2017; 261:46-52. [PMID: 28602791 DOI: 10.1016/j.jbiotec.2017.06.003] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2017] [Revised: 05/27/2017] [Accepted: 06/07/2017] [Indexed: 12/21/2022]
Abstract
Recent advances in sequencing technologies have greatly accelerated the rate of plant genome and applied breeding research. Despite this advancing trend, plant genomes continue to present numerous difficulties to the standard tools and pipelines not only for genome assembly but also gene annotation and downstream analysis. Here we give a perspective on tools, resources and services necessary to assemble and analyze plant genomes and link them to plant phenotypes.
Collapse
|