1
|
Deng CH, Naithani S, Kumari S, Cobo-Simón I, Quezada-Rodríguez EH, Skrabisova M, Gladman N, Correll MJ, Sikiru AB, Afuwape OO, Marrano A, Rebollo I, Zhang W, Jung S. Genotype and phenotype data standardization, utilization and integration in the big data era for agricultural sciences. Database (Oxford) 2023; 2023:baad088. [PMID: 38079567 PMCID: PMC10712715 DOI: 10.1093/database/baad088] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Revised: 10/17/2023] [Accepted: 11/28/2023] [Indexed: 12/18/2023]
Abstract
Large-scale genotype and phenotype data have been increasingly generated to identify genetic markers, understand gene function and evolution and facilitate genomic selection. These datasets hold immense value for both current and future studies, as they are vital for crop breeding, yield improvement and overall agricultural sustainability. However, integrating these datasets from heterogeneous sources presents significant challenges and hinders their effective utilization. We established the Genotype-Phenotype Working Group in November 2021 as a part of the AgBioData Consortium (https://www.agbiodata.org) to review current data types and resources that support archiving, analysis and visualization of genotype and phenotype data to understand the needs and challenges of the plant genomic research community. For 2021-22, we identified different types of datasets and examined metadata annotations related to experimental design/methods/sample collection, etc. Furthermore, we thoroughly reviewed publicly funded repositories for raw and processed data as well as secondary databases and knowledgebases that enable the integration of heterogeneous data in the context of the genome browser, pathway networks and tissue-specific gene expression. Based on our survey, we recommend a need for (i) additional infrastructural support for archiving many new data types, (ii) development of community standards for data annotation and formatting, (iii) resources for biocuration and (iv) analysis and visualization tools to connect genotype data with phenotype data to enhance knowledge synthesis and to foster translational research. Although this paper only covers the data and resources relevant to the plant research community, we expect that similar issues and needs are shared by researchers working on animals. Database URL: https://www.agbiodata.org.
Collapse
Affiliation(s)
- Cecilia H Deng
- Molecular and Digital Breeding, New Cultivar Innovation, The New Zealand Institute for Plant and Food Research Limited, 120 Mt Albert Road, Auckland 1025, New Zealand
| | - Sushma Naithani
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331, USA
| | - Sunita Kumari
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, New York, NY 11724, USA
| | - Irene Cobo-Simón
- Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT, USA
- Institute of Forest Science (ICIFOR-INIA, CSIC), Madrid, Spain
| | - Elsa H Quezada-Rodríguez
- Departamento de Producción Agrícola y Animal, Universidad Autónoma Metropolitana-Xochimilco, Ciudad de México, México
- Centro de Ciencias de la Complejidad, Universidad Nacional Autónoma de México, Ciudad de México, México
| | - Maria Skrabisova
- Department of Biochemistry, Faculty of Science, Palacky University, Olomouc, Czech Republic
| | - Nick Gladman
- Cold Spring Harbor Laboratory, 1 Bungtown Rd, Cold Spring Harbor, New York, NY 11724, USA
- U.S. Department of Agriculture-Agricultural Research Service, NEA Robert W. Holley Center for Agriculture and Health, Cornell University, Ithaca, NY 14853, USA
| | - Melanie J Correll
- Agricultural and Biological Engineering Department, University of Florida, 1741 Museum Rd, Gainesville, FL 32611, USA
| | | | | | - Annarita Marrano
- Phoenix Bioinformatics, 39899 Balentine Drive, Suite 200, Newark, CA 94560, USA
| | | | - Wentao Zhang
- National Research Council Canada, 110 Gymnasium Pl, Saskatoon, Saskatchewan S7N 0W9, Canada
| | - Sook Jung
- Department of Horticulture, Washington State University, 303c Plant Sciences Building, Pullman, WA 99164-6414, USA
| |
Collapse
|
2
|
Rey MD, Labella-Ortega M, Guerrero-Sánchez VM, Carleial R, Castillejo MÁ, Ruggieri V, Jorrín-Novo JV. A first draft genome of holm oak ( Quercus ilex subsp. ballota), the most representative species of the Mediterranean forest and the Spanish agrosylvopastoral ecosystem " dehesa". Front Mol Biosci 2023; 10:1242943. [PMID: 37905231 PMCID: PMC10613499 DOI: 10.3389/fmolb.2023.1242943] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 09/20/2023] [Indexed: 11/02/2023] Open
Abstract
The holm oak (Quercus ilex subsp. ballota) is the most representative species of the Mediterranean Basin and the agrosylvopastoral Spanish "dehesa" ecosystem. Being part of our life, culture, and subsistence since ancient times, it has significant environmental and economic importance. More recently, there has been a renewed interest in using the Q. ilex acorn as a functional food due to its nutritional and nutraceutical properties. However, the holm oak and its related ecosystems are threatened by different factors, with oak decline syndrome and climate change being the most worrying in the short and medium term. Breeding programs informed by the selection of elite genotypes seem to be the most plausible biotechnological solution to rescue populations under threat. To achieve this and other downstream analyses, we need a high-quality and well-annotated Q. ilex reference genome. Here, we introduce the first draft genome assembly of Q. ilex using long-read sequencing (PacBio). The assembled nuclear haploid genome had 530 contigs totaling 842.2 Mbp (N50 = 3.3 Mbp), of which 448.7 Mb (53%) were repetitive sequences. We annotated 39,443 protein-coding genes of which 94.80% were complete and single-copy genes. Phylogenetic analyses showed no evidence of a recent whole-genome duplication, and high synteny of the 12 chromosomes between Q. ilex and Quercus lobata and between Q. ilex and Quercus robur. The chloroplast genome size was 142.3 Kbp with 149 protein-coding genes successfully annotated. This first draft should allow for the validation of omics data as well as the identification and functional annotation of genes related to phenotypes of interest such as those associated with resilience against oak decline syndrome and climate change and higher acorn productivity and nutraceutical value.
Collapse
Affiliation(s)
- María-Dolores Rey
- Agroforestry and Plant Biochemistry, Proteomics and Systems Biology, Department of Biochemistry and Molecular Biology, University of Cordoba, UCO-CeiA3, Cordoba, Spain
| | - Mónica Labella-Ortega
- Agroforestry and Plant Biochemistry, Proteomics and Systems Biology, Department of Biochemistry and Molecular Biology, University of Cordoba, UCO-CeiA3, Cordoba, Spain
| | - Víctor M. Guerrero-Sánchez
- Agroforestry and Plant Biochemistry, Proteomics and Systems Biology, Department of Biochemistry and Molecular Biology, University of Cordoba, UCO-CeiA3, Cordoba, Spain
| | | | - María Ángeles Castillejo
- Agroforestry and Plant Biochemistry, Proteomics and Systems Biology, Department of Biochemistry and Molecular Biology, University of Cordoba, UCO-CeiA3, Cordoba, Spain
| | - Valentino Ruggieri
- Biomeets Consulting ITNIG—Carrer d’ Alaba 61 08005 Catalonia, Barcelona, Spain
| | - Jesús V. Jorrín-Novo
- Agroforestry and Plant Biochemistry, Proteomics and Systems Biology, Department of Biochemistry and Molecular Biology, University of Cordoba, UCO-CeiA3, Cordoba, Spain
| |
Collapse
|
3
|
Rutz C, Bonassin L, Kress A, Francesconi C, Boštjančić LL, Merlat D, Theissinger K, Lecompte O. Abundance and Diversification of Repetitive Elements in Decapoda Genomes. Genes (Basel) 2023; 14:1627. [PMID: 37628678 PMCID: PMC10454600 DOI: 10.3390/genes14081627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 08/05/2023] [Accepted: 08/12/2023] [Indexed: 08/27/2023] Open
Abstract
Repetitive elements are a major component of DNA sequences due to their ability to propagate through the genome. Characterization of Metazoan repetitive profiles is improving; however, current pipelines fail to identify a significant proportion of divergent repeats in non-model organisms. The Decapoda order, for which repeat content analyses are largely lacking, is characterized by extremely variable genome sizes that suggest an important presence of repetitive elements. Here, we developed a new standardized pipeline to annotate repetitive elements in non-model organisms, which we applied to twenty Decapoda and six other Crustacea genomes. Using this new tool, we identified 10% more repetitive elements than standard pipelines. Repetitive elements were more abundant in Decapoda species than in other Crustacea, with a very large number of highly repeated satellite DNA families. Moreover, we demonstrated a high correlation between assembly size and transposable elements and different repeat dynamics between Dendrobranchiata and Reptantia. The patterns of repetitive elements largely reflect the phylogenetic relationships of Decapoda and the distinct evolutionary trajectories within Crustacea. In summary, our results highlight the impact of repetitive elements on genome evolution in Decapoda and the value of our novel annotation pipeline, which will provide a baseline for future comparative analyses.
Collapse
Affiliation(s)
- Christelle Rutz
- Department of Computer Science, ICube, UMR 7357, University of Strasbourg, CNRS, Rue Eugène Boeckel 1, 67000 Strasbourg, France; (C.R.); (L.B.); (A.K.); (L.L.B.); (D.M.)
| | - Lena Bonassin
- Department of Computer Science, ICube, UMR 7357, University of Strasbourg, CNRS, Rue Eugène Boeckel 1, 67000 Strasbourg, France; (C.R.); (L.B.); (A.K.); (L.L.B.); (D.M.)
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Senckenberg Biodiversity and Climate Research Centre, Georg-Voigt-Str. 14-16, 60325 Frankfurt am Main, Germany; (C.F.); (K.T.)
- Department of Molecular Ecology, Institute for Environmental Sciences, Rhineland-Palatinate Technical University Kaiserslautern Landau, Fortstr. 7, 76829 Landau, Germany
| | - Arnaud Kress
- Department of Computer Science, ICube, UMR 7357, University of Strasbourg, CNRS, Rue Eugène Boeckel 1, 67000 Strasbourg, France; (C.R.); (L.B.); (A.K.); (L.L.B.); (D.M.)
| | - Caterina Francesconi
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Senckenberg Biodiversity and Climate Research Centre, Georg-Voigt-Str. 14-16, 60325 Frankfurt am Main, Germany; (C.F.); (K.T.)
- Department of Molecular Ecology, Institute for Environmental Sciences, Rhineland-Palatinate Technical University Kaiserslautern Landau, Fortstr. 7, 76829 Landau, Germany
| | - Ljudevit Luka Boštjančić
- Department of Computer Science, ICube, UMR 7357, University of Strasbourg, CNRS, Rue Eugène Boeckel 1, 67000 Strasbourg, France; (C.R.); (L.B.); (A.K.); (L.L.B.); (D.M.)
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Senckenberg Biodiversity and Climate Research Centre, Georg-Voigt-Str. 14-16, 60325 Frankfurt am Main, Germany; (C.F.); (K.T.)
- Department of Molecular Ecology, Institute for Environmental Sciences, Rhineland-Palatinate Technical University Kaiserslautern Landau, Fortstr. 7, 76829 Landau, Germany
| | - Dorine Merlat
- Department of Computer Science, ICube, UMR 7357, University of Strasbourg, CNRS, Rue Eugène Boeckel 1, 67000 Strasbourg, France; (C.R.); (L.B.); (A.K.); (L.L.B.); (D.M.)
| | - Kathrin Theissinger
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Senckenberg Biodiversity and Climate Research Centre, Georg-Voigt-Str. 14-16, 60325 Frankfurt am Main, Germany; (C.F.); (K.T.)
| | - Odile Lecompte
- Department of Computer Science, ICube, UMR 7357, University of Strasbourg, CNRS, Rue Eugène Boeckel 1, 67000 Strasbourg, France; (C.R.); (L.B.); (A.K.); (L.L.B.); (D.M.)
| |
Collapse
|
4
|
Kim YK, Jo S, Cheon SH, Hong JR, Kim KJ. Ancient Horizontal Gene Transfers from Plastome to Mitogenome of a Nonphotosynthetic Orchid, Gastrodia pubilabiata (Epidendroideae, Orchidaceae). Int J Mol Sci 2023; 24:11448. [PMID: 37511216 PMCID: PMC10380568 DOI: 10.3390/ijms241411448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 07/08/2023] [Accepted: 07/11/2023] [Indexed: 07/30/2023] Open
Abstract
Gastrodia pubilabiata is a nonphotosynthetic and mycoheterotrophic orchid belonging to subfamily Epidendroideae. Compared to other typical angiosperm species, the plastome of G. pubilabiata is dramatically reduced in size to only 30,698 base pairs (bp). This reduction has led to the loss of most photosynthesis-related genes and some housekeeping genes in the plastome, which now only contains 19 protein coding genes, three tRNAs, and three rRNAs. In contrast, the typical orchid species contains 79 protein coding genes, 30 tRNAs, and four rRNAs. This study decoded the entire mitogenome of G. pubilabiata, which consisted of 44 contigs with a total length of 867,349 bp. Its mitogenome contained 38 protein coding genes, nine tRNAs, and three rRNAs. The gene content of G. pubilabiata mitogenome is similar to the typical plant mitogenomes even though the mitogenome size is twice as large as the typical ones. To determine possible gene transfer events between the plastome and the mitogenome individual BLASTN searches were conducted, using all available orchid plastome sequences and flowering plant mitogenome sequences. Plastid rRNA fragments were found at a high frequency in the mitogenome. Seven plastid protein coding gene fragments (ndhC, ndhJ, ndhK, psaA, psbF, rpoB, and rps4) were also identified in the mitogenome of G. pubilabiata. Phylogenetic trees using these seven plastid protein coding gene fragments suggested that horizontal gene transfer (HGT) from plastome to mitogenome occurred before losses of photosynthesis related genes, leading to the lineage of G. pubilabiata. Compared to species phylogeny of the lineage of orchid, it was estimated that HGT might have occurred approximately 30 million years ago.
Collapse
Affiliation(s)
- Young-Kee Kim
- Division of Life Sciences, Korea University, Seoul 02841, Republic of Korea
| | - Sangjin Jo
- Division of Life Sciences, Korea University, Seoul 02841, Republic of Korea
- International Biological Material Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon 34141, Republic of Korea
| | - Se-Hwan Cheon
- Division of Life Sciences, Korea University, Seoul 02841, Republic of Korea
| | - Ja-Ram Hong
- Division of Life Sciences, Korea University, Seoul 02841, Republic of Korea
| | - Ki-Joong Kim
- Division of Life Sciences, Korea University, Seoul 02841, Republic of Korea
| |
Collapse
|
5
|
Wu Y, Li D, Hu Y, Li H, Ramstein GP, Zhou S, Zhang X, Bao Z, Zhang Y, Song B, Zhou Y, Zhou Y, Gagnon E, Särkinen T, Knapp S, Zhang C, Städler T, Buckler ES, Huang S. Phylogenomic discovery of deleterious mutations facilitates hybrid potato breeding. Cell 2023; 186:2313-2328.e15. [PMID: 37146612 DOI: 10.1016/j.cell.2023.04.008] [Citation(s) in RCA: 15] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Revised: 02/20/2023] [Accepted: 04/05/2023] [Indexed: 05/07/2023]
Abstract
Hybrid potato breeding will transform the crop from a clonally propagated tetraploid to a seed-reproducing diploid. Historical accumulation of deleterious mutations in potato genomes has hindered the development of elite inbred lines and hybrids. Utilizing a whole-genome phylogeny of 92 Solanaceae and its sister clade species, we employ an evolutionary strategy to identify deleterious mutations. The deep phylogeny reveals the genome-wide landscape of highly constrained sites, comprising ∼2.4% of the genome. Based on a diploid potato diversity panel, we infer 367,499 deleterious variants, of which 50% occur at non-coding and 15% at synonymous sites. Counterintuitively, diploid lines with relatively high homozygous deleterious burden can be better starting material for inbred-line development, despite showing less vigorous growth. Inclusion of inferred deleterious mutations increases genomic-prediction accuracy for yield by 24.7%. Our study generates insights into the genome-wide incidence and properties of deleterious mutations and their far-reaching consequences for breeding.
Collapse
Affiliation(s)
- Yaoyao Wu
- State Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China; Institute for Genomic Diversity, Cornell University, Ithaca, NY 14853, USA
| | - Dawei Li
- State Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China; State Key Laboratory of Tropical Crop Breeding, Chinese Academy of Tropical Agricultural Sciences, Haikou, Hainan 571101, China
| | - Yong Hu
- State Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China; The AGISCAAS-YNNU Joint Academy of Potato Sciences, Yunnan Normal University, Kunming, Yunnan 650500, China
| | - Hongbo Li
- State Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| | - Guillaume P Ramstein
- Center for Quantitative Genetics and Genomics, Aarhus University, Aarhus 8000, Denmark
| | - Shaoqun Zhou
- State Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| | - Xinyan Zhang
- State Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| | - Zhigui Bao
- State Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China; Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| | - Yu Zhang
- State Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China; School of Agriculture, Sun Yat-sen University, Shenzhen, Guangdong 518107, China
| | - Baoxing Song
- Peking University Institute of Advanced Agricultural Sciences, Weifang, Shandong 261000, China
| | - Yao Zhou
- State Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China; Institute of Botany, Chinese Academy of Sciences, Beijing 100093, China; College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100094, China
| | - Yongfeng Zhou
- State Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| | - Edeline Gagnon
- Technische Universität München, TUM School of Life Sciences, Emil-Ramann-Strasse 2, 85354 Freising, Germany
| | - Tiina Särkinen
- Royal Botanic Garden Edinburgh, 20A Inverleith Row, Edinburgh EH3 5LR, UK
| | - Sandra Knapp
- Natural History Museum, Cromwell Road, London SW7 5BD, UK
| | - Chunzhi Zhang
- State Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China
| | - Thomas Städler
- Institute of Integrative Biology and Zurich-Basel Plant Science Center, ETH Zurich, 8092 Zurich, Switzerland
| | - Edward S Buckler
- Institute for Genomic Diversity, Cornell University, Ithaca, NY 14853, USA; USDA-ARS, Ithaca, NY 14853, USA
| | - Sanwen Huang
- State Key Laboratory of Tropical Crop Breeding, Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, Guangdong 518120, China; State Key Laboratory of Tropical Crop Breeding, Chinese Academy of Tropical Agricultural Sciences, Haikou, Hainan 571101, China.
| |
Collapse
|
6
|
Anita VPD, Matra DD, Siregar UJ. Chloroplast genome draft assembly of Falcataria moluccana using hybrid sequencing technology. BMC Res Notes 2023; 16:31. [PMID: 36894969 PMCID: PMC9996948 DOI: 10.1186/s13104-023-06290-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Accepted: 02/16/2023] [Indexed: 03/11/2023] Open
Abstract
OBJECTIVES Falcataria moluccana, known locally as Sengon, is a fast-growing legume tree that is commonly planted in community forests of Java Island, Indonesia. However, the plantations face attacks of Boktor stem borer (Xystrocera festiva) and gall-rust disease (Uromycladium falcatariae) as major threats to its productivity. To control those pest and disease, it is necessary to grow resistant sengon clones, which are developed through tree improvement program, of which needs genetic and genomic information. This dataset was created to construct draft of sengon chloroplast genome and to study the evolution of sengon based on matK and rbcL barcode genes. DATA DESCRIPTION Genomic DNA was extracted from leaf samples of one individual healthy tree in a private plantation. The DNA was sequenced using Illumina Novaseq 6000 (Novogen AIT, Singapore) for short-reads data, and MinION of Nanopore following manufacture's protocols SQK-LSK110 for long-reads data. The 66,3 Gb short-reads and 12 Gb long-reads data were hybrid assembled and used to construct a 128.867 bp of F. moluccana chloroplast genome with a quadripartite structure, containing a pair of inverted repeats, a large single-copy and a small single-copy region. Phylogenetic tree constructed using matK and rbcL showed monophyletic origin of F. moluccana and other legume trees.
Collapse
Affiliation(s)
- Vilda Puji Dini Anita
- Tropical Silviculture Program, Department of Silviculture, Faculty of Forestry and Environment, IPB University (Bogor Agricultural University), Bogor, Indonesia
| | - Deden Derajat Matra
- Department of Agronomy and Horticulture, Faculty of Agriculture, IPB University (Bogor Agricultural University), Bogor, Indonesia
| | - Ulfah Juniarti Siregar
- Department of Silviculture, Faculty of Forestry and Environment, IPB University (Bogor Agricultural University), Bogor, Indonesia.
| |
Collapse
|
7
|
Lin X, Torres Ascurra YC, Fillianti H, Dethier L, de Rond L, Domazakis E, Aguilera-Galvez C, Kiros AY, Jacobsen E, Visser RGF, Nürnberger T, Vleeshouwers VGAA. Recognition of Pep-13/25 MAMPs of Phytophthora localizes to an RLK locus in Solanum microdontum. FRONTIERS IN PLANT SCIENCE 2023; 13:1037030. [PMID: 36714772 PMCID: PMC9879208 DOI: 10.3389/fpls.2022.1037030] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/05/2022] [Accepted: 12/09/2022] [Indexed: 06/18/2023]
Abstract
Pattern-triggered immunity (PTI) in plants is mediated by cell surface-localized pattern recognition receptors (PRRs) upon perception of microbe-associated molecular pattern (MAMPs). MAMPs are conserved molecules across microbe species, or even kingdoms, and PRRs can confer broad-spectrum disease resistance. Pep-13/25 are well-characterized MAMPs in Phytophthora species, which are renowned devastating oomycete pathogens of potato and other plants, and for which genetic resistance is highly wanted. Pep-13/25 are derived from a 42 kDa transglutaminase GP42, but their cognate PRR has remained unknown. Here, we genetically mapped a novel surface immune receptor that recognizes Pep-25. By using effectoromics screening, we characterized the recognition spectrum of Pep-13/25 in diverse Solanaceae species. Response to Pep-13/25 was predominantly found in potato and related wild tuber-bearing Solanum species. Bulk-segregant RNA sequencing (BSR-Seq) and genetic mapping the response to Pep-25 led to a 0.081 cM region on the top of chromosome 3 in the wild potato species Solanum microdontum subsp. gigantophyllum. Some BAC clones in this region were isolated and sequenced, and we found the Pep-25 receptor locates in a complex receptor-like kinase (RLK) locus. This study is an important step toward the identification of the Pep-13/25 receptor, which can potentially lead to broad application in potato and various other hosts of Phytophthora species.
Collapse
Affiliation(s)
- Xiao Lin
- Plant Breeding, Wageningen University and Research, Wageningen, Netherlands
| | | | - Happyka Fillianti
- Plant Breeding, Wageningen University and Research, Wageningen, Netherlands
| | - Laura Dethier
- Plant Breeding, Wageningen University and Research, Wageningen, Netherlands
| | - Laura de Rond
- Plant Breeding, Wageningen University and Research, Wageningen, Netherlands
| | | | | | | | - Evert Jacobsen
- Plant Breeding, Wageningen University and Research, Wageningen, Netherlands
| | | | - Thorsten Nürnberger
- Department of Plant Biochemistry, Centre of Plant Molecular Biology (ZMBP), University of Tübingen, Tübingen, Germany
- Department of Biochemistry, University of Johannesburg, Johannesburg, South Africa
| | | |
Collapse
|
8
|
Zhang T, Zhou J, Gao W, Jia Y, Wei Y, Wang G. Complex genome assembly based on long-read sequencing. Brief Bioinform 2022; 23:6657663. [PMID: 35940845 DOI: 10.1093/bib/bbac305] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Revised: 06/20/2022] [Accepted: 07/06/2022] [Indexed: 11/12/2022] Open
Abstract
High-quality genome chromosome-scale sequences provide an important basis for genomics downstream analysis, especially the construction of haplotype-resolved and complete genomes, which plays a key role in genome annotation, mutation detection, evolutionary analysis, gene function research, comparative genomics and other aspects. However, genome-wide short-read sequencing is difficult to produce a complete genome in the face of a complex genome with high duplication and multiple heterozygosity. The emergence of long-read sequencing technology has greatly improved the integrity of complex genome assembly. We review a variety of computational methods for complex genome assembly and describe in detail the theories, innovations and shortcomings of collapsed, semi-collapsed and uncollapsed assemblers based on long reads. Among the three methods, uncollapsed assembly is the most correct and complete way to represent genomes. In addition, genome assembly is closely related to haplotype reconstruction, that is uncollapsed assembly realizes haplotype reconstruction, and haplotype reconstruction promotes uncollapsed assembly. We hope that gapless, telomere-to-telomere and accurate assembly of complex genomes can be truly routinely achieved using only a simple process or a single tool in the future.
Collapse
Affiliation(s)
- Tianjiao Zhang
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, 150040, China
| | - Jie Zhou
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, 150040, China
| | - Wentao Gao
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, 150040, China
| | - Yuran Jia
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, 150040, China
| | - Yanan Wei
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, 150040, China
| | - Guohua Wang
- College of Information and Computer Engineering, Northeast Forestry University, Harbin, 150040, China
| |
Collapse
|
9
|
Minio A, Cochetel N, Vondras AM, Massonnet M, Cantu D. Assembly of complete diploid-phased chromosomes from draft genome sequences. G3 GENES|GENOMES|GENETICS 2022; 12:6605224. [PMID: 35686922 PMCID: PMC9339290 DOI: 10.1093/g3journal/jkac143] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Accepted: 05/30/2022] [Indexed: 01/27/2023]
Abstract
De novo genome assembly is essential for genomic research. High-quality genomes assembled into phased pseudomolecules are challenging to produce and often contain assembly errors because of repeats, heterozygosity, or the chosen assembly strategy. Although algorithms that produce partially phased assemblies exist, haploid draft assemblies that may lack biological information remain favored because they are easier to generate and use. We developed HaploSync, a suite of tools that produces fully phased, chromosome-scale diploid genome assemblies, and performs extensive quality control to limit assembly artifacts. HaploSync scaffolds sequences from a draft diploid assembly into phased pseudomolecules guided by a genetic map and/or the genome of a closely related species. HaploSync generates a report that visualizes the relationships between current and legacy sequences, for both haplotypes, and displays their gene and marker content. This quality control helps the user identify misassemblies and guides Haplosync’s correction of scaffolding errors. Finally, HaploSync fills assembly gaps with unplaced sequences and resolves collapsed homozygous regions. In a series of plant, fungal, and animal kingdom case studies, we demonstrate that HaploSync efficiently increases the assembly contiguity of phased chromosomes, improves completeness by filling gaps, corrects scaffolding, and correctly phases highly heterozygous, complex regions.
Collapse
Affiliation(s)
- Andrea Minio
- Department of Viticulture and Enology, University of California Davis , Davis, CA 95616, USA
| | - Noé Cochetel
- Department of Viticulture and Enology, University of California Davis , Davis, CA 95616, USA
| | - Amanda M Vondras
- Department of Viticulture and Enology, University of California Davis , Davis, CA 95616, USA
| | - Mélanie Massonnet
- Department of Viticulture and Enology, University of California Davis , Davis, CA 95616, USA
| | - Dario Cantu
- Department of Viticulture and Enology, University of California Davis , Davis, CA 95616, USA
| |
Collapse
|
10
|
Pootakham W, Naktang C, Sonthirod C, Kongkachana W, Yoocha T, Jomchai N, Maknual C, Chumriang P, Pravinvongvuthi T, Tangphatsornruang S. De novo reference assembly of the upriver orange mangrove (Bruguiera sexangula) genome. Genome Biol Evol 2022; 14:6527208. [PMID: 35148390 PMCID: PMC8872974 DOI: 10.1093/gbe/evac025] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/01/2022] [Indexed: 11/21/2022] Open
Abstract
Upriver orange mangrove (Bruguiera sexangula) is a member of the most mangrove-rich taxon (Rhizophoraceae family) and is commonly distributed in the intertidal zones in tropical and subtropical latitudes. In this study, we employed the 10× Genomics linked-read technology to obtain a preliminary de novo assembly of the B. sexangula genome, which was further scaffolded to a pseudomolecule level using the Bruguiera parviflora genome as a reference. The final assembly of the B. sexangula genome contained 260 Mb with an N50 scaffold length of 11,020,310 bases. The assembly comprised 18 pseudomolecules (corresponding to the haploid chromosome number in B. sexangula), covering 204,645,832 bases or 78.6% of the 260-Mb assembly. We predicted a total of 23,978 protein-coding sequences, 17,598 of which were associated with gene ontology terms. Our gene prediction recovered 96.6% of the highly conserved orthologs based on the Benchmarking Universal Single-Copy Orthologs (BUSCO) analysis. The chromosome-level assembly presented in this work provides a valuable genetic resource to help strengthen our understanding of mangroves’ physiological and morphological adaptations to the intertidal zones.
Collapse
Affiliation(s)
- Wirulda Pootakham
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Chaiwat Naktang
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Chutima Sonthirod
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Wasitthee Kongkachana
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Thippawan Yoocha
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Nukoon Jomchai
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Chatree Maknual
- Department of Marine and Coastal Resources, 120 The Government Complex, Chaengwatthana Rd., Thung Song Hong, Bangkok, 10210, Thailand
| | - Pranom Chumriang
- Department of Marine and Coastal Resources, 120 The Government Complex, Chaengwatthana Rd., Thung Song Hong, Bangkok, 10210, Thailand
| | - Tamanai Pravinvongvuthi
- Department of Marine and Coastal Resources, 120 The Government Complex, Chaengwatthana Rd., Thung Song Hong, Bangkok, 10210, Thailand
| | | |
Collapse
|
11
|
Hörandl E. Novel Approaches for Species Concepts and Delimitation in Polyploids and Hybrids. PLANTS (BASEL, SWITZERLAND) 2022; 11:204. [PMID: 35050093 PMCID: PMC8781807 DOI: 10.3390/plants11020204] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Revised: 01/07/2022] [Accepted: 01/10/2022] [Indexed: 05/08/2023]
Abstract
Hybridization and polyploidization are important processes for plant evolution. However, classification of hybrid or polyploid species has been notoriously difficult because of the complexity of processes and different evolutionary scenarios that do not fit with classical species concepts. Polyploid complexes are formed via combinations of allopolyploidy, autopolyploidy and homoploid hybridization with persisting sexual reproduction, resulting in many discrete lineages that have been classified as species. Polyploid complexes with facultative apomixis result in complicated net-work like clusters, or rarely in agamospecies. Various case studies illustrate the problems that apply to traditional species concepts to hybrids and polyploids. Conceptual progress can be made if lineage formation is accepted as an inevitable consequence of meiotic sex, which is established already in the first eukaryotes as a DNA restoration tool. The turnaround of the viewpoint that sex forms species as lineages helps to overcome traditional thinking of species as "units". Lineage formation and self-sustainability is the prerequisite for speciation and can also be applied to hybrids and polyploids. Species delimitation is aided by the improved recognition of lineages via various novel -omics methods, by understanding meiosis functions, and by recognizing functional phenotypes by considering morphological-physiological-ecological adaptations.
Collapse
Affiliation(s)
- Elvira Hörandl
- Department of Systematics, Biodiversity and Evolution of Plants (with Herbarium), University of Goettingen, 37073 Göttingen, Germany
| |
Collapse
|
12
|
Pucker B, Irisarri I, de Vries J, Xu B. Plant genome sequence assembly in the era of long reads: Progress, challenges and future directions. QUANTITATIVE PLANT BIOLOGY 2022; 3:e5. [PMID: 37077982 PMCID: PMC10095996 DOI: 10.1017/qpb.2021.18] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 11/24/2021] [Accepted: 12/21/2021] [Indexed: 05/03/2023]
Abstract
Third-generation long-read sequencing is transforming plant genomics. Oxford Nanopore Technologies and Pacific Biosciences are offering competing long-read sequencing technologies and enable plant scientists to investigate even large and complex plant genomes. Sequencing projects can be conducted by single research groups and sequences of smaller plant genomes can be completed within days. This also resulted in an increased investigation of genomes from multiple species in large scale to address fundamental questions associated with the origin and evolution of land plants. Increased accessibility of sequencing devices and user-friendly software allows more researchers to get involved in genomics. Current challenges are accurately resolving diploid or polyploid genome sequences and better accounting for the intra-specific diversity by switching from the use of single reference genome sequences to a pangenome graph.
Collapse
Affiliation(s)
- Boas Pucker
- Department of Plant Sciences, University of Cambridge, Cambridge, United Kingdom
- Institute of Plant Biology & Braunschweig Integrated Centre of Systems Biology (BRICS), TU Braunschweig, Braunschweig, Germany
- Author for correspondence: Boas Pucker E-mail:
| | - Iker Irisarri
- Department of Applied Bioinformatics, Institute for Microbiology and Genetics, University of Goettingen, Göttingen, Germany
- Campus Institute Data Science (CIDAS), University of Goettingen, Göttingen, Germany
| | - Jan de Vries
- Department of Applied Bioinformatics, Institute for Microbiology and Genetics, University of Goettingen, Göttingen, Germany
- Campus Institute Data Science (CIDAS), University of Goettingen, Göttingen, Germany
- Department of Applied Bioinformatics, Göttingen Center for Molecular Biosciences (GZMB), University of Goettingen, Göttingen, Germany
| | - Bo Xu
- State Key Laboratory of Systematic and Evolutionary Botany, Institute of Botany, Chinese Academy of Sciences, Beijing, China
| |
Collapse
|
13
|
Kiryushkin AS, Ilina EL, Guseva ED, Pawlowski K, Demchenko KN. Hairy CRISPR: Genome Editing in Plants Using Hairy Root Transformation. PLANTS (BASEL, SWITZERLAND) 2021; 11:51. [PMID: 35009056 PMCID: PMC8747350 DOI: 10.3390/plants11010051] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Revised: 12/15/2021] [Accepted: 12/20/2021] [Indexed: 05/27/2023]
Abstract
CRISPR/Cas-mediated genome editing is a powerful tool of plant functional genomics. Hairy root transformation is a rapid and convenient approach for obtaining transgenic roots. When combined, these techniques represent a fast and effective means of studying gene function. In this review, we outline the current state of the art reached by the combination of these approaches over seven years. Additionally, we discuss the origins of different Agrobacterium rhizogenes strains that are widely used for hairy root transformation; the components of CRISPR/Cas vectors, such as the promoters that drive Cas or gRNA expression, the types of Cas nuclease, and selectable and screenable markers; and the application of CRISPR/Cas genome editing in hairy roots. The modification of the already known vector pKSE401 with the addition of the rice translational enhancer OsMac3 and the gene encoding the fluorescent protein DsRed1 is also described.
Collapse
Affiliation(s)
- Alexey S. Kiryushkin
- Laboratory of Cellular and Molecular Mechanisms of Plant Development, Komarov Botanical Institute, Russian Academy of Sciences, 197376 Saint Petersburg, Russia; (E.L.I.); (E.D.G.)
| | - Elena L. Ilina
- Laboratory of Cellular and Molecular Mechanisms of Plant Development, Komarov Botanical Institute, Russian Academy of Sciences, 197376 Saint Petersburg, Russia; (E.L.I.); (E.D.G.)
| | - Elizaveta D. Guseva
- Laboratory of Cellular and Molecular Mechanisms of Plant Development, Komarov Botanical Institute, Russian Academy of Sciences, 197376 Saint Petersburg, Russia; (E.L.I.); (E.D.G.)
| | - Katharina Pawlowski
- Department of Ecology, Environment and Plant Sciences, Stockholm University, 10691 Stockholm, Sweden
| | - Kirill N. Demchenko
- Laboratory of Cellular and Molecular Mechanisms of Plant Development, Komarov Botanical Institute, Russian Academy of Sciences, 197376 Saint Petersburg, Russia; (E.L.I.); (E.D.G.)
| |
Collapse
|
14
|
Bohutínská M, Handrick V, Yant L, Schmickl R, Kolář F, Bomblies K, Paajanen P. De Novo Mutation and Rapid Protein (Co-)evolution during Meiotic Adaptation in Arabidopsis arenosa. Mol Biol Evol 2021; 38:1980-1994. [PMID: 33502506 PMCID: PMC8097281 DOI: 10.1093/molbev/msab001] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
A sudden shift in environment or cellular context necessitates rapid adaptation. A dramatic example is genome duplication, which leads to polyploidy. In such situations, the waiting time for new mutations might be prohibitive; theoretical and empirical studies suggest that rapid adaptation will largely rely on standing variation already present in source populations. Here, we investigate the evolution of meiosis proteins in Arabidopsis arenosa, some of which were previously implicated in adaptation to polyploidy, and in a diploid, habitat. A striking and unexplained feature of prior results was the large number of amino acid changes in multiple interacting proteins, especially in the relatively young tetraploid. Here, we investigate whether selection on meiosis genes is found in other lineages, how the polyploid may have accumulated so many differences, and whether derived variants were selected from standing variation. We use a range-wide sample of 145 resequenced genomes of diploid and tetraploid A. arenosa, with new genome assemblies. We confirmed signals of positive selection in the polyploid and diploid lineages they were previously reported in and find additional meiosis genes with evidence of selection. We show that the polyploid lineage stands out both qualitatively and quantitatively. Compared with diploids, meiosis proteins in the polyploid have more amino acid changes and a higher proportion affecting more strongly conserved sites. We find evidence that in tetraploids, positive selection may have commonly acted on de novo mutations. Several tests provide hints that coevolution, and in some cases, multinucleotide mutations, might contribute to rapid accumulation of changes in meiotic proteins.
Collapse
Affiliation(s)
- Magdalena Bohutínská
- Department of Botany, Faculty of Science, Charles University, Prague, Czech Republic.,Institute of Botany of the Czech Academy of Sciences, Průhonice, Czech Republic
| | - Vinzenz Handrick
- Department of Cell and Developmental Biology, John Innes Centre, Norwich, United Kingdom
| | - Levi Yant
- Department of Cell and Developmental Biology, John Innes Centre, Norwich, United Kingdom
| | - Roswitha Schmickl
- Department of Botany, Faculty of Science, Charles University, Prague, Czech Republic.,Institute of Botany of the Czech Academy of Sciences, Průhonice, Czech Republic
| | - Filip Kolář
- Department of Botany, Faculty of Science, Charles University, Prague, Czech Republic.,Institute of Botany of the Czech Academy of Sciences, Průhonice, Czech Republic.,Department of Botany, University of Innsbruck, Innsbruck, Austria
| | - Kirsten Bomblies
- Department of Cell and Developmental Biology, John Innes Centre, Norwich, United Kingdom.,Plant Evolutionary Genetics, Department of Biology, Institute of Molecular Plant Biology, ETH Zürich, Zurich, Switzerland
| | - Pirita Paajanen
- Department of Cell and Developmental Biology, John Innes Centre, Norwich, United Kingdom
| |
Collapse
|
15
|
Pootakham W, Naktang C, Kongkachana W, Sonthirod C, Yoocha T, Sangsrakru D, Jomchai N, U-Thoomporn S, Romyanon K, Toojinda T, Tangphatsornruang S. De novo chromosome-level assembly of the Centella asiatica genome. Genomics 2021; 113:2221-2228. [PMID: 34022344 DOI: 10.1016/j.ygeno.2021.05.019] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 03/05/2021] [Accepted: 05/17/2021] [Indexed: 12/22/2022]
Abstract
Centella asiatica is a herbaceous, perennial species indigenous to India and Southeast Asia. C. asiatica possesses several medicinal properties: anti-aging, anti-inflammatory, wound healing and memory enhancing. The lack of available genomics resources significantly impedes the improvement of C. asiatica varieties through molecular breeding. Here, we combined the 10× Genomics linked-read technology and the long-range HiC technique to obtain the genome assembly. The final assembly contained nine pseudomolecules, corresponding to the haploid chromosome number in C. asiatica. These nine chromosomes covered 402,536,584 bases or 93.6% of the 430-Mb assembly. Comparative genomics analyses based on single-copy orthologous genes showed that C. asiatica and the common ancestor of Coriandrum sativum (coriander) and Daucus carota (carrot) diverged about 48 million years ago. This assembly provides a valuable reference genome for future molecular studies, varietal development through marker-assisted breeding and comparative genomics studies in C. asiatica.
Collapse
Affiliation(s)
- Wirulda Pootakham
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand.
| | - Chaiwat Naktang
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Wasitthee Kongkachana
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Chutima Sonthirod
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Thippawan Yoocha
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Duangjai Sangsrakru
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Nukoon Jomchai
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Sonicha U-Thoomporn
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Kanokwan Romyanon
- National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Theerayut Toojinda
- National Center for Genetic Engineering and Biotechnology, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | | |
Collapse
|
16
|
Peng Y, Li H, Liu Z, Zhang C, Li K, Gong Y, Geng L, Su J, Guan X, Liu L, Zhou R, Zhao Z, Guo J, Liang Q, Li X. Chromosome-level genome assembly of the Arctic fox (Vulpes lagopus) using PacBio sequencing and Hi-C technology. Mol Ecol Resour 2021; 21:2093-2108. [PMID: 33829635 DOI: 10.1111/1755-0998.13397] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Revised: 03/28/2021] [Accepted: 03/30/2021] [Indexed: 10/21/2022]
Abstract
The Arctic fox (Vulpes lagopus) is the only fox species occurring in the Arctic and has adapted to its extreme climatic conditions. Currently, the molecular basis of its adaptation to the extreme climate has not been characterized. Here, we applied PacBio sequencing and chromosome structure capture technique to assemble the first V. lagopus genome assembly, which is assembled into chromosome fragments. The genome assembly has a total length of 2.345 Gb with a contig N50 of 31.848 Mb and a scaffold N50 of 131.537 Mb, consisting of 25 pseudochromosomal scaffolds. The V. lagopus genome had approximately 32.33% repeat sequences. In total, 21,278 protein-coding genes were predicted, of which 99.14% were functionally annotated. Compared with 12 other mammals, V. lagopus was most closely related to V. Vulpes with an estimated divergence time of ~7.1 Ma. The expanded gene families and positively selected genes potentially play roles in the adaptation of V. lagopus to Arctic extreme environment. This high-quality assembled genome will not only promote future studies of genetic diversity and evolution in foxes and other canids but also provide important resources for conservation of Arctic species.
Collapse
Affiliation(s)
- Yongdong Peng
- Hebei Key Laboratory of Specialty Animal Germplasm Resources Exploration and Innovation (Under Planning), College of Animal Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao, China
| | - Hong Li
- Novogene Bioinformatics Institute, Beijing, China
| | - Zhengzhu Liu
- Hebei Key Laboratory of Specialty Animal Germplasm Resources Exploration and Innovation (Under Planning), College of Animal Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao, China
| | - Chuansheng Zhang
- Hebei Key Laboratory of Specialty Animal Germplasm Resources Exploration and Innovation (Under Planning), College of Animal Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao, China
| | - Keqiang Li
- Hebei Key Laboratory of Specialty Animal Germplasm Resources Exploration and Innovation (Under Planning), College of Mathematics and Information Science, Hebei Normal University of Science and Technology, Qinhuangdao, China
| | - Yuanfang Gong
- Hebei Key Laboratory of Specialty Animal Germplasm Resources Exploration and Innovation (Under Planning), College of Animal Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao, China
| | - Liying Geng
- Hebei Key Laboratory of Specialty Animal Germplasm Resources Exploration and Innovation (Under Planning), College of Agronomy and Biotechnology, Hebei Normal University of Science and Technology, Qinhuangdao, China
| | - Jingjing Su
- Hebei Key Laboratory of Applied Chemistry, School of Environmental and Chemical Engineering, Yanshan University, Qinhuangdao, China
| | - Xuemin Guan
- Hebei Key Laboratory of Specialty Animal Germplasm Resources Exploration and Innovation (Under Planning), College of Agronomy and Biotechnology, Hebei Normal University of Science and Technology, Qinhuangdao, China
| | - Lei Liu
- College of Animal Science and Technology, Shandong Agricultural University, Tai-an, China
| | - Ruihong Zhou
- Hebei Key Laboratory of Specialty Animal Germplasm Resources Exploration and Innovation (Under Planning), College of Animal Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao, China
| | - Ziya Zhao
- Hebei Key Laboratory of Specialty Animal Germplasm Resources Exploration and Innovation (Under Planning), College of Animal Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao, China
| | - Jianxu Guo
- Hebei Key Laboratory of Specialty Animal Germplasm Resources Exploration and Innovation (Under Planning), College of Animal Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao, China
| | - Qiqi Liang
- Novogene Bioinformatics Institute, Beijing, China
| | - Xianglong Li
- Hebei Key Laboratory of Specialty Animal Germplasm Resources Exploration and Innovation (Under Planning), College of Animal Science and Technology, Hebei Normal University of Science and Technology, Qinhuangdao, China
| |
Collapse
|
17
|
Du M, Wang T, Lian Q, Zhang X, Xin G, Pu Y, Bryan GJ, Qi J. Developing a new model system for potato genetics by androgenesis. JOURNAL OF INTEGRATIVE PLANT BIOLOGY 2021; 63:628-633. [PMID: 32965762 DOI: 10.1111/jipb.13018] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 09/21/2020] [Indexed: 06/11/2023]
Abstract
High heterozygosity and tetrasomic inheritance complicate studies of asexually propagated polyploids, such as potato. Reverse genetics approaches, especially mutant library construction, can be an ideal choice if a proper mutagenesis genotype is available. Here, we aimed to generate a model system for potato research using anther cultures of Solanum verrucosum, a self-compatible diploid potato with strong late blight resistance. Six of the 23 regenerants obtained (SVA4, SVA7, SVA22, SVA23, SVA32, and SVA33) were diploids, and their homozygosity was estimated to be >99.99% with 22 polymorphic InDel makers. Two lines-SVA4 and SVA32-had reduced stature (plant height ≤80 cm), high seed yield (>1,000 seeds/plant), and good tuber set (>30 tubers/plant). We further confirmed the full homozygosity of SVA4 and SVA32 using whole-genome resequencing. These two regenerants possess all the characteristics of a model plant: diploidy, 100% homozygosity, self-compatibility, and amenability to transgenesis. Thus, we have successfully generated two lines, SVA4 and SVA32, which can potentially be used for mutagenesis and as model plants to rejuvenate current methods of conducting potato research.
Collapse
Affiliation(s)
- Miru Du
- Inner Mongolia Potato Engineering and Technology Research Center, Inner Mongolia University, Hohhot, 010021, China
| | - Ting Wang
- Inner Mongolia Potato Engineering and Technology Research Center, Inner Mongolia University, Hohhot, 010021, China
| | - Qun Lian
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518000, China
| | - Xiaojie Zhang
- Inner Mongolia Potato Engineering and Technology Research Center, Inner Mongolia University, Hohhot, 010021, China
| | - Guohui Xin
- Inner Mongolia Potato Engineering and Technology Research Center, Inner Mongolia University, Hohhot, 010021, China
| | - Yuanyuan Pu
- Inner Mongolia Potato Engineering and Technology Research Center, Inner Mongolia University, Hohhot, 010021, China
| | - Glenn J Bryan
- Cell and Molecular Sciences, The James Hutton Institute, Invergowrie, Dundee, DD2 5DA, UK
| | - Jianjian Qi
- Inner Mongolia Potato Engineering and Technology Research Center, Inner Mongolia University, Hohhot, 010021, China
| |
Collapse
|
18
|
Peona V, Blom MPK, Xu L, Burri R, Sullivan S, Bunikis I, Liachko I, Haryoko T, Jønsson KA, Zhou Q, Irestedt M, Suh A. Identifying the causes and consequences of assembly gaps using a multiplatform genome assembly of a bird-of-paradise. Mol Ecol Resour 2021; 21:263-286. [PMID: 32937018 PMCID: PMC7757076 DOI: 10.1111/1755-0998.13252] [Citation(s) in RCA: 74] [Impact Index Per Article: 24.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Revised: 08/21/2020] [Accepted: 08/26/2020] [Indexed: 01/09/2023]
Abstract
Genome assemblies are currently being produced at an impressive rate by consortia and individual laboratories. The low costs and increasing efficiency of sequencing technologies now enable assembling genomes at unprecedented quality and contiguity. However, the difficulty in assembling repeat-rich and GC-rich regions (genomic "dark matter") limits insights into the evolution of genome structure and regulatory networks. Here, we compare the efficiency of currently available sequencing technologies (short/linked/long reads and proximity ligation maps) and combinations thereof in assembling genomic dark matter. By adopting different de novo assembly strategies, we compare individual draft assemblies to a curated multiplatform reference assembly and identify the genomic features that cause gaps within each assembly. We show that a multiplatform assembly implementing long-read, linked-read and proximity sequencing technologies performs best at recovering transposable elements, multicopy MHC genes, GC-rich microchromosomes and the repeat-rich W chromosome. Telomere-to-telomere assemblies are not a reality yet for most organisms, but by leveraging technology choice it is now possible to minimize genome assembly gaps for downstream analysis. We provide a roadmap to tailor sequencing projects for optimized completeness of both the coding and noncoding parts of nonmodel genomes.
Collapse
Affiliation(s)
- Valentina Peona
- Department of Ecology and Genetics—Evolutionary BiologyScience for Life LaboratoriesUppsala UniversityUppsalaSweden
- Department of Organismal Biology—Systematic BiologyScience for Life LaboratoriesUppsala UniversityUppsalaSweden
| | - Mozes P. K. Blom
- Department of Bioinformatics and GeneticsSwedish Museum of Natural HistoryStockholmSweden
- Museum für NaturkundeLeibniz Institut für Evolutions‐ und BiodiversitätsforschungBerlinGermany
| | - Luohao Xu
- Department of Neurosciences and Developmental BiologyUniversity of ViennaViennaAustria
| | - Reto Burri
- Department of Population EcologyInstitute of Ecology and EvolutionFriedrich‐Schiller‐University JenaJenaGermany
| | | | - Ignas Bunikis
- Department of Immunology, Genetics and PathologyScience for Life LaboratoryUppsala Genome CenterUppsala UniversityUppsalaSweden
| | | | - Tri Haryoko
- Research Centre for BiologyMuseum Zoologicum BogorienseIndonesian Institute of Sciences (LIPI)CibinongIndonesia
| | - Knud A. Jønsson
- Natural History Museum of DenmarkUniversity of CopenhagenCopenhagenDenmark
| | - Qi Zhou
- Department of Neurosciences and Developmental BiologyUniversity of ViennaViennaAustria
- MOE Laboratory of Biosystems Homeostasis & ProtectionLife Sciences InstituteZhejiang UniversityHangzhouChina
- Center for Reproductive MedicineThe 2nd Affiliated HospitalSchool of MedicineZhejiang UniversityHangzhouChina
| | - Martin Irestedt
- Department of Bioinformatics and GeneticsSwedish Museum of Natural HistoryStockholmSweden
| | - Alexander Suh
- Department of Ecology and Genetics—Evolutionary BiologyScience for Life LaboratoriesUppsala UniversityUppsalaSweden
- Department of Organismal Biology—Systematic BiologyScience for Life LaboratoriesUppsala UniversityUppsalaSweden
- School of Biological Sciences—Organisms and the EnvironmentUniversity of East AngliaNorwichUK
| |
Collapse
|
19
|
Murigneux V, Rai SK, Furtado A, Bruxner TJC, Tian W, Harliwong I, Wei H, Yang B, Ye Q, Anderson E, Mao Q, Drmanac R, Wang O, Peters BA, Xu M, Wu P, Topp B, Coin LJM, Henry RJ. Comparison of long-read methods for sequencing and assembly of a plant genome. Gigascience 2020; 9:giaa146. [PMID: 33347571 PMCID: PMC7751402 DOI: 10.1093/gigascience/giaa146] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2020] [Revised: 07/07/2020] [Accepted: 11/22/2020] [Indexed: 01/25/2023] Open
Abstract
BACKGROUND Sequencing technologies have advanced to the point where it is possible to generate high-accuracy, haplotype-resolved, chromosome-scale assemblies. Several long-read sequencing technologies are available, and a growing number of algorithms have been developed to assemble the reads generated by those technologies. When starting a new genome project, it is therefore challenging to select the most cost-effective sequencing technology, as well as the most appropriate software for assembly and polishing. It is thus important to benchmark different approaches applied to the same sample. RESULTS Here, we report a comparison of 3 long-read sequencing technologies applied to the de novo assembly of a plant genome, Macadamia jansenii. We have generated sequencing data using Pacific Biosciences (Sequel I), Oxford Nanopore Technologies (PromethION), and BGI (single-tube Long Fragment Read) technologies for the same sample. Several assemblers were benchmarked in the assembly of Pacific Biosciences and Nanopore reads. Results obtained from combining long-read technologies or short-read and long-read technologies are also presented. The assemblies were compared for contiguity, base accuracy, and completeness, as well as sequencing costs and DNA material requirements. CONCLUSIONS The 3 long-read technologies produced highly contiguous and complete genome assemblies of M. jansenii. At the time of sequencing, the cost associated with each method was significantly different, but continuous improvements in technologies have resulted in greater accuracy, increased throughput, and reduced costs. We propose updating this comparison regularly with reports on significant iterations of the sequencing technologies.
Collapse
Affiliation(s)
- Valentine Murigneux
- Genome Innovation Hub, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia
| | - Subash Kumar Rai
- Genome Innovation Hub, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia
| | - Agnelo Furtado
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Timothy J C Bruxner
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia
| | - Wei Tian
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- BGI-Australia, 300 Herston Road, Herston, QLD 4006, Australia
| | - Ivon Harliwong
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- BGI-Australia, 300 Herston Road, Herston, QLD 4006, Australia
| | - Hanmin Wei
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- MGI, BGI-Shenzhen, Building 11, Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Bicheng Yang
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- BGI-Australia, 300 Herston Road, Herston, QLD 4006, Australia
| | - Qianyu Ye
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- BGI-Australia, 300 Herston Road, Herston, QLD 4006, Australia
| | - Ellis Anderson
- MGI, BGI-Shenzhen, Building 11, Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
- Advanced Genomics Technology Lab, Complete Genomics Inc., 2904 Orchard Parkway, San Jose, CA 95134, USA
| | - Qing Mao
- MGI, BGI-Shenzhen, Building 11, Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
- Advanced Genomics Technology Lab, Complete Genomics Inc., 2904 Orchard Parkway, San Jose, CA 95134, USA
| | - Radoje Drmanac
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- MGI, BGI-Shenzhen, Building 11, Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
- Advanced Genomics Technology Lab, Complete Genomics Inc., 2904 Orchard Parkway, San Jose, CA 95134, USA
| | - Ou Wang
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
| | - Brock A Peters
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- MGI, BGI-Shenzhen, Building 11, Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
- Advanced Genomics Technology Lab, Complete Genomics Inc., 2904 Orchard Parkway, San Jose, CA 95134, USA
| | - Mengyang Xu
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- BGI-Qingdao, Building 2, No. 2 Hengyunshan Road, Qingdao 266555, China
| | - Pei Wu
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- BGI-Tianjin, Airport Business Park, Building E3, Airport Economics Area, Tianjin 300308, China
| | - Bruce Topp
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Lachlan J M Coin
- Genome Innovation Hub, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia
- Department of Microbiology and Immunology, University of Melbourne at The Peter Doherty Institute for Infection and Immunity, 792 Elizabeth Street, Melbourne, VIC 3004, Australia
| | - Robert J Henry
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD 4072, Australia
| |
Collapse
|
20
|
Murigneux V, Rai SK, Furtado A, Bruxner TJC, Tian W, Harliwong I, Wei H, Yang B, Ye Q, Anderson E, Mao Q, Drmanac R, Wang O, Peters BA, Xu M, Wu P, Topp B, Coin LJM, Henry RJ. Comparison of long-read methods for sequencing and assembly of a plant genome. Gigascience 2020; 9:6042729. [PMID: 33347571 DOI: 10.1101/2020.03.16.992933] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2020] [Revised: 07/07/2020] [Accepted: 11/22/2020] [Indexed: 05/23/2023] Open
Abstract
BACKGROUND Sequencing technologies have advanced to the point where it is possible to generate high-accuracy, haplotype-resolved, chromosome-scale assemblies. Several long-read sequencing technologies are available, and a growing number of algorithms have been developed to assemble the reads generated by those technologies. When starting a new genome project, it is therefore challenging to select the most cost-effective sequencing technology, as well as the most appropriate software for assembly and polishing. It is thus important to benchmark different approaches applied to the same sample. RESULTS Here, we report a comparison of 3 long-read sequencing technologies applied to the de novo assembly of a plant genome, Macadamia jansenii. We have generated sequencing data using Pacific Biosciences (Sequel I), Oxford Nanopore Technologies (PromethION), and BGI (single-tube Long Fragment Read) technologies for the same sample. Several assemblers were benchmarked in the assembly of Pacific Biosciences and Nanopore reads. Results obtained from combining long-read technologies or short-read and long-read technologies are also presented. The assemblies were compared for contiguity, base accuracy, and completeness, as well as sequencing costs and DNA material requirements. CONCLUSIONS The 3 long-read technologies produced highly contiguous and complete genome assemblies of M. jansenii. At the time of sequencing, the cost associated with each method was significantly different, but continuous improvements in technologies have resulted in greater accuracy, increased throughput, and reduced costs. We propose updating this comparison regularly with reports on significant iterations of the sequencing technologies.
Collapse
Affiliation(s)
- Valentine Murigneux
- Genome Innovation Hub, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia
| | - Subash Kumar Rai
- Genome Innovation Hub, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia
| | - Agnelo Furtado
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Timothy J C Bruxner
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia
| | - Wei Tian
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- BGI-Australia, 300 Herston Road, Herston, QLD 4006, Australia
| | - Ivon Harliwong
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- BGI-Australia, 300 Herston Road, Herston, QLD 4006, Australia
| | - Hanmin Wei
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- MGI, BGI-Shenzhen, Building 11, Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
| | - Bicheng Yang
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- BGI-Australia, 300 Herston Road, Herston, QLD 4006, Australia
| | - Qianyu Ye
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- BGI-Australia, 300 Herston Road, Herston, QLD 4006, Australia
| | - Ellis Anderson
- MGI, BGI-Shenzhen, Building 11, Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
- Advanced Genomics Technology Lab, Complete Genomics Inc., 2904 Orchard Parkway, San Jose, CA 95134, USA
| | - Qing Mao
- MGI, BGI-Shenzhen, Building 11, Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
- Advanced Genomics Technology Lab, Complete Genomics Inc., 2904 Orchard Parkway, San Jose, CA 95134, USA
| | - Radoje Drmanac
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- MGI, BGI-Shenzhen, Building 11, Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
- Advanced Genomics Technology Lab, Complete Genomics Inc., 2904 Orchard Parkway, San Jose, CA 95134, USA
| | - Ou Wang
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
| | - Brock A Peters
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- MGI, BGI-Shenzhen, Building 11, Beishan Industrial Zone, Yantian District, Shenzhen 518083, China
- Advanced Genomics Technology Lab, Complete Genomics Inc., 2904 Orchard Parkway, San Jose, CA 95134, USA
| | - Mengyang Xu
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- BGI-Qingdao, Building 2, No. 2 Hengyunshan Road, Qingdao 266555, China
| | - Pei Wu
- BGI-Shenzhen, No.21 Hongan 3rd Street, Yantian District, Shenzhen 518083, China
- BGI-Tianjin, Airport Business Park, Building E3, Airport Economics Area, Tianjin 300308, China
| | - Bruce Topp
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD 4072, Australia
| | - Lachlan J M Coin
- Genome Innovation Hub, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia
- Institute for Molecular Bioscience, The University of Queensland, 306 Carmody Road, Brisbane, QLD 4072, Australia
- Department of Microbiology and Immunology, University of Melbourne at The Peter Doherty Institute for Infection and Immunity, 792 Elizabeth Street, Melbourne, VIC 3004, Australia
| | - Robert J Henry
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, Brisbane, QLD 4072, Australia
| |
Collapse
|
21
|
Bohra A, Chand Jha U, Godwin ID, Kumar Varshney R. Genomic interventions for sustainable agriculture. PLANT BIOTECHNOLOGY JOURNAL 2020; 18:2388-2405. [PMID: 32875704 PMCID: PMC7680532 DOI: 10.1111/pbi.13472] [Citation(s) in RCA: 36] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/25/2020] [Revised: 07/21/2020] [Accepted: 08/16/2020] [Indexed: 05/05/2023]
Abstract
Agricultural production faces a Herculean challenge to feed the increasing global population. Food production systems need to deliver more with finite land and water resources while exerting the least negative influence on the ecosystem. The unpredictability of climate change and consequent changes in pests/pathogens dynamics aggravate the enormity of the challenge. Crop improvement has made significant contributions towards food security, and breeding climate-smart cultivars are considered the most sustainable way to accelerate food production. However, a fundamental change is needed in the conventional breeding framework in order to respond adequately to the growing food demands. Progress in genomics has provided new concepts and tools that hold promise to make plant breeding procedures more precise and efficient. For instance, reference genome assemblies in combination with germplasm sequencing delineate breeding targets that could contribute to securing future food supply. In this review, we highlight key breakthroughs in plant genome sequencing and explain how the presence of these genome resources in combination with gene editing techniques has revolutionized the procedures of trait discovery and manipulation. Adoption of new approaches such as speed breeding, genomic selection and haplotype-based breeding could overcome several limitations of conventional breeding. We advocate that strengthening varietal release and seed distribution systems will play a more determining role in delivering genetic gains at farmer's field. A holistic approach outlined here would be crucial to deliver steady stream of climate-smart crop cultivars for sustainable agriculture.
Collapse
Affiliation(s)
- Abhishek Bohra
- ICAR‐Indian Institute of Pulses Research (IIPR)KanpurIndia
| | - Uday Chand Jha
- ICAR‐Indian Institute of Pulses Research (IIPR)KanpurIndia
| | - Ian D. Godwin
- Centre for Crop ScienceQueensland Alliance for Agriculture and Food Innovation (QAAFI)The University of QueenslandBrisbaneQldAustralia
| | - Rajeev Kumar Varshney
- International Crops Research Institute for the Semi‐Arid Tropics (ICRISAT)HyderabadIndia
- The UWA Institute of AgricultureThe University of Western AustraliaPerthAustralia
| |
Collapse
|
22
|
Jung H, Ventura T, Chung JS, Kim WJ, Nam BH, Kong HJ, Kim YO, Jeon MS, Eyun SI. Twelve quick steps for genome assembly and annotation in the classroom. PLoS Comput Biol 2020; 16:e1008325. [PMID: 33180771 PMCID: PMC7660529 DOI: 10.1371/journal.pcbi.1008325] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Eukaryotic genome sequencing and de novo assembly, once the exclusive domain of well-funded international consortia, have become increasingly affordable, thus fitting the budgets of individual research groups. Third-generation long-read DNA sequencing technologies are increasingly used, providing extensive genomic toolkits that were once reserved for a few select model organisms. Generating high-quality genome assemblies and annotations for many aquatic species still presents significant challenges due to their large genome sizes, complexity, and high chromosome numbers. Indeed, selecting the most appropriate sequencing and software platforms and annotation pipelines for a new genome project can be daunting because tools often only work in limited contexts. In genomics, generating a high-quality genome assembly/annotation has become an indispensable tool for better understanding the biology of any species. Herein, we state 12 steps to help researchers get started in genome projects by presenting guidelines that are broadly applicable (to any species), sustainable over time, and cover all aspects of genome assembly and annotation projects from start to finish. We review some commonly used approaches, including practical methods to extract high-quality DNA and choices for the best sequencing platforms and library preparations. In addition, we discuss the range of potential bioinformatics pipelines, including structural and functional annotations (e.g., transposable elements and repetitive sequences). This paper also includes information on how to build a wide community for a genome project, the importance of data management, and how to make the data and results Findable, Accessible, Interoperable, and Reusable (FAIR) by submitting them to a public repository and sharing them with the research community.
Collapse
Affiliation(s)
- Hyungtaek Jung
- School of Biological Sciences, The University of Queensland, St Lucia, Queensland, Australia
- Centre for Agriculture and Bioeconomy, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Tomer Ventura
- Genecology Research Centre, School of Science and Engineering, University of the Sunshine Coast, Sippy Downs, Queensland, Australia
| | - J. Sook Chung
- Institute of Marine and Environmental Technology, University of Maryland Center for Environmental Science, Baltimore, Maryland, United States of America
| | - Woo-Jin Kim
- Genetics and Breeding Research Center, National Institute of Fisheries Science, Geoje, Korea
| | - Bo-Hye Nam
- Biotechnology Research Division, National Institute of Fisheries Science, Busan, Korea
| | - Hee Jeong Kong
- Biotechnology Research Division, National Institute of Fisheries Science, Busan, Korea
| | - Young-Ok Kim
- Biotechnology Research Division, National Institute of Fisheries Science, Busan, Korea
| | - Min-Seung Jeon
- Department of Life Science, Chung-Ang University, Seoul, Korea
| | - Seong-il Eyun
- Department of Life Science, Chung-Ang University, Seoul, Korea
| |
Collapse
|
23
|
Pootakham W, Nawae W, Naktang C, Sonthirod C, Yoocha T, Kongkachana W, Sangsrakru D, Jomchai N, U-Thoomporn S, Somta P, Laosatit K, Tangphatsornruang S. A chromosome-scale assembly of the black gram (Vigna mungo) genome. Mol Ecol Resour 2020; 21:238-250. [PMID: 32794377 DOI: 10.1111/1755-0998.13243] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2020] [Revised: 08/05/2020] [Accepted: 08/10/2020] [Indexed: 02/06/2023]
Abstract
Black gram (Vigna mungo) is an important short duration grain legume crop. Black gram seeds provide an inexpensive source of dietary protein. Here, we applied the 10X Genomics linked-read technology to obtain a de novo whole genome assembly of V. mungo cultivated variety Chai Nat 80 (CN80). The preliminary assembly contained 12,228 contigs and had an N50 length of 5.2 Mb. Subsequent scaffolding using the long-range Chicago and HiC techniques yielded the first high-quality, chromosome-level assembly of 499 Mb comprising 11 pseudomolecules. Comparative genomics analyses based on sequence information from single-copy orthologous genes revealed that black gram and mungbean (Vigna radiata) diverged about 2.7 million years ago . The transversion rate (4DTv) analysis in V. mungo revealed no evidence supporting a recent genome-wide duplication event observed in the tetraploid créole bean (Vigna reflexo-pilosa). The proportion of repetitive elements in the black gram genome is slightly lower than the numbers reported for related Vigna species. The majority of long terminal repeat retrotransposons appeared to integrate into the genome within the last five million years. We also examined alternative splicing events in V. mungo using full-length transcript sequences. While intron retention was the most prevalent mode of alternative splicing in several plant species, alternative 3' acceptor site selection represented the majority of events in black gram. Our high-quality genome assembly along with the genomic variation information from the germplasm provides valuable resources for accelerating the development of elite varieties through marker-assisted breeding and for future comparative genomics and phylogenetic studies in legume species.
Collapse
Affiliation(s)
- Wirulda Pootakham
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Wanapinun Nawae
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Chaiwat Naktang
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Chutima Sonthirod
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Thippawan Yoocha
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Wasitthee Kongkachana
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Duangjai Sangsrakru
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Nukoon Jomchai
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Sonicha U-Thoomporn
- National Omics Center, National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Prakit Somta
- Department of Agronomy, Faculty of Agriculture at Kamphaeng Saen, Kasetsart University, Nakhon Pathom, Thailand
| | - Kularb Laosatit
- Department of Agronomy, Faculty of Agriculture at Kamphaeng Saen, Kasetsart University, Nakhon Pathom, Thailand
| | | |
Collapse
|
24
|
Bellinger MR, Paudel R, Starnes S, Kambic L, Kantar MB, Wolfgruber T, Lamour K, Geib S, Sim S, Miyasaka SC, Helmkampf M, Shintaku M. Taro Genome Assembly and Linkage Map Reveal QTLs for Resistance to Taro Leaf Blight. G3 (BETHESDA, MD.) 2020; 10:2763-2775. [PMID: 32546503 PMCID: PMC7407455 DOI: 10.1534/g3.120.401367] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/09/2020] [Accepted: 06/08/2020] [Indexed: 02/06/2023]
Abstract
Taro (Colocasia esculenta) is a food staple widely cultivated in the humid tropics of Asia, Africa, Pacific and the Caribbean. One of the greatest threats to taro production is Taro Leaf Blight caused by the oomycete pathogen Phytophthora colocasiae Here we describe a de novo taro genome assembly and use it to analyze sequence data from a Taro Leaf Blight resistant mapping population. The genome was assembled from linked-read sequences (10x Genomics; ∼60x coverage) and gap-filled and scaffolded with contigs assembled from Oxford Nanopore Technology long-reads and linkage map results. The haploid assembly was 2.45 Gb total, with a maximum contig length of 38 Mb and scaffold N50 of 317,420 bp. A comparison of family-level (Araceae) genome features reveals the repeat content of taro to be 82%, >3.5x greater than in great duckweed (Spirodela polyrhiza), 23%. Both genomes recovered a similar percent of Benchmarking Universal Single-copy Orthologs, 80% and 84%, based on a 3,236 gene database for monocot plants. A greater number of nucleotide-binding leucine-rich repeat disease resistance genes were present in genomes of taro than the duckweed, ∼391 vs. ∼70 (∼182 and ∼46 complete). The mapping population data revealed 16 major linkage groups with 520 markers, and 10 quantitative trait loci (QTL) significantly associated with Taro Leaf Blight disease resistance. The genome sequence of taro enhances our understanding of resistance to TLB, and provides markers that may accelerate breeding programs. This genome project may provide a template for developing genomic resources in other understudied plant species.
Collapse
Affiliation(s)
| | - Roshan Paudel
- University of Hawaii at Manoa, Department of Tropical Plant and Soil Sciences, Honolulu, Hawaii
| | - Steven Starnes
- University of Hawaii at Hilo, College of Agriculture, Forestry and Natural Resource Management, Hilo, Hawaii
| | - Lukas Kambic
- University of Hawaii at Hilo, College of Agriculture, Forestry and Natural Resource Management, Hilo, Hawaii
| | - Michael B Kantar
- University of Hawaii at Manoa, Department of Tropical Plant and Soil Sciences, Honolulu, Hawaii
| | - Thomas Wolfgruber
- University of Hawaii at Manoa, Department of Tropical Plant and Soil Sciences, Honolulu, Hawaii
| | - Kurt Lamour
- University of Tennessee at Knoxville, Department of Entomology and Plant Pathology, Knoxville, Tennessee
| | - Scott Geib
- United States Department of Agriculture-Agricultural Research Service, Hilo, Hawaii
| | - Sheina Sim
- United States Department of Agriculture-Agricultural Research Service, Hilo, Hawaii
| | - Susan C Miyasaka
- University of Hawaii at Manoa, Department of Tropical Plant and Soil Sciences, Honolulu, Hawaii
| | - Martin Helmkampf
- University of Hawaii at Hilo, Department of Biology, Hilo, Hawaii
| | - Michael Shintaku
- University of Hawaii at Hilo, College of Agriculture, Forestry and Natural Resource Management, Hilo, Hawaii,
| |
Collapse
|
25
|
Wang J, Liu W, Zhu D, Hong P, Zhang S, Xiao S, Tan Y, Chen X, Xu L, Zong X, Zhang L, Wei H, Yuan X, Liu Q. Chromosome-scale genome assembly of sweet cherry ( Prunus avium L.) cv. Tieton obtained using long-read and Hi-C sequencing. HORTICULTURE RESEARCH 2020; 7:122. [PMID: 32821405 PMCID: PMC7395734 DOI: 10.1038/s41438-020-00343-8] [Citation(s) in RCA: 38] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2020] [Revised: 04/29/2020] [Accepted: 05/12/2020] [Indexed: 05/25/2023]
Abstract
Sweet cherry (Prunus avium) is an economically significant fruit species in the genus Prunus. However, in contrast to other important fruit trees in this genus, only one draft genome assembly is available for sweet cherry, which was assembled using only Illumina short-read sequences. The incompleteness and low quality of the current sweet cherry draft genome limit its use in genetic and genomic studies. A high-quality chromosome-scale sweet cherry reference genome assembly is therefore needed. A total of 65.05 Gb of Oxford Nanopore long reads and 46.24 Gb of Illumina short reads were generated, representing ~190x and 136x coverage, respectively, of the sweet cherry genome. The final de novo assembly resulted in a phased haplotype assembly of 344.29 Mb with a contig N50 of 3.25 Mb. Hi-C scaffolding of the genome resulted in eight pseudochromosomes containing 99.59% of the bases in the assembled genome. Genome annotation revealed that more than half of the genome (59.40%) was composed of repetitive sequences, and 40,338 protein-coding genes were predicted, 75.40% of which were functionally annotated. With the chromosome-scale assembly, we revealed that gene duplication events contributed to the expansion of gene families for salicylic acid/jasmonic acid carboxyl methyltransferase and ankyrin repeat-containing proteins in the genome of sweet cherry. Four auxin-responsive genes (two GH3s and two SAURs) were induced in the late stage of fruit development, indicating that auxin is crucial for the sweet cherry ripening process. In addition, 772 resistance genes were identified and functionally predicted in the sweet cherry genome. The high-quality genome assembly of sweet cherry obtained in this study will provide valuable genomic resources for sweet cherry improvement and molecular breeding.
Collapse
Affiliation(s)
- Jiawei Wang
- Shandong Key Laboratory of Fruit Biotechnology Breeding, Shandong Institute of Pomology, Taian, Shandong 271000 China
| | - Weizhen Liu
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan, Hubei 430070 China
| | - Dongzi Zhu
- Shandong Key Laboratory of Fruit Biotechnology Breeding, Shandong Institute of Pomology, Taian, Shandong 271000 China
| | - Po Hong
- Shandong Key Laboratory of Fruit Biotechnology Breeding, Shandong Institute of Pomology, Taian, Shandong 271000 China
| | - Shizhong Zhang
- State Key Laboratory of Crop Biology, Shandong Agricultural University, Taian, Shandong 271018 China
| | - Shijun Xiao
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan, Hubei 430070 China
- Gooal Gene, Wuhan, Hubei 430070 China
| | - Yue Tan
- Shandong Key Laboratory of Fruit Biotechnology Breeding, Shandong Institute of Pomology, Taian, Shandong 271000 China
| | - Xin Chen
- Shandong Key Laboratory of Fruit Biotechnology Breeding, Shandong Institute of Pomology, Taian, Shandong 271000 China
| | - Li Xu
- Shandong Key Laboratory of Fruit Biotechnology Breeding, Shandong Institute of Pomology, Taian, Shandong 271000 China
| | - Xiaojuan Zong
- Shandong Key Laboratory of Fruit Biotechnology Breeding, Shandong Institute of Pomology, Taian, Shandong 271000 China
| | - Lisi Zhang
- Shandong Key Laboratory of Fruit Biotechnology Breeding, Shandong Institute of Pomology, Taian, Shandong 271000 China
| | - Hairong Wei
- Shandong Key Laboratory of Fruit Biotechnology Breeding, Shandong Institute of Pomology, Taian, Shandong 271000 China
| | - Xiaohui Yuan
- School of Computer Science and Technology, Wuhan University of Technology, Wuhan, Hubei 430070 China
| | - Qingzhong Liu
- Shandong Key Laboratory of Fruit Biotechnology Breeding, Shandong Institute of Pomology, Taian, Shandong 271000 China
| |
Collapse
|
26
|
Jung H, Jeon MS, Hodgett M, Waterhouse P, Eyun SI. Comparative Evaluation of Genome Assemblers from Long-Read Sequencing for Plants and Crops. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2020; 68:7670-7677. [PMID: 32530283 DOI: 10.1021/acs.jafc.0c01647] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
The availability of recent state-of-the-art long-read sequencing technologies has significantly increased the ease and speed of producing high-quality plant genome assemblies. A wide variety of genome-related software tools are now available and they are typically benchmarked using microbial or model eukaryotic genomes such as Arabidopsis and rice. However, many plant species have much larger and more complex genomes than these, and the choice of tools, parameters, and/or strategies that can be used is not always obvious. Thus, we have compared the metrics of assemblies generated by various pipelines to discuss how assembly quality can be affected by two different assembly strategies. First, we focused on optimizing read preprocessing and assembler variables using eight different de novo assemblers on five different Pacific Biosciences long-read datasets of diploid and tetraploid species. Then, we examined a single scaffolding tool (quickmerge) that has been employed for the postprocessing step. We then merged the outputs from multiple assemblies to produce a higher quality consensus assembly. Then, we benchmarked the assemblies for completeness and accuracy (assembly metrics and BUSCO), computer memory, and CPU times. Two lightweight assemblers, Miniasm/Minimap/Racon and WTDBG, were deemed good for novice users because they involved smaller required learning curves and light computational resources. However, two heavyweight tools, CANU and Flye, should be the first choice when the goal is to achieve accurate and complete assemblies. Our results will provide valuable guidance in future plant genome projects and beyond.
Collapse
Affiliation(s)
- Hyungtaek Jung
- Centre for Agriculture and Biocommodities, Queensland University of Technology, Brisbane, Queensland 4001, Australia
| | - Min-Seung Jeon
- Department of Life Science, Chung-Ang University, Seoul 06974, Korea
| | - Matthew Hodgett
- Information Technology Services, Queensland University of Technology, Brisbane, Queensland 4001, Australia
| | - Peter Waterhouse
- Centre for Agriculture and Biocommodities, Queensland University of Technology, Brisbane, Queensland 4001, Australia
| | - Seong-Il Eyun
- Department of Life Science, Chung-Ang University, Seoul 06974, Korea
| |
Collapse
|
27
|
Lin X, Wang S, de Rond L, Bertolin N, Wouters RHM, Wouters D, Domazakis E, Bitew MK, Win J, Dong S, Visser RGF, Birch P, Kamoun S, Vleeshouwers VGAA. Divergent Evolution of PcF/SCR74 Effectors in Oomycetes Is Associated with Distinct Recognition Patterns in Solanaceous Plants. mBio 2020; 11:e00947-20. [PMID: 32605983 PMCID: PMC7327169 DOI: 10.1128/mbio.00947-20] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Accepted: 06/02/2020] [Indexed: 01/03/2023] Open
Abstract
Plants deploy cell surface receptors known as pattern-recognition receptors (PRRs) that recognize non-self molecules from pathogens and microbes to defend against invaders. PRRs typically recognize microbe-associated molecular patterns (MAMPs) that are usually widely conserved, some even across kingdoms. Here, we report an oomycete-specific family of small secreted cysteine-rich (SCR) proteins that displays divergent patterns of sequence variation in the Irish potato famine pathogen Phytophthora infestans A subclass that includes the conserved effector PcF from Phytophthora cactorum activates immunity in a wide range of plant species. In contrast, the more diverse SCR74 subclass is specific to P. infestans and tends to trigger immune responses only in a limited number of wild potato genotypes. The SCR74 response was recently mapped to a G-type lectin receptor kinase (G-LecRK) locus in the wild potato Solanum microdontum subsp. gigantophyllum. The G-LecRK locus displays a high diversity in Solanum host species compared to other solanaceous plants. We propose that the diversification of the SCR74 proteins in P. infestans is driven by a fast coevolutionary arms race with cell surface immune receptors in wild potato, which contrasts the presumed slower dynamics between conserved apoplastic effectors and PRRs. Understanding the molecular determinants of plant immune responses to these divergent molecular patterns in oomycetes is expected to contribute to deploying multiple layers of disease resistance in crop plants.IMPORTANCE Immune receptors at the plant cell surface can recognize invading microbes. The perceived microbial molecules are typically widely conserved and therefore the matching surface receptors can detect a broad spectrum of pathogens. Here we describe a family of Phytophthora small extracellular proteins that consists of conserved subfamilies that are widely recognized by solanaceous plants. Remarkably, one subclass of SCR74 proteins is highly diverse, restricted to the late blight pathogen Phytophthora infestans and is specifically detected in wild potato plants. The diversification of this subfamily exhibits signatures of a coevolutionary arms race with surface receptors in potato. Insights into the molecular interaction between these potato-specific receptors and the recognized Phytophthora proteins are expected to contribute to disease resistance breeding in potato.
Collapse
Affiliation(s)
- Xiao Lin
- Wageningen UR Plant Breeding, Wageningen University and Research, Wageningen, The Netherlands
| | - Shumei Wang
- Cell and Molecular Sciences, The James Hutton Institute, Dundee, United Kingdom
| | - Laura de Rond
- Wageningen UR Plant Breeding, Wageningen University and Research, Wageningen, The Netherlands
| | - Nicoletta Bertolin
- Wageningen UR Plant Breeding, Wageningen University and Research, Wageningen, The Netherlands
| | - Roland H M Wouters
- Wageningen UR Plant Breeding, Wageningen University and Research, Wageningen, The Netherlands
| | - Doret Wouters
- Wageningen UR Plant Breeding, Wageningen University and Research, Wageningen, The Netherlands
| | - Emmanouil Domazakis
- Wageningen UR Plant Breeding, Wageningen University and Research, Wageningen, The Netherlands
| | - Mulusew Kassa Bitew
- Wageningen UR Plant Breeding, Wageningen University and Research, Wageningen, The Netherlands
| | - Joe Win
- The Sainsbury Laboratory, University of East Anglia, Norwich, United Kingdom
| | - Suomeng Dong
- The Sainsbury Laboratory, University of East Anglia, Norwich, United Kingdom
| | - Richard G F Visser
- Wageningen UR Plant Breeding, Wageningen University and Research, Wageningen, The Netherlands
| | - Paul Birch
- Cell and Molecular Sciences, The James Hutton Institute, Dundee, United Kingdom
- School of Life Sciences, Division of Plant Sciences, University of Dundee at the James Hutton Institute, Dundee, United Kingdom
| | - Sophien Kamoun
- The Sainsbury Laboratory, University of East Anglia, Norwich, United Kingdom
| | | |
Collapse
|
28
|
Wang W, Wang F, Hao R, Wang A, Sharshov K, Druzyaka A, Lancuo Z, Shi Y, Feng S. First de novo whole genome sequencing and assembly of the bar-headed goose. PeerJ 2020; 8:e8914. [PMID: 32292659 PMCID: PMC7144584 DOI: 10.7717/peerj.8914] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Accepted: 03/15/2020] [Indexed: 12/23/2022] Open
Abstract
Background The bar-headed goose (Anser indicus) mainly inhabits the plateau wetlands of Asia. As a specialized high-altitude species, bar-headed geese can migrate between South and Central Asia and annually fly twice over the Himalayan mountains along the central Asian flyway. The physiological, biochemical and behavioral adaptations of bar-headed geese to high-altitude living and flying have raised much interest. However, to date, there is still no genome assembly information publicly available for bar-headed geese. Methods In this study, we present the first de novo whole genome sequencing and assembly of the bar-headed goose, along with gene prediction and annotation. Results 10X Genomics sequencing produced a total of 124 Gb sequencing data, which can cover the estimated genome size of bar-headed goose for 103 times (average coverage). The genome assembly comprised 10,528 scaffolds, with a total length of 1.143 Gb and a scaffold N50 of 10.09 Mb. Annotation of the bar-headed goose genome assembly identified a total of 102 Mb (8.9%) of repetitive sequences, 16,428 protein-coding genes, and 282 tRNAs. In total, we determined that there were 63 expanded and 20 contracted gene families in the bar-headed goose compared with the other 15 vertebrates. We also performed a positive selection analysis between the bar-headed goose and the closely related low-altitude goose, swan goose (Anser cygnoides), to uncover its genetic adaptations to the Qinghai-Tibetan Plateau. Conclusion We reported the currently most complete genome sequence of the bar-headed goose. Our assembly will provide a valuable resource to enhance further studies of the gene functions of bar-headed goose. The data will also be valuable for facilitating studies of the evolution, population genetics and high-altitude adaptations of the bar-headed geese at the genomic level.
Collapse
Affiliation(s)
- Wen Wang
- State Key Laboratory of Plateau Ecology and Agriculture, Qinghai University, Xi'ning, Qinghai, China
| | - Fang Wang
- Northwest Institute of Plateau Biology, Chinese Academy of Sciences, Xi'ning, Qinghai, China
| | - Rongkai Hao
- Novogene Bioinformatics Institute, Beijing, China
| | - Aizhen Wang
- College of Eco-Environmental Engineering, Qinghai University, Xi'ning, Qinghai, China
| | - Kirill Sharshov
- Research Institute of Experimental and Clinical Medicine, Novosibirsk, Russia
| | - Alexey Druzyaka
- Institute of Systematics and Ecology of Animals, Siberian Branch of the Russian Academy of Sciences, Novosibirsk, Russia
| | - Zhuoma Lancuo
- School of Finance and Economics, Qinghai University, Xi'ning, Qinghai, China
| | - Yuetong Shi
- KunLun College of Qinghai University, Xi'ning, Qinghai, China
| | - Shuo Feng
- State Key Laboratory of Plateau Ecology and Agriculture, Qinghai University, Xi'ning, Qinghai, China
| |
Collapse
|
29
|
Danilevicz MF, Tay Fernandez CG, Marsh JI, Bayer PE, Edwards D. Plant pangenomics: approaches, applications and advancements. CURRENT OPINION IN PLANT BIOLOGY 2020; 54:18-25. [PMID: 31982844 DOI: 10.1016/j.pbi.2019.12.005] [Citation(s) in RCA: 67] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/19/2019] [Revised: 12/15/2019] [Accepted: 12/18/2019] [Indexed: 05/05/2023]
Abstract
With the assembly of increasing numbers of plant genomes, it is becoming accepted that a single reference assembly does not reflect the gene diversity of a species. The production of pangenomes, which reflect the structural variation and polymorphisms in genomes, enables in depth comparisons of variation within species or higher taxonomic groups. In this review, we discuss the current and emerging approaches for pangenome assembly, analysis and visualisation. In addition, we consider the potential of pangenomes for applied crop improvement, evolutionary and biodiversity studies. To fully exploit the value of pangenomes it is important to integrate broad information such as phenotypic, environmental, and expression data to gain insights into the role of variable regions within genomes.
Collapse
Affiliation(s)
- Monica Furaste Danilevicz
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | | | - Jacob Ian Marsh
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - Philipp Emanuel Bayer
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia
| | - David Edwards
- School of Biological Sciences and Institute of Agriculture, University of Western Australia, Perth, WA, Australia.
| |
Collapse
|
30
|
Siadjeu C, Pucker B, Viehöver P, Albach DC, Weisshaar B. High Contiguity De Novo Genome Sequence Assembly of Trifoliate Yam ( Dioscorea dumetorum) Using Long Read Sequencing. Genes (Basel) 2020; 11:E274. [PMID: 32143301 PMCID: PMC7140821 DOI: 10.3390/genes11030274] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Revised: 02/25/2020] [Accepted: 02/29/2020] [Indexed: 12/17/2022] Open
Abstract
Trifoliate yam (Dioscorea dumetorum) is one example of an orphan crop, not traded internationally. Post-harvest hardening of the tubers of this species starts within 24 h after harvesting and renders the tubers inedible. Genomic resources are required for D. dumetorum to improve breeding for non-hardening varieties as well as for other traits. We sequenced the D. dumetorum genome and generated the corresponding annotation. The two haplophases of this highly heterozygous genome were separated to a large extent. The assembly represents 485 Mbp of the genome with an N50 of over 3.2 Mbp. A total of 35,269 protein-encoding gene models as well as 9941 non-coding RNA genes were predicted, and functional annotations were assigned.
Collapse
Affiliation(s)
- Christian Siadjeu
- Institute for Biology and Environmental Sciences, Biodiversity and Evolution of Plants, Carl-von-Ossietzky University Oldenburg, Carl-von-Ossietzky Str. 9-11, 26111 Oldenburg, Germany; (C.S.); (D.C.A.)
- Genetics and Genomics of Plants, Faculty of Biology, Center for Biotechnology (CeBiTec), Bielefeld University, Sequenz 1, 33615 Bielefeld, NRW, Germany; (B.P.); (P.V.)
| | - Boas Pucker
- Genetics and Genomics of Plants, Faculty of Biology, Center for Biotechnology (CeBiTec), Bielefeld University, Sequenz 1, 33615 Bielefeld, NRW, Germany; (B.P.); (P.V.)
- Molecular Genetics and Physiology of Plants, Faculty of Biology and Biotechnology, Ruhr-University Bochum, Universitätsstraße 150, 44801 Bochum, Germany
| | - Prisca Viehöver
- Genetics and Genomics of Plants, Faculty of Biology, Center for Biotechnology (CeBiTec), Bielefeld University, Sequenz 1, 33615 Bielefeld, NRW, Germany; (B.P.); (P.V.)
| | - Dirk C. Albach
- Institute for Biology and Environmental Sciences, Biodiversity and Evolution of Plants, Carl-von-Ossietzky University Oldenburg, Carl-von-Ossietzky Str. 9-11, 26111 Oldenburg, Germany; (C.S.); (D.C.A.)
| | - Bernd Weisshaar
- Genetics and Genomics of Plants, Faculty of Biology, Center for Biotechnology (CeBiTec), Bielefeld University, Sequenz 1, 33615 Bielefeld, NRW, Germany; (B.P.); (P.V.)
| |
Collapse
|
31
|
Edelman NB, Frandsen PB, Miyagi M, Clavijo B, Davey J, Dikow RB, García-Accinelli G, Van Belleghem SM, Patterson N, Neafsey DE, Challis R, Kumar S, Moreira GRP, Salazar C, Chouteau M, Counterman BA, Papa R, Blaxter M, Reed RD, Dasmahapatra KK, Kronforst M, Joron M, Jiggins CD, McMillan WO, Di Palma F, Blumberg AJ, Wakeley J, Jaffe D, Mallet J. Genomic architecture and introgression shape a butterfly radiation. Science 2019; 366:594-599. [PMID: 31672890 PMCID: PMC7197882 DOI: 10.1126/science.aaw2090] [Citation(s) in RCA: 266] [Impact Index Per Article: 53.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2018] [Accepted: 09/16/2019] [Indexed: 12/26/2022]
Abstract
We used 20 de novo genome assemblies to probe the speciation history and architecture of gene flow in rapidly radiating Heliconius butterflies. Our tests to distinguish incomplete lineage sorting from introgression indicate that gene flow has obscured several ancient phylogenetic relationships in this group over large swathes of the genome. Introgressed loci are underrepresented in low-recombination and gene-rich regions, consistent with the purging of foreign alleles more tightly linked to incompatibility loci. Here, we identify a hitherto unknown inversion that traps a color pattern switch locus. We infer that this inversion was transferred between lineages by introgression and is convergent with a similar rearrangement in another part of the genus. These multiple de novo genome sequences enable improved understanding of the importance of introgression and selective processes in adaptive radiation.
Collapse
Affiliation(s)
- Nathaniel B Edelman
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA.
| | - Paul B Frandsen
- Department of Plant and Wildlife Sciences, Brigham Young University, Provo, UT 84602, USA
- Data Science Lab, Office of the Chief Information Officer, Smithsonian Institution, Washington, DC 20560, USA
| | - Miriam Miyagi
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | | | - John Davey
- Bioscience Technology Facility, Department of Biology, University of York, York YO10 5DD, UK
- Department of Zoology, University of Cambridge, Cambridge CB2 3EJ, UK
| | - Rebecca B Dikow
- Data Science Lab, Office of the Chief Information Officer, Smithsonian Institution, Washington, DC 20560, USA
| | | | - Steven M Van Belleghem
- Department of Biology, University of Puerto Rico, Río Piedras Campus, San Juan, PR 00931-3360, Puerto Rico
| | - Nick Patterson
- Department of Human Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142 USA
| | - Daniel E Neafsey
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142 USA
- Harvard TH Chan School of Public Health, Boston, MA 02115, USA
| | - Richard Challis
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge CB10 1SA, UK
| | - Sujai Kumar
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, UK
| | - Gilson R P Moreira
- Departamento de Zoologia, Universidade Federal do Rio Grande do Sul, Porto Alegre, 91501-970 Brasil
| | - Camilo Salazar
- Biology Program, Faculty of Natural Sciences and Mathematics, Universidad del Rosario, Carrera 24, No. 63C-69, Bogotá D.C. 111221, Colombia
| | - Mathieu Chouteau
- Laboratoire Ecologie, Evolution, Interactions des Systèmes Amazoniens (LEEISA), USR 3456, Université De Guyane, CNRS Guyane, 275 Route de Montabo, 97334 Cayenne, French Guiana
| | - Brian A Counterman
- Department of Biological Sciences, Mississippi State University, Starkville, MS 39762, USA
| | - Riccardo Papa
- Department of Biology, University of Puerto Rico, Río Piedras Campus, San Juan, PR 00931-3360, Puerto Rico
- Molecular Sciences and Research Center, University of Puerto Rico, San Juan, PR 00931-3360, Puerto Rico
| | - Mark Blaxter
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge CB10 1SA, UK
| | - Robert D Reed
- Department of Ecology and Evolutionary Biology, Cornell University, Ithaca, NY 14853, USA
| | - Kanchon K Dasmahapatra
- Bioscience Technology Facility, Department of Biology, University of York, York YO10 5DD, UK
| | - Marcus Kronforst
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA
| | - Mathieu Joron
- CEFE, CNRS, Université de Montpellier, Université Paul Valéry Montpellier 3, EPHE, IRD, 34090 Montpellier, France
| | - Chris D Jiggins
- Department of Zoology, University of Cambridge, Cambridge CB2 3EJ, UK
| | - W Owen McMillan
- Smithsonian Tropical Research Institute, Apartado 0843-03092 Panamá, Panama
| | | | - Andrew J Blumberg
- Department of Mathematics, University of Texas, Austin, TX 78712, USA
| | - John Wakeley
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - David Jaffe
- Broad Institute of MIT and Harvard, Cambridge, MA, 02142 USA
- 10x Genomics, Pleasanton, CA 94566, USA
| | - James Mallet
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA.
| |
Collapse
|
32
|
The architecture of the Plasmodiophora brassicae nuclear and mitochondrial genomes. Sci Rep 2019; 9:15753. [PMID: 31673019 PMCID: PMC6823432 DOI: 10.1038/s41598-019-52274-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2019] [Accepted: 10/15/2019] [Indexed: 11/09/2022] Open
Abstract
Plasmodiophora brassicae is a soil-borne pathogen that attacks roots of cruciferous plants causing clubroot disease. The pathogen belongs to the Plasmodiophorida order in Phytomyxea. Here we used long-read SMRT technology to clarify the P. brassicae e3 genomic constituents along with comparative and phylogenetic analyses. Twenty contigs representing the nuclear genome and one mitochondrial (mt) contig were generated, together comprising 25.1 Mbp. Thirteen of the 20 nuclear contigs represented chromosomes from telomere to telomere characterized by [TTTTAGGG] sequences. Seven active gene candidates encoding synaptonemal complex-associated and meiotic-related protein homologs were identified, a finding that argues for possible genetic recombination events. The circular mt genome is large (114,663 bp), gene dense and intron rich. It shares high synteny with the mt genome of Spongospora subterranea, except in a unique 12 kb region delimited by shifts in GC content and containing tandem minisatellite- and microsatellite repeats with partially palindromic sequences. De novo annotation identified 32 protein-coding genes, 28 structural RNA genes and 19 ORFs. ORFs predicted in the repeat-rich region showed similarities to diverse organisms suggesting possible evolutionary connections. The data generated here form a refined platform for the next step involving functional analysis, all to clarify the complex biology of P. brassicae.
Collapse
|
33
|
Song G, Lee J, Kim J, Kang S, Lee H, Kwon D, Lee D, Lang GI, Cherry JM, Kim J. Integrative Meta-Assembly Pipeline (IMAP): Chromosome-level genome assembler combining multiple de novo assemblies. PLoS One 2019; 14:e0221858. [PMID: 31454399 PMCID: PMC6711525 DOI: 10.1371/journal.pone.0221858] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2018] [Accepted: 08/18/2019] [Indexed: 11/29/2022] Open
Abstract
BACKGROUND Genomic data have become major resources to understand complex mechanisms at fine-scale temporal and spatial resolution in functional and evolutionary genetic studies, including human diseases, such as cancers. Recently, a large number of whole genomes of evolving populations of yeast (Saccharomyces cerevisiae W303 strain) were sequenced in a time-dependent manner to identify temporal evolutionary patterns. For this type of study, a chromosome-level sequence assembly of the strain or population at time zero is required to compare with the genomes derived later. However, there is no fully automated computational approach in experimental evolution studies to establish the chromosome-level genome assembly using unique features of sequencing data. METHODS AND RESULTS In this study, we developed a new software pipeline, the integrative meta-assembly pipeline (IMAP), to build chromosome-level genome sequence assemblies by generating and combining multiple initial assemblies using three de novo assemblers from short-read sequencing data. We significantly improved the continuity and accuracy of the genome assembly using a large collection of sequencing data and hybrid assembly approaches. We validated our pipeline by generating chromosome-level assemblies of yeast strains W303 and SK1, and compared our results with assemblies built using long-read sequencing and various assembly evaluation metrics. We also constructed chromosome-level sequence assemblies of S. cerevisiae strain Sigma1278b, and three commonly used fungal strains: Aspergillus nidulans A713, Neurospora crassa 73, and Thielavia terrestris CBS 492.74, for which long-read sequencing data are not yet available. Finally, we examined the effect of IMAP parameters, such as reference and resolution, on the quality of the final assembly of the yeast strains W303 and SK1. CONCLUSIONS We developed a cost-effective pipeline to generate chromosome-level sequence assemblies using only short-read sequencing data. Our pipeline combines the strengths of reference-guided and meta-assembly approaches. Our pipeline is available online at http://github.com/jkimlab/IMAP including a Docker image, as well as a Perl script, to help users install the IMAP package, including several prerequisite programs. Users can use IMAP to easily build the chromosome-level assembly for the genome of their interest.
Collapse
Affiliation(s)
- Giltae Song
- School of Computer Science and Engineering, Pusan National University, Busan, South Korea
| | - Jongin Lee
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, South Korea
| | - Juyeon Kim
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, South Korea
| | - Seokwoo Kang
- School of Computer Science and Engineering, Pusan National University, Busan, South Korea
| | - Hoyong Lee
- School of Computer Science and Engineering, Pusan National University, Busan, South Korea
| | - Daehong Kwon
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, South Korea
| | - Daehwan Lee
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, South Korea
| | - Gregory I. Lang
- Department of Biological Sciences, Lehigh University, Bethlehem, PA, United States of America
| | - J. Michael Cherry
- Department of Genetics, Stanford University School of Medicine, Stanford, California, United States of America
| | - Jaebum Kim
- Department of Biomedical Science and Engineering, Konkuk University, Seoul, South Korea
| |
Collapse
|
34
|
Jung H, Winefield C, Bombarely A, Prentis P, Waterhouse P. Tools and Strategies for Long-Read Sequencing and De Novo Assembly of Plant Genomes. TRENDS IN PLANT SCIENCE 2019; 24:700-724. [PMID: 31208890 DOI: 10.1016/j.tplants.2019.05.003] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/06/2019] [Revised: 05/01/2019] [Accepted: 05/10/2019] [Indexed: 05/16/2023]
Abstract
The commercial release of third-generation sequencing technologies (TGSTs), giving long and ultra-long sequencing reads, has stimulated the development of new tools for assembling highly contiguous genome sequences with unprecedented accuracy across complex repeat regions. We survey here a wide range of emerging sequencing platforms and analytical tools for de novo assembly, provide background information for each of their steps, and discuss the spectrum of available options. Our decision tree recommends workflows for the generation of a high-quality genome assembly when used in combination with the specific needs and resources of a project.
Collapse
Affiliation(s)
- Hyungtaek Jung
- Centre for Tropical Crops and Biocommodities, Queensland University of Technology, Brisbane, QLD 4001, Australia.
| | - Christopher Winefield
- Department of Wine, Food, and Molecular Biosciences, Lincoln University, 7647 Christchurch, New Zealand
| | - Aureliano Bombarely
- Department of Bioscience, University of Milan, Milan 20133, Italy; School of Plants and Environmental Sciences, Virginia Tech, Blacksburg, VA 24061, USA
| | - Peter Prentis
- School of Earth, Environmental, and Biological Sciences, Queensland University of Technology, Brisbane, QLD, 4001, Australia
| | - Peter Waterhouse
- Centre for Tropical Crops and Biocommodities, Queensland University of Technology, Brisbane, QLD 4001, Australia; School of Biological Sciences, University of Sydney, Sydney, NSW 2006, Australia.
| |
Collapse
|
35
|
Helmkampf M, Bellinger MR, Geib SM, Sim SB, Takabayashi M. Draft Genome of the Rice Coral Montipora capitata Obtained from Linked-Read Sequencing. Genome Biol Evol 2019; 11:2045-2054. [PMID: 31243452 PMCID: PMC6668484 DOI: 10.1093/gbe/evz135] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/24/2019] [Indexed: 12/24/2022] Open
Abstract
The rice coral, Montipora capitata, is widely distributed throughout the Indo-Pacific and comprises one of the most important reef-building species in the Hawaiian Islands. Here, we describe a de novo assembly of its genome based on a linked-read sequencing approach developed by 10x Genomics. The final draft assembly consisted of 27,870 scaffolds with a N50 size of 186 kb and contained a fairly complete set (81%) of metazoan benchmarking (BUSCO) genes. Based on haploid assembly size (615 Mb) and read k-mer profiles, we estimated the genome size to fall between 600 and 700 Mb, although the high fraction of repetitive sequence introduced considerable uncertainty. Repeat analysis indicated that 42% of the assembly consisted of interspersed, mostly unclassified repeats, and almost 3% tandem repeats. We also identified 36,691 protein-coding genes with a median coding sequence length of 807 bp, together spanning 7% of the assembly. The high repeat content and heterozygosity of the genome proved a challenging scenario for assembly, requiring additional steps to merge haplotypes and resulting in a higher than expected fragmentation at the scaffold level. Despite these challenges, the assembly turned out to be comparable in most quality measures to that of other available coral genomes while being considerably more cost-effective, especially with respect to long-read sequencing methods. Provided high-molecular-weight DNA is available, linked-read technology may thus serve as a valuable alternative capable of providing quality genome assemblies of nonmodel organisms.
Collapse
Affiliation(s)
- Martin Helmkampf
- Tropical Conservation Biology and Environmental Science, University of Hawaíi at Hilo
| | - M Renee Bellinger
- Tropical Conservation Biology and Environmental Science, University of Hawaíi at Hilo
| | - Scott M Geib
- Daniel K. Inouye U.S. Pacific Basin Agricultural Research Center, United States Department of Agriculture, Hilo, Hawaíi
| | - Sheina B Sim
- Daniel K. Inouye U.S. Pacific Basin Agricultural Research Center, United States Department of Agriculture, Hilo, Hawaíi
| | - Misaki Takabayashi
- Marine Science Department, University of Hawaíi at Hilo
- Okinawa Institute of Science and Technology, Onna, Okinawa, Japan
| |
Collapse
|
36
|
Platanus-allee is a de novo haplotype assembler enabling a comprehensive access to divergent heterozygous regions. Nat Commun 2019; 10:1702. [PMID: 30979905 PMCID: PMC6461651 DOI: 10.1038/s41467-019-09575-2] [Citation(s) in RCA: 82] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Accepted: 03/19/2019] [Indexed: 12/14/2022] Open
Abstract
The ultimate goal for diploid genome determination is to completely decode homologous chromosomes independently, and several phasing programs from consensus sequences have been developed. These methods work well for lowly heterozygous genomes, but the manifold species have high heterozygosity. Additionally, there are highly divergent regions (HDRs), where the haplotype sequences differ considerably. Because HDRs are likely to direct various interesting biological phenomena, many genomic analysis targets fall within these regions. However, they cannot be accessed by existing phasing methods, and we have to adopt costly traditional methods. Here, we develop a de novo haplotype assembler, Platanus-allee ( http://platanus.bio.titech.ac.jp/platanus2 ), which initially constructs each haplotype sequence and then untangles the assembly graphs utilizing sequence links and synteny information. A comprehensive benchmark analysis reveals that Platanus-allee exhibits high recall and precision, particularly for HDRs. Using this approach, previously unknown HDRs are detected in the human genome, which may uncover novel aspects of genome variability.
Collapse
|