1
|
Cavalet-Giorsa E, González-Muñoz A, Athiyannan N, Holden S, Salhi A, Gardener C, Quiroz-Chávez J, Rustamova SM, Elkot AF, Patpour M, Rasheed A, Mao L, Lagudah ES, Periyannan SK, Sharon A, Himmelbach A, Reif JC, Knauft M, Mascher M, Stein N, Chayut N, Ghosh S, Perovic D, Putra A, Perera AB, Hu CY, Yu G, Ahmed HI, Laquai KD, Rivera LF, Chen R, Wang Y, Gao X, Liu S, Raupp WJ, Olson EL, Lee JY, Chhuneja P, Kaur S, Zhang P, Park RF, Ding Y, Liu DC, Li W, Nasyrova FY, Dvorak J, Abbasi M, Li M, Kumar N, Meyer WB, Boshoff WHP, Steffenson BJ, Matny O, Sharma PK, Tiwari VK, Grewal S, Pozniak CJ, Chawla HS, Ens J, Dunning LT, Kolmer JA, Lazo GR, Xu SS, Gu YQ, Xu X, Uauy C, Abrouk M, Bougouffa S, Brar GS, Wulff BBH, Krattinger SG. Origin and evolution of the bread wheat D genome. Nature 2024:10.1038/s41586-024-07808-z. [PMID: 39143210 DOI: 10.1038/s41586-024-07808-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Accepted: 07/10/2024] [Indexed: 08/16/2024]
Abstract
Bread wheat (Triticum aestivum) is a globally dominant crop and major source of calories and proteins for the human diet. Compared with its wild ancestors, modern bread wheat shows lower genetic diversity, caused by polyploidisation, domestication and breeding bottlenecks1,2. Wild wheat relatives represent genetic reservoirs, and harbour diversity and beneficial alleles that have not been incorporated into bread wheat. Here we establish and analyse extensive genome resources for Tausch's goatgrass (Aegilops tauschii), the donor of the bread wheat D genome. Our analysis of 46 Ae. tauschii genomes enabled us to clone a disease resistance gene and perform haplotype analysis across a complex disease resistance locus, allowing us to discern alleles from paralogous gene copies. We also reveal the complex genetic composition and history of the bread wheat D genome, which involves contributions from genetically and geographically discrete Ae. tauschii subpopulations. Together, our results reveal the complex history of the bread wheat D genome and demonstrate the potential of wild relatives in crop improvement.
Collapse
Affiliation(s)
- Emile Cavalet-Giorsa
- Plant Science Program, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Andrea González-Muñoz
- Plant Science Program, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Naveenkumar Athiyannan
- Plant Science Program, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Samuel Holden
- Faculty of Land and Food Systems, The University of British Columbia (UBC), Vancouver, British Columbia, Canada
| | - Adil Salhi
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Catherine Gardener
- Plant Science Program, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | | | - Samira M Rustamova
- Institute of Molecular Biology and Biotechnologies, Ministry of Science and Education of the Republic of Azerbaijan, Baku, Azerbaijan
| | - Ahmed Fawzy Elkot
- Wheat Research Department, Field Crops Research Institute, Agricultural Research Center (ARC), Giza, Egypt
| | - Mehran Patpour
- Department of Agroecology, Aarhus University, Slagelse, Denmark
| | - Awais Rasheed
- Department of Plant Sciences, Quaid-i-Azam University, Islamabad, Pakistan
- International Maize and Wheat Improvement Centre (CIMMYT), c/o CAAS, Beijing, China
| | - Long Mao
- State Key Laboratory of Crop Gene Resources and Breeding and National Key Facility for Crop Gene Resources and Genetic Improvement, Institute of Crop Science, Chinese Academy of Agricultural Sciences, Beijing, China
| | - Evans S Lagudah
- Commonwealth Scientific and Industrial Research Organization (CSIRO), Agriculture and Food, Canberra, New South Wales, Australia
| | - Sambasivam K Periyannan
- Commonwealth Scientific and Industrial Research Organization (CSIRO), Agriculture and Food, Canberra, New South Wales, Australia
- Centre for Crop Health School of Agriculture and Environmental Science, University of Southern Queensland, Toowoomba, Queensland, Australia
| | - Amir Sharon
- Institute for Cereal Crops Improvement, School of Plant Sciences and Food Security, Tel Aviv University, Tel Aviv, Israel
| | - Axel Himmelbach
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, Germany
| | - Jochen C Reif
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, Germany
| | - Manuela Knauft
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, Germany
| | - Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, Germany
- German Centre for Integrative Biodiversity Research (iDiv) Halle-Jena-Leipzig, Leipzig, Germany
| | - Nils Stein
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Seeland, Germany
- Institute of Agricultural and Nutritional Sciences, Martin Luther University Halle-Wittenberg, Halle, Germany
| | - Noam Chayut
- John Innes Centre, Norwich Research Park, Norwich, UK
| | - Sreya Ghosh
- John Innes Centre, Norwich Research Park, Norwich, UK
| | - Dragan Perovic
- Julius Kuehn-Institute (JKI), Federal Research Centre for Cultivated Plants, Institute for Resistance Research and Stress Tolerance, Quedlinburg, Germany
| | - Alexander Putra
- Bioscience Core Lab, King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Ana B Perera
- Plant Science Program, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Chia-Yi Hu
- Plant Science Program, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Guotai Yu
- Plant Science Program, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Hanin Ibrahim Ahmed
- Plant Science Program, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- Centre d'anthropobiologie et de génomique de Toulouse (CAGT), Laboratoire d'Anthropobiologie et d'Imagerie de Synthèse, CNRS UMR 5288, Faculté de Médecine de Purpan, Toulouse, France
| | - Konstanze D Laquai
- Plant Science Program, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Luis F Rivera
- Plant Science Program, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Renjie Chen
- Plant Science Program, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Yajun Wang
- Plant Science Program, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- National Key Laboratory of Plant Molecular Genetics, Center for Excellence in Molecular Plant Sciences, Institute of Plant Physiology and Ecology, Chinese Academy of Sciences, Shanghai, China
| | - Xin Gao
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Sanzhen Liu
- Department of Plant Pathology, Kansas State University, Manhattan, KS, USA
| | - W John Raupp
- Department of Plant Pathology and Wheat Genetics Resource Center, Kansas State University, Manhattan, KS, USA
| | - Eric L Olson
- Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, MI, USA
| | - Jong-Yeol Lee
- National Institute of Agricultural Sciences, Rural Development Administration, Jeonju, South Korea
| | - Parveen Chhuneja
- School of Agricultural Biotechnology, Punjab Agricultural University, Ludhiana, India
| | - Satinder Kaur
- School of Agricultural Biotechnology, Punjab Agricultural University, Ludhiana, India
| | - Peng Zhang
- Plant Breeding Institute, School of Life and Environmental Sciences, University of Sydney, Cobbitty, New South Wales, Australia
| | - Robert F Park
- Plant Breeding Institute, School of Life and Environmental Sciences, University of Sydney, Cobbitty, New South Wales, Australia
| | - Yi Ding
- Plant Breeding Institute, School of Life and Environmental Sciences, University of Sydney, Cobbitty, New South Wales, Australia
| | - Deng-Cai Liu
- Triticeae Research Institute, Sichuan Agricultural University, Chengdu, China
| | - Wanlong Li
- Department of Biology and Microbiology, South Dakota State University, Brookings, SD, USA
| | - Firuza Y Nasyrova
- Institute of Botany, Plant Physiology and Genetics, Tajik National Academy of Sciences, Dushanbe, Tajikistan
| | - Jan Dvorak
- Department of Plant Sciences, University of California, Davis, CA, USA
| | - Mehrdad Abbasi
- Faculty of Land and Food Systems, The University of British Columbia (UBC), Vancouver, British Columbia, Canada
| | - Meng Li
- Faculty of Land and Food Systems, The University of British Columbia (UBC), Vancouver, British Columbia, Canada
| | - Naveen Kumar
- Faculty of Land and Food Systems, The University of British Columbia (UBC), Vancouver, British Columbia, Canada
| | - Wilku B Meyer
- Department of Plant Sciences, University of the Free State, Bloemfontein, South Africa
| | - Willem H P Boshoff
- Department of Plant Sciences, University of the Free State, Bloemfontein, South Africa
| | - Brian J Steffenson
- Department of Plant Pathology, University of Minnesota, Saint Paul, MN, USA
| | - Oadi Matny
- Department of Plant Pathology, University of Minnesota, Saint Paul, MN, USA
| | - Parva K Sharma
- Department of Plant Science and Landscape Architecture, University of Maryland, College Park, MD, USA
| | - Vijay K Tiwari
- Department of Plant Science and Landscape Architecture, University of Maryland, College Park, MD, USA
| | - Surbhi Grewal
- Nottingham Wheat Research Centre, School of Biosciences, University of Nottingham, Loughborough, UK
| | - Curtis J Pozniak
- University of Saskatchewan, Crop Development Centre, Agriculture Building, Saskatoon, Saskatchewan, Canada
| | - Harmeet Singh Chawla
- University of Saskatchewan, Crop Development Centre, Agriculture Building, Saskatoon, Saskatchewan, Canada
- Department of Plant Science, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Jennifer Ens
- University of Saskatchewan, Crop Development Centre, Agriculture Building, Saskatoon, Saskatchewan, Canada
| | - Luke T Dunning
- Ecology and Evolutionary Biology, School of Biosciences, University of Sheffield, Western Bank, Sheffield, UK
| | | | - Gerard R Lazo
- Crop Improvement and Genetics Research Unit, Western Regional Research Center, USDA-ARS, Albany, CA, USA
| | - Steven S Xu
- Crop Improvement and Genetics Research Unit, Western Regional Research Center, USDA-ARS, Albany, CA, USA
| | - Yong Q Gu
- Crop Improvement and Genetics Research Unit, Western Regional Research Center, USDA-ARS, Albany, CA, USA
| | - Xianyang Xu
- Peanut and Small Grains Research Unit, USDA-ARS, Stillwater, OK, USA
| | | | - Michael Abrouk
- Plant Science Program, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Salim Bougouffa
- Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center (CBRC), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Gurcharn S Brar
- Faculty of Land and Food Systems, The University of British Columbia (UBC), Vancouver, British Columbia, Canada
- Faculty of Agricultural, Life and Environmental Sciences, University of Alberta, Edmonton, Canada
| | - Brande B H Wulff
- Plant Science Program, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia.
| | - Simon G Krattinger
- Plant Science Program, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia.
| |
Collapse
|
2
|
Mazooji K, Shomorony I. Fast multiple sequence alignment via multi-armed bandits. Bioinformatics 2024; 40:i328-i336. [PMID: 38940160 PMCID: PMC11211838 DOI: 10.1093/bioinformatics/btae225] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open
Abstract
SUMMARY Multiple sequence alignment is an important problem in computational biology with applications that include phylogeny and the detection of remote homology between protein sequences. UPP is a popular software package that constructs accurate multiple sequence alignments for large datasets based on ensembles of hidden Markov models (HMMs). A computational bottleneck for this method is a sequence-to-HMM assignment step, which relies on the precise computation of probability scores on the HMMs. In this work, we show that we can speed up this assignment step significantly by replacing these HMM probability scores with alternative scores that can be efficiently estimated. Our proposed approach utilizes a multi-armed bandit algorithm to adaptively and efficiently compute estimates of these scores. This allows us to achieve similar alignment accuracy as UPP with a significant reduction in computation time, particularly for datasets with long sequences. AVAILABILITY AND IMPLEMENTATION The code used to produce the results in this paper is available on GitHub at: https://github.com/ilanshom/adaptiveMSA.
Collapse
Affiliation(s)
- Kayvon Mazooji
- Department of Electrical and Computer Engineering, University of Illinois Urbana-Champaign, Urbana, IL 61801, United States
| | - Ilan Shomorony
- Department of Electrical and Computer Engineering, University of Illinois Urbana-Champaign, Urbana, IL 61801, United States
| |
Collapse
|
3
|
Liu Z, Yang F, Deng C, Wan H, Tang H, Feng J, Wang Q, Yang N, Li J, Yang W. Chromosome-level assembly of the synthetic hexaploid wheat-derived cultivar Chuanmai 104. Sci Data 2024; 11:670. [PMID: 38909086 PMCID: PMC11193762 DOI: 10.1038/s41597-024-03527-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2024] [Accepted: 06/14/2024] [Indexed: 06/24/2024] Open
Abstract
Synthetic hexaploid wheats (SHWs) are effective genetic resources for transferring agronomically important genes from wild relatives to common wheat (Triticum aestivum L.). Dozens of reference-quality pseudomolecule assemblies of hexaploid wheat have been generated, but none is reported for SHW-derived cultivars. Here, we generated a chromosome-scale assembly for the SHW-derived cultivar 'Chuanmai 104' based on PacBio HiFi reads and chromosome conformation capture sequencing. The total assembly size was 14.81 Gb with a contig N50 length of 58.25 Mb. A BUSCO analysis yielded a completeness score of 99.30%. In total, repetitive elements comprised 81.36% of the genome and 122,554 high-confidence protein-coding gene models were predicted. In summary, the first chromosome-level assembly for a SHW-derived cultivar presents a promising outlook for the study and utilization of SHWs in wheat improvement, which is essential to meet the global food demand.
Collapse
Affiliation(s)
- Zehou Liu
- Crop Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu, China
- Environment Friendly Crop Germplasm Innovation and Genetic Improvement Key Laboratory of Sichuan Province, Chengdu, China
- Key Laboratory of Wheat Biology and Genetic Improvement on Southwestern China, Chengdu, China
- Key Laboratory of Tianfu Seed Industry Innovation, Chengdu, China
| | - Fan Yang
- Key Laboratory of Tianfu Seed Industry Innovation, Chengdu, China
- Biotechnology and Nuclear Technology Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu, China
| | - Cao Deng
- The Key Laboratory of Animal Disease and Human Health of Sichuan Province, College of Veterinary Medicine, Sichuan Agricultural University, Chengdu, China
- Departments of Bioinformatics, DNA Stories Bioinformatics Center, Chengdu, China
| | - Hongshen Wan
- Crop Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu, China
- Environment Friendly Crop Germplasm Innovation and Genetic Improvement Key Laboratory of Sichuan Province, Chengdu, China
- Key Laboratory of Wheat Biology and Genetic Improvement on Southwestern China, Chengdu, China
- Key Laboratory of Tianfu Seed Industry Innovation, Chengdu, China
| | - Hao Tang
- Crop Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu, China
- Environment Friendly Crop Germplasm Innovation and Genetic Improvement Key Laboratory of Sichuan Province, Chengdu, China
- Key Laboratory of Wheat Biology and Genetic Improvement on Southwestern China, Chengdu, China
- Key Laboratory of Tianfu Seed Industry Innovation, Chengdu, China
| | - Junyan Feng
- Key Laboratory of Tianfu Seed Industry Innovation, Chengdu, China
- Biotechnology and Nuclear Technology Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu, China
| | - Qin Wang
- Crop Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu, China
- Environment Friendly Crop Germplasm Innovation and Genetic Improvement Key Laboratory of Sichuan Province, Chengdu, China
- Key Laboratory of Wheat Biology and Genetic Improvement on Southwestern China, Chengdu, China
- Key Laboratory of Tianfu Seed Industry Innovation, Chengdu, China
| | - Ning Yang
- Crop Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu, China
- Environment Friendly Crop Germplasm Innovation and Genetic Improvement Key Laboratory of Sichuan Province, Chengdu, China
- Key Laboratory of Wheat Biology and Genetic Improvement on Southwestern China, Chengdu, China
- Key Laboratory of Tianfu Seed Industry Innovation, Chengdu, China
| | - Jun Li
- Crop Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu, China.
- Environment Friendly Crop Germplasm Innovation and Genetic Improvement Key Laboratory of Sichuan Province, Chengdu, China.
- Key Laboratory of Wheat Biology and Genetic Improvement on Southwestern China, Chengdu, China.
- Key Laboratory of Tianfu Seed Industry Innovation, Chengdu, China.
| | - Wuyun Yang
- Crop Research Institute, Sichuan Academy of Agricultural Sciences, Chengdu, China.
- Environment Friendly Crop Germplasm Innovation and Genetic Improvement Key Laboratory of Sichuan Province, Chengdu, China.
- Key Laboratory of Wheat Biology and Genetic Improvement on Southwestern China, Chengdu, China.
- Key Laboratory of Tianfu Seed Industry Innovation, Chengdu, China.
| |
Collapse
|
4
|
Makova KD, Pickett BD, Harris RS, Hartley GA, Cechova M, Pal K, Nurk S, Yoo D, Li Q, Hebbar P, McGrath BC, Antonacci F, Aubel M, Biddanda A, Borchers M, Bornberg-Bauer E, Bouffard GG, Brooks SY, Carbone L, Carrel L, Carroll A, Chang PC, Chin CS, Cook DE, Craig SJC, de Gennaro L, Diekhans M, Dutra A, Garcia GH, Grady PGS, Green RE, Haddad D, Hallast P, Harvey WT, Hickey G, Hillis DA, Hoyt SJ, Jeong H, Kamali K, Pond SLK, LaPolice TM, Lee C, Lewis AP, Loh YHE, Masterson P, McGarvey KM, McCoy RC, Medvedev P, Miga KH, Munson KM, Pak E, Paten B, Pinto BJ, Potapova T, Rhie A, Rocha JL, Ryabov F, Ryder OA, Sacco S, Shafin K, Shepelev VA, Slon V, Solar SJ, Storer JM, Sudmant PH, Sweetalana, Sweeten A, Tassia MG, Thibaud-Nissen F, Ventura M, Wilson MA, Young AC, Zeng H, Zhang X, Szpiech ZA, Huber CD, Gerton JL, Yi SV, Schatz MC, Alexandrov IA, Koren S, O'Neill RJ, Eichler EE, Phillippy AM. The complete sequence and comparative analysis of ape sex chromosomes. Nature 2024; 630:401-411. [PMID: 38811727 PMCID: PMC11168930 DOI: 10.1038/s41586-024-07473-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 04/26/2024] [Indexed: 05/31/2024]
Abstract
Apes possess two sex chromosomes-the male-specific Y chromosome and the X chromosome, which is present in both males and females. The Y chromosome is crucial for male reproduction, with deletions being linked to infertility1. The X chromosome is vital for reproduction and cognition2. Variation in mating patterns and brain function among apes suggests corresponding differences in their sex chromosomes. However, owing to their repetitive nature and incomplete reference assemblies, ape sex chromosomes have been challenging to study. Here, using the methodology developed for the telomere-to-telomere (T2T) human genome, we produced gapless assemblies of the X and Y chromosomes for five great apes (bonobo (Pan paniscus), chimpanzee (Pan troglodytes), western lowland gorilla (Gorilla gorilla gorilla), Bornean orangutan (Pongo pygmaeus) and Sumatran orangutan (Pongo abelii)) and a lesser ape (the siamang gibbon (Symphalangus syndactylus)), and untangled the intricacies of their evolution. Compared with the X chromosomes, the ape Y chromosomes vary greatly in size and have low alignability and high levels of structural rearrangements-owing to the accumulation of lineage-specific ampliconic regions, palindromes, transposable elements and satellites. Many Y chromosome genes expand in multi-copy families and some evolve under purifying selection. Thus, the Y chromosome exhibits dynamic evolution, whereas the X chromosome is more stable. Mapping short-read sequencing data to these assemblies revealed diversity and selection patterns on sex chromosomes of more than 100 individual great apes. These reference assemblies are expected to inform human evolution and conservation genetics of non-human apes, all of which are endangered species.
Collapse
Affiliation(s)
| | - Brandon D Pickett
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Monika Cechova
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Karol Pal
- Penn State University, University Park, PA, USA
| | - Sergey Nurk
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - DongAhn Yoo
- University of Washington School of Medicine, Seattle, WA, USA
| | - Qiuhui Li
- Johns Hopkins University, Baltimore, MD, USA
| | - Prajna Hebbar
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | | | | | | | | | - Erich Bornberg-Bauer
- University of Münster, Münster, Germany
- MPI for Developmental Biology, Tübingen, Germany
| | - Gerard G Bouffard
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Shelise Y Brooks
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Lucia Carbone
- Oregon Health and Science University, Portland, OR, USA
- Oregon National Primate Research Center, Hillsboro, OR, USA
| | - Laura Carrel
- Penn State University School of Medicine, Hershey, PA, USA
| | | | | | - Chen-Shan Chin
- Foundation of Biological Data Sciences, Belmont, CA, USA
| | | | | | | | - Mark Diekhans
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Amalia Dutra
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Gage H Garcia
- University of Washington School of Medicine, Seattle, WA, USA
| | | | | | - Diana Haddad
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Pille Hallast
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | | - Glenn Hickey
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - David A Hillis
- University of California Santa Barbara, Santa Barbara, CA, USA
| | | | - Hyeonsoo Jeong
- University of Washington School of Medicine, Seattle, WA, USA
| | | | | | | | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | | - Yong-Hwee E Loh
- University of California Santa Barbara, Santa Barbara, CA, USA
| | - Patrick Masterson
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Kelly M McGarvey
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Karen H Miga
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | - Evgenia Pak
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Benedict Paten
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | | | - Arang Rhie
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Joana L Rocha
- University of California Berkeley, Berkeley, CA, USA
| | - Fedor Ryabov
- Masters Program in National Research, University Higher School of Economics, Moscow, Russia
| | | | - Samuel Sacco
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | | | | | - Steven J Solar
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Sweetalana
- Penn State University, University Park, PA, USA
| | - Alex Sweeten
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Johns Hopkins University, Baltimore, MD, USA
| | | | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Mario Ventura
- Università degli Studi di Bari Aldo Moro, Bari, Italy
| | | | - Alice C Young
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Xinru Zhang
- Penn State University, University Park, PA, USA
| | | | | | | | - Soojin V Yi
- University of California Santa Barbara, Santa Barbara, CA, USA
| | | | | | - Sergey Koren
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Evan E Eichler
- University of Washington School of Medicine, Seattle, WA, USA.
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| | - Adam M Phillippy
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
5
|
Simmons JR, Estrem B, Zagoskin MV, Oldridge R, Zadegan SB, Wang J. Chromosome fusion and programmed DNA elimination shape karyotypes of nematodes. Curr Biol 2024; 34:2147-2161.e5. [PMID: 38688284 PMCID: PMC11111355 DOI: 10.1016/j.cub.2024.04.022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 02/21/2024] [Accepted: 04/09/2024] [Indexed: 05/02/2024]
Abstract
An increasing number of metazoans undergo programmed DNA elimination (PDE), where a significant amount of DNA is selectively lost from the somatic genome during development. In some nematodes, PDE leads to the removal and remodeling of the ends of all germline chromosomes. In several species, PDE also generates internal breaks that lead to sequence loss and increased numbers of somatic chromosomes. The biological significance of these karyotype changes associated with PDE and the origin and evolution of nematode PDE remain largely unknown. Here, we assembled the single germline chromosome of the nematode Parascaris univalens and compared the karyotypes, chromosomal gene organization, and PDE features among other nematodes. We show that PDE in Parascaris converts an XX/XY sex-determination system in the germline into an XX/XO system in the somatic cells. Comparisons of Ascaris, Parascaris, and Baylisascaris ascarid chromosomes suggest that PDE existed in the ancestor of these nematodes, and their current distinct germline karyotypes were derived from fusion events of smaller ancestral chromosomes. The DNA breaks involved in PDE resolve these fused germline chromosomes into their pre-fusion karyotypes. These karyotype changes may lead to alterations in genome architecture and gene expression in the somatic cells. Cytological and genomic analyses further suggest that satellite DNA and the heterochromatic chromosome arms are dynamic and may play a role during meiosis. Overall, our results show that chromosome fusion and PDE have been harnessed in these ascarids to sculpt their karyotypes, altering the genome organization and serving specific functions in the germline and somatic cells.
Collapse
Affiliation(s)
- James R Simmons
- Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA
| | - Brandon Estrem
- Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA
| | - Maxim V Zagoskin
- Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA
| | - Ryan Oldridge
- Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA
| | - Sobhan Bahrami Zadegan
- UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37996, USA
| | - Jianbin Wang
- Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA; UT-ORNL Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37996, USA.
| |
Collapse
|
6
|
Carroll RA, Rice ES, Murphy WJ, Lyons LA, Thibaud-Nissen F, Coghill LM, Swanson WF, Terio KA, Boyd T, Warren WC. A chromosome-scale fishing cat reference genome for the evaluation of potential germline risk variants. Sci Rep 2024; 14:8073. [PMID: 38580653 PMCID: PMC10997796 DOI: 10.1038/s41598-024-56003-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Accepted: 02/29/2024] [Indexed: 04/07/2024] Open
Abstract
The fishing cat, Prionailurus viverrinus, faces a population decline, increasing the importance of maintaining healthy zoo populations. Unfortunately, zoo-managed individuals currently face a high prevalence of transitional cell carcinoma (TCC), a form of bladder cancer. To investigate the genetics of inherited diseases among captive fishing cats, we present a chromosome-scale assembly, generate the pedigree of the zoo-managed population, reaffirm the close genetic relationship with the Asian leopard cat (Prionailurus bengalensis), and identify 7.4 million single nucleotide variants (SNVs) and 23,432 structural variants (SVs) from whole genome sequencing (WGS) data of healthy and TCC cats. Only BRCA2 was found to have a high recurrent number of missense mutations in fishing cats diagnosed with TCC when compared to inherited human cancer risk variants. These new fishing cat genomic resources will aid conservation efforts to improve their genetic fitness and enhance the comparative study of feline genomes.
Collapse
Affiliation(s)
- Rachel A Carroll
- Bond Life Sciences Center, University of Missouri, 1201 Rollins St., Columbia, MO, 65211, USA
| | - Edward S Rice
- Bond Life Sciences Center, University of Missouri, 1201 Rollins St., Columbia, MO, 65211, USA
| | - William J Murphy
- Department of Veterinary Integrative Biosciences, Texas A and M University, College Station, TX, 77843-4458, USA
| | - Leslie A Lyons
- Department of Veterinary Medicine and Surgery, College of Veterinary Medicine, University of Missouri, Columbia, MO, 65211, USA
| | - Francoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Lyndon M Coghill
- Bioinformatics and Analytics Core, University of Missouri, 1201 Rollins St., Columbia, MO, 65211, USA
| | - William F Swanson
- Center for Conservation and Research of Endangered Wildlife, Cincinnati Zoo and Botanical Garden, 3400 Vine St., Cincinnati, OH, 45220, USA
| | - Karen A Terio
- Zoological Pathology Program, University of Illinois, 3300 Golf Rd, Brookfield, IL, 60513, USA
| | - Tyler Boyd
- Oklahoma City Zoo and Botanical Garden, 2000 Remington Pl., Oklahoma, OK, 73111, USA
| | - Wesley C Warren
- Bond Life Sciences Center, University of Missouri, 1201 Rollins St., Columbia, MO, 65211, USA.
- Department of Surgery, Bond Life Sciences Center, Institute of Data Science and Informatics, University of Missouri, 1201 Rollins St., Columbia, MO, 65211, USA.
| |
Collapse
|
7
|
Smith T, Olagunju T, Rosen B, Neibergs H, Becker G, Davenport K, Elsik C, Hadfield T, Koren S, Kuhn K, Rhie A, Shira K, Skibiel A, Stegemiller M, Thorne J, Villamediana P, Cockett N, Murdoch B. The first complete T2T Assemblies of Cattle and Sheep Y-Chromosomes uncover remarkable divergence in structure and gene content. RESEARCH SQUARE 2024:rs.3.rs-4033388. [PMID: 38712074 PMCID: PMC11071540 DOI: 10.21203/rs.3.rs-4033388/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Reference genomes of cattle and sheep have lacked contiguous assemblies of the sex-determining Y chromosome. We assembled complete and gapless telomere to telomere (T2T) Y chromosomes for these species. The pseudo-autosomal regions were similar in length, but the total chromosome size was substantially different, with the cattle Y more than twice the length of the sheep Y. The length disparity was accounted for by expanded ampliconic region in cattle. The genic amplification in cattle contrasts with pseudogenization in sheep suggesting opposite evolutionary mechanisms since their divergence 18MYA. The centromeres also differed dramatically despite the close relationship between these species at the overall genome sequence level. These Y chromosome have been added to the current reference assemblies in GenBank opening new opportunities for the study of evolution and variation while supporting efforts to improve sustainability in these important livestock species that generally use sire-driven genetic improvement strategies.
Collapse
Affiliation(s)
- Timothy Smith
- USDA, ARS, U.S. Meat Animal Research Center (USMARC)
| | | | | | | | | | | | | | | | - Sergey Koren
- Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health
| | | | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | | | | | | | | | | | |
Collapse
|
8
|
Koren S, Bao Z, Guarracino A, Ou S, Goodwin S, Jenike KM, Lucas J, McNulty B, Park J, Rautiainen M, Rhie A, Roelofs D, Schneiders H, Vrijenhoek I, Nijbroek K, Ware D, Schatz MC, Garrison E, Huang S, McCombie WR, Miga KH, Wittenberg AH, Phillippy AM. Gapless assembly of complete human and plant chromosomes using only nanopore sequencing. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.15.585294. [PMID: 38529488 PMCID: PMC10962732 DOI: 10.1101/2024.03.15.585294] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/27/2024]
Abstract
The combination of ultra-long Oxford Nanopore (ONT) sequencing reads with long, accurate PacBio HiFi reads has enabled the completion of a human genome and spurred similar efforts to complete the genomes of many other species. However, this approach for complete, "telomere-to-telomere" genome assembly relies on multiple sequencing platforms, limiting its accessibility. ONT "Duplex" sequencing reads, where both strands of the DNA are read to improve quality, promise high per-base accuracy. To evaluate this new data type, we generated ONT Duplex data for three widely-studied genomes: human HG002, Solanum lycopersicum Heinz 1706 (tomato), and Zea mays B73 (maize). For the diploid, heterozygous HG002 genome, we also used "Pore-C" chromatin contact mapping to completely phase the haplotypes. We found the accuracy of Duplex data to be similar to HiFi sequencing, but with read lengths tens of kilobases longer, and the Pore-C data to be compatible with existing diploid assembly algorithms. This combination of read length and accuracy enables the construction of a high-quality initial assembly, which can then be further resolved using the ultra-long reads, and finally phased into chromosome-scale haplotypes with Pore-C. The resulting assemblies have a base accuracy exceeding 99.999% (Q50) and near-perfect continuity, with most chromosomes assembled as single contigs. We conclude that ONT sequencing is a viable alternative to HiFi sequencing for de novo genome assembly, and has the potential to provide a single-instrument solution for the reconstruction of complete genomes.
Collapse
Affiliation(s)
- Sergey Koren
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Zhigui Bao
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, Tübingen, BadenWürttemberg, Germany
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Andrea Guarracino
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, Tennessee, USA
- Human Technopole, Milan, Italy
| | - Shujun Ou
- Ohio State University, Columbus, OH, USA
| | - Sara Goodwin
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Katharine M. Jenike
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Julian Lucas
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Brandy McNulty
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Jimin Park
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Mikko Rautiainen
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Arang Rhie
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Dick Roelofs
- KeyGene, Agro Business Park 90, 6708 PW Wageningen, Netherlands
| | | | - Ilse Vrijenhoek
- KeyGene, Agro Business Park 90, 6708 PW Wageningen, Netherlands
| | - Koen Nijbroek
- KeyGene, Agro Business Park 90, 6708 PW Wageningen, Netherlands
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, USA
| | - Michael C. Schatz
- Department of Computer Science, Johns Hopkins University, Baltimore, MD, USA
| | - Erik Garrison
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, Tennessee, USA
| | - Sanwen Huang
- Shenzhen Branch, Guangdong Laboratory of Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture and Rural Affairs, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
- State Key Laboratory of Tropical Crop Breeding, Chinese Academy of Tropical Agricultural Sciences, Haikou, Hainan, China
| | | | - Karen H. Miga
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | - Adam M. Phillippy
- Genome Informatics Section, Center for Genomics and Data Science Research, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
9
|
Mao Y, Harvey WT, Porubsky D, Munson KM, Hoekzema K, Lewis AP, Audano PA, Rozanski A, Yang X, Zhang S, Yoo D, Gordon DS, Fair T, Wei X, Logsdon GA, Haukness M, Dishuck PC, Jeong H, Del Rosario R, Bauer VL, Fattor WT, Wilkerson GK, Mao Y, Shi Y, Sun Q, Lu Q, Paten B, Bakken TE, Pollen AA, Feng G, Sawyer SL, Warren WC, Carbone L, Eichler EE. Structurally divergent and recurrently mutated regions of primate genomes. Cell 2024; 187:1547-1562.e13. [PMID: 38428424 PMCID: PMC10947866 DOI: 10.1016/j.cell.2024.01.052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 11/26/2023] [Accepted: 01/31/2024] [Indexed: 03/03/2024]
Abstract
We sequenced and assembled using multiple long-read sequencing technologies the genomes of chimpanzee, bonobo, gorilla, orangutan, gibbon, macaque, owl monkey, and marmoset. We identified 1,338,997 lineage-specific fixed structural variants (SVs) disrupting 1,561 protein-coding genes and 136,932 regulatory elements, including the most complete set of human-specific fixed differences. We estimate that 819.47 Mbp or ∼27% of the genome has been affected by SVs across primate evolution. We identify 1,607 structurally divergent regions wherein recurrent structural variation contributes to creating SV hotspots where genes are recurrently lost (e.g., CARD, C4, and OLAH gene families) and additional lineage-specific genes are generated (e.g., CKAP2, VPS36, ACBD7, and NEK5 paralogs), becoming targets of rapid chromosomal diversification and positive selection (e.g., RGPD gene family). High-fidelity long-read sequencing has made these dynamic regions of the genome accessible for sequence-level analyses within and between primate species.
Collapse
Affiliation(s)
- Yafei Mao
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA; Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China.
| | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Peter A Audano
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Allison Rozanski
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Xiangyu Yang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Shilong Zhang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - DongAhn Yoo
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - David S Gordon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Tyler Fair
- Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA, USA
| | - Xiaoxi Wei
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Marina Haukness
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Philip C Dishuck
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Hyeonsoo Jeong
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Ricardo Del Rosario
- McGovern Institute for Brain Research, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA; Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Vanessa L Bauer
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Bouder, CO, USA
| | - Will T Fattor
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Bouder, CO, USA
| | - Gregory K Wilkerson
- Department of Veterinary Sciences, Michale E. Keeling Center for Comparative Medicine and Research, The University of Texas MD Anderson Cancer Center, Bastrop, TX, USA; Department of Clinical Sciences, North Carolina State University, Raleigh, NC, USA
| | - Yuxiang Mao
- Institute of Neuroscience, State Key Laboratory of Neuroscience, Center for Excellence in Brain Science & Intelligence Technology, Chinese Academy of Sciences, Shanghai, China; Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
| | - Yongyong Shi
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China; Institute of Neuroscience, State Key Laboratory of Neuroscience, Center for Excellence in Brain Science & Intelligence Technology, Chinese Academy of Sciences, Shanghai, China; Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
| | - Qiang Sun
- Institute of Neuroscience, State Key Laboratory of Neuroscience, Center for Excellence in Brain Science & Intelligence Technology, Chinese Academy of Sciences, Shanghai, China; Shanghai Center for Brain Science and Brain-Inspired Intelligence Technology, Shanghai, China
| | - Qing Lu
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | | | - Alex A Pollen
- Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA, USA; Department of Neurology, University of California, San Francisco, San Francisco, CA, USA
| | - Guoping Feng
- McGovern Institute for Brain Research, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA; Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Sara L Sawyer
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Bouder, CO, USA
| | - Wesley C Warren
- Department of Animal Sciences, Bond Life Sciences Center, University of Missouri, Columbia, MO, USA; Department of Surgery, School of Medicine, University of Missouri, Columbia, MO, USA; Institute of Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Lucia Carbone
- Department of Medicine, Knight Cardiovascular Institute, Oregon Health and Science University, Portland, OR, USA; Division of Genetics, Oregon National Primate Research Center, Beaverton, OR, USA; Department of Molecular and Medical Genetics, Oregon Health and Science University, Portland, OR, USA; Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA; Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA.
| |
Collapse
|
10
|
Massaro I, Poethig RS, Sinha NR, Leichty AR. Chromosome-level genome of the transformable northern wattle, Acacia crassicarpa. G3 (BETHESDA, MD.) 2024; 14:jkad284. [PMID: 38096217 PMCID: PMC10917515 DOI: 10.1093/g3journal/jkad284] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 12/01/2023] [Indexed: 03/08/2024]
Abstract
The genus Acacia is a large group of woody legumes containing an enormous amount of morphological diversity in leaf shape. This diversity is at least in part the result of an innovation in leaf development where many Acacia species are capable of developing leaves of both bifacial and unifacial morphologies. While not unique in the plant kingdom, unifaciality is most commonly associated with monocots, and its developmental genetic mechanisms have yet to be explored beyond this group. In this study, we identify an accession of Acacia crassicarpa with high regeneration rates and isolate a clone for genome sequencing. We generate a chromosome-level assembly of this readily transformable clone, and using comparative analyses, confirm a whole-genome duplication unique to Caesalpinoid legumes. This resource will be important for future work examining genome evolution in legumes and the unique developmental genetic mechanisms underlying unifacial morphogenesis in Acacia.
Collapse
Affiliation(s)
- Isabelle Massaro
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA 94720, USA
| | | | - Neelima R Sinha
- Department of Plant Biology, University of California Davis, Davis, CA 95616, USA
| | - Aaron R Leichty
- Department of Plant and Microbial Biology, University of California Berkeley, Berkeley, CA 94720, USA
- Department of Plant Biology, University of California Davis, Davis, CA 95616, USA
- USDA Plant Gene Expression Center, 800 Buchanan Street, Albany, CA 94710, USA
- 800 Buchanan Street, Albany, CA 94710, USA
| |
Collapse
|
11
|
Gerrick ER, Zlitni S, West PT, Carter MM, Mechler CM, Olm MR, Caffrey EB, Li JA, Higginbottom SK, Severyn CJ, Kracke F, Spormann AM, Sonnenburg JL, Bhatt AS, Howitt MR. Metabolic diversity in commensal protists regulates intestinal immunity and trans-kingdom competition. Cell 2024; 187:62-78.e20. [PMID: 38096822 DOI: 10.1016/j.cell.2023.11.018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 08/01/2023] [Accepted: 11/14/2023] [Indexed: 01/05/2024]
Abstract
The microbiota influences intestinal health and physiology, yet the contributions of commensal protists to the gut environment have been largely overlooked. Here, we discover human- and rodent-associated parabasalid protists, revealing substantial diversity and prevalence in nonindustrialized human populations. Genomic and metabolomic analyses of murine parabasalids from the genus Tritrichomonas revealed species-level differences in excretion of the metabolite succinate, which results in distinct small intestinal immune responses. Metabolic differences between Tritrichomonas species also determine their ecological niche within the microbiota. By manipulating dietary fibers and developing in vitro protist culture, we show that different Tritrichomonas species prefer dietary polysaccharides or mucus glycans. These polysaccharide preferences drive trans-kingdom competition with specific commensal bacteria, which affects intestinal immunity in a diet-dependent manner. Our findings reveal unappreciated diversity in commensal parabasalids, elucidate differences in commensal protist metabolism, and suggest how dietary interventions could regulate their impact on gut health.
Collapse
Affiliation(s)
- Elias R Gerrick
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Soumaya Zlitni
- Department of Genetics, Stanford University, Stanford, CA 94305, USA; Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Patrick T West
- Department of Genetics, Stanford University, Stanford, CA 94305, USA; Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Matthew M Carter
- Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Claire M Mechler
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Matthew R Olm
- Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Elisa B Caffrey
- Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Jessica A Li
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Steven K Higginbottom
- Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Christopher J Severyn
- Department of Genetics, Stanford University, Stanford, CA 94305, USA; Department of Pediatrics, Division of Hematology/Oncology/Stem Cell Transplant and Regenerative Medicine Stanford University, Palo Alto, CA 94305, USA
| | - Frauke Kracke
- Department of Civil and Environmental Engineering, Stanford University, Stanford, CA 94305, USA
| | - Alfred M Spormann
- Department of Civil and Environmental Engineering, Stanford University, Stanford, CA 94305, USA; Department of Chemical Engineering, Stanford University, Stanford, CA 94305, USA
| | - Justin L Sonnenburg
- Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, CA 94305, USA; Chan Zuckerberg Biohub, San Francisco, CA 94158, USA
| | - Ami S Bhatt
- Department of Genetics, Stanford University, Stanford, CA 94305, USA; Department of Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA
| | - Michael R Howitt
- Department of Pathology, Stanford University School of Medicine, Stanford, CA 94305, USA; Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford, CA 94305, USA.
| |
Collapse
|
12
|
Volpe E, Corda L, Tommaso ED, Pelliccia F, Ottalevi R, Licastro D, Guarracino A, Capulli M, Formenti G, Tassone E, Giunta S. The complete diploid reference genome of RPE-1 identifies human phased epigenetic landscapes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.01.565049. [PMID: 38168337 PMCID: PMC10760208 DOI: 10.1101/2023.11.01.565049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
Comparative analysis of recent human genome assemblies highlights profound sequence divergence that peaks within polymorphic loci such as centromeres. This raises the question about the adequacy of relying on human reference genomes to accurately analyze sequencing data derived from experimental cell lines. Here, we generated the complete diploid genome assembly for the human retinal epithelial cells (RPE-1), a widely used non-cancer laboratory cell line with a stable karyotype, to use as matched reference for multi-omics sequencing data analysis. Our RPE1v1.0 assembly presents completely phased haplotypes and chromosome-level scaffolds that span centromeres with ultra-high base accuracy (>QV60). We mapped the haplotype-specific genomic variation specific to this cell line including t(Xq;10q), a stable 73.18 Mb duplication of chromosome 10 translocated onto the microdeleted chromosome X telomere t(Xq;10q). Polymorphisms between haplotypes of the same genome reveals genetic and epigenetic variation for all chromosomes, especially at centromeres. The RPE-1 assembly as matched reference genome improves mapping quality of multi-omics reads originating from RPE-1 cells with drastic reduction in alignments mismatches compared to using the most complete human reference to date (CHM13). Leveraging the accuracy achieved using a matched reference, we were able to identify the kinetochore sites at base pair resolution and show unprecedented variation between haplotypes. This work showcases the use of matched reference genomes for multiomics analyses and serves as the foundation for a call to comprehensively assemble experimentally relevant cell lines for widespread application.
Collapse
Affiliation(s)
- Emilia Volpe
- Giunta Laboratory of Genome Evolution, Department of Biology and Biotechnologies Charles Darwin, University of Rome “Sapienza”, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Luca Corda
- Giunta Laboratory of Genome Evolution, Department of Biology and Biotechnologies Charles Darwin, University of Rome “Sapienza”, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Elena Di Tommaso
- Giunta Laboratory of Genome Evolution, Department of Biology and Biotechnologies Charles Darwin, University of Rome “Sapienza”, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Franca Pelliccia
- Giunta Laboratory of Genome Evolution, Department of Biology and Biotechnologies Charles Darwin, University of Rome “Sapienza”, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Riccardo Ottalevi
- Department of Bioinformatic, Dante Genomics Corp Inc., 667 Madison Avenue, New York, NY 10065 USA and S.s.17, 67100, L’Aquila, Italy
| | | | - Andrea Guarracino
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN 38163, USA
| | - Mattia Capulli
- Department of Biotechnological and Applied Clinical Sciences, University of L’Aquila, L’Aquila, Italy
| | - Giulio Formenti
- The Rockefeller University, 1230 York Avenue, 10065 New York, USA
| | - Evelyne Tassone
- Giunta Laboratory of Genome Evolution, Department of Biology and Biotechnologies Charles Darwin, University of Rome “Sapienza”, Piazzale Aldo Moro 5, 00185 Rome, Italy
| | - Simona Giunta
- Giunta Laboratory of Genome Evolution, Department of Biology and Biotechnologies Charles Darwin, University of Rome “Sapienza”, Piazzale Aldo Moro 5, 00185 Rome, Italy
| |
Collapse
|
13
|
LoTempio J, Delot E, Vilain E. Benchmarking long-read genome sequence alignment tools for human genomics applications. PeerJ 2023; 11:e16515. [PMID: 38130927 PMCID: PMC10734412 DOI: 10.7717/peerj.16515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2023] [Accepted: 11/02/2023] [Indexed: 12/23/2023] Open
Abstract
Background The utility of long-read genome sequencing platforms has been shown in many fields including whole genome assembly, metagenomics, and amplicon sequencing. Less clear is the applicability of long reads to reference-guided human genomics, which is the foundation of genomic medicine. Here, we benchmark available platform-agnostic alignment tools on datasets from nanopore and single-molecule real-time platforms to understand their suitability in producing a genome representation. Results For this study, we leveraged publicly-available data from sample NA12878 generated on Oxford Nanopore and sample NA24385 on Pacific Biosciences platforms. We employed state of the art sequence alignment tools including GraphMap2, long-read aligner (LRA), Minimap2, CoNvex Gap-cost alignMents for Long Reads (NGMLR), and Winnowmap2. Minimap2 and Winnowmap2 were computationally lightweight enough for use at scale, while GraphMap2 was not. NGMLR took a long time and required many resources, but produced alignments each time. LRA was fast, but only worked on Pacific Biosciences data. Each tool widely disagreed on which reads to leave unaligned, affecting the end genome coverage and the number of discoverable breakpoints. No alignment tool independently resolved all large structural variants (1,001-100,000 base pairs) present in the Database of Genome Variants (DGV) for sample NA12878 or the truthset for NA24385. Conclusions These results suggest a combined approach is needed for LRS alignments for human genomics. Specifically, leveraging alignments from three tools will be more effective in generating a complete picture of genomic variability. It should be best practice to use an analysis pipeline that generates alignments with both Minimap2 and Winnowmap2 as they are lightweight and yield different views of the genome. Depending on the question at hand, the data available, and the time constraints, NGMLR and LRA are good options for a third tool. If computational resources and time are not a factor for a given case or experiment, NGMLR will provide another view, and another chance to resolve a case. LRA, while fast, did not work on the nanopore data for our cluster, but PacBio results were promising in that those computations completed faster than Minimap2. Due to its significant burden on computational resources and slow run time, Graphmap2 is not an ideal tool for exploration of a whole human genome generated on a long-read sequencing platform.
Collapse
Affiliation(s)
- Jonathan LoTempio
- Institute for Clinical and Translational Science, University of California, Irvine, CA, United States of America
- International Research Laboratory (IRL2006) “Epigenetics, Data, Politics (EpiDaPo)”, Centre National de la Recherche Scientifique, Washington, DC, United States of America
| | - Emmanuele Delot
- Center for Genetic Medicine Research, Children’s National Hospital, Washington, DC, United States of America
- Department of Genomics and Precision Medicine, George Washington University, Washington, DC, United States of America
| | - Eric Vilain
- Institute for Clinical and Translational Science, University of California, Irvine, CA, United States of America
- International Research Laboratory (IRL2006) “Epigenetics, Data, Politics (EpiDaPo)”, Centre National de la Recherche Scientifique, Washington, DC, United States of America
| |
Collapse
|
14
|
Xu XW, Sun P, Gao C, Zheng W, Chen S. Assembly of the poorly differentiated Verasper variegatus W chromosome by different sequencing technologies. Sci Data 2023; 10:893. [PMID: 38092799 PMCID: PMC10719390 DOI: 10.1038/s41597-023-02790-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 11/24/2023] [Indexed: 12/17/2023] Open
Abstract
The assembly of W and Y chromosomes poses significant challenges in vertebrate genome sequencing and assembly. Here, we successfully assembled the W chromosome of Verasper variegatus with a length of 20.48 Mb by combining population and PacBio HiFi sequencing data. It was identified as a young sex chromosome and showed signs of expansion in repetitive sequences. The major component of the expansion was Ty3/Gypsy. The ancestral Osteichthyes karyotype consists of 24 protochromosomes. The sex chromosomes in four Pleuronectiformes species derived from a pair of homologous protochromosomes resulting from a whole-genome duplication event in teleost fish, yet with different sex-determination systems. V. variegatus and Cynoglossus semilaevis adhere to the ZZ/ZW system, while Hippoglossus stenolepis and H. hippoglossus follow the XX/XY system. Interestingly, V. variegatus and H. hippoglossus derived from one protochromosome, while C. semilaevis and H. stenolepis derived from another protochromosome. Our study provides valuable insights into the evolution of sex chromosomes in flatfish and sheds light on the important role of whole-genome duplication in shaping the evolution of sex chromosomes.
Collapse
Affiliation(s)
- Xi-Wen Xu
- State Key Laboratory of Mariculture Biobreeding and Sustainable Goods, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao, 266071, China
- Laboratory for Marine Fisheries Science and Food Production Processes, Laoshan Laboratory, Qingdao, 266237, China
| | - Pengchuan Sun
- Key Laboratory for Bio-resources and Eco-environment & Sichuan Zoige Alpine Wetland Ecosystem National Observation and Research Station, College of Life Sciences, Sichuan University, Chengdu, 610065, China
| | - Chengbin Gao
- State Key Laboratory of Mariculture Biobreeding and Sustainable Goods, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao, 266071, China
| | - Weiwei Zheng
- State Key Laboratory of Mariculture Biobreeding and Sustainable Goods, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao, 266071, China
| | - Songlin Chen
- State Key Laboratory of Mariculture Biobreeding and Sustainable Goods, Yellow Sea Fisheries Research Institute, Chinese Academy of Fishery Sciences, Qingdao, 266071, China.
- Laboratory for Marine Fisheries Science and Food Production Processes, Laoshan Laboratory, Qingdao, 266237, China.
| |
Collapse
|
15
|
Yang Y, Wu Z, Wu Z, Li T, Shen Z, Zhou X, Wu X, Li G, Zhang Y. A near-complete assembly of asparagus bean provides insights into anthocyanin accumulation in pods. PLANT BIOTECHNOLOGY JOURNAL 2023; 21:2473-2489. [PMID: 37558431 PMCID: PMC10651155 DOI: 10.1111/pbi.14142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 07/11/2023] [Accepted: 07/23/2023] [Indexed: 08/11/2023]
Abstract
Asparagus bean (Vigna unguiculata ssp. sesquipedialis), a subspecies of V. unguiculata, is a vital legume crop widely cultivated in Asia for its tender pods consumed as vegetables. However, the existing asparagus bean assemblies still contain numerous gaps and unanchored sequences, which presents challenges to functional genomics research. Here, we present an improved reference genome sequence of an elite asparagus bean variety, Fengchan 6, achieved through the integration of nanopore ultra-long reads, PacBio high-fidelity reads, and Hi-C technology. The improved assembly is 521.3 Mb in length and demonstrates several enhancements, including a higher N50 length (46.4 Mb), an anchor ratio of 99.8%, and the presence of only one gap. Furthermore, we successfully assembled 14 telomeres and all 11 centromeres, including four telomere-to-telomere chromosomes. Remarkably, the centromeric regions cover a total length of 38.1 Mb, providing valuable insights into the complex architecture of centromeres. Among the 30 594 predicted protein-coding genes, we identified 2356 genes that are tandemly duplicated in segmental duplication regions. These findings have implications for defence responses and may contribute to evolutionary processes. By utilizing the reference genome, we were able to effectively identify the presence of the gene VuMYB114, which regulates the accumulation of anthocyanins, thereby controlling the purple coloration of the pods. This discovery holds significant implications for understanding the underlying mechanisms of color determination and the breeding process. Overall, the highly improved reference genome serves as crucial resource and lays a solid foundation for asparagus bean genomic studies and genetic improvement efforts.
Collapse
Affiliation(s)
- Yi Yang
- Vegetable Research InstituteGuangdong Academy of Agricultural SciencesGuangzhouChina
- Guangdong Key Laboratory for New Technology Research of VegetablesGuangzhouChina
| | - Zhikun Wu
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic CenterSun Yat‐Sen UniversityGuangzhouChina
| | - Zengxiang Wu
- Vegetable Research InstituteGuangdong Academy of Agricultural SciencesGuangzhouChina
- Guangdong Key Laboratory for New Technology Research of VegetablesGuangzhouChina
| | - Tinyao Li
- Vegetable Research InstituteGuangdong Academy of Agricultural SciencesGuangzhouChina
- Guangdong Key Laboratory for New Technology Research of VegetablesGuangzhouChina
| | - Zhuo Shen
- Vegetable Research InstituteGuangdong Academy of Agricultural SciencesGuangzhouChina
- Guangdong Key Laboratory for New Technology Research of VegetablesGuangzhouChina
| | - Xuan Zhou
- Vegetable Research InstituteGuangdong Academy of Agricultural SciencesGuangzhouChina
- Guangdong Key Laboratory for New Technology Research of VegetablesGuangzhouChina
| | - Xinyi Wu
- Institute of VegetableZhejiang Academy of Agricultural SciencesHangzhouChina
| | - Guojing Li
- Institute of VegetableZhejiang Academy of Agricultural SciencesHangzhouChina
| | - Yan Zhang
- Vegetable Research InstituteGuangdong Academy of Agricultural SciencesGuangzhouChina
- Guangdong Key Laboratory for New Technology Research of VegetablesGuangzhouChina
| |
Collapse
|
16
|
Makova KD, Pickett BD, Harris RS, Hartley GA, Cechova M, Pal K, Nurk S, Yoo D, Li Q, Hebbar P, McGrath BC, Antonacci F, Aubel M, Biddanda A, Borchers M, Bomberg E, Bouffard GG, Brooks SY, Carbone L, Carrel L, Carroll A, Chang PC, Chin CS, Cook DE, Craig SJ, de Gennaro L, Diekhans M, Dutra A, Garcia GH, Grady PG, Green RE, Haddad D, Hallast P, Harvey WT, Hickey G, Hillis DA, Hoyt SJ, Jeong H, Kamali K, Kosakovsky Pond SL, LaPolice TM, Lee C, Lewis AP, Loh YHE, Masterson P, McCoy RC, Medvedev P, Miga KH, Munson KM, Pak E, Paten B, Pinto BJ, Potapova T, Rhie A, Rocha JL, Ryabov F, Ryder OA, Sacco S, Shafin K, Shepelev VA, Slon V, Solar SJ, Storer JM, Sudmant PH, Sweetalana, Sweeten A, Tassia MG, Thibaud-Nissen F, Ventura M, Wilson MA, Young AC, Zeng H, Zhang X, Szpiech ZA, Huber CD, Gerton JL, Yi SV, Schatz MC, Alexandrov IA, Koren S, O’Neill RJ, Eichler E, Phillippy AM. The Complete Sequence and Comparative Analysis of Ape Sex Chromosomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.11.30.569198. [PMID: 38077089 PMCID: PMC10705393 DOI: 10.1101/2023.11.30.569198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/24/2023]
Abstract
Apes possess two sex chromosomes-the male-specific Y and the X shared by males and females. The Y chromosome is crucial for male reproduction, with deletions linked to infertility. The X chromosome carries genes vital for reproduction and cognition. Variation in mating patterns and brain function among great apes suggests corresponding differences in their sex chromosome structure and evolution. However, due to their highly repetitive nature and incomplete reference assemblies, ape sex chromosomes have been challenging to study. Here, using the state-of-the-art experimental and computational methods developed for the telomere-to-telomere (T2T) human genome, we produced gapless, complete assemblies of the X and Y chromosomes for five great apes (chimpanzee, bonobo, gorilla, Bornean and Sumatran orangutans) and a lesser ape, the siamang gibbon. These assemblies completely resolved ampliconic, palindromic, and satellite sequences, including the entire centromeres, allowing us to untangle the intricacies of ape sex chromosome evolution. We found that, compared to the X, ape Y chromosomes vary greatly in size and have low alignability and high levels of structural rearrangements. This divergence on the Y arises from the accumulation of lineage-specific ampliconic regions and palindromes (which are shared more broadly among species on the X) and from the abundance of transposable elements and satellites (which have a lower representation on the X). Our analysis of Y chromosome genes revealed lineage-specific expansions of multi-copy gene families and signatures of purifying selection. In summary, the Y exhibits dynamic evolution, while the X is more stable. Finally, mapping short-read sequencing data from >100 great ape individuals revealed the patterns of diversity and selection on their sex chromosomes, demonstrating the utility of these reference assemblies for studies of great ape evolution. These complete sex chromosome assemblies are expected to further inform conservation genetics of nonhuman apes, all of which are endangered species.
Collapse
Affiliation(s)
| | - Brandon D. Pickett
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Monika Cechova
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Karol Pal
- Penn State University, University Park, PA, USA
| | - Sergey Nurk
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - DongAhn Yoo
- University of Washington School of Medicine, Seattle, WA, USA
| | - Qiuhui Li
- Johns Hopkins University, Baltimore, MD, USA
| | - Prajna Hebbar
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | | | | | | | | | - Erich Bomberg
- University of Münster, Münster, Germany
- MPI for Developmental Biology, Tübingen, Germany
| | - Gerard G. Bouffard
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Shelise Y. Brooks
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Lucia Carbone
- Oregon Health & Science University, Portland, OR, USA
- Oregon National Primate Research Center, Hillsboro, OR, USA
| | - Laura Carrel
- Penn State University School of Medicine, Hershey, PA, USA
| | | | | | - Chen-Shan Chin
- Foundation of Biological Data Sciences, Belmont, CA, USA
| | | | | | | | - Mark Diekhans
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - Amalia Dutra
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Gage H. Garcia
- University of Washington School of Medicine, Seattle, WA, USA
| | | | | | - Diana Haddad
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Pille Hallast
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | | - Glenn Hickey
- University of California Santa Cruz, Santa Cruz, CA, USA
| | - David A. Hillis
- University of California Santa Barbara, Santa Barbara, CA, USA
| | | | - Hyeonsoo Jeong
- University of Washington School of Medicine, Seattle, WA, USA
| | | | | | | | - Charles Lee
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, USA
| | | | | | - Patrick Masterson
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Karen H. Miga
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | - Evgenia Pak
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Benedict Paten
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | | | - Arang Rhie
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Fedor Ryabov
- Masters Program in National Research University Higher School of Economics, Moscow, Russia
| | | | - Samuel Sacco
- University of California Santa Cruz, Santa Cruz, CA, USA
| | | | | | | | - Steven J. Solar
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Sweetalana
- Penn State University, University Park, PA, USA
| | - Alex Sweeten
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Johns Hopkins University, Baltimore, MD, USA
| | | | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | | | | | - Alice C. Young
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Xinru Zhang
- Penn State University, University Park, PA, USA
| | | | | | | | - Soojin V. Yi
- University of California Santa Barbara, Santa Barbara, CA, USA
| | | | | | - Sergey Koren
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | - Evan Eichler
- University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Adam M. Phillippy
- National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
17
|
Rice ES, Alberdi A, Alfieri J, Athrey G, Balacco JR, Bardou P, Blackmon H, Charles M, Cheng HH, Fedrigo O, Fiddaman SR, Formenti G, Frantz LAF, Gilbert MTP, Hearn CJ, Jarvis ED, Klopp C, Marcos S, Mason AS, Velez-Irizarry D, Xu L, Warren WC. A pangenome graph reference of 30 chicken genomes allows genotyping of large and complex structural variants. BMC Biol 2023; 21:267. [PMID: 37993882 PMCID: PMC10664547 DOI: 10.1186/s12915-023-01758-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Accepted: 11/02/2023] [Indexed: 11/24/2023] Open
Abstract
BACKGROUND The red junglefowl, the wild outgroup of domestic chickens, has historically served as a reference for genomic studies of domestic chickens. These studies have provided insight into the etiology of traits of commercial importance. However, the use of a single reference genome does not capture diversity present among modern breeds, many of which have accumulated molecular changes due to drift and selection. While reference-based resequencing is well-suited to cataloging simple variants such as single-nucleotide changes and short insertions and deletions, it is mostly inadequate to discover more complex structural variation in the genome. METHODS We present a pangenome for the domestic chicken consisting of thirty assemblies of chickens from different breeds and research lines. RESULTS We demonstrate how this pangenome can be used to catalog structural variants present in modern breeds and untangle complex nested variation. We show that alignment of short reads from 100 diverse wild and domestic chickens to this pangenome reduces reference bias by 38%, which affects downstream genotyping results. This approach also allows for the accurate genotyping of a large and complex pair of structural variants at the K feathering locus using short reads, which would not be possible using a linear reference. CONCLUSIONS We expect that this new paradigm of genomic reference will allow better pinpointing of exact mutations responsible for specific phenotypes, which will in turn be necessary for breeding chickens that meet new sustainability criteria and are resilient to quickly evolving pathogen threats.
Collapse
Affiliation(s)
- Edward S Rice
- Bond Life Sciences Center, University of Missouri, Columbia, MO, USA
- Faculty of Veterinary Medicine, Ludwig-Maximilians-Universität, Munich, Germany
| | - Antton Alberdi
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen (UCPH), Copenhagen, Denmark
| | - James Alfieri
- Department of Ecology & Evolutionary Biology, Texas A&M University, College Station, TX, USA
| | - Giridhar Athrey
- Department of Poultry Science, Texas A&M University, College Station, TX, USA
| | - Jennifer R Balacco
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Philippe Bardou
- Sigenae, GenPhySE, Université de Toulouse, INRAE, ENVT, Castanet Tolosan, 31326, France
| | - Heath Blackmon
- Department of Biology, Texas A&M University, College Station, TX, USA
| | - Mathieu Charles
- University Paris-Saclay, INRAE, AgroParisTech, GABI, Sigenae, Jouy-en-Josas, France
| | - Hans H Cheng
- Avian Disease and Oncology Laboratory, USDA, ARS, USNPRC, East Lansing, MI, USA
| | - Olivier Fedrigo
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | | | - Giulio Formenti
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
| | - Laurent A F Frantz
- Faculty of Veterinary Medicine, Ludwig-Maximilians-Universität, Munich, Germany
- School of Biological and Behavioural Sciences, Queen Mary University of London, London, E1 4DQ, UK
| | - M Thomas P Gilbert
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen (UCPH), Copenhagen, Denmark
| | - Cari J Hearn
- Avian Disease and Oncology Laboratory, USDA, ARS, USNPRC, East Lansing, MI, USA
| | - Erich D Jarvis
- Vertebrate Genome Laboratory, The Rockefeller University, New York, NY, USA
- The Howard Hughes Medical Institute, Chevy Chase, MD, USA
| | - Christophe Klopp
- Sigenae, Genotoul Bioinfo, MIAT UR875, INRAE, Castanet Tolosan, France
| | - Sofia Marcos
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen (UCPH), Copenhagen, Denmark
- Applied Genomics and Bioinformatics, University of the Basque Country (UPV/EHU), Leioa, Bilbao, Spain
| | | | | | - Luohao Xu
- Key Laboratory of Freshwater Fish Reproduction and Development (Ministry of Education), Key Laboratory of Aquatic Science of Chongqing, School of Life Sciences, Southwest University, Chongqing, 400715, China
| | - Wesley C Warren
- Department of Animal Sciences, University of Missouri, Columbia, MO, USA.
| |
Collapse
|
18
|
Fruzangohar M, Moolhuijzen P, Bakaj N, Taylor J. CoreDetector: a flexible and efficient program for core-genome alignment of evolutionary diverse genomes. Bioinformatics 2023; 39:btad628. [PMID: 37878789 PMCID: PMC10663985 DOI: 10.1093/bioinformatics/btad628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2023] [Revised: 09/20/2023] [Accepted: 10/23/2023] [Indexed: 10/27/2023] Open
Abstract
MOTIVATION Whole genome alignment of eukaryote species remains an important method for the determination of sequence and structural variations and can also be used to ascertain the representative non-redundant core-genome sequence of a population. Many whole genome alignment tools were first developed for the more mature analysis of prokaryote species with few current tools containing the functionality to process larger genomes of eukaryotes as well as genomes of more divergent species. In addition, the functionality of these tools becomes computationally prohibitive due to the significant compute resources needed to handle larger genomes. RESULTS In this research, we present CoreDetector, an easy-to-use general-purpose program that can align the core-genome sequences for a range of genome sizes and divergence levels. To illustrate the flexibility of CoreDetector, we conducted alignments of a large set of closely related fungal pathogen and hexaploid wheat cultivar genomes as well as more divergent fly and rodent species genomes. In all cases, compared to existing multiple genome alignment tools, CoreDetector exhibited improved flexibility, efficiency, and competitive accuracy in tested cases. AVAILABILITY AND IMPLEMENTATION CoreDetector was developed in the cross platform, and easily deployable, Java language. A packaged pipeline is readily executable in a bash terminal without any external need for Perl or Python environments. Installation, example data, and usage instructions for CoreDetector are freely available from https://github.com/mfruzan/CoreDetector.
Collapse
Affiliation(s)
- Mario Fruzangohar
- The Biometry Hub, School of Agriculture, Food and Wine, University of Adelaide, Urrbrae, South Australia 5064, Australia
| | - Paula Moolhuijzen
- Centre for Crop Disease Management, School of Molecular and Life Sciences, Curtin University, Bentley, Western Australia 6102, Australia
| | - Nicolette Bakaj
- The Biometry Hub, School of Agriculture, Food and Wine, University of Adelaide, Urrbrae, South Australia 5064, Australia
| | - Julian Taylor
- The Biometry Hub, School of Agriculture, Food and Wine, University of Adelaide, Urrbrae, South Australia 5064, Australia
| |
Collapse
|
19
|
Rautiainen M, Nurk S, Walenz BP, Logsdon GA, Porubsky D, Rhie A, Eichler EE, Phillippy AM, Koren S. Telomere-to-telomere assembly of diploid chromosomes with Verkko. Nat Biotechnol 2023; 41:1474-1482. [PMID: 36797493 PMCID: PMC10427740 DOI: 10.1038/s41587-023-01662-6] [Citation(s) in RCA: 90] [Impact Index Per Article: 90.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Accepted: 01/03/2023] [Indexed: 02/18/2023]
Abstract
The Telomere-to-Telomere consortium recently assembled the first truly complete sequence of a human genome. To resolve the most complex repeats, this project relied on manual integration of ultra-long Oxford Nanopore sequencing reads with a high-resolution assembly graph built from long, accurate PacBio high-fidelity reads. We have improved and automated this strategy in Verkko, an iterative, graph-based pipeline for assembling complete, diploid genomes. Verkko begins with a multiplex de Bruijn graph built from long, accurate reads and progressively simplifies this graph by integrating ultra-long reads and haplotype-specific markers. The result is a phased, diploid assembly of both haplotypes, with many chromosomes automatically assembled from telomere to telomere. Running Verkko on the HG002 human genome resulted in 20 of 46 diploid chromosomes assembled without gaps at 99.9997% accuracy. The complete assembly of diploid genomes is a critical step towards the construction of comprehensive pangenome databases and chromosome-scale comparative genomics.
Collapse
Affiliation(s)
- Mikko Rautiainen
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
- Oxford Nanopore Technologies, Oxford, UK
| | - Brian P Walenz
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
20
|
Bae SH, Lee MH, Lee JH, Yu Y, Lee J, Kim TH. The Genome of the Korean Island-Originated Perilla citriodora 'Jeju17' Sheds Light on Its Environmental Adaptation and Fatty Acid and Lipid Production Pathways. Genes (Basel) 2023; 14:1898. [PMID: 37895247 PMCID: PMC10606934 DOI: 10.3390/genes14101898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 09/27/2023] [Accepted: 09/28/2023] [Indexed: 10/29/2023] Open
Abstract
Perilla is a key component of Korean food. It contains several plant-specialized metabolites that provide medical benefits. In response to an increased interest in healthy supplement food from the public, people are focusing on the properties of Perilla. Nevertheless, unlike rice and soybeans, there are few studies based on molecular genetics on Perilla, so it is difficult to systematically study the molecular breed. The wild Perilla, Perilla citriodora 'Jeju17', was identified a decade ago on the Korean island of Jeju. Using short-reads, long-reads, and Hi-C, a chromosome-scale genome spanning 676 Mbp, with high contiguity, was assembled. Aligning the 'Jeju17' genome to the 'PC002' Chinese species revealed significant collinearity with respect to the total length. A total of 31,769 coding sequences were predicted, among which 3331 were 'Jeju17'-specific. Gene enrichment of the species-specific gene repertoire highlighted environment adaptation, fatty acid metabolism, and plant-specialized metabolite biosynthesis. Using a homology-based approach, genes involved in fatty acid and lipid triacylglycerol biosynthesis were identified. A total of 22 fatty acid desaturases were found and comprehensively characterized. Expression of the FAD genes in 'Jeju17' was examined at the seed level, and hormone signaling factors were identified. The results showed that the expression of FAD genes in 'Jeju17' at the seed level was high 25 days after flowering, and their responses of hormones and stress were mainly associated with hormone signal transduction and abiotic stress via cis-elements patterns. This study presents a chromosome-level genome assembly of P. citriodora 'Jeju17', the first wild Perilla to be sequenced from the Korean island of Jeju. The analyses provided can be useful in designing ALA-enhanced Perilla genotypes in the future.
Collapse
Affiliation(s)
- Seon-Hwa Bae
- Genomics Division, Department of Agricultural Biotechnology, National Institute of Agricultural Sciences, Rural Development Administration, Jeonju 54874, Republic of Korea;
| | - Myoung Hee Lee
- Upland Crop Breeding Research Division, Department of Southern Area Crop Science, Rural Development Administration (RDA), Miryang 50424, Republic of Korea;
| | - Jeong-Hee Lee
- SEEDERS Inc., 118, Jungang-ro, Jung-gu, Daejeon 34912, Republic of Korea;
| | - Yeisoo Yu
- DNACARE Co., Ltd., 48, Teheran-ro 25-gil, Gangnam-gu, Seoul 06126, Republic of Korea;
| | - Jundae Lee
- Department of Horticulture, College of Agriculture and Life Sciences, Jeonbuk National University, Jeonju 54896, Republic of Korea
| | - Tae-Ho Kim
- Genomics Division, Department of Agricultural Biotechnology, National Institute of Agricultural Sciences, Rural Development Administration, Jeonju 54874, Republic of Korea;
| |
Collapse
|
21
|
Kille B, Garrison E, Treangen TJ, Phillippy AM. Minmers are a generalization of minimizers that enable unbiased local Jaccard estimation. Bioinformatics 2023; 39:btad512. [PMID: 37603771 PMCID: PMC10505501 DOI: 10.1093/bioinformatics/btad512] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2023] [Revised: 07/19/2023] [Accepted: 08/18/2023] [Indexed: 08/23/2023] Open
Abstract
MOTIVATION The Jaccard similarity on k-mer sets has shown to be a convenient proxy for sequence identity. By avoiding expensive base-level alignments and comparing reduced sequence representations, tools such as MashMap can scale to massive numbers of pairwise comparisons while still providing useful similarity estimates. However, due to their reliance on minimizer winnowing, previous versions of MashMap were shown to be biased and inconsistent estimators of Jaccard similarity. This directly impacts downstream tools that rely on the accuracy of these estimates. RESULTS To address this, we propose the minmer winnowing scheme, which generalizes the minimizer scheme by use of a rolling minhash with multiple sampled k-mers per window. We show both theoretically and empirically that minmers yield an unbiased estimator of local Jaccard similarity, and we implement this scheme in an updated version of MashMap. The minmer-based implementation is over 10 times faster than the minimizer-based version under the default ANI threshold, making it well-suited for large-scale comparative genomics applications. AVAILABILITY AND IMPLEMENTATION MashMap3 is available at https://github.com/marbl/MashMap.
Collapse
Affiliation(s)
- Bryce Kille
- Department of Computer Science, Rice University, Houston, TX, United States
| | - Erik Garrison
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, United States
| | - Todd J Treangen
- Department of Computer Science, Rice University, Houston, TX, United States
| | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, United States
| |
Collapse
|
22
|
Fawcett JA, Takeshima R, Kikuchi S, Yazaki E, Katsube-Tanaka T, Dong Y, Li M, Hunt HV, Jones MK, Lister DL, Ohsako T, Ogiso-Tanaka E, Fujii K, Hara T, Matsui K, Mizuno N, Nishimura K, Nakazaki T, Saito H, Takeuchi N, Ueno M, Matsumoto D, Norizuki M, Shirasawa K, Li C, Hirakawa H, Ota T, Yasui Y. Genome sequencing reveals the genetic architecture of heterostyly and domestication history of common buckwheat. NATURE PLANTS 2023; 9:1236-1251. [PMID: 37563460 DOI: 10.1038/s41477-023-01474-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Accepted: 07/03/2023] [Indexed: 08/12/2023]
Abstract
Common buckwheat, Fagopyrum esculentum, is an orphan crop domesticated in southwest China that exhibits heterostylous self-incompatibility. Here we present chromosome-scale assemblies of a self-compatible F. esculentum accession and a self-compatible wild relative, Fagopyrum homotropicum, together with the resequencing of 104 wild and cultivated F. esculentum accessions. Using these genomic data, we report the roles of transposable elements and whole-genome duplications in the evolution of Fagopyrum. In addition, we show that (1) the breakdown of heterostyly occurs through the disruption of a hemizygous gene jointly regulating the style length and female compatibility and (2) southeast Tibet was involved in common buckwheat domestication. Moreover, we obtained mutants conferring the waxy phenotype for the first time in buckwheat. These findings demonstrate the utility of our F. esculentum assembly as a reference genome and promise to accelerate buckwheat research and breeding.
Collapse
Affiliation(s)
| | - Ryoma Takeshima
- Institute of Crop Science, National Agriculture and Food Research Organization (NARO), Tsukuba, Japan
| | - Shinji Kikuchi
- Graduate School of Horticulture, Chiba University, Matsudo, Japan
- Plant Molecular Science Center, Chiba University, Chiba, Japan
| | | | | | - Yumei Dong
- State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, Yunnan Agricultural University, Kunming, China
| | - Meifang Li
- State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, Yunnan Agricultural University, Kunming, China
| | - Harriet V Hunt
- McDonald Institute for Archaeological Research, University of Cambridge, Cambridge, UK
- Royal Botanic Gardens Kew, Richmond, UK
| | - Martin K Jones
- McDonald Institute for Archaeological Research, University of Cambridge, Cambridge, UK
| | - Diane L Lister
- McDonald Institute for Archaeological Research, University of Cambridge, Cambridge, UK
- Conservation Research Institute, University of Cambridge, Cambridge, UK
| | - Takanori Ohsako
- Graduate School of Life and Environmental Sciences, Kyoto Prefectural University, Kyoto, Japan
| | - Eri Ogiso-Tanaka
- Institute of Crop Science, National Agriculture and Food Research Organization (NARO), Tsukuba, Japan
- Center for Molecular Biodiversity Research, National Museum of Nature and Science, Tsukuba, Japan
| | - Kenichiro Fujii
- Institute of Crop Science, National Agriculture and Food Research Organization (NARO), Tsukuba, Japan
| | - Takashi Hara
- Hokkaido Agricultural Research Center, National Agriculture and Food Research Organization (NARO), Kasai, Japan
| | - Katsuhiro Matsui
- Institute of Crop Science, National Agriculture and Food Research Organization (NARO), Tsukuba, Japan
- Institute of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Japan
| | - Nobuyuki Mizuno
- Graduate School of Agriculture, Kyoto University, Kyoto, Japan
| | | | | | - Hiroki Saito
- Graduate School of Agriculture, Kyoto University, Kyoto, Japan
- Tropical Agriculture Research Front, Japan International Research Center for Agricultural Sciences, Ishigaki, Japan
| | - Naoko Takeuchi
- Graduate School of Agriculture, Kyoto University, Kyoto, Japan
| | - Mariko Ueno
- Graduate School of Agriculture, Kyoto University, Kyoto, Japan
| | - Daiki Matsumoto
- Faculty of Bioscience and Biotechnology, Fukui Prefectural University, Awara, Japan
| | - Miyu Norizuki
- Graduate School of Horticulture, Chiba University, Matsudo, Japan
| | | | - Chengyun Li
- State Key Laboratory for Conservation and Utilization of Bio-Resources in Yunnan, Yunnan Agricultural University, Kunming, China.
| | | | - Tatsuya Ota
- Department of Evolutionary Studies of Biosystems, SOKENDAI, Hayama, Japan.
- Research Center for Integrative Evolutionary Science, SOKENDAI, Hayama, Japan.
| | - Yasuo Yasui
- Graduate School of Agriculture, Kyoto University, Kyoto, Japan.
| |
Collapse
|
23
|
Mathers TC, Wouters RHM, Mugford ST, Biello R, van Oosterhout C, Hogenhout SA. Hybridisation has shaped a recent radiation of grass-feeding aphids. BMC Biol 2023; 21:157. [PMID: 37443008 PMCID: PMC10347838 DOI: 10.1186/s12915-023-01649-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Accepted: 06/13/2023] [Indexed: 07/15/2023] Open
Abstract
BACKGROUND Aphids are common crop pests. These insects reproduce by facultative parthenogenesis involving several rounds of clonal reproduction interspersed with an occasional sexual cycle. Furthermore, clonal aphids give birth to live young that are already pregnant. These qualities enable rapid population growth and have facilitated the colonisation of crops globally. In several cases, so-called "super clones" have come to dominate agricultural systems. However, the extent to which the sexual stage of the aphid life cycle has shaped global pest populations has remained unclear, as have the origins of successful lineages. Here, we used chromosome-scale genome assemblies to disentangle the evolution of two global pests of cereals-the English (Sitobion avenae) and Indian (Sitobion miscanthi) grain aphids. RESULTS Genome-wide divergence between S. avenae and S. miscanthi is low. Moreover, comparison of haplotype-resolved assemblies revealed that the S. miscanthi isolate used for genome sequencing is likely a hybrid, with one of its diploid genome copies closely related to S. avenae (~ 0.5% divergence) and the other substantially more divergent (> 1%). Population genomics analyses of UK and China grain aphids showed that S. avenae and S. miscanthi are part of a cryptic species complex with many highly differentiated lineages that predate the origins of agriculture. The complex consists of hybrid lineages that display a tangled history of hybridisation and genetic introgression. CONCLUSIONS Our analyses reveal that hybridisation has substantially contributed to grain aphid diversity, and hence, to the evolutionary potential of this important pest species. Furthermore, we propose that aphids are particularly well placed to exploit hybridisation events via the rapid propagation of live-born "frozen hybrids" via asexual reproduction, increasing the likelihood of hybrid lineage formation.
Collapse
Affiliation(s)
- Thomas C Mathers
- Department of Crop Genetics, John Innes Centre, Norwich Research Park, Norwich, UK.
- Tree of Life, Welcome Sanger Institute, Hinxton, Cambridge, UK.
| | - Roland H M Wouters
- Department of Crop Genetics, John Innes Centre, Norwich Research Park, Norwich, UK
| | - Sam T Mugford
- Department of Crop Genetics, John Innes Centre, Norwich Research Park, Norwich, UK
| | - Roberto Biello
- Department of Crop Genetics, John Innes Centre, Norwich Research Park, Norwich, UK
| | | | - Saskia A Hogenhout
- Department of Crop Genetics, John Innes Centre, Norwich Research Park, Norwich, UK.
| |
Collapse
|
24
|
Ekim B, Sahlin K, Medvedev P, Berger B, Chikhi R. Efficient mapping of accurate long reads in minimizer space with mapquik. Genome Res 2023; 33:1188-1197. [PMID: 37399256 PMCID: PMC10538364 DOI: 10.1101/gr.277679.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2023] [Accepted: 06/26/2023] [Indexed: 07/05/2023]
Abstract
DNA sequencing data continue to progress toward longer reads with increasingly lower sequencing error rates. We focus on the critical problem of mapping, or aligning, low-divergence sequences from long reads (e.g., Pacific Biosciences [PacBio] HiFi) to a reference genome, which poses challenges in terms of accuracy and computational resources when using cutting-edge read mapping approaches that are designed for all types of alignments. A natural idea would be to optimize efficiency with longer seeds to reduce the probability of extraneous matches; however, contiguous exact seeds quickly reach a sensitivity limit. We introduce mapquik, a novel strategy that creates accurate longer seeds by anchoring alignments through matches of k consecutively sampled minimizers (k-min-mers) and only indexing k-min-mers that occur once in the reference genome, thereby unlocking ultrafast mapping while retaining high sensitivity. We show that mapquik significantly accelerates the seeding and chaining steps-fundamental bottlenecks to read mapping-for both the human and maize genomes with [Formula: see text] sensitivity and near-perfect specificity. On the human genome, for both real and simulated reads, mapquik achieves a [Formula: see text] speedup over the state-of-the-art tool minimap2, and on the maize genome, mapquik achieves a [Formula: see text] speedup over minimap2, making mapquik the fastest mapper to date. These accelerations are enabled from not only minimizer-space seeding but also a novel heuristic [Formula: see text] pseudochaining algorithm, which improves upon the long-standing [Formula: see text] bound. Minimizer-space computation builds the foundation for achieving real-time analysis of long-read sequencing data.
Collapse
Affiliation(s)
- Bariş Ekim
- Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts 02139, USA
- Department of Mathematics, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts 02139, USA
| | - Kristoffer Sahlin
- Department of Mathematics, Science for Life Laboratory, Stockholm University, SE-106 91 Stockholm, Sweden
| | - Paul Medvedev
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
- Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
| | - Bonnie Berger
- Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts 02139, USA
- Department of Mathematics, Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts 02139, USA
| | - Rayan Chikhi
- Department of Computational Biology, Institut Pasteur, 75015 Paris, France
| |
Collapse
|
25
|
Kille B, Garrison E, Treangen TJ, Phillippy AM. Minmers are a generalization of minimizers that enable unbiased local Jaccard estimation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.16.540882. [PMID: 37325780 PMCID: PMC10268037 DOI: 10.1101/2023.05.16.540882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Motivation The Jaccard similarity on k -mer sets has shown to be a convenient proxy for sequence identity. By avoiding expensive base-level alignments and comparing reduced sequence representations, tools such as MashMap can scale to massive numbers of pairwise comparisons while still providing useful similarity estimates. However, due to their reliance on minimizer winnowing, previous versions of MashMap were shown to be biased and inconsistent estimators of Jaccard similarity. This directly impacts downstream tools that rely on the accuracy of these estimates. Results To address this, we propose the minmer winnowing scheme, which generalizes the minimizer scheme by use of a rolling minhash with multiple sampled k -mers per window. We show both theoretically and empirically that minmers yield an unbiased estimator of local Jaccard similarity, and we implement this scheme in an updated version of MashMap. The minmer-based implementation is over 10 times faster than the minimizer-based version under the default ANI threshold, making it well-suited for large-scale comparative genomics applications.
Collapse
Affiliation(s)
- Bryce Kille
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Erik Garrison
- Department of Genetics, Genomics and Informatics, University of Tennessee Health Science Center, Memphis, TN, USA
| | - Todd J Treangen
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
26
|
Sedeek K, Zuccolo A, Fornasiero A, Weber AM, Sanikommu K, Sampathkumar S, Rivera LF, Butt H, Mussurova S, Alhabsi A, Nurmansyah N, Ryan EP, Wing RA, Mahfouz MM. Multi-omics resources for targeted agronomic improvement of pigmented rice. NATURE FOOD 2023; 4:366-371. [PMID: 37169820 DOI: 10.1038/s43016-023-00742-9] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Accepted: 03/24/2023] [Indexed: 05/13/2023]
Abstract
Pigmented rice (Oryza sativa L.) is a rich source of nutrients, but pigmented lines typically have long life cycles and limited productivity. Here we generated genome assemblies of 5 pigmented rice varieties and evaluated the genetic variation among 51 pigmented rice varieties by resequencing an additional 46 varieties. Phylogenetic analyses divided the pigmented varieties into four varietal groups: Geng-japonica, Xian-indica, circum-Aus and circum-Basmati. Metabolomics and ionomics profiling revealed that black rice varieties are rich in aromatic secondary metabolites. We established a regeneration and transformation system and used CRISPR-Cas9 to knock out three flowering time repressors (Hd2, Hd4 and Hd5) in the black Indonesian rice Cempo Ireng, resulting in an early maturing variety with shorter stature. Our study thus provides a multi-omics resource for understanding and improving Asian pigmented rice.
Collapse
Affiliation(s)
- Khalid Sedeek
- Laboratory for Genome Engineering and Synthetic Biology, Division of Biological Sciences, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- Center for Desert Agriculture, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Andrea Zuccolo
- Center for Desert Agriculture, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- Crop Science Research Center, Sant'Anna School of Advanced Studies, Pisa, Italy
| | - Alice Fornasiero
- Center for Desert Agriculture, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Annika M Weber
- Department of Environmental and Radiological Health Sciences, Colorado State University, Fort Collins, CO, USA
| | - Krishnaveni Sanikommu
- Laboratory for Genome Engineering and Synthetic Biology, Division of Biological Sciences, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- Center for Desert Agriculture, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Sangeetha Sampathkumar
- Laboratory for Genome Engineering and Synthetic Biology, Division of Biological Sciences, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- Center for Desert Agriculture, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Luis F Rivera
- Center for Desert Agriculture, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Haroon Butt
- Laboratory for Genome Engineering and Synthetic Biology, Division of Biological Sciences, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- Center for Desert Agriculture, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Saule Mussurova
- Center for Desert Agriculture, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Abdulrahman Alhabsi
- Laboratory for Genome Engineering and Synthetic Biology, Division of Biological Sciences, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- Center for Desert Agriculture, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Nurmansyah Nurmansyah
- Department of Agronomy, Faculty of Agriculture, Universitas Gadjah Mada, Yogyakarta, Indonesia
| | - Elizabeth P Ryan
- Department of Environmental and Radiological Health Sciences, Colorado State University, Fort Collins, CO, USA
| | - Rod A Wing
- Center for Desert Agriculture, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA
- International Rice Research Institute, Strategic Innovation, Los Baños, Philippines
| | - Magdy M Mahfouz
- Laboratory for Genome Engineering and Synthetic Biology, Division of Biological Sciences, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia.
- Center for Desert Agriculture, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia.
| |
Collapse
|
27
|
Si J, Dai D, Li K, Fang L, Zhang Y. A Multi-Tissue Gene Expression Atlas of Water Buffalo ( Bubalus bubalis) Reveals Transcriptome Conservation between Buffalo and Cattle. Genes (Basel) 2023; 14:genes14040890. [PMID: 37107649 PMCID: PMC10137413 DOI: 10.3390/genes14040890] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 04/04/2023] [Accepted: 04/07/2023] [Indexed: 04/29/2023] Open
Abstract
We generated 73 transcriptomic data of water buffalo, which were integrated with publicly available data in this species, yielding a large dataset of 355 samples representing 20 major tissue categories. We established a multi-tissue gene expression atlas of water buffalo. Furthermore, by comparing them with 4866 cattle transcriptomic data from the cattle genotype-tissue expression atlas (CattleGTEx), we found that the transcriptomes of the two species exhibited conservation in their overall gene expression patterns, tissue-specific gene expression and house-keeping gene expression. We further identified conserved and divergent expression genes between the two species, with the largest number of differentially expressed genes found in the skin, which may be related to structural and functional differences in the skin of the two species. This work provides a source of functional annotation of the buffalo genome and lays the foundations for future genetic and evolutionary studies in water buffalo.
Collapse
Affiliation(s)
- Jingfang Si
- College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Dongmei Dai
- College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Kun Li
- College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| | - Lingzhao Fang
- The Center for Quantitative Genetics and Genomics (QGG), Aarhus University, 11, 8000 Aarhus, Denmark
| | - Yi Zhang
- College of Animal Science and Technology, China Agricultural University, Beijing 100193, China
| |
Collapse
|
28
|
Berger B, Yu YW. Navigating bottlenecks and trade-offs in genomic data analysis. Nat Rev Genet 2023; 24:235-250. [PMID: 36476810 DOI: 10.1038/s41576-022-00551-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/27/2022] [Indexed: 12/12/2022]
Abstract
Genome sequencing and analysis allow researchers to decode the functional information hidden in DNA sequences as well as to study cell to cell variation within a cell population. Traditionally, the primary bottleneck in genomic analysis pipelines has been the sequencing itself, which has been much more expensive than the computational analyses that follow. However, an important consequence of the continued drive to expand the throughput of sequencing platforms at lower cost is that often the analytical pipelines are struggling to keep up with the sheer amount of raw data produced. Computational cost and efficiency have thus become of ever increasing importance. Recent methodological advances, such as data sketching, accelerators and domain-specific libraries/languages, promise to address these modern computational challenges. However, despite being more efficient, these innovations come with a new set of trade-offs, both expected, such as accuracy versus memory and expense versus time, and more subtle, including the human expertise needed to use non-standard programming interfaces and set up complex infrastructure. In this Review, we discuss how to navigate these new methodological advances and their trade-offs.
Collapse
Affiliation(s)
- Bonnie Berger
- Department of Mathematics, Massachusetts Institute of Technology, Cambridge, MA, USA.
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA, USA.
| | - Yun William Yu
- Department of Computer and Mathematical Sciences, University of Toronto Scarborough, Toronto, Ontario, Canada
- Tri-Campus Department of Mathematics, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
29
|
Zuccolo A, Mfarrej S, Celii M, Mussurova S, Rivera LF, Llaca V, Mohammed N, Pain A, Alrefaei AF, Alrefaei AF, Wing RA. The gyrfalcon (Falco rusticolus) genome. G3 (BETHESDA, MD.) 2023; 13:6972330. [PMID: 36611193 PMCID: PMC9997569 DOI: 10.1093/g3journal/jkad001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Revised: 12/22/2022] [Accepted: 12/26/2022] [Indexed: 01/09/2023]
Abstract
High-quality genome assemblies are characterized by high-sequence contiguity, completeness, and a low error rate, thus providing the basis for a wide array of studies focusing on natural species ecology, conservation, evolution, and population genomics. To provide this valuable resource for conservation projects and comparative genomics studies on gyrfalcon (Falco rusticolus), we sequenced and assembled the genome of this species using third-generation sequencing strategies and optical maps. Here, we describe a highly contiguous and complete genome assembly comprising 20 scaffolds and 13 contigs with a total size of 1.193 Gbp, including 8,064 complete Benchmarking Universal Single-Copy Orthologs (BUSCOs) of the total 8,338 BUSCO groups present in the library aves_odb10. Of these BUSCO genes, 96.7% were complete, 96.1% were present as a single copy, and 0.6% were duplicated. Furthermore, 0.8% of BUSCO genes were fragmented and 2.5% (210) were missing. A de novo search for transposable elements (TEs) identified 5,716 TEs that masked 7.61% of the F. rusticolus genome assembly when combined with publicly available TE collections. Long interspersed nuclear elements, in particular, the element Chicken-repeat 1 (CR1), were the most abundant TEs in the F. rusticolus genome. A de novo first-pass gene annotation was performed using 293,349 PacBio Iso-Seq transcripts and 496,195 transcripts derived from the assembly of 42,429,525 Illumina PE RNA-seq reads. In all, 19,602 putative genes, of which 59.31% were functionally characterized and associated with Gene Ontology terms, were annotated. A comparison of the gyrfalcon genome assembly with the publicly available assemblies of the domestic chicken (Gallus gallus), zebra finch (Taeniopygia guttata), and hummingbird (Calypte anna) revealed several genome rearrangements. In particular, nine putative chromosome fusions were identified in the gyrfalcon genome assembly compared with those in the G. gallus genome assembly. This genome assembly, its annotation for TEs and genes, and the comparative analyses presented, complement and strength the base of high-quality genome assemblies and associated resources available for comparative studies focusing on the evolution, ecology, and conservation of Aves.
Collapse
Affiliation(s)
- Andrea Zuccolo
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia.,Crop Science Research Center, Sant'Anna School of Advanced Studies, Piazza Martiri della Libertà 33, 56127 Pisa, Italy
| | - Sara Mfarrej
- King Abdullah University of Science and Technology (KAUST), Pathogen Genomics Laboratory, Biological and Environmental Science and Engineering (BESE), Thuwal-Jeddah 23955-6900, Saudi Arabia
| | - Mirko Celii
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Saule Mussurova
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Luis F Rivera
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Victor Llaca
- Research and Development, Corteva Agriscience, Johnston, IA 50131, USA
| | - Nahed Mohammed
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Arnab Pain
- King Abdullah University of Science and Technology (KAUST), Pathogen Genomics Laboratory, Biological and Environmental Science and Engineering (BESE), Thuwal-Jeddah 23955-6900, Saudi Arabia
| | | | - Abdulwahed Fahad Alrefaei
- Department of Zoology, College of Science, King Saud University, P.O. Box 2455, Riyadh 11451, Saudi Arabia
| | - Rod A Wing
- Center for Desert Agriculture (CDA), Biological and Environmental Sciences & Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia.,School of Plant Sciences, Arizona Genomics Institute, University of Arizona, 24 Tucson, Arizona 85721, USA
| |
Collapse
|
30
|
Mao Y, Harvey WT, Porubsky D, Munson KM, Hoekzema K, Lewis AP, Audano PA, Rozanski A, Yang X, Zhang S, Gordon DS, Wei X, Logsdon GA, Haukness M, Dishuck PC, Jeong H, Del Rosario R, Bauer VL, Fattor WT, Wilkerson GK, Lu Q, Paten B, Feng G, Sawyer SL, Warren WC, Carbone L, Eichler EE. Structurally divergent and recurrently mutated regions of primate genomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.07.531415. [PMID: 36945442 PMCID: PMC10028934 DOI: 10.1101/2023.03.07.531415] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/10/2023]
Abstract
To better understand the pattern of primate genome structural variation, we sequenced and assembled using multiple long-read sequencing technologies the genomes of eight nonhuman primate species, including New World monkeys (owl monkey and marmoset), Old World monkey (macaque), Asian apes (orangutan and gibbon), and African ape lineages (gorilla, bonobo, and chimpanzee). Compared to the human genome, we identified 1,338,997 lineage-specific fixed structural variants (SVs) disrupting 1,561 protein-coding genes and 136,932 regulatory elements, including the most complete set of human-specific fixed differences. Across 50 million years of primate evolution, we estimate that 819.47 Mbp or ~27% of the genome has been affected by SVs based on analysis of these primate lineages. We identify 1,607 structurally divergent regions (SDRs) wherein recurrent structural variation contributes to creating SV hotspots where genes are recurrently lost (CARDs, ABCD7, OLAH) and new lineage-specific genes are generated (e.g., CKAP2, NEK5) and have become targets of rapid chromosomal diversification and positive selection (e.g., RGPDs). High-fidelity long-read sequencing has made these dynamic regions of the genome accessible for sequence-level analyses within and between primate species for the first time.
Collapse
Affiliation(s)
- Yafei Mao
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - William T Harvey
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Katherine M Munson
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Kendra Hoekzema
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Alexandra P Lewis
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Peter A Audano
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Allison Rozanski
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Xiangyu Yang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Shilong Zhang
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - David S Gordon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| | - Xiaoxi Wei
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Glennis A Logsdon
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Marina Haukness
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Philip C Dishuck
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Hyeonsoo Jeong
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
| | - Ricardo Del Rosario
- McGovern Institute for Brain Research, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Vanessa L Bauer
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, CO, USA
| | - Will T Fattor
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, CO, USA
| | - Gregory K Wilkerson
- Department of Veterinary Sciences, Michale E. Keeling Center for Comparative Medicine and Research, The University of Texas MD Anderson Cancer Center, Bastrop, TX, USA
- Department of Clinical Sciences, North Carolina State University, Raleigh, NC, USA
| | - Qing Lu
- Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA
| | - Guoping Feng
- McGovern Institute for Brain Research, Department of Brain and Cognitive Sciences, Massachusetts Institute of Technology, Cambridge, MA, USA
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Sara L Sawyer
- BioFrontiers Institute, Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, CO, USA
| | - Wesley C Warren
- Department of Animal Sciences, Bond Life Sciences Center, University of Missouri, Columbia, MO, USA
- Department of Surgery, School of Medicine, University of Missouri, Columbia, MO, USA
- Institute of Data Science and Informatics, University of Missouri, Columbia, MO, USA
| | - Lucia Carbone
- Department of Medicine, Knight Cardiovascular Institute, Oregon Health and Science University, Portland, OR, USA
- Division of Genetics, Oregon National Primate Research Center, Beaverton, OR, USA
- Department of Molecular and Medical Genetics, Oregon Health and Science University, Portland, OR, USA
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR, USA
| | - Evan E Eichler
- Department of Genome Sciences, University of Washington School of Medicine, Seattle, WA, USA
- Howard Hughes Medical Institute, University of Washington, Seattle, WA, USA
| |
Collapse
|
31
|
Exploring the Potential Molecular Mechanisms of Interactions between a Probiotic Consortium and Its Coral Host. mSystems 2023; 8:e0092122. [PMID: 36688656 PMCID: PMC9948713 DOI: 10.1128/msystems.00921-22] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
Beneficial microorganisms for corals (BMCs) have been demonstrated to be effective probiotics to alleviate bleaching and mitigate coral mortality in vivo. The selection of putative BMCs is traditionally performed manually, using an array of biochemical and molecular tests for putative BMC traits. We present a comprehensive genetic survey of BMC traits using a genome-based framework for the identification of alternative mechanisms that can be used for future in silico selection of BMC strains. We identify exclusive BMC traits associated with specific strains and propose new BMC mechanisms, such as the synthesis of glycine betaine and ectoines. Our roadmap facilitates the selection of BMC strains while increasing the array of genetic targets that can be included in the selection of putative BMC strains to be tested as coral probiotics. IMPORTANCE Probiotics are currently the main hope as a potential medicine for corals, organisms that are considered the marine "canaries of the coal mine" and that are threatened with extinction. Our experiments have proved the concept that probiotics mitigate coral bleaching and can also prevent coral mortality. Here, we present a comprehensive genetic survey of probiotic traits using a genome-based framework. The main outcomes are a roadmap that facilitates the selection of coral probiotic strains while increasing the array of mechanisms that can be included in the selection of coral probiotics.
Collapse
|
32
|
Piña JS, Orozco-Arias S, Tobón-Orozco N, Camargo-Forero L, Tabares-Soto R, Guyot R. G-SAIP: Graphical Sequence Alignment Through Parallel Programming in the Post-Genomic Era. Evol Bioinform Online 2023; 19:11769343221150585. [PMID: 36703866 PMCID: PMC9871978 DOI: 10.1177/11769343221150585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 12/23/2022] [Indexed: 01/22/2023] Open
Abstract
A common task in bioinformatics is to compare DNA sequences to identify similarities between organisms at the sequence level. An approach to such comparison is the dot-plots, a 2-dimensional graphical representation to analyze DNA or protein alignments. Dot-plots alignment software existed before the sequencing revolution, and now there is an ongoing limitation when dealing with large-size sequences, resulting in very long execution times. High-Performance Computing (HPC) techniques have been successfully used in many applications to reduce computing times, but so far, very few applications for graphical sequence alignment using HPC have been reported. Here, we present G-SAIP (Graphical Sequence Alignment in Parallel), a software capable of spawning multiple distributed processes on CPUs, over a supercomputing infrastructure to speed up the execution time for dot-plot generation up to 1.68× compared with other current fastest tools, improve the efficiency for comparative structural genomic analysis, phylogenetics because the benefits of pairwise alignments for comparison between genomes, repetitive structure identification, and assembly quality checking.
Collapse
Affiliation(s)
- Johan S. Piña
- Department of Data Science, People
Contact, Manizales, Caldas, Colombia,Department of Computer Science,
Universidad Autónoma de Manizales, Manizales, Caldas, Colombia,Johan S. Piña, Department of Computer
Science, Universidad Autónoma de Manizales, Antigua estación del ferrocarril,
Manizales, Caldas 170004, Colombia.
| | - Simon Orozco-Arias
- Department of Computer Science,
Universidad Autónoma de Manizales, Manizales, Caldas, Colombia,Department of Systems and Informatics,
Universidad de Caldas, Manizales, Caldas, Colombia
| | - Nicolas Tobón-Orozco
- Department of Computer Science,
Universidad Autónoma de Manizales, Manizales, Caldas, Colombia
| | | | - Reinel Tabares-Soto
- Department of Electronics and
Automation, Universidad Autónoma de Manizales, Manizales, Caldas, Colombia
| | - Romain Guyot
- Department of Electronics and
Automation, Universidad Autónoma de Manizales, Manizales, Caldas, Colombia,Institut de Recherche pour le
Développement, CIRAD, University of Montpellier, Montpellier, France
| |
Collapse
|
33
|
Abstract
MOTIVATION Pangenome variation graphs model the mutual alignment of collections of DNA sequences. A set of pairwise alignments implies a variation graph, but there are no scalable methods to generate such a graph from these alignments. Existing related approaches depend on a single reference, a specific ordering of genomes or a de Bruijn model based on a fixed k-mer length. A scalable, self-contained method to build pangenome graphs without such limitations would be a key step in pangenome construction and manipulation pipelines. RESULTS We design the seqwish algorithm, which builds a variation graph from a set of sequences and alignments between them. We first transform the alignment set into an implicit interval tree. To build up the variation graph, we query this tree-based representation of the alignments to reduce transitive matches into single DNA segments in a sequence graph. By recording the mapping from input sequence to output graph, we can trace the original paths through this graph, yielding a pangenome variation graph. We present an implementation that operates in external memory, using disk-backed data structures and lock-free parallel methods to drive the core graph induction step. We demonstrate that our method scales to very large graph induction problems by applying it to build pangenome graphs for several species. AVAILABILITY AND IMPLEMENTATION seqwish is published as free software under the MIT open source license. Source code and documentation are available at https://github.com/ekg/seqwish. seqwish can be installed via Bioconda https://bioconda.github.io/recipes/seqwish/README.html or GNU Guix https://github.com/ekg/guix-genomics/blob/master/seqwish.scm.
Collapse
Affiliation(s)
| | - Andrea Guarracino
- Genomics Research Centre, Human Technopole, Viale Rita Levi-Montalcini 1, Milan 20157, Italy
| |
Collapse
|
34
|
Sahlin K. Strobealign: flexible seed size enables ultra-fast and accurate read alignment. Genome Biol 2022; 23:260. [PMID: 36522758 PMCID: PMC9753264 DOI: 10.1186/s13059-022-02831-7] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Accepted: 12/02/2022] [Indexed: 12/23/2022] Open
Abstract
Read alignment is often the computational bottleneck in analyses. Recently, several advances have been made on seeding methods for fast sequence comparison. We combine two such methods, syncmers and strobemers, in a novel seeding approach for constructing dynamic-sized fuzzy seeds and implement the method in a short-read aligner, strobealign. The seeding is fast to construct and effectively reduces repetitiveness in the seeding step, as shown using a novel metric E-hits. strobealign is several times faster than traditional aligners at similar and sometimes higher accuracy while being both faster and more accurate than more recently proposed aligners for short reads of lengths 150nt and longer. Availability: https://github.com/ksahlin/strobealign.
Collapse
Affiliation(s)
- Kristoffer Sahlin
- Department of Mathematics, Science for Life Laboratory, Stockholm University, 106 91, Stockholm, Sweden.
| |
Collapse
|
35
|
Papa Y, Wellenreuther M, Morrison MA, Ritchie PA. Genome assembly and isoform analysis of a highly heterozygous New Zealand fisheries species, the tarakihi (Nemadactylus macropterus). G3 (BETHESDA, MD.) 2022; 13:6883520. [PMID: 36477875 PMCID: PMC9911067 DOI: 10.1093/g3journal/jkac315] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 11/01/2022] [Accepted: 11/08/2022] [Indexed: 12/14/2022]
Abstract
Although being some of the most valuable and heavily exploited wild organisms, few fisheries species have been studied at the whole-genome level. This is especially the case in New Zealand, where genomics resources are urgently needed to assist fisheries management. Here, we generated 55 Gb of short Illumina reads (92× coverage) and 73 Gb of long Nanopore reads (122×) to produce the first genome assembly of the marine teleost tarakihi [Nemadactylus macropterus (Forster, 1801)], a highly valuable fisheries species in New Zealand. An additional 300 Mb of Iso-Seq reads were obtained to assist in gene annotation. The final genome assembly was 568 Mb long with an N50 of 3.37 Mb. The genome completeness was high, with 97.8% of complete Actinopterygii Benchmarking Universal Single-Copy Orthologs. Heterozygosity values estimated through k-mer counting (1.00%) and bi-allelic SNPs (0.64%) were high compared with the same values reported for other fishes. Iso-Seq analysis recovered 91,313 unique transcripts from 15,515 genes (mean ratio of 5.89 transcripts per gene), and the most common alternative splicing event was intron retention. This highly contiguous genome assembly and the isoform-resolved transcriptome will provide a useful resource to assist the study of population genomics and comparative eco-evolutionary studies in teleosts and related organisms.
Collapse
Affiliation(s)
- Yvan Papa
- School of Biological Sciences, Victoria University of Wellington, Wellington 6012, New Zealand
| | - Maren Wellenreuther
- Seafood Production Group, The New Zealand Institute for Plant and Food Research Limited, Nelson 7010, New Zealand,School of Biological Sciences, The University of Auckland, Auckland 1010, New Zealand
| | - Mark A Morrison
- National Institute of Water and Atmospheric Research, Auckland 1010, New Zealand
| | - Peter A Ritchie
- Corresponding author: Te Toki A Rata, Gate 7, Kelburn Parade, Wellington 6012, New Zealand.
| |
Collapse
|
36
|
Yoo D, Park J, Lee C, Song I, Lee YH, Yun T, Lee H, Heguy A, Han JY, Dasen JS, Kim H, Baek M. Little skate genome provides insights into genetic programs essential for limb-based locomotion. eLife 2022; 11:e78345. [PMID: 36288084 PMCID: PMC9605692 DOI: 10.7554/elife.78345] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Accepted: 10/10/2022] [Indexed: 11/13/2022] Open
Abstract
The little skate Leucoraja erinacea, a cartilaginous fish, displays pelvic fin driven walking-like behavior using genetic programs and neuronal subtypes similar to those of land vertebrates. However, mechanistic studies on little skate motor circuit development have been limited, due to a lack of high-quality reference genome. Here, we generated an assembly of the little skate genome, with precise gene annotation and structures, which allowed post-genome analysis of spinal motor neurons (MNs) essential for locomotion. Through interspecies comparison of mouse, skate and chicken MN transcriptomes, shared and divergent gene expression profiles were identified. Comparison of accessible chromatin regions between mouse and skate MNs predicted shared transcription factor (TF) motifs with divergent ones, which could be used for achieving differential regulation of MN-expressed genes. A greater number of TF motif predictions were observed in MN-expressed genes in mouse than in little skate. These findings suggest conserved and divergent molecular mechanisms controlling MN development of vertebrates during evolution, which might contribute to intricate gene regulatory networks in the emergence of a more sophisticated motor system in tetrapods.
Collapse
Affiliation(s)
- DongAhn Yoo
- Interdisciplinary Program in Bioinformatics, Seoul National UniversitySeoulRepublic of Korea
| | - Junhee Park
- Department of Brain Sciences, DGISTDaeguRepublic of Korea
| | - Chul Lee
- Interdisciplinary Program in Bioinformatics, Seoul National UniversitySeoulRepublic of Korea
| | - Injun Song
- Department of Brain Sciences, DGISTDaeguRepublic of Korea
| | - Young Ho Lee
- Interdisciplinary Program in Bioinformatics, Seoul National UniversitySeoulRepublic of Korea
| | - Tery Yun
- Department of Brain Sciences, DGISTDaeguRepublic of Korea
| | - Hyemin Lee
- Department of Biology, Graduate School of Arts and Science, NYUNew YorkUnited States
| | - Adriana Heguy
- Genome Technology Center, Division for Advanced Research Technologies, and Department of Pathology, NYU School of MedicineNew YorkUnited States
| | - Jae Yong Han
- Department of Agricultural Biotechnology, Seoul National UniversitySeoulRepublic of Korea
| | - Jeremy S Dasen
- Neuroscience Institute, Department of Neuroscience and Physiology, New York University School of MedicineNew YorkUnited States
| | - Heebal Kim
- Interdisciplinary Program in Bioinformatics, Seoul National UniversitySeoulRepublic of Korea
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National UniversitySeoulRepublic of Korea
- eGnome, IncSeoulRepublic of Korea
| | - Myungin Baek
- Department of Brain Sciences, DGISTDaeguRepublic of Korea
| |
Collapse
|
37
|
Schield DR, Perry BW, Card DC, Pasquesi GIM, Westfall AK, Mackessy SP, Castoe TA. The Rattlesnake W Chromosome: A GC-Rich Retroelement Refugium with Retained Gene Function Across Ancient Evolutionary Strata. Genome Biol Evol 2022; 14:evac116. [PMID: 35867356 PMCID: PMC9447483 DOI: 10.1093/gbe/evac116] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/17/2022] [Indexed: 11/18/2022] Open
Abstract
Sex chromosomes diverge after the establishment of recombination suppression, resulting in differential sex-linkage of genes involved in genetic sex determination and dimorphic traits. This process produces systems of male or female heterogamety wherein the Y and W chromosomes are only present in one sex and are often highly degenerated. Sex-limited Y and W chromosomes contain valuable information about the evolutionary transition from autosomes to sex chromosomes, yet detailed characterizations of the structure, composition, and gene content of sex-limited chromosomes are lacking for many species. In this study, we characterize the female-specific W chromosome of the prairie rattlesnake (Crotalus viridis) and evaluate how recombination suppression and other processes have shaped sex chromosome evolution in ZW snakes. Our analyses indicate that the rattlesnake W chromosome is over 80% repetitive and that an abundance of GC-rich mdg4 elements has driven an overall high degree of GC-richness despite a lack of recombination. The W chromosome is also highly enriched for repeat sequences derived from endogenous retroviruses and likely acts as a "refugium" for these and other retroelements. We annotated 219 putatively functional W-linked genes across at least two evolutionary strata identified based on estimates of sequence divergence between Z and W gametologs. The youngest of these strata is relatively gene-rich, however gene expression across strata suggests retained gene function amidst a greater degree of degeneration following ancient recombination suppression. Functional annotation of W-linked genes indicates a specialization of the W chromosome for reproductive and developmental function since recombination suppression from the Z chromosome.
Collapse
Affiliation(s)
- Drew R Schield
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, Colorado, USA
| | - Blair W Perry
- Department of Biology, University of Texas at Arlington, Arlington, Texas, USA
- School of Biological Sciences, Washington State University, Pullman, Washington, USA
| | - Daren C Card
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, Massachusetts, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, Massachusetts, USA
| | - Giulia I M Pasquesi
- Department of Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, Colorado, USA
| | - Aundrea K Westfall
- Department of Biology, University of Texas at Arlington, Arlington, Texas, USA
| | - Stephen P Mackessy
- School of Biological Sciences, University of Northern Colorado, Greeley, Colorado, USA
| | - Todd A Castoe
- Department of Biology, University of Texas at Arlington, Arlington, Texas, USA
| |
Collapse
|
38
|
Pickett BD, Glass JR, Johnson TP, Ridge PG, Kauwe JSK. The genome of a giant (trevally): Caranx ignobilis. GIGABYTE 2022; 2022:gigabyte67. [PMID: 36824527 PMCID: PMC9694125 DOI: 10.46471/gigabyte.67] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Accepted: 08/25/2022] [Indexed: 11/09/2022] Open
Abstract
Caranx ignobilis, commonly known as giant kingfish or giant trevally, is a large, reef-associated apex predator. It is a prized sportfish, targeted throughout its tropical and subtropical range in the Indian and Pacific Oceans. It also gained significant interest in aquaculture due to its unusual freshwater tolerance. Here, we present a draft assembly of the estimated 625.92 Mbp nuclear genome of a C. ignobilis individual from Hawaiian waters, which host a genetically distinct population. Our 97.4% BUSCO-complete assembly has a contig NG50 of 7.3 Mbp and a scaffold NG50 of 46.3 Mbp. Twenty-five of the 203 scaffolds contain 90% of the genome. We also present noisy, long-read DNA, Hi-C, and RNA-seq datasets, the latter containing eight distinct tissues and can help with annotations and studies of freshwater tolerance. Our genome assembly and its supporting data are valuable tools for ecological and comparative genomics studies of kingfishes and other carangoid fishes.
Collapse
Affiliation(s)
| | - Jessica R. Glass
- South African Institute for Aquatic Biodiversity, Makhanda, South Africa
- College of Fisheries and Ocean Sciences, University of Alaska Fairbanks, Fairbanks, Alaska, USA
| | | | - Perry G. Ridge
- Department of Biology, Brigham Young University, Provo, Utah, USA
| | - John S. K. Kauwe
- Department of Biology, Brigham Young University, Provo, Utah, USA
- Brigham Young University – Hawai‘i, Laie, Hawai‘i, USA
| |
Collapse
|
39
|
Kille B, Balaji A, Sedlazeck FJ, Nute M, Treangen TJ. Multiple genome alignment in the telomere-to-telomere assembly era. Genome Biol 2022; 23:182. [PMID: 36038949 PMCID: PMC9421119 DOI: 10.1186/s13059-022-02735-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2021] [Accepted: 07/21/2022] [Indexed: 01/22/2023] Open
Abstract
With the arrival of telomere-to-telomere (T2T) assemblies of the human genome comes the computational challenge of efficiently and accurately constructing multiple genome alignments at an unprecedented scale. By identifying nucleotides across genomes which share a common ancestor, multiple genome alignments commonly serve as the bedrock for comparative genomics studies. In this review, we provide an overview of the algorithmic template that most multiple genome alignment methods follow. We also discuss prospective areas of improvement of multiple genome alignment for keeping up with continuously arriving high-quality T2T assembled genomes and for unlocking clinically-relevant insights.
Collapse
Affiliation(s)
- Bryce Kille
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Advait Balaji
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Michael Nute
- Department of Computer Science, Rice University, Houston, TX, USA
| | - Todd J Treangen
- Department of Computer Science, Rice University, Houston, TX, USA.
| |
Collapse
|
40
|
Ilyukhin E, Markovskaja S, Elgorban AM, Al-Rejaie SS, Maharachchikumbura SS. Genomic Characteristics and Comparative Genomics Analysis of Parafenestella ontariensis sp. nov. J Fungi (Basel) 2022; 8:jof8070732. [PMID: 35887487 PMCID: PMC9318755 DOI: 10.3390/jof8070732] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2022] [Revised: 07/07/2022] [Accepted: 07/11/2022] [Indexed: 11/16/2022] Open
Abstract
A new ascomycetous species of Parafenestella was isolated from Acer negundo during the survey of diseased trees in Southern Ontario, Canada. The species is morphologically similar to other taxa of Cucurbitariacea (Pleosporales). The new species is different from the extant species in the morphology of ascospores, culture characteristics and molecular data. The novel species is described as Parafenestella ontariensis sp. nov. based on morphological and multi-gene phylogenetic analyses using a combined set of ITS, LSU, tef1 and tub2 loci. Additionally, the genome of P. ontariensis was sequenced and analyzed. The phylogenomic analysis confirmed the close relationship of the species to the fenestelloid clades of Cucurbitariaceae. The comparative genomics analysis revealed that the species lifestyle appears to be multitrophic (necrotrophic or hemi-biotrophic) with a capability to turn pathogenic on a corresponding plant host.
Collapse
Affiliation(s)
- Evgeny Ilyukhin
- Department of Biological Sciences, Brock University, St. Catharines, ON L2S 3A1, Canada
- Correspondence: or
| | | | - Abdallah M. Elgorban
- Department of Botany and Microbiology, College of Science, King Saud University, Riyadh 11451, Saudi Arabia;
| | - Salim S. Al-Rejaie
- Department of Pharmacology and Toxicology, College of Pharmacy, King Saud University, Riyadh 11451, Saudi Arabia;
| | - Sajeewa S.N. Maharachchikumbura
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China;
| |
Collapse
|
41
|
Phillips AL, Ferguson S, Watson-Haigh NS, Jones AW, Borevitz JO, Burton RA, Atwell BJ. The first long-read nuclear genome assembly of Oryza australiensis, a wild rice from northern Australia. Sci Rep 2022; 12:10823. [PMID: 35752642 PMCID: PMC9233661 DOI: 10.1038/s41598-022-14893-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Accepted: 06/14/2022] [Indexed: 11/17/2022] Open
Abstract
Oryza australiensis is a wild rice native to monsoonal northern Australia. The International Oryza Map Alignment Project emphasises its significance as the sole representative of the EE genome clade. Assembly of the O. australiensis genome has previously been challenging due to its high Long Terminal Repeat (LTR) retrotransposon (RT) content. Oxford Nanopore long reads were combined with Illumina short reads to generate a high-quality ~ 858 Mbp genome assembly within 850 contigs with 46× long read coverage. Reference-guided scaffolding increased genome contiguity, placing 88.2% of contigs into 12 pseudomolecules. After alignment to the Oryza sativa cv. Nipponbare genome, we observed several structural variations. PacBio Iso-Seq data were generated for five distinct tissues to improve the functional annotation of 34,587 protein-coding genes and 42,329 transcripts. We also report SNV numbers for three additional O. australiensis genotypes based on Illumina re-sequencing. Although genetic similarity reflected geographical separation, the density of SNVs also correlated with our previous report on variations in salinity tolerance. This genome re-confirms the genetic remoteness of the O. australiensis lineage within the O. officinalis genome complex. Assembly of a high-quality genome for O. australiensis provides an important resource for the discovery of critical genes involved in development and stress tolerance.
Collapse
Affiliation(s)
- Aaron L Phillips
- Department of Food Science, University of Adelaide, Adelaide, SA, Australia
- ARC Centre of Excellence in Plant Energy Biology, Adelaide, SA, Australia
| | - Scott Ferguson
- Research School of Biology, Australian National University, Canberra, ACT, Australia
- ARC Centre of Excellence in Plant Energy Biology, Canberra, ACT, Australia
| | - Nathan S Watson-Haigh
- South Australian Genomics Centre, University of Adelaide, Adelaide, SA, Australia
- Australian Genome Research Facility, Victorian Comprehensive Cancer Centre, Melbourne, VIC, Australia
| | - Ashley W Jones
- Research School of Biology, Australian National University, Canberra, ACT, Australia
- ARC Centre of Excellence in Plant Energy Biology, Canberra, ACT, Australia
| | - Justin O Borevitz
- Research School of Biology, Australian National University, Canberra, ACT, Australia
- ARC Centre of Excellence in Plant Energy Biology, Canberra, ACT, Australia
| | - Rachel A Burton
- Department of Food Science, University of Adelaide, Adelaide, SA, Australia
- ARC Centre of Excellence in Plant Energy Biology, Adelaide, SA, Australia
| | - Brian J Atwell
- School of Natural Sciences, Macquarie University, Sydney, NSW, Australia.
| |
Collapse
|
42
|
Belbasi M, Blanca A, Harris RS, Koslicki D, Medvedev P. The minimizer Jaccard estimator is biased and inconsistent. Bioinformatics 2022; 38:i169-i176. [PMID: 35758786 PMCID: PMC9235516 DOI: 10.1093/bioinformatics/btac244] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Motivation Sketching is now widely used in bioinformatics to reduce data size and increase data processing speed. Sketching approaches entice with improved scalability but also carry the danger of decreased accuracy and added bias. In this article, we investigate the minimizer sketch and its use to estimate the Jaccard similarity between two sequences. Results We show that the minimizer Jaccard estimator is biased and inconsistent, which means that the expected difference (i.e. the bias) between the estimator and the true value is not zero, even in the limit as the lengths of the sequences grow. We derive an analytical formula for the bias as a function of how the shared k-mers are laid out along the sequences. We show both theoretically and empirically that there are families of sequences where the bias can be substantial (e.g. the true Jaccard can be more than double the estimate). Finally, we demonstrate that this bias affects the accuracy of the widely used mashmap read mapping tool. Availability and implementation Scripts to reproduce our experiments are available at https://github.com/medvedevgroup/minimizer-jaccard-estimator/tree/main/reproduce. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mahdi Belbasi
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA, USA
| | - Antonio Blanca
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA, USA
| | - Robert S Harris
- Department of Biology, The Pennsylvania State University, University Park, PA, USA
| | - David Koslicki
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA, USA.,Department of Biology, The Pennsylvania State University, University Park, PA, USA.,Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA
| | - Paul Medvedev
- Department of Computer Science and Engineering, The Pennsylvania State University, University Park, PA, USA.,Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA, USA.,Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
43
|
Jain C, Rhie A, Hansen NF, Koren S, Phillippy AM. Long-read mapping to repetitive reference sequences using Winnowmap2. Nat Methods 2022; 19:705-710. [PMID: 35365778 PMCID: PMC10510034 DOI: 10.1038/s41592-022-01457-8] [Citation(s) in RCA: 51] [Impact Index Per Article: 25.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Accepted: 03/17/2022] [Indexed: 01/10/2023]
Abstract
Approximately 5-10% of the human genome remains inaccessible due to the presence of repetitive sequences such as segmental duplications and tandem repeat arrays. We show that existing long-read mappers often yield incorrect alignments and variant calls within long, near-identical repeats, as they remain vulnerable to allelic bias. In the presence of a nonreference allele within a repeat, a read sampled from that region could be mapped to an incorrect repeat copy. To address this limitation, we developed a new long-read mapping method, Winnowmap2, by using minimal confidently alignable substrings. Winnowmap2 computes each read mapping through a collection of confident subalignments. This approach is more tolerant of structural variation and more sensitive to paralog-specific variants within repeats. Our experiments highlight that Winnowmap2 successfully addresses the issue of allelic bias, enabling more accurate downstream variant calls in repetitive sequences.
Collapse
Affiliation(s)
- Chirag Jain
- Department of Computational and Data Sciences, Indian Institute of Science, Bangalore, India.
- Genome Informatics Section, National Human Genome Research Institute, Bethesda, MD, USA.
| | - Arang Rhie
- Genome Informatics Section, National Human Genome Research Institute, Bethesda, MD, USA
| | - Nancy F Hansen
- Comparative Genomics Analysis Unit, National Human Genome Research Institute, Bethesda, MD, USA
| | - Sergey Koren
- Genome Informatics Section, National Human Genome Research Institute, Bethesda, MD, USA
| | - Adam M Phillippy
- Genome Informatics Section, National Human Genome Research Institute, Bethesda, MD, USA
| |
Collapse
|
44
|
Low WY, Rosen BD, Ren Y, Bickhart DM, To TH, Martin FJ, Billis K, Sonstegard TS, Sullivan ST, Hiendleder S, Williams JL, Heaton MP, Smith TPL. Gaur genome reveals expansion of sperm odorant receptors in domesticated cattle. BMC Genomics 2022; 23:344. [PMID: 35508966 PMCID: PMC9069736 DOI: 10.1186/s12864-022-08561-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Accepted: 04/13/2022] [Indexed: 02/08/2023] Open
Abstract
Background The gaur (Bos gaurus) is the largest extant wild bovine species, native to South and Southeast Asia, with unique traits, and is listed as vulnerable by the International Union for Conservation of Nature (IUCN). Results We report the first gaur reference genome and identify three biological pathways including lysozyme activity, proton transmembrane transporter activity, and oxygen transport with significant changes in gene copy number in gaur compared to other mammals. These may reflect adaptation to challenges related to climate and nutrition. Comparative analyses with domesticated indicine (Bos indicus) and taurine (Bos taurus) cattle revealed genomic signatures of artificial selection, including the expansion of sperm odorant receptor genes in domesticated cattle, which may have important implications for understanding selection for male fertility. Conclusions Apart from aiding dissection of economically important traits, the gaur genome will also provide the foundation to conserve the species. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08561-1.
Collapse
Affiliation(s)
- Wai Yee Low
- The Davies Research Centre, School of Animal and Veterinary Sciences, University of Adelaide, Roseworthy, SA, 5371, Australia.
| | - Benjamin D Rosen
- Animal Genomics and Improvement LaboratoryARS USDA, Beltsville, MD, USA
| | - Yan Ren
- The Davies Research Centre, School of Animal and Veterinary Sciences, University of Adelaide, Roseworthy, SA, 5371, Australia
| | | | - Thu-Hien To
- Norwegian University of Life Sciences: NMBU, Universitetstunet 3, 1430, Ås, Norway
| | - Fergal J Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Konstantinos Billis
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | | | - Shawn T Sullivan
- Phase Genomics, 4000 Mason Road, Suite 225, Seattle, WA, 98195, USA
| | - Stefan Hiendleder
- The Davies Research Centre, School of Animal and Veterinary Sciences, University of Adelaide, Roseworthy, SA, 5371, Australia
| | - John L Williams
- The Davies Research Centre, School of Animal and Veterinary Sciences, University of Adelaide, Roseworthy, SA, 5371, Australia.,Department of Animal Science, Food and Nutrition, Università Cattolica del Sacro Cuore, 29122, Piacenza, Italy
| | - Michael P Heaton
- U.S. Department of Agriculture, Agricultural Research Service, U.S. Meat Animal Research Center, Clay Center, Nebraska, USA
| | - Timothy P L Smith
- U.S. Department of Agriculture, Agricultural Research Service, U.S. Meat Animal Research Center, Clay Center, Nebraska, USA.
| |
Collapse
|
45
|
Lo R, Dougan KE, Chen Y, Shah S, Bhattacharya D, Chan CX. Alignment-Free Analysis of Whole-Genome Sequences From Symbiodiniaceae Reveals Different Phylogenetic Signals in Distinct Regions. FRONTIERS IN PLANT SCIENCE 2022; 13:815714. [PMID: 35557718 PMCID: PMC9087856 DOI: 10.3389/fpls.2022.815714] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/15/2021] [Accepted: 04/04/2022] [Indexed: 05/24/2023]
Abstract
Dinoflagellates of the family Symbiodiniaceae are predominantly essential symbionts of corals and other marine organisms. Recent research reveals extensive genome sequence divergence among Symbiodiniaceae taxa and high phylogenetic diversity hidden behind subtly different cell morphologies. Using an alignment-free phylogenetic approach based on sub-sequences of fixed length k (i.e. k-mers), we assessed the phylogenetic signal among whole-genome sequences from 16 Symbiodiniaceae taxa (including the genera of Symbiodinium, Breviolum, Cladocopium, Durusdinium and Fugacium) and two strains of Polarella glacialis as outgroup. Based on phylogenetic trees inferred from k-mers in distinct genomic regions (i.e. repeat-masked genome sequences, protein-coding sequences, introns and repeats) and in protein sequences, the phylogenetic signal associated with protein-coding DNA and the encoded amino acids is largely consistent with the Symbiodiniaceae phylogeny based on established markers, such as large subunit rRNA. The other genome sequences (introns and repeats) exhibit distinct phylogenetic signals, supporting the expected differential evolutionary pressure acting on these regions. Our analysis of conserved core k-mers revealed the prevalence of conserved k-mers (>95% core 23-mers among all 18 genomes) in annotated repeats and non-genic regions of the genomes. We observed 180 distinct repeat types that are significantly enriched in genomes of the symbiotic versus free-living Symbiodinium taxa, suggesting an enhanced activity of transposable elements linked to the symbiotic lifestyle. We provide evidence that representation of alignment-free phylogenies as dynamic networks enhances the ability to generate new hypotheses about genome evolution in Symbiodiniaceae. These results demonstrate the potential of alignment-free phylogenetic methods as a scalable approach for inferring comprehensive, unbiased whole-genome phylogenies of dinoflagellates and more broadly of microbial eukaryotes.
Collapse
Affiliation(s)
- Rosalyn Lo
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia
| | - Katherine E. Dougan
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia
| | - Yibi Chen
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia
| | - Sarah Shah
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia
| | - Debashish Bhattacharya
- Department of Biochemistry and Microbiology, Rutgers University, New Brunswick, NJ, United States
| | - Cheong Xin Chan
- Australian Centre for Ecogenomics, School of Chemistry and Molecular Biosciences, University of Queensland, Brisbane, QLD, Australia
| |
Collapse
|
46
|
Turbek SP, Schield DR, Scordato ESC, Contina A, Da XW, Liu Y, Liu Y, Pagani-Núñez E, Ren QM, Smith CCR, Stricker CA, Wunder M, Zonana DM, Safran RJ. A migratory divide spanning two continents is associated with genomic and ecological divergence. Evolution 2022; 76:722-736. [PMID: 35166383 DOI: 10.1111/evo.14448] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Revised: 12/21/2021] [Accepted: 12/29/2021] [Indexed: 01/22/2023]
Abstract
Migratory divides are contact zones between breeding populations with divergent migratory strategies during the nonbreeding season. These locations provide an opportunity to evaluate the role of seasonal migration in the maintenance of reproductive isolation, particularly the relationship between population structure and features associated with distinct migratory strategies. We combine light-level geolocators, genomic sequencing, and stable isotopes to investigate the timing of migration and migratory routes of individuals breeding on either side of a migratory divide coinciding with genomic differentiation across a hybrid zone between barn swallow (Hirundo rustica) subspecies in China. Individuals west of the hybrid zone, with H. r. rustica ancestry, had comparatively enriched stable-carbon and hydrogen isotope values and overwintered in eastern Africa, whereas birds east of the hybrid zone, with H. r. gutturalis ancestry, had depleted isotope values and migrated to southern India. The two subspecies took divergent migratory routes around the high-altitude Karakoram Range and arrived on the breeding grounds over 3 weeks apart. These results indicate that assortative mating by timing of arrival and/or selection against hybrids with intermediate migratory traits may maintain reproductive isolation between the subspecies, and that inhospitable geographic features may have contributed to the diversification of Asian avifauna by influencing migratory patterns.
Collapse
Affiliation(s)
- Sheela P Turbek
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, Colorado, 80309
| | - Drew R Schield
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, Colorado, 80309
| | - Elizabeth S C Scordato
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, Colorado, 80309.,Department of Biological Sciences, Cal Poly Pomona, Pomona, California, 91768
| | - Andrea Contina
- Department of Integrative Biology, University of Colorado, Denver, Colorado, 80217
| | - Xin-Wei Da
- College of Life Science, Wuhan University, Wuhan, 430072, China
| | - Yang Liu
- School of Ecology, Sun Yat-sen University, Guangzhou, 510275, China
| | - Yu Liu
- Key Laboratory for Biodiversity Sciences and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing, 100875, China
| | - Emilio Pagani-Núñez
- Department of Health and Environmental Sciences, Xi'an Jiaotong-Liverpool University, Suzhou, 215123, China
| | - Qing-Miao Ren
- School of Life Sciences, Lanzhou University, Lanzhou, 730000, China
| | - Chris C R Smith
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, Colorado, 80309
| | - Craig A Stricker
- U.S. Geological Survey, Fort Collins Science Center, Fort Collins, Colorado, 80526
| | - Michael Wunder
- Department of Integrative Biology, University of Colorado, Denver, Colorado, 80217
| | - David M Zonana
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, Colorado, 80309.,Department of Biological Sciences, University of Denver, Denver, Colorado, 80210
| | - Rebecca J Safran
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, Colorado, 80309
| |
Collapse
|
47
|
Nurk S, Koren S, Rhie A, Rautiainen M, Bzikadze AV, Mikheenko A, Vollger MR, Altemose N, Uralsky L, Gershman A, Aganezov S, Hoyt SJ, Diekhans M, Logsdon GA, Alonge M, Antonarakis SE, Borchers M, Bouffard GG, Brooks SY, Caldas GV, Chen NC, Cheng H, Chin CS, Chow W, de Lima LG, Dishuck PC, Durbin R, Dvorkina T, Fiddes IT, Formenti G, Fulton RS, Fungtammasan A, Garrison E, Grady PG, Graves-Lindsay TA, Hall IM, Hansen NF, Hartley GA, Haukness M, Howe K, Hunkapiller MW, Jain C, Jain M, Jarvis ED, Kerpedjiev P, Kirsche M, Kolmogorov M, Korlach J, Kremitzki M, Li H, Maduro VV, Marschall T, McCartney AM, McDaniel J, Miller DE, Mullikin JC, Myers EW, Olson ND, Paten B, Peluso P, Pevzner PA, Porubsky D, Potapova T, Rogaev EI, Rosenfeld JA, Salzberg SL, Schneider VA, Sedlazeck FJ, Shafin K, Shew CJ, Shumate A, Sims Y, Smit AFA, Soto DC, Sović I, Storer JM, Streets A, Sullivan BA, Thibaud-Nissen F, Torrance J, Wagner J, Walenz BP, Wenger A, Wood JMD, Xiao C, Yan SM, Young AC, Zarate S, Surti U, McCoy RC, Dennis MY, Alexandrov IA, Gerton JL, O’Neill RJ, Timp W, Zook JM, Schatz MC, Eichler EE, Miga KH, Phillippy AM. The complete sequence of a human genome. Science 2022; 376:44-53. [PMID: 35357919 PMCID: PMC9186530 DOI: 10.1126/science.abj6987] [Citation(s) in RCA: 1097] [Impact Index Per Article: 548.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Since its initial release in 2000, the human reference genome has covered only the euchromatic fraction of the genome, leaving important heterochromatic regions unfinished. Addressing the remaining 8% of the genome, the Telomere-to-Telomere (T2T) Consortium presents a complete 3.055 billion-base pair sequence of a human genome, T2T-CHM13, that includes gapless assemblies for all chromosomes except Y, corrects errors in the prior references, and introduces nearly 200 million base pairs of sequence containing 1956 gene predictions, 99 of which are predicted to be protein coding. The completed regions include all centromeric satellite arrays, recent segmental duplications, and the short arms of all five acrocentric chromosomes, unlocking these complex regions of the genome to variational and functional studies.
Collapse
Affiliation(s)
- Sergey Nurk
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
| | - Mikko Rautiainen
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
| | - Andrey V. Bzikadze
- Graduate Program in Bioinformatics and Systems Biology, University of California, San Diego; La Jolla, CA, USA
| | - Alla Mikheenko
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University; Saint Petersburg, Russia
| | - Mitchell R. Vollger
- Department of Genome Sciences, University of Washington School of Medicine; Seattle, WA, USA
| | - Nicolas Altemose
- Department of Bioengineering, University of California, Berkeley; Berkeley, CA, USA
| | - Lev Uralsky
- Sirius University of Science and Technology; Sochi, Russia
- Vavilov Institute of General Genetics; Moscow, Russia
| | - Ariel Gershman
- Department of Molecular Biology and Genetics, Johns Hopkins University; Baltimore, MD, USA
| | - Sergey Aganezov
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
| | - Savannah J. Hoyt
- Institute for Systems Genomics and Department of Molecular and Cell Biology, University of Connecticut; Storrs, CT, USA
| | - Mark Diekhans
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
| | - Glennis A. Logsdon
- Department of Genome Sciences, University of Washington School of Medicine; Seattle, WA, USA
| | - Michael Alonge
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
| | | | | | - Gerard G. Bouffard
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
| | - Shelise Y. Brooks
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
| | - Gina V. Caldas
- Department of Molecular and Cell Biology, University of California, Berkeley; Berkeley, CA, USA
| | - Nae-Chyun Chen
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
| | - Haoyu Cheng
- Department of Data Sciences, Dana-Farber Cancer Institute; Boston, MA
- Department of Biomedical Informatics, Harvard Medical School; Boston, MA
| | | | | | | | - Philip C. Dishuck
- Department of Genome Sciences, University of Washington School of Medicine; Seattle, WA, USA
| | - Richard Durbin
- Wellcome Sanger Institute; Cambridge, UK
- Department of Genetics, University of Cambridge; Cambridge, UK
| | - Tatiana Dvorkina
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University; Saint Petersburg, Russia
| | | | - Giulio Formenti
- Laboratory of Neurogenetics of Language and The Vertebrate Genome Lab, The Rockefeller University; New York, NY, USA
- Howard Hughes Medical Institute; Chevy Chase, MD, USA
| | - Robert S. Fulton
- Department of Genetics, Washington University School of Medicine; St. Louis, MO, USA
| | | | - Erik Garrison
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
- University of Tennessee Health Science Center; Memphis, TN, USA
| | - Patrick G.S. Grady
- Institute for Systems Genomics and Department of Molecular and Cell Biology, University of Connecticut; Storrs, CT, USA
| | | | - Ira M. Hall
- Department of Genetics, Yale University School of Medicine; New Haven, CT, USA
| | - Nancy F. Hansen
- Comparative Genomics Analysis Unit, Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
| | - Gabrielle A. Hartley
- Institute for Systems Genomics and Department of Molecular and Cell Biology, University of Connecticut; Storrs, CT, USA
| | - Marina Haukness
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
| | | | | | - Chirag Jain
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
- Department of Computational and Data Sciences, Indian Institute of Science; Bangalore KA, India
| | - Miten Jain
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
| | - Erich D. Jarvis
- Laboratory of Neurogenetics of Language and The Vertebrate Genome Lab, The Rockefeller University; New York, NY, USA
- Howard Hughes Medical Institute; Chevy Chase, MD, USA
| | | | - Melanie Kirsche
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
| | - Mikhail Kolmogorov
- Department of Computer Science and Engineering, University of California, San Diego; San Diego, CA, USA
| | | | - Milinn Kremitzki
- McDonnell Genome Institute, Washington University in St. Louis; St. Louis, MO, USA
| | - Heng Li
- Department of Data Sciences, Dana-Farber Cancer Institute; Boston, MA
- Department of Biomedical Informatics, Harvard Medical School; Boston, MA
| | - Valerie V. Maduro
- Undiagnosed Diseases Program, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
| | - Tobias Marschall
- Heinrich Heine University Düsseldorf, Medical Faculty, Institute for Medical Biometry and Bioinformatics; Düsseldorf, Germany
| | - Ann M. McCartney
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
| | - Jennifer McDaniel
- Biosystems and Biomaterials Division, National Institute of Standards and Technology; Gaithersburg, MD, USA
| | - Danny E. Miller
- Department of Genome Sciences, University of Washington School of Medicine; Seattle, WA, USA
- Department of Pediatrics, Division of Genetic Medicine, University of Washington and Seattle Children’s Hospital; Seattle, WA, USA
| | - James C. Mullikin
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
- Comparative Genomics Analysis Unit, Cancer Genetics and Comparative Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
| | - Eugene W. Myers
- Max-Planck Institute of Molecular Cell Biology and Genetics; Dresden, Germany
| | - Nathan D. Olson
- Biosystems and Biomaterials Division, National Institute of Standards and Technology; Gaithersburg, MD, USA
| | - Benedict Paten
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
| | | | - Pavel A. Pevzner
- Department of Computer Science and Engineering, University of California, San Diego; San Diego, CA, USA
| | - David Porubsky
- Department of Genome Sciences, University of Washington School of Medicine; Seattle, WA, USA
| | - Tamara Potapova
- Stowers Institute for Medical Research; Kansas City, MO, USA
| | - Evgeny I. Rogaev
- Sirius University of Science and Technology; Sochi, Russia
- Vavilov Institute of General Genetics; Moscow, Russia
- Department of Psychiatry, University of Massachusetts Medical School; Worcester, MA, USA
- Faculty of Biology, Lomonosov Moscow State University; Moscow, Russia
| | | | - Steven L. Salzberg
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University; Baltimore, MD, USA
| | - Valerie A. Schneider
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health; Bethesda, MD, USA
| | - Fritz J. Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine; Houston TX, USA
| | - Kishwar Shafin
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
| | - Colin J. Shew
- Genome Center, MIND Institute, Department of Biochemistry and Molecular Medicine, University of California, Davis; CA, USA
| | - Alaina Shumate
- Department of Biomedical Engineering, Johns Hopkins University; Baltimore, MD, USA
| | - Ying Sims
- Wellcome Sanger Institute; Cambridge, UK
| | | | - Daniela C. Soto
- Genome Center, MIND Institute, Department of Biochemistry and Molecular Medicine, University of California, Davis; CA, USA
| | - Ivan Sović
- Pacific Biosciences; Menlo Park, CA, USA
- Digital BioLogic d.o.o.; Ivanić-Grad, Croatia
| | | | - Aaron Streets
- Department of Bioengineering, University of California, Berkeley; Berkeley, CA, USA
- Chan Zuckerberg Biohub; San Francisco, CA, USA
| | - Beth A. Sullivan
- Department of Molecular Genetics and Microbiology, Duke University School of Medicine; Durham, NC, USA
| | - Françoise Thibaud-Nissen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health; Bethesda, MD, USA
| | | | - Justin Wagner
- Biosystems and Biomaterials Division, National Institute of Standards and Technology; Gaithersburg, MD, USA
| | - Brian P. Walenz
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
| | | | | | - Chunlin Xiao
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health; Bethesda, MD, USA
| | - Stephanie M. Yan
- Department of Biology, Johns Hopkins University; Baltimore, MD, USA
| | - Alice C. Young
- NIH Intramural Sequencing Center, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD, USA
| | - Samantha Zarate
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
| | - Urvashi Surti
- Department of Pathology, University of Pittsburgh; Pittsburgh, PA, USA
| | - Rajiv C. McCoy
- Department of Biology, Johns Hopkins University; Baltimore, MD, USA
| | - Megan Y. Dennis
- Genome Center, MIND Institute, Department of Biochemistry and Molecular Medicine, University of California, Davis; CA, USA
| | - Ivan A. Alexandrov
- Center for Algorithmic Biotechnology, Institute of Translational Biomedicine, Saint Petersburg State University; Saint Petersburg, Russia
- Vavilov Institute of General Genetics; Moscow, Russia
- Research Center of Biotechnology of the Russian Academy of Sciences; Moscow, Russia
| | - Jennifer L. Gerton
- Stowers Institute for Medical Research; Kansas City, MO, USA
- Department of Biochemistry and Molecular Biology, University of Kansas Medical School; Kansas City, MO, USA
| | - Rachel J. O’Neill
- Institute for Systems Genomics and Department of Molecular and Cell Biology, University of Connecticut; Storrs, CT, USA
| | - Winston Timp
- Department of Molecular Biology and Genetics, Johns Hopkins University; Baltimore, MD, USA
- Department of Biomedical Engineering, Johns Hopkins University; Baltimore, MD, USA
| | - Justin M. Zook
- Biosystems and Biomaterials Division, National Institute of Standards and Technology; Gaithersburg, MD, USA
| | - Michael C. Schatz
- Department of Computer Science, Johns Hopkins University; Baltimore, MD, USA
- Department of Biology, Johns Hopkins University; Baltimore, MD, USA
| | - Evan E. Eichler
- Department of Genome Sciences, University of Washington School of Medicine; Seattle, WA, USA
- Howard Hughes Medical Institute; Chevy Chase, MD, USA
| | - Karen H. Miga
- UC Santa Cruz Genomics Institute, University of California, Santa Cruz; Santa Cruz, CA, USA
- Department of Biomolecular Engineering, University of California Santa Cruz, CA, USA
| | - Adam M. Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health; Bethesda, MD USA
| |
Collapse
|
48
|
Išerić H, Alkan C, Hach F, Numanagić I. Fast characterization of segmental duplication structure in multiple genome assemblies. Algorithms Mol Biol 2022; 17:4. [PMID: 35303886 PMCID: PMC8932185 DOI: 10.1186/s13015-022-00210-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Accepted: 02/08/2022] [Indexed: 11/29/2022] Open
Abstract
MOTIVATION The increasing availability of high-quality genome assemblies raised interest in the characterization of genomic architecture. Major architectural elements, such as common repeats and segmental duplications (SDs), increase genome plasticity that stimulates further evolution by changing the genomic structure and inventing new genes. Optimal computation of SDs within a genome requires quadratic-time local alignment algorithms that are impractical due to the size of most genomes. Additionally, to perform evolutionary analysis, one needs to characterize SDs in multiple genomes and find relations between those SDs and unique (non-duplicated) segments in other genomes. A naïve approach consisting of multiple sequence alignment would make the optimal solution to this problem even more impractical. Thus there is a need for fast and accurate algorithms to characterize SD structure in multiple genome assemblies to better understand the evolutionary forces that shaped the genomes of today. RESULTS Here we introduce a new approach, BISER, to quickly detect SDs in multiple genomes and identify elementary SDs and core duplicons that drive the formation of such SDs. BISER improves earlier tools by (i) scaling the detection of SDs with low homology to multiple genomes while introducing further 7-33[Formula: see text] speed-ups over the existing tools, and by (ii) characterizing elementary SDs and detecting core duplicons to help trace the evolutionary history of duplications to as far as 300 million years. AVAILABILITY AND IMPLEMENTATION BISER is implemented in Seq programming language and is publicly available at https://github.com/0xTCG/biser .
Collapse
Affiliation(s)
- Hamza Išerić
- Department of Computer Science, University of Victoria, Victoria, BC, V8P 5C2, Canada
| | - Can Alkan
- Department of Computer Engineering, Bilkent University, 06800, Ankara, Turkey
| | - Faraz Hach
- Vancouver Prostate Centre, Vancouver, BC, V6H 3Z6, Canada
- Department of Urologic Sciences, University of British Columbia, Vancouver, BC, V5Z 1M9, Canada
| | - Ibrahim Numanagić
- Department of Computer Science, University of Victoria, Victoria, BC, V8P 5C2, Canada.
| |
Collapse
|
49
|
Chen J, Zhong J, He X, Li X, Ni P, Safner T, Šprem N, Han J. The de novo assembly of a European wild boar genome revealed unique patterns of chromosomal structural variations and segmental duplications. Anim Genet 2022; 53:281-292. [PMID: 35238061 PMCID: PMC9314987 DOI: 10.1111/age.13181] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2021] [Revised: 02/12/2022] [Accepted: 02/12/2022] [Indexed: 02/05/2023]
Abstract
The rapid progress of sequencing technology has greatly facilitated the de novo genome assembly of pig breeds. However, the assembly of the wild boar genome is still lacking, hampering our understanding of chromosomal and genomic evolution during domestication from wild boars into domestic pigs. Here, we sequenced and de novo assembled a European wild boar genome (ASM2165605v1) using the long‐range information provided by 10× Linked‐Reads sequencing. We achieved a high‐quality assembly with contig N50 of 26.09 Mb. Additionally, 1.64% of the contigs (222) with lengths from 107.65 kb to 75.36 Mb covered 90.3% of the total genome size of ASM2165605v1 (~2.5 Gb). Mapping analysis revealed that the contigs can fill 24.73% (93/376) of the gaps present in the orthologous regions of the updated pig reference genome (Sscrofa11.1). We further improved the contigs into chromosome level with a reference‐assistant scaffolding method. Using the ‘assembly‐to‐assembly’ approach, we identified intra‐chromosomal large structural variations (SVs, length >1 kb) between ASM2165605v1 and Sscrofa11.1 assemblies. Interestingly, we found that the number of SV events on the X chromosome deviated significantly from the linear models fitting autosomes (R2 > 0.64, p < 0.001). Specifically, deletions and insertions were deficient on the X chromosome by 66.14 and 58.41% respectively, whereas duplications and inversions were excessive on the X chromosome by 71.96 and 107.61% respectively. We further used the large segmental duplications (SDs, >1 kb) events as a proxy to understand the large‐scale inter‐chromosomal evolution, by resolving parental‐derived relationships for SD pairs. We revealed a significant excess of SD movements from the X chromosome to autosomes (p < 0.001), consistent with the expectation of meiotic sex chromosome inactivation. Enrichment analyses indicated that the genes within derived SD copies on autosomes were significantly related to biological processes involving nervous system, lipid biosynthesis and sperm motility (p < 0.01). Together, our analyses of the de novo assembly of ASM2165605v1 provides insight into the SVs between European wild boar and domestic pig, in addition to the ongoing process of meiotic sex chromosome inactivation in driving inter‐chromosomal interaction between the sex chromosome and autosomes.
Collapse
Affiliation(s)
- Jianhai Chen
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
| | - Jie Zhong
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
| | - Xuefei He
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
| | - Xiaoyu Li
- Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, China
| | - Pan Ni
- Animal Husbandry and Veterinary Institute of Keqiao District, Shaoxing, Zhejiang, China
| | - Toni Safner
- Faculty of Agriculture, University of Zagreb, Zagreb, Croatia.,Centre of Excellence for Biodiversity and Molecular Plant Breeding, (CoE CroP-BioDiv), Zagreb, Croatia
| | - Nikica Šprem
- Faculty of Agriculture, University of Zagreb, Zagreb, Croatia
| | - Jianlin Han
- International Livestock Research Institute, Nairobi, Kenya.,CAAS-ILRI Joint Laboratory on Livestock and Forage Genetic Resources, Institute of Animal Science, Chinese Academy of Agricultural Sciences, Beijing, China
| |
Collapse
|
50
|
Athiyannan N, Abrouk M, Boshoff WHP, Cauet S, Rodde N, Kudrna D, Mohammed N, Bettgenhaeuser J, Botha KS, Derman SS, Wing RA, Prins R, Krattinger SG. Long-read genome sequencing of bread wheat facilitates disease resistance gene cloning. Nat Genet 2022; 54:227-231. [PMID: 35288708 PMCID: PMC8920886 DOI: 10.1038/s41588-022-01022-1] [Citation(s) in RCA: 56] [Impact Index Per Article: 28.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Accepted: 01/25/2022] [Indexed: 12/19/2022]
Abstract
The cloning of agronomically important genes from large, complex crop genomes remains challenging. Here we generate a 14.7 gigabase chromosome-scale assembly of the South African bread wheat (Triticum aestivum) cultivar Kariega by combining high-fidelity long reads, optical mapping and chromosome conformation capture. The resulting assembly is an order of magnitude more contiguous than previous wheat assemblies. Kariega shows durable resistance to the devastating fungal stripe rust disease1. We identified the race-specific disease resistance gene Yr27, which encodes an intracellular immune receptor, to be a major contributor to this resistance. Yr27 is allelic to the leaf rust resistance gene Lr13; the Yr27 and Lr13 proteins show 97% sequence identity2,3. Our results demonstrate the feasibility of generating chromosome-scale wheat assemblies to clone genes, and exemplify that highly similar alleles of a single-copy gene can confer resistance to different pathogens, which might provide a basis for engineering Yr27 alleles with multiple recognition specificities in the future.
Collapse
Affiliation(s)
- Naveenkumar Athiyannan
- Center for Desert Agriculture, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Michael Abrouk
- Center for Desert Agriculture, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Willem H P Boshoff
- Department of Plant Sciences, University of the Free State, Bloemfontein, South Africa
| | - Stéphane Cauet
- INRAE-CNRGV French Plant Genomic Resource Center, Castanet-Tolosan, France
| | - Nathalie Rodde
- INRAE-CNRGV French Plant Genomic Resource Center, Castanet-Tolosan, France
| | - David Kudrna
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA
| | - Nahed Mohammed
- Center for Desert Agriculture, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Jan Bettgenhaeuser
- Center for Desert Agriculture, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | | | | | - Rod A Wing
- Center for Desert Agriculture, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, USA
| | - Renée Prins
- CenGen (Pty) Ltd, Worcester, South Africa.
- Department of Genetics, Stellenbosch University, Stellenbosch, South Africa.
| | - Simon G Krattinger
- Center for Desert Agriculture, Biological and Environmental Science and Engineering Division (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia.
| |
Collapse
|