2
|
Hebert PDN, Bock DG, Prosser SWJ. Interrogating 1000 insect genomes for NUMTs: A risk assessment for estimates of species richness. PLoS One 2023; 18:e0286620. [PMID: 37289794 PMCID: PMC10249859 DOI: 10.1371/journal.pone.0286620] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Accepted: 05/22/2023] [Indexed: 06/10/2023] Open
Abstract
The nuclear genomes of most animal species include NUMTs, segments of the mitogenome incorporated into their chromosomes. Although NUMT counts are known to vary greatly among species, there has been no comprehensive study of their frequency/attributes in the most diverse group of terrestrial organisms, insects. This study examines NUMTs derived from a 658 bp 5' segment of the cytochrome c oxidase I (COI) gene, the barcode region for the animal kingdom. This assessment is important because unrecognized NUMTs can elevate estimates of species richness obtained through DNA barcoding and derived approaches (eDNA, metabarcoding). This investigation detected nearly 10,000 COI NUMTs ≥ 100 bp in the genomes of 1,002 insect species (range = 0-443). Variation in nuclear genome size explained 56% of the mitogenome-wide variation in NUMT counts. Although insect orders with the largest genome sizes possessed the highest NUMT counts, there was considerable variation among their component lineages. Two thirds of COI NUMTs possessed an IPSC (indel and/or premature stop codon) allowing their recognition and exclusion from downstream analyses. The remainder can elevate species richness as they showed 10.1% mean divergence from their mitochondrial homologue. The extent of exposure to "ghost species" is strongly impacted by the target amplicon's length. NUMTs can raise apparent species richness by up to 22% when a 658 bp COI amplicon is examined versus a doubling of apparent richness when 150 bp amplicons are targeted. Given these impacts, metabarcoding and eDNA studies should target the longest possible amplicons while also avoiding use of 12S/16S rDNA as they triple NUMT exposure because IPSC screens cannot be employed.
Collapse
Affiliation(s)
- Paul D. N. Hebert
- Centre for Biodiversity Genomics, University of Guelph, Guelph, ON, Canada
| | - Dan G. Bock
- Centre for Biodiversity Genomics, University of Guelph, Guelph, ON, Canada
| | - Sean W. J. Prosser
- Centre for Biodiversity Genomics, University of Guelph, Guelph, ON, Canada
| |
Collapse
|
3
|
Santos BF, Miller ME, Miklasevskaja M, McKeown JTA, Redmond NE, Coddington JA, Bird J, Miller SE, Smith A, Brady SG, Buffington ML, Chamorro ML, Dikow T, Gates MW, Goldstein P, Konstantinov A, Kula R, Silverson ND, Solis MA, deWaard SL, Naik S, Nikolova N, Pentinsaari M, Prosser SWJ, Sones JE, Zakharov EV, deWaard JR. Enhancing DNA barcode reference libraries by harvesting terrestrial arthropods at the Smithsonian's National Museum of Natural History. Biodivers Data J 2023; 11:e100904. [PMID: 38327288 PMCID: PMC10848724 DOI: 10.3897/bdj.11.e100904] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2023] [Accepted: 03/30/2023] [Indexed: 02/09/2024] Open
Abstract
The use of DNA barcoding has revolutionised biodiversity science, but its application depends on the existence of comprehensive and reliable reference libraries. For many poorly known taxa, such reference sequences are missing even at higher-level taxonomic scales. We harvested the collections of the Smithsonian's National Museum of Natural History (USNM) to generate DNA barcoding sequences for genera of terrestrial arthropods previously not recorded in one or more major public sequence databases. Our workflow used a mix of Sanger and Next-Generation Sequencing (NGS) approaches to maximise sequence recovery while ensuring affordable cost. In total, COI sequences were obtained for 5,686 specimens belonging to 3,737 determined species in 3,886 genera and 205 families distributed in 137 countries. Success rates varied widely according to collection data and focal taxon. NGS helped recover sequences of specimens that failed a previous run of Sanger sequencing. Success rates and the optimal balance between Sanger and NGS are the most important drivers to maximise output and minimise cost in future projects. The corresponding sequence and taxonomic data can be accessed through the Barcode of Life Data System, GenBank, the Global Biodiversity Information Facility, the Global Genome Biodiversity Network Data Portal and the NMNH data portal.
Collapse
Affiliation(s)
- Bernardo F. Santos
- National Museum of Natural History, Smithsonian Institution, Washington, United States of AmericaNational Museum of Natural History, Smithsonian InstitutionWashingtonUnited States of America
- Institut de Systématique, Evolution, Biodiversité (ISYEB), Muséum National d’Histoire naturelle, CNRS, SU, EPHE, UA, Paris, FranceInstitut de Systématique, Evolution, Biodiversité (ISYEB), Muséum National d’Histoire naturelle, CNRS, SU, EPHE, UAParisFrance
| | - Meredith E. Miller
- Centre for Biodiversity Genomics, University of Guelph, Guelph, CanadaCentre for Biodiversity Genomics, University of GuelphGuelphCanada
| | - Margarita Miklasevskaja
- Centre for Biodiversity Genomics, University of Guelph, Guelph, CanadaCentre for Biodiversity Genomics, University of GuelphGuelphCanada
| | - Jaclyn T. A. McKeown
- Centre for Biodiversity Genomics, University of Guelph, Guelph, CanadaCentre for Biodiversity Genomics, University of GuelphGuelphCanada
| | - Niamh E. Redmond
- National Museum of Natural History, Smithsonian Institution, Washington, United States of AmericaNational Museum of Natural History, Smithsonian InstitutionWashingtonUnited States of America
| | - Jonathan A. Coddington
- National Museum of Natural History, Smithsonian Institution, Washington, United States of AmericaNational Museum of Natural History, Smithsonian InstitutionWashingtonUnited States of America
| | - Jessica Bird
- National Museum of Natural History, Smithsonian Institution, Washington, United States of AmericaNational Museum of Natural History, Smithsonian InstitutionWashingtonUnited States of America
| | - Scott E. Miller
- National Museum of Natural History, Smithsonian Institution, Washington, United States of AmericaNational Museum of Natural History, Smithsonian InstitutionWashingtonUnited States of America
| | - Ashton Smith
- National Museum of Natural History, Smithsonian Institution, Washington, United States of AmericaNational Museum of Natural History, Smithsonian InstitutionWashingtonUnited States of America
| | - Seán G. Brady
- National Museum of Natural History, Smithsonian Institution, Washington, United States of AmericaNational Museum of Natural History, Smithsonian InstitutionWashingtonUnited States of America
| | - Matthew L. Buffington
- Systematic Entomology Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, U.S. Department of Agriculture, Washington, United States of AmericaSystematic Entomology Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, U.S. Department of AgricultureWashingtonUnited States of America
| | - M. Lourdes Chamorro
- Systematic Entomology Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, U.S. Department of Agriculture, Washington, United States of AmericaSystematic Entomology Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, U.S. Department of AgricultureWashingtonUnited States of America
| | - Torsten Dikow
- National Museum of Natural History, Smithsonian Institution, Washington, United States of AmericaNational Museum of Natural History, Smithsonian InstitutionWashingtonUnited States of America
| | - Michael W. Gates
- Systematic Entomology Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, U.S. Department of Agriculture, Washington, United States of AmericaSystematic Entomology Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, U.S. Department of AgricultureWashingtonUnited States of America
| | - Paul Goldstein
- Systematic Entomology Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, U.S. Department of Agriculture, Washington, United States of AmericaSystematic Entomology Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, U.S. Department of AgricultureWashingtonUnited States of America
| | - Alexander Konstantinov
- Systematic Entomology Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, U.S. Department of Agriculture, Washington, United States of AmericaSystematic Entomology Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, U.S. Department of AgricultureWashingtonUnited States of America
| | - Robert Kula
- Systematic Entomology Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, U.S. Department of Agriculture, Washington, United States of AmericaSystematic Entomology Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, U.S. Department of AgricultureWashingtonUnited States of America
| | - Nicholas D. Silverson
- National Museum of Natural History, Smithsonian Institution, Washington, United States of AmericaNational Museum of Natural History, Smithsonian InstitutionWashingtonUnited States of America
| | - M. Alma Solis
- Systematic Entomology Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, U.S. Department of Agriculture, Washington, United States of AmericaSystematic Entomology Laboratory, Beltsville Agricultural Research Center, Agricultural Research Service, U.S. Department of AgricultureWashingtonUnited States of America
| | - Stephanie L. deWaard
- Centre for Biodiversity Genomics, University of Guelph, Guelph, CanadaCentre for Biodiversity Genomics, University of GuelphGuelphCanada
| | - Suresh Naik
- Centre for Biodiversity Genomics, University of Guelph, Guelph, CanadaCentre for Biodiversity Genomics, University of GuelphGuelphCanada
- Department of Integrative Biology, University of Guelph, Guelph, CanadaDepartment of Integrative Biology, University of GuelphGuelphCanada
| | - Nadya Nikolova
- Centre for Biodiversity Genomics, University of Guelph, Guelph, CanadaCentre for Biodiversity Genomics, University of GuelphGuelphCanada
| | - Mikko Pentinsaari
- Centre for Biodiversity Genomics, University of Guelph, Guelph, CanadaCentre for Biodiversity Genomics, University of GuelphGuelphCanada
| | - Sean W. J. Prosser
- Centre for Biodiversity Genomics, University of Guelph, Guelph, CanadaCentre for Biodiversity Genomics, University of GuelphGuelphCanada
| | - Jayme E. Sones
- Centre for Biodiversity Genomics, University of Guelph, Guelph, CanadaCentre for Biodiversity Genomics, University of GuelphGuelphCanada
| | - Evgeny V. Zakharov
- Centre for Biodiversity Genomics, University of Guelph, Guelph, CanadaCentre for Biodiversity Genomics, University of GuelphGuelphCanada
- Department of Integrative Biology, University of Guelph, Guelph, CanadaDepartment of Integrative Biology, University of GuelphGuelphCanada
| | - Jeremy R. deWaard
- National Museum of Natural History, Smithsonian Institution, Washington, United States of AmericaNational Museum of Natural History, Smithsonian InstitutionWashingtonUnited States of America
- Centre for Biodiversity Genomics, University of Guelph, Guelph, CanadaCentre for Biodiversity Genomics, University of GuelphGuelphCanada
- School of Environmental Sciences, University of Guelph, Guelph, CanadaSchool of Environmental Sciences, University of GuelphGuelphCanada
| |
Collapse
|