1
|
Diaz-Silveira GL, Deutsch J, Little DP. DNA Barcode Authentication of Devil's Claw Herbal Dietary Supplements. PLANTS (BASEL, SWITZERLAND) 2021; 10:plants10102005. [PMID: 34685813 PMCID: PMC8540935 DOI: 10.3390/plants10102005] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 09/17/2021] [Accepted: 09/21/2021] [Indexed: 06/13/2023]
Abstract
Devil's claw is the vernacular name for a genus of medicinal plants that occur in the Kalahari Desert and Namibia Steppes. The genus comprises two distinct species: Harpagophytum procumbens and H. zeyheri. Although the European pharmacopeia considers the species interchangeable, recent studies have demonstrated that H. procumbens and H. zeyheri are chemically distinct and should not be treated as the same species. Further, the sale of H. zeyheri as an herbal supplement is not legal in the United States. Four markers were tested for their ability to distinguish H. procumbens from H. zeyheri: rbcL, matK, nrITS2, and psbA-trnH. Of these, only psbA-trnH was successful. A novel DNA mini-barcode assay that produces a 178-base amplicon in Harpagophytum (specificity = 1.00 [95% confidence interval = 0.80-1.00]; sensitivity = 1.00 [95% confidence interval = 0.75-1.00]) was used to estimate mislabeling frequency in a sample of 23 devil's claw supplements purchased in the United States. PCR amplification failed in 13% of cases. Among the 20 fully-analyzable supplements: H. procumbens was not detected in 75%; 25% contained both H. procumbens and H. zeyheri; none contained only H. procumbens. We recommend this novel mini-barcode region as a standard method of quality control in the manufacture of devil's claw supplements.
Collapse
|
2
|
Yang B, Zhang Z, Yang C, Wang Y, Orr MC, Hongbin W, Zhang AB. Identification of Species by Combining Molecular and Morphological Data Using Convolutional Neural Networks. Syst Biol 2021; 71:690-705. [PMID: 34524452 DOI: 10.1093/sysbio/syab076] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2020] [Accepted: 09/08/2021] [Indexed: 11/14/2022] Open
Abstract
Integrative taxonomy is central to modern taxonomy and systematic biology, including behaviour, niche preference, distribution, morphological analysis and DNA barcoding. However, decades of use demonstrate that these methods can face challenges when used in isolation, for instance, potential misidentifications due to phenotypic plasticity for morphological methods, and incorrect identifications because of introgression, incomplete lineage sorting and horizontal gene transfer for DNA barcoding. Although researchers have advocated the use of integrative taxonomy, few detailed algorithms have been proposed. Here, we develop a convolutional neural network method (morphology-molecule network (MMNet)) that integrates morphological and molecular data for species identification. The newly proposed method (MMNet) worked better than four currently-available alternative methods when tested with 10 independent datasets representing varying genetic diversity from different taxa. High accuracies were achieved for all groups, including beetles (98.1% of 123 species), butterflies (98.8% of 24 species), fishes (96.3% of 214 species) and moths (96.4% of 150 total species). Further, MMNet demonstrated a high degree of accuracy (>98%) in four datasets including closely related species from the same genus. The average accuracy of two modest sub-genomic (single nucleotide polymorphism) datasets, comprising eight putative subspecies respectively, is 90%. Additional tests show that the success rate of species identification under this method most strongly depends on the amount of training data, and is robust to sequence length and image size. Analyses on the contribution of different data types (image versus gene) indicate that both morphological and genetic data are important to the model, and that genetic data contribute slightly more. The approaches developed here serve as a foundation for the future integration of multi-modal information for integrative taxonomy, such as image, audio, video, 3D scanning and biosensor data, to characterize organisms more comprehensively as a basis for improved investigation, monitoring and conservation of biodiversity.
Collapse
Affiliation(s)
- Bing Yang
- College of Life Sciences, Capital Normal University, Beijing 100048, People's Republic of China
| | - Zhenxin Zhang
- The Key Laboratory of 3D Information Acquisition and Application, MOE, Capital Normal University, Beijing 100048, People's Republic of China.,Beijing Laboratory of Water Resources Security, Capital Normal University, Beijing 100048, People's Republic of China.,Base of the State Key Laboratory of Urban Environmental Process and Digital, Capital Normal University, Beijing 100048, People's Republic of China
| | - Caiqing Yang
- College of Life Sciences, Capital Normal University, Beijing 100048, People's Republic of China
| | - Ying Wang
- College of Life Sciences, Capital Normal University, Beijing 100048, People's Republic of China
| | - Michael C Orr
- Key Laboratory of Zoological Systematics and Evolution, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, People's Republic of China
| | - Wang Hongbin
- Museum of Forest Biodiversity, Research Institute of Forest Ecology, Environment and Protection, Chinese Academy of Forestry, Beijing 100091, People's Republic of China
| | - Ai-Bing Zhang
- College of Life Sciences, Capital Normal University, Beijing 100048, People's Republic of China
| |
Collapse
|
3
|
Howard C, Lockie-Williams C, Slater A. Applied Barcoding: The Practicalities of DNA Testing for Herbals. PLANTS (BASEL, SWITZERLAND) 2020; 9:E1150. [PMID: 32899738 PMCID: PMC7570336 DOI: 10.3390/plants9091150] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2020] [Revised: 07/22/2020] [Accepted: 08/28/2020] [Indexed: 12/26/2022]
Abstract
DNA barcoding is a widely accepted technique for the identification of plant materials, and its application to the authentication of commercial medicinal plants has attracted significant attention. The incorporation of DNA-based technologies into the quality testing protocols of international pharmacopoeias represents a step-change in status, requiring the establishment of standardized, reliable and reproducible methods. The process by which this can be achieved for any herbal medicine is described, using Hypericum perforatum L. (St John's Wort) and potential adulterant Hypericum species as a case study. A range of practical issues are considered including quality control of DNA sequences from public repositories and the construction of individual curated databases, choice of DNA barcode region(s) and the identification of informative polymorphic nucleotide sequences. A decision tree informs the structure of the manuscript and provides a template to guide the development of future DNA barcode tests for herbals.
Collapse
Affiliation(s)
- Caroline Howard
- Biomolecular Technology Group, Leicester School of Allied Health Science, Faculty of Health and Life Sciences, De Montfort University, Leicester LE1 9BH, UK
- BP-NIBSC Herbal Laboratory, National Institute for Biological Standards and Controls, Potters Bar EN6 3QG, UK;
| | - Claire Lockie-Williams
- BP-NIBSC Herbal Laboratory, National Institute for Biological Standards and Controls, Potters Bar EN6 3QG, UK;
| | - Adrian Slater
- Biomolecular Technology Group, Leicester School of Allied Health Science, Faculty of Health and Life Sciences, De Montfort University, Leicester LE1 9BH, UK
| |
Collapse
|
4
|
Adibah A, Syazwan S, Haniza Hanim M, Badrul Munir M, Intan Faraha A, Siti Azizah M. Evaluation of DNA barcoding to facilitate the authentication of processed fish products in the seafood industry. Lebensm Wiss Technol 2020. [DOI: 10.1016/j.lwt.2020.109585] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
5
|
Adeoba MI, Kabongo R, der Bank HV, Yessoufou K. Re-evaluation of the discriminatory power of DNA barcoding on some specimens of African Cyprinidae (subfamilies Cyprininae and Danioninae). Zookeys 2018:105-121. [PMID: 29674898 PMCID: PMC5906743 DOI: 10.3897/zookeys.746.13502] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2017] [Accepted: 12/14/2017] [Indexed: 12/02/2022] Open
Abstract
Specimen identification in the absence of diagnostic morphological characters (e.g., larvae) can be problematic even for experts. The goal of the present study was to assess the performance of COI in discriminating specimens of the fish family Cyprinidae in Africa, and to explore whether COI-phylogeny can be reliably used for phylogenetic comparative analysis. The main objective was to analyse a matrix of COI sequences for 315 specimens from 15 genera of African Cyprinidae using various distance-based identification methods alongside multiple tests of DNA barcode efficacy (barcode gap, species monophyly on NJ tree). Some morphological and biological characters were also mapped on a COI-phylogeny reconstructed using Maximum Parsimony. First, the results indicated the existence of barcode gaps, a discriminatory power of COI ranging from 79 % to 92 %, and that most nodes form well-supported monophyletic clades on an NJ tree. Second, it was found that some morphological and biological characters are clustered on the COI-phylogeny, and this indicates the reliability of these characters for taxonomic discrimination within the family. Put together, our results provide not only an additional support for the COI as a good barcode marker for the African Cyprinidae but it also indicate the utility of COI-based phylogenies for a wide spectrum of ecological questions related to African Cyprinidae.
Collapse
Affiliation(s)
- Mariam I Adeoba
- Department of Zoology, University of Johannesburg, Kingsway Campus PO Box 524, Auckland Park 2006, South Africa
| | - Ronny Kabongo
- African Centre for DNA Barcoding, University of Johannesburg, Kingsway Campus, PO Box 524, Auckland Park 2006, South Africa
| | - Herman Van der Bank
- Department of Zoology, University of Johannesburg, Kingsway Campus PO Box 524, Auckland Park 2006, South Africa
| | - Kowiyou Yessoufou
- Department of Geography, Environmental management and Energy studies, University of Johannesburg, Kingsway Campus PO Box 524, Auckland Park 2006, South Africa
| |
Collapse
|
6
|
Adeoba MI, Kabongo R, Van der Bank H, Yessoufou K. Re-evaluation of the discriminatory power of DNA barcoding on some specimens of African Cyprinidae (subfamilies Cyprininae and Danioninae). Zookeys 2018. [DOI: 10.3897/zookeys.744.13502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Specimen identification in the absence of diagnostic morphological characters (e.g., larvae) can be problematic even for experts. The goal of the present study was to assess the performance of COI in discriminating specimens of the fish family Cyprinidae in Africa, and to explore whether COI-phylogeny can be reliably used for comparative phylogenetic analysis. The main objective was to analyse a matrix of COI sequences for 315 specimens from 15 genera of African Cyprinidae using various distance-based identification methods alongside multiple tests of DNA barcode efficacy (barcode gap, species monophyly on NJ tree). Some morphological and biological characters were also mapped on a COI-phylogeny reconstructed using Maximum Parsimony. First, the results indicated the existence of barcode gaps, a discriminatory power of COI ranging from 79 % to 92 %, and that most nodes form well-supported monophyletic clades on an NJ tree. Second, it was found that some morphological and biological characters are clustered on the COI-phylogeny, and this indicates the reliability of these characters for taxonomic discrimination within the family. Put together, our results provide not only an additional support for the COI as a good barcode marker for the African Cyprinidae but it also indicate the utility of COI-based phylogenies for a wide spectrum of ecological questions related to African Cyprinidae.
Collapse
|
7
|
Adeoba MI, Kabongo R, Van der Bank H, Yessoufou K. Re-evaluation of the discriminatory power of DNA barcoding on some specimens of African Cyprinidae (subfamilies Cyprininae and Danioninae). Zookeys 2018. [DOI: 10.3897/zookeys.740.13502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Specimen identification in the absence of diagnostic morphological characters (e.g., larvae) can be problematic even for experts. The goal of the present study was to assess the performance of COI in discriminating specimens of the fish family Cyprinidae in Africa, and to explore whether COI-phylogeny can be reliably used for phylogenetic comparative analysis. The main objective was to analyse a matrix of COI sequences for 315 specimens from 15 genera of African Cyprinidae using various distance-based identification methods alongside multiple tests of DNA barcode efficacy (barcode gap, species monophyly on NJ tree). Some morphological and biological characters were also mapped on a COI-phylogeny reconstructed using Maximum Parsimony. First, the results indicated the existence of barcode gaps, a discriminatory power of COI ranging from 79 % to 92 %, and that most nodes form well-supported monophyletic clades on an NJ tree. Second, it was found that some morphological and biological characters are clustered on the COI-phylogeny, and this indicates the reliability of these characters for taxonomic discrimination within the family. Put together, our results provide not only an additional support for the COI as a good barcode marker for the African Cyprinidae but it also indicate the utility of COI-based phylogenies for a wide spectrum of ecological questions related to African Cyprinidae.
Collapse
|
8
|
Zielezinski A, Vinga S, Almeida J, Karlowski WM. Alignment-free sequence comparison: benefits, applications, and tools. Genome Biol 2017; 18:186. [PMID: 28974235 PMCID: PMC5627421 DOI: 10.1186/s13059-017-1319-7] [Citation(s) in RCA: 239] [Impact Index Per Article: 34.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Alignment-free sequence analyses have been applied to problems ranging from whole-genome phylogeny to the classification of protein families, identification of horizontally transferred genes, and detection of recombined sequences. The strength of these methods makes them particularly useful for next-generation sequencing data processing and analysis. However, many researchers are unclear about how these methods work, how they compare to alignment-based methods, and what their potential is for use for their research. We address these questions and provide a guide to the currently available alignment-free sequence analysis tools.
Collapse
Affiliation(s)
- Andrzej Zielezinski
- Department of Computational Biology, Faculty of Biology, Adam Mickiewicz University in Poznan, Umultowska 89, 61-614, Poznan, Poland
| | - Susana Vinga
- IDMEC, Instituto Superior Técnico, Universidade de Lisboa, Av. Rovisco Pais 1, 1049-001, Lisbon, Portugal
| | - Jonas Almeida
- Stony Brook University (SUNY), 101 Nicolls Road, Stony Brook, NY, 11794, USA
| | - Wojciech M Karlowski
- Department of Computational Biology, Faculty of Biology, Adam Mickiewicz University in Poznan, Umultowska 89, 61-614, Poznan, Poland.
| |
Collapse
|
9
|
Hosein FN, Austin N, Maharaj S, Johnson W, Rostant L, Ramdass AC, Rampersad SN. Utility of DNA barcoding to identify rare endemic vascular plant species in Trinidad. Ecol Evol 2017; 7:7311-7333. [PMID: 28944019 PMCID: PMC5606854 DOI: 10.1002/ece3.3220] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2016] [Revised: 05/17/2017] [Accepted: 06/12/2017] [Indexed: 02/06/2023] Open
Abstract
The islands of the Caribbean are considered to be a "biodiversity hotspot." Collectively, a high level of endemism for several plant groups has been reported for this region. Biodiversity conservation should, in part, be informed by taxonomy, population status, and distribution of flora. One taxonomic impediment to species inventory and management is correct identification as conventional morphology-based assessment is subject to several caveats. DNA barcoding can be a useful tool to quickly and accurately identify species and has the potential to prompt the discovery of new species. In this study, the ability of DNA barcoding to confirm the identities of 14 endangered endemic vascular plant species in Trinidad was assessed using three DNA barcodes (matK, rbcL, and rpoC1). Herbarium identifications were previously made for all species under study. matK, rbcL, and rpoC1 markers were successful in amplifying target regions for seven of the 14 species. rpoC1 sequences required extensive editing and were unusable. rbcL primers resulted in cleanest reads, however, matK appeared to be superior to rbcL based on a number of parameters assessed including level of DNA polymorphism in the sequences, genetic distance, reference library coverage based on BLASTN statistics, direct sequence comparisons within "best match" and "best close match" criteria, and finally, degree of clustering with moderate to strong bootstrap support (>60%) in neighbor-joining tree-based comparisons. The performance of both markers seemed to be species-specific based on the parameters examined. Overall, the Trinidad sequences were accurately identified to the genus level for all endemic plant species successfully amplified and sequenced using both matK and rbcL markers. DNA barcoding can contribute to taxonomic and biodiversity research and will complement efforts to select taxa for various molecular ecology and population genetics studies.
Collapse
Affiliation(s)
- Fazeeda N. Hosein
- Faculty of Science and TechnologyDepartment of Life SciencesThe University of the West IndiesSt. AugustineTrinidad and Tobago – West Indies
| | - Nigel Austin
- Faculty of Science and TechnologyDepartment of Life SciencesThe University of the West IndiesSt. AugustineTrinidad and Tobago – West Indies
| | - Shobha Maharaj
- Faculty of Science and TechnologyDepartment of Life SciencesThe University of the West IndiesSt. AugustineTrinidad and Tobago – West Indies
| | - Winston Johnson
- Faculty of Science and TechnologyDepartment of Life SciencesThe University of the West IndiesSt. AugustineTrinidad and Tobago – West Indies
| | - Luke Rostant
- Faculty of Science and TechnologyDepartment of Life SciencesThe University of the West IndiesSt. AugustineTrinidad and Tobago – West Indies
| | - Amanda C. Ramdass
- Faculty of Science and TechnologyDepartment of Life SciencesThe University of the West IndiesSt. AugustineTrinidad and Tobago – West Indies
| | - Sephra N. Rampersad
- Faculty of Science and TechnologyDepartment of Life SciencesThe University of the West IndiesSt. AugustineTrinidad and Tobago – West Indies
| |
Collapse
|
10
|
Parker J, Helmstetter AJ, Devey D, Wilkinson T, Papadopulos AST. Field-based species identification of closely-related plants using real-time nanopore sequencing. Sci Rep 2017; 7:8345. [PMID: 28827531 PMCID: PMC5566789 DOI: 10.1038/s41598-017-08461-5] [Citation(s) in RCA: 54] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Accepted: 07/12/2017] [Indexed: 01/04/2023] Open
Abstract
Advances in DNA sequencing and informatics have revolutionised biology over the past four decades, but technological limitations have left many applications unexplored. Recently, portable, real-time, nanopore sequencing (RTnS) has become available. This offers opportunities to rapidly collect and analyse genomic data anywhere. However, generation of datasets from large, complex genomes has been constrained to laboratories. The portability and long DNA sequences of RTnS offer great potential for field-based species identification, but the feasibility and accuracy of these technologies for this purpose have not been assessed. Here, we show that a field-based RTnS analysis of closely-related plant species (Arabidopsis spp.) has many advantages over laboratory-based high-throughput sequencing (HTS) methods for species level identification and phylogenomics. Samples were collected and sequenced in a single day by RTnS using a portable, “al fresco” laboratory. Our analyses demonstrate that correctly identifying unknown reads from matches to a reference database with RTnS reads enables rapid and confident species identification. Individually annotated RTnS reads can be used to infer the evolutionary relationships of A. thaliana. Furthermore, hybrid genome assembly with RTnS and HTS reads substantially improved upon a genome assembled from HTS reads alone. Field-based RTnS makes real-time, rapid specimen identification and genome wide analyses possible.
Collapse
Affiliation(s)
- Joe Parker
- Jodrell Laboratory, Royal Botanic Gardens, Kew, Richmond, Surrey, UK, TW9 3AB.
| | | | - Dion Devey
- Jodrell Laboratory, Royal Botanic Gardens, Kew, Richmond, Surrey, UK, TW9 3AB
| | - Tim Wilkinson
- Jodrell Laboratory, Royal Botanic Gardens, Kew, Richmond, Surrey, UK, TW9 3AB
| | - Alexander S T Papadopulos
- Jodrell Laboratory, Royal Botanic Gardens, Kew, Richmond, Surrey, UK, TW9 3AB. .,Molecular Ecology and Fisheries Genetics Laboratory, Environment Centre Wales, School of Biological Sciences, Bangor University, Bangor, UK, LL57 2UW.
| |
Collapse
|
11
|
Korshunova T, Martynov A, Bakken T, Picton B. External diversity is restrained by internal conservatism: New nudibranch mollusc contributes to the cryptic species problem. ZOOL SCR 2017. [DOI: 10.1111/zsc.12253] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Affiliation(s)
- Tatiana Korshunova
- Koltzov Institute of Developmental Biology RAS; Moscow Russia
- Zoological Museum; Moscow State University; Moscow Russia
| | | | - Torkild Bakken
- NTNU University Museum; Norwegian University of Science and Technology; Trondheim Norway
| | | |
Collapse
|
12
|
Using Next-Generation Sequencing for DNA Barcoding: Capturing Allelic Variation in ITS2. G3-GENES GENOMES GENETICS 2017; 7:19-29. [PMID: 27799340 PMCID: PMC5217108 DOI: 10.1534/g3.116.036145] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Internal Transcribed Spacer 2 (ITS2) is a popular DNA barcoding marker; however, in some animal species it is hypervariable and therefore difficult to sequence with traditional methods. With next-generation sequencing (NGS) it is possible to sequence all gene variants despite the presence of single nucleotide polymorphisms (SNPs), insertions/deletions (indels), homopolymeric regions, and microsatellites. Our aim was to compare the performance of Sanger sequencing and NGS amplicon sequencing in characterizing ITS2 in 26 mosquito species represented by 88 samples. The suitability of ITS2 as a DNA barcoding marker for mosquitoes, and its allelic diversity in individuals and species, was also assessed. Compared to Sanger sequencing, NGS was able to characterize the ITS2 region to a greater extent, with resolution within and between individuals and species that was previously not possible. A total of 382 unique sequences (alleles) were generated from the 88 mosquito specimens, demonstrating the diversity present that has been overlooked by traditional sequencing methods. Multiple indels and microsatellites were present in the ITS2 alleles, which were often specific to species or genera, causing variation in sequence length. As a barcoding marker, ITS2 was able to separate all of the species, apart from members of the Culex pipiens complex, providing the same resolution as the commonly used Cytochrome Oxidase I (COI). The ability to cost-effectively sequence hypervariable markers makes NGS an invaluable tool with many applications in the DNA barcoding field, and provides insights into the limitations of previous studies and techniques.
Collapse
|
13
|
Horn T, Häser A. Bamboo tea: reduction of taxonomic complexity and application of DNA diagnostics based on rbcL and matK sequence data. PeerJ 2016; 4:e2781. [PMID: 27957401 PMCID: PMC5149056 DOI: 10.7717/peerj.2781] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2016] [Accepted: 11/10/2016] [Indexed: 11/30/2022] Open
Abstract
Background Names used in ingredient lists of food products are trivial and in their nature rarely precise. The most recent scientific interpretation of the term bamboo (Bambusoideae, Poaceae) comprises over 1,600 distinct species. In the European Union only few of these exotic species are well known sources for food ingredients (i.e., bamboo sprouts) and are thus not considered novel foods, which would require safety assessments before marketing of corresponding products. In contrast, the use of bamboo leaves and their taxonomic origin is mostly unclear. However, products containing bamboo leaves are currently marketed. Methods We analysed bamboo species and tea products containing bamboo leaves using anatomical leaf characters and DNA sequence data. To reduce taxonomic complexity associated with the term bamboo, we used a phylogenetic framework to trace the origin of DNA from commercially available bamboo leaves within the bambusoid subfamily. For authentication purposes, we introduced a simple PCR based test distinguishing genuine bamboo from other leaf components and assessed the diagnostic potential of rbcL and matK to resolve taxonomic entities within the bamboo subfamily and tribes. Results Based on anatomical and DNA data we were able to trace the taxonomic origin of bamboo leaves used in products to the genera Phyllostachys and Pseudosasa from the temperate “woody” bamboo tribe (Arundinarieae). Currently available rbcL and matK sequence data allow the character based diagnosis of 80% of represented bamboo genera. We detected adulteration by carnation in four of eight tea products and, after adapting our objectives, could trace the taxonomic origin of the adulterant to Dianthus chinensis (Caryophyllaceae), a well known traditional Chinese medicine with counter indications for pregnant women.
Collapse
Affiliation(s)
- Thomas Horn
- Molecular Cellbiology, Karlsruhe Institute of Technology , Karlsruhe , Germany
| | - Annette Häser
- Molecular Cellbiology, Karlsruhe Institute of Technology , Karlsruhe , Germany
| |
Collapse
|
14
|
Fiscon G, Weitschek E, Cella E, Lo Presti A, Giovanetti M, Babakir-Mina M, Ciotti M, Ciccozzi M, Pierangeli A, Bertolazzi P, Felici G. MISSEL: a method to identify a large number of small species-specific genomic subsequences and its application to viruses classification. BioData Min 2016; 9:38. [PMID: 27980679 PMCID: PMC5139023 DOI: 10.1186/s13040-016-0116-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2016] [Accepted: 11/20/2016] [Indexed: 12/04/2022] Open
Abstract
Background Continuous improvements in next generation sequencing technologies led to ever-increasing collections of genomic sequences, which have not been easily characterized by biologists, and whose analysis requires huge computational effort. The classification of species emerged as one of the main applications of DNA analysis and has been addressed with several approaches, e.g., multiple alignments-, phylogenetic trees-, statistical- and character-based methods. Results We propose a supervised method based on a genetic algorithm to identify small genomic subsequences that discriminate among different species. The method identifies multiple subsequences of bounded length with the same information power in a given genomic region. The algorithm has been successfully evaluated through its integration into a rule-based classification framework and applied to three different biological data sets: Influenza, Polyoma, and Rhino virus sequences. Conclusions We discover a large number of small subsequences that can be used to identify each virus type with high accuracy and low computational time, and moreover help to characterize different genomic regions. Bounding their length to 20, our method found 1164 characterizing subsequences for all the Influenza virus subtypes, 194 for all the Polyoma viruses, and 11 for Rhino viruses. The abundance of small separating subsequences extracted for each genomic region may be an important support for quick and robust virus identification. Finally, useful biological information can be derived by the relative location and abundance of such subsequences along the different regions. Electronic supplementary material The online version of this article (doi:10.1186/s13040-016-0116-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Giulia Fiscon
- Institute of Systems Analysis and Computer Science A. Ruberti (IASI), National Research Council (CNR), Via dei Taurini 19, Rome, 00185 Italy
| | - Emanuel Weitschek
- Institute of Systems Analysis and Computer Science A. Ruberti (IASI), National Research Council (CNR), Via dei Taurini 19, Rome, 00185 Italy.,Department of Engineering, Uninettuno International University, Corso Vittorio Emanuele II 39, Rome, 00186 Italy
| | - Eleonora Cella
- Department of Infectious Diseases, Istituto Superiore di Sanita, Viale Regina Margherita 299, Rome, 00161 Italy.,Public Health and Infectious Diseases, Sapienza University, Piazzale Aldo Moro 5, Rome, 00185 Italy
| | - Alessandra Lo Presti
- Department of Infectious Diseases, Istituto Superiore di Sanita, Viale Regina Margherita 299, Rome, 00161 Italy
| | - Marta Giovanetti
- Department of Infectious Diseases, Istituto Superiore di Sanita, Viale Regina Margherita 299, Rome, 00161 Italy.,Department of Biology, University of Rome Tor Vergata, Via della Ricerca Scientifica 1, Rome, 00133 Italy
| | | | - Marco Ciotti
- Laboratory of Molecular Virology, Polyclinic Tor Vergata Foundation, Viale Oxford 81, Rome, 00133 Italy
| | - Massimo Ciccozzi
- Institute of Systems Analysis and Computer Science A. Ruberti (IASI), National Research Council (CNR), Via dei Taurini 19, Rome, 00185 Italy.,Department of Infectious Diseases, Istituto Superiore di Sanita, Viale Regina Margherita 299, Rome, 00161 Italy
| | - Alessandra Pierangeli
- Virology Laboratory, Department of Molecular Medicine, Sapienza University, Viale di Porta Tiburtina 2, Rome, 00185 Italy
| | - Paola Bertolazzi
- Institute of Systems Analysis and Computer Science A. Ruberti (IASI), National Research Council (CNR), Via dei Taurini 19, Rome, 00185 Italy
| | - Giovanni Felici
- Institute of Systems Analysis and Computer Science A. Ruberti (IASI), National Research Council (CNR), Via dei Taurini 19, Rome, 00185 Italy
| |
Collapse
|
15
|
Zhang A, Hao M, Yang C, Shi Z. BarcodingR: an integrated
r
package for species identification using
DNA
barcodes. Methods Ecol Evol 2016. [DOI: 10.1111/2041-210x.12682] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Affiliation(s)
- Ai‐bing Zhang
- College of Life Sciences Capital Normal University Beijing 100048 China
| | - Meng‐di Hao
- College of Life Sciences Capital Normal University Beijing 100048 China
| | - Cai‐qing Yang
- College of Life Sciences Capital Normal University Beijing 100048 China
| | - Zhi‐yong Shi
- College of Life Sciences Capital Normal University Beijing 100048 China
| |
Collapse
|
16
|
Williamson J, Maurin O, Shiba S, van der Bank H, Pfab M, Pilusa M, Kabongo R, van der Bank M. Exposing the illegal trade in cycad species (Cycadophyta:Encephalartos) at two traditional medicine markets in South Africa using DNA barcoding. Genome 2016; 59:771-81. [DOI: 10.1139/gen-2016-0032] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Species in the cycad genus Encephalartos are listed in CITES Appendix I and as Threatened or Protected Species in terms of South Africa’s National Environmental Management: Biodiversity Act (NEM:BA) of 2004. Despite regulations, illegal plant harvesting for medicinal trade has continued in South Africa and resulted in declines in cycad populations and even complete loss of sub-populations. Encephalartos is traded at traditional medicine markets in South Africa in the form of bark strips and stem sections; thus, determining the species traded presents a major challenge due to a lack of characteristic plant parts. Here, a case study is presented on the use of DNA barcoding to identify cycads sold at the Faraday and Warwick traditional medicine markets in Johannesburg and Durban, respectively. Market samples were sequenced for the core DNA barcodes (rbcLa and matK) as well as two additional regions: nrITS and trnH-psbA. The barcoding database for cycads at the University of Johannesburg was utilized to assign query samples to known species. Three approaches were followed: tree-based, similarity-based, and character-based (BRONX) methods. Market samples identified were Encephalartos ferox (Near Threatened), Encephalartos lebomboensis (Endangered), Encephalartos natalensis (Near Threatened), Encephalartos senticosus (Vulnerable), and Encephalartos villosus (Least Concern). Results from this study are crucial for making appropriate assessments and decisions on how to manage these markets.
Collapse
Affiliation(s)
- J. Williamson
- The African Centre for DNA Barcoding, Department of Botany & Plant Biotechnology, University of Johannesburg, APK Campus, P.O. Box 524, Auckland Park, 2006, South Africa
| | - O. Maurin
- The African Centre for DNA Barcoding, Department of Botany & Plant Biotechnology, University of Johannesburg, APK Campus, P.O. Box 524, Auckland Park, 2006, South Africa
| | - S.N.S. Shiba
- The African Centre for DNA Barcoding, Department of Botany & Plant Biotechnology, University of Johannesburg, APK Campus, P.O. Box 524, Auckland Park, 2006, South Africa
| | - H. van der Bank
- The African Centre for DNA Barcoding, Department of Zoology, University of Johannesburg, APK Campus, P.O. Box 524, Auckland Park, 2006, South Africa
| | - M. Pfab
- South African National Biodiversity Institute, Pretoria National Botanical Garden, P/Bag X101, Silverton, 0184, South Africa
| | - M. Pilusa
- The African Centre for DNA Barcoding, Department of Botany & Plant Biotechnology, University of Johannesburg, APK Campus, P.O. Box 524, Auckland Park, 2006, South Africa
| | - R.M. Kabongo
- The African Centre for DNA Barcoding, Department of Botany & Plant Biotechnology, University of Johannesburg, APK Campus, P.O. Box 524, Auckland Park, 2006, South Africa
| | - M. van der Bank
- The African Centre for DNA Barcoding, Department of Botany & Plant Biotechnology, University of Johannesburg, APK Campus, P.O. Box 524, Auckland Park, 2006, South Africa
| |
Collapse
|
17
|
Hartvig I, Czako M, Kjær ED, Nielsen LR, Theilade I. The Use of DNA Barcoding in Identification and Conservation of Rosewood (Dalbergia spp.). PLoS One 2015; 10:e0138231. [PMID: 26375850 PMCID: PMC4573973 DOI: 10.1371/journal.pone.0138231] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2015] [Accepted: 08/26/2015] [Indexed: 11/19/2022] Open
Abstract
The genus Dalbergia contains many valuable timber species threatened by illegal logging and deforestation, but knowledge on distributions and threats is often limited and accurate species identification difficult. The aim of this study was to apply DNA barcoding methods to support conservation efforts of Dalbergia species in Indochina. We used the recommended rbcL, matK and ITS barcoding markers on 95 samples covering 31 species of Dalbergia, and tested their discrimination ability with both traditional distance-based as well as different model-based machine learning methods. We specifically tested whether the markers could be used to solve taxonomic confusion concerning the timber species Dalbergia oliveri, and to identify the CITES-listed Dalbergia cochinchinensis. We also applied the barcoding markers to 14 samples of unknown identity. In general, we found that the barcoding markers discriminated among Dalbergia species with high accuracy. We found that ITS yielded the single highest discrimination rate (100%), but due to difficulties in obtaining high-quality sequences from degraded material, the better overall choice for Dalbergia seems to be the standard rbcL+matK barcode, as this yielded discrimination rates close to 90% and amplified well. The distance-based method TaxonDNA showed the highest identification rates overall, although a more complete specimen sampling is needed to conclude on the best analytic method. We found strong support for a monophyletic Dalbergia oliveri and encourage that this name is used consistently in Indochina. The CITES-listed Dalbergia cochinchinensis was successfully identified, and a species-specific assay can be developed from the data generated in this study for the identification of illegally traded timber. We suggest that the use of DNA barcoding is integrated into the work flow during floristic studies and at national herbaria in the region, as this could significantly increase the number of identified specimens and improve knowledge about species distributions.
Collapse
Affiliation(s)
- Ida Hartvig
- Forest Genetics and Diversity, Department of Geosciences and Natural Resource Management, University of Copenhagen, Frederiksberg, Denmark
| | - Mihaly Czako
- Department of Biological Sciences, University of South Carolina, Columbia, South Carolina, United States of America
| | - Erik Dahl Kjær
- Forest Genetics and Diversity, Department of Geosciences and Natural Resource Management, University of Copenhagen, Frederiksberg, Denmark
| | - Lene Rostgaard Nielsen
- Forest Genetics and Diversity, Department of Geosciences and Natural Resource Management, University of Copenhagen, Frederiksberg, Denmark
| | - Ida Theilade
- Global Development, Department of Food and Resource Economics, University of Copenhagen, Frederiksberg, Denmark
| |
Collapse
|
18
|
Fiannaca A, La Rosa M, Rizzo R, Urso A. A k-mer-based barcode DNA classification methodology based on spectral representation and a neural gas network. Artif Intell Med 2015; 64:173-84. [PMID: 26170017 DOI: 10.1016/j.artmed.2015.06.002] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2014] [Revised: 05/25/2015] [Accepted: 06/25/2015] [Indexed: 11/28/2022]
Abstract
OBJECTIVES In this paper, an alignment-free method for DNA barcode classification that is based on both a spectral representation and a neural gas network for unsupervised clustering is proposed. METHODS In the proposed methodology, distinctive words are identified from a spectral representation of DNA sequences. A taxonomic classification of the DNA sequence is then performed using the sequence signature, i.e., the smallest set of k-mers that can assign a DNA sequence to its proper taxonomic category. Experiments were then performed to compare our method with other supervised machine learning classification algorithms, such as support vector machine, random forest, ripper, naïve Bayes, ridor, and classification tree, which also consider short DNA sequence fragments of 200 and 300 base pairs (bp). The experimental tests were conducted over 10 real barcode datasets belonging to different animal species, which were provided by the on-line resource "Barcode of Life Database". RESULTS The experimental results showed that our k-mer-based approach is directly comparable, in terms of accuracy, recall and precision metrics, with the other classifiers when considering full-length sequences. In addition, we demonstrate the robustness of our method when a classification is performed task with a set of short DNA sequences that were randomly extracted from the original data. For example, the proposed method can reach the accuracy of 64.8% at the species level with 200-bp fragments. Under the same conditions, the best other classifier (random forest) reaches the accuracy of 20.9%. CONCLUSIONS Our results indicate that we obtained a clear improvement over the other classifiers for the study of short DNA barcode sequence fragments.
Collapse
Affiliation(s)
- Antonino Fiannaca
- Institute of High-Performance Computing and Networking, National Research Council of Italy, Viale delle Scienze, Ed. 11, 90128 Palermo, Italy.
| | - Massimo La Rosa
- Institute of High-Performance Computing and Networking, National Research Council of Italy, Viale delle Scienze, Ed. 11, 90128 Palermo, Italy
| | - Riccardo Rizzo
- Institute of High-Performance Computing and Networking, National Research Council of Italy, Viale delle Scienze, Ed. 11, 90128 Palermo, Italy
| | - Alfonso Urso
- Institute of High-Performance Computing and Networking, National Research Council of Italy, Viale delle Scienze, Ed. 11, 90128 Palermo, Italy
| |
Collapse
|
19
|
Deagle BE, Jarman SN, Coissac E, Pompanon F, Taberlet P. DNA metabarcoding and the cytochrome c oxidase subunit I marker: not a perfect match. Biol Lett 2015; 10:rsbl.2014.0562. [PMID: 25209199 DOI: 10.1098/rsbl.2014.0562] [Citation(s) in RCA: 273] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
DNA metabarcoding enables efficient characterization of species composition in environmental DNA or bulk biodiversity samples, and this approach is making significant and unique contributions in the field of ecology. In metabarcoding of animals, the cytochrome c oxidase subunit I (COI) gene is frequently used as the marker of choice because no other genetic region can be found in taxonomically verified databases with sequences covering so many taxa. However, the accuracy of metabarcoding datasets is dependent on recovery of the targeted taxa using conserved amplification primers. We argue that COI does not contain suitably conserved regions for most amplicon-based metabarcoding applications. Marker selection deserves increased scrutiny and available marker choices should be broadened in order to maximize potential in this exciting field of research.
Collapse
Affiliation(s)
| | | | - Eric Coissac
- Université Grenoble Alpes, Centre National de la Recherche Scientifique (CNRS), Laboratoire d'Ecologie Alpine, F-38000 Grenoble, France Centre National de la Recherche Scientifique (CNRS), Laboratoire d'Ecologie Alpine, F-38000 Grenoble, France
| | - François Pompanon
- Université Grenoble Alpes, Centre National de la Recherche Scientifique (CNRS), Laboratoire d'Ecologie Alpine, F-38000 Grenoble, France Centre National de la Recherche Scientifique (CNRS), Laboratoire d'Ecologie Alpine, F-38000 Grenoble, France
| | - Pierre Taberlet
- Université Grenoble Alpes, Centre National de la Recherche Scientifique (CNRS), Laboratoire d'Ecologie Alpine, F-38000 Grenoble, France Centre National de la Recherche Scientifique (CNRS), Laboratoire d'Ecologie Alpine, F-38000 Grenoble, France
| |
Collapse
|
20
|
Christa G, Händeler K, Kück P, Vleugels M, Franken J, Karmeinski D, Wägele H. Phylogenetic evidence for multiple independent origins of functional kleptoplasty in Sacoglossa (Heterobranchia, Gastropoda). ORG DIVERS EVOL 2014. [DOI: 10.1007/s13127-014-0189-z] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
|
21
|
Abstract
Ginkgo biloba L. (known as ginkgo or maidenhair tree) is a phylogenetically isolated, charismatic, gymnosperm tree. Herbal dietary supplements, prepared from G. biloba leaves, are consumed to boost cognitive capacity via improved blood perfusion and mitochondrial function. A novel DNA mini-barcode assay was designed and validated for the authentication of G. biloba in herbal dietary supplements (n = 22; sensitivity = 1.00, 95% CI = 0.59-1.00; specificity = 1.00, 95% CI = 0.64-1.00). This assay was further used to estimate the frequency of mislabeled ginkgo herbal dietary supplements on the market in the United States of America: DNA amenable to PCR could not be extracted from three (7.5%) of the 40 supplements sampled, 31 of 37 (83.8%) assayable supplements contained identifiable G. biloba DNA, and six supplements (16.2%) contained fillers without any detectable G. biloba DNA. It is hoped that this assay will be used by supplement manufacturers to ensure that their supplements contain G. biloba.
Collapse
Affiliation(s)
- Damon P Little
- Lewis B. and Dorothy Cullman Program for Molecular Systematics, The New York Botanical Garden, Bronx, NY 10458-5126, USA
| |
Collapse
|
22
|
Abstract
Accurate identification of unknown specimens by means of DNA barcoding is contingent on the presence of a DNA barcoding gap, among other factors, as its absence may result in dubious specimen identifications - false negatives or positives. Whereas the utility of DNA barcoding would be greatly reduced in the absence of a distinct and sufficiently sized barcoding gap, the limits of intraspecific and interspecific distances are seldom thoroughly inspected across comprehensive sampling. The present study aims to illuminate this aspect of barcoding in a comprehensive manner for the animal phylum Annelida. All cytochrome c oxidase subunit I sequences (cox1 gene; the chosen region for zoological DNA barcoding) present in GenBank for Annelida, as well as for "Polychaeta", "Oligochaeta", and Hirudinea separately, were downloaded and curated for length, coverage and potential contaminations. The final datasets consisted of 9782 (Annelida), 5545 ("Polychaeta"), 3639 ("Oligochaeta"), and 598 (Hirudinea) cox1 sequences and these were either (i) used as is in an automated global barcoding gap detection analysis or (ii) further analyzed for genetic distances, separated into bins containing intraspecific and interspecific comparisons and plotted in a graph to visualize any potential global barcoding gap. Over 70 million pairwise genetic comparisons were made and results suggest that although there is a tendency towards separation, no distinct or sufficiently sized global barcoding gap exists in either of the datasets rendering future barcoding efforts at risk of erroneous specimen identifications (but local barcoding gaps may still exist allowing for the identification of specimens at lower taxonomic ranks). This seems to be especially true for earthworm taxa, which account for fully 35% of the total number of interspecific comparisons that show 0% divergence.
Collapse
Affiliation(s)
- Sebastian Kvist
- a Museum of Comparative Zoology, Department of Organismic and Evolutionary Biology, Harvard University , Cambridge , MA , USA
| |
Collapse
|
23
|
Weitschek E, Fiscon G, Felici G. Supervised DNA Barcodes species classification: analysis, comparisons and results. BioData Min 2014; 7:4. [PMID: 24721333 PMCID: PMC4022351 DOI: 10.1186/1756-0381-7-4] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2013] [Accepted: 04/05/2014] [Indexed: 11/15/2022] Open
Abstract
BACKGROUND Specific fragments, coming from short portions of DNA (e.g., mitochondrial, nuclear, and plastid sequences), have been defined as DNA Barcode and can be used as markers for organisms of the main life kingdoms. Species classification with DNA Barcode sequences has been proven effective on different organisms. Indeed, specific gene regions have been identified as Barcode: COI in animals, rbcL and matK in plants, and ITS in fungi. The classification problem assigns an unknown specimen to a known species by analyzing its Barcode. This task has to be supported with reliable methods and algorithms. METHODS In this work the efficacy of supervised machine learning methods to classify species with DNA Barcode sequences is shown. The Weka software suite, which includes a collection of supervised classification methods, is adopted to address the task of DNA Barcode analysis. Classifier families are tested on synthetic and empirical datasets belonging to the animal, fungus, and plant kingdoms. In particular, the function-based method Support Vector Machines (SVM), the rule-based RIPPER, the decision tree C4.5, and the Naïve Bayes method are considered. Additionally, the classification results are compared with respect to ad-hoc and well-established DNA Barcode classification methods. RESULTS A software that converts the DNA Barcode FASTA sequences to the Weka format is released, to adapt different input formats and to allow the execution of the classification procedure. The analysis of results on synthetic and real datasets shows that SVM and Naïve Bayes outperform on average the other considered classifiers, although they do not provide a human interpretable classification model. Rule-based methods have slightly inferior classification performances, but deliver the species specific positions and nucleotide assignments. On synthetic data the supervised machine learning methods obtain superior classification performances with respect to the traditional DNA Barcode classification methods. On empirical data their classification performances are at a comparable level to the other methods. CONCLUSIONS The classification analysis shows that supervised machine learning methods are promising candidates for handling with success the DNA Barcoding species classification problem, obtaining excellent performances. To conclude, a powerful tool to perform species identification is now available to the DNA Barcoding community.
Collapse
Affiliation(s)
- Emanuel Weitschek
- Department of Engineering, Roma Tre University, Via della Vasca Navale, 79, 00146 Rome, Italy
- Institute of Systems Analysis and Computer Science Antonio Ruberti, National Research Council, Viale Manzoni, 30, 00185 Rome, Italy
| | - Giulia Fiscon
- Institute of Systems Analysis and Computer Science Antonio Ruberti, National Research Council, Viale Manzoni, 30, 00185 Rome, Italy
- Department of Computer, Control, and Management Engineering, Sapienza University, Via Ariosto, 25, 00185 Rome, Italy
| | - Giovanni Felici
- Institute of Systems Analysis and Computer Science Antonio Ruberti, National Research Council, Viale Manzoni, 30, 00185 Rome, Italy
| |
Collapse
|
24
|
Fan L, Hui JHL, Yu ZG, Chu KH. VIP Barcoding: composition vector-based software for rapid species identification based on DNA barcoding. Mol Ecol Resour 2014; 14:871-81. [PMID: 24479510 DOI: 10.1111/1755-0998.12235] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2013] [Revised: 01/22/2014] [Accepted: 01/24/2014] [Indexed: 12/17/2022]
Abstract
Species identification based on short sequences of DNA markers, that is, DNA barcoding, has emerged as an integral part of modern taxonomy. However, software for the analysis of large and multilocus barcoding data sets is scarce. The Basic Local Alignment Search Tool (BLAST) is currently the fastest tool capable of handling large databases (e.g. >5000 sequences), but its accuracy is a concern and has been criticized for its local optimization. However, current more accurate software requires sequence alignment or complex calculations, which are time-consuming when dealing with large data sets during data preprocessing or during the search stage. Therefore, it is imperative to develop a practical program for both accurate and scalable species identification for DNA barcoding. In this context, we present VIP Barcoding: a user-friendly software in graphical user interface for rapid DNA barcoding. It adopts a hybrid, two-stage algorithm. First, an alignment-free composition vector (CV) method is utilized to reduce searching space by screening a reference database. The alignment-based K2P distance nearest-neighbour method is then employed to analyse the smaller data set generated in the first stage. In comparison with other software, we demonstrate that VIP Barcoding has (i) higher accuracy than Blastn and several alignment-free methods and (ii) higher scalability than alignment-based distance methods and character-based methods. These results suggest that this platform is able to deal with both large-scale and multilocus barcoding data with accuracy and can contribute to DNA barcoding for modern taxonomy. VIP Barcoding is free and available at http://msl.sls.cuhk.edu.hk/vipbarcoding/.
Collapse
Affiliation(s)
- Long Fan
- School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | | | | | | |
Collapse
|
25
|
Identification of sequestered chloroplasts in photosynthetic and non-photosynthetic sacoglossan sea slugs (Mollusca, Gastropoda). Front Zool 2014; 11:15. [PMID: 24555467 PMCID: PMC3941943 DOI: 10.1186/1742-9994-11-15] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2013] [Accepted: 02/06/2014] [Indexed: 11/24/2022] Open
Abstract
Background Sacoglossan sea slugs are well known for their unique ability among metazoans to incorporate functional chloroplasts (kleptoplasty) in digestive glandular cells, enabling the slugs to use these as energy source when starved for weeks and months. However, members assigned to the shelled Oxynoacea and Limapontioidea (often with dorsal processes) are in general not able to keep the incorporated chloroplasts functional. Since obviously no algal genes are present within three (out of six known) species with chloroplast retention of several months, other factors enabling functional kleptoplasty have to be considered. Certainly, the origin of the chloroplasts is important, however, food source of most of the about 300 described species is not known so far. Therefore, a deduction of specific algal food source as a factor to perform functional kleptoplasty was still missing. Results We investigated the food sources of 26 sacoglossan species, freshly collected from the field, by applying the chloroplast marker genes tufA and rbcL and compared our results with literature data of species known for their retention capability. For the majority of the investigated species, especially for the genus Thuridilla, we were able to identify food sources for the first time. Furthermore, published data based on feeding observations were confirmed and enlarged by the molecular methods. We also found that certain chloroplasts are most likely essential for establishing functional kleptoplasty. Conclusions Applying DNA-Barcoding appeared to be very efficient and allowed a detailed insight into sacoglossan food sources. We favor rbcL for future analyses, but tufA might be used additionally in ambiguous cases. We narrowed down the algal species that seem to be essential for long-term-functional photosynthesis: Halimeda, Caulerpa, Penicillus, Avrainvillea, Acetabularia and Vaucheria. None of these were found in Thuridilla, the only plakobranchoidean genus without long-term retention forms. The chloroplast type, however, does not solely determine functional kleptoplasty; members of no-retention genera, such as Cylindrobulla or Volvatella, feed on the same algae as e.g., the long-term-retention forms Plakobranchus ocellatus or Elysia crispata, respectively. Evolutionary benefits of functional kleptoplasty are still questionable, since a polyphagous life style would render slugs more independent of specific food sources and their abundance.
Collapse
|
26
|
DNA barcode authentication of saw palmetto herbal dietary supplements. Sci Rep 2013; 3:3518. [PMID: 24343362 PMCID: PMC3865462 DOI: 10.1038/srep03518] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2013] [Accepted: 11/28/2013] [Indexed: 11/09/2022] Open
Abstract
Herbal dietary supplements made from saw palmetto (Serenoa repens; Arecaceae) fruit are commonly consumed to ameliorate benign prostate hyperplasia. A novel DNA mini-barcode assay to accurately identify [specificity = 1.00 (95% confidence interval = 0.74-1.00); sensitivity = 1.00 (95% confidence interval = 0.66-1.00); n = 31] saw palmetto dietary supplements was designed from a DNA barcode reference library created for this purpose. The mini-barcodes were used to estimate the frequency of mislabeled saw palmetto herbal dietary supplements on the market in the United States of America. Of the 37 supplements examined, amplifiable DNA could be extracted from 34 (92%). Mini-barcode analysis of these supplements demonstrated that 29 (85%) contain saw palmetto and that 2 (6%) supplements contain related species that cannot be legally sold as herbal dietary supplements in the United States of America. The identity of 3 (9%) supplements could not be conclusively determined.
Collapse
|
27
|
Little DP. A DNA mini-barcode for land plants. Mol Ecol Resour 2013; 14:437-46. [PMID: 24286499 DOI: 10.1111/1755-0998.12194] [Citation(s) in RCA: 72] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2013] [Revised: 10/11/2013] [Accepted: 10/18/2013] [Indexed: 11/27/2022]
Abstract
Small portions of the barcode region - mini-barcodes - may be used in place of full-length barcodes to overcome DNA degradation for samples with poor DNA preservation. 591,491,286 rbcL mini-barcode primer combinations were electronically evaluated for PCR universality, and two novel highly universal sets of priming sites were identified. Novel and published rbcL mini-barcode primers were evaluated for PCR amplification [determined with a validated electronic simulation (n = 2765) and empirically (n = 188)], Sanger sequence quality [determined empirically (n = 188)], and taxonomic discrimination [determined empirically (n = 30,472)]. PCR amplification for all mini-barcodes, as estimated by validated electronic simulation, was successful for 90.2-99.8% of species. Overall Sanger sequence quality for mini-barcodes was very low - the best mini-barcode tested produced sequences of adequate quality (B20 ≥ 0.5) for 74.5% of samples. The majority of mini-barcodes provide correct identifications of families in excess of 70.1% of the time. Discriminatory power noticeably decreased at lower taxonomic levels. At the species level, the discriminatory power of the best mini-barcode was less than 38.2%. For samples believed to contain DNA from only one species, an investigator should attempt to sequence, in decreasing order of utility and probability of success, mini-barcodes F (rbcL1/rbcLB), D (F52/R193) and K (F517/R604). For samples believed to contain DNA from more than one species, an investigator should amplify and sequence mini-barcode D (F52/R193).
Collapse
Affiliation(s)
- Damon P Little
- Cullman Program for Molecular Systematics, The New York Botanical Garden, 2900 Southern Boulevard, Bronx, NY, 10458, USA
| |
Collapse
|
28
|
Murray DC, Haile J, Dortch J, White NE, Haouchar D, Bellgard MI, Allcock RJ, Prideaux GJ, Bunce M. Scrapheap challenge: a novel bulk-bone metabarcoding method to investigate ancient DNA in faunal assemblages. Sci Rep 2013; 3:3371. [PMID: 24288018 PMCID: PMC3842778 DOI: 10.1038/srep03371] [Citation(s) in RCA: 61] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2013] [Accepted: 11/05/2013] [Indexed: 11/23/2022] Open
Abstract
Highly fragmented and morphologically indistinct fossil bone is common in archaeological and paleontological deposits but unfortunately it is of little use in compiling faunal assemblages. The development of a cost-effective methodology to taxonomically identify bulk bone is therefore a key challenge. Here, an ancient DNA methodology using high-throughput sequencing is developed to survey and analyse thousands of archaeological bones from southwest Australia. Fossils were collectively ground together depending on which of fifteen stratigraphical layers they were excavated from. By generating fifteen synthetic blends of bulk bone powder, each corresponding to a chronologically distinct layer, samples could be collectively analysed in an efficient manner. A diverse range of taxa, including endemic, extirpated and hitherto unrecorded taxa, dating back to c.46,000 years BP was characterized. The method is a novel, cost-effective use for unidentifiable bone fragments and a powerful molecular tool for surveying fossils that otherwise end up on the taxonomic “scrapheap”.
Collapse
Affiliation(s)
- Dáithí C Murray
- 1] Ancient DNA Laboratory, School of Veterinary and Life Sciences, Murdoch University, South Street, Murdoch, WA, 6150, Australia [2]
| | | | | | | | | | | | | | | | | |
Collapse
|
29
|
Little DP, Knopf P, Schulz C. DNA barcode identification of Podocarpaceae--the second largest conifer family. PLoS One 2013; 8:e81008. [PMID: 24312258 PMCID: PMC3842326 DOI: 10.1371/journal.pone.0081008] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Accepted: 10/09/2013] [Indexed: 11/28/2022] Open
Abstract
We have generated matK, rbcL, and nrITS2 DNA barcodes for 320 specimens representing all 18 extant genera of the conifer family Podocarpaceae. The sample includes 145 of the 198 recognized species. Comparative analyses of sequence quality and species discrimination were conducted on the 159 individuals from which all three markers were recovered (representing 15 genera and 97 species). The vast majority of sequences were of high quality (B 30 = 0.596-0.989). Even the lowest quality sequences exceeded the minimum requirements of the BARCODE data standard. In the few instances that low quality sequences were generated, the responsible mechanism could not be discerned. There were no statistically significant differences in the discriminatory power of markers or marker combinations (p = 0.05). The discriminatory power of the barcode markers individually and in combination is low (56.7% of species at maximum). In some instances, species discrimination failed in spite of ostensibly useful variation being present (genotypes were shared among species), but in many cases there was simply an absence of sequence variation. Barcode gaps (maximum intraspecific p-distance > minimum interspecific p-distance) were observed in 50.5% of species when all three markers were considered simultaneously. The presence of a barcode gap was not predictive of discrimination success (p = 0.02) and there was no statistically significant difference in the frequency of barcode gaps among markers (p = 0.05). In addition, there was no correlation between number of individuals sampled per species and the presence of a barcode gap (p = 0.27).
Collapse
Affiliation(s)
- Damon P. Little
- Lewis B. and Dorothy Cullman Program for Molecular Systematics, The New York Botanical Garden, Bronx, New York, United States of America
| | - Patrick Knopf
- Lehrstuhl für Evolution und Biodiversität der Pflanzen, Ruhr–Universität Bochum, Bochum, Nordrhein–Westfalen, Bundesrepublik Deutschland
| | - Christian Schulz
- Lehrstuhl für Evolution und Biodiversität der Pflanzen, Ruhr–Universität Bochum, Bochum, Nordrhein–Westfalen, Bundesrepublik Deutschland
| |
Collapse
|
30
|
Saarela JM, Sokoloff PC, Gillespie LJ, Consaul LL, Bull RD. DNA barcoding the Canadian Arctic flora: core plastid barcodes (rbcL + matK) for 490 vascular plant species. PLoS One 2013; 8:e77982. [PMID: 24348895 PMCID: PMC3865322 DOI: 10.1371/journal.pone.0077982] [Citation(s) in RCA: 64] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2013] [Accepted: 09/08/2013] [Indexed: 01/16/2023] Open
Abstract
Accurate identification of Arctic plant species is critical for understanding potential climate-induced changes in their diversity and distributions. To facilitate rapid identification we generated DNA barcodes for the core plastid barcode loci (rbcL and matK) for 490 vascular plant species, representing nearly half of the Canadian Arctic flora and 93% of the flora of the Canadian Arctic Archipelago. Sequence recovery was higher for rbcL than matK (93% and 81%), and rbcL was easier to recover than matK from herbarium specimens (92% and 77%). Distance-based and sequence-similarity analyses of combined rbcL + matK data discriminate 97% of genera, 56% of species, and 7% of infraspecific taxa. There is a significant negative correlation between the number of species sampled per genus and the percent species resolution per genus. We characterize barcode variation in detail in the ten largest genera sampled (Carex, Draba, Festuca, Pedicularis, Poa, Potentilla, Puccinellia, Ranunculus, Salix, and Saxifraga) in the context of their phylogenetic relationships and taxonomy. Discrimination with the core barcode loci in these genera ranges from 0% in Salix to 85% in Carex. Haplotype variation in multiple genera does not correspond to species boundaries, including Taraxacum, in which the distribution of plastid haplotypes among Arctic species is consistent with plastid variation documented in non-Arctic species. Introgression of Poa glauca plastid DNA into multiple individuals of P. hartzii is problematic for identification of these species with DNA barcodes. Of three supplementary barcode loci (psbA-trnH, psbK-psbI, atpF-atpH) collected for a subset of Poa and Puccinellia species, only atpF-atpH improved discrimination in Puccinellia, compared with rbcL and matK. Variation in matK in Vaccinium uliginosum and rbcL in Saxifraga oppositifolia corresponds to variation in other loci used to characterize the phylogeographic histories of these Arctic-alpine species.
Collapse
Affiliation(s)
- Jeffery M. Saarela
- Botany Section, Research and Collections Services, Canadian Museum of Nature, Ottawa, Ontario, Canada
| | - Paul C. Sokoloff
- Botany Section, Research and Collections Services, Canadian Museum of Nature, Ottawa, Ontario, Canada
| | - Lynn J. Gillespie
- Botany Section, Research and Collections Services, Canadian Museum of Nature, Ottawa, Ontario, Canada
| | - Laurie L. Consaul
- Botany Section, Research and Collections Services, Canadian Museum of Nature, Ottawa, Ontario, Canada
| | - Roger D. Bull
- Botany Section, Research and Collections Services, Canadian Museum of Nature, Ottawa, Ontario, Canada
| |
Collapse
|
31
|
Tanabe AS, Toju H. Two new computational methods for universal DNA barcoding: a benchmark using barcode sequences of bacteria, archaea, animals, fungi, and land plants. PLoS One 2013; 8:e76910. [PMID: 24204702 PMCID: PMC3799923 DOI: 10.1371/journal.pone.0076910] [Citation(s) in RCA: 125] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2013] [Accepted: 08/25/2013] [Indexed: 11/24/2022] Open
Abstract
Taxonomic identification of biological specimens based on DNA sequence information (a.k.a. DNA barcoding) is becoming increasingly common in biodiversity science. Although several methods have been proposed, many of them are not universally applicable due to the need for prerequisite phylogenetic/machine-learning analyses, the need for huge computational resources, or the lack of a firm theoretical background. Here, we propose two new computational methods of DNA barcoding and show a benchmark for bacterial/archeal 16S, animal COX1, fungal internal transcribed spacer, and three plant chloroplast (rbcL, matK, and trnH-psbA) barcode loci that can be used to compare the performance of existing and new methods. The benchmark was performed under two alternative situations: query sequences were available in the corresponding reference sequence databases in one, but were not available in the other. In the former situation, the commonly used “1-nearest-neighbor” (1-NN) method, which assigns the taxonomic information of the most similar sequences in a reference database (i.e., BLAST-top-hit reference sequence) to a query, displays the highest rate and highest precision of successful taxonomic identification. However, in the latter situation, the 1-NN method produced extremely high rates of misidentification for all the barcode loci examined. In contrast, one of our new methods, the query-centric auto-k-nearest-neighbor (QCauto) method, consistently produced low rates of misidentification for all the loci examined in both situations. These results indicate that the 1-NN method is most suitable if the reference sequences of all potentially observable species are available in databases; otherwise, the QCauto method returns the most reliable identification results. The benchmark results also indicated that the taxon coverage of reference sequences is far from complete for genus or species level identification in all the barcode loci examined. Therefore, we need to accelerate the registration of reference barcode sequences to apply high-throughput DNA barcoding to genus or species level identification in biodiversity research.
Collapse
Affiliation(s)
- Akifumi S. Tanabe
- Graduate School of Global Environmental Studies, Kyoto University, Kyoto, Kyoto, Japan
- Research Center for Aquatic Genomics, National Research Institute of Fisheries Science, Fisheries Research Agency, Yokohama, Kanagawa, Japan
- * E-mail:
| | - Hirokazu Toju
- Graduate School of Global Environmental Studies, Kyoto University, Kyoto, Kyoto, Japan
- Graduate School of Human and Environmental Studies, Kyoto University, Kyoto, Kyoto, Japan
| |
Collapse
|
32
|
Parmentier I, Duminil J, Kuzmina M, Philippe M, Thomas DW, Kenfack D, Chuyong GB, Cruaud C, Hardy OJ. How effective are DNA barcodes in the identification of African rainforest trees? PLoS One 2013; 8:e54921. [PMID: 23565134 PMCID: PMC3615068 DOI: 10.1371/journal.pone.0054921] [Citation(s) in RCA: 70] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2011] [Accepted: 12/20/2012] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND DNA barcoding of rain forest trees could potentially help biologists identify species and discover new ones. However, DNA barcodes cannot always distinguish between closely related species, and the size and completeness of barcode databases are key parameters for their successful application. We test the ability of rbcL, matK and trnH-psbA plastid DNA markers to identify rain forest trees at two sites in Atlantic central Africa under the assumption that a database is exhaustive in terms of species content, but not necessarily in terms of haplotype diversity within species. METHODOLOGY/PRINCIPAL FINDINGS We assess the accuracy of identification to species or genus using a genetic distance matrix between samples either based on a global multiple sequence alignment (GD) or on a basic local alignment search tool (BLAST). Where a local database is available (within a 50 ha plot), barcoding was generally reliable for genus identification (95-100% success), but less for species identification (71-88%). Using a single marker, best results for species identification were obtained with trnH-psbA. There was a significant decrease of barcoding success in species-rich clades. When the local database was used to identify the genus of trees from another region and did include all genera from the query individuals but not all species, genus identification success decreased to 84-90%. The GD method performed best but a global multiple sequence alignment is not applicable on trnH-psbA. CONCLUSIONS/SIGNIFICANCE Barcoding is a useful tool to assign unidentified African rain forest trees to a genus, but identification to a species is less reliable, especially in species-rich clades, even using an exhaustive local database. Combining two markers improves the accuracy of species identification but it would only marginally improve genus identification. Finally, we highlight some limitations of the BLAST algorithm as currently implemented and suggest possible improvements for barcoding applications.
Collapse
Affiliation(s)
- Ingrid Parmentier
- Evolutionary Biology and Ecology – Faculté des Sciences, Université Libre de Bruxelles, Brussels, Belgium
| | - Jérôme Duminil
- Evolutionary Biology and Ecology – Faculté des Sciences, Université Libre de Bruxelles, Brussels, Belgium
- Sub-regional Office for Central Africa, Bioversity International, Yaoundé, Cameroon
| | - Maria Kuzmina
- Canadian Centre for DNA Barcoding, Biodiversity Institute of Ontario, University of Guelph, Guelph, Ontario, Canada
| | - Morgane Philippe
- Evolutionary Biology and Ecology – Faculté des Sciences, Université Libre de Bruxelles, Brussels, Belgium
| | - Duncan W. Thomas
- Department of Botany and Plant Pathology, Oregon State University, Corvallis, Oregon, United States of America
| | - David Kenfack
- Department of Botany, Smithsonian Institution, Washington, D.C., United States of America
| | - George B. Chuyong
- Department of Plant and Animal Sciences, University of Buea, Buea, Cameroon
| | - Corinne Cruaud
- Institut de Génomique – Génoscope, Commissariat à l′énergie atomique et aux énergies alternatives (CEA), Evry, France
| | - Olivier J. Hardy
- Evolutionary Biology and Ecology – Faculté des Sciences, Université Libre de Bruxelles, Brussels, Belgium
| |
Collapse
|
33
|
Bhargava M, Sharma A. DNA barcoding in plants: evolution and applications of in silico approaches and resources. Mol Phylogenet Evol 2013; 67:631-41. [PMID: 23500333 DOI: 10.1016/j.ympev.2013.03.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2012] [Revised: 02/28/2013] [Accepted: 03/01/2013] [Indexed: 02/03/2023]
Abstract
Bioinformatics has played an important role in the analysis of DNA barcoding data. The process of DNA barcoding initially involves the available data collection from the existing databases. Many databases have been developed in recent years, e.g. MMDBD [Medicinal Materials DNA Barcode Database], BioBarcode, etc. In case of non-availability of sequences, sequencing has to be done in vitro for which a recently developed software ecoPrimers can be helpful. This is followed by multiple sequence alignment. Further, basic sequence statistics computation and phylogenetic analysis can be performed by MEGA and PHYLIP/PAUP tools respectively. Some of the recent tools for in silico and statistical analysis specifically designed for barcoding viz. CAOS (Character Based DNA Barcoding), BRONX (DNA Barcode Sequence Identification Incorporating Taxonomic Hierarchy and within Taxon Variability), Spider (Analysis of species identity and evolution, particularly DNA barcoding), jMOTU and Taxonerator (Turning DNA Barcode Sequences into Annotated OTUs), OTUbase (Analysis of OTU data and taxonomic data), SAP (Statistical Assignment Package), etc. have been discussed and analysed in this review. The paper presents a comprehensive overview of the various in silico methods, tools, softwares and databases used for DNA barcoding of plants.
Collapse
Affiliation(s)
- Mili Bhargava
- Biotechnology Division, Central Institute of Medicinal and Aromatic Plants, Council of Scientific and Industrial Research, PO, Lucknow 226 015, India.
| | | |
Collapse
|
34
|
Collins RA, Cruickshank RH. The seven deadly sins of DNA barcoding. Mol Ecol Resour 2012; 13:969-75. [DOI: 10.1111/1755-0998.12046] [Citation(s) in RCA: 211] [Impact Index Per Article: 17.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2012] [Revised: 08/27/2012] [Accepted: 11/09/2012] [Indexed: 11/27/2022]
Affiliation(s)
- R. A. Collins
- Bio-Protection Research Centre; Lincoln University; PO Box 84; Lincoln; 7647; Canterbury; New Zealand
| | - R. H. Cruickshank
- Department of Ecology; Faculty of Agriculture and Life Sciences; Lincoln University; Lincoln; 7647; Canterbury; New Zealand
| |
Collapse
|
35
|
Biswal DK, Debnath M, Kumar S, Tandon P. Phylogenetic reconstruction in the order Nymphaeales: ITS2 secondary structure analysis and in silico testing of maturase k (matK) as a potential marker for DNA bar coding. BMC Bioinformatics 2012; 13 Suppl 17:S26. [PMID: 23282079 PMCID: PMC3521246 DOI: 10.1186/1471-2105-13-s17-s26] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The Nymphaeales (waterlilly and relatives) lineage has diverged as the second branch of basal angiosperms and comprises of two families: Cabombaceae and Nymphaceae. The classification of Nymphaeales and phylogeny within the flowering plants are quite intriguing as several systems (Thorne system, Dahlgren system, Cronquist system, Takhtajan system and APG III system (Angiosperm Phylogeny Group III system) have attempted to redefine the Nymphaeales taxonomy. There have been also fossil records consisting especially of seeds, pollen, stems, leaves and flowers as early as the lower Cretaceous. Here we present an in silico study of the order Nymphaeales taking maturaseK (matK) and internal transcribed spacer (ITS2) as biomarkers for phylogeny reconstruction (using character-based methods and Bayesian approach) and identification of motifs for DNA barcoding. RESULTS The Maximum Likelihood (ML) and Bayesian approach yielded congruent fully resolved and well-supported trees using a concatenated (ITS2+ matK) supermatrix aligned dataset. The taxon sampling corroborates the monophyly of Cabombaceae. Nuphar emerges as a monophyletic clade in the family Nymphaeaceae while there are slight discrepancies in the monophyletic nature of the genera Nymphaea owing to Victoria-Euryale and Ondinea grouping in the same node of Nymphaeaceae. ITS2 secondary structures alignment corroborate the primary sequence analysis. Hydatellaceae emerged as a sister clade to Nymphaeaceae and had a basal lineage amongst the water lilly clades. Species from Cycas and Ginkgo were taken as outgroups and were rooted in the overall tree topology from various methods. CONCLUSIONS MatK genes are fast evolving highly variant regions of plant chloroplast DNA that can serve as potential biomarkers for DNA barcoding and also in generating primers for angiosperms with identification of unique motif regions. We have reported unique genus specific motif regions in the Order Nymphaeles from matK dataset which can be further validated for barcoding and designing of PCR primers. Our analysis using a novel approach of sequence-structure alignment and phylogenetic reconstruction using molecular morphometrics congrue with the current placement of Hydatellaceae within the early-divergent angiosperm order Nymphaeales. The results underscore the fact that more diverse genera, if not fully resolved to be monophyletic, should be represented by all major lineages.
Collapse
Affiliation(s)
- Devendra Kumar Biswal
- Bioinformatics Centre, North Eastern Hill University, Shillong 793022, Meghalaya, India
| | | | | | | |
Collapse
|
36
|
Hao DC, Xiao PG, Ge GB, Liu M. Biological, Chemical, and Omics Research ofTaxusMedicinal Resources. Drug Dev Res 2012. [DOI: 10.1002/ddr.21040] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Affiliation(s)
- Da-Cheng Hao
- Biotechnology Institute/School of Environment; Dalian Jiaotong University; Dalian; China
| | | | - Guang-Bo Ge
- Pharmaceutical resource discovery; Dalian Institute of Chemical Physics; Chinese Academy of Sciences; Dalian; China
| | - Ming Liu
- Biotechnology Institute/School of Environment; Dalian Jiaotong University; Dalian; China
| |
Collapse
|
37
|
Ghahramanzadeh R, Esselink G, Kodde LP, Duistermaat H, van Valkenburg JLCH, Marashi SH, Smulders MJM, van de Wiel CCM. Efficient distinction of invasive aquatic plant species from non-invasive related species using DNA barcoding. Mol Ecol Resour 2012; 13:21-31. [PMID: 23039943 DOI: 10.1111/1755-0998.12020] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2012] [Revised: 08/08/2012] [Accepted: 08/16/2012] [Indexed: 11/29/2022]
Abstract
Biological invasions are regarded as threats to global biodiversity. Among invasive aliens, a number of plant species belonging to the genera Myriophyllum, Ludwigia and Cabomba, and to the Hydrocharitaceae family pose a particular ecological threat to water bodies. Therefore, one would try to prevent them from entering a country. However, many related species are commercially traded, and distinguishing invasive from non-invasive species based on morphology alone is often difficult for plants in a vegetative stage. In this regard, DNA barcoding could become a good alternative. In this study, 242 samples belonging to 26 species from 10 genera of aquatic plants were assessed using the chloroplast loci trnH-psbA, matK and rbcL. Despite testing a large number of primer sets and several PCR protocols, the matK locus could not be amplified or sequenced reliably and therefore was left out of the analysis. Using the other two loci, eight invasive species could be distinguished from their respective related species, a ninth one failed to produce sequences of sufficient quality. Based on the criteria of universal application, high sequence divergence and level of species discrimination, the trnH-psbA noncoding spacer was the best performing barcode in the aquatic plant species studied. Thus, DNA barcoding may be helpful with enforcing a ban on trade of such invasive species, such as is already in place in the Netherlands. This will become even more so once DNA barcoding would be turned into machinery routinely operable by a nonspecialist in botany and molecular genetics.
Collapse
Affiliation(s)
- R Ghahramanzadeh
- Wageningen UR Plant Breeding, Wageningen, NL-6700 AA, The Netherlands
| | | | | | | | | | | | | | | |
Collapse
|
38
|
Coissac E, Riaz T, Puillandre N. Bioinformatic challenges for DNA metabarcoding of plants and animals. Mol Ecol 2012; 21:1834-47. [PMID: 22486822 DOI: 10.1111/j.1365-294x.2012.05550.x] [Citation(s) in RCA: 160] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Almost all empirical studies in ecology have to identify the species involved in the ecological process under examination. DNA metabarcoding, which couples the principles of DNA barcoding with next generation sequencing technology, provides an opportunity to easily produce large amounts of data on biodiversity. Microbiologists have long used metabarcoding approaches, but use of this technique in the assessment of biodiversity in plant and animal communities is under-explored. Despite its relationship with DNA barcoding, several unique features of DNA metabarcoding justify the development of specific data analysis methodologies. In this review, we describe the bioinformatics tools available for DNA metabarcoding of plants and animals, and we revisit others developed for DNA barcoding or microbial metabarcoding. We also discuss the principles and associated tools for evaluating and comparing DNA barcodes in the context of DNA metabarcoding, for designing new custom-made barcodes adapted to specific ecological question, for dealing with PCR and sequencing errors, and for inferring taxonomical data from sequences.
Collapse
Affiliation(s)
- Eric Coissac
- Laboratoire d'Ecologie Alpine, CNRS UMR 5553, Université Joseph Fourier, Grenoble, France.
| | | | | |
Collapse
|
39
|
Li CP, Yu ZG, Han GS, Chu KH. Analyzing multi-locus plant barcoding datasets with a composition vector method based on adjustable weighted distance. PLoS One 2012; 7:e42154. [PMID: 22848736 PMCID: PMC3407124 DOI: 10.1371/journal.pone.0042154] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2012] [Accepted: 07/02/2012] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND The composition vector (CV) method has been proved to be a reliable and fast alignment-free method to analyze large COI barcoding data. In this study, we modify this method for analyzing multi-gene datasets for plant DNA barcoding. The modified method includes an adjustable-weighted algorithm for the vector distance according to the ratio in sequence length of the candidate genes for each pair of taxa. METHODOLOGY/PRINCIPAL FINDINGS Three datasets, matK+rbcL dataset with 2,083 sequences, matK+rbcL dataset with 397 sequences and matK+rbcL+trnH-psbA dataset with 397 sequences, were tested. We showed that the success rates of grouping sequences at the genus/species level based on this modified CV approach are always higher than those based on the traditional K2P/NJ method. For the matK+rbcL datasets, the modified CV approach outperformed the K2P-NJ approach by 7.9% in both the 2,083-sequence and 397-sequence datasets, and for the matK+rbcL+trnH-psbA dataset, the CV approach outperformed the traditional approach by 16.7%. CONCLUSIONS We conclude that the modified CV approach is an efficient method for analyzing large multi-gene datasets for plant DNA barcoding. Source code, implemented in C++ and supported on MS Windows, is freely available for download at http://math.xtu.edu.cn/myphp/math/research/source/Barcode_source_codes.zip.
Collapse
Affiliation(s)
- Chi Pang Li
- School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| | - Zu Guo Yu
- School of Mathematics and Computational Science, Xiangtan University, Hunan, China
- School of Mathematical Sciences, Queensland University of Technology, Brisbane, Australia
| | - Guo Sheng Han
- School of Mathematics and Computational Science, Xiangtan University, Hunan, China
| | - Ka Hou Chu
- School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong SAR, China
| |
Collapse
|
40
|
de Vere N, Rich TCG, Ford CR, Trinder SA, Long C, Moore CW, Satterthwaite D, Davies H, Allainguillaume J, Ronca S, Tatarinova T, Garbett H, Walker K, Wilkinson MJ. DNA barcoding the native flowering plants and conifers of Wales. PLoS One 2012; 7:e37945. [PMID: 22701588 PMCID: PMC3368937 DOI: 10.1371/journal.pone.0037945] [Citation(s) in RCA: 93] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2012] [Accepted: 04/26/2012] [Indexed: 11/19/2022] Open
Abstract
We present the first national DNA barcode resource that covers the native flowering plants and conifers for the nation of Wales (1143 species). Using the plant DNA barcode markers rbcL and matK, we have assembled 97.7% coverage for rbcL, 90.2% for matK, and a dual-locus barcode for 89.7% of the native Welsh flora. We have sampled multiple individuals for each species, resulting in 3304 rbcL and 2419 matK sequences. The majority of our samples (85%) are from DNA extracted from herbarium specimens. Recoverability of DNA barcodes is lower using herbarium specimens, compared to freshly collected material, mostly due to lower amplification success, but this is balanced by the increased efficiency of sampling species that have already been collected, identified, and verified by taxonomic experts. The effectiveness of the DNA barcodes for identification (level of discrimination) is assessed using four approaches: the presence of a barcode gap (using pairwise and multiple alignments), formation of monophyletic groups using Neighbour-Joining trees, and sequence similarity in BLASTn searches. These approaches yield similar results, providing relative discrimination levels of 69.4 to 74.9% of all species and 98.6 to 99.8% of genera using both markers. Species discrimination can be further improved using spatially explicit sampling. Mean species discrimination using barcode gap analysis (with a multiple alignment) is 81.6% within 10×10 km squares and 93.3% for 2×2 km squares. Our database of DNA barcodes for Welsh native flowering plants and conifers represents the most complete coverage of any national flora, and offers a valuable platform for a wide range of applications that require accurate species identification.
Collapse
Affiliation(s)
- Natasha de Vere
- National Botanic Garden of Wales, Llanarthne, United Kingdom.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
41
|
Coghlan ML, Haile J, Houston J, Murray DC, White NE, Moolhuijzen P, Bellgard MI, Bunce M. Deep sequencing of plant and animal DNA contained within traditional Chinese medicines reveals legality issues and health safety concerns. PLoS Genet 2012; 8:e1002657. [PMID: 22511890 PMCID: PMC3325194 DOI: 10.1371/journal.pgen.1002657] [Citation(s) in RCA: 166] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2011] [Accepted: 03/02/2012] [Indexed: 12/14/2022] Open
Abstract
Traditional Chinese medicine (TCM) has been practiced for thousands of years, but only within the last few decades has its use become more widespread outside of Asia. Concerns continue to be raised about the efficacy, legality, and safety of many popular complementary alternative medicines, including TCMs. Ingredients of some TCMs are known to include derivatives of endangered, trade-restricted species of plants and animals, and therefore contravene the Convention on International Trade in Endangered Species (CITES) legislation. Chromatographic studies have detected the presence of heavy metals and plant toxins within some TCMs, and there are numerous cases of adverse reactions. It is in the interests of both biodiversity conservation and public safety that techniques are developed to screen medicinals like TCMs. Targeting both the p-loop region of the plastid trnL gene and the mitochondrial 16S ribosomal RNA gene, over 49,000 amplicon sequence reads were generated from 15 TCM samples presented in the form of powders, tablets, capsules, bile flakes, and herbal teas. Here we show that second-generation, high-throughput sequencing (HTS) of DNA represents an effective means to genetically audit organic ingredients within complex TCMs. Comparison of DNA sequence data to reference databases revealed the presence of 68 different plant families and included genera, such as Ephedra and Asarum, that are potentially toxic. Similarly, animal families were identified that include genera that are classified as vulnerable, endangered, or critically endangered, including Asiatic black bear (Ursus thibetanus) and Saiga antelope (Saiga tatarica). Bovidae, Cervidae, and Bufonidae DNA were also detected in many of the TCM samples and were rarely declared on the product packaging. This study demonstrates that deep sequencing via HTS is an efficient and cost-effective way to audit highly processed TCM products and will assist in monitoring their legality and safety especially when plant reference databases become better established.
Collapse
Affiliation(s)
- Megan L Coghlan
- Australian Wildlife Forensic Services and Ancient DNA Laboratory, School of Biological Sciences and Biotechnology, Murdoch University, Murdoch, Australia
| | | | | | | | | | | | | | | |
Collapse
|
42
|
Abstract
Success of species assignment using DNA barcodes has been shown to vary among plant lineages because of a wide range of different factors. In this study, we confirm the theoretical prediction that gene flow influences species assignment with simulations and a literature survey. We show that the genome experiencing the highest gene flow is, in the majority of the cases, the best suited for species delimitation. Our results clearly suggest that, for most angiosperm groups, plastid markers will not be the most appropriate for use as DNA barcodes. We therefore advocate shifting the focus from plastid to nuclear markers to achieve an overall higher success using DNA barcodes.
Collapse
Affiliation(s)
- Yamama Naciri
- Unité de Phylogénie et de Génétique Moléculaires, Conservatoire et Jardin botaniques de la Ville de Genève, Chemin de l'Impératrice 1, 1292 Chambésy, Switzerland.
| | | | | |
Collapse
|
43
|
van Velzen R, Weitschek E, Felici G, Bakker FT. DNA barcoding of recently diverged species: relative performance of matching methods. PLoS One 2012; 7:e30490. [PMID: 22272356 PMCID: PMC3260286 DOI: 10.1371/journal.pone.0030490] [Citation(s) in RCA: 124] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2011] [Accepted: 12/22/2011] [Indexed: 12/23/2022] Open
Abstract
Recently diverged species are challenging for identification, yet they are frequently of special interest scientifically as well as from a regulatory perspective. DNA barcoding has proven instrumental in species identification, especially in insects and vertebrates, but for the identification of recently diverged species it has been reported to be problematic in some cases. Problems are mostly due to incomplete lineage sorting or simply lack of a 'barcode gap' and probably related to large effective population size and/or low mutation rate. Our objective was to compare six methods in their ability to correctly identify recently diverged species with DNA barcodes: neighbor joining and parsimony (both tree-based), nearest neighbor and BLAST (similarity-based), and the diagnostic methods DNA-BAR, and BLOG. We analyzed simulated data assuming three different effective population sizes as well as three selected empirical data sets from published studies. Results show, as expected, that success rates are significantly lower for recently diverged species (∼75%) than for older species (∼97%) (P<0.00001). Similarity-based and diagnostic methods significantly outperform tree-based methods, when applied to simulated DNA barcode data (P<0.00001). The diagnostic method BLOG had highest correct query identification rate based on simulated (86.2%) as well as empirical data (93.1%), indicating that it is a consistently better method overall. Another advantage of BLOG is that it offers species-level information that can be used outside the realm of DNA barcoding, for instance in species description or molecular detection assays. Even though we can confirm that identification success based on DNA barcoding is generally high in our data, recently diverged species remain difficult to identify. Nevertheless, our results contribute to improved solutions for their accurate identification.
Collapse
Affiliation(s)
- Robin van Velzen
- Biosystematics Group, Wageningen University, Wageningen, The Netherlands.
| | | | | | | |
Collapse
|