1
|
An HE, Mun MH, Malik A, Kim CB. Development of a two-layer machine learning model for the forensic application of legal and illegal poppy classification based on sequence data. Forensic Sci Int Genet 2024; 71:103061. [PMID: 38820740 DOI: 10.1016/j.fsigen.2024.103061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2023] [Revised: 02/09/2024] [Accepted: 05/06/2024] [Indexed: 06/02/2024]
Abstract
Poppies are beneficial plants with a variety of applications, including medicinal, edible, ornamental, and industrial purposes. Some Papaver species are forensically significant plants because they contain opium, a narcotic substance. Internationally trafficked species of illegal poppies are being identified by DNA barcoding employing multiple markers in response to their forensic value. However, effective markers for precise species identification of legal and illegal poppies are still under discussion, with research on illegal poppies focusing on Papaver somniferum L., and species identification studies of Papaver bracteatum and Papaver setigerum DC. still lacking. As a result, in order to evaluate the performance of genetic markers and classify their DNA sequences in the genus Papaver, this study developed the first machine learning-based two-layer model, in which the first layer classifies legal and illegal poppies from the given sequence and the second layer identifies species of illegal poppies using their sequences. We constructed the dataset and investigated biological features from four markers, internal transcribed spacer 1 (ITS1), internal transcribed spacer 2 (ITS2), transfer RNA Leucine (trnL), transfer RNA Leucine - transfer RNA Phenylalanine intergenic spacer (trnL-trnF intergenic spacer) and their combination, using four machine learning algorithms, K-nearest neighbor (KNN), Naïve Bayes (NB), extreme gradient boost (XGBoost) and Random Forest (RF). According to our findings, for Layer 1 to classify legal and illegal poppies, KNN-based models using combined ITS region achieved the greatest performance of accuracy 0.846 and 0.889 using training and test sets, respectively. Additionally, for Layer 2 to identify illegal poppy species, KNN-based models using combined ITS region achieved the best performance of 0.833 and 1.000 for using training and test sets, respectively. To validate the model, the combined ITS region, which includes ITS 1 and 2 sequences, from blind poppy samples were used as a case study, with the Layer 1 correctly classifying legal and illegal poppies with over 0.830 accuracy. Layer 2 correctly identified P. setigerum DC., however, only one of the three P. somniferum L. species was accurately identified. Nevertheless, our research shows that machine learning can be used to classify and identify legal and illegal poppy species using DNA barcodes which can then be used as an efficient and effective forensic tool for improved law enforcement and a safer society.
Collapse
Affiliation(s)
- Hyung-Eun An
- Department of Biotechnology, Sangmyung University, Seoul 03016, the Republic of Korea
| | - Min-Ho Mun
- Department of Biotechnology, Sangmyung University, Seoul 03016, the Republic of Korea
| | - Adeel Malik
- Institute of Intelligence Informatics Technology, Sangmyung University, Seoul 03016, the Republic of Korea
| | - Chang-Bae Kim
- Department of Biotechnology, Sangmyung University, Seoul 03016, the Republic of Korea.
| |
Collapse
|
2
|
Riza LS, Zain MI, Izzuddin A, Prasetyo Y, Hidayat T, Abu Samah KAF. Implementation of machine learning in DNA barcoding for determining the plant family taxonomy. Heliyon 2023; 9:e20161. [PMID: 37767518 PMCID: PMC10520734 DOI: 10.1016/j.heliyon.2023.e20161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2022] [Revised: 09/05/2023] [Accepted: 09/13/2023] [Indexed: 09/29/2023] Open
Abstract
The DNA barcoding approach has been used extensively in taxonomy and phylogenetics. The differences in certain DNA sequences are able to differentiate and help classify organisms into taxa. It has been used in cases of taxonomic disputes where morphology by itself is insufficient. This research aimed to utilize hierarchical clustering, an unsupervised machine learning method, to determine and resolve disputes in plant family taxonomy. We take a case study of Leguminosae that historically some classify into three families (Fabaceae, Caesalpiniaceae, and Mimosaceae) but others classify into one family (Leguminosae). This study is divided into several phases, which are: (i) data collection, (ii) data preprocessing, (iii) finding the best distance method, and (iv) determining disputed family. The data used are collected from several sources, including National Center for Biotechnology Information (NCBI), journals, and websites. The data for validation of the methods were collected from NCBI. This was used to determine the best distance method for differentiating families or genera. The data for the case study in the Leguminosae group was collected from journals and a website. From the experiment that we have conducted, we found that the Pearson method is the best distance method to do clustering ITS sequence of plants, both in accuracy and computational cost. We use the Pearson method to determine the disputed family between Leguminosae. We found that the case study of Leguminosae should be grouped into one family based on our research.
Collapse
Affiliation(s)
- Lala Septem Riza
- Department of Computer Science Education, Universitas Pendidikan Indonesia, Bandung, Indonesia
| | - Muhammad Iqbal Zain
- Department of Computer Science Education, Universitas Pendidikan Indonesia, Bandung, Indonesia
| | - Ahmad Izzuddin
- Department of Computer Science Education, Universitas Pendidikan Indonesia, Bandung, Indonesia
| | - Yudi Prasetyo
- Department of Computer Science Education, Universitas Pendidikan Indonesia, Bandung, Indonesia
| | - Topik Hidayat
- Department of Biology Education, Universitas Pendidikan Indonesia, Bandung, Indonesia
| | | |
Collapse
|
3
|
Dev SA, Unnikrishnan R, Prathibha PS, Sijimol K, Sreekumar VB, AzharAli A, Anoop EV, Viswanath S. Artificial intelligence in timber forensics employing DNA barcode database. 3 Biotech 2023; 13:183. [PMID: 37193334 PMCID: PMC10182240 DOI: 10.1007/s13205-023-03604-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Accepted: 05/03/2023] [Indexed: 05/18/2023] Open
Abstract
Extreme difficulties in species identification of illegally sourced wood with conventional tools have accelerated illicit logging activities, leading to the destruction of natural resources in India. In this regard, the study primarily focused on developing a DNA barcode database for 41 commercial timber tree species which are highly vulnerable to adulteration in south India. The developed DNA barcode database was validated using an integrated approach involving wood anatomical features of traded wood samples collected from south India. Traded wood samples were primarily identified using wood anatomical features using IAWA list of microscopic features for hardwood identification. Consortium of Barcode of Life (CBOL) recommended barcode gene regions (rbcL, matK & psbA-trnH) were employed for developing DNA barcode database. Secondly, we employed artificial intelligence (AI) analytical platform, Waikato Environment for Knowledge Analysis (WEKA) for analyzing DNA barcode sequence database which could append precision, speed, and accuracy for the entire identification process. Among the four classification algorithms implemented in the machine learning algorithm (WEKA), best performance was shown by SMO, which could clearly allocate individual samples to their respective sequence database of biological reference materials (BRM) with 100 % accuracy, indicating its efficiency in authenticating the traded timber species. Major advantage of AI is the ability to analyze huge data sets with more precision and also provides a large platform for rapid authentication of species, which subsequently reduces human labor and time. Supplementary Information The online version contains supplementary material available at 10.1007/s13205-023-03604-0.
Collapse
Affiliation(s)
- Suma Arun Dev
- Forest Genetic & Biotechnology Division, Kerala Forest Research Institute, Peechi, Thrissur, Kerala 680653 India
| | - Remya Unnikrishnan
- Forest Genetic & Biotechnology Division, Kerala Forest Research Institute, Peechi, Thrissur, Kerala 680653 India
- Cochin University of Science & Technology, Kochi, Kerala India
| | - P. S. Prathibha
- Forest Genetic & Biotechnology Division, Kerala Forest Research Institute, Peechi, Thrissur, Kerala 680653 India
| | - K. Sijimol
- Forest Genetic & Biotechnology Division, Kerala Forest Research Institute, Peechi, Thrissur, Kerala 680653 India
| | - V. B. Sreekumar
- Forest Genetic & Biotechnology Division, Kerala Forest Research Institute, Peechi, Thrissur, Kerala 680653 India
| | - A. AzharAli
- Department of Forest Products and Utilization, College of Forestry, Kerala Agricultural University, Vellanikara, Thrissur, Kerala 680654 India
| | - E. V. Anoop
- Department of Forest Products and Utilization, College of Forestry, Kerala Agricultural University, Vellanikara, Thrissur, Kerala 680654 India
| | - Syam Viswanath
- Forest Genetic & Biotechnology Division, Kerala Forest Research Institute, Peechi, Thrissur, Kerala 680653 India
| |
Collapse
|
4
|
Hishamuddin MS, Lee SY, Syazwan SA, Ramlee SI, Lamasudin DU, Mohamed R. Highly divergent regions in the complete plastome sequences of Aquilaria are suitable for DNA barcoding applications including identifying species origin of agarwood products. 3 Biotech 2023; 13:78. [PMID: 36761338 PMCID: PMC9902582 DOI: 10.1007/s13205-023-03479-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Accepted: 01/13/2023] [Indexed: 02/09/2023] Open
Abstract
Members of Aquilaria Lam. (Thymelaeaceae) are evergreen trees that are widely distributed in the Indomalesia region. Aquilaria is highly prized for its unique scented resin, agarwood, which is often the subject of unlawful trade activities. Survival of the tree is heavily threatened by destructive harvesting and agarwood poaching, leading to its protection under the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES). Unfortunately, an efficient species identification method, which is crucial to aid in the conservation efforts of Aquilaria is lacking. Here, we described our search for a suitable specific DNA barcode for Aquilaria species using eight complete plastome sequences. We identified five highly variable regions (HVR) (matK-rps16, ndhF-rpl32, psbJ-petA, trnD, and trnT-trnL) in the plastomes. These regions were further analyzed using the neighbor-joining (NJ) method to assess their ability at discriminating the eight species. Coupled with in silico primer design, two potential barcoding regions, psbJ-petA and trnT-trnL, were identified. Their strengths in species delimitation were evaluated individually and in combination, via DNA barcoding analysis. Our findings showed that the combined dataset, psbJ-petA + trnT-trnL, effectively resolved members of the genus Aquilaria by clustering all species into their respective clades. In addition, we demonstrated that the newly proposed DNA barcode was capable at identifying the species of origin of six commercial agarwood samples that were included as unknown samples. Such achievement offers a new technical advancement, useful in the combat against illicit agarwood trades and in assisting the conservation of these valuable species in natural populations. Supplementary Information The online version contains supplementary material available at 10.1007/s13205-023-03479-1.
Collapse
Affiliation(s)
- Muhammad Syahmi Hishamuddin
- Forest Biotechnology Laboratory, Department of Forestry Science and Biodiversity, Faculty of Forestry and Environment, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor Malaysia
| | - Shiou Yih Lee
- Forest Biotechnology Laboratory, Department of Forestry Science and Biodiversity, Faculty of Forestry and Environment, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor Malaysia
| | - Samsuddin Ahmad Syazwan
- Forest Biotechnology Laboratory, Department of Forestry Science and Biodiversity, Faculty of Forestry and Environment, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor Malaysia
- Mycology and Pathology Branch, Forest Biodiversity Division, Forest Research Institute Malaysia (FRIM), Jalan FRIM, 52109 Kuala Lumpur, Selangor Malaysia
| | - Shairul Izan Ramlee
- Department of Crop Science, Faculty of Agriculture, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor Malaysia
| | - Dhilia Udie Lamasudin
- Department of Cell and Molecular Biology, Faculty of Biotechnology and Biomolecular Sciences, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor Malaysia
| | - Rozi Mohamed
- Forest Biotechnology Laboratory, Department of Forestry Science and Biodiversity, Faculty of Forestry and Environment, Universiti Putra Malaysia, 43400 UPM Serdang, Selangor Malaysia
| |
Collapse
|
5
|
DNA Barcodes for Accurate Identification of Selected Medicinal Plants (Caryophyllales): Toward Barcoding Flowering Plants of the United Arab Emirates. DIVERSITY 2022. [DOI: 10.3390/d14040262] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
Abstract
The need for herbal medicinal plants is steadily increasing. Hence, the accurate identification of plant material has become vital for safe usage, avoiding adulteration, and medicinal plant trading. DNA barcoding has shown to be a valuable molecular identification tool for medicinal plants, ensuring the safety and efficacy of plant materials of therapeutic significance. Using morphological characters in genera with closely related species, species delimitation is often difficult. Here, we evaluated the capability of the nuclear barcode ITS2 and plastid DNA barcodes rbcL and matK to identify 20 medicinally important plant species of Caryophyllales. In our analysis, we applied an integrative approach for species discrimination using pairwise distance-based unsupervised operational taxonomic unit “OTU picking” methods, viz., ABGD (Automated Barcode Gap Analysis) and ASAP (Assemble Species by Automatic Partitioning). Along with the unsupervised OTU picking methods, Supervised Machine Learning methods (SML) were also implemented to recognize divergent taxa. Our results indicated that ITS2 was more successful in distinguishing between examined species, implying that it could be used to detect the contamination and adulteration of these medicinally important plants. Moreover, this study suggests that the combination of more than one method could assist in the resolution of morphologically similar or closely related taxa.
Collapse
|
6
|
Geometric and Topological Bases of a New Classification of Wood Vascular Tissues, Part 2: Classification of Vessels According to Their Grouping. SUSTAINABILITY 2022. [DOI: 10.3390/su14042031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
The arrangement of vessels and their grouping is unique in most tree species. When observing tiny, microscopic samples of wood, the arrangement of the wood vessels forms a characteristic and repetitive pattern, which is largely determined by the tree species, but it is also influenced by the site conditions as well as its location in the tree. The present study is part of a project aimed at applying computer vision and computer recognition methods to present a more general and comprehensive group classification of wood vessels. Quantitative descriptions of the grouping of vessels, as a rule, have so far been used mainly to reveal characteristic deviations from the typical structure of wood, for example, due to extreme site conditions. Therefore, they are applicable but not sufficient for the present study and need in-depth revision. A classification of vessels is presented depending on their mutual position, and more precisely, the groups of adjacent vessels are determined using quantitative methods. The quantitative indicators used for this purpose are based on the diameter and other quantitative indicators of the vessels’ arrangements. The proposed classification, although based on a long-known classification scheme in structural wood science, allows for the more precise definition of the classes of a grouping of adjacent vessels in a cross-section as a necessary step towards the wider use of the methods of machine recognition of wood.
Collapse
|
7
|
Jamdade R, Al-Shaer K, Al-Sallani M, Al-Harthi E, Mahmoud T, Gairola S, Shabana HA. Multilocus marker-based delimitation of Salicornia persica and its population discrimination assisted by supervised machine learning approach. PLoS One 2022; 17:e0270463. [PMID: 35895732 PMCID: PMC9328517 DOI: 10.1371/journal.pone.0270463] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Accepted: 06/10/2022] [Indexed: 11/18/2022] Open
Abstract
The Salicornia L. has been considered one of the most taxonomically challenging genera due to high morphological plasticity, intergradation between related species, and lack of diagnostic features in preserved herbarium specimens. In the United Arab Emirates (UAE), only one species of this genus, Salicornia europaea, has been reported, though investigating its identity at the molecular level has not yet been undertaken. Moreover, based on growth form and morphology variation between the Ras-Al-Khaimah (RAK) population and the Umm-Al-Quwain (UAQ) population, we suspect the presence of different species or morphotypes. The present study aimed to initially perform species identification using multilocus DNA barcode markers from chloroplast DNA (cpDNA) and nuclear ribosomal DNA (nrDNA), followed by the genetic divergence between two populations (RAK and UAQ) belonging to two different coastal localities in the UAE. The analysis resulted in high-quality multilocus barcode sequences subjected to species discrimination through the unsupervised OTU picking and supervised learning methods. The ETS sequence data from our study sites had high identity with the previously reported sequences of Salicornia persica using NCBI blast and was further confirmed using OTU picking methods viz., TaxonDNAs Species identifier and Assemble Species by Automatic Partitioning (ASAP). Moreover, matK sequence data showed a non-monophyletic relationship, and significant discrimination between the two populations through alignment-based unsupervised OTU picking, alignment-free Co-Phylog, and alignment & alignment-free supervised learning approaches. Other markers viz., rbcL, trnH-psbA, ITS2, and ETS could not distinguish the two populations individually, though their combination with matK (cpDNA & cpDNA+nrDNA) showed enough population discrimination. However, the ITS2+ETS (nrDNA) exhibited much higher genetic divergence, further splitting both the populations into four haplotypes. Based on the observed morphology, genetic divergence, and the number of haplotypes predicted using the matK marker, it can be suggested that two distinct populations (RAK and UAQ) do exist. Further extensive morpho-taxonomic studies are required to determine the inter-population variability of Salicornia in the UAE. Altogether, our results suggest that S. persica is the species that grow in the present study area in UAE, and do not support previous treatments as S. europaea.
Collapse
Affiliation(s)
- Rahul Jamdade
- Sharjah Seed Bank and Herbarium, Environment and Protected Areas Authority (EPAA), Sharjah, United Arab Emirates
- * E-mail:
| | - Khawla Al-Shaer
- Sharjah Seed Bank and Herbarium, Environment and Protected Areas Authority (EPAA), Sharjah, United Arab Emirates
| | - Mariam Al-Sallani
- Sharjah Seed Bank and Herbarium, Environment and Protected Areas Authority (EPAA), Sharjah, United Arab Emirates
| | - Eman Al-Harthi
- Sharjah Seed Bank and Herbarium, Environment and Protected Areas Authority (EPAA), Sharjah, United Arab Emirates
| | - Tamer Mahmoud
- Sharjah Seed Bank and Herbarium, Environment and Protected Areas Authority (EPAA), Sharjah, United Arab Emirates
- Nature Conservation Sector, Egyptian Environmental Affairs Agency, Cairo, Egypt
| | - Sanjay Gairola
- Sharjah Seed Bank and Herbarium, Environment and Protected Areas Authority (EPAA), Sharjah, United Arab Emirates
| | - Hatem A. Shabana
- Sharjah Seed Bank and Herbarium, Environment and Protected Areas Authority (EPAA), Sharjah, United Arab Emirates
- Nature Conservation Sector, Egyptian Environmental Affairs Agency, Cairo, Egypt
| |
Collapse
|
8
|
Quantification of adulteration in traded ayurvedic raw drugs employing machine learning approaches with DNA barcode database. 3 Biotech 2021; 11:463. [PMID: 34745814 DOI: 10.1007/s13205-021-03001-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Accepted: 09/26/2021] [Indexed: 10/20/2022] Open
Abstract
Adulteration of expensive raw drugs with inferior taxa has become a routine practice, conceding the quality and safety of derived herbal products. In this regard, the study addresses the development of an integrated approach encompassing DNA barcode and HPTLC fingerprinting to authenticate chiefly traded ayurvedic raw drugs in south India [viz. Saraca asoca (Roxb.) Willd., Terminalia arjuna (Roxb. ex DC.) Wight and Arn., Sida alnifolia L. and Desmodium gangeticum (L.) DC.] from its adulterants. Consortium of Barcode of Life (CBOL) recommended DNA barcode gene regions viz. nuclear ribosomal-Internal Transcribed Spacer (nrDNA-ITS), maturase K (matK), ribulose-1,5-bisphosphate carboxylase/oxygenase large subunit (rbcL) and psbA-trnH spacer regions along with HPTLC profiling were experimented and a reference database was created. Further, an integrated analytical approach employing genetic distance-based Maximum Likelihood phylogenetic tree and Artificial Intelligence (AI)based Machine Learning Algorithms (MLA)-Waikato Environment for Knowledge Analysis (WEKA) and Barcoding with Logic (BLOG) were employed to prove efficacy of DNA barcode tool. Even though, among the four barcodes, psbA-trnH (S. alnifolia and its adulterants, T. arjuna and its adulterants) or ITS region (S. asoca and its adulterants, D. gangeticum and its adulterants) showed highest inter specific divergences in the selected Biological Reference Materials (BRMs), rbcL or matK barcode regions alone were successful for authentication of traded samples. The automated species identification techniques, WEKA and BLOG, experimented for the first time in India for raw drug validation, could achieve rapid and precise identification. A national certification agency for raw drug authentication employing an integrated approach involving a DNA barcoding tool along with standard organoleptic and analytical methods can strengthen and ensure safety and quality of herbal medicines in India. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s13205-021-03001-5.
Collapse
|
9
|
Unnikrishnan R, Sumod M, Jayaraj R, Sujanapal P, Dev SA. The efficacy of machine learning algorithm for raw drug authentication in Coscinium fenestratum (Gaertn.) Colebr. employing a DNA barcode database. PHYSIOLOGY AND MOLECULAR BIOLOGY OF PLANTS : AN INTERNATIONAL JOURNAL OF FUNCTIONAL PLANT BIOLOGY 2021; 27:605-617. [PMID: 33854287 PMCID: PMC7981360 DOI: 10.1007/s12298-021-00965-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/03/2021] [Revised: 02/19/2021] [Accepted: 03/02/2021] [Indexed: 05/05/2023]
Abstract
Medicinal plants are a valuable resource for traditional as well as modern medicine. Consequently huge demand has exerted a heavy strain on the existing natural resources. Due to over exploitation and unscientific collection most of the commercially traded ayurvedic plants are in the phase of depletion. Adulteration of expensive raw drugs with inferior taxa has become a common practice to meet the annual demand of the ayurvedic industry. Although there are several recommended methods for proper identification varying from the traditional taxonomic to organoleptic and physiochemical, it is difficult to authenticate ayurvedic raw drugs available in extremely dried, powdered or shredded forms. In this regard, the study addresses proper authentication and illicit trade in Coscinium fenestratum (Gaertn.) Colebr. using CBOL recommended standard barcode regions viz. nuclear ribosomal-Internally Transcribed Spacer (nrDNA- ITS), maturase K (matK), ribulose-1,5-bisphosphate carboxylase/oxygenase large subunit (rbcL), and psbA-trnH spacer regions. Further, an integrated analytical approach employing Maximum Likelihood phylogenetic tree and Machine Learning Approach, Waikato Environment for Knowledge Analysis was employed to prove efficacy of the method. The automated species identification technique, Artificial Intelligence uses the ability of computers to build models that can receive the input data and then conduct statistical analyses which significantly reduces the human labour. Concurrently, scientific management, restoration, cultivation and conservation measures should be given utmost priority to reduce the depletion of wild resources as well as to meet the rapidly increasing demand of the herbal industries.
Collapse
Affiliation(s)
- Remya Unnikrishnan
- Forest Genetics and Biotechnology Division, Kerala Forest Research Institute, Peechi, Thrissur, Kerala 680653 India
- Cochin University of Science and Technology, Kochi, Kerala India
| | - M. Sumod
- Sustainable Forest Management Division, Kerala Forest Research Institute, Peechi, Thrissur, Kerala 680653 India
| | - R. Jayaraj
- Forest Ecology and Biodiversity Division, Kerala Forest Research Institute, Peechi, Thrissur, Kerala 680653 India
| | - P. Sujanapal
- Sustainable Forest Management Division, Kerala Forest Research Institute, Peechi, Thrissur, Kerala 680653 India
| | - Suma Arun Dev
- Forest Genetics and Biotechnology Division, Kerala Forest Research Institute, Peechi, Thrissur, Kerala 680653 India
| |
Collapse
|
10
|
Hong Z, Wu Z, Zhao K, Yang Z, Zhang N, Guo J, Tembrock LR, Xu D. Comparative Analyses of Five Complete Chloroplast Genomes from the Genus Pterocarpus (Fabacaeae). Int J Mol Sci 2020; 21:E3758. [PMID: 32466556 PMCID: PMC7312355 DOI: 10.3390/ijms21113758] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2020] [Revised: 05/20/2020] [Accepted: 05/24/2020] [Indexed: 12/14/2022] Open
Abstract
Pterocarpus is a genus of trees mainly distributed in tropical Asia, Africa, and South America. Some species of Pterocarpus are rosewood tree species, having important economic value for timber, and for some species, medicinal value as well. Up to now, information about this genus with regard to the genomic characteristics of the chloroplasts has been limited. Based on a combination of next-generation sequencing (Illumina Hiseq) and long-read sequencing (PacBio), the whole chloroplast genomes (cp genomes) of five species (rosewoods) in Pterocarpus (Pterocarpus macrocarpus, P. santalinus, P. indicus, P. pedatus, P. marsupium) have been assembled. The cp genomes of five species in Pterocarpus have similar structural characteristics, gene content, and sequence to other flowering plants. The cp genomes have a typical four-part structure, containing 110 unique genes (77 protein coding genes, 4 rRNAs, 29 tRNAs). Through comparative genomic analysis, abundant simple sequence repeat (SSR)loci (333-349) were detected in Pterocarpus, among which A /T single nucleotide repeats accounted for the highest proportion (72.8-76.4%). In the five cp genomes of Pterocarpus, eight hypervariable regions, including trnH-GUG_psbA, trnS-UGA_psbC, accD-psaI, ndhI-exon2_ndhI-exon1, ndhG_ndhi-exon2, rpoC2-exon2, ccsA, and trnfM-CAU, are proposed for use as DNA barcode regions. In the comparison of gene selection pressures (P. santalinus as the reference genome), purifying selection was inferred as the primary mode of selection in maintaining important biological functions. Phylogenetic analysis shows that Pterocarpus is a monophyletic group. The species P. tinctorius is resolved as early diverging in the genus. Pterocarpus was resolved as sister to the genus Tipuana.
Collapse
Affiliation(s)
- Zhou Hong
- State Key Laboratory of Tree Genetics and Breeding, Research Institute of Tropical Forestry, Chinese Academy of Forestry, Guangzhou 510520, China; (Z.H.); (K.Z.); (Z.Y.); (N.Z.); (J.G.)
| | - Zhiqiang Wu
- Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518120, China;
| | - Kunkun Zhao
- State Key Laboratory of Tree Genetics and Breeding, Research Institute of Tropical Forestry, Chinese Academy of Forestry, Guangzhou 510520, China; (Z.H.); (K.Z.); (Z.Y.); (N.Z.); (J.G.)
| | - Zengjiang Yang
- State Key Laboratory of Tree Genetics and Breeding, Research Institute of Tropical Forestry, Chinese Academy of Forestry, Guangzhou 510520, China; (Z.H.); (K.Z.); (Z.Y.); (N.Z.); (J.G.)
| | - Ningnan Zhang
- State Key Laboratory of Tree Genetics and Breeding, Research Institute of Tropical Forestry, Chinese Academy of Forestry, Guangzhou 510520, China; (Z.H.); (K.Z.); (Z.Y.); (N.Z.); (J.G.)
| | - Junyu Guo
- State Key Laboratory of Tree Genetics and Breeding, Research Institute of Tropical Forestry, Chinese Academy of Forestry, Guangzhou 510520, China; (Z.H.); (K.Z.); (Z.Y.); (N.Z.); (J.G.)
| | - Luke R. Tembrock
- Department of Agricultural Biology, Colorado State University, Fort Collins, CO 80523, USA;
| | - Daping Xu
- State Key Laboratory of Tree Genetics and Breeding, Research Institute of Tropical Forestry, Chinese Academy of Forestry, Guangzhou 510520, China; (Z.H.); (K.Z.); (Z.Y.); (N.Z.); (J.G.)
| |
Collapse
|
11
|
Yang CQ, Lv Q, Zhang AB. Sixteen Years of DNA Barcoding in China: What Has Been Done? What Can Be Done? Front Ecol Evol 2020. [DOI: 10.3389/fevo.2020.00057] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
|
12
|
Machine Learning Models with Quantitative Wood Anatomy Data Can Discriminate between Swietenia macrophylla and Swietenia mahagoni. FORESTS 2019. [DOI: 10.3390/f11010036] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Illegal logging and associated trade aggravate the over-exploitation of Swietenia species, of which S. macrophylla King, S. mahagoni (L.) Jacq, and S. humilis Zucc. have been listed in Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES) Appendix Ⅱ. Implementation of CITES necessitates the development of efficient forensic tools to identify wood species accurately, and ideally ones readily deployable in wood anatomy laboratories across the world. Herein, a method using quantitative wood anatomy data in combination with machine learning models to discriminate between three Swietenia species is presented, in addition to a second model focusing only on the two historically more important species S. mahagoni and S. macrophylla. The intra- and inter-specific variations in nine quantitative wood anatomical characters were measured and calculated based on 278 wood specimens, and four machine learning classifiers—Decision Tree C5.0, Naïve Bayes (NB), Support Vector Machine (SVM), and Artificial Neural Network (ANN)—were used to discriminate between the species. Among these species, S. macrophylla exhibited the largest intraspecific variation, and all three species showed at least partly overlapping values for all nine characters. SVM performed the best of all the classifiers, with an overall accuracy of 91.4% and a per-species correct identification rate of 66.7%, 95.0%, and 80.0% for S. humilis, S. macrophylla, and S. mahagoni, respectively. The two-species model discriminated between S. macrophylla and S. mahagoni with accuracies of over 90.0% using SVM. These accuracies are lower than perfect forensic certainty but nonetheless demonstrate that quantitative wood anatomy data in combination with machine learning models can be applied as an efficient tool to discriminate anatomically between similar species in the wood anatomy laboratory. It is probable that a range of previously anatomically inseparable species may become identifiable by incorporating in-depth analysis of quantitative characters and appropriate statistical classifiers.
Collapse
|