1
|
Hernández-León S, Little DP, Acevedo-Sandoval O, Gernandt DS, Rodríguez-Laguna R, Saucedo-García M, Arce-Cervantes O, Razo-Zárate R, Espitia-López J. Plant core DNA barcode performance at a local scale: identification of the conifers of the state of Hidalgo, Mexico. SYST BIODIVERS 2019. [DOI: 10.1080/14772000.2018.1546240] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Affiliation(s)
- Sergio Hernández-León
- Área Académica de Ciencias Agrícolas y Forestales, Instituto de Ciencias Agropecuarias, Universidad Autónoma del Estado de Hidalgo, Tulancingo, Hidalgo, C.P. 43600, A.P. 32, México
| | - Damon P. Little
- Lewis B. and Dorothy Cullman Program for Molecular Systematics, The New York Botanical Garden, Bronx, New York, 10458-5126, USA
| | - Otilio Acevedo-Sandoval
- Área Académica de Ciencias Agrícolas y Forestales, Instituto de Ciencias Agropecuarias, Universidad Autónoma del Estado de Hidalgo, Tulancingo, Hidalgo, C.P. 43600, A.P. 32, México
| | - David S. Gernandt
- Departamento de Botánica, Instituto de Biología, Universidad Nacional Autónoma de México, Ciudad de México, C.P. 04510, AP 70-233, México
| | - Rodrigo Rodríguez-Laguna
- Área Académica de Ciencias Agrícolas y Forestales, Instituto de Ciencias Agropecuarias, Universidad Autónoma del Estado de Hidalgo, Tulancingo, Hidalgo, C.P. 43600, A.P. 32, México
| | - Mariana Saucedo-García
- Área Académica de Ciencias Agrícolas y Forestales, Instituto de Ciencias Agropecuarias, Universidad Autónoma del Estado de Hidalgo, Tulancingo, Hidalgo, C.P. 43600, A.P. 32, México
| | - Oscar Arce-Cervantes
- Área Académica de Ciencias Agrícolas y Forestales, Instituto de Ciencias Agropecuarias, Universidad Autónoma del Estado de Hidalgo, Tulancingo, Hidalgo, C.P. 43600, A.P. 32, México
| | - Ramón Razo-Zárate
- Área Académica de Ciencias Agrícolas y Forestales, Instituto de Ciencias Agropecuarias, Universidad Autónoma del Estado de Hidalgo, Tulancingo, Hidalgo, C.P. 43600, A.P. 32, México
| | - Josefa Espitia-López
- Área Académica de Ciencias Agrícolas y Forestales, Instituto de Ciencias Agropecuarias, Universidad Autónoma del Estado de Hidalgo, Tulancingo, Hidalgo, C.P. 43600, A.P. 32, México
| |
Collapse
|
2
|
Little DP, Knopf P, Schulz C. DNA barcode identification of Podocarpaceae--the second largest conifer family. PLoS One 2013; 8:e81008. [PMID: 24312258 PMCID: PMC3842326 DOI: 10.1371/journal.pone.0081008] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Accepted: 10/09/2013] [Indexed: 11/28/2022] Open
Abstract
We have generated matK, rbcL, and nrITS2 DNA barcodes for 320 specimens representing all 18 extant genera of the conifer family Podocarpaceae. The sample includes 145 of the 198 recognized species. Comparative analyses of sequence quality and species discrimination were conducted on the 159 individuals from which all three markers were recovered (representing 15 genera and 97 species). The vast majority of sequences were of high quality (B 30 = 0.596-0.989). Even the lowest quality sequences exceeded the minimum requirements of the BARCODE data standard. In the few instances that low quality sequences were generated, the responsible mechanism could not be discerned. There were no statistically significant differences in the discriminatory power of markers or marker combinations (p = 0.05). The discriminatory power of the barcode markers individually and in combination is low (56.7% of species at maximum). In some instances, species discrimination failed in spite of ostensibly useful variation being present (genotypes were shared among species), but in many cases there was simply an absence of sequence variation. Barcode gaps (maximum intraspecific p-distance > minimum interspecific p-distance) were observed in 50.5% of species when all three markers were considered simultaneously. The presence of a barcode gap was not predictive of discrimination success (p = 0.02) and there was no statistically significant difference in the frequency of barcode gaps among markers (p = 0.05). In addition, there was no correlation between number of individuals sampled per species and the presence of a barcode gap (p = 0.27).
Collapse
Affiliation(s)
- Damon P. Little
- Lewis B. and Dorothy Cullman Program for Molecular Systematics, The New York Botanical Garden, Bronx, New York, United States of America
| | - Patrick Knopf
- Lehrstuhl für Evolution und Biodiversität der Pflanzen, Ruhr–Universität Bochum, Bochum, Nordrhein–Westfalen, Bundesrepublik Deutschland
| | - Christian Schulz
- Lehrstuhl für Evolution und Biodiversität der Pflanzen, Ruhr–Universität Bochum, Bochum, Nordrhein–Westfalen, Bundesrepublik Deutschland
| |
Collapse
|
3
|
Ma BG. How to describe genes: Enlightenment from the quaternary number system. Biosystems 2007; 90:20-7. [PMID: 16945479 DOI: 10.1016/j.biosystems.2006.06.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2005] [Revised: 06/15/2006] [Accepted: 06/19/2006] [Indexed: 11/17/2022]
Abstract
As an open problem, computational gene identification has been widely studied, and many gene finders (software) become available today. However, little attention has been given to the problem of describing the common features of known genes in databanks to transform raw data into human understandable knowledge. In this paper, we draw attention to the task of describing genes and propose a trial implementation by treating DNA sequences as quaternary numbers. Under such a treatment, the common features of genes can be represented by a "position weight function", the core concept for a number system. In principle, the "position weight function" can be any real-valued function. In this paper, by approximating the function using trigonometric functions, some characteristic parameters indicating single nucleotide periodicities were obtained for the bacteria Escherichia coli K12's genome and the eukaryote yeast's genome. As a byproduct of this approach, a single-nucleotide-level measure is derived that complements codon-based indexes in describing the coding quality and expression level of an open reading frame (ORF). The ideas presented here have the potential to become a general methodology for biological sequence analysis.
Collapse
Affiliation(s)
- Bin-Guang Ma
- College of Chemistry and Chemical Engineering, Suzhou University, Suzhou 215006, PR China.
| |
Collapse
|
4
|
Ouyang Z, Liu JK, She ZS. Hierarchical structure analysis describing abnormal base composition of genomes. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2005; 72:041915. [PMID: 16383428 DOI: 10.1103/physreve.72.041915] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/08/2004] [Indexed: 05/05/2023]
Abstract
Abnormal base compositional patterns of genomic DNA sequences are studied in the framework of a hierarchical structure (HS) model originally proposed for the study of fully developed turbulence [She and Lévêque, Phys. Rev. Lett. 72, 336 (1994)]. The HS similarity law is verified over scales between 10(3)bp and 10(5)bp, and the HS parameter beta is proposed to describe the degree of heterogeneity in the base composition patterns. More than one hundred bacteria, archaea, virus, yeast, and human genome sequences have been analyzed and the results show that the HS analysis efficiently captures abnormal base composition patterns, and the parameter beta is a characteristic measure of the genome. Detailed examination of the values of beta reveals an intriguing link to the evolutionary events of genetic material transfer. Finally, a sequence complexity (S) measure is proposed to characterize gradual increase of organizational complexity of the genome during the evolution. The present study raises several interesting issues in the evolutionary history of genomes.
Collapse
Affiliation(s)
- Zhengqing Ouyang
- State Key Lab for Turbulence and Complex Systems and Center for Theoretical Biology, Peking University, Beijing 100871, People's Republic of China
| | | | | |
Collapse
|
5
|
Gissi C, Iannelli F, Pesole G. Complete mtDNA of Ciona intestinalis reveals extensive gene rearrangement and the presence of an atp8 and an extra trnM gene in ascidians. J Mol Evol 2004; 58:376-89. [PMID: 15114417 DOI: 10.1007/s00239-003-2559-6] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2003] [Accepted: 10/23/2003] [Indexed: 11/25/2022]
Abstract
The complete mitochondrial genome (mtDNA) of the model organism Ciona intestinalis (Urochordata, Ascidiacea) has been amplified by long-PCR using specific primers designed on putative mitochondrial transcripts identified from publicly available mitochondrial-like expressed sequence tags. The C. intestinalis mtDNA encodes 39 genes: 2 rRNAs, 13 subunits of the respiratory complexes, including ATPase subunit 8 ( atp8), and 24 tRNAs, including 2 tRNA-Met with anticodons 5'-UAU-3'and 5'-CAU-3', respectively. All genes are transcribed from the same strand. This gene content seems to be a common feature of ascidian mtDNAs, as we have verified the presence of a previously undetected atp8 and of two trnM genes in the two other sequenced ascidian mtDNAs. Extensive gene rearrangement has been found in C. intestinalis with respect not only to the common Vertebrata/Cephalochordata/Hemichordata gene organization but also to other ascidian mtDNAs, including the cogeneric Ciona savignyi. Other features such as the absence of long noncoding regions, the shortness of rRNA genes, the low GC content (21.4%), and the absence of asymmetric base distribution between the two strands suggest that this genome is more similar to those of some protostomes than to deuterostomes.
Collapse
Affiliation(s)
- Carmela Gissi
- Dipartimento di Scienze Biomolecolari e Biotecnologie, Università di Milano, Via Celoria 26, 20133 Milan, Italy
| | | | | |
Collapse
|
6
|
Rogozin IB, Pavlov YI. Theoretical analysis of mutation hotspots and their DNA sequence context specificity. Mutat Res 2003; 544:65-85. [PMID: 12888108 DOI: 10.1016/s1383-5742(03)00032-2] [Citation(s) in RCA: 123] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Mutation frequencies vary significantly along nucleotide sequences such that mutations often concentrate at certain positions called hotspots. Mutation hotspots in DNA reflect intrinsic properties of the mutation process, such as sequence specificity, that manifests itself at the level of interaction between mutagens, DNA, and the action of the repair and replication machineries. The hotspots might also reflect structural and functional features of the respective DNA sequences. When mutations in a gene are identified using a particular experimental system, resulting hotspots could reflect the properties of the gene product and the mutant selection scheme. Analysis of the nucleotide sequence context of hotspots can provide information on the molecular mechanisms of mutagenesis. However, the determinants of mutation frequency and specificity are complex, and there are many analytical methods for their study. Here we review computational approaches for analyzing mutation spectra (distribution of mutations along the target genes) that include many mutable (detectable) positions. The following methods are reviewed: derivation of a consensus sequence, application of regression approaches to correlate nucleotide sequence features with mutation frequency, mutation hotspot prediction, analysis of oligonucleotide composition of regions containing mutations, pairwise comparison of mutation spectra, analysis of multiple spectra, and analysis of "context-free" characteristics. The advantages and pitfalls of these methods are discussed and illustrated by examples from the literature. The most reliable analyses were obtained when several methods were combined and information from theoretical analysis and experimental observations was considered simultaneously. Simple, robust approaches should be used with small samples of mutations, whereas combinations of simple and complex approaches may be required for large samples. We discuss several well-documented studies where analysis of mutation spectra has substantially contributed to the current understanding of molecular mechanisms of mutagenesis. The nucleotide sequence context of mutational hotspots is a fingerprint of interactions between DNA and DNA repair, replication, and modification enzymes, and the analysis of hotspot context provides evidence of such interactions.
Collapse
Affiliation(s)
- Igor B Rogozin
- Institute of Cytology and Genetics, Russian Academy of Sciences, Novosibirsk, Russia
| | | |
Collapse
|
7
|
Saccone C, Gissi C, Reyes A, Larizza A, Sbisà E, Pesole G. Mitochondrial DNA in metazoa: degree of freedom in a frozen event. Gene 2002; 286:3-12. [PMID: 11943454 DOI: 10.1016/s0378-1119(01)00807-1] [Citation(s) in RCA: 83] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
The mitochondrial genome (mtDNA), due to its peculiar features such as exclusive presence of orthologous genes, uniparental inheritance, lack of recombination, small size and constant gene content, certainly represents a major model system in studies on evolutionary genomics in metazoan. In 800 million years of evolution the gene content of metazoan mitochondrial genomes has remained practically frozen but several evolutionary processes have taken place. These processes, reviewed here, include rearrangements of gene order, changes in base composition and arising of compositional asymmetry between the two strands, variations in the genetic code and evolution of codon usage, lineage-specific nucleotide substitution rates and evolutionary patterns of mtDNA control regions.
Collapse
Affiliation(s)
- Cecilia Saccone
- Centro di Studio sui Mitocondri e Metabolismo Energetico, CNR, via Amendola 165/A, 70126 Bari, Italy.
| | | | | | | | | | | |
Collapse
|
8
|
Wang J, Zhang Q, Ren K, She Z. Multi-scaling hierarchical structure analysis on the sequence ofE. coli complete genome. ACTA ACUST UNITED AC 2001. [DOI: 10.1007/bf02901913] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
9
|
Ostergaard L, Pedersen AG, Jespersen HM, Brunak S, Welinder KG. Computational analyses and annotations of the Arabidopsis peroxidase gene family. FEBS Lett 1998; 433:98-102. [PMID: 9738941 DOI: 10.1016/s0014-5793(98)00849-7] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Classical heme-containing plant peroxidases have been ascribed a wide variety of functional roles related to development, defense, lignification, and hormonal signaling. More than 40 peroxidase genes are now known in Arabidopsis thaliana for which functional association is complicated by a general lack of peroxidase substrate specificity. Computational analysis was performed on 30 near full-length Arabidopsis peroxidase cDNAs for annotation of start codons and signal peptide cleavage sites. A compositional analysis revealed that 23 of the 30 peroxidase cDNAs have 5' untranslated regions containing 40-71% adenine, a rare feature observed also in cDNAs which predominantly encode stress-induced proteins, and which may indicate translational regulation.
Collapse
Affiliation(s)
- L Ostergaard
- Department of Protein Chemistry, Institute of Molecular Biology, University of Copenhagen, Denmark
| | | | | | | | | |
Collapse
|
10
|
Gissi C, Gullberg A, Arnason U. The complete mitochondrial DNA sequence of the rabbit, Oryctolagus cuniculus. Genomics 1998; 50:161-9. [PMID: 9653643 DOI: 10.1006/geno.1998.5282] [Citation(s) in RCA: 60] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The nucleotide sequence of the complete mitochondrial DNA (mtDNA) molecule of the rabbit (Oryctolagus cuniculus, order Lagomorpha) was determined. The length of the molecule is 17,245 nt, but the length is not absolute due to the presence of different numbers of repeated motifs in the control region. The organization and gene contents of the mtDNA of the rabbit conform to those of other eutherian species. The putative secondary structures of the tRNAs of the rabbit have been described. These structures as well as the structure of the L-strand origin of replication comply with those characteristic for eutherians in general. The compositional differences between the two mtDNA strands have also been detailed.
Collapse
Affiliation(s)
- C Gissi
- Department of Biochemistry and Molecular Biology, University of Bari, Italy.
| | | | | |
Collapse
|