1
|
Castaldi V, Langella E, Buonanno M, Di Lelio I, Aprile AM, Molisso D, Criscuolo MC, D'Andrea LD, Romanelli A, Amoresano A, Pinto G, Illiano A, Chiaiese P, Becchimanzi A, Pennacchio F, Rao R, Monti SM. Intrinsically disordered Prosystemin discloses biologically active repeat motifs. Plant Sci 2024; 340:111969. [PMID: 38159610 DOI: 10.1016/j.plantsci.2023.111969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/18/2023] [Revised: 12/22/2023] [Accepted: 12/26/2023] [Indexed: 01/03/2024]
Abstract
The in-depth studies over the years on the defence barriers by tomato plants have shown that the Systemin peptide controls the response to a wealth of environmental stress agents. This multifaceted stress reaction seems to be related to the intrinsic disorder of its precursor protein, Prosystemin (ProSys). Since latest findings show that ProSys has biological functions besides Systemin sequence, here we wanted to assess if this precursor includes peptide motifs able to trigger stress-related pathways. Candidate peptides were identified in silico and synthesized to test their capacity to trigger defence responses in tomato plants against different biotic stressors. Our results demonstrated that ProSys harbours several repeat motifs which triggered plant immune reactions against pathogens and pest insects. Three of these peptides were detected by mass spectrometry in plants expressing ProSys, demonstrating their effective presence in vivo. These experimental data shed light on unrecognized functions of ProSys, mediated by multiple biologically active sequences which may partly account for the capacity of ProSys to induce defense responses to different stress agents.
Collapse
Affiliation(s)
- Valeria Castaldi
- Department of Agricultural Sciences, University of Naples Federico II, via Università 100, Portici, Naples 80055, Italy
| | - Emma Langella
- Institute of Biostructures and Bioimaging, National Research Council (IBB, CNR), via Pietro Castellino 111, Naples 80131, Italy.
| | - Martina Buonanno
- Institute of Biostructures and Bioimaging, National Research Council (IBB, CNR), via Pietro Castellino 111, Naples 80131, Italy
| | - Ilaria Di Lelio
- Department of Agricultural Sciences, University of Naples Federico II, via Università 100, Portici, Naples 80055, Italy; Interuniversity Center for Studies on Bioinspired Agro-Environmental Technology (BAT Center), University of Naples Federico II, via Università 100, Portici, 80055 Naples, Italy
| | - Anna Maria Aprile
- Department of Agricultural Sciences, University of Naples Federico II, via Università 100, Portici, Naples 80055, Italy
| | - Donata Molisso
- Department of Agricultural Sciences, University of Naples Federico II, via Università 100, Portici, Naples 80055, Italy
| | - Martina Chiara Criscuolo
- Department of Agricultural Sciences, University of Naples Federico II, via Università 100, Portici, Naples 80055, Italy
| | - Luca Domenico D'Andrea
- Istituto di Scienze e Tecnologie Chimiche "Giulio Natta" (SCITEC), Consiglio Nazionale delle Ricerche (CNR), via Alfonso Corti 12, 20131 Milano, Italy
| | | | - Angela Amoresano
- Department of Chemical Sciences, University of Naples Federico II, via Cynthia 8, Napoli and Interuniversitary Consortium "Istituto Nazionale Biostrutture e Biosistemi, 80126 Roma, Italy
| | - Gabriella Pinto
- Department of Chemical Sciences, University of Naples Federico II, via Cynthia 8, Napoli and Interuniversitary Consortium "Istituto Nazionale Biostrutture e Biosistemi, 80126 Roma, Italy
| | - Anna Illiano
- Department of Chemical Sciences, University of Naples Federico II, via Cynthia 8, Napoli and Interuniversitary Consortium "Istituto Nazionale Biostrutture e Biosistemi, 80126 Roma, Italy
| | - Pasquale Chiaiese
- Department of Agricultural Sciences, University of Naples Federico II, via Università 100, Portici, Naples 80055, Italy
| | - Andrea Becchimanzi
- Department of Agricultural Sciences, University of Naples Federico II, via Università 100, Portici, Naples 80055, Italy; Interuniversity Center for Studies on Bioinspired Agro-Environmental Technology (BAT Center), University of Naples Federico II, via Università 100, Portici, 80055 Naples, Italy
| | - Francesco Pennacchio
- Department of Agricultural Sciences, University of Naples Federico II, via Università 100, Portici, Naples 80055, Italy; Interuniversity Center for Studies on Bioinspired Agro-Environmental Technology (BAT Center), University of Naples Federico II, via Università 100, Portici, 80055 Naples, Italy
| | - Rosa Rao
- Department of Agricultural Sciences, University of Naples Federico II, via Università 100, Portici, Naples 80055, Italy; Interuniversity Center for Studies on Bioinspired Agro-Environmental Technology (BAT Center), University of Naples Federico II, via Università 100, Portici, 80055 Naples, Italy.
| | - Simona Maria Monti
- Institute of Biostructures and Bioimaging, National Research Council (IBB, CNR), via Pietro Castellino 111, Naples 80131, Italy.
| |
Collapse
|
2
|
Hall LL, Creamer KM, Byron M, Lawrence JB. Differences in Alu vs L1-rich chromosome bands underpin architectural reorganization of the inactive-X chromosome and SAHFs. bioRxiv 2024:2024.01.09.574742. [PMID: 38260534 PMCID: PMC10802495 DOI: 10.1101/2024.01.09.574742] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
The linear DNA sequence of mammalian chromosomes is organized in large blocks of DNA with similar sequence properties, producing a pattern of dark and light staining bands on mitotic chromosomes. Cytogenetic banding is essentially invariant between people and cell-types and thus may be assumed unrelated to genome regulation. We investigate whether large blocks of Alu-rich R-bands and L1-rich G-bands provide a framework upon which functional genome architecture is built. We examine two models of large-scale chromatin condensation: X-chromosome inactivation and formation of senescence-associated heterochromatin foci (SAHFs). XIST RNA triggers gene silencing but also formation of the condensed Barr Body (BB), thought to reflect cumulative gene silencing. However, we find Alu-rich regions are depleted from the L1-rich BB, supporting it is a dense core but not the entire chromosome. Alu-rich bands are also gene-rich, affirming our earlier findings that genes localize at the outer periphery of the BB. SAHFs similarly form within each territory by coalescence of syntenic L1 regions depleted for highly Alu-rich DNA. Analysis of senescent cell Hi-C data also shows large contiguous blocks of G-band and R-band DNA remodel as a segmental unit. Entire dark-bands gain distal intrachromosomal interactions as L1-rich regions form the SAHF. Most striking is that sharp Alu peaks within R-bands resist these changes in condensation. We further show that Chr19, which is exceptionally Alu rich, fails to form a SAHF. Collective results show regulation of genome architecture corresponding to large blocks of DNA and demonstrate resistance of segments with high Alu to chromosome condensation.
Collapse
Affiliation(s)
- Lisa L. Hall
- Department of Neurology, University of Massachusetts Chan Medical School, Worcester, MA 01655, USA
| | - Kevin M. Creamer
- Department of Neurology, University of Massachusetts Chan Medical School, Worcester, MA 01655, USA
| | - Meg Byron
- Department of Neurology, University of Massachusetts Chan Medical School, Worcester, MA 01655, USA
| | - Jeanne B. Lawrence
- Department of Neurology, University of Massachusetts Chan Medical School, Worcester, MA 01655, USA
| |
Collapse
|
3
|
Chen J, Zang Y, Shang S, Yang Z, Liang S, Xue S, Wang Y, Tang X. Chloroplast genomic comparison provides insights into the evolution of seagrasses. BMC Plant Biol 2023; 23:104. [PMID: 36814193 PMCID: PMC9945681 DOI: 10.1186/s12870-023-04119-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/06/2022] [Accepted: 02/14/2023] [Indexed: 06/18/2023]
Abstract
BACKGROUND Seagrasses are a polyphyletic group of monocotyledonous angiosperms that have evolved to live entirely submerged in marine waters. Thus, these species are ideal for studying plant adaptation to marine environments. Herein, we sequenced the chloroplast (cp) genomes of two seagrass species (Zostera muelleri and Halophila ovalis) and performed a comparative analysis of them with 10 previously published seagrasses, resulting in various novel findings. RESULTS The cp genomes of the seagrasses ranged in size from 143,877 bp (Zostera marina) to 178,261 bp (Thalassia hemprichii), and also varied in size among different families in the following order: Hydrocharitaceae > Cymodoceaceae > Ruppiaceae > Zosteraceae. The length differences between families were mainly related to the expansion and contraction of the IR region. In addition, we screened out 2,751 simple sequence repeats and 1,757 long repeat sequence types in the cp genome sequences of the 12 seagrass species, ultimately finding seven hot spots in coding regions. Interestingly, we found nine genes with positive selection sites, including two ATP subunit genes (atpA and atpF), three ribosome subunit genes (rps4, rps7, and rpl20), one photosystem subunit gene (psbH), and the ycf2, accD, and rbcL genes. These gene regions may have played critical roles in the adaptation of seagrasses to diverse environments. In addition, phylogenetic analysis strongly supported the division of the 12 seagrass species into four previously recognized major clades. Finally, the divergence time of the seagrasses inferred from the cp genome sequences was generally consistent with previous studies. CONCLUSIONS In this study, we compared chloroplast genomes from 12 seagrass species, covering the main phylogenetic clades. Our findings will provide valuable genetic data for research into the taxonomy, phylogeny, and species evolution of seagrasses.
Collapse
Affiliation(s)
- Jun Chen
- College of Marine Life Sciences, Ocean University of China, Qingdao, Shandong, China
| | - Yu Zang
- Ministry of Natural Resources, Key Laboratory of Marine Eco-Environmental Science and Technology, First Institute of Oceanography, Qingdao, Shandong, China
| | - Shuai Shang
- College of Marine Life Sciences, Ocean University of China, Qingdao, Shandong, China
| | - Zhibo Yang
- College of Marine Life Sciences, Ocean University of China, Qingdao, Shandong, China
| | - Shuo Liang
- College of Marine Life Sciences, Ocean University of China, Qingdao, Shandong, China
| | - Song Xue
- College of Marine Life Sciences, Ocean University of China, Qingdao, Shandong, China
| | - Ying Wang
- College of Marine Life Sciences, Ocean University of China, Qingdao, Shandong, China.
- Laboratory for Marine Ecology and Environmental Science, Qingdao National Laboratory for Marine Science and Technology, Qingdao, Shandong, China.
| | - Xuexi Tang
- College of Marine Life Sciences, Ocean University of China, Qingdao, Shandong, China.
- Laboratory for Marine Ecology and Environmental Science, Qingdao National Laboratory for Marine Science and Technology, Qingdao, Shandong, China.
| |
Collapse
|
4
|
Liu G, Zhang T. Bioinformatic Prediction of Bulked Oligonucleotide Probes for FISH Using Chorus2. Methods Mol Biol 2023; 2672:389-408. [PMID: 37335491 DOI: 10.1007/978-1-0716-3226-0_25] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/21/2023]
Abstract
Fluorescence in situ hybridization (FISH) provides great conveniences for detection and visualization of specific genomic segments. Oligonucleotide (Oligo)-based FISH further broadened the applications in plant cytogenetics researches. High-specific single-copy oligo probes are essential for successful oligo-FISH experiments. Here, we introduce the bioinformatic pipeline to design genome-scaled single-copy oligos and filter repeat-related probes with Chorus2 software. Robust probes are accessible for both well-assembled genome and species without a reference genome based on this pipeline.
Collapse
Affiliation(s)
- Guanqing Liu
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding, Agricultural College of Yangzhou University, Yangzhou, China
- Key Laboratory of Plant Functional Genomics of the Ministry of Education/Joint International Research Laboratory of Agriculture and Agri-Product Safety, The Ministry of Education of China, Yangzhou University, Yangzhou, China
| | - Tao Zhang
- Jiangsu Key Laboratory of Crop Genomics and Molecular Breeding, Agricultural College of Yangzhou University, Yangzhou, China.
- Key Laboratory of Plant Functional Genomics of the Ministry of Education/Joint International Research Laboratory of Agriculture and Agri-Product Safety, The Ministry of Education of China, Yangzhou University, Yangzhou, China.
- Jiangsu Co-Innovation Center for Modern Production Technology of Grain Crops, Yangzhou University, Yangzhou, China.
| |
Collapse
|
5
|
Williams SL, Coster G. Cloning and expansion of repetitive DNA sequences. Methods Cell Biol 2022; 182:167-185. [PMID: 38359975 DOI: 10.1016/bs.mcb.2022.10.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Repeat and structure-prone DNA sequences comprise a large proportion of the human genome. The instability of these sequences has been implicated in a range of diseases, including cancers and neurodegenerative disorders. However, the mechanism of pathogenicity is poorly understood. As such, further studies on repetitive DNA are required. Cloning and maintaining repeat-containing substrates is challenging due to their inherent ability to form non-B DNA secondary structures which are refractory to DNA polymerases and prone to undergo rearrangements. Here, we describe an approach to clone and expand tandem-repeat DNA without interruptions, thereby allowing for its manipulation and subsequent investigation.
Collapse
Affiliation(s)
- Sophie L Williams
- Genome Replication lab, Division of Cancer Biology, Institute of Cancer Research, Chester Beatty Laboratories, London, United Kingdom
| | - Gideon Coster
- Genome Replication lab, Division of Cancer Biology, Institute of Cancer Research, Chester Beatty Laboratories, London, United Kingdom.
| |
Collapse
|
6
|
Tang C, Chen X, Deng Y, Geng L, Ma J, Wei X. Complete chloroplast genomes of Sorbus sensu stricto (Rosaceae): comparative analyses and phylogenetic relationships. BMC Plant Biol 2022; 22:495. [PMID: 36273120 PMCID: PMC9587547 DOI: 10.1186/s12870-022-03858-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 09/26/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND Sorbus sensu stricto (Sorbus s.s.) is a genus with important economical values because of its beautiful leaves, and flowers and especially the colorful fruits. It belongs to the tribe Maleae of the family Rosaceae, and comprises about 90 species mainly distributed in China. There is on-going dispute about its infrageneric classification and species delimitation as the species are morphologically similar. With the aim of shedding light on the circumscription of taxa within the genus, phylogenetic analyses were performed using 29 Sorbus s.s. chloroplast (cp) genomes (16 newly sequenced) representing two subgenera and eight sections. RESULTS The 16 cp genomes newly sequenced range between 159,646 bp and 160,178 bp in length. All the samples examined and 22 taxa re-annotated in Sorbus sensu lato (Sorbus s.l.) contain 113 unique genes with 19 of these duplicated in the inverted repeat (IR). Six hypervariable regions including trnR-atpA, petN-psbM, rpl32-trnL, trnH-psbA, trnT-trnL and ndhC-trnV were screened and 44-53 SSRs and 14-31 dispersed repeats were identified as potential molecular markers. Phylogenetic analyses under ML/BI indicated that Sorbus s.l. is polyphyletic, but Sorbus s.s. and the other five segregate genera, Aria, Chamaemespilus, Cormus, Micromeles and Torminalis are monophyletic. Two major clades and four sub-clades resolved with full-support within Sorbus s.s. are not consistent with the existing infrageneric classification. Two subgenera, subg. Sorbus and subg. Albocarmesinae are supported as monophyletic when S. tianschanica is transferred to subg. Albocarmesinae from subg. Sorbus and S. hupehensis var. paucijuga transferred to subg. Sorbus from subg. Albocarmesinae, respectively. The current classification at sectional level is not supported by analysis of cp genome phylogeny. CONCLUSION Phylogenomic analyses of the cp genomes are useful for inferring phylogenetic relationships in Sorbus s.s. Though genome structure is highly conserved in the genus, hypervariable regions and repeat sequences used are the most promising molecule makers for population genetics, species delimitation and phylogenetic studies.
Collapse
Affiliation(s)
- Chenqian Tang
- Co-Innovation Center for Sustainable Forestry in Southern China, College of Biology and the Environment, Nanjing Forestry University, Nanjing, 210037, Jiangsu, China
| | - Xin Chen
- Co-Innovation Center for Sustainable Forestry in Southern China, College of Biology and the Environment, Nanjing Forestry University, Nanjing, 210037, Jiangsu, China.
| | - Yunfei Deng
- Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, 510650, China
| | - Liyang Geng
- Co-Innovation Center for Sustainable Forestry in Southern China, College of Biology and the Environment, Nanjing Forestry University, Nanjing, 210037, Jiangsu, China
| | - Jianhui Ma
- Co-Innovation Center for Sustainable Forestry in Southern China, College of Biology and the Environment, Nanjing Forestry University, Nanjing, 210037, Jiangsu, China
| | - Xueyan Wei
- Co-Innovation Center for Sustainable Forestry in Southern China, College of Biology and the Environment, Nanjing Forestry University, Nanjing, 210037, Jiangsu, China
| |
Collapse
|
7
|
Westerdahl H, Mellinger S, Sigeman H, Kutschera VE, Proux-Wéra E, Lundberg M, Weissensteiner M, Churcher A, Bunikis I, Hansson B, Wolf JBW, Strandh M. The genomic architecture of the passerine MHC region: high repeat content and contrasting evolutionary histories of single copy and tandemly duplicated MHC genes. Mol Ecol Resour 2022; 22:2379-2395. [PMID: 35348299 DOI: 10.1111/1755-0998.13614] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Revised: 03/09/2022] [Accepted: 03/23/2022] [Indexed: 12/01/2022]
Abstract
The Major Histocompatibility Complex (MHC) is of central importance to the immune system, and an optimal MHC diversity is believed to maximize pathogen elimination. Birds show substantial variation in MHC diversity, ranging from few genes in most bird orders to very many genes in passerines. Our understanding of the evolutionary trajectories of the MHC in passerines is hampered by lack of data on genomic organization. Therefore, we assemble and annotate the MHC genomic region of the great reed warbler (Acrocephalus arundinaceus), using long-read sequencing and optical mapping. The MHC region is large (>5.5Mb), characterized by structural changes compared to hitherto investigated bird orders and shows higher repeat content than the genome average. These features were supported by analyses in three additional passerines. MHC genes in passerines are found in two different chromosomal arrangements, either as single copy MHC genes located among non-MHC genes, or as tandemly duplicated tightly linked MHC genes. Some single copy MHC genes are old and putative orthologs among species. In contrast tandemly duplicated MHC genes are monophyletic within species and have evolved by simultaneous gene duplication of several MHC genes. Structural differences in the MHC genomic region among bird orders seem substantial compared to mammals and have possibly been fuelled by clade-specific immune system adaptations. Our study provides methodological guidance in characterizing complex genomic regions, constitutes a resource for MHC research in birds, and calls for a revision of the general belief that avian MHC has a conserved gene order and small size compared to mammals.
Collapse
Affiliation(s)
- Helena Westerdahl
- Molecular Ecology and Evolution Lab, Department of Biology, Lund University, Sölvegatan 37, SE-223 62, Lund, Sweden
| | - Samantha Mellinger
- Molecular Ecology and Evolution Lab, Department of Biology, Lund University, Sölvegatan 37, SE-223 62, Lund, Sweden
| | - Hanna Sigeman
- Molecular Ecology and Evolution Lab, Department of Biology, Lund University, Sölvegatan 37, SE-223 62, Lund, Sweden
| | - Verena E Kutschera
- Department of Biochemistry and Biophysics, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Stockholm University, Box 1031, SE-17121, Solna, Sweden
| | - Estelle Proux-Wéra
- Department of Biochemistry and Biophysics, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Stockholm University, Box 1031, SE-17121, Solna, Sweden
| | - Max Lundberg
- Molecular Ecology and Evolution Lab, Department of Biology, Lund University, Sölvegatan 37, SE-223 62, Lund, Sweden
| | - Matthias Weissensteiner
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Grosshaderner Str. 2, 82152, Planegg-Martinsried, Germany
| | - Allison Churcher
- National Bioinformatics Infrastructure Sweden, Department of Molecular Biology, Umeå University, SE-901 87, Umeå, Sweden
| | - Ignas Bunikis
- Uppsala Genome Center, Science for Life Laboratory, Dept. of Immunology, Genetics and Pathology, Uppsala University, BMC, Box 815, SE-752 37, Uppsala, Sweden
| | - Bengt Hansson
- Molecular Ecology and Evolution Lab, Department of Biology, Lund University, Sölvegatan 37, SE-223 62, Lund, Sweden
| | - Jochen B W Wolf
- Division of Evolutionary Biology, Faculty of Biology, LMU Munich, Grosshaderner Str. 2, 82152, Planegg-Martinsried, Germany
| | - Maria Strandh
- Molecular Ecology and Evolution Lab, Department of Biology, Lund University, Sölvegatan 37, SE-223 62, Lund, Sweden
| |
Collapse
|
8
|
Farhat S, Bonnivard E, Pales Espinosa E, Tanguy A, Boutet I, Guiglielmoni N, Flot JF, Allam B. Comparative analysis of the Mercenaria mercenaria genome provides insights into the diversity of transposable elements and immune molecules in bivalve mollusks. BMC Genomics 2022; 23:192. [PMID: 35260071 PMCID: PMC8905726 DOI: 10.1186/s12864-021-08262-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2021] [Accepted: 12/15/2021] [Indexed: 01/05/2023] Open
Abstract
BACKGROUND The hard clam Mercenaria mercenaria is a major marine resource along the Atlantic coasts of North America and has been introduced to other continents for resource restoration or aquaculture activities. Significant mortality events have been reported in the species throughout its native range as a result of diseases (microbial infections, leukemia) and acute environmental stress. In this context, the characterization of the hard clam genome can provide highly needed resources to enable basic (e.g., oncogenesis and cancer transmission, adaptation biology) and applied (clam stock enhancement, genomic selection) sciences. RESULTS Using a combination of long and short-read sequencing technologies, a 1.86 Gb chromosome-level assembly of the clam genome was generated. The assembly was scaffolded into 19 chromosomes, with an N50 of 83 Mb. Genome annotation yielded 34,728 predicted protein-coding genes, markedly more than the few other members of the Venerida sequenced so far, with coding regions representing only 2% of the assembly. Indeed, more than half of the genome is composed of repeated elements, including transposable elements. Major chromosome rearrangements were detected between this assembly and another recent assembly derived from a genetically segregated clam stock. Comparative analysis of the clam genome allowed the identification of a marked diversification in immune-related proteins, particularly extensive tandem duplications and expansions in tumor necrosis factors (TNFs) and C1q domain-containing proteins, some of which were previously shown to play a role in clam interactions with infectious microbes. The study also generated a comparative repertoire highlighting the diversity and, in some instances, the specificity of LTR-retrotransposons elements, particularly Steamer elements in bivalves. CONCLUSIONS The diversity of immune molecules in M. mercenaria may allow this species to cope with varying and complex microbial and environmental landscapes. The repertoire of transposable elements identified in this study, particularly Steamer elements, should be a prime target for the investigation of cancer cell development and transmission among bivalve mollusks.
Collapse
Affiliation(s)
- Sarah Farhat
- Marine Animal Disease Laboratory, School of Marine and Atmospheric Sciences, 100 Nicolls Road, Stony Brook University, Stony Brook, NY, 11794-5000, USA
| | - Eric Bonnivard
- Sorbonne Université, CNRS, UMR 7144 AD2M, Station Biologique de Roscoff, Place Georges Teissier, 29688, Roscoff, France
| | - Emmanuelle Pales Espinosa
- Marine Animal Disease Laboratory, School of Marine and Atmospheric Sciences, 100 Nicolls Road, Stony Brook University, Stony Brook, NY, 11794-5000, USA
| | - Arnaud Tanguy
- Sorbonne Université, CNRS, UMR 7144 AD2M, Station Biologique de Roscoff, Place Georges Teissier, 29688, Roscoff, France
| | - Isabelle Boutet
- Sorbonne Université, CNRS, UMR 7144 AD2M, Station Biologique de Roscoff, Place Georges Teissier, 29688, Roscoff, France
| | - Nadège Guiglielmoni
- Université libre de Bruxelles (ULB), Evolutionary Biology & Ecology, Avenue F.D. Roosevelt 50, B-1050, Brussels, Belgium
| | - Jean-François Flot
- Université libre de Bruxelles (ULB), Evolutionary Biology & Ecology, Avenue F.D. Roosevelt 50, B-1050, Brussels, Belgium.,Interuniversity Institute of Bioinformatics in Brussels - (IB)2, B-1050, Brussels, Belgium
| | - Bassem Allam
- Marine Animal Disease Laboratory, School of Marine and Atmospheric Sciences, 100 Nicolls Road, Stony Brook University, Stony Brook, NY, 11794-5000, USA.
| |
Collapse
|
9
|
Chunduri AR, Lima A, Rajan R, Mamillapalli A. Nuclear matrix associated RNAs in posterior silk glands show developmental dynamics in Bombyx mori in 5th instar larvae. BMC Res Notes 2022; 15:68. [PMID: 35183251 DOI: 10.1186/s13104-022-05951-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Accepted: 02/03/2022] [Indexed: 11/10/2022] Open
Abstract
OBJECTIVES The nuclear matrix maintains and regulates chromatin structure. RNA is an integral component of the nuclear matrix and is essential to its structural maintenance. Bombyx mori is a major economic contributor in the sericulture industry and produces fibroin-the most important silk protein in its posterior silk glands during 5th instar larval stage. The present study investigates the composition of nuclear matrix RNA prepared from the posterior silk glands of Bombyx mori during fifth instar larval stage where maximum silk production occurs. The datasets from which the analysis is carried out are part of data note titled "Nuclear matrix associated RNA datasets of posterior silk glands of Bombyx mori during 5th instar larval development". RESULTS The results showed significant enrichment of nuclear matrix RNA from day 1, to day 5 and day 7. Nuclear RNA showed increased abundance from day 1 to day 5 and day 7. Nuclear matrix RNA exhibited repetitive RNA sequences, of which UGUCC and GCUGGU were the most abundant. Genes involved in metabolic pathways showed significant enrichment correlating with silk production. These results emphasize the role of dynamic, repetitive DNA transcripts in chromatin architecture and further reveal the close association between the nuclear matrix and gene expression.
Collapse
|
10
|
Maljković MM, Mitić NS, de Brevern AG. Prediction of structural alphabet protein blocks using data mining. Biochimie 2022; 197:74-85. [PMID: 35143919 DOI: 10.1016/j.biochi.2022.01.019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 01/22/2022] [Accepted: 01/31/2022] [Indexed: 11/17/2022]
Abstract
3D protein structures determine proteins' biological functions. The 3D structure of the protein backbone can be approximated using the prototypes of local protein conformations. Sets of these prototypes are called structural alphabets (SAs). Amongst several approaches to the prediction of 3D structures from amino acid sequences, one approach is based on the prediction of SA prototypes for a given amino acid sequence. Protein Blocks (PBs) is the most known SA, and it is composed of 16 prototypes of five consecutive amino acids which were identified as optimal prototypes considering the ability to correctly approximate the local structure and the prediction accuracy of prototypes from an amino acid sequence. We developed models for PBs prediction from sequence information using different data mining approaches and machine learning algorithms. Besides the amino acid sequences, the results of the following tools were used to train the models: the Spider3 predictor of protein structure properties, several predictors of the protein's intrinsically disordered regions, and a tool for finding repeats in amino acid sequences. The highest accuracy of the constructed models is 80%, which is a significant improvement compared to the previous best available prediction, whose accuracy was 61%. Analyzing the models constructed by applying different algorithms, it was noticed that the significance of input attributes differs among the models constructed by algorithms. Using the information about amino acids belonging to intrinsically disordered regions and repeats improves the precision of prediction for some PBs using the CART classification algorithm, while this is not the case with the C5.0 classification algorithm. Improved prediction approaches can have interesting applications in protein structural model approaches or computational protein design.
Collapse
Affiliation(s)
- Mirjana M Maljković
- Faculty of Mathematics, University of Belgrade, Studentski Trg 16, 11000, Belgrade, Serbia.
| | - Nenad S Mitić
- Faculty of Mathematics, University of Belgrade, Studentski Trg 16, 11000, Belgrade, Serbia
| | - Alexandre G de Brevern
- Université de Paris, INSERM UMR_S 1134, DSIMB, Université de la Réunion, INTS6, Rue Alexandre Cabanel, 75015, Paris, France
| |
Collapse
|
11
|
Abstract
Plant genomes contain a particularly high proportion of repeated structures of various types. This chapter proposes a guided tour of the available software that can help biologists to scan automatically for these repeats in sequence data or check hypothetical models intended to characterize their structures. Since transposable elements (TEs) are a major source of repeats in plants, many methods have been used or developed for this broad class of sequences. They are representative of the range of tools available for other classes of repeats and we have provided two sections on this topic (for the analysis of genomes or directly of sequenced reads), as well as a selection of the main existing software. It may be hard to keep up with the profusion of proposals in this dynamic field and the rest of the chapter is devoted to the foundations of an efficient search for repeats and more complex patterns. We first introduce the key concepts of the art of indexing and mapping or querying sequences. We end the chapter with the more prospective issue of building models of repeat families. We present the Machine Learning approach first, seeking to build predictors automatically for some families of ET, from a set of sequences known to belong to this family. A second approach, the linguistic (or syntactic) approach, allows biologists to describe themselves and check the validity of models of their favorite repeat family.
Collapse
|
12
|
Ma Q, Wang Y, Li S, Wen J, Zhu L, Yan K, Du Y, Ren J, Li S, Chen Z, Bi C, Li Q. Assembly and comparative analysis of the first complete mitochondrial genome of Acer truncatum Bunge: a woody oil-tree species producing nervonic acid. BMC Plant Biol 2022; 22:29. [PMID: 35026989 PMCID: PMC8756732 DOI: 10.1186/s12870-021-03416-5] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/14/2021] [Accepted: 12/27/2021] [Indexed: 05/12/2023]
Abstract
BACKGROUND Acer truncatum (purpleblow maple) is a woody tree species that produces seeds with high levels of valuable fatty acids (especially nervonic acid). The species is admired as a landscape plant with high developmental prospects and scientific research value. The A. truncatum chloroplast genome has recently been reported; however, the mitochondrial genome (mitogenome) is still unexplored. RESULTS We characterized the A. truncatum mitogenome, which was assembled using reads from PacBio and Illumina sequencing platforms, performed a comparative analysis against different species of Acer. The circular mitogenome of A. truncatum has a length of 791,052 bp, with a base composition of 27.11% A, 27.21% T, 22.79% G, and 22.89% C. The A. truncatum mitogenome contains 62 genes, including 35 protein-coding genes, 23 tRNA genes and 4 rRNA genes. We also examined codon usage, sequence repeats, RNA editing and selective pressure in the A. truncatum mitogenome. To determine the evolutionary and taxonomic status of A. truncatum, we conducted a phylogenetic analysis based on the mitogenomes of A. truncatum and 25 other taxa. In addition, the gene migration from chloroplast and nuclear genomes to the mitogenome were analyzed. Finally, we developed a novel NAD1 intron indel marker for distinguishing several Acer species. CONCLUSIONS In this study, we assembled and annotated the mitogenome of A. truncatum, a woody oil-tree species producing nervonic acid. The results of our analyses provide comprehensive information on the A. truncatum mitogenome, which would facilitate evolutionary research and molecular barcoding in Acer.
Collapse
Affiliation(s)
- Qiuyue Ma
- Institute of Leisure Agriculture, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014 China
| | - Yuxiao Wang
- Nanjing Forestry University, Nanjing, 210037 China
| | - Shushun Li
- Institute of Leisure Agriculture, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014 China
| | - Jing Wen
- Institute of Leisure Agriculture, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014 China
| | - Lu Zhu
- Institute of Leisure Agriculture, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014 China
| | - Kunyuan Yan
- Institute of Leisure Agriculture, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014 China
| | - Yiming Du
- Institute of Leisure Agriculture, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014 China
| | - Jie Ren
- Institute of Agricultural Engineering, Anhui Academy of Agricultural Sciences, 40 Nongkenanlu, Hefei, 230031 Anhui China
| | - Shuxian Li
- Nanjing Forestry University, Nanjing, 210037 China
| | - Zhu Chen
- Institute of Agricultural Engineering, Anhui Academy of Agricultural Sciences, 40 Nongkenanlu, Hefei, 230031 Anhui China
| | - Changwei Bi
- Nanjing Forestry University, Nanjing, 210037 China
| | - Qianzhong Li
- Institute of Leisure Agriculture, Jiangsu Academy of Agricultural Sciences, Nanjing, 210014 China
| |
Collapse
|
13
|
Kamal N, Lux T, Jayakodi M, Haberer G, Gundlach H, Mayer KFX, Mascher M, Spannagl M. The Barley and Wheat Pan-Genomes. Methods Mol Biol 2022; 2443:147-159. [PMID: 35037204 DOI: 10.1007/978-1-0716-2067-0_7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
To unlock the genetic potential in crops, multi-genome comparisons are an essential tool. Decreasing costs and improved sequencing technologies have democratized plant genome sequencing and led to a vast increase in the amount of available reference sequences on the one hand and enabled the assembly of even the largest and most complex and repetitive crops genomes such as wheat and barley. These developments have led to the era of pan-genomics in recent years. Pan-genome projects enable the definition of the core and dispensable genome for various crop species as well as the analysis of structural and functional variation and hence offer unprecedented opportunities for exploring and utilizing the genetic basis of natural variation in crops. Comparing, analyzing, and visualizing these multiple reference genomes and their diversity requires powerful and specialized computational strategies and tools.
Collapse
Affiliation(s)
- Nadia Kamal
- Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany
| | - Thomas Lux
- Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany
| | - Murukarthick Jayakodi
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Seeland, Germany
| | - Georg Haberer
- Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany
| | - Heidrun Gundlach
- Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany
| | - Klaus F X Mayer
- Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany
| | - Martin Mascher
- Leibniz Institute of Plant Genetics and Crop Plant Research (IPK), Gatersleben, Seeland, Germany
| | - Manuel Spannagl
- Helmholtz Zentrum München-German Research Center for Environmental Health, Neuherberg, Germany.
| |
Collapse
|
14
|
Lampasona A, Almeida S, Gao FB. Translation of the poly(GR) frame in C9ORF72-ALS/FTD is regulated by cis-elements involved in alternative splicing. Neurobiol Aging 2021; 105:327-332. [PMID: 34157654 PMCID: PMC8338774 DOI: 10.1016/j.neurobiolaging.2021.04.030] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2021] [Revised: 04/29/2021] [Accepted: 04/29/2021] [Indexed: 12/31/2022]
Abstract
GGGGCC (G4C2) repeat expansion in the first intron of C9ORF72 is the most common genetic cause of amyotrophic lateral sclerosis and frontotemporal dementia, two devastating age-dependent neurodegenerative disorders. Both sense and antisense repeat RNAs can be translated into 5 different dipeptide repeat proteins, such as poly(GR), which is toxic in various cellular and animal models. However, it remains unknown how poly(GR) is synthesized in patient neurons. Using a reporter construct containing 70 G4C2 repeats flanked by human intronic and exonic sequences, we show that translation of the poly(GR) frame does not depend on repeats or the CUG start codon in the poly(GA) frame, suggesting poly(GR) is not produced after ribosomal frameshifting in the poly(GA) frame. However, deletion analysis suggests that translation of the poly(GR) frame depends on the length of the intronic sequence 5' adjacent to G4C2 repeats. Moreover, several 5´ cis elements that are predicted to be involved in alternative splicing regulates poly(GR) synthesis. These results suggest that translation of repeat RNAs in the poly(GR) frame is regulated by multiple cis elements, likely through RNA secondary structures and/or associated RNA binding proteins.
Collapse
Affiliation(s)
- Alexa Lampasona
- Department of Neurology, University of Massachusetts Medical School, Worcester, MA 01605, USA
| | - Sandra Almeida
- Department of Neurology, University of Massachusetts Medical School, Worcester, MA 01605, USA.
| | - Fen-Biao Gao
- Department of Neurology, University of Massachusetts Medical School, Worcester, MA 01605, USA.
| |
Collapse
|
15
|
Chintalaphani SR, Pineda SS, Deveson IW, Kumar KR. An update on the neurological short tandem repeat expansion disorders and the emergence of long-read sequencing diagnostics. Acta Neuropathol Commun 2021; 9:98. [PMID: 34034831 PMCID: PMC8145836 DOI: 10.1186/s40478-021-01201-x] [Citation(s) in RCA: 62] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 05/17/2021] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Short tandem repeat (STR) expansion disorders are an important cause of human neurological disease. They have an established role in more than 40 different phenotypes including the myotonic dystrophies, Fragile X syndrome, Huntington's disease, the hereditary cerebellar ataxias, amyotrophic lateral sclerosis and frontotemporal dementia. MAIN BODY STR expansions are difficult to detect and may explain unsolved diseases, as highlighted by recent findings including: the discovery of a biallelic intronic 'AAGGG' repeat in RFC1 as the cause of cerebellar ataxia, neuropathy, and vestibular areflexia syndrome (CANVAS); and the finding of 'CGG' repeat expansions in NOTCH2NLC as the cause of neuronal intranuclear inclusion disease and a range of clinical phenotypes. However, established laboratory techniques for diagnosis of repeat expansions (repeat-primed PCR and Southern blot) are cumbersome, low-throughput and poorly suited to parallel analysis of multiple gene regions. While next generation sequencing (NGS) has been increasingly used, established short-read NGS platforms (e.g., Illumina) are unable to genotype large and/or complex repeat expansions. Long-read sequencing platforms recently developed by Oxford Nanopore Technology and Pacific Biosciences promise to overcome these limitations to deliver enhanced diagnosis of repeat expansion disorders in a rapid and cost-effective fashion. CONCLUSION We anticipate that long-read sequencing will rapidly transform the detection of short tandem repeat expansion disorders for both clinical diagnosis and gene discovery.
Collapse
Affiliation(s)
- Sanjog R. Chintalaphani
- School of Medicine, University of New South Wales, Sydney, 2052 Australia
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, NSW 2010 Australia
| | - Sandy S. Pineda
- Garvan-Weizmann Centre for Cellular Genomics, Garvan Institute of Medical Research, Darlinghurst, NSW 2010 Australia
- Brain and Mind Centre, University of Sydney, Camperdown, NSW 2050 Australia
| | - Ira W. Deveson
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, NSW 2010 Australia
- Faculty of Medicine, St Vincent’s Clinical School, University of New South Wales, Sydney, NSW 2010 Australia
| | - Kishore R. Kumar
- Kinghorn Centre for Clinical Genomics, Garvan Institute of Medical Research, Darlinghurst, NSW 2010 Australia
- Molecular Medicine Laboratory and Neurology Department, Central Clinical School, Concord Repatriation General Hospital, University of Sydney, Concord, NSW 2137 Australia
| |
Collapse
|
16
|
Formenti G, Rhie A, Balacco J, Haase B, Mountcastle J, Fedrigo O, Brown S, Capodiferro MR, Al-Ajli FO, Ambrosini R, Houde P, Koren S, Oliver K, Smith M, Skelton J, Betteridge E, Dolucan J, Corton C, Bista I, Torrance J, Tracey A, Wood J, Uliano-Silva M, Howe K, McCarthy S, Winkler S, Kwak W, Korlach J, Fungtammasan A, Fordham D, Costa V, Mayes S, Chiara M, Horner DS, Myers E, Durbin R, Achilli A, Braun EL, Phillippy AM, Jarvis ED. Complete vertebrate mitogenomes reveal widespread repeats and gene duplications. Genome Biol 2021; 22:120. [PMID: 33910595 PMCID: PMC8082918 DOI: 10.1186/s13059-021-02336-9] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Accepted: 03/31/2021] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Modern sequencing technologies should make the assembly of the relatively small mitochondrial genomes an easy undertaking. However, few tools exist that address mitochondrial assembly directly. RESULTS As part of the Vertebrate Genomes Project (VGP) we develop mitoVGP, a fully automated pipeline for similarity-based identification of mitochondrial reads and de novo assembly of mitochondrial genomes that incorporates both long (> 10 kbp, PacBio or Nanopore) and short (100-300 bp, Illumina) reads. Our pipeline leads to successful complete mitogenome assemblies of 100 vertebrate species of the VGP. We observe that tissue type and library size selection have considerable impact on mitogenome sequencing and assembly. Comparing our assemblies to purportedly complete reference mitogenomes based on short-read sequencing, we identify errors, missing sequences, and incomplete genes in those references, particularly in repetitive regions. Our assemblies also identify novel gene region duplications. The presence of repeats and duplications in over half of the species herein assembled indicates that their occurrence is a principle of mitochondrial structure rather than an exception, shedding new light on mitochondrial genome evolution and organization. CONCLUSIONS Our results indicate that even in the "simple" case of vertebrate mitogenomes the completeness of many currently available reference sequences can be further improved, and caution should be exercised before claiming the complete assembly of a mitogenome, particularly from short reads alone.
Collapse
Affiliation(s)
- Giulio Formenti
- The Vertebrate Genome Lab, Rockefeller University, New York, NY, USA.
- Laboratory of Neurogenetics of Language, Rockefeller University, New York, NY, USA.
- The Howards Hughes Medical Institute, Chevy Chase, MD, USA.
| | - Arang Rhie
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Jennifer Balacco
- The Vertebrate Genome Lab, Rockefeller University, New York, NY, USA
| | - Bettina Haase
- The Vertebrate Genome Lab, Rockefeller University, New York, NY, USA
| | | | - Olivier Fedrigo
- The Vertebrate Genome Lab, Rockefeller University, New York, NY, USA
| | - Samara Brown
- Laboratory of Neurogenetics of Language, Rockefeller University, New York, NY, USA
- The Howards Hughes Medical Institute, Chevy Chase, MD, USA
| | | | - Farooq O Al-Ajli
- Monash University Malaysia Genomics Facility, School of Science, Bandar Sunway, Selangor Darul Ehsan, Malaysia
- Tropical Medicine and Biology Multidisciplinary Platform, Monash University Malaysia, Bandar Sunway, Selangor Darul Ehsan, Malaysia
- Qatar Falcon Genome Project, Doha, State of Qatar
| | - Roberto Ambrosini
- Department of Environmental Science and Policy, University of Milan, Milan, Italy
| | - Peter Houde
- Department of Biology, New Mexico State University, Las Cruces, NM, USA
| | - Sergey Koren
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | | | | | | | | | | | | | - Iliana Bista
- Wellcome Sanger Institute, Cambridge, UK
- Department of Genetics, University of Cambridge, Cambridge, UK
| | | | | | | | | | | | - Shane McCarthy
- Wellcome Sanger Institute, Cambridge, UK
- Department of Genetics, University of Cambridge, Cambridge, UK
| | - Sylke Winkler
- Max Planck Institute of Molecular Cell Biology & Genetics, Dresden, Germany
| | | | | | | | - Daniel Fordham
- Oxford Nanopore Technologies Ltd, Oxford Science Park, Oxford, UK
| | - Vania Costa
- Oxford Nanopore Technologies Ltd, Oxford Science Park, Oxford, UK
| | - Simon Mayes
- Oxford Nanopore Technologies Ltd, Oxford Science Park, Oxford, UK
| | - Matteo Chiara
- Department of Biosciences, University of Milan, Milan, Italy
| | - David S Horner
- Department of Biosciences, University of Milan, Milan, Italy
| | - Eugene Myers
- Max Planck Institute of Molecular Cell Biology & Genetics, Dresden, Germany
| | - Richard Durbin
- Wellcome Sanger Institute, Cambridge, UK
- Department of Genetics, University of Cambridge, Cambridge, UK
| | - Alessandro Achilli
- Department of Biology and Biotechnology "L. Spallanzani", University of Pavia, Pavia, Italy
| | - Edward L Braun
- Department of Biology, University of Florida, Gainesville, FL, USA
| | - Adam M Phillippy
- Genome Informatics Section, Computational and Statistical Genomics Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Erich D Jarvis
- The Vertebrate Genome Lab, Rockefeller University, New York, NY, USA.
- Laboratory of Neurogenetics of Language, Rockefeller University, New York, NY, USA.
- The Howards Hughes Medical Institute, Chevy Chase, MD, USA.
| |
Collapse
|
17
|
Cheng Y, He X, Priyadarshani SVGN, Wang Y, Ye L, Shi C, Ye K, Zhou Q, Luo Z, Deng F, Cao L, Zheng P, Aslam M, Qin Y. Assembly and comparative analysis of the complete mitochondrial genome of Suaeda glauca. BMC Genomics 2021; 22:167. [PMID: 33750312 PMCID: PMC7941912 DOI: 10.1186/s12864-021-07490-9] [Citation(s) in RCA: 48] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2020] [Accepted: 02/26/2021] [Indexed: 01/30/2023] Open
Abstract
Background Suaeda glauca (S. glauca) is a halophyte widely distributed in saline and sandy beaches, with strong saline-alkali tolerance. It is also admired as a landscape plant with high development prospects and scientific research value. The S. glauca chloroplast (cp) genome has recently been reported; however, the mitochondria (mt) genome is still unexplored. Results The mt genome of S. glauca were assembled based on the reads from Pacbio and Illumina sequencing platforms. The circular mt genome of S. glauca has a length of 474,330 bp. The base composition of the S. glauca mt genome showed A (28.00%), T (27.93%), C (21.62%), and G (22.45%). S. glauca mt genome contains 61 genes, including 27 protein-coding genes, 29 tRNA genes, and 5 rRNA genes. The sequence repeats, RNA editing, and gene migration from cp to mt were observed in S. glauca mt genome. Phylogenetic analysis based on the mt genomes of S. glauca and other 28 taxa reflects an exact evolutionary and taxonomic status of S. glauca. Furthermore, the investigation on mt genome characteristics, including genome size, GC contents, genome organization, and gene repeats of S. gulaca genome, was investigated compared to other land plants, indicating the variation of the mt genome in plants. However, the subsequently Ka/Ks analysis revealed that most of the protein-coding genes in mt genome had undergone negative selections, reflecting the importance of those genes in the mt genomes. Conclusions In this study, we reported the mt genome assembly and annotation of a halophytic model plant S. glauca. The subsequent analysis provided us a comprehensive understanding of the S. glauca mt genome, which might facilitate the research on the salt-tolerant plant species. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07490-9.
Collapse
Affiliation(s)
- Yan Cheng
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, College of Plant Protection, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Xiaoxue He
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, College of Plant Protection, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - S V G N Priyadarshani
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, College of Plant Protection, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Yu Wang
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, College of Plant Protection, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China.,College of Agriculture, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Li Ye
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, College of Plant Protection, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Chao Shi
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, College of Plant Protection, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China.,College of Agriculture, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Kangzhuo Ye
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, College of Plant Protection, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Qiao Zhou
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, College of Plant Protection, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Ziqiang Luo
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, College of Plant Protection, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Fang Deng
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, College of Plant Protection, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Ling Cao
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, College of Plant Protection, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Ping Zheng
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, College of Plant Protection, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China
| | - Mohammad Aslam
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, College of Plant Protection, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China.,State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi Key Lab of Sugarcane Biology, College of Agriculture, Guangxi University, Nanning, 530004, Guangxi, China
| | - Yuan Qin
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, College of Plant Protection, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, College of Life Science, Fujian Agriculture and Forestry University, Fuzhou, 350002, China. .,State Key Laboratory for Conservation and Utilization of Subtropical Agro-Bioresources, Guangxi Key Lab of Sugarcane Biology, College of Agriculture, Guangxi University, Nanning, 530004, Guangxi, China.
| |
Collapse
|
18
|
Vourc'h P, Wurmser F, Brulard C, Mouzat K, Kassem S, Dangoumau A, Laumonnier F, Blasco H, Corcia P, Andres CR. Genes containing hexanucleotide repeats resembling C9ORF72 and expressed in the central nervous system are frequent in the human genome. Neurobiol Aging 2021; 97:148.e1-7. [PMID: 32843153 DOI: 10.1016/j.neurobiolaging.2020.07.027] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2020] [Revised: 07/22/2020] [Accepted: 07/25/2020] [Indexed: 12/14/2022]
Abstract
More than 40 human diseases, mainly diseases affecting the central nervous system, are caused by the expansion of unstable nucleotide repeats. Repeats of sequences like (CAG)n present in different genes can be responsible for various diseases of the central nervous system. An expanded hexanucleotide repeat (GGGGCC)n in the C9ORF72 gene has been characterized as the most frequent genetic cause of amyotrophic lateral sclerosis and frontotemporal lobar dementia. In this study, we performed a genome-wide analysis in the human genome and identified 74 genes containing this precise hexanucleotide repeat, with a preference for a location in exon 1 or intron 1, similar to the C9ORF72 gene. A total of 36 of these 74 genes may be of interest as candidates in neurodevelopmental or neurodegenerative diseases, based on their function.
Collapse
|
19
|
Fernandes JD, Zamudio-Hurtado A, Clawson H, Kent WJ, Haussler D, Salama SR, Haeussler M. The UCSC repeat browser allows discovery and visualization of evolutionary conflict across repeat families. Mob DNA 2020; 11:13. [PMID: 32266012 DOI: 10.1186/s13100-020-00208-w] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2019] [Accepted: 03/10/2020] [Indexed: 01/12/2023] Open
Abstract
Background Nearly half the human genome consists of repeat elements, most of which are retrotransposons, and many of which play important biological roles. However repeat elements pose several unique challenges to current bioinformatic analyses and visualization tools, as short repeat sequences can map to multiple genomic loci resulting in their misclassification and misinterpretation. In fact, sequence data mapping to repeat elements are often discarded from analysis pipelines. Therefore, there is a continued need for standardized tools and techniques to interpret genomic data of repeats. Results We present the UCSC Repeat Browser, which consists of a complete set of human repeat reference sequences derived from annotations made by the commonly used program RepeatMasker. The UCSC Repeat Browser also provides an alignment from the human genome to these references, uses it to map the standard human genome annotation tracks, and presents all of them as a comprehensive interface to facilitate work with repetitive elements. It also provides processed tracks of multiple publicly available datasets of particular interest to the repeat community, including ChIP-seq datasets for KRAB Zinc Finger Proteins (KZNFs) – a family of proteins known to bind and repress certain classes of repeats. We used the UCSC Repeat Browser in combination with these datasets, as well as RepeatMasker annotations in several non-human primates, to trace the independent trajectories of species-specific evolutionary battles between LINE 1 retroelements and their repressors. Furthermore, we document at https://repeatbrowser.ucsc.edu how researchers can map their own human genome annotations to these reference repeat sequences. Conclusions The UCSC Repeat Browser allows easy and intuitive visualization of genomic data on consensus repeat elements, circumventing the problem of multi-mapping, in which sequencing reads of repeat elements map to multiple locations on the human genome. By developing a reference consensus, multiple datasets and annotation tracks can easily be overlaid to reveal complex evolutionary histories of repeats in a single interactive window. Specifically, we use this approach to retrace the history of several primate specific LINE-1 families across apes, and discover several species-specific routes of evolution that correlate with the emergence and binding of KZNFs.
Collapse
|
20
|
Rajapakse RPVJ, Pham KLT, Karunathilake KJK, Lawton SP, Le TH. Characterization and phylogenetic properties of the complete mitochondrial genome of Fascioloides jacksoni (syn. Fasciola jacksoni) support the suggested intergeneric change from Fasciola to Fascioloides (Platyhelminthes: Trematoda: Plagiorchiida). Infect Genet Evol 2020; 82:104281. [PMID: 32165245 DOI: 10.1016/j.meegid.2020.104281] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2019] [Revised: 03/06/2020] [Accepted: 03/08/2020] [Indexed: 11/27/2022]
Abstract
Fascioloides jacksoni (syn. Fasciola jacksoni, Cobbold, 1869) (Platyhelminthes: Echinostomatoidea), is a liver fluke that causes severe morbidity and mortality of Asian elephants (Elephas maximus maximus). Understandings on molecular diagnosis, epidemiology, genetics and evolution of this flatworm are limited. In this study, we present the complete mitochondrial DNA (mt) sequence of 14,952 bp obtained from an individual fluke and comparative characterization of mitogenomic features with fasciolids, primarily, Fascioloides magna and other taxa in the superfamily Echinostomatoidea. Taxonomic relationship within and between Echinostomatoidea, Opisthorchioidea and Paramphistomoidea in the order Plagiorchiida, are also taxonomically considered. The complete circular mt molecule of Fas. jacksoni contained 12 protein-coding, two ribosomal RNA, 22 transfer RNA genes, and a non-coding region (NCR) rich in tandem repeat units. As common in digenean trematodes, Fas. jacksoni has the usual gene order, the absence of atp8 and the overlapped region by 40 bp between nad4L and nad4 genes. The NCR located between tRNAGlu (trnE) and cox3 contained nine nearly identical tandem repeat units (TRs of 113 bp each). Special DHU-arm missing tRNAs for Serine were found for both, tRNAS1(AGN) and tRNAS2(UCN). Base composition indicated that cox1 of Fas. jacksoni showed the lowest (11.8% to Fas. magna, 12.9 - 13.6% to Fasciola spp. and 18.1% to Fasciolopsis buski) and nad6 the highest divergence rate (19.2%, 23.8-26.5% and 27.2% to each fasciolid group), respectively. A clear bias in nucleotide composition, as of 61.68%, 62.88% and 61.54%, with a negative AT-skew of the corresponding values (-0.523, -0.225 and - 0.426) for PCGs, MRGs and mtDNA for Fas. jacksoni and likewise data for the fasciolids. Phylogenetic analysis confirmed the sister branch of Fas. jacksoni and Fas. magna with the nodal support of 100%, clearly separated from the taxonomically recognized Fasciola spp. With the previous studies, mitogenomic data presented in this study are strongly supportive for Fasciola jacksoni reappraisal as Fascioloides jacksoni in the Fascioloides genus.
Collapse
Affiliation(s)
- R P V J Rajapakse
- Department of Veterinary Pathobiology, Faculty of Veterinary Medicine and Animal Science, University of Peradeniya, Peradeniya, Sri Lanka
| | - Khanh Linh Thi Pham
- Institute of Biotechnology (IBT), Vietnam Academy of Science and Technology (VAST), 18. Hoang Quoc Viet Rd., Cau Giay, Hanoi, Viet Nam
| | - K J Kumari Karunathilake
- Department of Veterinary Pathobiology, Faculty of Veterinary Medicine and Animal Science, University of Peradeniya, Peradeniya, Sri Lanka
| | - Scott P Lawton
- Molecular Parasitology Laboratory, School of Life Sciences, Pharmacy and Chemistry, Kingston University London, Kingston Upon Thames, Surrey KT1 2EE, UK
| | - Thanh Hoa Le
- Institute of Biotechnology (IBT), Vietnam Academy of Science and Technology (VAST), 18. Hoang Quoc Viet Rd., Cau Giay, Hanoi, Viet Nam; Graduate University of Science and Technology (GUST), Vietnam Academy of Science and Technology (VAST), 18. Hoang Quoc Viet Rd., Cau Giay, Hanoi, Viet Nam.
| |
Collapse
|
21
|
Shortt JA, Ruggiero RP, Cox C, Wacholder AC, Pollock DD. Finding and extending ancient simple sequence repeat-derived regions in the human genome. Mob DNA 2020; 11:11. [PMID: 32095164 PMCID: PMC7027126 DOI: 10.1186/s13100-020-00206-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Accepted: 02/04/2020] [Indexed: 12/19/2022] Open
Abstract
Background Previously, 3% of the human genome has been annotated as simple sequence repeats (SSRs), similar to the proportion annotated as protein coding. The origin of much of the genome is not well annotated, however, and some of the unidentified regions are likely to be ancient SSR-derived regions not identified by current methods. The identification of these regions is complicated because SSRs appear to evolve through complex cycles of expansion and contraction, often interrupted by mutations that alter both the repeated motif and mutation rate. We applied an empirical, kmer-based, approach to identify genome regions that are likely derived from SSRs. Results The sequences flanking annotated SSRs are enriched for similar sequences and for SSRs with similar motifs, suggesting that the evolutionary remains of SSR activity abound in regions near obvious SSRs. Using our previously described P-clouds approach, we identified ‘SSR-clouds’, groups of similar kmers (or ‘oligos’) that are enriched near a training set of unbroken SSR loci, and then used the SSR-clouds to detect likely SSR-derived regions throughout the genome. Conclusions Our analysis indicates that the amount of likely SSR-derived sequence in the human genome is 6.77%, over twice as much as previous estimates, including millions of newly identified ancient SSR-derived loci. SSR-clouds identified poly-A sequences adjacent to transposable element termini in over 74% of the oldest class of Alu (roughly, AluJ), validating the sensitivity of the approach. Poly-A’s annotated by SSR-clouds also had a length distribution that was more consistent with their poly-A origins, with mean about 35 bp even in older Alus. This work demonstrates that the high sensitivity provided by SSR-Clouds improves the detection of SSR-derived regions and will enable deeper analysis of how decaying repeats contribute to genome structure.
Collapse
Affiliation(s)
- Jonathan A Shortt
- 1Colorado Center for Personalized Medicine, University of Colorado School of Medicine, Aurora, CO 80045 USA
| | - Robert P Ruggiero
- 2Department of Biology, Southeast Missouri State University, Cape Girardeau, MO 63701 USA
| | - Corey Cox
- 1Colorado Center for Personalized Medicine, University of Colorado School of Medicine, Aurora, CO 80045 USA
| | - Aaron C Wacholder
- 3Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213 USA
| | - David D Pollock
- 4Department of Biochemistry & Molecular Genetics, University of Colorado School of Medicine, Aurora, CO 80045 USA
| |
Collapse
|
22
|
Lee SR, Kim K, Lee BY, Lim CE. Complete chloroplast genomes of all six Hosta species occurring in Korea: molecular structures, comparative, and phylogenetic analyses. BMC Genomics 2019; 20:833. [PMID: 31706273 PMCID: PMC6842461 DOI: 10.1186/s12864-019-6215-y] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Accepted: 10/22/2019] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND The genus Hosta is a group of economically appreciated perennial herbs consisting of approximately 25 species that is endemic to eastern Asia. Due to considerable morphological variability, the genus has been well recognized as a group with taxonomic problems. Chloroplast is a cytoplasmic organelle with its own genome, which is the most commonly used for phylogenetic and genetic diversity analyses for land plants. To understand the genomic architecture of Hosta chloroplasts and examine the level of nucleotide and size variation, we newly sequenced four (H. clausa, H. jonesii, H. minor, and H. venusta) and analyzed six Hosta species (including the four, H. capitata and H. yingeri) distributed throughout South Korea. RESULTS The average size of complete chloroplast genomes for the Hosta taxa was 156,642 bp with a maximum size difference of ~ 300 bp. The overall gene content and organization across the six Hosta were nearly identical with a few exceptions. There was a single tRNA gene deletion in H. jonesii and four genes were pseudogenized in three taxa (H. capitata, H. minor, and H. jonesii). We did not find major structural variation, but there were a minor expansion and contractions in IR region for three species (H. capitata, H. minor, and H. venusta). Sequence variations were higher in non-coding regions than in coding regions. Four genic and intergenic regions including two coding genes (psbA and ndhD) exhibited the largest sequence divergence showing potential as phylogenetic markers. We found compositional codon usage bias toward A/T at the third position. The Hosta plastomes had a comparable number of dispersed and tandem repeats (simple sequence repeats) to the ones identified in other angiosperm taxa. The phylogeny of 20 Agavoideae (Asparagaceae) taxa including the six Hosta species inferred from complete plastome data showed well resolved monophyletic clades for closely related taxa with high node supports. CONCLUSIONS Our study provides detailed information on the chloroplast genome of the Hosta taxa. We identified nucleotide diversity hotspots and characterized types of repeats, which can be used for developing molecular markers applicable in various research area.
Collapse
Affiliation(s)
- Soo-Rang Lee
- Department of Biological Science, Texas Tech University, Lubbock, TX USA
| | - Kyeonghee Kim
- National Institute of Biological Resources, 42 Hwangyeong-ro, Seo-gu, Incheon, 22689 South Korea
| | - Byoung-Yoon Lee
- National Institute of Biological Resources, 42 Hwangyeong-ro, Seo-gu, Incheon, 22689 South Korea
| | - Chae Eun Lim
- National Institute of Biological Resources, 42 Hwangyeong-ro, Seo-gu, Incheon, 22689 South Korea
| |
Collapse
|
23
|
Choi IS, Schwarz EN, Ruhlman TA, Khiyami MA, Sabir JSM, Hajarah NH, Sabir MJ, Rabah SO, Jansen RK. Fluctuations in Fabaceae mitochondrial genome size and content are both ancient and recent. BMC Plant Biol 2019; 19:448. [PMID: 31653201 PMCID: PMC6814987 DOI: 10.1186/s12870-019-2064-8] [Citation(s) in RCA: 28] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/18/2019] [Accepted: 10/02/2019] [Indexed: 05/12/2023]
Abstract
BACKGROUND Organelle genome studies of Fabaceae, an economically and ecologically important plant family, have been biased towards the plastid genome (plastome). Thus far, less than 15 mitochondrial genome (mitogenome) sequences of Fabaceae have been published, all but four of which belong to the subfamily Papilionoideae, limiting the understanding of size variation and content across the family. To address this, four mitogenomes were sequenced and assembled from three different subfamilies (Cercidoideae, Detarioideae and Caesalpinioideae). RESULTS Phylogenetic analysis based on shared mitochondrial protein coding regions produced a fully resolved and well-supported phylogeny that was completely congruent with the plastome tree. Comparative analyses suggest that two kinds of mitogenome expansions have occurred in Fabaceae. Size expansion of four genera (Tamarindus, Libidibia, Haematoxylum, and Leucaena) in two subfamilies (Detarioideae and Caesalpinioideae) occurred in relatively deep nodes, and was mainly caused by intercellular gene transfer and/or interspecific horizontal gene transfer (HGT). The second, more recent expansion occurred in the Papilionoideae as a result of duplication of native mitochondrial sequences. Family-wide gene content analysis revealed 11 gene losses, four (rps2, 7, 11 and 13) of which occurred in the ancestor of Fabaceae. Losses of the remaining seven genes (cox2, rpl2, rpl10, rps1, rps19, sdh3, sdh4) were restricted to specific lineages or occurred independently in different clades. Introns of three genes (cox2, ccmFc and rps10) showed extensive lineage-specific length variation due to large sequence insertions and deletions. Shared DNA analysis among Fabaceae mitogenomes demonstrated a substantial decay of intergenic spacers and provided further insight into HGT between the mimosoid clade of Caesalpinioideae and the holoparasitic Lophophytum (Balanophoraceae). CONCLUSION This study represents the most exhaustive analysis of Fabaceae mitogenomes so far, and extends the understanding the dynamic variation in size and gene/intron content. The four newly sequenced mitogenomes reported here expands the phylogenetic coverage to four subfamilies. The family has experienced multiple mitogenome size fluctuations in both ancient and recent times. The causes of these size variations are distinct in different lineages. Fabaceae mitogenomes experienced extensive size fluctuation by recruitment of exogenous DNA and duplication of native mitochondrial DNA.
Collapse
Affiliation(s)
- In-Su Choi
- Department of Integrative Biology, University of Texas at Austin, Austin, TX 78712 USA
| | - Erika N. Schwarz
- Department of Biological Sciences, St. Edward’s University, Austin, TX 78704 USA
| | - Tracey A. Ruhlman
- Department of Integrative Biology, University of Texas at Austin, Austin, TX 78712 USA
| | - Mohammad A. Khiyami
- King Abdulaziz City for Science and Technology (KACST), Riyadh, 11442 Saudi Arabia
| | - Jamal S. M. Sabir
- Centre of Excellence in Bionanoscience Research, Department of Biological Sciences, Faculty of Science, King Abdulaziz University, Jeddah, 21589 Saudi Arabia
| | - Nahid H. Hajarah
- Centre of Excellence in Bionanoscience Research, Department of Biological Sciences, Faculty of Science, King Abdulaziz University, Jeddah, 21589 Saudi Arabia
| | - Mernan J. Sabir
- Centre of Excellence in Bionanoscience Research, Department of Biological Sciences, Faculty of Science, King Abdulaziz University, Jeddah, 21589 Saudi Arabia
| | - Samar O. Rabah
- Department of Biological Sciences, Faculty of Science, King Abdulaziz University, Jeddah, 21589 Saudi Arabia
| | - Robert K. Jansen
- Department of Integrative Biology, University of Texas at Austin, Austin, TX 78712 USA
- Centre of Excellence in Bionanoscience Research, Department of Biological Sciences, Faculty of Science, King Abdulaziz University, Jeddah, 21589 Saudi Arabia
| |
Collapse
|
24
|
Le TH, Nguyen KT, Nguyen NTB, Doan HTT, Agatsuma T, Blair D. The complete mitochondrial genome of Paragonimus ohirai (Paragonimidae: Trematoda: Platyhelminthes) and its comparison with P. westermani congeners and other trematodes. PeerJ 2019; 7:e7031. [PMID: 31259095 PMCID: PMC6589331 DOI: 10.7717/peerj.7031] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2019] [Accepted: 04/27/2019] [Indexed: 11/20/2022] Open
Abstract
We present the complete mitochondrial genome of Paragonimus ohirai Miyazaki, 1939 and compare its features with those of previously reported mitochondrial genomes of the pathogenic lung-fluke, Paragonimus westermani, and other members of the genus. The circular mitochondrial DNA molecule of the single fully sequenced individual of P. ohirai was 14,818 bp in length, containing 12 protein-coding, two ribosomal RNA and 22 transfer RNA genes. As is common among trematodes, an atp8 gene was absent from the mitogenome of P. ohirai and the 5' end of nad4 overlapped with the 3' end of nad4L by 40 bp. Paragonimusohirai and four forms/strains of P. westermani from South Korea and India, exhibited remarkably different base compositions and hence codon usage in protein-coding genes. In the fully sequenced P. ohirai individual, the non-coding region started with two long identical repeats (292 bp each), separated by tRNAGlu . These were followed by an array of six short tandem repeats (STR), 117 bp each. Numbers of the short tandem repeats varied among P. ohirai individuals. A phylogenetic tree inferred from concatenated mitochondrial protein sequences of 50 strains encompassing 42 species of trematodes belonging to 14 families identified a monophyletic Paragonimidae in the class Trematoda. Characterization of additional mitogenomes in the genus Paragonimus will be useful for biomedical studies and development of molecular tools and mitochondrial markers for diagnostic, identification, hybridization and phylogenetic/epidemiological/evolutionary studies.
Collapse
Affiliation(s)
- Thanh Hoa Le
- Immunology Department, Institute of Biotechnology (IBT), Vietnam Academy of Science and Technology (VAST), Hanoi, Vietnam
- Graduate University of Science and Technology (GUST), Vietnam Academy of Science and Technology (VAST), Hanoi, Vietnam
| | - Khue Thi Nguyen
- Immunology Department, Institute of Biotechnology (IBT), Vietnam Academy of Science and Technology (VAST), Hanoi, Vietnam
| | - Nga Thi Bich Nguyen
- Immunology Department, Institute of Biotechnology (IBT), Vietnam Academy of Science and Technology (VAST), Hanoi, Vietnam
| | - Huong Thi Thanh Doan
- Immunology Department, Institute of Biotechnology (IBT), Vietnam Academy of Science and Technology (VAST), Hanoi, Vietnam
- Graduate University of Science and Technology (GUST), Vietnam Academy of Science and Technology (VAST), Hanoi, Vietnam
| | - Takeshi Agatsuma
- Department of Environmental Medicine, Kochi Medical School, Kochi University, Oko, Nankoku City, Kochi, Japan
| | - David Blair
- College of Science and Engineering, James Cook University, Townsville, Australia
| |
Collapse
|
25
|
Abstract
BACKGROUND Long terminal repeat retrotransposons are the most abundant transposons in plants. They play important roles in alternative splicing, recombination, gene regulation, and defense mechanisms. Large-scale sequencing projects for plant genomes are currently underway. Software tools are important for annotating long terminal repeat retrotransposons in these newly available genomes. However, the available tools are not very sensitive to known elements and perform inconsistently on different genomes. Some are hard to install or obsolete. They may struggle to process large plant genomes. None can be executed in parallel out of the box and very few have features to support visual review of new elements. To overcome these limitations, we developed LtrDetector, which uses techniques inspired by signal-processing. RESULTS We compared LtrDetector to LTR_Finder and LTRharvest, the two most successful predecessor tools, on six plant genomes. For each organism, we constructed a ground truth data set based on queries from a consensus sequence database. According to this evaluation, LtrDetector was the most sensitive tool, achieving 16-23% improvement in sensitivity over LTRharvest and 21% improvement over LTR_Finder. All three tools had low false positive rates, with LtrDetector achieving 98.2% precision, in between its two competitors. Overall, LtrDetector provides the best compromise between high sensitivity and low false positive rate while requiring moderate time and utilizing memory available on personal computers. CONCLUSIONS LtrDetector uses a novel methodology revolving around k-mer distributions, which allows it to produce high-quality results using relatively lightweight procedures. It is easy to install and use. It is not species specific, performing well using its default parameters on genomes of varying size and repeat content. It is automatically configured for parallel execution and runs efficiently on an ordinary personal computer. It includes a k-mer scores visualization tool to facilitate manual review of the identified elements. These features make LtrDetector an attractive tool for future annotation projects involving long terminal repeat retrotransposons.
Collapse
Affiliation(s)
- Joseph D Valencia
- The Bioinformatics Toolsmith Laboratory, Tandy School of Computer Science, University of Tulsa, 800 South Tucker Drive, Tulsa, 74104, OK, USA
| | - Hani Z Girgis
- The Bioinformatics Toolsmith Laboratory, Tandy School of Computer Science, University of Tulsa, 800 South Tucker Drive, Tulsa, 74104, OK, USA.
| |
Collapse
|
26
|
Sun J, Dong X, Cao Q, Xu T, Zhu M, Sun J, Dong T, Ma D, Han Y, Li Z. A systematic comparison of eight new plastome sequences from Ipomoea L. PeerJ 2019; 7:e6563. [PMID: 30881765 PMCID: PMC6417408 DOI: 10.7717/peerj.6563] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2018] [Accepted: 02/04/2019] [Indexed: 11/20/2022] Open
Abstract
Background Ipomoea is the largest genus in the family Convolvulaceae. The species in this genus have been widely used in many fields, such as agriculture, nutrition, and medicine. With the development of next-generation sequencing, more than 50 chloroplast genomes of Ipomoea species have been sequenced. However, the repeats and divergence regions in Ipomoea have not been well investigated. In the present study, we sequenced and assembled eight chloroplast genomes from sweet potato's close wild relatives. By combining these with 32 published chloroplast genomes, we conducted a detailed comparative analysis of a broad range of Ipomoea species. Methods Eight chloroplast genomes were assembled using short DNA sequences generated by next-generation sequencing technology. By combining these chloroplast genomes with 32 other published Ipomoea chloroplast genomes downloaded from GenBank and the Oxford Research Archive, we conducted a comparative analysis of the repeat sequences and divergence regions across the Ipomoea genus. In addition, separate analyses of the Batatas group and Quamoclit group were also performed. Results The eight newly sequenced chloroplast genomes ranged from 161,225 to 161,721 bp in length and displayed the typical circular quadripartite structure, consisting of a pair of inverted repeat (IR) regions (30,798-30,910 bp each) separated by a large single copy (LSC) region (87,575-88,004 bp) and a small single copy (SSC) region (12,018-12,051 bp). The average guanine-cytosine (GC) content was approximately 40.5% in the IR region, 36.1% in the LSC region, 32.2% in the SSC regions, and 37.5% in complete sequence for all the generated plastomes. The eight chloroplast genome sequences from this study included 80 protein-coding genes, four rRNAs (rrn23, rrn16, rrn5, and rrn4.5), and 37 tRNAs. The boundaries of single copy regions and IR regions were highly conserved in the eight chloroplast genomes. In Ipomoea, 57-89 pairs of repetitive sequences and 39-64 simple sequence repeats were found. By conducting a sliding window analysis, we found six relatively high variable regions (ndhA intron, ndhH-ndhF, ndhF-rpl32, rpl32-trnL, rps16-trnQ, and ndhF) in the Ipomoea genus, eight (trnG, rpl32-trnL, ndhA intron, ndhF-rpl32, ndhH-ndhF, ccsA-ndhD, trnG-trnR, and pasA-ycf3) in the Batatas group, and eight (ndhA intron, petN-psbM, rpl32-trnL, trnG-trnR, trnK-rps16, ndhC-trnV, rps16-trnQ, and trnG) in the Quamoclit group. Our maximum-likelihood tree based on whole chloroplast genomes confirmed the phylogenetic topology reported in previous studies. Conclusions The chloroplast genome sequence and structure were highly conserved in the eight newly-sequenced Ipomoea species. Our comparative analysis included a broad range of Ipomoea chloroplast genomes, providing valuable information for Ipomoea species identification and enhancing the understanding of Ipomoea genetic resources.
Collapse
Affiliation(s)
- Jianying Sun
- Institute of Integrative Plant Biology, School of Life Sciences, Jiangsu Normal University, Xuzhou, China.,Jiangsu Key Laboratory of Phylogenomics and Comparative Genomics, Jiangsu Normal University, Xuzhou, China
| | - Xiaofeng Dong
- Institute of Integrative Plant Biology, School of Life Sciences, Jiangsu Normal University, Xuzhou, China.,Jiangsu Key Laboratory of Phylogenomics and Comparative Genomics, Jiangsu Normal University, Xuzhou, China
| | - Qinghe Cao
- Jiangsu Xuhuai Regional Xuzhou Institute of Agricultural Sciences, Chinese Academy of Agricultural Sciences, Xuzhou, China
| | - Tao Xu
- Institute of Integrative Plant Biology, School of Life Sciences, Jiangsu Normal University, Xuzhou, China.,Jiangsu Key Laboratory of Phylogenomics and Comparative Genomics, Jiangsu Normal University, Xuzhou, China
| | - Mingku Zhu
- Institute of Integrative Plant Biology, School of Life Sciences, Jiangsu Normal University, Xuzhou, China.,Jiangsu Key Laboratory of Phylogenomics and Comparative Genomics, Jiangsu Normal University, Xuzhou, China
| | - Jian Sun
- Institute of Integrative Plant Biology, School of Life Sciences, Jiangsu Normal University, Xuzhou, China.,Jiangsu Key Laboratory of Phylogenomics and Comparative Genomics, Jiangsu Normal University, Xuzhou, China
| | - Tingting Dong
- Institute of Integrative Plant Biology, School of Life Sciences, Jiangsu Normal University, Xuzhou, China.,Jiangsu Key Laboratory of Phylogenomics and Comparative Genomics, Jiangsu Normal University, Xuzhou, China
| | - Daifu Ma
- Jiangsu Xuhuai Regional Xuzhou Institute of Agricultural Sciences, Chinese Academy of Agricultural Sciences, Xuzhou, China
| | - Yonghua Han
- Institute of Integrative Plant Biology, School of Life Sciences, Jiangsu Normal University, Xuzhou, China.,Jiangsu Key Laboratory of Phylogenomics and Comparative Genomics, Jiangsu Normal University, Xuzhou, China
| | - Zongyun Li
- Institute of Integrative Plant Biology, School of Life Sciences, Jiangsu Normal University, Xuzhou, China.,Jiangsu Key Laboratory of Phylogenomics and Comparative Genomics, Jiangsu Normal University, Xuzhou, China
| |
Collapse
|
27
|
Piégu B, Arensburger P, Guillou F, Bigot Y. But where did the centromeres go in the chicken genome models? Chromosome Res 2018; 26:297-306. [PMID: 30225548 DOI: 10.1007/s10577-018-9585-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Revised: 08/31/2018] [Accepted: 09/03/2018] [Indexed: 11/30/2022]
Abstract
The chicken genome was the third vertebrate to be sequenced. To date, its sequence and feature annotations are used as the reference for avian models in genome sequencing projects developed on birds and other Sauropsida species, and in genetic studies of domesticated birds of economic and evolutionary biology interest. Therefore, an accurate description of this genome model is important to a wide number of scientists. Here, we review the location and features of a very basic element, the centromeres of chromosomes in the galGal5 genome model. Centromeres are elements that are not determined by their DNA sequence but by their epigenetic status, in particular by the accumulation of the histone-like protein CENP-A. Comparison of data from several public sources (primarily marker probes flanking centromeres using fluorescent in situ hybridization done on giant lampbrush chromosomes and CENP-A ChIP-seq datasets) with galGal5 annotations revealed that centromeres are likely inappropriately mapped in 9 of the 16 galGal5 chromosome models in which they are described. Analysis of karyology data confirmed that the location of the main CENP-A peaks in chromosomes is the best means of locating the centromeres in 25 galGal5 chromosome models, the majority of which (16) are fully sequenced and assembled. This data re-analysis reaffirms that several sources of information should be examined to produce accurate genome annotations, particularly for basic structures such as centromeres that are epigenetically determined.
Collapse
Affiliation(s)
- Benoît Piégu
- PRC, UMR INRA0085, CNRS 7247, Centre INRA Val de Loire, 37380, Nouzilly, France
| | - Peter Arensburger
- Biological Sciences Department, California State Polytechnic University, Pomona, CA, 91768, USA
| | - Florian Guillou
- PRC, UMR INRA0085, CNRS 7247, Centre INRA Val de Loire, 37380, Nouzilly, France
| | - Yves Bigot
- PRC, UMR INRA0085, CNRS 7247, Centre INRA Val de Loire, 37380, Nouzilly, France.
| |
Collapse
|
28
|
Glasner P, Johnson SD, Leitner M. A comparative analysis to forecast apartment burglaries in Vienna, Austria, based on repeat and near repeat victimization. Crime Sci 2018; 7:9. [PMID: 30956933 PMCID: PMC6417393 DOI: 10.1186/s40163-018-0083-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/17/2018] [Accepted: 08/14/2018] [Indexed: 06/09/2023]
Abstract
In this paper, we introduce two methods to forecast apartment burglaries that are based on repeat and near repeat victimization. While the first approach, the "heuristic method" generates buffer areas around each new apartment burglary, the second approach concentrates on forecasting near repeat chain links. These near repeat chain links are events that follow a near repeat pair of an originating and (near) repeat event that is close in space and in time. We name this approach the "near repeat chain method". This research analyzes apartment burglaries from November 2013 to November 2016 in Vienna, Austria. The overall research goal is to investigate whether the near repeat chain method shows better prediction efficiencies (using a capture rate and the prediction accuracy index) while producing fewer prediction areas. Results show that the near repeat chain method proves to be the more efficient compared to the heuristic method for all bandwidth combinations analyzed in this research.
Collapse
Affiliation(s)
- Philip Glasner
- Department of Geoinformatics–Z_GIS, University of Salzburg, Salzburg, Austria
- SynerGIS Informationssysteme GmbH, Vienna, Austria
| | - Shane D. Johnson
- Jill Dando Institute of Security and Crime Science, University College London, London, UK
| | - Michael Leitner
- Department of Geoinformatics–Z_GIS, University of Salzburg, Salzburg, Austria
- Department of Geography and Anthropology, Louisiana State University, Baton Rouge, LA USA
| |
Collapse
|
29
|
Dong S, Zhao C, Chen F, Liu Y, Zhang S, Wu H, Zhang L, Liu Y. The complete mitochondrial genome of the early flowering plant Nymphaea colorata is highly repetitive with low recombination. BMC Genomics 2018; 19:614. [PMID: 30107780 PMCID: PMC6092842 DOI: 10.1186/s12864-018-4991-4] [Citation(s) in RCA: 73] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2017] [Accepted: 08/02/2018] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Mitochondrial genomes of flowering plants (angiosperms) are highly dynamic in genome structure. The mitogenome of the earliest angiosperm Amborella is remarkable in carrying rampant foreign DNAs, in contrast to Liriodendron, the other only known early angiosperm mitogenome that is described as 'fossilized'. The distinctive features observed in the two early flowering plant mitogenomes add to the current confusions of what early flowering plants look like. Expanded sampling would provide more details in understanding the mitogenomic evolution of early angiosperms. Here we report the complete mitochondrial genome of water lily Nymphaea colorata from Nymphaeales, one of the three orders of the earliest angiosperms. RESULTS Assembly of data from Pac-Bio long-read sequencing yielded a circular mitochondria chromosome of 617,195 bp with an average depth of 601×. The genome encoded 41 protein coding genes, 20 tRNA and three rRNA genes with 25 group II introns disrupting 10 protein coding genes. Nearly half of the genome is composed of repeated sequences, which contributed substantially to the intron size expansion, making the gross intron length of the Nymphaea mitochondrial genome one of the longest among angiosperms, including an 11.4-Kb intron in cox2, which is the longest organellar intron reported to date in plants. Nevertheless, repeat mediated homologous recombination is unexpectedly low in Nymphaea evidenced by 74 recombined reads detected from ten recombinationally active repeat pairs among 886,982 repeat pairs examined. Extensive gene order changes were detected in the three early angiosperm mitogenomes, i.e. 38 or 44 events of inversions and translocations are needed to reconcile the mitogenome of Nymphaea with Amborella or Liriodendron, respectively. In contrast to Amborella with six genome equivalents of foreign mitochondrial DNA, not a single horizontal gene transfer event was observed in the Nymphaea mitogenome. CONCLUSIONS The Nymphaea mitogenome resembles the other available early angiosperm mitogenomes by a similarly rich 64-coding gene set, and many conserved gene clusters, whereas stands out by its highly repetitive nature and resultant remarkable intron expansions. The low recombination level in Nymphaea provides evidence for the predominant master conformation in vivo with a highly substoichiometric set of rearranged molecules.
Collapse
Affiliation(s)
- Shanshan Dong
- Fairylake Botanical Garden, Shenzhen & Chinese Academy of Sciences, Shenzhen, China
- College of Life Sciences, South China Agricultural University, Guangzhou, China
| | - Chaoxian Zhao
- Fairylake Botanical Garden, Shenzhen & Chinese Academy of Sciences, Shenzhen, China
- Department of Biology, School of Life Sciences, East China Normal University, Shanghai, China
| | - Fei Chen
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Fujian Agriculture and Forestry University, Fuzhou, China
- Ministry of Education Key Laboratory of Genetics, Breeding and Multiple Utilization of Corps, Fujian Agriculture and Forestry University, Fuzhou, China
- Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Yanhui Liu
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Fujian Agriculture and Forestry University, Fuzhou, China
- Ministry of Education Key Laboratory of Genetics, Breeding and Multiple Utilization of Corps, Fujian Agriculture and Forestry University, Fuzhou, China
- Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Shouzhou Zhang
- Fairylake Botanical Garden, Shenzhen & Chinese Academy of Sciences, Shenzhen, China
| | - Hong Wu
- College of Life Sciences, South China Agricultural University, Guangzhou, China
| | - Liangsheng Zhang
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Fujian Agriculture and Forestry University, Fuzhou, China
- Ministry of Education Key Laboratory of Genetics, Breeding and Multiple Utilization of Corps, Fujian Agriculture and Forestry University, Fuzhou, China
- Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Fujian Agriculture and Forestry University, Fuzhou, China
| | - Yang Liu
- Fairylake Botanical Garden, Shenzhen & Chinese Academy of Sciences, Shenzhen, China
- BGI-Shenzhen, Shenzhen, 518083 China
| |
Collapse
|
30
|
Zhang H, Jin J, Moore MJ, Yi T, Li D. Plastome characteristics of Cannabaceae. Plant Divers 2018; 40:127-137. [PMID: 30175293 PMCID: PMC6114266 DOI: 10.1016/j.pld.2018.04.003] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/17/2018] [Revised: 04/11/2018] [Accepted: 04/18/2018] [Indexed: 05/02/2023]
Abstract
Cannabaceae is an economically important family that includes ten genera and ca. 117 accepted species. To explore the structure and size variation of their plastomes, we sequenced ten plastomes representing all ten genera of Cannabaceae. Each plastome possessed the typical angiosperm quadripartite structure and contained a total of 128 genes. The Inverted Repeat (IR) regions in five plastomes had experienced small expansions (330-983 bp) into the Large Single-Copy (LSC) region. The plastome of Chaetachme aristata has experienced a 942-bp IR contraction and lost rpl22 and rps19 in its IRs. The substitution rates of rps19 and rpl22 decreased after they shifted from the LSC to IR. A 270-bp inversion was detected in the Parasponia rugosa plastome, which might have been mediated by 18-bp inverted repeats. Repeat sequences, simple sequence repeats, and nucleotide substitution rates varied among these plastomes. Molecular markers with more than 13% variable sites and 5% parsimony-informative sites were identified, which may be useful for further phylogenetic analysis and species identification. Our results show strong support for a sister relationship between Gironniera and Lozanell (BS = 100). Celtis, Cannabis-Humulus, Chaetachme-Pteroceltis, and Trema-Parasponia formed a strongly supported clade, and their relationships were well resolved with strong support (BS = 100). The availability of these ten plastomes provides valuable genetic information for accurately identifying species, clarifying taxonomy and reconstructing the intergeneric phylogeny of Cannabaceae.
Collapse
Affiliation(s)
- Huanlei Zhang
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, China
- Kunming College of Life Sciences, University of Chinese Academy of Sciences, Kunming 650201, China
| | - Jianjun Jin
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, China
- Kunming College of Life Sciences, University of Chinese Academy of Sciences, Kunming 650201, China
| | | | - Tingshuang Yi
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, China
- Corresponding author.
| | - Dezhu Li
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, China
- Corresponding author.
| |
Collapse
|
31
|
Nagy YI, Hussein MMM, Ragab YM, Attia AS. Isogenic mutations in the Moraxella catarrhalis CydDC system display pleiotropic phenotypes and reveal the role of a palindrome sequence in its transcriptional regulation. Microbiol Res 2017. [PMID: 28647125 DOI: 10.1016/j.micres.2017.06.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Moraxella catarrhalis is becoming an important human respiratory tract pathogen affecting significant proportions from the population. However, still little is known about its physiology and molecular regulation. To this end, the CydDC, which is a heterodimeric ATP binding cassette transporter that has been shown to contribute to the maintenance of the redox homeostasis across the periplasm in other Gram-negative bacteria, is studied here. Amino acids multiple sequence alignments indicated that M. catarrhalis CydC is different from the CydC proteins of the bacterial species in which this system has been previously studied. These findings prompted further interest in studying this system in M. catarrhalis. Isogenic mutant in the CydDC system showed suppression in growth rate, hypersensitivity to oxidative and reductive stress and increased accumulation of intracellular cysteine levels. In addition, the growth of cydC- mutant exhibited hypersensitivity to exogenous cysteine; however, it did not display a significant difference from its wild-type counterpart in the murine pulmonary clearance model. Moreover, a palindrome was detected 94bp upstream of the cydD ORF suggesting it might act as a potential regulatory element. Real-time reverse transcription-PCR analysis showed that deletion/change in the palindrome resulted into alterations in the transcription levels of cydC. A better understanding of such system and its regulation helps in developing better ways to combat M. catarrhalis infections.
Collapse
Affiliation(s)
- Yosra I Nagy
- Department of Microbiology and Immunology, Faculty of Pharmacy, Cairo University, Cairo, 11562, Egypt
| | - Manal M M Hussein
- Department of Microbiology and Immunology, Faculty of Pharmacy, Cairo University, Cairo, 11562, Egypt
| | - Yasser M Ragab
- Department of Microbiology and Immunology, Faculty of Pharmacy, Cairo University, Cairo, 11562, Egypt
| | - Ahmed S Attia
- Department of Microbiology and Immunology, Faculty of Pharmacy, Cairo University, Cairo, 11562, Egypt.
| |
Collapse
|
32
|
Ye N, Wang X, Li J, Bi C, Xu Y, Wu D, Ye Q. Assembly and comparative analysis of complete mitochondrial genome sequence of an economic plant Salix suchowensis. PeerJ 2017; 5:e3148. [PMID: 28367378 PMCID: PMC5374973 DOI: 10.7717/peerj.3148] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2017] [Accepted: 03/05/2017] [Indexed: 11/20/2022] Open
Abstract
Willow is a widely used dioecious woody plant of Salicaceae family in China. Due to their high biomass yields, willows are promising sources for bioenergy crops. In this study, we assembled the complete mitochondrial (mt) genome sequence of S. suchowensis with the length of 644,437 bp using Roche-454 GS FLX Titanium sequencing technologies. Base composition of the S. suchowensis mt genome is A (27.43%), T (27.59%), C (22.34%), and G (22.64%), which shows a prevalent GC content with that of other angiosperms. This long circular mt genome encodes 58 unique genes (32 protein-coding genes, 23 tRNA genes and 3 rRNA genes), and 9 of the 32 protein-coding genes contain 17 introns. Through the phylogenetic analysis of 35 species based on 23 protein-coding genes, it is supported that Salix as a sister to Populus. With the detailed phylogenetic information and the identification of phylogenetic position, some ribosomal protein genes and succinate dehydrogenase genes are found usually lost during evolution. As a native shrub willow species, this worthwhile research of S. suchowensis mt genome will provide more desirable information for better understanding the genomic breeding and missing pieces of sex determination evolution in the future.
Collapse
Affiliation(s)
- Ning Ye
- College of Information Science and Technology, Nanjing Forestry University , Nanjing , Jiangsu , China
| | - Xuelin Wang
- College of Information Science and Technology, Nanjing Forestry University , Nanjing , Jiangsu , China
| | - Juan Li
- School of Electrical and Automatic Engineering, Nanjing Normal University , Nanjing , Jiangsu , China
| | - Changwei Bi
- School of Biological Science and Medical Engineering, Southeast University , Nanjing , Jiangsu , China
| | - Yiqing Xu
- College of Information Science and Technology, Nanjing Forestry University , Nanjing , Jiangsu , China
| | - Dongyang Wu
- College of Forest Resources and Environment, Nanjing Forestry University , Nanjing , Jiangsu , China
| | - Qiaolin Ye
- College of Information Science and Technology, Nanjing Forestry University , Nanjing , Jiangsu , China
| |
Collapse
|
33
|
Lima L, Sinaimeri B, Sacomoto G, Lopez-Maestre H, Marchet C, Miele V, Sagot MF, Lacroix V. Playing hide and seek with repeats in local and global de novo transcriptome assembly of short RNA-seq reads. Algorithms Mol Biol 2017; 12:2. [PMID: 28250805 PMCID: PMC5322684 DOI: 10.1186/s13015-017-0091-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2016] [Accepted: 01/27/2017] [Indexed: 11/30/2022] Open
Abstract
BACKGROUND The main challenge in de novo genome assembly of DNA-seq data is certainly to deal with repeats that are longer than the reads. In de novo transcriptome assembly of RNA-seq reads, on the other hand, this problem has been underestimated so far. Even though we have fewer and shorter repeated sequences in transcriptomics, they do create ambiguities and confuse assemblers if not addressed properly. Most transcriptome assemblers of short reads are based on de Bruijn graphs (DBG) and have no clear and explicit model for repeats in RNA-seq data, relying instead on heuristics to deal with them. RESULTS The results of this work are threefold. First, we introduce a formal model for representing high copy-number and low-divergence repeats in RNA-seq data and exploit its properties to infer a combinatorial characteristic of repeat-associated subgraphs. We show that the problem of identifying such subgraphs in a DBG is NP-complete. Second, we show that in the specific case of local assembly of alternative splicing (AS) events, we can implicitly avoid such subgraphs, and we present an efficient algorithm to enumerate AS events that are not included in repeats. Using simulated data, we show that this strategy is significantly more sensitive and precise than the previous version of KisSplice (Sacomoto et al. in WABI, pp 99-111, 1), Trinity (Grabherr et al. in Nat Biotechnol 29(7):644-652, 2), and Oases (Schulz et al. in Bioinformatics 28(8):1086-1092, 3), for the specific task of calling AS events. Third, we turn our focus to full-length transcriptome assembly, and we show that exploring the topology of DBGs can improve de novo transcriptome evaluation methods. Based on the observation that repeats create complicated regions in a DBG, and when assemblers try to traverse these regions, they can infer erroneous transcripts, we propose a measure to flag transcripts traversing such troublesome regions, thereby giving a confidence level for each transcript. The originality of our work when compared to other transcriptome evaluation methods is that we use only the topology of the DBG, and not read nor coverage information. We show that our simple method gives better results than Rsem-Eval (Li et al. in Genome Biol 15(12):553, 4) and TransRate (Smith-Unna et al. in Genome Res 26(8):1134-1144, 5) on both real and simulated datasets for detecting chimeras, and therefore is able to capture assembly errors missed by these methods.
Collapse
Affiliation(s)
- Leandro Lima
- Inria Grenoble, 655, Avenue de l’Europe, 38334 Montbonnot, France
- CNRS, UMR5558, Université Claude Bernard Lyon 1, 43, Boulevard du 11 Novembre 1918, 69622 Villeurbanne, France
| | - Blerina Sinaimeri
- Inria Grenoble, 655, Avenue de l’Europe, 38334 Montbonnot, France
- CNRS, UMR5558, Université Claude Bernard Lyon 1, 43, Boulevard du 11 Novembre 1918, 69622 Villeurbanne, France
| | - Gustavo Sacomoto
- Inria Grenoble, 655, Avenue de l’Europe, 38334 Montbonnot, France
- CNRS, UMR5558, Université Claude Bernard Lyon 1, 43, Boulevard du 11 Novembre 1918, 69622 Villeurbanne, France
| | - Helene Lopez-Maestre
- Inria Grenoble, 655, Avenue de l’Europe, 38334 Montbonnot, France
- CNRS, UMR5558, Université Claude Bernard Lyon 1, 43, Boulevard du 11 Novembre 1918, 69622 Villeurbanne, France
| | - Camille Marchet
- IRISA Inria Rennes Bretagne Atlantique; GenScale Team, Université Rennes 1, 263, Avenue Général Leclerc, 35042 Rennes, France
| | - Vincent Miele
- CNRS, UMR5558, Université Claude Bernard Lyon 1, 43, Boulevard du 11 Novembre 1918, 69622 Villeurbanne, France
| | - Marie-France Sagot
- Inria Grenoble, 655, Avenue de l’Europe, 38334 Montbonnot, France
- CNRS, UMR5558, Université Claude Bernard Lyon 1, 43, Boulevard du 11 Novembre 1918, 69622 Villeurbanne, France
| | - Vincent Lacroix
- Inria Grenoble, 655, Avenue de l’Europe, 38334 Montbonnot, France
- CNRS, UMR5558, Université Claude Bernard Lyon 1, 43, Boulevard du 11 Novembre 1918, 69622 Villeurbanne, France
| |
Collapse
|
34
|
Kowar T, Zakrzewski F, Macas J, Kobližková A, Viehoever P, Weisshaar B, Schmidt T. Repeat Composition of CenH3-chromatin and H3K9me2-marked heterochromatin in Sugar Beet (Beta vulgaris). BMC Plant Biol 2016; 16:120. [PMID: 27230558 PMCID: PMC4881148 DOI: 10.1186/s12870-016-0805-5] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/07/2015] [Accepted: 05/17/2016] [Indexed: 05/18/2023]
Abstract
BACKGROUND Sugar beet (Beta vulgaris) is an important crop of temperate climate zones, which provides nearly 30 % of the world's annual sugar needs. From the total genome size of 758 Mb, only 567 Mb were incorporated in the recently published genome sequence, due to the fact that regions with high repetitive DNA contents (e.g. satellite DNAs) are only partially included. Therefore, to fill these gaps and to gain information about the repeat composition of centromeres and heterochromatic regions, we performed chromatin immunoprecipitation followed by sequencing (ChIP-Seq) using antibodies against the centromere-specific histone H3 variant of sugar beet (CenH3) and the heterochromatic mark of dimethylated lysine 9 of histone H3 (H3K9me2). RESULTS ChIP-Seq analysis revealed that active centromeres containing CenH3 consist of the satellite pBV and the Ty3-gypsy retrotransposon Beetle7, while heterochromatin marked by H3K9me2 exhibits heterogeneity in repeat composition. H3K9me2 was mainly associated with the satellite family pEV, the Ty1-copia retrotransposon family Cotzilla and the DNA transposon superfamily of the En/Spm type. In members of the section Beta within the genus Beta, immunostaining using the CenH3 antibody was successful, indicating that orthologous CenH3 proteins are present in closely related species within this section. CONCLUSIONS The identification of repetitive genome portions by ChIP-Seq experiments complemented the sugar beet reference sequence by providing insights into the repeat composition of poorly characterized CenH3-chromatin and H3K9me2-heterochromatin. Therefore, our work provides the basis for future research and application concerning the sugar beet centromere and repeat-rich heterochromatic regions characterized by the presence of H3K9me2.
Collapse
Affiliation(s)
- Teresa Kowar
- Department of Plant Cell and Molecular Biology, TU Dresden, Dresden, D-01062, Germany
| | - Falk Zakrzewski
- Department of Plant Cell and Molecular Biology, TU Dresden, Dresden, D-01062, Germany
| | - Jiří Macas
- Biology Centre ASCR, Institute of Plant Molecular Biology, Branišovská 31, Česke Budějovice, CZ-37005, Czech Republic
| | - Andrea Kobližková
- Biology Centre ASCR, Institute of Plant Molecular Biology, Branišovská 31, Česke Budějovice, CZ-37005, Czech Republic
| | - Prisca Viehoever
- CeBiTec & Faculty of Biology, Bielefeld University, Universitätsstr. 25, Bielefeld, D-33615, Germany
| | - Bernd Weisshaar
- CeBiTec & Faculty of Biology, Bielefeld University, Universitätsstr. 25, Bielefeld, D-33615, Germany.
| | - Thomas Schmidt
- Department of Plant Cell and Molecular Biology, TU Dresden, Dresden, D-01062, Germany
| |
Collapse
|
35
|
Abstract
Next-generation sequencing (NGS) technologies have rapidly evolved in the last 5 years, leading to the generation of millions of short reads in a single run. Consequently, various sequence alignment algorithms have been developed to compare these reads to an appropriate reference in order to perform important downstream analysis. SOAP2 from the SOAP series is one of the most commonly used alignment programs to handle NGS data, and it efficiently does so using low computer memory usage and fast alignment speed. This chapter describes the protocol used to align short reads to a reference genome using SOAP2, and highlights the significance of using the in-built command-line options to tune the behavior of the algorithm according to the inputs and the desired results.
Collapse
|
36
|
Abstract
Transposable elements (TEs) have recently been shown to have many regulatory roles within the genome. In this chapter, we will examine two in silico methods for analyzing TEs and identifying families that may have acquired such functions. The first method will look at how the overrepresentation of a repeat family in a set of genomic features can be discovered. The example situation of OCT4 binding sites originating from LTR7 TE sequences will be used to show how this method could be applied. The second method will describe how to determine if a TE family exhibits a cell type-specific expression pattern. As an example, we will look at the expression of HERV-H, an endogenous retrovirus known to act as an lncRNA in embryonic stem cells. We will use this example to demonstrate how RNA-seq data can be used to compare cell type expression of repeats.
Collapse
Affiliation(s)
- LeeAnn Ramsay
- Department of Human Genetics, McGill University, Montréal, QC, Canada, H3A 1A4
| | - Guillaume Bourque
- Department of Human Genetics, McGill University, Montréal, QC, Canada, H3A 1A4. .,McGill University and Génome Québec Innovation Center, Montréal, QC, Canada, H3A 1A4.
| |
Collapse
|
37
|
Abstract
Plant genomes contain a particularly high proportion of repeated structures of various types. This chapter proposes a guided tour of available software that can help biologists to look for these repeats and check some hypothetical models intended to characterize their structures. Since transposable elements are a major source of repeats in plants, many methods have been used or developed for this large class of sequences. They are representative of the range of tools available for other classes of repeats and we have provided a whole section on this topic as well as a selection of the main existing software. In order to better understand how they work and how repeats may be efficiently found in genomes, it is necessary to look at the technical issues involved in the large-scale search of these structures. Indeed, it may be hard to keep up with the profusion of proposals in this dynamic field and the rest of the chapter is devoted to the foundations of the search for repeats and more complex patterns. The second section introduces the key concepts that are useful for understanding the current state of the art in playing with words, applied to genomic sequences. This can be seen as the first stage of a very general approach called linguistic analysis that is interested in the analysis of natural or artificial texts. Words, the lexical level, correspond to simple repeated entities in texts or strings. In fact, biologists need to represent more complex entities where a repeat family is built on more abstract structures, including direct or inverted small repeats, motifs, composition constraints as well as ordering and distance constraints between these elementary blocks. In terms of linguistics, this corresponds to the syntactic level of a language. The last section introduces concepts and practical tools that can be used to reach this syntactic level in biological sequence analysis.
Collapse
Affiliation(s)
- Jacques Nicolas
- Dyliss Team, Irisa/Inria Centre de Rennes Bretagne Atlantique, Campus de Beaulieu, 35510, Rennes cedex, France.
| | - Pierre Peterlongo
- Irisa/Inria Centre de Rennes Bretagne Atlantique, Campus de Beaulieu, 35510, Rennes cedex, France
| | - Sébastien Tempel
- LCB, CNRS UMR 7283, 31 Chemin Joseph Aiguier, 13402, Marseille cedex 20, France
| |
Collapse
|
38
|
Agrawal S, Ganley ARD. Complete Sequence Construction of the Highly Repetitive Ribosomal RNA Gene Repeats in Eukaryotes Using Whole Genome Sequence Data. Methods Mol Biol 2016; 1455:161-181. [PMID: 27576718 DOI: 10.1007/978-1-4939-3792-9_13] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
The ribosomal RNA genes (rDNA) encode the major rRNA species of the ribosome, and thus are essential across life. These genes are highly repetitive in most eukaryotes, forming blocks of tandem repeats that form the core of nucleoli. The primary role of the rDNA in encoding rRNA has been long understood, but more recently the rDNA has been implicated in a number of other important biological phenomena, including genome stability, cell cycle, and epigenetic silencing. Noncoding elements, primarily located in the intergenic spacer region, appear to mediate many of these phenomena. Although sequence information is available for the genomes of many organisms, in almost all cases rDNA repeat sequences are lacking, primarily due to problems in assembling these intriguing regions during whole genome assemblies. Here, we present a method to obtain complete rDNA repeat unit sequences from whole genome assemblies. Limitations of next generation sequencing (NGS) data make them unsuitable for assembling complete rDNA unit sequences; therefore, the method we present relies on the use of Sanger whole genome sequence data. Our method makes use of the Arachne assembler, which can assemble highly repetitive regions such as the rDNA in a memory-efficient way. We provide a detailed step-by-step protocol for generating rDNA sequences from whole genome Sanger sequence data using Arachne, for refining complete rDNA unit sequences, and for validating the sequences obtained. In principle, our method will work for any species where the rDNA is organized into tandem repeats. This will help researchers working on species without a complete rDNA sequence, those working on evolutionary aspects of the rDNA, and those interested in conducting phylogenetic footprinting studies with the rDNA.
Collapse
Affiliation(s)
- Saumya Agrawal
- Institute of Natural and Mathematical Sciences, Massey University, Private Bag 102-904, Auckland, 0632, New Zealand.
- School of Biological Sciences, University of Auckland, Auckland, New Zealand.
| | - Austen R D Ganley
- Institute of Natural and Mathematical Sciences, Massey University, Private Bag 102-904, Auckland, 0632, New Zealand.
- School of Biological Sciences, University of Auckland, Private Bag 92019, Auckland, 1142, New Zealand.
| |
Collapse
|
39
|
Abstract
Protein structural motifs such as helical assemblies and α/β barrels combine secondary structure elements with various types of interactions. Helix-helix interfaces of assemblies - Ankyrin, ARM/HEAT, PUM, LRR, and TPR repeats - exhibit unique amino acid composition and patterns of interactions that correlate with curvature of solenoids, surface geometry and mutual orientation of the helical edges. Inner rows of ankyrin, ARM/HEAT, and PUM-HD repeats utilize edges (i-1, i) and (i+1, i+2) for the interaction of the given α-helix with preceding and following helices correspondingly, whereas outer rows of these proteins and LRR repeats invert this pattern and utilize edges (i-1, i) and (i-3, i-2). Arrangement of contacts observed in protein ligands that bind helical assemblies has to mimic the assembly pattern to provide the same curvature as a determinant of binding specificity. These characteristics are important for understanding fold recognition, specificity of protein-protein interactions, and design of new drugs and materials.
Collapse
Affiliation(s)
- Natalya A Kurochkina
- The School of Theoretical Modeling, 1629 K St NW s 300, Washington, DC 20006, United States.
| | - Michael J Iadarola
- Anesthesia Section, Department of Perioperative Medicine, Clinical Center, NIH, Building 10, Room 2C401, 10 Center Drive, MSC 1510, Bethesda, MD 20892, United States.
| |
Collapse
|
40
|
Kleine B, Ali L, Wobser D, Sakιnç T. The N-terminal repeat and the ligand binding domain A of SdrI protein is involved in hydrophobicity of S. saprophyticus. Microbiol Res 2014; 172:88-94. [PMID: 25497915 DOI: 10.1016/j.micres.2014.11.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2014] [Revised: 10/18/2014] [Accepted: 11/20/2014] [Indexed: 11/27/2022]
Abstract
Staphylococcus saprophyticus is an important cause of urinary tract infection, and its cell surface hydrophobicity may contribute to virulence by facilitating adherence of the organism to uroepithelia. S. saprophyticus expresses the surface protein SdrI, a member of the serine-aspartate repeat (SD) protein family, which has multifunctional properties. The SdrI knock out mutant has a reduced hydrophobicity index (HPI) of 25%, and expressed in the non-hydrophobic Staphylococcus carnosus strain TM300 causes hydrophobicity. Using hydrophobic interaction chromatography (HIC), we confined the hydrophobic site of SdrI to the N-terminal repeat region. S. saprophyticus strains carrying different plasmid constructs lacking either the N-terminal repeats, both B or SD-repeats were less hydrophobic than wild type and fully complemented SdrI mutant (HPI: 51%). The surface hydrophobicity and HPI of both wild type and the complemented strain were also influenced by calcium (Ca(2+)) and were reduced from 81.3% and 82.4% to 10.9% and 12.3%, respectively. This study confirms that the SdrI protein of S. saprophyticus is a crucial factor for surface hydrophobicity and also gives a first significant functional description of the N-terminal repeats, which in conjunction with the B-repeats form an optimal hydrophobic conformation.
Collapse
Affiliation(s)
- Britta Kleine
- Institut für Hygiene und Mikrobiologie, Abteilung für Medizinische Mikrobiologie, Ruhr-Universität Bochum, D-44780 Bochum, Germany.
| | - Liaqat Ali
- Division of Infectious Diseases, Department of Internal Medicine II, University Hospital Freiburg, 79106 Freiburg, Germany; Faculty of Biology, Albert Ludwigs University of Freiburg, 79104 Freiburg, Germany.
| | - Dominique Wobser
- Division of Infectious Diseases, Department of Internal Medicine II, University Hospital Freiburg, 79106 Freiburg, Germany.
| | - Türkân Sakιnç
- Division of Infectious Diseases, Department of Internal Medicine II, University Hospital Freiburg, 79106 Freiburg, Germany.
| |
Collapse
|
41
|
Kohler TP, Gisch N, Binsker U, Schlag M, Darm K, Völker U, Zähringer U, Hammerschmidt S. Repeating structures of the major staphylococcal autolysin are essential for the interaction with human thrombospondin 1 and vitronectin. J Biol Chem 2013; 289:4070-82. [PMID: 24371140 DOI: 10.1074/jbc.m113.521229] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Human thrombospondin 1 (hTSP-1) is a matricellular glycoprotein facilitating bacterial adherence to and invasion into eukaryotic cells. However, the bacterial adhesin(s) remain elusive. In this study, we show a dose-dependent binding of soluble hTSP-1 to Gram-positive but not Gram-negative bacteria. Diminished binding of soluble hTSP-1 to proteolytically pretreated staphylococci suggested a proteinaceous nature of potential bacterial adhesin(s) for hTSP-1. A combination of separation of staphylococcal surface proteins by two-dimensional gel electrophoresis with a ligand overlay assay with hTSP-1 and identification of the target protein by mass spectrometry revealed the major staphylococcal autolysin Atl as a bacterial binding protein for hTSP-1. Binding experiments with heterologously expressed repeats of the AtlE amidase from Staphylococcus epidermidis suggest that the repeating sequences (R1ab-R2ab) of the N-acetyl-muramoyl-L-alanine amidase of Atl are essential for binding of hTSP-1. Atl has also been identified previously as a staphylococcal vitronectin (Vn)-binding protein. Similar to the interaction with hTSP-1, the R1ab-R2ab repeats of Atl are shown here to be crucial for the interaction of Atl with the complement inhibition and matrix protein Vn. Competition assays with hTSP-1 and Vn revealed the R1ab-R2ab repeats of AtlE as the common binding domain for both host proteins. Furthermore, Vn competes with hTSP-1 for binding to Atl repeats and vice versa. In conclusion, this study identifies the Atl repeats as bacterial adhesive structures interacting with the human glycoproteins hTSP-1 and Vn. Finally, this study provides insight into the molecular interplay between hTSP-1 and Vn, respectively, and a bacterial autolysin.
Collapse
|
42
|
Abstract
The non-coding fraction of the human genome, which is approximately 98%, is mainly constituted by repeats. Transpositions,
expansions and deletions of these repeat elements contribute to a number of diseases. None of the available databases consolidates
information on both tandem and interspersed repeats with the flexibility of FASTA based homology search with reference to
disease genes. Repeats in diseases database (RiDs db) is a web accessible relational database, which aids analysis of repeats
associated with Mendelian disorders. It is a repository of disease genes, which can be searched by FASTA program or by limitedor
free- text keywords. Unlike other databases, RiDs db contains the sequences of these genes with access to corresponding
information on both interspersed and tandem repeats contained within them, on a unified platform. Comparative analysis of novel
or patient sequences with the reference sequences in RiDs db using FASTA search will indicate change in structure of repeats, if
any, with a particular disorder. This database also provides links to orthologs in model organisms such as zebrafish, mouse and
Drosophila.
Collapse
Affiliation(s)
- Anurag Chaturvedi
- Centre for Cellular and Molecular Biology, Habsiguda, Hyderabad - 500007, Andhra Pradesh, India
| | | | | |
Collapse
|