1
|
Wang YR, Chang SM, Lin JJ, Chen HC, Lee LT, Tsai DY, Lee SD, Lan CY, Chang CR, Chen CF, Ng CS. A comprehensive study of Z-DNA density and its evolutionary implications in birds. BMC Genomics 2024; 25:1123. [PMID: 39573987 PMCID: PMC11580473 DOI: 10.1186/s12864-024-11039-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2024] [Accepted: 11/13/2024] [Indexed: 11/25/2024] Open
Abstract
BACKGROUND Z-DNA, a left-handed helical form of DNA, plays a significant role in genomic stability and gene regulation. Its formation, associated with high GC content and repetitive sequences, is linked to genomic instability, potentially leading to large-scale deletions and contributing to phenotypic diversity and evolutionary adaptation. RESULTS In this study, we analyzed the density of Z-DNA-prone motifs of 154 avian genomes using the non-B DNA Motif Search Tool (nBMST). Our findings indicate a higher prevalence of Z-DNA motifs in promoter regions across all avian species compared to other genomic regions. A negative correlation was observed between Z-DNA density and developmental time in birds, suggesting that species with shorter developmental periods tend to have higher Z-DNA densities. This relationship implies that Z-DNA may influence the timing and regulation of development in avian species. Furthermore, Z-DNA density showed associations with traits such as body mass, egg mass, and genome size, highlighting the complex interactions between genome architecture and phenotypic characteristics. Gene Ontology (GO) analysis revealed that Z-DNA motifs are enriched in genes involved in nucleic acid binding, kinase activity, and translation regulation, suggesting a role in fine-tuning gene expression essential for cellular functions and responses to environmental changes. Additionally, the potential of Z-DNA to drive genomic instability and facilitate adaptive evolution underscores its importance in shaping phenotypic diversity. CONCLUSIONS This study emphasizes the role of Z-DNA as a dynamic genomic element contributing to gene regulation, genomic stability, and phenotypic diversity in avian species. Future research should experimentally validate these associations and explore the molecular mechanisms by which Z-DNA influences avian biology.
Collapse
Affiliation(s)
- Yu-Ren Wang
- Institute of Molecular and Cellular Biology, National Tsing Hua University, Hsinchu, 300044, Taiwan
| | - Shao-Ming Chang
- Institute of Molecular and Cellular Biology, National Tsing Hua University, Hsinchu, 300044, Taiwan
| | - Jinn-Jy Lin
- National Center for High-performance Computing, National Applied Research Laboratories, Hsinchu, 300092, Taiwan
| | - Hsiao-Chian Chen
- Institute of Molecular and Cellular Biology, National Tsing Hua University, Hsinchu, 300044, Taiwan
- Marine Research Station, Academia Sinica, Yilan, 262204, Taiwan
- Okinawa Institute of Science and Technology, Okinawa, 904-0495, Japan
| | - Lo-Tung Lee
- Institute of Molecular and Cellular Biology, National Tsing Hua University, Hsinchu, 300044, Taiwan
| | - Dien-Yu Tsai
- Institute of Molecular and Cellular Biology, National Tsing Hua University, Hsinchu, 300044, Taiwan
| | - Shih-Da Lee
- Institute of Molecular and Cellular Biology, National Tsing Hua University, Hsinchu, 300044, Taiwan
| | - Chung-Yu Lan
- Institute of Molecular and Cellular Biology, National Tsing Hua University, Hsinchu, 300044, Taiwan
- Department of Life Science, National Tsing Hua University, Hsinchu, 300044, Taiwan
| | - Chuang-Rung Chang
- Institute of Biotechnology, National Tsing Hua University, Hsinchu, 300044, Taiwan
- Department of Medical Science, National Tsing Hua University, Hsinchu, 300044, Taiwan
- School of Medicine, National Tsing Hua University, Hsinchu, 300044, Taiwan
| | - Chih-Feng Chen
- Deparment of Animal Sciences, National Chung Hsing University, Taichung, 402202, Taiwan
- The iEGG and Animal Biotechnology Center, National Chung Hsing University, Taichung, 402202, Taiwan
| | - Chen Siang Ng
- Institute of Molecular and Cellular Biology, National Tsing Hua University, Hsinchu, 300044, Taiwan.
- Department of Life Science, National Tsing Hua University, Hsinchu, 300044, Taiwan.
- The iEGG and Animal Biotechnology Center, National Chung Hsing University, Taichung, 402202, Taiwan.
- Bioresource Conservation Research Center, National Tsing Hua University, Hsinchu, 300044, Taiwan.
| |
Collapse
|
2
|
Provatas K, Chantzi N, Patsakis M, Nayak A, Mouratidis I, Pavlopoulos GA, Georgakopoulos-Soares I. invertiaDB: A Database of Inverted Repeats Across Organismal Genomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.11.11.622808. [PMID: 39605716 PMCID: PMC11601276 DOI: 10.1101/2024.11.11.622808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 11/29/2024]
Abstract
Inverted repeats are repetitive elements that can form hairpin and cruciform structures. They are linked to genomic instability, however they also have various biological functions. Their distribution differs markedly across taxonomic groups in the tree of life, and they exhibit high polymorphism due to their inherent genomic instability. Advances in sequencing technologies and declined costs have enabled the generation of an ever-growing number of complete genomes for organisms across taxonomic groups in the tree of life. However, a comprehensive database encompassing inverted repeats across diverse organismal genomes has been lacking. We present InvertiaDB, the first comprehensive database of inverted repeats spanning multiple taxa, featuring repeats identified in the genomes of 118,070 organisms across all major taxonomic groups. The database currently hosts 30,067,666 inverted repeat sequences, serving as a centralized, user-friendly repository to perform searches, interactive visualization, and download existing inverted repeat data for independent analysis. invertiaDB is implemented as a web portal for browsing, analyzing and downloading inverted repeat data. invertiaDB is publicly available at https://invertiadb.netlify.app/homepage.html.
Collapse
Affiliation(s)
- Kimonas Provatas
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| | - Nikol Chantzi
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| | - Michail Patsakis
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| | - Akshatha Nayak
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| | - Ioannis Mouratidis
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| | | | - Ilias Georgakopoulos-Soares
- Institute for Personalized Medicine, Department of Biochemistry and Molecular Biology, The Pennsylvania State University College of Medicine, Hershey, PA, USA
- Huck Institute of the Life Sciences, Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
3
|
Guo Z, Wang S, Wang Y, Wang Z, Ou G. A machine learning enhanced EMS mutagenesis probability map for efficient identification of causal mutations in Caenorhabditis elegans. PLoS Genet 2024; 20:e1011377. [PMID: 39186782 PMCID: PMC11379379 DOI: 10.1371/journal.pgen.1011377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2024] [Revised: 09/06/2024] [Accepted: 07/27/2024] [Indexed: 08/28/2024] Open
Abstract
Chemical mutagenesis-driven forward genetic screens are pivotal in unveiling gene functions, yet identifying causal mutations behind phenotypes remains laborious, hindering their high-throughput application. Here, we reveal a non-uniform mutation rate caused by Ethyl Methane Sulfonate (EMS) mutagenesis in the C. elegans genome, indicating that mutation frequency is influenced by proximate sequence context and chromatin status. Leveraging these factors, we developed a machine learning enhanced pipeline to create a comprehensive EMS mutagenesis probability map for the C. elegans genome. This map operates on the principle that causative mutations are enriched in genetic screens targeting specific phenotypes among random mutations. Applying this map to Whole Genome Sequencing (WGS) data of genetic suppressors that rescue a C. elegans ciliary kinesin mutant, we successfully pinpointed causal mutations without generating recombinant inbred lines. This method can be adapted in other species, offering a scalable approach for identifying causal genes and revitalizing the effectiveness of forward genetic screens.
Collapse
Affiliation(s)
- Zhengyang Guo
- Tsinghua-Peking Center for Life Sciences, Beijing Frontier Research Center for Biological Structure, McGovern Institute for Brain Research, State Key Laboratory of Membrane Biology, School of Life Sciences and MOE Key Laboratory for Protein Science, Tsinghua University, Beijing, China
| | - Shimin Wang
- Tsinghua-Peking Center for Life Sciences, Beijing Frontier Research Center for Biological Structure, McGovern Institute for Brain Research, State Key Laboratory of Membrane Biology, School of Life Sciences and MOE Key Laboratory for Protein Science, Tsinghua University, Beijing, China
| | - Yang Wang
- Tsinghua-Peking Center for Life Sciences, Beijing Frontier Research Center for Biological Structure, McGovern Institute for Brain Research, State Key Laboratory of Membrane Biology, School of Life Sciences and MOE Key Laboratory for Protein Science, Tsinghua University, Beijing, China
| | - Zi Wang
- Tsinghua-Peking Center for Life Sciences, Beijing Frontier Research Center for Biological Structure, McGovern Institute for Brain Research, State Key Laboratory of Membrane Biology, School of Life Sciences and MOE Key Laboratory for Protein Science, Tsinghua University, Beijing, China
| | - Guangshuo Ou
- Tsinghua-Peking Center for Life Sciences, Beijing Frontier Research Center for Biological Structure, McGovern Institute for Brain Research, State Key Laboratory of Membrane Biology, School of Life Sciences and MOE Key Laboratory for Protein Science, Tsinghua University, Beijing, China
| |
Collapse
|
4
|
Aseev LV, Koledinskaya LS, Boni IV. Extraribosomal Functions of Bacterial Ribosomal Proteins-An Update, 2023. Int J Mol Sci 2024; 25:2957. [PMID: 38474204 DOI: 10.3390/ijms25052957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 02/19/2024] [Accepted: 02/21/2024] [Indexed: 03/14/2024] Open
Abstract
Ribosomal proteins (r-proteins) are abundant, highly conserved, and multifaceted cellular proteins in all domains of life. Most r-proteins have RNA-binding properties and can form protein-protein contacts. Bacterial r-proteins govern the co-transcriptional rRNA folding during ribosome assembly and participate in the formation of the ribosome functional sites, such as the mRNA-binding site, tRNA-binding sites, the peptidyl transferase center, and the protein exit tunnel. In addition to their primary role in a cell as integral components of the protein synthesis machinery, many r-proteins can function beyond the ribosome (the phenomenon known as moonlighting), acting either as individual regulatory proteins or in complexes with various cellular components. The extraribosomal activities of r-proteins have been studied over the decades. In the past decade, our understanding of r-protein functions has advanced significantly due to intensive studies on ribosomes and gene expression mechanisms not only in model bacteria like Escherichia coli or Bacillus subtilis but also in little-explored bacterial species from various phyla. The aim of this review is to update information on the multiple functions of r-proteins in bacteria.
Collapse
Affiliation(s)
- Leonid V Aseev
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, 117997 Moscow, Russia
| | | | - Irina V Boni
- Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry RAS, 117997 Moscow, Russia
| |
Collapse
|
5
|
Li Z, Liu X, Ning N, Li T, Wang H. Diversity, Distribution, and Chromosomal Rearrangements of TRIP1 Repeat Sequences in Escherichia coli. Genes (Basel) 2024; 15:236. [PMID: 38397225 PMCID: PMC10888264 DOI: 10.3390/genes15020236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Revised: 02/07/2024] [Accepted: 02/10/2024] [Indexed: 02/25/2024] Open
Abstract
The bacterial genome contains numerous repeated sequences that greatly affect its genomic plasticity. The Escherichia coli K-12 genome contains three copies of the TRIP1 repeat sequence (TRIP1a, TRIP1b, and TRIP1c). However, the diversity, distribution, and role of the TRIP1 repeat sequence in the E. coli genome are still unclear. In this study, after screening 6725 E. coli genomes, the TRIP1 repeat was found in the majority of E. coli strains (96%: 6454/6725). The copy number and direction of the TRIP1 repeat sequence varied in each genome. Overall, 2449 genomes (36%: 2449/6725) had three copies of TRIP1 (TRIP1a, TRIP1b, and TRIP1c), which is the same as E. coli K-12. Five types of TRIP1 repeats, including two new types (TRIP1d and TRIP1e), are identified in E. coli genomes, located in 4703, 3529, 5741, 1565, and 232 genomes, respectively. Each type of TRIP1 repeat is localized to a specific locus on the chromosome. TRIP1 repeats can cause intra-chromosomal rearrangements. A total of 156 rearrangement events were identified, of which 88% (137/156) were between TRIP1a and TRIP1c. These findings have important implications for future research on TRIP1 repeats.
Collapse
Affiliation(s)
- Zhan Li
- State Key Laboratory of Pathogens and Biosecurity, Beijing Institute of Microbiology and Epidemiology, No. 20 Dongda Street, Fengtai District, Beijing 100071, China; (Z.L.); (N.N.); (T.L.)
| | - Xiong Liu
- Chinese PLA Center for Disease Control and Prevention, Dongda Street 20#, Fengtai District, Beijing 100071, China;
| | - Nianzhi Ning
- State Key Laboratory of Pathogens and Biosecurity, Beijing Institute of Microbiology and Epidemiology, No. 20 Dongda Street, Fengtai District, Beijing 100071, China; (Z.L.); (N.N.); (T.L.)
| | - Tao Li
- State Key Laboratory of Pathogens and Biosecurity, Beijing Institute of Microbiology and Epidemiology, No. 20 Dongda Street, Fengtai District, Beijing 100071, China; (Z.L.); (N.N.); (T.L.)
| | - Hui Wang
- State Key Laboratory of Pathogens and Biosecurity, Beijing Institute of Microbiology and Epidemiology, No. 20 Dongda Street, Fengtai District, Beijing 100071, China; (Z.L.); (N.N.); (T.L.)
| |
Collapse
|
6
|
Rigou S, Schmitt A, Alempic JM, Lartigue A, Vendloczki P, Abergel C, Claverie JM, Legendre M. Pithoviruses Are Invaded by Repeats That Contribute to Their Evolution and Divergence from Cedratviruses. Mol Biol Evol 2023; 40:msad244. [PMID: 37950899 PMCID: PMC10664404 DOI: 10.1093/molbev/msad244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Revised: 10/31/2023] [Accepted: 11/07/2023] [Indexed: 11/13/2023] Open
Abstract
Pithoviridae are amoeba-infecting giant viruses possessing the largest viral particles known so far. Since the discovery of Pithovirus sibericum, recovered from a 30,000-yr-old permafrost sample, other pithoviruses, and related cedratviruses, were isolated from various terrestrial and aquatic samples. Here, we report the isolation and genome sequencing of 2 Pithoviridae from soil samples, in addition to 3 other recent isolates. Using the 12 available genome sequences, we conducted a thorough comparative genomic study of the Pithoviridae family to decipher the organization and evolution of their genomes. Our study reveals a nonuniform genome organization in 2 main regions: 1 concentrating core genes and another gene duplications. We also found that Pithoviridae genomes are more conservative than other families of giant viruses, with a low and stable proportion (5% to 7%) of genes originating from horizontal transfers. Genome size variation within the family is mainly due to variations in gene duplication rates (from 14% to 28%) and massive invasion by inverted repeats. While these repeated elements are absent from cedratviruses, repeat-rich regions cover as much as a quarter of the pithoviruses genomes. These regions, identified using a dedicated pipeline, are hotspots of mutations, gene capture events, and genomic rearrangements that contribute to their evolution.
Collapse
Affiliation(s)
- Sofia Rigou
- Information Génomique & Structurale, Unité Mixte de Recherche 7256 (Institut de Microbiologie de la Méditerranée, FR3479), IM2B, IOM, Aix–Marseille University, Centre National de la Recherche Scientifique, Marseille 13288 Cedex 9, France
| | - Alain Schmitt
- Information Génomique & Structurale, Unité Mixte de Recherche 7256 (Institut de Microbiologie de la Méditerranée, FR3479), IM2B, IOM, Aix–Marseille University, Centre National de la Recherche Scientifique, Marseille 13288 Cedex 9, France
| | - Jean-Marie Alempic
- Information Génomique & Structurale, Unité Mixte de Recherche 7256 (Institut de Microbiologie de la Méditerranée, FR3479), IM2B, IOM, Aix–Marseille University, Centre National de la Recherche Scientifique, Marseille 13288 Cedex 9, France
| | - Audrey Lartigue
- Information Génomique & Structurale, Unité Mixte de Recherche 7256 (Institut de Microbiologie de la Méditerranée, FR3479), IM2B, IOM, Aix–Marseille University, Centre National de la Recherche Scientifique, Marseille 13288 Cedex 9, France
| | - Peter Vendloczki
- Information Génomique & Structurale, Unité Mixte de Recherche 7256 (Institut de Microbiologie de la Méditerranée, FR3479), IM2B, IOM, Aix–Marseille University, Centre National de la Recherche Scientifique, Marseille 13288 Cedex 9, France
| | - Chantal Abergel
- Information Génomique & Structurale, Unité Mixte de Recherche 7256 (Institut de Microbiologie de la Méditerranée, FR3479), IM2B, IOM, Aix–Marseille University, Centre National de la Recherche Scientifique, Marseille 13288 Cedex 9, France
| | - Jean-Michel Claverie
- Information Génomique & Structurale, Unité Mixte de Recherche 7256 (Institut de Microbiologie de la Méditerranée, FR3479), IM2B, IOM, Aix–Marseille University, Centre National de la Recherche Scientifique, Marseille 13288 Cedex 9, France
| | - Matthieu Legendre
- Information Génomique & Structurale, Unité Mixte de Recherche 7256 (Institut de Microbiologie de la Méditerranée, FR3479), IM2B, IOM, Aix–Marseille University, Centre National de la Recherche Scientifique, Marseille 13288 Cedex 9, France
| |
Collapse
|
7
|
Lim SH, Kim DH, Lee JY. Molecular mechanism controlling anthocyanin composition and content in radish plants with different root colors. PLANT PHYSIOLOGY AND BIOCHEMISTRY : PPB 2023; 204:108091. [PMID: 37864927 DOI: 10.1016/j.plaphy.2023.108091] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 10/03/2023] [Accepted: 10/11/2023] [Indexed: 10/23/2023]
Abstract
Radish (Raphanus sativus) roots exhibit various colors that reflect their anthocyanin compositions and contents. However, the details of the mechanism linking the expression of anthocyanin biosynthesis and their transcriptional regulators to anthocyanin composition in radish roots remained unknown. Here, we characterized the role of the anthocyanin biosynthetic enzyme flavonoid 3'-hydroxylase (RsF3'H), together with the R2R3 MYB transcription factor (TF) RsMYB1 and the basic helix-loop-helix (bHLH) TF TRANSPARENT TESTA 8 (RsTT8), in four radish plants with different root colors: white (W), deep red (DR), dark purple (DP), and dark greyish purple (DGP). The DR plant contained heterozygous for RsF3'H with low expression level and accumulated a large amount of pelargonidin, resulting in deep red color. While, the DP and DGP plants accumulated the cyanidin due to the higher expression level of functional RsF3'H. Notably, RsMYB1 and RsTT8 transcripts were abundant in all pigmented roots, but not in white roots. To investigate the differential expression of RsMYB1 and RsTT8, we compared the sequences of their promoter regions among the four radish plants, revealing variations in the numbers of cis-elements and in promoter architecture. Promoter activation assays demonstrated that variation in the RsMYB1 and RsTT8 promoters may contribute to the expression level of these genes, and RsMYB1 can activate its own expression as well as promote the RsTT8 expression. These results suggested that RsF3'H plays a vital role in anthocyanin composition and the expression level of both RsMYB1 and RsTT8 are crucial determinants for anthocyanin content in radish roots. Overall, these findings provide insight into the molecular basis of anthocyanin composition and level in radish roots.
Collapse
Affiliation(s)
- Sun-Hyung Lim
- Division of Horticultural Biotechnology, School of Biotechnology, Hankyong National University, Anseong, 17579, Republic of Korea; Research Institute of International Technology and Information, Hankyong National University, Anseong, 17579, Republic of Korea.
| | - Da-Hye Kim
- Division of Horticultural Biotechnology, School of Biotechnology, Hankyong National University, Anseong, 17579, Republic of Korea; Research Institute of International Technology and Information, Hankyong National University, Anseong, 17579, Republic of Korea
| | - Jong-Yeol Lee
- National Academy of Agricultural Science, Rural Development Administration, Jeonju, 54874, Republic of Korea
| |
Collapse
|
8
|
Porubiaková O, Havlík J, Indu, Šedý M, Přepechalová V, Bartas M, Bidula S, Šťastný J, Fojta M, Brázda V. Variability of Inverted Repeats in All Available Genomes of Bacteria. Microbiol Spectr 2023; 11:e0164823. [PMID: 37358458 PMCID: PMC10434271 DOI: 10.1128/spectrum.01648-23] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 06/03/2023] [Indexed: 06/27/2023] Open
Abstract
Noncanonical secondary structures in nucleic acids have been studied intensively in recent years. Important biological roles of cruciform structures formed by inverted repeats (IRs) have been demonstrated in diverse organisms, including humans. Using Palindrome analyser, we analyzed IRs in all accessible bacterial genome sequences to determine their frequencies, lengths, and localizations. IR sequences were identified in all species, but their frequencies differed significantly across various evolutionary groups. We detected 242,373,717 IRs in all 1,565 bacterial genomes. The highest mean IR frequency was detected in the Tenericutes (61.89 IRs/kbp) and the lowest mean frequency was found in the Alphaproteobacteria (27.08 IRs/kbp). IRs were abundant near genes and around regulatory, tRNA, transfer-messenger RNA (tmRNA), and rRNA regions, pointing to the importance of IRs in such basic cellular processes as genome maintenance, DNA replication, and transcription. Moreover, we found that organisms with high IR frequencies were more likely to be endosymbiotic, antibiotic producing, or pathogenic. On the other hand, those with low IR frequencies were far more likely to be thermophilic. This first comprehensive analysis of IRs in all available bacterial genomes demonstrates their genomic ubiquity, nonrandom distribution, and enrichment in genomic regulatory regions. IMPORTANCE Our manuscript reports for the first time a complete analysis of inverted repeats in all fully sequenced bacterial genomes. Thanks to the availability of unique computational resources, we were able to statistically evaluate the presence and localization of these important regulatory sequences in bacterial genomes. This work revealed a strong abundance of these sequences in regulatory regions and provides researchers with a valuable tool for their manipulation.
Collapse
Affiliation(s)
- Otília Porubiaková
- Institute of Biophysics of the Czech Academy of Sciences, Brno, Czech Republic
| | - Jan Havlík
- Mendel University in Brno, Brno, Czech Republic
| | - Indu
- Institute of Biophysics of the Czech Academy of Sciences, Brno, Czech Republic
- Department of Experimental Biology, Faculty of Science, Masaryk University, Brno, Czech Republic
| | - Michal Šedý
- Brno University of Technology, Faculty of Chemistry, Brno, Czech Republic
| | - Veronika Přepechalová
- Institute of Biophysics of the Czech Academy of Sciences, Brno, Czech Republic
- Brno University of Technology, Faculty of Chemistry, Brno, Czech Republic
| | - Martin Bartas
- Department of Biology and Ecology, Faculty of Science, University of Ostrava, Ostrava, Czech Republic
| | - Stefan Bidula
- School of Pharmacy, University of East Anglia, Norwich Research Park, Norwich, United Kingdom
| | - Jiří Šťastný
- Mendel University in Brno, Brno, Czech Republic
- Brno University of Technology, Faculty of Mechanical Engineering, Brno, Czech Republic
| | - Miroslav Fojta
- Institute of Biophysics of the Czech Academy of Sciences, Brno, Czech Republic
| | - Václav Brázda
- Institute of Biophysics of the Czech Academy of Sciences, Brno, Czech Republic
- Brno University of Technology, Faculty of Chemistry, Brno, Czech Republic
| |
Collapse
|
9
|
Bastos CAC, Afreixo V, Rodrigues JMOS, Pinho AJ. Concentration of inverted repeats along human DNA. J Integr Bioinform 2023; 20:jib-2022-0052. [PMID: 37486620 PMCID: PMC10561070 DOI: 10.1515/jib-2022-0052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 02/27/2023] [Indexed: 07/25/2023] Open
Abstract
This work aims to describe the observed enrichment of inverted repeats in the human genome; and to identify and describe, with detailed length profiles, the regions with significant and relevant enriched occurrence of inverted repeats. The enrichment is assessed and tested with a recently proposed measure (z-scores based measure). We simulate a genome using an order 7 Markov model trained with the data from the real genome. The simulated genome is used to establish the critical values which are used as decision thresholds to identify the regions with significant enriched concentrations. Several human genome regions are highly enriched in the occurrence of inverted repeats. This is observed in all the human chromosomes. The distribution of inverted repeat lengths varies along the genome. The majority of the regions with severely exaggerated enrichment contain mainly short length inverted repeats. There are also regions with regular peaks along the inverted repeats lengths distribution (periodic regularities) and other regions with exaggerated enrichment for long lengths (less frequent). However, adjacent regions tend to have similar distributions.
Collapse
Affiliation(s)
- Carlos A. C. Bastos
- DETI – Department of Electronics, Telecommunications and Informatics, IEETA – Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, 3810-193Aveiro, Portugal
- LASI – Intelligent Systems Associate Laboratory, Aveiro, Portugal
| | - Vera Afreixo
- CIDMA – Center for Research and Development in Mathematics and Applications, DMAT – Department of Mathematics, University of Aveiro, 3810-193Aveiro, Portugal
| | - João M. O. S. Rodrigues
- DETI – Department of Electronics, Telecommunications and Informatics, IEETA – Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, 3810-193Aveiro, Portugal
- LASI – Intelligent Systems Associate Laboratory, Aveiro, Portugal
| | - Armando J. Pinho
- DETI – Department of Electronics, Telecommunications and Informatics, IEETA – Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, 3810-193Aveiro, Portugal
- LASI – Intelligent Systems Associate Laboratory, Aveiro, Portugal
| |
Collapse
|
10
|
Ait Saada A, Guo W, Costa AB, Yang J, Wang J, Lobachev K. Widely spaced and divergent inverted repeats become a potent source of chromosomal rearrangements in long single-stranded DNA regions. Nucleic Acids Res 2023; 51:3722-3734. [PMID: 36919609 PMCID: PMC10164571 DOI: 10.1093/nar/gkad153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 02/16/2023] [Accepted: 02/20/2023] [Indexed: 03/16/2023] Open
Abstract
DNA inverted repeats (IRs) are widespread across many eukaryotic genomes. Their ability to form stable hairpin/cruciform secondary structures is causative in triggering chromosome instability leading to several human diseases. Distance and sequence divergence between IRs are inversely correlated with their ability to induce gross chromosomal rearrangements (GCRs) because of a lesser probability of secondary structure formation and chromosomal breakage. In this study, we demonstrate that structural parameters that normally constrain the instability of IRs are overcome when the repeats interact in single-stranded DNA (ssDNA). We established a system in budding yeast whereby >73 kb of ssDNA can be formed in cdc13-707fs mutants. We found that in ssDNA, 12 bp or 30 kb spaced Alu-IRs show similarly high levels of GCRs, while heterology only beyond 25% suppresses IR-induced instability. Mechanistically, rearrangements arise after cis-interaction of IRs leading to a DNA fold-back and the formation of a dicentric chromosome, which requires Rad52/Rad59 for IR annealing as well as Rad1-Rad10, Slx4, Msh2/Msh3 and Saw1 proteins for nonhomologous tail removal. Importantly, using structural characteristics rendering IRs permissive to DNA fold-back in yeast, we found that ssDNA regions mapped in cancer genomes contain a substantial number of potentially interacting and unstable IRs.
Collapse
Affiliation(s)
- Anissia Ait Saada
- School of Biological Sciences and Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Wenying Guo
- School of Biological Sciences and Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Alex B Costa
- School of Biological Sciences and Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | - Jiaxin Yang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Jianrong Wang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, East Lansing, MI 48824, USA
| | - Kirill S Lobachev
- School of Biological Sciences and Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, GA 30332, USA
| |
Collapse
|
11
|
Bowater RP, Brázda V. Impacts of Molecular Structure on Nucleic Acid-Protein Interactions. Int J Mol Sci 2022; 24:ijms24010407. [PMID: 36613851 PMCID: PMC9820666 DOI: 10.3390/ijms24010407] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Accepted: 12/21/2022] [Indexed: 12/28/2022] Open
Abstract
Interactions between nucleic acids and proteins are some of the most important interactions in biology because they are the cornerstones for fundamental biological processes, such as replication, transcription, and recombination [...].
Collapse
Affiliation(s)
- Richard P. Bowater
- School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich NR4 7TJ, UK
- Correspondence: (R.P.B.); (V.B.)
| | - Václav Brázda
- Institute of Biophysics of the Czech Academy of Sciences, Královopolská 135, 612 00 Brno, Czech Republic
- Correspondence: (R.P.B.); (V.B.)
| |
Collapse
|
12
|
DNA Motifs and an Accessory CRISPR Factor Determine Cas1 Binding and Integration Activity in Sulfolobus islandicus. Int J Mol Sci 2022; 23:ijms231710178. [PMID: 36077578 PMCID: PMC9456107 DOI: 10.3390/ijms231710178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2022] [Revised: 08/31/2022] [Accepted: 09/02/2022] [Indexed: 11/17/2022] Open
Abstract
CRISPR-Cas systems empower prokaryotes with adaptive immunity against invasive mobile genetic elements. At the first step of CRISPR immunity adaptation, short DNA fragments from the invaders are integrated into CRISPR arrays at the leader-proximal end. To date, the mechanism of recognition of the leader-proximal end remains largely unknown. Here, in the Sulfolobus islandicus subtype I-A system, we show that mutations destroying the proximal region reduce CRISPR adaptation in vivo. We identify that a stem-loop structure is present on the leader-proximal end, and we demonstrate that Cas1 preferentially binds the stem-loop structure in vitro. Moreover, we demonstrate that the integrase activity of Cas1 is modulated by interacting with a CRISPR-associated factor Csa3a. When translocated to the CRISPR array, the Csa3a-Cas1 complex is separated by Csa3a binding to the leader-distal motif and Cas1 binding to the leader-proximal end. Mutation at the leader-distal motif reduces CRISPR adaptation efficiency, further confirming the in vivo function of leader-distal motif. Together, our results suggest a general model for binding of Cas1 protein to a leader motif and modulation of integrase activity by an accessory factor.
Collapse
|