1
|
Verma A, Lin M, Smith D, Walker JC, Hewezi T, Davis EL, Hussey RS, Baum TJ, Mitchum MG. A novel sugar beet cyst nematode effector 2D01 targets the Arabidopsis HAESA receptor-like kinase. MOLECULAR PLANT PATHOLOGY 2022; 23:1765-1782. [PMID: 36069343 PMCID: PMC9644282 DOI: 10.1111/mpp.13263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Revised: 08/10/2022] [Accepted: 08/11/2022] [Indexed: 06/15/2023]
Abstract
Plant-parasitic cyst nematodes use a stylet to deliver effector proteins produced in oesophageal gland cells into root cells to cause disease in plants. These effectors are deployed to modulate plant defence responses and developmental programmes for the formation of a specialized feeding site called a syncytium. The Hg2D01 effector gene, coding for a novel 185-amino-acid secreted protein, was previously shown to be up-regulated in the dorsal gland of parasitic juveniles of the soybean cyst nematode Heterodera glycines, but its function has remained unknown. Genome analyses revealed that Hg2D01 belongs to a highly diversified effector gene family in the genomes of H. glycines and the sugar beet cyst nematode Heterodera schachtii. For functional studies using the model Arabidopsis thaliana-H. schachtii pathosystem, we cloned the orthologous Hs2D01 sequence from H. schachtii. We demonstrate that Hs2D01 is a cytoplasmic effector that interacts with the intracellular kinase domain of HAESA (HAE), a cell surface-associated leucine-rich repeat (LRR) receptor-like kinase (RLK) involved in signalling the activation of cell wall-remodelling enzymes important for cell separation during abscission and lateral root emergence. Furthermore, we show that AtHAE is expressed in the syncytium and, therefore, could serve as a viable host target for Hs2D01. Infective juveniles effectively penetrated the roots of HAE and HAESA-LIKE2 (HSL2) double mutant plants; however, fewer nematodes developed on the roots, consistent with a role for this receptor family in nematode infection. Taken together, our results suggest that the Hs2D01-AtHAE interaction may play an important role in sugar beet cyst nematode parasitism.
Collapse
Affiliation(s)
- Anju Verma
- Department of Plant Pathology and Institute of Plant Breeding, Genetics, and GenomicsUniversity of GeorgiaAthensGeorgiaUSA
- Division of Plant Sciences and Bond Life Sciences CenterUniversity of MissouriColumbiaMissouriUSA
| | - Marriam Lin
- Division of Plant Sciences and Bond Life Sciences CenterUniversity of MissouriColumbiaMissouriUSA
- Boyle Frederickson Intellectual Property LawMilwaukeeWisconsinUSA
| | - Dante Smith
- Division of Plant Sciences and Bond Life Sciences CenterUniversity of MissouriColumbiaMissouriUSA
- Conagra Brands, Inc., Corporate Microbiology, Research and DevelopmentOmahaNebraskaUSA
| | - John C. Walker
- Division of Biological SciencesUniversity of MissouriColumbiaMissouriUSA
| | - Tarek Hewezi
- Department of Plant SciencesUniversity of TennesseeKnoxvilleTennesseeUSA
| | - Eric L. Davis
- Department of Entomology and Plant PathologyNorth Carolina State UniversityRaleighNorth CarolinaUSA
| | - Richard S. Hussey
- Department of Plant Pathology and Institute of Plant Breeding, Genetics, and GenomicsUniversity of GeorgiaAthensGeorgiaUSA
| | - Thomas J. Baum
- Department of Plant Pathology and MicrobiologyIowa State UniversityAmesIowaUSA
| | - Melissa G. Mitchum
- Department of Plant Pathology and Institute of Plant Breeding, Genetics, and GenomicsUniversity of GeorgiaAthensGeorgiaUSA
- Division of Plant Sciences and Bond Life Sciences CenterUniversity of MissouriColumbiaMissouriUSA
| |
Collapse
|
2
|
Xu K, Zhang YF, Guo DY, Qin L, Ashraf M, Ahmad N. Recent advances in yeast genome evolution with stress tolerance for green biological manufacturing. Biotechnol Bioeng 2022; 119:2689-2697. [PMID: 35841179 DOI: 10.1002/bit.28183] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Revised: 06/20/2022] [Accepted: 07/13/2022] [Indexed: 01/04/2023]
Abstract
Green biological manufacturing is a revolutionary industrial model utilizing yeast as a significant microbial cell factory to produce biofuels and other biochemicals. However, biotransformation efficiency is often limited owing to several stress factors resulting from environmental changes or metabolic imbalance, leading to the slow growth of cells, compromised yield, and enhanced energy consumption. These factors make biological manufacturing competitively less economical. In this regard, minimizing the stress impact on microbial cell factories and strong robust performance have been an interesting area of interest in the last few decades. In this review, we focused on revealing the stress factors and their associated mechanisms for yeast in biological manufacturing. To improve yeast tolerance, rational and irrational strategies were introduced, and the molecular basis of genome evolution in yeast was also summarized. Furthermore, strategies of genome-directed evolution such as homology directed repair and nonhomologous end-joining, and the synthetic chromosome recombination and modification by LoxP-mediated evolution and their association with stress tolerance was highlighted. We hope that genome evolution provides new insights for solving the limitations of the natural phenotypes of microorganisms in industrial fermentation for the production of valuable compounds.
Collapse
Affiliation(s)
- Ke Xu
- Department of Life Science, Tangshan Key Laboratory of Agricultural Pathogenic Fungi and Toxins, Tangshan Normal University, Tangshan.,Department of Chemical Engineering, Key Lab for Industrial Biocatalysis, Ministry of Education, Tsinghua University, Beijing, PR China
| | - Yun-Feng Zhang
- Department of Life Science, Tangshan Key Laboratory of Agricultural Pathogenic Fungi and Toxins, Tangshan Normal University, Tangshan
| | - Dong-Yu Guo
- Department of Life Science, Tangshan Key Laboratory of Agricultural Pathogenic Fungi and Toxins, Tangshan Normal University, Tangshan
| | - Lei Qin
- Department of Chemical Engineering, Key Lab for Industrial Biocatalysis, Ministry of Education, Tsinghua University, Beijing, PR China
| | - Munaza Ashraf
- Department of Zoology, University of Sargodha, Sargodha, Pakistan
| | - Nadeem Ahmad
- Department of Pharmacy, COMSATS University Islamabad, Abbottabad Campus, Abbottabad, Pakistan
| |
Collapse
|
3
|
Brázda V, Bohálová N, Bowater RP. New telomere to telomere assembly of human chromosome 8 reveals a previous underestimation of G-quadruplex forming sequences and inverted repeats. Gene 2021; 810:146058. [PMID: 34737002 DOI: 10.1016/j.gene.2021.146058] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2021] [Revised: 10/14/2021] [Accepted: 10/29/2021] [Indexed: 11/04/2022]
Abstract
Taking advantage of evolving and improving sequencing methods, human chromosome 8 is now available as a gapless, end-to-end assembly. Thanks to advances in long-read sequencing technologies, its centromere, telomeres, duplicated gene families and repeat-rich regions are now fully sequenced. We were interested to assess if the new assembly altered our understanding of the potential impact of non-B DNA structures within this completed chromosome sequence. It has been shown that non-B secondary structures, such as G-quadruplexes, hairpins and cruciforms, have important regulatory functions and potential as targeted therapeutics. Therefore, we analysed the presence of putative G-quadruplex forming sequences and inverted repeats in the current human reference genome (GRCh38) and in the new end-to-end assembly of chromosome 8. The comparison revealed that the new assembly contains significantly more inverted repeats and G-quadruplex forming sequences compared to the current reference sequence. This observation can be explained by improved accuracy of the new sequencing methods, particularly in regions that contain extensive repeats of bases, as is preferred by many non-B DNA structures. These results show a significant underestimation of the prevalence of non-B DNA secondary structure in previous assembly versions of the human genome and point to their importance being not fully appreciated. We anticipate that similar observations will occur as the improved sequencing technologies fill in gaps across the genomes of humans and other organisms.
Collapse
Affiliation(s)
- Václav Brázda
- Institute of Biophysics of the Czech Academy of Sciences, Královopolská 135, Brno 612 65, Czech Republic.
| | - Natália Bohálová
- Institute of Biophysics of the Czech Academy of Sciences, Královopolská 135, Brno 612 65, Czech Republic; Department of Experimental Biology, Faculty of Science, Masaryk University, Kamenice 5, Brno 62500, Czech Republic
| | - Richard P Bowater
- School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich NR4 7TJ, United Kingdom.
| |
Collapse
|
4
|
Jia L, Li Y, Huang F, Jiang Y, Li H, Wang Z, Chen T, Li J, Zhang Z, Yao W. LIRBase: a comprehensive database of long inverted repeats in eukaryotic genomes. Nucleic Acids Res 2021; 50:D174-D182. [PMID: 34643715 PMCID: PMC8728187 DOI: 10.1093/nar/gkab912] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2021] [Revised: 09/20/2021] [Accepted: 09/25/2021] [Indexed: 11/14/2022] Open
Abstract
Small RNAs (sRNAs) constitute a large portion of functional elements in eukaryotic genomes. Long inverted repeats (LIRs) can be transcribed into long hairpin RNAs (hpRNAs), which can further be processed into small interfering RNAs (siRNAs) with vital biological roles. In this study, we systematically identified a total of 6 619 473 LIRs in 424 eukaryotic genomes and developed LIRBase (https://venyao.xyz/lirbase/), a specialized database of LIRs across different eukaryotic genomes aiming to facilitate the annotation and identification of LIRs encoding long hpRNAs and siRNAs. LIRBase houses a comprehensive collection of LIRs identified in a wide range of eukaryotic genomes. In addition, LIRBase not only allows users to browse and search the identified LIRs in any eukaryotic genome(s) of interest available in GenBank, but also provides friendly web functionalities to facilitate users to identify LIRs in user-uploaded sequences, align sRNA sequencing data to LIRs, perform differential expression analysis of LIRs, predict mRNA targets for LIR-derived siRNAs, and visualize the secondary structure of candidate long hpRNAs encoded by LIRs. As demonstrated by two case studies, collectively, LIRBase bears the great utility for systematic investigation and characterization of LIRs and functional exploration of potential roles of LIRs and their derived siRNAs in diverse species.
Collapse
Affiliation(s)
- Lihua Jia
- National Key Laboratory of Wheat and Maize Crop Science, College of Life Sciences, Henan Agricultural University, Zhengzhou 450002, China.,National Key Laboratory of Wheat and Maize Crop Science, College of Agronomy, Henan Agricultural University, Zhengzhou 450002, China
| | - Yang Li
- National Key Laboratory of Wheat and Maize Crop Science, College of Life Sciences, Henan Agricultural University, Zhengzhou 450002, China
| | - Fangfang Huang
- National Key Laboratory of Wheat and Maize Crop Science, College of Life Sciences, Henan Agricultural University, Zhengzhou 450002, China
| | - Yingru Jiang
- National Key Laboratory of Wheat and Maize Crop Science, College of Life Sciences, Henan Agricultural University, Zhengzhou 450002, China
| | - Haoran Li
- National Key Laboratory of Wheat and Maize Crop Science, College of Life Sciences, Henan Agricultural University, Zhengzhou 450002, China
| | - Zhizhan Wang
- National Key Laboratory of Wheat and Maize Crop Science, College of Life Sciences, Henan Agricultural University, Zhengzhou 450002, China
| | - Tiantian Chen
- National Key Laboratory of Wheat and Maize Crop Science, College of Life Sciences, Henan Agricultural University, Zhengzhou 450002, China
| | - Jiaming Li
- National Key Laboratory of Wheat and Maize Crop Science, College of Life Sciences, Henan Agricultural University, Zhengzhou 450002, China
| | - Zhang Zhang
- China National Center for Bioinformation, Beijing 100101, China.,National Genomics Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100101, China
| | - Wen Yao
- National Key Laboratory of Wheat and Maize Crop Science, College of Life Sciences, Henan Agricultural University, Zhengzhou 450002, China
| |
Collapse
|
5
|
Wu X, Liang Y, Gao H, Wang J, Zhao Y, Hua L, Yuan Y, Wang A, Zhang X, Liu J, Zhou J, Meng X, Zhang D, Lin S, Huang X, Han B, Li J, Wang Y. Enhancing rice grain production by manipulating the naturally evolved cis-regulatory element-containing inverted repeat sequence of OsREM20. MOLECULAR PLANT 2021; 14:997-1011. [PMID: 33741527 DOI: 10.1016/j.molp.2021.03.016] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Revised: 01/19/2021] [Accepted: 03/14/2021] [Indexed: 05/05/2023]
Abstract
Grain number per panicle (GNP) is an important agronomic trait that contributes to rice grain yield. Despite its importance in rice breeding, the molecular mechanism underlying GNP regulation remains largely unknown. In this study, we identified a previously unrecognized regulatory gene that controls GNP in rice, Oryza sativa REPRODUCTIVE MERISTEM 20 (OsREM20), which encodes a B3 domain transcription factor. Through genetic analysis and transgenic validation we found that genetic variation in the CArG box-containing inverted repeat (IR) sequence of the OsREM20 promoter alters its expression level and contributes to GNP variation among rice varieties. Furthermore, we revealed that the IR sequence regulates OsREM20 expression by affecting the direct binding of OsMADS34 to the CArG box within the IR sequence. Interestingly, the divergent pOsREM20IR and pOsREM20ΔIR alleles were found to originate from different Oryza rufipogon accessions, and were independently inherited into the japonica and indica subspecies, respectively, during domestication. Importantly, we demonstrated that IR sequence variations in the OsREM20 promoter can be utilized for germplasm improvement through either genome editing or traditional breeding. Taken together, our study characterizes novel genetic variations responsible for GNP diversity in rice, reveals the underlying molecular mechanism in the regulation of agronomically important gene expression, and provides a promising strategy for improving rice production by manipulating the cis-regulatory element-containing IR sequence.
Collapse
Affiliation(s)
- Xiaowei Wu
- State Key Laboratory of Plant Genomics and National Center for Plant Gene Research (Beijing), Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yan Liang
- State Key Laboratory of Plant Genomics and National Center for Plant Gene Research (Beijing), Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China; College of Life Sciences, Shandong Agricultural University, Taian, Shandong 271018, China
| | - Hengbin Gao
- College of Life Sciences, Shandong Agricultural University, Taian, Shandong 271018, China
| | - Jiyao Wang
- State Key Laboratory of Plant Genomics and National Center for Plant Gene Research (Beijing), Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yan Zhao
- National Center for Gene Research, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200233, China
| | - Lekai Hua
- College of Resources and Environment, Fujian Agriculture and Forestry University, Fuzhou 350002, China
| | - Yundong Yuan
- State Key Laboratory of Plant Genomics and National Center for Plant Gene Research (Beijing), Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China
| | - Ahong Wang
- National Center for Gene Research, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200233, China
| | - Xiaohui Zhang
- State Key Laboratory of Plant Genomics and National Center for Plant Gene Research (Beijing), Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jiafan Liu
- State Key Laboratory of Plant Genomics and National Center for Plant Gene Research (Beijing), Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China
| | - Jie Zhou
- State Key Laboratory of Plant Genomics and National Center for Plant Gene Research (Beijing), Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China
| | - Xiangbing Meng
- State Key Laboratory of Plant Genomics and National Center for Plant Gene Research (Beijing), Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China
| | - Dahan Zhang
- State Key Laboratory of Plant Genomics and National Center for Plant Gene Research (Beijing), Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Shaoyang Lin
- State Key Laboratory of Plant Genomics and National Center for Plant Gene Research (Beijing), Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China
| | - Xuehui Huang
- Shanghai Key Laboratory of Plant Molecular Sciences, College of Life Sciences, Shanghai Normal University, Shanghai 200234, China
| | - Bin Han
- CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai 200032, China; National Center for Gene Research, Institute of Plant Physiology and Ecology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200233, China
| | - Jiayang Li
- State Key Laboratory of Plant Genomics and National Center for Plant Gene Research (Beijing), Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China; Innovation Academy for Seed Design, Chinese Academy of Sciences, Beijing 100101, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yonghong Wang
- State Key Laboratory of Plant Genomics and National Center for Plant Gene Research (Beijing), Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China; CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai 200032, China; University of Chinese Academy of Sciences, Beijing 100049, China; College of Life Sciences, Shandong Agricultural University, Taian, Shandong 271018, China.
| |
Collapse
|
6
|
Cruciform Formable Sequences within Pou5f1 Enhancer Are Indispensable for Mouse ES Cell Integrity. Int J Mol Sci 2021; 22:ijms22073399. [PMID: 33810223 PMCID: PMC8036336 DOI: 10.3390/ijms22073399] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Revised: 03/22/2021] [Accepted: 03/22/2021] [Indexed: 01/04/2023] Open
Abstract
DNA can adopt various structures besides the B-form. Among them, cruciform structures are formed on inverted repeat (IR) sequences. While cruciform formable IRs (CFIRs) are sometimes found in regulatory regions of transcription, their function in transcription remains elusive, especially in eukaryotes. We found a cluster of CFIRs within the mouse Pou5f1 enhancer. Here, we demonstrate that this cluster or some member(s) plays an active role in the transcriptional regulation of not only Pou5f1, but also Sox2, Nanog, Klf4 and Esrrb. To clarify in vivo function of the cluster, we performed genome editing using mouse ES cells, in which each of the CFIRs was altered to the corresponding mirror repeat sequence. The alterations reduced the level of the Pou5f1 transcript in the genome-edited cell lines, and elevated those of Sox2, Nanog, Klf4 and Esrrb. Furthermore, transcription of non-coding RNAs (ncRNAs) within the enhancer was also upregulated in the genome-edited cell lines, in a similar manner to Sox2, Nanog, Klf4 and Esrrb. These ncRNAs are hypothesized to control the expression of these four pluripotency genes. The CFIRs present in the Pou5f1 enhancer seem to be important to maintain the integrity of ES cells.
Collapse
|
7
|
Zhang R, Ge F, Li H, Chen Y, Zhao Y, Gao Y, Liu Z, Yang L. PCIR: a database of Plant Chloroplast Inverted Repeats. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020; 2019:5611292. [PMID: 31696928 PMCID: PMC6835207 DOI: 10.1093/database/baz127] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 09/26/2019] [Accepted: 10/07/2019] [Indexed: 01/06/2023]
Abstract
Inverted repeats (IRs) serve as potential biomarkers for genomic instability, DNA replication and other genetic processes. However, little information can be found in databases to help researchers recognize potential IR nucleotides, explore junction sites and annotate related functional genes. Plant Chloroplast Inverted Repeats (PCIR) is an interactive, web-based platform containing various sequenced chloroplast genomes that enables detection, searching and visualization of large-scale detailed information on IRs. PCIR contains many datasets, including 21 433 IRs, 113 plants chloroplast genomes, 16 948 functional genes and 21 659 visual maps. This database offers an online prediction tool for detecting IRs based on DNA sequences. PCIR can also analyze phylogenetic relationships using IR information among different species and provide users with high-quality marker maps. This database will be a valuable resource for IR distribution patterns, related genes and architectural features.
Collapse
Affiliation(s)
- Rui Zhang
- Agricultural Big-Data Research Center and College of Plant Protection, Shandong Agricultural University, Tai'an 271018, China
| | - Fangfang Ge
- Agricultural Big-Data Research Center and College of Plant Protection, Shandong Agricultural University, Tai'an 271018, China
| | - Huayang Li
- Agricultural Big-Data Research Center and College of Plant Protection, Shandong Agricultural University, Tai'an 271018, China
| | - Yudong Chen
- Agricultural Big-Data Research Center and College of Plant Protection, Shandong Agricultural University, Tai'an 271018, China
| | - Ying Zhao
- Agricultural Big-Data Research Center and College of Plant Protection, Shandong Agricultural University, Tai'an 271018, China
| | - Ying Gao
- Agricultural Big-Data Research Center and College of Plant Protection, Shandong Agricultural University, Tai'an 271018, China
| | - Zhiguo Liu
- Agricultural Big-Data Research Center and College of Plant Protection, Shandong Agricultural University, Tai'an 271018, China
| | - Long Yang
- Agricultural Big-Data Research Center and College of Plant Protection, Shandong Agricultural University, Tai'an 271018, China
| |
Collapse
|
8
|
Bastos CAC, Afreixo V, Rodrigues JMOS, Pinho AJ, Silva RM. Distribution of Distances Between Symmetric Words in the Human Genome: Analysis of Regular Peaks. Interdiscip Sci 2019; 11:367-372. [PMID: 30911903 DOI: 10.1007/s12539-019-00326-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2018] [Revised: 01/24/2019] [Accepted: 02/27/2019] [Indexed: 11/29/2022]
Abstract
Finding DNA sites with high potential for the formation of hairpin/cruciform structures is an important task. Previous works studied the distances between adjacent reversed complement words (symmetric word pairs) and also for non-adjacent words. It was observed that for some words a few distances were favoured (peaks) and that in some distributions there was strong peak regularity. The present work extends previous studies, by improving the detection and characterization of peak regularities in the symmetric word pairs distance distributions of the human genome. This work also analyzes the location of the sequences that originate the observed strong peak periodicity in the distance distribution. The results obtained in this work may indicate genomic sites with potential for the formation of hairpin/cruciform structures.
Collapse
Affiliation(s)
- Carlos A C Bastos
- Department of Electronics, Telecommunications and Informatics, IEETA-Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Campus Universitário de Santiago, Aveiro, Portugal.
| | - Vera Afreixo
- Department of Mathematics, IEETA-Institute of Electronics and Informatics Engineering of Aveiro, CIDMA-Center for Research and Development in Mathematics and Applications, University of Aveiro, Campus Universitário de Santiago, Aveiro, Portugal
| | - João M O S Rodrigues
- Department of Electronics, Telecommunications and Informatics, IEETA-Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Campus Universitário de Santiago, Aveiro, Portugal
| | - Armando J Pinho
- Department of Electronics, Telecommunications and Informatics, IEETA-Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Campus Universitário de Santiago, Aveiro, Portugal
| | - Raquel M Silva
- Department of Medical Sciences, iBiMED, IEETA-Institute of Electronics and Informatics Engineering of Aveiro, University of Aveiro, Campus Universitário de Santiago, Aveiro, Portugal
| |
Collapse
|
9
|
Christmas MJ, Wallberg A, Bunikis I, Olsson A, Wallerman O, Webster MT. Chromosomal inversions associated with environmental adaptation in honeybees. Mol Ecol 2018; 28:1358-1374. [DOI: 10.1111/mec.14944] [Citation(s) in RCA: 39] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Revised: 11/07/2018] [Accepted: 11/07/2018] [Indexed: 01/03/2023]
Affiliation(s)
- Matthew J. Christmas
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory Uppsala University Uppsala Sweden
| | - Andreas Wallberg
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory Uppsala University Uppsala Sweden
| | - Ignas Bunikis
- Department of Immunology, Genetics and Pathology, Science for Life Laboratory Uppsala University Uppsala Sweden
| | - Anna Olsson
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory Uppsala University Uppsala Sweden
| | - Ola Wallerman
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory Uppsala University Uppsala Sweden
| | - Matthew T. Webster
- Department of Medical Biochemistry and Microbiology, Science for Life Laboratory Uppsala University Uppsala Sweden
| |
Collapse
|
10
|
Miura O, Ogake T, Yoneyama H, Kikuchi Y, Ohyama T. A strong structural correlation between short inverted repeat sequences and the polyadenylation signal in yeast and nucleosome exclusion by these inverted repeats. Curr Genet 2018; 65:575-590. [PMID: 30498953 PMCID: PMC6420913 DOI: 10.1007/s00294-018-0907-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Revised: 11/14/2018] [Accepted: 11/15/2018] [Indexed: 11/22/2022]
Abstract
DNA sequences that read the same from 5′ to 3′ in either strand are called inverted repeat sequences or simply IRs. They are found throughout a wide variety of genomes, from prokaryotes to eukaryotes. Despite extensive research, their in vivo functions, if any, remain unclear. Using Saccharomyces cerevisiae, we performed genome-wide analyses for the distribution, occurrence frequency, sequence characteristics and relevance to chromatin structure, for the IRs that reportedly have a cruciform-forming potential. Here, we provide the first comprehensive map of these IRs in the S. cerevisiae genome. The statistically significant enrichment of the IRs was found in the close vicinity of the DNA positions corresponding to polyadenylation [poly(A)] sites and ~ 30 to ~ 60 bp downstream of start codon-coding sites (referred to as ‘start codons’). In the former, ApT- or TpA-rich IRs and A-tract- or T-tract-rich IRs are enriched, while in the latter, different IRs are enriched. Furthermore, we found a strong structural correlation between the former IRs and the poly(A) signal. In the chromatin formed on the gene end regions, the majority of the IRs causes low nucleosome occupancy. The IRs in the region ~ 30 to ~ 60 bp downstream of start codons are located in the + 1 nucleosomes. In contrast, fewer IRs are present in the adjacent region downstream of start codons. The current study suggests that the IRs play similar roles in Escherichia coli and S. cerevisiae to regulate or complete transcription at the RNA level.
Collapse
Affiliation(s)
- Osamu Miura
- Department of Biology, Faculty of Education and Integrated Arts and Sciences, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
| | - Toshihiro Ogake
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
| | - Hiroki Yoneyama
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
| | - Yo Kikuchi
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
| | - Takashi Ohyama
- Department of Biology, Faculty of Education and Integrated Arts and Sciences, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan. .,Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan.
| |
Collapse
|
11
|
Miura O, Ogake T, Ohyama T. Requirement or exclusion of inverted repeat sequences with cruciform-forming potential in Escherichia coli revealed by genome-wide analyses. Curr Genet 2018; 64:945-958. [PMID: 29484452 PMCID: PMC6060812 DOI: 10.1007/s00294-018-0815-y] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2017] [Revised: 02/16/2018] [Accepted: 02/19/2018] [Indexed: 12/31/2022]
Abstract
Inverted repeat (IR) sequences are DNA sequences that read the same from 5' to 3' in each strand. Some IRs can form cruciforms under the stress of negative supercoiling, and these IRs are widely found in genomes. However, their biological significance remains unclear. The aim of the current study is to explore this issue further. We constructed the first Escherichia coli genome-wide comprehensive map of IRs with cruciform-forming potential. Based on the map, we performed detailed and quantitative analyses. Here, we report that IRs with cruciform-forming potential are statistically enriched in the following five regions: the adjacent regions downstream of the stop codon-coding sites (referred to as the stop codons), on and around the positions corresponding to mRNA ends (referred to as the gene ends), ~ 20 to ~45 bp upstream of the start codon-coding sites (referred to as the start codons) within the 5'-UTR (untranslated region), ~ 25 to ~ 60 bp downstream of the start codons, and promoter regions. For the adjacent regions downstream of the stop codons and on and around the gene ends, most of the IRs with a repeat unit length of ≥ 8 bp and a spacer size of ≤ 8 bp were parts of the intrinsic terminators, regardless of the location, and presumably used for Rho-independent transcription termination. In contrast, fewer IRs were present in the small region preceding the start codons. In E. coli, IRs with cruciform-forming potential are actively placed or excluded in the regulatory regions for the initiation and termination of transcription and translation, indicating their deep involvement or influence in these processes.
Collapse
Affiliation(s)
- Osamu Miura
- Department of Biology, Faculty of Education and Integrated Arts and Sciences, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
| | - Toshihiro Ogake
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
| | - Takashi Ohyama
- Department of Biology, Faculty of Education and Integrated Arts and Sciences, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan.
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan.
| |
Collapse
|
12
|
Wang Y, Huang JM. Lirex: A Package for Identification of Long Inverted Repeats in Genomes. GENOMICS PROTEOMICS & BIOINFORMATICS 2017; 15:141-146. [PMID: 28392477 PMCID: PMC5414712 DOI: 10.1016/j.gpb.2017.01.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/14/2016] [Revised: 01/04/2017] [Accepted: 01/22/2017] [Indexed: 11/30/2022]
Abstract
Long inverted repeats (LIRs) are evolutionarily and functionally important structures in genomes because of their involvement in RNA interference, DNA recombination, and gene duplication. Identification of LIRs is highly complicated when mismatches and indels between the repeats are permitted. Long inverted repeat explorer (Lirex) was developed and introduced in this report. Written in Java, Lirex provides a user-friendly interface and allows users to specify LIR searching criteria, such as length of the region, as well as pattern and size of the repeats. Recombinogenic LIRs can be selected on the basis of mismatch rate and internal spacer size from identified LIRs. Lirex, as a cross-platform tool to identify LIRs in a genome, may assist in designing following experiments to explore the function of LIRs. Our tool can identify more LIRs than other LIR searching tools. Lirex is publicly available at http://124.16.219.129/Lirex.
Collapse
Affiliation(s)
- Yong Wang
- Institute of Deep-Sea Science and Engineering, Chinese Academy of Sciences, Sanya 572000, China.
| | - Jiao-Mei Huang
- Institute of Deep-Sea Science and Engineering, Chinese Academy of Sciences, Sanya 572000, China
| |
Collapse
|
13
|
Tavares AHMP, Pinho AJ, Silva RM, Rodrigues JMOS, Bastos CAC, Ferreira PJSG, Afreixo V. DNA word analysis based on the distribution of the distances between symmetric words. Sci Rep 2017; 7:728. [PMID: 28389642 PMCID: PMC5428789 DOI: 10.1038/s41598-017-00646-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2016] [Accepted: 03/02/2017] [Indexed: 02/01/2023] Open
Abstract
We address the problem of discovering pairs of symmetric genomic words (i.e., words and the corresponding reversed complements) occurring at distances that are overrepresented. For this purpose, we developed new procedures to identify symmetric word pairs with uncommon empirical distance distribution and with clusters of overrepresented short distances. We speculate that patterns of overrepresentation of short distances between symmetric word pairs may allow the occurrence of non-standard DNA conformations, such as hairpin/cruciform structures. We focused on the human genome, and analysed both the complete genome as well as a version with known repetitive sequences masked out. We reported several well-defined features in the distributions of distances, which can be classified into three different profiles, showing enrichment in distinct distance ranges. We analysed in greater detail certain pairs of symmetric words of length seven, found by our procedure, characterised by the surprising fact that they occur at single distances more frequently than expected.
Collapse
Affiliation(s)
- Ana H M P Tavares
- Department of Mathematics & CIDMA, University of Aveiro, Aveiro, Portugal.,Department of Medical Sciences & iBiMED, University of Aveiro, Aveiro, Portugal
| | - Armando J Pinho
- Department of Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal.,IEETA, University of Aveiro, Aveiro, Portugal
| | - Raquel M Silva
- Department of Medical Sciences & iBiMED, University of Aveiro, Aveiro, Portugal.,IEETA, University of Aveiro, Aveiro, Portugal
| | - João M O S Rodrigues
- Department of Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal.,IEETA, University of Aveiro, Aveiro, Portugal
| | - Carlos A C Bastos
- Department of Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal.,IEETA, University of Aveiro, Aveiro, Portugal
| | - Paulo J S G Ferreira
- Department of Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal.,IEETA, University of Aveiro, Aveiro, Portugal
| | - Vera Afreixo
- Department of Mathematics & CIDMA, University of Aveiro, Aveiro, Portugal. .,Department of Medical Sciences & iBiMED, University of Aveiro, Aveiro, Portugal. .,IEETA, University of Aveiro, Aveiro, Portugal.
| |
Collapse
|
14
|
Brázda V, Kolomazník J, Lýsek J, Hároníková L, Coufal J, Št'astný J. Palindrome analyser - A new web-based server for predicting and evaluating inverted repeats in nucleotide sequences. Biochem Biophys Res Commun 2016; 478:1739-45. [PMID: 27603574 DOI: 10.1016/j.bbrc.2016.09.015] [Citation(s) in RCA: 53] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2016] [Accepted: 09/02/2016] [Indexed: 10/21/2022]
Abstract
DNA cruciform structures play an important role in the regulation of natural processes including gene replication and expression, as well as nucleosome structure and recombination. They have also been implicated in the evolution and development of diseases such as cancer and neurodegenerative disorders. Cruciform structures are formed by inverted repeats, and their stability is enhanced by DNA supercoiling and protein binding. They have received broad attention because of their important roles in biology. Computational approaches to study inverted repeats have allowed detailed analysis of genomes. However, currently there are no easily accessible and user-friendly tools that can analyse inverted repeats, especially among long nucleotide sequences. We have developed a web-based server, Palindrome analyser, which is a user-friendly application for analysing inverted repeats in various DNA (or RNA) sequences including genome sequences and oligonucleotides. It allows users to search and retrieve desired gene/nucleotide sequence entries from the NCBI databases, and provides data on length, sequence, locations and energy required for cruciform formation. Palindrome analyser also features an interactive graphical data representation of the distribution of the inverted repeats, with options for sorting according to the length of inverted repeat, length of loop, and number of mismatches. Palindrome analyser can be accessed at http://bioinformatics.ibp.cz.
Collapse
Affiliation(s)
- Václav Brázda
- Institute of Biophysics, Academy of Sciences of the Czech Republic, Královopolská 135, 612 65, Brno, Czech Republic.
| | - Jan Kolomazník
- Mendel University in Brno, Zemědělská 1, 613 00, Brno, Czech Republic
| | - Jiří Lýsek
- Mendel University in Brno, Zemědělská 1, 613 00, Brno, Czech Republic
| | - Lucia Hároníková
- Institute of Biophysics, Academy of Sciences of the Czech Republic, Královopolská 135, 612 65, Brno, Czech Republic
| | - Jan Coufal
- Institute of Biophysics, Academy of Sciences of the Czech Republic, Královopolská 135, 612 65, Brno, Czech Republic
| | - Jiří Št'astný
- Mendel University in Brno, Zemědělská 1, 613 00, Brno, Czech Republic
| |
Collapse
|
15
|
Lai PJ, Lim CT, Le HP, Katayama T, Leach DRF, Furukohri A, Maki H. Long inverted repeat transiently stalls DNA replication by forming hairpin structures on both leading and lagging strands. Genes Cells 2016; 21:136-45. [PMID: 26738888 DOI: 10.1111/gtc.12326] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2015] [Accepted: 11/18/2015] [Indexed: 11/27/2022]
Abstract
Long inverted repeats (LIRs), often found in eukaryotic genomes, are unstable in Escherichia coli where they are recognized by the SbcCD (the bacterial Mre11/Rad50 homologue), an endonuclease/exonuclease capable of cleaving hairpin DNA. It has long been postulated that LIRs form hairpin structures exclusively on the lagging-strand template during DNA replication, and SbcCD cleaves these hairpin-containing lagging strands to generate DNA double-strand breaks. Using a reconstituted oriC plasmid DNA replication system, we have examined how a replication fork behaves when it meets a LIR on DNA. We have shown that leading-strand synthesis stalls transiently within the upstream half of the LIR. Pausing of lagging-strand synthesis at the LIR was not clearly observed, but the pattern of priming sites for Okazaki fragment synthesis was altered within the downstream half of the LIR. We have found that the LIR on a replicating plasmid was cleaved by SbcCD with almost equal frequency on both the leading- and lagging-strand templates. These data strongly suggest that the LIR is readily converted to a cruciform DNA, before the arrival of the fork, creating SbcCD-sensitive hairpin structures on both leading and lagging strands. We propose a model for the replication-dependent extrusion of LIRs to form cruciform structures that transiently impede replication fork movement.
Collapse
Affiliation(s)
- Pey Jiun Lai
- Division of Systems Biology, Graduate School of Biological Sciences, Nara Institute of Science and Technology, Ikoma, Nara, 630-0192, Japan
| | - Chew Theng Lim
- Division of Systems Biology, Graduate School of Biological Sciences, Nara Institute of Science and Technology, Ikoma, Nara, 630-0192, Japan
| | - Hang Phuong Le
- Division of Systems Biology, Graduate School of Biological Sciences, Nara Institute of Science and Technology, Ikoma, Nara, 630-0192, Japan
| | - Tsutomu Katayama
- Department of Molecular Biology, Graduate School of Pharmaceutical Sciences, Kyushu University, Fukuoka, 812-8582, Japan
| | - David R F Leach
- Institute of Cell Biology, School of Biological Sciences, University of Edinburgh, Kings Buildings, Edinburgh, EH9 3JR, UK
| | - Asako Furukohri
- Division of Systems Biology, Graduate School of Biological Sciences, Nara Institute of Science and Technology, Ikoma, Nara, 630-0192, Japan
| | - Hisaji Maki
- Division of Systems Biology, Graduate School of Biological Sciences, Nara Institute of Science and Technology, Ikoma, Nara, 630-0192, Japan
| |
Collapse
|
16
|
Javadekar SM, Raghavan SC. Snaps and mends: DNA breaks and chromosomal translocations. FEBS J 2015; 282:2627-45. [PMID: 25913527 DOI: 10.1111/febs.13311] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2015] [Revised: 03/29/2015] [Accepted: 04/23/2015] [Indexed: 01/11/2023]
Abstract
Integrity in entirety is the preferred state of any organism. The temporal and spatial integrity of the genome ensures continued survival of a cell. DNA breakage is the first step towards creation of chromosomal translocations. In this review, we highlight the factors contributing towards the breakage of chromosomal DNA. It has been well-established that the structure and sequence of DNA play a critical role in selective fragility of the genome. Several non-B-DNA structures such as Z-DNA, cruciform DNA, G-quadruplexes, R loops and triplexes have been implicated in generation of genomic fragility leading to translocations. Similarly, specific sequences targeted by proteins such as Recombination Activating Genes and Activation Induced Cytidine Deaminase are involved in translocations. Processes that ensure the integrity of the genome through repair may lead to persistence of breakage and eventually translocations if their actions are anomalous. An insufficient supply of nucleotides and chromatin architecture may also play a critical role. This review focuses on a range of events with the potential to threaten the genomic integrity of a cell, leading to cancer.
Collapse
Affiliation(s)
- Saniya M Javadekar
- Department of Biochemistry, Indian Institute of Science, Bangalore, India
| | - Sathees C Raghavan
- Department of Biochemistry, Indian Institute of Science, Bangalore, India
| |
Collapse
|
17
|
Lu S, Wang G, Bacolla A, Zhao J, Spitser S, Vasquez KM. Short Inverted Repeats Are Hotspots for Genetic Instability: Relevance to Cancer Genomes. Cell Rep 2015; 10:1674-1680. [PMID: 25772355 DOI: 10.1016/j.celrep.2015.02.039] [Citation(s) in RCA: 65] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2014] [Revised: 01/26/2015] [Accepted: 02/16/2015] [Indexed: 12/25/2022] Open
Abstract
Analyses of chromosomal aberrations in human genetic disorders have revealed that inverted repeat sequences (IRs) often co-localize with endogenous chromosomal instability and breakage hotspots. Approximately 80% of all IRs in the human genome are short (<100 bp), yet the mutagenic potential of such short cruciform-forming sequences has not been characterized. Here, we find that short IRs are enriched at translocation breakpoints in human cancer and stimulate the formation of DNA double-strand breaks (DSBs) and deletions in mammalian and yeast cells. We provide evidence for replication-related mechanisms of IR-induced genetic instability and a novel XPF cleavage-based mechanism independent of DNA replication. These discoveries implicate short IRs as endogenous sources of DNA breakage involved in disease etiology and suggest that these repeats represent a feature of genome plasticity that may contribute to the evolution of the human genome by providing a means for diversity within the population.
Collapse
Affiliation(s)
- Steve Lu
- Division of Pharmacology and Toxicology, College of Pharmacy, The University of Texas at Austin - Dell Pediatric Research Institute, 1400 Barbara Jordan Boulevard R1800, Austin, TX 78723, USA
| | - Guliang Wang
- Division of Pharmacology and Toxicology, College of Pharmacy, The University of Texas at Austin - Dell Pediatric Research Institute, 1400 Barbara Jordan Boulevard R1800, Austin, TX 78723, USA
| | - Albino Bacolla
- Division of Pharmacology and Toxicology, College of Pharmacy, The University of Texas at Austin - Dell Pediatric Research Institute, 1400 Barbara Jordan Boulevard R1800, Austin, TX 78723, USA
| | - Junhua Zhao
- Division of Pharmacology and Toxicology, College of Pharmacy, The University of Texas at Austin - Dell Pediatric Research Institute, 1400 Barbara Jordan Boulevard R1800, Austin, TX 78723, USA
| | - Scott Spitser
- Division of Pharmacology and Toxicology, College of Pharmacy, The University of Texas at Austin - Dell Pediatric Research Institute, 1400 Barbara Jordan Boulevard R1800, Austin, TX 78723, USA
| | - Karen M Vasquez
- Division of Pharmacology and Toxicology, College of Pharmacy, The University of Texas at Austin - Dell Pediatric Research Institute, 1400 Barbara Jordan Boulevard R1800, Austin, TX 78723, USA.
| |
Collapse
|
18
|
Aygun N. Correlations between long inverted repeat (LIR) features, deletion size and distance from breakpoint in human gross gene deletions. Sci Rep 2015; 5:8300. [PMID: 25657065 PMCID: PMC4319165 DOI: 10.1038/srep08300] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2014] [Accepted: 01/14/2015] [Indexed: 11/09/2022] Open
Abstract
Long inverted repeats (LIRs) have been shown to induce genomic deletions in yeast. In this study, LIRs were investigated within ±10 kb spanning each breakpoint from 109 human gross deletions, using Inverted Repeat Finder (IRF) software. LIR number was significantly higher at the breakpoint regions, than in control segments (P < 0.001). In addition, it was found that strong correlation between 5' and 3' LIR numbers, suggesting contribution to DNA sequence evolution (r = 0.85, P < 0.001). 138 LIR features at ±3 kb breakpoints in 89 (81%) of 109 gross deletions were evaluated. Significant correlations were found between distance from breakpoint and loop length (r = -0.18, P < 0.05) and stem length (r = -0.18, P < 0.05), suggesting DNA strands are potentially broken in locations closer to bigger LIRs. In addition, bigger loops cause larger deletions (r = 0.19, P < 0.05). Moreover, loop length (r = 0.29, P < 0.02) and identity between stem copies (r = 0.30, P < 0.05) of 3' LIRs were more important in larger deletions. Consequently, DNA breaks may form via LIR-induced cruciform structure during replication. DNA ends may be later repaired by non-homologous end-joining (NHEJ), with following deletion.
Collapse
Affiliation(s)
- Nevim Aygun
- Department of Medical Biology, Faculty of Medicine, Dokuz Eylul University, Inciralti, Izmir, Turkey
| |
Collapse
|
19
|
Shen JJ, Dushoff J, Bewick AJ, Chain FJ, Evans BJ. Genomic dynamics of transposable elements in the western clawed frog (Silurana tropicalis). Genome Biol Evol 2013; 5:998-1009. [PMID: 23645600 PMCID: PMC3673623 DOI: 10.1093/gbe/evt065] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/18/2013] [Indexed: 02/07/2023] Open
Abstract
Transposable elements (TEs) are repetitive DNA sequences that can make new copies of themselves that are inserted elsewhere in a host genome. The abundance and distributions of TEs vary considerably among phylogenetically diverse hosts. With the aim of exploring the basis of this variation, we evaluated correlations between several genomic variables and the presence of TEs and non-TE repeats in the complete genome sequence of the Western clawed frog (Silurana tropicalis). This analysis reveals patterns of TE insertion consistent with gene disruption but not with the insertional preference model. Analysis of non-TE repeats recovered unique features of their genome-wide distribution when compared with TE repeats, including no strong correlation with exons and a particularly strong negative correlation with GC content. We also collected polymorphism data from 25 TE insertion sites in 19 wild-caught S. tropicalis individuals. DNA transposon insertions were fixed at eight of nine sites and at a high frequency at one of nine, whereas insertions of long terminal repeat (LTR) and non-LTR retrotransposons were fixed at only 4 of 16 sites and at low frequency at 12 of 16. A maximum likelihood model failed to attribute these differences in insertion frequencies to variation in selection pressure on different classes of TE, opening the possibility that other phenomena such as variation in rates of replication or duration of residence in the genome could play a role. Taken together, these results identify factors that sculpt heterogeneity in TE distribution in S. tropicalis and illustrate that genomic dynamics differ markedly among TE classes and between TE and non-TE repeats.
Collapse
Affiliation(s)
- Jiangshan J. Shen
- Department of Biology, McMaster University, Hamilton, Ontario, Canada
- Present address: Department of Pathology, The University of Hong Kong, Hong Kong, China
| | - Jonathan Dushoff
- Department of Biology, McMaster University, Hamilton, Ontario, Canada
| | - Adam J. Bewick
- Department of Biology, McMaster University, Hamilton, Ontario, Canada
| | - Frédéric J.J. Chain
- Department of Evolutionary Ecology, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Ben J. Evans
- Department of Biology, McMaster University, Hamilton, Ontario, Canada
| |
Collapse
|
20
|
Humphrey-Dixon EL, Sharp R, Schuckers M, Lock R. Comparative genome analysis suggests characteristics of yeast inverted repeats that are important for transcriptional activity. Genome 2011; 54:934-42. [PMID: 22029652 DOI: 10.1139/g11-058] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Inverted repeats are sequences of DNA that, when read in the 5' to 3' direction, have the same sequence on both strands (palindromic portion), with the exception of a small number of nucleotides in the exact center (nonpalindromic spacer). They have been implicated in various DNA-mediated processes including replication, transcription, and genomic instability. At least some of these sequences are capable of forming an alternative DNA structure, called a cruciform, that may be important for mediating these functions. We generated a list of inverted repeats in the Saccharomyces cerevisiae genome and determined which of them are conserved in three related yeasts. We have identified characterisitics of inverted repeats that make them more likely to be conserved than the surrounding DNA and characteristics, such as position and base composition, that make the genes they are associated with likely to be more actively transcribed. This is an important step in determining the functions of this group of genomic elements.
Collapse
|
21
|
Strawbridge EM, Benson G, Gelfand Y, Benham CJ. The distribution of inverted repeat sequences in the Saccharomyces cerevisiae genome. Curr Genet 2010; 56:321-40. [PMID: 20446088 PMCID: PMC2908449 DOI: 10.1007/s00294-010-0302-6] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2010] [Revised: 04/05/2010] [Accepted: 04/08/2010] [Indexed: 02/06/2023]
Abstract
Although a variety of possible functions have been proposed for inverted repeat sequences (IRs), it is not known which of them might occur in vivo. We investigate this question by assessing the distributions and properties of IRs in the Saccharomyces cerevisiae (SC) genome. Using the IRFinder algorithm we detect 100,514 IRs having copy length greater than 6 bp and spacer length less than 77 bp. To assess statistical significance we also determine the IR distributions in two types of randomization of the S. cerevisiae genome. We find that the S. cerevisiae genome is significantly enriched in IRs relative to random. The S. cerevisiae IRs are significantly longer and contain fewer imperfections than those from the randomized genomes, suggesting that processes to lengthen and/or correct errors in IRs may be operative in vivo. The S. cerevisiae IRs are highly clustered in intergenic regions, while their occurrence in coding sequences is consistent with random. Clustering is stronger in the 3' flanks of genes than in their 5' flanks. However, the S. cerevisiae genome is not enriched in those IRs that would extrude cruciforms, suggesting that this is not a common event. Various explanations for these results are considered.
Collapse
Affiliation(s)
| | - Gary Benson
- Laboratory for Biocomputing and Informatics, Boston University, Boston, MA USA
| | - Yevgeniy Gelfand
- Laboratory for Biocomputing and Informatics, Boston University, Boston, MA USA
| | - Craig J. Benham
- Department of Mathematics, University of California, Davis, CA 95616 USA
| |
Collapse
|
22
|
Wang Y, Leung FCC. Discovery of a long inverted repeat in human POTE genes. Genomics 2009; 94:278-83. [PMID: 19463943 DOI: 10.1016/j.ygeno.2009.05.005] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2009] [Revised: 05/08/2009] [Accepted: 05/13/2009] [Indexed: 01/18/2023]
Abstract
POTE gene family is tightly related to prostate, ovary, testis and placenta cancers. We recently identified an intronic long inverted repeat (LIR) in some members of the POTE gene family. Due to the capacity of inducing gene amplification, the POTE intronic LIRs may be involved in over-expression of the POTE genes. Our study aimed to understand the origin of the LIR in primates. We collected the LIR and its flanking sequences within rhesus monkey, chimpanzee and human genomes. The rhesus monkey genome only has half-sized LIRs (lack one repeat copy), whereas the human and chimpanzee genomes contain both full-sized and half-sized LIRs. Phylogenetic tree indicates that the LIR is formed after divergence of rhesus monkey and the common ancestor of human and chimpanzee. The POTE genes containing a full-sized LIR were amplified in the human genome.
Collapse
Affiliation(s)
- Yong Wang
- School of Biological Sciences and Genome Research Centre, The University of Hong Kong, Pokfulam, Hong Kong, China
| | | |
Collapse
|
23
|
Abstract
The human GSTM gene family is composed of five gene members, GSTM1-5, and plays an important role in detoxification. In this study, the human GSTM5 gene was found to have a long inverted repeat (LIR) in intron 5. The LIR is able to form a stem-loop structure with a 31-bp stem and a 9-nt loop. The intronic LIR was also identified in other primates but not in non-primates. The human and chimpanzee LIRs had undergone compensating mutations that make the stem loop more stable, suggesting a functional role for the LIR. Sequence homology showed that the LIR was actually a part of inverted exons acquired by the intron. Results of phylogenetic analysis indicate that the inverted exons were derived from exon 5 of GSTM4 and exon 5 of GSTM1. The intronic LIR and inverted GSTM exons can probably introduce complexity in the expression of GSTM gene family.
Collapse
|
24
|
Wang Y, Leung FCC. A study on genomic distribution and sequence features of human long inverted repeats reveals species-specific intronic inverted repeats. FEBS J 2009; 276:1986-98. [PMID: 19243432 DOI: 10.1111/j.1742-4658.2009.06930.x] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
The inverted repeats present in a genome play dual roles. They can induce genomic instability and, on the other hand, regulate gene expression. In the present study, we report the distribution and sequence features of recombinogenic long inverted repeats (LIRs) that are capable of forming stable stem-loops or palindromes within the human genome. A total of 2551 LIRs were identified, and 37% of them were located in long introns (largely > 10 kb) of genes. Their distribution appears to be random in introns and is not restrictive, even for regions near intron-exon boundaries. Almost half of them comprise TG/CA-rich repeats, inversely arranged Alu repeats and MADE1 mariners. The remaining LIRs are mostly unique in their sequence features. Comparative studies of human, chimpanzee, rhesus monkey and mouse orthologous genes reveal that human genes have more recombinogenic LIRs than other orthologs, and over 80% are human-specific. The human genes associated with the human-specific LIRs are involved in the pathways of cell communication, development and the nervous system, as based on significantly over-represented Gene Ontology terms. The functional pathways related to the development and functions of the nervous system are not enriched in chimpanzee and mouse orthologs. The findings of the present study provide insight into the role of intronic LIRs in gene regulation and primate speciation.
Collapse
Affiliation(s)
- Yong Wang
- School of Biological Sciences, The University of Hong Kong, Hong Kong, China.
| | | |
Collapse
|
25
|
Replication stalling at unstable inverted repeats: interplay between DNA hairpins and fork stabilizing proteins. Proc Natl Acad Sci U S A 2008; 105:9936-41. [PMID: 18632578 DOI: 10.1073/pnas.0804510105] [Citation(s) in RCA: 199] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
DNA inverted repeats (IRs) are hotspots of genomic instability in both prokaryotes and eukaryotes. This feature is commonly attributed to their ability to fold into hairpin- or cruciform-like DNA structures interfering with DNA replication and other genetic processes. However, direct evidence that IRs are replication stall sites in vivo is currently lacking. Here, we show by 2D electrophoretic analysis of replication intermediates that replication forks stall at IRs in bacteria, yeast, and mammalian cells. We found that DNA hairpins, rather than DNA cruciforms, are responsible for the replication stalling by comparing the effects of specifically designed imperfect IRs with varying lengths of their central spacer. Finally, we report that yeast fork-stabilizing proteins, Tof1 and Mrc1, are required to counteract repeat-mediated replication stalling. We show that the function of the Tof1 protein at DNA structure-mediated stall sites is different from its previously described effect on protein-mediated replication fork barriers.
Collapse
|
26
|
Coulibaly MB, Lobo NF, Fitzpatrick MC, Kern M, Grushko O, Thaner DV, Traoré SF, Collins FH, Besansky NJ. Segmental duplication implicated in the genesis of inversion 2Rj of Anopheles gambiae. PLoS One 2007; 2:e849. [PMID: 17786220 PMCID: PMC1952172 DOI: 10.1371/journal.pone.0000849] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2007] [Accepted: 08/15/2007] [Indexed: 01/26/2023] Open
Abstract
The malaria vector Anopheles gambiae maintains high levels of inversion polymorphism that facilitate its exploitation of diverse ecological settings across tropical Africa. Molecular characterization of inversion breakpoints is a first step toward understanding the processes that generate and maintain inversions. Here we focused on inversion 2Rj because of its association with the assortatively mating Bamako chromosomal form of An. gambiae, whose distinctive breeding sites are rock pools beside the Niger River in Mali and Guinea. Sequence and computational analysis of 2Rj revealed the same 14.6 kb insertion between both breakpoints, which occurred near but not within predicted genes. Each insertion consists of 5.3 kb terminal inverted repeat arms separated by a 4 kb spacer. The insertions lack coding capacity, and are comprised of degraded remnants of repetitive sequences including class I and II transposable elements. Because of their large size and patchwork composition, and as no other instances of these insertions were identified in the An. gambiae genome, they do not appear to be transposable elements. The 14.6 kb modules inserted at both 2Rj breakpoint junctions represent low copy repeats (LCRs, also called segmental duplications) that are strongly implicated in the recent (∼0.4Ne generations) origin of 2Rj. The LCRs contribute to further genome instability, as demonstrated by an imprecise excision event at the proximal breakpoint of 2Rj in field isolates.
Collapse
Affiliation(s)
- Mamadou B. Coulibaly
- Center for Global Health and Infectious Diseases, Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana, United States of America
- Malaria Research and Training Center, University of Bamako, Bamako, Mali
| | - Neil F. Lobo
- Center for Global Health and Infectious Diseases, Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Meagan C. Fitzpatrick
- Center for Global Health and Infectious Diseases, Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Marcia Kern
- Center for Global Health and Infectious Diseases, Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Olga Grushko
- Center for Global Health and Infectious Diseases, Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Daniel V. Thaner
- Center for Global Health and Infectious Diseases, Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Sékou F. Traoré
- Malaria Research and Training Center, University of Bamako, Bamako, Mali
| | - Frank H. Collins
- Center for Global Health and Infectious Diseases, Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Nora J. Besansky
- Center for Global Health and Infectious Diseases, Department of Biological Sciences, University of Notre Dame, Notre Dame, Indiana, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
27
|
Inagaki K, Lewis SM, Wu X, Ma C, Munroe DJ, Fuess S, Storm TA, Kay MA, Nakai H. DNA palindromes with a modest arm length of greater, similar 20 base pairs are a significant target for recombinant adeno-associated virus vector integration in the liver, muscles, and heart in mice. J Virol 2007; 81:11290-303. [PMID: 17686840 PMCID: PMC2045527 DOI: 10.1128/jvi.00963-07] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Our previous study has shown that recombinant adeno-associated virus (rAAV) vector integrates preferentially in genes, near transcription start sites and CpG islands in mouse liver (H. Nakai, X. Wu, S. Fuess, T. A. Storm, D. Munroe, E. Montini, S. M. Burgess, M. Grompe, and M. A. Kay, J. Virol. 79:3606-3614, 2005). However, the previous method relied on in vivo selection of rAAV integrants and could be employed for the liver but not for other tissues. Here, we describe a novel method for high-throughput rAAV integration site analysis that does not rely on marker gene expression, selection, or cell division, and therefore it can identify rAAV integration sites in nondividing cells without cell manipulations. Using this new method, we identified and characterized a total of 997 rAAV integration sites in mouse liver, skeletal muscle, and heart, transduced with rAAV2 or rAAV8 vector. The results support our previous observations, but notably they have revealed that DNA palindromes with an arm length of greater, similar 20 bp (total length, greater, similar 40 bp) are a significant target for rAAV integration. Up to approximately 30% of total integration events occurred in the vicinity of DNA palindromes with an arm length of greater, similar 20 bp. Considering that DNA palindromes may constitute fragile genomic sites, our results support the notion that rAAV integrates at chromosomal sites susceptible to breakage or preexisting breakage sites. The use of rAAV to label fragile genomic sites may provide an important new tool for probing the intrinsic source of ongoing genomic instability in various tissues in animals, studying DNA palindrome metabolism in vivo, and understanding their possible contributions to carcinogenesis and aging.
Collapse
Affiliation(s)
- Katsuya Inagaki
- Department of Molecular Genetics & Biochemistry, University of Pittsburgh School of Medicine, W1244 BSTWR, 200 Lothrop St., Pittsburgh, PA 15261, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Zhao G, Chang KY, Varley K, Stormo GD. Evidence for active maintenance of inverted repeat structures identified by a comparative genomic approach. PLoS One 2007; 2:e262. [PMID: 17327921 PMCID: PMC1803023 DOI: 10.1371/journal.pone.0000262] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2006] [Accepted: 02/08/2007] [Indexed: 11/19/2022] Open
Abstract
Inverted repeats have been found to occur in both prokaryotic and eukaryotic genomes. Usually they are short and some have important functions in various biological processes. However, long inverted repeats are rare and can cause genome instability. Analyses of C. elegans genome identified long, nearly-perfect inverted repeat sequences involving both divergently and convergently oriented homologous gene pairs and complete intergenic sequences. Comparisons with the orthologous regions from the genomes of C. briggsae and C. remanei show that the inverted repeat structures are often far more conserved than the sequences. This observation implies that there is an active mechanism for maintaining the inverted repeat nature of the sequences.
Collapse
Affiliation(s)
- Guoyan Zhao
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Kuan Y. Chang
- Department of Pathology and Immunology, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Katherine Varley
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
| | - Gary D. Stormo
- Department of Genetics, Washington University School of Medicine, St. Louis, Missouri, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|