201
|
Shu W, Bo X, Zheng Z, Wang S. A novel representation of RNA secondary structure based on element-contact graphs. BMC Bioinformatics 2008; 9:188. [PMID: 18402706 PMCID: PMC2373570 DOI: 10.1186/1471-2105-9-188] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2007] [Accepted: 04/11/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Depending on their specific structures, noncoding RNAs (ncRNAs) play important roles in many biological processes. Interest in developing new topological indices based on RNA graphs has been revived in recent years, as such indices can be used to compare, identify and classify RNAs. Although the topological indices presented before characterize the main topological features of RNA secondary structures, information on RNA structural details is ignored to some degree. Therefore, it is necessity to identify topological features with low degeneracy based on complete and fine-grained RNA graphical representations. RESULTS In this study, we present a complete and fine scheme for RNA graph representation as a new basis for constructing RNA topological indices. We propose a combination of three vertex-weighted element-contact graphs (ECGs) to describe the RNA element details and their adjacent patterns in RNA secondary structure. Both the stem and loop topologies are encoded completely in the ECGs. The relationship among the three typical topological index families defined by their ECGs and RNA secondary structures was investigated from a dataset of 6,305 ncRNAs. The applicability of topological indices is illustrated by three application case studies. Based on the applied small dataset, we find that the topological indices can distinguish true pre-miRNAs from pseudo pre-miRNAs with about 96% accuracy, and can cluster known types of ncRNAs with about 98% accuracy, respectively. CONCLUSION The results indicate that the topological indices can characterize the details of RNA structures and may have a potential role in identifying and classifying ncRNAs. Moreover, these indices may lead to a new approach for discovering novel ncRNAs. However, further research is needed to fully resolve the challenging problem of predicting and classifying noncoding RNAs.
Collapse
Affiliation(s)
- Wenjie Shu
- Beijing Institute of Radiation Medicine, Beijing 100850, China.
| | | | | | | |
Collapse
|
202
|
Allali-Hassani A, Pereira MP, Navani NK, Brown ED, Li Y. Isolation of DNA aptamers for CDP-ribitol synthase, and characterization of their inhibitory and structural properties. Chembiochem 2008; 8:2052-7. [PMID: 17929340 DOI: 10.1002/cbic.200700257] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Abdellah Allali-Hassani
- Department of Biochemistry and Biomedical Sciences, McMaster University, 1200 Main Street West, Hamilton, ON L8N 3Z5, Canada
| | | | | | | | | |
Collapse
|
203
|
Abstract
Genomic evidence reveals that gene expression in humans is precisely controlled in cellular, tissue-type, temporal, and condition-specific manners. Completely understanding the regulatory mechanisms of gene expression is therefore one of the most important issues in genomic medicine. Surprisingly, recent analyses of the human and animal genomes have demonstrated that the majority of RNA transcripts are relatively small, noncoding RNAs (sncRNAs), rather than large, protein coding message RNAs (mRNAs). Moreover, these sncRNAs may represent a novel important layer of regulation for gene expression. The most important breakthrough in this new area is the discovery of microRNAs (miRNAs). miRNAs comprise a novel class of endogenous, small, noncoding RNAs that negatively regulate gene expression via degradation or translational inhibition of their target mRNAs. As a group, miRNAs may directly regulate approximately 30% of the genes in the human genome. In keeping with the nomenclature of RNomics, which is to study sncRNAs on the genomic scale, "microRNomics" is coined here to describe a novel subdiscipline of genomics that studies the identification, expression, biogenesis, structure, regulation of expression, targets, and biological functions of miRNAs on the genomic scale. A growing body of exciting evidence suggests that miRNAs are important regulators of cell differentiation, proliferation/growth, mobility, and apoptosis. These miRNAs therefore play important roles in development and physiology. Consequently, dysregulation of miRNA function may lead to human diseases such as cancer, cardiovascular disease, liver disease, immune dysfunction, and metabolic disorders. microRNomics may be a newly emerging approach for human disease biology.
Collapse
Affiliation(s)
- Chunxiang Zhang
- RNA and Cardiovascular Research Laboratory, Department of Anesthesiology, New Jersey Medical School, University of Medicine and Dentistry of New Jersey, Newark, New Jersey 07101-1709, USA.
| |
Collapse
|
204
|
Zhou F, Tran T, Xu Y. Nezha, a novel active miniature inverted-repeat transposable element in cyanobacteria. Biochem Biophys Res Commun 2008; 365:790-4. [DOI: 10.1016/j.bbrc.2007.11.038] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2007] [Accepted: 11/09/2007] [Indexed: 11/16/2022]
|
205
|
Mourier T, Carret C, Kyes S, Christodoulou Z, Gardner PP, Jeffares DC, Pinches R, Barrell B, Berriman M, Griffiths-Jones S, Ivens A, Newbold C, Pain A. Genome-wide discovery and verification of novel structured RNAs in Plasmodium falciparum. Genome Res 2007; 18:281-92. [PMID: 18096748 PMCID: PMC2203626 DOI: 10.1101/gr.6836108] [Citation(s) in RCA: 78] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
We undertook a genome-wide search for novel noncoding RNAs (ncRNA) in the malaria parasite Plasmodium falciparum. We used the RNAz program to predict structures in the noncoding regions of the P. falciparum 3D7 genome that were conserved with at least one of seven other Plasmodium spp. genome sequences. By using Northern blot analysis for 76 high-scoring predictions and microarray analysis for the majority of candidates, we have verified the expression of 33 novel ncRNA transcripts including four members of a ncRNA family in the asexual blood stage. These transcripts represent novel structured ncRNAs in P. falciparum and are not represented in any RNA databases. We provide supporting evidence for purifying selection acting on the experimentally verified ncRNAs by comparing the nucleotide substitutions in the predicted ncRNA candidate structures in P. falciparum with the closely related chimp malaria parasite P. reichenowi. The high confirmation rate within a single parasite life cycle stage suggests that many more of the predictions may be expressed in other stages of the organism's life cycle.
Collapse
Affiliation(s)
- Tobias Mourier
- Ancient DNA and Evolution Group, Department of Biology, University of Copenhagen, Copenhagen DK-2100, Denmark
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
206
|
Rose D, Hackermüller J, Washietl S, Reiche K, Hertel J, Findeiss S, Stadler PF, Prohaska SJ. Computational RNomics of drosophilids. BMC Genomics 2007; 8:406. [PMID: 17996037 PMCID: PMC2216035 DOI: 10.1186/1471-2164-8-406] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2007] [Accepted: 11/08/2007] [Indexed: 11/11/2022] Open
Abstract
Background Recent experimental and computational studies have provided overwhelming evidence for a plethora of diverse transcripts that are unrelated to protein-coding genes. One subclass consists of those RNAs that require distinctive secondary structure motifs to exert their biological function and hence exhibit distinctive patterns of sequence conservation characteristic for positive selection on RNA secondary structure. The deep-sequencing of 12 drosophilid species coordinated by the NHGRI provides an ideal data set of comparative computational approaches to determine those genomic loci that code for evolutionarily conserved RNA motifs. This class of loci includes the majority of the known small ncRNAs as well as structured RNA motifs in mRNAs. We report here on a genome-wide survey using RNAz. Results We obtain 16 000 high quality predictions among which we recover the majority of the known ncRNAs. Taking a pessimistically estimated false discovery rate of 40% into account, this implies that at least some ten thousand loci in the Drosophila genome show the hallmarks of stabilizing selection action of RNA structure, and hence are most likely functional at the RNA level. A subset of RNAz predictions overlapping with TRF1 and BRF binding sites [Isogai et al., EMBO J. 26: 79–89 (2007)], which are plausible candidates of Pol III transcripts, have been studied in more detail. Among these sequences we identify several "clusters" of ncRNA candidates with striking structural similarities. Conclusion The statistical evaluation of the RNAz predictions in comparison with a similar analysis of vertebrate genomes [Washietl et al., Nat. Biotech. 23: 1383–1390 (2005)] shows that qualitatively similar fractions of structured RNAs are found in introns, UTRs, and intergenic regions. The intergenic RNA structures, however, are concentrated much more closely around known protein-coding loci, suggesting that flies have significantly smaller complement of independent structured ncRNAs compared to mammals.
Collapse
Affiliation(s)
- Dominic Rose
- Bioinformatics Group, Department of Computer Science, University of Leipzig, Härtelstrasse 16-18, Leipzig, Germany.
| | | | | | | | | | | | | | | |
Collapse
|
207
|
Gilbert SD, Love CE, Edwards AL, Batey RT. Mutational analysis of the purine riboswitch aptamer domain. Biochemistry 2007; 46:13297-309. [PMID: 17960911 DOI: 10.1021/bi700410g] [Citation(s) in RCA: 78] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
The purine riboswitch is one of a number of mRNA elements commonly found in the 5'-untranslated region capable of controlling expression in a cis-fashion via its ability to directly bind small-molecule metabolites. Extensive biochemical and structural analysis of the nucleobase-binding domain of the riboswitch, referred to as the aptamer domain, has revealed that the mRNA recognizes its cognate ligand using an intricately folded three-way junction motif that completely encapsulates the ligand. High-affinity binding of the purine nucleobase is facilitated by a distal loop-loop interaction that is conserved between both the adenine and guanine riboswitches. To understand the contribution of conserved nucleotides in both the three-way junction and the loop-loop interaction of this RNA, we performed a detailed mutagenic survey of these elements in the context of an adenine-responsive variant of the xpt-pbuX guanine riboswitch from Bacillus subtilis. The varying ability of these mutants to bind ligand as measured by isothermal titration calorimetry uncovered the conserved nucleotides whose identity is required for purine binding. Crystallographic analysis of the bound form of five mutants and chemical probing of their free state demonstrate that the identity of several universally conserved nucleotides is not essential for formation of the RNA-ligand complex but rather for maintaining a binding-competent form of the free RNA. These data show that conservation patterns in riboswitches arise from a combination of formation of the ligand-bound complex, promoting an open form of the free RNA, and participating in the secondary structural switch with the expression platform.
Collapse
Affiliation(s)
- Sunny D Gilbert
- Department of Chemistry and Biochemistry, University of Colorado at Boulder, Campus Box 215, Boulder, Colorado 80309-0215, USA
| | | | | | | |
Collapse
|
208
|
del Val C, Rivas E, Torres-Quesada O, Toro N, Jiménez-Zurdo JI. Identification of differentially expressed small non-coding RNAs in the legume endosymbiont Sinorhizobium meliloti by comparative genomics. Mol Microbiol 2007; 66:1080-91. [PMID: 17971083 PMCID: PMC2780559 DOI: 10.1111/j.1365-2958.2007.05978.x] [Citation(s) in RCA: 80] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Bacterial small non-coding RNAs (sRNAs) are being recognized as novel widespread regulators of gene expression in response to environmental signals. Here, we present the first search for sRNA-encoding genes in the nitrogen-fixing endosymbiont Sinorhizobium meliloti, performed by a genome-wide computational analysis of its intergenic regions. Comparative sequence data from eight related α-proteobacteria were obtained, and the interspecies pairwise alignments were scored with the programs eQRNA and RNAz as complementary predictive tools to identify conserved and stable secondary structures corresponding to putative non-coding RNAs. Northern experiments confirmed that eight of the predicted loci, selected among the original 32 candidates as most probable sRNA genes, expressed small transcripts. This result supports the combined use of eQRNA and RNAz as a robust strategy to identify novel sRNAs in bacteria. Furthermore, seven of the transcripts accumulated differentially in free-living and symbiotic conditions. Experimental mapping of the 5′-ends of the detected transcripts revealed that their encoding genes are organized in autonomous transcription units with recognizable promoter and, in most cases, termination signatures. These findings suggest novel regulatory functions for sRNAs related to the interactions of α-proteobacteria with their eukaryotic hosts.
Collapse
Affiliation(s)
- Coral del Val
- Department of Computer Science and Artificial Intelligence, E.T.S.I. Informatics, Universidad de Granada, Daniel Saucedo s/n, 18071 Granada, Spain
| | | | | | | | | |
Collapse
|
209
|
Machado-Lima A, del Portillo HA, Durham AM. Computational methods in noncoding RNA research. J Math Biol 2007; 56:15-49. [PMID: 17786447 DOI: 10.1007/s00285-007-0122-6] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2007] [Indexed: 11/26/2022]
Abstract
Non protein-coding RNAs (ncRNAs) are a research hotspot in bioinformatics. Recent discoveries have revealed new ncRNA families performing a variety of roles, from gene expression regulation to catalytic activities. It is also believed that other families are still to be unveiled. Computational methods developed for protein coding genes often fail when searching for ncRNAs. Noncoding RNAs functionality is often heavily dependent on their secondary structure, which makes gene discovery very different from protein coding RNA genes. This motivated the development of specific methods for ncRNA research. This article reviews the main approaches used to identify ncRNAs and predict secondary structure.
Collapse
Affiliation(s)
- Ariane Machado-Lima
- Institute of Mathematics and Statistics, University of Sao Paulo, Sao Paulo, SP, Brazil.
| | | | | |
Collapse
|
210
|
Luo Y, Gong X, Xu L, Li S. Isolation of RNA and RT-PCR, cloning, and sequencing of noncoding RNAs from fungi*. BIOCHEMISTRY AND MOLECULAR BIOLOGY EDUCATION : A BIMONTHLY PUBLICATION OF THE INTERNATIONAL UNION OF BIOCHEMISTRY AND MOLECULAR BIOLOGY 2007; 35:355-358. [PMID: 21591123 DOI: 10.1002/bmb.76] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
RNomics, the understanding of functional RNAs and their interactions at a genomic level, is of utmost practical and theoretical importance in modern life sciences. To introduce our students to the techniques and promise of this emerging field, a practical class activity for advanced undergraduate students in biochemistry and molecular biology is described. In these exercises, students first identify noncoding RNA from different fungi by computational methods and analyze their transcription regulation signals and splice signals by bioinformatics tools, then isolate total RNA from these fungi, and finally verify these noncoding RNA gene expressions by reverse transcriptase-polymerase chain reaction, cloning, and sequencing. This activity not only introduces students to the concept of RNomics, noncoding RNA, and RNA splicing, but also introduces students to the practice of basic molecular techniques. The natural combination of the genome projects and bioinformatics with modern molecular biology techniques is considered a major advantage of this laboratory course.
Collapse
Affiliation(s)
- Yuping Luo
- From the Laboratory of Molecular Biology and Gene Engineering, College of Life Science, Nanchang University, Nanchang 330031, China
| | | | | | | |
Collapse
|
211
|
Vallon-Christersson J, Staaf J, Kvist A, Medstrand P, Borg Å, Rovira C. Non-coding antisense transcription detected by conventional and single-stranded cDNA microarray. BMC Genomics 2007; 8:295. [PMID: 17727707 PMCID: PMC2020490 DOI: 10.1186/1471-2164-8-295] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2007] [Accepted: 08/29/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Recent studies revealed that many mammalian protein-coding genes also transcribe their complementary strands. This phenomenon raises questions regarding the validity of data obtained from double-stranded cDNA microarrays since hybridization to both strands may occur. Here, we wanted to analyze experimentally the incidence of antisense transcription in human cells and to estimate their influence on protein coding expression patterns obtained by double-stranded microarrays. Therefore, we profiled transcription of sense and antisense independently by using strand-specific cDNA microarrays. RESULTS Up to 88% of expressed protein coding loci displayed concurrent expression from the complementary strand. Antisense transcription is cell specific and showed a strong tendency to be positively correlated to the expression of the sense counterparts. Even if their expression is wide-spread, detected antisense signals seem to have a limited distorting effect on sense profiles obtained with double-stranded probes. CONCLUSION Antisense transcription in humans can be far more common than previously estimated. However, it has limited influence on expression profiles obtained with conventional cDNA probes. This can be explained by a biological phenomena and a bias of the technique: a) a co-ordinate sense and antisense expression variation and b) a bias for sense-hybridization to occur with more efficiency, presumably due to variable exonic overlap between antisense transcripts.
Collapse
Affiliation(s)
- Johan Vallon-Christersson
- Department of Oncology, Institute of Clinical Sciences, and SWEGENE DNA microarray resource center, Lund University, Barngatan 2:1, SE-221 85 Lund, Sweden
| | - Johan Staaf
- Department of Oncology, Institute of Clinical Sciences, and SWEGENE DNA microarray resource center, Lund University, Barngatan 2:1, SE-221 85 Lund, Sweden
| | - Anders Kvist
- Genomics and Bioinformatics, Department of Experimental Medical Science, BMC C13, SE-221 84 Lund, Sweden
| | - Patrik Medstrand
- Genomics and Bioinformatics, Department of Experimental Medical Science, BMC C13, SE-221 84 Lund, Sweden
| | - Åke Borg
- Department of Oncology, Institute of Clinical Sciences, and SWEGENE DNA microarray resource center, Lund University, Barngatan 2:1, SE-221 85 Lund, Sweden
- Lund Stem Cell Centre, University of Lund, BMC C13 SE-221 84 Lund, Sweden
| | - Carlos Rovira
- Department of Oncology, Institute of Clinical Sciences, and SWEGENE DNA microarray resource center, Lund University, Barngatan 2:1, SE-221 85 Lund, Sweden
- Lund Stem Cell Centre, University of Lund, BMC C13 SE-221 84 Lund, Sweden
| |
Collapse
|
212
|
Chen XH, Koumoutsi A, Scholz R, Eisenreich A, Schneider K, Heinemeyer I, Morgenstern B, Voss B, Hess WR, Reva O, Junge H, Voigt B, Jungblut PR, Vater J, Süssmuth R, Liesegang H, Strittmatter A, Gottschalk G, Borriss R. Comparative analysis of the complete genome sequence of the plant growth–promoting bacterium Bacillus amyloliquefaciens FZB42. Nat Biotechnol 2007; 25:1007-14. [PMID: 17704766 DOI: 10.1038/nbt1325] [Citation(s) in RCA: 493] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2007] [Accepted: 07/09/2007] [Indexed: 11/09/2022]
Abstract
Bacillus amyloliquefaciens FZB42 is a Gram-positive, plant-associated bacterium, which stimulates plant growth and produces secondary metabolites that suppress soil-borne plant pathogens. Its 3,918-kb genome, containing an estimated 3,693 protein-coding sequences, lacks extended phage insertions, which occur ubiquitously in the closely related Bacillus subtilis 168 genome. The B. amyloliquefaciens FZB42 genome reveals an unexpected potential to produce secondary metabolites, including the polyketides bacillaene and difficidin. More than 8.5% of the genome is devoted to synthesizing antibiotics and siderophores by pathways not involving ribosomes. Besides five gene clusters, known from B. subtilis to mediate nonribosomal synthesis of secondary metabolites, we identified four giant gene clusters absent in B. subtilis 168. The pks2 gene cluster encodes the components to synthesize the macrolactin core skeleton.
Collapse
Affiliation(s)
- Xiao Hua Chen
- Bakteriengenetik, Institut für Biologie, Humboldt Universität, Chausseestrasse 117, D-10115 Berlin, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
213
|
Abstract
While the concept of a gene has been helpful in defining the relationship of a portion of a genome to a phenotype, this traditional term may not be as useful as it once was. Currently, "gene" has come to refer principally to a genomic region producing a polyadenylated mRNA that encodes a protein. However, the recent emergence of a large collection of unannotated transcripts with apparently little protein coding capacity, collectively called transcripts of unknown function (TUFs), has begun to blur the physical boundaries and genomic organization of genic regions with noncoding transcripts often overlapping protein-coding genes on the same (sense) and opposite strand (antisense). Moreover, they are often located in intergenic regions, making the genic portions of the human genome an interleaved network of both annotated polyadenylated and nonpolyadenylated transcripts, including splice variants with novel 5' ends extending hundreds of kilobases. This complex transcriptional organization and other recently observed features of genomes argue for the reconsideration of the term "gene" and suggests that transcripts may be used to define the operational unit of a genome.
Collapse
|
214
|
Yang DH, Barari M, Arif BM, Krell PJ. Development of an oligonucleotide-based DNA microarray for transcriptional analysis of Choristoneura fumiferana nucleopolyhedrovirus (CfMNPV) genes. J Virol Methods 2007; 143:175-85. [PMID: 17428552 DOI: 10.1016/j.jviromet.2007.03.007] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2006] [Revised: 03/05/2007] [Accepted: 03/08/2007] [Indexed: 11/25/2022]
Abstract
A modified oligonucleotide-based two-channel DNA microarray was developed for characterization of temporal expression profiles of select Choristoneura fumiferana nucleopolyhedrovirus (CfMNPV) ORFs including its 7 unique ORFs. The microarray chip contained oligonucleotide probes for 23 CfMNPV ORFs and their complements as well as five host genes. Total RNA was isolated at different times post infection from Cf203 insect cells infected with CfMNPV. The cDNA was synthesized, fluorescent labelled with Cy3, and co-hybridized to the microarray chips along with Cy5-labelled viral genomic DNA, which served as equimolar reference standards for each probe. Transcription of the 7 CfMNPV unique ORFs was detected using DNA microarray analysis and their temporal expression profiles suggest that they are functional genes. The expression levels of three host genes varied throughout virus infection and therefore were unsuitable for normalization between microarrays. The DNA microarray results were compared to quantitative RT-PCR (qRT-PCR). Transcription of the non-coding (antisense) strands of some of the CfMNPV select genes including the polyhedrin gene, was also detected by array analysis and confirmed by qRT-PCR. The polyhedrin antisense transcript, based on long-range RT-PCR analysis, appeared to be a read-through product of an adjacent ORF in the same orientation as the antisense transcript.
Collapse
Affiliation(s)
- Dan-Hui Yang
- Department of Molecular and Cellular Biology, University of Guelph, Guelph, Ont. N1G 2W1, Canada
| | | | | | | |
Collapse
|
215
|
Discovery of Small Regulatory RNAs Extends Our Understanding of Gene Regulation in the Acidithiobacillus Genus. ACTA ACUST UNITED AC 2007. [DOI: 10.4028/www.scientific.net/amr.20-21.535] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Small regulatory RNAs (srRNAs) control gene expression in Bacteria, usually at the posttranscriptional
level, by acting as antisense RNAs that bind targeted mRNAs or by interacting with
regulatory proteins. srRNAs are involved in the regulation of a large variety of processes such as
plasmid replication, transposition and global genetic circuits that respond to environmental changes.
Since their discovery a few years ago, it has become apparent that they are prolific and widespread. In
this study, we describe bioinformatic approaches to srRNA discovery in the biomining microorganisms
Acidithiobacillus ferrooxidans, A. caldus and A. thiooxidans. Intergenic regions of the annotated
genomes were extracted and computationally searched for srRNAs. Candidate srRNAs that were
associated with predicted sigma 70 promoters and/or rho-independent terminators were chosen for
further study. The resulting potential srRNAs include known examples from other microorganisms and
some novel candidates and reveal interesting underlying biology of the Acidithiobacillus genus.
Collapse
|
216
|
Can Clustal-style progressive pairwise alignment of multiple sequences be used in RNA secondary structure prediction? BMC Bioinformatics 2007; 8:190. [PMID: 17559658 PMCID: PMC1904245 DOI: 10.1186/1471-2105-8-190] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2006] [Accepted: 06/08/2007] [Indexed: 11/15/2022] Open
Abstract
Background In ribonucleic acid (RNA) molecules whose function depends on their final, folded three-dimensional shape (such as those in ribosomes or spliceosome complexes), the secondary structure, defined by the set of internal basepair interactions, is more consistently conserved than the primary structure, defined by the sequence of nucleotides. Results The research presented here investigates the possibility of applying a progressive, pairwise approach to the alignment of multiple RNA sequences by simultaneously predicting an energy-optimized consensus secondary structure. We take an existing algorithm for finding the secondary structure common to two RNA sequences, Dynalign, and alter it to align profiles of multiple sequences. We then explore the relative successes of different approaches to designing the tree that will guide progressive alignments of sequence profiles to create a multiple alignment and prediction of conserved structure. Conclusion We have found that applying a progressive, pairwise approach to the alignment of multiple ribonucleic acid sequences produces highly reliable predictions of conserved basepairs, and we have shown how these predictions can be used as constraints to improve the results of a single-sequence structure prediction algorithm. However, we have also discovered that the amount of detail included in a consensus structure prediction is highly dependent on the order in which sequences are added to the alignment (the guide tree), and that if a consensus structure does not have sufficient detail, it is less likely to provide useful constraints for the single-sequence method.
Collapse
|
217
|
Horesh Y, Amir A, Michaeli S, Unger R. RNAMAT: an efficient method to detect classes of RNA molecules and their structural features. CONFERENCE PROCEEDINGS : ... ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL CONFERENCE 2007; 2004:2869-72. [PMID: 17270876 DOI: 10.1109/iembs.2004.1403817] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
There is a growing appreciation for the diverse and important roles RNA molecules play in cellular function. RNAMAT is an approach based on matrix representation of all potential base-pairing of a set of sequences to reveal common secondary-structure features. When the RNA sequences come from one class, proper summation of these matrices exposes common structural features as demonstrated for tRNA and HACA-RNA. For C/D-RNA, a novel structural motif is suggested. Furthermore, it is demonstrated, in the case of tmRNA that the method can detect pseudo-knots which are structural motifs that are difficult to detect in other methods. When the sequences come from diverse sources, a specific clustering algorithm is suggested that is capable of detecting the common motifs. The algorithm is demonstrated in a case of a simulated example and in a real case derived from trypanosomes comparative RNomics study.
Collapse
Affiliation(s)
- Yair Horesh
- Dept. of Comput. Sci., Bar-Ilan Univ., Ramat-Gan, Israel
| | | | | | | |
Collapse
|
218
|
Jossinet F, Ludwig TE, Westhof E. RNA structure: bioinformatic analysis. Curr Opin Microbiol 2007; 10:279-85. [PMID: 17548241 DOI: 10.1016/j.mib.2007.05.010] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2007] [Accepted: 05/23/2007] [Indexed: 01/30/2023]
Abstract
The range of functions ascribed to RNA molecules has grown considerably during recent years. Consequently, the analysis and comparison of RNA sequences have become recurrent tasks in molecular biology. Because the biological function of an RNA is expressed more by its folded architecture than by its sequence, original computational tools adapted to the multifaceted RNA functions have to be developed. Such tools, recently published, enable a user to solve classical problems related to RNA research: constructing 'structural' multiple alignments, inferring complete structures and structural motifs from RNA alignments, or searching structural homology in genomic databases.
Collapse
Affiliation(s)
- Fabrice Jossinet
- Architecture et Réactivité de l'ARN, Université Louis Pasteur, Institut de Biologie Moléculaire et Cellulaire, CNRS, F-67084 Strasbourg, France
| | | | | |
Collapse
|
219
|
Sridhar J, Rafi ZA. Small RNA identification in Enterobacteriaceae using synteny and genomic backbone retention. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2007; 11:74-99. [PMID: 17411397 DOI: 10.1089/omi.2006.0006] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Genomic screens for small RNA candidates in Enterobacteriacae genomes were carried out with existing small RNA sequences, conserved flanking genes, and genomic backbone information. The small RNA sequences and contexts from E. coli K12 formed the basis of the search. Sequence identity identified 117 additional small RNA homologs in related genomes. Motifs of continuous sequence stretches added another 48 sRNA regions, termed partial homologs. However, this study is unique in identifying 160 nonhomologous sRNA loci in related genomes based on the conserved flanking gene synteny and the backbone retention information obtained from KEGG-SSDB. Gene synteny and genomic backbone continuity were observed to be correlated with all of the sRNAs in related genomes. This search is the first of its kind toward identification of functionally important regions using gene order and back-bone information. A disruption in flanking gene order or genomic backbone indicates a possible hotspot for alien gene pool integration. This study reports both occurrence of multiple copies of a sRNA and co-occurrence of different sRNAs between a pair of conserved flanking genes. In general, synteny and genomic backbone retention information can be added as additional search criteria toward the design of precise bioinformatics tools for sRNA, gene identification, and gene functional annotations in related genomes.
Collapse
Affiliation(s)
- Jayavel Sridhar
- Centre of Excellence in Bioinformatics, School of Biotechnology, Madurai Kamaraj University, Madurai, Tamilnadu, India
| | | |
Collapse
|
220
|
Mrázek J, Kreutmayer SB, Grässer FA, Polacek N, Hüttenhofer A. Subtractive hybridization identifies novel differentially expressed ncRNA species in EBV-infected human B cells. Nucleic Acids Res 2007; 35:e73. [PMID: 17478510 PMCID: PMC1904266 DOI: 10.1093/nar/gkm244] [Citation(s) in RCA: 85] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2007] [Accepted: 04/03/2007] [Indexed: 12/31/2022] Open
Abstract
Non-protein-coding RNAs (ncRNAs) fulfill a wide range of cellular functions from protein synthesis to regulation of gene expression. Identification of novel regulatory ncRNAs by experimental approaches commonly includes the generation of specialized cDNA libraries encoding small ncRNA species. However, such identification is severely hampered by the presence of constitutively expressed and highly abundant 'house-keeping' ncRNAs, such as ribosomal RNAs, small nuclear RNAs or transfer RNAs. We have developed a novel experimental strategy, designated as subtractive hybridization of ncRNA transcripts (SHORT) to specifically select and amplify novel regulatory ncRNAs, which are only expressed at certain stages or under specific growth conditions of cells. The method is based on the selective subtractive hybridization technique, formerly applied to the detection of differentially expressed mRNAs. As a model system, we applied SHORT to Epstein-Barr virus (EBV) infected human B cells. Thereby, we identified 21 novel as well as previously reported ncRNA species to be up-regulated during virus infection. Our method will serve as a powerful tool to identify novel functional ncRNAs acting as genetic switches in the regulation of fundamental cellular processes such as development, tissue differentiation or disease.
Collapse
Affiliation(s)
- Jan Mrázek
- Innsbruck Biocenter, Division of Genomics and RNomics—Innsbruck Medical University, Fritz-Pregl-Strasse 3, 6020 Innsbruck, Austria and and Institut für Mikrobiologie und Hygiene, Abteilung Virologie, Haus 47, Universitätskliniken, D-66421 Homburg/Saar, Germany
| | - Simone B. Kreutmayer
- Innsbruck Biocenter, Division of Genomics and RNomics—Innsbruck Medical University, Fritz-Pregl-Strasse 3, 6020 Innsbruck, Austria and and Institut für Mikrobiologie und Hygiene, Abteilung Virologie, Haus 47, Universitätskliniken, D-66421 Homburg/Saar, Germany
| | - Friedrich A. Grässer
- Innsbruck Biocenter, Division of Genomics and RNomics—Innsbruck Medical University, Fritz-Pregl-Strasse 3, 6020 Innsbruck, Austria and and Institut für Mikrobiologie und Hygiene, Abteilung Virologie, Haus 47, Universitätskliniken, D-66421 Homburg/Saar, Germany
| | - Norbert Polacek
- Innsbruck Biocenter, Division of Genomics and RNomics—Innsbruck Medical University, Fritz-Pregl-Strasse 3, 6020 Innsbruck, Austria and and Institut für Mikrobiologie und Hygiene, Abteilung Virologie, Haus 47, Universitätskliniken, D-66421 Homburg/Saar, Germany
| | - Alexander Hüttenhofer
- Innsbruck Biocenter, Division of Genomics and RNomics—Innsbruck Medical University, Fritz-Pregl-Strasse 3, 6020 Innsbruck, Austria and and Institut für Mikrobiologie und Hygiene, Abteilung Virologie, Haus 47, Universitätskliniken, D-66421 Homburg/Saar, Germany
| |
Collapse
|
221
|
Busch A, Backofen R. INFO-RNA--a server for fast inverse RNA folding satisfying sequence constraints. Nucleic Acids Res 2007; 35:W310-3. [PMID: 17452349 PMCID: PMC1933236 DOI: 10.1093/nar/gkm218] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
INFO-RNA is a new web server for designing RNA sequences that fold into a user given secondary structure. Furthermore, constraints on the sequence can be specified, e.g. one can restrict sequence positions to a fixed nucleotide or to a set of nucleotides. Moreover, the user can allow violations of the constraints at some positions, which can be advantageous in complicated cases. The INFO-RNA web server allows biologists to design RNA sequences in an automatic manner. It is clearly and intuitively arranged and easy to use. The procedure is fast, as most applications are completed within seconds and it proceeds better and faster than other existing tools. The INFO-RNA web server is freely available at http://www.bioinf.uni-freiburg.de/Software/INFO-RNA/
Collapse
Affiliation(s)
- Anke Busch
- Albert-Ludwigs-University Freiburg, Institute of Computer Science, Bioinformatics Group, Georges-Koehler-Allee 106, 79110 Freiburg, Germany.
| | | |
Collapse
|
222
|
Shimada N, Kawata T. Evidence that noncoding RNA dutA is a multicopy suppressor of Dictyostelium discoideum STAT protein Dd-STATa. EUKARYOTIC CELL 2007; 6:1030-40. [PMID: 17435008 PMCID: PMC1951520 DOI: 10.1128/ec.00035-07] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Dd-STATa, a Dictyostelium discoideum homologue of metazoan STAT transcription factors, is necessary for culmination. We created a mutant strain with partial Dd-STATa activity and used it to screen for unlinked suppressor genes. We screened approximately 450,000 clones from a slug-stage cDNA library for their ability to rescue the culmination defect when overexpressed. There were 12 multicopy suppressors of Dd-STATa, of which 4 encoded segments of a known noncoding RNA, dutA. Expression of dutA is specific to the pstA zone, the region where Dd-STATa is activated. In suppressed strains the expression patterns of several putative Dd-STATa target genes become similar to the wild-type strain. In addition, the amount of the tyrosine-phosphorylated form of Dd-STATa is significantly increased in the suppressed strain. These results indicate that partial copies of dutA may act upstream of Dd-STATa to regulate tyrosine phosphorylation by an unknown mechanism.
Collapse
Affiliation(s)
- Nao Shimada
- Department of Biology, Faculty of Science, Toho University, 2-2-1 Miyama, Funabashi, Chiba 274-8510, Japan
| | | |
Collapse
|
223
|
Kim N, Gan HH, Schlick T. A computational proposal for designing structured RNA pools for in vitro selection of RNAs. RNA (NEW YORK, N.Y.) 2007; 13:478-92. [PMID: 17322501 PMCID: PMC1831855 DOI: 10.1261/rna.374907] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/07/2023]
Abstract
Although in vitro selection technology is a versatile experimental tool for discovering novel synthetic RNA molecules, finding complex RNA molecules is difficult because most RNAs identified from random sequence pools are simple motifs, consistent with recent computational analysis of such sequence pools. Thus, enriching in vitro selection pools with complex structures could increase the probability of discovering novel RNAs. Here we develop an approach for engineering sequence pools that links RNA sequence space regions with corresponding structural distributions via a "mixing matrix" approach combined with a graph theory analysis. We define five classes of mixing matrices motivated by covariance mutations in RNA; these constructs define nucleotide transition rates and are applied to chosen starting sequences to yield specific nonrandom pools. We examine the coverage of sequence space as a function of the mixing matrix and starting sequence via clustering analysis. We show that, in contrast to random sequences, which are associated only with a local region of sequence space, our designed pools, including a structured pool for GTP aptamers, can target specific motifs. It follows that experimental synthesis of designed pools can benefit from using optimized starting sequences, mixing matrices, and pool fractions associated with each of our constructed pools as a guide. Automation of our approach could provide practical tools for pool design applications for in vitro selection of RNAs and related problems.
Collapse
Affiliation(s)
- Namhee Kim
- Department of Chemistry, New York University, New York, NY 10003, USA
| | | | | |
Collapse
|
224
|
Salon J, Sheng J, Jiang J, Chen G, Caton-Williams J, Huang Z. Oxygen replacement with selenium at the thymidine 4-position for the Se base pairing and crystal structure studies. J Am Chem Soc 2007; 129:4862-3. [PMID: 17388591 DOI: 10.1021/ja0680919] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Jozef Salon
- Department of Chemistry, Georgia State University, Atlanta, Georgia 30303, USA
| | | | | | | | | | | |
Collapse
|
225
|
Liu J, Ma B, Zhang K. An algorithm for searching RNA motifs in genomic sequences. ACTA ACUST UNITED AC 2007; 24:343-50. [PMID: 17482512 DOI: 10.1016/j.bioeng.2007.02.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2006] [Revised: 05/11/2006] [Accepted: 05/11/2006] [Indexed: 11/22/2022]
Abstract
RNA molecules, which are found in all living cells, fold into characteristic structures that account for their diverse functional activities. Many of these RNA structures consist of a collection of fundamental RNA motifs. The various combinations of RNA basic components form different RNA classes and define their unique structural and functional properties. The availability of many genome sequences makes it possible to search computationally for functional RNAs. Biological experiments indicate that functional RNAs have characteristic RNA structural motifs represented by specific combinations of base pairings and conserved nucleotides in the loop regions. The searching for those well-ordered RNA structures and their homologues in genomic sequences is very helpful for the understanding of RNA-based gene regulation. In this paper, we consider the following problem: given an RNA sequence with a known secondary structure, efficiently determine candidate segments in genomic sequences that can potentially form RNA secondary structures similar to the given RNA secondary structure. Our new bottom-up approach searches all potential stem-loops similar to ones of the given RNA secondary structure first, and then based on located stem-loops, detects potential homologous structural RNAs in genomic sequences.
Collapse
Affiliation(s)
- Jingping Liu
- Department of Computer Science, University of Western Ontario, London, Ontario, Canada.
| | | | | |
Collapse
|
226
|
Baker ML, Indiviglio S, Nyberg AM, Rosenberg GH, Lindblad-Toh K, Miller RD, Papenfuss AT. Analysis of a set of Australian northern brown bandicoot expressed sequence tags with comparison to the genome sequence of the South American grey short tailed opossum. BMC Genomics 2007; 8:50. [PMID: 17298671 PMCID: PMC1802078 DOI: 10.1186/1471-2164-8-50] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2006] [Accepted: 02/13/2007] [Indexed: 12/21/2022] Open
Abstract
Background Expressed sequence tags (ESTs) have been used for rapid gene discovery in a variety of organisms and provide a valuable resource for whole genome annotation. Although the genome of one marsupial, the opossum Monodelphis domestica, has now been sequenced, no EST datasets have been reported from any marsupial species. In this study we describe an EST dataset from the bandicoot, Isoodon macrourus, providing information on the transcriptional profile of the bandicoot thymus and the opportunity for a genome wide comparison between the bandicoot and opossum, two distantly related marsupial species. Results A set of 1319 ESTs was generated from sequencing randomly chosen clones from a bandicoot thymus cDNA library. The nucleic acid and deduced amino acid sequences were compared with sequences both in GenBank and the recently completed whole genome sequence of M. domestica. This study provides information on the transcriptional profile of the bandicoot thymus with the identification of genes involved in a broad range of activities including protein metabolism (24%), transcription and/or nucleic acid metabolism (10%), metabolism/energy pathways (9%), immunity (5%), signal transduction (5%), cell growth and maintenance (3%), transport (3%), cell cycle (0.7%) and apoptosis (0.5%) and a proportion of genes whose function is unknown (5.8%). Thirty four percent of the bandicoot ESTs found no match with annotated sequences in any of the public databases. Clustering and assembly of the 1319 bandicoot ESTs resulted in a set of 949 unique sequences of which 375 were unannotated ESTs. Of these, seventy one unannotated ESTs aligned to non-coding regions in the opossum, human, or both genomes, and were identified as strong non-coding RNA candidates. Eighty-four percent of the 949 assembled ESTs aligned with the M. domestica genome sequence indicating a high level of conservation between these two distantly related marsupials. Conclusion This study is among the first reported marsupial EST datasets with a significant inter-species genome comparison between marsupials, providing a valuable resource for transcriptional analyses in marsupials and for future annotation of marsupial whole genome sequences.
Collapse
Affiliation(s)
- Michelle L Baker
- Center for Evolutionary and Theoretical Immunology, Department of Biology, University of New Mexico, Albuquerque, NM 87131, USA
| | - Sandra Indiviglio
- Center for Evolutionary and Theoretical Immunology, Department of Biology, University of New Mexico, Albuquerque, NM 87131, USA
| | - April M Nyberg
- Center for Evolutionary and Theoretical Immunology, Department of Biology, University of New Mexico, Albuquerque, NM 87131, USA
| | - George H Rosenberg
- Center for Evolutionary and Theoretical Immunology, Department of Biology, University of New Mexico, Albuquerque, NM 87131, USA
| | - Kerstin Lindblad-Toh
- Broad Institute of MIT and Harvard, 7 Cambridge Center, Cambridge, MA 02142, USA
| | - Robert D Miller
- Center for Evolutionary and Theoretical Immunology, Department of Biology, University of New Mexico, Albuquerque, NM 87131, USA
| | - Anthony T Papenfuss
- Bioinformatics Division, The Walter and Eliza Hall Institute of Medical Research, Parkville, Australia
| |
Collapse
|
227
|
Ng Kwang Loong S, Mishra SK. Unique folding of precursor microRNAs: quantitative evidence and implications for de novo identification. RNA (NEW YORK, N.Y.) 2007; 13:170-87. [PMID: 17194722 PMCID: PMC1781370 DOI: 10.1261/rna.223807] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
MicroRNAs (miRNAs) participate in diverse cellular and physiological processes through the post-transcriptional gene regulatory pathway. Hairpin is a crucial structural feature for the computational identification of precursor miRNAs (pre-miRs), as its formation is critically associated with the early stages of the mature miRNA biogenesis. Our incomplete knowledge about the number of miRNAs present in the genomes of vertebrates, worms, plants, and even viruses necessitates thorough understanding of their sequence motifs, hairpin structural characteristics, and topological descriptors. In this in-depth study, we investigate a comprehensive and heterogeneous collection of 2241 published (nonredundant) pre-miRs across 41 species (miRBase 8.2), 8494 pseudohairpins extracted from the human RefSeq genes, 12,387 (nonredundant) ncRNAs spanning 457 types (Rfam 7.0), 31 full-length mRNAs randomly selected from GenBank, and four sets of synthetically generated genomic background corresponding to each of the native RNA sequence. Our large-scale characterization analysis reveals that pre-miRs are significantly different from other types of ncRNAs, pseudohairpins, mRNAs, and genomic background according to the nonparametric Kruskal-Wallis ANOVA (p<0.001). We examine the intrinsic and global features at the sequence, structural, and topological levels including %G+C content, normalized base-pairing propensity P(S), normalized minimum free energy of folding MFE(s), normalized Shannon entropy Q(s), normalized base-pair distance D(s), and degree of compactness F(S), as well as their corresponding Z scores of P(S), MFE(s), Q(s), D(s), and F(S). The findings will promote more accurate guidelines and distinctive criteria for the prediction of novel pre-miRs with improved performance.
Collapse
|
228
|
Arluison V, Hohng S, Roy R, Pellegrini O, Régnier P, Ha T. Spectroscopic observation of RNA chaperone activities of Hfq in post-transcriptional regulation by a small non-coding RNA. Nucleic Acids Res 2007; 35:999-1006. [PMID: 17259214 PMCID: PMC1807976 DOI: 10.1093/nar/gkl1124] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Hfq protein is vital for the function of many non-coding small (s)RNAs in bacteria but the mechanism by which Hfq facilitates the function of sRNA is still debated. We developed a fluorescence resonance energy transfer assay to probe how Hfq modulates the interaction between a sRNA, DsrA, and its regulatory target mRNA, rpoS. The relevant RNA fragments were labelled so that changes in intra- and intermolecular RNA structures can be monitored in real time. Our data show that Hfq promotes the strand exchange reaction in which the internal structure of rpoS is replaced by pairing with DsrA such that the Shine-Dalgarno sequence of the mRNA becomes exposed. Hfq appears to carry out strand exchange by inducing rapid association of DsrA and a premelted rpoS and by aiding in the slow disruption of the rpoS secondary structure. Unexpectedly, Hfq also disrupts a preformed complex between rpoS and DsrA. While it may not be a frequent event in vivo, this melting activity may have implications in the reversal of sRNA-based regulation. Overall, our data suggests that Hfq not only promotes strand exchange by binding rapidly to both DsrA and rpoS but also possesses RNA chaperoning properties that facilitates dynamic RNA–RNA interactions.
Collapse
Affiliation(s)
- Véronique Arluison
- Institut de Biologie Physico-Chimique, CNRS UPR 9073 conventionnée avec l’Université Paris 7, 13 rue P. et M. Curie, 75005 Paris, France, Department of Physics and Howard Hughes Medical Institute and Center for Biophysics and Computational Biology, University of Illinois, Urbana-Champaign, Urbana, Illinois 61081, USA
| | - Sungchul Hohng
- Institut de Biologie Physico-Chimique, CNRS UPR 9073 conventionnée avec l’Université Paris 7, 13 rue P. et M. Curie, 75005 Paris, France, Department of Physics and Howard Hughes Medical Institute and Center for Biophysics and Computational Biology, University of Illinois, Urbana-Champaign, Urbana, Illinois 61081, USA
| | - Rahul Roy
- Institut de Biologie Physico-Chimique, CNRS UPR 9073 conventionnée avec l’Université Paris 7, 13 rue P. et M. Curie, 75005 Paris, France, Department of Physics and Howard Hughes Medical Institute and Center for Biophysics and Computational Biology, University of Illinois, Urbana-Champaign, Urbana, Illinois 61081, USA
| | - Olivier Pellegrini
- Institut de Biologie Physico-Chimique, CNRS UPR 9073 conventionnée avec l’Université Paris 7, 13 rue P. et M. Curie, 75005 Paris, France, Department of Physics and Howard Hughes Medical Institute and Center for Biophysics and Computational Biology, University of Illinois, Urbana-Champaign, Urbana, Illinois 61081, USA
| | - Philippe Régnier
- Institut de Biologie Physico-Chimique, CNRS UPR 9073 conventionnée avec l’Université Paris 7, 13 rue P. et M. Curie, 75005 Paris, France, Department of Physics and Howard Hughes Medical Institute and Center for Biophysics and Computational Biology, University of Illinois, Urbana-Champaign, Urbana, Illinois 61081, USA
| | - Taekjip Ha
- Institut de Biologie Physico-Chimique, CNRS UPR 9073 conventionnée avec l’Université Paris 7, 13 rue P. et M. Curie, 75005 Paris, France, Department of Physics and Howard Hughes Medical Institute and Center for Biophysics and Computational Biology, University of Illinois, Urbana-Champaign, Urbana, Illinois 61081, USA
- *To whom correspondence should be addressed. Tel: (217) 265 0717; Fax: (217) 244 7187;
| |
Collapse
|
229
|
McIntosh S, Watson L, Bundock P, Crawford A, White J, Cordeiro G, Barbary D, Rooke L, Henry R. SAGE of the developing wheat caryopsis. PLANT BIOTECHNOLOGY JOURNAL 2007; 5:69-83. [PMID: 17207258 DOI: 10.1111/j.1467-7652.2006.00218.x] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]
Abstract
Understanding the development of the cereal caryopsis holds the future for metabolic engineering in the interests of enhancing global food production. We have developed a Serial Analysis of Gene Expression (SAGE) data platform to investigate the developing wheat (Triticum aestivum) caryopsis. LongSAGE libraries have been constructed at five time-points post-anthesis to coincide with key processes in caryopsis development. More than 90,000 LongSAGE tags have been sequenced generating 29,261 unique tag sequences across all five libraries. Tag abundance, generated from cumulative tag counts, provides insight into the redundancy and diversity of each library. Annotation of the 500 most abundant tags spanning development highlights the array of functional groups being expressed. The relative frequency of these more abundant transcripts allows quantitative analysis of patterns of expression during grain development. We have identified activities of cellular proliferation/differentiation, the accumulation of storage proteins and starch biosynthesis. The abundance of calcium-dependent protein kinases indicate their importance in signalling across development. Acquisition of a broad array of defence coincides with storage accumulation and is dominated by inhibitors of amylase activity. Differential expression profiles of abundant tags from each library reveal the coordinated expression of genes responsible for the cellular events constituting caryopsis development. This SAGE platform has also provided a resource of novel sequence and expression information including the identification of potentially useful promoter activities. Further investigations into both the abundant and low expressing transcripts will provide greater insight into wheat caryopsis development and assist in wheat improvement programmes.
Collapse
Affiliation(s)
- Shane McIntosh
- Grain Foods CRC, Centre for Plant Conservation Genetics, Southern Cross University, PO Box 157, Lismore, NSW 2480 Australia.
| | | | | | | | | | | | | | | | | |
Collapse
|
230
|
Kim OTP, Yura K, Go N. Amino acid residue doublet propensity in the protein-RNA interface and its application to RNA interface prediction. Nucleic Acids Res 2006; 34:6450-60. [PMID: 17130160 PMCID: PMC1761430 DOI: 10.1093/nar/gkl819] [Citation(s) in RCA: 97] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Protein-RNA interactions play essential roles in a number of regulatory mechanisms for gene expression such as RNA splicing, transport, translation and post-transcriptional control. As the number of available protein-RNA complex 3D structures has increased, it is now possible to statistically examine protein-RNA interactions based on 3D structures. We performed computational analyses of 86 representative protein-RNA complexes retrieved from the Protein Data Bank. Interface residue propensity, a measure of the relative importance of different amino acid residues in the RNA interface, was calculated for each amino acid residue type (residue singlet interface propensity). In addition to the residue singlet propensity, we introduce a new residue-based propensity, which gives a measure of residue pairing preferences in the RNA interface of a protein (residue doublet interface propensity). The residue doublet interface propensity contains much more information than the sum of two singlet propensities alone. The prediction of the RNA interface using the two types of propensities plus a position-specific multiple sequence profile can achieve a specificity of about 80%. The prediction method was then applied to the 3D structure of two mRNA export factors, TAP (Mex67) and UAP56 (Sub2). The prediction enables us to point out candidate RNA interfaces, part of which are consistent with previous experimental studies and may contribute to elucidation of atomic mechanisms of mRNA export.
Collapse
Affiliation(s)
- Oanh T. P. Kim
- Quantum Bioinformatics Team, Center for Computational Science and Engineering, Japan Atomic Energy AgencyKizu-cho, Souraku-gun, Kyoto 619-0215, Japan
| | - Kei Yura
- Quantum Bioinformatics Team, Center for Computational Science and Engineering, Japan Atomic Energy AgencyKizu-cho, Souraku-gun, Kyoto 619-0215, Japan
- Research Unit for Quantum Beam Life Science Initiative, Quantum Beam Science Directorate, Japan Atomic Energy AgencyKizu-cho, Souraku-gun, Kyoto 619-0215, Japan
- CREST, JST, Japan Atomic Energy AgencyKizu-cho, Souraku-gun, Kyoto 619-0215, Japan
- To whom correspondence should be addressed. Tel: +81 774 71 3462; Fax: +81 774 71 3460;
| | - Nobuhiro Go
- Research Unit for Quantum Beam Life Science Initiative, Quantum Beam Science Directorate, Japan Atomic Energy AgencyKizu-cho, Souraku-gun, Kyoto 619-0215, Japan
- Computational Biology Group, Quantum Beam Science Directorate, Japan Atomic Energy AgencyKizu-cho, Souraku-gun, Kyoto 619-0215, Japan
- Bioinformatics Unit, Nara Institute of Science and TechnologyTakayama-cho, Ikoma-shi, Nara 630-0196, Japan
| |
Collapse
|
231
|
Carrasco N, Caton-Williams J, Brandt G, Wang S, Huang Z. Efficient enzymatic synthesis of phosphoroselenoate RNA by using adenosine 5'-(alpha-P-seleno)triphosphate. Angew Chem Int Ed Engl 2006; 45:94-7. [PMID: 16304655 DOI: 10.1002/anie.200502215] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Nicolas Carrasco
- Department of Chemistry, Georgia State University, Atlanta, GA 30303, USA
| | | | | | | | | |
Collapse
|
232
|
Cao X, Yeo G, Muotri AR, Kuwabara T, Gage FH. Noncoding RNAs in the mammalian central nervous system. Annu Rev Neurosci 2006; 29:77-103. [PMID: 16776580 DOI: 10.1146/annurev.neuro.29.051605.112839] [Citation(s) in RCA: 332] [Impact Index Per Article: 18.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The central nervous system (CNS) is arguably one of the most complex systems in the universe. To understand the CNS, scientists have investigated a variety of molecules, including proteins, lipids, and various small molecules. However, one large class of molecules, noncoding RNAs (ncRNAs), has been relatively unexplored. ncRNAs function directly as structural, catalytic, or regulatory molecules rather than serving as templates for protein synthesis. The increasing variety of ncRNAs being identified in the CNS suggests a strong connection between the biogenesis, dynamics of action, and combinatorial regulatory potential of ncRNAs and the complexity of the CNS. In this review, we give an overview of the diversity and abundance of ncRNAs before delving into specific examples that illustrate their importance in the CNS. In particular, we cover recent evidence for the roles of microRNAs, small nucleolar RNAs, retrotransposons, the NRSE small modulatory RNA, and BC1/BC200 in the CNS. Finally, we speculate why ncRNAs are well adapted to improving organism-environment interactions.
Collapse
Affiliation(s)
- Xinwei Cao
- Laboratory of Genetics, The Salk Institute for Biological Studies, La Jolla, California 92037, USA.
| | | | | | | | | |
Collapse
|
233
|
Johansen J, Rasmussen AA, Overgaard M, Valentin-Hansen P. Conserved small non-coding RNAs that belong to the sigmaE regulon: role in down-regulation of outer membrane proteins. J Mol Biol 2006; 364:1-8. [PMID: 17007876 DOI: 10.1016/j.jmb.2006.09.004] [Citation(s) in RCA: 204] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2006] [Revised: 08/27/2006] [Accepted: 09/01/2006] [Indexed: 11/24/2022]
Abstract
Enteric bacteria respond to misfolded proteins by activating the transcription of "heat shock" genes. These genes are arranged in two major regulons controlled by the alternative sigma factors sigmaH and sigmaE. The two transcription factors coordinate the stress response in different cellular compartments; the sigmaH regulon is induced by stress in the cytoplasm whereas the sigmaE regulon is activated by stress signals in the cell envelope. In Escherichia coli sigmaE plays a central role in maintaining cell envelope integrity both under stress conditions and during normal growth. Previous work established that sigmaE is essential for viability of the bacterium and up-regulates expression of approximately 100 protein-encoding genes that influences nearly every aspect of the cell envelope. Moreover, the expression of several outer membrane proteins is down-regulated upon sigmaE activation. Here, we show that two Hfq-binding small RNAs, MicA and RybB, are under positive control of sigmaE. Transient induction of RybB resulted in decreased levels of the mRNAs encoding OmpC and OmpW. sigmaE -mediated regulation of ompC and ompW expression was abolished in strains lacking RybB or Hfq. Recently MicA was shown to act in destabilizing the ompA transcript when rapidly grown cells entered the stationary phase of growth. Also, the alternative sigma factor down-regulates this message in a small non-coding RNA-dependent fashion. These findings add the sigmaE regulon to the growing list of stress induced regulatory circuits that include small regulatory RNAs and provide insight in a homeostatic loop that prevent a build-up of unassembled outer membrane proteins in the envelope.
Collapse
Affiliation(s)
- Jesper Johansen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Campusvej 55, DK-5230, Odense M, Denmark
| | | | | | | |
Collapse
|
234
|
Dowell RD, Eddy SR. Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints. BMC Bioinformatics 2006; 7:400. [PMID: 16952317 PMCID: PMC1579236 DOI: 10.1186/1471-2105-7-400] [Citation(s) in RCA: 96] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2006] [Accepted: 09/04/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND We are interested in the problem of predicting secondary structure for small sets of homologous RNAs, by incorporating limited comparative sequence information into an RNA folding model. The Sankoff algorithm for simultaneous RNA folding and alignment is a basis for approaches to this problem. There are two open problems in applying a Sankoff algorithm: development of a good unified scoring system for alignment and folding and development of practical heuristics for dealing with the computational complexity of the algorithm. RESULTS We use probabilistic models (pair stochastic context-free grammars, pairSCFGs) as a unifying framework for scoring pairwise alignment and folding. A constrained version of the pairSCFG structural alignment algorithm was developed which assumes knowledge of a few confidently aligned positions (pins). These pins are selected based on the posterior probabilities of a probabilistic pairwise sequence alignment. CONCLUSION Pairwise RNA structural alignment improves on structure prediction accuracy relative to single sequence folding. Constraining on alignment is a straightforward method of reducing the runtime and memory requirements of the algorithm. Five practical implementations of the pairwise Sankoff algorithm - this work (Consan), David Mathews' Dynalign, Ian Holmes' Stemloc, Ivo Hofacker's PMcomp, and Jan Gorodkin's FOLDALIGN - have comparable overall performance with different strengths and weaknesses.
Collapse
Affiliation(s)
- Robin D Dowell
- Howard Hughes Medical Institute and Department of Genetics, Washington University School of Medicine, 4444 Forest Park Blvd. Box 8510, St. Louis, MO 63108, USA
- MIT Computer Science and Artificial Intelligence Laboratory, 32 Vassar Street, Cambridge, MA 02139, USA
| | - Sean R Eddy
- Howard Hughes Medical Institute and Department of Genetics, Washington University School of Medicine, 4444 Forest Park Blvd. Box 8510, St. Louis, MO 63108, USA
| |
Collapse
|
235
|
Melin M, Klar J, Jr Gedde-Dahl T, Fredriksson R, Hausser I, Brandrup F, Bygum A, Vahlquist A, Hellström Pigg M, Dahl N. A founder mutation for ichthyosis prematurity syndrome restricted to 76 kb by haplotype association. J Hum Genet 2006; 51:864-871. [PMID: 16946994 DOI: 10.1007/s10038-006-0035-z] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2006] [Accepted: 06/29/2006] [Indexed: 10/24/2022]
Abstract
Autosomal recessive congenital ichthyosis (ARCI) is a group of keratinisation disorders that includes the ichthyosis prematurity syndrome (IPS). IPS is rare and almost exclusively present in a restricted region in the middle of Norway and Sweden, which indicates a founder effect for the disorder. We recently reported linkage of IPS to chromosome 9q34, and we present here the subsequent fine-mapping of this region with known and novel microsatellite markers as well as single nucleotide polymorphisms (SNPs). Allelic association, evaluated with Fisher's exact test and P (excess), was used to refine the IPS haplotype to approximately 1.6 Mb. On the basis of the average length of the haplotype in IPS patients, we calculated the age of a founder mutation to approximately 1,900 years. The IPS haplotype contains a core region of 76 kb consisting of four marker alleles shared by 97.7% of the chromosomes associated with IPS. This region spans four known genes, all of which are expressed in mature epidermal cells. We present the results from the analysis of these four genes and their corresponding transcripts in normal and patient-derived samples.
Collapse
Affiliation(s)
- M Melin
- Department of Genetics and Pathology, Rudbeck Laboratory, Uppsala University, Uppsala, 751 85, Sweden
| | - J Klar
- Department of Genetics and Pathology, Rudbeck Laboratory, Uppsala University, Uppsala, 751 85, Sweden
| | - T Jr Gedde-Dahl
- Department of Dermatology, Rikshospitalet University Hospital, and Institute of Forensic Medicine, University of Oslo, Oslo, Norway
| | - R Fredriksson
- Department of Neuroscience, Uppsala University, Uppsala, Sweden
| | - I Hausser
- Department of Dermatology, University of Heidelberg, Heidelberg, Germany
| | - F Brandrup
- Department of Dermatology, Odense University Hospital, Odense, Denmark
| | - A Bygum
- Department of Dermatology, Odense University Hospital, Odense, Denmark
| | - A Vahlquist
- Department of Medical Science, Uppsala University Hospital, Uppsala, Sweden
| | - M Hellström Pigg
- Department of Genetics and Pathology, Rudbeck Laboratory, Uppsala University, Uppsala, 751 85, Sweden
| | - N Dahl
- Department of Genetics and Pathology, Rudbeck Laboratory, Uppsala University, Uppsala, 751 85, Sweden.
| |
Collapse
|
236
|
Brandt G, Carrasco N, Huang Z. Efficient substrate cleavage catalyzed by hammerhead ribozymes derivatized with selenium for X-ray crystallography. Biochemistry 2006; 45:8972-7. [PMID: 16846240 PMCID: PMC2519893 DOI: 10.1021/bi060455m] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Because oxygen and selenium are in the same group (Family VI) in the periodic table, the site-specific mutagenesis at the atomic level by replacing RNA oxygen with selenium can provide insights on the structure and function of catalytic RNAs. We report here the first Se-derivatized ribozymes transcribed with all nucleoside 5'-(alpha-P-seleno)triphosphates (NTPalphaSe, including A, C, G, and U). We found that T7 RNA polymerase recognizes NTPalphaSe Sp diastereomers as well as the natural NTPs, whereas NTPalphaSe Rp diastereomers are neither substrates nor inhibitors. We also demonstrated the catalytic activity of these Se-derivatized hammerhead ribozymes by cleaving the RNA substrate, and we found that these phosphoroselenoate ribozymes can be as active as the native one. These hammerhead ribozymes site-specifically mutagenized by selenium reveal the close relationship between the catalytic activities and the replaced oxygen atoms, which provides insight on the participation of oxygen in catalysis or intramolecular interaction. This demonstrates a convenient strategy for the mechanistic study of functional RNAs. In addition, the active ribozymes site-specifically derivatized by selenium will allow for convenient MAD phasing in X-ray crystal structure studies.
Collapse
Affiliation(s)
- Gary Brandt
- Department of Chemistry, Georgia State University, Atlanta, Georgia 30303, USA
| | | | | |
Collapse
|
237
|
Abstract
I have demonstrated that nuclear transcription modulates the distribution of replication origins along mammalian chromosomes. Chinese Hamster Ovary (CHO) cells were exposed to transcription inhibitors in early G1 phase and replication origin sites in the dihydrofolate reductase (DHFR) gene locus were mapped several hours later. DNA within nuclei prepared from control and transcription-deficient G1-phase cells was replicated with similar efficiencies when introduced into Xenopus egg extracts. Replication initiated in the intergenic region within control late-G1 nuclei, but randomly within transcriptionally repressed nuclei. Random initiation was not a consequence of inability to produce an essential protein(s), since initiation was site-specific within cells exposed to the translation inhibitor cycloheximide during the same interval of G1 phase. Furthermore, in vivo inhibition of transcription within late-G1-phase cells reduced the frequency of usage of pre-established DHFR replication origin sites. Transcription rates in the DHFR domain were very low and did not change throughout G1 phase. This implies that, although ongoing nuclear transcription is required, local expression of the genes in the DHFR locus alone is not sufficient to create a site-specific replication initiation pattern. I conclude that epigenetic factors, including general nuclear transcription, play a role in replication origin selection in mammalian nuclei.
Collapse
Affiliation(s)
- Daniela S Dimitrova
- Department of Biological Sciences, State University of New York at Buffalo, Buffalo, NY 14260, USA.
| |
Collapse
|
238
|
Saha S, Murthy S, Rangarajan PN. Identification and characterization of a virus-inducible non-coding RNA in mouse brain. J Gen Virol 2006; 87:1991-1995. [PMID: 16760401 DOI: 10.1099/vir.0.81768-0] [Citation(s) in RCA: 91] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Infection of mice with Japanese encephalitis virus or Rabies virus results in the activation of a gene encoding a novel, non-coding RNA (ncRNA) in the mouse central nervous system. This transcript, named virus-inducible ncRNA (VINC), is identical to a 3.18 kb transcript expressed in mouse neonate skin (GenBank accession no. AK028745) that, together with a number of unannotated cDNAs and expressed sequence tags, is grouped in the mouse unigene cluster Mm281895. VINC is expressed constitutively in early mouse embryo and several adult non-neuronal mouse tissues, as well as a murine renal adenocarcinoma (RAG) cell line. Northern blotting of nuclear and cytoplasmic RNAs revealed that VINC is localized primarily in the nucleus of RAG cells and is thus a novel member of the nuclear ncRNA family.
Collapse
Affiliation(s)
- Sougata Saha
- Department of Biochemistry, Indian Institute of Science, Bangalore 560012, India
| | - Sreenivasa Murthy
- Department of Biochemistry, Indian Institute of Science, Bangalore 560012, India
| | - Pundi N Rangarajan
- Department of Biochemistry, Indian Institute of Science, Bangalore 560012, India
| |
Collapse
|
239
|
Le SY, Maizel JV, Zhang K. An algorithm for detecting homologues of known structured RNAs in genomes. PROCEEDINGS. IEEE COMPUTATIONAL SYSTEMS BIOINFORMATICS CONFERENCE 2006:300-10. [PMID: 16448023 DOI: 10.1109/csb.2004.1332443] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Distinct RNA structures are frequently involved in a wide-range of functions in various biological mechanisms. The three dimensional RNA structures solved by X-ray crystallography and various well-established RNA phylogenetic structures indicate that functional RNAs have characteristic RNA structural motifs represented by specific combinations of base pairings and conserved nucleotides in the loop region. Discovery of well-ordered RNA structures and their homologues in genome-wide searches will enhance our ability to detect the RNA structural motifs and help us to highlight their association with functional and regulatory RNA elements. We present here a novel computer algorithm, HomoStRscan, that takes a single RNA sequence with its secondary structure to search for homologous-RNAs in complete genomes. This novel algorithm completely differs from other currently used search algorithms of homologous structures or structural motifs. For an arbitrary segment (or window) given in the target sequence, that has similar size to the query sequence, HomoStRscan finds the most similar structure to the input query structure and computes the maximal similarity score (MSS) between the two structures. The homologousRNA structures are then statistically inferred from the MSS distribution computed in the target genome. The method provides a flexible, robust and fine search tool for any homologous structural RNAs.
Collapse
Affiliation(s)
- Shu-Yun Le
- Laboratory of Experimental and Computational Biology, NCI Center for Cancer Research, National Cancer Institute, NIH, Bldg., Frederick, MD 21702, USA.
| | | | | |
Collapse
|
240
|
Friedrich M, Grahnert A, Klein C, Tschöp K, Engeland K, Hauschildt S. Genomic organization and expression of the human mono-ADP-ribosyltransferase ART3 gene. ACTA ACUST UNITED AC 2006; 1759:270-80. [PMID: 16934346 DOI: 10.1016/j.bbaexp.2006.06.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2006] [Revised: 06/26/2006] [Accepted: 06/29/2006] [Indexed: 01/17/2023]
Abstract
Here we describe an RT-PCR analysis of mono-ADP-ribosyltransferase 3 (ART3) mRNA expression in macrophages, testis, semen, tonsil, heart and skeletal muscle and the complete gene structure as obtained by sequence alignment of PCR products with a human genomic clone (GenBank accession no. AC112719). Twelve exons (ex1-12) were found to make up the coding region of the gene (one more than previously published). Two prominent classes of ART3 splice variants could be distinguished by the presence or absence of ex2 which encodes most of ART3 protein. Among the ex2-containing mRNA species, the most frequently amplified variant did not include exons 9 to 11, except in skeletal muscle, in which the major splice variant lacked ex10 only. Two different, previously not reported 5' non-translated regions (5' UTRs) were identified, demonstrating the presence of two alternative promoters that we termed palpha and pbeta. Whereas the 5'UTR originating from palpha, was split up into three exons, a single exon represented the 5' UTR of pbeta transcripts. Strikingly, in heart, skeletal muscle and tonsils the upstream promoter palpha was totally inactive and ART3 transcription appears to be driven solely by pbeta. In all other cell types tested, transcription started mainly (if not exclusively) at palpha. Thus, ART3 expression in human cells appears to be governed by a combination of differential splicing and tissue-preferential use of two alternative promoters. This specific use is evolutionary conserved as shown by analysis of the 5' UTR of the mouse ART3 mRNA.
Collapse
Affiliation(s)
- Maik Friedrich
- Institute of Biology II/Department of Immunobiology, University of Leipzig, Talstrasse 33, D-04103 Leipzig, Germany
| | | | | | | | | | | |
Collapse
|
241
|
Dror O, Nussinov R, Wolfson HJ. The ARTS web server for aligning RNA tertiary structures. Nucleic Acids Res 2006; 34:W412-5. [PMID: 16845038 PMCID: PMC1538835 DOI: 10.1093/nar/gkl312] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2006] [Revised: 03/06/2006] [Accepted: 04/11/2006] [Indexed: 11/12/2022] Open
Abstract
RNA molecules with common structural features may share similar functional properties. Structural comparison of RNAs and detection of common substructures is, thus, a highly important task. Nevertheless, the current available tools in the RNA community provide only a partial solution, since they either work at the 2D level or are suitable for detecting predefined or local contiguous tertiary motifs only. Here, we describe a web server built around ARTS, a method for aligning tertiary structures of nucleic acids (both RNA and DNA). ARTS receives a pair of 3D nucleic acid structures and searches for a priori unknown common substructures. The search is truly 3D and irrespective of the order of the nucleotides on the chain. The identified common substructures can be large global folds with hundreds and even thousands of nucleotides as well as small local motifs with at least two successive base pairs. The method is highly efficient and has been used to conduct an all-against-all comparison of all the RNA structures in the Protein Data Bank. The web server together with a software package for download are freely accessible at http://bioinfo3d.cs.tau.ac.il/ARTS.
Collapse
Affiliation(s)
- Oranit Dror
- School of Computer Science, Raymond and Beverly Sackler Faculty of Exact Sciences, Tel Aviv University, Tel Aviv 69978, Israel.
| | | | | |
Collapse
|
242
|
Dawson W, Fujiwara K, Kawai G, Futamura Y, Yamamoto K. A method for finding optimal rna secondary structures using a new entropy model (vsfold). NUCLEOSIDES NUCLEOTIDES & NUCLEIC ACIDS 2006; 25:171-89. [PMID: 16541960 DOI: 10.1080/15257770500446915] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
We are developing a program to calculate optimal RNA secondary structures. The model uses di-nucleotide pairing energies as with most traditional approaches. However, for long-range entropy interactions, the approach uses an entropy-loss model based on the accumulated sum of the entropy of bonding between each base-pair weighted inversely by the correlation of the RNA sequence (the Kuhn length). Stiff RNA forms very different structures from flexible RNA. The results demonstrate that the long-range folding is largely governed by this entropy and the Kuhn length.
Collapse
Affiliation(s)
- Wayne Dawson
- Department of Life and Environmental Sciences, Chiba Institute of Technology, Tsudanuma, Narashino-shi, Chiba, Japan.
| | | | | | | | | |
Collapse
|
243
|
Abstract
As one of the earliest problems in computational biology, RNA secondary structure prediction (sometimes referred to as "RNA folding") problem has attracted attention again, thanks to the recent discoveries of many novel non-coding RNA molecules. The two common approaches to this problem are de novo prediction of RNA secondary structure based on energy minimization and the consensus folding approach (computing the common secondary structure for a set of unaligned RNA sequences). Consensus folding algorithms work well when the correct seed alignment is part of the input to the problem. However, seed alignment itself is a challenging problem for diverged RNA families. In this paper, we propose a novel framework to predict the common secondary structure for unaligned RNA sequences. By matching putative stacks in RNA sequences, we make use of both primary sequence information and thermodynamic stability for prediction at the same time. We show that our method can predict the correct common RNA secondary structures even when we are given only a limited number of unaligned RNA sequences, and it outperforms current algorithms in sensitivity and accuracy.
Collapse
Affiliation(s)
- Vineet Bafna
- Department of Computer Science and Engineering, University of California, San Diego, La Jolla, 92093, USA
| | | | | |
Collapse
|
244
|
Lu W, Zhou D, Glusman G, Utleg AG, White JT, Nelson PS, Vasicek TJ, Hood L, Lin B. KLK31P is a novel androgen regulated and transcribed pseudogene of kallikreins that is expressed at lower levels in prostate cancer cells than in normal prostate cells. Prostate 2006; 66:936-44. [PMID: 16541416 DOI: 10.1002/pros.20382] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
BACKGROUND Fifteen human tissue kallikrein (KLK) genes have been identified as a cluster on chromosome 19. KLK expression is associated with various human diseases including cancers. Noncoding RNAs such as PCA3/DD3 and PCGEM1 have been identified in prostate cancer cells. METHODS Using massively parallel signature sequencing (MPSS) technology, RT-PCR, and 5' rapid amplification of cDNA ends (RACE), we identified and cloned a novel gene that maps to the KLK locus. RESULTS We have characterized this gene, named as KLK31P by the HUGO Gene Nomenclature Committee, as an unprocessed KLK pseudogene. It contains five exons, two of which are KLK-derived while the rest are "exonized" interspersed repeats. KLK31P is expressed abundantly in prostate tissues and is androgen regulated. KLK31P is expressed at lower levels in localized and metastatic prostate cancer cells than in normal prostate cells. CONCLUSIONS KLK31P is a novel androgen regulated and transcribed pseudogene of kallikreins that may play a role in prostate carcinogenesis or maintenance.
Collapse
MESH Headings
- Amino Acid Sequence
- Androgens/physiology
- Blotting, Northern
- Cell Line, Tumor
- Chromosomes, Human, Pair 19/genetics
- Cloning, Organism
- DNA/analysis
- DNA/genetics
- DNA, Complementary/genetics
- DNA, Neoplasm/analysis
- DNA, Neoplasm/genetics
- Exons/genetics
- Gene Expression Regulation
- Gene Expression Regulation, Neoplastic
- Humans
- Kidney/chemistry
- Male
- Molecular Sequence Data
- Multigene Family
- Prostate/chemistry
- Prostate/physiology
- Prostatic Neoplasms/chemistry
- Prostatic Neoplasms/genetics
- Prostatic Neoplasms/physiopathology
- Pseudogenes
- Reverse Transcriptase Polymerase Chain Reaction
- Tissue Kallikreins/analysis
- Tissue Kallikreins/genetics
- Tissue Kallikreins/physiology
- Transcription, Genetic
Collapse
Affiliation(s)
- Wei Lu
- Institute for Systems Biology, Seattle, Washington 98103, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
245
|
Halligan DL, Keightley PD. Ubiquitous selective constraints in the Drosophila genome revealed by a genome-wide interspecies comparison. Genome Res 2006; 16:875-84. [PMID: 16751341 PMCID: PMC1484454 DOI: 10.1101/gr.5022906] [Citation(s) in RCA: 181] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Non-coding DNA comprises approximately 80% of the euchromatic portion of the Drosophila melanogaster genome. Non-coding sequences are known to contain functionally important elements controlling gene expression, but the proportion of sites that are selectively constrained is still largely unknown. We have compared the complete D. melanogaster and Drosophila simulans genome sequences to estimate mean selective constraint (the fraction of mutations that are eliminated by selection) in coding and non-coding DNA by standardizing to substitution rates in putatively unconstrained sequences. We show that constraint is positively correlated with intronic and intergenic sequence length and is generally remarkably strong in non-coding DNA, implying that more than half of all point mutations in the Drosophila genome are deleterious. This fraction is also likely to be an underestimate if many substitutions in non-coding DNA are adaptively driven to fixation. We also show that substitutions in long introns and intergenic sequences are clustered, such that there is an excess of substitutions <8 bp apart and a deficit farther apart. These results suggest that there are blocks of constrained nucleotides, presumably involved in gene expression control, that are concentrated in long non-coding sequences. Furthermore, we infer that there is more than three times as much functional non-coding DNA as protein-coding DNA in the Drosophila genome. Most deleterious mutations therefore occur in non-coding DNA, and these may make an important contribution to a wide variety of evolutionary processes.
Collapse
Affiliation(s)
- Daniel L Halligan
- Institute of Evolutionary Biology, University of Edinburgh, Edinburgh EH9 3JT, United Kingdom.
| | | |
Collapse
|
246
|
Abstract
MOTIVATION The structure of RNA molecules is often crucial for their function. Therefore, secondary structure prediction has gained much interest. Here, we consider the inverse RNA folding problem, which means designing RNA sequences that fold into a given structure. RESULTS We introduce a new algorithm for the inverse folding problem (INFO-RNA) that consists of two parts; a dynamic programming method for good initial sequences and a following improved stochastic local search that uses an effective neighbor selection method. During the initialization, we design a sequence that among all sequences adopts the given structure with the lowest possible energy. For the selection of neighbors during the search, we use a kind of look-ahead of one selection step applying an additional energy-based criterion. Afterwards, the pre-ordered neighbors are tested using the actual optimization criterion of minimizing the structure distance between the target structure and the mfe structure of the considered neighbor. We compared our algorithm to RNAinverse and RNA-SSD for artificial and biological test sets. Using INFO-RNA, we performed better than RNAinverse and in most cases, we gained better results than RNA-SSD, the probably best inverse RNA folding tool on the market. AVAILABILITY www.bioinf.uni-freiburg.de?Subpages/software.html.
Collapse
Affiliation(s)
- Anke Busch
- Albert-Ludwigs-University Freiburg, Institute of Computer Science, Chair of Bioinformatics Georges-Koehler-Allee 106, 79110 Freiburg, Germany
| | | |
Collapse
|
247
|
Anwar M, Nguyen T, Turcotte M. Identification of consensus RNA secondary structures using suffix arrays. BMC Bioinformatics 2006; 7:244. [PMID: 16677380 PMCID: PMC1475642 DOI: 10.1186/1471-2105-7-244] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2005] [Accepted: 05/05/2006] [Indexed: 11/16/2022] Open
Abstract
Background The identification of a consensus RNA motif often consists in finding a conserved secondary structure with minimum free energy in an ensemble of aligned sequences. However, an alignment is often difficult to obtain without prior structural information. Thus the need for tools to automate this process. Results We present an algorithm called Seed to identify all the conserved RNA secondary structure motifs in a set of unaligned sequences. The search space is defined as the set of all the secondary structure motifs inducible from a seed sequence. A general-to-specific search allows finding all the motifs that are conserved. Suffix arrays are used to enumerate efficiently all the biological palindromes as well as for the matching of RNA secondary structure expressions. We assessed the ability of this approach to uncover known structures using four datasets. The enumeration of the motifs relies only on the secondary structure definition and conservation only, therefore allowing for the independent evaluation of scoring schemes. Twelve simple objective functions based on free energy were evaluated for their potential to discriminate native folds from the rest. Conclusion Our evaluation shows that 1) support and exclusion constraints are sufficient to make an exhaustive search of the secondary structure space feasible. 2) The search space induced from a seed sequence contains known motifs. 3) Simple objective functions, consisting of a combination of the free energy of matching sequences, can generally identify motifs with high positive predictive value and sensitivity to known motifs.
Collapse
Affiliation(s)
- Mohammad Anwar
- School of Information Technology and Engineering, University of Ottawa, Ottawa, Ontario, Canada
| | - Truong Nguyen
- School of Information Technology and Engineering, University of Ottawa, Ottawa, Ontario, Canada
| | - Marcel Turcotte
- School of Information Technology and Engineering, University of Ottawa, Ottawa, Ontario, Canada
| |
Collapse
|
248
|
Lewis A, Reik W. How imprinting centres work. Cytogenet Genome Res 2006; 113:81-9. [PMID: 16575166 DOI: 10.1159/000090818] [Citation(s) in RCA: 138] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2005] [Accepted: 09/15/2005] [Indexed: 11/19/2022] Open
Abstract
Imprinted genes tend to be clustered in the genome. Most of these clusters have been found to be under the control of discrete DNA elements called imprinting centres (ICs) which are normally differentially methylated in the germline. ICs can regulate imprinted expression and epigenetic marks at many genes in the region, even those which lie several megabases away. Some of the molecular and cellular mechanisms by which ICs control other genes and regulatory regions in the cluster are becoming clear. One involves the insulation of genes on one side of the IC from enhancers on the other, mediated by the insulator protein CTCF and higher-order chromatin interactions. Another mechanism may involve non-coding RNAs that originate from the IC, targeting histone modifications to the surrounding genes. Given that several imprinting clusters contain CTCF dependent insulators and/or non-coding RNAs, it is likely that one or both of these two mechanisms regulate imprinting at many loci. Both mechanisms involve a variety of epigenetic marks including DNA methylation and histone modifications but the hierarchy of and interactions between these modifications are not yet understood. The challenge now is to establish a chain of developmental events beginning with differential methylation of an IC in the germline and ending with imprinting of many genes, often in a lineage dependent manner.
Collapse
Affiliation(s)
- A Lewis
- Laboratory of Developmental Genetics and Imprinting, The Babraham Institute, Cambridge, UK.
| | | |
Collapse
|
249
|
Dalli D, Wilm A, Mainz I, Steger G. STRAL: progressive alignment of non-coding RNA using base pairing probability vectors in quadratic time. Bioinformatics 2006; 22:1593-9. [PMID: 16613908 DOI: 10.1093/bioinformatics/btl142] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Alignment of RNA has a wide range of applications, for example in phylogeny inference, consensus structure prediction and homology searches. Yet aligning structural or non-coding RNAs (ncRNAs) correctly is notoriously difficult as these RNA sequences may evolve by compensatory mutations, which maintain base pairing but destroy sequence homology. Ideally, alignment programs would take RNA structure into account. The Sankoff algorithm for the simultaneous solution of RNA structure prediction and RNA sequence alignment was proposed 20 years ago but suffers from its exponential complexity. A number of programs implement lightweight versions of the Sankoff algorithm by restricting its application to a limited type of structure and/or only pairwise alignment. Thus, despite recent advances, the proper alignment of multiple structural RNA sequences remains a problem. RESULTS Here we present StrAl, a heuristic method for alignment of ncRNA that reduces sequence-structure alignment to a two-dimensional problem similar to standard multiple sequence alignment. The scoring function takes into account sequence similarity as well as up- and downstream pairing probability. To test the robustness of the algorithm and the performance of the program, we scored alignments produced by StrAl against a large set of published reference alignments. The quality of alignments predicted by StrAl is far better than that obtained by standard sequence alignment programs, especially when sequence homologies drop below approximately 65%; nevertheless StrAl's runtime is comparable to that of ClustalW.
Collapse
Affiliation(s)
- Deniz Dalli
- Heinrich-Heine-Universität Düsseldorf, Institut für Physikalische Biologie D-40225 Düsseldorf, Germany
| | | | | | | |
Collapse
|
250
|
Ginger MR, Shore AN, Contreras A, Rijnkels M, Miller J, Gonzalez-Rimbau MF, Rosen JM. A noncoding RNA is a potential marker of cell fate during mammary gland development. Proc Natl Acad Sci U S A 2006; 103:5781-6. [PMID: 16574773 PMCID: PMC1420634 DOI: 10.1073/pnas.0600745103] [Citation(s) in RCA: 146] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2005] [Indexed: 12/26/2022] Open
Abstract
PINC is a large, alternatively spliced, developmentally regulated, noncoding RNA expressed in the regressed terminal ductal lobular unit-like structures of the parous mammary gland. Previous studies have shown that this population of cells possesses not only progenitor-like qualities (the ability to proliferate and repopulate a mammary gland) and the ability to survive developmentally programmed cell death but also the inhibition of carcinogen-induced proliferation. Here we report that PINC expression is temporally and spatially regulated in response to developmental stimuli in vivo and that PINC RNA is localized to distinct foci in either the nucleus or the cytoplasm in a cell-cycle-specific manner. Loss-of-function experiments suggest that PINC performs dual roles in cell survival and regulation of cell-cycle progression, suggesting that PINC may contribute to the developmentally mediated changes previously observed in the terminal ductal lobular unit-like structures of the parous gland. This is one of the first reports describing the functional properties of a large, developmentally regulated, mammalian, noncoding RNA.
Collapse
Affiliation(s)
| | - Amy N. Shore
- Program in Developmental Biology, Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030; and
| | | | - Monique Rijnkels
- U.S. Department of Agriculture/Agricultural Research Services Children’s Nutrition Research Center, Department of Pediatrics, Baylor College of Medicine, 1100 Bates Street, Houston, TX 77030
| | | | | | | |
Collapse
|