1
|
Computational Prediction of RNA-Binding Proteins and Binding Sites. Int J Mol Sci 2015; 16:26303-17. [PMID: 26540053 PMCID: PMC4661811 DOI: 10.3390/ijms161125952] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2015] [Revised: 10/20/2015] [Accepted: 10/23/2015] [Indexed: 11/19/2022] Open
Abstract
Proteins and RNA interaction have vital roles in many cellular processes such as protein synthesis, sequence encoding, RNA transfer, and gene regulation at the transcriptional and post-transcriptional levels. Approximately 6%–8% of all proteins are RNA-binding proteins (RBPs). Distinguishing these RBPs or their binding residues is a major aim of structural biology. Previously, a number of experimental methods were developed for the determination of protein–RNA interactions. However, these experimental methods are expensive, time-consuming, and labor-intensive. Alternatively, researchers have developed many computational approaches to predict RBPs and protein–RNA binding sites, by combining various machine learning methods and abundant sequence and/or structural features. There are three kinds of computational approaches, which are prediction from protein sequence, prediction from protein structure, and protein-RNA docking. In this paper, we review all existing studies of predictions of RNA-binding sites and RBPs and complexes, including data sets used in different approaches, sequence and structural features used in several predictors, prediction method classifications, performance comparisons, evaluation methods, and future directions.
Collapse
|
2
|
Tuszynska I, Magnus M, Jonak K, Dawson W, Bujnicki JM. NPDock: a web server for protein-nucleic acid docking. Nucleic Acids Res 2015; 43:W425-30. [PMID: 25977296 PMCID: PMC4489298 DOI: 10.1093/nar/gkv493] [Citation(s) in RCA: 151] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2015] [Accepted: 05/02/2015] [Indexed: 01/03/2023] Open
Abstract
Protein–RNA and protein–DNA interactions play fundamental roles in many biological processes. A detailed understanding of these interactions requires knowledge about protein–nucleic acid complex structures. Because the experimental determination of these complexes is time-consuming and perhaps futile in some instances, we have focused on computational docking methods starting from the separate structures. Docking methods are widely employed to study protein–protein interactions; however, only a few methods have been made available to model protein–nucleic acid complexes. Here, we describe NPDock (Nucleic acid–Protein Docking); a novel web server for predicting complexes of protein–nucleic acid structures which implements a computational workflow that includes docking, scoring of poses, clustering of the best-scored models and refinement of the most promising solutions. The NPDock server provides a user-friendly interface and 3D visualization of the results. The smallest set of input data consists of a protein structure and a DNA or RNA structure in PDB format. Advanced options are available to control specific details of the docking process and obtain intermediate results. The web server is available at http://genesilico.pl/NPDock.
Collapse
Affiliation(s)
- Irina Tuszynska
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland Institute of Informatics, University of Warsaw, Banacha 2, PL-02-097 Warsaw, Poland
| | - Marcin Magnus
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Katarzyna Jonak
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Wayne Dawson
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland
| | - Janusz M Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, ul. Ks. Trojdena 4, PL-02-109 Warsaw, Poland Bioinformatics Laboratory, Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, ul. Umultowska 89, PL-61-614 Poznan, Poland
| |
Collapse
|
3
|
Faoro C, Ataide SF. Ribonomic approaches to study the RNA-binding proteome. FEBS Lett 2014; 588:3649-64. [PMID: 25150170 DOI: 10.1016/j.febslet.2014.07.039] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2014] [Revised: 07/04/2014] [Accepted: 07/04/2014] [Indexed: 01/23/2023]
Abstract
Gene expression is controlled through a complex interplay among mRNAs, non-coding RNAs and RNA-binding proteins (RBPs), which all assemble along with other RNA-associated factors in dynamic and functional ribonucleoprotein complexes (RNPs). To date, our understanding of RBPs is largely limited to proteins with known or predicted RNA-binding domains. However, various methods have been recently developed to capture an RNA of interest and comprehensively identify its associated RBPs. In this review, we discuss the RNA-affinity purification methods followed by mass spectrometry analysis (AP-MS); RBP screening within protein libraries and computational methods that can be used to study the RNA-binding proteome (RBPome).
Collapse
Affiliation(s)
- Camilla Faoro
- School of Molecular Biosciences, University of Sydney, NSW, Australia
| | - Sandro F Ataide
- School of Molecular Biosciences, University of Sydney, NSW, Australia.
| |
Collapse
|
4
|
Minie M, Chopra G, Sethi G, Horst J, White G, Roy A, Hatti K, Samudrala R. CANDO and the infinite drug discovery frontier. Drug Discov Today 2014; 19:1353-63. [PMID: 24980786 DOI: 10.1016/j.drudis.2014.06.018] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2014] [Revised: 06/18/2014] [Accepted: 06/19/2014] [Indexed: 12/21/2022]
Abstract
The Computational Analysis of Novel Drug Opportunities (CANDO) platform (http://protinfo.org/cando) uses similarity of compound-proteome interaction signatures to infer homology of compound/drug behavior. We constructed interaction signatures for 3733 human ingestible compounds covering 48,278 protein structures mapping to 2030 indications based on basic science methodologies to predict and analyze protein structure, function, and interactions developed by us and others. Our signature comparison and ranking approach yielded benchmarking accuracies of 12-25% for 1439 indications with at least two approved compounds. We prospectively validated 49/82 'high value' predictions from nine studies covering seven indications, with comparable or better activity to existing drugs, which serve as novel repurposed therapeutics. Our approach may be generalized to compounds beyond those approved by the FDA, and can also consider mutations in protein structures to enable personalization. Our platform provides a holistic multiscale modeling framework of complex atomic, molecular, and physiological systems with broader applications in medicine and engineering.
Collapse
Affiliation(s)
- Mark Minie
- University of Washington, Department of Bioengineering, Seattle, WA 98109, United States
| | - Gaurav Chopra
- University of Washington, Department of Microbiology, Seattle, WA 98109, United States; University of California, San Francisco, Diabetes Center, San Francisco, CA 94143, United States
| | - Geetika Sethi
- University of Washington, Department of Microbiology, Seattle, WA 98109, United States
| | - Jeremy Horst
- University of California, School of Medicine, San Francisco, CA 94143, United States
| | - George White
- University of Washington, Department of Microbiology, Seattle, WA 98109, United States
| | - Ambrish Roy
- Georgia Institute of Technology, Center for the Study of Systems Biology, Atlanta, GA 30318, United States
| | - Kaushik Hatti
- Molecular Biophysics Unit, Indian Institute of Science Bangalore, 560012, India
| | - Ram Samudrala
- University of Washington, Department of Microbiology, Seattle, WA 98109, United States.
| |
Collapse
|
5
|
Guariglia-Oropeza V, Orsi RH, Yu H, Boor KJ, Wiedmann M, Guldimann C. Regulatory network features in Listeria monocytogenes-changing the way we talk. Front Cell Infect Microbiol 2014; 4:14. [PMID: 24592357 PMCID: PMC3924034 DOI: 10.3389/fcimb.2014.00014] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2013] [Accepted: 01/27/2014] [Indexed: 01/04/2023] Open
Abstract
Our understanding of how pathogens shape their gene expression profiles in response to environmental changes is ever growing. Advances in Bioinformatics have made it possible to model complex systems and integrate data from variable sources into one large regulatory network. In these analyses, regulatory networks are typically broken down into regulatory motifs such as feed-forward loops (FFL) or auto-regulatory feedbacks, which serves to simplify the structure, while the functional implications of different regulatory motifs allow to make informed assumptions about the function of a specific regulatory pathway. Here we review the basic concepts of network features and use this language to break down the regulatory networks that govern the interactions between the main regulators of stress response, virulence, and transmission in Listeria monocytogenes. We point out the advantage that taking a “systems approach” could have for our understanding of gene functions, the detection of distant regulatory inputs, interspecies comparisons, and co-expression.
Collapse
Affiliation(s)
| | - Renato H Orsi
- Department of Food Science, Cornell University Ithaca, NY, USA
| | - Haiyuan Yu
- Department of Biological Statistics and Computational Biology, Cornell University Ithaca, NY, USA ; Department of Biological Statistics and Computational Biology, Weill Institute for Cell and Molecular Biology, Cornell University Ithaca, NY, USA
| | - Kathryn J Boor
- Department of Food Science, Cornell University Ithaca, NY, USA
| | - Martin Wiedmann
- Department of Food Science, Cornell University Ithaca, NY, USA
| | | |
Collapse
|
6
|
Computational modeling of protein-RNA complex structures. Methods 2013; 65:310-9. [PMID: 24083976 DOI: 10.1016/j.ymeth.2013.09.014] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2013] [Revised: 09/17/2013] [Accepted: 09/19/2013] [Indexed: 12/26/2022] Open
Abstract
Protein-RNA interactions play fundamental roles in many biological processes, such as regulation of gene expression, RNA splicing, and protein synthesis. The understanding of these processes improves as new structures of protein-RNA complexes are solved and the molecular details of interactions analyzed. However, experimental determination of protein-RNA complex structures by high-resolution methods is tedious and difficult. Therefore, studies on protein-RNA recognition and complex formation present major technical challenges for macromolecular structural biology. Alternatively, protein-RNA interactions can be predicted by computational methods. Although less accurate than experimental measurements, theoretical models of macromolecular structures can be sufficiently accurate to prompt functional hypotheses and guide e.g. identification of important amino acid or nucleotide residues. In this article we present an overview of strategies and methods for computational modeling of protein-RNA complexes, including software developed in our laboratory, and illustrate it with practical examples of structural predictions.
Collapse
|
7
|
Computational methods for prediction of protein-RNA interactions. J Struct Biol 2011; 179:261-8. [PMID: 22019768 DOI: 10.1016/j.jsb.2011.10.001] [Citation(s) in RCA: 83] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2011] [Revised: 09/28/2011] [Accepted: 10/04/2011] [Indexed: 12/21/2022]
Abstract
Understanding the molecular mechanism of protein-RNA recognition and complex formation is a major challenge in structural biology. Unfortunately, the experimental determination of protein-RNA complexes by X-ray crystallography and nuclear magnetic resonance spectroscopy (NMR) is tedious and difficult. Alternatively, protein-RNA interactions can be predicted by computational methods. Although less accurate than experimental observations, computational predictions can be sufficiently accurate to prompt functional hypotheses and guide experiments, e.g. to identify individual amino acid or nucleotide residues. In this article we review 10 methods for predicting protein-RNA interactions, seven of which predict RNA-binding sites from protein sequences and three from structures. We also developed a meta-predictor that uses the output of top three sequence-based primary predictors to calculate a consensus prediction, which outperforms all the primary predictors. In order to fully cover the software for predicting protein-RNA interactions, we also describe five methods for protein-RNA docking. The article highlights the strengths and shortcomings of existing methods for the prediction of protein-RNA interactions and provides suggestions for their further development.
Collapse
|
8
|
Gu H, Zhu P, Jiao Y, Meng Y, Chen M. PRIN: a predicted rice interactome network. BMC Bioinformatics 2011; 12:161. [PMID: 21575196 PMCID: PMC3118165 DOI: 10.1186/1471-2105-12-161] [Citation(s) in RCA: 131] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2010] [Accepted: 05/16/2011] [Indexed: 12/22/2022] Open
Abstract
Background Protein-protein interactions play a fundamental role in elucidating the molecular mechanisms of biomolecular function, signal transductions and metabolic pathways of living organisms. Although high-throughput technologies such as yeast two-hybrid system and affinity purification followed by mass spectrometry are widely used in model organisms, the progress of protein-protein interactions detection in plants is rather slow. With this motivation, our work presents a computational approach to predict protein-protein interactions in Oryza sativa. Results To better understand the interactions of proteins in Oryza sativa, we have developed PRIN, a Predicted Rice Interactome Network. Protein-protein interaction data of PRIN are based on the interologs of six model organisms where large-scale protein-protein interaction experiments have been applied: yeast (Saccharomyces cerevisiae), worm (Caenorhabditis elegans), fruit fly (Drosophila melanogaster), human (Homo sapiens), Escherichia coli K12 and Arabidopsis thaliana. With certain quality controls, altogether we obtained 76,585 non-redundant rice protein interaction pairs among 5,049 rice proteins. Further analysis showed that the topology properties of predicted rice protein interaction network are more similar to yeast than to the other 5 organisms. This may not be surprising as the interologs based on yeast contribute nearly 74% of total interactions. In addition, GO annotation, subcellular localization information and gene expression data are also mapped to our network for validation. Finally, a user-friendly web interface was developed to offer convenient database search and network visualization. Conclusions PRIN is the first well annotated protein interaction database for the important model plant Oryza sativa. It has greatly extended the current available protein-protein interaction data of rice with a computational approach, which will certainly provide further insights into rice functional genomics and systems biology. PRIN is available online at http://bis.zju.edu.cn/prin/.
Collapse
Affiliation(s)
- Haibin Gu
- Department of Bioinformatics, State Key Laboratory of Plant Physiology and Biochemistry, College of Life Sciences, Zhejiang University, Hangzhou 310058, China
| | | | | | | | | |
Collapse
|
9
|
Abdi A, Emamian ES. Fault diagnosis engineering in molecular signaling networks: an overview and applications in target discovery. Chem Biodivers 2010; 7:1111-23. [PMID: 20491069 DOI: 10.1002/cbdv.200900315] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Fault diagnosis engineering is a key component of modern industrial facilities and complex systems, and has gone through considerable developments in the past few decades. In this paper, the principles and concepts of molecular fault diagnosis engineering are reviewed. In this area, molecular intracellular networks are considered as complex systems that may fail to function, due to the presence of some faulty molecules. Dysfunction of the system due to the presence of a single or multiple molecules can ultimately lead to the transition from the normal state to the disease state. It is the goal of molecular fault diagnosis engineering to identify the critical components of molecular networks, i.e., those whose dysfunction can interrupt the function of the entire network. The results of the fault analysis of several signaling networks are discussed, and possible connections of the findings with some complex human diseases are examined. Implications of molecular fault diagnosis engineering for target discovery and drug development are outlined as well.
Collapse
Affiliation(s)
- Ali Abdi
- Center for Wireless Communications and Signal Processing Research, Department of Electrical and Computer Engineering & Department of Biological Sciences, New Jersey Institute of Technology, 323 King Blvd, Newark, NJ 07102, USA.
| | | |
Collapse
|