1
|
DelRosso N, Suzuki PH, Griffith D, Lotthammer JM, Novak B, Kocalar S, Sheth MU, Holehouse AS, Bintu L, Fordyce P. High-throughput affinity measurements of direct interactions between activation domains and co-activators. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.19.608698. [PMID: 39229005 PMCID: PMC11370418 DOI: 10.1101/2024.08.19.608698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
Sequence-specific activation by transcription factors is essential for gene regulation 1,2 . Key to this are activation domains, which often fall within disordered regions of transcription factors 3,4 and recruit co-activators to initiate transcription 5 . These interactions are difficult to characterize via most experimental techniques because they are typically weak and transient 6,7 . Consequently, we know very little about whether these interactions are promiscuous or specific, the mechanisms of binding, and how these interactions tune the strength of gene activation. To address these questions, we developed a microfluidic platform for expression and purification of hundreds of activation domains in parallel followed by direct measurement of co-activator binding affinities (STAMMPPING, for Simultaneous Trapping of Affinity Measurements via a Microfluidic Protein-Protein INteraction Generator). By applying STAMMPPING to quantify direct interactions between eight co-activators and 204 human activation domains (>1,500 K d s), we provide the first quantitative map of these interactions and reveal 334 novel binding pairs. We find that the metazoan-specific co-activator P300 directly binds >100 activation domains, potentially explaining its widespread recruitment across the genome to influence transcriptional activation. Despite sharing similar molecular properties ( e.g. enrichment of negative and hydrophobic residues), activation domains utilize distinct biophysical properties to recruit certain co-activator domains. Co-activator domain affinity and occupancy are well-predicted by analytical models that account for multivalency, and in vitro affinities quantitatively predict activation in cells with an ultrasensitive response. Not only do our results demonstrate the ability to measure affinities between even weak protein-protein interactions in high throughput, but they also provide a necessary resource of over 1,500 activation domain/co-activator affinities which lays the foundation for understanding the molecular basis of transcriptional activation.
Collapse
|
2
|
P P, Riyaz A, Choudhury A, Choudhury PR, Pradhan N, Singh A, Nakul M, Dudeja C, Yadav A, Nath SK, Khanna V, Sharma T, Pradhan G, Takkar S, Rawal K. DNASCANNER v2: A Web-Based Tool to Analyze the Characteristic Properties of Nucleotide Sequences. J Comput Biol 2024; 31:651-669. [PMID: 38662479 DOI: 10.1089/cmb.2023.0227] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/18/2024] Open
Abstract
Throughout the process of evolution, DNA undergoes the accumulation of distinct mutations, which can often result in highly organized patterns that serve various essential biological functions. These patterns encompass various genomic elements and provide valuable insights into the regulatory and functional aspects of DNA. The physicochemical, mechanical, thermodynamic, and structural properties of DNA sequences play a crucial role in the formation of specific patterns. These properties contribute to the three-dimensional structure of DNA and influence their interactions with proteins, regulatory elements, and other molecules. In this study, we introduce DNASCANNER v2, an advanced version of our previously published algorithm DNASCANNER for analyzing DNA properties. The current tool is built using the FLASK framework in Python language. Featuring a user-friendly interface tailored for nonspecialized researchers, it offers an extensive analysis of 158 DNA properties, including mono/di/trinucleotide frequencies, structural, physicochemical, thermodynamics, and mechanical properties of DNA sequences. The tool provides downloadable results and offers interactive plots for easy interpretation and comparison between different features. We also demonstrate the utility of DNASCANNER v2 in analyzing splice-site junctions, casposon insertion sequences, and transposon insertion sites (TIS) within the bacterial and human genomes, respectively. We also developed a deep learning module for the prediction of potential TIS in a given nucleotide sequence. In the future, we aim to optimize the performance of this prediction model through extensive training on larger data sets.
Collapse
Affiliation(s)
- Preeti P
- Center for Computational Biology and Bioinformatics, Amity Institute of Biotechnology, Amity University, Noida, Uttar Pradesh, India
| | - Azeen Riyaz
- Center for Computational Biology and Bioinformatics, Amity Institute of Biotechnology, Amity University, Noida, Uttar Pradesh, India
| | - Alakto Choudhury
- Center for Computational Biology and Bioinformatics, Amity Institute of Biotechnology, Amity University, Noida, Uttar Pradesh, India
| | - Priyanka Ray Choudhury
- Center for Computational Biology and Bioinformatics, Amity Institute of Biotechnology, Amity University, Noida, Uttar Pradesh, India
| | - Nischal Pradhan
- Center for Computational Biology and Bioinformatics, Amity Institute of Biotechnology, Amity University, Noida, Uttar Pradesh, India
| | - Abhishek Singh
- Center for Computational Biology and Bioinformatics, Amity Institute of Biotechnology, Amity University, Noida, Uttar Pradesh, India
| | - Mihir Nakul
- Center for Computational Biology and Bioinformatics, Amity Institute of Biotechnology, Amity University, Noida, Uttar Pradesh, India
| | - Chhavi Dudeja
- Center for Computational Biology and Bioinformatics, Amity Institute of Biotechnology, Amity University, Noida, Uttar Pradesh, India
| | - Abhijeet Yadav
- Center for Computational Biology and Bioinformatics, Amity Institute of Biotechnology, Amity University, Noida, Uttar Pradesh, India
| | - Swarsat Kaushik Nath
- Center for Computational Biology and Bioinformatics, Amity Institute of Biotechnology, Amity University, Noida, Uttar Pradesh, India
| | - Vrinda Khanna
- Center for Computational Biology and Bioinformatics, Amity Institute of Biotechnology, Amity University, Noida, Uttar Pradesh, India
| | - Trapti Sharma
- Center for Computational Biology and Bioinformatics, Amity Institute of Biotechnology, Amity University, Noida, Uttar Pradesh, India
| | - Gayatri Pradhan
- Center for Computational Biology and Bioinformatics, Amity Institute of Biotechnology, Amity University, Noida, Uttar Pradesh, India
| | - Simran Takkar
- Center for Computational Biology and Bioinformatics, Amity Institute of Biotechnology, Amity University, Noida, Uttar Pradesh, India
| | - Kamal Rawal
- Center for Computational Biology and Bioinformatics, Amity Institute of Biotechnology, Amity University, Noida, Uttar Pradesh, India
| |
Collapse
|
3
|
Ridha F, Gromiha MM. MPA-Pred: A machine learning approach for predicting the binding affinity of membrane protein-protein complexes. Proteins 2024; 92:499-508. [PMID: 37949651 DOI: 10.1002/prot.26633] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 10/05/2023] [Accepted: 10/25/2023] [Indexed: 11/12/2023]
Abstract
Membrane protein-protein interactions are essential for several functions including cell signaling, ion transport, and enzymatic activity. These interactions are mainly dictated by their binding affinities. Although several methods are available for predicting the binding affinity of protein-protein complexes, there exists no specific method for membrane protein-protein complexes. In this work, we collected the experimental binding affinity data for a set of 114 membrane protein-protein complexes and derived several structure and sequence-based features. Our analysis on the relationship between binding affinity and the features revealed that the important factors mainly depend on the type of membrane protein and the functional class of the protein. Specifically, aromatic and charged residues at the interface, and aromatic-aromatic and electrostatic interactions are found to be important to understand the binding affinity. Further, we developed a method, MPA-Pred, for predicting the binding affinity of membrane protein-protein complexes using a machine learning approach. It showed an average correlation and mean absolute error of 0.83 and 0.91 kcal/mol, respectively, using the jack-knife test on a set of 114 complexes. We have also developed a web server and it is available at https://web.iitm.ac.in/bioinfo2/MPA-Pred/. This method can be used for predicting the affinity of membrane protein-protein complexes at a large scale and aid to improve drug design strategies.
Collapse
Affiliation(s)
- Fathima Ridha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
- Department of Computer Science, National University of Singapore, Singapore, Singapore
| |
Collapse
|
4
|
Melikishvili M, Fried MG, Fondufe-Mittendorf YN. Cooperative nucleic acid binding by Poly ADP-ribose polymerase 1. Sci Rep 2024; 14:7530. [PMID: 38553566 PMCID: PMC10980755 DOI: 10.1038/s41598-024-58076-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 03/25/2024] [Indexed: 04/02/2024] Open
Abstract
Poly (ADP)-ribose polymerase 1 (PARP1) is an abundant nuclear protein well-known for its role in DNA repair yet also participates in DNA replication, transcription, and co-transcriptional splicing, where DNA is undamaged. Thus, binding to undamaged regions in DNA and RNA is likely a part of PARP1's normal repertoire. Here we describe analyses of PARP1 binding to two short single-stranded DNAs, a single-stranded RNA, and a double stranded DNA. The investigations involved comparing the wild-type (WT) full-length enzyme with mutants lacking the catalytic domain (∆CAT) or zinc fingers 1 and 2 (∆Zn1∆Zn2). All three protein types exhibited monomeric characteristics in solution and formed saturated 2:1 complexes with single-stranded T20 and U20 oligonucleotides. These complexes formed without accumulation of 1:1 intermediates, a pattern suggestive of positive binding cooperativity. The retention of binding activities by ∆CAT and ∆Zn1∆Zn2 enzymes suggests that neither the catalytic domain nor zinc fingers 1 and 2 are indispensable for cooperative binding. In contrast, when a double stranded 19mer DNA was tested, WT PARP1 formed a 4:1 complex while the ∆Zn1Zn2 mutant binding saturated at 1:1 stoichiometry. These deviations from the 2:1 pattern observed with T20 and U20 oligonucleotides show that PARP's binding mechanism can be influenced by the secondary structure of the nucleic acid. Our studies show that PARP1:nucleic acid interactions are strongly dependent on the nucleic acid type and properties, perhaps reflecting PARP1's ability to respond differently to different nucleic acid ligands in cells. These findings lay a platform for understanding how the functionally versatile PARP1 recognizes diverse oligonucleotides within the realms of chromatin and RNA biology.
Collapse
Affiliation(s)
- Manana Melikishvili
- Department of Epigenetics, Van Andel Institute, Grand Rapids, MI, 49503, USA
| | - Michael G Fried
- Center for Structural Biology, Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, KY, 40536, USA.
| | | |
Collapse
|
5
|
Pandey U, Behara SM, Sharma S, Patil RS, Nambiar S, Koner D, Bhukya H. DeePNAP: A Deep Learning Method to Predict Protein-Nucleic Acid Binding Affinity from Their Sequences. J Chem Inf Model 2024; 64:1806-1815. [PMID: 38458968 DOI: 10.1021/acs.jcim.3c01151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/10/2024]
Abstract
Predicting the protein-nucleic acid (PNA) binding affinity solely from their sequences is of paramount importance for the experimental design and analysis of PNA interactions (PNAIs). A large number of currently developed models for binding affinity prediction are limited to specific PNAIs while also relying on the sequence and structural information of the PNA complexes for both training and testing, and also as inputs. As the PNA complex structures available are scarce, this significantly limits the diversity and generalizability due to the small training data set. Additionally, a majority of the tools predict a single parameter, such as binding affinity or free energy changes upon mutations, rendering a model less versatile for usage. Hence, we propose DeePNAP, a machine learning-based model built from a vast and heterogeneous data set with 14,401 entries (from both eukaryotes and prokaryotes) from the ProNAB database, consisting of wild-type and mutant PNA complex binding parameters. Our model precisely predicts the binding affinity and free energy changes due to the mutation(s) of PNAIs exclusively from their sequences. While other similar tools extract features from both sequence and structure information, DeePNAP employs sequence-based features to yield high correlation coefficients between the predicted and experimental values with low root mean squared errors for PNA complexes in predicting KD and ΔΔG, implying the generalizability of DeePNAP. Additionally, we have also developed a web interface hosting DeePNAP that can serve as a powerful tool to rapidly predict binding affinities for a myriad of PNAIs with high precision toward developing a deeper understanding of their implications in various biological systems. Web interface: http://14.139.174.41:8080/.
Collapse
Affiliation(s)
- Uddeshya Pandey
- Department of Biology, Indian Institute of Science Education and Research Tirupati, Tirupati 517507, India
| | - Sasi M Behara
- Department of Biology, Indian Institute of Science Education and Research Tirupati, Tirupati 517507, India
| | - Siddhant Sharma
- Department of Biology, Indian Institute of Science Education and Research Tirupati, Tirupati 517507, India
| | - Rachit S Patil
- Department of Biology, Indian Institute of Science Education and Research Tirupati, Tirupati 517507, India
| | - Souparnika Nambiar
- Department of Biology, Indian Institute of Science Education and Research Tirupati, Tirupati 517507, India
| | - Debasish Koner
- Department of Chemistry, Indian Institute of Technology Hyderabad, Kandi 502284, India
| | - Hussain Bhukya
- Department of Biology, Indian Institute of Science Education and Research Tirupati, Tirupati 517507, India
| |
Collapse
|
6
|
Harini K, Sekijima M, Gromiha MM. PRA-Pred: Structure-based prediction of protein-RNA binding affinity. Int J Biol Macromol 2024; 259:129490. [PMID: 38224813 DOI: 10.1016/j.ijbiomac.2024.129490] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Revised: 01/10/2024] [Accepted: 01/12/2024] [Indexed: 01/17/2024]
Abstract
Understanding crucial factors that affect the binding affinity of protein-RNA complexes is vital for comprehending their recognition mechanisms. This study involved compiling experimentally measured binding affinity (ΔG) values of 217 protein-RNA complexes and extracting numerous structure-based features, considering RNA, protein, and interactions between protein and RNA. Our findings indicate the significance of RNA base-step parameters, interaction energies, number of atomic contacts in the complex, hydrogen bonds, and contact potentials in understanding the binding affinity. Further, we observed that these factors are influenced by the type of RNA strand and the function of the protein in a protein-RNA complex. Multiple regression equations were developed for different classes of complexes to perform the prediction of the binding affinity between the protein and RNA. We evaluated the models using the jack-knife test and achieved an overall correlation 0.77 between the experimental and predicted binding affinities with a mean absolute error of 1.02 kcal/mol. Furthermore, we introduced a web server, PRA-Pred, intended for the prediction of protein-RNA binding affinity, and it is freely accessible through https://web.iitm.ac.in/bioinfo2/prapred/. We propose that our approach could function as a potential resource for investigating protein-RNA recognitions and developing therapeutic strategies.
Collapse
Affiliation(s)
- K Harini
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - M Sekijima
- Department of Computer Science, Tokyo Institute of Technology, Yokohama, Japan
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India; International Research Frontiers Initiative, School of Computing, Tokyo Institute of Technology, Yokohama, 226-8501, Japan; Department of Computer Science, National University of Singapore, Singapore.
| |
Collapse
|
7
|
Liu S, Gomez-Alcala P, Leemans C, Glassford WJ, Mann RS, Bussemaker HJ. Predicting the DNA binding specificity of mutated transcription factors using family-level biophysically interpretable machine learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.24.577115. [PMID: 38352411 PMCID: PMC10862739 DOI: 10.1101/2024.01.24.577115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Sequence-specific interactions of transcription factors (TFs) with genomic DNA underlie many cellular processes. High-throughput in vitro binding assays coupled with computational analysis have made it possible to accurately define such sequence recognition in a biophysically interpretable yet mechanism-agonistic way for individual TFs. The fact that such sequence-to-affinity models are now available for hundreds of TFs provides new avenues for predicting how the DNA binding specificity of a TF changes when its protein sequence is mutated. To this end, we developed an analytical framework based on a tetrahedron embedding that can be applied at the level of a given structural TF family. Using bHLH as a test case, we demonstrate that we can systematically map dependencies between the protein sequence of a TF and base preference within the DNA binding site. We also develop a regression approach to predict the quantitative energetic impact of mutations in the DNA binding domain of a TF on its DNA binding specificity, and perform SELEX-seq assays on mutated TFs to experimentally validate our results. Our results point to the feasibility of predicting the functional impact of disease mutations and allelic variation in the cell-wide TF repertoire by leveraging high-quality functional information across sets of homologous wild-type proteins.
Collapse
Affiliation(s)
- Shaoxun Liu
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Pilar Gomez-Alcala
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - Christ Leemans
- Department of Biological Sciences, Columbia University, New York, NY, USA
| | - William J Glassford
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USA
| | - Richard S Mann
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| | - Harmen J Bussemaker
- Department of Biological Sciences, Columbia University, New York, NY, USA
- Department of Systems Biology, Columbia University, New York, NY, USA
| |
Collapse
|
8
|
Xu K, Li J, Li WX. Simulation of STAT and HP1 interaction by molecular docking. Cell Signal 2023; 112:110925. [PMID: 37839545 DOI: 10.1016/j.cellsig.2023.110925] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 10/06/2023] [Accepted: 10/11/2023] [Indexed: 10/17/2023]
Abstract
Heterochromatin Protein 1 (HP1) is a major component of heterochromatin. Multiple proteins have been shown to interact with HP1 with the HP1-binding motif PxVxL/I, thereby affecting heterochromatin stability. The HP1-interacting proteins include the signal transducer and activator of transcription (STAT) protein, which can be regulated by phosphorylation on a tyrosine around amino acid 700 in the carboxyl terminus. Previous research has shown that unphosphorylated STAT (uSTAT) binds to HP1 via a PxVxI HP1-binding motif and maintains the stability of heterochromatin, while phosphorylated STAT (pSTAT) dissociates from HP1, resulting in heterochromatin disruption. To understand the theoretical basis of the biochemical observations, we employed computational modeling to investigate STAT-HP1 binding configurations and the effect of STAT phosphorylation on their interaction. Using STAT3 and HP1α protein structures for molecular docking and thermodynamic calculations, our computations predict that uSTAT homodimers have a higher affinity for HP1 and a lower affinity for DNA than pSTAT homodimers, and that phosphorylation induces a conformational change in STAT, shifting its binding preference from HP1 to DNA. The results of our modeling studies support the idea that phosphorylation drives STAT from HP1-binding to DNA-binding, suggesting a potential role for uSTAT in both maintaining and initiating heterochromatin formation.
Collapse
Affiliation(s)
- Kangxin Xu
- Department of Medicine, University of California San Diego, USA
| | - Jinghong Li
- Department of Medicine, University of California San Diego, USA
| | - Willis X Li
- Department of Medicine, University of California San Diego, USA.
| |
Collapse
|
9
|
Zhang X, Mei LC, Gao YY, Hao GF, Song BA. Web tools support predicting protein-nucleic acid complexes stability with affinity changes. WILEY INTERDISCIPLINARY REVIEWS. RNA 2023; 14:e1781. [PMID: 36693636 DOI: 10.1002/wrna.1781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 11/10/2022] [Accepted: 11/28/2022] [Indexed: 01/26/2023]
Abstract
Numerous biological processes, such as transcription, replication, and translation, rely on protein-nucleic acid interactions (PNIs). Demonstrating the binding stability of protein-nucleic acid complexes is vital to deciphering the code for PNIs. Numerous web-based tools have been developed to attach importance to protein-nucleic acid stability, facilitating the prediction of PNIs characteristics rapidly. However, the data and tools are dispersed and lack comprehensive integration to understand the stability of PNIs better. In this review, we first summarize existing databases for evaluating the stability of protein-nucleic acid binding. Then, we compare and evaluate the pros and cons of web tools for forecasting the interaction energies of protein-nucleic acid complexes. Finally, we discuss the application of combining models and capabilities of PNIs. We may hope these web-based tools will facilitate the discovery of recognition mechanisms for protein-nucleic acid binding stability. This article is categorized under: RNA Interactions with Proteins and Other Molecules > Protein-RNA Recognition RNA Interactions with Proteins and Other Molecules > RNA-Protein Complexes RNA Interactions with Proteins and Other Molecules > Protein-RNA Interactions: Functional Implications.
Collapse
Affiliation(s)
- Xiao Zhang
- National Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for Research and Development of Fine Chemicals, Guizhou University, Guiyang, China
| | - Long-Can Mei
- National Key Laboratory of Green Pesticide, Central China Normal University, Wuhan, China
| | - Yang-Yang Gao
- National Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for Research and Development of Fine Chemicals, Guizhou University, Guiyang, China
| | - Ge-Fei Hao
- National Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for Research and Development of Fine Chemicals, Guizhou University, Guiyang, China
- National Key Laboratory of Green Pesticide, Central China Normal University, Wuhan, China
| | - Bao-An Song
- National Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for Research and Development of Fine Chemicals, Guizhou University, Guiyang, China
| |
Collapse
|
10
|
Yang S, Gong W, Zhou T, Sun X, Chen L, Zhou W, Li C. emPDBA: protein-DNA binding affinity prediction by combining features from binding partners and interface learned with ensemble regression model. Brief Bioinform 2023:7165253. [PMID: 37193676 DOI: 10.1093/bib/bbad192] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2023] [Revised: 04/26/2023] [Accepted: 04/29/2023] [Indexed: 05/18/2023] Open
Abstract
Protein-deoxyribonucleic acid (DNA) interactions are important in a variety of biological processes. Accurately predicting protein-DNA binding affinity has been one of the most attractive and challenging issues in computational biology. However, the existing approaches still have much room for improvement. In this work, we propose an ensemble model for Protein-DNA Binding Affinity prediction (emPDBA), which combines six base models with one meta-model. The complexes are classified into four types based on the DNA structure (double-stranded or other forms) and the percentage of interface residues. For each type, emPDBA is trained with the sequence-based, structure-based and energy features from binding partners and complex structures. Through feature selection by the sequential forward selection method, it is found that there do exist considerable differences in the key factors contributing to intermolecular binding affinity. The complex classification is beneficial for the important feature extraction for binding affinity prediction. The performance comparison of our method with other peer ones on the independent testing dataset shows that emPDBA outperforms the state-of-the-art methods with the Pearson correlation coefficient of 0.53 and the mean absolute error of 1.11 kcal/mol. The comprehensive results demonstrate that our method has a good performance for protein-DNA binding affinity prediction. Availability and implementation: The source code is available at https://github.com/ChunhuaLiLab/emPDBA/.
Collapse
Affiliation(s)
- Shuang Yang
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Weikang Gong
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Tong Zhou
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Xiaohan Sun
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Lei Chen
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Wenxue Zhou
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| | - Chunhua Li
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing 100124, China
| |
Collapse
|
11
|
Harini K, Kihara D, Michael Gromiha M. PDA-Pred: Predicting the binding affinity of protein-DNA complexes using machine learning techniques and structural features. Methods 2023; 213:10-17. [PMID: 36924867 PMCID: PMC10563387 DOI: 10.1016/j.ymeth.2023.03.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 02/17/2023] [Accepted: 03/11/2023] [Indexed: 03/17/2023] Open
Abstract
Protein-DNA interactions play an important role in various biological processes such as gene expression, replication, and transcription. Understanding the important features that dictate the binding affinity of protein-DNA complexes and predicting their affinities is important for elucidating their recognition mechanisms. In this work, we have collected the experimental binding free energy (ΔG) for a set of 391 Protein-DNA complexes and derived several structure-based features such as interaction energy, contact potentials, volume and surface area of binding site residues, base step parameters of the DNA and contacts between different types of atoms. Our analysis on relationship between binding affinity and structural features revealed that the important factors mainly depend on the number of DNA strands as well as functional and structural classes of proteins. Specifically, binding site properties such as number of atom contacts between the DNA and protein, volume of protein binding sites and interaction-based features such as interaction energies and contact potentials are important to understand the binding affinity. Further, we developed multiple regression equations for predicting the binding affinity of protein-DNA complexes belonging to different structural and functional classes. Our method showed an average correlation and mean absolute error of 0.78 and 0.98 kcal/mol, respectively, between the experimental and predicted binding affinities on a jack-knife test. We have developed a webserver, PDA-PreD (Protein-DNA Binding affinity predictor), for predicting the affinity of protein-DNA complexes and it is freely available at https://web.iitm.ac.in/bioinfo2/pdapred/.
Collapse
Affiliation(s)
- K Harini
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, United States; Department of Computer Science, Purdue University, West Lafayette, IN, United States
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India; International Research Frontiers Initiative, School of Computing, Tokyo Institute of Technology, Yokohama 226-8501, Japan.
| |
Collapse
|
12
|
Esmaeeli R, Bauzá A, Perez A. Structural predictions of protein-DNA binding: MELD-DNA. Nucleic Acids Res 2023; 51:1625-1636. [PMID: 36727436 PMCID: PMC9976882 DOI: 10.1093/nar/gkad013] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 12/27/2022] [Accepted: 01/30/2023] [Indexed: 02/03/2023] Open
Abstract
Structural, regulatory and enzymatic proteins interact with DNA to maintain a healthy and functional genome. Yet, our structural understanding of how proteins interact with DNA is limited. We present MELD-DNA, a novel computational approach to predict the structures of protein-DNA complexes. The method combines molecular dynamics simulations with general knowledge or experimental information through Bayesian inference. The physical model is sensitive to sequence-dependent properties and conformational changes required for binding, while information accelerates sampling of bound conformations. MELD-DNA can: (i) sample multiple binding modes; (ii) identify the preferred binding mode from the ensembles; and (iii) provide qualitative binding preferences between DNA sequences. We first assess performance on a dataset of 15 protein-DNA complexes and compare it with state-of-the-art methodologies. Furthermore, for three selected complexes, we show sequence dependence effects of binding in MELD predictions. We expect that the results presented herein, together with the freely available software, will impact structural biology (by complementing DNA structural databases) and molecular recognition (by bringing new insights into aspects governing protein-DNA interactions).
Collapse
Affiliation(s)
- Reza Esmaeeli
- Department of Chemistry, Quantum theory project, University of Florida, Gainesville, FL 32611, USA
| | - Antonio Bauzá
- Department of Chemistry, Universitat de les Illes Balears, Palma de Mallorca (Baleares), 07122, Spain
| | - Alberto Perez
- Department of Chemistry, Quantum theory project, University of Florida, Gainesville, FL 32611, USA
| |
Collapse
|
13
|
Su W, Xie XQ, Liu XW, Gao D, Ma CY, Zulfiqar H, Yang H, Lin H, Yu XL, Li YW. iRNA-ac4C: A novel computational method for effectively detecting N4-acetylcytidine sites in human mRNA. Int J Biol Macromol 2023; 227:1174-1181. [PMID: 36470433 DOI: 10.1016/j.ijbiomac.2022.11.299] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 11/10/2022] [Accepted: 11/25/2022] [Indexed: 12/07/2022]
Abstract
RNA N4-acetylcytidine (ac4C) is the acetylation of cytidine at the nitrogen-4 position, which is a highly conserved RNA modification and involves a variety of biological processes. Hence, accurate identification of genome-wide ac4C sites is vital for understanding regulation mechanism of gene expression. In this work, a novel predictor, named iRNA-ac4C, was established to identify ac4C sites in human mRNA based on three feature extraction methods, including nucleotide composition, nucleotide chemical property, and accumulated nucleotide frequency. Subsequently, minimum-Redundancy-Maximum-Relevance combined with incremental feature selection strategies was utilized to select the optimal feature subset. According to the optimal feature subset, the best ac4C classification model was trained by gradient boosting decision tree with 10-fold cross-validation. The results of independent testing set indicated that our proposed method could produce encouraging generalization capabilities. For the convenience of other researchers, we established a user-friendly web server which is freely available at http://lin-group.cn/server/iRNA-ac4C/. We hope that the tool could provide guide for wet-experimental scholars.
Collapse
Affiliation(s)
- Wei Su
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Xue-Qin Xie
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Xiao-Wei Liu
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Dong Gao
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Cai-Yi Ma
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Hasan Zulfiqar
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Hui Yang
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Hao Lin
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China.
| | - Xiao-Long Yu
- School of Materials Science and Engineering, Hainan University, Haikou 570228, China.
| | - Yan-Wen Li
- School of Information Science and Technology, Northeast Normal University, Changchun 130117, China; Key Laboratory of Intelligent Information Processing of Jilin Province, Northeast Normal University, Changchun 130117, China; Institute of Computational Biology, Northeast Normal University, Changchun 130117, China.
| |
Collapse
|
14
|
Li H, Zhu D, Yang Y, Ma Y, Chen Y, Xue P, Chen J, Qin M, Xu D, Cai C, Cheng H. Determinants of DNMT2/TRDMT1 preference for substrates tRNA and DNA during the evolution. RNA Biol 2023; 20:875-892. [PMID: 37966982 PMCID: PMC10653749 DOI: 10.1080/15476286.2023.2272473] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/12/2023] [Indexed: 11/17/2023] Open
Abstract
RNA methyltransferase DNMT2/TRDMT1 is the most conserved member of the DNMT family from bacteria to plants and mammals. In previous studies, we found some determinants for tRNA recognition of DNMT2/TRDMT1, but the preference mechanism of this enzyme for substrates tRNA and DNA remains to be explored. In the present study, CFT-containing target recognition domain (TRD) and target recognition extension domain (TRED) in DNMT2/TRDMT1 play a crucial role in the substrate DNA and RNA selection during the evolution. Moreover, the classical substrate tRNA for DNMT2/TRDMT1 had a characteristic sequence CUXXCAC in the anticodon loop. Position 35 was occupied by U, making cytosine-38 (C38) twist into the loop, whereas C, G or A was located at position 35, keeping the C38-flipping state. Hence, the substrate preference could be modulated by the easily flipped state of target cytosine in tRNA, as well as TRD and TRED. Additionally, DNMT2/TRDMT1 cancer mutant activity was collectively mediated by five enzymatic characteristics, which might impact gene expressions. Importantly, G155C, G155V and G155S mutations reduced enzymatic activities and showed significant associations with diseases using seven prediction methods. Altogether, these findings will assist in illustrating the substrate preference mechanism of DNMT2/TRDMT1 and provide a promising therapeutic strategy for cancer.
Collapse
Affiliation(s)
- Huari Li
- College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, Hubei, China
| | - Daiyun Zhu
- College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, Hubei, China
| | - Yapeng Yang
- College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, Hubei, China
| | - Yunfei Ma
- College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, Hubei, China
| | - Yong Chen
- College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, Hubei, China
| | - Pingfang Xue
- College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, Hubei, China
| | - Juan Chen
- College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, Hubei, China
| | - Mian Qin
- College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, Hubei, China
| | - Dandan Xu
- College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, Hubei, China
| | - Chao Cai
- College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, Hubei, China
| | - Hongjing Cheng
- College of Veterinary Medicine, Huazhong Agricultural University, Wuhan, Hubei, China
| |
Collapse
|
15
|
Zhang H, Zou Q, Ju Y, Song C, Chen D. Distance-based support vector machine to predict DNA N6-methyladenine modification. Curr Bioinform 2022. [DOI: 10.2174/1574893617666220404145517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Background:
DNA N6-methyladenine plays an important role in the restriction-modification system to isolate invasion from adventive DNA. The shortcomings of the high time-consumption and high costs of experimental methods have been exposed, and some computational methods have emerged. The support vector machine theory has received extensive attention in the bioinformatics field due to its solid theoretical foundation and many good characteristics.
Objective:
General machine learning methods include an important step of extracting features. The research has omitted this step and replaced with easy-to-obtain sequence distances matrix to obtain better results
Method:
First sequence alignment technology was used to achieve the similarity matrix. Then a novel transformation turned the similarity matrix into a distance matrix. Next, the similarity-distance matrix is made positive semi-definite so that it can be used in the kernel matrix. Finally, the LIBSVM software was applied to solve the support vector machine.
Results:
The five-fold cross-validation of this model on rice and mouse data has achieved excellent accuracy rates of 92.04% and 96.51%, respectively. This shows that the DB-SVM method has obvious advantages compared with traditional machine learning methods. Meanwhile this model achieved 0.943,0.982 and 0.818 accuracy,0.944, 0.982, and 0.838 Matthews correlation coefficient and 0.942, 0.982 and 0.840 F1 scores for the rice, M. musculus and cross-species genome datasets, respectively.
Conclusion:
These outcomes show that this model outperforms the iIM-CNN and csDMA in the prediction of DNA 6mA modification, which are the lastest research on DNA 6mA.
Collapse
Affiliation(s)
- Haoyu Zhang
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610051, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610051, China
| | - Ying Ju
- School of Informatics, Xiamen University, Xiamen 361005, China
| | - Chenggang Song
- Beidahuang Industry Group General Hospital, Harbin 150001, China
| | - Dong Chen
- College of Electrical and Information Engineering, Quzhou University, Quzhou 324000, China
| |
Collapse
|
16
|
Harini K, Srivastava A, Kulandaisamy A, Gromiha MM. ProNAB: database for binding affinities of protein-nucleic acid complexes and their mutants. Nucleic Acids Res 2021; 50:D1528-D1534. [PMID: 34606614 PMCID: PMC8728258 DOI: 10.1093/nar/gkab848] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2021] [Revised: 09/08/2021] [Accepted: 09/10/2021] [Indexed: 11/16/2022] Open
Abstract
Protein–nucleic acid interactions are involved in various biological processes such as gene expression, replication, transcription, translation and packaging. The binding affinities of protein–DNA and protein–RNA complexes are important for elucidating the mechanism of protein–nucleic acid recognition. Although experimental data on binding affinity are reported abundantly in the literature, no well-curated database is currently available for protein–nucleic acid binding affinity. We have developed a database, ProNAB, which contains more than 20 000 experimental data for the binding affinities of protein–DNA and protein–RNA complexes. Each entry provides comprehensive information on sequence and structural features of a protein, nucleic acid and its complex, experimental conditions, thermodynamic parameters such as dissociation constant (Kd), binding free energy (ΔG) and change in binding free energy upon mutation (ΔΔG), and literature information. ProNAB is cross-linked with GenBank, UniProt, PDB, ProThermDB, PROSITE, DisProt and Pubmed. It provides a user-friendly web interface with options for search, display, sorting, visualization, download and upload the data. ProNAB is freely available at https://web.iitm.ac.in/bioinfo2/pronab/ and it has potential applications such as understanding the factors influencing the affinity, development of prediction tools, binding affinity change upon mutation and design complexes with the desired affinity.
Collapse
Affiliation(s)
- Kannan Harini
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - Ambuj Srivastava
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - Arulsamy Kulandaisamy
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, India
| |
Collapse
|
17
|
Mei LC, Wang YL, Wu FX, Wang F, Hao GF, Yang GF. HISNAPI: a bioinformatic tool for dynamic hot spot analysis in nucleic acid-protein interface with a case study. Brief Bioinform 2021; 22:bbaa373. [PMID: 33406224 PMCID: PMC7929440 DOI: 10.1093/bib/bbaa373] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2020] [Revised: 11/19/2020] [Accepted: 11/23/2020] [Indexed: 01/18/2023] Open
Abstract
Protein-nucleic acid interactions play essential roles in many biological processes, such as transcription, replication and translation. In protein-nucleic acid interfaces, hotspot residues contribute the majority of binding affinity toward molecular recognition. Hotspot residues are commonly regarded as potential binding sites for compound molecules in drug design projects. The dynamic property is a considerable factor that affects the binding of ligands. Computational approaches have been developed to expedite the prediction of hotspot residues on protein-nucleic acid interfaces. However, existing approaches overlook hotspot dynamics, despite their essential role in protein function. Here, we report a web server named Hotspots In silico Scanning on Nucleic Acid and Protein Interface (HISNAPI) to analyze hotspot residue dynamics by integrating molecular dynamics simulation and one-step free energy perturbation. HISNAPI is capable of not only predicting the hotspot residues in protein-nucleic acid interfaces but also providing insights into their intensity and correlation of dynamic motion. Protein dynamics have been recognized as a vital factor that has an effect on the interaction specificity and affinity of the binding partners. We applied HISNAPI to the case of SARS-CoV-2 RNA-dependent RNA polymerase, a vital target of the antiviral drug for the treatment of coronavirus disease 2019. We identified the hotspot residues and characterized their dynamic behaviors, which might provide insight into the target site for antiviral drug design. The web server is freely available via a user-friendly web interface at http://chemyang.ccnu.edu.cn/ccb/server/HISNAPI/ and http://agroda.gzu.edu.cn:9999/ccb/server/HISNAPI/.
Collapse
Affiliation(s)
- Long-Can Mei
- College of Chemistry, Central China Normal University
| | | | | | | | | | - Guang-Fu Yang
- Pesticide Science from Nankai University, Tianjin, China
| |
Collapse
|
18
|
Feng Y, Wang Z, Yang N, Liu S, Yan J, Song J, Yang S, Zhang Y. Identification of Biomarkers for Cervical Cancer Radiotherapy Resistance Based on RNA Sequencing Data. Front Cell Dev Biol 2021; 9:724172. [PMID: 34414195 PMCID: PMC8369412 DOI: 10.3389/fcell.2021.724172] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Accepted: 07/14/2021] [Indexed: 11/28/2022] Open
Abstract
Cervical cancer as a common gynecological malignancy threatens the health and lives of women. Resistance to radiotherapy is the primary cause of treatment failure and is mainly related to difference in the inherent vulnerability of tumors after radiotherapy. Here, we investigated signature genes associated with poor response to radiotherapy by analyzing an independent cervical cancer dataset from the Gene Expression Omnibus, including pre-irradiation and mid-irradiation information. A total of 316 differentially expressed genes were significantly identified. The correlations between these genes were investigated through the Pearson correlation analysis. Subsequently, random forest model was used in determining cancer-related genes, and all genes were ranked by random forest scoring. The top 30 candidate genes were selected for uncovering their biological functions. Functional enrichment analysis revealed that the biological functions chiefly enriched in tumor immune responses, such as cellular defense response, negative regulation of immune system process, T cell activation, neutrophil activation involved in immune response, regulation of antigen processing and presentation, and peptidyl-tyrosine autophosphorylation. Finally, the top 30 genes were screened and analyzed through literature verification. After validation, 10 genes (KLRK1, LCK, KIF20A, CD247, FASLG, CD163, ZAP70, CD8B, ZNF683, and F10) were to our objective. Overall, the present research confirmed that integrated bioinformatics methods can contribute to the understanding of the molecular mechanisms and potential therapeutic targets underlying radiotherapy resistance in cervical cancer.
Collapse
Affiliation(s)
- Yue Feng
- Department of Gynecological Radiotherapy, Harbin Medical University Cancer Hospital, Harbin, China
| | - Zhao Wang
- Department of Gynecological Radiotherapy, Harbin Medical University Cancer Hospital, Harbin, China
| | - Nan Yang
- Department of Gynecological Radiotherapy, Harbin Medical University Cancer Hospital, Harbin, China
| | - Sijia Liu
- Department of Gynecological Radiotherapy, Harbin Medical University Cancer Hospital, Harbin, China
| | - Jiazhuo Yan
- Department of Gynecological Radiotherapy, Harbin Medical University Cancer Hospital, Harbin, China
| | - Jiayu Song
- Department of Gynecological Radiotherapy, Harbin Medical University Cancer Hospital, Harbin, China
| | - Shanshan Yang
- Department of Gynecological Radiotherapy, Harbin Medical University Cancer Hospital, Harbin, China
| | - Yunyan Zhang
- Department of Gynecological Radiotherapy, Harbin Medical University Cancer Hospital, Harbin, China
| |
Collapse
|
19
|
Nagy Z, Pethő Z, Kardos G, Major T, Szűcs A, Szarka K. Effect of E2 and long control region polymorphisms on disease severity in human papillomavirus type 11 mediated mucosal disease: Protein modelling and functional analysis. INFECTION GENETICS AND EVOLUTION 2021; 93:104948. [PMID: 34089910 DOI: 10.1016/j.meegid.2021.104948] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Revised: 05/20/2021] [Accepted: 05/30/2021] [Indexed: 10/21/2022]
Abstract
Interaction of the long control region (LCR) and the E2 protein of HPV11s was studied by in silico modelling and in vitro functional analysis. Genomes of HPV11s from fifteen (six known and nine novel) patients (two solitary papillomas, eleven respiratory papillomatoses of different severity, one condyloma acuminatum and one cervical atypia) were sequenced; E2 polymorphisms were analysed in silico by protein modelling. E2 and LCR variants were cloned into pcDNA3.1+ expression vector and into pALuc reporter vector, respectively, transfected to HEp2 cells alone or in different combinations and the luciferase activity was measured. In the E2, the ubiquitous polymorphism K308R caused stronger binding between the dimers but did not alter DNA binding; E2s with this polymorphism were significantly less efficient than the reference in promoting LCR activity. The unique polymorphism Q86K changed the negative surface charge of E2 (Q86) to positive (K86). The unique polymorphisms S245F and N247T in the hinge region disrupt a probable phosphorylation site in a RXXS motif targeted by protein kinase A and B, but do not affect directly the amino acids critical to nuclear transport. Both unique patterns partly restored the LCR activating potential disrupted by K308R. A unique E2/E4 ORF with a 58-bp deletion leading to a frameshift and an early stop codon resulted in a practically nonfunctional E2, and was associated with a papillomatosis with dysplasia. When testing existing LCR-E2 combinations, LCR with intrinsically lower enhancer capacity was only marginally activated by its E2 (R308 and the deletion mutant), and did not significantly exceed the activity of the reference LCR without E2. Combined with more potent LCRs associated with more severe disease, the activity was significantly higher, but still significantly lower than LCRs with reference E2. In summary, LCR-E2 interaction determined by their polymorphisms may explain, at least partly, differences in disease severity.
Collapse
Affiliation(s)
- Zsófia Nagy
- Department of Medical Microbiology, Faculty of Medicine, University of Debrecen, Nagyerdei krt.98, H-4032 Debrecen, Hungary
| | - Zoltán Pethő
- Department of Biophysics and Cell Biology, Faculty of Medicine, University of Debrecen, Nagyerdei krt.98, H-4032 Debrecen, Hungary; Institute of Physiology II, University Muenster, Robert-Koch-Str. 27B, 48147 Münster, Germany
| | - Gábor Kardos
- Department of Medical Microbiology, Faculty of Medicine, University of Debrecen, Nagyerdei krt.98, H-4032 Debrecen, Hungary
| | - Tamás Major
- Otorhinolaryngology and Head-Neck Surgery Division, Kenézy Gyula Teaching Hospital, University of Debrecen, Bartók Béla út 2-26, H-4031 Debrecen, Hungary
| | - Attila Szűcs
- Otorhinolaryngology and Head and Neck Surgery Clinic, Faculty of Medicine, University of Debrecen, Nagyerdei krt. 98, H-4032 Debrecen, Hungary
| | - Krisztina Szarka
- Department of Medical Microbiology, Faculty of Medicine, University of Debrecen, Nagyerdei krt.98, H-4032 Debrecen, Hungary.
| |
Collapse
|
20
|
Apostolides M, Jiang Y, Husić M, Siddaway R, Hawkins C, Turinsky AL, Brudno M, Ramani AK. MetaFusion: A high-confidence metacaller for filtering and prioritizing RNA-seq gene fusion candidates. Bioinformatics 2021; 37:3144-3151. [PMID: 33944895 DOI: 10.1093/bioinformatics/btab249] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Revised: 03/04/2021] [Accepted: 05/03/2021] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Current fusion detection tools use diverse calling approaches and provide varying results, making selection of the appropriate tool challenging. Ensemble fusion calling techniques appear promising; however, current options have limited accessibility and function. RESULTS MetaFusion is a flexible meta-calling tool that amalgamates outputs from any number of fusion callers. Individual caller results are standardized by conversion into the new file type Common Fusion Format (CFF). Calls are annotated, merged using graph clustering, filtered, and ranked to provide a final output of high confidence candidates. MetaFusion consistently achieves higher precision and recall than individual callers on real and simulated datasets, and reaches up to 100% precision, indicating that ensemble calling is imperative for high confidence results. MetaFusion uses FusionAnnotator to annotate calls with information from cancer fusion databases, and is provided with a benchmarking toolkit to calibrate new callers. AVAILABILITY MetaFusion is freely available at https://github.com/ccmbioinfo/MetaFusion. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Michael Apostolides
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada
| | - Yue Jiang
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada
| | - Mia Husić
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada
| | - Robert Siddaway
- The Arthur and Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children, Toronto, ON, Canada
| | - Cynthia Hawkins
- The Arthur and Sonia Labatt Brain Tumour Research Centre, The Hospital for Sick Children, Toronto, ON, Canada.,Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada.,Division of Pathology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Andrei L Turinsky
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada
| | - Michael Brudno
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada.,Genetics and Genome Biology Program, The Hospital for Sick Children, Toronto, ON, Canada.,Department of Computer Science, University of Toronto, Toronto, ON, Canada.,University Health Network, Toronto, ON, Canada
| | - Arun K Ramani
- Centre for Computational Medicine, The Hospital For Sick Children, Toronto, ON, Canada
| |
Collapse
|
21
|
Dao FY, Lv H, Yang YH, Zulfiqar H, Gao H, Lin H. Computational identification of N6-methyladenosine sites in multiple tissues of mammals. Comput Struct Biotechnol J 2020; 18:1084-1091. [PMID: 32435427 PMCID: PMC7229270 DOI: 10.1016/j.csbj.2020.04.015] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2020] [Revised: 04/20/2020] [Accepted: 04/21/2020] [Indexed: 12/12/2022] Open
Abstract
N6-methyladenosine (m6A) is the methylation of the adenosine at the nitrogen-6 position, which is the most abundant RNA methylation modification and involves a series of important biological processes. Accurate identification of m6A sites in genome-wide is invaluable for better understanding their biological functions. In this work, an ensemble predictor named iRNA-m6A was established to identify m6A sites in multiple tissues of human, mouse and rat based on the data from high-throughput sequencing techniques. In the proposed predictor, RNA sequences were encoded by physical-chemical property matrix, mono-nucleotide binary encoding and nucleotide chemical property. Subsequently, these features were optimized by using minimum Redundancy Maximum Relevance (mRMR) feature selection method. Based on the optimal feature subset, the best m6A classification models were trained by Support Vector Machine (SVM) with 5-fold cross-validation test. Prediction results on independent dataset showed that our proposed method could produce the excellent generalization ability. We also established a user-friendly webserver called iRNA-m6A which can be freely accessible at http://lin-group.cn/server/iRNA-m6A. This tool will provide more convenience to users for studying m6A modification in different tissues.
Collapse
Affiliation(s)
| | | | - Yu-He Yang
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Hasan Zulfiqar
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Hui Gao
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Hao Lin
- Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu 610054, China
| |
Collapse
|