1
|
Yuan Y, Tang X, Li H, Lang X, Song Y, Yang Y, Zhou Z. BiLSTM- and CNN-Based m6A Modification Prediction Model for circRNAs. Molecules 2024; 29:2429. [PMID: 38893304 PMCID: PMC11173551 DOI: 10.3390/molecules29112429] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Revised: 05/13/2024] [Accepted: 05/20/2024] [Indexed: 06/21/2024] Open
Abstract
m6A methylation, a ubiquitous modification on circRNAs, exerts a profound influence on RNA function, intracellular behavior, and diverse biological processes, including disease development. While prediction algorithms exist for mRNA m6A modifications, a critical gap remains in the prediction of circRNA m6A modifications. Therefore, accurate identification and prediction of m6A sites are imperative for understanding RNA function and regulation. This study presents a novel hybrid model combining a convolutional neural network (CNN) and a bidirectional long short-term memory network (BiLSTM) for precise m6A methylation site prediction in circular RNAs (circRNAs) based on data from HEK293 cells. This model exploits the synergy between CNN's ability to extract intricate sequence features and BiLSTM's strength in capturing long-range dependencies. Furthermore, the integrated attention mechanism empowers the model to pinpoint critical biological information for studying circRNA m6A methylation. Our model, exhibiting over 78% prediction accuracy on independent datasets, offers not only a valuable tool for scientific research but also a strong foundation for future biomedical applications. This work not only furthers our understanding of gene expression regulation but also opens new avenues for the exploration of circRNA methylation in biological research.
Collapse
Affiliation(s)
- Yuqian Yuan
- School of Artificial Intelligence and Information Technology, Nanjing University of Chinese Medicine, Nanjing 210023, China; (Y.Y.); (H.L.); (X.L.); (Y.S.)
| | - Xiaozhu Tang
- School of Medicine & Holistic Integrative Medicine, Nanjing University of Chinese Medicine, Nanjing 210023, China;
| | - Hongyan Li
- School of Artificial Intelligence and Information Technology, Nanjing University of Chinese Medicine, Nanjing 210023, China; (Y.Y.); (H.L.); (X.L.); (Y.S.)
| | - Xufeng Lang
- School of Artificial Intelligence and Information Technology, Nanjing University of Chinese Medicine, Nanjing 210023, China; (Y.Y.); (H.L.); (X.L.); (Y.S.)
| | - Yihua Song
- School of Artificial Intelligence and Information Technology, Nanjing University of Chinese Medicine, Nanjing 210023, China; (Y.Y.); (H.L.); (X.L.); (Y.S.)
| | - Ye Yang
- School of Medicine & Holistic Integrative Medicine, Nanjing University of Chinese Medicine, Nanjing 210023, China;
| | - Zuojian Zhou
- School of Artificial Intelligence and Information Technology, Nanjing University of Chinese Medicine, Nanjing 210023, China; (Y.Y.); (H.L.); (X.L.); (Y.S.)
| |
Collapse
|
2
|
Laveglia V, Bazayeva M, Andreini C, Rosato A. Hunting down zinc(II)-binding sites in proteins with distance matrices. Bioinformatics 2023; 39:btad653. [PMID: 37878807 PMCID: PMC10630175 DOI: 10.1093/bioinformatics/btad653] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 10/17/2023] [Accepted: 10/23/2023] [Indexed: 10/27/2023] Open
Abstract
MOTIVATION In recent years, high-throughput sequencing technologies have made available the genome sequences of a huge variety of organisms. However, the functional annotation of the encoded proteins often still relies on low-throughput and costly experimental studies. Bioinformatics approaches offer a promising alternative to accelerate this process. In this work, we focus on the binding of zinc(II) ions, which is needed for 5%-10% of any organism's proteins to achieve their physiologically relevant form. RESULTS To implement a predictor of zinc(II)-binding sites in the 3D structures of proteins, we used a neural network, followed by a filter of the network output against the local structure of all known sites. The latter was implemented as a function comparing the distance matrices of the Cα and Cβ atoms of the sites. We called the resulting tool Master of Metals (MOM). The structural models for the entire proteome of an organism generated by AlphaFold can be used as input to our tool in order to achieve annotation at the whole organism level within a few hours. To demonstrate this, we applied MOM to the yeast proteome, obtaining a precision of about 76%, based on data for homologous proteins. AVAILABILITY AND IMPLEMENTATION Master of Metals has been implemented in Python and is available at https://github.com/cerm-cirmmp/Master-of-metals.
Collapse
Affiliation(s)
- Vincenzo Laveglia
- Department of Chemistry, University of Florence, Sesto Fiorentino 50019, Italy
| | - Milana Bazayeva
- Department of Chemistry, University of Florence, Sesto Fiorentino 50019, Italy
- Magnetic Resonance Center (CERM), University of Florence, Sesto Fiorentino 50019, Italy
| | - Claudia Andreini
- Department of Chemistry, University of Florence, Sesto Fiorentino 50019, Italy
- Magnetic Resonance Center (CERM), University of Florence, Sesto Fiorentino 50019, Italy
- Consorzio Interuniversitario di Risonanze Magnetiche di Metallo Proteine, Sesto Fiorentino 50019, Italy
| | - Antonio Rosato
- Department of Chemistry, University of Florence, Sesto Fiorentino 50019, Italy
- Magnetic Resonance Center (CERM), University of Florence, Sesto Fiorentino 50019, Italy
- Consorzio Interuniversitario di Risonanze Magnetiche di Metallo Proteine, Sesto Fiorentino 50019, Italy
| |
Collapse
|
3
|
Liu S, Liang Y, Li J, Yang S, Liu M, Liu C, Yang D, Zuo Y. Integrating reduced amino acid composition into PSSM for improving copper ion-binding protein prediction. Int J Biol Macromol 2023:124993. [PMID: 37307968 DOI: 10.1016/j.ijbiomac.2023.124993] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 05/12/2023] [Accepted: 05/19/2023] [Indexed: 06/14/2023]
Abstract
Copper ion-binding proteins play an essential role in metabolic processes and are critical factors in many diseases, such as breast cancer, lung cancer, and Menkes disease. Many algorithms have been developed for predicting metal ion classification and binding sites, but none have been applied to copper ion-binding proteins. In this study, we developed a copper ion-bound protein classifier, RPCIBP, which integrating the reduced amino acid composition into position-specific score matrix (PSSM). The reduced amino acid composition filters out a large number of useless evolutionary features, improving the operational efficiency and predictive ability of the model (feature dimension from 2900 to 200, ACC from 83 % to 85.1 %). Compared with the basic model using only three sequence feature extraction methods (ACC in training set between 73.8 %-86.2 %, ACC in test set between 69.3 %-87.5 %), the model integrating the evolutionary features of the reduced amino acid composition showed higher accuracy and robustness (ACC in training set between 83.1 %-90.8 %, ACC in test set between 79.1 %-91.9 %). Best copper ion-binding protein classifiers filtered by feature selection progress were deployed in a user-friendly web server (http://bioinfor.imu.edu.cn/RPCIBP). RPCIBP can accurately predict copper ion-binding proteins, which is convenient for further structural and functional studies, and conducive to mechanism exploration and target drug development.
Collapse
Affiliation(s)
- Shanghua Liu
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, School of Life Sciences, Inner Mongolia University, Hohhot 010021, China; Inner Mongolia International Mongolian Hospital, Hohhot 010065, China
| | - Yuchao Liang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, School of Life Sciences, Inner Mongolia University, Hohhot 010021, China; Digital College, Inner Mongolia Intelligent Union Big Data Academy, Hohhot 010010, China
| | - Jinzhao Li
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, School of Life Sciences, Inner Mongolia University, Hohhot 010021, China
| | - Siqi Yang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, School of Life Sciences, Inner Mongolia University, Hohhot 010021, China
| | - Ming Liu
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, School of Life Sciences, Inner Mongolia University, Hohhot 010021, China
| | - Chengfang Liu
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, School of Life Sciences, Inner Mongolia University, Hohhot 010021, China
| | - Dezhi Yang
- Inner Mongolia International Mongolian Hospital, Hohhot 010065, China.
| | - Yongchun Zuo
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, Institutes of Biomedical Sciences, School of Life Sciences, Inner Mongolia University, Hohhot 010021, China; Inner Mongolia International Mongolian Hospital, Hohhot 010065, China; Digital College, Inner Mongolia Intelligent Union Big Data Academy, Hohhot 010010, China.
| |
Collapse
|
4
|
Rodzik A, Railean V, Pomastowski P, Žuvela P, Wong MW, Buszewski B. The influence of zinc ions concentration on β-lactoglobulin structure – physicochemical properties of Zn–β-lactoglobulin complexes. J Mol Struct 2022. [DOI: 10.1016/j.molstruc.2022.133745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/31/2022]
|
5
|
Chen Z, Liu X, Zhao P, Li C, Wang Y, Li F, Akutsu T, Bain C, Gasser RB, Li J, Yang Z, Gao X, Kurgan L, Song J. iFeatureOmega: an integrative platform for engineering, visualization and analysis of features from molecular sequences, structural and ligand data sets. Nucleic Acids Res 2022; 50:W434-W447. [PMID: 35524557 PMCID: PMC9252729 DOI: 10.1093/nar/gkac351] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 04/22/2022] [Accepted: 04/25/2022] [Indexed: 01/07/2023] Open
Abstract
The rapid accumulation of molecular data motivates development of innovative approaches to computationally characterize sequences, structures and functions of biological and chemical molecules in an efficient, accessible and accurate manner. Notwithstanding several computational tools that characterize protein or nucleic acids data, there are no one-stop computational toolkits that comprehensively characterize a wide range of biomolecules. We address this vital need by developing a holistic platform that generates features from sequence and structural data for a diverse collection of molecule types. Our freely available and easy-to-use iFeatureOmega platform generates, analyzes and visualizes 189 representations for biological sequences, structures and ligands. To the best of our knowledge, iFeatureOmega provides the largest scope when directly compared to the current solutions, in terms of the number of feature extraction and analysis approaches and coverage of different molecules. We release three versions of iFeatureOmega including a webserver, command line interface and graphical interface to satisfy needs of experienced bioinformaticians and less computer-savvy biologists and biochemists. With the assistance of iFeatureOmega, users can encode their molecular data into representations that facilitate construction of predictive models and analytical studies. We highlight benefits of iFeatureOmega based on three research applications, demonstrating how it can be used to accelerate and streamline research in bioinformatics, computational biology, and cheminformatics areas. The iFeatureOmega webserver is freely available at http://ifeatureomega.erc.monash.edu and the standalone versions can be downloaded from https://github.com/Superzchen/iFeatureOmega-GUI/ and https://github.com/Superzchen/iFeatureOmega-CLI/.
Collapse
Affiliation(s)
- Zhen Chen
- Collaborative Innovation Center of Henan Grain Crops, Henan Agricultural University, Zhengzhou 450046, China
- Center for Crop Genome Engineering, Henan Agricultural University, Zhengzhou 450046, China
| | - Xuhan Liu
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden 2333 CC, The Netherlands
| | - Pei Zhao
- State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Sciences (CAAS), Anyang 455000, China
| | - Chen Li
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria 3800, Australia
| | - Yanan Wang
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria 3800, Australia
| | - Fuyi Li
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria 3800, Australia
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan
| | - Chris Bain
- Monash Data Future Institutes, Monash University, Melbourne, Victoria 3800, Australia
| | - Robin B Gasser
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Junzhou Li
- Collaborative Innovation Center of Henan Grain Crops, Henan Agricultural University, Zhengzhou 450046, China
| | - Zuoren Yang
- State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Sciences (CAAS), Anyang 455000, China
| | - Xin Gao
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Jiangning Song
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria 3800, Australia
- Monash Data Future Institutes, Monash University, Melbourne, Victoria 3800, Australia
| |
Collapse
|
6
|
A Comprehensive Review of Computation-Based Metal-Binding Prediction Approaches at the Residue Level. BIOMED RESEARCH INTERNATIONAL 2022; 2022:8965712. [PMID: 35402609 PMCID: PMC8989566 DOI: 10.1155/2022/8965712] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/02/2022] [Accepted: 03/04/2022] [Indexed: 12/29/2022]
Abstract
Clear evidence has shown that metal ions strongly connect and delicately tune the dynamic homeostasis in living bodies. They have been proved to be associated with protein structure, stability, regulation, and function. Even small changes in the concentration of metal ions can shift their effects from natural beneficial functions to harmful. This leads to degenerative diseases, malignant tumors, and cancers. Accurate characterizations and predictions of metalloproteins at the residue level promise informative clues to the investigation of intrinsic mechanisms of protein-metal ion interactions. Compared to biophysical or biochemical wet-lab technologies, computational methods provide open web interfaces of high-resolution databases and high-throughput predictors for efficient investigation of metal-binding residues. This review surveys and details 18 public databases of metal-protein binding. We collect a comprehensive set of 44 computation-based methods and classify them into four categories, namely, learning-, docking-, template-, and meta-based methods. We analyze the benchmark datasets, assessment criteria, feature construction, and algorithms. We also compare several methods on two benchmark testing datasets and include a discussion about currently publicly available predictive tools. Finally, we summarize the challenges and underlying limitations of the current studies and propose several prospective directions concerning the future development of the related databases and methods.
Collapse
|
7
|
Zhou J, Bo S, Wang H, Zheng L, Liang P, Zuo Y. Identification of Disease-Related 2-Oxoglutarate/Fe (II)-Dependent Oxygenase Based on Reduced Amino Acid Cluster Strategy. Front Cell Dev Biol 2021; 9:707938. [PMID: 34336861 PMCID: PMC8323781 DOI: 10.3389/fcell.2021.707938] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 06/10/2021] [Indexed: 11/17/2022] Open
Abstract
The 2-oxoglutarate/Fe (II)-dependent (2OG) oxygenase superfamily is mainly responsible for protein modification, nucleic acid repair and/or modification, and fatty acid metabolism and plays important roles in cancer, cardiovascular disease, and other diseases. They are likely to become new targets for the treatment of cancer and other diseases, so the accurate identification of 2OG oxygenases is of great significance. Many computational methods have been proposed to predict functional proteins to compensate for the time-consuming and expensive experimental identification. However, machine learning has not been applied to the study of 2OG oxygenases. In this study, we developed OGFE_RAAC, a prediction model to identify whether a protein is a 2OG oxygenase. To improve the performance of OGFE_RAAC, 673 amino acid reduction alphabets were used to determine the optimal feature representation scheme by recoding the protein sequence. The 10-fold cross-validation test showed that the accuracy of the model in identifying 2OG oxygenases is 91.04%. Besides, the independent dataset results also proved that the model has excellent generalization and robustness. It is expected to become an effective tool for the identification of 2OG oxygenases. With further research, we have also found that the function of 2OG oxygenases may be related to their polarity and hydrophobicity, which will help the follow-up study on the catalytic mechanism of 2OG oxygenases and the way they interact with the substrate. Based on the model we built, a user-friendly web server was established and can be friendly accessed at http://bioinfor.imu.edu.cn/ogferaac.
Collapse
Affiliation(s)
- Jian Zhou
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Suling Bo
- College of Computer and Information, Inner Mongolia Medical University, Hohhot, China
| | - Hao Wang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Lei Zheng
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Pengfei Liang
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| | - Yongchun Zuo
- State Key Laboratory of Reproductive Regulation and Breeding of Grassland Livestock, College of Life Sciences, Inner Mongolia University, Hohhot, China
| |
Collapse
|
8
|
Chen YZ, Wang ZZ, Wang Y, Ying G, Chen Z, Song J. nhKcr: a new bioinformatics tool for predicting crotonylation sites on human nonhistone proteins based on deep learning. Brief Bioinform 2021; 22:6277413. [PMID: 34002774 DOI: 10.1093/bib/bbab146] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Revised: 03/18/2021] [Accepted: 03/25/2021] [Indexed: 12/20/2022] Open
Abstract
Lysine crotonylation (Kcr) is a newly discovered type of protein post-translational modification and has been reported to be involved in various pathophysiological processes. High-resolution mass spectrometry is the primary approach for identification of Kcr sites. However, experimental approaches for identifying Kcr sites are often time-consuming and expensive when compared with computational approaches. To date, several predictors for Kcr site prediction have been developed, most of which are capable of predicting crotonylation sites on either histones alone or mixed histone and nonhistone proteins together. These methods exhibit high diversity in their algorithms, encoding schemes, feature selection techniques and performance assessment strategies. However, none of them were designed for predicting Kcr sites on nonhistone proteins. Therefore, it is desirable to develop an effective predictor for identifying Kcr sites from the large amount of nonhistone sequence data. For this purpose, we first provide a comprehensive review on six methods for predicting crotonylation sites. Second, we develop a novel deep learning-based computational framework termed as CNNrgb for Kcr site prediction on nonhistone proteins by integrating different types of features. We benchmark its performance against multiple commonly used machine learning classifiers (including random forest, logitboost, naïve Bayes and logistic regression) by performing both 10-fold cross-validation and independent test. The results show that the proposed CNNrgb framework achieves the best performance with high computational efficiency on large datasets. Moreover, to facilitate users' efforts to investigate Kcr sites on human nonhistone proteins, we implement an online server called nhKcr and compare it with other existing tools to illustrate the utility and robustness of our method. The nhKcr web server and all the datasets utilized in this study are freely accessible at http://nhKcr.erc.monash.edu/.
Collapse
Affiliation(s)
- Yong-Zi Chen
- Laboratory of Tumor Cell Biology, Tianjin Medical University Cancer Institute and Hospital, Tianjin 300060, China
| | | | | | - Guoguang Ying
- Laboratory of Tumor Cell Biology in Tianjin Medical University Cancer Institute and Hospital, Tianjin 300060, China
| | - Zhen Chen
- Collaborative Innovation Center of Henan Grain Crops, Henan Agricultural University, China
| | - Jiangning Song
- Monash Biomedicine Discovery Institute, Monash University, Australia
| |
Collapse
|
9
|
Ireland SM, Martin ACR. Zincbindpredict-Prediction of Zinc Binding Sites in Proteins. Molecules 2021; 26:molecules26040966. [PMID: 33673040 PMCID: PMC7918553 DOI: 10.3390/molecules26040966] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2020] [Revised: 01/26/2021] [Accepted: 02/09/2021] [Indexed: 11/21/2022] Open
Abstract
Background: Zinc binding proteins make up a significant proportion of the proteomes of most organisms and, within those proteins, zinc performs rôles in catalysis and structure stabilisation. Identifying the ability to bind zinc in a novel protein can offer insights into its functions and the mechanism by which it carries out those functions. Computational means of doing so are faster than spectroscopic means, allowing for searching at much greater speeds and scales, and thereby guiding complimentary experimental approaches. Typically, computational models of zinc binding predict zinc binding for individual residues rather than as a single binding site, and typically do not distinguish between different classes of binding site—missing crucial properties indicative of zinc binding. Methods: Previously, we created ZincBindDB, a continuously updated database of known zinc binding sites, categorised by family (the set of liganding residues). Here, we use this dataset to create ZincBindPredict, a set of machine learning methods to predict the most common zinc binding site families for both structure and sequence. Results: The models all achieve an MCC ≥ 0.88, recall ≥ 0.93 and precision ≥ 0.91 for the structural models (mean MCC = 0.97), while the sequence models have MCC ≥ 0.64, recall ≥ 0.80 and precision ≥ 0.83 (mean MCC = 0.87), with the models for binding sites containing four liganding residues performing much better than this. Conclusions: The predictors outperform competing zinc binding site predictors and are available online via a web interface and a GraphQL API.
Collapse
|
10
|
Zhang Y, Zheng J. Bioinformatics of Metalloproteins and Metalloproteomes. Molecules 2020; 25:molecules25153366. [PMID: 32722260 PMCID: PMC7435645 DOI: 10.3390/molecules25153366] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2020] [Revised: 07/17/2020] [Accepted: 07/22/2020] [Indexed: 12/14/2022] Open
Abstract
Trace metals are inorganic elements that are required for all organisms in very low quantities. They serve as cofactors and activators of metalloproteins involved in a variety of key cellular processes. While substantial effort has been made in experimental characterization of metalloproteins and their functions, the application of bioinformatics in the research of metalloproteins and metalloproteomes is still limited. In the last few years, computational prediction and comparative genomics of metalloprotein genes have arisen, which provide significant insights into their distribution, function, and evolution in nature. This review aims to offer an overview of recent advances in bioinformatic analysis of metalloproteins, mainly focusing on metalloprotein prediction and the use of different metals across the tree of life. We describe current computational approaches for the identification of metalloprotein genes and metal-binding sites/patterns in proteins, and then introduce a set of related databases. Furthermore, we discuss the latest research progress in comparative genomics of several important metals in both prokaryotes and eukaryotes, which demonstrates divergent and dynamic evolutionary patterns of different metalloprotein families and metalloproteomes. Overall, bioinformatic studies of metalloproteins provide a foundation for systematic understanding of trace metal utilization in all three domains of life.
Collapse
Affiliation(s)
- Yan Zhang
- Shenzhen Key Laboratory of Marine Bioresources and Ecology, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen 518055, China;
- Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions, Shenzhen 518055, China
- Shenzhen Bay Laboratory, Shenzhen 518055, China
- Correspondence: ; Tel.: +86-755-2692-2024
| | - Junge Zheng
- Shenzhen Key Laboratory of Marine Bioresources and Ecology, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen 518055, China;
- Shenzhen-Hong Kong Institute of Brain Science-Shenzhen Fundamental Research Institutions, Shenzhen 518055, China
- Shenzhen Bay Laboratory, Shenzhen 518055, China
| |
Collapse
|
11
|
Meng C, Guo F, Zou Q. CWLy-SVM: A support vector machine-based tool for identifying cell wall lytic enzymes. Comput Biol Chem 2020; 87:107304. [PMID: 32580129 DOI: 10.1016/j.compbiolchem.2020.107304] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2019] [Revised: 06/07/2020] [Accepted: 06/08/2020] [Indexed: 12/21/2022]
Abstract
Cell wall lytic enzymes, as an important biotechnical tool in drug development, agriculture and the food industry, have attracted more research attention. In this research, the accurate identification of cell wall lytic enzymes is one of the key and fundamental tasks. In this study, in order to eliminate the inefficiency of in vitro experiments, a support vector machine-based cell wall lytic enzyme identification model was constructed using bioinformatics. This machine learning process includes feature extraction, feature selection, model training and optimization. According to the jackknife cross validation test, this model obtained a sensitivity of 0.853, a specificity of 0.977, an MCC of 0.845 and an AUC of 0.915. These benchmark results demonstrate that the proposed model outperforms the state-of-the-art method and that it has powerful cell wall lytic enzyme identification ability. Furthermore, we comprehensively analyzed the selected optimal features and used the proposed model to construct a user friendly web server called the CWLy-SVM to identify cell wall lytic enzymes, which is available at http://server.malab.cn/CWLy-SVM/index.jsp.
Collapse
Affiliation(s)
- Chaolu Meng
- College of Intelligence and Computing, Tianjin University, Tianjin, China; College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China
| | - Fei Guo
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu, China; Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.
| |
Collapse
|
12
|
Babu VMP, Sankari S, Budnick JA, Caswell CC, Walker GC. Sinorhizobium meliloti YbeY is a zinc-dependent single-strand specific endoribonuclease that plays an important role in 16S ribosomal RNA processing. Nucleic Acids Res 2020; 48:332-348. [PMID: 31777930 PMCID: PMC6943124 DOI: 10.1093/nar/gkz1095] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2019] [Revised: 11/01/2019] [Accepted: 11/21/2019] [Indexed: 12/19/2022] Open
Abstract
Single-strand specific endoribonuclease YbeY has been shown to play an important role in the processing of the 3' end of the 16S rRNA in Escherichia coli. Lack of YbeY results in the accumulation of the 17S rRNA precursor. In contrast to a previous report, we show that Sinorhizobium meliloti YbeY exhibits endoribonuclease activity on single-stranded RNA substrate but not on the double-stranded substrate. This study also identifies the previously unknown metal ion involved in YbeY function to be Zn2+ and shows that the activity of YbeY is enhanced when the occupancy of zinc is increased. We have identified a pre-16S rRNA precursor that accumulates in the S. meliloti ΔybeY strain. We also show that ΔybeY mutant of Brucella abortus, a mammalian pathogen, also accumulates a similar pre-16S rRNA. The pre-16S species is longer in alpha-proteobacteria than in gamma-proteobacteria. We demonstrate that the YbeY from E. coli and S. meliloti can reciprocally complement the rRNA processing defect in a ΔybeY mutant of the other organism. These results establish YbeY as a zinc-dependent single-strand specific endoribonuclease that functions in 16S rRNA processing in both alpha- and gamma-proteobacteria.
Collapse
Affiliation(s)
- Vignesh M P Babu
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - Siva Sankari
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | - James A Budnick
- Department of Biomedical Sciences and Pathobiology, VA-MD College of Veterinary Medicine, Virginia Tech, Blacksburg, VA, USA
- Department of Microbiology and Molecular Genetics, School of Medicine, University of Pittsburgh, PA, USA
| | - Clayton C Caswell
- Department of Biomedical Sciences and Pathobiology, VA-MD College of Veterinary Medicine, Virginia Tech, Blacksburg, VA, USA
| | - Graham C Walker
- Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| |
Collapse
|
13
|
Chen Z, Liu X, Li F, Li C, Marquez-Lago T, Leier A, Akutsu T, Webb GI, Xu D, Smith AI, Li L, Chou KC, Song J. Large-scale comparative assessment of computational predictors for lysine post-translational modification sites. Brief Bioinform 2019; 20:2267-2290. [PMID: 30285084 PMCID: PMC6954452 DOI: 10.1093/bib/bby089] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Revised: 08/17/2018] [Accepted: 08/18/2018] [Indexed: 12/22/2022] Open
Abstract
Lysine post-translational modifications (PTMs) play a crucial role in regulating diverse functions and biological processes of proteins. However, because of the large volumes of sequencing data generated from genome-sequencing projects, systematic identification of different types of lysine PTM substrates and PTM sites in the entire proteome remains a major challenge. In recent years, a number of computational methods for lysine PTM identification have been developed. These methods show high diversity in their core algorithms, features extracted and feature selection techniques and evaluation strategies. There is therefore an urgent need to revisit these methods and summarize their methodologies, to improve and further develop computational techniques to identify and characterize lysine PTMs from the large amounts of sequence data. With this goal in mind, we first provide a comprehensive survey on a large collection of 49 state-of-the-art approaches for lysine PTM prediction. We cover a variety of important aspects that are crucial for the development of successful predictors, including operating algorithms, sequence and structural features, feature selection, model performance evaluation and software utility. We further provide our thoughts on potential strategies to improve the model performance. Second, in order to examine the feasibility of using deep learning for lysine PTM prediction, we propose a novel computational framework, termed MUscADEL (Multiple Scalable Accurate Deep Learner for lysine PTMs), using deep, bidirectional, long short-term memory recurrent neural networks for accurate and systematic mapping of eight major types of lysine PTMs in the human and mouse proteomes. Extensive benchmarking tests show that MUscADEL outperforms current methods for lysine PTM characterization, demonstrating the potential and power of deep learning techniques in protein PTM prediction. The web server of MUscADEL, together with all the data sets assembled in this study, is freely available at http://muscadel.erc.monash.edu/. We anticipate this comprehensive review and the application of deep learning will provide practical guide and useful insights into PTM prediction and inspire future bioinformatics studies in the related fields.
Collapse
Affiliation(s)
- Zhen Chen
- School of Basic Medical Science, Qingdao University, Dengzhou Road, Qingdao, Shandong, China
| | - Xuhan Liu
- Medicinal Chemistry, Leiden Academic Centre for Drug Research,Einsteinweg, Leiden, The Netherlands
| | - Fuyi Li
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Faculty of Medicine, Monash University, Melbourne, VIC, Australia
- ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC, Australia
| | - Chen Li
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Faculty of Medicine, Monash University, Melbourne, VIC, Australia
- Institute of Molecular Systems Biology, ETH Zürich,Auguste-Piccard-Hof, Zürich, Switzerland
| | - Tatiana Marquez-Lago
- Department of Genetics, School of Medicine, University of Alabama at Birmingham, AL, USA
- Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, AL, USA
| | - André Leier
- Department of Genetics, School of Medicine, University of Alabama at Birmingham, AL, USA
- Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, AL, USA
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research,Kyoto University, Uji, Kyoto, Japan
| | - Geoffrey I Webb
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC, Australia
| | - Dakang Xu
- Faculty of Medical Laboratory Science, Ruijin Hospital, School of Medicine, Shanghai Jiao Tong University, Shanghai, China
- Department of Molecular and Translational Science, Faculty of Medicine, Hudson Institute of Medical Research, Monash University, Melbourne, VIC, Australia
| | - Alexander Ian Smith
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Faculty of Medicine, Monash University, Melbourne, VIC, Australia
- ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC, Australia
| | - Lei Li
- School of Basic Medical Science, Qingdao University, Dengzhou Road, Qingdao, Shandong, China
| | - Kuo-Chen Chou
- Gordon Life Science Institute, Boston, MA, USA
- Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China
| | - Jiangning Song
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Faculty of Medicine, Monash University, Melbourne, VIC, Australia
- ARC Centre of Excellence in Advanced Molecular Imaging, Monash University, Melbourne, VIC, Australia
- Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC, Australia
| |
Collapse
|
14
|
Chen Z, Zhao P, Li F, Wang Y, Smith AI, Webb GI, Akutsu T, Baggag A, Bensmail H, Song J. Comprehensive review and assessment of computational methods for predicting RNA post-transcriptional modification sites from RNA sequences. Brief Bioinform 2019; 21:1676-1696. [DOI: 10.1093/bib/bbz112] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2019] [Revised: 07/31/2019] [Accepted: 08/07/2019] [Indexed: 12/14/2022] Open
Abstract
Abstract
RNA post-transcriptional modifications play a crucial role in a myriad of biological processes and cellular functions. To date, more than 160 RNA modifications have been discovered; therefore, accurate identification of RNA-modification sites is fundamental for a better understanding of RNA-mediated biological functions and mechanisms. However, due to limitations in experimental methods, systematic identification of different types of RNA-modification sites remains a major challenge. Recently, more than 20 computational methods have been developed to identify RNA-modification sites in tandem with high-throughput experimental methods, with most of these capable of predicting only single types of RNA-modification sites. These methods show high diversity in their dataset size, data quality, core algorithms, features extracted and feature selection techniques and evaluation strategies. Therefore, there is an urgent need to revisit these methods and summarize their methodologies, in order to improve and further develop computational techniques to identify and characterize RNA-modification sites from the large amounts of sequence data. With this goal in mind, first, we provide a comprehensive survey on a large collection of 27 state-of-the-art approaches for predicting N1-methyladenosine and N6-methyladenosine sites. We cover a variety of important aspects that are crucial for the development of successful predictors, including the dataset quality, operating algorithms, sequence and genomic features, feature selection, model performance evaluation and software utility. In addition, we also provide our thoughts on potential strategies to improve the model performance. Second, we propose a computational approach called DeepPromise based on deep learning techniques for simultaneous prediction of N1-methyladenosine and N6-methyladenosine. To extract the sequence context surrounding the modification sites, three feature encodings, including enhanced nucleic acid composition, one-hot encoding, and RNA embedding, were used as the input to seven consecutive layers of convolutional neural networks (CNNs), respectively. Moreover, DeepPromise further combined the prediction score of the CNN-based models and achieved around 43% higher area under receiver-operating curve (AUROC) for m1A site prediction and 2–6% higher AUROC for m6A site prediction, respectively, when compared with several existing state-of-the-art approaches on the independent test. In-depth analyses of characteristic sequence motifs identified from the convolution-layer filters indicated that nucleotide presentation at proximal positions surrounding the modification sites contributed most to the classification, whereas those at distal positions also affected classification but to different extents. To maximize user convenience, a web server was developed as an implementation of DeepPromise and made publicly available at http://DeepPromise.erc.monash.edu/, with the server accepting both RNA sequences and genomic sequences to allow prediction of two types of putative RNA-modification sites.
Collapse
Affiliation(s)
- Zhen Chen
- School of BasicMedical Science, Qingdao University, China
| | - Pei Zhao
- State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Sciences, Anyang 455000, Henan, China
| | - Fuyi Li
- Northwest A&F University, China
| | | | - A Ian Smith
- Prince Henrys Institute Melbourne and Monash University, Australia
| | | | | | - Abdelkader Baggag
- Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha 34110, Qatar
| | - Halima Bensmail
- Qatar Computing Research Institute, Hamad Bin Khalifa University, Doha 34110, Qatar
| | - Jiangning Song
- Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Victoria 3800, Australia
| |
Collapse
|
15
|
Yan R, Wang X, Tian Y, Xu J, Xu X, Lin J. Prediction of zinc-binding sites using multiple sequence profiles and machine learning methods. Mol Omics 2019; 15:205-215. [PMID: 31046040 DOI: 10.1039/c9mo00043g] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
The zinc (Zn2+) cofactor has been proven to be involved in numerous biological mechanisms and the zinc-binding site is recognized as one of the most important post-translation modifications in proteins. Therefore, accurate knowledge of zinc ions in protein structures can provide potential clues for elucidation of protein folding and functions. However, determining zinc-binding residues by experimental means is usually lab-intensive and associated with high cost in most cases. In this context, the development of computational tools for identifying zinc-binding sites is highly desired, especially in the current post-genomic era. In this work, we developed a novel zinc-binding site prediction method by combining several intensively-trained machine learning models. To establish an accurate and generative method, we downloaded all zinc-binding proteins from the Protein Data Bank and prepared a non-redundant dataset. Meanwhile, a well-prepared dataset by other groups was also used. Then, effective and complementary features were extracted from sequences and three-dimensional structures of these proteins. Moreover, several well-designed machine learning models were intensively trained to construct accurate models. To assess the performance, the obtained predictors were stringently benchmarked using the diverse zinc-binding sites. Furthermore, several state-of-the-art in silico methods developed specifically for zinc-binding sites were also evaluated and compared. The results confirmed that our method is very competitive in real world applications and could become a complementary tool to wet lab experiments. To facilitate research in the community, a web server and stand-alone program implementing our method were constructed and are publicly available at . The downloadable program of our method can be easily used for the high-throughput screening of potential zinc-binding sites across proteomes.
Collapse
Affiliation(s)
- Renxiang Yan
- School of Biological Sciences and Engineering, Fuzhou University, Fuzhou 350002, China. and Fujian Key Laboratory of Marine Enzyme Engineering, Fuzhou 350002, China
| | - Xiaofeng Wang
- College of Mathematics and Computer Science, Shanxi Normal University, Linfen 041004, China
| | - Yarong Tian
- Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, 40530, Sweden
| | - Jing Xu
- School of Biological Sciences and Engineering, Fuzhou University, Fuzhou 350002, China. and Fujian Key Laboratory of Marine Enzyme Engineering, Fuzhou 350002, China
| | - Xiaoli Xu
- School of Biological Sciences and Engineering, Fuzhou University, Fuzhou 350002, China.
| | - Juan Lin
- School of Biological Sciences and Engineering, Fuzhou University, Fuzhou 350002, China. and Fujian Key Laboratory of Marine Enzyme Engineering, Fuzhou 350002, China
| |
Collapse
|
16
|
Slepchenko KG, Holub JM, Li YV. Intracellular zinc increase affects phosphorylation state and subcellular localization of protein kinase C delta (δ). Cell Signal 2018; 44:148-157. [DOI: 10.1016/j.cellsig.2018.01.018] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2017] [Accepted: 01/14/2018] [Indexed: 10/18/2022]
|
17
|
Song J, Li F, Takemoto K, Haffari G, Akutsu T, Chou KC, Webb GI. PREvaIL, an integrative approach for inferring catalytic residues using sequence, structural, and network features in a machine-learning framework. J Theor Biol 2018; 443:125-137. [DOI: 10.1016/j.jtbi.2018.01.023] [Citation(s) in RCA: 95] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2017] [Revised: 01/17/2018] [Accepted: 01/18/2018] [Indexed: 10/18/2022]
|
18
|
Trace Elements and Healthcare: A Bioinformatics Perspective. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2018; 1005:63-98. [PMID: 28916929 DOI: 10.1007/978-981-10-5717-5_4] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Biological trace elements are essential for human health. Imbalance in trace element metabolism and homeostasis may play an important role in a variety of diseases and disorders. While the majority of previous researches focused on experimental verification of genes involved in trace element metabolism and those encoding trace element-dependent proteins, bioinformatics study on trace elements is relatively rare and still at the starting stage. This chapter offers an overview of recent progress in bioinformatics analyses of trace element utilization, metabolism, and function, especially comparative genomics of several important metals. The relationship between individual elements and several diseases based on recent large-scale systematic studies such as genome-wide association studies and case-control studies is discussed. Lastly, developments of ionomics and its recent application in human health are also introduced.
Collapse
|
19
|
Srivastava A, Kumar M. Prediction of zinc binding sites in proteins using sequence derived information. J Biomol Struct Dyn 2018; 36:4413-4423. [PMID: 29241411 DOI: 10.1080/07391102.2017.1417910] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Zinc is one the most abundant catalytic cofactor and also an important structural component of a large number of metallo-proteins. Hence prediction of zinc metal binding sites in proteins can be a significant step in annotation of molecular function of a large number of proteins. Majority of existing methods for zinc-binding site predictions are based on a data-set of proteins, which has been compiled nearly a decade ago. Hence there is a need to develop zinc-binding site prediction system using the current updated data to include recently added proteins. Herein, we propose a support vector machine-based method, named as ZincBinder, for prediction of zinc metal-binding site in a protein using sequence profile information. The predictor was trained using fivefold cross validation approach and achieved 85.37% sensitivity with 86.20% specificity during training. Benchmarking on an independent non-redundant data-set, which was not used during training, showed better performance of ZincBinder vis-à-vis existing methods. Executable versions, source code, sample datasets, and usage instructions are available at http://proteininformatics.org/mkumar/znbinder/.
Collapse
Affiliation(s)
- Abhishikha Srivastava
- a Department of Biophysics , University of Delhi South Campus , Benito Juarez Road, New Delhi 110021 , India
| | - Manish Kumar
- a Department of Biophysics , University of Delhi South Campus , Benito Juarez Road, New Delhi 110021 , India
| |
Collapse
|
20
|
Kumar S. Prediction of Metal Ion Binding Sites in Proteins from Amino Acid Sequences by Using Simplified Amino Acid Alphabets and Random Forest Model. Genomics Inform 2017; 15:162-169. [PMID: 29307143 PMCID: PMC5769865 DOI: 10.5808/gi.2017.15.4.162] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Revised: 11/16/2017] [Accepted: 11/16/2017] [Indexed: 11/20/2022] Open
Abstract
Metal binding proteins or metallo-proteins are important for the stability of the protein and also serve as co-factors in various functions like controlling metabolism, regulating signal transport, and metal homeostasis. In structural genomics, prediction of metal binding proteins help in the selection of suitable growth medium for overexpression's studies and also help in obtaining the functional protein. Computational prediction using machine learning approach has been widely used in various fields of bioinformatics based on the fact all the information contains in amino acid sequence. In this study, random forest machine learning prediction systems were deployed with simplified amino acid for prediction of individual major metal ion binding sites like copper, calcium, cobalt, iron, magnesium, manganese, nickel, and zinc.
Collapse
Affiliation(s)
- Suresh Kumar
- Department of Diagnostic and Allied Health Sciences, Faculty of Health and Life Sciences, Management and Science University, 40100 Shah Alam, Malaysia
| |
Collapse
|
21
|
Niedzialkowska E, Mrugała B, Rugor A, Czub MP, Skotnicka A, Cotelesage JJH, George GN, Szaleniec M, Minor W, Lewiński K. Optimization of overexpression of a chaperone protein of steroid C25 dehydrogenase for biochemical and biophysical characterization. Protein Expr Purif 2017; 134:47-62. [PMID: 28343996 DOI: 10.1016/j.pep.2017.03.019] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2016] [Revised: 03/02/2017] [Accepted: 03/21/2017] [Indexed: 11/27/2022]
Abstract
Molybdenum is an essential nutrient for metabolism in plant, bacteria, and animals. Molybdoenzymes are involved in nitrogen assimilation and oxidoreductive detoxification, and bioconversion reactions of environmental, industrial, and pharmaceutical interest. Molybdoenzymes contain a molybdenum cofactor (Moco), which is a pyranopterin heterocyclic compound that binds a molybdenum atom via a dithiolene group. Because Moco is a large and complex compound deeply buried within the protein, molybdoenzymes are accompanied by private chaperone proteins responsible for the cofactor's insertion into the enzyme and the enzyme's maturation. An efficient recombinant expression and purification of both Moco-free and Moco-containing molybdoenzymes and their chaperones is of paramount importance for fundamental and applied research related to molybdoenzymes. In this work, we focused on a D1 protein annotated as a chaperone of steroid C25 dehydrogenase (S25DH) from Sterolibacterium denitrificans Chol-1S. The D1 protein is presumably involved in the maturation of S25DH engaged in oxygen-independent oxidation of sterols. As this chaperone is thought to be a crucial element that ensures the insertion of Moco into the enzyme and consequently, proper folding of S25DH optimization of the chaperon's expression is the first step toward the development of recombinant expression and purification methods for S25DH. We have identified common E. coli strains and conditions for both expression and purification that allow us to selectively produce Moco-containing and Moco-free chaperones. We have also characterized the Moco-containing chaperone by EXAFS and HPLC analysis and identified conditions that stabilize both forms of the protein. The protocols presented here are efficient and result in protein quantities sufficient for biochemical studies.
Collapse
Affiliation(s)
- Ewa Niedzialkowska
- Jerzy Haber Institute of Catalysis and Surface Chemistry, Polish Academy of Sciences, Niezapominajek 8, 30239 Krakow, Poland.
| | - Beata Mrugała
- Jerzy Haber Institute of Catalysis and Surface Chemistry, Polish Academy of Sciences, Niezapominajek 8, 30239 Krakow, Poland
| | - Agnieszka Rugor
- Jerzy Haber Institute of Catalysis and Surface Chemistry, Polish Academy of Sciences, Niezapominajek 8, 30239 Krakow, Poland
| | - Mateusz P Czub
- Faculty of Chemistry, Jagiellonian University, Ingardena 3, Krakow 30060, Poland; Department of Molecular Physiology and Biological Physics, University of Virginia, 1340 Jefferson Park Avenue, Charlottesville, VA 22908, USA
| | - Anna Skotnicka
- Faculty of Agriculture and Economics, University of Agriculture in Krakow, Mickiewicza 21, 31120 Krakow, Poland
| | - Julien J H Cotelesage
- Molecular and Environmental Sciences Group, Department of Geological Sciences, University of Saskatchewan, Saskatoon, SK S7N 5E2, Canada
| | - Graham N George
- Molecular and Environmental Sciences Group, Department of Geological Sciences, University of Saskatchewan, Saskatoon, SK S7N 5E2, Canada
| | - Maciej Szaleniec
- Jerzy Haber Institute of Catalysis and Surface Chemistry, Polish Academy of Sciences, Niezapominajek 8, 30239 Krakow, Poland
| | - Wladek Minor
- Department of Molecular Physiology and Biological Physics, University of Virginia, 1340 Jefferson Park Avenue, Charlottesville, VA 22908, USA
| | - Krzysztof Lewiński
- Faculty of Chemistry, Jagiellonian University, Ingardena 3, Krakow 30060, Poland
| |
Collapse
|
22
|
Skliros D, Kalatzis PG, Katharios P, Flemetakis E. Comparative Functional Genomic Analysis of Two Vibrio Phages Reveals Complex Metabolic Interactions with the Host Cell. Front Microbiol 2016; 7:1807. [PMID: 27895630 PMCID: PMC5107563 DOI: 10.3389/fmicb.2016.01807] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2016] [Accepted: 10/27/2016] [Indexed: 01/21/2023] Open
Abstract
Sequencing and annotation was performed for two large double stranded DNA bacteriophages, φGrn1 and φSt2 of the Myoviridae family, considered to be of great interest for phage therapy against Vibrios in aquaculture live feeds. In addition, phage–host metabolic interactions and exploitation was studied by transcript profiling of selected viral and host genes. Comparative genomic analysis with other large Vibrio phages was also performed to establish the presence and location of homing endonucleases highlighting distinct features for both phages. Phylogenetic analysis revealed that they belong to the “schizoT4like” clade. Although many reports of newly sequenced viruses have provided a large set of information, basic research related to the shift of the bacterial metabolism during infection remains stagnant. The function of many viral protein products in the process of infection is still unknown. Genome annotation identified the presence of several viral open reading frames (ORFs) participating in metabolism, including a Sir2/cobB (sirtuin) protein and a number of genes involved in auxiliary NAD+ and nucleotide biosynthesis, necessary for phage DNA replication. Key genes were subsequently selected for detail study of their expression levels during infection. This work suggests a complex metabolic interaction and exploitation of the host metabolic pathways and biochemical processes, including a possible post-translational protein modification, by the virus during infection.
Collapse
Affiliation(s)
- Dimitrios Skliros
- Laboratory of Molecular Biology, Department of Biotechnology, School of Food, Biotechnology and Development, Agricultural University of Athens Athens, Greece
| | - Panos G Kalatzis
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, HeraklionCrete, Greece; Marine Biological Section, University of CopenhagenHelsingør, Denmark
| | - Pantelis Katharios
- Institute of Marine Biology, Biotechnology and Aquaculture, Hellenic Centre for Marine Research, Heraklion Crete, Greece
| | - Emmanouil Flemetakis
- Laboratory of Molecular Biology, Department of Biotechnology, School of Food, Biotechnology and Development, Agricultural University of Athens Athens, Greece
| |
Collapse
|
23
|
Erthal LCS, Marques AF, Almeida FCL, Melo GLM, Carvalho CM, Palmieri LC, Cabral KMS, Fontes GN, Lima LMTR. Regulation of the assembly and amyloid aggregation of murine amylin by zinc. Biophys Chem 2016; 218:58-70. [PMID: 27693831 DOI: 10.1016/j.bpc.2016.09.008] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2016] [Revised: 09/10/2016] [Accepted: 09/17/2016] [Indexed: 11/17/2022]
Abstract
The secretory granule of the pancreatic β-cells is a zinc-rich environment copopulated with the hormones amylin and insulin. The human amylin is shown to interact with zinc ions with major contribution from the single histidine residue, which is absent in amylin from other species such as cat, rhesus and rodents. We report here the interaction of murine amylin with zinc ions in vitro. The self-assembly of murine amylin is tightly regulated by zinc and pH. Ion mobility mass spectrometry revealed zinc interaction with monomers and oligomers. Nuclear magnetic resonance confirms the binding of zinc to murine amylin. The aggregation process of murine amylin into amyloid fibrils is accelerated by zinc. Collectively these data suggest a general role of zinc in the modulation of amylin variants oligomerization and amyloid fibril formation.
Collapse
Affiliation(s)
- Luiza C S Erthal
- School of Pharmacy, Federal University of Rio de Janeiro - UFRJ, CCS, Bss24, Ilha do Fundão, 21941-590 Rio de Janeiro, Brazil
| | - Adriana F Marques
- School of Pharmacy, Federal University of Rio de Janeiro - UFRJ, CCS, Bss24, Ilha do Fundão, 21941-590 Rio de Janeiro, Brazil
| | - Fábio C L Almeida
- School of Pharmacy, Federal University of Rio de Janeiro - UFRJ, CCS, Bss24, Ilha do Fundão, 21941-590 Rio de Janeiro, Brazil
| | - Gustavo L M Melo
- School of Pharmacy, Federal University of Rio de Janeiro - UFRJ, CCS, Bss24, Ilha do Fundão, 21941-590 Rio de Janeiro, Brazil
| | - Camila M Carvalho
- School of Pharmacy, Federal University of Rio de Janeiro - UFRJ, CCS, Bss24, Ilha do Fundão, 21941-590 Rio de Janeiro, Brazil
| | - Leonardo C Palmieri
- School of Pharmacy, Federal University of Rio de Janeiro - UFRJ, CCS, Bss24, Ilha do Fundão, 21941-590 Rio de Janeiro, Brazil
| | - Katia M S Cabral
- School of Pharmacy, Federal University of Rio de Janeiro - UFRJ, CCS, Bss24, Ilha do Fundão, 21941-590 Rio de Janeiro, Brazil
| | - Giselle N Fontes
- Laboratory for Macromolecules (LAMAC-DIMAV), Brazilian National Institute of Metrology, Quality and Technology - INMETRO, Av. N. Sa. das Graças, 50 - Xerém, Duque de Caxias-RJ, 25250-020 Rio de Janeiro, Brazil
| | - Luís Maurício T R Lima
- School of Pharmacy, Federal University of Rio de Janeiro - UFRJ, CCS, Bss24, Ilha do Fundão, 21941-590 Rio de Janeiro, Brazil; Laboratory for Macromolecules (LAMAC-DIMAV), Brazilian National Institute of Metrology, Quality and Technology - INMETRO, Av. N. Sa. das Graças, 50 - Xerém, Duque de Caxias-RJ, 25250-020 Rio de Janeiro, Brazil; National Institute of Science and Technology for Structural Biology and Bioimaging (INBEB-INCT), Federal University of Rio de Janeiro, Rio de Janeiro 21941-590, Brazil.
| |
Collapse
|
24
|
Yan R, Wang X, Huang L, Lin J, Cai W, Zhang Z. GPCRserver: an accurate and novel G protein-coupled receptor predictor. MOLECULAR BIOSYSTEMS 2015; 10:2495-504. [PMID: 25014909 DOI: 10.1039/c4mb00272e] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
G protein coupled receptors (GPCRs), also known as seven-transmembrane domain receptors, pass through the cellular membrane seven times and play diverse biological roles in the cells such as signaling, transporting of molecules and cell-cell communication. In this work, we develop a web server, namely the GPCRserver, which is capable of identifying GPCRs from genomic sequences, and locating their transmembrane regions. The GPCRserver contains three modules: (1) the Trans-GPCR for the transmembrane region prediction by using sequence evolutionary profiles with the assistance of neural network training, (2) the SSEA-GPCR for identifying GPCRs from genomic data by using secondary structure element alignment, and (3) the PPA-GPCR for identifying GPCRs by using profile-to-profile alignment. Our predictor was strictly benchmarked and showed its favorable performance in the real application. The web server and stand-alone programs are publicly available at .
Collapse
Affiliation(s)
- Renxiang Yan
- Institute of Applied Genomics, School of Biological Sciences and Engineering, Fuzhou University, Fuzhou 350002, China.
| | | | | | | | | | | |
Collapse
|
25
|
Li L, Yu S, Xiao W, Li Y, Hu W, Huang L, Zheng X, Zhou S, Yang H. Protein submitochondrial localization from integrated sequence representation and SVM-based backward feature extraction. MOLECULAR BIOSYSTEMS 2015; 11:170-7. [PMID: 25335193 DOI: 10.1039/c4mb00340c] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Mitochondrion, a tiny energy factory, plays an important role in various biological processes of most eukaryotic cells.
Collapse
Affiliation(s)
- Liqi Li
- Department of General Surgery
- Xinqiao Hospital
- Third Military Medical University
- Chongqing 400037
- China
| | - Sanjiu Yu
- Institute of Cardiovascular Diseases of PLA
- Xinqiao Hospital
- Third Military Medical University
- Chongqing 400037
- China
| | - Weidong Xiao
- Department of General Surgery
- Xinqiao Hospital
- Third Military Medical University
- Chongqing 400037
- China
| | - Yongsheng Li
- Institute of Cancer
- Xinqiao Hospital
- Third Military Medical University
- Chongqing 400037
- China
| | - Wenjuan Hu
- Department of Pathophysiology and High Altitude Pathology
- College of High Altitude Military Medicine
- Third Military Medical University
- Chongqing 400038
- China
| | - Lan Huang
- Institute of Cardiovascular Diseases of PLA
- Xinqiao Hospital
- Third Military Medical University
- Chongqing 400037
- China
| | - Xiaoqi Zheng
- Department of Mathematics
- Shanghai Normal University
- Shanghai 200234
- China
| | - Shiwen Zhou
- National Drug Clinical Trial Institution
- Xinqiao Hospital
- Third Military Medical University
- Chongqing 400037
- China
| | - Hua Yang
- Department of General Surgery
- Xinqiao Hospital
- Third Military Medical University
- Chongqing 400037
- China
| |
Collapse
|
26
|
Yu X, Strub MP, Barnard TJ, Noinaj N, Piszczek G, Buchanan SK, Taraska JW. An engineered palette of metal ion quenchable fluorescent proteins. PLoS One 2014; 9:e95808. [PMID: 24752441 PMCID: PMC3994163 DOI: 10.1371/journal.pone.0095808] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2014] [Accepted: 03/31/2014] [Indexed: 12/17/2022] Open
Abstract
Many fluorescent proteins have been created to act as genetically encoded biosensors. With these sensors, changes in fluorescence report on chemical states in living cells. Transition metal ions such as copper, nickel, and zinc are crucial in many physiological and pathophysiological pathways. Here, we engineered a spectral series of optimized transition metal ion-binding fluorescent proteins that respond to metals with large changes in fluorescence intensity. These proteins can act as metal biosensors or imaging probes whose fluorescence can be tuned by metals. Each protein is uniquely modulated by four different metals (Cu2+, Ni2+, Co2+, and Zn2+). Crystallography revealed the geometry and location of metal binding to the engineered sites. When attached to the extracellular terminal of a membrane protein VAMP2, dimeric pairs of the sensors could be used in cells as ratiometric probes for transition metal ions. Thus, these engineered fluorescent proteins act as sensitive transition metal ion-responsive genetically encoded probes that span the visible spectrum.
Collapse
Affiliation(s)
- Xiaozhen Yu
- Laboratory of Molecular Biophysics, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Marie-Paule Strub
- Laboratory of Molecular Biophysics, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Travis J. Barnard
- Laboratory of Molecular Biology, National Institute Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Nicholas Noinaj
- Laboratory of Molecular Biology, National Institute Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Grzegorz Piszczek
- Laboratory of Biochemistry, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Susan K. Buchanan
- Laboratory of Molecular Biology, National Institute Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Justin W. Taraska
- Laboratory of Molecular Biophysics, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail:
| |
Collapse
|