1
|
Wang XT, Liu KH, Li Y, Ren YY, Li Q, Wang BT. Zinc metalloprotease FgM35, which targets the wheat zinc-binding protein TaZnBP, contributes to the virulence of Fusarium graminearum. STRESS BIOLOGY 2024; 4:45. [PMID: 39472326 PMCID: PMC11522218 DOI: 10.1007/s44154-024-00171-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Accepted: 04/29/2024] [Indexed: 11/02/2024]
Abstract
Metalloproteinases are ubiquitous in organisms. Most metalloproteinases secreted by pathogenic microorganisms are also called virulence factors, because they degrade proteins in the external tissues of the host, thereby reducing the host's immunity and increasing its susceptibility to disease. Zinc metalloproteinase is one of the most common metalloproteinases. In our report, we studied the biological function of zinc metalloprotease FgM35 in Fusarium graminearum and the pathogen-host interaction during infection. We found that the asexual and sexual reproduction of the deletion mutant ΔFgM35 were affected, as well as the tolerance of F. graminearum to metal stress. In addition, deletion of FgM35 reduced the virulence of F. graminearum. The wheat target TaZnBP was screened using a wheat yeast cDNA library, and the interaction between FgM35 and TaZnBP was verified by HADDOCK molecular docking, yeast two-hybrid, Bi-FC, Luc, and Co-IP assays. The contribution of TaZnBP to plant immunity was also demonstrated. In summary, our work revealed the indispensable role of FgM35 in the reproductive process and the pathogenicity of F. graminearum, and it identified the interaction between FgM35 and TaZnBP as well as the function of TaZnBP. This provides a theoretical basis for further study of the function of metalloproteinases in pathogen-host interactions.
Collapse
Affiliation(s)
- Xin-Tong Wang
- State Key Laboratory of Crop Stress Resistance and High-Efficiency Production, College of Plant Protection, Northwest A&F University, Yangling, Shannxi Province, 712100, People's Republic of China
| | - Kou-Han Liu
- State Key Laboratory of Crop Stress Resistance and High-Efficiency Production, College of Plant Protection, Northwest A&F University, Yangling, Shannxi Province, 712100, People's Republic of China
| | - Ying Li
- State Key Laboratory of Crop Stress Resistance and High-Efficiency Production, College of Plant Protection, Northwest A&F University, Yangling, Shannxi Province, 712100, People's Republic of China
| | - Yan-Yan Ren
- State Key Laboratory of Crop Stress Resistance and High-Efficiency Production, College of Plant Protection, Northwest A&F University, Yangling, Shannxi Province, 712100, People's Republic of China
| | - Qiang Li
- State Key Laboratory of Crop Stress Resistance and High-Efficiency Production, College of Plant Protection, Northwest A&F University, Yangling, Shannxi Province, 712100, People's Republic of China.
| | - Bao-Tong Wang
- State Key Laboratory of Crop Stress Resistance and High-Efficiency Production, College of Plant Protection, Northwest A&F University, Yangling, Shannxi Province, 712100, People's Republic of China.
| |
Collapse
|
2
|
Andrikopoulos PC, Čabart P. The chromatin remodeler SMARCA5 binds to d-block metal supports: Characterization of affinities by IMAC chromatography and QM analysis. PLoS One 2024; 19:e0309134. [PMID: 39374200 PMCID: PMC11458017 DOI: 10.1371/journal.pone.0309134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Accepted: 08/05/2024] [Indexed: 10/09/2024] Open
Abstract
The ISWI family protein SMARCA5 contains the ATP-binding pocket that coordinates the catalytic Mg2+ ion and water molecules for ATP hydrolysis. In this study, we demonstrate that SMARCA5 can also possess an alternative metal-binding ability. First, we isolated SMARCA5 on the cobalt column (IMAC) to near homogeneity. Examination of the interactions of SMARCA5 with metal-chelating supports showed that, apart from Co2+, it binds to Cu2+, Zn2+ and Ni2+. The efficiency of the binding to the last-listed metal was influenced by the chelating ligand, resulting in a strong preference for Ni-NTA over the Ni-CM-Asp equivalent. To gain insight in the preferential affinity for the Ni-NTA ligand, QM calculations were performed on model systems and metal-ligand complexes with a limited protein fragment of SMARCA5 containing the double-histidine (dHis) motif. The calculations correlated the observed affinity with the relative stability of the d-block metals to tetradentate ligand coordination over tridentate, as well as their overall octahedral coordination capacity. Likewise, binding free energies derived from model imidazole complexes mirrored the observed Ni-NTA/Ni-CM-Asp preferential affinity. Finally, similar calculations on complexes with a SMARCA5 peptide fragment derived from the AlphaFold structural prediction, captured almost accurately the expected relative stability of the TM complexes, and produced a large energetic separation (~10 kcal∙mol-1) between Ni-NTA and Ni-CM-Asp in favour of the former.
Collapse
Affiliation(s)
- Prokopis C. Andrikopoulos
- BIOCEV, Institute of Biotechnology of the Czech Academy of Sciences, Vestec, Czechia
- BIOCEV, 1 Faculty of Medicine, Charles University, Vestec, Czechia
| | - Pavel Čabart
- Institute of Experimental Medicine of the Czech Academy of Sciences, Prague, Czechia
| |
Collapse
|
3
|
Liu F, Hutchinson R. Visible particles in parenteral drug products: A review of current safety assessment practice. Curr Res Toxicol 2024; 7:100175. [PMID: 38975062 PMCID: PMC11223083 DOI: 10.1016/j.crtox.2024.100175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2024] [Revised: 06/01/2024] [Accepted: 06/05/2024] [Indexed: 07/09/2024] Open
Abstract
Parenteral drug products (PDPs) are administered extensively to treat various diseases. Product quality plays a critical role in ensuring patient safety and product efficacy. One important quality challenge is the contamination of particles in PDPs. Particle presence in PDPs represents potential safety risk to patients. Differential guidance and practice have been in place for visible (VPs) and subvisible particles (SVPs) in PDPs. For SVPs, the amount limits have been harmonized in multiple Pharmacopeias. The pharmaceutical industry follows the guided limits for regulatory and quality compliance. However, for VPs, no such acceptable limit has been set. This results in not only quality but also safety challenges for manufacturers and drug developers in managing and evaluating VPs. It is important to understand the potential safety risk of VPs so these can be weighed against the benefit of the PDPs. To evaluate their potential risk(s), it is necessary to understand their nature, origin, frequency of their occurrence, safety risk, the risk mitigation measures, and the method to evaluate their safety. The current paper reviews the critical literature on these aspects and provides insight into considerations when performing safety assessment and managing the risk(s) for VPs in PDPs.
Collapse
Affiliation(s)
- Frank Liu
- Safe Product Services LLC, Pittsfield, MA, USA
| | | |
Collapse
|
4
|
Diptiman D, Jalan A, Pal R, Dodwani S, Bandyopadhyay D. Hist-i-fy-a multiple histidine post-translational-modification (PTM) prediction server based on protein sequences using convolution neural network: a case study on mass spectroscopy data. J Biomol Struct Dyn 2024:1-10. [PMID: 38285683 DOI: 10.1080/07391102.2024.2310200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2023] [Accepted: 01/19/2024] [Indexed: 01/31/2024]
Abstract
Computational characterization of multiple Histidine (His) post-translational-modifications (PTM) at enzyme active sites complements tedious experimental characterization in proteins-of-unknown-functions (PUFs) and domain-of-unknown-functions (DUFs). There are only a handful of Histidine-PTM-prediction-tools and those also annotate only a single function. Here, we addressed the problem using artificial neural networks on functional histidine dataset curated from enzyme (protein) sequences available in UniProt database (sample size n = 1584). The convolution-neural-network (CNN) model ('Hist-i-fy') performed the best with 75% overall accuracy/F1-score. A case study was performed on histidine-phosphorylation (n = 34) obtained from mass spectroscopy data. For the first time, we report multiple His-PTM-prediction-tool (https://histify.streamlit.app/& https://github.com/dibyansu24-maker/Histify), with optimal performance. The inputs to the tool are (i) protein sequence containing histidine, and (ii) the histidine residue number. Prediction output is one out of the eight histidine functions-acetylation, ribosylation, glycosylation, hydroxylation, methylation, oxidation, phosphorylation, and protein splicing.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Dibyansu Diptiman
- Department of Biological Sciences, Birla Institute of Technology and Science, Hyderabad, India
| | - Abhishek Jalan
- Department of Biological Sciences, Birla Institute of Technology and Science, Hyderabad, India
| | - Rishabh Pal
- Department of Biological Sciences, Birla Institute of Technology and Science, Hyderabad, India
| | - Sachin Dodwani
- Department of Biological Sciences, Birla Institute of Technology and Science, Hyderabad, India
| | - Debashree Bandyopadhyay
- Department of Biological Sciences, Birla Institute of Technology and Science, Hyderabad, India
| |
Collapse
|
5
|
Williams AH, Zhan CG. Staying Ahead of the Game: How SARS-CoV-2 has Accelerated the Application of Machine Learning in Pandemic Management. BioDrugs 2023; 37:649-674. [PMID: 37464099 DOI: 10.1007/s40259-023-00611-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/28/2023] [Indexed: 07/20/2023]
Abstract
In recent years, machine learning (ML) techniques have garnered considerable interest for their potential use in accelerating the rate of drug discovery. With the emergence of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic, the utilization of ML has become even more crucial in the search for effective antiviral medications. The pandemic has presented the scientific community with a unique challenge, and the rapid identification of potential treatments has become an urgent priority. Researchers have been able to accelerate the process of identifying drug candidates, repurposing existing drugs, and designing new compounds with desirable properties using machine learning in drug discovery. To train predictive models, ML techniques in drug discovery rely on the analysis of large datasets, including both experimental and clinical data. These models can be used to predict the biological activities, potential side effects, and interactions with specific target proteins of drug candidates. This strategy has proven to be an effective method for identifying potential coronavirus disease 2019 (COVID-19) and other disease treatments. This paper offers a thorough analysis of the various ML techniques implemented to combat COVID-19, including supervised and unsupervised learning, deep learning, and natural language processing. The paper discusses the impact of these techniques on pandemic drug development, including the identification of potential treatments, the understanding of the disease mechanism, and the creation of effective and safe therapeutics. The lessons learned can be applied to future outbreaks and drug discovery initiatives.
Collapse
Affiliation(s)
- Alexander H Williams
- Molecular Modeling and Biopharmaceutical Center, University of Kentucky, 789 South Limestone Street, Lexington, KY, 40536, USA
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Kentucky, 789 South Limestone Street, Lexington, KY, 40536, USA
- GSK Upper Providence, 1250 S. Collegeville Road, Collegeville, PA, 19426, USA
| | - Chang-Guo Zhan
- Molecular Modeling and Biopharmaceutical Center, University of Kentucky, 789 South Limestone Street, Lexington, KY, 40536, USA.
- Department of Pharmaceutical Sciences, College of Pharmacy, University of Kentucky, 789 South Limestone Street, Lexington, KY, 40536, USA.
| |
Collapse
|
6
|
Cheng Y, Wang H, Xu H, Liu Y, Ma B, Chen X, Zeng X, Wang X, Wang B, Shiau C, Ovchinnikov S, Su XD, Wang C. Co-evolution-based prediction of metal-binding sites in proteomes by machine learning. Nat Chem Biol 2023; 19:548-555. [PMID: 36593274 DOI: 10.1038/s41589-022-01223-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2022] [Accepted: 11/08/2022] [Indexed: 01/03/2023]
Abstract
Metal ions have various important biological roles in proteins, including structural maintenance, molecular recognition and catalysis. Previous methods of predicting metal-binding sites in proteomes were based on either sequence or structural motifs. Here we developed a co-evolution-based pipeline named 'MetalNet' to systematically predict metal-binding sites in proteomes. We applied MetalNet to proteomes of four representative prokaryotic species and predicted 4,849 potential metalloproteins, which substantially expands the currently annotated metalloproteomes. We biochemically and structurally validated previously unannotated metal-binding sites in several proteins, including apo-citrate lyase phosphoribosyl-dephospho-CoA transferase citX, an Escherichia coli enzyme lacking structural or sequence homology to any known metalloprotein (Protein Data Bank (PDB) codes: 7DCM and 7DCN ). MetalNet also successfully recapitulated all known zinc-binding sites from the human spliceosome complex. The pipeline of MetalNet provides a unique and enabling tool for interrogating the hidden metalloproteome and studying metal biology.
Collapse
Affiliation(s)
- Yao Cheng
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Haobo Wang
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Hua Xu
- State Key Laboratory of Protein and Plant Gene Research, and Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China
| | - Yuan Liu
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China.
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China.
| | - Bin Ma
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Xuemin Chen
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Xin Zeng
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China
| | - Xianghe Wang
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
| | - Bo Wang
- State Key Laboratory of Protein and Plant Gene Research, and Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China
| | | | - Sergey Ovchinnikov
- John Harvard Distinguished Science Fellow, Harvard University, Cambridge, MA, USA
| | - Xiao-Dong Su
- State Key Laboratory of Protein and Plant Gene Research, and Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, China.
| | - Chu Wang
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, Peking University, Beijing, China.
- Department of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing, China.
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China.
| |
Collapse
|
7
|
Chen Z, Liu X, Zhao P, Li C, Wang Y, Li F, Akutsu T, Bain C, Gasser RB, Li J, Yang Z, Gao X, Kurgan L, Song J. iFeatureOmega: an integrative platform for engineering, visualization and analysis of features from molecular sequences, structural and ligand data sets. Nucleic Acids Res 2022; 50:W434-W447. [PMID: 35524557 PMCID: PMC9252729 DOI: 10.1093/nar/gkac351] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 04/22/2022] [Accepted: 04/25/2022] [Indexed: 01/07/2023] Open
Abstract
The rapid accumulation of molecular data motivates development of innovative approaches to computationally characterize sequences, structures and functions of biological and chemical molecules in an efficient, accessible and accurate manner. Notwithstanding several computational tools that characterize protein or nucleic acids data, there are no one-stop computational toolkits that comprehensively characterize a wide range of biomolecules. We address this vital need by developing a holistic platform that generates features from sequence and structural data for a diverse collection of molecule types. Our freely available and easy-to-use iFeatureOmega platform generates, analyzes and visualizes 189 representations for biological sequences, structures and ligands. To the best of our knowledge, iFeatureOmega provides the largest scope when directly compared to the current solutions, in terms of the number of feature extraction and analysis approaches and coverage of different molecules. We release three versions of iFeatureOmega including a webserver, command line interface and graphical interface to satisfy needs of experienced bioinformaticians and less computer-savvy biologists and biochemists. With the assistance of iFeatureOmega, users can encode their molecular data into representations that facilitate construction of predictive models and analytical studies. We highlight benefits of iFeatureOmega based on three research applications, demonstrating how it can be used to accelerate and streamline research in bioinformatics, computational biology, and cheminformatics areas. The iFeatureOmega webserver is freely available at http://ifeatureomega.erc.monash.edu and the standalone versions can be downloaded from https://github.com/Superzchen/iFeatureOmega-GUI/ and https://github.com/Superzchen/iFeatureOmega-CLI/.
Collapse
Affiliation(s)
- Zhen Chen
- Collaborative Innovation Center of Henan Grain Crops, Henan Agricultural University, Zhengzhou 450046, China
- Center for Crop Genome Engineering, Henan Agricultural University, Zhengzhou 450046, China
| | - Xuhan Liu
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden 2333 CC, The Netherlands
| | - Pei Zhao
- State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Sciences (CAAS), Anyang 455000, China
| | - Chen Li
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria 3800, Australia
| | - Yanan Wang
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria 3800, Australia
| | - Fuyi Li
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria 3800, Australia
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan
| | - Chris Bain
- Monash Data Future Institutes, Monash University, Melbourne, Victoria 3800, Australia
| | - Robin B Gasser
- Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia
| | - Junzhou Li
- Collaborative Innovation Center of Henan Grain Crops, Henan Agricultural University, Zhengzhou 450046, China
| | - Zuoren Yang
- State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Sciences (CAAS), Anyang 455000, China
| | - Xin Gao
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
| | - Jiangning Song
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria 3800, Australia
- Monash Data Future Institutes, Monash University, Melbourne, Victoria 3800, Australia
| |
Collapse
|
8
|
Roy P, Bhattacharyya D. MetBP: A Software Tool for Detection of Interaction between Metal Ion-RNA Base Pairs. Bioinformatics 2022; 38:3833-3834. [PMID: 35695777 DOI: 10.1093/bioinformatics/btac392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2022] [Revised: 05/09/2022] [Accepted: 06/10/2022] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION The role of metals in shaping and functioning of RNA is a well established fact and the understanding of that through the analysis of structural data has biological relevance. Often metal ions bind to one or more atoms of the nucleobase of an RNA. This fact becomes more interesting when such bases form a base pair with any other base. Furthermore, when metal ions bind to any residue of an RNA, the secondary structural features of the residue (helix, loop, unpaired etc) are also biologically important. The available metal binding related software tools cannot address such type specific queries. RESULTS To fill this limitation, we have designed a software tool, called MetBP, that meets the goal. This tool is a stand-alone command line based tool and has no dependency on the other existing software. It accepts a structure file in mmCIF or PDB format and computes the base pairs and thereafter reports all metals that bind to one or more nucleotides that form pairs with another. It reports binding distance, angles along with base pair stability. It also reports several other important aspects, e.g. secondary structure of the residue in the RNA. MetBP can be used as a generalized metal binding site detection tool for Proteins and DNA as well. AVAILABILITY https://github.com/computational-biology/metbp. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Parthajit Roy
- The Department of Computer Science, The University of Burdwan, Burdwan 713104, West Bengal, India
| | | |
Collapse
|
9
|
Aptekmann AA, Buongiorno J, Giovannelli D, Glamoclija M, Ferreiro DU, Bromberg Y. mebipred: identifying metal binding potential in protein sequence. Bioinformatics 2022; 38:3532-3540. [PMID: 35639953 PMCID: PMC9272798 DOI: 10.1093/bioinformatics/btac358] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 03/27/2022] [Accepted: 05/22/2022] [Indexed: 11/23/2022] Open
Abstract
Motivation metal-binding proteins have a central role in maintaining life processes. Nearly one-third of known protein structures contain metal ions that are used for a variety of needs, such as catalysis, DNA/RNA binding, protein structure stability, etc. Identifying metal-binding proteins is thus crucial for understanding the mechanisms of cellular activity. However, experimental annotation of protein metal-binding potential is severely lacking, while computational techniques are often imprecise and of limited applicability. Results we developed a novel machine learning-based method, mebipred, for identifying metal-binding proteins from sequence-derived features. This method is over 80% accurate in recognizing proteins that bind metal ion-containing ligands; the specific identity of 11 ubiquitously present metal ions can also be annotated. mebipred is reference-free, i.e. no sequence alignments are involved, and is thus faster than alignment-based methods; it is also more accurate than other sequence-based prediction methods. Additionally, mebipred can identify protein metal-binding capabilities from short sequence stretches, e.g. translated sequencing reads, and, thus, may be useful for the annotation of metal requirements of metagenomic samples. We performed an analysis of available microbiome data and found that ocean, hot spring sediments and soil microbiomes use a more diverse set of metals than human host-related ones. For human microbiomes, physiological conditions explain the observed metal preferences. Similarly, subtle changes in ocean sample ion concentration affect the abundance of relevant metal-binding proteins. These results highlight mebipred’s utility in analyzing microbiome metal requirements. Availability and implementation mebipred is available as a web server at services.bromberglab.org/mebipred and as a standalone package at https://pypi.org/project/mymetal/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- A A Aptekmann
- Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Dr, New Brunswick, NJ, 08873, USA.,Institute of Marine and Coastal Sciences, Rutgers University, New Brunswick, NJ, 08901, USA
| | | | - D Giovannelli
- Institute of Marine and Coastal Sciences, Rutgers University, New Brunswick, NJ, 08901, USA.,Department of Biology, University of Naples Federico II, Naples, Italy.,Institute for Marine Biological Resources and Biotechnology-IRBIM, National Research Council of Italy, CNR, Ancona, Italy
| | - M Glamoclija
- Department of Earth and Environmental Sciences, Rutgers University, New Brunswick, NJ, 07102, USA
| | - D U Ferreiro
- Protein Physiology Lab, Departamento de Quimica Biologica, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires-CONICET-IQUIBICEN, Buenos Aires, 1428, Argentina
| | - Y Bromberg
- Department of Biochemistry and Microbiology, Rutgers University, 76 Lipman Dr, New Brunswick, NJ, 08873, USA
| |
Collapse
|
10
|
Keßler M, Wittig I, Ackermann J, Koch I. Prediction and analysis of redox-sensitive cysteines using machine learning and statistical methods. Biol Chem 2021; 402:925-935. [PMID: 34261205 DOI: 10.1515/hsz-2020-0321] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2020] [Accepted: 12/07/2020] [Indexed: 12/19/2022]
Abstract
Reactive oxygen species are produced by a number of stimuli and can lead both to irreversible intracellular damage and signaling through reversible post-translational modification. It is unclear which factors contribute to the sensitivity of cysteines to redox modification. Here, we used statistical and machine learning methods to investigate the influence of different structural and sequence features on the modifiability of cysteines. We found several strong structural predictors for redox modification. Sensitive cysteines tend to be characterized by higher exposure, a lack of secondary structure elements, and a high number of positively charged amino acids in their close environment. Our results indicate that modified cysteines tend to occur close to other post-translational modifications, such as phosphorylated serines. We used these features to create models and predict the presence of redox-modifiable cysteines in human mitochondrial complex I as well as make novel predictions regarding redox-sensitive cysteines in proteins.
Collapse
Affiliation(s)
- Marcus Keßler
- Molecular Bioinformatics Group, Institute of Computer Science, Goethe-University, Robert-Mayer-Str. 11-15, 60325, Frankfurt am Main, Germany
| | - Ilka Wittig
- Functional Proteomics Group, Medical School, Goethe-University, Theodor-Stern-Kai 7, 60590, Frankfurt am Main, Germany
| | - Jörg Ackermann
- Molecular Bioinformatics Group, Institute of Computer Science, Goethe-University, Robert-Mayer-Str. 11-15, 60325, Frankfurt am Main, Germany
| | - Ina Koch
- Molecular Bioinformatics Group, Institute of Computer Science, Goethe-University, Robert-Mayer-Str. 11-15, 60325, Frankfurt am Main, Germany
| |
Collapse
|
11
|
Meng C, Wang K, Zhang X, Zhu X. Purification and structure analysis of zinc-binding protein from Mizuhopecten yessoensis. J Food Biochem 2021; 45:e13756. [PMID: 33993503 DOI: 10.1111/jfbc.13756] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Revised: 03/31/2021] [Accepted: 04/19/2021] [Indexed: 11/27/2022]
Abstract
Zn-binding protein was obtained after purification from scallops (Mizuhopecten yessoensis) using gel permeation and ion-exchange chromatography. Amino acid determination showed that the cysteine of the zinc-binding protein accounted for one-third of the total amino acids, which is a typical feature of metallothionein (MT). The spectra of Fourier Transform Infrared Spectroscopy (FTIR) and Circular Dichroism (CD) were analyzed to predict the secondary structure information of zinc-binding protein: the α-helix was 46.55%, the β-sheets was 27.07%, the random coil was 16.48%, and the β-turns was 9.89%. Using a commercial kit to measure its antioxidant activity in vitro, the result showed that it had good scavenging ability to 1,1-diphenyl-2-picrylhydrazyl (DPPH), hydroxyl radical (·OH), and reducing the ability to ferrous iron ions. With the process provided by this study, zinc-binding protein can be prepared in large quantities, which is the basis for its future commercialization. PRACTICAL APPLICATIONS: According to the extraction and purification process established in this study, a large amount of zinc-bound MT from the viscera of scallops can be obtained. And the zinc-bound MT had good antioxidant activity. In addition, the yield of each purification step has been calculated. The zinc-bound MTs from scallop' viscera can be prepared in large quantities by directly using the process in this manuscript or by equal magnification of this process. In the future, large-scale production can be considered to increase the economic value of scallops' viscera.
Collapse
Affiliation(s)
- Chunying Meng
- School of Food Science and Biotechnology, Zhejiang Gongshang University, Hangzhou, P.R. China.,Laboratory of Aquatic Product Processing and Quality Safety, Zhejiang Marine Fisheries Research Institute, Zhoushan, P.R. China
| | - Kuiwu Wang
- School of Food Science and Biotechnology, Zhejiang Gongshang University, Hangzhou, P.R. China
| | - Xiaojun Zhang
- Laboratory of Aquatic Product Processing and Quality Safety, Zhejiang Marine Fisheries Research Institute, Zhoushan, P.R. China
| | - Xinyue Zhu
- School of Food Science and Biotechnology, Zhejiang Gongshang University, Hangzhou, P.R. China
| |
Collapse
|
12
|
Garg A, Pal D. Inferring metal binding sites in flexible regions of proteins. Proteins 2021; 89:1125-1133. [PMID: 33864411 DOI: 10.1002/prot.26085] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Revised: 03/15/2021] [Accepted: 04/12/2021] [Indexed: 12/20/2022]
Abstract
Metal ions are central to the molecular function of many proteins. Thus their knowledge in experimentally determined structure is important; however, such structures often lose bound metal ions during sample preparation. Identification of these metal-binding site(s) becomes difficult when the receptor is novel and/or their conformations differ in the bound/unbound states. Locating such sites in theoretical models also poses a challenge due to the uncertainties with side-chain modeling. We address the problem by employing the Geometric Hashing algorithm to create a template library of functionally important binding sites and match query structures with the available templates. The matching is done on the structure ensemble obtained from coarse-grained molecular dynamics simulation, where metal-specific amino acids are screened to infer the true site. Test on 1347 non-redundant monomer protein structures show that Ca2+ , Zn2+ , Mg2+ , Cu2+ , and Fe3+ binding site residues can be classified at 0.92, 0.95, 0.80, 0.90, and 0.92 aggregate performance (out of 1) across all possible thresholds. The performance for Ca2+ and Zn2+ is notably superior in comparison to state-of-the-art methods like IonCom and MIB. Specific case studies show that additionally predicted metal-binding site residues in proteins have features necessary for ion binding. These include new sites not predicted by other methods. The use of coarse-grained dynamics thus provides a generalized approach to improve metal-binding site prediction. The work is expected to contribute to improving our ability to correctly predict protein molecular function where knowledge of metal binding is a key requirement.
Collapse
Affiliation(s)
- Aditi Garg
- Department of Computational and Data Sciences, Indian Institute of Science, Bengaluru, India
| | - Debnath Pal
- Department of Computational and Data Sciences, Indian Institute of Science, Bengaluru, India
| |
Collapse
|
13
|
AbdelGawwad MR, Mahmutović E, Al Farraj DA, Elshikh MS. In silico prediction of silver nitrate nanoparticles and Nitrate Reductase A (NAR A) interaction in the treatment of infectious disease causing clinical strains of E. coli. J Infect Public Health 2020; 13:1580-1585. [PMID: 32855089 DOI: 10.1016/j.jiph.2020.08.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Revised: 07/28/2020] [Accepted: 08/13/2020] [Indexed: 10/23/2022] Open
Abstract
BACKGROUND The interaction of specially designed nanoparticles with proteins is the basis of formation of a nanoparticle-protein corona. Silver nanoparticles and molecules have been used in many fields due to their strong antimicrobial activity against pathogenic microorganisms such as bacteria, yeast and fungi. E. coli is a Gram-negative bacteria that has the genome completely sequenced and determined majority of its protein 3D structures. Nitrate Reductase A is a cellular protein often found in many bacteria that uses nitrate as an electron acceptor during anaerobic growth. The enzyme is composed of three different chains α, β and γ, all having properties of metal binding regions and domains. METHODS Bioinformatics tools were used to investigate the structure, domains, interactomes, and docking sites of E. coli Nitrate Reductase A in order to predict the possible site of interaction of silver nitrate AgNO3 with the protein. The 3D structure of the NAR A protein was predicted with the Phyre2 protein modeling software. The generated structures from Phyre2 were validated and evaluated by analysis of Ramachandran plots using RAMPAGE online software. To understand the evolutionary relationships between the subunits of Nitrate Reductase A, a phylogenetic tree was constructed using Phylogeny.fr. RESULTS All cysteine and histidine residues in amino acid sequences were identified; 3D structure of subunits predicted together with Ramachandran plots, and the electrostatic potential was computed using various bioinformatics tools. The reactive cationic property of silver ion leads to attachment to specific anionic regions and active sites of the three subunits causing in many prokaryotic cells deactivation of nitrate reductase. Obtained results showed the possible sites of attachment of silver ions and their reactivity with domains that have metal bonding properties. In silico analysis of silver nanoparticles and nitrate reductase is helpful to treat various infections diseases caused by E. coli.
Collapse
Affiliation(s)
- Mohamed Ragab AbdelGawwad
- Genetics and Bioengineering, Faculty of Engineering and Natural Sciences, International University of Sarajevo, Sarajevo, Bosnia and Herzegovina.
| | - Ensar Mahmutović
- Genetics and Bioengineering, Faculty of Engineering and Natural Sciences, International University of Sarajevo, Sarajevo, Bosnia and Herzegovina
| | - Dunia A Al Farraj
- Department of Botany and Microbiology, College of Sciences, King Saud University, Riyadh 11451, Saudi Arabia
| | - Mohamed Soliman Elshikh
- Department of Botany and Microbiology, College of Sciences, King Saud University, Riyadh 11451, Saudi Arabia
| |
Collapse
|
14
|
Determination of interactions between antibody biotherapeutics and copper by size exclusion chromatography (SEC) coupled with inductively coupled plasma mass spectrometry (ICP/MS). Anal Chim Acta 2019; 1079:252-259. [DOI: 10.1016/j.aca.2019.06.047] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2019] [Revised: 06/18/2019] [Accepted: 06/23/2019] [Indexed: 01/29/2023]
|
15
|
Yan R, Wang X, Tian Y, Xu J, Xu X, Lin J. Prediction of zinc-binding sites using multiple sequence profiles and machine learning methods. Mol Omics 2019; 15:205-215. [PMID: 31046040 DOI: 10.1039/c9mo00043g] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
The zinc (Zn2+) cofactor has been proven to be involved in numerous biological mechanisms and the zinc-binding site is recognized as one of the most important post-translation modifications in proteins. Therefore, accurate knowledge of zinc ions in protein structures can provide potential clues for elucidation of protein folding and functions. However, determining zinc-binding residues by experimental means is usually lab-intensive and associated with high cost in most cases. In this context, the development of computational tools for identifying zinc-binding sites is highly desired, especially in the current post-genomic era. In this work, we developed a novel zinc-binding site prediction method by combining several intensively-trained machine learning models. To establish an accurate and generative method, we downloaded all zinc-binding proteins from the Protein Data Bank and prepared a non-redundant dataset. Meanwhile, a well-prepared dataset by other groups was also used. Then, effective and complementary features were extracted from sequences and three-dimensional structures of these proteins. Moreover, several well-designed machine learning models were intensively trained to construct accurate models. To assess the performance, the obtained predictors were stringently benchmarked using the diverse zinc-binding sites. Furthermore, several state-of-the-art in silico methods developed specifically for zinc-binding sites were also evaluated and compared. The results confirmed that our method is very competitive in real world applications and could become a complementary tool to wet lab experiments. To facilitate research in the community, a web server and stand-alone program implementing our method were constructed and are publicly available at . The downloadable program of our method can be easily used for the high-throughput screening of potential zinc-binding sites across proteomes.
Collapse
Affiliation(s)
- Renxiang Yan
- School of Biological Sciences and Engineering, Fuzhou University, Fuzhou 350002, China. and Fujian Key Laboratory of Marine Enzyme Engineering, Fuzhou 350002, China
| | - Xiaofeng Wang
- College of Mathematics and Computer Science, Shanxi Normal University, Linfen 041004, China
| | - Yarong Tian
- Institute of Biomedicine, Sahlgrenska Academy, University of Gothenburg, 40530, Sweden
| | - Jing Xu
- School of Biological Sciences and Engineering, Fuzhou University, Fuzhou 350002, China. and Fujian Key Laboratory of Marine Enzyme Engineering, Fuzhou 350002, China
| | - Xiaoli Xu
- School of Biological Sciences and Engineering, Fuzhou University, Fuzhou 350002, China.
| | - Juan Lin
- School of Biological Sciences and Engineering, Fuzhou University, Fuzhou 350002, China. and Fujian Key Laboratory of Marine Enzyme Engineering, Fuzhou 350002, China
| |
Collapse
|
16
|
Haberal İ, Oğul H. Prediction of Protein Metal Binding Sites Using Deep Neural Networks. Mol Inform 2019; 38:e1800169. [PMID: 30977960 DOI: 10.1002/minf.201800169] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2018] [Accepted: 03/29/2019] [Indexed: 11/06/2022]
Abstract
Metals have crucial roles for many physiological, pathological and diagnostic processes. Metal binding proteins or metalloproteins are important for metabolism functions. The proteins that reach the three-dimensional structure by folding show which vital function is fulfilled. The prediction of metal-binding in proteins will be considered as a step-in function assignment for new proteins, which helps to obtain functional proteins in genomic studies, is critical to protein function annotation and drug discovery. Computational predictions made by using machine learning methods from the data obtained from amino acid sequences are widely used in the protein metal-binding and various bioinformatics fields. In this work, we present three different deep learning architectures for prediction of metal-binding of Histidines (HIS) and Cysteines (CYS) amino acids. These architectures are as follows: 2D Convolutional Neural Network, Long-Short Term Memory and Recurrent Neural Network. Their comparison is carried out on the three different sets of attributes derived from a public dataset of protein sequences. These three sets of features extracted from the protein sequence were obtained using the PAM scoring matrix, protein composition server, and binary representation methods. The results show that a better performance for prediction of protein metal- binding sites is obtained through Convolutional Neural Network architecture.
Collapse
Affiliation(s)
- İsmail Haberal
- Department of Computer Engineering, Başkent University, Fatih Sultan Mahallesi Eskişehir Yolu 18. km, 06790, Etimesgut, Ankara, Turkey
| | - Hasan Oğul
- Department of Computer Engineering, Başkent University, Fatih Sultan Mahallesi Eskişehir Yolu 18. km, 06790, Etimesgut, Ankara, Turkey.,Faculty of Computer Sciences, Østfold University College, Halden, Norway
| |
Collapse
|
17
|
Atanasova P, Hoffmann RC, Stitz N, Sanctis S, Burghard Z, Bill J, Schneider JJ, Eiben S. Engineered nanostructured virus/ZnO hybrid materials with dedicated functional properties. BIOINSPIRED BIOMIMETIC AND NANOBIOMATERIALS 2019. [DOI: 10.1680/jbibn.18.00006] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Affiliation(s)
- Petia Atanasova
- Institute for Materials Science, University of Stuttgart, Stuttgart, Germany
| | - Rudolf C Hoffmann
- Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Darmstadt, Germany
| | - Nina Stitz
- Institute for Materials Science, University of Stuttgart, Stuttgart, Germany
| | - Shawn Sanctis
- Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Darmstadt, Germany
| | - Zaklina Burghard
- Institute for Materials Science, University of Stuttgart, Stuttgart, Germany
| | - Joachim Bill
- Institute for Materials Science, University of Stuttgart, Stuttgart, Germany
| | - Jörg J Schneider
- Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Darmstadt, Germany
| | - Sabine Eiben
- Institute of Biomaterials and Biological Systems, University of Stuttgart, Stuttgart, Germany
| |
Collapse
|
18
|
Qiao L, Xie D. MIonSite: Ligand-specific prediction of metal ion-binding sites via enhanced AdaBoost algorithm with protein sequence information. Anal Biochem 2019; 566:75-88. [DOI: 10.1016/j.ab.2018.11.009] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2018] [Revised: 10/15/2018] [Accepted: 11/07/2018] [Indexed: 11/24/2022]
|
19
|
Shaik NA, Awan ZA, Verma PK, Elango R, Banaganapalli B. Protein phenotype diagnosis of autosomal dominant calmodulin mutations causing irregular heart rhythms. J Cell Biochem 2018; 119:8233-8248. [PMID: 29932249 DOI: 10.1002/jcb.26834] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2018] [Accepted: 03/09/2018] [Indexed: 12/21/2022]
Abstract
The life-threatening group of irregular cardiac rhythmic disorders also known as Cardiac Arrhythmias (CA) are caused by mutations in highly conserved Calmodulin (CALM/CaM) genes. Herein, we present a multidimensional approach to diagnose changes in phenotypic, stability, and Ca2+ ion binding properties of CA-causing mutations. Mutation pathogenicity was determined by diverse computational machine learning approaches. We further modeled the mutations in 3D protein structure and analyzed residue level phenotype plasticity. We have also examined the influence of torsion angles, number of H-bonds, and free energy dynamics on the stability, near-native simulation dynamic potential of residue fluctuations in protein structures, Ca2+ ion binding potentials, of CaM mutants. Our study recomends to use M-CAP method for measuring the pathogenicity of CA causing CaM variants. Interestingly, most CA-causing variants we analyzed, exists in either third (V/H-96, S/I-98, V-103) or fourth (G/V-130, V/E/H-132, H-134, P-136, G-141, and L-142) EF-hands located in carboxyl domains of the CaM molecule. We observed that the minor structural fluctuations caused by these variants are likely tolerable owing to the highly flexible nature of calmodulin's globular domains. However, our molecular docking results supports that these variants disturb the affinity of CaM toward Ca2+ ions and corroborate previous findings from functional studies. Taken together, these computational findings can explain the molecular reasons for subtle changes in structure, flexibility, and stability aspects of mutant CaM molecule. Our comprehensive molecular scanning approach demonstrates the utility of computational methods in quick preliminary screening of CA- CaM mutations before undertaking time consuming and complicated functional laboratory assays.
Collapse
Affiliation(s)
- Noor A Shaik
- Department of Genetic Medicine, Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia.,Princess Al-Jawhara Al-Brahim Centre of Excellence in Research of Hereditary Disorders (PACER-HD), King Abdulaziz University, Jeddah, Saudi Arabia
| | - Zuhier A Awan
- Department of Clinical Biochemistry, Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Prashant K Verma
- Department of Genetic Medicine, Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Ramu Elango
- Department of Genetic Medicine, Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia.,Princess Al-Jawhara Al-Brahim Centre of Excellence in Research of Hereditary Disorders (PACER-HD), King Abdulaziz University, Jeddah, Saudi Arabia
| | - Babajan Banaganapalli
- Department of Genetic Medicine, Faculty of Medicine, King Abdulaziz University, Jeddah, Saudi Arabia.,Princess Al-Jawhara Al-Brahim Centre of Excellence in Research of Hereditary Disorders (PACER-HD), King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
20
|
Srivastava A, Kumar M. Prediction of zinc binding sites in proteins using sequence derived information. J Biomol Struct Dyn 2018; 36:4413-4423. [PMID: 29241411 DOI: 10.1080/07391102.2017.1417910] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Zinc is one the most abundant catalytic cofactor and also an important structural component of a large number of metallo-proteins. Hence prediction of zinc metal binding sites in proteins can be a significant step in annotation of molecular function of a large number of proteins. Majority of existing methods for zinc-binding site predictions are based on a data-set of proteins, which has been compiled nearly a decade ago. Hence there is a need to develop zinc-binding site prediction system using the current updated data to include recently added proteins. Herein, we propose a support vector machine-based method, named as ZincBinder, for prediction of zinc metal-binding site in a protein using sequence profile information. The predictor was trained using fivefold cross validation approach and achieved 85.37% sensitivity with 86.20% specificity during training. Benchmarking on an independent non-redundant data-set, which was not used during training, showed better performance of ZincBinder vis-à-vis existing methods. Executable versions, source code, sample datasets, and usage instructions are available at http://proteininformatics.org/mkumar/znbinder/.
Collapse
Affiliation(s)
- Abhishikha Srivastava
- a Department of Biophysics , University of Delhi South Campus , Benito Juarez Road, New Delhi 110021 , India
| | - Manish Kumar
- a Department of Biophysics , University of Delhi South Campus , Benito Juarez Road, New Delhi 110021 , India
| |
Collapse
|
21
|
Kumar S. Prediction of Metal Ion Binding Sites in Proteins from Amino Acid Sequences by Using Simplified Amino Acid Alphabets and Random Forest Model. Genomics Inform 2017; 15:162-169. [PMID: 29307143 PMCID: PMC5769865 DOI: 10.5808/gi.2017.15.4.162] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2017] [Revised: 11/16/2017] [Accepted: 11/16/2017] [Indexed: 11/20/2022] Open
Abstract
Metal binding proteins or metallo-proteins are important for the stability of the protein and also serve as co-factors in various functions like controlling metabolism, regulating signal transport, and metal homeostasis. In structural genomics, prediction of metal binding proteins help in the selection of suitable growth medium for overexpression's studies and also help in obtaining the functional protein. Computational prediction using machine learning approach has been widely used in various fields of bioinformatics based on the fact all the information contains in amino acid sequence. In this study, random forest machine learning prediction systems were deployed with simplified amino acid for prediction of individual major metal ion binding sites like copper, calcium, cobalt, iron, magnesium, manganese, nickel, and zinc.
Collapse
Affiliation(s)
- Suresh Kumar
- Department of Diagnostic and Allied Health Sciences, Faculty of Health and Life Sciences, Management and Science University, 40100 Shah Alam, Malaysia
| |
Collapse
|
22
|
Petyaev IM, Alekseev KP, Tsibezov VV, Kostina LV, Kozlov AY, Kyle NH, Bashmakov YK. Structural Organization of 6B9 Molecule, a Monoclonal Antibody Against Lycopene. Monoclon Antib Immunodiagn Immunother 2017; 36:259-263. [PMID: 29267147 DOI: 10.1089/mab.2017.0041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Full cDNA and corresponding amino acid (AA) sequences of 6B9 monoclonal antibody (mAb) against lycopene was obtained using Step-Out RACE technology. Variable (V) and constant (C) regions were identified. The light chain of 6B9 contained 238 AA IgM with the highest level of identity (0.93) to both the anti-VEGF receptor antibody and anti-collagen type II FAb CIIC1. The heavy chain was composed of 634 AA with a high identity (0.9) to the Ig mu chain C region. Potential posttranslational modification regions in both chains were identified alongside with disulfide bond sites. The obtained information can be used for making chimeric constructs containing 6B9 mAb (or its fragments) and lycopene, a powerful carotenoid with antioxidant as well as antiproliferating properties, which can be implemented in the treatment of an aggressive form of prostate cancer and possibly other malignancies.
Collapse
Affiliation(s)
- Ivan M Petyaev
- 1 Lycotec Ltd. , Granta Park Campus, Cambridge, United Kingdom
| | - Konstantin P Alekseev
- 2 Gamaleya Research Center of Epidemiology and Microbiology , Ministry of Health, Moscow, Russia
| | - Valeriy V Tsibezov
- 2 Gamaleya Research Center of Epidemiology and Microbiology , Ministry of Health, Moscow, Russia
| | - Ludmila V Kostina
- 2 Gamaleya Research Center of Epidemiology and Microbiology , Ministry of Health, Moscow, Russia
| | - Alexey Y Kozlov
- 2 Gamaleya Research Center of Epidemiology and Microbiology , Ministry of Health, Moscow, Russia
| | - Nigel H Kyle
- 1 Lycotec Ltd. , Granta Park Campus, Cambridge, United Kingdom
| | | |
Collapse
|
23
|
Wang H, Chen X, Li C, Liu Y, Yang F, Wang C. Sequence-Based Prediction of Cysteine Reactivity Using Machine Learning. Biochemistry 2017; 57:451-460. [DOI: 10.1021/acs.biochem.7b00897] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Affiliation(s)
- Haobo Wang
- Synthetic
and Functional Biomolecules Center, Beijing National Laboratory for
Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular
Engineering of Ministry of Education, Peking University, Beijing 100871, China
- Department
of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Xuemin Chen
- Synthetic
and Functional Biomolecules Center, Beijing National Laboratory for
Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular
Engineering of Ministry of Education, Peking University, Beijing 100871, China
- Department
of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Can Li
- Department
of Chemical Engineering, Tsinghua University, Beijing 100084, China
- Peking-Tsinghua
Center for Life Sciences, Peking University, Beijing 100871, China
| | - Yuan Liu
- Synthetic
and Functional Biomolecules Center, Beijing National Laboratory for
Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular
Engineering of Ministry of Education, Peking University, Beijing 100871, China
- Department
of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
- Peking-Tsinghua
Center for Life Sciences, Peking University, Beijing 100871, China
| | - Fan Yang
- Synthetic
and Functional Biomolecules Center, Beijing National Laboratory for
Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular
Engineering of Ministry of Education, Peking University, Beijing 100871, China
- Department
of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Chu Wang
- Synthetic
and Functional Biomolecules Center, Beijing National Laboratory for
Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular
Engineering of Ministry of Education, Peking University, Beijing 100871, China
- Department
of Chemical Biology, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
- Peking-Tsinghua
Center for Life Sciences, Peking University, Beijing 100871, China
| |
Collapse
|
24
|
Chen P, Hu S, Zhang J, Gao X, Li J, Xia J, Wang B. A Sequence-Based Dynamic Ensemble Learning System for Protein Ligand-Binding Site Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016; 13:901-912. [PMID: 26661785 DOI: 10.1109/tcbb.2015.2505286] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
BACKGROUND Proteins have the fundamental ability to selectively bind to other molecules and perform specific functions through such interactions, such as protein-ligand binding. Accurate prediction of protein residues that physically bind to ligands is important for drug design and protein docking studies. Most of the successful protein-ligand binding predictions were based on known structures. However, structural information is not largely available in practice due to the huge gap between the number of known protein sequences and that of experimentally solved structures. RESULTS This paper proposes a dynamic ensemble approach to identify protein-ligand binding residues by using sequence information only. To avoid problems resulting from highly imbalanced samples between the ligand-binding sites and non ligand-binding sites, we constructed several balanced data sets and we trained a random forest classifier for each of them. We dynamically selected a subset of classifiers according to the similarity between the target protein and the proteins in the training data set. The combination of the predictions of the classifier subset to each query protein target yielded the final predictions. The ensemble of these classifiers formed a sequence-based predictor to identify protein-ligand binding sites. CONCLUSIONS Experimental results on two Critical Assessment of protein Structure Prediction datasets and the ccPDB dataset demonstrated that of our proposed method compared favorably with the state-of-the-art. AVAILABILITY http://www2.ahu.edu.cn/pchen/web/LigandDSES.htm.
Collapse
|
25
|
Sun MA, Zhang Q, Wang Y, Ge W, Guo D. Prediction of redox-sensitive cysteines using sequential distance and other sequence-based features. BMC Bioinformatics 2016; 17:316. [PMID: 27553667 PMCID: PMC4995733 DOI: 10.1186/s12859-016-1185-4] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2015] [Accepted: 08/12/2016] [Indexed: 11/10/2022] Open
Abstract
Background Reactive oxygen species can modify the structure and function of proteins and may also act as important signaling molecules in various cellular processes. Cysteine thiol groups of proteins are particularly susceptible to oxidation. Meanwhile, their reversible oxidation is of critical roles for redox regulation and signaling. Recently, several computational tools have been developed for predicting redox-sensitive cysteines; however, those methods either only focus on catalytic redox-sensitive cysteines in thiol oxidoreductases, or heavily depend on protein structural data, thus cannot be widely used. Results In this study, we analyzed various sequence-based features potentially related to cysteine redox-sensitivity, and identified three types of features for efficient computational prediction of redox-sensitive cysteines. These features are: sequential distance to the nearby cysteines, PSSM profile and predicted secondary structure of flanking residues. After further feature selection using SVM-RFE, we developed Redox-Sensitive Cysteine Predictor (RSCP), a SVM based classifier for redox-sensitive cysteine prediction using primary sequence only. Using 10-fold cross-validation on RSC758 dataset, the accuracy, sensitivity, specificity, MCC and AUC were estimated as 0.679, 0.602, 0.756, 0.362 and 0.727, respectively. When evaluated using 10-fold cross-validation with BALOSCTdb dataset which has structure information, the model achieved performance comparable to current structure-based method. Further validation using an independent dataset indicates it is robust and of relatively better accuracy for predicting redox-sensitive cysteines from non-enzyme proteins. Conclusions In this study, we developed a sequence-based classifier for predicting redox-sensitive cysteines. The major advantage of this method is that it does not rely on protein structure data, which ensures more extensive application compared to other current implementations. Accurate prediction of redox-sensitive cysteines not only enhances our understanding about the redox sensitivity of cysteine, it may also complement the proteomics approach and facilitate further experimental investigation of important redox-sensitive cysteines. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1185-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Ming-An Sun
- State Key Laboratory of Agrobiotechnology and School of Life Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, People's Republic of China
| | - Qing Zhang
- State Key Laboratory of Agrobiotechnology and School of Life Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, People's Republic of China
| | - Yejun Wang
- Department of Cell Biology and Genetics, School of Basic Medical Sciences, Shenzhen University Health Science Center, Nanhai Ave 3688, Shenzhen, 518060, People's Republic of China
| | - Wei Ge
- Centre of Reproduction, Development and Aging, Faculty of Health Sciences, University of Macau, Taipa, Macau, People's Republic of China
| | - Dianjing Guo
- State Key Laboratory of Agrobiotechnology and School of Life Sciences, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong, People's Republic of China.
| |
Collapse
|
26
|
Atanasova P, Stitz N, Sanctis S, Maurer JHM, Hoffmann RC, Eiben S, Jeske H, Schneider JJ, Bill J. Genetically improved monolayer-forming tobacco mosaic viruses to generate nanostructured semiconducting bio/inorganic hybrids. LANGMUIR : THE ACS JOURNAL OF SURFACES AND COLLOIDS 2015; 31:3897-3903. [PMID: 25768914 DOI: 10.1021/acs.langmuir.5b00700] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
The genetically determined design of structured functional bio/inorganic materials was investigated by applying a convective assembly approach. Wildtype tobacco mosaic virus (wt TMV) as well as several TMV mutants were organized on substrates over macroscopic-length scales. Depending on the virus type, the self-organization behavior showed pronounced differences in the surface arrangement under the same convective assembly conditions. Additionally, under varying assembly parameters, the virus particles generated structures encompassing morphologies emerging from single micrometer long fibers aligned parallel to the triple-contact line through disordered but dense films to smooth and uniform monolayers. Monolayers with diverse packing densities were used as templates to form TMV/ZnO hybrid materials. The semiconducting properties can be directly designed and tuned by the variation of the template architecture which are reflected in the transistor performance.
Collapse
Affiliation(s)
- Petia Atanasova
- †Institute of Materials Science, Universität Stuttgart, Heisenbergstrasse 3, 70569 Stuttgart, Germany
| | - Nina Stitz
- †Institute of Materials Science, Universität Stuttgart, Heisenbergstrasse 3, 70569 Stuttgart, Germany
| | - Shawn Sanctis
- ‡Fachbereich Chemie, Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Alarich-Weiss-Strasse 12, 64287 Darmstadt, Germany
| | - Johannes H M Maurer
- †Institute of Materials Science, Universität Stuttgart, Heisenbergstrasse 3, 70569 Stuttgart, Germany
| | - Rudolf C Hoffmann
- ‡Fachbereich Chemie, Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Alarich-Weiss-Strasse 12, 64287 Darmstadt, Germany
| | - Sabine Eiben
- §Institute of Biomaterials and Biological Systems, Universität Stuttgart, Pfaffenwaldring 57, 70569 Stuttgart, Germany
| | - Holger Jeske
- §Institute of Biomaterials and Biological Systems, Universität Stuttgart, Pfaffenwaldring 57, 70569 Stuttgart, Germany
| | - Jörg J Schneider
- ‡Fachbereich Chemie, Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Technische Universität Darmstadt, Alarich-Weiss-Strasse 12, 64287 Darmstadt, Germany
| | - Joachim Bill
- †Institute of Materials Science, Universität Stuttgart, Heisenbergstrasse 3, 70569 Stuttgart, Germany
| |
Collapse
|
27
|
Chen P, Huang JZ, Gao X. LigandRFs: random forest ensemble to identify ligand-binding residues from sequence information alone. BMC Bioinformatics 2014; 15 Suppl 15:S4. [PMID: 25474163 PMCID: PMC4271564 DOI: 10.1186/1471-2105-15-s15-s4] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Background Protein-ligand binding is important for some proteins to perform their functions. Protein-ligand binding sites are the residues of proteins that physically bind to ligands. Despite of the recent advances in computational prediction for protein-ligand binding sites, the state-of-the-art methods search for similar, known structures of the query and predict the binding sites based on the solved structures. However, such structural information is not commonly available. Results In this paper, we propose a sequence-based approach to identify protein-ligand binding residues. We propose a combination technique to reduce the effects of different sliding residue windows in the process of encoding input feature vectors. Moreover, due to the highly imbalanced samples between the ligand-binding sites and non ligand-binding sites, we construct several balanced data sets, for each of which a random forest (RF)-based classifier is trained. The ensemble of these RF classifiers forms a sequence-based protein-ligand binding site predictor. Conclusions Experimental results on CASP9 and CASP8 data sets demonstrate that our method compares favorably with the state-of-the-art protein-ligand binding site prediction methods.
Collapse
|
28
|
Teso S, Passerini A. Joint probabilistic-logical refinement of multiple protein feature predictors. BMC Bioinformatics 2014; 15:16. [PMID: 24428894 PMCID: PMC3929554 DOI: 10.1186/1471-2105-15-16] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2012] [Accepted: 11/06/2013] [Indexed: 11/24/2022] Open
Abstract
Background Computational methods for the prediction of protein features from sequence are a long-standing focus of bioinformatics. A key observation is that several protein features are closely inter-related, that is, they are conditioned on each other. Researchers invested a lot of effort into designing predictors that exploit this fact. Most existing methods leverage inter-feature constraints by including known (or predicted) correlated features as inputs to the predictor, thus conditioning the result. Results By including correlated features as inputs, existing methods only rely on one side of the relation: the output feature is conditioned on the known input features. Here we show how to jointly improve the outputs of multiple correlated predictors by means of a probabilistic-logical consistency layer. The logical layer enforces a set of weighted first-order rules encoding biological constraints between the features, and improves the raw predictions so that they least violate the constraints. In particular, we show how to integrate three stand-alone predictors of correlated features: subcellular localization (Loctree [J Mol Biol 348:85–100, 2005]), disulfide bonding state (Disulfind [Nucleic Acids Res 34:W177–W181, 2006]), and metal bonding state (MetalDetector [Bioinformatics 24:2094–2095, 2008]), in a way that takes into account the respective strengths and weaknesses, and does not require any change to the predictors themselves. We also compare our methodology against two alternative refinement pipelines based on state-of-the-art sequential prediction methods. Conclusions The proposed framework is able to improve the performance of the underlying predictors by removing rule violations. We show that different predictors offer complementary advantages, and our method is able to integrate them using non-trivial constraints, generating more consistent predictions. In addition, our framework is fully general, and could in principle be applied to a vast array of heterogeneous predictions without requiring any change to the underlying software. On the other hand, the alternative strategies are more specific and tend to favor one task at the expense of the others, as shown by our experimental evaluation. The ultimate goal of our framework is to seamlessly integrate full prediction suites, such as Distill [BMC Bioinformatics 7:402, 2006] and PredictProtein [Nucleic Acids Res 32:W321–W326, 2004].
Collapse
Affiliation(s)
- Stefano Teso
- Department of Information Engineering and Computer Science, Università degli Studi di Trento, Trento, Italy.
| | | |
Collapse
|
29
|
Brylinski M. Exploring the "dark matter" of a mammalian proteome by protein structure and function modeling. Proteome Sci 2013; 11:47. [PMID: 24321360 PMCID: PMC3866606 DOI: 10.1186/1477-5956-11-47] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2013] [Accepted: 12/03/2013] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND A growing body of evidence shows that gene products encoded by short open reading frames play key roles in numerous cellular processes. Yet, they are generally overlooked in genome assembly, escaping annotation because small protein-coding genes are difficult to predict computationally. Consequently, there are still a considerable number of small proteins whose functions are yet to be characterized. RESULTS To address this issue, we apply a collection of structural bioinformatics algorithms to infer molecular function of putative small proteins from the mouse proteome. Specifically, we construct 1,743 confident structure models of small proteins, which reveal a significant structural diversity with a noticeably high helical content. A subsequent structure-based function annotation of small protein models exposes 178,745 putative protein-protein interactions with the remaining gene products in the mouse proteome, 1,100 potential binding sites for small organic molecules and 987 metal-binding signatures. CONCLUSIONS These results strongly indicate that many small proteins adopt three-dimensional structures and are fully functional, playing important roles in transcriptional regulation, cell signaling and metabolism. Data collected through this work is freely available to the academic community at http://www.brylinski.org/content/databases to support future studies oriented on elucidating the functions of hypothetical small proteins.
Collapse
Affiliation(s)
- Michal Brylinski
- Department of Biological Sciences, Louisiana State University, 70803 Baton Rouge, LA, USA.
| |
Collapse
|
30
|
Identification of Functional Regulatory Residues of the β -Lactam Inducible Penicillin Binding Protein in Methicillin-Resistant Staphylococcus aureus. CHEMOTHERAPY RESEARCH AND PRACTICE 2013; 2013:614670. [PMID: 23984067 PMCID: PMC3745919 DOI: 10.1155/2013/614670] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/27/2013] [Revised: 06/19/2013] [Accepted: 07/03/2013] [Indexed: 11/30/2022]
Abstract
Resistance to methicillin by Staphylococcus aureus is a persistent clinical problem worldwide. A mechanism for resistance has been proposed in which methicillin resistant Staphylococcus aureus (MRSA) isolates acquired a new protein called β-lactam inducible penicillin binding protein (PBP-2′). The PBP-2′ functions by substituting other penicillin binding proteins which have been inhibited by β-lactam antibiotics. Presently, there is no structural and regulatory information on PBP-2′ protein. We conducted a complete structural and functional regulatory analysis of PBP-2′ protein. Our analysis revealed that the PBP-2′ is very stable with more hydrophilic amino acids expressing antigenic sites. PBP-2′ has three striking regulatory points constituted by first penicillin binding site at Ser25, second penicillin binding site at Ser405, and finally a single metallic ligand binding site at Glu657 which binds to Zn2+ ions. This report highlights structural features of PBP-2′ that can serve as targets for developing new chemotherapeutic agents and conducting site direct mutagenesis experiments.
Collapse
|
31
|
Yu DJ, Hu J, Yang J, Shen HB, Tang J, Yang JY. Designing template-free predictor for targeting protein-ligand binding sites with classifier ensemble and spatial clustering. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2013; 10:994-1008. [PMID: 24334392 DOI: 10.1109/tcbb.2013.104] [Citation(s) in RCA: 87] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Accurately identifying the protein-ligand binding sites or pockets is of significant importance for both protein function analysis and drug design. Although much progress has been made, challenges remain, especially when the 3D structures of target proteins are not available or no homology templates can be found in the library, where the template-based methods are hard to be applied. In this paper, we report a new ligand-specific template-free predictor called TargetS for targeting protein-ligand binding sites from primary sequences. TargetS first predicts the binding residues along the sequence with ligand-specific strategy and then further identifies the binding sites from the predicted binding residues through a recursive spatial clustering algorithm. Protein evolutionary information, predicted protein secondary structure, and ligand-specific binding propensities of residues are combined to construct discriminative features; an improved AdaBoost classifier ensemble scheme based on random undersampling is proposed to deal with the serious imbalance problem between positive (binding) and negative (nonbinding) samples. Experimental results demonstrate that TargetS achieves high performances and outperforms many existing predictors. TargetS web server and data sets are freely available at: http://www.csbio.sjtu.edu.cn/bioinf/TargetS/ for academic use.
Collapse
Affiliation(s)
- Dong-Jun Yu
- Nanjing University of Science and Technology, Nanjing
| | - Jun Hu
- Nanjing University of Science and Technology, Nanjing
| | - Jing Yang
- Shanghai Jiao Tong University, Shanghai and Ministry of Education of China, Shanghai
| | - Hong-Bin Shen
- Shanghai Jiao Tong University, Shanghai and Ministry of Education of China, Shanghai
| | - Jinhui Tang
- Nanjing University of Science and Technology, Nanjing
| | - Jing-Yu Yang
- Nanjing University of Science and Technology, Nanjing
| |
Collapse
|
32
|
Liu Z, Wang Y, Zhou C, Xue Y, Zhao W, Liu H. Computationally characterizing and comprehensive analysis of zinc-binding sites in proteins. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2013; 1844:171-80. [PMID: 23499845 DOI: 10.1016/j.bbapap.2013.03.001] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2012] [Revised: 03/02/2013] [Accepted: 03/04/2013] [Indexed: 10/27/2022]
Abstract
Zinc is one of the most essential metals utilized by organisms, and zinc-binding proteins play an important role in a variety of biological processes such as transcription regulation, cell metabolism and apoptosis. Thus, characterizing the precise zinc-binding sites is fundamental to an elucidation of the biological functions and molecular mechanisms of zinc-binding proteins. Using systematic analyses of structural characteristics, we observed that 4-residue and 3-residue zinc-binding sites have distinctly specific geometric features. Based on the results, we developed the novel computational program Geometric REstriction for Zinc-binding (GRE4Zn) to characterize the zinc-binding sites in protein structures, by restricting the distances between zinc and its coordinating atoms. The comparison between GRE4Zn and analogous tools revealed that it achieved a superior performance. A large-scale prediction for structurally characterized proteins was performed with this powerful predictor, and statistical analyses for the results indicated zinc-binding proteins have come to be significantly involved in more complicated biological processes in higher species than simpler species during the course of evolution. Further analyses suggested that zinc-binding proteins are preferentially implicated in a variety of diseases and highly enriched in known drug targets, and the prediction of zinc-binding sites can be helpful for the investigation of molecular mechanisms. In this regard, these prediction and analysis results should prove to be highly useful be helpful for further biomedical study and drug design. The online service of GRE4Zn is freely available at: http://biocomp.ustc.edu.cn/gre4zn/. This article is part of a Special Issue entitled: Computational Proteomics, Systems Biology & Clinical Implications. Guest Editor: Yudong Cai.
Collapse
Affiliation(s)
- Zexian Liu
- Hefei National Laboratory for Physical Sciences at Microscale and School of Life Sciences, University of Science & Technology of China, Hefei, Anhui 230027, China
| | | | | | | | | | | |
Collapse
|
33
|
Ou YY, Chen SA, Wu SC. ETMB-RBF: discrimination of metal-binding sites in electron transporters based on RBF networks with PSSM profiles and significant amino acid pairs. PLoS One 2013; 8:e46572. [PMID: 23405059 PMCID: PMC3566168 DOI: 10.1371/journal.pone.0046572] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2012] [Accepted: 08/31/2012] [Indexed: 11/18/2022] Open
Abstract
Background Cellular respiration is the process by which cells obtain energy from glucose and is a very important biological process in living cell. As cells do cellular respiration, they need a pathway to store and transport electrons, the electron transport chain. The function of the electron transport chain is to produce a trans-membrane proton electrochemical gradient as a result of oxidation–reduction reactions. In these oxidation–reduction reactions in electron transport chains, metal ions play very important role as electron donor and acceptor. For example, Fe ions are in complex I and complex II, and Cu ions are in complex IV. Therefore, to identify metal-binding sites in electron transporters is an important issue in helping biologists better understand the workings of the electron transport chain. Methods We propose a method based on Position Specific Scoring Matrix (PSSM) profiles and significant amino acid pairs to identify metal-binding residues in electron transport proteins. Results We have selected a non-redundant set of 55 metal-binding electron transport proteins as our dataset. The proposed method can predict metal-binding sites in electron transport proteins with an average 10-fold cross-validation accuracy of 93.2% and 93.1% for metal-binding cysteine and histidine, respectively. Compared with the general metal-binding predictor from A. Passerini et al., the proposed method can improve over 9% of sensitivity, and 14% specificity on the independent dataset in identifying metal-binding cysteines. The proposed method can also improve almost 76% sensitivity with same specificity in metal-binding histidine, and MCC is also improved from 0.28 to 0.88. Conclusions We have developed a novel approach based on PSSM profiles and significant amino acid pairs for identifying metal-binding sites from electron transport proteins. The proposed approach achieved a significant improvement with independent test set of metal-binding electron transport proteins.
Collapse
Affiliation(s)
- Yu-Yen Ou
- Department of Computer Science and Engineering, Yuan Ze University, Chung-Li, Taiwan.
| | | | | |
Collapse
|
34
|
Chen Z, Wang Y, Zhai YF, Song J, Zhang Z. ZincExplorer: an accurate hybrid method to improve the prediction of zinc-binding sites from protein sequences. MOLECULAR BIOSYSTEMS 2013; 9:2213-22. [DOI: 10.1039/c3mb70100j] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
35
|
An integrative computational framework based on a two-step random forest algorithm improves prediction of zinc-binding sites in proteins. PLoS One 2012; 7:e49716. [PMID: 23166753 PMCID: PMC3499040 DOI: 10.1371/journal.pone.0049716] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2012] [Accepted: 10/12/2012] [Indexed: 11/30/2022] Open
Abstract
Zinc-binding proteins are the most abundant metalloproteins in the Protein Data Bank where the zinc ions usually have catalytic, regulatory or structural roles critical for the function of the protein. Accurate prediction of zinc-binding sites is not only useful for the inference of protein function but also important for the prediction of 3D structure. Here, we present a new integrative framework that combines multiple sequence and structural properties and graph-theoretic network features, followed by an efficient feature selection to improve prediction of zinc-binding sites. We investigate what information can be retrieved from the sequence, structure and network levels that is relevant to zinc-binding site prediction. We perform a two-step feature selection using random forest to remove redundant features and quantify the relative importance of the retrieved features. Benchmarking on a high-quality structural dataset containing 1,103 protein chains and 484 zinc-binding residues, our method achieved >80% recall at a precision of 75% for the zinc-binding residues Cys, His, Glu and Asp on 5-fold cross-validation tests, which is a 10%-28% higher recall at the 75% equal precision compared to SitePredict and zincfinder at residue level using the same dataset. The independent test also indicates that our method has achieved recall of 0.790 and 0.759 at residue and protein levels, respectively, which is a performance better than the other two methods. Moreover, AUC (the Area Under the Curve) and AURPC (the Area Under the Recall-Precision Curve) by our method are also respectively better than those of the other two methods. Our method can not only be applied to large-scale identification of zinc-binding sites when structural information of the target is available, but also give valuable insights into important features arising from different levels that collectively characterize the zinc-binding sites. The scripts and datasets are available at http://protein.cau.edu.cn/zincidentifier/.
Collapse
|
36
|
Pathak RK, Dessingou J, Rao CP. Multiple Sensor Array of Mn2+, Fe2+, Co2+, Ni2+, Cu2+, and Zn2+ Complexes of a Triazole Linked Imino-Phenol Based Calix[4]arene Conjugate for the Selective Recognition of Asp, Glu, Cys, and His. Anal Chem 2012; 84:8294-300. [DOI: 10.1021/ac301821c] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Affiliation(s)
- Rakesh K. Pathak
- Bioinorganic Laboratory, Department
of Chemistry, Indian Institute of Technology Bombay, Powai, Mumbai
400 076, India
| | - Jayaraman Dessingou
- Bioinorganic Laboratory, Department
of Chemistry, Indian Institute of Technology Bombay, Powai, Mumbai
400 076, India
| | - Chebrolu P. Rao
- Bioinorganic Laboratory, Department
of Chemistry, Indian Institute of Technology Bombay, Powai, Mumbai
400 076, India
| |
Collapse
|
37
|
Seguritan V, Alves N, Arnoult M, Raymond A, Lorimer D, Burgin AB, Salamon P, Segall AM. Artificial neural networks trained to detect viral and phage structural proteins. PLoS Comput Biol 2012; 8:e1002657. [PMID: 22927809 PMCID: PMC3426561 DOI: 10.1371/journal.pcbi.1002657] [Citation(s) in RCA: 66] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2011] [Accepted: 06/29/2012] [Indexed: 01/03/2023] Open
Abstract
Phages play critical roles in the survival and pathogenicity of their hosts, via lysogenic conversion factors, and in nutrient redistribution, via cell lysis. Analyses of phage- and viral-encoded genes in environmental samples provide insights into the physiological impact of viruses on microbial communities and human health. However, phage ORFs are extremely diverse of which over 70% of them are dissimilar to any genes with annotated functions in GenBank. Better identification of viruses would also aid in better detection and diagnosis of disease, in vaccine development, and generally in better understanding the physiological potential of any environment. In contrast to enzymes, viral structural protein function can be much more challenging to detect from sequence data because of low sequence conservation, few known conserved catalytic sites or sequence domains, and relatively limited experimental data. We have designed a method of predicting phage structural protein sequences that uses Artificial Neural Networks (ANNs). First, we trained ANNs to classify viral structural proteins using amino acid frequency; these correctly classify a large fraction of test cases with a high degree of specificity and sensitivity. Subsequently, we added estimates of protein isoelectric points as a feature to ANNs that classify specialized families of proteins, namely major capsid and tail proteins. As expected, these more specialized ANNs are more accurate than the structural ANNs. To experimentally validate the ANN predictions, several ORFs with no significant similarities to known sequences that are ANN-predicted structural proteins were examined by transmission electron microscopy. Some of these self-assembled into structures strongly resembling virion structures. Thus, our ANNs are new tools for identifying phage and potential prophage structural proteins that are difficult or impossible to detect by other bioinformatic analysis. The networks will be valuable when sequence is available but in vitro propagation of the phage may not be practical or possible. Bacteriophages are extremely abundant and diverse biological entities. All phage particles are comprised of nucleic acids and structural proteins, with few other packaged proteins. Despite their simplicity and abundance, more than 70% of phage sequences in the viral Reference Sequence database encode proteins with unknown function based on FASTA annotations. As a result, the use of sequence similarity is often insufficient for detecting virus structural proteins among unknown viral sequences. Viral structural protein function is challenging to detect from sequence data because structural proteins possess few known conserved catalytic motifs and sequence domains. To address these issues we investigated the use of Artificial Neural Networks as an alternative means of predicting function. Here, we trained thousands of networks using the amino acid frequency of structural protein sequences and identified the optimal architectures with the highest accuracies. Some hypothetical protein sequences detected by our networks were expressed and visualized by TEM, and produced images that strongly resemble virion structures. Our results support the utility of our neural networks in predicting the functions of unknown viral sequences.
Collapse
Affiliation(s)
- Victor Seguritan
- Program of Computational Science, San Diego State University, San Diego, California, United States of America
| | - Nelson Alves
- Department of Genetics, Federal University of Rio de Janeiro, Rio de Janeiro, Brazil
| | - Michael Arnoult
- Department of Biology, San Diego State University, San Diego, California, United States of America
| | - Amy Raymond
- Emerald BioStructures, Seattle, Washington, United States of America
| | - Don Lorimer
- Emerald BioStructures, Seattle, Washington, United States of America
| | - Alex B. Burgin
- Emerald BioStructures, Seattle, Washington, United States of America
| | - Peter Salamon
- Department of Mathematics and Statistics, San Diego State University, San Diego, California, United States of America
| | - Anca M. Segall
- Program of Computational Science, San Diego State University, San Diego, California, United States of America
- Department of Biology, San Diego State University, San Diego, California, United States of America
- * E-mail:
| |
Collapse
|
38
|
Predicting nonspecific ion binding using DelPhi. Biophys J 2012; 102:2885-93. [PMID: 22735539 DOI: 10.1016/j.bpj.2012.05.013] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2012] [Revised: 04/29/2012] [Accepted: 05/01/2012] [Indexed: 11/24/2022] Open
Abstract
Ions are an important component of the cell and affect the corresponding biological macromolecules either via direct binding or as a screening ion cloud. Although some ion binding is highly specific and frequently associated with the function of the macromolecule, other ions bind to the protein surface nonspecifically, presumably because the electrostatic attraction is strong enough to immobilize them. Here, we test such a scenario and demonstrate that experimentally identified surface-bound ions are located at a potential that facilitates binding, which indicates that the major driving force is the electrostatics. Without taking into consideration geometrical factors and structural fluctuations, we show that ions tend to be bound onto the protein surface at positions with strong potential but with polarity opposite to that of the ion. This observation is used to develop a method that uses a DelPhi-calculated potential map in conjunction with an in-house-developed clustering algorithm to predict nonspecific ion-binding sites. Although this approach distinguishes only the polarity of the ions, and not their chemical nature, it can predict nonspecific binding of positively or negatively charged ions with acceptable accuracy. One can use the predictions in the Poisson-Boltzmann approach by placing explicit ions in the predicted positions, which in turn will reduce the magnitude of the local potential and extend the limits of the Poisson-Boltzmann equation. In addition, one can use this approach to place the desired number of ions before conducting molecular-dynamics simulations to neutralize the net charge of the protein, because it was shown to perform better than standard screened Coulomb canned routines, or to predict ion-binding sites in proteins. This latter is especially true for proteins that are involved in ion transport, because such ions are loosely bound and very difficult to detect experimentally.
Collapse
|
39
|
Lu CH, Lin YF, Lin JJ, Yu CS. Prediction of metal ion-binding sites in proteins using the fragment transformation method. PLoS One 2012; 7:e39252. [PMID: 22723976 PMCID: PMC3377655 DOI: 10.1371/journal.pone.0039252] [Citation(s) in RCA: 85] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2011] [Accepted: 05/21/2012] [Indexed: 11/19/2022] Open
Abstract
The structure of a protein determines its function and its interactions with other factors. Regions of proteins that interact with ligands, substrates, and/or other proteins, tend to be conserved both in sequence and structure, and the residues involved are usually in close spatial proximity. More than 70,000 protein structures are currently found in the Protein Data Bank, and approximately one-third contain metal ions essential for function. Identifying and characterizing metal ion-binding sites experimentally is time-consuming and costly. Many computational methods have been developed to identify metal ion-binding sites, and most use only sequence information. For the work reported herein, we developed a method that uses sequence and structural information to predict the residues in metal ion-binding sites. Six types of metal ion-binding templates- those involving Ca(2+), Cu(2+), Fe(3+), Mg(2+), Mn(2+), and Zn(2+)-were constructed using the residues within 3.5 Å of the center of the metal ion. Using the fragment transformation method, we then compared known metal ion-binding sites with the templates to assess the accuracy of our method. Our method achieved an overall 94.6 % accuracy with a true positive rate of 60.5 % at a 5 % false positive rate and therefore constitutes a significant improvement in metal-binding site prediction.
Collapse
Affiliation(s)
- Chih-Hao Lu
- Graduate Institute of Molecular Systems Biomedicine, China Medical University, Taichung, Taiwan.
| | | | | | | |
Collapse
|
40
|
Passerini A, Lippi M, Frasconi P. Predicting metal-binding sites from protein sequence. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012; 9:203-213. [PMID: 21606549 DOI: 10.1109/tcbb.2011.94] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Prediction of binding sites from sequence can significantly help toward determining the function of uncharacterized proteins on a genomic scale. The task is highly challenging due to the enormous amount of alternative candidate configurations. Previous research has only considered this prediction problem starting from 3D information. When starting from sequence alone, only methods that predict the bonding state of selected residues are available. The sole exception consists of pattern-based approaches, which rely on very specific motifs and cannot be applied to discover truly novel sites. We develop new algorithmic ideas based on structured-output learning for determining transition-metal-binding sites coordinated by cysteines and histidines. The inference step (retrieving the best scoring output) is intractable for general output types (i.e., general graphs). However, under the assumption that no residue can coordinate more than one metal ion, we prove that metal binding has the algebraic structure of a matroid, allowing us to employ a very efficient greedy algorithm. We test our predictor in a highly stringent setting where the training set consists of protein chains belonging to SCOP folds different from the ones used for accuracy estimation. In this setting, our predictor achieves 56 percent precision and 60 percent recall in the identification of ligand-ion bonds.
Collapse
|
41
|
Marino SM, Gladyshev VN. Analysis and functional prediction of reactive cysteine residues. J Biol Chem 2011; 287:4419-25. [PMID: 22157013 DOI: 10.1074/jbc.r111.275578] [Citation(s) in RCA: 203] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Cys is much different from other common amino acids in proteins. Being one of the least abundant residues, Cys is often observed in functional sites in proteins. This residue is reactive, polarizable, and redox-active; has high affinity for metals; and is particularly responsive to the local environment. A better understanding of the basic properties of Cys is essential for interpretation of high-throughput data sets and for prediction and classification of functional Cys residues. We provide an overview of approaches used to study Cys residues, from methods for investigation of their basic properties, such as exposure and pK(a), to algorithms for functional prediction of different types of Cys in proteins.
Collapse
Affiliation(s)
- Stefano M Marino
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts 02115, USA
| | | |
Collapse
|
42
|
Marino SM, Gladyshev VN. Redox biology: computational approaches to the investigation of functional cysteine residues. Antioxid Redox Signal 2011; 15:135-46. [PMID: 20812876 PMCID: PMC3110093 DOI: 10.1089/ars.2010.3561] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/07/2010] [Revised: 08/19/2010] [Accepted: 09/02/2010] [Indexed: 12/18/2022]
Abstract
Cysteine (Cys) residues serve many functions, such as catalysis, stabilization of protein structure through disulfides, metal binding, and regulation of protein function. Cys residues are also subject to numerous post-translational modifications. In recent years, various computational tools aiming at classifying and predicting different functional categories of Cys have been developed, particularly for structural and catalytic Cys. On the other hand, given complexity of the subject, bioinformatics approaches have been less successful for the investigation of regulatory Cys sites. In this review, we introduce different functional categories of Cys residues. For each category, an overview of state-of-the-art bioinformatics methods and tools is provided, along with examples of successful applications and potential limitations associated with each approach. Finally, we discuss Cys-based redox switches, which modify the view of distinct functional categories of Cys in proteins.
Collapse
Affiliation(s)
- Stefano M Marino
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, Massachusetts 02115, USA
| | | |
Collapse
|
43
|
Passerini A, Lippi M, Frasconi P. MetalDetector v2.0: predicting the geometry of metal binding sites from protein sequence. Nucleic Acids Res 2011; 39:W288-92. [PMID: 21576237 PMCID: PMC3125771 DOI: 10.1093/nar/gkr365] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
MetalDetector identifies CYS and HIS involved in transition metal protein binding sites, starting from sequence alone. A major new feature of release 2.0 is the ability to predict which residues are jointly involved in the coordination of the same metal ion. The server is available at http://metaldetector.dsi.unifi.it/v2.0/.
Collapse
Affiliation(s)
- Andrea Passerini
- Dipartimento di Ingegneria e Scienza dell'Informazione, Università degli Studi di Trento, Via Sommarive 14, 38123 Povo di Trento, Italy.
| | | | | |
Collapse
|
44
|
Ashrafi E, Alemzadeh A, Ebrahimi M, Ebrahimie E, Dadkhodaei N, Ebrahimi M. Amino Acid Features of P1B-ATPase Heavy Metal Transporters Enabling Small Numbers of Organisms to Cope with Heavy Metal Pollution. Bioinform Biol Insights 2011; 5:59-82. [PMID: 21573033 PMCID: PMC3091408 DOI: 10.4137/bbi.s6206] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
Phytoremediation refers to the use of plants for extraction and detoxification of pollutants, providing a new and powerful weapon against a polluted environment. In some plants, such as Thlaspi spp, heavy metal ATPases are involved in overall metal ion homeostasis and hyperaccumulation. P1B-ATPases pump a wide range of cations, especially heavy metals, across membranes against their electrochemical gradients. Determination of the protein characteristics of P1B-ATPases in hyperaccumulator plants provides a new opportuntity for engineering of phytoremediating plants. In this study, using diverse weighting and modeling approaches, 2644 protein characteristics of primary, secondary, and tertiary structures of P1B-ATPases in hyperaccumulator and nonhyperaccumulator plants were extracted and compared to identify differences between proteins in hyperaccumulator and nonhyperaccumulator pumps. Although the protein characteristics were variable in their weighting, tree and rule induction models; glycine count, frequency of glutamine-valine, and valine-phenylalanine count were the most important attributes highlighted by 10, five, and four models, respectively. In addition, a precise model was built to discriminate P1B-ATPases in different organisms based on their structural protein features. Moreover, reliable models for prediction of the hyperaccumulating activity of unknown P1B-ATPase pumps were developed. Uncovering important structural features of hyperaccumulator pumps in this study has provided the knowledge required for future modification and engineering of these pumps by techniques such as site-directed mutagenesis.
Collapse
Affiliation(s)
- E Ashrafi
- Department of Crop Production and Plant Breeding, College of Agriculture, Shiraz University, Shiraz, Iran
| | | | | | | | | | | |
Collapse
|
45
|
Shi W, Punta M, Bohon J, Sauder JM, D'Mello R, Sullivan M, Toomey J, Abel D, Lippi M, Passerini A, Frasconi P, Burley SK, Rost B, Chance MR. Characterization of metalloproteins by high-throughput X-ray absorption spectroscopy. Genome Res 2011; 21:898-907. [PMID: 21482623 DOI: 10.1101/gr.115097.110] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
High-throughput X-ray absorption spectroscopy was used to measure transition metal content based on quantitative detection of X-ray fluorescence signals for 3879 purified proteins from several hundred different protein families generated by the New York SGX Research Center for Structural Genomics. Approximately 9% of the proteins analyzed showed the presence of transition metal atoms (Zn, Cu, Ni, Co, Fe, or Mn) in stoichiometric amounts. The method is highly automated and highly reliable based on comparison of the results to crystal structure data derived from the same protein set. To leverage the experimental metalloprotein annotations, we used a sequence-based de novo prediction method, MetalDetector, to identify Cys and His residues that bind to transition metals for the redundancy reduced subset of 2411 sequences sharing <70% sequence identity and having at least one His or Cys. As the HT-XAS identifies metal type and protein binding, while the bioinformatics analysis identifies metal- binding residues, the results were combined to identify putative metal-binding sites in the proteins and their associated families. We explored the combination of this data with homology models to generate detailed structure models of metal-binding sites for representative proteins. Finally, we used extended X-ray absorption fine structure data from two of the purified Zn metalloproteins to validate predicted metalloprotein binding site structures. This combination of experimental and bioinformatics approaches provides comprehensive active site analysis on the genome scale for metalloproteins as a class, revealing new insights into metalloprotein structure and function.
Collapse
Affiliation(s)
- Wuxian Shi
- New York SGX Research Center for Structural Genomics (NYSGXRC), Case Western Reserve University, Center for Proteomics and Bioinformatics, Case Center for Synchrotron Biosciences, Upton, New York 11973, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
46
|
Zhao W, Xu M, Liang Z, Ding B, Niu L, Liu H, Teng M. Structure-based de novo prediction of zinc-binding sites in proteins of unknown function. ACTA ACUST UNITED AC 2011; 27:1262-8. [PMID: 21414989 DOI: 10.1093/bioinformatics/btr133] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION Zinc-binding proteins are the most abundant metallo-proteins in Protein Data Bank (PDB). Accurate prediction of zinc-binding sites in proteins of unknown function may provide important clues for the inference of protein function. As zinc binding is often associated with characteristic 3D arrangements of zinc ligand residues, its prediction may benefit from using not only the sequence information but also the structure information of proteins. RESULTS In this work, we present a structure-based method, TEMSP (3D TEmplate-based Metal Site Prediction), to predict zinc-binding sites. TEMSP significantly improves over previously reported best methods in predicting as many as possible true ligand residues for zinc with minimum overpredictions: if only those results in which all zinc ligand residues have been correctly predicted are defined as true positives, our method improves sensitivity from less than 30% to above 60%, and selectivity from around 25% to 80%. These results are for predictions based on apo state structures. In addition, the method can predict the zinc-bound local structures reliably, generating predictions useful for function inference. We applied TEMSP to 1888 protein structures of the 'Unknown Function' class in the PDB database. A number of zinc-binding sites have been discovered de novo, i.e. based solely on the protein structures. Using the predicted local structures of these sites, possible functional roles were analyzed. AVAILABILITY TEMSP is freely available from http://netalign.ustc.edu.cn/temsp/.
Collapse
Affiliation(s)
- Wei Zhao
- Hefei National Laboratory for Physical Sciences at Microscale and School of Life Sciences, University of Science and Technology of China and Key Laboratory of Structural Biology, Chinese Academy of Sciences, 96 Jinzhai Road, Hefei, Anhui, China
| | | | | | | | | | | | | |
Collapse
|
47
|
Zhou S, Schöneich C, Singh SK. Biologics formulation factors affecting metal leachables from stainless steel. AAPS PharmSciTech 2011; 12:411-21. [PMID: 21360314 DOI: 10.1208/s12249-011-9592-3] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2010] [Accepted: 01/19/2011] [Indexed: 11/30/2022] Open
Abstract
An area of increasing concern and scientific scrutiny is the potential contamination of drug products by leachables entering the product during manufacturing and storage. These contaminants may either have a direct safety impact on the patients or act indirectly through the alteration of the physicochemical properties of the product. In the case of biotherapeutics, trace amounts of metal contaminants can arise from various sources, but mainly from contact with stainless steel (ss). The effect of the various factors, buffer species, solution fill volume per unit contact surface area, metal chelators, and pH, on metal leachables from contact with ss over time were investigated individually. Three major metal leachables, iron, chromium, and nickel, were monitored by inductively coupled plasma-mass spectrometry because they are the major components of 316L ss. Iron was primarily used to evaluate the effect of each factor since it is the most abundant. It was observed that each studied factor exhibited its own effect on metal leachables from contact with ss. The effect of buffer species and pH exhibited temperature dependence over the studied temperature range. The metal leachables decreased with the increased fill volume (mL) per unit contact ss surface area (cm(2)) but a plateau was achieved at approximately 3 mL/cm(2). Metal chelators produced the strongest effect in facilitating metal leaching. In order to minimize the metal leachables and optimize biological product stability, each formulation factor must be evaluated for its impact, to balance its risk and benefit in achieving the target drug product shelf life.
Collapse
|
48
|
Dutta A, Bahar I. Metal-binding sites are designed to achieve optimal mechanical and signaling properties. Structure 2011; 18:1140-8. [PMID: 20826340 DOI: 10.1016/j.str.2010.06.013] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2010] [Revised: 05/21/2010] [Accepted: 06/17/2010] [Indexed: 11/29/2022]
Abstract
Many proteins require bound metals to achieve their function. We take advantage of increasing structural data on metal-binding proteins to elucidate three properties: the involvement of metal-binding sites in the global dynamics of the protein, predicted by elastic network models, their exposure/burial to solvent, and their signal-processing properties indicated by Markovian stochastics analysis. Systematic analysis of a data set of 145 structures reveals that the residues that coordinate metal ions enjoy remarkably efficient and precise signal transduction properties. These properties are rationalized in terms of their physical properties: participation in hinge sites that control the softest modes collectively accessible to the protein and occupancy of central positions minimally exposed to solvent. Our observations suggest that metal-binding sites may have been evolutionary selected to achieve optimum allosteric communication. They also provide insights into basic principles for designing metal-binding sites, which are verified to be met by recently designed de novo metal-binding proteins.
Collapse
Affiliation(s)
- Anindita Dutta
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, 3064 BST3, 3501 Fifth Avenue, Pittsburgh, PA 15213, USA
| | | |
Collapse
|
49
|
Brylinski M, Skolnick J. FINDSITE-metal: integrating evolutionary information and machine learning for structure-based metal-binding site prediction at the proteome level. Proteins 2010; 79:735-51. [PMID: 21287609 DOI: 10.1002/prot.22913] [Citation(s) in RCA: 73] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2010] [Revised: 09/27/2010] [Accepted: 10/07/2010] [Indexed: 12/13/2022]
Abstract
The rapid accumulation of gene sequences, many of which are hypothetical proteins with unknown function, has stimulated the development of accurate computational tools for protein function prediction with evolution/structure-based approaches showing considerable promise. In this article, we present FINDSITE-metal, a new threading-based method designed specifically to detect metal-binding sites in modeled protein structures. Comprehensive benchmarks using different quality protein structures show that weakly homologous protein models provide sufficient structural information for quite accurate annotation by FINDSITE-metal. Combining structure/evolutionary information with machine learning results in highly accurate metal-binding annotations; for protein models constructed by TASSER, whose average Cα RMSD from the native structure is 8.9 Å, 59.5% (71.9%) of the best of top five predicted metal locations are within 4 Å (8 Å) from a bound metal in the crystal structure. For most of the targets, multiple metal-binding sites are detected with the best predicted binding site at rank 1 and within the top two ranks in 65.6% and 83.1% of the cases, respectively. Furthermore, for iron, copper, zinc, calcium, and magnesium ions, the binding metal can be predicted with high, typically 70% to 90%, accuracy. FINDSITE-metal also provides a set of confidence indexes that help assess the reliability of predictions. Finally, we describe the proteome-wide application of FINDSITE-metal that quantifies the metal-binding complement of the human proteome. FINDSITE-metal is freely available to the academic community at http://cssb.biology.gatech.edu/findsite-metal/.
Collapse
Affiliation(s)
- Michal Brylinski
- Center for the Study of Systems Biology, Georgia Institute of Technology, Atlanta, Georgia 30318, USA
| | | |
Collapse
|
50
|
Shi W, Chance MR. Metalloproteomics: forward and reverse approaches in metalloprotein structural and functional characterization. Curr Opin Chem Biol 2010; 15:144-8. [PMID: 21130021 DOI: 10.1016/j.cbpa.2010.11.004] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2010] [Revised: 10/29/2010] [Accepted: 11/01/2010] [Indexed: 11/20/2022]
Abstract
About one-third of all proteins are associated with a metal. Metalloproteomics is defined as the structural and functional characterization of metalloproteins on a genome-wide scale. The methodologies utilized in metalloproteomics, including both forward (bottom-up) and reverse (top-down) technologies, to provide information on the identity, quantity, and function of metalloproteins are discussed. Important techniques frequently employed in metalloproteomics include classical proteomic tools such as mass spectrometry and 2D gels, immobilized-metal affinity chromatography, bioinformatic sequence analysis and homology modeling, X-ray absorption spectroscopy and other synchrotron radiation based tools. Combinative applications of these techniques provide a powerful approach to understand the function of metalloproteins.
Collapse
Affiliation(s)
- Wuxian Shi
- Center for Proteomics and Bioinformatics, Case Western Reserve University, 10900 Euclid Ave, BRB 113, Cleveland, OH 44106, USA
| | | |
Collapse
|