1
|
Ali F, Cai Q, Hu J, Zhang L, Hoare R, Monaghan SJ, Pang H. In silico analysis of AhyI protein and AI-1 inhibition using N-cis-octadec-9z-enoyl-l-homoserine lactone inhibitor in Aeromonas hydrophila. Microb Pathog 2021; 162:105356. [PMID: 34915138 DOI: 10.1016/j.micpath.2021.105356] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2021] [Revised: 11/26/2021] [Accepted: 12/07/2021] [Indexed: 10/19/2022]
Abstract
AhyI is homologous to the protein LuxI and is conserved throughout bacterial species including Aeromonas hydrophila. A. hydrophila causes opportunistic infections in fish and other aquatic organisms. Furthermore, this pathogennot only poses a great risk for the aquaculture industry, but also for human public health. AhyI (expressing acylhomoserine lactone) is responsible for the biosynthesis of autoinducer-1 (AI-1), commonly referred to as a quorum sensing (QS) signaling molecule, which plays an essential role in bacterial communication. Studying protein structure is essential for understanding molecular mechanisms of pathogenicity in microbes. Here, we have deduced a predicted structure of AhyI protein and characterized its function using in silico methods to aid the development of new treatments for controlling A.hydrophila infections. In addition to modeling AhyI, an appropriate inhibitor molecule was identified via high throughput virtual screening (HTVS) using mcule drug-like databases.The AhyI-inhibitor N-cis-octadec-9Z-enoyl-l-Homoserine lactone was selected withthe best drug score. In order to understand the pocket sites (ligand binding sites) and their interaction with the selected inhibitor, docking (predicted protein binding complex) servers were used and the selected ligand was docked with the predicted AhyI protein model. Remarkably, N-cis-octadec-9Z-enoyl-l-Homoserine lactone established interfaces with the protein via16 residues (V24, R27, F28, R31, W34, V36, D45, M77, F82, T101, R102, L103, 104, V143, S145, and V168), which are involved with regulating mechanisms of inhibition. These proposed predictions suggest that this inhibitor molecule may be used as a novel drug candidate for the inhibition of auto-inducer-1 (AI-1) activity.The N-cis-octadec-9Z-enoyl-l-Homoserine lactone inhibitor molecule was studied on cultured bacteria to validate its potency against AI-1 production. At a concentration of 40 μM, optimal inhibition efficiency of AI-1 was observedin bacterial culture media.These results suggest that the inhibitor molecule N-cis-octadec-9Z-enoyl-l-Homoserine lactone is a competitive inhibitor of AI-1 biosynthesis.
Collapse
Affiliation(s)
- Farman Ali
- Fujian Provincial Key Laboratory of Agro Ecological Processing and Safety Monitoring, College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, 35002, China; Key Laboratory of Crop Ecology and Molecular Physiology (Fujian Agriculture and Forestry University) Fujian Province University, Fuzhou, 35002, China
| | - Qilan Cai
- Fujian Provincial Key Laboratory of Agro Ecological Processing and Safety Monitoring, College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, 35002, China; Key Laboratory of Crop Ecology and Molecular Physiology (Fujian Agriculture and Forestry University) Fujian Province University, Fuzhou, 35002, China
| | - Jialing Hu
- College of Fisheries, Guangdong Ocean University, Zhanjiang, 524025, China; Guangdong Provincial Key Laboratory of Pathogenic Biology and Epidemiology for Aquatic Economic Animal, Key Laboratory of Control for Disease of Aquatic Animals of Guangdong Higher Education Institutes, Zhanjiang, 524025, China
| | - Lishan Zhang
- Fujian Provincial Key Laboratory of Agro Ecological Processing and Safety Monitoring, College of Life Sciences, Fujian Agriculture and Forestry University, Fuzhou, 35002, China; Key Laboratory of Crop Ecology and Molecular Physiology (Fujian Agriculture and Forestry University) Fujian Province University, Fuzhou, 35002, China
| | - Rowena Hoare
- Institute of Aquaculture, University of Stirling, Stirling, FK9 4LA, Scotland, UK
| | - Sean J Monaghan
- Institute of Aquaculture, University of Stirling, Stirling, FK9 4LA, Scotland, UK
| | - Huanying Pang
- College of Fisheries, Guangdong Ocean University, Zhanjiang, 524025, China; Guangdong Provincial Key Laboratory of Pathogenic Biology and Epidemiology for Aquatic Economic Animal, Key Laboratory of Control for Disease of Aquatic Animals of Guangdong Higher Education Institutes, Zhanjiang, 524025, China.
| |
Collapse
|
2
|
Ghadermarzi S, Krawczyk B, Song J, Kurgan L. XRRpred: Accurate Predictor of Crystal Structure Quality from Protein Sequence. Bioinformatics 2021; 37:4366-4374. [PMID: 34247234 DOI: 10.1093/bioinformatics/btab509] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Revised: 06/10/2021] [Accepted: 07/06/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION X-ray crystallography was used to produce nearly 90% of protein structures. These efforts were supported by numerous sequence-based tools that accurately predict crystallizable proteins. However, protein structures vary widely in their quality, typically measured with resolution and R-free. This impacts the ability to use these structures for some applications including rational drug design and molecular docking and motivates development of methods that accurately predict structure quality. RESULTS We introduce XRRpred, the first predictor of the resolution and R-free values from protein sequences. XRRpred relies on original sequence profiles, hand-crafted features, empirically selected and parametrized regressors, and modern resampling techniques. Using an independent test dataset, we show that XRRpred provides accurate predictions of resolution and R-free. We demonstrate that XRRpred's predictions correctly model relationship between the resolution and R-free and reproduce structure quality relations between structural classes of proteins. We also show that XRRpred significantly outperforms indirect alternative ways to predict the structure quality that include predictors of crystallization propensity and an alignment-based approach. XRRpred is available as a convenient webserver that allows batch predictions and offers informative visualization of the results. AVAILABILITY http://biomine.cs.vcu.edu/servers/XRRPred/.
Collapse
Affiliation(s)
- Sina Ghadermarzi
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Bartosz Krawczyk
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| | - Jiangning Song
- Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia.,Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC 3800, Australia
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
3
|
Structural genomics and the Protein Data Bank. J Biol Chem 2021; 296:100747. [PMID: 33957120 PMCID: PMC8166929 DOI: 10.1016/j.jbc.2021.100747] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Revised: 04/16/2021] [Accepted: 04/30/2021] [Indexed: 12/14/2022] Open
Abstract
The field of Structural Genomics arose over the last 3 decades to address a large and rapidly growing divergence between microbial genomic, functional, and structural data. Several international programs took advantage of the vast genomic sequence information and evaluated the feasibility of structure determination for expanded and newly discovered protein families. As a consequence, structural genomics has developed structure-determination pipelines and applied them to a wide range of novel, uncharacterized proteins, often from “microbial dark matter,” and later to proteins from human pathogens. Advances were especially needed in protein production and rapid de novo structure solution. The experimental three-dimensional models were promptly made public, facilitating structure determination of other members of the family and helping to understand their molecular and biochemical functions. Improvements in experimental methods and databases resulted in fast progress in molecular and structural biology. The Protein Data Bank structure repository played a central role in the coordination of structural genomics efforts and the structural biology community as a whole. It facilitated development of standards and validation tools essential for maintaining high quality of deposited structural data.
Collapse
|
4
|
Yadav M, Khandelwal S. Homology modeling and molecular dynamics dimulation study of β carbonic anhydrase of Ascaris lumbricoides. Bioinformation 2019; 15:572-578. [PMID: 31719767 PMCID: PMC6822520 DOI: 10.6026/97320630015572] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Accepted: 08/12/2019] [Indexed: 11/23/2022] Open
Abstract
Ascaris lumbricoides is the prevalent parasite causing ascariasis by infecting the human alimentary tract. This is common in the jejunum of small intestine. Therefore, it is of interest to describe the target protein β Carbonic Anhydrase involved in Ascariasis. Carbonic anhydrase (CAs, the metallo enzymes) is encoded by six evolutionary divergent gene families α, β,γ, δ, ζ, and η, which contain zinc ion in their catalytic active site. β-CA is found in plants, algae, fungi, bacteria, protozoans, arthropods, and nematodes and completely absent in vertebrate genomes. The absence of β-CA protein in vertebrate makes the enzyme an important target for inhibitory studies against helminthic infection. The sequence to function related information and 3D structure data for β-CA of Ascaris lumbricoides is not available. Hence, we modeled the 3D structure (using PRIME) for the molecular dynamics and simulation studies (using the Desmond of Schrodinger software) and interaction analysis (using STRING database). The β-CA protein found to be interacting with carbonic anhydrase protein family along with T27A3, alh13, mtp18, T22F3, gcy29 proteins. These results provide insights for the understanding of the functional and biological roles played by β CA. Hence, this data is useful for the design of drugs for Ascariasis.
Collapse
Affiliation(s)
- Mahima Yadav
- Amity Institute of Biotechnology, Amity University Haryana, Gurgaon-122413, India
| | - Shikha Khandelwal
- Amity Institute of Biotechnology, Amity University Haryana, Gurgaon-122413, India
| |
Collapse
|
5
|
Pellizza L, Smal C, Rodrigo G, Arán M. Codon usage clusters correlation: towards protein solubility prediction in heterologous expression systems in E. coli. Sci Rep 2018; 8:10618. [PMID: 30006617 PMCID: PMC6045634 DOI: 10.1038/s41598-018-29035-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2018] [Accepted: 06/21/2018] [Indexed: 12/15/2022] Open
Abstract
Production of soluble recombinant proteins is crucial to the development of industry and basic research. However, the aggregation due to the incorrect folding of the nascent polypeptides is still a mayor bottleneck. Understanding the factors governing protein solubility is important to grasp the underlying mechanisms and improve the design of recombinant proteins. Here we show a quantitative study of the expression and solubility of a set of proteins from Bizionia argentinensis. Through the analysis of different features known to modulate protein production, we defined two parameters based on the %MinMax algorithm to compare codon usage clusters between the host and the target genes. We demonstrate that the absolute difference between all %MinMax frequencies of the host and the target gene is significantly negatively correlated with protein expression levels. But most importantly, a strong positive correlation between solubility and the degree of conservation of codons usage clusters is observed for two independent datasets. Moreover, we evince that this correlation is higher in codon usage clusters involved in less compact protein secondary structure regions. Our results provide important tools for protein design and support the notion that codon usage may dictate translation rate and modulate co-translational folding.
Collapse
Affiliation(s)
- Leonardo Pellizza
- Laboratory of Nuclear Magnetic Resonance, Fundación Instituto Leloir, IIBBA-CONICET, Av. Patricias Argentinas 435, C1405BWE, CABA, Argentina
| | - Clara Smal
- Laboratory of Nuclear Magnetic Resonance, Fundación Instituto Leloir, IIBBA-CONICET, Av. Patricias Argentinas 435, C1405BWE, CABA, Argentina
| | - Guido Rodrigo
- Laboratory of Nuclear Magnetic Resonance, Fundación Instituto Leloir, IIBBA-CONICET, Av. Patricias Argentinas 435, C1405BWE, CABA, Argentina
| | - Martín Arán
- Laboratory of Nuclear Magnetic Resonance, Fundación Instituto Leloir, IIBBA-CONICET, Av. Patricias Argentinas 435, C1405BWE, CABA, Argentina.
| |
Collapse
|
6
|
Guleria S, Walia A, Chauhan A, Shirkot CK. Molecular characterization of alkaline protease of Bacillus amyloliquefaciens SP1 involved in biocontrol of Fusarium oxysporum. Int J Food Microbiol 2016; 232:134-43. [PMID: 27294522 DOI: 10.1016/j.ijfoodmicro.2016.05.030] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2015] [Revised: 05/16/2016] [Accepted: 05/30/2016] [Indexed: 11/18/2022]
Abstract
An alkaline protease gene was amplified from genomic DNA of Bacillus amyloliquefaciens SP1 which was involved in effective biocontrol of Fusarium oxysporum. We investigated the antagonistic capacity of protease of B. amyloliquifaciens SP1, under in vitro conditions. The 5.62 fold purified enzyme with specific activity of 607.69U/mg reported 24.14% growth inhibition of F. oxysporum. However, no antagonistic activity was found after addition of protease inhibitor i.e. PMSF (15mM) to purified enzyme. An 1149bp nucleotide sequence of protease gene encoded 382 amino acids of 43kDa and calculated isoelectric point of 9.29. Analysis of deduced amino acid sequence revealed high homology (86%) with subtilisin E of Bacillus subtilis. The B. amyloliquefaciens SP1 protease gene was expressed in Escherichiax coli BL21. The expressed protease was secreted into culture medium by E. coli and exhibited optimum activity at pH8.0 and 60°C. The most reliable three dimensional structure of alkaline protease was determined using Phyre 2 server which was validated on the basis of Ramachandran plot and ERRAT value. The expression and structure prediction of the enzyme offers potential value for commercial application in agriculture and industry.
Collapse
Affiliation(s)
- Shiwani Guleria
- Department of Microbiology, DAV University, Jalandhar, Punjab144012, India.
| | - Abhishek Walia
- Department of Microbiology, DAV University, Jalandhar, Punjab144012, India.
| | - Anjali Chauhan
- Department of Basic Sciences (Microbiology Section), Dr. Y. S. Parmar University of Horticulture and Forestry, Nauni, Solan 173230 (H.P.), India.
| | - C K Shirkot
- Department of Basic Sciences (Microbiology Section), Dr. Y. S. Parmar University of Horticulture and Forestry, Nauni, Solan 173230 (H.P.), India.
| |
Collapse
|
7
|
Tokmakov AA, Kurotani A, Shirouzu M, Fukami Y, Yokoyama S. Bioinformatics analysis and optimization of cell-free protein synthesis. Methods Mol Biol 2014; 1118:17-33. [PMID: 24395407 DOI: 10.1007/978-1-62703-782-2_2] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Cell-free protein synthesis offers substantial advantages over cell-based expression, allowing direct access to the protein synthetic reaction and meticulous control over the reaction conditions. Recently, we identified a number of statistically significant correlations between calculated and predicted properties of amino acid sequences and their amenability to heterologous cell-free expression. These correlations can be of practical use for predicting expression success and optimizing cell-free protein synthesis. In this chapter, we describe our approach and demonstrate how computational and predictive bioinformatics can be used to analyze and optimize cell-free protein expression.
Collapse
|
8
|
DePietro PJ, Julfayev ES, McLaughlin WA. Quantification of the impact of PSI:Biology according to the annotations of the determined structures. BMC STRUCTURAL BIOLOGY 2013; 13:24. [PMID: 24139526 PMCID: PMC4016320 DOI: 10.1186/1472-6807-13-24] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/14/2013] [Accepted: 10/14/2013] [Indexed: 11/23/2022]
Abstract
Background Protein Structure Initiative:Biology (PSI:Biology) is the third phase of PSI where protein structures are determined in high-throughput to characterize their biological functions. The transition to the third phase entailed the formation of PSI:Biology Partnerships which are composed of structural genomics centers and biomedical science laboratories. We present a method to examine the impact of protein structures determined under the auspices of PSI:Biology by measuring their rates of annotations. The mean numbers of annotations per structure and per residue are examined. These are designed to provide measures of the amount of structure to function connections that can be leveraged from each structure. Results One result is that PSI:Biology structures are found to have a higher rate of annotations than structures determined during the first two phases of PSI. A second result is that the subset of PSI:Biology structures determined through PSI:Biology Partnerships have a higher rate of annotations than those determined exclusive of those partnerships. Both results hold when the annotation rates are examined either at the level of the entire protein or for annotations that are known to fall at specific residues within the portion of the protein that has a determined structure. Conclusions We conclude that PSI:Biology determines structures that are estimated to have a higher degree of biomedical interest than those determined during the first two phases of PSI based on a broad array of biomedical annotations. For the PSI:Biology Partnerships, we see that there is an associated added value that represents part of the progress toward the goals of PSI:Biology. We interpret the added value to mean that team-based structural biology projects that utilize the expertise and technologies of structural genomics centers together with biological laboratories in the community are conducted in a synergistic manner. We show that the annotation rates can be used in conjunction with established metrics, i.e. the numbers of structures and impact of publication records, to monitor the progress of PSI:Biology towards its goals of examining structure to function connections of high biomedical relevance. The metric provides an objective means to quantify the overall impact of PSI:Biology as it uses biomedical annotations from external sources.
Collapse
Affiliation(s)
| | | | - William A McLaughlin
- Department of Basic Science, The Commonwealth Medical College, 525 Pine Street, Scranton, PA 18509, USA.
| |
Collapse
|
9
|
Pieper U, Schlessinger A, Kloppmann E, Chang GA, Chou JJ, Dumont ME, Fox BG, Fromme P, Hendrickson WA, Malkowski MG, Rees DC, Stokes DL, Stowell MHB, Wiener MC, Rost B, Stroud RM, Stevens RC, Sali A. Coordinating the impact of structural genomics on the human α-helical transmembrane proteome. Nat Struct Mol Biol 2013; 20:135-8. [PMID: 23381628 DOI: 10.1038/nsmb.2508] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2012] [Accepted: 01/09/2013] [Indexed: 12/19/2022]
Affiliation(s)
- Ursula Pieper
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, California, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Tokmakov AA, Kurotani A, Takagi T, Toyama M, Shirouzu M, Fukami Y, Yokoyama S. Multiple post-translational modifications affect heterologous protein synthesis. J Biol Chem 2012; 287:27106-16. [PMID: 22674579 DOI: 10.1074/jbc.m112.366351] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Post-translational modifications (PTMs) are required for proper folding of many proteins. The low capacity for PTMs hinders the production of heterologous proteins in the widely used prokaryotic systems of protein synthesis. Until now, a systematic and comprehensive study concerning the specific effects of individual PTMs on heterologous protein synthesis has not been presented. To address this issue, we expressed 1488 human proteins and their domains in a bacterial cell-free system, and we examined the correlation of the expression yields with the presence of multiple PTM sites bioinformatically predicted in these proteins. This approach revealed a number of previously unknown statistically significant correlations. Prediction of some PTMs, such as myristoylation, glycosylation, palmitoylation, and disulfide bond formation, was found to significantly worsen protein amenability to soluble expression. The presence of other PTMs, such as aspartyl hydroxylation, C-terminal amidation, and Tyr sulfation, did not correlate with the yield of heterologous protein expression. Surprisingly, the predicted presence of several PTMs, such as phosphorylation, ubiquitination, SUMOylation, and prenylation, was associated with the increased production of properly folded soluble proteins. The plausible rationales for the existence of the observed correlations are presented. Our findings suggest that identification of potential PTMs in polypeptide sequences can be of practical use for predicting expression success and optimizing heterologous protein synthesis. In sum, this study provides the most compelling evidence so far for the role of multiple PTMs in the stability and solubility of heterologously expressed recombinant proteins.
Collapse
Affiliation(s)
- Alexander A Tokmakov
- RIKEN Systems and Structural Biology Center, University of Tokyo, Bunkyo, Tokyo 113-0033, Japan.
| | | | | | | | | | | | | |
Collapse
|
11
|
Syed R, Rani R, Sabeena, Masoodi TA, Shafi G, Alharbi K. Functional analysis and structure determination of alkaline protease from Aspergillus flavus. Bioinformation 2012; 8:175-80. [PMID: 22419836 PMCID: PMC3301997 DOI: 10.6026/97320630008175] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2012] [Accepted: 02/07/2012] [Indexed: 11/23/2022] Open
Abstract
Proteases are one of the highest value commercial enzymes as they have broad applications in food, pharmaceutical, detergent, and dairy industries and serve as vital tools in determination of structure of proteins and polypeptides. Multiple application of these enzymes stimulated interest to discover them with novel properties and considerable advancement of basic research into these enzymes. A broad understanding of the active site of the enzyme and of the mechanism of its inactivation is essential for delineating its structure-function relationship. Primary structure analysis of alkaline protease showed 42% of its content to be alpha helix making it stable for three dimensional structure modeling. Homology model of alkaline protease has been constructed using the X-ray structure (3F7O) as a template and swiss model as the workspace. The model was validated by ProSA, SAVES, PROCHECK, PROSAII and RMSD. The results showed the final refined model is reliable. It has 53% amino acid sequence identity with the template, 0.24 Å as RMSD and has -7.53 as Z-score, the Ramachandran plot analysis showed that conformations for 83.4 % of amino acid residues are within the most favored regions and only 0.4% in the disallowed regions.
Collapse
Affiliation(s)
- Rabbani Syed
- College of Applied Medical Sciences, King Saud University, Riyadh, Saudi Arabia
| | - Roja Rani
- Biotechnology Department,Acharya
Nagarjuna University,Guntur,AP, India
| | - Sabeena
- Jawaharlal
Nehru Institute of Advanced Studies, Hyderabad, India
| | - Tariq Ahmad Masoodi
- College of Applied Medical Sciences, King Saud University, Riyadh, Saudi Arabia
| | - Gowher Shafi
- Institute of Genetics and Hospital for Genetic Diseases, Hyderabad, India
| | - Khalid Alharbi
- College of Applied Medical Sciences, King Saud University, Riyadh, Saudi Arabia
| |
Collapse
|
12
|
Kim Y, Babnigg G, Jedrzejczak R, Eschenfeldt WH, Li H, Maltseva N, Hatzos-Skintges C, Gu M, Makowska-Grzyska M, Wu R, An H, Chhor G, Joachimiak A. High-throughput protein purification and quality assessment for crystallization. Methods 2011; 55:12-28. [PMID: 21907284 DOI: 10.1016/j.ymeth.2011.07.010] [Citation(s) in RCA: 118] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2011] [Revised: 07/14/2011] [Accepted: 07/14/2011] [Indexed: 12/31/2022] Open
Abstract
The ultimate goal of structural biology is to understand the structural basis of proteins in cellular processes. In structural biology, the most critical issue is the availability of high-quality samples. "Structural biology-grade" proteins must be generated in the quantity and quality suitable for structure determination using X-ray crystallography or nuclear magnetic resonance (NMR) spectroscopy. The purification procedures must reproducibly yield homogeneous proteins or their derivatives containing marker atom(s) in milligram quantities. The choice of protein purification and handling procedures plays a critical role in obtaining high-quality protein samples. With structural genomics emphasizing a genome-based approach in understanding protein structure and function, a number of unique structures covering most of the protein folding space have been determined and new technologies with high efficiency have been developed. At the Midwest Center for Structural Genomics (MCSG), we have developed semi-automated protocols for high-throughput parallel protein expression and purification. A protein, expressed as a fusion with a cleavable affinity tag, is purified in two consecutive immobilized metal affinity chromatography (IMAC) steps: (i) the first step is an IMAC coupled with buffer-exchange, or size exclusion chromatography (IMAC-I), followed by the cleavage of the affinity tag using the highly specific Tobacco Etch Virus (TEV) protease; the second step is IMAC and buffer exchange (IMAC-II) to remove the cleaved tag and tagged TEV protease. These protocols have been implemented on multidimensional chromatography workstations and, as we have shown, many proteins can be successfully produced in large-scale. All methods and protocols used for purification, some developed by MCSG, others adopted and integrated into the MCSG purification pipeline and more recently the Center for Structural Genomics of Infectious Diseases (CSGID) purification pipeline, are discussed in this chapter.
Collapse
Affiliation(s)
- Youngchang Kim
- Midwest Center for Structural Genomics, Biosciences Division, Argonne National Laboratory, Argonne, IL 60439, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Kherraz K, Kherraz K, Kameli A. Homology modeling of Ferredoxin-nitrite reductase from Arabidopsis thaliana. Bioinformation 2011; 6:115-9. [PMID: 21584187 PMCID: PMC3089885 DOI: 10.6026/97320630006115] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2010] [Accepted: 12/24/2010] [Indexed: 12/02/2022] Open
Abstract
UNLABELLED Nitrogen is one of the major growth-limiting nutrients for plants: The main source of nitrogen in most of the higher plants is nitrate taken up through roots. Nitrate can be reduced both in the chloroplasts (photosynthetic tissues) and in proplastes (nonphotosynthetic tissues) such as roots. Ferredoxin-nitrite reductase (NiR) catalyses the reduction of nitrite to ammonium in the second step of the nitrate- assimilation pathway. Homology model of Ferredoxin-nitrite reductase has been constructed using the X-ray structure (PDB code: 2akj) a s a template and MODELLER 9v5 software. The resulting model assessed by PROCHECK, PROSAII and RMSD that showed the final refined model is reliable: has 81% of amino acid sequence identity with template, 0.2Å as RMSD and has (-10.37) as Z-scores, the Ramachandran plot analysis showed that conformations for 99.5 % of amino acid residues are within the most favored regions. The model could prove useful in further functional characterization of this protein. ABBREVIATIONS PDB - Protein Data Bank, NMR - Nuclear Magnetic Resonance, NiR - Nitrite Reductase, RMSD - Root Mean Squared Deviation, Fd - ferredoxin.
Collapse
Affiliation(s)
- Karim Kherraz
- Biology department, Ecole Normale Superieure, ENS-Kouba, PB 92, Algiers, Algeria
| | - Khaled Kherraz
- Biology department, Ecole Normale Superieure, ENS-Kouba, PB 92, Algiers, Algeria
| | - Abdelkrim Kameli
- Biology department, Ecole Normale Superieure, ENS-Kouba, PB 92, Algiers, Algeria
| |
Collapse
|
14
|
Julfayev ES, McLaughlin RJ, Tao YP, McLaughlin WA. A new approach to assess and predict the functional roles of proteins across all known structures. JOURNAL OF STRUCTURAL AND FUNCTIONAL GENOMICS 2011; 12:9-20. [PMID: 21445639 PMCID: PMC3089730 DOI: 10.1007/s10969-011-9105-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2010] [Accepted: 03/14/2011] [Indexed: 12/11/2022]
Abstract
The three dimensional atomic structures of proteins provide information regarding their function; and codified relationships between structure and function enable the assessment of function from structure. In the current study, a new data mining tool was implemented that checks current gene ontology (GO) annotations and predicts new ones across all the protein structures available in the Protein Data Bank (PDB). The tool overcomes some of the challenges of utilizing large amounts of protein annotation and measurement information to form correspondences between protein structure and function. Protein attributes were extracted from the Structural Biology Knowledgebase and open source biological databases. Based on the presence or absence of a given set of attributes, a given protein's functional annotations were inferred. The results show that attributes derived from the three dimensional structures of proteins enhanced predictions over that using attributes only derived from primary amino acid sequence. Some predictions reflected known but not completely documented GO annotations. For example, predictions for the GO term for copper ion binding reflected used information a copper ion was known to interact with the protein based on information in a ligand interaction database. Other predictions were novel and require further experimental validation. These include predictions for proteins labeled as unknown function in the PDB. Two examples are a role in the regulation of transcription for the protein AF1396 from Archaeoglobus fulgidus and a role in RNA metabolism for the protein psuG from Thermotoga maritima.
Collapse
Affiliation(s)
- Elchin S. Julfayev
- Department of Basic Science, The Commonwealth Medical College, 525 Pine Street, Scranton, PA 18509 USA
| | - Ryan J. McLaughlin
- Department of Basic Science, The Commonwealth Medical College, 525 Pine Street, Scranton, PA 18509 USA
| | - Yi-Ping Tao
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, 610 Taylor Road, Piscataway, NJ 08854-8087 USA
| | - William A. McLaughlin
- Department of Basic Science, The Commonwealth Medical College, 525 Pine Street, Scranton, PA 18509 USA
| |
Collapse
|
15
|
Brooks MA, Gewartowski K, Mitsiki E, Létoquart J, Pache RA, Billier Y, Bertero M, Corréa M, Czarnocki-Cieciura M, Dadlez M, Henriot V, Lazar N, Delbos L, Lebert D, Piwowarski J, Rochaix P, Böttcher B, Serrano L, Séraphin B, van Tilbeurgh H, Aloy P, Perrakis A, Dziembowski A. Systematic bioinformatics and experimental validation of yeast complexes reduces the rate of attrition during structural investigations. Structure 2011; 18:1075-82. [PMID: 20826334 DOI: 10.1016/j.str.2010.08.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2010] [Revised: 06/30/2010] [Accepted: 08/07/2010] [Indexed: 10/19/2022]
Abstract
For high-throughput structural studies of protein complexes of composition inferred from proteomics data, it is crucial that candidate complexes are selected accurately. Herein, we exemplify a procedure that combines a bioinformatics tool for complex selection with in vivo validation, to deliver structural results in a medium-throughout manner. We have selected a set of 20 yeast complexes, which were predicted to be feasible by either an automated bioinformatics algorithm, by manual inspection of primary data, or by literature searches. These complexes were validated with two straightforward and efficient biochemical assays, and heterologous expression technologies of complex components were then used to produce the complexes to assess their feasibility experimentally. Approximately one-half of the selected complexes were useful for structural studies, and we detail one particular success story. Our results underscore the importance of accurate target selection and validation in avoiding transient, unstable, or simply nonexistent complexes from the outset.
Collapse
Affiliation(s)
- Mark A Brooks
- IBBMC-CNRS UMR8619, IFR 115, Bât. 430, Université Paris-Sud, 91405 Orsay, France
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Abstract
The drug discovery process mainly relies on the experimental high-throughput screening of huge compound libraries in their pursuit of new active compounds. However, spiraling research and development costs and unimpressive success rates have driven the development of more rational, efficient, and cost-effective methods. With the increasing availability of protein structural information, advancement in computational algorithms, and faster computing resources, in silico docking-based methods are increasingly used to design smaller and focused compound libraries in order to reduce screening efforts and costs and at the same time identify active compounds with a better chance of progressing through the optimization stages. This chapter is a primer on the various docking-based methods developed for the purpose of structure-based library design. Our aim is to elucidate some basic terms related to the docking technique and explain the methodology behind several docking-based library design methods. This chapter also aims to guide the novice computational practitioner by laying out the general steps involved for such an exercise. Selected successful case studies conclude this chapter.
Collapse
Affiliation(s)
- Claudio N Cavasotto
- School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA.
| | | |
Collapse
|
17
|
Oke M, Carter LG, Johnson KA, Liu H, McMahon SA, Yan X, Kerou M, Weikart ND, Kadi N, Sheikh MA, Schmelz S, Dorward M, Zawadzki M, Cozens C, Falconer H, Powers H, Overton IM, van Niekerk CAJ, Peng X, Patel P, Garrett RA, Prangishvili D, Botting CH, Coote PJ, Dryden DTF, Barton GJ, Schwarz-Linek U, Challis GL, Taylor GL, White MF, Naismith JH. The Scottish Structural Proteomics Facility: targets, methods and outputs. ACTA ACUST UNITED AC 2010; 11:167-80. [PMID: 20419351 PMCID: PMC2883930 DOI: 10.1007/s10969-010-9090-y] [Citation(s) in RCA: 100] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2010] [Accepted: 04/06/2010] [Indexed: 12/19/2022]
Abstract
The Scottish Structural Proteomics Facility was funded to develop a laboratory scale approach to high throughput structure determination. The effort was successful in that over 40 structures were determined. These structures and the methods harnessed to obtain them are reported here. This report reflects on the value of automation but also on the continued requirement for a high degree of scientific and technical expertise. The efficiency of the process poses challenges to the current paradigm of structural analysis and publication. In the 5 year period we published ten peer-reviewed papers reporting structural data arising from the pipeline. Nevertheless, the number of structures solved exceeded our ability to analyse and publish each new finding. By reporting the experimental details and depositing the structures we hope to maximize the impact of the project by allowing others to follow up the relevant biology.
Collapse
Affiliation(s)
- Muse Oke
- Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST UK
| | - Lester G. Carter
- Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST UK
- Present Address: Stanford Synchrotron Radiation Light Source, 2575 Sand Hill Road, MS 69, Menlo Park, CA 94025 USA
| | - Kenneth A. Johnson
- Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST UK
- Present Address: The Norwegian Structural Biology Centre, University of Tromsø, 9037 Tromsø, Norway
| | - Huanting Liu
- Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST UK
| | - Stephen A. McMahon
- Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST UK
| | - Xuan Yan
- Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST UK
| | - Melina Kerou
- Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST UK
| | - Nadine D. Weikart
- Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST UK
- Present Address: Faculty of Chemistry, Technische Universität Dortmund, Otto-Hahn-Str. 6, 44227 Dortmund, Germany
| | - Nadia Kadi
- Department of Chemistry, University of Warwick, Coventry, CV4 7AL UK
- Present Address: Institute of Cancer Research, 15 Cotswold Road, Belmont, Sutton, Surrey, SM2 5NG UK
| | - Md. Arif Sheikh
- Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST UK
| | - Stefan Schmelz
- Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST UK
| | - Mark Dorward
- Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST UK
- Present Address: Division of Signal Transduction Therapy, College of Life Sciences, University of Dundee, Dundee, DD1 5EH Scotland, UK
| | - Michal Zawadzki
- Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST UK
- Present Address: Syngenta Ltd, Jealott’s Hill International Research Centre, Bracknell, Berkshire, RG42 6EY UK
| | - Christopher Cozens
- Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST UK
- Present Address: Medical Research Council Laboratory of Molecular Biology, Hills Road, Cambridge, CB2 0QH UK
| | - Helen Falconer
- Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST UK
- Present Address: Institute of Structural and Molecular Biology, Edinburgh University, Kings Buildings, Edinburgh, EH9 3JR UK
| | - Helen Powers
- Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST UK
| | - Ian M. Overton
- Division of Biological Chemistry and Drug Discovery, College of Life Sciences, University of Dundee, Dundee, DD1 5EH Scotland, UK
- Present Address: MRC Human Genetics Unit, Crewe Road South, Edinburgh, EH4 2XU UK
| | - C. A. Johannes van Niekerk
- Division of Biological Chemistry and Drug Discovery, College of Life Sciences, University of Dundee, Dundee, DD1 5EH Scotland, UK
| | - Xu Peng
- Department of Biology, Archaea Centre, University of Copenhagen, Ole Maaløes Vej 5, 2200, Copenhagen N, Denmark
| | - Prakash Patel
- Department of Chemistry, University of Warwick, Coventry, CV4 7AL UK
| | - Roger A. Garrett
- Department of Biology, Archaea Centre, University of Copenhagen, Ole Maaløes Vej 5, 2200, Copenhagen N, Denmark
| | | | - Catherine H. Botting
- Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST UK
| | - Peter J. Coote
- Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST UK
| | - David T. F. Dryden
- EaStChem School of Chemistry, University of Edinburgh, The King’s Buildings, Edinburgh, EH9 3JJ UK
| | - Geoffrey J. Barton
- Division of Biological Chemistry and Drug Discovery, College of Life Sciences, University of Dundee, Dundee, DD1 5EH Scotland, UK
| | - Ulrich Schwarz-Linek
- Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST UK
| | | | - Garry L. Taylor
- Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST UK
| | - Malcolm F. White
- Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST UK
| | - James H. Naismith
- Biomedical Sciences Research Complex, University of St Andrews, St Andrews, KY16 9ST UK
| |
Collapse
|
18
|
Eschenfeldt WH, Maltseva N, Stols L, Donnelly MI, Gu M, Nocek B, Tan K, Kim Y, Joachimiak A. Cleavable C-terminal His-tag vectors for structure determination. JOURNAL OF STRUCTURAL AND FUNCTIONAL GENOMICS 2010; 11:31-9. [PMID: 20213425 PMCID: PMC2885959 DOI: 10.1007/s10969-010-9082-y] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2009] [Accepted: 02/11/2010] [Indexed: 10/19/2022]
Abstract
High-throughput structural genomics projects seek to delineate protein structure space by determining the structure of representatives of all major protein families. Generally this is accomplished by processing numerous proteins through standardized protocols, for the most part involving purification of N-terminally His-tagged proteins. Often proteins that fail this approach are abandoned, but in many cases further effort is warranted because of a protein's intrinsic value. In addition, failure often occurs relatively far into the path to structure determination, and many failed proteins passed the first critical step, expression as a soluble protein. Salvage pathways seek to recoup the investment in this subset of failed proteins through alternative cloning, nested truncations, chemical modification, mutagenesis, screening buffers, ligands and modifying processing steps. To this end we have developed a series of ligation-independent cloning expression vectors that append various cleavable C-terminal tags instead of the conventional N-terminal tags. In an initial set of 16 proteins that failed with an N-terminal appendage, structures were obtained for C-terminally tagged derivatives of five proteins, including an example for which several alternative salvaging steps had failed. The new vectors allow appending C-terminal His(6)-tag and His(6)- and MBP-tags, and are cleavable with TEV or with both TEV and TVMV proteases.
Collapse
Affiliation(s)
- William H. Eschenfeldt
- Midwest Center for Structural Genomics, Biosciences Division, Argonne National Laboratory, Bldg. 202/Rm. BE111, 9700 South Cass Avenue, Argonne, IL 60439, USA
| | - Natalia Maltseva
- Center for Structural Genomics of Infectious Diseases, Computational Institute, University of Chicago, Chicago, IL 60667, USA
| | - Lucy Stols
- Midwest Center for Structural Genomics, Biosciences Division, Argonne National Laboratory, Bldg. 202/Rm. BE111, 9700 South Cass Avenue, Argonne, IL 60439, USA
| | - Mark I. Donnelly
- Midwest Center for Structural Genomics, Biosciences Division, Argonne National Laboratory, Bldg. 202/Rm. BE111, 9700 South Cass Avenue, Argonne, IL 60439, USA
| | - Minyi Gu
- Midwest Center for Structural Genomics, Biosciences Division, Argonne National Laboratory, Bldg. 202/Rm. BE111, 9700 South Cass Avenue, Argonne, IL 60439, USA; Center for Structural Genomics of Infectious Diseases, Computational Institute, University of Chicago, Chicago, IL 60667, USA
| | - Boguslaw Nocek
- Center for Structural Genomics of Infectious Diseases, Computational Institute, University of Chicago, Chicago, IL 60667, USA
| | - Kemin Tan
- Midwest Center for Structural Genomics, Biosciences Division, Argonne National Laboratory, Bldg. 202/Rm. BE111, 9700 South Cass Avenue, Argonne, IL 60439, USA
| | - Youngchang Kim
- Midwest Center for Structural Genomics, Biosciences Division, Argonne National Laboratory, Bldg. 202/Rm. BE111, 9700 South Cass Avenue, Argonne, IL 60439, USA; Center for Structural Genomics of Infectious Diseases, Computational Institute, University of Chicago, Chicago, IL 60667, USA
| | - Andrzej Joachimiak
- Midwest Center for Structural Genomics, Biosciences Division, Argonne National Laboratory, Bldg. 202/Rm. BE111, 9700 South Cass Avenue, Argonne, IL 60439, USA; Center for Structural Genomics of Infectious Diseases, Computational Institute, University of Chicago, Chicago, IL 60667, USA
| |
Collapse
|
19
|
Babnigg G, Joachimiak A. Predicting protein crystallization propensity from protein sequence. ACTA ACUST UNITED AC 2010; 11:71-80. [PMID: 20177794 DOI: 10.1007/s10969-010-9080-0] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2009] [Accepted: 02/05/2010] [Indexed: 10/19/2022]
Abstract
The high-throughput structure determination pipelines developed by structural genomics programs offer a unique opportunity for data mining. One important question is how protein properties derived from a primary sequence correlate with the protein's propensity to yield X-ray quality crystals (crystallizability) and 3D X-ray structures. A set of protein properties were computed for over 1,300 proteins that expressed well but were insoluble, and for approximately 720 unique proteins that resulted in X-ray structures. The correlation of the protein's iso-electric point and grand average hydropathy (GRAVY) with crystallizability was analyzed for full length and domain constructs of protein targets. In a second step, several additional properties that can be calculated from the protein sequence were added and evaluated. Using statistical analyses we have identified a set of the attributes correlating with a protein's propensity to crystallize and implemented a Support Vector Machine (SVM) classifier based on these. We have created applications to analyze and provide optimal boundary information for query sequences and to visualize the data. These tools are available via the web site http://bioinformatics.anl.gov/cgi-bin/tools/pdpredictor .
Collapse
Affiliation(s)
- György Babnigg
- Midwest Center for Structural Genomics, Biosciences Division, Argonne National Laboratory, 9700 S Cass Ave., Argonne, IL 60439, USA.
| | | |
Collapse
|
20
|
Wu S, Liu T, Altman RB. Identification of recurring protein structure microenvironments and discovery of novel functional sites around CYS residues. BMC STRUCTURAL BIOLOGY 2010; 10:4. [PMID: 20122268 PMCID: PMC2833161 DOI: 10.1186/1472-6807-10-4] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/31/2009] [Accepted: 02/02/2010] [Indexed: 11/29/2022]
Abstract
Background The emergence of structural genomics presents significant challenges in the annotation of biologically uncharacterized proteins. Unfortunately, our ability to analyze these proteins is restricted by the limited catalog of known molecular functions and their associated 3D motifs. Results In order to identify novel 3D motifs that may be associated with molecular functions, we employ an unsupervised, two-phase clustering approach that combines k-means and hierarchical clustering with knowledge-informed cluster selection and annotation methods. We applied the approach to approximately 20,000 cysteine-based protein microenvironments (3D regions 7.5 Å in radius) and identified 70 interesting clusters, some of which represent known motifs (e.g. metal binding and phosphatase activity), and some of which are novel, including several zinc binding sites. Detailed annotation results are available online for all 70 clusters at http://feature.stanford.edu/clustering/cys. Conclusions The use of microenvironments instead of backbone geometric criteria enables flexible exploration of protein function space, and detection of recurring motifs that are discontinuous in sequence and diverse in structure. Clustering microenvironments may thus help to functionally characterize novel proteins and better understand the protein structure-function relationship.
Collapse
Affiliation(s)
- Shirley Wu
- 23andMe, 1390 Shorebird Way, Mountain View, CA, USA
| | | | | |
Collapse
|
21
|
Kurotani A, Takagi T, Toyama M, Shirouzu M, Yokoyama S, Fukami Y, Tokmakov AA. Comprehensive bioinformatics analysis of cell-free protein synthesis: identification of multiple protein properties that correlate with successful expression. FASEB J 2009; 24:1095-104. [PMID: 19940260 DOI: 10.1096/fj.09-139527] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
High-throughput cell-free protein synthesis is being used increasingly in structural/functional genomics projects. However, the factors determining expression success are poorly understood. Here, we evaluated the expression of 3066 human proteins and their domains in a bacterial cell-free system and analyzed the correlation of protein expression with 39 physicochemical and structural properties of proteins. As a result of the bioinformatics analysis performed, we determined the 18 most influential features that affect protein amenability to cell-free expression. They include protein length; hydrophobicity; pI; content of charged, nonpolar, and aromatic residues;, cysteine content; solvent accessibility; presence of coiled coil; content of intrinsically disordered and structured (alpha-helix and beta-sheet) sequence; number of disulfide bonds and functional domains; presence of transmembrane regions; PEST motifs; and signaling sequences. This study represents the first comprehensive bioinformatics analysis of heterologous protein synthesis in a cell-free system. The rules and correlations revealed here provide a plethora of important insights into rationalization of cell-free protein production and can be of practical use for protein engineering with the aim of increasing expression success.-Kurotani, A., Takagi, T., Toyama, M., Shirouzu, M., Yokoyama, S., Fukami, Y., Tokmakov, A. A. Comprehensive bioinformatics analysis of cell-free protein synthesis: identification of multiple protein properties that correlate with successful expression.
Collapse
|
22
|
Dessailly BH, Nair R, Jaroszewski L, Fajardo JE, Kouranov A, Lee D, Fiser A, Godzik A, Rost B, Orengo C. PSI-2: structural genomics to cover protein domain family space. Structure 2009; 17:869-81. [PMID: 19523904 DOI: 10.1016/j.str.2009.03.015] [Citation(s) in RCA: 106] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2008] [Revised: 03/18/2009] [Accepted: 03/22/2009] [Indexed: 11/25/2022]
Abstract
One major objective of structural genomics efforts, including the NIH-funded Protein Structure Initiative (PSI), has been to increase the structural coverage of protein sequence space. Here, we present the target selection strategy used during the second phase of PSI (PSI-2). This strategy, jointly devised by the bioinformatics groups associated with the PSI-2 large-scale production centers, targets representatives from large, structurally uncharacterized protein domain families, and from structurally uncharacterized subfamilies in very large and diverse families with incomplete structural coverage. These very large families are extremely diverse both structurally and functionally, and are highly overrepresented in known proteomes. On the basis of several metrics, we then discuss to what extent PSI-2, during its first 3 years, has increased the structural coverage of genomes, and contributed structural and functional novelty. Together, the results presented here suggest that PSI-2 is successfully meeting its objectives and provides useful insights into structural and functional space.
Collapse
Affiliation(s)
- Benoît H Dessailly
- Department of Structural and Molecular Biology, University College of London, London WC1E6BT, UK.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
23
|
Joachimiak A. High-throughput crystallography for structural genomics. Curr Opin Struct Biol 2009; 19:573-84. [PMID: 19765976 DOI: 10.1016/j.sbi.2009.08.002] [Citation(s) in RCA: 84] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2009] [Revised: 08/14/2009] [Accepted: 08/20/2009] [Indexed: 11/20/2022]
Abstract
Protein X-ray crystallography recently celebrated its 50th anniversary. The structures of myoglobin and hemoglobin determined by Kendrew and Perutz provided the first glimpses into the complex protein architecture and chemistry. Since then, the field of structural molecular biology has experienced extraordinary progress and now more than 55000 protein structures have been deposited into the Protein Data Bank. In the past decade many advances in macromolecular crystallography have been driven by world-wide structural genomics efforts. This was made possible because of third-generation synchrotron sources, structure phasing approaches using anomalous signal, and cryo-crystallography. Complementary progress in molecular biology, proteomics, hardware and software for crystallographic data collection, structure determination and refinement, computer science, databases, robotics and automation improved and accelerated many processes. These advancements provide the robust foundation for structural molecular biology and assure strong contribution to science in the future. In this report we focus mainly on reviewing structural genomics high-throughput X-ray crystallography technologies and their impact.
Collapse
Affiliation(s)
- Andrzej Joachimiak
- Midwest Center for Structural Genomics, Structural Biology Center, Biosciences Division, Argonne National Laboratory, 9700 S Class Ave., Argonne, IL 60439, USA.
| |
Collapse
|
24
|
Sim DW, Lee YS, Kim JH, Seo MD, Lee BJ, Won HS. HP0902 from Helicobacter pylori is a thermostable, dimeric protein belonging to an all-β topology of the cupin superfamily. BMB Rep 2009; 42:387-92. [PMID: 19558799 DOI: 10.5483/bmbrep.2009.42.6.387] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Affiliation(s)
- Dae-Won Sim
- Department of Biotechnology, College of Biomedical and Health Science, Konkuk University, Chungju, 380-701, Korea
| | | | | | | | | | | |
Collapse
|
25
|
Nair R, Liu J, Soong TT, Acton TB, Everett JK, Kouranov A, Fiser A, Godzik A, Jaroszewski L, Orengo C, Montelione GT, Rost B. Structural genomics is the largest contributor of novel structural leverage. ACTA ACUST UNITED AC 2009; 10:181-91. [PMID: 19194785 PMCID: PMC2705706 DOI: 10.1007/s10969-008-9055-6] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2008] [Accepted: 12/08/2008] [Indexed: 11/28/2022]
Abstract
The Protein Structural Initiative (PSI) at the US National Institutes of Health (NIH) is funding four large-scale centers for structural genomics (SG). These centers systematically target many large families without structural coverage, as well as very large families with inadequate structural coverage. Here, we report a few simple metrics that demonstrate how successfully these efforts optimize structural coverage: while the PSI-2 (2005-now) contributed more than 8% of all structures deposited into the PDB, it contributed over 20% of all novel structures (i.e. structures for protein sequences with no structural representative in the PDB on the date of deposition). The structural coverage of the protein universe represented by today’s UniProt (v12.8) has increased linearly from 1992 to 2008; structural genomics has contributed significantly to the maintenance of this growth rate. Success in increasing novel leverage (defined in Liu et al. in Nat Biotechnol 25:849–851, 2007) has resulted from systematic targeting of large families. PSI’s per structure contribution to novel leverage was over 4-fold higher than that for non-PSI structural biology efforts during the past 8 years. If the success of the PSI continues, it may just take another ~15 years to cover most sequences in the current UniProt database.
Collapse
Affiliation(s)
- Rajesh Nair
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|