601
|
Vinchon S, Moreau SJM, Drezen JM, Prévost G, Cherqui A. Molecular and biochemical analysis of an aspartylglucosaminidase from the venom of the parasitoid wasp Asobara tabida (Hymenoptera: Braconidae). INSECT BIOCHEMISTRY AND MOLECULAR BIOLOGY 2010; 40:38-48. [PMID: 20036741 DOI: 10.1016/j.ibmb.2009.12.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/24/2009] [Revised: 12/07/2009] [Accepted: 12/09/2009] [Indexed: 05/28/2023]
Abstract
The most abundant venom protein of the parasitoid wasp Asobara tabida was identified to be an aspartylglucosaminidase (hereafter named AtAGA). The aim of the present work is the identification of: 1) its cDNA and deduced amino acid sequences, 2) its subunits organization and 3) its activity. The cDNA of AtAGA coded for a proalphabeta precursor molecule preceded by a signal peptide of 19 amino acids. The gene products were detected specifically in the wasp venom gland (in which it could be found) under two forms: an (active) heterotetramer composed of two alpha and two beta subunits of 30 and 18 kDa respectively and a homodimer of 44 kDa precursor. The activity of AtAGA enzyme showed a limited tolerance toward variations of pH and temperatures. Since the enzyme failed to exhibit any glycopeptide N-glycosidase activity toward entire glycoproteins, its activity seemed to be restricted to the deglycosylation of free glycosylasparagines like human AGA, indicating AtAGA did not evolve a broader function in the course of evolution. The study of this enzyme may allow a better understanding of the functional evolution of venom enzymes in hymenopteran parasitoids.
Collapse
Affiliation(s)
- S Vinchon
- Laboratoire de Biologie des Entomophages, EA3900 BioPI, Université de Picardie Jules Verne, 33 rue Saint-Leu, 80039 Amiens Cedex, France.
| | | | | | | | | |
Collapse
|
602
|
Elhefnawi MM, Youssif AA, Ghalwash AZ, Behaidy WHE. An integrated methodology for mining promiscuous proteins: a case study of an integrative bioinformatics approach for hepatitis C virus non-structural 5A protein. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2010; 680:299-305. [PMID: 20865513 DOI: 10.1007/978-1-4419-5913-3_34] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
A methodology for elucidation of structural, functional, and mechanistic knowledge on promiscuous proteins is proposed that constitutes a workflow of integrated bioinformatics analysis. Sequence alignments with closely related homologues can reveal conserved regions which are functionally important. Scanning protein motif databases, along with secondary and surface accessibility predictions integrated with post-translational modification sites (PTMs) prediction reveal functional and protein-binding motifs. Integrating this information about the protein with the GO, SCOP, and CATH annotations of the templates can help to formulate a 3D model with reasonable accuracy even in the case of distant sequence homology. A novel integrative model of the non-structural protein 5A of Hepatitis C virus: a hub promiscuous protein with roles in virus replication and host interactions is proposed. The 3D structure for domain II was predicted based on, the Homo sapiens Replication factor-A protein-1 (RPA1), as a template using consensus meta-servers results. Domain III is an intrinsically unstructured domain with a fold from the retroviral matrix protein, which conducts diverse protein interactions and is involved in viral replication and protein interactions. It also has a single-stranded DNA-binding protein motif (SSDP) signature for pyrimidine binding during viral replication. Two protein-binding motifs with high sequence conservation and disordered regions are proposed; the first corresponds to an Interleukin-8B receptor signature (IL-8R-B), while the second has a lymphotoxin beta receptor (LTβR) high local similarity. A mechanism is proposed to their contribution to NS5A Interferon signaling pathway interception. Lastly, the overlapping between LTβR and SSDP is considered as a sign for NS5A date hubs.
Collapse
|
603
|
Predicting miRNA's target from primary structure by the nearest neighbor algorithm. Mol Divers 2009; 14:719-29. [PMID: 20041294 DOI: 10.1007/s11030-009-9216-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2009] [Accepted: 12/08/2009] [Indexed: 12/19/2022]
Abstract
We used a machine learning method, the nearest neighbor algorithm (NNA), to learn the relationship between miRNAs and their target proteins, generating a predictor which can then judge whether a new miRNA-target pair is true or not. We acquired 198 positive (true) miRNA-target pairs from Tarbase and the literature, and generated 4,888 negative (false) pairs through random combination. A 0/1 system and the frequencies of single nucleotides and di-nucleotides were used to encode miRNAs into vectors while various physicochemical parameters were used to encode the targets. The NNA was then applied, learning from these data to produce a predictor. We implemented minimum redundancy maximum relevance (mRMR) and properties forward selection (PFS) to reduce the redundancy of our encoding system, obtaining 91 most efficient properties. Finally, via the Jackknife cross-validation test, we got a positive accuracy of 69.2% and an overall accuracy of 96.0% with all the 253 properties. Besides, we got a positive accuracy of 83.8% and an overall accuracy of 97.2% with the 91 most efficient properties. A web-server for predictions is also made available at http://app3.biosino.org:8080/miRTP/index.jsp.
Collapse
|
604
|
El Hefnawi MM, El Behaidy WH, Youssif AA, Ghalwash AZ, El Housseiny LA, Zada S. Natural genetic engineering of hepatitis C virus NS5A for immune system counterattack. Ann N Y Acad Sci 2009; 1178:173-85. [PMID: 19845637 DOI: 10.1111/j.1749-6632.2009.05003.x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
The Hepatitis C virus nonstructural 5A (NS5A) protein is a hydrophilic phosphoprotein with diverse functions. The domain assignment of NS5A had been refined using a systematic in silico bioinformatics approach using DOMAC, the protein is divided into three domains and domain III is subdivided into two subdomains using ProDom and SSEP servers. The fold structure for domains II and III were predicted using the meta-server 3D-Jury. Scanning motif databases (SMART, BLOCKS, and PROSITE) gave new motifs. Two important motifs, the interleukins 1 and 8 interaction motifs, relating to NS5A function in inducing the interleukin 8 promoter, were discovered from the BLOCKS scan. Protein-protein interaction motifs were predicted as hot loops and disordered regions, corresponding to binding regions with the ds-protein kinase R, viral polymerase, and Src homology 3 signaling proteins binding motif. Other hot loops were predicted in the V3 region and in the single-stranded DNA-binding protein motif. The different mechanisms by which the NS5A protein leads to immune system signaling dysfunction points to the natural genetic engineering of this protein.
Collapse
Affiliation(s)
- Mahmoud M El Hefnawi
- Informatics and Systems Department, Division of Engineering Research Sciences, the National Research Centre, Egypt.
| | | | | | | | | | | |
Collapse
|
605
|
Rangwala H, Kauffman C, Karypis G. svmPRAT: SVM-based protein residue annotation toolkit. BMC Bioinformatics 2009; 10:439. [PMID: 20028521 PMCID: PMC2805646 DOI: 10.1186/1471-2105-10-439] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2009] [Accepted: 12/22/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Over the last decade several prediction methods have been developed for determining the structural and functional properties of individual protein residues using sequence and sequence-derived information. Most of these methods are based on support vector machines as they provide accurate and generalizable prediction models. RESULTS We present a general purpose protein residue annotation toolkit (svmPRAT) to allow biologists to formulate residue-wise prediction problems. svmPRAT formulates the annotation problem as a classification or regression problem using support vector machines. One of the key features of svmPRAT is its ease of use in incorporating any user-provided information in the form of feature matrices. For every residue svmPRAT captures local information around the reside to create fixed length feature vectors. svmPRAT implements accurate and fast kernel functions, and also introduces a flexible window-based encoding scheme that accurately captures signals and pattern for training effective predictive models. CONCLUSIONS In this work we evaluate svmPRAT on several classification and regression problems including disorder prediction, residue-wise contact order estimation, DNA-binding site prediction, and local structure alphabet prediction. svmPRAT has also been used for the development of state-of-the-art transmembrane helix prediction method called TOPTMH, and secondary structure prediction method called YASSPP. This toolkit developed provides practitioners an efficient and easy-to-use tool for a wide variety of annotation problems. AVAILABILITY http://www.cs.gmu.edu/~mlbio/svmprat.
Collapse
Affiliation(s)
- Huzefa Rangwala
- Computer Science Department, George Mason University, Fairfax, VA, USA.
| | | | | |
Collapse
|
606
|
Deng X, Eickholt J, Cheng J. PreDisorder: ab initio sequence-based prediction of protein disordered regions. BMC Bioinformatics 2009; 10:436. [PMID: 20025768 PMCID: PMC3087350 DOI: 10.1186/1471-2105-10-436] [Citation(s) in RCA: 80] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2009] [Accepted: 12/21/2009] [Indexed: 11/10/2022] Open
Abstract
Background Disordered regions are segments of the protein chain which do not adopt stable structures. Such segments are often of interest because they have a close relationship with protein expression and functionality. As such, protein disorder prediction is important for protein structure prediction, structure determination and function annotation. Results This paper presents our protein disorder prediction server, PreDisorder. It is based on our ab initio prediction method (MULTICOM-CMFR) which, along with our meta (or consensus) prediction method (MULTICOM), was recently ranked among the top disorder predictors in the eighth edition of the Critical Assessment of Techniques for Protein Structure Prediction (CASP8). We systematically benchmarked PreDisorder along with 26 other protein disorder predictors on the CASP8 data set and assessed its accuracy using a number of measures. The results show that it compared favourably with other ab initio methods and its performance is comparable to that of the best meta and clustering methods. Conclusion PreDisorder is a fast and reliable server which can be used to predict protein disordered regions on genomic scale. It is available at http://casp.rnet.missouri.edu/predisorder.html.
Collapse
Affiliation(s)
- Xin Deng
- Department of Computer Science, University of Missouri-Columbia, Columbia, MO 65211, USA.
| | | | | |
Collapse
|
607
|
Annan RB, Lee AY, Reid ID, Sayad A, Whiteway M, Hallett M, Thomas DY. A biochemical genomics screen for substrates of Ste20p kinase enables the in silico prediction of novel substrates. PLoS One 2009; 4:e8279. [PMID: 20020052 PMCID: PMC2791418 DOI: 10.1371/journal.pone.0008279] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2009] [Accepted: 11/19/2009] [Indexed: 01/13/2023] Open
Abstract
The Ste20/PAK family is involved in many cellular processes, including the regulation of actin-based cytoskeletal dynamics and the activation of MAPK signaling pathways. Despite its numerous roles, few of its substrates have been identified. To better characterize the roles of the yeast Ste20p kinase, we developed an in vitro biochemical genomics screen to identify its substrates. When applied to 539 purified yeast proteins, the screen reported 14 targets of Ste20p phosphorylation. We used the data resulting from our screen to build an in silico predictor to identify Ste20p substrates on a proteome-wide basis. Since kinase-substrate specificity is often mediated by additional binding events at sites distal to the phosphorylation site, the predictor uses the presence/absence of multiple sequence motifs to evaluate potential substrates. Statistical validation estimates a threefold improvement in substrate recovery over random predictions, despite the lack of a single dominant motif that can characterize Ste20p phosphorylation. The set of predicted substrates significantly overrepresents elements of the genetic and physical interaction networks surrounding Ste20p, suggesting that some of the predicted substrates are in vivo targets. We validated this combined experimental and computational approach for identifying kinase substrates by confirming the in vitro phosphorylation of polarisome components Bni1p and Bud6p, thus suggesting a mechanism by which Ste20p effects polarized growth.
Collapse
Affiliation(s)
- Robert B Annan
- Department of Biochemistry, McGill University, Montreal, Quebec, Canada.
| | | | | | | | | | | | | |
Collapse
|
608
|
Makarov VV, Rybakova EN, Efimov AV, Dobrov EN, Serebryakova MV, Solovyev AG, Yaminsky IV, Taliansky ME, Morozov SY, Kalinina NO. Domain organization of the N-terminal portion of hordeivirus movement protein TGBp1. J Gen Virol 2009; 90:3022-3032. [DOI: 10.1099/vir.0.013862-0] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Three ‘triple gene block’ proteins known as TGBp1, TGBp2 and TGBp3 are required for cell-to-cell movement of plant viruses belonging to a number of genera including Hordeivirus. Hordeiviral TGBp1 interacts with viral genomic RNAs to form ribonucleoprotein (RNP) complexes competent for translocation between cells through plasmodesmata and over long distances via the phloem. Binding of hordeivirus TGBp1 to RNA involves two protein regions, the C-terminal NTPase/helicase domain and the N-terminal extension region. This study demonstrated that the extension region of hordeivirus TGBp1 consists of two structurally and functionally distinct domains called the N-terminal domain (NTD) and the internal domain (ID). In agreement with secondary structure predictions, analysis of circular dichroism spectra of the isolated NTD and ID demonstrated that the NTD represents a natively unfolded protein domain, whereas the ID has a pronounced secondary structure. Both the NTD and ID were able to bind ssRNA non-specifically. However, whilst the NTD interacted with ssRNA non-cooperatively, the ID bound ssRNA in a cooperative manner. Additionally, both domains bound dsRNA. The NTD and ID formed low-molecular-mass oligomers, whereas the ID also gave rise to high-molecular-mass complexes. The isolated ID was able to interact with both the NTD and the C-terminal NTPase/helicase domain in solution. These data demonstrate that the hordeivirus TGBp1 has three RNA-binding domains and that interaction between these structural units can provide a basis for remodelling of viral RNP complexes at different steps of cell-to-cell and long-distance transport of virus infection.
Collapse
Affiliation(s)
- Valentin V. Makarov
- A. N. Belozersky Institute of Physico-Chemical Biology, Moscow State University, Moscow 119992, Russia
| | - Ekaterina N. Rybakova
- A. N. Belozersky Institute of Physico-Chemical Biology, Moscow State University, Moscow 119992, Russia
| | - Alexander V. Efimov
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region 142290, Russia
| | - Eugene N. Dobrov
- A. N. Belozersky Institute of Physico-Chemical Biology, Moscow State University, Moscow 119992, Russia
| | | | - Andrey G. Solovyev
- Institute of Agricultural Biotechnology, Russian Academy of Agricultural Sciences, Moscow 127550, Russia
- A. N. Belozersky Institute of Physico-Chemical Biology, Moscow State University, Moscow 119992, Russia
| | - Igor V. Yaminsky
- Physical Faculty, Moscow State University, Moscow 119992, Russia
| | | | - Sergey Yu. Morozov
- Department of Virology, Biological Faculty, Moscow State University, Moscow 119992, Russia
- A. N. Belozersky Institute of Physico-Chemical Biology, Moscow State University, Moscow 119992, Russia
| | - Natalia O. Kalinina
- A. N. Belozersky Institute of Physico-Chemical Biology, Moscow State University, Moscow 119992, Russia
| |
Collapse
|
609
|
Guzman L, Moraga-Cid G, Avila A, Figueroa M, Yevenes GE, Fuentealba J, Aguayo LG. Blockade of ethanol-induced potentiation of glycine receptors by a peptide that interferes with Gbetagamma binding. J Pharmacol Exp Ther 2009; 331:933-9. [PMID: 19773530 PMCID: PMC2784719 DOI: 10.1124/jpet.109.160440] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2009] [Accepted: 09/21/2009] [Indexed: 01/09/2023] Open
Abstract
The large intracellular loop (IL) of the glycine receptor (GlyR) interacts with various signaling proteins and plays a fundamental role in trafficking and regulation of several receptor properties, including a direct interaction with Gbetagamma. In the present study, we found that mutation of basic residues in the N-terminal region of the IL reduced the binding of Gbetagamma to 21 +/- 10% of control. Two basic residues in the C-terminal region, on the other hand, contributed to a smaller extent to Gbetagamma binding. Using docking analysis, we found that both basic regions of the IL bind in nearby regions to the Gbetagamma dimer, within an area of high density of amino acids having an electronegative character. Thereafter, we generated a 17-amino acid peptide with the N-terminal sequence of the wild-type IL (RQH) that was able to inhibit the in vitro binding of Gbetagamma to GlyRs to 57 +/- 5% of control in glutathione S-transferase pull-down assays using purified proteins. More interestingly, when the peptide was intracellularly applied to human embryonic kidney 293 cells, it inhibited the Gbetagamma-mediated modulations of G protein-coupled inwardly rectifying potassium channel by baclofen (24 +/- 14% of control) and attenuated the GlyR potentiation by ethanol (51 +/- 10% versus 10 +/- 3%).
Collapse
Affiliation(s)
- Leonardo Guzman
- Laboratory of Neurophysiology, Department of Physiology, University of Concepción, Concepción, Chile
| | | | | | | | | | | | | |
Collapse
|
610
|
Wang Z, Tegge AN, Cheng J. Evaluating the absolute quality of a single protein model using structural features and support vector machines. Proteins 2009; 75:638-47. [PMID: 19004001 DOI: 10.1002/prot.22275] [Citation(s) in RCA: 78] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Knowing the quality of a protein structure model is important for its appropriate usage. We developed a model evaluation method to assess the absolute quality of a single protein model using only structural features with support vector machine regression. The method assigns an absolute quantitative score (i.e. GDT-TS) to a model by comparing its secondary structure, relative solvent accessibility, contact map, and beta sheet structure with their counterparts predicted from its primary sequence. We trained and tested the method on the CASP6 dataset using cross-validation. The correlation between predicted and true scores is 0.82. On the independent CASP7 dataset, the correlation averaged over 95 protein targets is 0.76; the average correlation for template-based and ab initio targets is 0.82 and 0.50, respectively. Furthermore, the predicted absolute quality scores can be used to rank models effectively. The average difference (or loss) between the scores of the top-ranked models and the best models is 5.70 on the CASP7 targets. This method performs favorably when compared with the other methods used on the same dataset. Moreover, the predicted absolute quality scores are comparable across models for different proteins. These features make the method a valuable tool for model quality assurance and ranking.
Collapse
Affiliation(s)
- Zheng Wang
- Computer Science Department, Informatics Institute, University of Missouri, Columbia, MO 65211, USA
| | | | | |
Collapse
|
611
|
Chen L, Shi X, Kong X, Zeng Z, Cai YD. Identifying Protein Complexes Using Hybrid Properties. J Proteome Res 2009; 8:5212-8. [DOI: 10.1021/pr900554a] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Lei Chen
- Institute of Systems Biology, Shanghai University, Shanghai 200444, People’s Republic of China, Centre for Computational Systems Biology, Fudan University, Shanghai 200433, People’s Republic of China, Shanghai Key Laboratory of Trustworthy Computing, East China Normal University, Shanghai 200062, People’s Republic of China, and Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences and Shanghai Jiao Tong University School of Medicine, Shanghai 200025,
| | - Xiaohe Shi
- Institute of Systems Biology, Shanghai University, Shanghai 200444, People’s Republic of China, Centre for Computational Systems Biology, Fudan University, Shanghai 200433, People’s Republic of China, Shanghai Key Laboratory of Trustworthy Computing, East China Normal University, Shanghai 200062, People’s Republic of China, and Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences and Shanghai Jiao Tong University School of Medicine, Shanghai 200025,
| | - Xiangyin Kong
- Institute of Systems Biology, Shanghai University, Shanghai 200444, People’s Republic of China, Centre for Computational Systems Biology, Fudan University, Shanghai 200433, People’s Republic of China, Shanghai Key Laboratory of Trustworthy Computing, East China Normal University, Shanghai 200062, People’s Republic of China, and Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences and Shanghai Jiao Tong University School of Medicine, Shanghai 200025,
| | - Zhenbing Zeng
- Institute of Systems Biology, Shanghai University, Shanghai 200444, People’s Republic of China, Centre for Computational Systems Biology, Fudan University, Shanghai 200433, People’s Republic of China, Shanghai Key Laboratory of Trustworthy Computing, East China Normal University, Shanghai 200062, People’s Republic of China, and Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences and Shanghai Jiao Tong University School of Medicine, Shanghai 200025,
| | - Yu-Dong Cai
- Institute of Systems Biology, Shanghai University, Shanghai 200444, People’s Republic of China, Centre for Computational Systems Biology, Fudan University, Shanghai 200433, People’s Republic of China, Shanghai Key Laboratory of Trustworthy Computing, East China Normal University, Shanghai 200062, People’s Republic of China, and Institute of Health Sciences, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences and Shanghai Jiao Tong University School of Medicine, Shanghai 200025,
| |
Collapse
|
612
|
Biastoff S, Brandt W, Dräger B. Putrescine N-methyltransferase--the start for alkaloids. PHYTOCHEMISTRY 2009; 70:1708-18. [PMID: 19651420 DOI: 10.1016/j.phytochem.2009.06.012] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2009] [Revised: 06/08/2009] [Accepted: 06/12/2009] [Indexed: 05/08/2023]
Abstract
Putrescine N-methyltransferase (PMT) catalyses S-adenosylmethionine (SAM) dependent methylation of the diamine putrescine. The product N-methylputrescine is the first specific metabolite on the route to nicotine, tropane, and nortropane alkaloids. PMT cDNA sequences were cloned from tobacco species and other Solanaceae, also from nortropane-forming Convolvulaceae and enzyme proteins were synthesised in Escherichia coli. PMT activity was measured by HPLC separation of polyamine derivatives and by an enzyme-coupled colorimetric assay using S-adenosylhomocysteine. PMT cDNA sequences resemble those of plant spermidine synthases (putrescine aminopropyltransferases) and display little similarity to other plant methyltransferases. PMT is likely to have evolved from the ubiquitous enzyme spermidine synthase. PMT and spermidine synthase proteins share the same overall protein structure; they bind the same substrate putrescine and similar co-substrates, SAM and decarboxylated S-adenosylmethionine. The active sites of both proteins, however, were shaped differentially in the course of evolution. Phylogenetic analysis of both enzyme groups from plants revealed a deep bifurcation and confirmed an early descent of PMT from spermidine synthase in the course of angiosperm development.
Collapse
Affiliation(s)
- Stefan Biastoff
- Institute of Pharmacy, Faculty of Natural Sciences I, Martin Luther University Halle-Wittenberg, Halle/Saale, Germany
| | | | | |
Collapse
|
613
|
Song J, Tan H, Mahmood K, Law RHP, Buckle AM, Webb GI, Akutsu T, Whisstock JC. Prodepth: predict residue depth by support vector regression approach from protein sequences only. PLoS One 2009; 4:e7072. [PMID: 19759917 PMCID: PMC2742725 DOI: 10.1371/journal.pone.0007072] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2009] [Accepted: 08/20/2009] [Indexed: 11/24/2022] Open
Abstract
Residue depth (RD) is a solvent exposure measure that complements the information provided by conventional accessible surface area (ASA) and describes to what extent a residue is buried in the protein structure space. Previous studies have established that RD is correlated with several protein properties, such as protein stability, residue conservation and amino acid types. Accurate prediction of RD has many potentially important applications in the field of structural bioinformatics, for example, facilitating the identification of functionally important residues, or residues in the folding nucleus, or enzyme active sites from sequence information. In this work, we introduce an efficient approach that uses support vector regression to quantify the relationship between RD and protein sequence. We systematically investigated eight different sequence encoding schemes including both local and global sequence characteristics and examined their respective prediction performances. For the objective evaluation of our approach, we used 5-fold cross-validation to assess the prediction accuracies and showed that the overall best performance could be achieved with a correlation coefficient (CC) of 0.71 between the observed and predicted RD values and a root mean square error (RMSE) of 1.74, after incorporating the relevant multiple sequence features. The results suggest that residue depth could be reliably predicted solely from protein primary sequences: local sequence environments are the major determinants, while global sequence features could influence the prediction performance marginally. We highlight two examples as a comparison in order to illustrate the applicability of this approach. We also discuss the potential implications of this new structural parameter in the field of protein structure prediction and homology modeling. This method might prove to be a powerful tool for sequence analysis.
Collapse
Affiliation(s)
- Jiangning Song
- Department of Biochemistry and Molecular Biology, Monash University, Clayton, Melbourne, Victoria, Australia
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto, Japan
- * E-mail: (JS); (JCW)
| | - Hao Tan
- Department of Biochemistry and Molecular Biology, Monash University, Clayton, Melbourne, Victoria, Australia
| | - Khalid Mahmood
- Department of Biochemistry and Molecular Biology, Monash University, Clayton, Melbourne, Victoria, Australia
- ARC Centre of Excellence for Structural and Functional Microbial Genomics, Monash University, Clayton, Melbourne, Victoria, Australia
| | - Ruby H. P. Law
- Department of Biochemistry and Molecular Biology, Monash University, Clayton, Melbourne, Victoria, Australia
| | - Ashley M. Buckle
- Department of Biochemistry and Molecular Biology, Monash University, Clayton, Melbourne, Victoria, Australia
| | - Geoffrey I. Webb
- Faculty of Information Technology, Monash University, Clayton, Melbourne, Victoria, Australia
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto, Japan
| | - James C. Whisstock
- Department of Biochemistry and Molecular Biology, Monash University, Clayton, Melbourne, Victoria, Australia
- ARC Centre of Excellence for Structural and Functional Microbial Genomics, Monash University, Clayton, Melbourne, Victoria, Australia
- * E-mail: (JS); (JCW)
| |
Collapse
|
614
|
Roca AL, Ishida Y, Nikolaidis N, Kolokotronis SO, Fratpietro S, Stewardson K, Hensley S, Tisdale M, Boeskorov G, Greenwood AD. Genetic variation at hair length candidate genes in elephants and the extinct woolly mammoth. BMC Evol Biol 2009; 9:232. [PMID: 19747392 PMCID: PMC2754481 DOI: 10.1186/1471-2148-9-232] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2009] [Accepted: 09/11/2009] [Indexed: 11/17/2022] Open
Abstract
Background Like humans, the living elephants are unusual among mammals in being sparsely covered with hair. Relative to extant elephants, the extinct woolly mammoth, Mammuthus primigenius, had a dense hair cover and extremely long hair, which likely were adaptations to its subarctic habitat. The fibroblast growth factor 5 (FGF5) gene affects hair length in a diverse set of mammalian species. Mutations in FGF5 lead to recessive long hair phenotypes in mice, dogs, and cats; and the gene has been implicated in hair length variation in rabbits. Thus, FGF5 represents a leading candidate gene for the phenotypic differences in hair length notable between extant elephants and the woolly mammoth. We therefore sequenced the three exons (except for the 3' UTR) and a portion of the promoter of FGF5 from the living elephantid species (Asian, African savanna and African forest elephants) and, using protocols for ancient DNA, from a woolly mammoth. Results Between the extant elephants and the mammoth, two single base substitutions were observed in FGF5, neither of which alters the amino acid sequence. Modeling of the protein structure suggests that the elephantid proteins fold similarly to the human FGF5 protein. Bioinformatics analyses and DNA sequencing of another locus that has been implicated in hair cover in humans, type I hair keratin pseudogene (KRTHAP1), also yielded negative results. Interestingly, KRTHAP1 is a pseudogene in elephantids as in humans (although fully functional in non-human primates). Conclusion The data suggest that the coding sequence of the FGF5 gene is not the critical determinant of hair length differences among elephantids. The results are discussed in the context of hairlessness among mammals and in terms of the potential impact of large body size, subarctic conditions, and an aquatic ancestor on hair cover in the Proboscidea.
Collapse
Affiliation(s)
- Alfred L Roca
- Department of Animal Sciences, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
615
|
Trindade DM, Silva JC, Navarro MS, Torriani ICL, Kobarg J. Low-resolution structural studies of human Stanniocalcin-1. BMC STRUCTURAL BIOLOGY 2009; 9:57. [PMID: 19712479 PMCID: PMC2744999 DOI: 10.1186/1472-6807-9-57] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/17/2009] [Accepted: 08/27/2009] [Indexed: 11/10/2022]
Abstract
BACKGROUND Stanniocalcins (STCs) represent small glycoprotein hormones, found in all vertebrates, which have been functionally implicated in Calcium homeostasis. However, recent data from mammalian systems indicated that they may be also involved in embryogenesis, tumorigenesis and in the context of the latter especially in angiogenesis. Human STC1 is a 247 amino acids protein with a predicted molecular mass of 27 kDa, but preliminary data suggested its di- or multimerization. The latter in conjunction with alternative splicing and/or post-translational modification gives rise to forms described as STC50 and "big STC", which molecular weights range from 56 to 135 kDa. RESULTS In this study we performed a biochemical and structural analysis of STC1 with the aim of obtaining low resolution structural information about the human STC1, since structural information in this protein family is scarce. We expressed STC1 in both E. coli and insect cells using the baculo virus system with a C-terminal 6 x His fusion tag. From the latter we obtained reasonable amounts of soluble protein. Circular dichroism analysis showed STC1 as a well structured protein with 52% of alpha-helical content. Mass spectroscopy analysis of the recombinant protein allowed to assign the five intramolecular disulfide bridges as well as the dimerization Cys202, thereby confirming the conservation of the disulfide pattern previously described for fish STC1. SAXS data also clearly demonstrated that STC1 adopts a dimeric, slightly elongated structure in solution. CONCLUSION Our data reveal the first low resolution, structural information for human STC1. Theoretical predictions and circular dichroism spectroscopy both suggested that STC1 has a high content of alpha-helices and SAXS experiments revealed that STC1 is a dimer of slightly elongated shape in solution. The dimerization was confirmed by mass spectrometry as was the highly conserved disulfide pattern, which is identical to that found in fish STC1.
Collapse
Affiliation(s)
- Daniel M Trindade
- Centro de Biologia Molecular Estrutural (CEBIME), Campinas, SP, Brazil
- Instituto de Biologia, Departamento de Bioquímica, Universidade Estadual de Campinas, Campinas, SP, Brazil
| | - Júlio C Silva
- Instituto de Física "Gleb Wataghin", Universidade Estadual de Campinas, Campinas, SP, Brazil
- Laboratório Nacional de Luz Síncrotron (LNLS), Campinas, SP, Brazil
| | | | - Iris CL Torriani
- Instituto de Física "Gleb Wataghin", Universidade Estadual de Campinas, Campinas, SP, Brazil
- Laboratório Nacional de Luz Síncrotron (LNLS), Campinas, SP, Brazil
| | - Jörg Kobarg
- Centro de Biologia Molecular Estrutural (CEBIME), Campinas, SP, Brazil
- Instituto de Biologia, Departamento de Bioquímica, Universidade Estadual de Campinas, Campinas, SP, Brazil
| |
Collapse
|
616
|
Ulijasz AT, Cornilescu G, von Stetten D, Cornilescu C, Velazquez Escobar F, Zhang J, Stankey RJ, Rivera M, Hildebrandt P, Vierstra RD. Cyanochromes are blue/green light photoreversible photoreceptors defined by a stable double cysteine linkage to a phycoviolobilin-type chromophore. J Biol Chem 2009; 284:29757-72. [PMID: 19671704 DOI: 10.1074/jbc.m109.038513] [Citation(s) in RCA: 63] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Phytochromes are a collection of bilin-containing photoreceptors that regulate a diverse array of processes in microorganisms and plants through photoconversion between two stable states, a red light-absorbing Pr form, and a far red light-absorbing Pfr form. Recently, a novel set of phytochrome-like chromoproteins was discovered in cyanobacteria, designated here as cyanochromes, that instead photoconvert between stable blue and green light-absorbing forms Pb and Pg, respectively. Here, we show that the distinctive absorption properties of cyanochromes are facilitated through the binding of phycocyanobilin via two stable cysteine-based thioether linkages within the cGMP phosphodiesterase/adenyl cyclase/FhlA domain. Absorption, resonance Raman and infrared spectroscopy, and molecular modeling of the Te-PixJ GAF (cGMP phosphodiesterase/adenyl cyclase/FhlA) domain assembled with phycocyanobilin are consistent with attachments to the C3(1) carbon of the ethylidene side chain and the C4 or C5 carbons in the A-B methine bridge to generate a double thioether-linked phycoviolobilin-type chromophore. These spectroscopic methods combined with NMR data show that the bilin is fully protonated in the Pb and Pg states and that numerous conformation changes occur during Pb --> Pg photoconversion. Also identified were a number of photochromically inactive mutants with strong yellow or red fluorescence that may be useful for fluorescence-based cell biological assays. Phylogenetic analyses detected cyanochromes capable of different signaling outputs in a wide range of cyanobacterial species. One unusual case is the Synechocystis cyanochrome Etr1 that also binds ethylene, suggesting that it works as a hybrid receptor to simultaneously integrate light and hormone signals.
Collapse
Affiliation(s)
- Andrew T Ulijasz
- Department of Genetics, University of Wisconsin, Madison, WI 53706, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
617
|
Petersen B, Petersen TN, Andersen P, Nielsen M, Lundegaard C. A generic method for assignment of reliability scores applied to solvent accessibility predictions. BMC STRUCTURAL BIOLOGY 2009; 9:51. [PMID: 19646261 PMCID: PMC2725087 DOI: 10.1186/1472-6807-9-51] [Citation(s) in RCA: 474] [Impact Index Per Article: 31.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/10/2009] [Accepted: 07/31/2009] [Indexed: 11/25/2022]
Abstract
BACKGROUND Estimation of the reliability of specific real value predictions is nontrivial and the efficacy of this is often questionable. It is important to know if you can trust a given prediction and therefore the best methods associate a prediction with a reliability score or index. For discrete qualitative predictions, the reliability is conventionally estimated as the difference between output scores of selected classes. Such an approach is not feasible for methods that predict a biological feature as a single real value rather than a classification. As a solution to this challenge, we have implemented a method that predicts the relative surface accessibility of an amino acid and simultaneously predicts the reliability for each prediction, in the form of a Z-score. RESULTS An ensemble of artificial neural networks has been trained on a set of experimentally solved protein structures to predict the relative exposure of the amino acids. The method assigns a reliability score to each surface accessibility prediction as an inherent part of the training process. This is in contrast to the most commonly used procedures where reliabilities are obtained by post-processing the output. CONCLUSION The performance of the neural networks was evaluated on a commonly used set of sequences known as the CB513 set. An overall Pearson's correlation coefficient of 0.72 was obtained, which is comparable to the performance of the currently best public available method, Real-SPINE. Both methods associate a reliability score with the individual predictions. However, our implementation of reliability scores in the form of a Z-score is shown to be the more informative measure for discriminating good predictions from bad ones in the entire range from completely buried to fully exposed amino acids. This is evident when comparing the Pearson's correlation coefficient for the upper 20% of predictions sorted according to reliability. For this subset, values of 0.79 and 0.74 are obtained using our and the compared method, respectively. This tendency is true for any selected subset.
Collapse
Affiliation(s)
- Bent Petersen
- Center for Biological Sequence Analysis – CBS, Department of Systems Biology, Kemitorvet 208, Technical University of Denmark – DTU, DK-2800 Lyngby, Denmark
| | - Thomas Nordahl Petersen
- Center for Biological Sequence Analysis – CBS, Department of Systems Biology, Kemitorvet 208, Technical University of Denmark – DTU, DK-2800 Lyngby, Denmark
| | - Pernille Andersen
- Center for Biological Sequence Analysis – CBS, Department of Systems Biology, Kemitorvet 208, Technical University of Denmark – DTU, DK-2800 Lyngby, Denmark
- Centre for Medical Parasitology – CMP, CSS Building 22, University of Copenhagen, DK-1014 Copenhagen, Denmark
| | - Morten Nielsen
- Center for Biological Sequence Analysis – CBS, Department of Systems Biology, Kemitorvet 208, Technical University of Denmark – DTU, DK-2800 Lyngby, Denmark
| | - Claus Lundegaard
- Center for Biological Sequence Analysis – CBS, Department of Systems Biology, Kemitorvet 208, Technical University of Denmark – DTU, DK-2800 Lyngby, Denmark
| |
Collapse
|
618
|
Thusberg J, Vihinen M. Pathogenic or not? And if so, then how? Studying the effects of missense mutations using bioinformatics methods. Hum Mutat 2009; 30:703-14. [PMID: 19267389 DOI: 10.1002/humu.20938] [Citation(s) in RCA: 181] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Many gene defects are relatively easy to identify experimentally, but obtaining information about the effects of sequence variations and elucidation of the detailed molecular mechanisms of genetic diseases will be among the next major efforts in mutation research. Amino acid substitutions may have diverse effects on protein structure and function; thus, a detailed analysis of the mutations is essential. Experimental study of the molecular effects of mutations is laborious, whereas useful and reliable information about the effects of amino acid substitutions can readily be obtained by theoretical methods. Experimentally defined structures and molecular modeling can be used as a basis for interpretation of the mutations. The effects of missense mutations can be analyzed even when the 3D structure of the protein has not been determined, although structure-based analyses are more reliable. Structural analyses include studies of the contacts between residues, their implication for the stability of the protein, and the effects of the introduced residues. Investigations of steric and stereochemical consequences of substitutions provide insights on the molecular fit of the introduced residue. Mutations that change the electrostatic surface potential of a protein have wide-ranging effects. Analyses of the effects of mutations on interactions with ligands and partners have been performed for elucidation of functional mutations. We have employed numerous methods for predicting the effects of amino acid substitutions. We discuss the applicability of these methods in the analysis of genes, proteins, and diseases to reveal protein structure-function relationships, which is essential to gain insights into disease genotype-phenotype correlations.
Collapse
Affiliation(s)
- Janita Thusberg
- Institute of Medical Technology, FI-33014 University of Tampere, Finland
| | | |
Collapse
|
619
|
Lippi M, Frasconi P. Prediction of protein beta-residue contacts by Markov logic networks with grounding-specific weights. ACTA ACUST UNITED AC 2009; 25:2326-33. [PMID: 19592394 DOI: 10.1093/bioinformatics/btp421] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION Accurate prediction of contacts between beta-strand residues can significantly contribute towards ab initio prediction of the 3D structure of many proteins. Contacts in the same protein are highly interdependent. Therefore, significant improvements can be expected by applying statistical relational learners that overcome the usual machine learning assumption that examples are independent and identically distributed. Furthermore, the dependencies among beta-residue contacts are subject to strong regularities, many of which are known a priori. In this article, we take advantage of Markov logic, a statistical relational learning framework that is able to capture dependencies between contacts, and constrain the solution according to domain knowledge expressed by means of weighted rules in a logical language. RESULTS We introduce a novel hybrid architecture based on neural and Markov logic networks with grounding-specific weights. On a non-redundant dataset, our method achieves 44.9% F(1) measure, with 47.3% precision and 42.7% recall, which is significantly better (P < 0.01) than previously reported performance obtained by 2D recursive neural networks. Our approach also significantly improves the number of chains for which beta-strands are nearly perfectly paired (36% of the chains are predicted with F(1) >or= 70% on coarse map). It also outperforms more general contact predictors on recent CASP 2008 targets.
Collapse
Affiliation(s)
- Marco Lippi
- Dipartimento di Sistemi e Informatica, Università degli Studi di Firenze, Firenze, Italy.
| | | |
Collapse
|
620
|
Tegge AN, Wang Z, Eickholt J, Cheng J. NNcon: improved protein contact map prediction using 2D-recursive neural networks. Nucleic Acids Res 2009; 37:W515-8. [PMID: 19420062 PMCID: PMC2703959 DOI: 10.1093/nar/gkp305] [Citation(s) in RCA: 110] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2009] [Revised: 04/13/2009] [Accepted: 04/16/2009] [Indexed: 11/13/2022] Open
Abstract
Protein contact map prediction is useful for protein folding rate prediction, model selection and 3D structure prediction. Here we describe NNcon, a fast and reliable contact map prediction server and software. NNcon was ranked among the most accurate residue contact predictors in the Eighth Critical Assessment of Techniques for Protein Structure Prediction (CASP8), 2008. Both NNcon server and software are available at http://casp.rnet.missouri.edu/nncon.html.
Collapse
Affiliation(s)
| | | | | | - Jianlin Cheng
- Computer Science Department, Informatics Institute, University of Missouri, Columbia, MO 65213, USA
| |
Collapse
|
621
|
Verspurten J, Gevaert K, Declercq W, Vandenabeele P. SitePredicting the cleavage of proteinase substrates. Trends Biochem Sci 2009; 34:319-23. [DOI: 10.1016/j.tibs.2009.04.001] [Citation(s) in RCA: 96] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2009] [Revised: 04/27/2009] [Accepted: 04/28/2009] [Indexed: 11/16/2022]
|
622
|
Deschavanne P, Tufféry P. Enhanced protein fold recognition using a structural alphabet. Proteins 2009; 76:129-37. [DOI: 10.1002/prot.22324] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
623
|
Schiermeyer A, Hartenstein H, Mandal MK, Otte B, Wahner V, Schillberg S. A membrane-bound matrix-metalloproteinase from Nicotiana tabacum cv. BY-2 is induced by bacterial pathogens. BMC PLANT BIOLOGY 2009; 9:83. [PMID: 19563670 PMCID: PMC2715019 DOI: 10.1186/1471-2229-9-83] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/11/2009] [Accepted: 06/29/2009] [Indexed: 05/22/2023]
Abstract
BACKGROUND Plant matrix metalloproteinases (MMP) are conserved proteolytic enzymes found in a wide range of monocotyledonous and dicotyledonous plant species. Acting on the plant extracellular matrix, they play crucial roles in many aspects of plant physiology including growth, development and the response to stresses such as pathogen attack. RESULTS We have identified the first tobacco MMP, designated NtMMP1, and have isolated the corresponding cDNA sequence from the tobacco suspension cell line BY-2. The overall domain structure of NtMMP1 is similar to known MMP sequences, although certain features suggest it may be constitutively active rather than dependent on proteolytic processing. The protein appears to be expressed in two forms with different molecular masses, both of which are enzymatically active as determined by casein zymography. Exchanging the catalytic domain of NtMMP1 with green fluorescent protein (GFP) facilitated subcellular localization by confocal laser scanning microscopy, showing the protein is normally inserted into the plasma membrane. The NtMMP1 gene is expressed constitutively at a low level but can be induced by exposure to bacterial pathogens. CONCLUSION Our biochemical analysis of NtMMP1 together with bioinformatic data on the primary sequence indicate that NtMMP1 is a constitutively-active protease. Given its induction in response to bacterial pathogens and its localization in the plasma membrane, we propose a role in pathogen defense at the cell periphery.
Collapse
Affiliation(s)
- Andreas Schiermeyer
- Fraunhofer Institute for Molecular Biology and Applied Ecology (IME), Department Plant Biotechnology, Forckenbeckstrasse 6, 52074 Aachen, Germany
| | - Hanna Hartenstein
- RWTH Aachen University, Institute for Molecular Biotechnology, Worringerweg 1, 52074 Aachen, Germany
| | - Manoj K Mandal
- RWTH Aachen University, Institute for Molecular Biotechnology, Worringerweg 1, 52074 Aachen, Germany
| | - Burkhard Otte
- RWTH Aachen University, Institute for Molecular Biotechnology, Worringerweg 1, 52074 Aachen, Germany
| | - Verena Wahner
- Aachen University for Applied Sciences, Campus Juelich, Ginsterweg 1, 52428 Juelich, Germany
| | - Stefan Schillberg
- Fraunhofer Institute for Molecular Biology and Applied Ecology (IME), Department Plant Biotechnology, Forckenbeckstrasse 6, 52074 Aachen, Germany
| |
Collapse
|
624
|
Magnan CN, Randall A, Baldi P. SOLpro: accurate sequence-based prediction of protein solubility. ACTA ACUST UNITED AC 2009; 25:2200-7. [PMID: 19549632 DOI: 10.1093/bioinformatics/btp386] [Citation(s) in RCA: 357] [Impact Index Per Article: 23.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION Protein insolubility is a major obstacle for many experimental studies. A sequence-based prediction method able to accurately predict the propensity of a protein to be soluble on overexpression could be used, for instance, to prioritize targets in large-scale proteomics projects and to identify mutations likely to increase the solubility of insoluble proteins. RESULTS Here, we first curate a large, non-redundant and balanced training set of more than 17 000 proteins. Next, we extract and study 23 groups of features computed directly or predicted (e.g. secondary structure) from the primary sequence. The data and the features are used to train a two-stage support vector machine (SVM) architecture. The resulting predictor, SOLpro, is compared directly with existing methods and shows significant improvement according to standard evaluation metrics, with an overall accuracy of over 74% estimated using multiple runs of 10-fold cross-validation.
Collapse
Affiliation(s)
- Christophe N Magnan
- Institute for Genomics and Bioinformatics, School of Information and Computer Sciences, University of California, Irvine, CA, USA
| | | | | |
Collapse
|
625
|
Fang J, Koen YM, Hanzlik RP. Bioinformatic analysis of xenobiotic reactive metabolite target proteins and their interacting partners. BMC CHEMICAL BIOLOGY 2009; 9:5. [PMID: 19523227 PMCID: PMC2711050 DOI: 10.1186/1472-6769-9-5] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2009] [Accepted: 06/12/2009] [Indexed: 12/12/2022]
Abstract
BACKGROUND Protein covalent binding by reactive metabolites of drugs, chemicals and natural products can lead to acute cytotoxicity. Recent rapid progress in reactive metabolite target protein identification has shown that adduction is surprisingly selective and inspired the hope that analysis of target proteins might reveal protein factors that differentiate target- vs. non-target proteins and illuminate mechanisms connecting covalent binding to cytotoxicity. RESULTS Sorting 171 known reactive metabolite target proteins revealed a number of GO categories and KEGG pathways to be significantly enriched in targets, but in most cases the classes were too large, and the "percent coverage" too small, to allow meaningful conclusions about mechanisms of toxicity. However, a similar analysis of the directlyinteracting partners of 28 common targets of multiple reactive metabolites revealed highly significant enrichments in terms likely to be highly relevant to cytotoxicity (e.g., MAP kinase pathways, apoptosis, response to unfolded protein). Machine learning was used to rank the contribution of 211 computed protein features to determining protein susceptibility to adduction. Protein lysine (but not cysteine) content and protein instability index (i.e., rate of turnover in vivo) were among the features most important to determining susceptibility. CONCLUSION As yet there is no good explanation for why some low-abundance proteins become heavily adducted while some abundant proteins become only lightly adducted in vivo. Analyzing the directly interacting partners of target proteins appears to yield greater insight into mechanisms of toxicity than analyzing target proteins per se. The insights provided can readily be formulated as hypotheses to test in future experimental studies.
Collapse
Affiliation(s)
- Jianwen Fang
- Applied Bioinformatics Laboratory, University of Kansas, Lawrence, KS 66045, USA
| | - Yakov M Koen
- Department of Medicinal Chemistry, University of Kansas, Lawrence, KS 66045, USA
| | - Robert P Hanzlik
- Department of Medicinal Chemistry, University of Kansas, Lawrence, KS 66045, USA
| |
Collapse
|
626
|
Shiryaev SA, Chernov AV, Aleshin AE, Shiryaeva TN, Strongin AY. NS4A regulates the ATPase activity of the NS3 helicase: a novel cofactor role of the non-structural protein NS4A from West Nile virus. J Gen Virol 2009; 90:2081-5. [PMID: 19474250 DOI: 10.1099/vir.0.012864-0] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Using constructs that encode the individual West Nile virus (WNV) NS3helicase (NS3hel) and NS3hel linked to the hydrophilic, N-terminal 1-50 sequence of NS4A, we demonstrated that the presence of NS4A allows NS3hel to conserve energy in the course of oligonucleotide substrate unwinding. Using NS4A mutants, we also determined that the C-terminal acidic EELPD/E motif of NS4A, which appears to be functionally similar to the acidic EFDEMEE motif of hepatitis C virus (HCV) NS4A, is essential for regulating the ATPase activity of NS3hel. We concluded that, similar to HCV NS4A, NS4A of WNV acts as a cofactor for NS3hel and allows helicase to sustain the unwinding rate of the viral RNA under conditions of ATP deficiency.
Collapse
|
627
|
Bhide MR, Mucha R, Mikula I, Kisova L, Skrabana R, Novak M, Mikula I. Novel mutations in TLR genes cause hyporesponsiveness to Mycobacterium avium subsp. paratuberculosis infection. BMC Genet 2009; 10:21. [PMID: 19470169 PMCID: PMC2705378 DOI: 10.1186/1471-2156-10-21] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2008] [Accepted: 05/26/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Toll like receptors (TLR) play the central role in the recognition of pathogen associated molecular patterns (PAMPs). Mutations in the TLR1, TLR2 and TLR4 genes may change the ability to recognize PAMPs and cause altered responsiveness to the bacterial pathogens. RESULTS The study presents association between TLR gene mutations and increased susceptibility to Mycobacterium avium subsp. paratuberculosis (MAP) infection. Novel mutations in TLR genes (TLR1- Ser150Gly and Val220Met; TLR2 - Phe670Leu) were statistically correlated with the hindrance in recognition of MAP legends. This correlation was confirmed subsequently by measuring the expression levels of cytokines (IL-4, IL-8, IL-10, IL-12 and IFN-gamma) in the mutant and wild type moDCs (mocyte derived dendritic cells) after challenge with MAP cell lysate or LPS. Further in silico analysis of the TLR1 and TLR4 ectodomains (ECD) revealed the polymorphic nature of the central ECD and irregularities in the central LRR (leucine rich repeat) motifs. CONCLUSION The most critical positions that may alter the pathogen recognition ability of TLR were: the 9th amino acid position in LRR motif (TLR1-LRR10) and 4th residue downstream to LRR domain (exta-LRR region of TLR4). The study describes novel mutations in the TLRs and presents their association with the MAP infection.
Collapse
Affiliation(s)
- Mangesh R Bhide
- Laboratory of Biomedical Microbiology and Immunology, University of Veterinary Medicine, Komenskeho-73, Kosice, Slovakia.
| | | | | | | | | | | | | |
Collapse
|
628
|
Benkert P, Künzli M, Schwede T. QMEAN server for protein model quality estimation. Nucleic Acids Res 2009; 37:W510-4. [PMID: 19429685 DOI: 10.1093/nar/gkp322] [Citation(s) in RCA: 593] [Impact Index Per Article: 39.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Model quality estimation is an essential component of protein structure prediction, since ultimately the accuracy of a model determines its usefulness for specific applications. Usually, in the course of protein structure prediction a set of alternative models is produced, from which subsequently the most accurate model has to be selected. The QMEAN server provides access to two scoring functions successfully tested at the eighth round of the community-wide blind test experiment CASP. The user can choose between the composite scoring function QMEAN, which derives a quality estimate on the basis of the geometrical analysis of single models, and the clustering-based scoring function QMEANclust which calculates a global and local quality estimate based on a weighted all-against-all comparison of the models from the ensemble provided by the user. The web server performs a ranking of the input models and highlights potentially problematic regions for each model. The QMEAN server is available at http://swissmodel.expasy.org/qmean.
Collapse
|
629
|
Pivotal roles of the outer membrane polysaccharide export and polysaccharide copolymerase protein families in export of extracellular polysaccharides in gram-negative bacteria. Microbiol Mol Biol Rev 2009; 73:155-77. [PMID: 19258536 PMCID: PMC2650888 DOI: 10.1128/mmbr.00024-08] [Citation(s) in RCA: 185] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Many bacteria export extracellular polysaccharides (EPS) and capsular polysaccharides (CPS). These polymers exhibit remarkably diverse structures and play important roles in the biology of free-living, commensal, and pathogenic bacteria. EPS and CPS production represents a major challenge because these high-molecular-weight hydrophilic polymers must be assembled and exported in a process spanning the envelope, without compromising the essential barrier properties of the envelope. Emerging evidence points to the existence of molecular scaffolds that perform these critical polymer-trafficking functions. Two major pathways with different polymer biosynthesis strategies are involved in the assembly of most EPS/CPS: the Wzy-dependent and ATP-binding cassette (ABC) transporter-dependent pathways. They converge in an outer membrane export step mediated by a member of the outer membrane auxiliary (OMA) protein family. OMA proteins form outer membrane efflux channels for the polymers, and here we propose the revised name outer membrane polysaccharide export (OPX) proteins. Proteins in the polysaccharide copolymerase (PCP) family have been implicated in several aspects of polymer biogenesis, but there is unequivocal evidence for some systems that PCP and OPX proteins interact to form a trans-envelope scaffold for polymer export. Understanding of the precise functions of the OPX and PCP proteins has been advanced by recent findings from biochemistry and structural biology approaches and by parallel studies of other macromolecular trafficking events. Phylogenetic analyses reported here also contribute important new insight into the distribution, structural relationships, and function of the OPX and PCP proteins. This review is intended as an update on progress in this important area of microbial cell biology.
Collapse
|
630
|
Mooney C, Pollastri G. Beyond the Twilight Zone: Automated prediction of structural properties of proteins by recursive neural networks and remote homology information. Proteins 2009; 77:181-90. [DOI: 10.1002/prot.22429] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
631
|
Stagner EE, Bouvrette DJ, Cheng J, Bryda EC. The polycystic kidney disease-related proteins Bicc1 and SamCystin interact. Biochem Biophys Res Commun 2009; 383:16-21. [PMID: 19324013 DOI: 10.1016/j.bbrc.2009.03.113] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2009] [Accepted: 03/19/2009] [Indexed: 12/27/2022]
Abstract
Mutations in either the Bicaudal-C or the Anks6 gene which encode the Bicc1 and SamCystin proteins respectively cause formation of renal cysts in rodent models of polycystic kidney disease, however their role in the mammalian kidney is unknown. Immunolocalization studies demonstrated that, unlike many other PKD-related proteins, SamCystin and Bicc1 do not localize to the primary cilia of cultured kidney cells. Epitope-tagged recombinant SamCystin and Bicc1 proteins were transiently transfected into inner medullary collecting duct (IMCD) cells and co-immunoprecipitated. The results showed that SamCystin self-associates, Bicc1 and SamCystin interact, the mutation responsible for PKD in the Han:SPRD-Cy rat disrupts the self-association of SamCystin but not the Bicc1-SamCystin interaction, and RNA may be an important component of the Bicc1-SamCystin complex. These studies provide the first evidence that Bicc1 and SamCystin interact at the protein level suggesting that they function in a common molecular pathway that when perturbed, is involved in cystogenesis.
Collapse
Affiliation(s)
- Emily E Stagner
- Department of Veterinary Pathobiology, University of Missouri, Columbia, 65211, USA
| | | | | | | |
Collapse
|
632
|
Nuclear magnetic resonance and circular dichroism study of metastin (Kisspeptin-54) structure in solution. Clin Exp Metastasis 2009; 26:527-33. [PMID: 19308666 DOI: 10.1007/s10585-009-9252-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2008] [Accepted: 03/09/2009] [Indexed: 12/12/2022]
Abstract
KISS1 was first discovered as a metastasis suppressor, but also plays crucial roles in the onset of puberty. The KISS1 gene encodes a secreted protein of 145 amino acids that exhibits no sequence similarity with any known proteins. KISS1 protein is proteolytically processed to generate a number of so-called kisspeptins (KP), the most well characterized is known as KP-54 or metastin. KP-54 is carboxy-terminally amidated and binds to and activates the KISS1 receptor (KISS1R). The current studies were undertaken in order to determine structure of KP-54 using nuclear magnetic resonance and circular dichroism. KP-54 is mostly disordered both in water and in trifluoroethanol/water mixed solvent, with no structural motifs. In sodium dodecyl sulfate micelles, KP-54 remains mostly disordered except for a small increase in helical propensity (from 3.7% in water to 9.9% in micelles). Despite this apparent lack of structure, KP-54 is biologically active. The intrinsic disorder of KP-54 may confer advantages in its ability to recognize and bind a wide range of target proteins.
Collapse
|
633
|
Shi X, Elliott RM. Generation and analysis of recombinant Bunyamwera orthobunyaviruses expressing V5 epitope-tagged L proteins. J Gen Virol 2009; 90:297-306. [PMID: 19141438 PMCID: PMC2885054 DOI: 10.1099/vir.0.007567-0] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
The L protein of Bunyamwera virus (BUNV; family Bunyaviridae) is an RNA-dependent RNA polymerase, 2238 aa in length, that catalyses transcription and replication of the negative-sense, tripartite RNA genome. To learn more about the molecular interactions of the L protein and to monitor its intracellular distribution we inserted a 14 aa V5 epitope derived from parainfluenza virus type 5, against which high-affinity antibodies are available, into different regions of the protein. Insertion of the epitope at positions 1935 or 2046 resulted in recombinant L proteins that retained functionality in a minireplicon assay. Two viable recombinant viruses, rBUNL4V5 and rBUNL5V5, expressing the tagged L protein were rescued by reverse genetics, and characterized with respect to their plaque size, growth kinetics and protein synthesis profile. The recombinant viruses behaved similarly to wild-type (wt) BUNV in BHK-21 cells, but formed smaller plaques and grew to lower titres in Vero E6 cells compared with wt BUNV. Immunofluorescent staining of infected cells showed the L protein to have a punctate to reticular distribution in the cytoplasm, and cell fractionation studies indicated that the L protein was present in both soluble and microsomal fractions. Co-immunoprecipitation and confocal microscopic assays confirmed an interaction between BUNV L and N proteins. The recombinant viruses expressing tagged L protein will be highly valuable reagents for the detailed dissection of the role of the BUNV L protein in virus replication.
Collapse
Affiliation(s)
- Xiaohong Shi
- Centre for Biomolecular Sciences, School of Biology, University of St Andrews, North Haugh, St Andrews, Scotland KY16 9ST, UK
| | - Richard M Elliott
- Centre for Biomolecular Sciences, School of Biology, University of St Andrews, North Haugh, St Andrews, Scotland KY16 9ST, UK
| |
Collapse
|
634
|
Mimicking the folding pathway to improve homology-free protein structure prediction. Proc Natl Acad Sci U S A 2009; 106:3734-9. [PMID: 19237560 DOI: 10.1073/pnas.0811363106] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Since the demonstration that the sequence of a protein encodes its structure, the prediction of structure from sequence remains an outstanding problem that impacts numerous scientific disciplines, including many genome projects. By iteratively fixing secondary structure assignments of residues during Monte Carlo simulations of folding, our coarse-grained model without information concerning homology or explicit side chains can outperform current homology-based secondary structure prediction methods for many proteins. The computationally rapid algorithm using only single (phi,psi) dihedral angle moves also generates tertiary structures of accuracy comparable with existing all-atom methods for many small proteins, particularly those with low homology. Hence, given appropriate search strategies and scoring functions, reduced representations can be used for accurately predicting secondary structure and providing 3D structures, thereby increasing the size of proteins approachable by homology-free methods and the accuracy of template methods that depend on a high-quality input secondary structure.
Collapse
|
635
|
Nabuurs SM, Westphal AH, van Mierlo CPM. Extensive formation of off-pathway species during folding of an alpha-beta parallel protein is due to docking of (non)native structure elements in unfolded molecules. J Am Chem Soc 2009; 130:16914-20. [PMID: 19053416 DOI: 10.1021/ja803841n] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Detailed information about unfolded states is required to understand how proteins fold. Knowledge about folding intermediates formed subsequently is essential to get a grip on pathological aggregation phenomena. During folding of apoflavodoxin, which adopts the widely prevalent alpha-beta parallel topology, most molecules fold via an off-pathway folding intermediate with helical properties. To better understand why this species is formed, guanidine hydrochloride-unfolded apoflavodoxin is characterized at the residue level using heteronuclear NMR spectroscopy. In 6.0 M denaturant, the protein behaves as a random coil. In contrast, at 3.4 M denaturant, secondary shifts and (1)H-(15)N relaxation rates report four transiently ordered regions in unfolded apoflavodoxin. These regions have restricted flexibility on the (sub)nanosecond time scale. Secondary shifts show that three of these regions form alpha-helices, which are populated about 10% of the time, as confirmed by far-UV CD data. One region of unfolded apoflavodoxin adopts non-native structure. Of the alpha-helices observed, two are present in native apoflavodoxin as well. A substantial part of the third helix becomes beta-strand while forming native protein. Chemical shift changes due to amino acid residue replacement show that the latter alpha-helix has hydrophobic interactions with all other ordered regions in unfolded apoflavodoxin. Remarkably, these ordered segments dock non-natively, which causes strong competition with on-pathway folding. Thus, rather than directing productive folding, conformational preorganization in the unfolded state of an alpha-beta parallel-type protein promotes off-pathway species formation.
Collapse
Affiliation(s)
- Sanne M Nabuurs
- Laboratory of Biochemistry, Wageningen University, Dreijenlaan 3, 6703 HA Wageningen, The Netherlands
| | | | | |
Collapse
|
636
|
Madhusmita S, Singh H, Karlapalem K, Mitra A. A real valued Genetic Algorithm for generating native like structure of small globular protein. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2009; 2008:1359-62. [PMID: 19162920 DOI: 10.1109/iembs.2008.4649417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Predicting the 3D native conformation of a protein given the amino acid sequence is known as protein structure prediction (PSP) problem and is one of the greatest challenges of computational biology. The current work uses a real valued Genetic Algorithm (GA), a powerful variate of GA to simulate the PSP problem. This algorithm consists of basic evolutionary operators and a fitness vector. The fitness vector is designed by combining a set of knowledge based biophysical filters viz. persistence length, radius of gyration, packing fraction, hydrophobicity ratio and irregularity index of phi and psi. This vector converts all these biophysical measures into a real value by using specific weights or factors. The algorithm has been validated on six known globular proteins, with their length varying from 17-61 residues and total number of helices and strands in the range of 2-4. For all the test protein the algorithm converges rapidly and the converged structure shows a backbone RMSD (root mean square deviation) of 3-6A as compared to the native structure.
Collapse
Affiliation(s)
- S Madhusmita
- Centre for Computational Natural Sciences and Bioinformatics (CCNSB), International Institute of Information Technology, Gachibowli, Hyderabad-500032, India
| | | | | | | |
Collapse
|
637
|
Usuki S, Nakatani Y, Taguchi K, Fujita T, Tanabe S, Ustunomiya I, Gu Y, Cawthraw SA, Newell DG, Pajaniappan M, Thompson SA, Ariga T, Yu RK. Topology and patch-clamp analysis of the sodium channel in relationship to the anti-lipid a antibody in campylobacteriosis. J Neurosci Res 2009; 86:3359-74. [PMID: 18627035 DOI: 10.1002/jnr.21781] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
An infecting strain VLA2/18 of Campylobacter jejuni was obtained from an individual with campylobacteriosis and used to prepare chicken sera by experimental infection to investigate the role of serum anti-ganglioside antibodies in Guillain-Barré syndrome. Both sera of the patient and chicken contained anti-ganglioside antibodies and anti-Lipid A (anti-Kdo2-Lipid A) antibodies directed against the lipid A portion of the bacterial lipooligosaccharide. The anti-Kdo2-Lipid A activities inhibited voltage-gated Na (Nav) channel of NSC-34 cells in culture. We hypothesized that anti-Kdo2-Lipid A antibody acts on the functional inhibition of Nav1.4. To test this possibility, a rabbit peptide antibody (anti-Nav1.4 pAb) against a 19-mer peptide (KELKDNHILNHVGLTDGPR) on the alpha subunit of Nav1.4 was produced. Anti-Nav1.4 pAb was cross-reactive to Kdo2-Lipid A. Anti-Kdo2-lipid A antibody activity in the chicken serum was tested for the Na(+) current inhibition in NSC-34 cells in combination with mu-Conotoxin and tetrodotoxin. Contrary to our expectations, the anti-Kdo2-Lipid A antibody activity was extended to Nav channels other than Nav1.4. By overlapping structural analysis, it was found that there might be multiple peptide epitopes containing certain dipeptides showing a structural similarity with v-Lipid A. Thus, our study suggests the possibility that there are multiple epitopic peptides on the extracellular domains of Nav1.1 to 1.9, and some of them may represent target sites for anti-Kdo2-Lipid A antibody, to induce neurophysiological changes in GBS by disrupting the normal function of the Nav channels.
Collapse
Affiliation(s)
- Seigo Usuki
- Institute of Molecular Medicine and Genetics, Medical College of Georgia, Augusta, Georgia 30912-2697, USA
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
638
|
Brown JB, Akutsu T. Identification of novel DNA repair proteins via primary sequence, secondary structure, and homology. BMC Bioinformatics 2009; 10:25. [PMID: 19154573 PMCID: PMC2660303 DOI: 10.1186/1471-2105-10-25] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2008] [Accepted: 01/20/2009] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND DNA repair is the general term for the collection of critical mechanisms which repair many forms of DNA damage such as methylation or ionizing radiation. DNA repair has mainly been studied in experimental and clinical situations, and relatively few information-based approaches to new extracting DNA repair knowledge exist. As a first step, automatic detection of DNA repair proteins in genomes via informatics techniques is desirable; however, there are many forms of DNA repair and it is not a straightforward process to identify and classify repair proteins with a single optimal method. We perform a study of the ability of homology and machine learning-based methods to identify and classify DNA repair proteins, as well as scan vertebrate genomes for the presence of novel repair proteins. Combinations of primary sequence polypeptide frequency, secondary structure, and homology information are used as feature information for input to a Support Vector Machine (SVM). RESULTS We identify that SVM techniques are capable of identifying portions of DNA repair protein datasets without admitting false positives; at low levels of false positive tolerance, homology can also identify and classify proteins with good performance. Secondary structure information provides improved performance compared to using primary structure alone. Furthermore, we observe that machine learning methods incorporating homology information perform best when data is filtered by some clustering technique. Analysis by applying these methodologies to the scanning of multiple vertebrate genomes confirms a positive correlation between the size of a genome and the number of DNA repair protein transcripts it is likely to contain, and simultaneously suggests that all organisms have a non-zero minimum number of repair genes. In addition, the scan result clusters several organisms' repair abilities in an evolutionarily consistent fashion. Analysis also identifies several functionally unconfirmed proteins that are highly likely to be involved in the repair process. A new web service, INTREPED, has been made available for the immediate search and annotation of DNA repair proteins in newly sequenced genomes. CONCLUSION Despite complexity due to a multitude of repair pathways, combinations of sequence, structure, and homology with Support Vector Machines offer good methods in addition to existing homology searches for DNA repair protein identification and functional annotation. Most importantly, this study has uncovered relationships between the size of a genome and a genome's available repair repertoire, and offers a number of new predictions as well as a prediction service, both which reduce the search time and cost for novel repair genes and proteins.
Collapse
Affiliation(s)
- J B Brown
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto, 611-0011, Japan.
| | | |
Collapse
|
639
|
Design and characterization of novel hybrid peptides from LFB15(W4,10), HP(2-20), and cecropin A based on structure parameters by computer-aided method. Appl Microbiol Biotechnol 2009; 82:1097-103. [PMID: 19148638 DOI: 10.1007/s00253-008-1839-x] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2008] [Revised: 12/19/2008] [Accepted: 12/21/2008] [Indexed: 10/21/2022]
Abstract
The increasing problem of antibiotic resistance among pathogenic bacteria requires development of new antimicrobial agents. The pivotal assets of the antimicrobial peptide include potential for rapid bactericidal activity and low propensity for resistance. The four new antimicrobial hybrid peptides were designed based on peptides LFB15(W4,10), HP(2-20), and cecropin A according to the structure-activity relationship of the amphipathic and cationic antimicrobial peptides. Their structural parameters were accessed by bioinformatics tools, and then two hybrids with the most potential candidates were synthesized. The hybrid peptide LH28 caused an increase in antibiotic activity (MIC(50)=1.56-3.13 microM) against given bacterial strains and did not cause obvious hemolysis of rabbit erythrocytes at concentration of 3.13 microM with effective antimicrobial activity. The results demonstrate that evaluating the structural parameters could be useful for designing novel antimicrobial peptides.
Collapse
|
640
|
Abstract
Molecular modeling techniques have made significant advances in recent years and are becoming essential components of many chemical, physical and biological studies. Here we present three widely used techniques used in the simulation of biomolecular systems: structural and homology modeling, molecular dynamics and molecular docking. For each of these topics we present a brief discussion of the underlying scientific basis of the technique, some simple examples of how the method is commonly applied, and some discussion of the limitations and caveats of which the user should be aware. References for further reading as well as an extensive list of software resources are provided.
Collapse
Affiliation(s)
- Akansha Saxena
- Biomedical Engineering, Washington University, St Louis, Missouri, USA
| | - Diana Wong
- Biomedical Engineering, Washington University, St Louis, Missouri, USA
| | - Karthikeyan Diraviyam
- Biomedical Engineering and Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
| | - David Sept
- Biomedical Engineering and Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, USA
| |
Collapse
|
641
|
|
642
|
Wywial E, Dongre VN, Singh SM. Proteomic tools for the analysis of cytoskeleton proteins. Methods Mol Biol 2009; 586:375-388. [PMID: 19768443 DOI: 10.1007/978-1-60761-376-3_22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
Proteomic tools have become an essential part of the tool kit of the molecular biologist, and provide techniques for detecting homologous sequences, recognizing functional domains, modeling, and analyzing the three-dimensional structure for any given protein sequence. Although a wealth of structural and functional information is available for a large number of members of the various classes of cytoskeletal proteins, many more members remain uncharacterized. These computational tools that are freely and easily accessible to the scientific community provide an excellent starting point to predict the structural and functional properties of such partially or fully uncharacterized protein sequences, and can lead to elegantly designed experiments to probe the hypothesized function. This chapter discusses various proteomic analysis tools with a focus on protein structure and function predictions.
Collapse
Affiliation(s)
- Ewa Wywial
- Department of Biology, Brooklyn College-CUNY, Brooklyn, NY, USA
| | | | | |
Collapse
|
643
|
Hübscher J, Lüthy L, Berger-Bächi B, Stutzmann Meier P. Phylogenetic distribution and membrane topology of the LytR-CpsA-Psr protein family. BMC Genomics 2008; 9:617. [PMID: 19099556 PMCID: PMC2632651 DOI: 10.1186/1471-2164-9-617] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2008] [Accepted: 12/19/2008] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND The bacterial cell wall is the target of many antibiotics and cell envelope constituents are critical to host-pathogen interactions. To combat resistance development and virulence, a detailed knowledge of the individual factors involved is essential. Members of the LytR-CpsA-Psr family of cell envelope-associated attenuators are relevant for beta-lactam resistance, biofilm formation, and stress tolerance, and they are suggested to play a role in cell wall maintenance. However, their precise function is still unknown. This study addresses the occurrence as well as sequence-based characteristics of the LytR-CpsA-Psr proteins. RESULTS A comprehensive list of LytR-CpsA-Psr proteins was established, and their phylogenetic distribution and clustering into subgroups was determined. LytR-CpsA-Psr proteins were present in all Gram-positive organisms, except for the cell wall-deficient Mollicutes and one strain of the Clostridiales. In contrast, the majority of Gram-negatives did not contain LytR-CpsA-Psr family members. Despite high sequence divergence, the LytR-CpsA-Psr domains of different subclusters shared a highly similar, predicted mixed a/beta-structure, and conserved charged residues. PhoA fusion experiments, using MsrR of Staphylococcus aureus, confirmed membrane topology predictions and extracellular location of its LytR-CpsA-Psr domain. CONCLUSION The LytR-CpsA-Psr domain is unique to bacteria. The presence of diverse subgroups within the LytR-CpsA-Psr family might indicate functional differences, and could explain variations in phenotypes of respective mutants reported. The identified conserved structural elements and amino acids are likely to be important for the function of the domain and will help to guide future studies of the LytR-CpsA-Psr proteins.
Collapse
Affiliation(s)
- Judith Hübscher
- Institute of Medical Microbiology, University of Zürich, Zürich, Switzerland.
| | | | | | | |
Collapse
|
644
|
Sweredoski MJ, Baldi P. COBEpro: a novel system for predicting continuous B-cell epitopes. Protein Eng Des Sel 2008; 22:113-20. [PMID: 19074155 DOI: 10.1093/protein/gzn075] [Citation(s) in RCA: 122] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Accurate prediction of B-cell epitopes has remained a challenging task in computational immunology despite several decades of research. Only 10% of the known B-cell epitopes are estimated to be continuous, yet they are often the targets of predictors because a solved tertiary structure is not required and they are integral to the development of peptide vaccines and engineering therapeutic proteins. In this article, we present COBEpro, a novel two-step system for predicting continuous B-cell epitopes. COBEpro is capable of assigning epitopic propensity scores to both standalone peptide fragments and residues within an antigen sequence. COBEpro first uses a support vector machine to make predictions on short peptide fragments within the query antigen sequence and then calculates an epitopic propensity score for each residue based on the fragment predictions. Secondary structure and solvent accessibility information (either predicted or exact) can be incorporated to improve performance. COBEpro achieved a cross-validated area under the curve (AUC) of the receiver operating characteristic up to 0.829 on the fragment epitopic propensity scoring task and an AUC up to 0.628 on the residue epitopic propensity scoring task. COBEpro is incorporated into the SCRATCH prediction suite at http://scratch.proteomics.ics.uci.edu.
Collapse
Affiliation(s)
- Michael J Sweredoski
- Department of Computer Science, University of California, Irvine, 92697-3435, USA
| | | |
Collapse
|
645
|
Mucha R, Bhide MR, Chakurkar EB, Novak M, Mikula I. Toll-like receptors TLR1, TLR2 and TLR4 gene mutations and natural resistance to Mycobacterium avium subsp. paratuberculosis infection in cattle. Vet Immunol Immunopathol 2008; 128:381-8. [PMID: 19131114 DOI: 10.1016/j.vetimm.2008.12.007] [Citation(s) in RCA: 93] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2008] [Revised: 11/21/2008] [Accepted: 12/01/2008] [Indexed: 11/15/2022]
Abstract
Toll like receptors (TLRs) are a class of pattern recognition receptors belonging to the innate immune system. Mutations in the protein coding region of TLRs are associated with altered responsiveness to pathogen-associated molecular patterns (PAMPs). A search was performed for novel mutations in bovine TLR1, TLR2 and TLR4 genes associated with the Mycobacterium avium subsp. paratuberculosis (MAP) infection. The work was also focused on the assessment of linkage between well known mutations in TLR genes (TLR2: Arg677Trp, Pro681His and Arg753Gln; TLR4: Asp299Gly and Thr399Ile), and the susceptibility of cattle to MAP infection. Detection of MAP infection in cattle population (n=711) was based on IS900 PCR, which revealed 22.50% (n=160) MAP positivity. Known mutations in TLR2 and TLR4 genes were not found in cattle population. A novel mutation Val220Met was associated (Odd's ratio, OR-3.459) with increased susceptibility to MAP infection. Toll/interleukin-1 receptor (TIR) domain of TLR2 was screened for the presence of mutations, wherein a novel Ile680Val mutation was linked with MAP infection. In silico analysis of the bovine TLR4 ectodomain (ECD) revealed the polymorphic nature of the central ECD and irregularities in the central LRR motifs. LRR11 of the TLR4 showed five missense mutations possibly linked with the increased susceptibility to MAP infection. The most critical position that may alter the pathogen recognition of TLR molecule was 4th residue downstream to LRR domain. Two such missense mutations in TLR4 (Asp299Asn downstream to LRR11, and Gly389Ser downstream to LRR15) were associated with MAP infection. Briefly, the work describes novel mutations in the bovine TLRs and presents their association with the MAP infection.
Collapse
Affiliation(s)
- R Mucha
- Laboratory of Biomedical Microbiology and Immunology, University of Veterinary Medicine, Kosice, Slovakia
| | | | | | | | | |
Collapse
|
646
|
Randall A, Baldi P. SELECTpro: effective protein model selection using a structure-based energy function resistant to BLUNDERs. BMC STRUCTURAL BIOLOGY 2008; 8:52. [PMID: 19055744 PMCID: PMC2667183 DOI: 10.1186/1472-6807-8-52] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/26/2008] [Accepted: 12/03/2008] [Indexed: 11/10/2022]
Abstract
Background Protein tertiary structure prediction is a fundamental problem in computational biology and identifying the most native-like model from a set of predicted models is a key sub-problem. Consensus methods work well when the redundant models in the set are the most native-like, but fail when the most native-like model is unique. In contrast, structure-based methods score models independently and can be applied to model sets of any size and redundancy level. Additionally, structure-based methods have a variety of important applications including analogous fold recognition, refinement of sequence-structure alignments, and de novo prediction. The purpose of this work was to develop a structure-based model selection method based on predicted structural features that could be applied successfully to any set of models. Results Here we introduce SELECTpro, a novel structure-based model selection method derived from an energy function comprising physical, statistical, and predicted structural terms. Novel and unique energy terms include predicted secondary structure, predicted solvent accessibility, predicted contact map, β-strand pairing, and side-chain hydrogen bonding. SELECTpro participated in the new model quality assessment (QA) category in CASP7, submitting predictions for all 95 targets and achieved top results. The average difference in GDT-TS between models ranked first by SELECTpro and the most native-like model was 5.07. This GDT-TS difference was less than 1% of the GDT-TS of the most native-like model for 18 targets, and less than 10% for 66 targets. SELECTpro also ranked the single most native-like first for 15 targets, in the top five for 39 targets, and in the top ten for 53 targets, more often than any other method. Because the ranking metric is skewed by model redundancy and ignores poor models with a better ranking than the most native-like model, the BLUNDER metric is introduced to overcome these limitations. SELECTpro is also evaluated on a recent benchmark set of 16 small proteins with large decoy sets of 12500 to 20000 models for each protein, where it outperforms the benchmarked method (I-TASSER). Conclusion SELECTpro is an effective model selection method that scores models independently and is appropriate for use on any model set. SELECTpro is available for download as a stand alone application at: . SELECTpro is also available as a public server at the same site.
Collapse
Affiliation(s)
- Arlo Randall
- School of Information and Computer Sciences, University of California, Irvine, CA 92697, USA.
| | | |
Collapse
|
647
|
Chu FH, Wang SY, Lee LC, Shaw JF. Identification and characterization of a lipase gene from Antrodia cinnamomea. ACTA ACUST UNITED AC 2008; 112:1421-7. [DOI: 10.1016/j.mycres.2008.06.006] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2007] [Revised: 05/12/2008] [Accepted: 06/11/2008] [Indexed: 12/01/2022]
|
648
|
Schaeffer C, Santambrogio S, Perucca S, Casari G, Rampoldi L. Analysis of uromodulin polymerization provides new insights into the mechanisms regulating ZP domain-mediated protein assembly. Mol Biol Cell 2008; 20:589-99. [PMID: 19005207 DOI: 10.1091/mbc.e08-08-0876] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Uromodulin is the most abundant protein secreted in urine, in which it is found as a high-molecular-weight polymer. Polymerization occurs via its zona pellucida (ZP) domain, a conserved module shared by many extracellular eukaryotic proteins that are able to assemble into matrices. In this work, we identified two motifs in uromodulin, mapping in the linker region of the ZP domain and in between protein cleavage and glycosylphosphatidylinositol (GPI)-anchoring sites, which regulate its polymerization. Indeed, mutations in either module led to premature intracellular polymerization of a soluble uromodulin isoform, demonstrating the inhibitory role of these motifs for ZP domain-mediated protein assembly. Proteolytic cleavage separating the external motif from the mature monomer is necessary to release the inhibitory function and allow protein polymerization. Moreover, we report absent or abnormal assembly into filaments of GPI-anchored uromodulin mutated in either the internal or the external motif. This effect is due to altered processing on the plasma membrane, demonstrating that the presence of the two modules has not only an inhibitory function but also can positively regulate protein polymerization. Our data expand previous knowledge on the control of ZP domain function and suggest a common mechanism regulating polymerization of ZP domain proteins.
Collapse
Affiliation(s)
- Céline Schaeffer
- Dulbecco Telethon Institute, Molecular Genetics of Renal Disorders, San Raffaele Scientific Institute, 20132 Milan, Italy
| | | | | | | | | |
Collapse
|
649
|
Wu Y, Dousis AD, Chen M, Li J, Ma J. OPUS-Dom: applying the folding-based method VECFOLD to determine protein domain boundaries. J Mol Biol 2008; 385:1314-29. [PMID: 19026662 DOI: 10.1016/j.jmb.2008.10.093] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2008] [Revised: 10/29/2008] [Accepted: 10/31/2008] [Indexed: 10/21/2022]
Abstract
In this article, we present a de novo method for predicting protein domain boundaries, called OPUS-Dom. The core of the method is a novel coarse-grained folding method, VECFOLD, which constructs low-resolution structural models from a target sequence by folding a chain of vectors representing the predicted secondary-structure elements. OPUS-Dom generates a large ensemble of folded structure decoys by VECFOLD and labels the domain boundaries of each decoy by a domain parsing algorithm. Consensus domain boundaries are then derived from the statistical distribution of the putative boundaries and three empirical sequence-based domain profiles. OPUS-Dom generally outperformed several state-of-the-art domain prediction algorithms over various benchmark protein sets. Even though each VECFOLD-generated structure contains large errors, collectively these structures provide a more robust delineation of domain boundaries. The success of OPUS-Dom suggests that the arrangement of protein domains is more a consequence of limited coordination patterns per domain arising from tertiary packing of secondary-structure segments, rather than sequence-specific constraints.
Collapse
Affiliation(s)
- Yinghao Wu
- Department of Bioengineering, Rice University, Houston, TX 77005, USA
| | | | | | | | | |
Collapse
|
650
|
Alamo L, Wriggers W, Pinto A, Bártoli F, Salazar L, Zhao FQ, Craig R, Padrón R. Three-dimensional reconstruction of tarantula myosin filaments suggests how phosphorylation may regulate myosin activity. J Mol Biol 2008; 384:780-97. [PMID: 18951904 DOI: 10.1016/j.jmb.2008.10.013] [Citation(s) in RCA: 117] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2008] [Revised: 09/27/2008] [Accepted: 10/02/2008] [Indexed: 11/19/2022]
Abstract
Muscle contraction involves the interaction of the myosin heads of the thick filaments with actin subunits of the thin filaments. Relaxation occurs when this interaction is blocked by molecular switches on these filaments. In many muscles, myosin-linked regulation involves phosphorylation of the myosin regulatory light chains (RLCs). Electron microscopy of vertebrate smooth muscle myosin molecules (regulated by phosphorylation) has provided insight into the relaxed structure, revealing that myosin is switched off by intramolecular interactions between its two heads, the free head and the blocked head. Three-dimensional reconstruction of frozen-hydrated specimens revealed that this asymmetric head interaction is also present in native thick filaments of tarantula striated muscle. Our goal in this study was to elucidate the structural features of the tarantula filament involved in phosphorylation-based regulation. A new reconstruction revealed intra- and intermolecular myosin interactions in addition to those seen previously. To help interpret the interactions, we sequenced the tarantula RLC and fitted an atomic model of the myosin head that included the predicted RLC atomic structure and an S2 (subfragment 2) crystal structure to the reconstruction. The fitting suggests one intramolecular interaction, between the cardiomyopathy loop of the free head and its own S2, and two intermolecular interactions, between the cardiac loop of the free head and the essential light chain of the blocked head and between the Leu305-Gln327 interaction loop of the free head and the N-terminal fragment of the RLC of the blocked head. These interactions, added to those previously described, would help switch off the thick filament. Molecular dynamics simulations suggest how phosphorylation could increase the helical content of the RLC N-terminus, weakening these interactions, thus releasing both heads and activating the thick filament.
Collapse
Affiliation(s)
- Lorenzo Alamo
- Departamento de Biología Estructural, Instituto Venezolano de Investigaciones Científicas, Caracas, Venezuela
| | | | | | | | | | | | | | | |
Collapse
|