51
|
Hazai E, Bikádi Z. Homology modeling of breast cancer resistance protein (ABCG2). J Struct Biol 2008; 162:63-74. [DOI: 10.1016/j.jsb.2007.12.001] [Citation(s) in RCA: 70] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2007] [Revised: 11/16/2007] [Accepted: 12/06/2007] [Indexed: 01/31/2023]
|
52
|
Faure G, Bornot A, de Brevern AG. Protein contacts, inter-residue interactions and side-chain modelling. Biochimie 2008; 90:626-39. [DOI: 10.1016/j.biochi.2007.11.007] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2007] [Accepted: 11/22/2007] [Indexed: 10/22/2022]
|
53
|
Sardar PS, Samanta S, Maity SS, Dasgupta S, Ghosh S. Energy Transfer Photophysics from Serum Albumins to Sequestered 3-Hydroxy-2-Naphthoic Acid, an Excited State Intramolecular Proton-Transfer Probe. J Phys Chem B 2008; 112:3451-61. [DOI: 10.1021/jp074598+] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
54
|
Shackelford G, Karplus K. Contact prediction using mutual information and neural nets. Proteins 2008; 69 Suppl 8:159-64. [PMID: 17932918 DOI: 10.1002/prot.21791] [Citation(s) in RCA: 81] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Prediction of protein structures continues to be a difficult problem, particularly when there are no solved structures for homologous proteins to use as templates. Local structure prediction (secondary structure and burial) is fairly reliable, but does not provide enough information to produce complete three-dimensional structures. Residue-residue contact prediction, though still not highly reliable, may provide a useful guide for assembling local structure prediction into full tertiary prediction. We develop a neural network which is applied to pairs of residue positions and outputs a probability of contact between the positions. One of the neural net inputs is a novel statistic for detecting correlated mutations: the statistical significance of the mutual information between the corresponding columns of a multiple sequence alignment. This statistic, combined with a second statistic based on the propensity of two amino acid types being in contact, results in a simple neural network that is a good predictor of contacts. Adding more features from amino-acid distributions and local structure predictions, the final neural network predicts contacts better than other submitted contact predictions at CASP7, including contact predictions derived from fragment-based tertiary models on free-modeling domains. It is still not known if contact predictions can improve tertiary models on free-modeling domains. Available at http://www.soe.ucsc.edu/research/compbio/SAM_T06/T06-query.html.
Collapse
Affiliation(s)
- George Shackelford
- Department of Biomolecular Engineering, University of California, Santa Cruz, California 95064, USA
| | | |
Collapse
|
55
|
Yao XQ, Zhu H, She ZS. A dynamic Bayesian network approach to protein secondary structure prediction. BMC Bioinformatics 2008; 9:49. [PMID: 18218144 PMCID: PMC2266706 DOI: 10.1186/1471-2105-9-49] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2007] [Accepted: 01/25/2008] [Indexed: 11/19/2022] Open
Abstract
Background Protein secondary structure prediction method based on probabilistic models such as hidden Markov model (HMM) appeals to many because it provides meaningful information relevant to sequence-structure relationship. However, at present, the prediction accuracy of pure HMM-type methods is much lower than that of machine learning-based methods such as neural networks (NN) or support vector machines (SVM). Results In this paper, we report a new method of probabilistic nature for protein secondary structure prediction, based on dynamic Bayesian networks (DBN). The new method models the PSI-BLAST profile of a protein sequence using a multivariate Gaussian distribution, and simultaneously takes into account the dependency between the profile and secondary structure and the dependency between profiles of neighboring residues. In addition, a segment length distribution is introduced for each secondary structure state. Tests show that the DBN method has made a significant improvement in the accuracy compared to other pure HMM-type methods. Further improvement is achieved by combining the DBN with an NN, a method called DBNN, which shows better Q3 accuracy than many popular methods and is competitive to the current state-of-the-arts. The most interesting feature of DBN/DBNN is that a significant improvement in the prediction accuracy is achieved when combined with other methods by a simple consensus. Conclusion The DBN method using a Gaussian distribution for the PSI-BLAST profile and a high-ordered dependency between profiles of neighboring residues produces significantly better prediction accuracy than other HMM-type probabilistic methods. Owing to their different nature, the DBN and NN combine to form a more accurate method DBNN. Future improvement may be achieved by combining DBNN with a method of SVM type.
Collapse
Affiliation(s)
- Xin-Qiu Yao
- State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, Peking University, Beijing 100871, China.
| | | | | |
Collapse
|
56
|
Battey JND, Kopp J, Bordoli L, Read RJ, Clarke ND, Schwede T. Automated server predictions in CASP7. Proteins 2008; 69 Suppl 8:68-82. [PMID: 17894354 DOI: 10.1002/prot.21761] [Citation(s) in RCA: 94] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
With each round of CASP (Critical Assessment of Techniques for Protein Structure Prediction), automated prediction servers have played an increasingly important role. Today, most protein structure prediction approaches in some way depend on automated methods for fold recognition or model building. The accuracy of server predictions has significantly increased over the last years, and, in CASP7, we observed a continuation of this trend. In the template-based modeling category, the best prediction server was ranked third overall, i.e. it outperformed all but two of the human participating groups. This server also ranked among the very best predictors in the free modeling category as well, being clearly beaten by only one human group. In the high accuracy (HA) subset of TBM, two of the top five groups were servers. This article summarizes the contribution of automated structure prediction servers in the CASP7 experiment, with emphasis on 3D structure prediction, as well as information on their prediction scope and public availability.
Collapse
|
57
|
|
58
|
Investigating the binding of curcumin derivatives to bovine serum albumin. Biophys Chem 2007; 132:81-8. [PMID: 18037556 DOI: 10.1016/j.bpc.2007.10.007] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2007] [Revised: 10/14/2007] [Accepted: 10/14/2007] [Indexed: 11/22/2022]
Abstract
The interaction of bovine serum albumin (BSA) with isoxazolcurcumin (IOC) and diacetylcurcumin (DAC) has been investigated. Binding constants obtained were found to be in the 10(5) M(-1) range. Minor conformational changes of BSA were observed from circular dichroism (CD) and Fourier transformed infrared (FT-IR) studies on binding. Based on Förster's theory of non-radiation energy transfer, the average binding distance, r between the donor (BSA) and acceptors IOC and DAC was found to be 3.79 and 4.27 nm respectively. Molecular docking of isoxazolcurcumin and diacetylcurcumin with bovine serum albumin indicated that they docked close to Trp 213, which is within the hydrophobic subdomain.
Collapse
|
59
|
Becker E, Cotillard A, Meyer V, Madaoui H, Guérois R. HMM-Kalign: a tool for generating sub-optimal HMM alignments. ACTA ACUST UNITED AC 2007; 23:3095-7. [PMID: 17921492 DOI: 10.1093/bioinformatics/btm492] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Recent development of strategies using multiple sequence alignments (MSA) or profiles to detect remote homologies between proteins has led to a significant increase in the number of proteins whose structures can be generated by comparative modeling methods. However, prediction of the optimal alignment between these highly divergent homologous proteins remains a difficult issue. We present a tool based on a generalized Viterbi algorithm that generates optimal and sub-optimal alignments between a sequence and a Hidden Markov Model. The tool is implemented as a new function within the HMMER package called hmmkalign.
Collapse
Affiliation(s)
- Emmanuelle Becker
- CEA, iBiTecS, URA 2096, SBSM, Laboratoire de Biologie Structurale et Radiobiologie, Gif sur Yvette, F-91191 France
| | | | | | | | | |
Collapse
|
60
|
Vries JK, Liu X, Bahar I. The relationship between n-gram patterns and protein secondary structure. Proteins 2007; 68:830-8. [PMID: 17523186 DOI: 10.1002/prot.21480] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
An n-gram pattern (NP{n,m}) in a protein sequence is a set of n residues and m wildcards in a window of size n+m. Each window of n+m amino acids is associated with a collection of NP{n,m} patterns based on the combinatorics of n+m objects taken m at a time. NP{n,m} patterns that are shared between sequences reflect evolutionary relationships. Recently the authors developed an alignment-independent protein classification algorithm based on shared NP{4,2} patterns that compared favorably to PSI-BLAST. Theoretically, NP{4,2} patterns should also reflect secondary structure propensity since they contain all possible n-grams for 1 < or = n < or = 4 and a window of 6 residues is wide enough to capture periodicities in the 2 < or = n < or = 5 range. This sparked interest in differentiating the information content in NP{4,2} patterns related to evolution from the content related to local propensity. The probability of alpha-, beta-, and coil components was determined for every NP{4,2} pattern over all the chains in the Protein Data Bank (PDB). An algorithm exclusively based on the Z-values of these distributions was developed, which accurately predicted 71-76% of alpha-helical segments and 62-67% of beta-sheets in rigorous jackknife tests. This provided evidence for the strong correlation between NP{4,2} patterns and secondary structure. By grouping PDB chains into subsets with increasing levels of sequence identity, it was also possible to separate the evolutionary and local propensity contributions to the classification process. The results showed that information derived from evolutionary relationships was more important for beta-sheet prediction than alpha-helix prediction.
Collapse
Affiliation(s)
- John K Vries
- Department of Computational Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania 15213, USA.
| | | | | |
Collapse
|
61
|
Akhavan A, Crivelli SN, Singh M, Lingappa VR, Muschler JL. SEA domain proteolysis determines the functional composition of dystroglycan. FASEB J 2007; 22:612-21. [PMID: 17905726 DOI: 10.1096/fj.07-8354com] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Post-translational modifications of the extracellular matrix receptor dystroglycan (DG) determine its functional state, and defects in these modifications are linked to muscular dystrophies and cancers. A prominent feature of DG biosynthesis is a precursor cleavage that segregates the ligand-binding and transmembrane domains into the noncovalently attached alpha- and beta-subunits. We investigate here the structural determinants and functional significance of this cleavage. We show that cleavage of DG elicits a conspicuous change in its ligand-binding activity. Mutations that obstruct this cleavage result in increased capacity to bind laminin, in part, due to enhanced glycosylation of alpha-DG. Reconstitution of DG cleavage in a cell-free expression system demonstrates that cleavage takes place in the endoplasmic reticulum, providing a suitable regulatory point for later processing events. Sequence and mutational analyses reveal that the cleavage occurs within a full SEA (sea urchin, enterokinase, agrin) module with traits matching those ascribed to autoproteolysis. Thus, cleavage of DG constitutes a control point for the modulation of its ligand-binding properties, with therapeutic implications for muscular dystrophies. We provide a structural model for the cleavage domain that is validated by experimental analysis and discuss this cleavage in the context of mucin protein and SEA domain evolution.
Collapse
Affiliation(s)
- Armin Akhavan
- California Pacific Medical Center Research Institute, 475 Brannan St., Ste. 220, San Francisco, CA 94107, USA
| | | | | | | | | |
Collapse
|
62
|
Brinkman D, Burnell J. Identification, cloning and sequencing of two major venom proteins from the box jellyfish, Chironex fleckeri. Toxicon 2007; 50:850-60. [PMID: 17688901 DOI: 10.1016/j.toxicon.2007.06.016] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2007] [Revised: 06/21/2007] [Accepted: 06/21/2007] [Indexed: 10/23/2022]
Abstract
Two of the most abundant proteins found in the nematocysts of the box jellyfish Chironex fleckeri have been identified as C. fleckeri toxin-1 (CfTX-1) and toxin-2 (CfTX-2). The molecular masses of CfTX-1 and CfTX-2, as determined by SDS-PAGE, are approximately 43 and 45 kDa, respectively, and both proteins are strongly antigenic to commercially available box jellyfish antivenom and rabbit polyclonal antibodies raised against C. fleckeri nematocyst extracts. The amino acid sequences of mature CfTX-1 and CfTX-2 (436 and 445 residues, respectively) share significant homology with three known proteins: CqTX-A from Chiropsalmus quadrigatus, CrTXs from Carybdea rastoni and CaTX-A from Carybdea alata, all of which are lethal, haemolytic box jellyfish toxins. Multiple sequence alignment of the five jellyfish proteins has identified several short, but highly conserved regions of amino acids that coincide with a predicted transmembrane spanning region, referred to as TSR1, which may be involved in a pore-forming mechanism of action. Furthermore, remote protein homology predictions for CfTX-2 and CaTX-A suggest weak structural similarities to pore-forming insecticidal delta-endotoxins Cry1Aa, Cry3Bb and Cry3A.
Collapse
Affiliation(s)
- Diane Brinkman
- Department of Biochemistry and Molecular Biology, School of Pharmacy and Molecular Sciences, James Cook University, Townsville, Qld 4811, Australia.
| | | |
Collapse
|
63
|
Fong JCN, Yildiz FH. The rbmBCDEF gene cluster modulates development of rugose colony morphology and biofilm formation in Vibrio cholerae. J Bacteriol 2007; 189:2319-30. [PMID: 17220218 PMCID: PMC1899372 DOI: 10.1128/jb.01569-06] [Citation(s) in RCA: 122] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Vibrio cholerae, the causative agent of cholera, can undergo phenotypic variation generating rugose and smooth variants. The rugose variant forms corrugated colonies and well-developed biofilms and exhibits increased levels of resistance to several environmental stresses. Many of these phenotypes are mediated in part by increased expression of the vps genes, which are organized into vps-I and vps-II coding regions, separated by an intergenic region. In this study, we generated in-frame deletions of the five genes located in the vps intergenic region, termed rbmB to -F (rugosity and biofilm structure modulators B to F) in the rugose genetic background, and characterized the mutants for rugose colony development and biofilm formation. Deletion of rbmB, which encodes a protein with low sequence similarity to polysaccharide hydrolases, resulted in an increase in colony corrugation and accumulation of exopolysaccharides relative to the rugose variant. RbmC and its homolog Bap1 are predicted to encode proteins with carbohydrate-binding domains. The colonies of the rbmC bap1 double deletion mutant and bap1 single deletion mutant exhibited a decrease in colony corrugation. Furthermore, the rbmC bap1 double deletion mutant was unable to form biofilms at the air-liquid interface after 2 days, while the biofilms formed on solid surfaces detached readily. Although the colony morphology of rbmDEF mutants was similar to that of the rugose variant, their biofilm structure and cell aggregation phenotypes were different than those of the rugose variant. Taken together, these results indicate that vps intergenic region genes encode proteins that are involved in biofilm matrix production and maintenance of biofilm structure and stability.
Collapse
Affiliation(s)
- Jiunn C N Fong
- Department of Environmental Toxicology, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | | |
Collapse
|
64
|
|
65
|
Abstract
Two RNases, Dicer and Argonaute, are at the heart of the RNA interference (RNAi) molecular machinery responsible for gene silencing. Both RNases contain multiple domains, most of which have been characterized or have functions that can be predicted based on sequence comparisons. However, Dicers of higher eukaryotes contain the domain known as DUF283 which at present has no assigned role. Using sensitive profile-profile comparisons, we detected a divergent double-stranded RNA-binding domain coinciding with the DUF283 of Dicer. This finding has potential implications regarding the mechanistic role of Dicer in RNAi.
Collapse
Affiliation(s)
- Mensur Dlakić
- Department of Microbiology, Montana State University Bozeman, MT 59717, USA.
| |
Collapse
|
66
|
Maaty WSA, Ortmann AC, Dlakić M, Schulstad K, Hilmer JK, Liepold L, Weidenheft B, Khayat R, Douglas T, Young MJ, Bothner B. Characterization of the archaeal thermophile Sulfolobus turreted icosahedral virus validates an evolutionary link among double-stranded DNA viruses from all domains of life. J Virol 2006; 80:7625-35. [PMID: 16840341 PMCID: PMC1563717 DOI: 10.1128/jvi.00522-06] [Citation(s) in RCA: 77] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Icosahedral nontailed double-stranded DNA (dsDNA) viruses are present in all three domains of life, leading to speculation about a common viral ancestor that predates the divergence of Eukarya, Bacteria, and Archaea. This suggestion is supported by the shared general architecture of this group of viruses and the common fold of their major capsid protein. However, limited information on the diversity and replication of archaeal viruses, in general, has hampered further analysis. Sulfolobus turreted icosahedral virus (STIV), isolated from a hot spring in Yellowstone National Park, was the first icosahedral virus with an archaeal host to be described. Here we present a detailed characterization of the components forming this unusual virus. Using a proteomics-based approach, we identified nine viral and two host proteins from purified STIV particles. Interestingly, one of the viral proteins originates from a reading frame lacking a consensus start site. The major capsid protein (B345) was found to be glycosylated, implying a strong similarity to proteins from other dsDNA viruses. Sequence analysis and structural predication of virion-associated viral proteins suggest that they may have roles in DNA packaging, penton formation, and protein-protein interaction. The presence of an internal lipid layer containing acidic tetraether lipids has also been confirmed. The previously presented structural models in conjunction with the protein, lipid, and carbohydrate information reported here reveal that STIV is strikingly similar to viruses associated with the Bacteria and Eukarya domains of life, further strengthening the hypothesis for a common ancestor of this group of dsDNA viruses from all domains of life.
Collapse
Affiliation(s)
- Walid S A Maaty
- Department of Chemistry and Biochemistry, Montana State University, Bozeman, MT 59715, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
67
|
Dunbrack RL. Sequence comparison and protein structure prediction. Curr Opin Struct Biol 2006; 16:374-84. [PMID: 16713709 DOI: 10.1016/j.sbi.2006.05.006] [Citation(s) in RCA: 119] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2006] [Revised: 03/22/2006] [Accepted: 05/08/2006] [Indexed: 10/24/2022]
Abstract
Sequence comparison is a major step in the prediction of protein structure from existing templates in the Protein Data Bank. The identification of potentially remote homologues to be used as templates for modeling target sequences of unknown structure and their accurate alignment remain challenges, despite many years of study. The most recent advances have been in combining as many sources of information as possible--including amino acid variation in the form of profiles or hidden Markov models for both the target and template families, known and predicted secondary structures of the template and target, respectively, the combination of structure alignment for distant homologues and sequence alignment for close homologues to build better profiles, and the anchoring of certain regions of the alignment based on existing biological data. Newer technologies have been applied to the problem, including the use of support vector machines to tackle the fold classification problem for a target sequence and the alignment of hidden Markov models. Finally, using the consensus of many fold recognition methods, whether based on profile-profile alignments, threading or other approaches, continues to be one of the most successful strategies for both recognition and alignment of remote homologues. Although there is still room for improvement in identification and alignment methods, additional progress may come from model building and refinement methods that can compensate for large structural changes between remotely related targets and templates, as well as for regions of misalignment.
Collapse
Affiliation(s)
- Roland L Dunbrack
- Institute for Cancer Research, Fox Chase Cancer Center, 333 Cottman Avenue, Philadelphia, PA 19111, USA.
| |
Collapse
|
68
|
Fong JCN, Karplus K, Schoolnik GK, Yildiz FH. Identification and characterization of RbmA, a novel protein required for the development of rugose colony morphology and biofilm structure in Vibrio cholerae. J Bacteriol 2006; 188:1049-59. [PMID: 16428409 PMCID: PMC1347326 DOI: 10.1128/jb.188.3.1049-1059.2006] [Citation(s) in RCA: 109] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Phase variation between smooth and rugose colony variants of Vibrio cholerae is predicted to be important for the pathogen's survival in its natural aquatic ecosystems. The rugose variant forms corrugated colonies, exhibits increased levels of resistance to osmotic, acid, and oxidative stresses, and has an enhanced capacity to form biofilms. Many of these phenotypes are mediated in part by increased production of an exopolysaccharide termed VPS. In this study, we compared total protein profiles of the smooth and rugose variants using two-dimensional gel electrophoresis and identified one protein that is present at a higher level in the rugose variant. A mutation in the gene encoding this protein, which does not have any known homologs in the protein databases, causes cells to form biofilms that are more fragile and sensitive to sodium dodecyl sulfate than wild-type biofilms. The results indicate that the gene, termed rbmA (rugosity and biofilm structure modulator A), is required for rugose colony formation and biofilm structure integrity in V. cholerae. Transcription of rbmA is positively regulated by the response regulator VpsR but not VpsT.
Collapse
Affiliation(s)
- Jiunn C N Fong
- Department of Environmental Toxicology, University of California, Santa Cruz, Santa Cruz, CA 95064, USA
| | | | | | | |
Collapse
|
69
|
Dayalan S, Gooneratne ND, Bevinakoppa S, Schroder H. Dihedral angle and secondary structure database of short amino acid fragments. Bioinformation 2006; 1:78-80. [PMID: 17597859 PMCID: PMC1891663 DOI: 10.6026/97320630001078] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2005] [Revised: 12/18/2005] [Accepted: 12/23/2005] [Indexed: 11/23/2022] Open
Abstract
UNLABELLED Dihedral angles of amino acids are of considerable importance in protein tertiary structure prediction as they define the backbone of a protein and hence almost define the protein's entire conformation. Most ab initio protein structure prediction methods predict the secondary structure of a protein before predicting the tertiary structure because three-dimensional fold consists of repeating units of secondary structures. Hence, both dihedral angles and secondary structures are important in tertiary structure prediction of proteins. Here we describe a database called DASSD (Dihedral Angle and Secondary Structure Database of Short Amino acid Fragments) that contains dihedral angle values and secondary structure details of short amino acid fragments of lengths 1, 3 and 5. Information stored in this database was extracted from a set of 5,227 non-redundant high resolution (less than 2-angstroms) protein structures. In total, DASSD stores details for about 733,000 fragments. This database finds application in the development of ab initio protein structure prediction methods using fragment libraries and fragment assembly techniques. It is also useful in protein secondary structure prediction. AVAILABILITY DASSD can be accessed and downloaded from http://www.cs.rmit.edu.au/dassd/
Collapse
Affiliation(s)
- Saravanan Dayalan
- School of Computer Science and Information Technology, RMIT University, GPO Box 2474V, Melbourne 3001, Australia.
| | | | | | | |
Collapse
|