26
|
Li ZR, Lin HH, Han LY, Jiang L, Chen X, Chen YZ. PROFEAT: a web server for computing structural and physicochemical features of proteins and peptides from amino acid sequence. Nucleic Acids Res 2006; 34:W32-7. [PMID: 16845018 PMCID: PMC1538821 DOI: 10.1093/nar/gkl305] [Citation(s) in RCA: 203] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2005] [Revised: 01/17/2006] [Accepted: 04/10/2006] [Indexed: 02/01/2023] Open
Abstract
Sequence-derived structural and physicochemical features have frequently been used in the development of statistical learning models for predicting proteins and peptides of different structural, functional and interaction profiles. PROFEAT (Protein Features) is a web server for computing commonly-used structural and physicochemical features of proteins and peptides from amino acid sequence. It computes six feature groups composed of ten features that include 51 descriptors and 1447 descriptor values. The computed features include amino acid composition, dipeptide composition, normalized Moreau-Broto autocorrelation, Moran autocorrelation, Geary autocorrelation, sequence-order-coupling number, quasi-sequence-order descriptors and the composition, transition and distribution of various structural and physicochemical properties. In addition, it can also compute previous autocorrelations descriptors based on user-defined properties. Our computational algorithms were extensively tested and the computed protein features have been used in a number of published works for predicting proteins of functional classes, protein-protein interactions and MHC-binding peptides. PROFEAT is accessible at http://jing.cz3.nus.edu.sg/cgi-bin/prof/prof.cgi.
Collapse
|
27
|
Cui J, Han LY, Lin HH, Zhang HL, Tang ZQ, Zheng CJ, Cao ZW, Chen YZ. Prediction of MHC-binding peptides of flexible lengths from sequence-derived structural and physicochemical properties. Mol Immunol 2006; 44:866-77. [PMID: 16806474 DOI: 10.1016/j.molimm.2006.04.001] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2006] [Revised: 04/05/2006] [Accepted: 04/06/2006] [Indexed: 11/22/2022]
Abstract
Peptide binding to MHC is critical for antigen recognition by T-cells. To facilitate vaccine design, computational methods have been developed for predicting MHC-binding peptides, which achieve impressive prediction accuracies of 70-90% for binders and 40-80% for non-binders. These methods have been developed for peptides of fixed lengths, for a limited number of alleles, trained from small number of non-binders, and in some cases based straightforwardly on sequence. These limit prediction coverage and accuracy particularly for non-binders. It is desirable to explore methods that predict binders of flexible lengths from sequence-derived physicochemical properties and trained from diverse sets of non-binders. This work explores support vector machines (SVM) as such a method for developing prediction systems of 18 MHC class I and 12 class II alleles by using 4208-3252 binders and 234,333-168,793 non-binders, and evaluated by an independent set of 545-476 binders and 110,564-84,430 non-binders. Binder accuracies are 86-99% for 25 and 70-80% for 5 alleles, non-binder accuracies are 96-99% for 30 alleles. Binder accuracies are comparable and non-binder accuracies substantially improved against other results. Our method correctly predicts 73.3% of the 15 newly-published epitopes in the last 4 months of 2005. Of the 251 recently-published HLA-A*0201 non-epitopes predicted as binders by other methods, 63 are predicted as binders by our method. Screening of HIV-1 genome shows that, compared to other methods, a comparable percentage (75-100%) of its known epitopes is correctly predicted, while a lower percentage (0.01-5% for 24 and 5-8% for 6 alleles) of its constituent peptides are predicted as binders. Our software can be accessed at .
Collapse
|
28
|
Yap CW, Xue Y, Li H, Li ZR, Ung CY, Han LY, Zheng CJ, Cao ZW, Chen YZ. Prediction of compounds with specific pharmacodynamic, pharmacokinetic or toxicological property by statistical learning methods. Mini Rev Med Chem 2006; 6:449-59. [PMID: 16613581 DOI: 10.2174/138955706776361501] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Computational methods for predicting compounds of specific pharmacodynamic, pharmacokinetic, or toxicological property are useful for facilitating drug discovery and drug safety evaluation. The quantitative structure-activity relationship (QSAR) and quantitative structure-property relationship (QSPR) methods are the most successfully used statistical learning methods for predicting compounds of specific property. More recently, other statistical learning methods such as neural networks and support vector machines have been explored for predicting compounds of higher structural diversity than those covered by QSAR and QSPR. These methods have shown promising potential in a number of studies. This article is intended to review the strategies, current progresses and underlying difficulties in using statistical learning methods for predicting compounds of specific property. It also evaluates algorithms commonly used for representing structural and physicochemical properties of compounds.
Collapse
|
29
|
Ji ZL, Zhou H, Wang JF, Han LY, Zheng CJ, Chen YZ. Traditional Chinese medicine information database. JOURNAL OF ETHNOPHARMACOLOGY 2006; 103:501. [PMID: 16376038 DOI: 10.1016/j.jep.2005.11.003] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/14/2005] [Revised: 10/25/2005] [Accepted: 11/01/2005] [Indexed: 05/05/2023]
|
30
|
Lin HH, Han LY, Zhang HL, Zheng CJ, Xie B, Chen YZ. Prediction of the functional class of lipid binding proteins from sequence-derived properties irrespective of sequence similarity. J Lipid Res 2006; 47:824-31. [PMID: 16443826 DOI: 10.1194/jlr.m500530-jlr200] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Lipid binding proteins play important roles in signaling, regulation, membrane trafficking, immune response, lipid metabolism, and transport. Because of their functional and sequence diversity, it is desirable to explore additional methods for predicting lipid binding proteins irrespective of sequence similarity. This work explores the use of support vector machines (SVMs) as such a method. SVM prediction systems are developed using 14,776 lipid binding and 133,441 nonlipid binding proteins and are evaluated by an independent set of 6,768 lipid binding and 64,761 nonlipid binding proteins. The computed prediction accuracy is 78.9, 79.5, 82.2, 79.5, 84.4, 76.6, 90.6, 79.0, and 89.9% for lipid degradation, lipid metabolism, lipid synthesis, lipid transport, lipid binding, lipopolysaccharide biosynthesis, lipoprotein, lipoyl, and all lipid binding proteins, respectively. The accuracy for the nonmember proteins of each class is 99.9, 99.2, 99.6, 99.8, 99.9, 99.8, 98.5, 99.9, and 97.0%, respectively. Comparable accuracies are obtained when homologous proteins are considered as one, or by using a different SVM kernel function. Our method predicts 86.8% of the 76 lipid binding proteins nonhomologous to any protein in the Swiss-Prot database and 89.0% of the 73 known lipid binding domains as lipid binding. These findings suggest the usefulness of SVMs for facilitating the prediction of lipid binding proteins. Our software can be accessed at the SVMProt server (http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi).
Collapse
|
31
|
Cui J, Han LY, Cai CZ, Zheng CJ, Ji ZL, Chen YZ. Prediction of functional class of novel bacterial proteins without the use of sequence similarity by a statistical learning method. J Mol Microbiol Biotechnol 2006; 9:86-100. [PMID: 16319498 DOI: 10.1159/000088839] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
A substantial percentage of the putative protein-encoding open reading frames (ORFs) in bacterial genomes have no homolog of known function, and their function cannot be confidently assigned on the basis of sequence similarity. Methods not based on sequence similarity are needed and being developed. One method, SVMProt (http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi), predicts protein functional family irrespective of sequence similarity (Nucleic Acids Res. 2003;31:3692-3697). While it has been tested on a large number of proteins, its capability for non-homologous proteins has so far been evaluated for a relatively small number of proteins, and additional tests are needed to more fully assess SVMProt. In this work, 90 novel bacterial proteins (non-homologous to known proteins) are used to evaluate the capability of SVMProt. These proteins are such that none of their homologs are in the Swiss-Prot database, their functions not clearly described in the literature, and they themselves and their homologs are not included in the training sets of SVMProt. They represent proteins whose function cannot be confidently predicted by sequence similarity methods at present. The predicted functional class of 76.7% of each of these proteins shows various levels of consistency with the literature-described function, compared to the overall accuracy of 87% for the SVMProt functional class assignment of 34,582 proteins that have at least one homolog of known function. Our study suggests that SVMProt is capable of assigning functional class for novel bacterial proteins at a level not too much lower than that of sequence alignment methods for homologous proteins.
Collapse
|
32
|
Cai CZ, Han LY, Chen X, Cao ZW, Chen YZ. Prediction of functional class of the SARS coronavirus proteins by a statistical learning method. J Proteome Res 2006; 4:1855-62. [PMID: 16212442 DOI: 10.1021/pr050110a] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The complete genome of severe acute respiratory syndrome coronavirus (SARS-CoV) reveals the existence of putative proteins unique to SARS-CoV. Identification of their function facilitates a mechanistic understanding of SARS infection and drug development for its treatment. The sequence of the majority of these putative proteins has no significant similarity to those of known proteins, which complicates the task of using sequence analysis tools to probe their function. Support vector machines (SVM), useful for predicting the functional class of distantly related proteins, is employed to ascribe a possible functional class to SARS-CoV proteins. Testing results indicate that SVM is able to predict the functional class of 73% of the known SARS-CoV proteins with available sequences and 67% of 18 other novel viral proteins. A combination of the sequence comparison method BLAST and SVMProt can further improve the prediction accuracy of SMVProt such that the functional class of two additional SARS-CoV proteins is correctly predicted. Our study suggests that the SARS-CoV genome possibly contains a putative voltage-gated ion channel, structural proteins, a carbon-oxygen lyase, oxidoreductases acting on the CH-OH group of donors, and an ATP-binding cassette transporter. A web version of our software, SVMProt, is accessible at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi .
Collapse
|
33
|
Han LY, Lin HH, Li ZR, Zheng CJ, Cao ZW, Xie B, Chen YZ. PEARLS: Program for Energetic Analysis of Receptor−Ligand System. J Chem Inf Model 2006; 46:445-50. [PMID: 16426079 DOI: 10.1021/ci0502146] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Analysis of the energetics of small molecule ligand-protein, ligand-nucleic acid, and protein-nucleic acid interactions facilitates the quantitative understanding of molecular interactions that regulate the function and conformation of proteins. It has also been extensively used for ranking potential new ligands in virtual drug screening. We developed a Web-based software, PEARLS (Program for Energetic Analysis of Ligand-Receptor Systems), for computing interaction energies of ligand-protein, ligand-nucleic acid, protein-nucleic acid, and ligand-protein-nucleic acid complexes from their 3D structures. AMBER molecular force field, Morse potential, and empirical energy functions are used to compute the van der Waals, electrostatic, hydrogen bond, metal-ligand bonding, and water-mediated hydrogen bond energies between the binding molecules. The change in the solvation free energy of molecular binding is estimated by using an empirical solvation free energy model. Contribution from ligand conformational entropy change is also estimated by a simple model. The computed free energy for a number of PDB ligand-receptor complexes were studied and compared to experimental binding affinity. A substantial degree of correlation between the computed free energy and experimental binding affinity was found, which suggests that PEARLS may be useful in facilitating energetic analysis of ligand-protein, ligand-nucleic acid, and protein-nucleic acid interactions. PEARLS can be accessed at http://ang.cz3.nus.edu.sg/cgi-bin/prog/rune.pl.
Collapse
|
34
|
Li H, Yap CW, Xue Y, Li ZR, Ung CY, Han LY, Chen YZ. Statistical learning approach for predicting specific pharmacodynamic, pharmacokinetic, or toxicological properties of pharmaceutical agents. Drug Dev Res 2005. [DOI: 10.1002/ddr.20044] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
35
|
Lin HH, Han LY, Cai CZ, Ji ZL, Chen YZ. Prediction of transporter family from protein sequence by support vector machine approach. Proteins 2005; 62:218-31. [PMID: 16287089 DOI: 10.1002/prot.20605] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Transporters play key roles in cellular transport and metabolic processes, and in facilitating drug delivery and excretion. These proteins are classified into families based on the transporter classification (TC) system. Determination of the TC family of transporters facilitates the study of their cellular and pharmacological functions. Methods for predicting TC family without sequence alignments or clustering are particularly useful for studying novel transporters whose function cannot be determined by sequence similarity. This work explores the use of a machine learning method, support vector machines (SVMs), for predicting the family of transporters from their sequence without the use of sequence similarity. A total of 10,636 transporters in 13 TC subclasses, 1914 transporters in eight TC families, and 168,341 nontransporter proteins are used to train and test the SVM prediction system. Testing results by using a separate set of 4351 transporters and 83,151 nontransporter proteins show that the overall accuracy for predicting members of these TC subclasses and families is 83.4% and 88.0%, respectively, and that of nonmembers is 99.3% and 96.6%, respectively. The accuracies for predicting members and nonmembers of individual TC subclasses are in the range of 70.7-96.1% and 97.6-99.9%, respectively, and those of individual TC families are in the range of 60.6-97.1% and 91.5-99.4%, respectively. A further test by using 26,139 transmembrane proteins outside each of the 13 TC subclasses shows that 90.4-99.6% of these are correctly predicted. Our study suggests that the SVM is potentially useful for facilitating functional study of transporters irrespective of sequence similarity.
Collapse
|
36
|
Han LY, Zheng CJ, Lin HH, Cui J, Li H, Zhang HL, Tang ZQ, Chen YZ. Prediction of functional class of novel plant proteins by a statistical learning method. THE NEW PHYTOLOGIST 2005; 168:109-21. [PMID: 16159326 DOI: 10.1111/j.1469-8137.2005.01482.x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
In plant genomes, the function of a substantial percentage of the putative protein-coding open reading frames (ORFs) is unknown. These ORFs have no significant sequence similarity to known proteins, which complicates the task of functional study of these proteins. Efforts are being made to explore methods that are complementary to, or may be used in combination with, sequence alignment and clustering methods. A web-based protein functional class prediction software, SVMProt, has shown some capability for predicting functional class of distantly related proteins. Here the usefulness of SVMProt for functional study of novel plant proteins is evaluated. To test SVMProt, 49 plant proteins (without a sequence homolog in the Swiss-Prot protein database, not in the SVMProt training set, and with functional indications provided in the literature) were selected from a comprehensive search of MEDLINE abstracts and Swiss-Prot databases in 1999-2004. These represent unique proteins the function of which, at present, cannot be confidently predicted by sequence alignment and clustering methods. The predicted functional class of 31 proteins was consistent, and that of four other proteins was weakly consistent, with published functions. Overall, the functional class of 71.4% of these proteins was consistent, or weakly consistent, with functional indications described in the literature. SVMProt shows a certain level of ability to provide useful hints about the functions of novel plant proteins with no similarity to known proteins.
Collapse
|
37
|
Wang JF, Zhou H, Han LY, Chen X, Chen YZ, Cao ZW. Traditional Chinese medicine information database. Clin Pharmacol Ther 2005; 78:92-3. [PMID: 16003299 DOI: 10.1016/j.clpt.2005.03.010] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
38
|
Landen CN, Immaneni A, Deavers MT, Thornton A, Celestino J, Thanker PH, Han LY, Bodurka DC, Gershenson DM, Brinkley WR, Sood AK. Overexpression of the centrosomal protein aurora-A kinase is associated with poor prognosis in epithelial ovarian cancer patients. J Clin Oncol 2005. [DOI: 10.1200/jco.2005.23.16_suppl.5039] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
39
|
Abstract
Lead discovery against a preselected therapeutic target is a key component in modern drug development. Continuous effort and increasing interest has been directed at the search for new targets, which has led to the identification of a growing number of them. Data from the therapeutic target database, at http://bidd.nus.edu.sg/group/cjttd/ttd.asp, show that, as of July 2004, the number of documented targets of marketed and investigational drugs has reached 1,174 distinct proteins (including subtypes) and 27 nucleic acids, 239 of which are targets of the marketed drugs. Analysis of these targets, particularly those of recently approved drugs and patented investigational agents, provide useful hints about general trends of target exploration and current focus in drug discovery for the treatment of high impact diseases needing effective or more treatment options.
Collapse
|
40
|
Han LY, Cai CZ, Ji ZL, Cao ZW, Cui J, Chen YZ. Predicting functional family of novel enzymes irrespective of sequence similarity: a statistical learning approach. Nucleic Acids Res 2004; 32:6437-44. [PMID: 15585667 PMCID: PMC535691 DOI: 10.1093/nar/gkh984] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The function of a protein that has no sequence homolog of known function is difficult to assign on the basis of sequence similarity. The same problem may arise for homologous proteins of different functions if one is newly discovered and the other is the only known protein of similar sequence. It is desirable to explore methods that are not based on sequence similarity. One approach is to assign functional family of a protein to provide useful hint about its function. Several groups have employed a statistical learning method, support vector machines (SVMs), for predicting protein functional family directly from sequence irrespective of sequence similarity. These studies showed that SVM prediction accuracy is at a level useful for functional family assignment. But its capability for assignment of distantly related proteins and homologous proteins of different functions has not been critically and adequately assessed. Here SVM is tested for functional family assignment of two groups of enzymes. One consists of 50 enzymes that have no homolog of known function from PSI-BLAST search of protein databases. The other contains eight pairs of homologous enzymes of different families. SVM correctly assigns 72% of the enzymes in the first group and 62% of the enzyme pairs in the second group, suggesting that it is potentially useful for facilitating functional study of novel proteins. A web version of our software, SVMProt, is accessible at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi.
Collapse
|
41
|
Han LY, Schimp V, Oh JC, Ramirez PT. A gelatin matrix-thrombin tissue sealant (FloSeal) application in the management of groin breakdown after inguinal lymphadenectomy for vulvar cancer. Int J Gynecol Cancer 2004; 14:621-4. [PMID: 15304156 DOI: 10.1111/j.1048-891x.2004.14411.x] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
Abstract
The rate of groin breakdown after radical wide vulvar excision and inguinal lymphadenectomy for vulvar cancer remains significant despite conservative surgical approaches. An 86-year-old Latin American woman underwent wide radical excision and bilateral inguinal lymphadenectomy for vulvar cancer. The postoperative course was complicated by bilateral groin wound separation and high output lymphorrhea. The patient responded to the application of a gelatin matrix-thrombin tissue sealant (FloSeal) to the bases of each groin with resolution in lymphorrhea and formation of granulation tissue. The application of a gelatin matrix-thrombin tissue sealant (FloSeal) may be a viable treatment in the management of groin breakdown in selected patients when conventional therapy produces suboptimal results.
Collapse
|
42
|
Cao ZW, Xue Y, Han LY, Xie B, Zhou H, Zheng CJ, Lin HH, Chen YZ. MoViES: molecular vibrations evaluation server for analysis of fluctuational dynamics of proteins and nucleic acids. Nucleic Acids Res 2004; 32:W679-85. [PMID: 15215475 PMCID: PMC441522 DOI: 10.1093/nar/gkh384] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Analysis of vibrational motions and thermal fluctuational dynamics is a widely used approach for studying structural, dynamic and functional properties of proteins and nucleic acids. Development of a freely accessible web server for computation of vibrational and thermal fluctuational dynamics of biomolecules is thus useful for facilitating the relevant studies. We have developed a computer program for computing vibrational normal modes and thermal fluctuational properties of proteins and nucleic acids and applied it in several studies. In our program, vibrational normal modes are computed by using modified AMBER molecular mechanics force fields, and thermal fluctuational properties are computed by means of a self-consistent harmonic approximation method. A web version of our program, MoViES (Molecular Vibrations Evaluation Server), was set up to facilitate the use of our program to study vibrational dynamics of proteins and nucleic acids. This software was tested on selected proteins, which show that the computed normal modes and thermal fluctuational bond disruption probabilities are consistent with experimental findings and other normal mode computations. MoViES can be accessed at http://ang.cz3.nus.edu.sg/cgi-bin/prog/norm.pl.
Collapse
|
43
|
Abstract
One approach for facilitating protein function prediction is to classify proteins into functional families. Recent studies on the classification of G-protein coupled receptors and other proteins suggest that a statistical learning method, Support vector machines (SVM), may be potentially useful for protein classification into functional families. In this work, SVM is applied and tested on the classification of enzymes into functional families defined by the Enzyme Nomenclature Committee of IUBMB. SVM classification system for each family is trained from representative enzymes of that family and seed proteins of Pfam curated protein families. The classification accuracy for enzymes from 46 families and for non-enzymes is in the range of 50.0% to 95.7% and 79.0% to 100% respectively. The corresponding Matthews correlation coefficient is in the range of 54.1% to 96.1%. Moreover, 80.3% of the 8,291 correctly classified enzymes are uniquely classified into a specific enzyme family by using a scoring function, indicating that SVM may have certain level of unique prediction capability. Testing results also suggest that SVM in some cases is capable of classification of distantly related enzymes and homologous enzymes of different functions. Effort is being made to use a more comprehensive set of enzymes as training sets and to incorporate multi-class SVM classification systems to further enhance the unique prediction accuracy. Our results suggest the potential of SVM for enzyme family classification and for facilitating protein function prediction. Our software is accessible at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi.
Collapse
|
44
|
Zheng CJ, Zhou H, Xie B, Han LY, Yap CW, Chen YZ. TRMP: a database of therapeutically relevant multiple pathways. Bioinformatics 2004; 20:2236-41. [PMID: 15059817 DOI: 10.1093/bioinformatics/bth233] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
UNLABELLED Disease processes often involve crosstalks between proteins in different pathways. Different proteins have been used as separate therapeutic targets for the same disease. Synergetic targeting of multiple targets has been explored in combination therapy of a number of diseases. Potential harmful interactions of multiple targeting have also been closely studied. To facilitate mechanistic study of drug actions and a more comprehensive understanding the relationship between different targets of the same disease, it is useful to develop a database of known therapeutically relevant multiple pathways (TRMPs). Information about non-target proteins and natural small molecules involved in these pathways also provides useful hint for searching new therapeutic targets and facilitate the understanding of how therapeutic targets interact with other molecules in performing specific tasks. The TRMPs database is designed to provide information about such multiple pathways along with related therapeutic targets, corresponding drugs/ligands, targeted disease conditions, constituent individual pathways, structural and functional information about each protein in the pathways. Cross links to other databases are also introduced to facilitate the access of information about individual pathways and proteins. AVAILABILITY This database can be accessed at http://bidd.nus.edu.sg/group/trmp/trmp.asp and it currently contains 11 entries of multiple pathways, 97 entries of individual pathways, 120 targets covering 72 disease conditions together with 120 sets of drugs directed at each of these targets. Each entry can be retrieved through multiple methods including multiple pathway name, individual pathway name and disease name. SUPPLEMENTARY INFORMATION http://bidd.nus.edu.sg/group/trmp/sm.pdf
Collapse
|
45
|
Han LY, Schimp V, Oh JC, Ramirez PT. A gelatin matrix-thrombin tissue sealant (FloSeal®) application in the management of groin breakdown after inguinal lymphadenectomy for vulvar cancer. Int J Gynecol Cancer 2004. [DOI: 10.1136/ijgc-00009577-200407000-00008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open
Abstract
The rate of groin breakdown after radical wide vulvar excision and inguinal lymphadenectomy for vulvar cancer remains significant despite conservative surgical approaches. An 86-year-old Latin American woman underwent wide radical excision and bilateral inguinal lymphadenectomy for vulvar cancer. The postoperative course was complicated by bilateral groin wound separation and high output lymphorrhea. The patient responded to the application of a gelatin matrix-thrombin tissue sealant (FloSeal®) to the bases of each groin with resolution in lymphorrhea and formation of granulation tissue. The application of a gelatin matrix-thrombin tissue sealant (FloSeal®) may be a viable treatment in the management of groin breakdown in selected patients when conventional therapy produces suboptimal results.
Collapse
|
46
|
Zheng C, Sun LZ, Han LY, Ji ZL, Chen X, Chen YZ. Drug ADME-associated protein database as a resource for facilitating pharmacogenomics research. Drug Dev Res 2004. [DOI: 10.1002/ddr.10376] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
47
|
Cai CZ, Han LY, Ji ZL, Chen X, Chen YZ. SVM-Prot: Web-based support vector machine software for functional classification of a protein from its primary sequence. Nucleic Acids Res 2003; 31:3692-7. [PMID: 12824396 PMCID: PMC169006 DOI: 10.1093/nar/gkg600] [Citation(s) in RCA: 358] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Prediction of protein function is of significance in studying biological processes. One approach for function prediction is to classify a protein into functional family. Support vector machine (SVM) is a useful method for such classification, which may involve proteins with diverse sequence distribution. We have developed a web-based software, SVMProt, for SVM classification of a protein into functional family from its primary sequence. SVMProt classification system is trained from representative proteins of a number of functional families and seed proteins of Pfam curated protein families. It currently covers 54 functional families and additional families will be added in the near future. The computed accuracy for protein family classification is found to be in the range of 69.1-99.6%. SVMProt shows a certain degree of capability for the classification of distantly related proteins and homologous proteins of different function and thus may be used as a protein function prediction tool that complements sequence alignment methods. SVMProt can be accessed at http://jing.cz3.nus.edu.sg/cgi-bin/svmprot.cgi.
Collapse
|
48
|
Ji ZL, Chen X, Zhen CJ, Yao LX, Han LY, Yeo WK, Chung PC, Puy HS, Tay YT, Muhammad A, Chen YZ. KDBI: Kinetic Data of Bio-molecular Interactions database. Nucleic Acids Res 2003; 31:255-7. [PMID: 12519995 PMCID: PMC165514 DOI: 10.1093/nar/gkg067] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Understanding of cellular processes and underlying molecular events requires knowledge about different aspects of molecular interactions, networks of molecules and pathways in addition to the sequence, structure and function of individual molecules involved. Databases of interacting molecules, pathways and related chemical reaction equations have been developed. The kinetic data for these interactions, which is important for mechanistic investigation, quantitative study and simulation of cellular processes and events, is not provided in the existing databases. We introduce a new database of Kinetic Data of Bio-molecular Interactions (KDBI) aimed at providing experimentally determined kinetic data of protein-protein, protein-RNA, protein-DNA, protein-ligand, RNA-ligand, DNA-ligand binding or reaction events described in the literature. KDBI contains information about binding or reaction event, participating molecules (name, synonyms, molecular formula, classification, SWISS-PROT AC or CAS number), binding or reaction equation, kinetic data and related references. The kinetic data is in terms of one or a combination of the following quantities as given in the literature of a particular event: association/dissociation or on/off rate constant, first/second/third/. order rate constant, equilibrium rate constant, catalytic rate constant, equilibrium association/dissociation constant, inhibition constant and binding affinity constant. Each entry can be retrieved through protein or nucleic acid or ligand name, SWISS-PROT AC number, ligand CAS number and full-text search of a binding or reaction event. KDBI currently contains 8273 entries of biomolecular binding or reaction events involving 1380 proteins, 143 nucleic acids and 1395 small molecules. Hyperlinks are provided for accessing references in Medline and available 3D structures in PDB and NDB. This database can be accessed at http://xin.cz3.nus.edu.sg/group/kdbi/kdbi.asp.
Collapse
|
49
|
Arnold SE, Han LY, Moberg PJ, Turetsky BI, Gur RE, Trojanowski JQ, Hahn CG. Dysregulation of olfactory receptor neuron lineage in schizophrenia. ARCHIVES OF GENERAL PSYCHIATRY 2001; 58:829-35. [PMID: 11545665 DOI: 10.1001/archpsyc.58.9.829] [Citation(s) in RCA: 87] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
BACKGROUND Growing evidence implicates abnormal neurodevelopment in schizophrenia. While neuron birth and differentiation is largely completed by the end of gestation, the olfactory epithelium (OE) is a unique part of the central nervous system that undergoes regeneration throughout life, thus offering an opportunity to investigate cellular and molecular events of neurogenesis and development postmortem. We hypothesized that OE neurons exhibit deviant progress through neurodevelopment in schizophrenia characterized by an increase in immature neurons. METHODS Olfactory epithelium was removed at autopsy from 13 prospectively assessed elderly subjects who had schizophrenia and 10 nonpsychiatric control subjects. Sections were immunolabeled with antibodies that distinguish OE neurons in different stages of development, including basal cells (low-affinity nerve growth factor receptor, p75NGFR), postmitotic immature neurons (growth-associated protein 43 [GAP43]), and mature olfactory receptor neurons (olfactory marker protein). Absolute and relative densities of each cell type were determined. RESULTS We observed a significantly lower density of p75NGFR basal cells (37%) in schizophrenia and increases in GAP43 + postmitotic immature neurons (316%) and ratios of GAP43 + postmitotic immature neurons to p75NGFR + cells (665%) and olfactory marker protein + mature neurons to p75NGFR + basal cells (328%). Neuroleptic-free schizophrenia subjects exhibited the highest GAP43 + postmitotic immature neuron values. CONCLUSIONS Abnormal densities and ratios of OE neurons at different stages of development indicate dysregulation of OE neuronal lineage in schizophrenia. This could be because of intrinsic factors controlling differentiation or an inability to gain trophic support from axonal targets in the olfactory bulb. While caution is necessary in extrapolating developmental findings in mature OE to early brain development, similarities in molecular events suggest that such studies may be instructive.
Collapse
|
50
|
Mitchell TW, Nissanov J, Han LY, Mufson EJ, Schneider JA, Cochran EJ, Bennett DA, Lee VM, Trojanowski JQ, Arnold SE. Novel method to quantify neuropil threads in brains from elders with or without cognitive impairment. J Histochem Cytochem 2000; 48:1627-38. [PMID: 11101631 DOI: 10.1177/002215540004801206] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Pathological alterations in dendrites and axons (i.e., neuritic pathologies) occur in the normal aging brain as well as in brains from elders with mild cognitive impairment and neurodegenerative dementia. These alterations may correlate with clinical measures of cognitive abilities, but the contribution of neuropil threads (NTs), which constitute 85-90% of cortical tau pathology, has not been clear because of the lack of quantitative methodologies. We combined quantitative fractionation and image analysis to devise a strategy for measuring the burden of tau-rich NTs in the entorhinal and perirhinal cortex of brains from elders with and without cognitive impairment, including dementia due to Alzheimer's disease (AD). On the basis of data presented here using this novel strategy, we conclude that this quantitative imaging technique will facilitate efforts to determine the behavioral correlations of neuritic lesions in AD and other brain disorders.
Collapse
|