1
|
Wijaya SH, Batubara I, Nishioka T, Altaf-Ul-Amin M, Kanaya S. Metabolomic Studies of Indonesian Jamu Medicines: Prediction of Jamu Efficacy and Identification of Important Metabolites. Mol Inform 2017; 36. [PMID: 28682479 DOI: 10.1002/minf.201700050] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Accepted: 06/22/2017] [Indexed: 12/15/2022]
Abstract
In order to obtain a better understanding why some Jamu formulas can be used to treat a specific disease, we performed metabolomic studies of Jamu by taking into consideration the biologically active compounds existing in plants used as Jamu ingredients. A thorough integration of information from omics is expected to provide solid evidence-based scientific rationales for the development of modern phytomedicines. This study focused on prediction of Jamu efficacy based on its component metabolites and also identification of important metabolites related to each efficacy group. Initially, we compared the performance of Support Vector Machines and Random Forest to predict the Jamu efficacy with three different data pre-processing approaches, such as no filtering, Single Filtering algorithm, and a combination of Single Filtering algorithm and feature selection using Regularized Random Forest. Both classifiers performed very well and according to 5-fold cross-validation results, the mean accuracy of Support Vector Machine with linear kernel was slightly better than Random Forest. It can be concluded that machine learning methods can successfully relate Jamu efficacy with metabolites. In addition, we extended our analysis by identifying important metabolites from the Random Forest model. The inTrees framework was used to extract the rules and to select important metabolites for each efficacy group. Overall, we identified 94 significant metabolites associated to 12 efficacy groups and many of them were validated by published literature and KNApSAcK Metabolite Activity database.
Collapse
Affiliation(s)
- Sony Hartono Wijaya
- Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5, Takayama, Ikoma, Nara 630-0192, Japan.,Department of Computer Science, Bogor Agricultural University, Jl. Meranti Wing 20 Level 5 Kampus IPB Dramaga, Bogor, 16680, Indonesia
| | - Irmanida Batubara
- Tropical Biopharmaca Research Center, Bogor Agricultural University, Jl. Taman Kencana No. 3 Kampus IPB Taman Kencana, Bogor, 16128, Indonesia.,Department of Chemistry, Bogor Agricultural University, Jl. Tanjung Wing 1 Level 3 Kampus IPB Dramaga, Bogor, 16680, Indonesia
| | - Takaaki Nishioka
- Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5, Takayama, Ikoma, Nara 630-0192, Japan
| | - Md Altaf-Ul-Amin
- Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5, Takayama, Ikoma, Nara 630-0192, Japan
| | - Shigehiko Kanaya
- Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5, Takayama, Ikoma, Nara 630-0192, Japan
| |
Collapse
|
2
|
Strömbergsson H, Lapins M, Kleywegt GJ, Wikberg JES. Towards Proteome-Wide Interaction Models Using the Proteochemometrics Approach. Mol Inform 2010; 29:499-508. [PMID: 27463328 DOI: 10.1002/minf.201000052] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2010] [Accepted: 05/25/2010] [Indexed: 02/02/2023]
Abstract
A proteochemometrics model was induced from all interaction data in the BindingDB database, comprizing in all 7078 protein-ligand complexes with representatives from all major drug target categories. Proteins were represented by alignment-independent sequence descriptors holding information on properties such as hydrophobicity, charge, and secondary structure. Ligands were represented by commonly used QSAR descriptors. The inhibition constant (pKi ) values of protein-ligand complexes were discretized into "high" and "low" interaction activity. Different machine-learning techniques were used to induce models relating protein and ligand properties to the interaction activity. The best was decision trees, which gave an accuracy of 80 % and an area under the ROC curve of 0.81. The tree pointed to the protein and ligand properties, which are relevant for the interaction. As the approach does neither require alignments nor knowledge of protein 3D structures virtually all available protein-ligand interaction data could be utilized, thus opening a way to completely general interaction models that may span entire proteomes.
Collapse
Affiliation(s)
- Helena Strömbergsson
- The Linnaeus Centre for Bioinformatics, Department of Cell and Molecular Biology, Biomedical Centre, Box 598, SE-751 24, Uppsala, Sweden.
| | - Maris Lapins
- Department of Pharmaceutical Pharmacology, Biomedical Centre, Box 591, SE-751 24 Uppsala, Sweden
| | - Gerard J Kleywegt
- Department of Cell and Molecular Biology, Biomedical Centre, Box 596, SE-751 24, Uppsala, Sweden
| | - Jarl E S Wikberg
- Department of Pharmaceutical Pharmacology, Biomedical Centre, Box 591, SE-751 24 Uppsala, Sweden
| |
Collapse
|