1
|
Vijayaraghavan S, Lakshminarayanan A, Bhargava N, Ravichandran J, Vivek-Ananth RP, Samal A. Machine Learning Models for Prediction of Xenobiotic Chemicals with High Propensity to Transfer into Human Milk. ACS OMEGA 2024; 9:13006-13016. [PMID: 38524439 PMCID: PMC10955560 DOI: 10.1021/acsomega.3c09392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 02/04/2024] [Accepted: 02/21/2024] [Indexed: 03/26/2024]
Abstract
Breast milk serves as a vital source of essential nutrients for infants. However, human milk contamination via the transfer of environmental chemicals from maternal exposome is a significant concern for infant health. The milk to plasma concentration (M/P) ratio is a critical metric that quantifies the extent to which these chemicals transfer from maternal plasma into breast milk, impacting infant exposure. Machine learning-based predictive toxicology models can be valuable in predicting chemicals with a high propensity to transfer into human milk. To this end, we build such classification- and regression-based models by employing multiple machine learning algorithms and leveraging the largest curated data set, to date, of 375 chemicals with known milk-to-plasma concentration (M/P) ratios. Our support vector machine (SVM)-based classifier outperforms other models in terms of different performance metrics, when evaluated on both (internal) test data and an external test data set. Specifically, the SVM-based classifier on (internal) test data achieved a classification accuracy of 77.33%, a specificity of 84%, a sensitivity of 64%, and an F-score of 65.31%. When evaluated on an external test data set, our SVM-based classifier is found to be generalizable with a sensitivity of 77.78%. While we were able to build highly predictive classification models, our best regression models for predicting the M/P ratio of chemicals could achieve only moderate R2 values on the (internal) test data. As noted in the earlier literature, our study also highlights the challenges in developing accurate regression models for predicting the M/P ratio of xenobiotic chemicals. Overall, this study attests to the immense potential of predictive computational toxicology models in characterizing the myriad of chemicals in the human exposome.
Collapse
Affiliation(s)
| | - Akshaya Lakshminarayanan
- Department
of Applied Mathematics and Computational Sciences, PSG College of Technology, Coimbatore 641004, India
| | - Naman Bhargava
- Department
of Applied Mathematics and Computational Sciences, PSG College of Technology, Coimbatore 641004, India
| | - Janani Ravichandran
- The
Institute of Mathematical Sciences (IMSc), Chennai 600113, India
- Homi
Bhabha National Institute (HBNI), Mumbai 400094, India
| | - R. P. Vivek-Ananth
- The
Institute of Mathematical Sciences (IMSc), Chennai 600113, India
- Homi
Bhabha National Institute (HBNI), Mumbai 400094, India
| | - Areejit Samal
- The
Institute of Mathematical Sciences (IMSc), Chennai 600113, India
- Homi
Bhabha National Institute (HBNI), Mumbai 400094, India
| |
Collapse
|
2
|
Khaouane A, Ferhat S, Hanini S. A Quantitative Structure-Activity Relationship for Human Plasma Protein Binding: Prediction, Validation and Applicability Domain. Adv Pharm Bull 2023; 13:784-791. [PMID: 38022813 PMCID: PMC10676552 DOI: 10.34172/apb.2023.078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Revised: 01/23/2023] [Accepted: 04/24/2023] [Indexed: 12/01/2023] Open
Abstract
Purpose The purpose of this study was to develop a robust and externally predictive in silico QSAR-neural network model for predicting plasma protein binding of drugs. This model aims to enhance drug discovery processes by reducing the need for chemical synthesis and extensive laboratory testing. Methods A dataset of 277 drugs was used to develop the QSAR-neural network model. The model was constructed using a Filter method to select 55 molecular descriptors. The validation set's external accuracy was assessed through the predictive squared correlation coefficient Q2 and the root mean squared error (RMSE). Results The developed QSAR-neural network model demonstrated robustness and good applicability domain. The external accuracy of the validation set was high, with a predictive squared correlation coefficient Q2 of 0.966 and a root mean squared error (RMSE) of 0.063. Comparatively, this model outperformed previously published models in the literature. Conclusion The study successfully developed an advanced QSAR-neural network model capable of predicting plasma protein binding in human plasma for a diverse set of 277 drugs. This model's accuracy and robustness make it a valuable tool in drug discovery, potentially reducing the need for resource-intensive chemical synthesis and laboratory testing.
Collapse
Affiliation(s)
- Affaf Khaouane
- Laboratory of Biomaterial and transport Phenomena (LBMPT), University of Médéa, pole urbain, 26000, Médéa, Algeria
| | | | | |
Collapse
|
3
|
Maeshima T, Yoshida S, Watanabe M, Itagaki F. Prediction model for milk transfer of drugs by primarily evaluating the area under the curve using QSAR/QSPR. Pharm Res 2023; 40:711-719. [PMID: 36720832 PMCID: PMC10036427 DOI: 10.1007/s11095-023-03477-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Accepted: 01/25/2023] [Indexed: 02/02/2023]
Abstract
PURPOSE Information on milk transferability of drugs is important for patients who wish to breastfeed. The purpose of this study is to develop a prediction model for milk-to-plasma drug concentration ratio based on area under the curve (M/PAUC). The quantitative structure-activity/property relationship (QSAR/QSPR) approach was used to predict compounds involved in active transport during milk transfer. METHODS We collected M/P ratio data from literature, which were curated and divided into M/PAUC ≥ 1 and M/PAUC < 1. Using the ADMET Predictor® and ADMET Modeler™, we constructed two types of binary classification models: an artificial neural network (ANN) and a support vector machine (SVM). RESULTS M/P ratios of 403 compounds were collected, M/PAUC data were obtained for 173 compounds, while 230 compounds only had M/Pnon-AUC values reported. The models were constructed using 129 of the 173 compounds, excluding colostrum data. The sensitivity of the ANN model was 0.969 for the training set and 0.833 for the test set, while the sensitivity of the SVM model was 0.971 for the training set and 0.667 for the test set. The contribution of the charge-based descriptor was high in both models. CONCLUSIONS We built a M/PAUC prediction model using QSAR/QSPR. These predictive models can play an auxiliary role in evaluating the milk transferability of drugs.
Collapse
Affiliation(s)
- Tae Maeshima
- Department of Clinical & Pharmaceutical Sciences, Faculty of Pharma Science, Teikyo University, Itabashi-Ku, Tokyo, 173-8605, Japan
| | - Shin Yoshida
- Department of Clinical & Pharmaceutical Sciences, Faculty of Pharma Science, Teikyo University, Itabashi-Ku, Tokyo, 173-8605, Japan
| | - Machiko Watanabe
- Department of Clinical & Pharmaceutical Sciences, Faculty of Pharma Science, Teikyo University, Itabashi-Ku, Tokyo, 173-8605, Japan
| | - Fumio Itagaki
- Department of Clinical & Pharmaceutical Sciences, Faculty of Pharma Science, Teikyo University, Itabashi-Ku, Tokyo, 173-8605, Japan.
| |
Collapse
|
4
|
Chu CSM, Simpson JD, O'Neill PM, Berry NG. Machine learning - Predicting Ames mutagenicity of small molecules. J Mol Graph Model 2021; 109:108011. [PMID: 34555723 DOI: 10.1016/j.jmgm.2021.108011] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2021] [Revised: 07/29/2021] [Accepted: 08/18/2021] [Indexed: 10/20/2022]
Abstract
In modern drug discovery, detection of a compound's potential mutagenicity is crucial. However, the traditional method of mutagenicity detection using the Ames test is costly and time consuming as the compounds need to be synthesised and then tested and the results are not always accurate and reproducible. Therefore, it would be advantageous to develop robust in silico models which can accurately predict the mutagenicity of a compound prior to synthesis to overcome the inadequacies of the Ames test. After curation of a previously defined compound mutagenicity library, over 5000 molecules had their chemical fingerprints and molecular properties calculated. Using 8 classification modelling algorithms, including support vector machine (SVM), random forest (RF) and extreme gradient boosting (XGB), a total of 112 predictive models have been constructed. Their performance has been assessed using 10-fold cross validation and a hold-out test set and some of the top performing models have been assessed using the y-randomisation approach. As a result, we have found SVM and XGB models to have good performance during the 10-fold cross validation (AUROC >0.90, sensitivity >0.85, specificity >0.75, balanced accuracy >0.80, Kappa >0.65) and on the test set (AUROC >0.65, sensitivity >0.65, specificity >0.60, balanced accuracy >0.65, Kappa >0.30). We have also identified molecular properties that are the most influential for mutagenicity prediction when combined with chemical molecular fingerprints. Using the Class A mutagenic compounds from the Ames/QSAR International Challenge Project, we were able to verify our models perform better, predicting more mutagens correctly then the StarDrop Ames mutagenicity prediction and TEST mutagenicity prediction.
Collapse
Affiliation(s)
- Charmaine S M Chu
- Department of Chemistry, University of Liverpool, Crown Street, Liverpool, L69 7ZD, UK.
| | - Jack D Simpson
- Department of Chemistry, University of Liverpool, Crown Street, Liverpool, L69 7ZD, UK
| | - Paul M O'Neill
- Department of Chemistry, University of Liverpool, Crown Street, Liverpool, L69 7ZD, UK
| | - Neil G Berry
- Department of Chemistry, University of Liverpool, Crown Street, Liverpool, L69 7ZD, UK.
| |
Collapse
|
5
|
Karthikeyan BS, Ravichandran J, Aparna SR, Samal A. ExHuMId: A curated resource and analysis of Exposome of Human Milk across India. CHEMOSPHERE 2021; 271:129583. [PMID: 33460906 DOI: 10.1016/j.chemosphere.2021.129583] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Revised: 12/30/2020] [Accepted: 01/04/2021] [Indexed: 06/12/2023]
Abstract
Human milk is a vital source of nourishment for infants. However, numerous environmental contaminants also find their way into human milk, making up the major part of a newborn's external exposome. While there are chemical regulations in India and scientific literature on environmental contaminants is available, the systematic compilation, monitoring, and risk management of human milk contaminants are inadequate. We have harnessed the potential of this large body of literature to develop the Exposome of Human Milk across India (ExHuMId) version 1.0 containing detailed information on 101 environmental contaminants detected in human milk samples across 13 Indian states, compiled from 36 research articles. ExHuMId also compiles the detected concentrations of the contaminants, structural and physicochemical properties, and factors associated with the donor of the sample. We also present findings from a three-pronged analysis of ExHuMId and two other resources on human milk contaminants, with a focus on the Indian scenario. Through a comparative analysis with global chemical regulations and guidelines, we identify human milk contaminants of high concern, such as potential carcinogens, endocrine disruptors and neurotoxins. We then study the physicochemical properties of the contaminants to gain insights on their propensity to transfer into human milk. Lastly, we employ a systems biology approach to shed light on potential effects of human milk contaminants on maternal and infant health, by identifying contaminant-gene interactions associated with lactation, cytokine signalling and production, and protein-mediated transport. ExHuMId 1.0 is accessible online at: https://cb.imsc.res.in/exhumid/.
Collapse
Affiliation(s)
| | - Janani Ravichandran
- The Institute of Mathematical Sciences (IMSc), Chennai, 600113, India; Homi Bhabha National Institute (HBNI), Mumbai, 400094, India.
| | - S R Aparna
- The Institute of Mathematical Sciences (IMSc), Chennai, 600113, India
| | - Areejit Samal
- The Institute of Mathematical Sciences (IMSc), Chennai, 600113, India; Homi Bhabha National Institute (HBNI), Mumbai, 400094, India.
| |
Collapse
|
6
|
Anderson PO, Sauberan JB. Modeling drug passage into human milk. Clin Pharmacol Ther 2016; 100:42-52. [PMID: 27060684 DOI: 10.1002/cpt.377] [Citation(s) in RCA: 75] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Revised: 03/16/2016] [Accepted: 04/01/2016] [Indexed: 01/16/2023]
Abstract
Breastfeeding has positive health consequences for both the breastfed infant and the nursing mother.(1,2) Although information on drug use during lactation is available through sites such as LactMed,(3) available information is often incomplete. Unlike pregnancy, in which large numbers of pregnant women need to be studied to assure safety, measurement of drug concentrations in breastmilk in a relatively few subjects can provide valuable information to assess drug safety. This article reviews methods of measuring and predicting drug passage into breastmilk.
Collapse
Affiliation(s)
- P O Anderson
- Health Sciences Clinical Professor, University of California San Diego, Skaggs School of Pharmacy and Pharmaceutical Sciences, La Jolla, California, USA
| | - J B Sauberan
- Neonatal Research Institute, Sharp Mary Birch Hospital for Women and Newborns, San Diego, California, USA
| |
Collapse
|
7
|
Zhan X, Jiang S, Yang Y, Liang J, Shi T, Li X. Inline Measurement of Particle Concentrations in Multicomponent Suspensions using Ultrasonic Sensor and Least Squares Support Vector Machines. SENSORS 2015; 15:24109-24. [PMID: 26393611 PMCID: PMC4610515 DOI: 10.3390/s150924109] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/31/2015] [Revised: 09/11/2015] [Accepted: 09/11/2015] [Indexed: 11/16/2022]
Abstract
This paper proposes an ultrasonic measurement system based on least squares support vector machines (LS-SVM) for inline measurement of particle concentrations in multicomponent suspensions. Firstly, the ultrasonic signals are analyzed and processed, and the optimal feature subset that contributes to the best model performance is selected based on the importance of features. Secondly, the LS-SVM model is tuned, trained and tested with different feature subsets to obtain the optimal model. In addition, a comparison is made between the partial least square (PLS) model and the LS-SVM model. Finally, the optimal LS-SVM model with the optimal feature subset is applied to inline measurement of particle concentrations in the mixing process. The results show that the proposed method is reliable and accurate for inline measuring the particle concentrations in multicomponent suspensions and the measurement accuracy is sufficiently high for industrial application. Furthermore, the proposed method is applicable to the modeling of the nonlinear system dynamically and provides a feasible way to monitor industrial processes.
Collapse
Affiliation(s)
- Xiaobin Zhan
- State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan 430074, China.
| | - Shulan Jiang
- Tribology Research Institute, National Traction Power Laboratory, Southwest Jiaotong University, Chengdu 610031, China.
| | - Yili Yang
- State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan 430074, China.
| | - Jian Liang
- State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan 430074, China.
| | - Tielin Shi
- State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan 430074, China.
| | - Xiwen Li
- State Key Laboratory of Digital Manufacturing Equipment and Technology, Huazhong University of Science and Technology, Wuhan 430074, China.
| |
Collapse
|
8
|
Wu J, Fatah EEA, Mahfouz MR. Fully automatic initialization of two-dimensional-three-dimensional medical image registration using hybrid classifier. J Med Imaging (Bellingham) 2015; 2:024007. [PMID: 26158102 DOI: 10.1117/1.jmi.2.2.024007] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2014] [Accepted: 05/01/2015] [Indexed: 11/14/2022] Open
Abstract
X-ray video fluoroscopy along with two-dimensional-three-dimensional (2D-3D) registration techniques is widely used to study joints in vivo kinematic behaviors. These techniques, however, are generally very sensitive to the initial alignment of the 3-D model. We present an automatic initialization method for 2D-3D registration of medical images. The contour of the knee bone or implant was first automatically extracted from a 2-D x-ray image. Shape descriptors were calculated by normalized elliptical Fourier descriptors to represent the contour shape. The optimal pose was then determined by a hybrid classifier combining [Formula: see text]-nearest neighbors and support vector machine. The feasibility of the method was first validated on computer synthesized images, with 100% successful estimation for the femur and tibia implants, 92% for the femur and 95% for the tibia. The method was further validated on fluoroscopic x-ray images with all the poses of the testing cases successfully estimated. Finally, the method was evaluated as an initialization of a feature-based 2D-3D registration. The initialized and uninitialized registrations had success rates of 100% and 50%, respectively. The proposed method can be easily utilized for 2D-3D image registration on various medical objects and imaging modalities.
Collapse
Affiliation(s)
- Jing Wu
- University of Tennessee , Institute of Biomedical Engineering, 1506 Middle Drive, Knoxville, Tennessee 37996-2000, United States
| | - Emam E Abdel Fatah
- University of Tennessee , Institute of Biomedical Engineering, 1506 Middle Drive, Knoxville, Tennessee 37996-2000, United States
| | - Mohamed R Mahfouz
- University of Tennessee , Institute of Biomedical Engineering, 1506 Middle Drive, Knoxville, Tennessee 37996-2000, United States
| |
Collapse
|
9
|
Prediction of Drug Transfer into Milk Considering Breast Cancer Resistance Protein (BCRP)-Mediated Transport. Pharm Res 2015; 32:2527-37. [PMID: 25690342 DOI: 10.1007/s11095-015-1641-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2014] [Accepted: 01/27/2015] [Indexed: 01/13/2023]
Abstract
PURPOSE Drug transfer into milk is of concern due to the unnecessary exposure of infants to drugs. Proposed prediction methods for such transfer assume only passive drug diffusion across the mammary epithelium. This study reorganized data from the literature to assess the contribution of carrier-mediated transport to drug transfer into milk, and to improve the predictability thereof. METHODS Milk-to-plasma drug concentration ratios (M/Ps) in humans were exhaustively collected from the literature and converted into observed unbound concentration ratios (M/Punbound,obs). The ratios were also predicted based on passive diffusion across the mammary epithelium (M/Punbound,pred). An in vitro transport assay was performed for selected drugs in breast cancer resistance protein (BCRP)-expressing cell monolayers. RESULTS M/Punbound,obs and M/Punbound,pred values were compared for 166 drugs. M/Punbound,obs values were 1.5 times or more higher than M/Punbound,pred values for as many as 13 out of 16 known BCRP substrates, reconfirming BCRP as the predominant transporter contributing to secretory transfer of drugs into milk. Predictability of M/P values for selected BCRP substrates and non-substrates was improved by considering in vitro-evaluated BCRP-mediated transport relative to passive diffusion alone. CONCLUSIONS The current analysis improved the predictability of drug transfer into milk, particularly for BCRP substrates, based on an exhaustive data overhaul followed by focused in vitro transport experimentation.
Collapse
|
10
|
Abstract
The chemical structure of any drug determines its pharmacokinetics and pharmacodynamics. Detailed understanding of relationships between the drug chemical structure and individual disposition pathways (i.e., distribution and elimination) is required for efficient use of existing drugs and effective development of new drugs. Different approaches have been developed for this purpose, ranging from statistics-based quantitative structure-property (or structure-pharmacokinetic) relationships (QSPR) analysis to physiologically based pharmacokinetic (PBPK) models. This review critically analyzes currently available approaches for analysis and prediction of drug disposition on the basis of chemical structure. Models that can be used to predict different aspects of disposition are presented, including: (a) value of the individual pharmacokinetic parameter (e.g., clearance or volume of distribution), (b) efficiency of the specific disposition pathway (e.g., biliary drug excretion or cytochrome P450 3A4 metabolism), (c) accumulation in a specific organ or tissue (e.g., permeability of the placenta or accumulation in the brain), and (d) the whole-body disposition in the individual patients. Examples of presented pharmacological agents include "classical" low-molecular-weight compounds, biopharmaceuticals, and drugs encapsulated in specialized drug-delivery systems. The clinical efficiency of agents from all these groups can be suboptimal, because of inefficient permeability of the drug to the site of action and/or excessive accumulation in other organs and tissues. Therefore, robust and reliable approaches for chemical structure-based prediction of drug disposition are required to overcome these limitations. PBPK models are increasingly being used for prediction of drug disposition. These models can reflect the complex interplay of factors that determine drug disposition in a mechanistically correct fashion and can be combined with other approaches, for example QSPR-based prediction of drug permeability and metabolism, pharmacogenomic data and tools, pharmacokinetic-pharmacodynamic modeling approaches, etc. Moreover, the PBPK models enable detailed analysis of clinically relevant scenarios, for example the effect of the specific conditions on the time course of the analyzed drug in the individual organs and tissues, including the site of action. It is expected that further development of such combined approaches will increase their precision, enhance the effectiveness of drugs, and lead to individualized drug therapy for different patient populations (geriatric, pediatric, specific diseases, etc.).
Collapse
|
11
|
Kar S, Roy K. Prediction of Milk/Plasma Concentration Ratios of Drugs and Environmental Pollutants Using In Silico Tools: Classification and Regression Based QSARs and Pharmacophore Mapping. Mol Inform 2013; 32:693-705. [DOI: 10.1002/minf.201300018] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2013] [Accepted: 04/17/2013] [Indexed: 11/12/2022]
|
12
|
Fatemi MH, Ghorbanzad’e M. Classification of drugs according to their milk/plasma concentration ratio. Eur J Med Chem 2010; 45:5051-5. [DOI: 10.1016/j.ejmech.2010.08.013] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2010] [Revised: 07/24/2010] [Accepted: 08/07/2010] [Indexed: 11/12/2022]
|
13
|
Sakiyama Y. The use of machine learning and nonlinear statistical tools for ADME prediction. Expert Opin Drug Metab Toxicol 2010; 5:149-69. [PMID: 19239395 DOI: 10.1517/17425250902753261] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Abstract
Absorption, distribution, metabolism and excretion (ADME)-related failure of drug candidates is a major issue for the pharmaceutical industry today. Prediction of ADME by in silico tools has now become an inevitable paradigm to reduce cost and enhance efficiency in pharmaceutical research. Recently, machine learning as well as nonlinear statistical tools has been widely applied to predict routine ADME end points. To achieve accurate and reliable predictions, it would be a prerequisite to understand the concepts, mechanisms and limitations of these tools. Here, we have devised a small synthetic nonlinear data set to help understand the mechanism of machine learning by 2D-visualisation. We applied six new machine learning methods to four different data sets. The methods include Naive Bayes classifier, classification and regression tree, random forest, Gaussian process, support vector machine and k nearest neighbour. The results demonstrated that ensemble learning and kernel machine displayed greater accuracy of prediction than classical methods irrespective of the data set size. The importance of interaction with the engineering field is also addressed. The results described here provide insights into the mechanism of machine learning, which will enable appropriate usage in the future.
Collapse
Affiliation(s)
- Yojiro Sakiyama
- Pharmacokinetics Dynamics Metabolism, Pfizer Global Research and Development, Sandwich Laboratories, Kent, UK.
| |
Collapse
|
14
|
Lin M, Hu B, Chen L, Sun P, Fan Y, Wu P, Chen X. Computational identification of potential molecular interactions in Arabidopsis. PLANT PHYSIOLOGY 2009; 151:34-46. [PMID: 19592425 PMCID: PMC2735983 DOI: 10.1104/pp.109.141317] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2009] [Accepted: 07/06/2009] [Indexed: 05/21/2023]
Abstract
Knowledge of the protein interaction network is useful to assist molecular mechanism studies. Several major repositories have been established to collect and organize reported protein interactions. Many interactions have been reported in several model organisms, yet a very limited number of plant interactions can thus far be found in these major databases. Computational identification of potential plant interactions, therefore, is desired to facilitate relevant research. In this work, we constructed a support vector machine model to predict potential Arabidopsis (Arabidopsis thaliana) protein interactions based on a variety of indirect evidence. In a 100-iteration bootstrap evaluation, the confidence of our predicted interactions was estimated to be 48.67%, and these interactions were expected to cover 29.02% of the entire interactome. The sensitivity of our model was validated with an independent evaluation data set consisting of newly reported interactions that did not overlap with the examples used in model training and testing. Results showed that our model successfully recognized 28.91% of the new interactions, similar to its expected sensitivity (29.02%). Applying this model to all possible Arabidopsis protein pairs resulted in 224,206 potential interactions, which is the largest and most accurate set of predicted Arabidopsis interactions at present. In order to facilitate the use of our results, we present the Predicted Arabidopsis Interactome Resource, with detailed annotations and more specific per interaction confidence measurements. This database and related documents are freely accessible at http://www.cls.zju.edu.cn/pair/.
Collapse
Affiliation(s)
- Mingzhi Lin
- Department of Bioinformatics, Zhejiang University, Hangzhou, People's Republic of China, 310058
| | | | | | | | | | | | | |
Collapse
|
15
|
Yang XG, Chen D, Wang M, Xue Y, Chen YZ. Prediction of antibacterial compounds by machine learning approaches. J Comput Chem 2009; 30:1202-11. [DOI: 10.1002/jcc.21148] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
|