1
|
Chou JCC, Decosto CM, Chatterjee P, Dassama LMK. Rapid proteome-wide prediction of lipid-interacting proteins through ligand-guided structural genomics. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.26.577452. [PMID: 38352308 PMCID: PMC10862712 DOI: 10.1101/2024.01.26.577452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Lipids are primary metabolites that play essential roles in multiple cellular pathways. Alterations in lipid metabolism and transport are associated with infectious diseases and cancers. As such, proteins involved in lipid synthesis, trafficking, and modification, are targets for therapeutic intervention. The ability to rapidly detect these proteins can accelerate their biochemical and structural characterization. However, it remains challenging to identify lipid binding motifs in proteins due to a lack of conservation at the amino acids level. Therefore, new bioinformatic tools that can detect conserved features in lipid binding sites are necessary. Here, we present Structure-based Lipid-interacting Pocket Predictor (SLiPP), a structural bioinformatics algorithm that uses machine learning to detect protein cavities capable of binding to lipids in experimental and AlphaFold-predicted protein structures. SLiPP, which can be used at proteome-wide scales, predicts lipid binding pockets with an accuracy of 96.8% and a F1 score of 86.9%. Our analyses revealed that the algorithm relies on hydrophobicity-related features to distinguish lipid binding pockets from those that bind to other ligands. Use of the algorithm to detect lipid binding proteins in the proteomes of various bacteria, yeast, and human have produced hits annotated or verified as lipid binding proteins, and many other uncharacterized proteins whose functions are not discernable from sequence alone. Because of its ability to identify novel lipid binding proteins, SLiPP can spur the discovery of new lipid metabolic and trafficking pathways that can be targeted for therapeutic development.
Collapse
Affiliation(s)
- Jonathan Chiu-Chun Chou
- Department of Chemistry and Sarafan ChEM-H Institute, Stanford University, Stanford, CA 94305
| | - Cassandra M. Decosto
- Department of Chemistry and Sarafan ChEM-H Institute, Stanford University, Stanford, CA 94305
| | - Poulami Chatterjee
- Department of Chemistry and Sarafan ChEM-H Institute, Stanford University, Stanford, CA 94305
| | - Laura M. K. Dassama
- Department of Chemistry and Sarafan ChEM-H Institute, Stanford University, Stanford, CA 94305
- Department of Microbiology and Immunology, Stanford School of Medicine, Stanford, CA 94305
| |
Collapse
|
2
|
Liu Y, Munteanu CR, Kong Z, Ran T, Sahagún-Ruiz A, He Z, Zhou C, Tan Z. Identification of coenzyme-binding proteins with machine learning algorithms. Comput Biol Chem 2019; 79:185-192. [PMID: 30851647 DOI: 10.1016/j.compbiolchem.2019.01.014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2018] [Revised: 09/11/2018] [Accepted: 01/25/2019] [Indexed: 01/12/2023]
Abstract
The coenzyme-binding proteins play a vital role in the cellular metabolism processes, such as fatty acid biosynthesis, enzyme and gene regulation, lipid synthesis, particular vesicular traffic, and β-oxidation donation of acyl-CoA esters. Based on the theory of Star Graph Topological Indices (SGTIs) of protein primary sequences, we proposed a method to develop a first classification model for predicting protein with coenzyme-binding properties. To simulate the properties of coenzyme-binding proteins, we created a dataset containing 2897 proteins, among 456 proteins functioned as coenzyme-binding activity. The SGTIs of peptide sequence were calculated with Sequence to Star Network (S2SNet) application. We used the SGTIs as inputs to several classification techniques with a machine learning software - Weka. A Random Forest classifier based on 3 features of the embedded and non-embedded graphs was identified as the best predictive model for coenzyme-binding proteins. This model developed was with the true positive (TP) rate of 91.7%, false positive (FP) rate of 7.6%, and Area Under the Receiver Operating Characteristic Curve (AUROC) of 0.971. The prediction of new coenzyme-binding activity proteins using this model could be useful for further drug development or enzyme metabolism researches.
Collapse
Affiliation(s)
- Yong Liu
- Key Laboratory for Agro-Ecological Processes in Subtropical Region, National Engineering Laboratory for Pollution Control and Waste Utilization in Livestock and Poultry Production, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, Institute of Subtropical Agriculture, The Chinese Academy of Sciences, Changsha, Hunan, 410125, PR China; Hunan Co-Innovation Center of Animal Production Safety, CICAPS, Changsha, Hunan, 410128, PR China
| | - Cristian R Munteanu
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, A Coruña, Spain; Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), A Coruña, 15006, Spain
| | - Zhiwei Kong
- Key Laboratory for Agro-Ecological Processes in Subtropical Region, National Engineering Laboratory for Pollution Control and Waste Utilization in Livestock and Poultry Production, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, Institute of Subtropical Agriculture, The Chinese Academy of Sciences, Changsha, Hunan, 410125, PR China; University of the Chinese Academy of Sciences, Beijing, 100049, PR China
| | - Tao Ran
- Key Laboratory for Agro-Ecological Processes in Subtropical Region, National Engineering Laboratory for Pollution Control and Waste Utilization in Livestock and Poultry Production, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, Institute of Subtropical Agriculture, The Chinese Academy of Sciences, Changsha, Hunan, 410125, PR China; Lethbridge Research and Development Centre, Agriculture and Agri-Food Canada, Lethbridge, Alberta, T1J 4B1, Canada
| | - Alfredo Sahagún-Ruiz
- Department of Microbiology and Immunology, Faculty of Veterinary Medicine and Animal Science, National Autonomous University of Mexico, Universidad 3000, Copilco Coyoacán, CP 04510, México D.F., Mexico
| | - Zhixiong He
- Key Laboratory for Agro-Ecological Processes in Subtropical Region, National Engineering Laboratory for Pollution Control and Waste Utilization in Livestock and Poultry Production, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, Institute of Subtropical Agriculture, The Chinese Academy of Sciences, Changsha, Hunan, 410125, PR China; Hunan Co-Innovation Center of Animal Production Safety, CICAPS, Changsha, Hunan, 410128, PR China.
| | - Chuanshe Zhou
- Key Laboratory for Agro-Ecological Processes in Subtropical Region, National Engineering Laboratory for Pollution Control and Waste Utilization in Livestock and Poultry Production, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, Institute of Subtropical Agriculture, The Chinese Academy of Sciences, Changsha, Hunan, 410125, PR China; Hunan Co-Innovation Center of Animal Production Safety, CICAPS, Changsha, Hunan, 410128, PR China
| | - Zhiliang Tan
- Key Laboratory for Agro-Ecological Processes in Subtropical Region, National Engineering Laboratory for Pollution Control and Waste Utilization in Livestock and Poultry Production, South Central Experimental Station of Animal Nutrition and Feed Science in the Ministry of Agriculture, Institute of Subtropical Agriculture, The Chinese Academy of Sciences, Changsha, Hunan, 410125, PR China; Hunan Co-Innovation Center of Animal Production Safety, CICAPS, Changsha, Hunan, 410128, PR China
| |
Collapse
|
3
|
Akbar S, Hayat M, Iqbal M, Jan MA. iACP-GAEnsC: Evolutionary genetic algorithm based ensemble classification of anticancer peptides by utilizing hybrid feature space. Artif Intell Med 2017; 79:62-70. [PMID: 28655440 DOI: 10.1016/j.artmed.2017.06.008] [Citation(s) in RCA: 94] [Impact Index Per Article: 13.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2017] [Revised: 06/12/2017] [Accepted: 06/16/2017] [Indexed: 01/10/2023]
Abstract
Cancer is a fatal disease, responsible for one-quarter of all deaths in developed countries. Traditional anticancer therapies such as, chemotherapy and radiation, are highly expensive, susceptible to errors and ineffective techniques. These conventional techniques induce severe side-effects on human cells. Due to perilous impact of cancer, the development of an accurate and highly efficient intelligent computational model is desirable for identification of anticancer peptides. In this paper, evolutionary intelligent genetic algorithm-based ensemble model, 'iACP-GAEnsC', is proposed for the identification of anticancer peptides. In this model, the protein sequences are formulated, using three different discrete feature representation methods, i.e., amphiphilic Pseudo amino acid composition, g-Gap dipeptide composition, and Reduce amino acid alphabet composition. The performance of the extracted feature spaces are investigated separately and then merged to exhibit the significance of hybridization. In addition, the predicted results of individual classifiers are combined together, using optimized genetic algorithm and simple majority technique in order to enhance the true classification rate. It is observed that genetic algorithm-based ensemble classification outperforms than individual classifiers as well as simple majority voting base ensemble. The performance of genetic algorithm-based ensemble classification is highly reported on hybrid feature space, with an accuracy of 96.45%. In comparison to the existing techniques, 'iACP-GAEnsC' model has achieved remarkable improvement in terms of various performance metrics. Based on the simulation results, it is observed that 'iACP-GAEnsC' model might be a leading tool in the field of drug design and proteomics for researchers.
Collapse
Affiliation(s)
- Shahid Akbar
- Department of Computer Science, Abdul Wali Khan University Mardan, KP 23200, Pakistan.
| | - Maqsood Hayat
- Department of Computer Science, Abdul Wali Khan University Mardan, KP 23200, Pakistan.
| | - Muhammad Iqbal
- Department of Computer Science, Abdul Wali Khan University Mardan, KP 23200, Pakistan.
| | - Mian Ahmad Jan
- Department of Computer Science, Abdul Wali Khan University Mardan, KP 23200, Pakistan.
| |
Collapse
|
4
|
Sánchez-Ovejero C, Benito-Lopez F, Díez P, Casulli A, Siles-Lucas M, Fuentes M, Manzano-Román R. Sensing parasites: Proteomic and advanced bio-detection alternatives. J Proteomics 2016; 136:145-56. [PMID: 26773860 DOI: 10.1016/j.jprot.2015.12.030] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2015] [Revised: 12/22/2015] [Accepted: 12/29/2015] [Indexed: 12/12/2022]
Abstract
Parasitic diseases have a great impact in human and animal health. The gold standard for the diagnosis of the majority of parasitic infections is still conventional microscopy, which presents important limitations in terms of sensitivity and specificity and commonly requires highly trained technicians. More accurate molecular-based diagnostic tools are needed for the implementation of early detection, effective treatments and massive screenings with high-throughput capacities. In this respect, sensitive and affordable devices could greatly impact on sustainable control programmes which exist against parasitic diseases, especially in low income settings. Proteomics and nanotechnology approaches are valuable tools for sensing pathogens and host alteration signatures within microfluidic detection platforms. These new devices might provide novel solutions to fight parasitic diseases. Newly described specific parasite derived products with immune-modulatory properties have been postulated as the best candidates for the early and accurate detection of parasitic infections as well as for the blockage of parasite development. This review provides the most recent methodological and technological advances with great potential for bio-sensing parasites in their hosts, showing the newest opportunities offered by modern "-omics" and platforms for parasite detection and control.
Collapse
Affiliation(s)
- Carlos Sánchez-Ovejero
- Instituto de Recursos Naturales y Agrobiología de Salamanca (IRNASA-CSIC), 37008 Salamanca, Spain
| | - Fernando Benito-Lopez
- Analytical Chemistry Department, Universidad del País Vasco UPV/EHU, 01006 Vitoria-Gasteiz, Spain
| | - Paula Díez
- Department of Medicine and General Cytometry Service-Nucleus, Cancer Research Centre (IBMCC/CSIC/USAL/IBSAL), 37007 Salamanca, Spain; Proteomics Unit, Cancer Research Centre (IBMCC/CSIC/USAL/IBSAL), 37007 Salamanca, Spain
| | - Adriano Casulli
- Department of Infectious, Parasitic and Immunomediated Diseases, Istituto Superiore di Sanità, Viale Regina Elena 299, - 00161 Rome, Italy
| | - Mar Siles-Lucas
- Instituto de Recursos Naturales y Agrobiología de Salamanca (IRNASA-CSIC), 37008 Salamanca, Spain
| | - Manuel Fuentes
- Department of Medicine and General Cytometry Service-Nucleus, Cancer Research Centre (IBMCC/CSIC/USAL/IBSAL), 37007 Salamanca, Spain; Proteomics Unit, Cancer Research Centre (IBMCC/CSIC/USAL/IBSAL), 37007 Salamanca, Spain.
| | - Raúl Manzano-Román
- Instituto de Recursos Naturales y Agrobiología de Salamanca (IRNASA-CSIC), 37008 Salamanca, Spain.
| |
Collapse
|
5
|
Dhiman K, Agarwal SM. NPred: QSAR classification model for identifying plant based naturally occurring anti-cancerous inhibitors. RSC Adv 2016. [DOI: 10.1039/c6ra02772e] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
Prediction of naturally occurring plant based compounds as anticancer agents is the key to developing new chemical entities in the area of therapeutic oncology. A webserver for assessing anticancer potential of phytomolecules has been developed.
Collapse
Affiliation(s)
- Kanika Dhiman
- Bioinformatics Division
- Institute of Cytology and Preventive Oncology
- Noida-201301
- India
| | - Subhash Mohan Agarwal
- Bioinformatics Division
- Institute of Cytology and Preventive Oncology
- Noida-201301
- India
| |
Collapse
|
6
|
Liu Y, Munteanu CR, Fernández Blanco E, Tan Z, Santos Del Riego A, Pazos A. Prediction of Nucleotide Binding Peptides Using Star Graph Topological Indices. Mol Inform 2015; 34:736-41. [PMID: 27491034 DOI: 10.1002/minf.201500064] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Accepted: 07/06/2015] [Indexed: 01/14/2023]
Abstract
The nucleotide binding proteins are involved in many important cellular processes, such as transmission of genetic information or energy transfer and storage. Therefore, the screening of new peptides for this biological function is an important research topic. The current study proposes a mixed methodology to obtain the first classification model that is able to predict new nucleotide binding peptides, using only the amino acid sequence. Thus, the methodology uses a Star graph molecular descriptor of the peptide sequences and the Machine Learning technique for the best classifier. The best model represents a Random Forest classifier based on two features of the embedded and non-embedded graphs. The performance of the model is excellent, considering similar models in the field, with an Area Under the Receiver Operating Characteristic Curve (AUROC) value of 0.938 and true positive rate (TPR) of 0.886 (test subset). The prediction of new nucleotide binding peptides with this model could be useful for drug target studies in drug development.
Collapse
Affiliation(s)
- Yong Liu
- Department of Information and Communication Technologies, Computer Science Faculty, University of A Coruna, Campus de Elviña s/n, 15071, A Coruña, Spain, phone/fax: +34-981167000/+34-981167160.,Faculty of Veterinary Medicine and Animal Science, Autonomous University of the State of Mexico, Toluca, 50090, México.,Key Laboratory of Subtropical Agro-ecological Engineering, Institute of Subtropical Agriculture, the Chinese Academy of Sciences, Changsha, Hunan, 410125, P. R. China
| | - Cristian R Munteanu
- Department of Information and Communication Technologies, Computer Science Faculty, University of A Coruna, Campus de Elviña s/n, 15071, A Coruña, Spain, phone/fax: +34-981167000/+34-981167160.
| | - Enrique Fernández Blanco
- Department of Information and Communication Technologies, Computer Science Faculty, University of A Coruna, Campus de Elviña s/n, 15071, A Coruña, Spain, phone/fax: +34-981167000/+34-981167160
| | - Zhiliang Tan
- Key Laboratory of Subtropical Agro-ecological Engineering, Institute of Subtropical Agriculture, the Chinese Academy of Sciences, Changsha, Hunan, 410125, P. R. China
| | - Antonino Santos Del Riego
- Department of Information and Communication Technologies, Computer Science Faculty, University of A Coruna, Campus de Elviña s/n, 15071, A Coruña, Spain, phone/fax: +34-981167000/+34-981167160
| | - Alejandro Pazos
- Department of Information and Communication Technologies, Computer Science Faculty, University of A Coruna, Campus de Elviña s/n, 15071, A Coruña, Spain, phone/fax: +34-981167000/+34-981167160
| |
Collapse
|
7
|
Markov mean properties for cell death-related protein classification. J Theor Biol 2014; 349:12-21. [DOI: 10.1016/j.jtbi.2014.01.033] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2013] [Revised: 01/21/2014] [Accepted: 01/24/2014] [Indexed: 11/18/2022]
|
8
|
Chen L, Lu J, Luo X, Feng KY. Prediction of drug target groups based on chemical–chemical similarities and chemical–chemical/protein connections. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2014; 1844:207-13. [DOI: 10.1016/j.bbapap.2013.05.021] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/29/2012] [Revised: 05/20/2013] [Accepted: 05/22/2013] [Indexed: 10/26/2022]
|
9
|
Fernandez-Lozano C, Fernández-Blanco E, Dave K, Pedreira N, Gestal M, Dorado J, Munteanu CR. Improving enzyme regulatory protein classification by means of SVM-RFE feature selection. MOLECULAR BIOSYSTEMS 2014; 10:1063-71. [DOI: 10.1039/c3mb70489k] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|
10
|
Speck-Planche A, Kleandrova VV, Cordeiro MND. New insights toward the discovery of antibacterial agents: Multi-tasking QSBER model for the simultaneous prediction of anti-tuberculosis activity and toxicological profiles of drugs. Eur J Pharm Sci 2013; 48:812-8. [DOI: 10.1016/j.ejps.2013.01.011] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2012] [Revised: 01/05/2013] [Accepted: 01/23/2013] [Indexed: 01/11/2023]
|
11
|
Speck-Planche A, Kleandrova VV, Luan F, Cordeiro MNDS. A ligand-based approach for the in silico discovery of multi-target inhibitors for proteins associated with HIV infection. MOLECULAR BIOSYSTEMS 2012; 8:2188-96. [PMID: 22688327 DOI: 10.1039/c2mb25093d] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Acquired immunodeficiency syndrome (AIDS) is a dangerous disease, which damages the immune system cells to the point that the immune system can no longer fight against other infections that it would usually be able to prevent. The causal agent is the human immunodeficiency virus (HIV), and for this reason, the search for more effective chemotherapies against HIV is a challenge for the scientific community. Chemoinformatics and Quantitative Structure-Activity Relationship (QSAR) studies have played an essential role in the design of potent inhibitors for proteins associated with the HIV infection. However, all previous studies took into consideration the discovery of future drug candidates using homogeneous series of compounds against only one protein. This fact limits the use of more efficient anti-HIV chemotherapies. In this work, we develop the first ligand-based approach for the in silico design of multi-target (mt) inhibitors for seven key proteins associated with the HIV infection. Two mt-QSAR models were constructed from a large and heterogeneous database of compounds. The first model was based on linear discriminant analysis (mt-QSAR-LDA) employing fragment-based descriptors. The second model was obtained using artificial neural networks (mt-QSAR-ANN) with global 2D descriptors. Both models correctly classified more than 90% of active and inactive compounds in training and prediction sets. Some fragments were extracted and their contributions to anti-HIV activity through inhibition of the different proteins were calculated using the mt-QSAR-LDA model. New molecules designed from fragments with positive contributions were suggested and correctly predicted by the two models as possible potent and versatile anti-HIV agents.
Collapse
Affiliation(s)
- Alejandro Speck-Planche
- REQUIMTE/Department of Chemistry and Biochemistry, University of Porto, 4169-007 Porto, Portugal.
| | | | | | | |
Collapse
|