1
|
Mao M, Ahrens L, Luka J, Contreras F, Kurkina T, Bienstein M, Sárria Pereira de Passos M, Schirinzi G, Mehn D, Valsesia A, Desmet C, Serra MÁ, Gilliland D, Schwaneberg U. Material-specific binding peptides empower sustainable innovations in plant health, biocatalysis, medicine and microplastic quantification. Chem Soc Rev 2024; 53:6445-6510. [PMID: 38747901 DOI: 10.1039/d2cs00991a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2024]
Abstract
Material-binding peptides (MBPs) have emerged as a diverse and innovation-enabling class of peptides in applications such as plant-/human health, immobilization of catalysts, bioactive coatings, accelerated polymer degradation and analytics for micro-/nanoplastics quantification. Progress has been fuelled by recent advancements in protein engineering methodologies and advances in computational and analytical methodologies, which allow the design of, for instance, material-specific MBPs with fine-tuned binding strength for numerous demands in material science applications. A genetic or chemical conjugation of second (biological, chemical or physical property-changing) functionality to MBPs empowers the design of advanced (hybrid) materials, bioactive coatings and analytical tools. In this review, we provide a comprehensive overview comprising naturally occurring MBPs and their function in nature, binding properties of short man-made MBPs (<20 amino acids) mainly obtained from phage-display libraries, and medium-sized binding peptides (20-100 amino acids) that have been reported to bind to metals, polymers or other industrially produced materials. The goal of this review is to provide an in-depth understanding of molecular interactions between materials and material-specific binding peptides, and thereby empower the use of MBPs in material science applications. Protein engineering methodologies and selected examples to tailor MBPs toward applications in agriculture with a focus on plant health, biocatalysis, medicine and environmental monitoring serve as examples of the transformative power of MBPs for various industrial applications. An emphasis will be given to MBPs' role in detecting and quantifying microplastics in high throughput, distinguishing microplastics from other environmental particles, and thereby assisting to close an analytical gap in food safety and monitoring of environmental plastic pollution. In essence, this review aims to provide an overview among researchers from diverse disciplines in respect to material-(specific) binding of MBPs, protein engineering methodologies to tailor their properties to application demands, re-engineering for material science applications using MBPs, and thereby inspire researchers to employ MBPs in their research.
Collapse
Affiliation(s)
- Maochao Mao
- Lehrstuhl für Biotechnologie, RWTH Aachen University, Worringerweg 3, 52074 Aachen, Germany.
| | - Leon Ahrens
- Lehrstuhl für Biotechnologie, RWTH Aachen University, Worringerweg 3, 52074 Aachen, Germany.
| | - Julian Luka
- Lehrstuhl für Biotechnologie, RWTH Aachen University, Worringerweg 3, 52074 Aachen, Germany.
| | - Francisca Contreras
- Lehrstuhl für Biotechnologie, RWTH Aachen University, Worringerweg 3, 52074 Aachen, Germany.
| | - Tetiana Kurkina
- Lehrstuhl für Biotechnologie, RWTH Aachen University, Worringerweg 3, 52074 Aachen, Germany.
| | - Marian Bienstein
- Lehrstuhl für Biotechnologie, RWTH Aachen University, Worringerweg 3, 52074 Aachen, Germany.
| | | | | | - Dora Mehn
- European Commission, Joint Research Centre (JRC), Ispra, Italy
| | - Andrea Valsesia
- European Commission, Joint Research Centre (JRC), Ispra, Italy
| | - Cloé Desmet
- European Commission, Joint Research Centre (JRC), Ispra, Italy
| | | | | | - Ulrich Schwaneberg
- Lehrstuhl für Biotechnologie, RWTH Aachen University, Worringerweg 3, 52074 Aachen, Germany.
| |
Collapse
|
2
|
Xu J, Ruan X, Yang J, Hu B, Li S, Hu J. SME-MFP: A novel spatiotemporal neural network with multiangle initialization embedding toward multifunctional peptides prediction. Comput Biol Chem 2024; 109:108033. [PMID: 38412804 DOI: 10.1016/j.compbiolchem.2024.108033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2023] [Revised: 01/09/2024] [Accepted: 02/17/2024] [Indexed: 02/29/2024]
Abstract
As a promising alternative to conventional antibiotic drugs in the biomedical field, functional peptide has been widely used in disease treatment owing to its low toxicity, high absorption rate, and biological activity. Recently, several machine learning methods have been developed for functional peptide prediction. However, the main research heavily relies on statistical features and few consider multifunctional peptide identification. So, we propose SME-MFP, a novel predictor in the imbalanced multi-label functional peptide datasets. First, we employ physicochemical and evolutionary information to represent the peptide sequence's initialization features from multiple perspectives. Second, the features are fused and then put into spatial feature extractors, where the residual connection and multiscale convolutional neural network extract more discriminative features of different lengths' peptide sequences. Besides, we also design AFT-based temporal feature extractors to fully capture the global interactions of the sequences. Finally, devising a new loss to replace the traditional cross entropy loss to settle the class imbalance problems. The results show that our framework not only enhances the model's ability to capture sequence features effectively, but also accuracy improves by 3.89% over existing methods on public peptide datasets.
Collapse
Affiliation(s)
- Jing Xu
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China
| | - Xiaoli Ruan
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China.
| | - Jing Yang
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China
| | - Bingqi Hu
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China
| | - Shaobo Li
- State Key Laboratory of Public Big Data, Guizhou University, Guiyang 550025, China
| | - Jianjun Hu
- Department of Computer Science and Engineering, University of South Carolina, Columbia 29208, USA
| |
Collapse
|
3
|
Meng C, Yuan Y, Zhao H, Pei Y, Li Z. IIFS: An improved incremental feature selection method for protein sequence processing. Comput Biol Med 2023; 167:107654. [PMID: 37944304 DOI: 10.1016/j.compbiomed.2023.107654] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 10/09/2023] [Accepted: 10/31/2023] [Indexed: 11/12/2023]
Abstract
MOTIVATION Discrete features can be obtained from protein sequences using a feature extraction method. These features are the basis of downstream processing of protein data, but it is necessary to screen and select some important features from them as they generally have data redundancy. RESULT Here, we report IIFS, an improved incremental feature selection method that exploits a new subset search strategy to find the optimal feature set. IIFS combines nonadjacent sorting features to prevent the drawbacks of data explosion and excessive reliance on feature sorting results. The comparative experimental results on 27 feature sorting data show that IIFS can find more accurate and important features compared to existing methods.The IIFS approach also handles data redundancy more efficiently and finds more representative and discriminatory features while ensuring minimal feature dimensionality and good evaluation metrics. Moreover, we wrap this method and deploy it on a web server for access at http://112.124.26.17:8005/.
Collapse
Affiliation(s)
- Chaolu Meng
- College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China; Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application of Agriculture and Animal Husbandry, China
| | - Ye Yuan
- Beidahuang Industry Group General Hospital, Harbin, 150001, China
| | - Haiyan Zhao
- College of Integration of Traditional Chinese and Western Medicine to Southwest Medical University, Luzhou, Sichuan, 646000, China
| | - Yue Pei
- Computer Network Information Center, Chinese Academy of Sciences, Beijing, 100190, China
| | - Zhi Li
- Department of Spleen and Stomach Diseases, The Affiliated Traditional Chinese Medicine Hospital of Southwest Medical University, Luzhou, Sichuan, 646000, China.
| |
Collapse
|
4
|
Bergman M, Xiao X, Hall CK. In Silico Design and Analysis of Plastic-Binding Peptides. J Phys Chem B 2023; 127:8370-8381. [PMID: 37735840 PMCID: PMC10591858 DOI: 10.1021/acs.jpcb.3c04319] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/23/2023]
Abstract
Peptides that bind to inorganic materials can be used to functionalize surfaces, control crystallization, or assist in interfacial self-assembly. In the past, inorganic-binding peptides have been found predominantly through peptide library screening. While this method has successfully identified peptides that bind to a variety of materials, an alternative design approach that can intelligently search for peptides and provide physical insight for peptide affinity would be desirable. In this work, we develop a computational, physics-based approach to design inorganic-binding peptides, focusing on peptides that bind to the common plastics polyethylene, polypropylene, polystyrene, and poly(ethylene terephthalate). The PepBD algorithm, a Monte Carlo method that samples peptide sequence and conformational space, was modified to include simulated annealing, relax hydration constraints, and an ensemble of conformations to initiate design. These modifications led to the discovery of peptides with significantly better scores compared to those obtained using the original PepBD. PepBD scores were found to improve with increasing van der Waals interactions, although strengthening the intermolecular van der Waals interactions comes at the cost of introducing unfavorable electrostatic interactions. The best designs are enriched in amino acids with bulky side chains and possess hydrophobic and hydrophilic patches whose location depends on the adsorbed conformation. Future work will evaluate the top peptide designs in molecular dynamics simulations and experiment, enabling their application in microplastic pollution remediation and plastic-based biosensors.
Collapse
Affiliation(s)
- Michael Bergman
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, North Carolina, 27606, USA
| | - Xingqing Xiao
- Department of Chemistry, School of Science, Hainan University, Longhua District, Haikou, Hainan, 571101, China
| | - Carol K. Hall
- Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, North Carolina, 27606, USA
| |
Collapse
|
5
|
Petinrin OO, Saeed F, Toseef M, Liu Z, Basurra S, Muyide IO, Li X, Lin Q, Wong KC. Machine Learning in Metastatic Cancer Research: Potentials, Possibilities, and Prospects. Comput Struct Biotechnol J 2023; 21:2454-2470. [PMID: 37077177 PMCID: PMC10106342 DOI: 10.1016/j.csbj.2023.03.046] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2023] [Revised: 03/26/2023] [Accepted: 03/27/2023] [Indexed: 03/31/2023] Open
Abstract
Cancer has received extensive recognition for its high mortality rate, with metastatic cancer being the top cause of cancer-related deaths. Metastatic cancer involves the spread of the primary tumor to other body organs. As much as the early detection of cancer is essential, the timely detection of metastasis, the identification of biomarkers, and treatment choice are valuable for improving the quality of life for metastatic cancer patients. This study reviews the existing studies on classical machine learning (ML) and deep learning (DL) in metastatic cancer research. Since the majority of metastatic cancer research data are collected in the formats of PET/CT and MRI image data, deep learning techniques are heavily involved. However, its black-box nature and expensive computational cost are notable concerns. Furthermore, existing models could be overestimated for their generality due to the non-diverse population in clinical trial datasets. Therefore, research gaps are itemized; follow-up studies should be carried out on metastatic cancer using machine learning and deep learning tools with data in a symmetric manner.
Collapse
|
6
|
Dixit R, Khambhati K, Supraja KV, Singh V, Lederer F, Show PL, Awasthi MK, Sharma A, Jain R. Application of machine learning on understanding biomolecule interactions in cellular machinery. BIORESOURCE TECHNOLOGY 2023; 370:128522. [PMID: 36565819 DOI: 10.1016/j.biortech.2022.128522] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/29/2022] [Revised: 12/17/2022] [Accepted: 12/20/2022] [Indexed: 06/17/2023]
Abstract
Machine learning (ML) applications have become ubiquitous in all fields of research including protein science and engineering. Apart from protein structure and mutation prediction, scientists are focusing on knowledge gaps with respect to the molecular mechanisms involved in protein binding and interactions with other components in the experimental setups or the human body. Researchers are working on several wet-lab techniques and generating data for a better understanding of concepts and mechanics involved. The information like biomolecular structure, binding affinities, structure fluctuations and movements are enormous which can be handled and analyzed by ML. Therefore, this review highlights the significance of ML in understanding the biomolecular interactions while assisting in various fields of research such as drug discovery, nanomedicine, nanotoxicity and material science. Hence, the way ahead would be to force hand-in hand of laboratory work and computational techniques.
Collapse
Affiliation(s)
- Rewati Dixit
- Waste Treatment Laboratory, Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Haus-khas, New Delhi 110016, India
| | - Khushal Khambhati
- Department of Biosciences, School of Science, Indrashil University, Rajpur, Mehsana 382715, Gujarat, India
| | - Kolli Venkata Supraja
- Waste Treatment Laboratory, Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Haus-khas, New Delhi 110016, India
| | - Vijai Singh
- Department of Biosciences, School of Science, Indrashil University, Rajpur, Mehsana 382715, Gujarat, India
| | - Franziska Lederer
- Helmholtz-Zentrum Dresden-Rossendorf, Helmholtz Institute Freiberg for Resource Technology, Bautzner landstrasse 400, 01328 Dresden, Germany
| | - Pau-Loke Show
- Zhejiang Provincial Key Laboratory for Subtropical Water Environment and Marine Biological Resources Protection, Wenzhou University, Wenzhou 325035, China; Department of Sustainable Engineering, Saveetha School of Engineering, SIMATS, Chennai 602105, India; Department of Chemical and Environmental Engineering, University of Nottingham, Malaysia, 43500 Semenyih, Selangor Darul Ehsan, Malaysia
| | - Mukesh Kumar Awasthi
- College of Natural Resources and Environment, Northwest A&F University, Yangling 712100, China
| | - Abhinav Sharma
- Institute Theory of Polymers, Leibniz Institute for Polymer Research, Hohe Strasse 6, 01069 Dresden, Germany
| | - Rohan Jain
- Helmholtz-Zentrum Dresden-Rossendorf, Helmholtz Institute Freiberg for Resource Technology, Bautzner landstrasse 400, 01328 Dresden, Germany.
| |
Collapse
|
7
|
Yue ZX, Yan TC, Xu HQ, Liu YH, Hong YF, Chen GX, Xie T, Tao L. A systematic review on the state-of-the-art strategies for protein representation. Comput Biol Med 2023; 152:106440. [PMID: 36543002 DOI: 10.1016/j.compbiomed.2022.106440] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 12/08/2022] [Accepted: 12/15/2022] [Indexed: 12/23/2022]
Abstract
The study of drug-target protein interaction is a key step in drug research. In recent years, machine learning techniques have become attractive for research, including drug research, due to their automated nature, predictive power, and expected efficiency. Protein representation is a key step in the study of drug-target protein interaction by machine learning, which plays a fundamental role in the ultimate accomplishment of accurate research. With the progress of machine learning, protein representation methods have gradually attracted attention and have consequently developed rapidly. Therefore, in this review, we systematically classify current protein representation methods, comprehensively review them, and discuss the latest advances of interest. According to the information extraction methods and information sources, these representation methods are generally divided into structure and sequence-based representation methods. Each primary class can be further divided into specific subcategories. As for the particular representation methods involve both traditional and the latest approaches. This review contains a comprehensive assessment of the various methods which researchers can use as a reference for their specific protein-related research requirements, including drug research.
Collapse
Affiliation(s)
- Zi-Xuan Yue
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Tian-Ci Yan
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Hong-Quan Xu
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Yu-Hong Liu
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Yan-Feng Hong
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Gong-Xing Chen
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Tian Xie
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China.
| | - Lin Tao
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China.
| |
Collapse
|
8
|
Pei X, Luo Z, Qiao L, Xiao Q, Zhang P, Wang A, Sheldon RA. Putting precision and elegance in enzyme immobilisation with bio-orthogonal chemistry. Chem Soc Rev 2022; 51:7281-7304. [PMID: 35920313 DOI: 10.1039/d1cs01004b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The covalent immobilisation of enzymes generally involves the use of highly reactive crosslinkers, such as glutaraldehyde, to couple enzyme molecules to each other or to carriers through, for example, the free amino groups of lysine residues, on the enzyme surface. Unfortunately, such methods suffer from a lack of precision. Random formation of covalent linkages with reactive functional groups in the enzyme leads to disruption of the three dimensional structure and accompanying activity losses. This review focuses on recent advances in the use of bio-orthogonal chemistry in conjunction with rec-DNA to affect highly precise immobilisation of enzymes. In this way, cost-effective combination of production, purification and immobilisation of an enzyme is achieved, in a single unit operation with a high degree of precision. Various bio-orthogonal techniques for putting this precision and elegance into enzyme immobilisation are elaborated. These include, for example, fusing (grafting) peptide or protein tags to the target enzyme that enable its immobilisation in cell lysate or incorporating non-standard amino acids that enable the application of bio-orthogonal chemistry.
Collapse
Affiliation(s)
- Xiaolin Pei
- College of Materials, Chemistry and Chemical Engineering, Key Laboratory of Organosilicon Chemistry and Material Technology, Ministry of Education, Key Laboratory of Organosilicon Material Technology, Hangzhou Normal University, Zhejiang Province, Hangzhou, 311121, Zhejiang, P. R. China
| | - Zhiyuan Luo
- College of Materials, Chemistry and Chemical Engineering, Key Laboratory of Organosilicon Chemistry and Material Technology, Ministry of Education, Key Laboratory of Organosilicon Material Technology, Hangzhou Normal University, Zhejiang Province, Hangzhou, 311121, Zhejiang, P. R. China
| | - Li Qiao
- College of Materials, Chemistry and Chemical Engineering, Key Laboratory of Organosilicon Chemistry and Material Technology, Ministry of Education, Key Laboratory of Organosilicon Material Technology, Hangzhou Normal University, Zhejiang Province, Hangzhou, 311121, Zhejiang, P. R. China
| | - Qinjie Xiao
- College of Materials, Chemistry and Chemical Engineering, Key Laboratory of Organosilicon Chemistry and Material Technology, Ministry of Education, Key Laboratory of Organosilicon Material Technology, Hangzhou Normal University, Zhejiang Province, Hangzhou, 311121, Zhejiang, P. R. China
| | - Pengfei Zhang
- College of Materials, Chemistry and Chemical Engineering, Key Laboratory of Organosilicon Chemistry and Material Technology, Ministry of Education, Key Laboratory of Organosilicon Material Technology, Hangzhou Normal University, Zhejiang Province, Hangzhou, 311121, Zhejiang, P. R. China
| | - Anming Wang
- College of Materials, Chemistry and Chemical Engineering, Key Laboratory of Organosilicon Chemistry and Material Technology, Ministry of Education, Key Laboratory of Organosilicon Material Technology, Hangzhou Normal University, Zhejiang Province, Hangzhou, 311121, Zhejiang, P. R. China
| | - Roger A Sheldon
- Molecular Sciences Institute, School of Chemistry, University of the Witwatersrand, PO Wits, 2050, Johannesburg, South Africa. .,Department of Biotechnology, Section BOC, Delft University of Technology, van der Maasweg 9, 2629 HZ Delft, The Netherlands
| |
Collapse
|
9
|
Xu C, Zhang R, Duan M, Zhou Y, Bao J, Lu H, Wang J, Hu M, Hu Z, Zhou F, Zhu W. A polygenic stacking classifier revealed the complicated platelet transcriptomic landscape of adult immune thrombocytopenia. MOLECULAR THERAPY - NUCLEIC ACIDS 2022; 28:477-487. [PMID: 35505964 PMCID: PMC9046129 DOI: 10.1016/j.omtn.2022.04.004] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Accepted: 04/01/2022] [Indexed: 01/19/2023]
Abstract
Immune thrombocytopenia (ITP) is an autoimmune disease with the typical symptom of a low platelet count in blood. ITP demonstrated age and sex biases in both occurrences and prognosis, and adult ITP was mainly induced by the living environments. The current diagnosis guideline lacks the integration of molecular heterogenicity. This study recruited the largest cohort of platelet transcriptome samples. A comprehensive procedure of feature selection, feature engineering, and stacking classification was carried out to detect the ITP biomarkers using RNA sequencing (RNA-seq) transcriptomes. The 40 detected biomarkers were loaded to train the final ITP detection model, with an overall accuracy 0.974. The biomarkers suggested that ITP onset may be associated with various transcribed components, including protein-coding genes, long intergenic non-coding RNA (lincRNA) genes, and pseudogenes with apparent transcriptions. The delivered ITP detection model may also be utilized as a complementary ITP diagnosis tool. The code and the example dataset is freely available on http://www.healthinformaticslab.org/supp/resources.php
Collapse
Affiliation(s)
- Chengfeng Xu
- Department of Hematology, Yueyang Hospital of Integrated Traditional Chinese and Western Medicine, Shanghai University of Traditional Chinese Medicine, 110 Ganhe Road, Hongkou District, Shanghai 200437, China
| | - Ruochi Zhang
- College of Computer Science and Technology, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China
| | - Meiyu Duan
- College of Computer Science and Technology, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China
| | - Yongming Zhou
- Department of Hematology, Yueyang Hospital of Integrated Traditional Chinese and Western Medicine, Shanghai University of Traditional Chinese Medicine, 110 Ganhe Road, Hongkou District, Shanghai 200437, China
| | - Jizhang Bao
- Department of Hematology, Yueyang Hospital of Integrated Traditional Chinese and Western Medicine, Shanghai University of Traditional Chinese Medicine, 110 Ganhe Road, Hongkou District, Shanghai 200437, China
| | - Hao Lu
- Department of Hematology, Yueyang Hospital of Integrated Traditional Chinese and Western Medicine, Shanghai University of Traditional Chinese Medicine, 110 Ganhe Road, Hongkou District, Shanghai 200437, China
| | - Jie Wang
- Department of Hematology, Yueyang Hospital of Integrated Traditional Chinese and Western Medicine, Shanghai University of Traditional Chinese Medicine, 110 Ganhe Road, Hongkou District, Shanghai 200437, China
| | - Minghui Hu
- Department of Hematology, Yueyang Hospital of Integrated Traditional Chinese and Western Medicine, Shanghai University of Traditional Chinese Medicine, 110 Ganhe Road, Hongkou District, Shanghai 200437, China
| | - Zhaoyang Hu
- Fun-Med Pharmaceutical Technology (Shanghai) Co., Ltd., RM. A310, 115 Xinjunhuan Road, Minhang District, Shanghai 201100, China
- Corresponding author Zhaoyang Hu, PhD, Fengneng Pharmaceutical Technology (Shanghai) Co., Ltd., RM. A310, 115 Xinjunhuan Road, Minhang District, Shanghai 201100, China.
| | - Fengfeng Zhou
- College of Computer Science and Technology, Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China
- Corresponding author Fengfeng Zhou, PhD, College of Computer Science and Technology, and Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, Changchun, Jilin 130012, China.
| | - Wenwei Zhu
- Department of Hematology, Yueyang Hospital of Integrated Traditional Chinese and Western Medicine, Shanghai University of Traditional Chinese Medicine, 110 Ganhe Road, Hongkou District, Shanghai 200437, China
- Corresponding author Wenwei Zhu, PhD, Department of Hematology, Yueyang Hospital of Integrated Traditional Chinese and Western Medicine, Shanghai University of Traditional Chinese Medicine, 110 Ganhe Road, Hongkou District, Shanghai 200437, China.
| |
Collapse
|
10
|
Zou H, Yang F, Yin Z. Identification of tumor homing peptides by utilizing hybrid feature representation. J Biomol Struct Dyn 2022; 41:3405-3412. [PMID: 35262448 DOI: 10.1080/07391102.2022.2049368] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Cancer is one of the serious diseases, recent studies reported that tumor homing peptides (THPs) play a key role in treatment of cancer. Due to the experimental methods are time-consuming and expensive, it is urgent to develop automatic computational approaches to identify THPs. Hence, in this study, we proposed a novel machine learning methods to distinguish THPs from non-THPs, in which the peptide sequences firstly encoded by pseudo residue pairwise energy content matrix (PseRECM) and pseudo physicochemical property (PsePC). Moreover, the least absolute shrinkage and selection operator (LAASO) was employed to select optimal features from the extracted features. All of these selected features were fed into support vector machine (SVM) for identifying THPs. We achieved 89.02%, 88.49%, and 94.58% classification accuracy on the Main, Small, and Main90 dataset, respectively. Experimental results showed that our proposed method outperforms the existing predictors on the same benchmark datasets. It indicates that the proposed method may be a useful tool in identifying THPs. The datasets and codes used in current study are available at https://figshare.com/articles/online_resource/iTHPs/16778770.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Hongliang Zou
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang, China
| | - Fan Yang
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang, China
| | - Zhijian Yin
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang, China
| |
Collapse
|
11
|
Tan G, Huang B, Cui Z, Dou H, Zheng S, Zhou T. A noise-immune reinforcement learning method for early diagnosis of neuropsychiatric systemic lupus erythematosus. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2022; 19:2219-2239. [PMID: 35240783 DOI: 10.3934/mbe.2022104] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The neuropsychiatric systemic lupus erythematosus (NPSLE), a severe disease that can damage the heart, liver, kidney, and other vital organs, often involves the central nervous system and even leads to death. Magnetic resonance spectroscopy (MRS) is a brain functional imaging technology that can detect the concentration of metabolites in organs and tissues non-invasively. However, the performance of early diagnosis of NPSLE through conventional MRS analysis is still unsatisfactory. In this paper, we propose a novel method based on genetic algorithm (GA) and multi-agent reinforcement learning (MARL) to improve the performance of the NPSLE diagnosis model. Firstly, the proton magnetic resonance spectroscopy (1H-MRS) data from 23 NPSLE patients and 16 age-matched healthy controls (HC) were standardized before training. Secondly, we adopt MARL by assigning an agent to each feature to select the optimal feature subset. Thirdly, the parameter of SVM is optimized by GA. Our experiment shows that the SVM classifier optimized by feature selection and parameter optimization achieves 94.9% accuracy, 91.3% sensitivity, 100% specificity and 0.87 cross-validation score, which is the best score compared with other state-of-the-art machine learning algorithms. Furthermore, our method is even better than other dimension reduction ones, such as SVM based on principal component analysis (PCA) and variational autoencoder (VAE). By analyzing the metabolites obtained by MRS, we believe that this method can provide a reliable classification result for doctors and can be effectively used for the early diagnosis of this disease.
Collapse
Affiliation(s)
- Guanru Tan
- Department of Computer Science, Shantou University, Shantou 515063, China
| | - Boyu Huang
- Department of Computer Science, Shantou University, Shantou 515063, China
| | - Zhihan Cui
- Department of Computer Science, Shantou University, Shantou 515063, China
| | - Haowen Dou
- Department of Computer Science, Shantou University, Shantou 515063, China
| | - Shiqiang Zheng
- Department of Computer Science, Shantou University, Shantou 515063, China
| | - Teng Zhou
- Department of Computer Science, Shantou University, Shantou 515063, China
- Key Laboratory of Intelligent Manufacturing Technology, Shantou University, Ministry of Education, Shantou 515063, China
| |
Collapse
|
12
|
He W, Jiang Y, Jin J, Li Z, Zhao J, Manavalan B, Su R, Gao X, Wei L. Accelerating bioactive peptide discovery via mutual information-based meta-learning. Brief Bioinform 2021; 23:6457168. [PMID: 34882225 DOI: 10.1093/bib/bbab499] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 10/07/2021] [Accepted: 10/30/2021] [Indexed: 12/28/2022] Open
Abstract
Recently, machine learning methods have been developed to identify various peptide bio-activities. However, due to the lack of experimentally validated peptides, machine learning methods cannot provide a sufficiently trained model, easily resulting in poor generalizability. Furthermore, there is no generic computational framework to predict the bioactivities of different peptides. Thus, a natural question is whether we can use limited samples to build an effective predictive model for different kinds of peptides. To address this question, we propose Mutual Information Maximization Meta-Learning (MIMML), a novel meta-learning-based predictive model for bioactive peptide discovery. Using few samples from various functional peptides, MIMML can sufficiently learn the discriminative information amongst various functions and characterize functional differences. Experimental results show excellent performance of MIMML though using far fewer training samples as compared to the state-of-the-art methods. We also decipher the latent relationships among different kinds of functions to understand what meta-model learned to improve a specific task. In summary, this study is a pioneering work in the field of functional peptide mining and provides the first-of-its-kind solution for few-sample learning problems in biological sequence analysis, accelerating the new functional peptide discovery. The source codes and datasets are available on https://github.com/TearsWaiting/MIMML.
Collapse
Affiliation(s)
- Wenjia He
- School of Software, Shandong University, Jinan, China.,Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China.,BioMap, Beijing, China
| | - Yi Jiang
- School of Software, Shandong University, Jinan, China.,Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | - Junru Jin
- School of Software, Shandong University, Jinan, China.,Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | - Zhongshen Li
- School of Software, Shandong University, Jinan, China.,Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | - Jiaojiao Zhao
- School of Software, Shandong University, Jinan, China.,Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| | | | - Ran Su
- College of Intelligence and Computing, Tianjin University, Tianjin, China
| | - Xin Gao
- King Abdullah University of Science and Technology (KAUST), Computational Bioscience Research Center (CBRC), Computer, Electrical, and Mathematical Sciences and Engineering (CEMSE) Division, Thuwal, 23955-6900, Saudi Arabia
| | - Leyi Wei
- School of Software, Shandong University, Jinan, China.,Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan, China
| |
Collapse
|
13
|
Zou H, Yin Z. m7G-DPP: Identifying N7-methylguanosine sites based on dinucleotide physicochemical properties of RNA. Biophys Chem 2021; 279:106697. [PMID: 34628276 DOI: 10.1016/j.bpc.2021.106697] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2021] [Revised: 10/01/2021] [Accepted: 10/02/2021] [Indexed: 11/17/2022]
Abstract
N7-methylguanosine (m7G) modification is one of the most common post-transcriptional RNA modifications, which play vital role in the regulation of gene expression. Dysfunction of m7G may result to developmental defects and the appearance of some serious diseases. Thus, it is an urgent task to fast and accurate identifying m7G sites. In view of experimental approaches are costly and time-consuming, researchers focused their attention on computational models. Hence, in current study, we proposed a novel predictor called m7G-DPP to identify m7G sites. In the predictor, the RNA sequences were firstly encoded by physicochemical (PC) properties of dinucleotide. Then, sliding window approach was adopted to divide PC matrix into multiple matrixes, and Pearson's correlation coefficient (PCC), dynamic time warping (DTW), and distance correlation (DC) were employed to extract classification features at each window. Next, the least absolute shrinkage and selection operator (LASSO) algorithm was applied to select discriminative features. Finally, these selected features were fed into support vector machine to identify m7G sites. Experimental results showed that the proposed method is effective, which may play a complementary role in current m7G sites prediction studies. The MATLAB codes and dataset can be obtained from website at https://figshare.com/articles/online_resource/m7G-DPP/15000348.
Collapse
Affiliation(s)
- Hongliang Zou
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang 330003, China.
| | - Zhijian Yin
- School of Communications and Electronics, Jiangxi Science and Technology Normal University, Nanchang 330003, China
| |
Collapse
|
14
|
Gupta R, Srivastava D, Sahu M, Tiwari S, Ambasta RK, Kumar P. Artificial intelligence to deep learning: machine intelligence approach for drug discovery. Mol Divers 2021; 25:1315-1360. [PMID: 33844136 PMCID: PMC8040371 DOI: 10.1007/s11030-021-10217-3] [Citation(s) in RCA: 256] [Impact Index Per Article: 85.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Accepted: 03/22/2021] [Indexed: 02/06/2023]
Abstract
Drug designing and development is an important area of research for pharmaceutical companies and chemical scientists. However, low efficacy, off-target delivery, time consumption, and high cost impose a hurdle and challenges that impact drug design and discovery. Further, complex and big data from genomics, proteomics, microarray data, and clinical trials also impose an obstacle in the drug discovery pipeline. Artificial intelligence and machine learning technology play a crucial role in drug discovery and development. In other words, artificial neural networks and deep learning algorithms have modernized the area. Machine learning and deep learning algorithms have been implemented in several drug discovery processes such as peptide synthesis, structure-based virtual screening, ligand-based virtual screening, toxicity prediction, drug monitoring and release, pharmacophore modeling, quantitative structure-activity relationship, drug repositioning, polypharmacology, and physiochemical activity. Evidence from the past strengthens the implementation of artificial intelligence and deep learning in this field. Moreover, novel data mining, curation, and management techniques provided critical support to recently developed modeling algorithms. In summary, artificial intelligence and deep learning advancements provide an excellent opportunity for rational drug design and discovery process, which will eventually impact mankind. The primary concern associated with drug design and development is time consumption and production cost. Further, inefficiency, inaccurate target delivery, and inappropriate dosage are other hurdles that inhibit the process of drug delivery and development. With advancements in technology, computer-aided drug design integrating artificial intelligence algorithms can eliminate the challenges and hurdles of traditional drug design and development. Artificial intelligence is referred to as superset comprising machine learning, whereas machine learning comprises supervised learning, unsupervised learning, and reinforcement learning. Further, deep learning, a subset of machine learning, has been extensively implemented in drug design and development. The artificial neural network, deep neural network, support vector machines, classification and regression, generative adversarial networks, symbolic learning, and meta-learning are examples of the algorithms applied to the drug design and discovery process. Artificial intelligence has been applied to different areas of drug design and development process, such as from peptide synthesis to molecule design, virtual screening to molecular docking, quantitative structure-activity relationship to drug repositioning, protein misfolding to protein-protein interactions, and molecular pathway identification to polypharmacology. Artificial intelligence principles have been applied to the classification of active and inactive, monitoring drug release, pre-clinical and clinical development, primary and secondary drug screening, biomarker development, pharmaceutical manufacturing, bioactivity identification and physiochemical properties, prediction of toxicity, and identification of mode of action.
Collapse
Affiliation(s)
- Rohan Gupta
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Devesh Srivastava
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Mehar Sahu
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Swati Tiwari
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Rashmi K Ambasta
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India
| | - Pravir Kumar
- Molecular Neuroscience and Functional Genomics Laboratory, Department of Biotechnology, Delhi Technological University (Formerly DCE), Shahbad Daulatpur, Bawana Road, Delhi, 110042, India.
| |
Collapse
|
15
|
Meng C, Wu J, Guo F, Dong B, Xu L. CWLy-pred: A novel cell wall lytic enzyme identifier based on an improved MRMD feature selection method. Genomics 2020; 112:4715-4721. [DOI: 10.1016/j.ygeno.2020.08.015] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2020] [Revised: 08/04/2020] [Accepted: 08/13/2020] [Indexed: 10/25/2022]
|
16
|
Liang G, Fan W, Luo H, Zhu X. The emerging roles of artificial intelligence in cancer drug development and precision therapy. Biomed Pharmacother 2020; 128:110255. [DOI: 10.1016/j.biopha.2020.110255] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2020] [Revised: 04/22/2020] [Accepted: 05/10/2020] [Indexed: 12/12/2022] Open
|