1
|
Fallah A, Havaei SA, Sedighian H, Kachuei R, Fooladi AAI. Prediction of aptamer affinity using an artificial intelligence approach. J Mater Chem B 2024; 12:8825-8842. [PMID: 39158322 DOI: 10.1039/d4tb00909f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/20/2024]
Abstract
Aptamers are oligonucleotide sequences that can connect to particular target molecules, similar to monoclonal antibodies. They can be chosen by systematic evolution of ligands by exponential enrichment (SELEX), and are modifiable and can be synthesized. Even if the SELEX approach has been improved a lot, it is frequently challenging and time-consuming to identify aptamers experimentally. In particular, structure-based methods are the most used in computer-aided design and development of aptamers. For this purpose, numerous web-based platforms have been suggested for the purpose of forecasting the secondary structure and 3D configurations of RNAs and DNAs. Also, molecular docking and molecular dynamics (MD), which are commonly utilized in protein compound selection by structural information, are suitable for aptamer selection. On the other hand, from a large number of sequences, artificial intelligence (AI) may be able to quickly discover the possible aptamer candidates. Conversely, sophisticated machine and deep-learning (DL) models have demonstrated efficacy in forecasting the binding properties between ligands and targets during drug discovery; as such, they may provide a reliable and precise method for forecasting the binding of aptamers to targets. This research looks at advancements in AI pipelines and strategies for aptamer binding ability prediction, such as machine and deep learning, as well as structure-based approaches, molecular dynamics and molecular docking simulation methods.
Collapse
Affiliation(s)
- Arezoo Fallah
- Department of Bacteriology and Virology, Faculty of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran
| | - Seyed Asghar Havaei
- Department of Microbiology, School of Medicine, Isfahan University of Medical Sciences, Isfahan, Iran.
| | - Hamid Sedighian
- Applied Microbiology Research Center, Biomedicine Technologies Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran.
| | - Reza Kachuei
- Molecular Biology Research Center, Biomedicine Technologies Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - Abbas Ali Imani Fooladi
- Applied Microbiology Research Center, Biomedicine Technologies Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
2
|
Fang Z, Wu Z, Wu X, Chen S, Wang X, Umrao S, Dwivedy A. APIPred: An XGBoost-Based Method for Predicting Aptamer-Protein Interactions. J Chem Inf Model 2024; 64:2290-2301. [PMID: 38127053 PMCID: PMC11001522 DOI: 10.1021/acs.jcim.3c00713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2023]
Abstract
Aptamers are single-stranded DNA or RNA oligos that can bind to a variety of targets with high specificity and selectivity and thus are widely used in the field of biosensing and disease therapies. Aptamers are generated by SELEX, which is a time-consuming procedure. In this study, using in silico and computational tools, we attempt to predict whether an aptamer can interact with a specific protein target. We present multiple data representations of protein and aptamer pairs and multiple machine-learning-based models to predict aptamer-protein interactions with a fair degree of selectivity. One of our models showed 96.5% accuracy and 97% precision, which are significantly better than those of the previously reported models. Additionally, we used molecular docking and SPR binding assays for two aptamers and the predicted targets as examples to exhibit the robustness of the APIPred algorithm. This reported model can be used for the high throughput screening of aptamer-protein pairs for targeting cancer and rapidly evolving viral epidemics.
Collapse
Affiliation(s)
- Zheng Fang
- Holonyak Micro and Nanotechnology Lab (HMNTL), University of Illinois at Urbana-Champaign, Champaign-Urbana 61801, United States
- Zhejiang University-University of Illinois at Urbana-Champaign Institute, Haining, Zhejiang 314400, China
- Department of Electrical and Computer Engineering, National University of Singapore, 117583 Singapore
| | - Zhongqi Wu
- Holonyak Micro and Nanotechnology Lab (HMNTL), University of Illinois at Urbana-Champaign, Champaign-Urbana 61801, United States
- Zhejiang University-University of Illinois at Urbana-Champaign Institute, Haining, Zhejiang 314400, China
- Department of Electrical and Computer Engineering, University of California, San Diego 92161, United States
| | - Xinbo Wu
- Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Champaign-Urbana 61801, United States
| | - Shixin Chen
- Holonyak Micro and Nanotechnology Lab (HMNTL), University of Illinois at Urbana-Champaign, Champaign-Urbana 61801, United States
- Zhejiang University-University of Illinois at Urbana-Champaign Institute, Haining, Zhejiang 314400, China
- Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Champaign-Urbana 61801, United States
| | - Xing Wang
- Holonyak Micro and Nanotechnology Lab (HMNTL), University of Illinois at Urbana-Champaign, Champaign-Urbana 61801, United States
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Champaign-Urbana 61801, United States
- Department of Chemistry, University of Illinois at Urbana-Champaign, Champaign-Urbana 61801, United States
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Champaign-Urbana 61801, United States
| | - Saurabh Umrao
- Holonyak Micro and Nanotechnology Lab (HMNTL), University of Illinois at Urbana-Champaign, Champaign-Urbana 61801, United States
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Champaign-Urbana 61801, United States
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Champaign-Urbana 61801, United States
| | - Abhisek Dwivedy
- Holonyak Micro and Nanotechnology Lab (HMNTL), University of Illinois at Urbana-Champaign, Champaign-Urbana 61801, United States
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Champaign-Urbana 61801, United States
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Champaign-Urbana 61801, United States
| |
Collapse
|
3
|
Shin I, Kang K, Kim J, Sel S, Choi J, Lee JW, Kang HY, Song G. AptaTrans: a deep neural network for predicting aptamer-protein interaction using pretrained encoders. BMC Bioinformatics 2023; 24:447. [PMID: 38012571 PMCID: PMC10680337 DOI: 10.1186/s12859-023-05577-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 11/21/2023] [Indexed: 11/29/2023] Open
Abstract
BACKGROUND Aptamers, which are biomaterials comprised of single-stranded DNA/RNA that form tertiary structures, have significant potential as next-generation materials, particularly for drug discovery. The systematic evolution of ligands by exponential enrichment (SELEX) method is a critical in vitro technique employed to identify aptamers that bind specifically to target proteins. While advanced SELEX-based methods such as Cell- and HT-SELEX are available, they often encounter issues such as extended time consumption and suboptimal accuracy. Several In silico aptamer discovery methods have been proposed to address these challenges. These methods are specifically designed to predict aptamer-protein interaction (API) using benchmark datasets. However, these methods often fail to consider the physicochemical interactions between aptamers and proteins within tertiary structures. RESULTS In this study, we propose AptaTrans, a pipeline for predicting API using deep learning techniques. AptaTrans uses transformer-based encoders to handle aptamer and protein sequences at the monomer level. Furthermore, pretrained encoders are utilized for the structural representation. After validation with a benchmark dataset, AptaTrans has been integrated into a comprehensive toolset. This pipeline synergistically combines with Apta-MCTS, a generative algorithm for recommending aptamer candidates. CONCLUSION The results show that AptaTrans outperforms existing models for predicting API, and the efficacy of the AptaTrans pipeline has been confirmed through various experimental tools. We expect AptaTrans will enhance the cost-effectiveness and efficiency of SELEX in drug discovery. The source code and benchmark dataset for AptaTrans are available at https://github.com/pnumlb/AptaTrans .
Collapse
Affiliation(s)
- Incheol Shin
- Division of Artificial Intelligence, Pusan National University, Busan, Republic of Korea
| | - Keumseok Kang
- Division of Artificial Intelligence, Pusan National University, Busan, Republic of Korea
| | - Juseong Kim
- Division of Artificial Intelligence, Pusan National University, Busan, Republic of Korea
| | - Sanghun Sel
- Division of Artificial Intelligence, Pusan National University, Busan, Republic of Korea
| | - Jeonghoon Choi
- Division of Artificial Intelligence, Pusan National University, Busan, Republic of Korea
| | - Jae-Wook Lee
- Research & Development, NuclixBio, Seoul, Republic of Korea
| | - Ho Young Kang
- Research & Development, NuclixBio, Seoul, Republic of Korea
| | - Giltae Song
- Division of Artificial Intelligence, Pusan National University, Busan, Republic of Korea.
- School of Computer Science and Engineering, Pusan National University, Busan, Republic of Korea.
- Center for Artificial Intelligence Research, Pusan National University, Busan, Republic of Korea.
| |
Collapse
|
4
|
Andress C, Kappel K, Villena ME, Cuperlovic-Culf M, Yan H, Li Y. DAPTEV: Deep aptamer evolutionary modelling for COVID-19 drug design. PLoS Comput Biol 2023; 19:e1010774. [PMID: 37406007 DOI: 10.1371/journal.pcbi.1010774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 06/13/2023] [Indexed: 07/07/2023] Open
Abstract
Typical drug discovery and development processes are costly, time consuming and often biased by expert opinion. Aptamers are short, single-stranded oligonucleotides (RNA/DNA) that bind to target proteins and other types of biomolecules. Compared with small-molecule drugs, aptamers can bind to their targets with high affinity (binding strength) and specificity (uniquely interacting with the target only). The conventional development process for aptamers utilizes a manual process known as Systematic Evolution of Ligands by Exponential Enrichment (SELEX), which is costly, slow, dependent on library choice and often produces aptamers that are not optimized. To address these challenges, in this research, we create an intelligent approach, named DAPTEV, for generating and evolving aptamer sequences to support aptamer-based drug discovery and development. Using the COVID-19 spike protein as a target, our computational results suggest that DAPTEV is able to produce structurally complex aptamers with strong binding affinities.
Collapse
Affiliation(s)
- Cameron Andress
- Department of Computer Science, Brock University, St. Catharines, Canada
| | - Kalli Kappel
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | | | | | - Hongbin Yan
- Department of Chemistry, Brock University, St. Catharines, Canada
| | - Yifeng Li
- Department of Computer Science, Brock University, St. Catharines, Canada
- Department of Biological Sciences, Brock University, St. Catharines, Canada
| |
Collapse
|
5
|
Lee SJ, Cho J, Lee BH, Hwang D, Park JW. Design and Prediction of Aptamers Assisted by In Silico Methods. Biomedicines 2023; 11:356. [PMID: 36830893 PMCID: PMC9953197 DOI: 10.3390/biomedicines11020356] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 01/21/2023] [Accepted: 01/23/2023] [Indexed: 01/28/2023] Open
Abstract
An aptamer is a single-stranded DNA or RNA that binds to a specific target with high binding affinity. Aptamers are developed through the process of systematic evolution of ligands by exponential enrichment (SELEX), which is repeated to increase the binding power and specificity. However, the SELEX process is time-consuming, and the characterization of aptamer candidates selected through it requires additional effort. Here, we describe in silico methods in order to suggest the most efficient way to develop aptamers and minimize the laborious effort required to screen and optimise aptamers. We investigated several methods for the estimation of aptamer-target molecule binding through conformational structure prediction, molecular docking, and molecular dynamic simulation. In addition, examples of machine learning and deep learning technologies used to predict the binding of targets and ligands in the development of new drugs are introduced. This review will be helpful in the development and application of in silico aptamer screening and characterization.
Collapse
Affiliation(s)
- Su Jin Lee
- Drug Manufacturing Center, Daegu-Gyeongbuk Medical Innovation Foundation (K-MEDI Hub), Daegu 41061, Republic of Korea
| | - Junmin Cho
- Medical Device Development Center, Daegu-Gyeongbuk Medical Innovation Foundation (K-MEDI Hub), Daegu 41061, Republic of Korea
| | - Byung-Hoon Lee
- Medical Device Development Center, Daegu-Gyeongbuk Medical Innovation Foundation (K-MEDI Hub), Daegu 41061, Republic of Korea
| | - Donghwan Hwang
- Medical Device Development Center, Daegu-Gyeongbuk Medical Innovation Foundation (K-MEDI Hub), Daegu 41061, Republic of Korea
| | - Jee-Woong Park
- Medical Device Development Center, Daegu-Gyeongbuk Medical Innovation Foundation (K-MEDI Hub), Daegu 41061, Republic of Korea
| |
Collapse
|
6
|
Hasanzadeh A, Hamblin MR, Kiani J, Noori H, Hardie JM, Karimi M, Shafiee H. Could artificial intelligence revolutionize the development of nanovectors for gene therapy and mRNA vaccines? NANO TODAY 2022; 47:101665. [PMID: 37034382 PMCID: PMC10081506 DOI: 10.1016/j.nantod.2022.101665] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Gene therapy enables the introduction of nucleic acids like DNA and RNA into host cells, and is expected to revolutionize the treatment of a wide range of diseases. This growth has been further accelerated by the discovery of CRISPR/Cas technology, which allows accurate genomic editing in a broad range of cells and organisms in vitro and in vivo. Despite many advances in gene delivery and the development of various viral and non-viral gene delivery vectors, the lack of highly efficient non-viral systems with low cellular toxicity remains a challenge. The application of cutting-edge technologies such as artificial intelligence (AI) has great potential to find new paradigms to solve this issue. Herein, we review AI and its major subfields including machine learning (ML), neural networks (NNs), expert systems, deep learning (DL), computer vision and robotics. We discuss the potential of AI-based models and algorithms in the design of targeted gene delivery vehicles capable of crossing extracellular and intracellular barriers by viral mimicry strategies. We finally discuss the role of AI in improving the function of CRISPR/Cas systems, developing novel nanobots, and mRNA vaccine carriers.
Collapse
Affiliation(s)
- Akbar Hasanzadeh
- Cellular and Molecular Research Center, Iran University of Medical Sciences, Tehran 1449614535, Iran
- Department of Medical Nanotechnology, Faculty of Advanced Technologies in Medicine, Iran University of Medical Sciences, Tehran 1449614535, Iran
| | - Michael R Hamblin
- Laser Research Centre, Faculty of Health Science, University of Johannesburg, Doornfontein 2028, South Africa
- Radiation Biology Research Center, Iran University of Medical Sciences, Tehran, Iran
| | - Jafar Kiani
- Oncopathology Research Center, Iran University of Medical Sciences, Tehran 1449614535, Iran
- Department of Molecular Medicine, Faculty of Advanced Technologies in Medicine, Iran University of Medical Sciences, Tehran, Iran
| | - Hamid Noori
- Cellular and Molecular Research Center, Iran University of Medical Sciences, Tehran 1449614535, Iran
- Department of Medical Nanotechnology, Faculty of Advanced Technologies in Medicine, Iran University of Medical Sciences, Tehran 1449614535, Iran
| | - Joseph M. Hardie
- Division of Engineering in Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02139 USA
| | - Mahdi Karimi
- Cellular and Molecular Research Center, Iran University of Medical Sciences, Tehran 1449614535, Iran
- Department of Medical Nanotechnology, Faculty of Advanced Technologies in Medicine, Iran University of Medical Sciences, Tehran 1449614535, Iran
- Oncopathology Research Center, Iran University of Medical Sciences, Tehran 1449614535, Iran
- Research Center for Science and Technology in Medicine, Tehran University of Medical Sciences, Tehran 141556559, Iran
- Applied Biotechnology Research Centre, Tehran Medical Science, Islamic Azad University, Tehran 1584743311, Iran
| | - Hadi Shafiee
- Division of Engineering in Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, 02139 USA
| |
Collapse
|
7
|
Uwiragiye E, Rhinehardt KL. TFIDF-Random Forest: Prediction of Aptamer-Protein Interacting Pairs. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:3032-3037. [PMID: 34310317 DOI: 10.1109/tcbb.2021.3098709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Aptamers are short, single-stranded oligonucleotides or peptides generated from in vitro selection to selectively bind with various molecules. Due to their molecular recognition capability for proteins, aptamers are becoming promising reagents in new drug development. Aptamers can fold into specific spatial configuration that bind to certain targets with extremely high specificity. The ability of aptamers to reversibly bind proteins has generated increasing interest in using them to facilitate controlled release of therapeutic biomolecules. In-vitro selection experiments to produce the aptamer-protein binding pairs is very complex and MD/MM in-silico experiments can be computationally expensive. In this study, we introduce a natural language processing approach for data-driven computational selection. We compared our method to the sequential model with the embedding layer, applied in the literature. We transformed the DNA/RNA and protein sequences into text format using a sliding window approach. This methodology showed that efficiency was notably higher than those observed from the literature. This indicates that our preliminary model has marked improvement over previous models which brings us closer to a data-driven computational selection method.
Collapse
|
8
|
Douaki A, Garoli D, Inam AKMS, Angeli MAC, Cantarella G, Rocchia W, Wang J, Petti L, Lugli P. Smart Approach for the Design of Highly Selective Aptamer-Based Biosensors. BIOSENSORS 2022; 12:bios12080574. [PMID: 36004970 PMCID: PMC9405846 DOI: 10.3390/bios12080574] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/04/2022] [Revised: 07/23/2022] [Accepted: 07/25/2022] [Indexed: 11/30/2022]
Abstract
Aptamers are chemically synthesized single-stranded DNA or RNA oligonucleotides widely used nowadays in sensors and nanoscale devices as highly sensitive biorecognition elements. With proper design, aptamers are able to bind to a specific target molecule with high selectivity. To date, the systematic evolution of ligands by exponential enrichment (SELEX) process is employed to isolate aptamers. Nevertheless, this method requires complex and time-consuming procedures. In silico methods comprising machine learning models have been recently proposed to reduce the time and cost of aptamer design. In this work, we present a new in silico approach allowing the generation of highly sensitive and selective RNA aptamers towards a specific target, here represented by ammonium dissolved in water. By using machine learning and bioinformatics tools, a rational design of aptamers is demonstrated. This “smart” SELEX method is experimentally proved by choosing the best five aptamer candidates obtained from the design process and applying them as functional elements in an electrochemical sensor to detect, as the target molecule, ammonium at different concentrations. We observed that the use of five different aptamers leads to a significant difference in the sensor’s response. This can be explained by considering the aptamers’ conformational change due to their interaction with the target molecule. We studied these conformational changes using a molecular dynamics simulation and suggested a possible explanation of the experimental observations. Finally, electrochemical measurements exposing the same sensors to different molecules were used to confirm the high selectivity of the designed aptamers. The proposed in silico SELEX approach can potentially reduce the cost and the time needed to identify the aptamers and potentially be applied to any target molecule.
Collapse
Affiliation(s)
- Ali Douaki
- Faculty of Science and Technology, Libera Università di Bolzano, Piazza Università 1, 39100 Bolzano, Italy; (A.K.M.S.I.); (M.A.C.A.); (G.C.); (L.P.)
- Correspondence: (A.D.); (P.L.)
| | - Denis Garoli
- Istituto Italiano di Tecnologia, Via Morego, 30, 16163 Genova, Italy;
| | - A. K. M. Sarwar Inam
- Faculty of Science and Technology, Libera Università di Bolzano, Piazza Università 1, 39100 Bolzano, Italy; (A.K.M.S.I.); (M.A.C.A.); (G.C.); (L.P.)
| | - Martina Aurora Costa Angeli
- Faculty of Science and Technology, Libera Università di Bolzano, Piazza Università 1, 39100 Bolzano, Italy; (A.K.M.S.I.); (M.A.C.A.); (G.C.); (L.P.)
| | - Giuseppe Cantarella
- Faculty of Science and Technology, Libera Università di Bolzano, Piazza Università 1, 39100 Bolzano, Italy; (A.K.M.S.I.); (M.A.C.A.); (G.C.); (L.P.)
| | - Walter Rocchia
- CONCEPT Lab, Istituto Italiano di Tecnologia, Via Enrico Melen 83, 16152 Genova, Italy;
| | - Jiahai Wang
- School of Mechanical and Electrical Engineering, School of Chemistry and Chemical Engineering, Guangzhou University, Guangzhou 510006, China;
| | - Luisa Petti
- Faculty of Science and Technology, Libera Università di Bolzano, Piazza Università 1, 39100 Bolzano, Italy; (A.K.M.S.I.); (M.A.C.A.); (G.C.); (L.P.)
| | - Paolo Lugli
- Faculty of Science and Technology, Libera Università di Bolzano, Piazza Università 1, 39100 Bolzano, Italy; (A.K.M.S.I.); (M.A.C.A.); (G.C.); (L.P.)
- Correspondence: (A.D.); (P.L.)
| |
Collapse
|
9
|
Ghiasi A, Malekpour A, Mahpishanian S. Aptamer functionalized magnetic metal–organic framework MIL-101(Cr)-NH2 for specific extraction of acetamiprid from fruit juice and water samples. Food Chem 2022; 382:132218. [DOI: 10.1016/j.foodchem.2022.132218] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Revised: 01/15/2022] [Accepted: 01/19/2022] [Indexed: 11/30/2022]
|
10
|
Zhang N. Meet the Editorial Board Member. Curr Med Chem 2022. [DOI: 10.2174/092986732912220324160351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
11
|
Heredia FL, Roche-Lima A, Parés-Matos EI. A novel artificial intelligence-based approach for identification of deoxynucleotide aptamers. PLoS Comput Biol 2021; 17:e1009247. [PMID: 34343165 PMCID: PMC8362955 DOI: 10.1371/journal.pcbi.1009247] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2020] [Revised: 08/13/2021] [Accepted: 07/05/2021] [Indexed: 02/07/2023] Open
Abstract
The selection of a DNA aptamer through the Systematic Evolution of Ligands by EXponential enrichment (SELEX) method involves multiple binding steps, in which a target and a library of randomized DNA sequences are mixed for selection of a single, nucleotide-specific molecule. Usually, 10 to 20 steps are required for SELEX to be completed. Throughout this process it is necessary to discriminate between true DNA aptamers and unspecified DNA-binding sequences. Thus, a novel machine learning-based approach was developed to support and simplify the early steps of the SELEX process, to help discriminate binding between DNA aptamers from those unspecified targets of DNA-binding sequences. An Artificial Intelligence (AI) approach to identify aptamers were implemented based on Natural Language Processing (NLP) and Machine Learning (ML). NLP method (CountVectorizer) was used to extract information from the nucleotide sequences. Four ML algorithms (Logistic Regression, Decision Tree, Gaussian Naïve Bayes, Support Vector Machines) were trained using data from the NLP method along with sequence information. The best performing model was Support Vector Machines because it had the best ability to discriminate between positive and negative classes. In our model, an Accuracy (A) of 0.995, the fraction of samples that the model correctly classified, and an Area Under the Receiving Operating Curve (AUROC) of 0.998, the degree by which a model is capable of distinguishing between classes, were observed. The developed AI approach is useful to identify potential DNA aptamers to reduce the amount of rounds in a SELEX selection. This new approach could be applied in the design of DNA libraries and result in a more efficient and faster process for DNA aptamers to be chosen during SELEX. In this manuscript authors explain the development and validation of a novel artificial intelligence approach to support and simplify the early steps of the process from SELEX, to help discriminate binding between deoxynucleotide aptamers from those unspecified targets of DNA-binding sequences. The approach was implemented based on Natural Language Processing and Machine Learning. CountVectorizer, a Natural Language Processing method, was used to extract information from nucleotide sequences. Four Machine Learning algorithms (Logistic Regression, Decision Tree, Gaussian Naïve Bayes, and Support Vector Machines) were trained using data from the Natural Language Processing method along with sequence information. From these four trained machine learning algorithms, the best performance and selected model was Support Vectors Machines, because it had the best discriminatory metrics (i.e., Accuracy (A) = 0.995; AUROC (AU) = 0.998). In general, all models showed good metric results for predicting DNA aptamer sequences. The Machine Learning model complexity and difficult interpretation may hinder its application into the standard practice. For this reason, the development of a web-app is already taking place to facilitate the interpretation and application of the obtained results.
Collapse
Affiliation(s)
- Frances L. Heredia
- Department of Chemistry, University of Puerto Rico-Mayagüez Campus, Mayagüez, Puerto Rico, United States of America
| | - Abiel Roche-Lima
- Center for Collaborative Research in Health Disparities, University of Puerto Rico-Medical Sciences Campus, San Juan, Puerto Rico, United States of America
| | - Elsie I. Parés-Matos
- Department of Chemistry, University of Puerto Rico-Mayagüez Campus, Mayagüez, Puerto Rico, United States of America
- * E-mail:
| |
Collapse
|
12
|
Predicting aptamer sequences that interact with target proteins using an aptamer-protein interaction classifier and a Monte Carlo tree search approach. PLoS One 2021; 16:e0253760. [PMID: 34170922 PMCID: PMC8232527 DOI: 10.1371/journal.pone.0253760] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2020] [Accepted: 06/14/2021] [Indexed: 11/19/2022] Open
Abstract
Oligonucleotide-based aptamers, which have a three-dimensional structure with a single-stranded fragment, feature various characteristics with respect to size, toxicity, and permeability. Accordingly, aptamers are advantageous in terms of diagnosis and treatment and are materials that can be produced through relatively simple experiments. Systematic evolution of ligands by exponential enrichment (SELEX) is one of the most widely used experimental methods for generating aptamers; however, it is highly expensive and time-consuming. To reduce the related costs, recent studies have used in silico approaches, such as aptamer-protein interaction (API) classifiers that use sequence patterns to determine the binding affinity between RNA aptamers and proteins. Some of these methods generate candidate RNA aptamer sequences that bind to a target protein, but they are limited to producing candidates of a specific size. In this study, we present a machine learning approach for selecting candidate sequences of various sizes that have a high binding affinity for a specific sequence of a target protein. We applied the Monte Carlo tree search (MCTS) algorithm for generating the candidate sequences using a score function based on an API classifier. The tree structure that we designed with MCTS enables nucleotide sequence sampling, and the obtained sequences are potential aptamer candidates. We performed a quality assessment using the scores of docking simulations. Our validation datasets revealed that our model showed similar or better docking scores in ZDOCK docking simulations than the known aptamers. We expect that our method, which is size-independent and easy to use, can provide insights into searching for an appropriate aptamer sequence for a target protein during the simulation step of SELEX.
Collapse
|
13
|
Using an Ensemble to Identify and Classify Macroalgae Antimicrobial Peptides. Interdiscip Sci 2021; 13:321-333. [PMID: 33978916 DOI: 10.1007/s12539-021-00435-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2020] [Revised: 04/27/2021] [Accepted: 04/27/2021] [Indexed: 10/21/2022]
Abstract
The rapid spread of multi-drug resistant microbes has lead researchers to discover natural alternative remedies such as antimicrobial peptides (AMPs). In the first line of defense, AMPs display a broad spectrum of potent activity against multi-resistant pathogenic bacteria, viruses, fungi, and even cancer. AMPs can be further characterised into families according to amino acid composition, secondary structure, and function. However, despite recent advancements in rapid computational methods for AMP prediction from various mammalian, aquatic, and terrestrial species, there is limited information regarding their presence, functional roles, and family type from marine macroalgae. In this paper, we present a promising two-tier ensemble of heterogeneous machine learning models that integrates seven well-known machine learning classifiers to predict AMPs from macroalgae. The first tier of the ensemble consists of a suite of binary classifiers that identify AMPs from protein sequence data which are then forwarded to a second-tier multi-class ensemble to characterise their functional family type. The two-tier ensemble was successfully used to identify 39 putative AMP sequences in 12 macroalgae species from three different phyla groups. The approach we describe is not limited to AMPs and can also be applied to search sequence data for other types of proteins.
Collapse
|
14
|
Zhang N. Meet Our Editorial Board Member. Curr Med Chem 2021. [DOI: 10.2174/092986732813210504125325] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
15
|
Chen Z, Hu L, Zhang BT, Lu A, Wang Y, Yu Y, Zhang G. Artificial Intelligence in Aptamer-Target Binding Prediction. Int J Mol Sci 2021; 22:3605. [PMID: 33808496 PMCID: PMC8038094 DOI: 10.3390/ijms22073605] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 03/25/2021] [Accepted: 03/26/2021] [Indexed: 12/18/2022] Open
Abstract
Aptamers are short single-stranded DNA, RNA, or synthetic Xeno nucleic acids (XNA) molecules that can interact with corresponding targets with high affinity. Owing to their unique features, including low cost of production, easy chemical modification, high thermal stability, reproducibility, as well as low levels of immunogenicity and toxicity, aptamers can be used as an alternative to antibodies in diagnostics and therapeutics. Systematic evolution of ligands by exponential enrichment (SELEX), an experimental approach for aptamer screening, allows the selection and identification of in vitro aptamers with high affinity and specificity. However, the SELEX process is time consuming and characterization of the representative aptamer candidates from SELEX is rather laborious. Artificial intelligence (AI) could help to rapidly identify the potential aptamer candidates from a vast number of sequences. This review discusses the advancements of AI pipelines/methods, including structure-based and machine/deep learning-based methods, for predicting the binding ability of aptamers to targets. Structure-based methods are the most used in computer-aided drug design. For this part, we review the secondary and tertiary structure prediction methods for aptamers, molecular docking, as well as molecular dynamic simulation methods for aptamer-target binding. We also performed analysis to compare the accuracy of different secondary and tertiary structure prediction methods for aptamers. On the other hand, advanced machine-/deep-learning models have witnessed successes in predicting the binding abilities between targets and ligands in drug discovery and thus potentially offer a robust and accurate approach to predict the binding between aptamers and targets. The research utilizing machine-/deep-learning techniques for prediction of aptamer-target binding is limited currently. Therefore, perspectives for models, algorithms, and implementation strategies of machine/deep learning-based methods are discussed. This review could facilitate the development and application of high-throughput and less laborious in silico methods in aptamer selection and characterization.
Collapse
Affiliation(s)
- Zihao Chen
- School of Chinese Medicine, The Chinese University of Hong Kong, Hong Kong, China; (Z.C.); (B.-T.Z.)
| | - Long Hu
- Law Sau Fai Institute for Advancing Translational Medicine in Bone & Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong, China;
| | - Bao-Ting Zhang
- School of Chinese Medicine, The Chinese University of Hong Kong, Hong Kong, China; (Z.C.); (B.-T.Z.)
| | - Aiping Lu
- Institute of Integrated Bioinformedicine and Translational Science, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong, China;
- Guangdong-Hong Kong Macao Greater Bay Area International Research Platform for Aptamer-Based Translational Medicine and Drug Discovery, Hong Kong, China
| | - Yaofeng Wang
- Centre for Regenerative Medicine and Health, Hong Kong Institute of Science & Innovation, Chinese Academy of Sciences, Hong Kong, China
| | - Yuanyuan Yu
- Law Sau Fai Institute for Advancing Translational Medicine in Bone & Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong, China;
- Guangdong-Hong Kong Macao Greater Bay Area International Research Platform for Aptamer-Based Translational Medicine and Drug Discovery, Hong Kong, China
| | - Ge Zhang
- Law Sau Fai Institute for Advancing Translational Medicine in Bone & Joint Diseases, School of Chinese Medicine, Hong Kong Baptist University, Hong Kong, China;
- Guangdong-Hong Kong Macao Greater Bay Area International Research Platform for Aptamer-Based Translational Medicine and Drug Discovery, Hong Kong, China
| |
Collapse
|
16
|
Emami N, Ferdousi R. AptaNet as a deep learning approach for aptamer-protein interaction prediction. Sci Rep 2021; 11:6074. [PMID: 33727685 PMCID: PMC7971039 DOI: 10.1038/s41598-021-85629-0] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2020] [Accepted: 03/03/2021] [Indexed: 02/08/2023] Open
Abstract
Aptamers are short oligonucleotides (DNA/RNA) or peptide molecules that can selectively bind to their specific targets with high specificity and affinity. As a powerful new class of amino acid ligands, aptamers have high potentials in biosensing, therapeutic, and diagnostic fields. Here, we present AptaNet-a new deep neural network-to predict the aptamer-protein interaction pairs by integrating features derived from both aptamers and the target proteins. Aptamers were encoded by using two different strategies, including k-mer and reverse complement k-mer frequency. Amino acid composition (AAC) and pseudo amino acid composition (PseAAC) were applied to represent target information using 24 physicochemical and conformational properties of the proteins. To handle the imbalance problem in the data, we applied a neighborhood cleaning algorithm. The predictor was constructed based on a deep neural network, and optimal features were selected using the random forest algorithm. As a result, 99.79% accuracy was achieved for the training dataset, and 91.38% accuracy was obtained for the testing dataset. AptaNet achieved high performance on our constructed aptamer-protein benchmark dataset. The results indicate that AptaNet can help identify novel aptamer-protein interacting pairs and build more-efficient insights into the relationship between aptamers and proteins. Our benchmark dataset and the source codes for AptaNet are available in: https://github.com/nedaemami/AptaNet .
Collapse
Affiliation(s)
- Neda Emami
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Reza Ferdousi
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran.
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran.
| |
Collapse
|
17
|
Torkamanian-Afshar M, Nematzadeh S, Tabarzad M, Najafi A, Lanjanian H, Masoudi-Nejad A. In silico design of novel aptamers utilizing a hybrid method of machine learning and genetic algorithm. Mol Divers 2021; 25:1395-1407. [PMID: 33554306 DOI: 10.1007/s11030-021-10192-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2020] [Accepted: 01/28/2021] [Indexed: 12/29/2022]
Abstract
Aptamers can be regarded as efficient substitutes for monoclonal antibodies in many diagnostic and therapeutic applications. Due to the tedious and prohibitive nature of SELEX (systematic evolution of ligands by exponential enrichment), the in silico methods have been developed to improve the enrichment processes rate. However, the majority of these methods did not show any effort in designing novel aptamers. Moreover, some target proteins may have not any binding RNA candidates in nature and a reductive mechanism is needed to generate novel aptamer pools among enormous possible combinations of nucleotide acids to be examined in vitro. We have applied a genetic algorithm (GA) with an embedded binding predictor fitness function to in silico design of RNA aptamers. As a case study of this research, all steps were accomplished to generate an aptamer pool against aminopeptidase N (CD13) biomarker. First, the model was developed based on sequential and structural features of known RNA-protein complexes. Then, utilizing RNA sequences involved in complexes with positive prediction results, as the first-generation, novel aptamers were designed and top-ranked sequences were selected. A 76-mer aptamer was identified with the highest fitness value with a 3 to 6 time higher score than parent oligonucleotides. The reliability of obtained sequences was confirmed utilizing docking and molecular dynamic simulation. The proposed method provides an important simplified contribution to the oligonucleotide-aptamer design process. Also, it can be an underlying ground to design novel aptamers against a wide range of biomarkers.
Collapse
Affiliation(s)
- Mahsa Torkamanian-Afshar
- Department of Bioinformatics, Kish International Campus, University of Tehran, Kish Island, Iran.,Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran.,Department of Computer Technologies, Beykent University, Istanbul, Turkey
| | - Sajjad Nematzadeh
- Department of Computer Technologies, Beykent University, Istanbul, Turkey
| | - Maryam Tabarzad
- Protein Technology Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Ali Najafi
- Molecular Biology Research Center, Systems Biology and Poisonings Institute, Tehran, Iran
| | - Hossein Lanjanian
- Cellular and Molecular Endocrine Research Center, Research Institute for Endocrine Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Ali Masoudi-Nejad
- Department of Bioinformatics, Kish International Campus, University of Tehran, Kish Island, Iran. .,Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran.
| |
Collapse
|
18
|
Lee W, Han K. Constructive Prediction of Potential RNA Aptamers for a Protein Target. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1476-1482. [PMID: 31689200 DOI: 10.1109/tcbb.2019.2951114] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Aptamers are short single-stranded nucleic acids that bind to target molecules with high affinity and selectivity. Aptamers are generally identified in vitro by performing SELEX (systematic evolution of ligands by exponential enrichment). Complementing the SELEX process, several computational methods have been proposed in the search for aptamers. However, many of these methods cannot be applied for finding new aptamers, either because they are classifiers for determining whether an RNA and protein interact with each other, or because they are limited to a specific target only. Hence, we developed a new random forest (RF) model for finding potential RNA aptamers for a protein target. From an extensive analysis of protein-RNA complexes including RNA aptamers-protein complexes, we identified key features of interacting RNA and protein molecules, and structural constraints on RNA aptamers. The potential RNA aptamers predicted by our method reveal similar secondary and protein-binding structures as the actual RNA aptamers. The RF model showed a reliable performance in both cross validations and independent testing. The key features of interacting RNA and protein molecules and the structural constraints identified in our study were effective in finding potential aptamers for a protein target. Although preliminary, our results are promising, and we believe this approach will be useful in reducing time and money spent on in vitro experiments by substantially limiting the size of the initial pool of nucleic acid sequences.
Collapse
|
19
|
Volk MJ, Lourentzou I, Mishra S, Vo LT, Zhai C, Zhao H. Biosystems Design by Machine Learning. ACS Synth Biol 2020; 9:1514-1533. [PMID: 32485108 DOI: 10.1021/acssynbio.0c00129] [Citation(s) in RCA: 52] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Biosystems such as enzymes, pathways, and whole cells have been increasingly explored for biotechnological applications. However, the intricate connectivity and resulting complexity of biosystems poses a major hurdle in designing biosystems with desirable features. As -omics and other high throughput technologies have been rapidly developed, the promise of applying machine learning (ML) techniques in biosystems design has started to become a reality. ML models enable the identification of patterns within complicated biological data across multiple scales of analysis and can augment biosystems design applications by predicting new candidates for optimized performance. ML is being used at every stage of biosystems design to help find nonobvious engineering solutions with fewer design iterations. In this review, we first describe commonly used models and modeling paradigms within ML. We then discuss some applications of these models that have already shown success in biotechnological applications. Moreover, we discuss successful applications at all scales of biosystems design, including nucleic acids, genetic circuits, proteins, pathways, genomes, and bioprocesses. Finally, we discuss some limitations of these methods and potential solutions as well as prospects of the combination of ML and biosystems design.
Collapse
|
20
|
Li J, Ma X, Li X, Gu J. PPAI: a web server for predicting protein-aptamer interactions. BMC Bioinformatics 2020; 21:236. [PMID: 32517696 PMCID: PMC7285591 DOI: 10.1186/s12859-020-03574-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Accepted: 05/28/2020] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND The interactions between proteins and aptamers are prevalent in organisms and play an important role in various life activities. Thanks to the rapid accumulation of protein-aptamer interaction data, it is necessary and feasible to construct an accurate and effective computational model to predict aptamers binding to certain interested proteins and protein-aptamer interactions, which is beneficial for understanding mechanisms of protein-aptamer interactions and improving aptamer-based therapies. RESULTS In this study, a novel web server named PPAI is developed to predict aptamers and protein-aptamer interactions with key sequence features of proteins/aptamers and a machine learning framework integrated adaboost and random forest. A new method for extracting several key sequence features of both proteins and aptamers is presented, where the features for proteins are extracted from amino acid composition, pseudo-amino acid composition, grouped amino acid composition, C/T/D composition and sequence-order-coupling number, while the features for aptamers are extracted from nucleotide composition, pseudo-nucleotide composition (PseKNC) and normalized Moreau-Broto autocorrelation coefficient. On the basis of these feature sets and balanced the samples with SMOTE algorithm, we validate the performance of PPAI by the independent test set. The results demonstrate that the Area Under Curve (AUC) is 0.907 for prediction of aptamer, while the AUC reaches 0.871 for prediction of protein-aptamer interactions. CONCLUSION These results indicate that PPAI can query aptamers and proteins, predict aptamers and predict protein-aptamer interactions in batch mode precisely and efficiently, which would be a novel bioinformatics tool for the research of protein-aptamer interactions. PPAI web-server is freely available at http://39.96.85.9/PPAI.
Collapse
Affiliation(s)
- Jianwei Li
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, China. .,Tianjin Key Laboratory of Bioelectromagnetic Technology and Intelligent Health, Hebei University of Technology, Tianjin, China.
| | - Xiaoyu Ma
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, China
| | - Xichuan Li
- Tianjin Key Laboratory of Animal and Plant Resistance, College of Life Sciences, Tianjin Normal University, Tianjin, China
| | - Junhua Gu
- Institute of Computational Medicine, School of Artificial Intelligence, Hebei University of Technology, Tianjin, China
| |
Collapse
|
21
|
Emami N, Pakchin PS, Ferdousi R. Computational predictive approaches for interaction and structure of aptamers. J Theor Biol 2020; 497:110268. [PMID: 32311376 DOI: 10.1016/j.jtbi.2020.110268] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2020] [Revised: 03/27/2020] [Accepted: 04/02/2020] [Indexed: 02/07/2023]
Abstract
Aptamers are short single-strand sequences that can bind to their specific targets with high affinity and specificity. Usually, aptamers are selected experimentally via systematic evolution of ligands by exponential enrichment (SELEX), an evolutionary process that consists of multiple cycles of selection and amplification. The SELEX process is expensive, time-consuming, and its success rates are relatively low. To overcome these difficulties, in recent years, several computational techniques have been developed in aptamer sciences that bring together different disciplines and branches of technologies. In this paper, a complementary review on computational predictive approaches of the aptamer has been organized. Generally, the computational prediction approaches of aptamer have been proposed to carry out in two main categories: interaction-based prediction and structure-based predictions. Furthermore, the available software packages and toolkits in this scope were reviewed. The aim of describing computational methods and tools in aptamer science is that aptamer scientists might take advantage of these computational techniques to develop more accurate and more sensitive aptamers.
Collapse
Affiliation(s)
- Neda Emami
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Parvin Samadi Pakchin
- Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran
| | - Reza Ferdousi
- Department of Health Information Technology, School of Management and Medical Informatics, Tabriz University of Medical Sciences, Tabriz, Iran; Research Center for Pharmaceutical Nanotechnology, Biomedicine Institute, Tabriz University of Medical Sciences, Tabriz, Iran.
| |
Collapse
|
22
|
Yang Q, Jia C, Li T. Prediction of aptamer-protein interacting pairs based on sparse autoencoder feature extraction and an ensemble classifier. Math Biosci 2019; 311:103-108. [PMID: 30880100 DOI: 10.1016/j.mbs.2019.01.009] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2018] [Revised: 01/29/2019] [Accepted: 01/29/2019] [Indexed: 10/27/2022]
Abstract
Aptamer-protein interacting pairs play important roles in physiological functions and structural characterization. Identifying aptamer-protein interacting pairs is challenging and limited, despite of the tremendous applications of aptamers. Therefore, it is vital to construct a high prediction performance model for identifying aptamer-target interacting pairs. In this study, a novel ensemble method is presented to predict aptamer-protein interacting pairs by integrating sequence characteristics derived from aptamers and the target proteins. The features extracted for aptamers were the compositions of amino acids and pseudo K-tuple nucleotides. In addition, a sparse autoencoder was used to characterize features for the target protein sequences. To remove redundant features, gradient boosting decision tree (GBDT) and incremental feature selection (IFS) methods were used to obtain the optimum combination of sequence characters. Based on 616 selected features, an ensemble of three sub- support vector machine (SVM) classifiers was used to construct our prediction model. Evaluated on an independent dataset, our predictor obtained an accuracy of 75.7%, Matthew's Correlation Coefficient of 0.478, and Youden's Index of 0.538, which were superior to the values reached using other existing predictors. The results show that our model can be used to distinguishing novel aptamer-protein interacting pairs and revealing the interrelation between aptamers and proteins.
Collapse
Affiliation(s)
- Qing Yang
- Institute of Environmental Systems Biology, College of Environmental and Engineering, Dalian Maritime University, No. 1 Linghai Road, Dalian 116026, China
| | - Cangzhi Jia
- School of Science, Dalian Maritime University, No. 1 Linghai Road, Dalian 116026, China
| | - Taoying Li
- Department of Maritime Economics and Management, Dalian Maritime University, No. 1 Linghai Road, Dalian 116026, China.
| |
Collapse
|
23
|
Yu X, Wang Y, Yang H, Huang X. Prediction of the binding affinity of aptamers against the influenza virus. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2019; 30:51-62. [PMID: 30638061 DOI: 10.1080/1062936x.2018.1558416] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/24/2018] [Indexed: 06/09/2023]
Abstract
Thousands of investigations on quantitative structure-activity/property relationships (QSARs/QSPRs) have been reported. However, few publications can be found that deal with QSARs for aptamers, because calculating two-dimensional and three-dimensional descriptors directly from aptamers (typically with 15-45 nucleotides) is difficult. This paper describes calculating molecular descriptors from amino acid sequences that are translated from DNA aptamer sequences with DNAMAN software, and developing QSAR models for the aptamers' binding affinity to the influenza virus. General regression neural network (GRNN) based on Parzen windows estimation was used to build the QSAR model by applying six molecular descriptors. The optimal spreading factor σ of Gaussian function of 0.3 was obtained with the circulation method. The correlation coefficients r from the GRNN model were 0.889 for the training set and 0.892 for the test set. Compared with the existing model for aptamers' binding affinity to the influenza virus, our model is accurate and competes favourably. The feasibility of calculating molecular descriptors from an amino acid sequence translated from DNA aptamer sequences to develop a QSAR model for the anti-influenza aptamers was demonstrated.
Collapse
Affiliation(s)
- X Yu
- a Hunan Provincial Key Laboratory of Environmental Catalysis & Waste Regeneration, College of Chemistry and Chemical Engineering, Hunan Institute of Engineering , Xiangtan , China
- b State Key Laboratory of Chemo/Biosensing and Chemometrics, Hunan University , Changsha , China
| | - Y Wang
- a Hunan Provincial Key Laboratory of Environmental Catalysis & Waste Regeneration, College of Chemistry and Chemical Engineering, Hunan Institute of Engineering , Xiangtan , China
| | - H Yang
- a Hunan Provincial Key Laboratory of Environmental Catalysis & Waste Regeneration, College of Chemistry and Chemical Engineering, Hunan Institute of Engineering , Xiangtan , China
| | - X Huang
- a Hunan Provincial Key Laboratory of Environmental Catalysis & Waste Regeneration, College of Chemistry and Chemical Engineering, Hunan Institute of Engineering , Xiangtan , China
| |
Collapse
|
24
|
Yu X, Yang H, Huang X. Novel Method for Structure-Activity Relationship of Aptamer Sequences for Human Prostate Cancer. ACS OMEGA 2018; 3:10002-10007. [PMID: 31459128 PMCID: PMC6644987 DOI: 10.1021/acsomega.8b01464] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/26/2018] [Accepted: 08/20/2018] [Indexed: 06/10/2023]
Abstract
Prostate cancer (PCa) is one of the most common malignancies in men and seriously threatens men's health. Developing aptamer probes for PCa cells is of great significance for early diagnosis and treatment of PCa. This paper reports a classification model for SELEX-based aptamers, which were obtained with PCa cell line PCa-3M-1E8 (highly metastatic tumor cell) as target cells and PCa cell line PCa-3M-2B4 (low metastatic tumor cell) as control cells. On the basis of the SELEX principle, 100 oligonucleotide sequences from the 3rd round of SELEX were defined as low affinity and specificity aptamers, and 100 sequences from the 11th round were set as high affinity and specificity aptamers. Seven molecular descriptors were used for the classification model, which were calculated from amino acid sequences translated from DNA aptamer sequences with DNAMAN software. The classification model based on binary logical regression analysis has prediction accuracies, sensitivity, and specificity of about 80% for both the training set and test set. Therefore, it is feasible to calculate molecular descriptors from amino acid sequence translated from DNA aptamer sequences and develop a classification model for PCa cell line PCa-3M-1E8.
Collapse
Affiliation(s)
- Xinliang Yu
- College
of Chemistry and Chemical Engineering, Hunan
Institute of Engineering, Xiangtan, Hunan 411104, China
- State
Key Laboratory of Chemo/Biosensing and Chemometrics, Hunan University, Changsha, Hunan 410082, China
| | - Huiqiong Yang
- College
of Chemistry and Chemical Engineering, Hunan
Institute of Engineering, Xiangtan, Hunan 411104, China
| | - Xianwei Huang
- College
of Chemistry and Chemical Engineering, Hunan
Institute of Engineering, Xiangtan, Hunan 411104, China
| |
Collapse
|
25
|
Affinity capture of aflatoxin B 1 and B 2 by aptamer-functionalized magnetic agarose microspheres prior to their determination by HPLC. Mikrochim Acta 2018; 185:326. [PMID: 29896649 DOI: 10.1007/s00604-018-2849-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2018] [Accepted: 05/22/2018] [Indexed: 10/14/2022]
Abstract
A novel adsorbent is described for magnetic solid-phase extraction (MSPE) of the aflatoxins AFB1 and AFB2 (AFBs). Magnetic agarose microspheres (MAMs) were functionalized with an aptamer to bind the AFBs which then were quantified by HPLC and on-line post-column photochemical derivatization with fluorescence detection. Streptavidin-conjugated MAMs were synthesized first by a highly reproducible strategy. They possess strong magnetism and high surface area. The MAMs were characterized by transmission electron microscopy, scanning electron microscopy, optical microscopy, laser diffraction particle size analyzer, Fourier transform infrared spectrometry, vibrating sample magnetometry and laser scanning confocal microscopy. Then, the AFB-aptamers were immobilized on MAMs through biotin-streptavidin interaction. Finally, the MSPE is performed by suspending the aptamer-modified MAMs in the sample. They are then collected by an external magnetic field and the AFBs are eluted with methanol/buffer (20:80). Several parameters affecting the coupling, capturing and eluting efficiency were optimized. Under the optimized conditions, the method is fast, has good linearity, high selectivity, and sensitivity. The LODs are 25 pg·mL-1 for AFB1 and 10 pg·mL-1 for AFB2. The binding capacity is 350 ± 8 ng·g-1 for AFB1 and 384 ± 8 ng·g-1 for AFB2, and the precision of the assay is <8%. The method was successfully applied to the analysis of AFBs in spiked maize samples. Graphical abstract Schematic of novel aptamer functionalized magnetic agarose microspheres (Apt-MAM) as magnetic adsorbents for simultaneous and specific affinity capture of aflatoxins B1 and B2 (AFBs).
Collapse
|
26
|
Huang G, Li J, Zhao C. Computational Prediction and Analysis of Associations between Small Molecules and Binding-Associated S-Nitrosylation Sites. Molecules 2018; 23:molecules23040954. [PMID: 29671802 PMCID: PMC6017196 DOI: 10.3390/molecules23040954] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2018] [Revised: 03/30/2018] [Accepted: 04/09/2018] [Indexed: 01/12/2023] Open
Abstract
Interactions between drugs and proteins occupy a central position during the process of drug discovery and development. Numerous methods have recently been developed for identifying drug–target interactions, but few have been devoted to finding interactions between post-translationally modified proteins and drugs. We presented a machine learning-based method for identifying associations between small molecules and binding-associated S-nitrosylated (SNO-) proteins. Namely, small molecules were encoded by molecular fingerprint, SNO-proteins were encoded by the information entropy-based method, and the random forest was used to train a classifier. Ten-fold and leave-one-out cross validations achieved, respectively, 0.7235 and 0.7490 of the area under a receiver operating characteristic curve. Computational analysis of similarity suggested that SNO-proteins associated with the same drug shared statistically significant similarity, and vice versa. This method and finding are useful to identify drug–SNO associations and further facilitate the discovery and development of SNO-associated drugs.
Collapse
Affiliation(s)
- Guohua Huang
- Provincial Key Laboratory of Informational Service for Rural Area of Southwestern Hunan, Shaoyang University, Shaoyang 422000, China.
- College of Information Engineering, Shaoyang University, Shaoyang 422000, China.
| | - Jincheng Li
- Provincial Key Laboratory of Informational Service for Rural Area of Southwestern Hunan, Shaoyang University, Shaoyang 422000, China.
- College of Information Engineering, Shaoyang University, Shaoyang 422000, China.
| | - Chenglin Zhao
- Provincial Key Laboratory of Informational Service for Rural Area of Southwestern Hunan, Shaoyang University, Shaoyang 422000, China.
- College of Information Engineering, Shaoyang University, Shaoyang 422000, China.
| |
Collapse
|
27
|
Li L, Li J, Xiao W, Li Y, Qin Y, Zhou S, Yang H. Prediction the Substrate Specificities of Membrane Transport Proteins Based on Support Vector Machine and Hybrid Features. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016; 13:947-953. [PMID: 26571537 DOI: 10.1109/tcbb.2015.2495140] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
|
28
|
Prediction of aptamer-protein interacting pairs using an ensemble classifier in combination with various protein sequence attributes. BMC Bioinformatics 2016; 17:225. [PMID: 27245069 PMCID: PMC4888498 DOI: 10.1186/s12859-016-1087-5] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2016] [Accepted: 05/17/2016] [Indexed: 02/05/2023] Open
Abstract
Background Aptamer-protein interacting pairs play a variety of physiological functions and therapeutic potentials in organisms. Rapidly and effectively predicting aptamer-protein interacting pairs is significant to design aptamers binding to certain interested proteins, which will give insight into understanding mechanisms of aptamer-protein interacting pairs and developing aptamer-based therapies. Results In this study, an ensemble method is presented to predict aptamer-protein interacting pairs with hybrid features. The features for aptamers are extracted from Pseudo K-tuple Nucleotide Composition (PseKNC) while the features for proteins incorporate Discrete Cosine Transformation (DCT), disorder information, and bi-gram Position Specific Scoring Matrix (PSSM). We investigate predictive capabilities of various feature spaces. The proposed ensemble method obtains the best performance with Youden’s Index of 0.380, using the hybrid feature space of PseKNC, DCT, bi-gram PSSM, and disorder information by 10-fold cross validation. The Relief-Incremental Feature Selection (IFS) method is adopted to obtain the optimal feature set. Based on the optimal feature set, the proposed method achieves a balanced performance with a sensitivity of 0.753 and a specificity of 0.725 on the training dataset, which indicates that this method can solve the imbalanced data problem effectively. To evaluate the prediction performance objectively, an independent testing dataset is used to evaluate the proposed method. Encouragingly, our proposed method performs better than previous study with a sensitivity of 0.738 and a Youden’s Index of 0.451. Conclusions These results suggest that the proposed method can be a potential candidate for aptamer-protein interacting pair prediction, which may contribute to finding novel aptamer-protein interacting pairs and understanding the relationship between aptamers and proteins. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1087-5) contains supplementary material, which is available to authorized users.
Collapse
|
29
|
Lin S, Gan N, Cao Y, Chen Y, Jiang Q. Selective dispersive solid phase extraction-chromatography tandem mass spectrometry based on aptamer-functionalized UiO-66-NH2 for determination of polychlorinated biphenyls. J Chromatogr A 2016; 1446:34-40. [PMID: 27083256 DOI: 10.1016/j.chroma.2016.04.016] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2016] [Revised: 04/05/2016] [Accepted: 04/06/2016] [Indexed: 10/22/2022]
Abstract
In this paper, a novel dispersive solid phase extraction (dSPE) adsorbent based on aptamer-functionalized magnetic metal-organic framework material was developed for selective enrichment of the trace polychlorinated biphenyls (PCBs) from soil sample. Firstly, we developed a simple, versatile synthetic strategy to prepare highly reproducible magnetic amino-functionalized UiO-66 (Fe3O4@PDA@UiO-66-NH2) by using polydopamine (PDA) as covalent linker. Then amino-functionalized aptamers which can recognize 2,3',5,5'-tetrachlorobiphenyl (PCB72), 2',3',4',5,5'-pentachlorobiphenyl (PCB106) were covalent immobilized on UiO-66-NH2 through coupling reagent of glutaraldehyde. Aptamer-functionalized adsorbent (Fe3O4@PDA@UiO-66-Apt) can specifically capture PCBs from complex matrix with high adsorption capacity based on the specific affinity of aptamer towards target. Moreover, the adsorbent can be easily isolated from the solution through magnetic separation after extraction. Afterwards, the detection was carried out with gas chromatography tandem mass spectrometry (GC-MS). The selective dSPE pretreatment coupled with GC-MS possessed high selectivity, good binding capacity, stability, repeatability and reproducibility for the extraction of PCBs. Furthermore, the adsorbent possessed good mechanical stability which can be applied in replicate at least for 60 extraction cycles with recovery over 80%. It provided a linear range of 0.02-400ngmL(-1) with a good correlation coefficient (R(2)=0.9994-0.9996), and the limit of detection was found to be 0.010-0.015ngmL(-1). The method was successfully utilized for the determination of PCBs in soil samples.
Collapse
Affiliation(s)
- Saichai Lin
- The State Key Laboratory Base of Novel Functional Materials and Preparation Science, Faculty of Material Science and Chemical Engineering, Ningbo University, Ningbo 315211, China
| | - Ning Gan
- The State Key Laboratory Base of Novel Functional Materials and Preparation Science, Faculty of Material Science and Chemical Engineering, Ningbo University, Ningbo 315211, China.
| | - Yuting Cao
- The State Key Laboratory Base of Novel Functional Materials and Preparation Science, Faculty of Material Science and Chemical Engineering, Ningbo University, Ningbo 315211, China.
| | - Yinji Chen
- Deptartment of Food Science and Engineering/Collaborative Innovation Center for Modern Grain Circulation and Safety, Nanjing University of Finance and Economics, Nanjing 210007, China
| | - Qianli Jiang
- Department of Hematology, Nanfang Hospital, Southern Medical University, Guangzhou 510515, China
| |
Collapse
|
30
|
Jokar M, Safaralizadeh MH, Hadizadeh F, Rahmani F, Kalani MR. Design and evaluation of an apta-nano-sensor to detect Acetamiprid in vitro and in silico. J Biomol Struct Dyn 2016; 34:2505-17. [PMID: 26609886 DOI: 10.1080/07391102.2015.1123188] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Pesticide detection is a main concern of food safety experts. Therefore, it is urgent to design an accurate, rapid, and cheap test. Biosensors that detect pesticide residues could replace current methods, such as HPLC or GC-MC. This research designs a biosensor based on aptamer (Oligonucleotide ss-DNA) in the receptor role, silver nanoparticles (AgNPs) as optical sensors and salt (NaCl) as the aggregative inducer of AgNPs to detect the presence of Acetamiprid. After optimization, .6 μM aptamer and 100 mM salt were employed. The selectivity and sensitivity of the complex were examined by different pesticides and different Acetamiprid concentrations. To simulate in vitro experimental conditions, bioinformatics software was used as in silico analysis. The results showed the detection of Acetamiprid at the .02 ppm (89.8 nM) level in addition to selectivity. Docking outputs introduced two loops as active sites in aptamer and confirmed aptamer-Acetamiprid bonding. Circular dichroism spectroscopy (CD) confirmed upon Acetamiprid binding, aptamer was folded due to stem-loop formation. Stability of the Apt-Acetamiprid complex in a simulated aqueous media was examined by molecular dynamic studies.
Collapse
Affiliation(s)
- Mahmoud Jokar
- a Department of Entomology and Plant Pathology , Urmia University , Urmia , Iran
| | | | - Farzin Hadizadeh
- b Biotechnology Research Center, School of Pharmacy , Mashhad University of Medical Sciences , Mashhad , Iran
| | - Fatemeh Rahmani
- c Faculty of Sciences, Department of Biology , Urmia University , Urmia , Iran
| | | |
Collapse
|
31
|
Analysis and Identification of Aptamer-Compound Interactions with a Maximum Relevance Minimum Redundancy and Nearest Neighbor Algorithm. BIOMED RESEARCH INTERNATIONAL 2016; 2016:8351204. [PMID: 26955638 PMCID: PMC4756144 DOI: 10.1155/2016/8351204] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2015] [Accepted: 01/05/2016] [Indexed: 12/02/2022]
Abstract
The development of biochemistry and molecular biology has revealed an increasingly important role of compounds in several biological processes. Like the aptamer-protein interaction, aptamer-compound interaction attracts increasing attention. However, it is time-consuming to select proper aptamers against compounds using traditional methods, such as exponential enrichment. Thus, there is an urgent need to design effective computational methods for searching effective aptamers against compounds. This study attempted to extract important features for aptamer-compound interactions using feature selection methods, such as Maximum Relevance Minimum Redundancy, as well as incremental feature selection. Each aptamer-compound pair was represented by properties derived from the aptamer and compound, including frequencies of single nucleotides and dinucleotides for the aptamer, as well as the constitutional, electrostatic, quantum-chemical, and space conformational descriptors of the compounds. As a result, some important features were obtained. To confirm the importance of the obtained features, we further discussed the associations between them and aptamer-compound interactions. Simultaneously, an optimal prediction model based on the nearest neighbor algorithm was built to identify aptamer-compound interactions, which has the potential to be a useful tool for the identification of novel aptamer-compound interactions. The program is available upon the request.
Collapse
|
32
|
Chen L, Chu C, Huang T, Kong X, Cai YD. Prediction and analysis of cell-penetrating peptides using pseudo-amino acid composition and random forest models. Amino Acids 2015; 47:1485-93. [PMID: 25894890 DOI: 10.1007/s00726-015-1974-5] [Citation(s) in RCA: 62] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2015] [Accepted: 03/27/2015] [Indexed: 12/26/2022]
Abstract
Cell-penetrating peptides, a group of short peptides, can traverse cell membranes to enter cells and thus facilitate the uptake of various molecular cargoes. Thus, they have the potential to become powerful drug delivery systems. The correct identification of peptides as cell-penetrating or non-cell-penetrating would accelerate this application. In this study, we determined which features were important for a peptide to be cell-penetrating or non-cell-penetrating and built a predictive model based on the key features extracted from this analysis. The investigated peptides were retrieved from a previous study, and each was encoded as a numeric vector according to six properties of amino acids-amino acid frequency, codon diversity, electrostatic charge, molecular volume, polarity, and secondary structure-by the pseudo-amino acid composition method. Methods of minimum redundancy maximum relevance and incremental feature selection were then employed to analyze these features, and some were found to be key determinants of cell penetration. In parallel, an optimal random forest prediction model was built. We hope that our findings will provide new resources for the study of cell-penetrating peptides.
Collapse
Affiliation(s)
- Lei Chen
- College of Life Science, Shanghai University, Shanghai, 200444, People's Republic of China,
| | | | | | | | | |
Collapse
|
33
|
Saidijam M, Patching SG. Amino acid composition analysis of secondary transport proteins from Escherichia coli with relation to functional classification, ligand specificity and structure. J Biomol Struct Dyn 2015; 33:2205-20. [DOI: 10.1080/07391102.2014.998283] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Affiliation(s)
- Massoud Saidijam
- Department of Molecular Medicine and Genetics, Research Centre for Molecular Medicine, School of Medicine, Hamadan University of Medical Sciences , Hamadan, Iran
| | - Simon G. Patching
- Department of Molecular Medicine and Genetics, Research Centre for Molecular Medicine, School of Medicine, Hamadan University of Medical Sciences , Hamadan, Iran
| |
Collapse
|