1
|
Kim YA, Mousavi K, Yazdi A, Zwierzyna M, Cardinali M, Fox D, Peel T, Coller J, Aggarwal K, Maruggi G. Computational design of mRNA vaccines. Vaccine 2024; 42:1831-1840. [PMID: 37479613 DOI: 10.1016/j.vaccine.2023.07.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 06/23/2023] [Accepted: 07/10/2023] [Indexed: 07/23/2023]
Abstract
mRNA technology has emerged as a successful vaccine platform that offered a swift response to the COVID-19 pandemic. Accumulating evidence shows that vaccine efficacy, thermostability, and other important properties, are largely impacted by intrinsic properties of the mRNA molecule, such as RNA sequence and structure, both of which can be optimized. Designing mRNA sequence for vaccines presents a combinatorial problem due to an extremely large selection space. For instance, due to the degeneracy of the genetic code, there are over 10632 possible mRNA sequences that could encode the spike protein, the COVID-19 vaccines' target. Moreover, designing different elements of the mRNA sequence simultaneously against multiple objectives such as translational efficiency, reduced reactogenicity, and improved stability requires an efficient and sophisticated optimization strategy. Recently, there has been a growing interest in utilizing computational tools to redesign mRNA sequences to improve vaccine characteristics and expedite discovery timelines. In this review, we explore important biophysical features of mRNA to be considered for vaccine design and discuss how computational approaches can be applied to rapidly design mRNA sequences with desirable characteristics.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | - Jeff Coller
- Johns Hopkins University, Baltimore, MD, USA
| | | | | |
Collapse
|
2
|
Salimi A, Jang JH, Lee JY. Leveraging attention-enhanced variational autoencoders: Novel approach for investigating latent space of aptamer sequences. Int J Biol Macromol 2024; 255:127884. [PMID: 37926303 DOI: 10.1016/j.ijbiomac.2023.127884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 10/27/2023] [Accepted: 11/02/2023] [Indexed: 11/07/2023]
Abstract
Aptamers are increasingly recognized as potent alternatives to antibodies for diagnostic and therapeutic applications. The application of deep learning, particularly attention-based models, for aptamer (DNA/RNA) sequences is an innovative field. The ongoing advancements in aptamer sequencing technologies coupled with machine learning algorithms have resulted in novel developments. Further research is required to investigate the full potential of deep learning models and address the challenges associated with the generation of sequences, like the large search space of possible sequences. In this study, we propose a workflow that integrates an attention mechanism within a framework of a generative variational autoencoder, to generate novel sequences by expanding latent memory. They show 100 % novelty compared with the dataset, and approximately 88 % of them show negative values for the minimum free energy, which may indicate the likelihood of an RNA sequence folding into a functional structure. Because the field of aptamer discovery is affected by data scarcity, advanced strategies that facilitate the generation of diverse and superior sequences are necessitated. The utilization of our workflow can result in novel aptamers. Thus, investigations such as the present study can address the abovementioned challenge. Our research is anticipated to facilitate further discoveries and advancements in aptamer fields.
Collapse
Affiliation(s)
- Abbas Salimi
- Department of Chemistry, Sungkyunkwan University, Suwon 16419, Republic of Korea
| | - Jee Hwan Jang
- School of Materials Science and Engineering, Sungkyunkwan University, Suwon 16419, Republic of Korea; Ucaretron Inc., No. 3508, 40, Simin-daero 365 beon-gil, Dongan-gu, Anyang-si, Gyeonggi-do, Republic of Korea.
| | - Jin Yong Lee
- Department of Chemistry, Sungkyunkwan University, Suwon 16419, Republic of Korea.
| |
Collapse
|
3
|
Andress C, Kappel K, Villena ME, Cuperlovic-Culf M, Yan H, Li Y. DAPTEV: Deep aptamer evolutionary modelling for COVID-19 drug design. PLoS Comput Biol 2023; 19:e1010774. [PMID: 37406007 DOI: 10.1371/journal.pcbi.1010774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 06/13/2023] [Indexed: 07/07/2023] Open
Abstract
Typical drug discovery and development processes are costly, time consuming and often biased by expert opinion. Aptamers are short, single-stranded oligonucleotides (RNA/DNA) that bind to target proteins and other types of biomolecules. Compared with small-molecule drugs, aptamers can bind to their targets with high affinity (binding strength) and specificity (uniquely interacting with the target only). The conventional development process for aptamers utilizes a manual process known as Systematic Evolution of Ligands by Exponential Enrichment (SELEX), which is costly, slow, dependent on library choice and often produces aptamers that are not optimized. To address these challenges, in this research, we create an intelligent approach, named DAPTEV, for generating and evolving aptamer sequences to support aptamer-based drug discovery and development. Using the COVID-19 spike protein as a target, our computational results suggest that DAPTEV is able to produce structurally complex aptamers with strong binding affinities.
Collapse
Affiliation(s)
- Cameron Andress
- Department of Computer Science, Brock University, St. Catharines, Canada
| | - Kalli Kappel
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | | | | | - Hongbin Yan
- Department of Chemistry, Brock University, St. Catharines, Canada
| | - Yifeng Li
- Department of Computer Science, Brock University, St. Catharines, Canada
- Department of Biological Sciences, Brock University, St. Catharines, Canada
| |
Collapse
|
4
|
Iwano N, Adachi T, Aoki K, Nakamura Y, Hamada M. Generative aptamer discovery using RaptGen. NATURE COMPUTATIONAL SCIENCE 2022; 2:378-386. [PMID: 38177576 PMCID: PMC10766510 DOI: 10.1038/s43588-022-00249-6] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2021] [Accepted: 04/21/2022] [Indexed: 01/06/2024]
Abstract
Nucleic acid aptamers are generated by an in vitro molecular evolution method known as systematic evolution of ligands by exponential enrichment (SELEX). Various candidates are limited by actual sequencing data from an experiment. Here we developed RaptGen, which is a variational autoencoder for in silico aptamer generation. RaptGen exploits a profile hidden Markov model decoder to represent motif sequences effectively. We showed that RaptGen embedded simulation sequence data into low-dimensional latent space on the basis of motif information. We also performed sequence embedding using two independent SELEX datasets. RaptGen successfully generated aptamers from the latent space even though they were not included in high-throughput sequencing. RaptGen could also generate a truncated aptamer with a short learning model. We demonstrated that RaptGen could be applied to activity-guided aptamer generation according to Bayesian optimization. We concluded that a generative method by RaptGen and latent representation are useful for aptamer discovery.
Collapse
Affiliation(s)
- Natsuki Iwano
- Graduate School of Advanced Science and Engineering, Waseda University, Tokyo, Japan
| | | | | | | | - Michiaki Hamada
- Graduate School of Advanced Science and Engineering, Waseda University, Tokyo, Japan.
- Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), National Institute of Advanced Industrial Science and Technology (AIST), Tokyo, Japan.
- Graduate School of Medicine, Nippon Medical School, Tokyo, Japan.
| |
Collapse
|
5
|
Navigating the pitfalls of applying machine learning in genomics. Nat Rev Genet 2022; 23:169-181. [PMID: 34837041 DOI: 10.1038/s41576-021-00434-9] [Citation(s) in RCA: 83] [Impact Index Per Article: 41.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/28/2021] [Indexed: 11/08/2022]
Abstract
The scale of genetic, epigenomic, transcriptomic, cheminformatic and proteomic data available today, coupled with easy-to-use machine learning (ML) toolkits, has propelled the application of supervised learning in genomics research. However, the assumptions behind the statistical models and performance evaluations in ML software frequently are not met in biological systems. In this Review, we illustrate the impact of several common pitfalls encountered when applying supervised ML in genomics. We explore how the structure of genomics data can bias performance evaluations and predictions. To address the challenges associated with applying cutting-edge ML methods to genomics, we describe solutions and appropriate use cases where ML modelling shows great potential.
Collapse
|
6
|
Loyez M, DeRosa MC, Caucheteur C, Wattiez R. Overview and emerging trends in optical fiber aptasensing. Biosens Bioelectron 2022; 196:113694. [PMID: 34637994 DOI: 10.1016/j.bios.2021.113694] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Revised: 09/30/2021] [Accepted: 10/01/2021] [Indexed: 12/16/2022]
Abstract
Optical fiber biosensors have attracted growing interest over the last decade and quickly became a key enabling technology, especially for the detection of biomarkers at extremely low concentrations and in small volumes. Among the many and recent fiber-optic sensing amenities, aptamers-based sensors have shown unequalled performances in terms of ease of production, specificity, and sensitivity. The immobilization of small and highly stable bioreceptors such as DNA has bolstered their use for the most varied applications e.g., medical diagnosis, food safety and environmental monitoring. This review highlights the recent advances in aptamer-based optical fiber biosensors. An in-depth analysis of the literature summarizes different fiber-optic structures and biochemical strategies for molecular detection and immobilization of receptors over diverse surfaces. In this review, we analyze the features offered by those sensors and discuss about the next challenges to be addressed. This overview investigates both biochemical and optical parameters, drawing the guiding lines for forthcoming innovations and prospects in this ever-growing field of research.
Collapse
Affiliation(s)
- Médéric Loyez
- Proteomics and Microbiology Department, University of Mons, Avenue du Champ de Mars 6, 7000, Mons, Belgium; Electromagnetism and Telecommunication Department, University of Mons, Bld. Dolez 31, 7000, Mons, Belgium.
| | - Maria C DeRosa
- Department of Chemistry, 203 Steacie Building, Carleton University, 1125, Colonel By Drive, Ottawa, ON K1S 5B6, Canada
| | - Christophe Caucheteur
- Electromagnetism and Telecommunication Department, University of Mons, Bld. Dolez 31, 7000, Mons, Belgium
| | - Ruddy Wattiez
- Proteomics and Microbiology Department, University of Mons, Avenue du Champ de Mars 6, 7000, Mons, Belgium
| |
Collapse
|
7
|
Mardikoraem M, Woldring D. Machine Learning-driven Protein Library Design: A Path Toward Smarter Libraries. Methods Mol Biol 2022; 2491:87-104. [PMID: 35482186 DOI: 10.1007/978-1-0716-2285-8_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Proteins are small yet valuable biomolecules that play a versatile role in therapeutics and diagnostics. The intricate sequence-structure-function paradigm in the realm of proteins opens the possibility for directly mapping amino acid sequence to function. However, the rugged nature of the protein fitness landscape and an astronomical number of possible mutations even for small proteins make navigating this system a daunting task. Moreover, the scarcity of functional proteins and the ease with which deleterious mutations are introduced, due to complex epistatic relationships, compound the existing challenges. This highlights the need for auxiliary tools in current techniques such as rational design and directed evolution. To that end, the state-of-the-art machine learning can offer time and cost efficiency in finding high fitness proteins, circumventing unnecessary wet-lab experiments. In the context of improving library design, machine learning provides valuable insights via its unique features such as high adaptation to complex systems, multi-tasking, and parallelism, and the ability to capture hidden trends in input data. Finally, both the advancements in computational resources and the rapidly increasing number of sequences in protein databases will allow more promising and detailed insights delivered from machine learning to protein library design. In this chapter, fundamental concepts and a method for machine learning-driven library design leveraging deep sequencing datasets will be discussed. We elaborate on (1) basic knowledge about machine learning algorithms, (2) the benefit of machine learning in library design, and (3) methodology for implementing machine learning in library design.
Collapse
Affiliation(s)
- Mehrsa Mardikoraem
- Department of Chemical Engineering and Materials Science, Michigan State University, East Lansing, MI, USA
- Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI, USA
| | - Daniel Woldring
- Department of Chemical Engineering and Materials Science, Michigan State University, East Lansing, MI, USA.
- Institute for Quantitative Health Science and Engineering, Michigan State University, East Lansing, MI, USA.
| |
Collapse
|
8
|
Palomino‐Hernandez O, Margreiter MA, Rossetti G. Challenges in RNA Regulation in Huntington's Disease: Insights from Computational Studies. Isr J Chem 2020. [DOI: 10.1002/ijch.202000021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Oscar Palomino‐Hernandez
- Computational Biomedicine, Institute of Neuroscience and Medicine (INM-9)/Instute for advanced simulations (IAS-5)Forschungszentrum Juelich 52425 Jülich Germany
- Faculty 1RWTH Aachen 52425 Aachen Germany
- Computation-based Science and Technology Research CenterThe Cyprus Institute Nicosia 2121 Cyprus
- Institute of Life ScienceThe Hebrew University of Jerusalem Jerusalem 91904 Israel
| | - Michael A. Margreiter
- Computational Biomedicine, Institute of Neuroscience and Medicine (INM-9)/Instute for advanced simulations (IAS-5)Forschungszentrum Juelich 52425 Jülich Germany
- Faculty 1RWTH Aachen 52425 Aachen Germany
| | - Giulia Rossetti
- Computational Biomedicine, Institute of Neuroscience and Medicine (INM-9)/Instute for advanced simulations (IAS-5)Forschungszentrum Juelich 52425 Jülich Germany
- Jülich Supercomputing Centre (JSC)Forschungszentrum Jülich 52425 Jülich Germany
- Department of Hematology, Oncology, Hemostaseology and Stem Cell Transplantation University Hospital AachenRWTH Aachen University Pauwelsstraße 30 52074 Aachen Germany
| |
Collapse
|