1
|
Schlotter T, Kloter T, Hengsteler J, Yang K, Zhan L, Ragavan S, Hu H, Zhang X, Duru J, Vörös J, Zambelli T, Nakatsuka N. Aptamer-Functionalized Interface Nanopores Enable Amino Acid-Specific Peptide Detection. ACS NANO 2024; 18:6286-6297. [PMID: 38355286 PMCID: PMC10906075 DOI: 10.1021/acsnano.3c10679] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 02/06/2024] [Accepted: 02/08/2024] [Indexed: 02/16/2024]
Abstract
Single-molecule proteomics based on nanopore technology has made significant advances in recent years. However, to achieve nanopore sensing with single amino acid resolution, several bottlenecks must be tackled: controlling nanopore sizes with nanoscale precision and slowing molecular translocation events. Herein, we address these challenges by integrating amino acid-specific DNA aptamers into interface nanopores with dynamically tunable pore sizes. A phenylalanine aptamer was used as a proof-of-concept: aptamer recognition of phenylalanine moieties led to the retention of specific peptides, slowing translocation speeds. Importantly, while phenylalanine aptamers were isolated against the free amino acid, the aptamers were determined to recognize the combination of the benzyl or phenyl and the carbonyl group in the peptide backbone, enabling binding to specific phenylalanine-containing peptides. We decoupled specific binding between aptamers and phenylalanine-containing peptides from nonspecific interactions (e.g., electrostatics and hydrophobic interactions) using optical waveguide lightmode spectroscopy. Aptamer-modified interface nanopores differentiated peptides containing phenylalanine vs. control peptides with structurally similar amino acids (i.e., tyrosine and tryptophan). When the duration of aptamer-target interactions inside the nanopore were prolonged by lowering the applied voltage, discrete ionic current levels with repetitive motifs were observed. Such reoccurring signatures in the measured signal suggest that the proposed method has the possibility to resolve amino acid-specific aptamer recognition, a step toward single-molecule proteomics.
Collapse
Affiliation(s)
- Tilman Schlotter
- Laboratory
of Biosensors and Bioelectronics, Institute for Biomedical Engineering, ETH Zürich, 8092 Zürich, Switzerland
| | - Tom Kloter
- Laboratory
of Biosensors and Bioelectronics, Institute for Biomedical Engineering, ETH Zürich, 8092 Zürich, Switzerland
| | - Julian Hengsteler
- Laboratory
of Biosensors and Bioelectronics, Institute for Biomedical Engineering, ETH Zürich, 8092 Zürich, Switzerland
| | - Kyungae Yang
- Department
of Medicine, Columbia University Irving
Medical Center, New York, New York 10032, United States
| | - Lijian Zhan
- Laboratory
of Biosensors and Bioelectronics, Institute for Biomedical Engineering, ETH Zürich, 8092 Zürich, Switzerland
| | - Sujeni Ragavan
- Laboratory
of Biosensors and Bioelectronics, Institute for Biomedical Engineering, ETH Zürich, 8092 Zürich, Switzerland
| | - Haiying Hu
- Laboratory
of Biosensors and Bioelectronics, Institute for Biomedical Engineering, ETH Zürich, 8092 Zürich, Switzerland
| | - Xinyu Zhang
- Laboratory
of Biosensors and Bioelectronics, Institute for Biomedical Engineering, ETH Zürich, 8092 Zürich, Switzerland
| | - Jens Duru
- Laboratory
of Biosensors and Bioelectronics, Institute for Biomedical Engineering, ETH Zürich, 8092 Zürich, Switzerland
| | - János Vörös
- Laboratory
of Biosensors and Bioelectronics, Institute for Biomedical Engineering, ETH Zürich, 8092 Zürich, Switzerland
| | - Tomaso Zambelli
- Laboratory
of Biosensors and Bioelectronics, Institute for Biomedical Engineering, ETH Zürich, 8092 Zürich, Switzerland
| | - Nako Nakatsuka
- Laboratory
of Biosensors and Bioelectronics, Institute for Biomedical Engineering, ETH Zürich, 8092 Zürich, Switzerland
| |
Collapse
|
2
|
Singh G, Alser M, Denolf K, Firtina C, Khodamoradi A, Cavlak MB, Corporaal H, Mutlu O. RUBICON: a framework for designing efficient deep learning-based genomic basecallers. Genome Biol 2024; 25:49. [PMID: 38365730 PMCID: PMC10870431 DOI: 10.1186/s13059-024-03181-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 02/02/2024] [Indexed: 02/18/2024] Open
Abstract
Nanopore sequencing generates noisy electrical signals that need to be converted into a standard string of DNA nucleotide bases using a computational step called basecalling. The performance of basecalling has critical implications for all later steps in genome analysis. Therefore, there is a need to reduce the computation and memory cost of basecalling while maintaining accuracy. We present RUBICON, a framework to develop efficient hardware-optimized basecallers. We demonstrate the effectiveness of RUBICON by developing RUBICALL, the first hardware-optimized mixed-precision basecaller that performs efficient basecalling, outperforming the state-of-the-art basecallers. We believe RUBICON offers a promising path to develop future hardware-optimized basecallers.
Collapse
Affiliation(s)
- Gagandeep Singh
- Department of Information Technology and Electrical Engineering, ETH Zürich, Zürich, Switzerland
- Research and Advanced Development, AMD, Longmont, USA
| | - Mohammed Alser
- Department of Information Technology and Electrical Engineering, ETH Zürich, Zürich, Switzerland
| | | | - Can Firtina
- Department of Information Technology and Electrical Engineering, ETH Zürich, Zürich, Switzerland.
| | | | - Meryem Banu Cavlak
- Department of Information Technology and Electrical Engineering, ETH Zürich, Zürich, Switzerland
| | - Henk Corporaal
- Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| | - Onur Mutlu
- Department of Information Technology and Electrical Engineering, ETH Zürich, Zürich, Switzerland.
| |
Collapse
|
3
|
Stuber A, Schlotter T, Hengsteler J, Nakatsuka N. Solid-State Nanopores for Biomolecular Analysis and Detection. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2024; 187:283-316. [PMID: 38273209 DOI: 10.1007/10_2023_240] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2024]
Abstract
Advances in nanopore technology and data processing have rendered DNA sequencing highly accessible, unlocking a new realm of biotechnological opportunities. Commercially available nanopores for DNA sequencing are of biological origin and have certain disadvantages such as having specific environmental requirements to retain functionality. Solid-state nanopores have received increased attention as modular systems with controllable characteristics that enable deployment in non-physiological milieu. Thus, we focus our review on summarizing recent innovations in the field of solid-state nanopores to envision the future of this technology for biomolecular analysis and detection. We begin by introducing the physical aspects of nanopore measurements ranging from interfacial interactions at pore and electrode surfaces to mass transport of analytes and data analysis of recorded signals. Then, developments in nanopore fabrication and post-processing techniques with the pros and cons of different methodologies are examined. Subsequently, progress to facilitate DNA sequencing using solid-state nanopores is described to assess how this platform is evolving to tackle the more complex challenge of protein sequencing. Beyond sequencing, we highlight the recent developments in biosensing of nucleic acids, proteins, and sugars and conclude with an outlook on the frontiers of nanopore technologies.
Collapse
Affiliation(s)
- Annina Stuber
- Laboratory of Biosensors and Bioelectronics, Institute for Biomedical Engineering, ETH Zürich, Zürich, Switzerland
| | - Tilman Schlotter
- Laboratory of Biosensors and Bioelectronics, Institute for Biomedical Engineering, ETH Zürich, Zürich, Switzerland
| | - Julian Hengsteler
- Laboratory of Biosensors and Bioelectronics, Institute for Biomedical Engineering, ETH Zürich, Zürich, Switzerland
| | - Nako Nakatsuka
- Laboratory of Biosensors and Bioelectronics, Institute for Biomedical Engineering, ETH Zürich, Zürich, Switzerland.
| |
Collapse
|
4
|
Xu X, Bhalla N, Ståhl P, Jaldén J. Lokatt: a hybrid DNA nanopore basecaller with an explicit duration hidden Markov model and a residual LSTM network. BMC Bioinformatics 2023; 24:461. [PMID: 38062356 PMCID: PMC10704643 DOI: 10.1186/s12859-023-05580-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2023] [Accepted: 11/23/2023] [Indexed: 12/18/2023] Open
Abstract
BACKGROUND Basecalling long DNA sequences is a crucial step in nanopore-based DNA sequencing protocols. In recent years, the CTC-RNN model has become the leading basecalling model, supplanting preceding hidden Markov models (HMMs) that relied on pre-segmenting ion current measurements. However, the CTC-RNN model operates independently of prior biological and physical insights. RESULTS We present a novel basecaller named Lokatt: explicit duration Markov model and residual-LSTM network. It leverages an explicit duration HMM (EDHMM) designed to model the nanopore sequencing processes. Trained on a newly generated library with methylation-free Ecoli samples and MinION R9.4.1 chemistry, the Lokatt basecaller achieves basecalling performances with a median single read identity score of 0.930, a genome coverage ratio of 99.750%, on par with existing state-of-the-art structure when trained on the same datasets. CONCLUSION Our research underlines the potential of incorporating prior knowledge into the basecalling processes, particularly through integrating HMMs and recurrent neural networks. The Lokatt basecaller showcases the efficacy of a hybrid approach, emphasizing its capacity to achieve high-quality basecalling performance while accommodating the nuances of nanopore sequencing. These outcomes pave the way for advanced basecalling methodologies, with potential implications for enhancing the accuracy and efficiency of nanopore-based DNA sequencing protocols.
Collapse
Affiliation(s)
- Xuechun Xu
- Division of Information Science and Engineering, KTH Royal Institute of Technology, 11428, Stockholm, Sweden.
| | - Nayanika Bhalla
- Department of Gene Technology, Science for Life Laboratory, KTH Royal Institute of Technology, Solna, 17165, Stockholm, Sweden
| | - Patrik Ståhl
- Department of Gene Technology, Science for Life Laboratory, KTH Royal Institute of Technology, Solna, 17165, Stockholm, Sweden
| | - Joakim Jaldén
- Division of Information Science and Engineering, KTH Royal Institute of Technology, 11428, Stockholm, Sweden
| |
Collapse
|
5
|
Pagès-Gallego M, de Ridder J. Comprehensive benchmark and architectural analysis of deep learning models for nanopore sequencing basecalling. Genome Biol 2023; 24:71. [PMID: 37041647 PMCID: PMC10088207 DOI: 10.1186/s13059-023-02903-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2022] [Accepted: 03/20/2023] [Indexed: 04/13/2023] Open
Abstract
BACKGROUND Nanopore-based DNA sequencing relies on basecalling the electric current signal. Basecalling requires neural networks to achieve competitive accuracies. To improve sequencing accuracy further, new models are continuously proposed with new architectures. However, benchmarking is currently not standardized, and evaluation metrics and datasets used are defined on a per publication basis, impeding progress in the field. This makes it impossible to distinguish data from model driven improvements. RESULTS To standardize the process of benchmarking, we unified existing benchmarking datasets and defined a rigorous set of evaluation metrics. We benchmarked the latest seven basecaller models by recreating and analyzing their neural network architectures. Our results show that overall Bonito's architecture is the best for basecalling. We find, however, that species bias in training can have a large impact on performance. Our comprehensive evaluation of 90 novel architectures demonstrates that different models excel at reducing different types of errors and using recurrent neural networks (long short-term memory) and a conditional random field decoder are the main drivers of high performing models. CONCLUSIONS We believe that our work can facilitate the benchmarking of new basecaller tools and that the community can further expand on this work.
Collapse
Affiliation(s)
- Marc Pagès-Gallego
- Center for Molecular Medicine, University Medical Center Utrecht, Universiteitsweg 100, 3584 CG, Utrecht, The Netherlands
- Oncode Institute, Utrecht, The Netherlands
| | - Jeroen de Ridder
- Center for Molecular Medicine, University Medical Center Utrecht, Universiteitsweg 100, 3584 CG, Utrecht, The Netherlands.
- Oncode Institute, Utrecht, The Netherlands.
| |
Collapse
|
6
|
Xie S, Leung AWS, Zheng Z, Zhang D, Xiao C, Luo R, Luo M, Zhang S. Applications and potentials of nanopore sequencing in the (epi)genome and (epi)transcriptome era. Innovation (N Y) 2021; 2:100153. [PMID: 34901902 PMCID: PMC8640597 DOI: 10.1016/j.xinn.2021.100153] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 08/09/2021] [Indexed: 02/08/2023] Open
Abstract
The Human Genome Project opened an era of (epi)genomic research, and also provided a platform for the development of new sequencing technologies. During and after the project, several sequencing technologies continue to dominate nucleic acid sequencing markets. Currently, Illumina (short-read), PacBio (long-read), and Oxford Nanopore (long-read) are the most popular sequencing technologies. Unlike PacBio or the popular short-read sequencers before it, which, as examples of the second or so-called Next-Generation Sequencing platforms, need to synthesize when sequencing, nanopore technology directly sequences native DNA and RNA molecules. Nanopore sequencing, therefore, avoids converting mRNA into cDNA molecules, which not only allows for the sequencing of extremely long native DNA and full-length RNA molecules but also document modifications that have been made to those native DNA or RNA bases. In this review on direct DNA sequencing and direct RNA sequencing using Oxford Nanopore technology, we focus on their development and application achievements, discussing their challenges and future perspective. We also address the problems researchers may encounter applying these approaches in their research topics, and how to resolve them. Nanopore-seq can dissect native DNA/RNA molecules from any organisms at unlimited length A wide variety of algorithms greatly increase the accuracy of signal decoding in Nanopore-Seq Nanopore-Seq significantly facilitates genome assembly and structural variant calling, and can simultaneously detect base modifications These advantages ensure its great potentials in future medical and agricultural practices
Collapse
Affiliation(s)
- Shangqian Xie
- Key Laboratory of Ministry of Education for Genetics and Germplasm Innovation of Tropical Special Trees and Ornamental Plants, College of Forestry, Hainan University, Haikou 570228, China
| | - Amy Wing-Sze Leung
- Department of Computer Science, The University of Hong Kong, Hong Kong 999077, China
| | - Zhenxian Zheng
- Department of Computer Science, The University of Hong Kong, Hong Kong 999077, China
| | - Dake Zhang
- Beijing Advanced Innovation Centre for Biomedical Engineering, Key Laboratory for Biomechanics and Mechanobiology of Ministry of Education, School of Biological Science and Medical Engineering, Beihang University, Beijing 100083, China
| | - Chuanle Xiao
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Centre, Sun Yat-sen University, Guangzhou 510060, China
| | - Ruibang Luo
- Department of Computer Science, The University of Hong Kong, Hong Kong 999077, China
| | - Ming Luo
- Agriculture and Biotechnology Research Center, Guangdong Provincial Key Laboratory of Applied Botany, Center of Economic Botany, Core Botanical Gardens, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou 510650, China
| | - Shoudong Zhang
- School of Life Sciences, The Chinese University of Hong Kong, Shatin, Hong Kong 999077, China.,Center for Soybean Research of the State Key Laboratory of Agrobiotechnology, The Chinese University of Hong Kong, Shatin, Hong Kong 999077, China
| |
Collapse
|
7
|
Nanopore sequencing technology, bioinformatics and applications. Nat Biotechnol 2021; 39:1348-1365. [PMID: 34750572 PMCID: PMC8988251 DOI: 10.1038/s41587-021-01108-x] [Citation(s) in RCA: 480] [Impact Index Per Article: 160.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Accepted: 09/22/2021] [Indexed: 12/13/2022]
Abstract
Rapid advances in nanopore technologies for sequencing single long DNA and RNA molecules have led to substantial improvements in accuracy, read length and throughput. These breakthroughs have required extensive development of experimental and bioinformatics methods to fully exploit nanopore long reads for investigations of genomes, transcriptomes, epigenomes and epitranscriptomes. Nanopore sequencing is being applied in genome assembly, full-length transcript detection and base modification detection and in more specialized areas, such as rapid clinical diagnoses and outbreak surveillance. Many opportunities remain for improving data quality and analytical approaches through the development of new nanopores, base-calling methods and experimental protocols tailored to particular applications.
Collapse
|
8
|
Ciuffreda L, Rodríguez-Pérez H, Flores C. Nanopore sequencing and its application to the study of microbial communities. Comput Struct Biotechnol J 2021; 19:1497-1511. [PMID: 33815688 PMCID: PMC7985215 DOI: 10.1016/j.csbj.2021.02.020] [Citation(s) in RCA: 89] [Impact Index Per Article: 29.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2020] [Revised: 02/24/2021] [Accepted: 02/27/2021] [Indexed: 12/14/2022] Open
Abstract
Since its introduction, nanopore sequencing has enhanced our ability to study complex microbial samples through the possibility to sequence long reads in real time using inexpensive and portable technologies. The use of long reads has allowed to address several previously unsolved issues in the field, such as the resolution of complex genomic structures, and facilitated the access to metagenome assembled genomes (MAGs). Furthermore, the low cost and portability of platforms together with the development of rapid protocols and analysis pipelines have featured nanopore technology as an attractive and ever-growing tool for real-time in-field sequencing for environmental microbial analysis. This review provides an up-to-date summary of the experimental protocols and bioinformatic tools for the study of microbial communities using nanopore sequencing, highlighting the most important and recent research in the field with a major focus on infectious diseases. An overview of the main approaches including targeted and shotgun approaches, metatranscriptomics, epigenomics, and epitranscriptomics is provided, together with an outlook to the major challenges and perspectives over the use of this technology for microbial studies.
Collapse
Affiliation(s)
- Laura Ciuffreda
- Research Unit, Hospital Universitario N.S. de Candelaria, Universidad de La Laguna, 38010 Santa Cruz de Tenerife, Spain
| | - Héctor Rodríguez-Pérez
- Research Unit, Hospital Universitario N.S. de Candelaria, Universidad de La Laguna, 38010 Santa Cruz de Tenerife, Spain
| | - Carlos Flores
- Research Unit, Hospital Universitario N.S. de Candelaria, Universidad de La Laguna, 38010 Santa Cruz de Tenerife, Spain
- CIBER de Enfermedades Respiratorias, Instituto de Salud Carlos III, 28029 Madrid, Spain
- Genomics Division, Instituto Tecnológico y de Energías Renovables (ITER), 38600 Santa Cruz de Tenerife, Spain
- Instituto de Tecnologías Biomédicas (ITB), Universidad de La Laguna, 38200 Santa Cruz de Tenerife, Spain
| |
Collapse
|