1
|
Firtina C, Soysal M, Lindegger J, Mutlu O. RawHash2: mapping raw nanopore signals using hash-based seeding and adaptive quantization. Bioinformatics 2024; 40:btae478. [PMID: 39078113 PMCID: PMC11333567 DOI: 10.1093/bioinformatics/btae478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 07/04/2024] [Accepted: 07/29/2024] [Indexed: 07/31/2024] Open
Abstract
SUMMARY Raw nanopore signals can be analyzed while they are being generated, a process known as real-time analysis. Real-time analysis of raw signals is essential to utilize the unique features that nanopore sequencing provides, enabling the early stopping of the sequencing of a read or the entire sequencing run based on the analysis. The state-of-the-art mechanism, RawHash, offers the first hash-based efficient and accurate similarity identification between raw signals and a reference genome by quickly matching their hash values. In this work, we introduce RawHash2, which provides major improvements over RawHash, including more sensitive quantization and chaining algorithms, weighted mapping decisions, frequency filters to reduce ambiguous seed hits, minimizers for hash-based sketching, and support for the R10.4 flow cell version and POD5 and SLOW5 file formats. Compared to RawHash, RawHash2 provides better F1 accuracy (on average by 10.57% and up to 20.25%) and better throughput (on average by 4.0× and up to 9.9×) than RawHash. AVAILABILITY AND IMPLEMENTATION RawHash2 is available at https://github.com/CMU-SAFARI/RawHash. We also provide the scripts to fully reproduce our results on our GitHub page.
Collapse
Affiliation(s)
- Can Firtina
- Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich 8092, Switzerland
| | - Melina Soysal
- Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich 8092, Switzerland
| | - Joël Lindegger
- Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich 8092, Switzerland
| | - Onur Mutlu
- Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich 8092, Switzerland
| |
Collapse
|
2
|
Lin Y, Zhang Y, Sun H, Jiang H, Zhao X, Teng X, Lin J, Shu B, Sun H, Liao Y, Zhou J. NanoDeep: a deep learning framework for nanopore adaptive sampling on microbial sequencing. Brief Bioinform 2023; 25:bbad499. [PMID: 38189540 PMCID: PMC10772945 DOI: 10.1093/bib/bbad499] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Revised: 11/21/2023] [Accepted: 12/11/2023] [Indexed: 01/09/2024] Open
Abstract
Nanopore sequencers can enrich or deplete the targeted DNA molecules in a library by reversing the voltage across individual nanopores. However, it requires substantial computational resources to achieve rapid operations in parallel at read-time sequencing. We present a deep learning framework, NanoDeep, to overcome these limitations by incorporating convolutional neural network and squeeze and excitation. We first showed that the raw squiggle derived from native DNA sequences determines the origin of microbial and human genomes. Then, we demonstrated that NanoDeep successfully classified bacterial reads from the pooled library with human sequence and showed enrichment for bacterial sequence compared with routine nanopore sequencing setting. Further, we showed that NanoDeep improves the sequencing efficiency and preserves the fidelity of bacterial genomes in the mock sample. In addition, NanoDeep performs well in the enrichment of metagenome sequences of gut samples, showing its potential applications in the enrichment of unknown microbiota. Our toolkit is available at https://github.com/lysovosyl/NanoDeep.
Collapse
Affiliation(s)
- Yusen Lin
- Dermatology Hospital, Southern Medical University, Guangzhou, China
| | - Yongjun Zhang
- Dermatology Hospital, Southern Medical University, Guangzhou, China
| | - Hang Sun
- Dermatology Hospital, Southern Medical University, Guangzhou, China
| | - Hang Jiang
- Dermatology Hospital, Southern Medical University, Guangzhou, China
| | - Xing Zhao
- Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, New Territories, Hong Kong SAR, China
- Department of Chemical Pathology, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, New Territories, Hong Kong SAR, China
| | - Xiaojuan Teng
- Dermatology Hospital, Southern Medical University, Guangzhou, China
| | - Jingxia Lin
- Dermatology Hospital, Southern Medical University, Guangzhou, China
| | - Bowen Shu
- Dermatology Hospital, Southern Medical University, Guangzhou, China
| | - Hao Sun
- Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, New Territories, Hong Kong SAR, China
- Department of Chemical Pathology, The Chinese University of Hong Kong, Prince of Wales Hospital, Shatin, New Territories, Hong Kong SAR, China
| | - Yuhui Liao
- Dermatology Hospital, Southern Medical University, Guangzhou, China
| | - Jiajian Zhou
- Dermatology Hospital, Southern Medical University, Guangzhou, China
| |
Collapse
|