Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Weese D, Emde AK, Rausch T, Döring A, Reinert K. RazerS--fast read mapping with sensitivity control. Genome Res 2009;19:1646-54. [PMID: 19592482 DOI: 10.1101/gr.088823.108] [Citation(s) in RCA: 94] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

For:	Weese D, Emde AK, Rausch T, Döring A, Reinert K. RazerS--fast read mapping with sensitivity control. Genome Res 2009;19:1646-54. [PMID: 19592482 DOI: 10.1101/gr.088823.108] [Citation(s) in RCA: 94] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Number

Cited by Other Article(s)

Yu C, Zhao Y, Zhao C, Jin J, Mao K, Wang G. MiniDBG: A Novel and Minimal De Bruijn Graph for Read Mapping. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024;21:129-142. [PMID: 38060353 DOI: 10.1109/tcbb.2023.3340251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2024]

Liu Y, Shen X, Gong Y, Liu Y, Song B, Zeng X. Sequence Alignment/Map format: a comprehensive review of approaches and applications. Brief Bioinform 2023;24:bbad320. [PMID: 37668049 DOI: 10.1093/bib/bbad320] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 08/16/2023] [Accepted: 08/18/2023] [Indexed: 09/06/2023] Open

Firtina C, Park J, Alser M, Kim JS, Cali D, Shahroodi T, Ghiasi N, Singh G, Kanellopoulos K, Alkan C, Mutlu O. BLEND: a fast, memory-efficient and accurate mechanism to find fuzzy seed matches in genome analysis. NAR Genom Bioinform 2023;5:lqad004. [PMID: 36685727 PMCID: PMC9853099 DOI: 10.1093/nargab/lqad004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 12/16/2022] [Accepted: 01/10/2023] [Indexed: 01/22/2023] Open

Gudur VY, Maheshwari S, Acharyya A, Shafik R. An FPGA Based Energy-Efficient Read Mapper With Parallel Filtering and In-Situ Verification. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:2697-2711. [PMID: 34415836 DOI: 10.1109/tcbb.2021.3106311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]

Alser M, Rotman J, Deshpande D, Taraszka K, Shi H, Baykal PI, Yang HT, Xue V, Knyazev S, Singer BD, Balliu B, Koslicki D, Skums P, Zelikovsky A, Alkan C, Mutlu O, Mangul S. Technology dictates algorithms: recent developments in read alignment. Genome Biol 2021;22:249. [PMID: 34446078 PMCID: PMC8390189 DOI: 10.1186/s13059-021-02443-7] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 07/28/2021] [Indexed: 01/08/2023] Open

Affiliation(s)

Mohammed Alser Computer Science Department, ETH Zürich, 8092, Zürich, Switzerland Computer Engineering Department, Bilkent University, 06800 Bilkent, Ankara, Turkey Information Technology and Electrical Engineering Department, ETH Zürich, Zürich, 8092, Switzerland
Jeremy Rotman Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
Dhrithi Deshpande Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, Los Angeles, CA, 90089, USA
Kodi Taraszka Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
Huwenbo Shi Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
Pelin Icer Baykal Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
Harry Taegyun Yang Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA Bioinformatics Interdepartmental Ph.D. Program, University of California Los Angeles, Los Angeles, CA, 90095, USA
Victor Xue Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
Sergey Knyazev Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
Benjamin D Singer Division of Pulmonary and Critical Care Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA Department of Biochemistry & Molecular Genetics, Northwestern University Feinberg School of Medicine, Chicago, USA Simpson Querrey Institute for Epigenetics, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
Brunilda Balliu Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA, 90095, USA
David Koslicki Computer Science and Engineering, Pennsylvania State University, University Park, PA, 16801, USA Biology Department, Pennsylvania State University, University Park, PA, 16801, USA The Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16801, USA
Pavel Skums Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
Alex Zelikovsky Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA The Laboratory of Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, 119991, Russia
Can Alkan Computer Engineering Department, Bilkent University, 06800 Bilkent, Ankara, Turkey Bilkent-Hacettepe Health Sciences and Technologies Program, Ankara, Turkey
Onur Mutlu Computer Science Department, ETH Zürich, 8092, Zürich, Switzerland Computer Engineering Department, Bilkent University, 06800 Bilkent, Ankara, Turkey Information Technology and Electrical Engineering Department, ETH Zürich, Zürich, 8092, Switzerland
Serghei Mangul Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, Los Angeles, CA, 90089, USA.

Collapse

Yang W, Wang L. Fast and Accurate Algorithms for Mapping and Aligning Long Reads. J Comput Biol 2021;28:789-803. [PMID: 34161175 DOI: 10.1089/cmb.2020.0603] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open

Comparison of High-Throughput Sequencing for Phage Display Peptide Screening on Two Commercially Available Platforms. Int J Pept Res Ther 2020. [DOI: 10.1007/s10989-019-09858-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Frints SGM, Ozanturk A, Rodríguez Criado G, Grasshoff U, de Hoon B, Field M, Manouvrier-Hanu S, E Hickey S, Kammoun M, Gripp KW, Bauer C, Schroeder C, Toutain A, Mihalic Mosher T, Kelly BJ, White P, Dufke A, Rentmeester E, Moon S, Koboldt DC, van Roozendaal KEP, Hu H, Haas SA, Ropers HH, Murray L, Haan E, Shaw M, Carroll R, Friend K, Liebelt J, Hobson L, De Rademaeker M, Geraedts J, Fryns JP, Vermeesch J, Raynaud M, Riess O, Gribnau J, Katsanis N, Devriendt K, Bauer P, Gecz J, Golzio C, Gontan C, Kalscheuer VM. Pathogenic variants in E3 ubiquitin ligase RLIM/RNF12 lead to a syndromic X-linked intellectual disability and behavior disorder. Mol Psychiatry 2019;24:1748-1768. [PMID: 29728705 DOI: 10.1038/s41380-018-0065-x] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/23/2017] [Accepted: 02/28/2018] [Indexed: 12/25/2022]

Affiliation(s)

Suzanna G M Frints Department of Clinical Genetics, Maastricht University Medical Center+, azM, Maastricht, 6202 AZ, The Netherlands. .,Department of Genetics and Cell Biology, School for Oncology and Developmental Biology, GROW, FHML, Maastricht University, Maastricht, 6200 MD, The Netherlands.
Aysegul Ozanturk Center for Human Disease Modeling and Departments of Pediatrics and Psychiatry, Duke University, Durham, NC, 27710, USA
Germán Rodríguez Criado Unidad de Genética Clínica, Hospital Virgen del Rocío, Sevilla, 41920, Spain
Ute Grasshoff Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, 72076, Germany
Bas de Hoon Department of Developmental Biology, Erasmus University Medical Center, Rotterdam, 3015 CN, Rotterdam, The Netherlands.,Department of Gynaecology and Obstetrics, Erasmus University Medical Center, Rotterdam, 3015 CN, The Netherlands
Michael Field GOLD (Genetics of Learning and Disability) Service, Hunter Genetics, Waratah, NSW, 2298, Australia
Sylvie Manouvrier-Hanu Clinique de Génétique médicale Guy Fontaine, Centre de référence maladies rares Anomalies du développement Hôpital Jeanne de Flandre, Lille, 59000, France.,EA 7364 RADEME Maladies Rares du Développement et du Métabolisme, Faculté de Médecine, Université de Lille, Lille, 59000, France
Scott E Hickey Division of Molecular & Human Genetics, Nationwide Children's Hospital, Columbus, OH, 43205, USA.,Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, 43205, USA
Molka Kammoun Center for Human Genetics, University Hospitals Leuven, Leuven, 3000, Belgium
Karen W Gripp Alfred I. duPont Hospital for Children Nemours, Wilmington, DE, 19803, USA
Claudia Bauer Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, 72076, Germany
Christopher Schroeder Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, 72076, Germany
Annick Toutain Service de Génétique, Hôpital Bretonneau, CHU de Tours, Tours, 37044, France.,UMR 1253, iBrain, Université de Tours, Inserm, Tours, 37032, France
Theresa Mihalic Mosher Division of Molecular & Human Genetics, Nationwide Children's Hospital, Columbus, OH, 43205, USA.,Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, 43205, USA.,The Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, 43205, USA
Benjamin J Kelly The Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, 43205, USA
Peter White Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, 43205, USA.,The Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, 43205, USA
Andreas Dufke Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, 72076, Germany
Eveline Rentmeester Department of Developmental Biology, Erasmus University Medical Center, Rotterdam, 3015 CN, Rotterdam, The Netherlands
Sungjin Moon Center for Human Disease Modeling and Departments of Pediatrics and Psychiatry, Duke University, Durham, NC, 27710, USA
Daniel C Koboldt Department of Pediatrics, The Ohio State University College of Medicine, Columbus, OH, 43205, USA.,The Institute for Genomic Medicine, Nationwide Children's Hospital, Columbus, OH, 43205, USA
Kees E P van Roozendaal Department of Clinical Genetics, Maastricht University Medical Center+, azM, Maastricht, 6202 AZ, The Netherlands.,Department of Genetics and Cell Biology, School for Oncology and Developmental Biology, GROW, FHML, Maastricht University, Maastricht, 6200 MD, The Netherlands
Hao Hu Department of Human Molecular Genetics, Max Planck Institute for Molecular Genetics, Berlin, 14195, Germany
Stefan A Haas Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, 14195, Germany
Hans-Hilger Ropers Department of Human Molecular Genetics, Max Planck Institute for Molecular Genetics, Berlin, 14195, Germany
Lucinda Murray GOLD (Genetics of Learning and Disability) Service, Hunter Genetics, Waratah, NSW, 2298, Australia
Eric Haan Adelaide Medical School and Robinson Research Institute, The University of Adelaide, Adelaide, SA, 5000, Australia.,South Australian Clinical Genetics Service, SA Pathology (at Women's and Children's Hospital), North Adelaide, SA, 5006, Australia
Marie Shaw Adelaide Medical School and Robinson Research Institute, The University of Adelaide, Adelaide, SA, 5000, Australia
Renee Carroll Adelaide Medical School and Robinson Research Institute, The University of Adelaide, Adelaide, SA, 5000, Australia
Kathryn Friend Genetics and Molecular Pathology, SA Pathology, Adelaide, SA, 5006, Australia
Jan Liebelt South Australian Clinical Genetics Service, SA Pathology (at Women's and Children's Hospital), North Adelaide, SA, 5006, Australia
Lynne Hobson Genetics and Molecular Pathology, SA Pathology, Adelaide, SA, 5006, Australia
Marjan De Rademaeker Centre for Medical Genetics, Reproduction and Genetics, Reproduction Genetics and Regenerative Medicine, Vrije Universiteit Brussel (VUB), UZ Brussel, 1090, Brussels, Belgium
Joep Geraedts Department of Clinical Genetics, Maastricht University Medical Center+, azM, Maastricht, 6202 AZ, The Netherlands.,Department of Genetics and Cell Biology, School for Oncology and Developmental Biology, GROW, FHML, Maastricht University, Maastricht, 6200 MD, The Netherlands
Jean-Pierre Fryns Center for Human Genetics, University Hospitals Leuven, Leuven, 3000, Belgium
Joris Vermeesch Center for Human Genetics, University Hospitals Leuven, Leuven, 3000, Belgium
Martine Raynaud Service de Génétique, Hôpital Bretonneau, CHU de Tours, Tours, 37044, France.,UMR 1253, iBrain, Université de Tours, Inserm, Tours, 37032, France
Olaf Riess Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, 72076, Germany
Joost Gribnau Department of Developmental Biology, Erasmus University Medical Center, Rotterdam, 3015 CN, Rotterdam, The Netherlands
Nicholas Katsanis Center for Human Disease Modeling and Departments of Pediatrics and Psychiatry, Duke University, Durham, NC, 27710, USA
Koen Devriendt Center for Human Genetics, University Hospitals Leuven, Leuven, 3000, Belgium
Peter Bauer Institute of Medical Genetics and Applied Genomics, University of Tübingen, Tübingen, 72076, Germany
Jozef Gecz Adelaide Medical School and Robinson Research Institute, The University of Adelaide, Adelaide, SA, 5000, Australia.,South Australian Health and Medical Research Institute, Adelaide, SA, 5000, Australia
Christelle Golzio Center for Human Disease Modeling and Departments of Pediatrics and Psychiatry, Duke University, Durham, NC, 27710, USA.,Institut de Génétique et de Biologie Moléculaire et Cellulaire, Department of Translational Medicine and Neurogenetics; Centre National de la Recherche Scientifique, UMR7104; Institut National de la Santé et de la Recherche Médicale, U964, Université de Strasbourg, 67400, Illkirch, France
Cristina Gontan Department of Developmental Biology, Erasmus University Medical Center, Rotterdam, 3015 CN, Rotterdam, The Netherlands
Vera M Kalscheuer Research Group Development and Disease, Max Planck Institute for Molecular Genetics, Berlin, 14195, Germany.

Collapse

Senol Cali D, Kim JS, Ghose S, Alkan C, Mutlu O. Nanopore sequencing technology and tools for genome assembly: computational analysis of the current state, bottlenecks and future directions. Brief Bioinform 2019;20:1542-1559. [PMID: 29617724 PMCID: PMC6781587 DOI: 10.1093/bib/bby017] [Citation(s) in RCA: 108] [Impact Index Per Article: 21.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2017] [Revised: 02/06/2018] [Indexed: 02/06/2023] Open

Abstract

Nanopore sequencing technology has the potential to render other sequencing technologies obsolete with its ability to generate long reads and provide portability. However, high error rates of the technology pose a challenge while generating accurate genome assemblies. The tools used for nanopore sequence analysis are of critical importance, as they should overcome the high error rates of the technology. Our goal in this work is to comprehensively analyze current publicly available tools for nanopore sequence analysis to understand their advantages, disadvantages and performance bottlenecks. It is important to understand where the current tools do not perform well to develop better tools. To this end, we (1) analyze the multiple steps and the associated tools in the genome assembly pipeline using nanopore sequence data, and (2) provide guidelines for determining the appropriate tools for each step. Based on our analyses, we make four key observations: (1) the choice of the tool for basecalling plays a critical role in overcoming the high error rates of nanopore sequencing technology. (2) Read-to-read overlap finding tools, GraphMap and Minimap, perform similarly in terms of accuracy. However, Minimap has a lower memory usage, and it is faster than GraphMap. (3) There is a trade-off between accuracy and performance when deciding on the appropriate tool for the assembly step. The fast but less accurate assembler Miniasm can be used for quick initial assembly, and further polishing can be applied on top of it to increase the accuracy, which leads to faster overall assembly. (4) The state-of-the-art polishing tool, Racon, generates high-quality consensus sequences while providing a significant speedup over another polishing tool, Nanopolish. We analyze various combinations of different tools and expose the trade-offs between accuracy, performance, memory usage and scalability. We conclude that our observations can guide researchers and practitioners in making conscious and effective choices for each step of the genome assembly pipeline using nanopore sequence data. Also, with the help of bottlenecks we have found, developers can improve the current tools or build new ones that are both accurate and fast, to overcome the high error rates of the nanopore sequencing technology.

Collapse

Mozafari F, Babashah H, Koohi S, Kavehvash Z. Speeding up DNA sequence alignment by optical correlator. OPTICS & LASER TECHNOLOGY 2018;108:124-135. [DOI: 10.1016/j.optlastec.2018.06.027] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2023]

Alser M, Hassan H, Xin H, Ergin O, Mutlu O, Alkan C. GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping. Bioinformatics 2018;33:3355-3363. [PMID: 28575161 DOI: 10.1093/bioinformatics/btx342] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2016] [Accepted: 05/29/2017] [Indexed: 01/06/2023] Open

Abstract

Motivation

High throughput DNA sequencing (HTS) technologies generate an excessive number of small DNA segments -called short reads- that cause significant computational burden. To analyze the entire genome, each of the billions of short reads must be mapped to a reference genome based on the similarity between a read and 'candidate' locations in that reference genome. The similarity measurement, called alignment, formulated as an approximate string matching problem, is the computational bottleneck because: (i) it is implemented using quadratic-time dynamic programming algorithms and (ii) the majority of candidate locations in the reference genome do not align with a given read due to high dissimilarity. Calculating the alignment of such incorrect candidate locations consumes an overwhelming majority of a modern read mapper's execution time. Therefore, it is crucial to develop a fast and effective filter that can detect incorrect candidate locations and eliminate them before invoking computationally costly alignment algorithms.

Results

We propose GateKeeper, a new hardware accelerator that functions as a pre-alignment step that quickly filters out most incorrect candidate locations. GateKeeper is the first design to accelerate pre-alignment using Field-Programmable Gate Arrays (FPGAs), which can perform pre-alignment much faster than software. When implemented on a single FPGA chip, GateKeeper maintains high accuracy (on average >96%) while providing, on average, 90-fold and 130-fold speedup over the state-of-the-art software pre-alignment techniques, Adjacency Filter and Shifted Hamming Distance (SHD), respectively. The addition of GateKeeper as a pre-alignment step can reduce the verification time of the mrFAST mapper by a factor of 10.

Availability and implementation

https://github.com/BilkentCompGen/GateKeeper.

Contact

mohammedalser@bilkent.edu.tr or onur.mutlu@inf.ethz.ch or calkan@cs.bilkent.edu.tr.

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Kim JS, Senol Cali D, Xin H, Lee D, Ghose S, Alser M, Hassan H, Ergin O, Alkan C, Mutlu O. GRIM-Filter: Fast seed location filtering in DNA read mapping using processing-in-memory technologies. BMC Genomics 2018;19:89. [PMID: 29764378 PMCID: PMC5954284 DOI: 10.1186/s12864-018-4460-0] [Citation(s) in RCA: 61] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open

Abstract

Background

Seed location filtering is critical in DNA read mapping, a process where billions of DNA fragments (reads) sampled from a donor are mapped onto a reference genome to identify genomic variants of the donor. State-of-the-art read mappers 1) quickly generate possible mapping locations for seeds (i.e., smaller segments) within each read, 2) extract reference sequences at each of the mapping locations, and 3) check similarity between each read and its associated reference sequences with a computationally-expensive algorithm (i.e., sequence alignment) to determine the origin of the read. A seed location filter comes into play before alignment, discarding seed locations that alignment would deem a poor match. The ideal seed location filter would discard all poor match locations prior to alignment such that there is no wasted computation on unnecessary alignments.

Results

We propose a novel seed location filtering algorithm, GRIM-Filter, optimized to exploit 3D-stacked memory systems that integrate computation within a logic layer stacked under memory layers, to perform processing-in-memory (PIM). GRIM-Filter quickly filters seed locations by 1) introducing a new representation of coarse-grained segments of the reference genome, and 2) using massively-parallel in-memory operations to identify read presence within each coarse-grained segment. Our evaluations show that for a sequence alignment error tolerance of 0.05, GRIM-Filter 1) reduces the false negative rate of filtering by 5.59x–6.41x, and 2) provides an end-to-end read mapper speedup of 1.81x–3.65x, compared to a state-of-the-art read mapper employing the best previous seed location filtering algorithm.

Conclusion

GRIM-Filter exploits 3D-stacked memory, which enables the efficient use of processing-in-memory, to overcome the memory bandwidth bottleneck in seed location filtering. We show that GRIM-Filter significantly improves the performance of a state-of-the-art read mapper. GRIM-Filter is a universal seed location filter that can be applied to any read mapper. We hope that our results provide inspiration for new works to design other bioinformatics algorithms that take advantage of emerging technologies and new processing paradigms, such as processing-in-memory using 3D-stacked memory devices.

Collapse

Almutairy M, Torng E. Comparing fixed sampling with minimizer sampling when using k-mer indexes to find maximal exact matches. PLoS One 2018;13:e0189960. [PMID: 29389989 PMCID: PMC5794061 DOI: 10.1371/journal.pone.0189960] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2017] [Accepted: 12/05/2017] [Indexed: 01/20/2023] Open

Abstract

Bioinformatics applications and pipelines increasingly use k-mer indexes to search for similar sequences. The major problem with k-mer indexes is that they require lots of memory. Sampling is often used to reduce index size and query time. Most applications use one of two major types of sampling: fixed sampling and minimizer sampling. It is well known that fixed sampling will produce a smaller index, typically by roughly a factor of two, whereas it is generally assumed that minimizer sampling will produce faster query times since query k-mers can also be sampled. However, no direct comparison of fixed and minimizer sampling has been performed to verify these assumptions. We systematically compare fixed and minimizer sampling using the human genome as our database. We use the resulting k-mer indexes for fixed sampling and minimizer sampling to find all maximal exact matches between our database, the human genome, and three separate query sets, the mouse genome, the chimp genome, and an NGS data set. We reach the following conclusions. First, using larger k-mers reduces query time for both fixed sampling and minimizer sampling at a cost of requiring more space. If we use the same k-mer size for both methods, fixed sampling requires typically half as much space whereas minimizer sampling processes queries only slightly faster. If we are allowed to use any k-mer size for each method, then we can choose a k-mer size such that fixed sampling both uses less space and processes queries faster than minimizer sampling. The reason is that although minimizer sampling is able to sample query k-mers, the number of shared k-mer occurrences that must be processed is much larger for minimizer sampling than fixed sampling. In conclusion, we argue that for any application where each shared k-mer occurrence must be processed, fixed sampling is the right sampling method.

Collapse

Kinghorn AB, Fraser LA, Liang S, Shiu SCC, Tanner JA. Aptamer Bioinformatics. Int J Mol Sci 2017;18:E2516. [PMID: 29186809 PMCID: PMC5751119 DOI: 10.3390/ijms18122516] [Citation(s) in RCA: 89] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Revised: 11/17/2017] [Accepted: 11/20/2017] [Indexed: 02/07/2023] Open

Reinert K, Dadi TH, Ehrhardt M, Hauswedell H, Mehringer S, Rahn R, Kim J, Pockrandt C, Winkler J, Siragusa E, Urgese G, Weese D. The SeqAn C++ template library for efficient sequence analysis: A resource for programmers. J Biotechnol 2017;261:157-168. [PMID: 28888961 DOI: 10.1016/j.jbiotec.2017.07.017] [Citation(s) in RCA: 67] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2017] [Revised: 07/17/2017] [Accepted: 07/19/2017] [Indexed: 11/27/2022]

Tsai MH, Liu YY, Soo VW. PathoBacTyper: A Web Server for Pathogenic Bacteria Identification and Molecular Genotyping. Front Microbiol 2017;8:1474. [PMID: 28824598 PMCID: PMC5540972 DOI: 10.3389/fmicb.2017.01474] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2017] [Accepted: 07/20/2017] [Indexed: 11/13/2022] Open

Almutairy M, Torng E. The effects of sampling on the efficiency and accuracy of k-mer indexes: Theoretical and empirical comparisons using the human genome. PLoS One 2017;12:e0179046. [PMID: 28686614 PMCID: PMC5501444 DOI: 10.1371/journal.pone.0179046] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2016] [Accepted: 05/23/2017] [Indexed: 01/11/2023] Open

Canzar S, Salzberg SL. Short Read Mapping: An Algorithmic Tour. PROCEEDINGS OF THE IEEE. INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS 2017;105:436-458. [PMID: 28502990 PMCID: PMC5425171 DOI: 10.1109/jproc.2015.2455551] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]

From next-generation resequencing reads to a high-quality variant data set. Heredity (Edinb) 2016;118:111-124. [PMID: 27759079 DOI: 10.1038/hdy.2016.102] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2016] [Revised: 09/03/2016] [Accepted: 09/06/2016] [Indexed: 12/11/2022] Open

Do LAH, Wilm A, van Doorn HR, Lam HM, Sim S, Sukumaran R, Tran AT, Nguyen BH, Tran TTL, Tran QH, Vo QB, Dac NAT, Trinh HN, Nguyen TTH, Binh BTL, Le K, Nguyen MT, Thai QT, Vo TV, Ngo NQM, Dang TKH, Cao NH, Tran TV, Ho LV, Farrar J, de Jong M, Chen S, Nagarajan N, Bryant JE, Hibberd ML. Direct whole-genome deep-sequencing of human respiratory syncytial virus A and B from Vietnamese children identifies distinct patterns of inter- and intra-host evolution. J Gen Virol 2016;96:3470-3483. [PMID: 26407694 DOI: 10.1099/jgv.0.000298] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open

Affiliation(s)

Lien Anh Ha Do Oxford University Clinical Research Unit, Wellcome Trust Major Overseas Program, Ho Chi Minh City, Vietnam
Andreas Wilm Genome Institute of Singapore, Genome Building, 138672 Singapore
H Rogier van Doorn Oxford University Clinical Research Unit, Wellcome Trust Major Overseas Program, Ho Chi Minh City, Vietnam.,Nuffield Department of Clinical Medicine, University of Oxford, Oxford, UK
Ha Minh Lam Oxford University Clinical Research Unit, Wellcome Trust Major Overseas Program, Ho Chi Minh City, Vietnam
Shuzhen Sim Genome Institute of Singapore, Genome Building, 138672 Singapore
Rashmi Sukumaran Genome Institute of Singapore, Genome Building, 138672 Singapore
Anh Tuan Tran Children's Hospital 1, Ward 10, District 10, Ho Chi Minh City, Vietnam
Bach Hue Nguyen Children's Hospital 1, Ward 10, District 10, Ho Chi Minh City, Vietnam
Thi Thu Loan Tran Children's Hospital 2, Ben Nghe Ward, District 1, Ho Chi Minh City, Vietnam
Quynh Huong Tran Children's Hospital 2, Ben Nghe Ward, District 1, Ho Chi Minh City, Vietnam
Quoc Bao Vo Children's Hospital 2, Ben Nghe Ward, District 1, Ho Chi Minh City, Vietnam
Nguyen Anh Tran Dac Children's Hospital 2, Ben Nghe Ward, District 1, Ho Chi Minh City, Vietnam
Hong Nhien Trinh Children's Hospital 1, Ward 10, District 10, Ho Chi Minh City, Vietnam
Thi Thanh Hai Nguyen Children's Hospital 1, Ward 10, District 10, Ho Chi Minh City, Vietnam
Bao Tinh Le Binh Children's Hospital 1, Ward 10, District 10, Ho Chi Minh City, Vietnam
Khanh Le Children's Hospital 1, Ward 10, District 10, Ho Chi Minh City, Vietnam
Minh Tien Nguyen Children's Hospital 1, Ward 10, District 10, Ho Chi Minh City, Vietnam
Quang Tung Thai Children's Hospital 1, Ward 10, District 10, Ho Chi Minh City, Vietnam
Thanh Vu Vo Children's Hospital 1, Ward 10, District 10, Ho Chi Minh City, Vietnam
Ngoc Quang Minh Ngo Children's Hospital 1, Ward 10, District 10, Ho Chi Minh City, Vietnam
Thi Kim Huyen Dang Children's Hospital 2, Ben Nghe Ward, District 1, Ho Chi Minh City, Vietnam
Ngoc Huong Cao Children's Hospital 2, Ben Nghe Ward, District 1, Ho Chi Minh City, Vietnam
Thu Van Tran Children's Hospital 2, Ben Nghe Ward, District 1, Ho Chi Minh City, Vietnam
Lu Viet Ho Children's Hospital 2, Ben Nghe Ward, District 1, Ho Chi Minh City, Vietnam
Jeremy Farrar Oxford University Clinical Research Unit, Wellcome Trust Major Overseas Program, Ho Chi Minh City, Vietnam
Menno de Jong Oxford University Clinical Research Unit, Wellcome Trust Major Overseas Program, Ho Chi Minh City, Vietnam.,Nuffield Department of Clinical Medicine, University of Oxford, Oxford, UK.,Department of Medical Microbiology, Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands
Swaine Chen Genome Institute of Singapore, Genome Building, 138672 Singapore
Niranjan Nagarajan Genome Institute of Singapore, Genome Building, 138672 Singapore
Juliet E Bryant Oxford University Clinical Research Unit, Wellcome Trust Major Overseas Program, Ho Chi Minh City, Vietnam.,Nuffield Department of Clinical Medicine, University of Oxford, Oxford, UK
Martin L Hibberd Genome Institute of Singapore, Genome Building, 138672 Singapore

Collapse

Mapping and differential expression analysis from short-read RNA-Seq data in model organisms. QUANTITATIVE BIOLOGY 2016. [DOI: 10.1007/s40484-016-0060-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]

Kalscheuer VM, James VM, Himelright ML, Long P, Oegema R, Jensen C, Bienek M, Hu H, Haas SA, Topf M, Hoogeboom AJM, Harvey K, Walikonis R, Harvey RJ. Novel Missense Mutation A789V in IQSEC2 Underlies X-Linked Intellectual Disability in the MRX78 Family. Front Mol Neurosci 2016;8:85. [PMID: 26793055 PMCID: PMC4707274 DOI: 10.3389/fnmol.2015.00085] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2015] [Accepted: 12/14/2015] [Indexed: 12/04/2022] Open

Hu H, Haas SA, Chelly J, Van Esch H, Raynaud M, de Brouwer APM, Weinert S, Froyen G, Frints SGM, Laumonnier F, Zemojtel T, Love MI, Richard H, Emde AK, Bienek M, Jensen C, Hambrock M, Fischer U, Langnick C, Feldkamp M, Wissink-Lindhout W, Lebrun N, Castelnau L, Rucci J, Montjean R, Dorseuil O, Billuart P, Stuhlmann T, Shaw M, Corbett MA, Gardner A, Willis-Owen S, Tan C, Friend KL, Belet S, van Roozendaal KEP, Jimenez-Pocquet M, Moizard MP, Ronce N, Sun R, O'Keeffe S, Chenna R, van Bömmel A, Göke J, Hackett A, Field M, Christie L, Boyle J, Haan E, Nelson J, Turner G, Baynam G, Gillessen-Kaesbach G, Müller U, Steinberger D, Budny B, Badura-Stronka M, Latos-Bieleńska A, Ousager LB, Wieacker P, Rodríguez Criado G, Bondeson ML, Annerén G, Dufke A, Cohen M, Van Maldergem L, Vincent-Delorme C, Echenne B, Simon-Bouy B, Kleefstra T, Willemsen M, Fryns JP, Devriendt K, Ullmann R, Vingron M, Wrogemann K, Wienker TF, Tzschach A, van Bokhoven H, Gecz J, Jentsch TJ, Chen W, Ropers HH, Kalscheuer VM. X-exome sequencing of 405 unresolved families identifies seven novel intellectual disability genes. Mol Psychiatry 2016;21:133-48. [PMID: 25644381 PMCID: PMC5414091 DOI: 10.1038/mp.2014.193] [Citation(s) in RCA: 208] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/04/2014] [Revised: 11/17/2014] [Accepted: 12/08/2014] [Indexed: 12/27/2022]

Affiliation(s)

H Hu Department of Human Molecular Genetics, Max Planck Institute for Molecular Genetics, Berlin, Germany
S A Haas Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
J Chelly University Paris Descartes, Paris, France,4Centre National de la Recherche Scientifique Unité Mixte de Recherche 8104, Institut National de la Santé et de la Recherche Médicale Unité 1016, Institut Cochin, Paris, France
H Van Esch Center for Human Genetics, University Hospitals Leuven, Leuven, Belgium
M Raynaud Inserm U930 ‘Imaging and Brain', Tours, France,7University François-Rabelais, Tours, France,8Centre Hospitalier Régional Universitaire, Service de Génétique, Tours, France
A P M de Brouwer Department of Human Genetics, Radboud University Medical Center, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
S Weinert Max-Delbrück-Centrum für Molekulare Medizin, Berlin, Germany,11Leibniz-Institut für Molekulare Pharmakologie, Berlin, Germany
G Froyen Human Genome Laboratory, VIB Center for the Biology of Disease, Leuven, Belgium,13Human Genome Laboratory, Department of Human Genetics, K.U. Leuven, Leuven, Belgium
S G M Frints Department of Clinical Genetics, Maastricht University Medical Center, azM, Maastricht, The Netherlands,15School for Oncology and Developmental Biology, GROW, Maastricht University, Maastricht, The Netherlands
F Laumonnier Inserm U930 ‘Imaging and Brain', Tours, France,7University François-Rabelais, Tours, France
T Zemojtel Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
M I Love Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
H Richard Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
A-K Emde Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
M Bienek Department of Human Molecular Genetics, Max Planck Institute for Molecular Genetics, Berlin, Germany
C Jensen Department of Human Molecular Genetics, Max Planck Institute for Molecular Genetics, Berlin, Germany
M Hambrock Department of Human Molecular Genetics, Max Planck Institute for Molecular Genetics, Berlin, Germany
U Fischer Department of Human Molecular Genetics, Max Planck Institute for Molecular Genetics, Berlin, Germany
C Langnick Max-Delbrück-Centrum für Molekulare Medizin, Berlin, Germany
M Feldkamp Max-Delbrück-Centrum für Molekulare Medizin, Berlin, Germany
W Wissink-Lindhout Department of Human Genetics, Radboud University Medical Center, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
N Lebrun University Paris Descartes, Paris, France,4Centre National de la Recherche Scientifique Unité Mixte de Recherche 8104, Institut National de la Santé et de la Recherche Médicale Unité 1016, Institut Cochin, Paris, France
L Castelnau University Paris Descartes, Paris, France,4Centre National de la Recherche Scientifique Unité Mixte de Recherche 8104, Institut National de la Santé et de la Recherche Médicale Unité 1016, Institut Cochin, Paris, France
J Rucci University Paris Descartes, Paris, France,4Centre National de la Recherche Scientifique Unité Mixte de Recherche 8104, Institut National de la Santé et de la Recherche Médicale Unité 1016, Institut Cochin, Paris, France
R Montjean University Paris Descartes, Paris, France,4Centre National de la Recherche Scientifique Unité Mixte de Recherche 8104, Institut National de la Santé et de la Recherche Médicale Unité 1016, Institut Cochin, Paris, France
O Dorseuil University Paris Descartes, Paris, France,4Centre National de la Recherche Scientifique Unité Mixte de Recherche 8104, Institut National de la Santé et de la Recherche Médicale Unité 1016, Institut Cochin, Paris, France
P Billuart University Paris Descartes, Paris, France,4Centre National de la Recherche Scientifique Unité Mixte de Recherche 8104, Institut National de la Santé et de la Recherche Médicale Unité 1016, Institut Cochin, Paris, France
T Stuhlmann Max-Delbrück-Centrum für Molekulare Medizin, Berlin, Germany,11Leibniz-Institut für Molekulare Pharmakologie, Berlin, Germany
M Shaw School of Paediatrics and Reproductive Health, The University of Adelaide, Adelaide, SA, Australia,17Robinson Research Institute, The University of Adelaide, Adelaide, SA, Australia
M A Corbett School of Paediatrics and Reproductive Health, The University of Adelaide, Adelaide, SA, Australia,17Robinson Research Institute, The University of Adelaide, Adelaide, SA, Australia
A Gardner School of Paediatrics and Reproductive Health, The University of Adelaide, Adelaide, SA, Australia,17Robinson Research Institute, The University of Adelaide, Adelaide, SA, Australia
S Willis-Owen School of Paediatrics and Reproductive Health, The University of Adelaide, Adelaide, SA, Australia,18National Heart and Lung Institute, Imperial College London, London, UK
C Tan School of Paediatrics and Reproductive Health, The University of Adelaide, Adelaide, SA, Australia
K L Friend SA Pathology, Women's and Children's Hospital, Adelaide, SA, Australia
S Belet Human Genome Laboratory, VIB Center for the Biology of Disease, Leuven, Belgium,13Human Genome Laboratory, Department of Human Genetics, K.U. Leuven, Leuven, Belgium
K E P van Roozendaal Department of Clinical Genetics, Maastricht University Medical Center, azM, Maastricht, The Netherlands,15School for Oncology and Developmental Biology, GROW, Maastricht University, Maastricht, The Netherlands
M Jimenez-Pocquet Centre Hospitalier Régional Universitaire, Service de Génétique, Tours, France
M-P Moizard Inserm U930 ‘Imaging and Brain', Tours, France,7University François-Rabelais, Tours, France,8Centre Hospitalier Régional Universitaire, Service de Génétique, Tours, France
N Ronce Inserm U930 ‘Imaging and Brain', Tours, France,7University François-Rabelais, Tours, France,8Centre Hospitalier Régional Universitaire, Service de Génétique, Tours, France
R Sun Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
S O'Keeffe Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
R Chenna Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
A van Bömmel Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
J Göke Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
A Hackett Genetics of Learning and Disability Service, Hunter Genetics, Waratah, NSW, Australia
M Field Genetics of Learning and Disability Service, Hunter Genetics, Waratah, NSW, Australia
L Christie Genetics of Learning and Disability Service, Hunter Genetics, Waratah, NSW, Australia
J Boyle Genetics of Learning and Disability Service, Hunter Genetics, Waratah, NSW, Australia
E Haan School of Paediatrics and Reproductive Health, The University of Adelaide, Adelaide, SA, Australia,19SA Pathology, Women's and Children's Hospital, Adelaide, SA, Australia
J Nelson Genetic Services of Western Australia, King Edward Memorial Hospital, Perth, WA, Australia
G Turner Genetics of Learning and Disability Service, Hunter Genetics, Waratah, NSW, Australia
G Baynam Genetic Services of Western Australia, King Edward Memorial Hospital, Perth, WA, Australia,22School of Paediatrics and Child Health, University of Western Australia, Perth, WA, Australia,23Institute for Immunology and Infectious Diseases, Murdoch University, Perth, WA, Australia,24Telethon Kids Institute, Perth, WA, Australia
G Gillessen-Kaesbach Institut für Humangenetik, Universität zu Lübeck, Lübeck, Germany
U Müller Institut für Humangenetik, Justus-Liebig-Universität Giessen, Giessen, Germany,27bio.logis Center for Human Genetics, Frankfurt a. M., Germany
D Steinberger Institut für Humangenetik, Justus-Liebig-Universität Giessen, Giessen, Germany,27bio.logis Center for Human Genetics, Frankfurt a. M., Germany
B Budny Chair and Department of Endocrinology, Metabolism and Internal Diseases, Ponzan University of Medical Sciences, Poznan, Poland
M Badura-Stronka Chair and Department of Medical Genetics, Poznan University of Medical Sciences, Poznan, Poland
A Latos-Bieleńska Chair and Department of Medical Genetics, Poznan University of Medical Sciences, Poznan, Poland
L B Ousager Department of Clinical Genetics, Odense University Hospital, Odense, Denmark
P Wieacker Institut für Humangenetik, Universitätsklinikum Münster, Muenster, Germany
G Rodríguez Criado Unidad de Genética Clínica, Hospital Virgen del Rocío, Sevilla, España
M-L Bondeson Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
G Annerén Department of Immunology, Genetics and Pathology, Uppsala University, Uppsala, Sweden
A Dufke Institut für Medizinische Genetik und Angewandte Genomik, Tübingen, Germany
M Cohen Kinderzentrum München, München, Germany
L Van Maldergem Centre de Génétique Humaine, Université de Franche-Comté, Besançon, France
C Vincent-Delorme Service de Génétique, Hôpital Jeanne de Flandre CHRU de Lilles, Lille, France
B Echenne Service de Neuro-Pédiatrie, CHU Montpellier, Montpellier, France
B Simon-Bouy Laboratoire SESEP, Centre hospitalier de Versailles, Le Chesnay, France
T Kleefstra Department of Human Genetics, Radboud University Medical Center, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
M Willemsen Department of Human Genetics, Radboud University Medical Center, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
J-P Fryns Center for Human Genetics, University Hospitals Leuven, Leuven, Belgium
K Devriendt Center for Human Genetics, University Hospitals Leuven, Leuven, Belgium
R Ullmann Department of Human Molecular Genetics, Max Planck Institute for Molecular Genetics, Berlin, Germany
M Vingron Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
K Wrogemann Department of Human Molecular Genetics, Max Planck Institute for Molecular Genetics, Berlin, Germany,40Department of Biochemistry and Medical Genetics, University of Manitoba, Winnipeg, MB, Canada
T F Wienker Department of Human Molecular Genetics, Max Planck Institute for Molecular Genetics, Berlin, Germany
A Tzschach Department of Human Molecular Genetics, Max Planck Institute for Molecular Genetics, Berlin, Germany
H van Bokhoven Department of Human Genetics, Radboud University Medical Center, Donders Institute for Brain, Cognition and Behaviour, Nijmegen, The Netherlands
J Gecz School of Paediatrics and Reproductive Health, The University of Adelaide, Adelaide, SA, Australia,17Robinson Research Institute, The University of Adelaide, Adelaide, SA, Australia
T J Jentsch Max-Delbrück-Centrum für Molekulare Medizin, Berlin, Germany,11Leibniz-Institut für Molekulare Pharmakologie, Berlin, Germany
W Chen Department of Human Molecular Genetics, Max Planck Institute for Molecular Genetics, Berlin, Germany,10Max-Delbrück-Centrum für Molekulare Medizin, Berlin, Germany
H-H Ropers Department of Human Molecular Genetics, Max Planck Institute for Molecular Genetics, Berlin, Germany
V M Kalscheuer Department of Human Molecular Genetics, Max Planck Institute for Molecular Genetics, Berlin, Germany,*Max Planck Institute for Molecular Genetics, Ihnestrasse 73, Berlin 14195, Germany. E-mail:

Collapse

Reinert K, Langmead B, Weese D, Evers DJ. Alignment of Next-Generation Sequencing Reads. Annu Rev Genomics Hum Genet 2015;16:133-51. [DOI: 10.1146/annurev-genom-090413-025358] [Citation(s) in RCA: 82] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Lim JQ, Tennakoon C, Guan P, Sung WK. BatAlign: an incremental method for accurate alignment of sequencing reads. Nucleic Acids Res 2015;43:e107. [PMID: 26170239 PMCID: PMC4652746 DOI: 10.1093/nar/gkv533] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2015] [Accepted: 05/09/2015] [Indexed: 11/12/2022] Open

Cheng H, Jiang H, Yang J, Xu Y, Shang Y. BitMapper: an efficient all-mapper based on bit-vector computing. BMC Bioinformatics 2015;16:192. [PMID: 26063651 PMCID: PMC4462005 DOI: 10.1186/s12859-015-0626-9] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2014] [Accepted: 05/22/2015] [Indexed: 11/10/2022] Open

Evaluation and application of the strand-specific protocol for next-generation sequencing. BIOMED RESEARCH INTERNATIONAL 2015;2015:182389. [PMID: 25893191 PMCID: PMC4393923 DOI: 10.1155/2015/182389] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/11/2014] [Accepted: 02/03/2015] [Indexed: 12/02/2022]

Hormozdiari F, Eskin E. Memory efficient assembly of human genome. J Bioinform Comput Biol 2015;13:1550008. [PMID: 25603998 DOI: 10.1142/s0219720015500080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Hauswedell H, Singer J, Reinert K. Lambda: the local aligner for massive biological data. ACTA ACUST UNITED AC 2015;30:i349-55. [PMID: 25161219 PMCID: PMC4147892 DOI: 10.1093/bioinformatics/btu439] [Citation(s) in RCA: 48] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]

Pasquier C, Clément M, Dombrovsky A, Penaud S, Da Rocha M, Rancurel C, Ledger N, Capovilla M, Robichon A. Environmentally selected aphid variants in clonality context display differential patterns of methylation in the genome. PLoS One 2014;9:e115022. [PMID: 25551225 PMCID: PMC4281257 DOI: 10.1371/journal.pone.0115022] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2014] [Accepted: 11/17/2014] [Indexed: 11/18/2022] Open

Chandran PA, Keller A, Weinmann L, Seida AA, Braun M, Andreev K, Fischer B, Horn E, Schwinn S, Junker M, Houben R, Dombrowski Y, Dietl J, Finotto S, Wölfl M, Meister G, Wischhusen J. The TGF-β-inducible miR-23a cluster attenuates IFN-γ levels and antigen-specific cytotoxicity in human CD8⁺ T cells. J Leukoc Biol 2014;96:633-45. [PMID: 25030422 DOI: 10.1189/jlb.3a0114-025r] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open

RandAL: a randomized approach to aligning DNA sequences to reference genomes. BMC Genomics 2014;15 Suppl 5:S2. [PMID: 25081493 PMCID: PMC4120144 DOI: 10.1186/1471-2164-15-s5-s2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Poole CB, Gu W, Kumar S, Jin J, Davis PJ, Bauche D, McReynolds LA. Diversity and expression of microRNAs in the filarial parasite, Brugia malayi. PLoS One 2014;9:e96498. [PMID: 24824352 PMCID: PMC4019659 DOI: 10.1371/journal.pone.0096498] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2013] [Accepted: 04/08/2014] [Indexed: 11/18/2022] Open

Hach F, Sarrafi I, Hormozdiari F, Alkan C, Eichler EE, Sahinalp SC. mrsFAST-Ultra: a compact, SNP-aware mapper for high performance sequencing applications. Nucleic Acids Res 2014;42:W494-500. [PMID: 24810850 PMCID: PMC4086126 DOI: 10.1093/nar/gku370] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open

Abstract

High throughput sequencing (HTS) platforms generate unprecedented amounts of data that introduce challenges for processing and downstream analysis. While tools that report the ‘best’ mapping location of each read provide a fast way to process HTS data, they are not suitable for many types of downstream analysis such as structural variation detection, where it is important to report multiple mapping loci for each read. For this purpose we introduce mrsFAST-Ultra, a fast, cache oblivious, SNP-aware aligner that can handle the multi-mapping of HTS reads very efficiently. mrsFAST-Ultra improves mrsFAST, our first cache oblivious read aligner capable of handling multi-mapping reads, through new and compact index structures that reduce not only the overall memory usage but also the number of CPU operations per alignment. In fact the size of the index generated by mrsFAST-Ultra is 10 times smaller than that of mrsFAST. As importantly, mrsFAST-Ultra introduces new features such as being able to (i) obtain the best mapping loci for each read, and (ii) return all reads that have at most n mapping loci (within an error threshold), together with these loci, for any user specified n. Furthermore, mrsFAST-Ultra is SNP-aware, i.e. it can map reads to reference genome while discounting the mismatches that occur at common SNP locations provided by db-SNP; this significantly increases the number of reads that can be mapped to the reference genome. Notice that all of the above features are implemented within the index structure and are not simple post-processing steps and thus are performed highly efficiently. Finally, mrsFAST-Ultra utilizes multiple available cores and processors and can be tuned for various memory settings. Our results show that mrsFAST-Ultra is roughly five times faster than its predecessor mrsFAST. In comparison to newly enhanced popular tools such as Bowtie2, it is more sensitive (it can report 10 times or more mappings per read) and much faster (six times or more) in the multi-mapping mode. Furthermore, mrsFAST-Ultra has an index size of 2GB for the entire human reference genome, which is roughly half of that of Bowtie2. mrsFAST-Ultra is open source and it can be accessed at http://mrsfast.sourceforge.net.

Collapse

Evaluation and comparison of multiple aligners for next-generation sequencing data analysis. BIOMED RESEARCH INTERNATIONAL 2014;2014:309650. [PMID: 24779008 PMCID: PMC3980841 DOI: 10.1155/2014/309650] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2013] [Accepted: 02/04/2014] [Indexed: 12/23/2022]

The effects of carbon dioxide and temperature on microRNA expression in Arabidopsis development. Nat Commun 2014;4:2145. [PMID: 23900278 DOI: 10.1038/ncomms3145] [Citation(s) in RCA: 99] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2012] [Accepted: 06/14/2013] [Indexed: 11/09/2022] Open

Grunert M, Dorn C, Schueler M, Dunkel I, Schlesinger J, Mebus S, Alexi-Meskishvili V, Perrot A, Wassilew K, Timmermann B, Hetzer R, Berger F, Sperling SR. Rare and private variations in neural crest, apoptosis and sarcomere genes define the polygenic background of isolated Tetralogy of Fallot. Hum Mol Genet 2014;23:3115-28. [PMID: 24459294 DOI: 10.1093/hmg/ddu021] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open

Affiliation(s)

Marcel Grunert Group of Cardiovascular Genetics, Department of Vertebrate Genomics and Cardiovascular Genetics, Experimental and Clinical Research Center, Charité-Universitätsmedizin Berlin and Max Delbrück Center (MDC) for Molecular Medicine, Berlin 13125, Germany
Cornelia Dorn Group of Cardiovascular Genetics, Department of Vertebrate Genomics and Cardiovascular Genetics, Experimental and Clinical Research Center, Charité-Universitätsmedizin Berlin and Max Delbrück Center (MDC) for Molecular Medicine, Berlin 13125, Germany Department of Biology, Chemistry and Pharmacy, Free University of Berlin, Berlin 14195, Germany
Markus Schueler Group of Cardiovascular Genetics, Department of Vertebrate Genomics and Cardiovascular Genetics, Experimental and Clinical Research Center, Charité-Universitätsmedizin Berlin and Max Delbrück Center (MDC) for Molecular Medicine, Berlin 13125, Germany
Ilona Dunkel Group of Cardiovascular Genetics, Department of Vertebrate Genomics and
Jenny Schlesinger Group of Cardiovascular Genetics, Department of Vertebrate Genomics and Cardiovascular Genetics, Experimental and Clinical Research Center, Charité-Universitätsmedizin Berlin and Max Delbrück Center (MDC) for Molecular Medicine, Berlin 13125, Germany
Siegrun Mebus Department of Pediatric Cardiology, German Heart Institute Berlin and Department of Pediatric Cardiology, Charité-Universitätsmedizin Berlin, Berlin 13353, Germany
Vladimir Alexi-Meskishvili Department of Cardiac Surgery and
Andreas Perrot Cardiovascular Genetics, Experimental and Clinical Research Center, Charité-Universitätsmedizin Berlin and Max Delbrück Center (MDC) for Molecular Medicine, Berlin 13125, Germany
Katharina Wassilew Department of Pathology, German Heart Institute Berlin, Berlin, Germany
Bernd Timmermann Next Generation Service Group, Max Planck Institute for Molecular Genetics, Berlin 14195, Germany
Roland Hetzer Department of Cardiac Surgery and
Felix Berger Department of Pediatric Cardiology, German Heart Institute Berlin and Department of Pediatric Cardiology, Charité-Universitätsmedizin Berlin, Berlin 13353, Germany
Silke R Sperling Group of Cardiovascular Genetics, Department of Vertebrate Genomics and Cardiovascular Genetics, Experimental and Clinical Research Center, Charité-Universitätsmedizin Berlin and Max Delbrück Center (MDC) for Molecular Medicine, Berlin 13125, Germany Department of Biology, Chemistry and Pharmacy, Free University of Berlin, Berlin 14195, Germany

Collapse

Impact of Next-Generation Whole-Exome sequencing in molecular diagnostics. ACTA ACUST UNITED AC 2013. [DOI: 10.1016/j.dit.2013.07.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]

Busse CE, Czogiel I, Braun P, Arndt PF, Wardemann H. Single-cell based high-throughput sequencing of full-length immunoglobulin heavy and light chain genes. Eur J Immunol 2013;44:597-603. [PMID: 24114719 DOI: 10.1002/eji.201343917] [Citation(s) in RCA: 94] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2013] [Revised: 08/27/2013] [Accepted: 09/19/2013] [Indexed: 11/09/2022]

Lederman R. A random-permutations-based approach to fast read alignment. BMC Bioinformatics 2013;14 Suppl 5:S8. [PMID: 23734846 PMCID: PMC3622637 DOI: 10.1186/1471-2105-14-s5-s8] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open

Hatem A, Bozdağ D, Toland AE, Çatalyürek ÜV. Benchmarking short sequence mapping tools. BMC Bioinformatics 2013;14:184. [PMID: 23758764 PMCID: PMC3694458 DOI: 10.1186/1471-2105-14-184] [Citation(s) in RCA: 121] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2012] [Accepted: 05/28/2013] [Indexed: 01/21/2023] Open

Abstract

BACKGROUND

The development of next-generation sequencing instruments has led to the generation of millions of short sequences in a single run. The process of aligning these reads to a reference genome is time consuming and demands the development of fast and accurate alignment tools. However, the current proposed tools make different compromises between the accuracy and the speed of mapping. Moreover, many important aspects are overlooked while comparing the performance of a newly developed tool to the state of the art. Therefore, there is a need for an objective evaluation method that covers all the aspects. In this work, we introduce a benchmarking suite to extensively analyze sequencing tools with respect to various aspects and provide an objective comparison.

RESULTS

We applied our benchmarking tests on 9 well known mapping tools, namely, Bowtie, Bowtie2, BWA, SOAP2, MAQ, RMAP, GSNAP, Novoalign, and mrsFAST (mrFAST) using synthetic data and real RNA-Seq data. MAQ and RMAP are based on building hash tables for the reads, whereas the remaining tools are based on indexing the reference genome. The benchmarking tests reveal the strengths and weaknesses of each tool. The results show that no single tool outperforms all others in all metrics. However, Bowtie maintained the best throughput for most of the tests while BWA performed better for longer read lengths. The benchmarking tests are not restricted to the mentioned tools and can be further applied to others.

CONCLUSION

The mapping process is still a hard problem that is affected by many factors. In this work, we provided a benchmarking suite that reveals and evaluates the different factors affecting the mapping process. Still, there is no tool that outperforms all of the others in all the tests. Therefore, the end user should clearly specify his needs in order to choose the tool that provides the best results.

Collapse

Giese SH, Zickmann F, Renard BY. Specificity control for read alignments using an artificial reference genome-guided false discovery rate. ACTA ACUST UNITED AC 2013;30:9-16. [PMID: 23685787 DOI: 10.1093/bioinformatics/btt255] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Mahmud MP, Wiedenhoeft J, Schliep A. Indel-tolerant read mapping with trinucleotide frequencies using cache-oblivious kd-trees. Bioinformatics 2013;28:i325-i332. [PMID: 22962448 PMCID: PMC3436807 DOI: 10.1093/bioinformatics/bts380] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open

Fimereli D, Detours V, Konopka T. TriageTools: tools for partitioning and prioritizing analysis of high-throughput sequencing data. Nucleic Acids Res 2013;41:e86. [PMID: 23408855 PMCID: PMC3627586 DOI: 10.1093/nar/gkt094] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open

Nestorov P, Battke F, Levesque MP, Gerberding M. The maternal transcriptome of the crustacean Parhyale hawaiensis is inherited asymmetrically to invariant cell lineages of the ectoderm and mesoderm. PLoS One 2013;8:e56049. [PMID: 23418507 PMCID: PMC3572164 DOI: 10.1371/journal.pone.0056049] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2012] [Accepted: 01/04/2013] [Indexed: 11/18/2022] Open

Abstract

BACKGROUND

The embryo of the crustacean Parhyale hawaiensis has a total, unequal and invariant early cleavage pattern. It specifies cell fates earlier than other arthropods, including Drosophila, as individual blastomeres of the 8-cell stage are allocated to the germ layers and the germline. Furthermore, the 8-cell stage is amenable to embryological manipulations. These unique features make Parhyale a suitable system for elucidating germ layer specification in arthropods. Since asymmetric localization of maternally provided RNA is a widespread mechanism to specify early cell fates, we asked whether this is also true for Parhyale. A candidate gene approach did not find RNAs that are asymmetrically distributed at the 8-cell stage. Therefore, we designed a high-density microarray from 9400 recently sequenced ESTs (1) to identify maternally provided RNAs and (2) to find RNAs that are differentially distributed among cells of the 8-cell stage.

RESULTS

Maternal-zygotic transition takes place around the 32-cell stage, i.e. after the specification of germ layers. By comparing a pool of RNAs from early embryos without zygotic transcription to zygotic RNAs of the germband, we found that more than 10% of the targets on the array were enriched in the maternal transcript pool. A screen for asymmetrically distributed RNAs at the 8-cell stage revealed 129 transcripts, from which 50% are predominantly expressed in the early embryonic stages. Finally, we performed knockdown experiments for two of these genes and observed cell-fate-related defects of embryonic development.

CONCLUSIONS

In contrast to Drosophila, the four primary germ layer cell lineages in Parhyale are specified during the maternal control phase of the embryo. A key step in this process is the asymmetric distribution of a large number of maternal RNAs to the germ layer progenitor cells.

Collapse

Xin H, Lee D, Hormozdiari F, Yedkar S, Mutlu O, Alkan C. Accelerating read mapping with FastHASH. BMC Genomics 2013;14 Suppl 1:S13. [PMID: 23369189 PMCID: PMC3549798 DOI: 10.1186/1471-2164-14-s1-s13] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Veeneman BA, Iyer MK, Chinnaiyan AM. Oculus: faster sequence alignment by streaming read compression. BMC Bioinformatics 2012;13:297. [PMID: 23148484 PMCID: PMC3534618 DOI: 10.1186/1471-2105-13-297] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2012] [Accepted: 11/01/2012] [Indexed: 01/17/2023] Open

Wilm A, Aw PPK, Bertrand D, Yeo GHT, Ong SH, Wong CH, Khor CC, Petric R, Hibberd ML, Nagarajan N. LoFreq: a sequence-quality aware, ultra-sensitive variant caller for uncovering cell-population heterogeneity from high-throughput sequencing datasets. Nucleic Acids Res 2012;40:11189-201. [PMID: 23066108 PMCID: PMC3526318 DOI: 10.1093/nar/gks918] [Citation(s) in RCA: 887] [Impact Index Per Article: 73.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open

Fonseca NA, Rung J, Brazma A, Marioni JC. Tools for mapping high-throughput sequencing data. Bioinformatics 2012;28:3169-77. [DOI: 10.1093/bioinformatics/bts605] [Citation(s) in RCA: 207] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open

Chaisson MJ, Tesler G. Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory. BMC Bioinformatics 2012;13:238. [PMID: 22988817 PMCID: PMC3572422 DOI: 10.1186/1471-2105-13-238] [Citation(s) in RCA: 795] [Impact Index Per Article: 66.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2012] [Accepted: 09/17/2012] [Indexed: 11/17/2022] Open