1
|
Naïja A, Mutlu O, Khan T, Seers TD, Yalcin HC. An optimized CT-dense agent perfusion and micro-CT imaging protocol for chick embryo developmental stages. BMC Biomed Eng 2024; 6:3. [PMID: 38654382 DOI: 10.1186/s42490-024-00078-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2023] [Accepted: 04/04/2024] [Indexed: 04/25/2024] Open
Abstract
Compared to classical techniques of morphological analysis, micro-CT (μ-CT) has become an effective approach allowing rapid screening of morphological changes. In the present work, we aimed to provide an optimized micro-CT dense agent perfusion protocol and μ-CT guidelines for different stages of chick embryo cardiogenesis. Our study was conducted over a period of 10 embryonic days (Hamburger-Hamilton HH36) in chick embryo hearts. During the perfusion of the micro-CT dense agent at different developmental stages (HH19, HH24, HH27, HH29, HH31, HH34, HH35, and HH36), we demonstrated that durations and volumes of the injected contrast agent gradually increased with the heart developmental stages contrary to the flow rate that was unchanged during the whole experiment. Analysis of the CT imaging confirmed the efficiency of the optimized parameters of the heart perfusion.
Collapse
Affiliation(s)
- Azza Naïja
- Biomedical Research Center, Qatar University, Doha, Qatar
| | - Onur Mutlu
- Biomedical Research Center, Qatar University, Doha, Qatar
| | - Talha Khan
- Petroleum Engineering Program, Texas A&M University, Doha, Qatar
| | | | - Huseyin C Yalcin
- Biomedical Research Center, Qatar University, Doha, Qatar.
- Department of Biomedical Sciences, College of Health Sciences, QU Health, Qatar University, Doha, Qatar.
- Department of Industrial and Mechanical Engineering, Qatar University, Doha, Qatar.
| |
Collapse
|
2
|
Alser M, Lawlor B, Abdill RJ, Waymost S, Ayyala R, Rajkumar N, LaPierre N, Brito J, Ribeiro-Dos-Santos AM, Almadhoun N, Sarwal V, Firtina C, Osinski T, Eskin E, Hu Q, Strong D, Kim BDBD, Abedalthagafi MS, Mutlu O, Mangul S. Packaging and containerization of computational methods. Nat Protoc 2024:10.1038/s41596-024-00986-0. [PMID: 38565959 DOI: 10.1038/s41596-024-00986-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2022] [Accepted: 02/12/2024] [Indexed: 04/04/2024]
Abstract
Methods for analyzing the full complement of a biomolecule type, e.g., proteomics or metabolomics, generate large amounts of complex data. The software tools used to analyze omics data have reshaped the landscape of modern biology and become an essential component of biomedical research. These tools are themselves quite complex and often require the installation of other supporting software, libraries and/or databases. A researcher may also be using multiple different tools that require different versions of the same supporting materials. The increasing dependence of biomedical scientists on these powerful tools creates a need for easier installation and greater usability. Packaging and containerization are different approaches to satisfy this need by delivering omics tools already wrapped in additional software that makes the tools easier to install and use. In this systematic review, we describe and compare the features of prominent packaging and containerization platforms. We outline the challenges, advantages and limitations of each approach and some of the most widely used platforms from the perspectives of users, software developers and system administrators. We also propose principles to make the distribution of omics software more sustainable and robust to increase the reproducibility of biomedical and life science research.
Collapse
Affiliation(s)
- Mohammed Alser
- Department of Information Technology and Electrical Engineering, ETH Zürich, Zurich, Switzerland
| | - Brendan Lawlor
- Department of Computer Science, Munster Technological University, Cork, Ireland
- Department of Biological Sciences, Munster Technological University, Cork, Ireland
| | - Richard J Abdill
- Section of Genetic Medicine, Department of Medicine, University of Chicago, Chicago, IL, USA
| | - Sharon Waymost
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA
| | - Ram Ayyala
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
- Titus Family Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, University of Southern California, Los Angeles, CA, USA
| | - Neha Rajkumar
- Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA, USA
| | - Nathan LaPierre
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Jaqueline Brito
- Titus Family Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, University of Southern California, Los Angeles, CA, USA
| | | | - Nour Almadhoun
- Department of Information Technology and Electrical Engineering, ETH Zürich, Zurich, Switzerland
| | - Varuni Sarwal
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA
| | - Can Firtina
- Department of Information Technology and Electrical Engineering, ETH Zürich, Zurich, Switzerland
| | - Tomasz Osinski
- Center for Advanced Research Computing, University of Southern California, Los Angeles, CA, USA
| | - Eleazar Eskin
- Department of Computer Science, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Computational Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, University of California, Los Angeles, CA, USA
| | - Qiyang Hu
- Office of Advanced Research Computing, University of California, Los Angeles, CA, USA
| | - Derek Strong
- Center for Advanced Research Computing, University of Southern California, Los Angeles, CA, USA
| | - Byoung-Do B D Kim
- Center for Advanced Research Computing, University of Southern California, Los Angeles, CA, USA
| | - Malak S Abedalthagafi
- Department of Pathology & Laboratory Medicine, Emory University Hospital, Atlanta, GA, USA
- King Salman Center for Disability Research, Riyadh, Saudi Arabia
| | - Onur Mutlu
- Department of Information Technology and Electrical Engineering, ETH Zürich, Zurich, Switzerland
| | - Serghei Mangul
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA.
- Titus Family Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
3
|
Singh G, Alser M, Denolf K, Firtina C, Khodamoradi A, Cavlak MB, Corporaal H, Mutlu O. RUBICON: a framework for designing efficient deep learning-based genomic basecallers. Genome Biol 2024; 25:49. [PMID: 38365730 PMCID: PMC10870431 DOI: 10.1186/s13059-024-03181-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2023] [Accepted: 02/02/2024] [Indexed: 02/18/2024] Open
Abstract
Nanopore sequencing generates noisy electrical signals that need to be converted into a standard string of DNA nucleotide bases using a computational step called basecalling. The performance of basecalling has critical implications for all later steps in genome analysis. Therefore, there is a need to reduce the computation and memory cost of basecalling while maintaining accuracy. We present RUBICON, a framework to develop efficient hardware-optimized basecallers. We demonstrate the effectiveness of RUBICON by developing RUBICALL, the first hardware-optimized mixed-precision basecaller that performs efficient basecalling, outperforming the state-of-the-art basecallers. We believe RUBICON offers a promising path to develop future hardware-optimized basecallers.
Collapse
Affiliation(s)
- Gagandeep Singh
- Department of Information Technology and Electrical Engineering, ETH Zürich, Zürich, Switzerland
- Research and Advanced Development, AMD, Longmont, USA
| | - Mohammed Alser
- Department of Information Technology and Electrical Engineering, ETH Zürich, Zürich, Switzerland
| | | | - Can Firtina
- Department of Information Technology and Electrical Engineering, ETH Zürich, Zürich, Switzerland.
| | | | - Meryem Banu Cavlak
- Department of Information Technology and Electrical Engineering, ETH Zürich, Zürich, Switzerland
| | - Henk Corporaal
- Department of Electrical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| | - Onur Mutlu
- Department of Information Technology and Electrical Engineering, ETH Zürich, Zürich, Switzerland.
| |
Collapse
|
4
|
Sukumaran V, Mutlu O, Murtaza M, Alhalbouni R, Dubansky B, Yalcin HC. Experimental assessment of cardiovascular physiology in the chick embryo. Dev Dyn 2023; 252:1247-1268. [PMID: 37002896 DOI: 10.1002/dvdy.589] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2022] [Revised: 12/13/2022] [Accepted: 03/10/2023] [Indexed: 10/04/2023] Open
Abstract
High resolution assessment of cardiac functional parameters is crucial in translational animal research. The chick embryo is a historically well-used in vivo model for cardiovascular research due to its many practical advantages, and the conserved form and function of the chick and human cardiogenesis programs. This review aims to provide an overview of several different technical approaches for chick embryo cardiac assessment. Doppler echocardiography, optical coherence tomography, micromagnetic resonance imaging, microparticle image velocimetry, real-time pressure monitoring, and associated issues with the techniques will be discussed. Alongside this discussion, we also highlight recent advances in cardiac function measurements in chick embryos.
Collapse
Affiliation(s)
| | - Onur Mutlu
- Biomedical Research Center, Qatar University, Doha, Qatar
| | | | | | - Benjamin Dubansky
- Department of Biological and Agricultural Engineering, Office of Research and Economic Development, Louisiana State University, Baton Rouge, Louisiana, USA
| | - Huseyin C Yalcin
- Biomedical Research Center, Qatar University, Doha, Qatar
- Department of Biomedical Science, College of Health Sciences, QU Health, Qatar University, Doha, Qatar
| |
Collapse
|
5
|
Tahir AM, Mutlu O, Bensaali F, Ward R, Ghareeb AN, Helmy SMHA, Othman KT, Al-Hashemi MA, Abujalala S, Chowdhury MEH, Alnabti ARDMH, Yalcin HC. Latest Developments in Adapting Deep Learning for Assessing TAVR Procedures and Outcomes. J Clin Med 2023; 12:4774. [PMID: 37510889 PMCID: PMC10381346 DOI: 10.3390/jcm12144774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 04/08/2023] [Accepted: 04/10/2023] [Indexed: 07/30/2023] Open
Abstract
Aortic valve defects are among the most prevalent clinical conditions. A severely damaged or non-functioning aortic valve is commonly replaced with a bioprosthetic heart valve (BHV) via the transcatheter aortic valve replacement (TAVR) procedure. Accurate pre-operative planning is crucial for a successful TAVR outcome. Assessment of computational fluid dynamics (CFD), finite element analysis (FEA), and fluid-solid interaction (FSI) analysis offer a solution that has been increasingly utilized to evaluate BHV mechanics and dynamics. However, the high computational costs and the complex operation of computational modeling hinder its application. Recent advancements in the deep learning (DL) domain can offer a real-time surrogate that can render hemodynamic parameters in a few seconds, thus guiding clinicians to select the optimal treatment option. Herein, we provide a comprehensive review of classical computational modeling approaches, medical imaging, and DL approaches for planning and outcome assessment of TAVR. Particularly, we focus on DL approaches in previous studies, highlighting the utilized datasets, deployed DL models, and achieved results. We emphasize the critical challenges and recommend several future directions for innovative researchers to tackle. Finally, an end-to-end smart DL framework is outlined for real-time assessment and recommendation of the best BHV design for TAVR. Ultimately, deploying such a framework in future studies will support clinicians in minimizing risks during TAVR therapy planning and will help in improving patient care.
Collapse
Affiliation(s)
- Anas M Tahir
- Electrical and Computer Engineering Department, The University of British Columbia, Vancouver, BC V6T 1Z4, Canada
- Biomedical Research Center, Qatar University, Doha 2713, Qatar
| | - Onur Mutlu
- Biomedical Research Center, Qatar University, Doha 2713, Qatar
| | - Faycal Bensaali
- Department of Electrical Engineering, Qatar University, Doha 2713, Qatar
| | - Rabab Ward
- Electrical and Computer Engineering Department, The University of British Columbia, Vancouver, BC V6T 1Z4, Canada
| | - Abdel Naser Ghareeb
- Heart Hospital, Hamad Medical Corporation, Doha 3050, Qatar
- Faculty of Medicine, Al Azhar University, Cairo 11884, Egypt
| | - Sherif M H A Helmy
- Noninvasive Cardiology Section, Cardiology Department, Heart Hospital, Hamad Medical Corporation, Doha 3050, Qatar
| | | | - Mohammed A Al-Hashemi
- Noninvasive Cardiology Section, Cardiology Department, Heart Hospital, Hamad Medical Corporation, Doha 3050, Qatar
| | | | | | | | - Huseyin C Yalcin
- Biomedical Research Center, Qatar University, Doha 2713, Qatar
- Department of Biomedical Science, College of Health Sciences, QU Health, Qatar University, Doha 2713, Qatar
| |
Collapse
|
6
|
Firtina C, Mansouri Ghiasi N, Lindegger J, Singh G, Cavlak MB, Mao H, Mutlu O. RawHash: enabling fast and accurate real-time analysis of raw nanopore signals for large genomes. Bioinformatics 2023; 39:i297-i307. [PMID: 37387139 DOI: 10.1093/bioinformatics/btad272] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
Nanopore sequencers generate electrical raw signals in real-time while sequencing long genomic strands. These raw signals can be analyzed as they are generated, providing an opportunity for real-time genome analysis. An important feature of nanopore sequencing, Read Until, can eject strands from sequencers without fully sequencing them, which provides opportunities to computationally reduce the sequencing time and cost. However, existing works utilizing Read Until either (i) require powerful computational resources that may not be available for portable sequencers or (ii) lack scalability for large genomes, rendering them inaccurate or ineffective. We propose RawHash, the first mechanism that can accurately and efficiently perform real-time analysis of nanopore raw signals for large genomes using a hash-based similarity search. To enable this, RawHash ensures the signals corresponding to the same DNA content lead to the same hash value, regardless of the slight variations in these signals. RawHash achieves an accurate hash-based similarity search via an effective quantization of the raw signals such that signals corresponding to the same DNA content have the same quantized value and, subsequently, the same hash value. We evaluate RawHash on three applications: (i) read mapping, (ii) relative abundance estimation, and (iii) contamination analysis. Our evaluations show that RawHash is the only tool that can provide high accuracy and high throughput for analyzing large genomes in real-time. When compared to the state-of-the-art techniques, UNCALLED and Sigmap, RawHash provides (i) 25.8× and 3.4× better average throughput and (ii) significantly better accuracy for large genomes, respectively. Source code is available at https://github.com/CMU-SAFARI/RawHash.
Collapse
Affiliation(s)
- Can Firtina
- Department of Information Technology and Electrical Engineering, ETH Zurich, 8092 Zurich, Switzerland
| | - Nika Mansouri Ghiasi
- Department of Information Technology and Electrical Engineering, ETH Zurich, 8092 Zurich, Switzerland
| | - Joel Lindegger
- Department of Information Technology and Electrical Engineering, ETH Zurich, 8092 Zurich, Switzerland
| | - Gagandeep Singh
- Department of Information Technology and Electrical Engineering, ETH Zurich, 8092 Zurich, Switzerland
| | - Meryem Banu Cavlak
- Department of Information Technology and Electrical Engineering, ETH Zurich, 8092 Zurich, Switzerland
| | - Haiyu Mao
- Department of Information Technology and Electrical Engineering, ETH Zurich, 8092 Zurich, Switzerland
| | - Onur Mutlu
- Department of Information Technology and Electrical Engineering, ETH Zurich, 8092 Zurich, Switzerland
| |
Collapse
|
7
|
Calciu I, Imran MT, Puddu I, Kashyap S, Al Maruf H, Mutlu O, Kolli A. Using Local Cache Coherence for Disaggregated Memory Systems. SIGOPS Oper Syst Rev 2023; 57:21-28. [DOI: 10.1145/3606557.3606561] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Abstract
Disaggregated memory provides many cost savings and resource provisioning benefits for current datacenters, but software systems enabling disaggregated memory access result in high performance penalties. These systems require intrusive code changes to port applications for disaggregated memory or employ slow virtual memory mechanisms to avoid code changes. Such mechanisms result in high overhead page faults to access remote data and high dirty data amplification when tracking changes to cached data at page-granularity. In this paper, we propose a fundamentally new approach for disaggregated memory systems, based on the observation that we can use local cache coherence to track applications' memory accesses transparently, without code changes, at cache-line granularity. This simple idea (1) eliminates page faults from the application critical path when accessing remote data, and (2) decouples the application memory access tracking from the virtual memory page size, enabling cache-line granularity dirty data tracking and eviction. Using this observation, we implemented a new software runtime for disaggregated memory that improves average memory access time and reduces dirty data amplification1.
Collapse
|
8
|
Alserr NA, Kale G, Mutlu O, Tastan O, Ayday E. Tuning Privacy-Utility Tradeoff in Genomic Studies Using Selective SNP Hiding. Proc Asia Pac Bioinform Conf 2023; 2023:3039. [PMID: 37383349 PMCID: PMC10306260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 06/30/2023]
Abstract
Researchers need a rich trove of genomic datasets that they can leverage to gain a better understanding of the genetic basis of the human genome and identify associations between phenol-types and specific parts of DNA. However, sharing genomic datasets that include sensitive genetic or medical information of individuals can lead to serious privacy-related consequences if data lands in the wrong hands. Restricting access to genomic datasets is one solution, but this greatly reduces their usefulness for research purposes. To allow sharing of genomic datasets while addressing these privacy concerns, several studies propose privacy-preserving mechanisms for data sharing. Differential privacy is one of such mechanisms that formalize rigorous mathematical foundations to provide privacy guarantees while sharing aggregated statistical information about a dataset. Nevertheless, it has been shown that the original privacy guarantees of DP-based solutions degrade when there are dependent tuples in the dataset, which is a common scenario for genomic datasets (due to the existence of family members). In this work, we introduce a new mechanism to mitigate the vulnerabilities of the inference attacks on differentially private query results from genomic datasets including dependent tuples. We propose a utility-maximizing and privacy-preserving approach for sharing statistics by hiding selective SNPs of the family members as they participate in a genomic dataset. By evaluating our mechanism on a real-world genomic dataset, we empirically demonstrate that our proposed mechanism can achieve up to 40% better privacy than state-of-the-art DP-based solutions, while near-optimally minimizing utility loss.
Collapse
Affiliation(s)
- Nour Almadhoun Alserr
- Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich 8006, Switzerland
| | - Gulce Kale
- Computer Engineering Department, Bilkent University, Ankara 06800, Turkey
| | - Onur Mutlu
- Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich 8006, Switzerland
- Computer Engineering Department, Bilkent University, Ankara 06800, Turkey
| | - Oznur Tastan
- Computer Science and Engineering, Sabanci University, Istanbul 34956, Turkey
| | - Erman Ayday
- Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA
- Computer Engineering Department, Bilkent University, Ankara 06800, Turkey
| |
Collapse
|
9
|
Diab S, Nassereldine A, Alser M, Gómez Luna J, Mutlu O, El Hajj I. A Framework for High-throughput Sequence Alignment using Real Processing-in-Memory Systems. Bioinformatics 2023; 39:7087101. [PMID: 36971586 PMCID: PMC10159653 DOI: 10.1093/bioinformatics/btad155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 02/24/2023] [Accepted: 03/25/2023] [Indexed: 03/29/2023] Open
Abstract
Abstract
Motivation
Sequence alignment is a memory bound computation whose performance in modern systems is limited by the memory bandwidth bottleneck. Processing-in-memory architectures alleviate this bottleneck by providing the memory with computing competencies. We propose Alignment-in-Memory (AIM), a framework for high-throughput sequence alignment using processing-in-memory, and evaluate it on UPMEM, the first publicly-available general-purpose programmable processing-in-memory system.
Results
Our evaluation shows that a real processing-in-memory system can substantially outperform server-grade multi-threaded CPU systems running at full-scale when performing sequence alignment for a variety of algorithms, read lengths, and edit distance thresholds. We hope that our findings inspire more work on creating and accelerating bioinformatics algorithms for such real processing-in-memory systems.
Availability
Our code is available at https://github.com/safaad/aim.
Collapse
Affiliation(s)
- Safaa Diab
- Department of Computer Science, American University of Beirut, Riad El-Solh, Beirut 1107 2020, Lebanon
| | - Amir Nassereldine
- Department of Computer Science, American University of Beirut, Riad El-Solh, Beirut 1107 2020, Lebanon
| | - Mohammed Alser
- Department of Information Technology and Electrical Engineering, ETH Zürich, Gloriastrasse 35, Zürich 8092, Switzerland
| | - Juan Gómez Luna
- Department of Information Technology and Electrical Engineering, ETH Zürich, Gloriastrasse 35, Zürich 8092, Switzerland
| | - Onur Mutlu
- Department of Information Technology and Electrical Engineering, ETH Zürich, Gloriastrasse 35, Zürich 8092, Switzerland
| | - Izzat El Hajj
- Department of Computer Science, American University of Beirut, Riad El-Solh, Beirut 1107 2020, Lebanon
| |
Collapse
|
10
|
Lindegger J, Cali DS, Alser M, Gómez-Luna J, Ghiasi NM, Mutlu O. Scrooge: A Fast and Memory-Frugal Genomic Sequence Aligner for CPUs, GPUs, and ASICs. Bioinformatics 2023; 39:7085594. [PMID: 36961334 DOI: 10.1093/bioinformatics/btad151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Subscribe] [Scholar Register] [Received: 08/21/2022] [Revised: 01/11/2023] [Accepted: 03/23/2023] [Indexed: 03/25/2023]
Abstract
MOTIVATION Pairwise sequence alignment is a very time-consuming step in common bioinformatics pipelines. Speeding up this step requires heuristics, efficient implementations, and/or hardware acceleration. A promising candidate for all of the above is the recently proposed GenASM algorithm. We identify and address three inefficiencies in the GenASM algorithm: it has a high amount of data movement, a large memory footprint, and does some unnecessary work. RESULTS We propose Scrooge, a fast and memory-frugal genomic sequence aligner. Scrooge includes three novel algorithmic improvements which reduce the data movement, memory footprint, and the number of operations in the GenASM algorithm. We provide efficient open-source implementations of the Scrooge algorithm for CPUs and GPUs, which demonstrate the significant benefits of our algorithmic improvements. For long reads, the CPU version of Scrooge achieves a 20.1×, 1.7×, and 2.1× speedup over KSW2, Edlib, and a CPU implementation of GenASM, respectively. The GPU version of Scrooge achieves a 4.0× 80.4×, 6.8×, 12.6× and 5.9× speedup over the CPU version of Scrooge, KSW2, Edlib, Darwin-GPU, and a GPU implementation of GenASM, respectively. We estimate an ASIC implementation of Scrooge to use 3.6× less chip area and 2.1× less power than a GenASM ASIC while maintaining the same throughput. Further, we systematically analyze the throughput and accuracy behavior of GenASM and Scrooge under various configurations. As the best configuration of Scrooge depends on the computing platform, we make several observations that can help guide future implementations of Scrooge. AVAILABILITY AND IMPLEMENTATION https://github.com/CMU-SAFARI/Scrooge.
Collapse
Affiliation(s)
- Joël Lindegger
- Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich, 8006, Switzerland
| | | | - Mohammed Alser
- Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich, 8006, Switzerland
| | - Juan Gómez-Luna
- Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich, 8006, Switzerland
| | - Nika Mansouri Ghiasi
- Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich, 8006, Switzerland
| | - Onur Mutlu
- Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich, 8006, Switzerland
| |
Collapse
|
11
|
Mutlu O, Salman HE, Al-Thani H, El-Menyar A, Qidwai UA, Yalcin HC. How does hemodynamics affect rupture tissue mechanics in abdominal aortic aneurysm: Focus on wall shear stress derived parameters, time-averaged wall shear stress, oscillatory shear index, endothelial cell activation potential, and relative residence time. Comput Biol Med 2023; 154:106609. [PMID: 36724610 DOI: 10.1016/j.compbiomed.2023.106609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 01/19/2023] [Accepted: 01/22/2023] [Indexed: 01/24/2023]
Abstract
An abdominal aortic aneurysm (AAA) is a critical health condition with a risk of rupture, where the diameter of the aorta enlarges more than 50% of its normal diameter. The incidence rate of AAA has increased worldwide. Currently, about three out of every 100,000 people have aortic diseases. The diameter and geometry of AAAs influence the hemodynamic forces exerted on the arterial wall. Therefore, a reliable assessment of hemodynamics is crucial for predicting the rupture risk. Wall shear stress (WSS) is an important metric to define the level of the frictional force on the AAA wall. Excessive levels of WSS deteriorate the remodeling mechanism of the arteries and lead to abnormal conditions. At this point, WSS-related hemodynamic parameters, such as time-averaged WSS (TAWSS), oscillatory shear index (OSI), endothelial cell activation potential (ECAP), and relative residence time (RRT) provide important information to evaluate the shear environment on the AAA wall in detail. Calculation of these parameters is not straightforward and requires a physical understanding of what they represent. In addition, computational fluid dynamics (CFD) solvers do not readily calculate these parameters when hemodynamics is simulated. This review aims to explain the WSS-derived parameters focusing on how these represent different characteristics of disturbed hemodynamics. A representative case is presented for spatial and temporal formulation that would be useful for interested researchers for practical calculations. Finally, recent hemodynamics investigations relating WSS-related parameters with AAA rupture risk assessment are presented. This review will be useful to understand the physical representation of WSS-related parameters in cardiovascular flows and how they can be calculated practically for AAA investigations.
Collapse
Affiliation(s)
- Onur Mutlu
- Biomedical Research Center, Qatar University, Doha, Qatar
| | - Huseyin Enes Salman
- Department of Mechanical Engineering, TOBB University of Economics and Technology, Ankara, Turkey
| | - Hassan Al-Thani
- Department of Surgery, Trauma and Vascular Surgery, Hamad General Hospital, Hamad Medical Corporation, P.O. Box 3050, Doha, Qatar
| | - Ayman El-Menyar
- Department of Surgery, Trauma and Vascular Surgery, Hamad General Hospital, Hamad Medical Corporation, P.O. Box 3050, Doha, Qatar; Clinical Medicine, Weill Cornell Medical College, Doha, Qatar
| | - Uvais Ahmed Qidwai
- Department of Computer Science Engineering, Qatar University, Doha, Qatar
| | | |
Collapse
|
12
|
Naija A, Mutlu O, Khan T, Seers TD, Yalcin HC. An optimized CT-dense agent perfusion and micro-CT imaging protocol for chick embryo developmental stages.. [DOI: 10.21203/rs.3.rs-2541863/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
Abstract
Abstract
Compared to classical techniques of morphological analysis, micro-CT (µ-CT) has become an effective approach allowing rapid screening of morphological changes. In the present work, we aimed to provide an optimized µ-CT dense agent perfusion protocol and µ-CT guidelines for different stages of chick embryo cardiogenesis. Our study was conducted over a period of 10 embryonic days (Hamburger-Hamilton HH36) in chick embryo hearts. During the perfusion of the µ-CT dense agent at different developmental stages (HH19, HH24, HH27, HH29, HH31, HH34, HH35, and HH36), we demonstrated that durations and volumes of the injected contrast agent gradually increased with the heart developmental stages contrary to the flow rate that was unchanged during the whole experiment. Analysis of the CT imaging confirmed the efficiency of the optimized parameters of the heart perfusion.
Collapse
|
13
|
Firtina C, Park J, Alser M, Kim JS, Cali D, Shahroodi T, Ghiasi N, Singh G, Kanellopoulos K, Alkan C, Mutlu O. BLEND: a fast, memory-efficient and accurate mechanism to find fuzzy seed matches in genome analysis. NAR Genom Bioinform 2023; 5:lqad004. [PMID: 36685727 PMCID: PMC9853099 DOI: 10.1093/nargab/lqad004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 12/16/2022] [Accepted: 01/10/2023] [Indexed: 01/22/2023] Open
Abstract
Generating the hash values of short subsequences, called seeds, enables quickly identifying similarities between genomic sequences by matching seeds with a single lookup of their hash values. However, these hash values can be used only for finding exact-matching seeds as the conventional hashing methods assign distinct hash values for different seeds, including highly similar seeds. Finding only exact-matching seeds causes either (i) increasing the use of the costly sequence alignment or (ii) limited sensitivity. We introduce BLEND, the first efficient and accurate mechanism that can identify both exact-matching and highly similar seeds with a single lookup of their hash values, called fuzzy seed matches. BLEND (i) utilizes a technique called SimHash, that can generate the same hash value for similar sets, and (ii) provides the proper mechanisms for using seeds as sets with the SimHash technique to find fuzzy seed matches efficiently. We show the benefits of BLEND when used in read overlapping and read mapping. For read overlapping, BLEND is faster by 2.4×-83.9× (on average 19.3×), has a lower memory footprint by 0.9×-14.1× (on average 3.8×), and finds higher quality overlaps leading to accurate de novo assemblies than the state-of-the-art tool, minimap2. For read mapping, BLEND is faster by 0.8×-4.1× (on average 1.7×) than minimap2. Source code is available at https://github.com/CMU-SAFARI/BLEND.
Collapse
Affiliation(s)
- Can Firtina
- To whom correspondence should be addressed. Tel: +41 44 632 64 29;
| | - Jisung Park
- ETH Zurich, Zurich 8092, Switzerland,POSTECH, Pohang 37673, Republic of Korea
| | | | | | | | | | | | | | | | - Can Alkan
- Bilkent University, Ankara 06800, Turkey
| | - Onur Mutlu
- Correspondence may also be addressed to Onur Mutlu. Tel: +41 44 632 64 29;
| |
Collapse
|
14
|
Kim JS, Firtina C, Cavlak MB, Senol Cali D, Alkan C, Mutlu O. FastRemap: a tool for quickly remapping reads between genome assemblies. Bioinformatics 2022; 38:4633-4635. [PMID: 35976109 DOI: 10.1093/bioinformatics/btac554] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2022] [Revised: 07/07/2022] [Accepted: 08/10/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION A genome read dataset can be quickly and efficiently remapped from one reference to another similar reference (e.g., between two reference versions or two similar species) using a variety of tools, e.g., the commonly used CrossMap tool. With the explosion of available genomic datasets and references, high-performance remapping tools will be even more important for keeping up with the computational demands of genome assembly and analysis. RESULTS We provide FastRemap, a fast and efficient tool for remapping reads between genome assemblies. FastRemap provides up to a 7.82× speedup (6.47×, on average) and uses as low as 61.7% (80.7%, on average) of the peak memory consumption compared to the state-of-the-art remapping tool, CrossMap. AVAILABILITY AND IMPLEMENTATION FastRemap is written in C++. Source code and user manual are freely available at: github.com/CMU-SAFARI/FastRemap. Docker image available at: https://hub.docker.com/r/alkanlab/fastremap. Also available in Bioconda at: https://anaconda.org/bioconda/fastremap-bio.
Collapse
Affiliation(s)
- Jeremie S Kim
- Department of Computer Engineering, ETH Zurich, D-ITET, Zurich 8006, Switzerland
| | - Can Firtina
- Department of Computer Engineering, ETH Zurich, D-ITET, Zurich 8006, Switzerland
| | - Meryem Banu Cavlak
- Department of Computer Engineering, ETH Zurich, D-ITET, Zurich 8006, Switzerland
| | - Damla Senol Cali
- Department of Computer Engineering, Bionano Genomics, San Diego, CA 92121, USA
| | - Can Alkan
- Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey
| | - Onur Mutlu
- Department of Computer Engineering, ETH Zurich, D-ITET, Zurich 8006, Switzerland.,Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey
| |
Collapse
|
15
|
Alser M, Kim JS, Almadhoun Alserr N, Tell SW, Mutlu O. COVIDHunter: COVID-19 Pandemic Wave Prediction and Mitigation via Seasonality Aware Modeling. Front Public Health 2022; 10:877621. [PMID: 35784219 PMCID: PMC9247408 DOI: 10.3389/fpubh.2022.877621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2022] [Accepted: 05/20/2022] [Indexed: 11/25/2022] Open
Abstract
Early detection and isolation of COVID-19 patients are essential for successful implementation of mitigation strategies and eventually curbing the disease spread. With a limited number of daily COVID-19 tests performed in every country, simulating the COVID-19 spread along with the potential effect of each mitigation strategy currently remains one of the most effective ways in managing the healthcare system and guiding policy-makers. We introduce COVIDHunter, a flexible and accurate COVID-19 outbreak simulation model that evaluates the current mitigation measures that are applied to a region, predicts COVID-19 statistics (the daily number of cases, hospitalizations, and deaths), and provides suggestions on what strength the upcoming mitigation measure should be. The key idea of COVIDHunter is to quantify the spread of COVID-19 in a geographical region by simulating the average number of new infections caused by an infected person considering the effect of external factors, such as environmental conditions (e.g., climate, temperature, humidity), different variants of concern, vaccination rate, and mitigation measures. Using Switzerland as a case study, COVIDHunter estimates that we are experiencing a deadly new wave that will peak on 26 January 2022, which is very similar in numbers to the wave we had in February 2020. The policy-makers have only one choice that is to increase the strength of the currently applied mitigation measures for 30 days. Unlike existing models, the COVIDHunter model accurately monitors and predicts the daily number of cases, hospitalizations, and deaths due to COVID-19. Our model is flexible to configure and simple to modify for modeling different scenarios under different environmental conditions and mitigation measures. We release the source code of the COVIDHunter implementation at https://github.com/CMU-SAFARI/COVIDHunter and show how to flexibly configure our model for any scenario and easily extend it for different measures and conditions than we account for.
Collapse
|
16
|
Breitwieser L, Hesam A, de Montigny J, Vavourakis V, Iosif A, Jennings J, Kaiser M, Manca M, Di Meglio A, Al-Ars Z, Rademakers F, Mutlu O, Bauer R. BioDynaMo: a modular platform for high-performance agent-based simulation. Bioinformatics 2022; 38:453-460. [PMID: 34529036 PMCID: PMC8723141 DOI: 10.1093/bioinformatics/btab649] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Revised: 09/02/2021] [Accepted: 09/13/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Agent-based modeling is an indispensable tool for studying complex biological systems. However, existing simulation platforms do not always take full advantage of modern hardware and often have a field-specific software design. RESULTS We present a novel simulation platform called BioDynaMo that alleviates both of these problems. BioDynaMo features a modular and high-performance simulation engine. We demonstrate that BioDynaMo can be used to simulate use cases in: neuroscience, oncology and epidemiology. For each use case, we validate our findings with experimental data or an analytical solution. Our performance results show that BioDynaMo performs up to three orders of magnitude faster than the state-of-the-art baselines. This improvement makes it feasible to simulate each use case with one billion agents on a single server, showcasing the potential BioDynaMo has for computational biology research. AVAILABILITY AND IMPLEMENTATION BioDynaMo is an open-source project under the Apache 2.0 license and is available at www.biodynamo.org. Instructions to reproduce the results are available in the supplementary information. SUPPLEMENTARY INFORMATION Available at https://doi.org/10.5281/zenodo.5121618.
Collapse
Affiliation(s)
- Lukas Breitwieser
- CERN openlab, IT Department, CERN, Geneva 1211, Switzerland.,Department of Computer Science, ETH Zurich, Zurich 8092, Switzerland
| | - Ahmad Hesam
- CERN openlab, IT Department, CERN, Geneva 1211, Switzerland.,Department of Quantum & Computer Engineering, Delft University of Technology, Delft 2628CD, The Netherlands
| | | | - Vasileios Vavourakis
- Department of Mechanical & Manufacturing Engineering, University of Cyprus, Nicosia 2109, Cyprus.,Department of Medical Physics & Biomedical Engineering, University College London, London WC1E 6BT, UK
| | - Alexandros Iosif
- Department of Mechanical & Manufacturing Engineering, University of Cyprus, Nicosia 2109, Cyprus
| | - Jack Jennings
- School of Computing, Newcastle University, Newcastle upon Tyne NE4 5TG, UK
| | - Marcus Kaiser
- School of Computing, Newcastle University, Newcastle upon Tyne NE4 5TG, UK.,Department of Functional Neurosurgery, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China.,Precision Imaging Beacon, School of Medicine, University of Nottingham, Nottingham NG7 2UH, UK
| | - Marco Manca
- SCimPulse Foundation, Geleen 6162 BC, The Netherlands
| | | | - Zaid Al-Ars
- Department of Quantum & Computer Engineering, Delft University of Technology, Delft 2628CD, The Netherlands
| | | | - Onur Mutlu
- Department of Computer Science, ETH Zurich, Zurich 8092, Switzerland.,Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich 8092, Switzerland
| | - Roman Bauer
- Department of Computer Science, University of Surrey, Guildford GU2 7XH, UK
| |
Collapse
|
17
|
Alser M, Lindegger J, Firtina C, Almadhoun N, Mao H, Singh G, Gomez-Luna J, Mutlu O. From molecules to genomic variations: Accelerating genome analysis via intelligent algorithms and architectures. Comput Struct Biotechnol J 2022; 20:4579-4599. [PMID: 36090814 PMCID: PMC9436709 DOI: 10.1016/j.csbj.2022.08.019] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2022] [Revised: 08/08/2022] [Accepted: 08/08/2022] [Indexed: 02/01/2023] Open
|
18
|
Tercanlı MF, Olcay AB, Mutlu O, Bilgin C, Hakyemez B. Investigation of the effect of anticoagulant usage in the flow diverter stent treatment of the patient-specific cerebral aneurysm using the Lagrangian coherent structures. J Clin Neurosci 2021; 94:86-93. [PMID: 34863468 DOI: 10.1016/j.jocn.2021.10.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2020] [Revised: 08/13/2021] [Accepted: 10/05/2021] [Indexed: 10/20/2022]
Abstract
Anticoagulants are prescribed to the flow diverter treated patients to diminish the risk of embolism in the arteries. In the present study, digital subtraction angiography images of a 49-year-old female patient with a left paraophthalmic aneurysm were used to build a numerical model to investigate the effect of an anticoagulant on hemodynamics at the aneurysm site. The Carreau-Yasuda viscosity model was utilized to define blood viscosity, and the coefficients of the viscosity model were updated based on the usage of warfarin. The five-cardiac cycle-long numerical simulations were performed, and Lagrangian coherent structures, hyperbolic time, and fluid particle analyses were also employed in the numerical models. These analyses allowed us to evaluate the formation of stagnated regions, recirculation zones, and the number of jailed particles inside the aneurysm sac following a flow diverter placement. It is realized that anticoagulant use caused blood to be less viscous, yielding a substantial amount of incoming blood flow to enter the aneurysm sac. Only 12% of the nearly 25,000 fluid particles seeded from the artery inlet have stayed inside the sac. Furthermore, the deviation between warfarin added blood and normal blood flow becomes more extensive, with every heartbeat undermining the effectiveness of patient-specific CFD models when the use of anticoagulants is overlooked in the viscosity models.
Collapse
Affiliation(s)
| | - Ali Bahadır Olcay
- Yeditepe University, Faculty of Engineering, Department of Mechanical Engineering, Kayisdagi Cad., 34755 Istanbul, Turkey.
| | - Onur Mutlu
- Biomedical Research Center, Qatar University, Doha P.O. Box 2713, Qatar
| | - Cem Bilgin
- University of Health Sciences, Bursa Yuksek Ihtisas Training and Research Hospital, Department of Radiology, Yildirim, Bursa 16310, Turkey
| | - Bahattin Hakyemez
- Uludag University School of Medicine, Department of Radiology, Gorukle, Bursa 16059, Turkey
| |
Collapse
|
19
|
Alser M, Rotman J, Deshpande D, Taraszka K, Shi H, Baykal PI, Yang HT, Xue V, Knyazev S, Singer BD, Balliu B, Koslicki D, Skums P, Zelikovsky A, Alkan C, Mutlu O, Mangul S. Technology dictates algorithms: recent developments in read alignment. Genome Biol 2021; 22:249. [PMID: 34446078 PMCID: PMC8390189 DOI: 10.1186/s13059-021-02443-7] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 07/28/2021] [Indexed: 01/08/2023] Open
Abstract
Aligning sequencing reads onto a reference is an essential step of the majority of genomic analysis pipelines. Computational algorithms for read alignment have evolved in accordance with technological advances, leading to today's diverse array of alignment methods. We provide a systematic survey of algorithmic foundations and methodologies across 107 alignment methods, for both short and long reads. We provide a rigorous experimental evaluation of 11 read aligners to demonstrate the effect of these underlying algorithms on speed and efficiency of read alignment. We discuss how general alignment algorithms have been tailored to the specific needs of various domains in biology.
Collapse
Affiliation(s)
- Mohammed Alser
- Computer Science Department, ETH Zürich, 8092, Zürich, Switzerland
- Computer Engineering Department, Bilkent University, 06800 Bilkent, Ankara, Turkey
- Information Technology and Electrical Engineering Department, ETH Zürich, Zürich, 8092, Switzerland
| | - Jeremy Rotman
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Dhrithi Deshpande
- Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, Los Angeles, CA, 90089, USA
| | - Kodi Taraszka
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Huwenbo Shi
- Department of Epidemiology, Harvard T.H. Chan School of Public Health, Boston, MA, 02115, USA
- Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, 02142, USA
| | - Pelin Icer Baykal
- Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
| | - Harry Taegyun Yang
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
- Bioinformatics Interdepartmental Ph.D. Program, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Victor Xue
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - Sergey Knyazev
- Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
| | - Benjamin D Singer
- Division of Pulmonary and Critical Care Medicine, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
- Department of Biochemistry & Molecular Genetics, Northwestern University Feinberg School of Medicine, Chicago, USA
- Simpson Querrey Institute for Epigenetics, Northwestern University Feinberg School of Medicine, Chicago, IL, 60611, USA
| | - Brunilda Balliu
- Department of Computational Medicine, University of California Los Angeles, Los Angeles, CA, 90095, USA
| | - David Koslicki
- Computer Science and Engineering, Pennsylvania State University, University Park, PA, 16801, USA
- Biology Department, Pennsylvania State University, University Park, PA, 16801, USA
- The Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA, 16801, USA
| | - Pavel Skums
- Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
| | - Alex Zelikovsky
- Department of Computer Science, Georgia State University, Atlanta, GA, 30302, USA
- The Laboratory of Bioinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, 119991, Russia
| | - Can Alkan
- Computer Engineering Department, Bilkent University, 06800 Bilkent, Ankara, Turkey
- Bilkent-Hacettepe Health Sciences and Technologies Program, Ankara, Turkey
| | - Onur Mutlu
- Computer Science Department, ETH Zürich, 8092, Zürich, Switzerland
- Computer Engineering Department, Bilkent University, 06800 Bilkent, Ankara, Turkey
- Information Technology and Electrical Engineering Department, ETH Zürich, Zürich, 8092, Switzerland
| | - Serghei Mangul
- Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, Los Angeles, CA, 90089, USA.
| |
Collapse
|
20
|
Calciu I, Imran MT, Puddu I, Kashyap S, Maruf HA, Mutlu O, Kolli A. Rethinking software runtimes for disaggregated memory. Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems 2021. [DOI: 10.1145/3445814.3446713] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/01/2023]
|
21
|
Alser M, Shahroodi T, Gómez-Luna J, Alkan C, Mutlu O. SneakySnake: a fast and accurate universal genome pre-alignment filter for CPUs, GPUs and FPGAs. Bioinformatics 2020; 36:5282-5290. [PMID: 33315064 DOI: 10.1093/bioinformatics/btaa1015] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Revised: 09/30/2020] [Accepted: 11/24/2020] [Indexed: 11/14/2022] Open
Abstract
Abstract
Motivation
We introduce SneakySnake, a highly parallel and highly accurate pre-alignment filter that remarkably reduces the need for computationally costly sequence alignment. The key idea of SneakySnake is to reduce the approximate string matching (ASM) problem to the single net routing (SNR) problem in VLSI chip layout. In the SNR problem, we are interested in finding the optimal path that connects two terminals with the least routing cost on a special grid layout that contains obstacles. The SneakySnake algorithm quickly solves the SNR problem and uses the found optimal path to decide whether or not performing sequence alignment is necessary. Reducing the ASM problem into SNR also makes SneakySnake efficient to implement on CPUs, GPUs and FPGAs.
Results
SneakySnake significantly improves the accuracy of pre-alignment filtering by up to four orders of magnitude compared to the state-of-the-art pre-alignment filters, Shouji, GateKeeper and SHD. For short sequences, SneakySnake accelerates Edlib (state-of-the-art implementation of Myers’s bit-vector algorithm) and Parasail (state-of-the-art sequence aligner with a configurable scoring function), by up to 37.7× and 43.9× (>12× on average), respectively, with its CPU implementation, and by up to 413× and 689× (>400× on average), respectively, with FPGA and GPU acceleration. For long sequences, the CPU implementation of SneakySnake accelerates Parasail and KSW2 (sequence aligner of minimap2) by up to 979× (276.9× on average) and 91.7× (31.7× on average), respectively. As SneakySnake does not replace sequence alignment, users can still obtain all capabilities (e.g. configurable scoring functions) of the aligner of their choice, unlike existing acceleration efforts that sacrifice some aligner capabilities.
Availabilityand implementation
https://github.com/CMU-SAFARI/SneakySnake.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mohammed Alser
- Department of Computer Science, ETH Zurich, Zurich 8006, Switzerland
- Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich 8006, Switzerland
| | - Taha Shahroodi
- Department of Computer Science, ETH Zurich, Zurich 8006, Switzerland
| | - Juan Gómez-Luna
- Department of Computer Science, ETH Zurich, Zurich 8006, Switzerland
- Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich 8006, Switzerland
| | - Can Alkan
- Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey
| | - Onur Mutlu
- Department of Computer Science, ETH Zurich, Zurich 8006, Switzerland
- Department of Information Technology and Electrical Engineering, ETH Zurich, Zurich 8006, Switzerland
- Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey
- Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| |
Collapse
|
22
|
Firtina C, Kim JS, Alser M, Senol Cali D, Cicek AE, Alkan C, Mutlu O. Apollo: a sequencing-technology-independent, scalable and accurate assembly polishing algorithm. Bioinformatics 2020; 36:3669-3679. [PMID: 32167530 DOI: 10.1093/bioinformatics/btaa179] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Revised: 12/16/2019] [Accepted: 03/11/2020] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Third-generation sequencing technologies can sequence long reads that contain as many as 2 million base pairs. These long reads are used to construct an assembly (i.e. the subject's genome), which is further used in downstream genome analysis. Unfortunately, third-generation sequencing technologies have high sequencing error rates and a large proportion of base pairs in these long reads is incorrectly identified. These errors propagate to the assembly and affect the accuracy of genome analysis. Assembly polishing algorithms minimize such error propagation by polishing or fixing errors in the assembly by using information from alignments between reads and the assembly (i.e. read-to-assembly alignment information). However, current assembly polishing algorithms can only polish an assembly using reads from either a certain sequencing technology or a small assembly. Such technology-dependency and assembly-size dependency require researchers to (i) run multiple polishing algorithms and (ii) use small chunks of a large genome to use all available readsets and polish large genomes, respectively. RESULTS We introduce Apollo, a universal assembly polishing algorithm that scales well to polish an assembly of any size (i.e. both large and small genomes) using reads from all sequencing technologies (i.e. second- and third-generation). Our goal is to provide a single algorithm that uses read sets from all available sequencing technologies to improve the accuracy of assembly polishing and that can polish large genomes. Apollo (i) models an assembly as a profile hidden Markov model (pHMM), (ii) uses read-to-assembly alignment to train the pHMM with the Forward-Backward algorithm and (iii) decodes the trained model with the Viterbi algorithm to produce a polished assembly. Our experiments with real readsets demonstrate that Apollo is the only algorithm that (i) uses reads from any sequencing technology within a single run and (ii) scales well to polish large assemblies without splitting the assembly into multiple parts. AVAILABILITY AND IMPLEMENTATION Source code is available at https://github.com/CMU-SAFARI/Apollo. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Can Firtina
- Department of Computer Science, ETH Zurich, Zurich 8092, Switzerland
| | - Jeremie S Kim
- Department of Computer Science, ETH Zurich, Zurich 8092, Switzerland.,Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Mohammed Alser
- Department of Computer Science, ETH Zurich, Zurich 8092, Switzerland
| | - Damla Senol Cali
- Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - A Ercument Cicek
- Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey
| | - Can Alkan
- Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey
| | - Onur Mutlu
- Department of Computer Science, ETH Zurich, Zurich 8092, Switzerland.,Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA.,Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey
| |
Collapse
|
23
|
Zachariadis O, Teatini A, Satpute N, Gómez-Luna J, Mutlu O, Elle OJ, Olivares J. Accelerating B-spline interpolation on GPUs: Application to medical image registration. Comput Methods Programs Biomed 2020; 193:105431. [PMID: 32283385 DOI: 10.1016/j.cmpb.2020.105431] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Revised: 02/14/2020] [Accepted: 03/02/2020] [Indexed: 06/11/2023]
Abstract
BACKGROUND AND OBJECTIVE B-spline interpolation (BSI) is a popular technique in the context of medical imaging due to its adaptability and robustness in 3D object modeling. A field that utilizes BSI is Image Guided Surgery (IGS). IGS provides navigation using medical images, which can be segmented and reconstructed into 3D models, often through BSI. Image registration tasks also use BSI to transform medical imaging data collected before the surgery and intra-operative data collected during the surgery into a common coordinate space. However, such IGS tasks are computationally demanding, especially when applied to 3D medical images, due to the complexity and amount of data involved. Therefore, optimization of IGS algorithms is greatly desirable, for example, to perform image registration tasks intra-operatively and to enable real-time applications. A traditional CPU does not have sufficient computing power to achieve these goals and, thus, it is preferable to rely on GPUs. In this paper, we introduce a novel GPU implementation of BSI to accelerate the calculation of the deformation field in non-rigid image registration algorithms. METHODS Our BSI implementation on GPUs minimizes the data that needs to be moved between memory and processing cores during loading of the input grid, and leverages the large on-chip GPU register file for reuse of input values. Moreover, we re-formulate our method as trilinear interpolations to reduce computational complexity and increase accuracy. To provide pre-clinical validation of our method and demonstrate its benefits in medical applications, we integrate our improved BSI into a registration workflow for compensation of liver deformation (caused by pneumoperitoneum, i.e., inflation of the abdomen) and evaluate its performance. RESULTS Our approach improves the performance of BSI by an average of 6.5× and interpolation accuracy by 2× compared to three state-of-the-art GPU implementations. Through pre-clinical validation, we demonstrate that our optimized interpolation accelerates a non-rigid image registration algorithm, which is based on the Free Form Deformation (FFD) method, by up to 34%. CONCLUSION Our study shows that we can achieve significant performance and accuracy gains with our novel parallelization scheme that makes effective use of the GPU resources. We show that our method improves the performance of real medical imaging registration applications used in practice today.
Collapse
Affiliation(s)
- Orestis Zachariadis
- Department of Electronics and Computer Engineering, Universidad de Cordoba, Córdoba, Spain.
| | - Andrea Teatini
- The Intervention Centre, Oslo University Hospital - Rikshospitalet, Oslo, Norway; Department of Informatics, University of Oslo, Oslo, Norway.
| | - Nitin Satpute
- Department of Electronics and Computer Engineering, Universidad de Cordoba, Córdoba, Spain
| | - Juan Gómez-Luna
- Department of Computer Science, ETH Zurich, Zurich, Switzerland
| | - Onur Mutlu
- Department of Computer Science, ETH Zurich, Zurich, Switzerland
| | - Ole Jakob Elle
- The Intervention Centre, Oslo University Hospital - Rikshospitalet, Oslo, Norway; Department of Informatics, University of Oslo, Oslo, Norway
| | - Joaquín Olivares
- Department of Electronics and Computer Engineering, Universidad de Cordoba, Córdoba, Spain
| |
Collapse
|
24
|
Mutlu O, Olcay AB, Bilgin C, Hakyemez B. Understanding the effect of effective metal surface area of flow diverter stent's on the patient-specific intracranial aneurysm numerical model using Lagrangian coherent structures. J Clin Neurosci 2020; 80:298-309. [PMID: 32712121 DOI: 10.1016/j.jocn.2020.04.111] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2020] [Accepted: 04/19/2020] [Indexed: 11/27/2022]
Abstract
The effective metal surface area (EMSA) of flow diversions plays an essential role in the occlusion mechanism inside the aneurysm since the value of EMSA determines the amount of blood flow into the aneurysm sac. In the present study, three different models of a flow diverter stent, namely FRED 4017, FRED 4038, and FRED 4539, were virtually placed at the aneurysm neck of a 52-years-old female patient to identify the effect of EMSA on stagnation region formation inside the aneurysm sac. Lagrangian coherent structures (LCSs), hyperbolic time, and particle tracking analysis were employed to the velocity vectors obtained from computational fluid dynamics (CFD). It is noticed that use of FRED 4017 stent with 0.42 EMSA value caused nearly 40% of the weightless blood flow particles (more than FRED 4038 and FRED 4539) to stay inside the aneurysm while only 0.35% of the blood flow was remaining inside the aneurysm sac when no stent was placed into the aneurysm site. Furthermore, hyperbolic time computations illustrated the formation of stagnation fluid flow zones that can be associated with the residence time of the blood flow particles. Lastly, the results of hyperbolic time analysis are in good agreement with digital subtraction angiography (DSA) images taken in the clinic a few minutes after a FRED 4017 implantation.
Collapse
Affiliation(s)
- Onur Mutlu
- Yeditepe University, Faculty of Engineering, Department of Mechanical Engineering, Kayisdagi Cad., 34755 Istanbul, Turkey
| | - Ali Bahadır Olcay
- Yeditepe University, Faculty of Engineering, Department of Mechanical Engineering, Kayisdagi Cad., 34755 Istanbul, Turkey.
| | - Cem Bilgin
- Uludag University School of Medicine, Department of Radiology, Gorukle, Bursa 16059, Turkey
| | - Bahattin Hakyemez
- Uludag University School of Medicine, Department of Radiology, Gorukle, Bursa 16059, Turkey
| |
Collapse
|
25
|
Alser M, Hassan H, Kumar A, Mutlu O, Alkan C. Shouji: a fast and efficient pre-alignment filter for sequence alignment. Bioinformatics 2020; 35:4255-4263. [PMID: 30923804 DOI: 10.1093/bioinformatics/btz234] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Revised: 02/27/2019] [Accepted: 03/27/2019] [Indexed: 01/07/2023] Open
Abstract
MOTIVATION The ability to generate massive amounts of sequencing data continues to overwhelm the processing capability of existing algorithms and compute infrastructures. In this work, we explore the use of hardware/software co-design and hardware acceleration to significantly reduce the execution time of short sequence alignment, a crucial step in analyzing sequenced genomes. We introduce Shouji, a highly parallel and accurate pre-alignment filter that remarkably reduces the need for computationally-costly dynamic programming algorithms. The first key idea of our proposed pre-alignment filter is to provide high filtering accuracy by correctly detecting all common subsequences shared between two given sequences. The second key idea is to design a hardware accelerator that adopts modern field-programmable gate array (FPGA) architectures to further boost the performance of our algorithm. RESULTS Shouji significantly improves the accuracy of pre-alignment filtering by up to two orders of magnitude compared to the state-of-the-art pre-alignment filters, GateKeeper and SHD. Our FPGA-based accelerator is up to three orders of magnitude faster than the equivalent CPU implementation of Shouji. Using a single FPGA chip, we benchmark the benefits of integrating Shouji with five state-of-the-art sequence aligners, designed for different computing platforms. The addition of Shouji as a pre-alignment step reduces the execution time of the five state-of-the-art sequence aligners by up to 18.8×. Shouji can be adapted for any bioinformatics pipeline that performs sequence alignment for verification. Unlike most existing methods that aim to accelerate sequence alignment, Shouji does not sacrifice any of the aligner capabilities, as it does not modify or replace the alignment step. AVAILABILITY AND IMPLEMENTATION https://github.com/CMU-SAFARI/Shouji. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mohammed Alser
- Computer Science Department, ETH Zürich, Zürich, Switzerland.,Chair for Processor Design, Center For Advancing Electronics Dresden, Institute of Computer Engineering, Technische Universität Dresden, Dresden, Germany.,Computer Engineering Department, Bilkent University, Ankara, Turkey
| | - Hasan Hassan
- Computer Science Department, ETH Zürich, Zürich, Switzerland
| | - Akash Kumar
- Chair for Processor Design, Center For Advancing Electronics Dresden, Institute of Computer Engineering, Technische Universität Dresden, Dresden, Germany
| | - Onur Mutlu
- Computer Science Department, ETH Zürich, Zürich, Switzerland.,Computer Engineering Department, Bilkent University, Ankara, Turkey
| | - Can Alkan
- Computer Engineering Department, Bilkent University, Ankara, Turkey
| |
Collapse
|
26
|
Senol Cali D, Kim JS, Ghose S, Alkan C, Mutlu O. Nanopore sequencing technology and tools for genome assembly: computational analysis of the current state, bottlenecks and future directions. Brief Bioinform 2019; 20:1542-1559. [PMID: 29617724 PMCID: PMC6781587 DOI: 10.1093/bib/bby017] [Citation(s) in RCA: 102] [Impact Index Per Article: 20.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2017] [Revised: 02/06/2018] [Indexed: 02/06/2023] Open
Abstract
Nanopore sequencing technology has the potential to render other sequencing technologies obsolete with its ability to generate long reads and provide portability. However, high error rates of the technology pose a challenge while generating accurate genome assemblies. The tools used for nanopore sequence analysis are of critical importance, as they should overcome the high error rates of the technology. Our goal in this work is to comprehensively analyze current publicly available tools for nanopore sequence analysis to understand their advantages, disadvantages and performance bottlenecks. It is important to understand where the current tools do not perform well to develop better tools. To this end, we (1) analyze the multiple steps and the associated tools in the genome assembly pipeline using nanopore sequence data, and (2) provide guidelines for determining the appropriate tools for each step. Based on our analyses, we make four key observations: (1) the choice of the tool for basecalling plays a critical role in overcoming the high error rates of nanopore sequencing technology. (2) Read-to-read overlap finding tools, GraphMap and Minimap, perform similarly in terms of accuracy. However, Minimap has a lower memory usage, and it is faster than GraphMap. (3) There is a trade-off between accuracy and performance when deciding on the appropriate tool for the assembly step. The fast but less accurate assembler Miniasm can be used for quick initial assembly, and further polishing can be applied on top of it to increase the accuracy, which leads to faster overall assembly. (4) The state-of-the-art polishing tool, Racon, generates high-quality consensus sequences while providing a significant speedup over another polishing tool, Nanopolish. We analyze various combinations of different tools and expose the trade-offs between accuracy, performance, memory usage and scalability. We conclude that our observations can guide researchers and practitioners in making conscious and effective choices for each step of the genome assembly pipeline using nanopore sequence data. Also, with the help of bottlenecks we have found, developers can improve the current tools or build new ones that are both accurate and fast, to overcome the high error rates of the nanopore sequencing technology.
Collapse
Affiliation(s)
- Damla Senol Cali
- Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Jeremie S Kim
- Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, USA
- Department of Computer Science, Systems Group, ETH Zürich, Zürich, Switzerland
| | - Saugata Ghose
- Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Can Alkan
- Department of Computer Engineering, Bilkent University, Bilkent, Ankara, Turkey
| | - Onur Mutlu
- Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, USA
- Department of Computer Science, Systems Group, ETH Zürich, Zürich, Switzerland
| |
Collapse
|
27
|
Ghose S, Yaglikçi AG, Gupta R, Lee D, Kudrolli K, Liu WX, Hassan H, Chang KK, Chatterjee N, Agrawal A, O'Connor M, Mutlu O. What Your DRAM Power Models Are Not Telling You. ACTA ACUST UNITED AC 2018. [DOI: 10.1145/3224419] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
Main memory (DRAM) consumes as much as half of the total system power in a computer today, due to the increasing demand for memory capacity and bandwidth. There is a growing need to understand and analyze DRAM power consumption, which can be used to research new DRAM architectures and systems that consume less power. A major obstacle against such research is the lack of detailed and accurate information on the power consumption behavior of modern DRAM devices. Researchers have long relied on DRAM power models that are predominantly based off of a set of standardized current measurements provided by DRAM vendors, called IDD values. Unfortunately, we find that state-of-the-art DRAM power models are often highly inaccurate, as these models do not reflect the actual power consumed by real DRAM devices. To build an accurate model and provide insights into DRAM power consumption, we perform the first comprehensive experimental characterization of the power consumed by modern real-world DRAM modules. Our extensive characterization of 50 DDR3L DRAM modules from three major vendors yields four key new observations about DRAM power consumption that prior models cannot capture: (1) across all IDD values that we measure, the current consumed by real DRAM modules varies significantly from the current specified by the vendors; (2) DRAM power consumption strongly depends on the data value that is read or written; (3) there is significant structural variation, where the same banks and rows across multiple DRAM modules from the same model consume more power than other banks or rows; and (4) over successive process technology generations, DRAM power consumption has not decreased by as much as vendor specifications have indicated. Because state-of-the-art DRAM power models do not account for any of these four key characteristics, they are highly inaccurate compared to the actual, measured power consumption of 50 real DDR3L modules. Based on our detailed analysis and characterization data, we develop the Variation-Aware model of Memory Power Informed by Real Experiments (VAMPIRE). VAMPIRE is a new, accurate power consumption model for DRAM that takes into account (1) module-to-module and intra-module variations, and (2) power consumption variation due to data value dependency. We show that VAMPIRE has a mean absolute percentage error of only 6.8% compared to actual measured DRAM power. VAMPIRE enables a wide range of studies that were not possible using prior DRAM power models. As an example, we use VAMPIRE to evaluate the energy efficiency of three different encodings that can be used to store data in DRAM. We find that a new power-aware data encoding mechanism can reduce total DRAM energy consumption by an average of 12.2%, across a wide range of applications. We have open-sourced both VAMPIRE and our extensive raw data collected during our experimental characterization.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | - Mike O'Connor
- NVIDIA / University of Texas at Austin, Austin, TX, USA
| | - Onur Mutlu
- ETH Zürich / Carnegie Mellon University, Zurich, Switzerland
| |
Collapse
|
28
|
Abstract
Contemporary discrete GPUs support rich memory management features such as virtual memory and demand paging. These features simplify GPU programming by providing a virtual address space abstraction similar to CPUs and eliminating manual memory management, but they introduce high performance overheads during (1) address translation and (2) page faults. A GPU relies on high degrees of thread-level parallelism (TLP) to hide memory latency. Address translation can undermine TLP, as a single miss in the translation lookaside buffer (TLB) invokes an expensive serialized page table walk that often stalls multiple threads. Demand paging can also undermine TLP, as multiple threads often stall while they wait for an expensive data transfer over the system I/O (e.g., PCIe) bus when the GPU demands a page.
In modern GPUs, we face a trade-off on how the page size used for memory management affects address translation and demand paging. The address translation overhead is lower when we employ a larger page size (e.g., 2MB large pages, compared with conventional 4KB base pages), which increases TLB coverage and thus reduces TLB misses. Conversely, the demand paging overhead is lower when we employ a smaller page size, which decreases the system I/O bus transfer latency. Support for multiple page sizes can help relax the page size trade-off so that address translation and demand paging optimizations work together synergistically. However, existing page coalescing (i.e., merging base pages into a large page) and splintering (i.e., splitting a large page into base pages) policies require costly base page migrations that undermine the benefits multiple page sizes provide. In this paper, we observe that GPGPU applications present an opportunity to support multiple page sizes without costly data migration, as the applications perform most of their memory allocation en masse (i.e., they allocate a large number of base pages at once).We show that this en masse allocation allows us to create intelligent memory allocation policies which ensure that base pages that are contiguous in virtual memory are allocated to contiguous physical memory pages. As a result, coalescing and splintering operations no longer need to migrate base pages.
Collapse
|
29
|
Alser M, Hassan H, Xin H, Ergin O, Mutlu O, Alkan C. GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping. Bioinformatics 2018; 33:3355-3363. [PMID: 28575161 DOI: 10.1093/bioinformatics/btx342] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2016] [Accepted: 05/29/2017] [Indexed: 01/06/2023] Open
Abstract
Motivation High throughput DNA sequencing (HTS) technologies generate an excessive number of small DNA segments -called short reads- that cause significant computational burden. To analyze the entire genome, each of the billions of short reads must be mapped to a reference genome based on the similarity between a read and 'candidate' locations in that reference genome. The similarity measurement, called alignment, formulated as an approximate string matching problem, is the computational bottleneck because: (i) it is implemented using quadratic-time dynamic programming algorithms and (ii) the majority of candidate locations in the reference genome do not align with a given read due to high dissimilarity. Calculating the alignment of such incorrect candidate locations consumes an overwhelming majority of a modern read mapper's execution time. Therefore, it is crucial to develop a fast and effective filter that can detect incorrect candidate locations and eliminate them before invoking computationally costly alignment algorithms. Results We propose GateKeeper, a new hardware accelerator that functions as a pre-alignment step that quickly filters out most incorrect candidate locations. GateKeeper is the first design to accelerate pre-alignment using Field-Programmable Gate Arrays (FPGAs), which can perform pre-alignment much faster than software. When implemented on a single FPGA chip, GateKeeper maintains high accuracy (on average >96%) while providing, on average, 90-fold and 130-fold speedup over the state-of-the-art software pre-alignment techniques, Adjacency Filter and Shifted Hamming Distance (SHD), respectively. The addition of GateKeeper as a pre-alignment step can reduce the verification time of the mrFAST mapper by a factor of 10. Availability and implementation https://github.com/BilkentCompGen/GateKeeper. Contact mohammedalser@bilkent.edu.tr or onur.mutlu@inf.ethz.ch or calkan@cs.bilkent.edu.tr. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Mohammed Alser
- Department of Computer Engineering, Bilkent University, Bilkent, Ankara 06800, Turkey
| | - Hasan Hassan
- TOBB University of Economics & Technology, Sogutozu, Ankara, Turkey
- Department of Computer Science, ETH Zürich, 8092 Zürich, Switzerland
| | - Hongyi Xin
- Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Oguz Ergin
- TOBB University of Economics & Technology, Sogutozu, Ankara, Turkey
| | - Onur Mutlu
- Department of Computer Science, ETH Zürich, 8092 Zürich, Switzerland
| | - Can Alkan
- Department of Computer Engineering, Bilkent University, Bilkent, Ankara 06800, Turkey
| |
Collapse
|
30
|
Kim JS, Senol Cali D, Xin H, Lee D, Ghose S, Alser M, Hassan H, Ergin O, Alkan C, Mutlu O. GRIM-Filter: Fast seed location filtering in DNA read mapping using processing-in-memory technologies. BMC Genomics 2018; 19:89. [PMID: 29764378 PMCID: PMC5954284 DOI: 10.1186/s12864-018-4460-0] [Citation(s) in RCA: 61] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
Background Seed location filtering is critical in DNA read mapping, a process where billions of DNA fragments (reads) sampled from a donor are mapped onto a reference genome to identify genomic variants of the donor. State-of-the-art read mappers 1) quickly generate possible mapping locations for seeds (i.e., smaller segments) within each read, 2) extract reference sequences at each of the mapping locations, and 3) check similarity between each read and its associated reference sequences with a computationally-expensive algorithm (i.e., sequence alignment) to determine the origin of the read. A seed location filter comes into play before alignment, discarding seed locations that alignment would deem a poor match. The ideal seed location filter would discard all poor match locations prior to alignment such that there is no wasted computation on unnecessary alignments. Results We propose a novel seed location filtering algorithm, GRIM-Filter, optimized to exploit 3D-stacked memory systems that integrate computation within a logic layer stacked under memory layers, to perform processing-in-memory (PIM). GRIM-Filter quickly filters seed locations by 1) introducing a new representation of coarse-grained segments of the reference genome, and 2) using massively-parallel in-memory operations to identify read presence within each coarse-grained segment. Our evaluations show that for a sequence alignment error tolerance of 0.05, GRIM-Filter 1) reduces the false negative rate of filtering by 5.59x–6.41x, and 2) provides an end-to-end read mapper speedup of 1.81x–3.65x, compared to a state-of-the-art read mapper employing the best previous seed location filtering algorithm. Conclusion GRIM-Filter exploits 3D-stacked memory, which enables the efficient use of processing-in-memory, to overcome the memory bandwidth bottleneck in seed location filtering. We show that GRIM-Filter significantly improves the performance of a state-of-the-art read mapper. GRIM-Filter is a universal seed location filter that can be applied to any read mapper. We hope that our results provide inspiration for new works to design other bioinformatics algorithms that take advantage of emerging technologies and new processing paradigms, such as processing-in-memory using 3D-stacked memory devices.
Collapse
Affiliation(s)
- Jeremie S Kim
- Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, USA. .,Department of Computer Science, ETH Zürich, Zürich, CH, Switzerland.
| | - Damla Senol Cali
- Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Hongyi Xin
- Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | | | - Saugata Ghose
- Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Mohammed Alser
- Department of Computer Engineering, Bilkent University, Bilkent, Ankara, Turkey
| | - Hasan Hassan
- Department of Computer Science, ETH Zürich, Zürich, CH, Switzerland
| | - Oguz Ergin
- Department of Computer Engineering, TOBB University of Economics and Technology, Sogutozu, Ankara, Turkey
| | - Can Alkan
- Department of Computer Engineering, Bilkent University, Bilkent, Ankara, Turkey
| | - Onur Mutlu
- Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, USA. .,Department of Computer Science, ETH Zürich, Zürich, CH, Switzerland.
| |
Collapse
|
31
|
|
32
|
Yakarsonmez S, Cayir E, Mutlu O, Nural B, Sariyer E, Topuzogullari M, Milward MR, Cooper PR, Erdemir A, Turgut-Balik D. Cloning, expression and characterization of the gene encoding the enolase from Fusobacterium nucleatum. APPL BIOCHEM MICRO+ 2016. [DOI: 10.1134/s0003683816010142] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
33
|
Abstract
Main memory bandwidth is a critical bottleneck for modern GPU systems due to limited off-chip pin bandwidth. 3D-stacked memory architectures provide a promising opportunity to significantly alleviate this bottleneck by directly connecting a logic layer to the DRAM layers with high bandwidth connections. Recent work has shown promising potential performance benefits from an architecture that connects multiple such 3D-stacked memories and offloads bandwidth-intensive computations to a GPU in each of the logic layers. An unsolved key challenge in such a system is how to enable computation offloading and data mapping to
multiple
3D-stacked memories
without burdening the programmer
such that
any application
can transparently benefit from near-data processing capabilities in the logic layer.
Our paper develops two new mechanisms to address this key challenge. First, a compiler-based technique that automatically identifies code to offload to a logic-layer GPU based on a simple cost-benefit analysis. Second, a software/hardware cooperative mechanism that predicts which memory pages will be accessed by offloaded code, and places those pages in the memory stack closest to the offloaded code, to minimize off-chip bandwidth consumption. We call the combination of these two programmer-transparent mechanisms TOM: Transparent Offloading and Mapping.
Our extensive evaluations across a variety of modern memory-intensive GPU workloads show that, without requiring any program modification, TOM significantly improves performance (by 30% on average, and up to 76%) compared to a baseline GPU system that cannot offload computation to 3D-stacked memories.
Collapse
|
34
|
Xin H, Nahar S, Zhu R, Emmons J, Pekhimenko G, Kingsford C, Alkan C, Mutlu O. Optimal seed solver: optimizing seed selection in read mapping. ACTA ACUST UNITED AC 2015; 32:1632-42. [PMID: 26568624 DOI: 10.1093/bioinformatics/btv670] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2015] [Accepted: 11/09/2015] [Indexed: 11/12/2022]
Abstract
MOTIVATION Optimizing seed selection is an important problem in read mapping. The number of non-overlapping seeds a mapper selects determines the sensitivity of the mapper while the total frequency of all selected seeds determines the speed of the mapper. Modern seed-and-extend mappers usually select seeds with either an equal and fixed-length scheme or with an inflexible placement scheme, both of which limit the ability of the mapper in selecting less frequent seeds to speed up the mapping process. Therefore, it is crucial to develop a new algorithm that can adjust both the individual seed length and the seed placement, as well as derive less frequent seeds. RESULTS We present the Optimal Seed Solver (OSS), a dynamic programming algorithm that discovers the least frequently-occurring set of x seeds in an L-base-pair read in [Formula: see text] operations on average and in [Formula: see text] operations in the worst case, while generating a maximum of [Formula: see text] seed frequency database lookups. We compare OSS against four state-of-the-art seed selection schemes and observe that OSS provides a 3-fold reduction in average seed frequency over the best previous seed selection optimizations. AVAILABILITY AND IMPLEMENTATION We provide an implementation of the Optimal Seed Solver in C++ at: https://github.com/CMU-SAFARI/Optimal-Seed-Solver CONTACT hxin@cmu.edu, calkan@cs.bilkent.edu.tr or onur@cmu.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | | | | | - John Emmons
- Department of Computer Science and Engineering, Washington University, St. Louis, MO 63130, USA
| | | | - Carl Kingsford
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Can Alkan
- Department of Computer Engineering, Bilkent University, Bilkent, Ankara 06800, Turkey and
| | - Onur Mutlu
- Computer Science Department, Department of Electrical and Computer Engineering
| |
Collapse
|
35
|
Xin H, Greth J, Emmons J, Pekhimenko G, Kingsford C, Alkan C, Mutlu O. Shifted Hamming distance: a fast and accurate SIMD-friendly filter to accelerate alignment verification in read mapping. ACTA ACUST UNITED AC 2015; 31:1553-60. [PMID: 25577434 DOI: 10.1093/bioinformatics/btu856] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2014] [Accepted: 12/23/2014] [Indexed: 11/13/2022]
Abstract
MOTIVATION Calculating the edit-distance (i.e. minimum number of insertions, deletions and substitutions) between short DNA sequences is the primary task performed by seed-and-extend based mappers, which compare billions of sequences. In practice, only sequence pairs with a small edit-distance provide useful scientific data. However, the majority of sequence pairs analyzed by seed-and-extend based mappers differ by significantly more errors than what is typically allowed. Such error-abundant sequence pairs needlessly waste resources and severely hinder the performance of read mappers. Therefore, it is crucial to develop a fast and accurate filter that can rapidly and efficiently detect error-abundant string pairs and remove them from consideration before more computationally expensive methods are used. RESULTS We present a simple and efficient algorithm, Shifted Hamming Distance (SHD), which accelerates the alignment verification procedure in read mapping, by quickly filtering out error-abundant sequence pairs using bit-parallel and SIMD-parallel operations. SHD only filters string pairs that contain more errors than a user-defined threshold, making it fully comprehensive. It also maintains high accuracy with moderate error threshold (up to 5% of the string length) while achieving a 3-fold speedup over the best previous algorithm (Gene Myers's bit-vector algorithm). SHD is compatible with all mappers that perform sequence alignment for verification.
Collapse
Affiliation(s)
- Hongyi Xin
- Computer Science Department, Department of Electrical and Computer Engineering, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA and Department of Computer Engineering, Bilkent University, Bilkent, Ankara 06800, Turkey
| | - John Greth
- Computer Science Department, Department of Electrical and Computer Engineering, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA and Department of Computer Engineering, Bilkent University, Bilkent, Ankara 06800, Turkey
| | - John Emmons
- Computer Science Department, Department of Electrical and Computer Engineering, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA and Department of Computer Engineering, Bilkent University, Bilkent, Ankara 06800, Turkey
| | - Gennady Pekhimenko
- Computer Science Department, Department of Electrical and Computer Engineering, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA and Department of Computer Engineering, Bilkent University, Bilkent, Ankara 06800, Turkey
| | - Carl Kingsford
- Computer Science Department, Department of Electrical and Computer Engineering, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA and Department of Computer Engineering, Bilkent University, Bilkent, Ankara 06800, Turkey
| | - Can Alkan
- Computer Science Department, Department of Electrical and Computer Engineering, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA and Department of Computer Engineering, Bilkent University, Bilkent, Ankara 06800, Turkey
| | - Onur Mutlu
- Computer Science Department, Department of Electrical and Computer Engineering, Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA and Department of Computer Engineering, Bilkent University, Bilkent, Ankara 06800, Turkey
| |
Collapse
|
36
|
Lee D, Hormozdiari F, Xin H, Hach F, Mutlu O, Alkan C. Fast and accurate mapping of Complete Genomics reads. Methods 2014; 79-80:3-10. [PMID: 25461772 DOI: 10.1016/j.ymeth.2014.10.012] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2014] [Revised: 10/01/2014] [Accepted: 10/13/2014] [Indexed: 12/31/2022] Open
Abstract
Many recent advances in genomics and the expectations of personalized medicine are made possible thanks to power of high throughput sequencing (HTS) in sequencing large collections of human genomes. There are tens of different sequencing technologies currently available, and each HTS platform have different strengths and biases. This diversity both makes it possible to use different technologies to correct for shortcomings; but also requires to develop different algorithms for each platform due to the differences in data types and error models. The first problem to tackle in analyzing HTS data for resequencing applications is the read mapping stage, where many tools have been developed for the most popular HTS methods, but publicly available and open source aligners are still lacking for the Complete Genomics (CG) platform. Unfortunately, Burrows-Wheeler based methods are not practical for CG data due to the gapped nature of the reads generated by this method. Here we provide a sensitive read mapper (sirFAST) for the CG technology based on the seed-and-extend paradigm that can quickly map CG reads to a reference genome. We evaluate the performance and accuracy of sirFAST using both simulated and publicly available real data sets, showing high precision and recall rates.
Collapse
Affiliation(s)
- Donghyuk Lee
- Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Farhad Hormozdiari
- Department of Computer Science, University of California Los Angeles, Los Angeles, CA, USA
| | - Hongyi Xin
- Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Faraz Hach
- School of Computing Science, Simon Fraser University, Burnaby, BC, Canada
| | - Onur Mutlu
- Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA, USA.
| | - Can Alkan
- Department of Computer Engineering, Bilkent University, Ankara, Turkey.
| |
Collapse
|
37
|
Abstract
Memory isolation is a key property of a reliable and secure computing system--an access to one memory address should not have unintended side effects on data stored in other addresses. However, as DRAM process technology scales down to smaller dimensions, it becomes more difficult to prevent DRAM cells from electrically interacting with each other. In this paper, we expose the vulnerability of commodity DRAM chips to disturbance errors. By reading from the same address in DRAM, we show that it is possible to corrupt data in nearby addresses. More specifically, activating the same row in DRAM corrupts data in nearby rows. We demonstrate this phenomenon on Intel and AMD systems using a malicious program that generates many DRAM accesses. We induce errors in most DRAM modules (110 out of 129) from three major DRAM manufacturers. From this we conclude that many deployed systems are likely to be at risk. We identify the root cause of disturbance errors as the repeated toggling of a DRAM row's wordline, which stresses inter-cell coupling effects that accelerate charge leakage from nearby rows. We provide an extensive characterization study of disturbance errors and their behavior using an FPGA-based testing platform. Among our key findings, we show that (i) it takes as few as 139K accesses to induce an error and (ii) up to one in every 1.7K cells is susceptible to errors. After examining various potential ways of addressing the problem, we propose a low-overhead solution to prevent the errors
Collapse
|
38
|
Abstract
Continued scaling of NAND flash memory to smaller process technology nodes decreases its reliability, necessitating more sophisticated mechanisms to correctly read stored data values. To distinguish between different potential stored values, conventional techniques to read data from flash memory employ a single set of reference voltage values, which are determined based on the overall threshold voltage distribution of flash cells. Unfortunately, the phenomenon of program interference, in which a cell's threshold voltage unintentionally changes when a neighboring cell is programmed, makes this conventional approach increasingly inaccurate in determining the values of cells.
This paper makes the new empirical observation that identifying the value stored in the immediate-neighbor cell makes it easier to determine the data value stored in the cell that is being read. We provide a detailed statistical and experimental characterization of threshold voltage distribution of flash memory cells
conditional upon
the immediate-neighbor cell values, and show that such conditional distributions can be used to determine a set of read reference voltages that lead to error rates much lower than when a single set of reference voltage values based on the overall distribution are used. Based on our analyses, we propose a new method for correcting errors in a flash memory page, neighbor-cell assisted correction (NAC). The key idea is to re-read a flash memory page that fails error correction codes (ECC) with the set of read reference voltage values corresponding to the conditional threshold voltage distribution assuming a neighbor cell value and use the re-read values to correct the cells that have neighbors with that value. Our simulations show that NAC effectively improves flash memory lifetime by 33% while having no (at nominal lifetime) or very modest (less than 5% at extended lifetime) performance overhead.
Collapse
Affiliation(s)
- Yu Cai
- Carnegie Mellon University, Pittsburgh, PA, USA
| | - Gulay Yalcin
- Barcelona Supercomputing Center, Barcelona, Spain
| | - Onur Mutlu
- Carnegie Mellon University, Pittsburgh, PA, USA
| | | | - Osman Unsal
- Barcelona Supercomputing Center, Barcelona, Spain
| | | | - Ken Mai
- Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
39
|
Abstract
As DRAM cells continue to shrink, they become more susceptible to retention failures. DRAM cells that permanently exhibit short retention times are fairly easy to identify and repair through the use of memory tests and row and column redundancy. However, the retention time of many cells may vary over time due to a property called
Variable Retention Time (VRT)
. Since these cells intermittently transition between failing and non-failing states, they are particularly difficult to identify through memory tests alone. In addition, the high temperature packaging process may aggravate this problem as the susceptibility of cells to VRT increases after the assembly of DRAM chips. A promising alternative to manufacture-time testing is to detect and mitigate retention failures after the system has become operational. Such a system would require mechanisms to detect and mitigate retention failures in the field, but would be responsive to retention failures introduced after system assembly and could dramatically reduce the cost of testing, enabling much longer tests than are practical with manufacturer testing equipment.
In this paper, we analyze the efficacy of three common error mitigation techniques (memory tests, guardbands, and error correcting codes (ECC)) in real DRAM chips exhibiting both intermittent and permanent retention failures. Our analysis allows us to quantify the efficacy of recent system-level error mitigation mechanisms that build upon these techniques. We revisit prior works in the context of the experimental data we present, showing that our measured results significantly impact these works' conclusions. We find that mitigation techniques that rely on run-time testing alone [38, 27, 50, 26] are unable to ensure reliable operation even after many months of testing. Techniques that incorporate ECC[4, 52], however, can ensure reliable DRAM operation after only a few hours of testing. For example, VS-ECC[4], which couples testing with variable strength codes to allocate the strongest codes to the most error-prone memory regions, can ensure reliable operation for 10 years after only 19 minutes of testing. We conclude that the viability of these mitigation techniques depend on efficient online profiling of DRAM performed without disrupting system operation.
Collapse
Affiliation(s)
- Samira Khan
- Carnegie Mellon University & Intel Labs, Pittsburgh, USA
| | | | - Yoongu Kim
- Carnegie Mellon University, Pittsburgh, USA
| | | | | | - Onur Mutlu
- Carnegie Mellon University, Pittsburgh, USA
| |
Collapse
|
40
|
Akar F, Mutlu O, Celikyurt IK, Ulak G, Erden F, Bektas E, Tanyeri P. Effects of rolipram and zaprinast on learning and memory in the Morris water maze and radial arm maze tests in naive mice. Drug Res (Stuttg) 2014; 65:86-90. [PMID: 24764251 DOI: 10.1055/s-0034-1372646] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Inhibition of phosphodiesterase 5 (PDE) improved recognition memory and counteracted spatial learning impairment induced by nitric oxide synthase (NOS) inhibition in recent studies. Aim of this study was to investigate effects of rolipram, a PDE4 inhibitor and zaprinast, a PDE5 inhibitor, on learning and memory in Morris water maze (MWM) and radial arm maze (RAM) tests in naive mice. Male Balb-c mice were treated subchronically with zaprinast (3 and 10 mg/kg) and rolipram (0.05 and 0.1 mg/kg) for 6 days in the MWM test and acutely before the retention trial of radial arm maze test. Rolipram (0.05 and 0.1 mg/kg) significantly decreased escape latency between 2(nd) and 5(th) sessions, while zaprinast (10 mg/kg) significantly decreased escape latency only in 2(nd) session. Rolipram (0.05 and 0.1 mg/kg) and zaprinast (10 mg/kg) significantly increased time spent in escape platform's quadrant in probe trial of MWM test; only rolipram decreased mean distance to platform, while zaprinast had no effect on mean distance to platform. Zaprinast (3 and 10 mg/kg) significantly decreased number of errors compared to control group, while rolipram (0.05 and 0.1mg/kg) had no effect on number of errors in retention trial of RAM test. Rolipram (0.05 and 0.1 mg/kg) and zaprinast (10 mg/kg) significantly decreased time spent to complete retention trial (latency) compared to control group. Our study revealed that both zaprinast and rolipram enhanced spatial memory in MWM, while zaprinast seems to have more memory enhancing effects compared to rolipram in radial arm maze test.
Collapse
Affiliation(s)
- F Akar
- Department of Pharmacology, Kocaeli University Medical Faculty, Kocaeli, Turkey
| | - O Mutlu
- Department of Pharmacology, Kocaeli University Medical Faculty, Kocaeli, Turkey
| | - I K Celikyurt
- Department of Pharmacology, Kocaeli University Medical Faculty, Kocaeli, Turkey
| | - G Ulak
- Department of Pharmacology, Kocaeli University Medical Faculty, Kocaeli, Turkey
| | - F Erden
- Department of Pharmacology, Kocaeli University Medical Faculty, Kocaeli, Turkey
| | - E Bektas
- Department of Pharmacology, Kocaeli University Medical Faculty, Kocaeli, Turkey
| | - P Tanyeri
- Department of Pharmacology, Faculty of Medicine, Sakarya University, Sakarya, Turkey
| |
Collapse
|
41
|
Abstract
DRAM cells store data in the form of charge on a capacitor. This charge leaks off over time, eventually causing data to be lost. To prevent this data loss from occurring, DRAM cells must be periodically refreshed. Unfortunately, DRAM refresh operations waste energy and also degrade system performance by interfering with memory requests. These problems are expected to worsen as DRAM density increases.
The amount of time that a DRAM cell can safely retain data without being refreshed is called the cell's
retention time
. In current systems, all DRAM cells are refreshed at the rate required to guarantee the integrity of the cell with the shortest retention time, resulting in unnecessary refreshes for cells with longer retention times. Prior work has proposed to reduce unnecessary refreshes by exploiting differences in retention time among DRAM cells; however, such mechanisms require knowledge of each cell's retention time.
In this paper, we present a comprehensive quantitative study of retention behavior in modern DRAMs. Using a temperature-controlled FPGA-based testing platform, we collect retention time information from 248 commodity DDR3 DRAM chips from five major DRAM vendors. We observe two significant phenomena:
data pattern dependence
, where the retention time of each DRAM cell is significantly affected by the data stored in other DRAM cells, and
variable retention time
, where the retention time of some DRAM cells changes unpredictably over time. We discuss possible physical explanations for these phenomena, how their magnitude may be affected by DRAM technology scaling, and their ramifications for DRAM retention time profiling mechanisms.
Collapse
Affiliation(s)
- Jamie Liu
- Carnegie Mellon University, Pittsburgh, PA
| | - Ben Jaiyen
- Carnegie Mellon University, Pittsburgh, PA
| | - Yoongu Kim
- Carnegie Mellon University, Pittsburgh, PA
| | | | - Onur Mutlu
- Carnegie Mellon University, Pittsburgh, PA
| |
Collapse
|
42
|
Gumuslu E, Mutlu O, Sunnetci D, Ulak G, Celikyurt I, Cine N, Akar F. The Effects of Tianeptine, Olanzapine and Fluoxetine on the Cognitive Behaviors of Unpredictable Chronic Mild Stress-exposed Mice. Drug Res (Stuttg) 2013; 63:532-9. [DOI: 10.1055/s-0033-1347237] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- E. Gumuslu
- Department of Medical Genetics, Kocaeli University Medical Faculty, Kocaeli, Turkey
| | - O. Mutlu
- Department of Pharmacology, Kocaeli University Medical Faculty, Kocaeli, Turkey
| | - D. Sunnetci
- Department of Medical Genetics, Kocaeli University Medical Faculty, Kocaeli, Turkey
| | - G. Ulak
- Department of Pharmacology, Kocaeli University Medical Faculty, Kocaeli, Turkey
| | - I. Celikyurt
- Department of Pharmacology, Kocaeli University Medical Faculty, Kocaeli, Turkey
| | - N. Cine
- Department of Medical Genetics, Kocaeli University Medical Faculty, Kocaeli, Turkey
| | - F. Akar
- Department of Pharmacology, Kocaeli University Medical Faculty, Kocaeli, Turkey
| |
Collapse
|
43
|
Abstract
With the introduction of next-generation sequencing (NGS) technologies, we are facing an exponential increase in the amount of genomic sequence data. The success of all medical and genetic applications of next-generation sequencing critically depends on the existence of computational techniques that can process and analyze the enormous amount of sequence data quickly and accurately. Unfortunately, the current read mapping algorithms have difficulties in coping with the massive amounts of data generated by NGS.We propose a new algorithm, FastHASH, which drastically improves the performance of the seed-and-extend type hash table based read mapping algorithms, while maintaining the high sensitivity and comprehensiveness of such methods. FastHASH is a generic algorithm compatible with all seed-and-extend class read mapping algorithms. It introduces two main techniques, namely Adjacency Filtering, and Cheap K-mer Selection.We implemented FastHASH and merged it into the codebase of the popular read mapping program, mrFAST. Depending on the edit distance cutoffs, we observed up to 19-fold speedup while still maintaining 100% sensitivity and high comprehensiveness.
Collapse
Affiliation(s)
- Hongyi Xin
- Depts. of Computer Science and Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | | | | | | | | | | |
Collapse
|
44
|
Abstract
In this paper, we present network-on-chip (NoC) design and contrast it to traditional network design, highlighting similarities and differences between the two. As an initial case study, we examine
network congestion
in bufferless NoCs. We show that congestion manifests itself differently in a NoC than in traditional networks. Network congestion reduces system throughput in congested workloads for smaller NoCs (16 and 64 nodes), and limits the scalability of larger bufferless NoCs (256 to 4096 nodes) even when traffic has locality (e.g., when an application's required data is mapped nearby to its core in the network). We propose a new source throttling-based congestion control mechanism with application-level awareness that reduces network congestion to improve system performance. Our mechanism improves system performance by up to 28% (15% on average in congested workloads) in smaller NoCs, achieves linear throughput scaling in NoCs up to 4096 cores (attaining similar performance scalability to a NoC with large buffers), and reduces power consumption by up to 20%. Thus, we show an effective application of a network-level concept, congestion control, to a class of networks -- bufferless on-chip networks -- that has not been studied before by the networking community.
Collapse
Affiliation(s)
| | | | | | - Onur Mutlu
- Carnegie Mellon University, Pittsburgh, PA, USA
| | | |
Collapse
|
45
|
Abstract
Dynamic random-access memory (DRAM) is the building block of modern main memory systems. DRAM cells must be periodically refreshed to prevent loss of data. These refresh operations waste energy and degrade system performance by interfering with memory accesses. The negative effects of DRAM refresh increase as DRAM device capacity increases. Existing DRAM devices refresh all cells at a rate determined by the leakiest cell in the device. However, most DRAM cells can retain data for significantly longer. Therefore, many of these refreshes are unnecessary.
In this paper, we propose RAIDR (Retention-Aware Intelligent DRAM Refresh), a low-cost mechanism that can identify and skip unnecessary refreshes using knowledge of cell retention times. Our key idea is to group DRAM rows into retention time bins and apply a different refresh rate to each bin. As a result, rows containing leaky cells are refreshed as frequently as normal, while most rows are refreshed less frequently. RAIDR uses Bloom filters to efficiently implement retention time bins. RAIDR requires no modification to DRAM and minimal modification to the memory controller. In an 8-core system with 32 GB DRAM, RAIDR achieves a 74.6% refresh reduction, an average DRAM power reduction of 16.1%, and an average system performance improvement of 8.6% over existing systems, at a modest storage overhead of 1.25 KB in the memory controller. RAIDR's benefits are robust to variation in DRAM system configuration, and increase as memory capacity increases.
Collapse
|
46
|
Abstract
Modern DRAMs have multiple banks to serve multiple memory requests in parallel. However, when two requests go to the same bank, they have to be served serially, exacerbating the high latency of off-chip memory. Adding more banks to the system to mitigate this problem incurs high system cost. Our goal in this work is to achieve the benefits of increasing the number of banks with a low cost approach. To this end, we propose three new mechanisms that overlap the latencies of different requests that go to the same bank. The key observation exploited by our mechanisms is that a modern DRAM bank is implemented as a collection of subarrays that operate largely independently while sharing few global peripheral structures.
Our proposed mechanisms (SALP-1, SALP-2, and MASA) mitigate the negative impact of bank serialization by overlapping different components of the bank access latencies of multiple requests that go to different subarrays within the same bank. SALP-1 requires no changes to the existing DRAM structure and only needs reinterpretation of some DRAM timing parameters. SALP-2 and MASA require only modest changes (< 0.15% area overhead) to the DRAM peripheral structures, which are much less design constrained than the DRAM core. Evaluations show that all our schemes significantly improve performance for both single-core systems and multi-core systems. Our schemes also interact positively with application-aware memory request scheduling in multi-core systems.
Collapse
|
47
|
Mutlu O, Celikyurt I, Ulak G, Tanyeri P, Akar F, Erden F. Effects of Olanzapine and Clozapine on Radial Maze Performance in Naive and MK-801-Treated Mice. ACTA ACUST UNITED AC 2012; 62:4-8. [DOI: 10.1055/s-0031-1291360] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Affiliation(s)
- O. Mutlu
- Kocaeli University Medical Faculty, Pharmacology Department, Kocaeli, Turkey
| | - I. Celikyurt
- Kocaeli University Medical Faculty, Pharmacology Department, Kocaeli, Turkey
| | - G. Ulak
- Kocaeli University Medical Faculty, Pharmacology Department, Kocaeli, Turkey
| | - P. Tanyeri
- Kocaeli University Medical Faculty, Pharmacology Department, Kocaeli, Turkey
| | - F. Akar
- Kocaeli University Medical Faculty, Pharmacology Department, Kocaeli, Turkey
| | - F. Erden
- Kocaeli University Medical Faculty, Pharmacology Department, Kocaeli, Turkey
| |
Collapse
|
48
|
Bolkent S, Yanardag R, Bolkent S, Mutlu O, Yildirim S, Kangawa K, Minegishi Y, Suzuki H. The Effect of Zinc supplementation on Ghrelin-Immunoreactive Cells and Lipid Parameters in Gastrointestinal Tissue of Streptozotocin-Induced Female Diabetic Rats. Mol Cell Biochem 2006; 286:77-85. [PMID: 16479319 DOI: 10.1007/s11010-005-9095-1] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2005] [Accepted: 11/28/2005] [Indexed: 10/25/2022]
Abstract
Zinc is an essential nutrient with a wide range of functions and closely involved in a variety of enzymatic processes of importance in glucose, protein and lipid metabolism. Ghrelin is the endogenous ligand of the G protein coupled growth hormone secretagogue receptor. The regulatory mechanism that explain the biosynthesis and secretion of ghrelin in the gastrointestinal tract has not been clarified. This study was undertaken to examine the effect of zinc supplementation on the streptozotocin (STZ)-induced diabetic rats, which exhibits ghrelin production and secretion, and lipid metabolism on the gastrointestinal tract. The animals were divided into four groups. Group I: Non-diabetic untreated animals. Group II: Zinc-treated non-diabetic rats. Group III: STZ-induced diabetic untreated animals. Group IV: Zinc-treated diabetic animals. Zinc sulfate was given to some of the experimental animals by gavage at a dose of 100 mg/kg body weight every day for 60 days. In the zinc-treated diabetic group, the blood glucose levels decreased and body weight increased as compared to the diabetic untreated group. Zinc supplementation to STZ-diabetic rats revealed the protective effect of zinc on lipids parameters such as total lipid, cholesterol, HDL-cholesterol and atherogenic index. There is no statistically change in ghrelin-immunoreactive cells in gastrointestinal tissue. But, it has found that zinc supplementation caused a significant reduction in densities of ghrelin-producing cells of fundic mucosa of zinc-treated diabetic animals as compared to untreated, non-diabetic controls. Zinc supplementation may contribute to prevent some complications of diabetic rats, biochemically.
Collapse
Affiliation(s)
- S Bolkent
- Department of Medical Biology, Cerrahpasa Faculty of Medicine, Istanbul University, 34098 Cerrahpasa, Istanbul, Turkey.
| | | | | | | | | | | | | | | |
Collapse
|