1
|
Gong B, Li D, Łabaj PP, Pan B, Novoradovskaya N, Thierry-Mieg D, Thierry-Mieg J, Chen G, Bergstrom Lucas A, LoCoco JS, Richmond TA, Tseng E, Kusko R, Happe S, Mercer TR, Pabón-Peña C, Salmans M, Tilgner HU, Xiao W, Johann DJ, Jones W, Tong W, Mason CE, Kreil DP, Xu J. Targeted DNA-seq and RNA-seq of Reference Samples with Short-read and Long-read Sequencing. Sci Data 2024; 11:892. [PMID: 39152166 PMCID: PMC11329654 DOI: 10.1038/s41597-024-03741-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 08/05/2024] [Indexed: 08/19/2024] Open
Abstract
Next-generation sequencing (NGS) has revolutionized genomic research by enabling high-throughput, cost-effective genome and transcriptome sequencing accelerating personalized medicine for complex diseases, including cancer. Whole genome/transcriptome sequencing (WGS/WTS) provides comprehensive insights, while targeted sequencing is more cost-effective and sensitive. In comparison to short-read sequencing, which still dominates the field due to high speed and cost-effectiveness, long-read sequencing can overcome alignment limitations and better discriminate similar sequences from alternative transcripts or repetitive regions. Hybrid sequencing combines the best strengths of different technologies for a more comprehensive view of genomic/transcriptomic variations. Understanding each technology's strengths and limitations is critical for translating cutting-edge technologies into clinical applications. In this study, we sequenced DNA and RNA libraries of reference samples using various targeted DNA and RNA panels and the whole transcriptome on both short-read and long-read platforms. This study design enables a comprehensive analysis of sequencing technologies, targeting protocols, and library preparation methods. Our expanded profiling landscape establishes a reference point for assessing current sequencing technologies, facilitating informed decision-making in genomic research and precision medicine.
Collapse
Affiliation(s)
- Binsheng Gong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Dan Li
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Paweł P Łabaj
- Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Bioinformatics Research, Institute of Molecular Biotechnology, Boku University Vienna, Vienna, Austria
| | - Bohu Pan
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | | | - Danielle Thierry-Mieg
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD, 20894, USA
| | - Jean Thierry-Mieg
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD, 20894, USA
| | - Guangchun Chen
- Department of Immunology, Genomics and Microarray Core Facility, University of Texas Southwestern Medical Center, 5323 Harry Hine Blvd., Dallas, TX, 75390, USA
| | - Anne Bergstrom Lucas
- Agilent Technologies, Inc., 5301 Stevens Creek Blvd., Santa Clara, CA, 95051, USA
| | | | - Todd A Richmond
- Market & Application Development Bioinformatics, Roche Sequencing Solutions Inc., 4300 Hacienda Dr., Pleasanton, CA, 94588, USA
| | | | - Rebecca Kusko
- Cellino Bio, 750 Main Street, Cambridge, MA, 02143, USA
| | - Scott Happe
- Agilent Technologies, Inc., 1834 State Hwy 71 West, Cedar Creek, TX, 78612, USA
| | - Timothy R Mercer
- Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, St Lucia, QLD, Australia
| | - Carlos Pabón-Peña
- Agilent Technologies, Inc., 5301 Stevens Creek Blvd., Santa Clara, CA, 95051, USA
| | | | - Hagen U Tilgner
- Brain and Mind Research Institute, Weill Cornell Medicine, New York, NY, USA
- Center for Neurogenetics, Weill Cornell Medicine, New York, NY, USA
| | - Wenzhong Xiao
- Stanford Genome Technology Center, Stanford University, Palo Alto, CA, 94304, USA
- Massachusetts General Hospital, Harvard Medical School, Boston, MA, 02114, USA
| | - Donald J Johann
- Winthrop P Rockefeller Cancer Institute, University of Arkansas for Medical Sciences, 4301W Markham St., Little Rock, AR, 72205, USA
| | - Wendell Jones
- Q squared Solutions Genomics, 2400 Elis Road, Durham, NC, 27703, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Christopher E Mason
- Department of Physiology and Biophysics, Weill Cornell Medicine, Cornell University, New York, NY, 10065, USA.
- The HRH Prince Alwaleed Bin Talal Bin Abdulaziz Alsaud Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA.
- The WorldQuant Initiative for Quantitative Prediction, Weill Cornell Medicine, New York, NY, USA.
| | - David P Kreil
- Bioinformatics Research, Institute of Molecular Biotechnology, Boku University Vienna, Vienna, Austria.
| | - Joshua Xu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA.
| |
Collapse
|
2
|
Deshpande D, Chhugani K, Chang Y, Karlsberg A, Loeffler C, Zhang J, Muszyńska A, Munteanu V, Yang H, Rotman J, Tao L, Balliu B, Tseng E, Eskin E, Zhao F, Mohammadi P, P. Łabaj P, Mangul S. RNA-seq data science: From raw data to effective interpretation. Front Genet 2023; 14:997383. [PMID: 36999049 PMCID: PMC10043755 DOI: 10.3389/fgene.2023.997383] [Citation(s) in RCA: 34] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 02/24/2023] [Indexed: 03/14/2023] Open
Abstract
RNA sequencing (RNA-seq) has become an exemplary technology in modern biology and clinical science. Its immense popularity is due in large part to the continuous efforts of the bioinformatics community to develop accurate and scalable computational tools to analyze the enormous amounts of transcriptomic data that it produces. RNA-seq analysis enables genes and their corresponding transcripts to be probed for a variety of purposes, such as detecting novel exons or whole transcripts, assessing expression of genes and alternative transcripts, and studying alternative splicing structure. It can be a challenge, however, to obtain meaningful biological signals from raw RNA-seq data because of the enormous scale of the data as well as the inherent limitations of different sequencing technologies, such as amplification bias or biases of library preparation. The need to overcome these technical challenges has pushed the rapid development of novel computational tools, which have evolved and diversified in accordance with technological advancements, leading to the current myriad of RNA-seq tools. These tools, combined with the diverse computational skill sets of biomedical researchers, help to unlock the full potential of RNA-seq. The purpose of this review is to explain basic concepts in the computational analysis of RNA-seq data and define discipline-specific jargon.
Collapse
Affiliation(s)
- Dhrithi Deshpande
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Karishma Chhugani
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Yutong Chang
- Department of Pharmacology and Pharmaceutical Sciences, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Aaron Karlsberg
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Caitlin Loeffler
- Department of Computer Science, University of California, Los Angeles, CA, United States
| | - Jinyang Zhang
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
| | - Agata Muszyńska
- Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Institute of Automatic Control, Electronics and Computer Science, Silesian University of Technology, Gliwice, Poland
| | - Viorel Munteanu
- Department of Computers, Informatics and Microelectronics, Technical University of Moldova, Chisinau, Moldova
| | - Harry Yang
- Department of Microbiology, Immunology and Molecular Genetics, University of California Los Angeles, Los Angeles, CA, United States
| | - Jeremy Rotman
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
| | - Laura Tao
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
| | - Brunilda Balliu
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
| | | | - Eleazar Eskin
- Department of Computer Science, University of California, Los Angeles, CA, United States
- Department of Computational Medicine, David Geffen School of Medicine at UCLA, CHS, Los Angeles, CA, United States
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, United States
| | - Fangqing Zhao
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing, China
- Key Laboratory of Systems Biology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou, China
| | - Pejman Mohammadi
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, United States
| | - Paweł P. Łabaj
- Małopolska Centre of Biotechnology, Jagiellonian University, Krakow, Poland
- Department of Biotechnology, Boku University Vienna, Vienna, Austria
| | - Serghei Mangul
- Department of Clinical Pharmacy, USC Alfred E. Mann School of Pharmacy and Pharmaceutical Sciences, Los Angeles, CA, United States
- Department of Quantitative and Computational Biology, USC Dornsife College of Letters, Arts and Sciences, Los Angeles, CA, United States
- *Correspondence: Serghei Mangul,
| |
Collapse
|
3
|
Gudur VY, Maheshwari S, Acharyya A, Shafik R. An FPGA Based Energy-Efficient Read Mapper With Parallel Filtering and In-Situ Verification. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022; 19:2697-2711. [PMID: 34415836 DOI: 10.1109/tcbb.2021.3106311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
In the assembly pipeline of Whole Genome Sequencing (WGS), read mapping is a widely used method to re-assemble the genome. It employs approximate string matching and dynamic programming-based algorithms on a large volume of data and associated structures, making it a computationally intensive process. Currently, the state-of-the-art data centers for genome sequencing incur substantial setup and energy costs for maintaining hardware, data storage and cooling systems. To enable low-cost genomics, we propose an energy-efficient architectural methodology for read mapping using a single system-on-chip (SoC) platform. The proposed methodology is based on the q-gram lemma and designed using a novel architecture for filtering and verification. The filtering algorithm is designed using a parallel sorted q-gram lemma based method for the first time, and it is complemented by an in-situ verification routine using parallel Myers bit-vector algorithm. We have implemented our design on the Zynq Ultrascale+ XCZU9EG MPSoC platform. It is then extensively validated using real genomic data to demonstrate up to 7.8× energy reduction and up to 13.3× less resource utilization when compared with the state-of-the-art software and hardware approaches.
Collapse
|
4
|
Bagal UR, Phan J, Welsh RM, Misas E, Wagner D, Gade L, Litvintseva AP, Cuomo CA, Chow NA. MycoSNP: A Portable Workflow for Performing Whole-Genome Sequencing Analysis of Candida auris. Methods Mol Biol 2022; 2517:215-228. [PMID: 35674957 DOI: 10.1007/978-1-0716-2417-3_17] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Candida auris is an urgent public health threat characterized by high drug-resistant rates and rapid spread in healthcare settings worldwide. As part of the C. auris response, molecular surveillance has helped public health officials track the global spread and investigate local outbreaks. Here, we describe whole-genome sequencing analysis methods used for routine C. auris molecular surveillance in the United States; methods include reference selection, reference preparation, quality assessment and control of sequencing reads, read alignment, and single-nucleotide polymorphism calling and filtration. We also describe the newly developed pipeline MycoSNP, a portable workflow for performing whole-genome sequencing analysis of fungal organisms including C. auris.
Collapse
Affiliation(s)
- Ujwal R Bagal
- Mycotic Diseases Branch, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - John Phan
- Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Rory M Welsh
- Mycotic Diseases Branch, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | - Elizabeth Misas
- Mycotic Diseases Branch, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | | | - Lalitha Gade
- Mycotic Diseases Branch, Centers for Disease Control and Prevention, Atlanta, GA, USA
| | | | - Christina A Cuomo
- Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Nancy A Chow
- Mycotic Diseases Branch, Centers for Disease Control and Prevention, Atlanta, GA, USA.
| |
Collapse
|
5
|
Pavlovich PV, Cauchy P. Sequences to Differences in Gene Expression: Analysis of RNA-Seq Data. Methods Mol Biol 2022; 2508:279-318. [PMID: 35737247 DOI: 10.1007/978-1-0716-2376-3_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
RNA-Seq is now a routinely employed assay to measure gene expression. As the technique matured over the last decade, so have dedicated analytic tools. In this chapter, we first describe the mainstream as well as the most up-to-date protocols and their implications on downstream analysis. We then detail the steps entailing RNA-Seq analysis in three main stages: (i) preprocessing and data preparation, (ii) upstream processing, and (iii) high-level analyses. We review the most recent and relevant tools as one workflow following a stepwise order. The chapter further encompasses in-depth features of these tools. Details of the required code are made available throughout the chapter, as well as of the underlying statistics. We illustrate these steps with analysis of publicly available RNA-Seq data.
Collapse
Affiliation(s)
| | - Pierre Cauchy
- Universitätskilinkum Freiburg, Freiburg, Germany.
- Max Planck Institute of Immunobiology and Epigenetics, Freiburg, Germany.
| |
Collapse
|
6
|
Claes KBM, Rosseel T, De Leeneer K. Dealing with Pseudogenes in Molecular Diagnostics in the Next Generation Sequencing Era. Methods Mol Biol 2021; 2324:363-381. [PMID: 34165726 DOI: 10.1007/978-1-0716-1503-4_22] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Presence of pseudogenes is a dreadful issue in next generation sequencing (NGS), because their contamination can interfere with the detection of variants in the genuine gene and generate false positive and false negative variants.In this chapter we focus on issues related to the application of NGS strategies for analysis of genes with pseudogenes in a clinical setting. The degree to which a pseudogene impacts the ability to accurately detect and map variants in its parent gene depends on the degree of similarity (homology) with the parent gene itself. Hereby, target enrichment and mapping strategies are crucial factors to avoid "contaminating" pseudogene sequences. For target enrichment, we describe advantages and disadvantages of PCR- and capture-based strategies. For mapping strategies, we discuss crucial parameters that need to be considered to accurately distinguish sequences of functional genes from pseudogenic sequences. Finally, we discuss some examples of genes associated with Mendelian disorders, for which interesting NGS approaches are described to avoid interference with pseudogene sequences.
Collapse
Affiliation(s)
| | - Toon Rosseel
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| | - Kim De Leeneer
- Center for Medical Genetics, Ghent University Hospital, Ghent, Belgium
| |
Collapse
|
7
|
Werner S, Galliot A, Pichot F, Kemmer T, Marchand V, Sednev MV, Lence T, Roignant JY, König J, Höbartner C, Motorin Y, Hildebrandt A, Helm M. NOseq: amplicon sequencing evaluation method for RNA m6A sites after chemical deamination. Nucleic Acids Res 2021; 49:e23. [PMID: 33313868 PMCID: PMC7913672 DOI: 10.1093/nar/gkaa1173] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Revised: 11/13/2020] [Accepted: 11/20/2020] [Indexed: 12/26/2022] Open
Abstract
Methods for the detection of m6A by RNA-Seq technologies are increasingly sought after. We here present NOseq, a method to detect m6A residues in defined amplicons by virtue of their resistance to chemical deamination, effected by nitrous acid. Partial deamination in NOseq affects all exocyclic amino groups present in nucleobases and thus also changes sequence information. The method uses a mapping algorithm specifically adapted to the sequence degeneration caused by deamination events. Thus, m6A sites with partial modification levels of ∼50% were detected in defined amplicons, and this threshold can be lowered to ∼10% by combination with m6A immunoprecipitation. NOseq faithfully detected known m6A sites in human rRNA, and the long non-coding RNA MALAT1, and positively validated several m6A candidate sites, drawn from miCLIP data with an m6A antibody, in the transcriptome of Drosophila melanogaster. Conceptually related to bisulfite sequencing, NOseq presents a novel amplicon-based sequencing approach for the validation of m6A sites in defined sequences.
Collapse
Affiliation(s)
- Stephan Werner
- Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg-University Mainz, Staudingerweg 5, 55128 Mainz, Germany
| | - Aurellia Galliot
- Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg-University Mainz, Staudingerweg 5, 55128 Mainz, Germany
| | - Florian Pichot
- Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg-University Mainz, Staudingerweg 5, 55128 Mainz, Germany
| | - Thomas Kemmer
- Institute of Computer Science, Johannes Gutenberg-University Mainz, Staudingerweg 9, 55128 Mainz, Germany
| | - Virginie Marchand
- Université de Lorraine, CNRS, INSERM, Epitranscriptomics and Sequencing (EpiRNA-Seq) Core Facility, UMS2008/US40 IBSLor, Biopôle UL, F-54000 Nancy, France
| | - Maksim V Sednev
- Institute of Organic Chemistry, Julius Maximilian University Würzburg, Am Hubland, 97074 Würzburg, Germany
| | - Tina Lence
- Institute of Molecular Biology, Ackermannweg 4, 55128 Mainz, Germany
| | - Jean-Yves Roignant
- Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg-University Mainz, Staudingerweg 5, 55128 Mainz, Germany.,Institute of Molecular Biology, Ackermannweg 4, 55128 Mainz, Germany.,Génopode - Center for Integrative Genomics, Université de Lausanne, 1015 Lausanne, Switzerland
| | - Julian König
- Institute of Molecular Biology, Ackermannweg 4, 55128 Mainz, Germany
| | - Claudia Höbartner
- Institute of Organic Chemistry, Julius Maximilian University Würzburg, Am Hubland, 97074 Würzburg, Germany
| | - Yuri Motorin
- Université de Lorraine, CNRS, UMR7365 IMoPA, Biopôle UL, F-54000 Nancy, France
| | - Andreas Hildebrandt
- Institute of Computer Science, Johannes Gutenberg-University Mainz, Staudingerweg 9, 55128 Mainz, Germany
| | - Mark Helm
- Institute of Pharmaceutical and Biomedical Sciences, Johannes Gutenberg-University Mainz, Staudingerweg 5, 55128 Mainz, Germany
| |
Collapse
|
8
|
Kumar S, Agarwal S, Ranvijay. Fast and memory efficient approach for mapping NGS reads to a reference genome. J Bioinform Comput Biol 2020; 17:1950008. [PMID: 31057068 DOI: 10.1142/s0219720019500082] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
New generation sequencing machines: Illumina and Solexa can generate millions of short reads from a given genome sequence on a single run. Alignment of these reads to a reference genome is a core step in Next-generation sequencing data analysis such as genetic variation and genome re-sequencing etc. Therefore there is a need of a new approach, efficient with respect to memory as well as time to align these enormous reads with the reference genome. Existing techniques such as MAQ, Bowtie, BWA, BWBBLE, Subread, Kart, and Minimap2 require huge memory for whole reference genome indexing and reads alignment. Gapped alignment versions of these techniques are also 20-40% slower than their respective normal versions. In this paper, an efficient approach: WIT for reference genome indexing and reads alignment using Burrows-Wheeler Transform (BWT) and Wavelet Tree (WT) is proposed. Both exact and approximate alignments are possible by it. Experimental work shows that the proposed approach WIT performs the best in case of protein sequence indexing. For indexing, the reference genome space required by WIT is 0.6 N (N is the size of reference genome) whereas existing techniques BWA, Subread, Kart, and Minimap2 require space in between 1.25 N to 5 N. Experimentally, it is also observed that even using such small index size alignment time of proposed approach is comparable in comparison to BWA, Subread, Kart, and Minimap2. Other alignment parameters accuracy and confidentiality are also experimentally shown to be better than Minimap2. The source code of the proposed approach WIT is available at http://www.algorithm-skg.com/wit/home.html .
Collapse
Affiliation(s)
| | | | - Ranvijay
- 1 CSED, NIT Allahabad, 211004, India
| |
Collapse
|
9
|
Lakin SM, Kuhnle A, Alipanahi B, Noyes NR, Dean C, Muggli M, Raymond R, Abdo Z, Prosperi M, Belk KE, Morley PS, Boucher C. Hierarchical Hidden Markov models enable accurate and diverse detection of antimicrobial resistance sequences. Commun Biol 2019; 2:294. [PMID: 31396574 PMCID: PMC6684577 DOI: 10.1038/s42003-019-0545-9] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2018] [Accepted: 07/08/2019] [Indexed: 12/13/2022] Open
Abstract
The characterization of antimicrobial resistance genes from high-throughput sequencing data has become foundational in public health research and regulation. This requires mapping sequence reads to databases of known antimicrobial resistance genes to determine the genes present in the sample. Mapping sequence reads to known genes is traditionally accomplished using alignment. Alignment methods have high specificity but are limited in their ability to detect sequences that are divergent from the reference database, which can result in a substantial false negative rate. We address this shortcoming through the creation of Meta-MARC, which enables detection of diverse resistance sequences using hierarchical, DNA-based Hidden Markov Models. We first describe Meta-MARC and then demonstrate its efficacy on simulated and functional metagenomic datasets. Meta-MARC has higher sensitivity relative to competing methods. This sensitivity allows for detection of sequences that are divergent from known antimicrobial resistance genes. This functionality is imperative to expanding existing antimicrobial gene databases.
Collapse
Affiliation(s)
- Steven M Lakin
- 1Department of Microbiology, Immunology, and Pathology, Colorado State University, Fort Collins, CO 80523 USA
| | - Alan Kuhnle
- 2Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL 32611 USA
| | - Bahar Alipanahi
- 2Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL 32611 USA
| | - Noelle R Noyes
- 3Department of Veterinary Population Medicine, University of Minnesota, St. Paul, MN 55108 USA
| | - Chris Dean
- 1Department of Microbiology, Immunology, and Pathology, Colorado State University, Fort Collins, CO 80523 USA
| | - Martin Muggli
- 4Department of Computer Science, Colorado State University, Fort Collins, CO 80523 USA
| | - Rob Raymond
- 4Department of Computer Science, Colorado State University, Fort Collins, CO 80523 USA
| | - Zaid Abdo
- 1Department of Microbiology, Immunology, and Pathology, Colorado State University, Fort Collins, CO 80523 USA
| | - Mattia Prosperi
- 5Department of Epidemiology, University of Florida, Gainesville, FL 32611 USA
| | - Keith E Belk
- 6Department of Animal Sciences, Colorado State University, Fort Collins, CO 80523 USA
| | - Paul S Morley
- 7VERO Center, Texas A&M University and West Texas A&M University, Canyon, TX 79016 USA
| | - Christina Boucher
- 2Department of Computer and Information Science and Engineering, University of Florida, Gainesville, FL 32611 USA
| |
Collapse
|
10
|
Prousalis K, Konofaos N. Α Quantum Pattern Recognition Method for Improving Pairwise Sequence Alignment. Sci Rep 2019; 9:7226. [PMID: 31076611 PMCID: PMC6510764 DOI: 10.1038/s41598-019-43697-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2018] [Accepted: 04/29/2019] [Indexed: 12/22/2022] Open
Abstract
Quantum pattern recognition techniques have recently raised attention as potential candidates in analyzing vast amount of data. The necessity to obtain faster ways to process data is imperative where data generation is rapid. The ever-growing size of sequence databases caused by the development of high throughput sequencing is unprecedented. Current alignment methods have blossomed overnight but there is still the need for more efficient methods that preserve accuracy in high levels. In this work, a complex method is proposed to treat the alignment problem better than its classical counterparts by means of quantum computation. The basic principal of the standard dot-plot method is combined with a quantum algorithm, giving insight into the effect of quantum pattern recognition on pairwise alignment. The central feature of quantum algorithmic -quantum parallelism- and the diffraction patterns of x-rays are synthesized to provide a clever array indexing structure on the growing sequence databases. A completely different approach is considered in contrast to contemporary conventional aligners and a variety of competitive classical counterparts are classified and organized in order to compare with the quantum setting. The proposed method seems to exhibit high alignment quality and prevail among the others in terms of time and space complexity.
Collapse
Affiliation(s)
| | - Nikos Konofaos
- Department of Informatics, Aristotle University of Thessaloniki, Thessaloniki, Greece
| |
Collapse
|
11
|
Abstract
Since the discovery that DNA alterations initiate tumorigenesis, scientists and clinicians have been exploring ways to counter these changes with targeted therapeutics. The sequencing of tumor DNA was initially limited to highly actionable hot spots-areas of the genome that are frequently altered and have an approved matched therapy in a specific tumor type. Large-scale genome sequencing programs quickly developed technological improvements that enabled the deployment of whole-exome and whole-genome sequencing technologies at scale for pristine sample materials in research environments. However, the turning point for precision medicine in oncology was the innovations in clinical laboratories that improved turnaround time, depth of coverage, and the ability to reliably sequence archived, clinically available samples. Today, tumor genome sequencing no longer suffers from significant technical or financial hurdles, and the next opportunity for improvement lies in the optimal utilization of the technologies and data for many different tumor types.
Collapse
Affiliation(s)
- Kenna R Mills Shaw
- Khalifa Bin Zayed Institute for Personalized Cancer Therapy and Sheikh Ahmed Center for Pancreatic Cancer Research, University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA;
| | - Anirban Maitra
- Khalifa Bin Zayed Institute for Personalized Cancer Therapy and Sheikh Ahmed Center for Pancreatic Cancer Research, University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA;
| |
Collapse
|
12
|
Pan B, Kusko R, Xiao W, Zheng Y, Liu Z, Xiao C, Sakkiah S, Guo W, Gong P, Zhang C, Ge W, Shi L, Tong W, Hong H. Similarities and differences between variants called with human reference genome HG19 or HG38. BMC Bioinformatics 2019; 20:101. [PMID: 30871461 PMCID: PMC6419332 DOI: 10.1186/s12859-019-2620-0] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Background Reference genome selection is a prerequisite for successful analysis of next generation sequencing (NGS) data. Current practice employs one of the two most recent human reference genome versions: HG19 or HG38. To date, the impact of genome version on SNV identification has not been rigorously assessed. Methods We conducted analysis comparing the SNVs identified based on HG19 vs HG38, leveraging whole genome sequencing (WGS) data from the genome-in-a-bottle (GIAB) project. First, SNVs were called using 26 different bioinformatics pipelines with either HG19 or HG38. Next, two tools were used to convert the called SNVs between HG19 and HG38. Lastly we calculated conversion rates, analyzed discordant rates between SNVs called with HG19 or HG38, and characterized the discordant SNVs. Results The conversion rates from HG38 to HG19 (average 95%) were lower than the conversion rates from HG19 to HG38 (average 99%). The conversion rates varied slightly among the various calling pipelines. Around 1.5% SNVs were discordantly converted between HG19 or HG38. The conversions from HG38 to HG19 had more SNVs which failed conversion and more discordant SNVs than the opposite conversion (HG19 to HG38). Most of the discordant SNVs had low read depth, were low confidence SNVs as defined by GIAB, and/or were predominated by G/C alleles (52% observed versus 42% expected). Conclusion A significant number of SNVs could not be converted between HG19 and HG38. Based on careful review of our comparisons, we recommend HG38 (the newer version) for NGS SNV analysis. To summarize, our findings suggest caution when translating identified SNVs between different versions of the human reference genome. Electronic supplementary material The online version of this article (10.1186/s12859-019-2620-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Bohu Pan
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | | | - Wenming Xiao
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Yuanting Zheng
- Center for Pharmacogenomics, Fudan University, Shanghai, China
| | - Zhichao Liu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Chunlin Xiao
- National Center for Biotechnological Information, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Sugunadevi Sakkiah
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Wenjing Guo
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Ping Gong
- Environmental Laboratory, US Army Engineer Research and Development Center, Vicksburg, MS, 39180, USA
| | - Chaoyang Zhang
- School of Computing, The University of Southern Mississippi, Hattiesburg, MS, 39406, USA
| | - Weigong Ge
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Leming Shi
- Center for Pharmacogenomics, Fudan University, Shanghai, China
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, 72079, USA.
| |
Collapse
|
13
|
Sun S, Murray SS. Bioinformatics Basics for High-Throughput Hybridization-Based Targeted DNA Sequencing from FFPE-Derived Tumor Specimens: From Reads to Variants. Methods Mol Biol 2019; 1908:37-48. [PMID: 30649719 DOI: 10.1007/978-1-4939-9004-7_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
The use of next-generation sequencing and hybridization-based capture for target enrichment have enabled the interrogation of coding regions of several clinically significant cancer genes in tumor specimens using both targeted panels of a few to hundreds of genes, to whole-exome panels encompassing coding regions of all genes in the genome. Next-generation sequencing (NGS) technologies produce millions of relatively short segments of sequences or reads that require bioinformatics tools to map reads back to a reference genome using various read alignment tools, as well as to determine differences between single bases (single nucleotide variants or SNVs) or multiple bases (insertions and deletions or indels) between the aligned reads and the reference genome to call variants. In addition to single nucleotide changes or small insertions and deletions, high copy gains and losses can also be gleaned from NGS data to call gene amplifications and deletions. Throughout these processes, numerous quality control metrics can be assessed at each step to ensure that the resulting called variants are of high quality and are accurate. In this chapter we review common tools used to generate reads from Illumina-derived sequence data, align reads, and call variants from hybridization-based targeted NGS panel data generated from tumor FFPE-derived DNA specimens as well as basic quality metrics to assess for each assayed specimen.
Collapse
Affiliation(s)
- Shulei Sun
- Center for Advanced Laboratory Medicine, University of California San Diego Health, La Jolla, CA, USA
| | - Sarah S Murray
- Center for Advanced Laboratory Medicine, University of California San Diego Health, La Jolla, CA, USA.
- Department of Pathology, University of California San Diego, La Jolla, CA, USA.
| |
Collapse
|
14
|
Fuentes-Pardo AP, Ruzzante DE. Whole-genome sequencing approaches for conservation biology: Advantages, limitations and practical recommendations. Mol Ecol 2017; 26:5369-5406. [PMID: 28746784 DOI: 10.1111/mec.14264] [Citation(s) in RCA: 160] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2017] [Revised: 06/23/2017] [Accepted: 06/28/2017] [Indexed: 12/14/2022]
Abstract
Whole-genome resequencing (WGR) is a powerful method for addressing fundamental evolutionary biology questions that have not been fully resolved using traditional methods. WGR includes four approaches: the sequencing of individuals to a high depth of coverage with either unresolved or resolved haplotypes, the sequencing of population genomes to a high depth by mixing equimolar amounts of unlabelled-individual DNA (Pool-seq) and the sequencing of multiple individuals from a population to a low depth (lcWGR). These techniques require the availability of a reference genome. This, along with the still high cost of shotgun sequencing and the large demand for computing resources and storage, has limited their implementation in nonmodel species with scarce genomic resources and in fields such as conservation biology. Our goal here is to describe the various WGR methods, their pros and cons and potential applications in conservation biology. WGR offers an unprecedented marker density and surveys a wide diversity of genetic variations not limited to single nucleotide polymorphisms (e.g., structural variants and mutations in regulatory elements), increasing their power for the detection of signatures of selection and local adaptation as well as for the identification of the genetic basis of phenotypic traits and diseases. Currently, though, no single WGR approach fulfils all requirements of conservation genetics, and each method has its own limitations and sources of potential bias. We discuss proposed ways to minimize such biases. We envision a not distant future where the analysis of whole genomes becomes a routine task in many nonmodel species and fields including conservation biology.
Collapse
|
15
|
White SJ, Laros JF, Bakker E, Cambon‐Thomsen A, Eden M, Leonard S, Lochmüller H, Matthijs G, Mattocks C, Patton S, Payne K, Scheffer H, Souche E, Thomassen E, Thompson R, Traeger‐Synodinos J, Vooren S, Janssen B, den Dunnen JT. Critical points for an accurate human genome analysis. Hum Mutat 2017; 38:912-921. [DOI: 10.1002/humu.23238] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2016] [Revised: 04/13/2017] [Accepted: 04/23/2017] [Indexed: 12/16/2022]
Affiliation(s)
- Stefan J. White
- Department of Human Genetics, Leiden University Medical Center The Netherlands
| | - Jeroen F.J. Laros
- Department of Human Genetics, Leiden University Medical Center The Netherlands
- Clinical GeneticsLeiden University Medical Center The Netherlands
- GenomeScan Leiden The Netherlands
| | - Egbert Bakker
- Clinical GeneticsLeiden University Medical Center The Netherlands
| | - Anne Cambon‐Thomsen
- Epidemiology and Public Health Analyses, Inserm and Université Toulouse III Paul Sabatier Toulouse UMR 1027 France
| | - Martin Eden
- Manchester Centre for Health Economics, University of Manchester Manchester UK
| | - Samantha Leonard
- Epidemiology and Public Health Analyses, Inserm and Université Toulouse III Paul Sabatier Toulouse UMR 1027 France
| | - Hanns Lochmüller
- Institute of Genetic Medicine, Newcastle University Newcastle upon Tyne UK
| | | | | | - Simon Patton
- Central Manchester University Hospitals Foundation Trust, EMQN Manchester UK
| | - Katherine Payne
- Manchester Centre for Health Economics, University of Manchester Manchester UK
| | | | | | - Ellen Thomassen
- Department of Human Genetics, Leiden University Medical Center The Netherlands
| | - Rachel Thompson
- Institute of Genetic Medicine, Newcastle University Newcastle upon Tyne UK
| | | | | | | | - Johan T. den Dunnen
- Department of Human Genetics, Leiden University Medical Center The Netherlands
- Clinical GeneticsLeiden University Medical Center The Netherlands
| |
Collapse
|
16
|
Jiménez C, Jara-Acevedo M, Corchete LA, Castillo D, Ordóñez GR, Sarasquete ME, Puig N, Martínez-López J, Prieto-Conde MI, García-Álvarez M, Chillón MC, Balanzategui A, Alcoceba M, Oriol A, Rosiñol L, Palomera L, Teruel AI, Lahuerta JJ, Bladé J, Mateos MV, Orfão A, San Miguel JF, González M, Gutiérrez NC, García-Sanz R. A Next-Generation Sequencing Strategy for Evaluating the Most Common Genetic Abnormalities in Multiple Myeloma. J Mol Diagn 2016; 19:99-106. [PMID: 27863261 DOI: 10.1016/j.jmoldx.2016.08.004] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2016] [Revised: 08/04/2016] [Accepted: 08/12/2016] [Indexed: 12/16/2022] Open
Abstract
Identification and characterization of genetic alterations are essential for diagnosis of multiple myeloma and may guide therapeutic decisions. Currently, genomic analysis of myeloma to cover the diverse range of alterations with prognostic impact requires fluorescence in situ hybridization (FISH), single nucleotide polymorphism arrays, and sequencing techniques, which are costly and labor intensive and require large numbers of plasma cells. To overcome these limitations, we designed a targeted-capture next-generation sequencing approach for one-step identification of IGH translocations, V(D)J clonal rearrangements, the IgH isotype, and somatic mutations to rapidly identify risk groups and specific targetable molecular lesions. Forty-eight newly diagnosed myeloma patients were tested with the panel, which included IGH and six genes that are recurrently mutated in myeloma: NRAS, KRAS, HRAS, TP53, MYC, and BRAF. We identified 14 of 17 IGH translocations previously detected by FISH and three confirmed translocations not detected by FISH, with the additional advantage of breakpoint identification, which can be used as a target for evaluating minimal residual disease. IgH subclass and V(D)J rearrangements were identified in 77% and 65% of patients, respectively. Mutation analysis revealed the presence of missense protein-coding alterations in at least one of the evaluating genes in 16 of 48 patients (33%). This method may represent a time- and cost-effective diagnostic method for the molecular characterization of multiple myeloma.
Collapse
Affiliation(s)
- Cristina Jiménez
- Hematology Department, University Hospital of Salamanca, Research Biomedical Institute of Salamanca (IBSAL), Salamanca, Spain
| | - María Jara-Acevedo
- DNA Sequencing Service, University of Salamanca, Research Biomedical Institute of Salamanca (IBSAL), Salamanca, Spain
| | - Luis A Corchete
- Hematology Department, University Hospital of Salamanca, Research Biomedical Institute of Salamanca (IBSAL), Salamanca, Spain
| | | | | | - María E Sarasquete
- Hematology Department, University Hospital of Salamanca, Research Biomedical Institute of Salamanca (IBSAL), Salamanca, Spain
| | - Noemí Puig
- Hematology Department, University Hospital of Salamanca, Research Biomedical Institute of Salamanca (IBSAL), Salamanca, Spain
| | - Joaquín Martínez-López
- Hematology Department, 12 de Octubre Hospital, Unit of Cancer Research Innovation Spain (CRIS), Spanish National Cancer Research Center (CNIO), University of Madrid, Madrid, Spain
| | - María I Prieto-Conde
- Hematology Department, University Hospital of Salamanca, Research Biomedical Institute of Salamanca (IBSAL), Salamanca, Spain
| | - María García-Álvarez
- Hematology Department, University Hospital of Salamanca, Research Biomedical Institute of Salamanca (IBSAL), Salamanca, Spain
| | - María C Chillón
- Hematology Department, University Hospital of Salamanca, Research Biomedical Institute of Salamanca (IBSAL), Salamanca, Spain
| | - Ana Balanzategui
- Hematology Department, University Hospital of Salamanca, Research Biomedical Institute of Salamanca (IBSAL), Salamanca, Spain
| | - Miguel Alcoceba
- Hematology Department, University Hospital of Salamanca, Research Biomedical Institute of Salamanca (IBSAL), Salamanca, Spain
| | - Albert Oriol
- Catalan Institute of Oncology, Josep Carreras Institute, Germans Trias i Pujol Hospital, Barcelona, Spain
| | - Laura Rosiñol
- Research Biomedical Institute August Pi i Sunyer, Clinical Hospital of Barcelona, Barcelona, Spain
| | | | | | - Juan J Lahuerta
- Hematology Department, 12 de Octubre Hospital, Unit of Cancer Research Innovation Spain (CRIS), Spanish National Cancer Research Center (CNIO), University of Madrid, Madrid, Spain
| | - Joan Bladé
- Research Biomedical Institute August Pi i Sunyer, Clinical Hospital of Barcelona, Barcelona, Spain
| | - María V Mateos
- Hematology Department, University Hospital of Salamanca, Research Biomedical Institute of Salamanca (IBSAL), Salamanca, Spain
| | - Alberto Orfão
- DNA Sequencing Service, University of Salamanca, Research Biomedical Institute of Salamanca (IBSAL), Salamanca, Spain
| | - Jesús F San Miguel
- Center for Applied Medical Research, University of Navarra Hospital, Institute of Health Research of Navarra (IDISNA), Pamplona, Spain
| | - Marcos González
- Hematology Department, University Hospital of Salamanca, Research Biomedical Institute of Salamanca (IBSAL), Salamanca, Spain.
| | - Norma C Gutiérrez
- Hematology Department, University Hospital of Salamanca, Research Biomedical Institute of Salamanca (IBSAL), Salamanca, Spain
| | - Ramón García-Sanz
- Hematology Department, University Hospital of Salamanca, Research Biomedical Institute of Salamanca (IBSAL), Salamanca, Spain
| |
Collapse
|
17
|
Icay K, Chen P, Cervera A, Rantanen V, Lehtonen R, Hautaniemi S. SePIA: RNA and small RNA sequence processing, integration, and analysis. BioData Min 2016; 9:20. [PMID: 27213017 PMCID: PMC4875694 DOI: 10.1186/s13040-016-0099-z] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2015] [Accepted: 05/08/2016] [Indexed: 02/07/2023] Open
Abstract
Background Large-scale sequencing experiments are complex and require a wide spectrum of computational tools to extract and interpret relevant biological information. This is especially true in projects where individual processing and integrated analysis of both small RNA and complementary RNA data is needed. Such studies would benefit from a computational workflow that is easy to implement and standardizes the processing and analysis of both sequenced data types. Results We developed SePIA (Sequence Processing, Integration, and Analysis), a comprehensive small RNA and RNA workflow. It provides ready execution for over 20 commonly known RNA-seq tools on top of an established workflow engine and provides dynamic pipeline architecture to manage, individually analyze, and integrate both small RNA and RNA data. Implementation with Docker makes SePIA portable and easy to run. We demonstrate the workflow’s extensive utility with two case studies involving three breast cancer datasets. SePIA is straightforward to configure and organizes results into a perusable HTML report. Furthermore, the underlying pipeline engine supports computational resource management for optimal performance. Conclusion SePIA is an open-source workflow introducing standardized processing and analysis of RNA and small RNA data. SePIA’s modular design enables robust customization to a given experiment while maintaining overall workflow structure. It is available at http://anduril.org/sepia. Electronic supplementary material The online version of this article (doi:10.1186/s13040-016-0099-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Katherine Icay
- Research Programs Unit, Genome-Scale Biology, Medicum and Department of Biochemistry and Developmental Biology, Faculty of Medicine, University of Helsinki, POB 63, Helsinki, 00014 Finland
| | - Ping Chen
- Research Programs Unit, Genome-Scale Biology, Medicum and Department of Biochemistry and Developmental Biology, Faculty of Medicine, University of Helsinki, POB 63, Helsinki, 00014 Finland
| | - Alejandra Cervera
- Research Programs Unit, Genome-Scale Biology, Medicum and Department of Biochemistry and Developmental Biology, Faculty of Medicine, University of Helsinki, POB 63, Helsinki, 00014 Finland
| | - Ville Rantanen
- Research Programs Unit, Genome-Scale Biology, Medicum and Department of Biochemistry and Developmental Biology, Faculty of Medicine, University of Helsinki, POB 63, Helsinki, 00014 Finland
| | - Rainer Lehtonen
- Research Programs Unit, Genome-Scale Biology, Medicum and Department of Biochemistry and Developmental Biology, Faculty of Medicine, University of Helsinki, POB 63, Helsinki, 00014 Finland
| | - Sampsa Hautaniemi
- Research Programs Unit, Genome-Scale Biology, Medicum and Department of Biochemistry and Developmental Biology, Faculty of Medicine, University of Helsinki, POB 63, Helsinki, 00014 Finland
| |
Collapse
|