1
|
Bulashevska A, Nacsa Z, Lang F, Braun M, Machyna M, Diken M, Childs L, König R. Artificial intelligence and neoantigens: paving the path for precision cancer immunotherapy. Front Immunol 2024; 15:1394003. [PMID: 38868767 PMCID: PMC11167095 DOI: 10.3389/fimmu.2024.1394003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 05/13/2024] [Indexed: 06/14/2024] Open
Abstract
Cancer immunotherapy has witnessed rapid advancement in recent years, with a particular focus on neoantigens as promising targets for personalized treatments. The convergence of immunogenomics, bioinformatics, and artificial intelligence (AI) has propelled the development of innovative neoantigen discovery tools and pipelines. These tools have revolutionized our ability to identify tumor-specific antigens, providing the foundation for precision cancer immunotherapy. AI-driven algorithms can process extensive amounts of data, identify patterns, and make predictions that were once challenging to achieve. However, the integration of AI comes with its own set of challenges, leaving space for further research. With particular focus on the computational approaches, in this article we have explored the current landscape of neoantigen prediction, the fundamental concepts behind, the challenges and their potential solutions providing a comprehensive overview of this rapidly evolving field.
Collapse
Affiliation(s)
- Alla Bulashevska
- Host-Pathogen-Interactions, Paul-Ehrlich-Institut, Langen, Germany
| | - Zsófia Nacsa
- Host-Pathogen-Interactions, Paul-Ehrlich-Institut, Langen, Germany
| | - Franziska Lang
- TRON - Translational Oncology at the University Medical Center of the Johannes Gutenberg University gGmbH, Mainz, Germany
| | - Markus Braun
- Host-Pathogen-Interactions, Paul-Ehrlich-Institut, Langen, Germany
| | - Martin Machyna
- Host-Pathogen-Interactions, Paul-Ehrlich-Institut, Langen, Germany
| | - Mustafa Diken
- TRON - Translational Oncology at the University Medical Center of the Johannes Gutenberg University gGmbH, Mainz, Germany
| | - Liam Childs
- Host-Pathogen-Interactions, Paul-Ehrlich-Institut, Langen, Germany
| | - Renate König
- Host-Pathogen-Interactions, Paul-Ehrlich-Institut, Langen, Germany
| |
Collapse
|
2
|
Cabello-Aguilar S, Vendrell JA, Solassol J. A Bioinformatics Toolkit for Next-Generation Sequencing in Clinical Oncology. Curr Issues Mol Biol 2023; 45:9737-9752. [PMID: 38132454 PMCID: PMC10741970 DOI: 10.3390/cimb45120608] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 11/28/2023] [Accepted: 12/02/2023] [Indexed: 12/23/2023] Open
Abstract
Next-generation sequencing (NGS) has taken on major importance in clinical oncology practice. With the advent of targeted therapies capable of effectively targeting specific genomic alterations in cancer patients, the development of bioinformatics processes has become crucial. Thus, bioinformatics pipelines play an essential role not only in the detection and in identification of molecular alterations obtained from NGS data but also in the analysis and interpretation of variants, making it possible to transform raw sequencing data into meaningful and clinically useful information. In this review, we aim to examine the multiple steps of a bioinformatics pipeline as used in current clinical practice, and we also provide an updated list of the necessary bioinformatics tools. This resource is intended to assist researchers and clinicians in their genetic data analyses, improving the precision and efficiency of these processes in clinical research and patient care.
Collapse
Affiliation(s)
- Simon Cabello-Aguilar
- Montpellier BioInformatics for Clinical Diagnosis (MOBIDIC), Molecular Medicine and Genomics Platform (PMMG), CHU Montpellier, 34295 Montpellier, France
- Laboratoire de Biologie des Tumeurs Solides, Département de Pathologie et Oncobiologie, CHU Montpellier, Université de Montpellier, 34295 Montpellier, France; (J.A.V.); (J.S.)
| | - Julie A. Vendrell
- Laboratoire de Biologie des Tumeurs Solides, Département de Pathologie et Oncobiologie, CHU Montpellier, Université de Montpellier, 34295 Montpellier, France; (J.A.V.); (J.S.)
| | - Jérôme Solassol
- Laboratoire de Biologie des Tumeurs Solides, Département de Pathologie et Oncobiologie, CHU Montpellier, Université de Montpellier, 34295 Montpellier, France; (J.A.V.); (J.S.)
| |
Collapse
|
3
|
Xiang X, Lu B, Song D, Li J, Shu K, Pu D. Evaluating the performance of low-frequency variant calling tools for the detection of variants from short-read deep sequencing data. Sci Rep 2023; 13:20444. [PMID: 37993475 PMCID: PMC10665316 DOI: 10.1038/s41598-023-47135-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Accepted: 11/09/2023] [Indexed: 11/24/2023] Open
Abstract
Detection of low-frequency variants with high accuracy plays an important role in biomedical research and clinical practice. However, it is challenging to do so with next-generation sequencing (NGS) approaches due to the high error rates of NGS. To accurately distinguish low-level true variants from these errors, many statistical variants calling tools for calling low-frequency variants have been proposed, but a systematic performance comparison of these tools has not yet been performed. Here, we evaluated four raw-reads-based variant callers (SiNVICT, outLyzer, Pisces, and LoFreq) and four UMI-based variant callers (DeepSNVMiner, MAGERI, smCounter2, and UMI-VarCal) considering their capability to call single nucleotide variants (SNVs) with allelic frequency as low as 0.025% in deep sequencing data. We analyzed a total of 54 simulated data with various sequencing depths and variant allele frequencies (VAFs), two reference data, and Horizon Tru-Q sample data. The results showed that the UMI-based callers, except smCounter2, outperformed the raw-reads-based callers regarding detection limit. Sequencing depth had almost no effect on the UMI-based callers but significantly influenced on the raw-reads-based callers. Regardless of the sequencing depth, MAGERI showed the fastest analysis, while smCounter2 consistently took the longest to finish the variant calling process. Overall, DeepSNVMiner and UMI-VarCal performed the best with considerably good sensitivity and precision of 88%, 100%, and 84%, 100%, respectively. In conclusion, the UMI-based callers, except smCounter2, outperformed the raw-reads-based callers in terms of sensitivity and precision. We recommend using DeepSNVMiner and UMI-VarCal for low-frequency variant detection. The results provide important information regarding future directions for reliable low-frequency variant detection and algorithm development, which is critical in genetics-based medical research and clinical applications.
Collapse
Affiliation(s)
- Xudong Xiang
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Bowen Lu
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Dongyang Song
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Jie Li
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China
| | - Kunxian Shu
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| | - Dan Pu
- Chongqing Key Laboratory of Big Data for Bio Intelligence, Chongqing University of Posts and Telecommunications, Chongqing, 400065, China.
| |
Collapse
|
4
|
De Paolis E, Perrucci A, Marchetti C, Pietragalla A, Scambia G, Urbani A, Fagotti A, Minucci A. BRCA testing on buccal swab to improve access to healthcare and cancer prevention: a performance evaluation. Int J Gynecol Cancer 2022; 32:ijgc-2022-003718. [PMID: 36028233 DOI: 10.1136/ijgc-2022-003718] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022] Open
Abstract
OBJECTIVE BRCA1/2 (BRCA) genetic testing allows patients with high-grade serous ovarian cancer to receive appropriate medical management with molecular target therapy and prevention strategies. Most of the BRCA sequencing methods use blood as the primary source of germline DNA. Buccal swab emerged as an alternative collection device due to its convenient and non-invasive characteristics. This study assessed the suitability of buccal swabs as the DNA source in next-generation sequencing-based BRCA genotyping. METHODS Matched buccal swabs and blood samples were collected from 51 patients with high-grade serous ovarian cancer, including 29 BRCA-mutated patients, from June to December 2021. Buccal swabs were self-collected using COPAN FLOQSwabs hDNA Free. BRCA genes were amplified using Devyser's BRCA next-generation sequencing kit and sequenced on the Illumina MiSeq platform. We evaluated collection and extraction procedures, amplification and sequencing performances, coverage data, blood/swab variant calling concordance, and interpretation. RESULTS Comparable sequencing parameters were observed between the two sample types in term of mean total number of reads passing filter for indexed sample (p>0.05) and sequencing coverage distribution, with a widespread overlap of mean depth of coverage/target region between blood and swab samples. An overall concordance of 100% in both polymorphisms and pathogenic variants calling between the two DNA sources were observed, including the copy number variation prediction. CONCLUSIONS Data from this study support the use of buccal swabs as an alternative source of DNA for BRCA evaluation. The use of this alternative delivery mode of BRCA testing may facilitate access to care without compromising patient outcomes.
Collapse
Affiliation(s)
- Elisa De Paolis
- Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy
- Departmental Unit of Molecular and Genomic Diagnostics, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy
| | - Alessia Perrucci
- Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy
- Departmental Unit of Molecular and Genomic Diagnostics, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy
| | - Claudia Marchetti
- Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy
- Università Cattolica del Sacro Cuore, Rome, Italy
| | | | - Giovanni Scambia
- Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy
- Università Cattolica del Sacro Cuore, Rome, Italy
| | - Andrea Urbani
- Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy
- Università Cattolica del Sacro Cuore, Rome, Italy
| | - Anna Fagotti
- Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy
- Università Cattolica del Sacro Cuore, Rome, Italy
| | - Angelo Minucci
- Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy
- Departmental Unit of Molecular and Genomic Diagnostics, Fondazione Policlinico Universitario Agostino Gemelli IRCCS, Rome, Italy
| |
Collapse
|
5
|
Al Salameen F, Habibi N, Al Amad S, Al Doaij B. Genetic Diversity of Rhanterium eppaposum Oliv. Populations in Kuwait as Revealed by GBS. PLANTS (BASEL, SWITZERLAND) 2022; 11:1435. [PMID: 35684208 PMCID: PMC9183190 DOI: 10.3390/plants11111435] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/10/2022] [Revised: 05/14/2022] [Accepted: 05/24/2022] [Indexed: 06/15/2023]
Abstract
Natural populations of Rhanterium eppaposum Oliv. (Arfaj), a perennial forage shrub, have depleted due to unethical human interventions and climate change in Kuwait. Therefore, there is an urgent need to conserve this native plant through the assessment of its genetic diversity and population structure. Genotyping by sequencing (GBS) has recently emerged as a powerful tool for the molecular diversity analysis of higher plants without prior knowledge of their genome. This study represents the first effort in using GBS to discover genome-wide single nucleotide polymorphisms (SNPs) of local Rhanterium plants to assess the genetic diversity present in landraces collected from six different locations in Kuwait. The study generated a novel set of 11,231 single nucleotide polymorphisms (SNPs) and indels (insertions and deletions) in 98 genotypes of Rhanterium. The analysis of molecular variance (AMOVA) revealed ~1.5% variation residing among the six populations, ~5% among the individuals within the population and 93% variation present within the populations (FST = 0.029; p = 0.0). Bayesian and UPGMA analyses identified two admixed clusters of the tested samples; however, the principal coordinates analysis returned the complete population as a single group. Mantel's test returned a very weak correlation coefficient of r2 = 0.101 (p = 0.00) between the geographic and genetic distance. These findings are useful for the native species to formulate conservation strategies for its sustainable management and desert rehabilitation.
Collapse
|
6
|
Cagirici HB, Akpinar BA, Sen TZ, Budak H. Multiple Variant Calling Pipelines in Wheat Whole Exome Sequencing. Int J Mol Sci 2021; 22:10400. [PMID: 34638743 PMCID: PMC8509018 DOI: 10.3390/ijms221910400] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2021] [Revised: 09/11/2021] [Accepted: 09/23/2021] [Indexed: 11/30/2022] Open
Abstract
The highly challenging hexaploid wheat (Triticum aestivum) genome is becoming ever more accessible due to the continued development of multiple reference genomes, a factor which aids in the plight to better understand variation in important traits. Although the process of variant calling is relatively straightforward, selection of the best combination of the computational tools for read alignment and variant calling stages of the analysis and efficient filtering of the false variant calls are not always easy tasks. Previous studies have analyzed the impact of methods on the quality metrics in diploid organisms. Given that variant identification in wheat largely relies on accurate mining of exome data, there is a critical need to better understand how different methods affect the analysis of whole exome sequencing (WES) data in polyploid species. This study aims to address this by performing whole exome sequencing of 48 wheat cultivars and assessing the performance of various variant calling pipelines at their suggested settings. The results show that all the pipelines require filtering to eliminate false-positive calls. The high consensus among the reference SNPs called by the best-performing pipelines suggests that filtering provides accurate and reproducible results. This study also provides detailed comparisons for high sensitivity and precision at individual and population levels for the raw and filtered SNP calls.
Collapse
Affiliation(s)
- H. Busra Cagirici
- Crop Improvement and Genetics Research Unit, Western Regional Research Center, U.S. Department of Agriculture—Agricultural Research Service, Albany, CA 94710, USA; (H.B.C.); (T.Z.S.)
| | - Bala Ani Akpinar
- Department of Genomics and Genome Editing, Montana BioAgriculture Inc., Missoula, MT 59802, USA;
| | - Taner Z. Sen
- Crop Improvement and Genetics Research Unit, Western Regional Research Center, U.S. Department of Agriculture—Agricultural Research Service, Albany, CA 94710, USA; (H.B.C.); (T.Z.S.)
| | - Hikmet Budak
- Department of Genomics and Genome Editing, Montana BioAgriculture Inc., Missoula, MT 59802, USA;
| |
Collapse
|
7
|
Targeted Sequencing Identifies the Genetic Variants Associated with High-altitude Polycythemia in the Tibetan Population. Indian J Hematol Blood Transfus 2021; 38:556-565. [PMID: 35747576 PMCID: PMC9209555 DOI: 10.1007/s12288-021-01474-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2021] [Accepted: 07/12/2021] [Indexed: 12/01/2022] Open
Abstract
High-altitude polycythemia (HAPC) is characterized by excessive proliferation of erythrocytes, resulting from the hypobaric hypoxia condition in high altitude. The genetic variants and molecular mechanisms of HAPC remain unclear in highlanders. We recruited 141 Tibetan dwellers, including 70 HAPC patients and 71 healthy controls, to detect the possible genetic variants associated with the disease; and performed targeted sequencing on 529 genes associated with the oxygen metabolism and erythrocyte regulation, utilized unconditional logistic regression analysis and GO (gene ontology) analysis to investigate the genetic variations of HAPC. We identified 12 single nucleotide variants, harbored in 12 genes, associated with the risk of HAPC (4.7 ≤ odd ratios ≤ 13.6; 7.6E − 08 ≤ p-value ≤ 1E − 04). The pathway enrichment study of these genes indicated the three pathways, the PI3K-AKT pathway, JAK-STAT pathway, and HIF-1 pathway, are essential, which p-values as 3.70E − 08, 1.28 E − 07, and 3.98 E − 06, respectively. We are hopeful that our results will provide a reference for the etiology research of HAPC. However, additional genetic risk factors and functional investigations are necessary to confirm our results further.
Collapse
|
8
|
Zanti M, Michailidou K, Loizidou MA, Machattou C, Pirpa P, Christodoulou K, Spyrou GM, Kyriacou K, Hadjisavvas A. Performance evaluation of pipelines for mapping, variant calling and interval padding, for the analysis of NGS germline panels. BMC Bioinformatics 2021; 22:218. [PMID: 33910496 PMCID: PMC8080428 DOI: 10.1186/s12859-021-04144-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2021] [Accepted: 04/15/2021] [Indexed: 11/10/2022] Open
Abstract
Background Next-generation sequencing (NGS) represents a significant advancement in clinical genetics. However, its use creates several technical, data interpretation and management challenges. It is essential to follow a consistent data analysis pipeline to achieve the highest possible accuracy and avoid false variant calls. Herein, we aimed to compare the performance of twenty-eight combinations of NGS data analysis pipeline compartments, including short-read mapping (BWA-MEM, Bowtie2, Stampy), variant calling (GATK-HaplotypeCaller, GATK-UnifiedGenotyper, SAMtools) and interval padding (null, 50 bp, 100 bp) methods, along with a commercially available pipeline (BWA Enrichment, Illumina®). Fourteen germline DNA samples from breast cancer patients were sequenced using a targeted NGS panel approach and subjected to data analysis. Results We highlight that interval padding is required for the accurate detection of intronic variants including spliceogenic pathogenic variants (PVs). In addition, using nearly default parameters, the BWA Enrichment algorithm, failed to detect these spliceogenic PVs and a missense PV in the TP53 gene. We also recommend the BWA-MEM algorithm for sequence alignment, whereas variant calling should be performed using a combination of variant calling algorithms; GATK-HaplotypeCaller and SAMtools for the accurate detection of insertions/deletions and GATK-UnifiedGenotyper for the efficient detection of single nucleotide variant calls. Conclusions These findings have important implications towards the identification of clinically actionable variants through panel testing in a clinical laboratory setting, when dedicated bioinformatics personnel might not always be available. The results also reveal the necessity of improving the existing tools and/or at the same time developing new pipelines to generate more reliable and more consistent data. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04144-1.
Collapse
Affiliation(s)
- Maria Zanti
- Department of Electron Microscopy/Molecular Pathology, The Cyprus Institute of Neurology and Genetics, 2371, Nicosia, Cyprus.,Cyprus School of Molecular Medicine, 2371, Nicosia, Cyprus.,Bioinformatics Department, The Cyprus Institute of Neurology and Genetics, 2371, Nicosia, Cyprus
| | - Kyriaki Michailidou
- Cyprus School of Molecular Medicine, 2371, Nicosia, Cyprus.,Biostatistics Unit, The Cyprus Institute of Neurology and Genetics, 2371, Nicosia, Cyprus
| | - Maria A Loizidou
- Department of Electron Microscopy/Molecular Pathology, The Cyprus Institute of Neurology and Genetics, 2371, Nicosia, Cyprus.,Cyprus School of Molecular Medicine, 2371, Nicosia, Cyprus
| | - Christina Machattou
- Department of Electron Microscopy/Molecular Pathology, The Cyprus Institute of Neurology and Genetics, 2371, Nicosia, Cyprus
| | - Panagiota Pirpa
- Department of Electron Microscopy/Molecular Pathology, The Cyprus Institute of Neurology and Genetics, 2371, Nicosia, Cyprus
| | - Kyproula Christodoulou
- Cyprus School of Molecular Medicine, 2371, Nicosia, Cyprus.,Neurogenetics Department, The Cyprus Institute of Neurology and Genetics, 2371, Nicosia, Cyprus
| | - George M Spyrou
- Cyprus School of Molecular Medicine, 2371, Nicosia, Cyprus.,Bioinformatics Department, The Cyprus Institute of Neurology and Genetics, 2371, Nicosia, Cyprus
| | - Kyriacos Kyriacou
- Department of Electron Microscopy/Molecular Pathology, The Cyprus Institute of Neurology and Genetics, 2371, Nicosia, Cyprus.,Cyprus School of Molecular Medicine, 2371, Nicosia, Cyprus
| | - Andreas Hadjisavvas
- Department of Electron Microscopy/Molecular Pathology, The Cyprus Institute of Neurology and Genetics, 2371, Nicosia, Cyprus. .,Cyprus School of Molecular Medicine, 2371, Nicosia, Cyprus.
| |
Collapse
|
9
|
Molina-Mora JA, Solano-Vargas M. Set-theory based benchmarking of three different variant callers for targeted sequencing. BMC Bioinformatics 2021; 22:20. [PMID: 33413082 PMCID: PMC7791862 DOI: 10.1186/s12859-020-03926-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2019] [Accepted: 12/09/2020] [Indexed: 12/05/2022] Open
Abstract
Background Next generation sequencing (NGS) technologies have improved the study of hereditary diseases. Since the evaluation of bioinformatics pipelines is not straightforward, NGS demands effective strategies to analyze data that is of paramount relevance for decision making under a clinical scenario. According to the benchmarking framework of the Global Alliance for Genomics and Health (GA4GH), we implemented a new simple and user-friendly set-theory based method to assess variant callers using a gold standard variant set and high confidence regions. As model, we used TruSight Cardio kit sequencing data of the reference genome NA12878. This targeted sequencing kit is used to identify variants in key genes related to Inherited Cardiac Conditions (ICCs), a group of cardiovascular diseases with high rates of morbidity and mortality. Results We implemented and compared three variant calling pipelines (Isaac, Freebayes, and VarScan). Performance metrics using our set-theory approach showed high-resolution pipelines and revealed: (1) a perfect recall of 1.000 for all three pipelines, (2) very high precision values, i.e. 0.987 for Freebayes, 0.928 for VarScan, and 1.000 for Isaac, when compared with the reference material, and (3) a ROC curve analysis with AUC > 0.94 for all cases. Moreover, significant differences were obtained between the three pipelines. In general, results indicate that the three pipelines were able to recognize the expected variants in the gold standard data set. Conclusions Our set-theory approach to calculate metrics was able to identify the expected ICCs related variants by the three selected pipelines, but results were completely dependent on the algorithms. We emphasize the importance to assess pipelines using gold standard materials to achieve the most reliable results for clinical application.
Collapse
Affiliation(s)
- Jose Arturo Molina-Mora
- Centro de Investigación en Enfermedades Tropicales (CIET) and Facultad de Microbiología, Universidad de Costa Rica (UCR), San José, Costa Rica. .,Centro de Investigaciones en Hematología y Transtornos Afines (CIHATA), Universidad de Costa Rica (UCR), San José, Costa Rica.
| | - Mariela Solano-Vargas
- Centro de Investigaciones en Hematología y Transtornos Afines (CIHATA), Universidad de Costa Rica (UCR), San José, Costa Rica
| |
Collapse
|
10
|
Gopanenko AV, Kosobokova EN, Kosorukov VS. Main Strategies for the Identification of Neoantigens. Cancers (Basel) 2020; 12:E2879. [PMID: 33036391 PMCID: PMC7600129 DOI: 10.3390/cancers12102879] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 10/01/2020] [Accepted: 10/05/2020] [Indexed: 12/24/2022] Open
Abstract
Genetic instability of tumors leads to the appearance of numerous tumor-specific somatic mutations that could potentially result in the production of mutated peptides that are presented on the cell surface by the MHC molecules. Peptides of this kind are commonly called neoantigens. Their presence on the cell surface specifically distinguishes tumors from healthy tissues. This feature makes neoantigens a promising target for immunotherapy. The rapid evolution of high-throughput genomics and proteomics makes it possible to implement these techniques in clinical practice. In particular, they provide useful tools for the investigation of neoantigens. The most valuable genomic approach to this problem is whole-exome sequencing coupled with RNA-seq. High-throughput mass-spectrometry is another option for direct identification of MHC-bound peptides, which is capable of revealing the entire MHC-bound peptidome. Finally, structure-based predictions could significantly improve the understanding of physicochemical and structural features that affect the immunogenicity of peptides. The development of pipelines combining such tools could improve the accuracy of the peptide selection process and decrease the required time. Here we present a review of the main existing approaches to investigating the neoantigens and suggest a possible ideal pipeline that takes into account all modern trends in the context of neoantigen discovery.
Collapse
Affiliation(s)
| | | | - Vyacheslav S. Kosorukov
- N.N. Blokhin National Medical Research Center of Oncology, Ministry of Health of the Russian Federation, 115478 Moscow, Russia; (A.V.G.); (E.N.K.)
| |
Collapse
|
11
|
Pei S, Liu T, Ren X, Li W, Chen C, Xie Z. Benchmarking variant callers in next-generation and third-generation sequencing analysis. Brief Bioinform 2020; 22:5875142. [PMID: 32698196 DOI: 10.1093/bib/bbaa148] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Revised: 06/11/2020] [Accepted: 06/12/2020] [Indexed: 12/15/2022] Open
Abstract
DNA variants represent an important source of genetic variations among individuals. Next- generation sequencing (NGS) is the most popular technology for genome-wide variant calling. Third-generation sequencing (TGS) has also recently been used in genetic studies. Although many variant callers are available, no single caller can call both types of variants on NGS or TGS data with high sensitivity and specificity. In this study, we systematically evaluated 11 variant callers on 12 NGS and TGS datasets. For germline variant calling, we tested DNAseq and DNAscope modes from Sentieon, HaplotypeCaller mode from GATK and WGS mode from DeepVariant. All the four callers had comparable performance on NGS data and 30× coverage of WGS data was recommended. For germline variant calling on TGS data, we tested DNAseq mode from Sentieon, HaplotypeCaller mode from GATK and PACBIO mode from DeepVariant. All the three callers had similar performance in SNP calling, while DeepVariant outperformed the others in InDel calling. TGS detected more variants than NGS, particularly in complex and repetitive regions. For somatic variant calling on NGS, we tested TNscope and TNseq modes from Sentieon, MuTect2 mode from GATK, NeuSomatic, VarScan2, and Strelka2. TNscope and Mutect2 outperformed the other callers. A higher proportion of tumor sample purity (from 10 to 20%) significantly increased the recall value of calling. Finally, computational costs of the callers were compared and Sentieon required the least computational cost. These results suggest that careful selection of a tool and parameters is needed for accurate SNP or InDel calling under different scenarios.
Collapse
Affiliation(s)
- Surui Pei
- Zhongshan Ophthalmic Center at Sun Yat-sen University and Annoroad Gene Technology (Beijing) Co., Ltd
| | - Tao Liu
- Annoroad Gene Technology (Beijing) Co., Ltd
| | - Xue Ren
- Annoroad Gene Technology (Beijing) Co., Ltd
| | - Weizhong Li
- Zhongshan School of Medicine at Sun Yat-sen University
| | | | - Zhi Xie
- Zhongshan Ophthalmic Center at Sun Yat-sen University
| |
Collapse
|
12
|
Dissanayake R, Braich S, Cogan NOI, Smith K, Kaur S. Characterization of Genetic and Allelic Diversity Amongst Cultivated and Wild Lentil Accessions for Germplasm Enhancement. Front Genet 2020; 11:546. [PMID: 32587602 PMCID: PMC7298104 DOI: 10.3389/fgene.2020.00546] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2019] [Accepted: 05/06/2020] [Indexed: 12/13/2022] Open
Abstract
Intensive breeding of cultivated lentil has resulted in a relatively narrow genetic base, which limits the options to increase crop productivity through selection. Assessment of genetic diversity in the wild gene pool of lentil, as well as characterization of useful and novel alleles/genes that can be introgressed into elite germplasm, presents new opportunities and pathways for germplasm enhancement, followed by successful crop improvement. In the current study, a lentil collection consisting of 467 wild and cultivated accessions that originated from 10 diverse geographical regions was assessed, to understand genetic relationships among different lentil species/subspecies. A total of 422,101 high-confidence SNP markers were identified against the reference lentil genome (cv. CDC Redberry). Phylogenetic analysis clustered the germplasm collection into four groups, namely, Lens culinaris/Lens orientalis, Lens lamottei/Lens odemensis, Lens ervoides, and Lens nigricans. A weak correlation was observed between geographical origin and genetic relationship, except for some accessions of L. culinaris and L. ervoides. Genetic distance matrices revealed a comparable level of variation within the gene pools of L. culinaris (Nei’s coefficient 0.01468–0.71163), L. ervoides (Nei’s coefficient 0.01807–0.71877), and L. nigricans (Nei’s coefficient 0.02188–1.2219). In order to understand any genic differences at species/subspecies level, allele frequencies were calculated from a subset of 263 lentil accessions. Among all cultivated and wild lentil species, L. nigricans exhibited the greatest allelic differentiation across the genome compared to all other species/subspecies. Major differences were observed on six genomic regions with the largest being on Chromosome 1 (c. 1 Mbp). These results indicate that L. nigricans is the most distantly related to L. culinaris and additional structural variations are likely to be identified from genome sequencing studies. This would provide further insights into evolutionary relationships between cultivated and wild lentil germplasm, for germplasm improvement and introgression.
Collapse
Affiliation(s)
- Ruwani Dissanayake
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia.,Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC, Australia
| | - Shivraj Braich
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia.,School of Applied Systems Biology, La Trobe University, Melbourne, VIC, Australia
| | - Noel O I Cogan
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia.,School of Applied Systems Biology, La Trobe University, Melbourne, VIC, Australia
| | - Kevin Smith
- Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Parkville, VIC, Australia.,Agriculture Victoria, Hamilton, VIC, Australia
| | - Sukhjiwan Kaur
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia
| |
Collapse
|
13
|
Roudko V, Greenbaum B, Bhardwaj N. Computational Prediction and Validation of Tumor-Associated Neoantigens. Front Immunol 2020; 11:27. [PMID: 32117226 PMCID: PMC7025577 DOI: 10.3389/fimmu.2020.00027] [Citation(s) in RCA: 67] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Accepted: 01/08/2020] [Indexed: 12/30/2022] Open
Abstract
Tumor progression is typically accompanied by an accumulation of driver and passenger somatic mutations. A handful of those mutations occur in protein coding genes which introduce non-synonymous polymorphisms. Certain substitutions may give rise to novel, tumor-associated antigens or neoantigens, presentable by cancer cells to the host adaptive immune system. As antigen recognition is the core of an effective immune response, the identification of patient tumor specific antigens derived from transformed cells is of importance for immunotherapeutic approaches. Recent technological advances in DNA sequencing of tumor genomes, advances in gene expression analysis, algorithm development for antigen predictions and methods for T-cell receptor (TCR) repertoire sequencing have facilitated the selection of candidate immunogenic neoantigens. In this regard, multiple research groups have reported encouraging results of neoantigen-based cancer vaccines that generate tumor antigen specific immune responses, both in mouse models and clinical trials. Additionally, both the quantity and quality of neoantigens has been shown to have predictive value for clinical outcomes in checkpoint-blockade immunotherapy in certain tumor types. Neoantigen recognition by vaccination or through adoptive T cell therapy may have unprecedented potential to advance cancer immunotherapy in combination with other approaches. In our review we discuss three parameters regarding neoantigens: computational methods for epitope prediction, experimental methods for epitope immunogenicity validation and future directions for improvement of those methods. Within each section, we will describe the advantages and limitations of existing methods as well as highlight pressing fundamental problems to be addressed.
Collapse
Affiliation(s)
- Vladimir Roudko
- Department of Hematology and Medical Oncology, Icahn School of Medicine at Mount Sinai Hospital, New York, NY, United States
- Center for Computational Immunology, Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai Hospital, New York, NY, United States
| | - Benjamin Greenbaum
- Center for Computational Immunology, Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai Hospital, New York, NY, United States
- Department of Pathology, Icahn School of Medicine at Mount Sinai Hospital, New York, NY, United States
- Department of Oncological Sciences, Icahn School of Medicine at Mount Sinai Hospital, New York, NY, United States
| | - Nina Bhardwaj
- Department of Hematology and Medical Oncology, Icahn School of Medicine at Mount Sinai Hospital, New York, NY, United States
| |
Collapse
|
14
|
Şener LT, Aktan M, Albeniz G, Şener A, Üstek D, Albeniz I. Identification of red blood cell membrane defects in a patient with hereditary spherocytosis using next‑generation sequencing technology and matrix‑assisted laser desorption/ionization time‑of‑flight mass spectrometry. Mol Med Rep 2019; 19:3912-3922. [PMID: 30896804 DOI: 10.3892/mmr.2019.10036] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2018] [Accepted: 02/13/2019] [Indexed: 11/05/2022] Open
Abstract
Hereditary spherocytosis (HS) is characterized by the morphological transformation of erythrocytes into a spherical shape due to a hereditary defect in cell membrane proteins (ghosts) associated with disruption of erythrocyte skeletal structures. Contrary to the literature, pores were detected in the erythrocytes of a patient with HS. The aim of the present study was to determine the affected proteins and genes that were responsible for the pores. Ghost isolation was performed to determine the proteins responsible for the pores observed on the erythrocytes of the patient. Erythrocyte membrane proteins were visualized using SDS‑PAGE. Exome and matrix‑assisted laser desorption/ionization time‑of‑flight mass spectrometry (MALDI TOF MS) analyses were used to identify the genes and proteins responsible for the observed defect. Quantitative protein assessments were performed using MALDI TOF MS. A difference was detected in the components of the erythrocyte membrane proteins. Band 3 and protein 4.2, which serve a particular role in membrane structure, decreased 4.573 and 4.106 fold, respectively. Through proteomic analyses, a non‑synonymous exonic mutation region was identified in the Golgi membrane protein 1 (GOLM1) gene (Chr9 rs142242230). Sorting Intolerant From Tolerant and Polymorphism Phenotyping Scores, Likelihood Ratio Tests and MutationTaster revealed that the mutation was deleterious. The pores observed in the morphology of the erythrocytes may have developed due to the decrease in these proteins, which reside in the erythrocyte membrane structure. Furthermore, genetic profiling of the patient with HS and her family was conducted in the present study. Next‑generation sequencing was used, and the genetic source of HS was identified as a GOLM1 gene mutation. The assessment of specific molecular defects is often not performed as the majority of mutations are unique to a family. However, molecular analyses should be performed in severe cases where prenatal diagnosis is required, or for unique HS phenotypes to aid scientific investigation.
Collapse
Affiliation(s)
- Leyla Türker Şener
- Department of Biophysics, Istanbul Faculty of Medicine, Istanbul University, 34093 Istanbul, Turkey
| | - Melih Aktan
- Department of Hematology, Istanbul Faculty of Medicine, Istanbul University, 34093 Istanbul, Turkey
| | - Gürcan Albeniz
- Department of General Surgery, Cerrahpaşa Faculty of Medicine, Istanbul University Cerrahpaşa, 34096 Istanbul, Turkey
| | - Aziz Şener
- Department of General Surgery, Kanuni Sultan Suleyman Training and Research Hospital, 34303 Istanbul, Turkey
| | - Duran Üstek
- Department of Medical Genetics and REMER, Medipol University, 34810 Istanbul, Turkey
| | - Işıl Albeniz
- Department of Biophysics, Istanbul Faculty of Medicine, Istanbul University, 34093 Istanbul, Turkey
| |
Collapse
|
15
|
Dębniak T, Scott RJ, Górski B, Masojć B, Kram A, Maleszka R, Cybulski C, Paszkowska-Szczur K, Kashyap A, Murawa D, Malińska K, Kiedrowicz M, Rogoża-Janiszewska E, Rudnicka H, Deptuła J, Domagała P, Kluźniak W, Lener MR, Lubiński J. BRCA1/2 mutations are not a common cause of malignant melanoma in the Polish population. PLoS One 2018; 13:e0204768. [PMID: 30286154 PMCID: PMC6171837 DOI: 10.1371/journal.pone.0204768] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2018] [Accepted: 09/13/2018] [Indexed: 12/11/2022] Open
Abstract
The association of BRCA1/2 mutations with melanoma is not completely determined; the interpretation of variants of unknown significance is also problematic. To evaluate these issues we explored the molecular basis of melanoma risk by performing whole-exome sequencing on a cohort of 96 unrelated Polish early-onset melanoma patients and targeted sequencing of BRCA1/2 genes on additional 30 melanoma patients with familial aggregation of breast and other cancers. Sequencing was performed on peripheral blood. We evaluated MutationTaster, Polyphen2, SIFT, PROVEAN algorithms, analyzed segregation with cancer disease (in both families with identified BRCA2 variants) and in one family performed LOH (based on 2 primary tumors). We found neither pathogenic mutations nor variants of unknown significance within BRCA1. We identified two BRCA2 variants of unknown significance: c.9334G>A and c.4534 C>T. Disease allele frequency was evaluated by genotyping of 1230 consecutive melanoma cases, 5000 breast cancer patients, 3500 prostate cancers and 9900 controls. Both variants were found to be absent among unselected cancer patients and healthy controls. The MutationTaster, Polyphen2 and SIFT algorithms indicate that c.9334G>A is a damaging variant. Due to lack of tumour tissue LOH analysis could not be performed for this variant. The variant segregated with the disease. The c.4534 C>T variant did not segregate with disease, there was no LOH of the variant. The c.9334G>A variant, classified as a rare variant of unknown significance, on current evidence may predisposes to cancers of the breast, prostate and melanoma. Functional studies to describe how the DNA change affects the protein function and a large multi-center study to evaluate its penetrance are required.
Collapse
Affiliation(s)
- Tadeusz Dębniak
- Department of Genetics and Pathomorphology, International Hereditary Cancer Center, Pomeranian Medical University in Szczecin, Szczecin, Poland
- * E-mail:
| | - Rodney J. Scott
- School of Biomedical Sciences and Pharmacy, Faculty of Health, University of Newcastle and the Hunter Medical Research Institute, Newcastle, New South Wales, Australia
| | - Bohdan Górski
- Department of Genetics and Pathomorphology, International Hereditary Cancer Center, Pomeranian Medical University in Szczecin, Szczecin, Poland
| | | | - Andrzej Kram
- West Pomeranian Oncology Center, Szczecin, Poland
| | | | - Cezary Cybulski
- Department of Genetics and Pathomorphology, International Hereditary Cancer Center, Pomeranian Medical University in Szczecin, Szczecin, Poland
| | - Katarzyna Paszkowska-Szczur
- Department of Genetics and Pathomorphology, International Hereditary Cancer Center, Pomeranian Medical University in Szczecin, Szczecin, Poland
| | - Aniruddh Kashyap
- Department of Genetics and Pathomorphology, International Hereditary Cancer Center, Pomeranian Medical University in Szczecin, Szczecin, Poland
| | - Dawid Murawa
- I Department of Oncological and General Surgery, Greater Poland Cancer Center, Poznań, Poland
| | - Karolina Malińska
- Department of Genetics and Pathomorphology, International Hereditary Cancer Center, Pomeranian Medical University in Szczecin, Szczecin, Poland
| | | | - Emilia Rogoża-Janiszewska
- Department of Genetics and Pathomorphology, International Hereditary Cancer Center, Pomeranian Medical University in Szczecin, Szczecin, Poland
| | - Helena Rudnicka
- Department of Genetics and Pathomorphology, International Hereditary Cancer Center, Pomeranian Medical University in Szczecin, Szczecin, Poland
| | - Jakub Deptuła
- Department of Genetics and Pathomorphology, International Hereditary Cancer Center, Pomeranian Medical University in Szczecin, Szczecin, Poland
| | - Paweł Domagała
- Department of Pathology, Pomeranian Medical University in Szczecin, Szczecin, Poland
| | - Wojciech Kluźniak
- Department of Genetics and Pathomorphology, International Hereditary Cancer Center, Pomeranian Medical University in Szczecin, Szczecin, Poland
| | - Marcin R. Lener
- Department of Genetics and Pathomorphology, International Hereditary Cancer Center, Pomeranian Medical University in Szczecin, Szczecin, Poland
| | - Jan Lubiński
- Department of Genetics and Pathomorphology, International Hereditary Cancer Center, Pomeranian Medical University in Szczecin, Szczecin, Poland
| |
Collapse
|
16
|
Pranckėnienė L, Jakaitienė A, Ambrozaitytė L, Kavaliauskienė I, Kučinskas V. Insights Into de novo Mutation Variation in Lithuanian Exome. Front Genet 2018; 9:315. [PMID: 30154829 PMCID: PMC6102505 DOI: 10.3389/fgene.2018.00315] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2018] [Accepted: 07/24/2018] [Indexed: 01/23/2023] Open
Abstract
In the last decade, one of the biggest challenges in genomics research has been to distinguish definitive pathogenic variants from all likely pathogenic variants identified by next-generation sequencing. This task is particularly complex because of our lack of knowledge regarding overall genome variation and pathogenicity of the variants. Therefore, obtaining sufficient information about genome variants in the general population is necessary as such data could be used for the interpretation of de novo mutations (DNMs) in the context of patient's phenotype in cases of sporadic genetic disease. In this study, data from whole-exome sequencing of the general population in Lithuania were directly examined. In total, 84 (VarScan) and 95 (VarSeqTM) DNMs were identified and validated using different algorithms. Thirty-nine of these mutations were considered likely to be pathogenic based on gene function, evolutionary conservation, and mutation impact. The mutation rate estimated per position pair per generation was 2.74 × 10-8 [95% CI: 2.24 × 10-8-3.35 × 10-8] (VarScan) and 2.4 × 10-8 [95% CI: 1.96 × 10-8-2.99 × 10-8] (VarSeqTM), with 1.77 × 10-8 [95% CI: 6.03 × 10-9-5.2 × 10-8] de novo indels per position per generation. The rate of germline DNMs in the Lithuanian population and the effects of the genomic and epigenetic context on DNM formation were calculated for the first time in this study, providing a basis for further analysis of DNMs in individuals with genetic diseases. Considering these findings, additional studies in patient groups with genetic diseases with unclear etiology may facilitate our ability to distinguish certain pathogenic or adaptive DNMs from tolerated background DNMs and to reliably identify disease-causing DNMs by their properties through direct observation.
Collapse
Affiliation(s)
- Laura Pranckėnienė
- Department of Human and Medical Genetics, Institute of Biomedical Sciences, Faculty of Medicine, Vilnius University, Vilnius, Lithuania
| | | | | | | | | |
Collapse
|
17
|
Einarsdottir BO, Karlsson J, Söderberg EMV, Lindberg MF, Funck-Brentano E, Jespersen H, Brynjolfsson SF, Olofsson Bagge R, Carstam L, Scobie M, Koolmeister T, Wallner O, Stierner U, Berglund UW, Ny L, Nilsson LM, Larsson E, Helleday T, Nilsson JA. A patient-derived xenograft pre-clinical trial reveals treatment responses and a resistance mechanism to karonudib in metastatic melanoma. Cell Death Dis 2018; 9:810. [PMID: 30042422 PMCID: PMC6057880 DOI: 10.1038/s41419-018-0865-6] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Revised: 06/29/2018] [Accepted: 07/05/2018] [Indexed: 12/19/2022]
Abstract
Karonudib (TH1579) is a novel compound that exerts anti-tumor activities and has recently entered phase I clinical testing. The aim of this study was to conduct a pre-clinical trial in patient-derived xenografts to identify the possible biomarkers of response or resistance that could guide inclusion of patients suffering from metastatic melanoma in phase II clinical trials. Patient-derived xenografts from 31 melanoma patients with metastatic disease were treated with karonudib or a vehicle for 18 days. Treatment responses were followed by measuring tumor sizes, and the models were categorized in the response groups. Tumors were harvested and processed for RNA sequencing and protein analysis. To investigate the effect of karonudib on T-cell-mediated anti-tumor activities, tumor-infiltrating T cells were injected in mice carrying autologous tumors and the mice treated with karonudib. We show that karonudib has heterogeneous anti-tumor effect on metastatic melanoma. Thus, based on the treatment responses, we could divide the 31 patient-derived xenografts in three treatment groups: progression group (32%), suppression group (42%), and regression group (26%). Furthermore, we show that karonudib has anti-tumor effect, irrespective of major melanoma driver mutations. Also, we identify high expression of ABCB1, which codes for p-gp pumps as a resistance biomarker. Finally, we show that karonudib treatment does not hamper T-cell-mediated anti-tumor responses. These findings can be used to guide future use of karonudib in clinical use with a potential approach as precision medicine.
Collapse
Affiliation(s)
- Berglind O Einarsdottir
- Sahlgrenska Translational Melanoma Group, Sahlgrenska Cancer Center, Departments of Surgery and Oncology, Institute of Clinical Sciences, University of Gothenburg and Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Joakim Karlsson
- Department of Medical Chemistry, Institute of Biomedicine, University of Gothenburg, Gothenburg, Sweden
| | - Elin M V Söderberg
- Sahlgrenska Translational Melanoma Group, Sahlgrenska Cancer Center, Departments of Surgery and Oncology, Institute of Clinical Sciences, University of Gothenburg and Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Mattias F Lindberg
- Sahlgrenska Translational Melanoma Group, Sahlgrenska Cancer Center, Departments of Surgery and Oncology, Institute of Clinical Sciences, University of Gothenburg and Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Elisa Funck-Brentano
- Sahlgrenska Translational Melanoma Group, Sahlgrenska Cancer Center, Departments of Surgery and Oncology, Institute of Clinical Sciences, University of Gothenburg and Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Henrik Jespersen
- Sahlgrenska Translational Melanoma Group, Sahlgrenska Cancer Center, Departments of Surgery and Oncology, Institute of Clinical Sciences, University of Gothenburg and Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Siggeir F Brynjolfsson
- Department of Microbiology and Immunology, Institute for Biomedicine, Sahlgrenska Academy, University of Gothenburg, Gothenburg, Sweden
| | - Roger Olofsson Bagge
- Sahlgrenska Translational Melanoma Group, Sahlgrenska Cancer Center, Departments of Surgery and Oncology, Institute of Clinical Sciences, University of Gothenburg and Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Louise Carstam
- Department of Neurosurgery, Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Martin Scobie
- Science for Life Laboratory, Division of Translational Medicine and Chemical Biology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden
| | - Tobias Koolmeister
- Science for Life Laboratory, Division of Translational Medicine and Chemical Biology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden
| | - Olof Wallner
- Science for Life Laboratory, Division of Translational Medicine and Chemical Biology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden
| | - Ulrika Stierner
- Sahlgrenska Translational Melanoma Group, Sahlgrenska Cancer Center, Departments of Surgery and Oncology, Institute of Clinical Sciences, University of Gothenburg and Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Ulrika Warpman Berglund
- Science for Life Laboratory, Division of Translational Medicine and Chemical Biology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden
| | - Lars Ny
- Sahlgrenska Translational Melanoma Group, Sahlgrenska Cancer Center, Departments of Surgery and Oncology, Institute of Clinical Sciences, University of Gothenburg and Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Lisa M Nilsson
- Sahlgrenska Translational Melanoma Group, Sahlgrenska Cancer Center, Departments of Surgery and Oncology, Institute of Clinical Sciences, University of Gothenburg and Sahlgrenska University Hospital, Gothenburg, Sweden
| | - Erik Larsson
- Department of Medical Chemistry, Institute of Biomedicine, University of Gothenburg, Gothenburg, Sweden
| | - Thomas Helleday
- Science for Life Laboratory, Division of Translational Medicine and Chemical Biology, Department of Medical Biochemistry and Biophysics, Karolinska Institutet, Stockholm, Sweden
| | - Jonas A Nilsson
- Sahlgrenska Translational Melanoma Group, Sahlgrenska Cancer Center, Departments of Surgery and Oncology, Institute of Clinical Sciences, University of Gothenburg and Sahlgrenska University Hospital, Gothenburg, Sweden.
| |
Collapse
|
18
|
Bogema DR, Micallef ML, Liu M, Padula MP, Djordjevic SP, Darling AE, Jenkins C. Analysis of Theileria orientalis draft genome sequences reveals potential species-level divergence of the Ikeda, Chitose and Buffeli genotypes. BMC Genomics 2018; 19:298. [PMID: 29703152 PMCID: PMC5921998 DOI: 10.1186/s12864-018-4701-2] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2017] [Accepted: 04/18/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Theileria orientalis (Apicomplexa: Piroplasmida) has caused clinical disease in cattle of Eastern Asia for many years and its recent rapid spread throughout Australian and New Zealand herds has caused substantial economic losses to production through cattle deaths, late term abortion and morbidity. Disease outbreaks have been linked to the detection of a pathogenic genotype of T. orientalis, genotype Ikeda, which is also responsible for disease outbreaks in Asia. Here, we sequenced and compared the draft genomes of one pathogenic (Ikeda) and two apathogenic (Chitose, Buffeli) isolates of T. orientalis sourced from Australian herds. RESULTS Using de novo assembled sequences and a single nucleotide variant (SNV) analysis pipeline, we found extensive genetic divergence between the T. orientalis genotypes. A genome-wide phylogeny reconstructed to address continued confusion over nomenclature of this species displayed concordance with prior phylogenetic studies based on the major piroplasm surface protein (MPSP) gene. However, average nucleotide identity (ANI) values revealed that the divergence between isolates is comparable to that observed between other theilerias which represent distinct species. Analysis of SNVs revealed putative recombination between the Chitose and Buffeli genotypes and also between Australian and Japanese Ikeda isolates. Finally, to inform future vaccine studies, dN/dS ratios and surface location predictions were analysed. Six predicted surface protein targets were confirmed to be expressed during the piroplasm phase of the parasite by mass spectrometry. CONCLUSIONS We used whole genome sequencing to demonstrate that the T. orientalis Ikeda, Chitose and Buffeli variants show substantial genetic divergence. Our data indicates that future researchers could potentially consider disease-associated Ikeda and closely related genotypes as a separate species from non-pathogenic Chitose and Buffeli.
Collapse
Affiliation(s)
- Daniel R Bogema
- NSW Department of Primary Industries, Elizabeth Macarthur Agricultural Institute, Menangle, NSW, Australia
| | - Melinda L Micallef
- NSW Department of Primary Industries, Elizabeth Macarthur Agricultural Institute, Menangle, NSW, Australia
| | - Michael Liu
- The ithree institute, University of Technology Sydney, Ultimo, NSW, Australia
| | - Matthew P Padula
- The ithree institute, University of Technology Sydney, Ultimo, NSW, Australia
| | - Steven P Djordjevic
- The ithree institute, University of Technology Sydney, Ultimo, NSW, Australia
| | - Aaron E Darling
- The ithree institute, University of Technology Sydney, Ultimo, NSW, Australia
| | - Cheryl Jenkins
- NSW Department of Primary Industries, Elizabeth Macarthur Agricultural Institute, Menangle, NSW, Australia.
| |
Collapse
|
19
|
Mozere M, Tekman M, Kari J, Bockenhauer D, Kleta R, Stanescu H. OVAS: an open-source variant analysis suite with inheritance modelling. BMC Bioinformatics 2018; 19:46. [PMID: 29422027 PMCID: PMC5806474 DOI: 10.1186/s12859-018-2030-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2017] [Accepted: 01/17/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The advent of modern high-throughput genetics continually broadens the gap between the rising volume of sequencing data, and the tools required to process them. The need to pinpoint a small subset of functionally important variants has now shifted towards identifying the critical differences between normal variants and disease-causing ones. The ever-increasing reliance on cloud-based services for sequence analysis and the non-transparent methods they utilize has prompted the need for more in-situ services that can provide a safer and more accessible environment to process patient data, especially in circumstances where continuous internet usage is limited. RESULTS To address these issues, we herein propose our standalone Open-source Variant Analysis Sequencing (OVAS) pipeline; consisting of three key stages of processing that pertain to the separate modes of annotation, filtering, and interpretation. Core annotation performs variant-mapping to gene-isoforms at the exon/intron level, append functional data pertaining the type of variant mutation, and determine hetero/homozygosity. An extensive inheritance-modelling module in conjunction with 11 other filtering components can be used in sequence ranging from single quality control to multi-file penetrance model specifics such as X-linked recessive or mosaicism. Depending on the type of interpretation required, additional annotation is performed to identify organ specificity through gene expression and protein domains. In the course of this paper we analysed an autosomal recessive case study. OVAS made effective use of the filtering modules to recapitulate the results of the study by identifying the prescribed compound-heterozygous disease pattern from exome-capture sequence input samples. CONCLUSION OVAS is an offline open-source modular-driven analysis environment designed to annotate and extract useful variants from Variant Call Format (VCF) files, and process them under an inheritance context through a top-down filtering schema of swappable modules, run entirely off a live bootable medium and accessed locally through a web-browser.
Collapse
Affiliation(s)
- Monika Mozere
- Division of Medicine, University College London, London, NW3 2PF UK
| | - Mehmet Tekman
- Division of Medicine, University College London, London, NW3 2PF UK
| | - Jameela Kari
- Pediatric Nephrology Center of Excellence and Pediatric Department, Faculty of Medicine, King Abdulaziz University, Jeddah, Kingdom of Saudi Arabia
| | | | - Robert Kleta
- Division of Medicine, University College London, London, NW3 2PF UK
| | - Horia Stanescu
- Division of Medicine, University College London, London, NW3 2PF UK
| |
Collapse
|
20
|
Rudewicz J, Soueidan H, Uricaru R, Bonnefoi H, Iggo R, Bergh J, Nikolski M. MICADo - Looking for Mutations in Targeted PacBio Cancer Data: An Alignment-Free Method. Front Genet 2016; 7:214. [PMID: 28008336 PMCID: PMC5143680 DOI: 10.3389/fgene.2016.00214] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2016] [Accepted: 11/23/2016] [Indexed: 12/11/2022] Open
Abstract
Targeted sequencing is commonly used in clinical application of NGS technology since it enables generation of sufficient sequencing depth in the targeted genes of interest and thus ensures the best possible downstream analysis. This notwithstanding, the accurate discovery and annotation of disease causing mutations remains a challenging problem even in such favorable context. The difficulty is particularly salient in the case of third generation sequencing technology, such as PacBio. We present MICADo, a de Bruijn graph based method, implemented in python, that makes possible to distinguish between patient specific mutations and other alterations for targeted sequencing of a cohort of patients. MICADo analyses NGS reads for each sample within the context of the data of the whole cohort in order to capture the differences between specificities of the sample with respect to the cohort. MICADo is particularly suitable for sequencing data from highly heterogeneous samples, especially when it involves high rates of non-uniform sequencing errors. It was validated on PacBio sequencing datasets from several cohorts of patients. The comparison with two widely used available tools, namely VarScan and GATK, shows that MICADo is more accurate, especially when true mutations have frequencies close to backgound noise. The source code is available at http://github.com/cbib/MICADo.
Collapse
Affiliation(s)
- Justine Rudewicz
- Centre de BioInformatique de Bordeaux, University of BordeauxBordeaux, France; Laboratoire Bordelais de Recherche en Informatique, Centre National de la Recherche Scientifique, University of BordeauxBordeaux, France; Bergonié Cancer Institute, Institut National de la Santé et de la Recherche Médicale U1218, University of BordeauxBordeaux, France
| | - Hayssam Soueidan
- Centre de BioInformatique de Bordeaux, University of BordeauxBordeaux, France; Laboratoire Bordelais de Recherche en Informatique, Centre National de la Recherche Scientifique, University of BordeauxBordeaux, France
| | - Raluca Uricaru
- Centre de BioInformatique de Bordeaux, University of BordeauxBordeaux, France; Laboratoire Bordelais de Recherche en Informatique, Centre National de la Recherche Scientifique, University of BordeauxBordeaux, France
| | - Hervé Bonnefoi
- Bergonié Cancer Institute, Institut National de la Santé et de la Recherche Médicale U1218, University of Bordeaux Bordeaux, France
| | - Richard Iggo
- Bergonié Cancer Institute, Institut National de la Santé et de la Recherche Médicale U1218, University of Bordeaux Bordeaux, France
| | - Jonas Bergh
- Karolinska Institute and University Hospital Stockholm, Sweden
| | - Macha Nikolski
- Centre de BioInformatique de Bordeaux, University of BordeauxBordeaux, France; Laboratoire Bordelais de Recherche en Informatique, Centre National de la Recherche Scientifique, University of BordeauxBordeaux, France
| |
Collapse
|
21
|
Performance Characterization and Validation of Saliva as an Alternative Specimen Source for Detecting Hereditary Breast Cancer Mutations by Next Generation Sequencing. Int J Genomics 2016; 2016:2059041. [PMID: 27818992 PMCID: PMC5081504 DOI: 10.1155/2016/2059041] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2016] [Revised: 08/22/2016] [Accepted: 09/26/2016] [Indexed: 12/12/2022] Open
Abstract
Identification of pathogenic germline mutations by next generation sequencing is a widely accepted tool for predicting the risk of hereditary cancer development. Blood is the most common source of DNA for such tests. However, blood as a sample type has many drawbacks, including the invasive collection method, poor sample stability, and a relatively high cost of collection. Therefore, in the current study we have assessed the suitability of saliva as an alternative source of genomic DNA for the identification of germline mutations in the BRCA1/2 genes by next generation sequencing (NGS). Our results show that all of the samples yielded DNA concentrations sufficient for library preparation. The concentrations of the final libraries, which were generated by PCR using target specific primers, fall into the expected range with no notable difference between libraries generated from DNA derived from saliva or blood. Quality parameters indicate that sequencing performance is comparable across sample source. An average of (98 ± 0.02)% variant calling concordance was obtained between the two specimen sources. Our data recommends saliva as a potential alternative for detecting germline mutation by next generation sequencing.
Collapse
|
22
|
Tian S, Yan H, Kalmbach M, Slager SL. Impact of post-alignment processing in variant discovery from whole exome data. BMC Bioinformatics 2016; 17:403. [PMID: 27716037 PMCID: PMC5048557 DOI: 10.1186/s12859-016-1279-z] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2016] [Accepted: 09/26/2016] [Indexed: 01/11/2023] Open
Abstract
Background GATK Best Practices workflows are widely used in large-scale sequencing projects and recommend post-alignment processing before variant calling. Two key post-processing steps include the computationally intensive local realignment around known INDELs and base quality score recalibration (BQSR). Both have been shown to reduce erroneous calls; however, the findings are mainly supported by the analytical pipeline that incorporates BWA and GATK UnifiedGenotyper. It is not known whether there is any benefit of post-processing and to what extent the benefit might be for pipelines implementing other methods, especially given that both mappers and callers are typically updated. Moreover, because sequencing platforms are upgraded regularly and the new platforms provide better estimations of read quality scores, the need for post-processing is also unknown. Finally, some regions in the human genome show high sequence divergence from the reference genome; it is unclear whether there is benefit from post-processing in these regions. Results We used both simulated and NA12878 exome data to comprehensively assess the impact of post-processing for five or six popular mappers together with five callers. Focusing on chromosome 6p21.3, which is a region of high sequence divergence harboring the human leukocyte antigen (HLA) system, we found that local realignment had little or no impact on SNP calling, but increased sensitivity was observed in INDEL calling for the Stampy + GATK UnifiedGenotyper pipeline. No or only a modest effect of local realignment was detected on the three haplotype-based callers and no evidence of effect on Novoalign. BQSR had virtually negligible effect on INDEL calling and generally reduced sensitivity for SNP calling that depended on caller, coverage and level of divergence. Specifically, for SAMtools and FreeBayes calling in the regions with low divergence, BQSR reduced the SNP calling sensitivity but improved the precision when the coverage is insufficient. However, in regions of high divergence (e.g., the HLA region), BQSR reduced the sensitivity of both callers with little gain in precision rate. For the other three callers, BQSR reduced the sensitivity without increasing the precision rate regardless of coverage and divergence level. Conclusions We demonstrated that the gain from post-processing is not universal; rather, it depends on mapper and caller combination, and the benefit is influenced further by sequencing depth and divergence level. Our analysis highlights the importance of considering these key factors in deciding to apply the computationally intensive post-processing to Illumina exome data. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1279-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Shulan Tian
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, 200 1st St SW, Rochester, MN, 55905, USA
| | - Huihuang Yan
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, 200 1st St SW, Rochester, MN, 55905, USA
| | - Michael Kalmbach
- Division of Research and Education Support Systems, Department of Information Technology Mayo Clinic, Rochester, MN, 55905, USA
| | - Susan L Slager
- Division of Biomedical Statistics and Informatics, Department of Health Sciences Research, Mayo Clinic, 200 1st St SW, Rochester, MN, 55905, USA.
| |
Collapse
|
23
|
Laurie S, Fernandez-Callejo M, Marco-Sola S, Trotta JR, Camps J, Chacón A, Espinosa A, Gut M, Gut I, Heath S, Beltran S. From Wet-Lab to Variations: Concordance and Speed of Bioinformatics Pipelines for Whole Genome and Whole Exome Sequencing. Hum Mutat 2016; 37:1263-1271. [PMID: 27604516 PMCID: PMC5129537 DOI: 10.1002/humu.23114] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2016] [Accepted: 09/01/2016] [Indexed: 12/21/2022]
Abstract
As whole genome sequencing becomes cheaper and faster, it will progressively substitute targeted next‐generation sequencing as standard practice in research and diagnostics. However, computing cost–performance ratio is not advancing at an equivalent rate. Therefore, it is essential to evaluate the robustness of the variant detection process taking into account the computing resources required. We have benchmarked six combinations of state‐of‐the‐art read aligners (BWA‐MEM and GEM3) and variant callers (FreeBayes, GATK HaplotypeCaller, SAMtools) on whole genome and whole exome sequencing data from the NA12878 human sample. Results have been compared between them and against the NIST Genome in a Bottle (GIAB) variants reference dataset. We report differences in speed of up to 20 times in some steps of the process and have observed that SNV, and to a lesser extent InDel, detection is highly consistent in 70% of the genome. SNV, and especially InDel, detection is less reliable in 20% of the genome, and almost unfeasible in the remaining 10%. These findings will aid in choosing the appropriate tools bearing in mind objectives, workload, and computing infrastructure available.
Collapse
Affiliation(s)
- Steve Laurie
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Marcos Fernandez-Callejo
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Santiago Marco-Sola
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Jean-Remi Trotta
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Jordi Camps
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | | | | | - Marta Gut
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Ivo Gut
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Simon Heath
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Sergi Beltran
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona, Spain.,Universitat Pompeu Fabra (UPF), Barcelona, Spain
| |
Collapse
|
24
|
Wang J, Skoog T, Einarsdottir E, Kaartokallio T, Laivuori H, Grauers A, Gerdhem P, Hytönen M, Lohi H, Kere J, Jiao H. Investigation of rare and low-frequency variants using high-throughput sequencing with pooled DNA samples. Sci Rep 2016; 6:33256. [PMID: 27633116 PMCID: PMC5025741 DOI: 10.1038/srep33256] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2016] [Accepted: 08/24/2016] [Indexed: 11/09/2022] Open
Abstract
High-throughput sequencing using pooled DNA samples can facilitate genome-wide studies on rare and low-frequency variants in a large population. Some major questions concerning the pooling sequencing strategy are whether rare and low-frequency variants can be detected reliably, and whether estimated minor allele frequencies (MAFs) can represent the actual values obtained from individually genotyped samples. In this study, we evaluated MAF estimates using three variant detection tools with two sets of pooled whole exome sequencing (WES) and one set of pooled whole genome sequencing (WGS) data. Both GATK and Freebayes displayed high sensitivity, specificity and accuracy when detecting rare or low-frequency variants. For the WGS study, 56% of the low-frequency variants in Illumina array have identical MAFs and 26% have one allele difference between sequencing and individual genotyping data. The MAF estimates from WGS correlated well (r = 0.94) with those from Illumina arrays. The MAFs from the pooled WES data also showed high concordance (r = 0.88) with those from the individual genotyping data. In conclusion, the MAFs estimated from pooled DNA sequencing data reflect the MAFs in individually genotyped samples well. The pooling strategy can thus be a rapid and cost-effective approach for the initial screening in large-scale association studies.
Collapse
Affiliation(s)
- Jingwen Wang
- Department of Biosciences and Nutrition, Karolinska Institutet, SE-14183 Huddinge, Sweden.,Science for Life Laboratory, Stockholm, Sweden
| | - Tiina Skoog
- Department of Biosciences and Nutrition, Karolinska Institutet, SE-14183 Huddinge, Sweden
| | - Elisabet Einarsdottir
- Department of Biosciences and Nutrition, Karolinska Institutet, SE-14183 Huddinge, Sweden.,Molecular Neurology Research Program, University of Helsinki and Folkhälsan Institute of Genetics, Helsinki, Finland
| | - Tea Kaartokallio
- Medical and Clinical Genetics, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
| | - Hannele Laivuori
- Medical and Clinical Genetics, University of Helsinki and Helsinki University Hospital, Helsinki, Finland.,Obstetrics and Gynecology, University of Helsinki and Helsinki University Hospital, Helsinki, Finland.,Institute for Molecular Medicine Finland, University of Helsinki, Helsinki, Finland
| | - Anna Grauers
- Department of Orthopedics, Karolinska University Hospital and Department of Clinical Sciences, Intervention and Technology (CLINTEC) Karolinska Institutet, Stockholm, Sweden.,Department of Orthopaedics, Sundsvall and Harnosand County Hospital, Sundsvall, Sweden
| | - Paul Gerdhem
- Department of Orthopedics, Karolinska University Hospital and Department of Clinical Sciences, Intervention and Technology (CLINTEC) Karolinska Institutet, Stockholm, Sweden
| | - Marjo Hytönen
- Department of Veterinary Biosciences, and Research Programs Unit, Molecular Neurology, University of Helsinki and Folkhälsan Research Center, Helsinki, Finland
| | - Hannes Lohi
- Department of Veterinary Biosciences, and Research Programs Unit, Molecular Neurology, University of Helsinki and Folkhälsan Research Center, Helsinki, Finland
| | - Juha Kere
- Department of Biosciences and Nutrition, Karolinska Institutet, SE-14183 Huddinge, Sweden.,Science for Life Laboratory, Stockholm, Sweden.,Molecular Neurology Research Program, University of Helsinki and Folkhälsan Institute of Genetics, Helsinki, Finland
| | - Hong Jiao
- Department of Biosciences and Nutrition, Karolinska Institutet, SE-14183 Huddinge, Sweden.,Science for Life Laboratory, Stockholm, Sweden
| |
Collapse
|
25
|
Dickson DJ, Pfeifer JD. Real-world data in the molecular era-finding the reality in the real world. Clin Pharmacol Ther 2016; 99:186-97. [PMID: 26565654 DOI: 10.1002/cpt.300] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2015] [Accepted: 11/10/2015] [Indexed: 01/06/2023]
Abstract
Real-world data (RWD) promises to provide a pivotal element to the understanding of personalized medicine. However, without true representation (or the reality) of the patient-disease biosystem and its molecular contributors, RWD may hamper rather than help this advancement. In this review article, we discuss RWD vs. clinical reality and the disconnects that exist currently (emphasizing molecular medicine), and methods of closing the gaps between RWD and reality.
Collapse
Affiliation(s)
- D J Dickson
- Molecular Evidence Development Consortium, Rexburg, Idaho, USA
| | - J D Pfeifer
- Department of Pathology, Washington University School of Medicine, St. Louis, Missouri, USA
| |
Collapse
|
26
|
Scarcelli N, Mariac C, Couvreur TLP, Faye A, Richard D, Sabot F, Berthouly‐Salazar C, Vigouroux Y. Intra‐individual polymorphism in chloroplasts from
NGS
data: where does it come from and how to handle it? Mol Ecol Resour 2015; 16:434-45. [DOI: 10.1111/1755-0998.12462] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2015] [Revised: 08/07/2015] [Accepted: 08/21/2015] [Indexed: 01/11/2023]
Affiliation(s)
- N. Scarcelli
- UMR DIADE IRD Montpellier 911 avenue Agropolis 34394 Montpellier Cedex 5 France
| | - C. Mariac
- UMR DIADE IRD Montpellier 911 avenue Agropolis 34394 Montpellier Cedex 5 France
| | - T. L. P. Couvreur
- UMR DIADE IRD Montpellier 911 avenue Agropolis 34394 Montpellier Cedex 5 France
- Département des Sciences Biologiques Laboratoire de Botanique Systématique et d'Ecologie Ecole Normale Supérieure Université de Yaoundé I BP 047 Yaoundé Cameroon
| | - A. Faye
- UMR DIADE IRD Montpellier 911 avenue Agropolis 34394 Montpellier Cedex 5 France
| | - D. Richard
- UMR DIADE IRD Montpellier 911 avenue Agropolis 34394 Montpellier Cedex 5 France
| | - F. Sabot
- UMR DIADE IRD Montpellier 911 avenue Agropolis 34394 Montpellier Cedex 5 France
| | - C. Berthouly‐Salazar
- UMR DIADE IRD Montpellier 911 avenue Agropolis 34394 Montpellier Cedex 5 France
- Route des Hydrocarbures Centre de Recherche de Bel‐Air IRD/ISRA BP 1386 – 18524 Dakar Senegal
| | - Y. Vigouroux
- UMR DIADE IRD Montpellier 911 avenue Agropolis 34394 Montpellier Cedex 5 France
| |
Collapse
|
27
|
Krasnov GS, Dmitriev AA, Kudryavtseva AV, Shargunov AV, Karpov DS, Uroshlev LA, Melnikova NV, Blinov VM, Poverennaya EV, Archakov AI, Lisitsa AV, Ponomarenko EA. PPLine: An Automated Pipeline for SNP, SAP, and Splice Variant Detection in the Context of Proteogenomics. J Proteome Res 2015; 14:3729-37. [DOI: 10.1021/acs.jproteome.5b00490] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- George Sergeevich Krasnov
- Engelhardt
Institute of Molecular Biology, Russian Academy of Sciences, Moscow, 111991 Russia
- Orekhovich
Institute of Biomedical Chemistry, Russian Academy of Medical Sciences, Moscow, 119121 Russia
- Mechnikov Research Institute of Vaccines and Sera, Moscow, 105064 Russia
| | | | - Anna Viktorovna Kudryavtseva
- Engelhardt
Institute of Molecular Biology, Russian Academy of Sciences, Moscow, 111991 Russia
- Herzen
Moscow Cancer Research Institute, Ministry of Healthcare of the Russian Federation, Moscow, 125284 Russia
| | - Alexander Valerievich Shargunov
- Orekhovich
Institute of Biomedical Chemistry, Russian Academy of Medical Sciences, Moscow, 119121 Russia
- Mechnikov Research Institute of Vaccines and Sera, Moscow, 105064 Russia
| | - Dmitry Sergeevich Karpov
- Engelhardt
Institute of Molecular Biology, Russian Academy of Sciences, Moscow, 111991 Russia
- Orekhovich
Institute of Biomedical Chemistry, Russian Academy of Medical Sciences, Moscow, 119121 Russia
| | | | | | - Vladimir Mikhailovich Blinov
- Orekhovich
Institute of Biomedical Chemistry, Russian Academy of Medical Sciences, Moscow, 119121 Russia
- Mechnikov Research Institute of Vaccines and Sera, Moscow, 105064 Russia
| | | | | | - Andrey Valerievich Lisitsa
- Orekhovich
Institute of Biomedical Chemistry, Russian Academy of Medical Sciences, Moscow, 119121 Russia
| | | |
Collapse
|
28
|
Kelly BJ, Fitch JR, Hu Y, Corsmeier DJ, Zhong H, Wetzel AN, Nordquist RD, Newsom DL, White P. Churchill: an ultra-fast, deterministic, highly scalable and balanced parallelization strategy for the discovery of human genetic variation in clinical and population-scale genomics. Genome Biol 2015; 16:6. [PMID: 25600152 PMCID: PMC4333267 DOI: 10.1186/s13059-014-0577-x] [Citation(s) in RCA: 80] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2014] [Accepted: 12/23/2014] [Indexed: 12/18/2022] Open
Abstract
While advances in genome sequencing technology make population-scale genomics a possibility, current approaches for analysis of these data rely upon parallelization strategies that have limited scalability, complex implementation and lack reproducibility. Churchill, a balanced regional parallelization strategy, overcomes these challenges, fully automating the multiple steps required to go from raw sequencing reads to variant discovery. Through implementation of novel deterministic parallelization techniques, Churchill allows computationally efficient analysis of a high-depth whole genome sample in less than two hours. The method is highly scalable, enabling full analysis of the 1000 Genomes raw sequence dataset in a week using cloud resources. http://churchill.nchri.org/.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Peter White
- Center for Microbial Pathogenesis, The Research Institute at Nationwide Children's Hospital, 700 Children's Drive, Columbus 43205, OH, USA.,Department of Pediatrics, College of Medicine, The Ohio State University, Columbus, Ohio, USA
| |
Collapse
|