1
|
Phumiphanjarphak W, Aiewsakun P. Entourage: all-in-one sequence analysis software for genome assembly, virus detection, virus discovery, and intrasample variation profiling. BMC Bioinformatics 2024; 25:222. [PMID: 38914932 PMCID: PMC11197340 DOI: 10.1186/s12859-024-05846-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Accepted: 06/14/2024] [Indexed: 06/26/2024] Open
Abstract
BACKGROUND Pan-virus detection, and virome investigation in general, can be challenging, mainly due to the lack of universally conserved genetic elements in viruses. Metagenomic next-generation sequencing can offer a promising solution to this problem by providing an unbiased overview of the microbial community, enabling detection of any viruses without prior target selection. However, a major challenge in utilising metagenomic next-generation sequencing for virome investigation is that data analysis can be highly complex, involving numerous data processing steps. RESULTS Here, we present Entourage to address this challenge. Entourage enables short-read sequence assembly, viral sequence search with or without reference virus targets using contig-based approaches, and intrasample sequence variation quantification. Several workflows are implemented in Entourage to facilitate end-to-end virus sequence detection analysis through a single command line, from read cleaning, sequence assembly, to virus sequence searching. The results generated are comprehensive, allowing for thorough quality control, reliability assessment, and interpretation. We illustrate Entourage's utility as a streamlined workflow for virus detection by employing it to comprehensively search for target virus sequences and beyond in raw sequence read data generated from HeLa cell culture samples spiked with viruses. Furthermore, we showcase its flexibility and performance on a real-world dataset by analysing a preassembled Tara Oceans dataset. Overall, our results show that Entourage performs well even with low virus sequencing depth in single digits, and it can be used to discover novel viruses effectively. Additionally, by using sequence data generated from a patient with chronic SARS-CoV-2 infection, we demonstrate Entourage's capability to quantify virus intrasample genetic variations, and generate publication-quality figures illustrating the results. CONCLUSIONS Entourage is an all-in-one, versatile, and streamlined bioinformatics software for virome investigation, developed with a focus on ease of use. Entourage is available at https://codeberg.org/CENMIG/Entourage under the MIT license.
Collapse
Affiliation(s)
- Worakorn Phumiphanjarphak
- Department of Microbiology, Faculty of Science, Mahidol University, Ratchathewi District, 272 Rama VI Road, Bangkok, 10400, Thailand
- Pornchai Matangkasombut Center for Microbial Genomics, Department of Microbiology, Faculty of Science, Mahidol University, Bangkok, Thailand
| | - Pakorn Aiewsakun
- Department of Microbiology, Faculty of Science, Mahidol University, Ratchathewi District, 272 Rama VI Road, Bangkok, 10400, Thailand.
- Pornchai Matangkasombut Center for Microbial Genomics, Department of Microbiology, Faculty of Science, Mahidol University, Bangkok, Thailand.
| |
Collapse
|
2
|
Pan H, Fang H, Zhu C, Li S, Yi H, Zhang X, Yin X, Song Y, Chen D, Yin C. Molecular and immunological characteristics of postoperative relapse in lymph node-positive esophageal squamous cell cancer. Cancer Med 2024; 13:e7228. [PMID: 38733174 PMCID: PMC11087845 DOI: 10.1002/cam4.7228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 04/07/2024] [Accepted: 04/18/2024] [Indexed: 05/13/2024] Open
Abstract
BACKGROUND The molecular and immunological characteristics of primary tumors and positive lymph nodes in esophageal squamous cell carcinoma (ESCC) are unknown and the relationship with recurrence is unclear, which this study attempted to explore. METHODS A total of 30 ESCC patients with lymph node positive (IIB-IVA) were enrolled. Among them, primary tumor and lymph node specimens were collected from each patient, and subjected to 551-tumor-targeted DNA sequencing and 289-immuno-oncology RNA panel sequencing to identify the different molecular basis and immunological features, respectively. RESULTS The primary tumors exhibited a higher mutation burden than lymph nodes (p < 0.001). One-year recurrent ESCC exhibited a higher Mucin16 (MUC16) mutation rate (p = 0.038), as well as univariate and multivariate analysis revealed that MUC16 mutation is independent genetic factor associated with reduced relapse-free survival (univariate, HR: 5.39, 95% CI: 1.67-17.4, p = 0.005; multivariate, HR: 7.36, 95% CI: 1.79-30.23, p = 0.006). Transcriptomic results showed non-relapse group had higher cytolytic activity (CYT) score (p = 0.025), and was enriched in the IFN-α pathway (p = 0.036), while those in the relapsed group were enriched in the TNF-α/NF-κB (p = 0.001) and PI3K/Akt pathway (p = 0.014). CONCLUSION The difference in molecular characteristics between primary lesions and lymph nodes may be the cause of the inconsistent clinical outcomes. Mutations of MUC16 and poor immune infiltration are associated with rapid relapse of nodes-positive ESCC.
Collapse
Affiliation(s)
- Hua‐guang Pan
- Department of Thoracic SurgeryThe First Affiliated Hospital of Anhui Medical UniversityHefeiAnhuiChina
| | - Han‐lin Fang
- Department of Thoracic SurgeryThe First Affiliated Hospital of Anhui Medical UniversityHefeiAnhuiChina
| | - Chan Zhu
- Jiangsu Simcere Diagnostics Co., Ltd., Nanjing Simcere Medical Laboratory Science Co., Ltd.The State Key Lab of Translational Medicine and Innovative Drug DevelopmentNanjingChina
| | - Si Li
- Jiangsu Simcere Diagnostics Co., Ltd., Nanjing Simcere Medical Laboratory Science Co., Ltd.The State Key Lab of Translational Medicine and Innovative Drug DevelopmentNanjingChina
| | - Huan Yi
- Jiangsu Simcere Diagnostics Co., Ltd., Nanjing Simcere Medical Laboratory Science Co., Ltd.The State Key Lab of Translational Medicine and Innovative Drug DevelopmentNanjingChina
| | - Xing Zhang
- Jiangsu Simcere Diagnostics Co., Ltd., Nanjing Simcere Medical Laboratory Science Co., Ltd.The State Key Lab of Translational Medicine and Innovative Drug DevelopmentNanjingChina
| | - Xiang‐yu Yin
- Jiangsu Simcere Diagnostics Co., Ltd., Nanjing Simcere Medical Laboratory Science Co., Ltd.The State Key Lab of Translational Medicine and Innovative Drug DevelopmentNanjingChina
- Department of Biological SciencesXi'an Jiaotong‐Liverpool UniversitySuzhouChina
| | - Yun‐jie Song
- Jiangsu Simcere Diagnostics Co., Ltd., Nanjing Simcere Medical Laboratory Science Co., Ltd.The State Key Lab of Translational Medicine and Innovative Drug DevelopmentNanjingChina
| | - Dongsheng Chen
- Jiangsu Simcere Diagnostics Co., Ltd., Nanjing Simcere Medical Laboratory Science Co., Ltd.The State Key Lab of Translational Medicine and Innovative Drug DevelopmentNanjingChina
| | - Chun‐tong Yin
- Department of Thoracic SurgeryThe First Affiliated Hospital of Anhui Medical UniversityHefeiAnhuiChina
| |
Collapse
|
3
|
Park H, Gim J. A comparative investigation of single nucleotide variant calling for a personal non-Caucasian sequencing sample. Genes Genomics 2023; 45:1527-1536. [PMID: 37651066 DOI: 10.1007/s13258-023-01439-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Accepted: 08/04/2023] [Indexed: 09/01/2023]
Abstract
BACKGROUND Dropping cost and increasing clinical application of whole genome sequencing (WGS) lead a necessity of efficient (accurate and rapid) variant calling procedures from a personal WGS data (n = 1). A number of variant calling pipelines have been introduced utilizing the human genome reference GRCh38 as a reference and a benchmark dataset called 'NA12878', which are both 'standard' but limited ethnic origin. Considering the nature of variant calling algorithms and recent updates in sequencing protocol, however, it is necessary to revisit the efficiency of the current best pipelines for a personal WGS data from diverse ethnicity. OBJECTIVE We discuss the most efficient practices for variant calling of a personal WGS reads, with a particular emphasis on whether (1) ethnic match or mismatch between the reference genome and a WGS data produces a distinct result and more importantly (2) there is an ethnic-specific optimal workflow. METHODS Here, we generate an appropriate WGS data, DNA array, and sufficient number of Sanger validated variants from a single Korean subject to perform such a comprehensive comparison. We applied this WGS reads and the 'NA12878' reads to 8 different variant calling pipelines with 2 different reference genomes (GRCh38 and KOREF, a Korean reference genome) to which the WGS reads from different ethnic origins are aligned. RESULTS We evaluated the performance of the pipelines with the matched array genotype data and Sanger sequencing validation and demonstrated that: regardless to the ethnic match/mismatch (1) Novoalign-GATK4 showed the most efficient performance with the exceptional calls in MHC region; (2) the overall performance was better with GRCh38, while a significant difference in recall was observed. In addition, we found it is largely reduced computing cost maintaining performance to remove 'markduplication' step with PCR-free WGS data. CONCLUSION For variant calling of a personal PCR-free WGS data, regardless of ethnicity consideration, we recommend the use of the Novoalign + GATK4 with GRCh38 and without 'markduplication'.
Collapse
Affiliation(s)
- HyeonSeul Park
- BK21 FOUR, Department of Integrative Biological Sciences, Chosun University, Gwangju, Republic of Korea
| | - JungSoo Gim
- BK21 FOUR, Department of Integrative Biological Sciences, Chosun University, Gwangju, Republic of Korea.
- Department of Biomedical Science, Chosun University, Gwangju, Republic of Korea.
- Asian Dementia Research Initiative, Chosun University, Gwangju, Republic of Korea.
| |
Collapse
|
4
|
Wilton R, Szalay AS. Short-read aligner performance in germline variant identification. Bioinformatics 2023; 39:btad480. [PMID: 37527006 PMCID: PMC10421969 DOI: 10.1093/bioinformatics/btad480] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 06/01/2023] [Accepted: 07/31/2023] [Indexed: 08/03/2023] Open
Abstract
MOTIVATION Read alignment is an essential first step in the characterization of DNA sequence variation. The accuracy of variant-calling results depends not only on the quality of read alignment and variant-calling software but also on the interaction between these complex software tools. RESULTS In this review, we evaluate short-read aligner performance with the goal of optimizing germline variant-calling accuracy. We examine the performance of three general-purpose short-read aligners-BWA-MEM, Bowtie 2, and Arioc-in conjunction with three germline variant callers: DeepVariant, FreeBayes, and GATK HaplotypeCaller. We discuss the behavior of the read aligners with regard to the data elements on which the variant callers rely, and illustrate how the runtime configurations of these software tools combine to affect variant-calling performance. AVAILABILITY AND IMPLEMENTATION The quick brown fox jumps over the lazy dog.
Collapse
Affiliation(s)
- Richard Wilton
- Department of Physics and Astronomy, Johns Hopkins University, Baltimore, MD 21218, United States
| | - Alexander S Szalay
- Department of Physics and Astronomy, Johns Hopkins University, Baltimore, MD 21218, United States
- Department of Computer Science, Johns Hopkins University, Baltimore, MD 21218, United States
| |
Collapse
|
5
|
Zhang Q, Yang Y, You X, Ju Y, Zhang Q, Sun T, Liu W. Comprehensive genomic analysis of primary bone sarcomas reveals different genetic patterns compared with soft tissue sarcomas. Front Oncol 2023; 13:1173275. [PMID: 37546405 PMCID: PMC10401477 DOI: 10.3389/fonc.2023.1173275] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Accepted: 07/03/2023] [Indexed: 08/08/2023] Open
Abstract
Introduction Sarcomas are classified into two types, bone sarcoma and soft tissue sarcoma (STS), which account for approximately 1% of adult solid malignancies and 20% of pediatric solid malignancies. There exist more than 50 subtypes within the two types of sarcoma. Each subtype is highly diverse and characterized by significant variations in morphology and phenotypes. Understanding tumor molecular genetics is helpful in improving the diagnostic accuracy of tumors that have been difficult to classify based on morphology alone or that have overlapping morphological features. The different molecular characteristics of bone sarcoma and STS in China remain poorly understood. Therefore, this study aimed to analyze genomic landscapes and actionable genomic alterations (GAs) as well as tumor mutational burden (TMB), microsatellite instability (MSI), and programmed death ligand-1 (PD-L1) expression among Chinese individuals diagnosed with primary bone sarcomas and STS. Methods This retrospective study included 145 patients with primary bone sarcomas (n = 75) and STS (n = 70), who were categorized based on the 2020 World Health Organization classification system. Results Patients diagnosed with bone sarcomas were significantly younger than those diagnosed with STS (p < 0.01). The top 10 frequently altered genes in bone sarcoma and STS were TP53, CDKN2A, CDKN2B, MAP3K1, LRP1B, MDM2, RB1, PTEN, MYC, and CDK4.The EWSR1 fusions exhibited statistically significant differences (p < 0.01) between primary bone sarcoma and STS in terms of their altered genes. Based on the actionable genes defined by OncoKB, actionable GAs was found in 30.7% (23/75) of the patients with bone sarcomas and 35.7% (25/70) of those with STS. There were 4.0% (3/75) patients with bone sarcoma and 4.3% (3/70) patients with STS exhibited high tumor mutational burden (TMB-H) (TMB ≥ 10). There was only one patient with STS exhibited MSI-L, while the remaining cases were microsatellite stable. The positive rate of PD-L1 expression was slightly higher in STS (35.2%) than in bone sarcoma (33.3%), however, this difference did not reach statistical significance. The expression of PD-L1 in STS patients was associated with a poorer prognosis (p = 0.007). Patients with STS had a better prognosis than those with bone sarcoma, but the observed difference did not attain statistical significance (p = 0.21). Amplification of MET and MYC genes were negatively correlated with clinical prognosis in bone tumors (p<0.01). Discussion In conclusion, bone sarcoma and STS have significantly different clinical and molecular characteristics, suggesting that it is vital to diagnose accurately for clinical treatment. Additionally, comprehensive genetic landscape can provide novel treatment perspectives for primary bone sarcoma and STS. Taking TMB, MSI, PD-L1 expression, and OncoKB definition together into consideration, there are still many patients who have the potential to respond to targeted therapy or immunotherapy.
Collapse
Affiliation(s)
- Qing Zhang
- Department of Orthopaedic Oncology, Beijing Ji Shui Tan Hospital, Peking University, Beijing, China
| | - Yongkun Yang
- Department of Orthopaedic Oncology, Beijing Ji Shui Tan Hospital, Peking University, Beijing, China
| | - Xia You
- The Medical Department, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, Jiangsu, China
- Nanjing Simcere Medical Laboratory Science Co., Ltd, Nanjing, Jiangsu, China
- The State Key Lab of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, Jiangsu, China
| | - Yongzhi Ju
- The Medical Department, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, Jiangsu, China
- Nanjing Simcere Medical Laboratory Science Co., Ltd, Nanjing, Jiangsu, China
- The State Key Lab of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, Jiangsu, China
| | - Qin Zhang
- The Medical Department, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, Jiangsu, China
- Nanjing Simcere Medical Laboratory Science Co., Ltd, Nanjing, Jiangsu, China
- The State Key Lab of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, Jiangsu, China
| | - Tingting Sun
- The Medical Department, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, Jiangsu, China
- Nanjing Simcere Medical Laboratory Science Co., Ltd, Nanjing, Jiangsu, China
- The State Key Lab of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, Jiangsu, China
| | - Weifeng Liu
- Department of Orthopaedic Oncology, Beijing Ji Shui Tan Hospital, Peking University, Beijing, China
| |
Collapse
|
6
|
Gong J, Dong L, Wang C, Luo N, Han T, Li M, Sun T, Ding R, Han B, Li G. Molecular genomic landscape of pediatric solid tumors in Chinese patients: implications for clinical significance. J Cancer Res Clin Oncol 2023:10.1007/s00432-023-04756-5. [PMID: 37140698 DOI: 10.1007/s00432-023-04756-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2023] [Accepted: 04/08/2023] [Indexed: 05/05/2023]
Abstract
PURPOSE Pediatric solid tumors are significantly different from adult tumors. Studies have revealed genomic aberrations in pediatric solid tumors, but these analyses were based on Western populations. Currently, it is not known to what extent the existing genomic findings represent differences in ethnic backgrounds. EXPERIMENTAL DESIGN: We retrospectively analyzed the basic clinical characteristics of the patients, including age, cancer type, and sex distribution, and further analyzed the somatic and germline mutations of cancer-related genes in a Chinese pediatric cohort. In addition, we investigated the clinical significance of genomic mutations on therapeutic, prognostic, diagnostic, and preventive actions. RESULTS Our study enrolled 318 pediatric patients, including 234 patients with CNS tumors and 84 patients with non-CNS tumors. Somatic mutation analysis showed that there were significant differences in mutation types between CNS tumors and non-CNS tumors. P/LP germline variants were identified in 8.49% of patients. In total, 42.8% patients prompted diagnostic, 37.7% patients prompted prognostic, 58.2% patients prompted therapeutic, and 8.5% patients prompted tumor-predisposing and preventive, and we found that genomic findings might improve clinical management. CONCLUSIONS Our study is the first large-scale study to analyze the landscape of genetic mutations in pediatric patients with solid tumors in China. Genomic findings in CNS and non-CNS solid pediatric tumors provide evidence for the clinical classification and individualized treatment of pediatric tumors, and they will facilitate improvement of clinical management. Data presented in this study should serve as a reference to guide the future design of clinical trials.
Collapse
Affiliation(s)
- Jie Gong
- Department of Neurosurgery, Qilu Hospital, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China
| | - Liujian Dong
- Department of Neurosurgery, Children's Hospital Affiliated to Zhengzhou University; Henan Children's Hospital; Zhengzhou Children's Hospital, Zhengzhou, 450000, Henan, China
| | - Chuanwei Wang
- Department of Neurosurgery, Qilu Hospital, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China
| | - Ningning Luo
- The Medical Department, The State Key Lab of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing Simcere Medical Laboratory Science Co., Ltd, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, 210000, Jiangsu, China
| | - Tiantian Han
- The Medical Department, The State Key Lab of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing Simcere Medical Laboratory Science Co., Ltd, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, 210000, Jiangsu, China
| | - Mengmeng Li
- The Medical Department, The State Key Lab of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing Simcere Medical Laboratory Science Co., Ltd, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, 210000, Jiangsu, China
| | - Tingting Sun
- The Medical Department, The State Key Lab of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing Simcere Medical Laboratory Science Co., Ltd, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, 210000, Jiangsu, China
| | - Ran Ding
- The Medical Department, The State Key Lab of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing Simcere Medical Laboratory Science Co., Ltd, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, 210000, Jiangsu, China
| | - Bo Han
- The Key Laboratory of Experimental Teratology, Ministry of Education and Department of Pathology, School of Basic Medical Sciences, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China.
- Department of Pathology, Qilu Hospital, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China.
| | - Gang Li
- Department of Neurosurgery, Qilu Hospital, Cheeloo College of Medicine, Shandong University, Jinan, 250012, Shandong, China.
- Institute of Brain and Brain-Inspired Science, Shandong University, Jinan, 250012, Shandong, China.
- Shandong Key Laboratory of Brain Function Remodeling, Jinan, 250012, Shandong, China.
| |
Collapse
|
7
|
Zhai Y, Bardel C, Vallée M, Iwaz J, Roy P. Performance comparisons between clustering models for reconstructing NGS results from technical replicates. Front Genet 2023; 14:1148147. [PMID: 37007945 PMCID: PMC10060969 DOI: 10.3389/fgene.2023.1148147] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 03/06/2023] [Indexed: 03/18/2023] Open
Abstract
To improve the performance of individual DNA sequencing results, researchers often use replicates from the same individual and various statistical clustering models to reconstruct a high-performance callset. Here, three technical replicates of genome NA12878 were considered and five model types were compared (consensus, latent class, Gaussian mixture, Kamila–adapted k-means, and random forest) regarding four performance indicators: sensitivity, precision, accuracy, and F1-score. In comparison with no use of a combination model, i) the consensus model improved precision by 0.1%; ii) the latent class model brought 1% precision improvement (97%–98%) without compromising sensitivity (= 98.9%); iii) the Gaussian mixture model and random forest provided callsets with higher precisions (both >99%) but lower sensitivities; iv) Kamila increased precision (>99%) and kept a high sensitivity (98.8%); it showed the best overall performance. According to precision and F1-score indicators, the compared non-supervised clustering models that combine multiple callsets are able to improve sequencing performance vs. previously used supervised models. Among the models compared, the Gaussian mixture model and Kamila offered non-negligible precision and F1-score improvements. These models may be thus recommended for callset reconstruction (from either biological or technical replicates) for diagnostic or precision medicine purposes.
Collapse
Affiliation(s)
- Yue Zhai
- Université Lyon 1, Lyon, France
- Université de Lyon, Lyon, France
- Laboratoire de Biométrie et Biologie Évolutive, Villeurbanne, France
- *Correspondence: Yue Zhai,
| | - Claire Bardel
- Université Lyon 1, Lyon, France
- Université de Lyon, Lyon, France
- Laboratoire de Biométrie et Biologie Évolutive, Villeurbanne, France
- Service de Biostatistique-Bioinformatique, Hospices Civils de Lyon, Lyon, France
- Service de Génétique, Hospices Civils de Lyon, Bron, France
| | - Maxime Vallée
- Cellule Bioinformatique de La Plateforme de Séquençage Haut Débit NGS-HCL, Hospices Civils de Lyon, Bron, France
| | - Jean Iwaz
- Université Lyon 1, Lyon, France
- Université de Lyon, Lyon, France
- Laboratoire de Biométrie et Biologie Évolutive, Villeurbanne, France
- Service de Biostatistique-Bioinformatique, Hospices Civils de Lyon, Lyon, France
| | - Pascal Roy
- Université Lyon 1, Lyon, France
- Université de Lyon, Lyon, France
- Laboratoire de Biométrie et Biologie Évolutive, Villeurbanne, France
- Service de Biostatistique-Bioinformatique, Hospices Civils de Lyon, Lyon, France
| |
Collapse
|
8
|
Park H, Gim J. A comparative investigation of variant calling and genotyping for a single non-Caucasian whole genome. RESEARCH SQUARE 2023:rs.3.rs-2580940. [PMID: 36945432 PMCID: PMC10029055 DOI: 10.21203/rs.3.rs-2580940/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/08/2023]
Abstract
Most genome benchmark studies utilize hg38 as a reference genome (based on Caucasian and African samples) and 'NA12878' (a Caucasian sequencing read) for comparison. Here, we aimed to elucidate whether 1) ethnic match or mismatch between the reference genome and sequencing reads produces a distinct result; 2) there is an optimal work flow for single genome data. We assessed the performance of variant calling pipelines using hg38 and a Korean genome (reference genomes) and two whole-genome sequencing (WGS) reads from different ethnic origins: Caucasian (NA12878) and Korean. The pipelines used BWA-mem and Novoalign as mapping tools and GATK4, Strelka2, DeepVariant, and Samtools as variant callers. Using hg38 led to better performance (based on precision and recall), regardless of the ethnic origin of the WGS reads. Novoalign + GATK4 demonstrated best performance when using both WGS data. We assessed pipeline efficiency by removing the markduplicate process, and all pipelines, except Novoalign + DeepVariant, maintained their performance. Novoalign identified more variants overall and in MHC of chr6 when combined with GATK4. No evidence suggested improved variant calling performance from single WGS reads with a different ethnic reference, re-validating hg38 utility. We recommend using Novoalign + GATK4 without markduplication for single PCR-free WGS data.
Collapse
|
9
|
Lázaro-Guevara JM, Flores-Robles BJ, Garrido-Lopez KM, McKeown RJ, Flores-Morán AE, Labrador-Sánchez E, Pinillos-Aransay V, Trasahedo EA, López-Martín JA, Soberanis LSR, Melgar MY, Téllez-Arreola JL, Thébault SC. Identification of RP1 as the genetic cause of retinitis pigmentosa in a multi-generational pedigree using Extremely Low-Coverage Whole Genome Sequencing (XLC-WGS). Gene X 2023; 851:146956. [DOI: 10.1016/j.gene.2022.146956] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2022] [Revised: 09/25/2022] [Accepted: 10/03/2022] [Indexed: 11/04/2022] Open
|
10
|
Pandya PH, Jannu AJ, Bijangi-Vishehsaraei K, Dobrota E, Bailey BJ, Barghi F, Shannon HE, Riyahi N, Damayanti NP, Young C, Malko R, Justice R, Albright E, Sandusky GE, Wurtz LD, Collier CD, Marshall MS, Gallagher RI, Wulfkuhle JD, Petricoin EF, Coy K, Trowbridge M, Sinn AL, Renbarger JL, Ferguson MJ, Huang K, Zhang J, Saadatzadeh MR, Pollok KE. Integrative Multi-OMICs Identifies Therapeutic Response Biomarkers and Confirms Fidelity of Clinically Annotated, Serially Passaged Patient-Derived Xenografts Established from Primary and Metastatic Pediatric and AYA Solid Tumors. Cancers (Basel) 2022; 15:259. [PMID: 36612255 PMCID: PMC9818438 DOI: 10.3390/cancers15010259] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Revised: 12/22/2022] [Accepted: 12/23/2022] [Indexed: 01/04/2023] Open
Abstract
Establishment of clinically annotated, molecularly characterized, patient-derived xenografts (PDXs) from treatment-naïve and pretreated patients provides a platform to test precision genomics-guided therapies. An integrated multi-OMICS pipeline was developed to identify cancer-associated pathways and evaluate stability of molecular signatures in a panel of pediatric and AYA PDXs following serial passaging in mice. Original solid tumor samples and their corresponding PDXs were evaluated by whole-genome sequencing, RNA-seq, immunoblotting, pathway enrichment analyses, and the drug−gene interaction database to identify as well as cross-validate actionable targets in patients with sarcomas or Wilms tumors. While some divergence between original tumor and the respective PDX was evident, majority of alterations were not functionally impactful, and oncogenic pathway activation was maintained following serial passaging. CDK4/6 and BETs were prioritized as biomarkers of therapeutic response in osteosarcoma PDXs with pertinent molecular signatures. Inhibition of CDK4/6 or BETs decreased osteosarcoma PDX growth (two-way ANOVA, p < 0.05) confirming mechanistic involvement in growth. Linking patient treatment history with molecular and efficacy data in PDX will provide a strong rationale for targeted therapy and improve our understanding of which therapy is most beneficial in patients at diagnosis and in those already exposed to therapy.
Collapse
Affiliation(s)
- Pankita H. Pandya
- Department of Pediatrics, Hematology/Oncology, Indiana University School of Medicine, Indianapolis, IN 46202, USA
- Department of Pediatrics, Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Asha Jacob Jannu
- Department of Biostatistics & Health Data Science Indiana, University School of Medicine, Indianapolis, IN 46202, USA
| | - Khadijeh Bijangi-Vishehsaraei
- Department of Pediatrics, Hematology/Oncology, Indiana University School of Medicine, Indianapolis, IN 46202, USA
- Department of Pediatrics, Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Erika Dobrota
- Department of Pediatrics, Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Barbara J. Bailey
- Department of Pediatrics, Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Farinaz Barghi
- Department of Pediatrics, Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202, USA
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Harlan E. Shannon
- Department of Pediatrics, Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Niknam Riyahi
- Department of Pediatrics, Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202, USA
- Department of Pharmacology and Toxicology, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Nur P. Damayanti
- Department of Neurological Surgery, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Courtney Young
- Department of Pediatrics, Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Rada Malko
- Department of Pediatrics, Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Ryli Justice
- Department of Pediatrics, Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Eric Albright
- Department of Pathology, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - George E. Sandusky
- Department of Pathology, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - L. Daniel Wurtz
- Department of Orthopedics Surgery, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Christopher D. Collier
- Department of Orthopedics Surgery, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Mark S. Marshall
- Department of Pediatrics, Hematology/Oncology, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Rosa I. Gallagher
- Center for Applied Proteomics and Molecular Medicine, Institute for Biomedical Innovation, George Mason University, Manassas, VA 20110, USA
| | - Julia D. Wulfkuhle
- Center for Applied Proteomics and Molecular Medicine, Institute for Biomedical Innovation, George Mason University, Manassas, VA 20110, USA
| | - Emanuel F. Petricoin
- Center for Applied Proteomics and Molecular Medicine, Institute for Biomedical Innovation, George Mason University, Manassas, VA 20110, USA
| | - Kathy Coy
- Preclinical Modeling and Therapeutics Core, Indiana University Melvin and Bren Simon Comprehensive Cancer Center, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Melissa Trowbridge
- Preclinical Modeling and Therapeutics Core, Indiana University Melvin and Bren Simon Comprehensive Cancer Center, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Anthony L. Sinn
- Preclinical Modeling and Therapeutics Core, Indiana University Melvin and Bren Simon Comprehensive Cancer Center, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Jamie L. Renbarger
- Department of Pediatrics, Hematology/Oncology, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Michael J. Ferguson
- Department of Pediatrics, Hematology/Oncology, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Kun Huang
- Department of Biostatistics & Health Data Science Indiana, University School of Medicine, Indianapolis, IN 46202, USA
| | - Jie Zhang
- Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - M. Reza Saadatzadeh
- Department of Pediatrics, Hematology/Oncology, Indiana University School of Medicine, Indianapolis, IN 46202, USA
- Department of Pediatrics, Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Karen E. Pollok
- Department of Pediatrics, Hematology/Oncology, Indiana University School of Medicine, Indianapolis, IN 46202, USA
- Department of Pediatrics, Herman B Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| |
Collapse
|
11
|
Betschart RO, Thiéry A, Aguilera-Garcia D, Zoche M, Moch H, Twerenbold R, Zeller T, Blankenberg S, Ziegler A. Comparison of calling pipelines for whole genome sequencing: an empirical study demonstrating the importance of mapping and alignment. Sci Rep 2022; 12:21502. [PMID: 36513709 PMCID: PMC9748128 DOI: 10.1038/s41598-022-26181-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Accepted: 12/12/2022] [Indexed: 12/14/2022] Open
Abstract
Rapid advances in high-throughput DNA sequencing technologies have enabled the conduct of whole genome sequencing (WGS) studies, and several bioinformatics pipelines have become available. The aim of this study was the comparison of 6 WGS data pre-processing pipelines, involving two mapping and alignment approaches (GATK utilizing BWA-MEM2 2.2.1, and DRAGEN 3.8.4) and three variant calling pipelines (GATK 4.2.4.1, DRAGEN 3.8.4 and DeepVariant 1.1.0). We sequenced one genome in a bottle (GIAB) sample 70 times in different runs, and one GIAB trio in triplicate. The truth set of the GIABs was used for comparison, and performance was assessed by computation time, F1 score, precision, and recall. In the mapping and alignment step, the DRAGEN pipeline was faster than the GATK with BWA-MEM2 pipeline. DRAGEN showed systematically higher F1 score, precision, and recall values than GATK for single nucleotide variations (SNVs) and Indels in simple-to-map, complex-to-map, coding and non-coding regions. In the variant calling step, DRAGEN was fastest. In terms of accuracy, DRAGEN and DeepVariant performed similarly and both superior to GATK, with slight advantages for DRAGEN for Indels and for DeepVariant for SNVs. The DRAGEN pipeline showed the lowest Mendelian inheritance error fraction for the GIAB trios. Mapping and alignment played a key role in variant calling of WGS, with the DRAGEN outperforming GATK.
Collapse
Affiliation(s)
- Raphael O. Betschart
- Cardio-CARE, Medizincampus Davos, Herman-Burchard-Str. 1, 7265 Davos Wolfgang, Switzerland
| | - Alexandre Thiéry
- Cardio-CARE, Medizincampus Davos, Herman-Burchard-Str. 1, 7265 Davos Wolfgang, Switzerland
| | - Domingo Aguilera-Garcia
- grid.412004.30000 0004 0478 9977Institute of Pathology and Molecular Pathology, University Hospital Zurich, Schmelzbergstrasse 12, 8091 Zurich, Switzerland
| | - Martin Zoche
- grid.412004.30000 0004 0478 9977Institute of Pathology and Molecular Pathology, University Hospital Zurich, Schmelzbergstrasse 12, 8091 Zurich, Switzerland
| | - Holger Moch
- grid.412004.30000 0004 0478 9977Institute of Pathology and Molecular Pathology, University Hospital Zurich, Schmelzbergstrasse 12, 8091 Zurich, Switzerland
| | - Raphael Twerenbold
- grid.13648.380000 0001 2180 3484Department of Cardiology, University Heart & Vascular Center, University Medical Center Hamburg Eppendorf, Martinistr. 52, 20251 Hamburg, Germany ,grid.13648.380000 0001 2180 3484University Center of Cardiovascular Research Hamburg, University Medical Center Hamburg Eppendorf, Martinistr. 52, 20251 Hamburg, Germany ,grid.452396.f0000 0004 5937 5237German Center for Cardiovascular Research (DZHK), Partner Site Hamburg/Kiel/Lübeck, Hamburg, Germany
| | - Tanja Zeller
- grid.13648.380000 0001 2180 3484Department of Cardiology, University Heart & Vascular Center, University Medical Center Hamburg Eppendorf, Martinistr. 52, 20251 Hamburg, Germany ,grid.13648.380000 0001 2180 3484University Center of Cardiovascular Research Hamburg, University Medical Center Hamburg Eppendorf, Martinistr. 52, 20251 Hamburg, Germany ,grid.452396.f0000 0004 5937 5237German Center for Cardiovascular Research (DZHK), Partner Site Hamburg/Kiel/Lübeck, Hamburg, Germany
| | - Stefan Blankenberg
- Cardio-CARE, Medizincampus Davos, Herman-Burchard-Str. 1, 7265 Davos Wolfgang, Switzerland ,grid.13648.380000 0001 2180 3484Department of Cardiology, University Heart & Vascular Center, University Medical Center Hamburg Eppendorf, Martinistr. 52, 20251 Hamburg, Germany ,grid.13648.380000 0001 2180 3484University Center of Cardiovascular Research Hamburg, University Medical Center Hamburg Eppendorf, Martinistr. 52, 20251 Hamburg, Germany ,grid.452396.f0000 0004 5937 5237German Center for Cardiovascular Research (DZHK), Partner Site Hamburg/Kiel/Lübeck, Hamburg, Germany
| | - Andreas Ziegler
- Cardio-CARE, Medizincampus Davos, Herman-Burchard-Str. 1, 7265 Davos Wolfgang, Switzerland ,grid.13648.380000 0001 2180 3484Department of Cardiology, University Heart & Vascular Center, University Medical Center Hamburg Eppendorf, Martinistr. 52, 20251 Hamburg, Germany ,School Mathematics, Statistics and Computer Science, Scottsville, Private Bag X01, Pietermaritzburg, 3209 South Africa
| |
Collapse
|
12
|
Qin Y, Li F, Tan Y, Duan Q, Zhang Q. Case report: Dramatic response to alectinib in a lung adenosquamous carcinoma patient harbouring a novel CPE-ALK fusion. Front Oncol 2022; 12:998545. [PMID: 37082099 PMCID: PMC10111186 DOI: 10.3389/fonc.2022.998545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 11/18/2022] [Indexed: 12/03/2022] Open
Abstract
Lung Adenosquamous carcinoma (ASC) is a rare histological subtype of lung cancer accounting for 0.4%–4% of all lung cancers. ASC is generally considered to be an aggressive cancer with poor prognosis. There is no specific standard treatment for ASC, and current treatment of ASC is relied on the guideline for non-small cell lung cancer (NSCLC). To date, only sporadic canonical EML4-ALK fusions have been reported in ASC patients, and the efficiency of ALK-TKI is still unclear in non-canonical ALK fusion positive ASC patients. Here we describe the case of a stage IV ASC patient harboring a novel CPE-ALK fusion detected via 74 genes panel analysis. Interestingly, the TP53 was wild-type and no another somatic mutation was found within 74 genes. In addition, immunohistochemical staining (IHC) also supports an oncogenic role for the CPE-ALK fusion. Based on these findings, the patient received alectinib 600 mg twice daily. After 4 months on treatment the patients achieved a radiological partial response (PR) and his symptoms were significantly relieved. Imaging showed that lesions of the patient were reduced, and the clinical evaluation was partial response (PR). To the best of our knowledge, this is the first report of a dramatic tumor response to alectinib in a patient with ASC harboring a CPE-ALK fusion. In addition, targeted NGS analysis may improve detection of ALK fusion in routine practice.
Collapse
Affiliation(s)
- Yanyan Qin
- Department of Respiratory and Critical Care Medicine, Shanxi Provincial People’s Hospital, Shanxi, China
- *Correspondence: Yanyan Qin,
| | - Fei Li
- Department of Respiratory and Critical Care Medicine, Shanxi Provincial People’s Hospital, Shanxi, China
| | - Yuan Tan
- The Medical Department, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, China
- Nanjing Simcere Medical Laboratory Science Co., Ltd, Nanjing, China
- The State Key Lab of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, China
| | - Qianqian Duan
- The Medical Department, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, China
- Nanjing Simcere Medical Laboratory Science Co., Ltd, Nanjing, China
- The State Key Lab of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, China
| | - Qin Zhang
- The Medical Department, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, China
- Nanjing Simcere Medical Laboratory Science Co., Ltd, Nanjing, China
- The State Key Lab of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, China
| |
Collapse
|
13
|
Samlali K, Thornbury M, Venter A. Community-led risk analysis of direct-to-consumer whole-genome sequencing. Biochem Cell Biol 2022; 100:499-509. [PMID: 35939839 DOI: 10.1139/bcb-2021-0506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Direct-to-consumer (DTC) genetic testing is cheaper and more accessible than ever before; however, the intention to combine, reuse, and resell this genetic information as powerful data sets is generally hidden from the consumer. This financial gain is creating a competitive DTC market, reducing the price of whole-genome sequencing (WGS) to under 300 USD. Entering this transition from single-nucleotide polymorphism-based DTC testing to WGS DTC testing, individuals looking for access to their whole-genomic information face new privacy and security risks. Differences between WGS and other methods of consumer genetic tests are left unexplored by regulation, leading to the application of legal data anonymization methods on whole-genome data, and questionable consent methods. Large representative genomic data sets are important for research and improve the standard of medicine and personalized care. However, these data can also be used by market players, law enforcement, and governments for surveillance, population analyses, marketing purposes, and discrimination. Here, we present a summary of the state of WGS DTC genetic testing and its current regulation, through a community-based lens to expose dual-use risks in consumer-facing biotechnologies.
Collapse
Affiliation(s)
- Kenza Samlali
- BricoBio Community Biology Lab, Montréal, QC, Canada.,Centre for Applied Synthetic Biology, Concordia University, Montréal, QC, Canada.,Department of Electrical and Computer Engineering, Concordia University, Montréal, QC, Canada
| | - Mackenzie Thornbury
- BricoBio Community Biology Lab, Montréal, QC, Canada.,Centre for Applied Synthetic Biology, Concordia University, Montréal, QC, Canada.,Department of Biology, Concordia University, Montréal, QC, Canada
| | - Andrei Venter
- BricoBio Community Biology Lab, Montréal, QC, Canada
| |
Collapse
|
14
|
Dong C, Cheng W, Zhang M, Li S, Zhao L, Chen D, Qin Y, Xiao M, Fang S. Genomic profiling of non-small cell lung cancer with the rare pulmonary lymphangitic carcinomatosis and clinical outcome of the exploratory anlotinib treatment. Front Oncol 2022; 12:992596. [PMID: 36324591 PMCID: PMC9620420 DOI: 10.3389/fonc.2022.992596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Accepted: 09/30/2022] [Indexed: 11/17/2022] Open
Abstract
Background To evaluate the potential treatment for patients with non-small cell lung cancer (NSCLC) and rare malignant pulmonary lymphangitis carcinomatosis (PLC), our study provided a genomic profile and clinical outcome of this group of patients. Methods We retrospectively reviewed patients with NSCLC who developed PLC. The genomic alterations, tumor mutation burden (TMB), and microsatellite instability (MSI) based on DNA-based next-generation sequencing were reviewed and compared in a Chinese population with lung adenocarcinomas (Chinese-LUAD cohort). Clinical outcomes after exploratory anlotinib treatment and factors influencing survival are summarized. Results A total of 564 patients with stage IV NSCLC were reviewed, and 39 patients with PLC were included. Genomic profiling of 17 adenocarcinoma patients with PLC (PLC-LUAD cohort) revealed TP53, EGFR, and LRP1B as the three most frequently altered genes. EGFR was less mutated in PLC-LUAD than Chinese-LUAD cohort of 778 patients (35.3% vs. 60.9%, P = 0.043). BRIP1 was mutated more often in the PLC-LUAD cohort (11.8% vs. 1.8%, P= 0.043). Two patients presented with high tumor mutational burden (TMB-H, 10 mutations/MB). Combing alterations in the patient with squamous cell carcinoma, the most altered pathways of PLC included cell cycle/DNA damage, chromatin modification, the RTK/Ras/MAPK pathway and VEGF signaling changes. Fourteen of the participants received anlotinib treatment. The ORR and DCR were 57.1% and 92.9%, respectively. Patients achieved a median progression-free survival of 4.9 months and a median overall survival of 7 months. The adverse effects were manageable. In patients with adenocarcinoma, the mPFS (5.3 months vs. 2.6 months) and mOS (9.9 months vs. 4.5 months) were prolonged in patients receiving anlotinib treatment compared to those receiving other treatment strategies (P < 0.05). Conclusion Patients with PLC in NSCLC demonstrated distinct genetic alterations. The results improve our understanding of the plausible genetic underpinnings of tumorigenesis in PLC and potential treatment strategies. Exploratory anlotinib treatment achieved considerable benefits and demonstrated manageable safety.
Collapse
Affiliation(s)
- Changqing Dong
- Department of Thoracic Surgery, Nanjing Chest hospital, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing, China
| | - Wanwan Cheng
- Department of Respiratory Medicine, Nanjing Chest hospital, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing, China
| | - Meiling Zhang
- Department of Oncology, The First Affiliated Hospital of Nanjing Medical University, Nanjing, China
| | - Si Li
- Nanjing Simcere Medical Laboratory Science Co., Ltd, The State Key Laboratory of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, China
| | - Lele Zhao
- Nanjing Simcere Medical Laboratory Science Co., Ltd, The State Key Laboratory of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, China
| | - Dongsheng Chen
- Nanjing Simcere Medical Laboratory Science Co., Ltd, The State Key Laboratory of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, China
| | - Yong Qin
- Nanjing Simcere Medical Laboratory Science Co., Ltd, The State Key Laboratory of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, China
| | - Mingzhe Xiao
- Nanjing Simcere Medical Laboratory Science Co., Ltd, The State Key Laboratory of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, China
| | - Shencun Fang
- Department of Respiratory Medicine, Nanjing Chest hospital, The Affiliated Brain Hospital of Nanjing Medical University, Nanjing, China
- *Correspondence: Shencun Fang,
| |
Collapse
|
15
|
Guan Y, Wang Y, Li H, Meng J, You X, Zhu X, Zhang Q, Sun T, Qi C, An G, Fan Y, Xu B. Molecular and clinicopathological characteristics of ERBB2 gene fusions in 32,131 Chinese patients with solid tumors. Front Oncol 2022; 12:986674. [PMID: 36276102 PMCID: PMC9582139 DOI: 10.3389/fonc.2022.986674] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Accepted: 09/15/2022] [Indexed: 12/04/2022] Open
Abstract
ERBB2 amplification is one of the most important and mature targets for HER2-targeted drug therapy. Somatic mutations of ERBB2 in the tyrosine kinase domain have been studied extensively, and play a role in response to anti-HER2 therapy among different cancer types. However, ERBB2 fusion has not been got attention and its relevance to HER2-targeted therapy is unclear. We comprehensively characterized ERBB2 fusions from next-generation sequencing (NGS) data between May 2018 and October 2021 in 32,131 various solid tumors. Among the tumors, 0.28% harbored ERBB2 fusions, which occurred more commonly in gastroesophageal junction cancer (3.12%; 3/96), breast cancer (1.89%; 8/422), urothelial carcinoma (1.72%; 1/58), and gastric cancer (1.60%; 23/1,437). Our population presented with a median age of 65 years (range 28 to 88 years), a high proportion of men (55 men vs 34 women; 61.80%). Among the patients with ERBB2 fusions, TP53 (82%), APC (18%), and CDK4 (15%) were the top3 co-mutant genes. What’s more, most patients with ERBB2 fusion also had ERBB2 amplification (75.28%; 67/89), which was similar to the data in the TCGA database (88.00%; 44/50). Furthermore, TCGA database shows that patients with ERBB2 fusions in pan-cancer had a worse prognosis than those without ERBB2 fusions, as well as in breast cancer. Besides, ERBB2 amplification combined with ERBB2 fusion had worse prognosis than those with only ERBB2 amplification. ERBB2 fusion may interfere the effect of anti-HER2-targeted antibody drugs and influence the prognosis of patients with ERBB2 amplification. Prospective clinical trials are warranted to confirm the results in the future.
Collapse
Affiliation(s)
- Yin Guan
- Department of Medical Oncology, Beijing Chao-Yang Hospital, Capital Medical University, Beijing, China
| | - Yutong Wang
- Department of Oncology, Shengjing Hospital of China Medical University, Shenyang, China
| | - Hongxia Li
- Department of Oncology, Shanxi Provincial People’s Hospital, Taiyuan, China
| | - Jing Meng
- Department of Medical Oncology, The Chinese People's Liberation Army (PLA) General Hospital, Beijing, China
| | - Xia You
- The Medical Department, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, China
- Medicial Department, Nanjing Simcere Medical Laboratory Science Co., Ltd, Nanjing, China
- The State Key Lab of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, China
| | - Xiaofeng Zhu
- The Medical Department, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, China
- Medicial Department, Nanjing Simcere Medical Laboratory Science Co., Ltd, Nanjing, China
- The State Key Lab of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, China
| | - Qin Zhang
- The Medical Department, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, China
- Medicial Department, Nanjing Simcere Medical Laboratory Science Co., Ltd, Nanjing, China
- The State Key Lab of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, China
| | - Tingting Sun
- The Medical Department, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, China
- Medicial Department, Nanjing Simcere Medical Laboratory Science Co., Ltd, Nanjing, China
- The State Key Lab of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, China
| | - Chuang Qi
- The Medical Department, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, China
- Medicial Department, Nanjing Simcere Medical Laboratory Science Co., Ltd, Nanjing, China
- The State Key Lab of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, China
| | - Guangyu An
- Department of Medical Oncology, Beijing Chao-Yang Hospital, Capital Medical University, Beijing, China
- *Correspondence: Guangyu An, ; Binghe Xu, ; Ying Fan,
| | - Ying Fan
- Department of Medical Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
- *Correspondence: Guangyu An, ; Binghe Xu, ; Ying Fan,
| | - Binghe Xu
- Department of Medical Oncology, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
- *Correspondence: Guangyu An, ; Binghe Xu, ; Ying Fan,
| |
Collapse
|
16
|
Zeng TM, Pan YF, Yuan ZG, Chen DS, Song YJ, Gao Y. Immune-related RNA signature predicts outcome of PD-1 inhibitor-combined GEMCIS therapy in advanced intrahepatic cholangiocarcinoma. Front Immunol 2022; 13:943066. [PMID: 36159865 PMCID: PMC9501891 DOI: 10.3389/fimmu.2022.943066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Accepted: 08/16/2022] [Indexed: 11/13/2022] Open
Abstract
BackgroundImmune checkpoint inhibitor (ICI)-combined chemotherapy in advanced intrahepatic cholangiocarcinoma has been proved to have more efficacy in a series of clinical trials. However, whether the tumor microenvironment (TME) plays a vital role in immune-combined therapy has not been rigorously evaluated.MethodsFirstly, we assayed the immunogenic properties of GEM-based chemotherapy. Then, 12 ICC patients treated with PD-1 inhibitor (sintilimab) combined with gemcitabine and cisplatin (GemCis) from a phase 2 clinical trial (ChiCTR2000036652) were included and their immune-related gene expression profiles were analyzed using RNA from baseline tumor samples. Immune-related signature correlating with clinical outcome was identified according to the 12 ICC patients, and its predictive value was validated in an ICC cohort with 26 patients. Multiplexed immunofluorescence (mIF) and flow cytometry (FCM) analysis were performed to evaluate the immune-related molecules with therapeutic outcomes.ResultsGEM-based chemotherapy induced immunogenic cell death of cholangiocarcinoma cells, together with increased CD274 expression. In an ICC cohort, we found that upregulation of immune-checkpoint molecules and immune response-related pathways were significantly related to better clinical outcome. On the contrary, baseline immune-cell proportions in tumor tissues did not show any correlation with clinical benefit between responders and non-responders. Immune-related signature (including six genes) correlating with clinical outcome was identified according to the 12 ICC patients, and its predictive value was validated in a small ICC cohort with 26 patients.ConclusionImmune-related RNA signature predicts the outcome of PD-1 inhibitor-combined GEMCIS therapy in advanced intrahepatic cholangiocarcinoma, which could be tested as a biomarker for immune-chemotherapy in the future.
Collapse
Affiliation(s)
- Tian-mei Zeng
- School of Medicine, Tongji University, Shanghai, China
- Department of Oncology, Eastern Hepatobiliary Surgery Hospital, Shanghai, China
| | - Yu-fei Pan
- International Cooperation Laboratory on Signal Transduction, Eastern Hepatobiliary Surgery Hospital, Shanghai, China
| | - Zhen-gang Yuan
- Department of Oncology, Eastern Hepatobiliary Surgery Hospital, Shanghai, China
| | - Dong-sheng Chen
- Jiangsu Simcere Diagnostics Co., Ltd, The State Key Laboratory of Translational Medicine and Innovative Drug Development, Nanjing, China
| | - Yun-jie Song
- Jiangsu Simcere Diagnostics Co., Ltd, The State Key Laboratory of Translational Medicine and Innovative Drug Development, Nanjing, China
| | - Yong Gao
- School of Medicine, Tongji University, Shanghai, China
- Department of Oncology, Shanghai East Hospital, Shanghai, China
- *Correspondence: Yong Gao,
| |
Collapse
|
17
|
Ma H, Zhang Q, Zhao Y, Zhang Y, Zhang J, Chen G, Tan Y, Zhang Q, Duan Q, Sun T, Qi C, Li F. Molecular and Clinicopathological Characteristics of Lung Cancer Concomitant Chronic Obstructive Pulmonary Disease (COPD). Int J Chron Obstruct Pulmon Dis 2022; 17:1601-1612. [PMID: 35860812 PMCID: PMC9293488 DOI: 10.2147/copd.s363482] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Accepted: 06/25/2022] [Indexed: 11/23/2022] Open
Abstract
Introduction Chronic obstructive pulmonary disease (COPD) and lung cancer often coexist, but its pathophysiology and genomics features are still unclear. Methods In this study, we retrospectively collected lung cancer concomitant COPD (COPD-LC) and non-COPD lung cancer (non-COPD-LC) patients, who performed next generation sequencing (NGS) and had clinicopathological information simultaneously. The COPD-LC data from the TCGA cohort were collected to conduct further analysis. Results A total of 51 COPD-LC patients and 88 non-COPD-LC patients were included in the study. Clinicopathological analysis showed that proportion of male gender, older age, and smoking patients were all substantially higher in COPD-LC group than in non-COPD-LC group (all P<0.01). Comparing the genomic data of the two groups in our cohort, COPD-LC had higher mutation frequency of LRP1B (43% vs 9%, P = 0.001), EPHA5 (24% vs 1%, P = 0.002), PRKDC (14% vs 1%, P = 0.039), PREX2 (14% vs 0%, P = 0.012), and FAT1 (14% vs 0%, P = 0.012), which had a relationship with improved tumor immunity. Immunotherapy biomarker of PD-L1 positive expression (62.5% vs 52.0%, P = 0.397) and tumor mutation burden (TMB, median TMB: 7.09 vs 2.94, P = 0.004) also were higher in COPD-LC. In addition, RNA data from TCGA further indicated tumor immunity increased in COPD-LC. Whereas, COPD-LC had lower frequency of EGFR mutation (19% vs 50%, P = 0.013) and EGFR mutant COPD-LC treated with EGFR-TKI had worse progression-free survival (PFS) (HR = 3.52, 95% CI: 1.27–9.80, P = 0.01). Conclusion In this retrospective study, we first explored molecular features of COPD-LC in a Chinese population. Although COPD-LC had lower EGFR mutant frequency and worse PFS with target treatment, high PD-L1 expression and TMB indicated these patients may benefit from immunotherapy.
Collapse
Affiliation(s)
- Hongxia Ma
- Pneumology Department, The Fourth Affiliated Hospital of Xinjiang Medical University, Urumqi, The Xinjiang Uygur Autonomous Region, People's Republic of China
| | - Qian Zhang
- Pneumology Department, The Fourth Affiliated Hospital of Xinjiang Medical University, Urumqi, The Xinjiang Uygur Autonomous Region, People's Republic of China
| | - Yanwen Zhao
- Pneumology Department, The Fourth Affiliated Hospital of Xinjiang Medical University, Urumqi, The Xinjiang Uygur Autonomous Region, People's Republic of China
| | - Yaohui Zhang
- Pneumology Department, The Fourth Affiliated Hospital of Xinjiang Medical University, Urumqi, The Xinjiang Uygur Autonomous Region, People's Republic of China
| | - Jingjing Zhang
- Pneumology Department, The Fourth Affiliated Hospital of Xinjiang Medical University, Urumqi, The Xinjiang Uygur Autonomous Region, People's Republic of China
| | - Guoqing Chen
- Pneumology Department, The Fourth Affiliated Hospital of Xinjiang Medical University, Urumqi, The Xinjiang Uygur Autonomous Region, People's Republic of China
| | - Yuan Tan
- The Medical Department, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, Jiangsu Province, People's Republic of China.,Nanjing Simcere Medical Laboratory Science Co., Ltd, Nanjing, Jiangsu Province, People's Republic of China.,The State Key Laboratory of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, Jiangsu Province, People's Republic of China
| | - Qin Zhang
- The Medical Department, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, Jiangsu Province, People's Republic of China.,Nanjing Simcere Medical Laboratory Science Co., Ltd, Nanjing, Jiangsu Province, People's Republic of China.,The State Key Laboratory of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, Jiangsu Province, People's Republic of China
| | - Qianqian Duan
- The Medical Department, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, Jiangsu Province, People's Republic of China.,Nanjing Simcere Medical Laboratory Science Co., Ltd, Nanjing, Jiangsu Province, People's Republic of China.,The State Key Laboratory of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, Jiangsu Province, People's Republic of China
| | - Tingting Sun
- The Medical Department, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, Jiangsu Province, People's Republic of China.,Nanjing Simcere Medical Laboratory Science Co., Ltd, Nanjing, Jiangsu Province, People's Republic of China.,The State Key Laboratory of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, Jiangsu Province, People's Republic of China
| | - Chuang Qi
- The Medical Department, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, Jiangsu Province, People's Republic of China.,Nanjing Simcere Medical Laboratory Science Co., Ltd, Nanjing, Jiangsu Province, People's Republic of China.,The State Key Laboratory of Translational Medicine and Innovative Drug Development, Jiangsu Simcere Diagnostics Co., Ltd, Nanjing, Jiangsu Province, People's Republic of China
| | - Fengsen Li
- Pneumology Department, The Fourth Affiliated Hospital of Xinjiang Medical University, Urumqi, The Xinjiang Uygur Autonomous Region, People's Republic of China
| |
Collapse
|
18
|
Age-dependent genomic characteristics and their impact on immunotherapy in lung adenocarcinoma. J Cancer Res Clin Oncol 2022:10.1007/s00432-022-04195-8. [PMID: 35838838 DOI: 10.1007/s00432-022-04195-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Accepted: 07/06/2022] [Indexed: 10/17/2022]
Abstract
BACKGROUND The incidence of lung cancer tends to be younger, and adenocarcinoma is the main histological type. Even patients with the same tumor type may have significant differences in clinical features, tumor microenvironment and genomic background at different ages. Immune checkpoint inhibitors (ICIs) have been shown to improve clinical outcomes in patients with lung adenocarcinoma (LUAD). However, differences in ICI efficacy between older and younger patients are unknown. Our study aimed to explore the relationship between age and immunotherapy in LUAD. METHODS In our study, 1313 resected LUAD patients in our hospital were divided into young (age ≤ 50) and old groups (age > 50), and the clinical characteristic differences between them were analyzed. Of these, next-generation sequencing (NGS) was performed on the 311 cases. In addition, immune-related signatures of 508 LUAD patients were analyzed by TCGA RNA expression data. Then, we validated genomic and clinical information of 270 LUAD samples in the MSKCC cohort. RESULTS ERBB2 and EGFR gene mutations were significantly different between the two groups, and the gene mutation number in the old group was significantly higher than that in the young group. In addition, immune-related signatures of LUAD patients were analyzed by TCGA RNA expression data, which indicated that the patients in the old group might have a better immune microenvironment. Then, we validated the MSKCC cohort and found that the TMB of the old group was significantly higher than that of the young group, and the OS of immunotherapy was longer in the old group. CONCLUSION Our study was the first to analyze the differences in the genomic landscape and immune-related biomarkers between the young and old groups of LUAD patients and found that the old group had a better efficacy of immunotherapy, providing a reference for the study design and treatment of patients with LUAD.
Collapse
|
19
|
Cherukuri PF, Soe MM, Condon DE, Bartaria S, Meis K, Gu S, Frost FG, Fricke LM, Lubieniecki KP, Lubieniecka JM, Pyatt RE, Hajek C, Boerkoel CF, Carmichael L. Establishing analytical validity of BeadChip array genotype data by comparison to whole-genome sequence and standard benchmark datasets. BMC Med Genomics 2022; 15:56. [PMID: 35287663 PMCID: PMC8919546 DOI: 10.1186/s12920-022-01199-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 02/28/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Clinical use of genotype data requires high positive predictive value (PPV) and thorough understanding of the genotyping platform characteristics. BeadChip arrays, such as the Global Screening Array (GSA), potentially offer a high-throughput, low-cost clinical screen for known variants. We hypothesize that quality assessment and comparison to whole-genome sequence and benchmark data establish the analytical validity of GSA genotyping. METHODS To test this hypothesis, we selected 263 samples from Coriell, generated GSA genotypes in triplicate, generated whole genome sequence (rWGS) genotypes, assessed the quality of each set of genotypes, and compared each set of genotypes to each other and to the 1000 Genomes Phase 3 (1KG) genotypes, a performance benchmark. For 59 genes (MAP59), we also performed theoretical and empirical evaluation of variants deemed medically actionable predispositions. RESULTS Quality analyses detected sample contamination and increased assay failure along the chip margins. Comparison to benchmark data demonstrated that > 82% of the GSA assays had a PPV of 1. GSA assays targeting transitions, genomic regions of high complexity, and common variants performed better than those targeting transversions, regions of low complexity, and rare variants. Comparison of GSA data to rWGS and 1KG data showed > 99% performance across all measured parameters. Consistent with predictions from prior studies, the GSA detection of variation within the MAP59 genes was 3/261. CONCLUSION We establish the analytical validity of GSA assays using quality analytics and comparison to benchmark and rWGS data. GSA assays meet the standards of a clinical screen although assays interrogating rare variants, transversions, and variants within low-complexity regions require careful evaluation.
Collapse
Affiliation(s)
- Praveen F Cherukuri
- Imagenetics, Sanford Health, 1410 W 25th St. Room #302, Sioux Falls, SD, 57105, USA. .,Sanford School of Medicine, University of South Dakota, Sioux Falls, SD, USA. .,Sanford Research Center, Sioux Falls, SD, USA.
| | - Melissa M Soe
- Imagenetics, Sanford Health, 1410 W 25th St. Room #302, Sioux Falls, SD, 57105, USA
| | - David E Condon
- Imagenetics, Sanford Health, 1410 W 25th St. Room #302, Sioux Falls, SD, 57105, USA.,Sanford School of Medicine, University of South Dakota, Sioux Falls, SD, USA
| | - Shubhi Bartaria
- Imagenetics, Sanford Health, 1410 W 25th St. Room #302, Sioux Falls, SD, 57105, USA
| | - Kaitlynn Meis
- Imagenetics, Sanford Health, 1410 W 25th St. Room #302, Sioux Falls, SD, 57105, USA
| | - Shaopeng Gu
- Imagenetics, Sanford Health, 1410 W 25th St. Room #302, Sioux Falls, SD, 57105, USA
| | - Frederick G Frost
- Imagenetics, Sanford Health, 1410 W 25th St. Room #302, Sioux Falls, SD, 57105, USA
| | - Lindsay M Fricke
- Imagenetics, Sanford Health, 1410 W 25th St. Room #302, Sioux Falls, SD, 57105, USA
| | - Krzysztof P Lubieniecki
- Imagenetics, Sanford Health, 1410 W 25th St. Room #302, Sioux Falls, SD, 57105, USA.,Sanford School of Medicine, University of South Dakota, Sioux Falls, SD, USA.,Sanford Research Center, Sioux Falls, SD, USA
| | - Joanna M Lubieniecka
- Imagenetics, Sanford Health, 1410 W 25th St. Room #302, Sioux Falls, SD, 57105, USA.,Sanford School of Medicine, University of South Dakota, Sioux Falls, SD, USA.,Sanford Research Center, Sioux Falls, SD, USA
| | - Robert E Pyatt
- Imagenetics, Sanford Health, 1410 W 25th St. Room #302, Sioux Falls, SD, 57105, USA.,Sanford School of Medicine, University of South Dakota, Sioux Falls, SD, USA
| | - Catherine Hajek
- Imagenetics, Sanford Health, 1410 W 25th St. Room #302, Sioux Falls, SD, 57105, USA.,Sanford School of Medicine, University of South Dakota, Sioux Falls, SD, USA
| | - Cornelius F Boerkoel
- Imagenetics, Sanford Health, 1410 W 25th St. Room #302, Sioux Falls, SD, 57105, USA
| | - Lynn Carmichael
- Imagenetics, Sanford Health, 1410 W 25th St. Room #302, Sioux Falls, SD, 57105, USA
| |
Collapse
|
20
|
Liu J, Shen Q, Bao H. Comparison of seven SNP calling pipelines for the next-generation sequencing data of chickens. PLoS One 2022; 17:e0262574. [PMID: 35100292 PMCID: PMC8803190 DOI: 10.1371/journal.pone.0262574] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Accepted: 12/29/2021] [Indexed: 11/18/2022] Open
Abstract
Single nucleotide polymorphisms (SNPs) are widely used in genome-wide association studies and population genetics analyses. Next-generation sequencing (NGS) has become convenient, and many SNP-calling pipelines have been developed for human NGS data. We took advantage of a gap knowledge in selecting the appropriated SNP calling pipeline to handle with high-throughput NGS data. To fill this gap, we studied and compared seven SNP calling pipelines, which include 16GT, genome analysis toolkit (GATK), Bcftools-single (Bcftools single sample mode), Bcftools-multiple (Bcftools multiple sample mode), VarScan2-single (VarScan2 single sample mode), VarScan2-multiple (VarScan2 multiple sample mode) and Freebayes pipelines, using 96 NGS data with the different depth gradients of approximately 5X, 10X, 20X, 30X, 40X, and 50X coverage from 16 Rhode Island Red chickens. The sixteen chickens were also genotyped with a 50K SNP array, and the sensitivity and specificity of each pipeline were assessed by comparison to the results of SNP arrays. For each pipeline, except Freebayes, the number of detected SNPs increased as the input read depth increased. In comparison with other pipelines, 16GT, followed by Bcftools-multiple, obtained the most SNPs when the input coverage exceeded 10X, and Bcftools-multiple obtained the most when the input was 5X and 10X. The sensitivity and specificity of each pipeline increased with increasing input. Bcftools-multiple had the highest sensitivity numerically when the input ranged from 5X to 30X, and 16GT showed the highest sensitivity when the input was 40X and 50X. Bcftools-multiple also had the highest specificity, followed by GATK, at almost all input levels. For most calling pipelines, there were no obvious changes in SNP numbers, sensitivities or specificities beyond 20X. In conclusion, (1) if only SNPs were detected, the sequencing depth did not need to exceed 20X; (2) the Bcftools-multiple may be the best choice for detecting SNPs from chicken NGS data, but for a single sample or sequencing depth greater than 20X, 16GT was recommended. Our findings provide a reference for researchers to select suitable pipelines to obtain SNPs from the NGS data of chickens or nonhuman animals.
Collapse
Affiliation(s)
- Jing Liu
- National Engineering Laboratory for Animal Breeding, Beijing Key Laboratory for Animal Genetic Improvement, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Qingmiao Shen
- National Engineering Laboratory for Animal Breeding, Beijing Key Laboratory for Animal Genetic Improvement, College of Animal Science and Technology, China Agricultural University, Beijing, China
| | - Haigang Bao
- National Engineering Laboratory for Animal Breeding, Beijing Key Laboratory for Animal Genetic Improvement, College of Animal Science and Technology, China Agricultural University, Beijing, China
- * E-mail:
| |
Collapse
|
21
|
Pan B, Ren L, Onuchic V, Guan M, Kusko R, Bruinsma S, Trigg L, Scherer A, Ning B, Zhang C, Glidewell-Kenney C, Xiao C, Donaldson E, Sedlazeck FJ, Schroth G, Yavas G, Grunenwald H, Chen H, Meinholz H, Meehan J, Wang J, Yang J, Foox J, Shang J, Miclaus K, Dong L, Shi L, Mohiyuddin M, Pirooznia M, Gong P, Golshani R, Wolfinger R, Lababidi S, Sahraeian SME, Sherry S, Han T, Chen T, Shi T, Hou W, Ge W, Zou W, Guo W, Bao W, Xiao W, Fan X, Gondo Y, Yu Y, Zhao Y, Su Z, Liu Z, Tong W, Xiao W, Zook JM, Zheng Y, Hong H. Assessing reproducibility of inherited variants detected with short-read whole genome sequencing. Genome Biol 2022; 23:2. [PMID: 34980216 PMCID: PMC8722114 DOI: 10.1186/s13059-021-02569-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 12/06/2021] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Reproducible detection of inherited variants with whole genome sequencing (WGS) is vital for the implementation of precision medicine and is a complicated process in which each step affects variant call quality. Systematically assessing reproducibility of inherited variants with WGS and impact of each step in the process is needed for understanding and improving quality of inherited variants from WGS. RESULTS To dissect the impact of factors involved in detection of inherited variants with WGS, we sequence triplicates of eight DNA samples representing two populations on three short-read sequencing platforms using three library kits in six labs and call variants with 56 combinations of aligners and callers. We find that bioinformatics pipelines (callers and aligners) have a larger impact on variant reproducibility than WGS platform or library preparation. Single-nucleotide variants (SNVs), particularly outside difficult-to-map regions, are more reproducible than small insertions and deletions (indels), which are least reproducible when > 5 bp. Increasing sequencing coverage improves indel reproducibility but has limited impact on SNVs above 30×. CONCLUSIONS Our findings highlight sources of variability in variant detection and the need for improvement of bioinformatics pipelines in the era of precision medicine with WGS.
Collapse
Affiliation(s)
- Bohu Pan
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Luyao Ren
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | | | | | | | | | - Len Trigg
- Real Time Genomics, Hamilton, New Zealand
| | - Andreas Scherer
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
- EATRIS ERIC- European Infrastructure for Translational Medicine, Amsterdam, the Netherlands
| | - Baitang Ning
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Chaoyang Zhang
- School of Computing Sciences and Computer Engineering, University of Southern Mississippi, Hattiesburg, MS, 39406, USA
| | | | - Chunlin Xiao
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Eric Donaldson
- Center for Drug Evaluation and Research, Food and Drug Administration, Silver Spring, MD, 20993, USA
| | - Fritz J Sedlazeck
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, 77030, USA
| | | | - Gokhan Yavas
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | | | | | | | - Joe Meehan
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Jing Wang
- Center for Advanced Measurement Science, National Institute of Metrology, Beijing, 100013, China
| | - Jingcheng Yang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Jonathan Foox
- Department of Physiology and Biophysics, Weill Cornell Medicine, New York, NY, 10021, USA
| | - Jun Shang
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | | | - Lianhua Dong
- Center for Advanced Measurement Science, National Institute of Metrology, Beijing, 100013, China
| | - Leming Shi
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | | | - Mehdi Pirooznia
- Bioinformatics and Computational Biology Laboratory, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Ping Gong
- Environmental Laboratory, U.S. Army Engineer Research and Development Center, Vicksburg, MS, 39180, USA
| | | | | | - Samir Lababidi
- Office of Health Informatics, Office of the Commissioner, US Food and Drug Administration, Silver Spring, MD, 20993, USA
| | | | - Steve Sherry
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Tao Han
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Tao Chen
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Tieliu Shi
- The Center for Bioinformatics and Computational Biology, Shanghai Key Laboratory of Regulatory Biology, Institute of Biomedical Sciences and School of Life Sciences, East China Normal University, Shanghai, 200241, China
| | - Wanwan Hou
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Weigong Ge
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Wen Zou
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Wenjing Guo
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Wenjun Bao
- SAS Institute Inc., Cary, NC, 27513, USA
| | - Wenzhong Xiao
- Stanford Genome Technology Center, Stanford University School of Medicine, Palo Alto, CA, 94305, USA
| | - Xiaohui Fan
- Pharmaceutical Informatics Institute, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, China
| | - Yoichi Gondo
- Department of Molecular Life Sciences, Tokai University School of Medicine, 143 Shimokasuya, Isehara, 259-1193, Japan
| | - Ying Yu
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China
- Human Phenome Institute, Fudan University, Shanghai, 200438, China
| | - Yongmei Zhao
- CCR-SF Bioinformatics Group, Advanced Biomedical and Computational Sciences, Biomedical Informatics and Data Science, Frederick National Laboratory for Cancer Research, Frederick, MD, 21701, USA
| | - Zhenqiang Su
- Takeda Pharmaceuticals, Cambridge, MA, 02139, USA
| | - Zhichao Liu
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA
| | - Wenming Xiao
- Division of Molecular Genetics and Pathology, Center for Device and Radiological Health, US Food and Drug Administration, Silver Spring, MD, 20993, USA
| | - Justin M Zook
- Material Measurement Laboratory, National Institute of Standards and Technology, Gaithersburg, MD, 20899, USA.
| | - Yuanting Zheng
- State Key Laboratory of Genetic Engineering, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, 200438, China.
- Human Phenome Institute, Fudan University, Shanghai, 200438, China.
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR, 72079, USA.
| |
Collapse
|
22
|
Späth GF, Bussotti G. GIP: an open-source computational pipeline for mapping genomic instability from protists to cancer cells. Nucleic Acids Res 2021; 50:e36. [PMID: 34928370 PMCID: PMC8989552 DOI: 10.1093/nar/gkab1237] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Revised: 11/01/2021] [Accepted: 12/03/2021] [Indexed: 11/25/2022] Open
Abstract
Genome instability has been recognized as a key driver for microbial and cancer adaptation and thus plays a central role in many diseases. Genome instability encompasses different types of genomic alterations, yet most available genome analysis software are limited to just one type of mutation. To overcome this limitation and better understand the role of genetic changes in enhancing pathogenicity we established GIP, a novel, powerful bioinformatic pipeline for comparative genome analysis. Here, we show its application to whole genome sequencing datasets of Leishmania, Plasmodium, Candida and cancer. Applying GIP on available data sets validated our pipeline and demonstrated the power of our tool to drive biological discovery. Applied to Plasmodium vivax genomes, our pipeline uncovered the convergent amplification of erythrocyte binding proteins and identified a nullisomic strain. Re-analyzing genomes of drug adapted Candida albicans strains revealed correlated copy number variations of functionally related genes, strongly supporting a mechanism of epistatic adaptation through interacting gene-dosage changes. Our results illustrate how GIP can be used for the identification of aneuploidy, gene copy number variations, changes in nucleic acid sequences, and chromosomal rearrangements. Altogether, GIP can shed light on the genetic bases of cell adaptation and drive disease biomarker discovery.
Collapse
Affiliation(s)
- Gerald F Späth
- Institut Pasteur, Université de Paris, INSERM U1201, Unité de Parasitologie moléculaire et Signalisation, Paris, France
| | - Giovanni Bussotti
- Institut Pasteur, Université de Paris, INSERM U1201, Unité de Parasitologie moléculaire et Signalisation, Paris, France.,Institut Pasteur, Université de Paris, Bioinformatics and Biostatistics Hub, F-75015 Paris, France
| |
Collapse
|
23
|
Liu X, Zhang Y, Liu W, Li Y, Pan J, Pu Y, Han J, Orlando L, Ma Y, Jiang L. A single-nucleotide mutation within the TBX3 enhancer increased body size in Chinese horses. Curr Biol 2021; 32:480-487.e6. [PMID: 34906355 PMCID: PMC8796118 DOI: 10.1016/j.cub.2021.11.052] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 09/23/2021] [Accepted: 11/22/2021] [Indexed: 01/15/2023]
Abstract
Chinese ponies are endemic to the mountainous areas of southwestern China and were first reported in the archaeological record at the Royal Tomb of Zhongshan King, Mancheng, dated to approximately ∼2,100 YBP.1 Previous work has started uncovering the genetic basis of size variation in western ponies and horses, revealing a limited number of loci, including HMGA2,2LCORL/NCAPG,3ZFAT, and LASP1.4,5 Whether the same genetic pathways also drive the small body size of Chinese ponies, which show striking anatomical differences to Shetland ponies,6 remains unclear.2,7 To test this, we combined whole-genome sequences of 187 horses across China. Statistical analyses revealed top association between genetic variation at the T-box transcription factor 3 (TBX3) and the body size. Fine-scale analysis across an extended population of 189 ponies and 574 horses narrowed down the association to one A/G SNP at an enhancer region upstream of the TBX3 (ECA8:20,644,555, p = 2.34e−39). Luciferase assays confirmed the single-nucleotide G mutation upregulating TBX3 expression, and enhancer-knockout mice exhibited shorter limbs than wild-type littermates (p < 0.01). Re-analysis of ancient DNA data showed that the G allele, which is most frequent in modern horses, first occurred some ∼2,300 years ago and rose in frequency since. This supports selection for larger size in Asia from approximately the beginning of the Chinese Empire. Overall, this study characterized the causal regulatory mutation underlying small body size in Chinese ponies and revealed size as one of the main selection targets of past Chinese breeders. One single A/G SNP in TBX3 enhancer region drives size variation in Chinese horses The frequency of the G variant correlates positively with size in 763 horses Cellular and mice models confirm it affects TBX3 transcription and the limb length The G variant first occurred ∼2,300 years ago and rose in frequency since
Collapse
Affiliation(s)
- Xuexue Liu
- Laboratory of Animal (Poultry) Genetics Breeding and Reproduction, Ministry of Agriculture, Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing 100193, P.R. China; Centre d'Anthropobiologie et de Génomique de Toulouse, Université Paul Sabatier, 37 allées Jules Guesde, 31000 Toulouse, France; CAAS-ILRI Joint Laboratory on Livestock and Forage Genetic Resources, Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing 100193, P.R. China
| | - Yanli Zhang
- Laboratory of Animal (Poultry) Genetics Breeding and Reproduction, Ministry of Agriculture, Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing 100193, P.R. China; CAAS-ILRI Joint Laboratory on Livestock and Forage Genetic Resources, Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing 100193, P.R. China
| | - Wujun Liu
- College of Animal Science, Xinjiang Agriculture University, Urumqi, Xinjiang, China
| | - Yefang Li
- Laboratory of Animal (Poultry) Genetics Breeding and Reproduction, Ministry of Agriculture, Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing 100193, P.R. China; CAAS-ILRI Joint Laboratory on Livestock and Forage Genetic Resources, Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing 100193, P.R. China
| | - Jianfei Pan
- Laboratory of Animal (Poultry) Genetics Breeding and Reproduction, Ministry of Agriculture, Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing 100193, P.R. China; CAAS-ILRI Joint Laboratory on Livestock and Forage Genetic Resources, Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing 100193, P.R. China
| | - Yabin Pu
- Laboratory of Animal (Poultry) Genetics Breeding and Reproduction, Ministry of Agriculture, Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing 100193, P.R. China; CAAS-ILRI Joint Laboratory on Livestock and Forage Genetic Resources, Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing 100193, P.R. China
| | - Jianlin Han
- CAAS-ILRI Joint Laboratory on Livestock and Forage Genetic Resources, Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing 100193, P.R. China; International Livestock Research Institute (ILRI), Nairobi 00100, Kenya
| | - Ludovic Orlando
- Centre d'Anthropobiologie et de Génomique de Toulouse, Université Paul Sabatier, 37 allées Jules Guesde, 31000 Toulouse, France.
| | - Yuehui Ma
- Laboratory of Animal (Poultry) Genetics Breeding and Reproduction, Ministry of Agriculture, Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing 100193, P.R. China; CAAS-ILRI Joint Laboratory on Livestock and Forage Genetic Resources, Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing 100193, P.R. China.
| | - Lin Jiang
- Laboratory of Animal (Poultry) Genetics Breeding and Reproduction, Ministry of Agriculture, Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing 100193, P.R. China; CAAS-ILRI Joint Laboratory on Livestock and Forage Genetic Resources, Institute of Animal Science, Chinese Academy of Agricultural Sciences (CAAS), Beijing 100193, P.R. China.
| |
Collapse
|
24
|
He JC, Li SY, He WZ, Xian JJ, Ma XY, Wang YC, Zhang MC, Ye GX, Liang B, Xia Q, Li Q. Application of Restriction Site-Associated DNA Sequencing (RAD-Seq) for Copy Number Variation and Triploidy Detection in Human. Cytogenet Genome Res 2021; 161:406-413. [PMID: 34657031 DOI: 10.1159/000518930] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Accepted: 08/06/2021] [Indexed: 11/19/2022] Open
Abstract
At present, low-pass whole-genome sequencing (WGS) is frequently used in clinical research and in the screening of copy number variations (CNVs). However, there are still some challenges in the detection of triploids. Restriction site-associated DNA sequencing (RAD-Seq) technology is a reduced-representation genome sequencing technology developed based on next-generation sequencing. Here, we verified whether RAD-Seq could be employed to detect CNVs and triploids. In this study, genomic DNA of 11 samples was extracted employing a routine method and used to build libraries. Five cell lines of known karyotypes and 6 triploid abortion tissue samples were included for RAD-Seq testing. The triploid samples were confirmed by STR analysis and also tested by low-pass WGS. The accuracy and efficiency of detecting CNVs and triploids by RAD-Seq were then assessed, compared with low-pass WGS. In our results, RAD-Seq detected 11 out of 11 (100%) chromosomal abnormalities, including 4 deletions and 1 aneuploidy in the purchased cell lines and all triploid samples. By contrast, these triploids were missed by low-pass WGS. Furthermore, RAD-Seq showed a higher resolution and more accurate allele frequency in the detection of triploids than low-pass WGS. Our study shows that, compared with low-pass WGS, RAD-Seq has relatively higher accuracy in CNV detection at a similar cost and is capable of identifying triploids. Therefore, the application of this technique in medical genetics has a significant potential value.
Collapse
Affiliation(s)
- Jian-Chun He
- Key Laboratory for Major Obstetric Diseases of Guangdong Province, Key Laboratory of Reproduction and Genetics of Guangdong Higher Education Institutes, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Shao-Ying Li
- Key Laboratory for Major Obstetric Diseases of Guangdong Province, Key Laboratory of Reproduction and Genetics of Guangdong Higher Education Institutes, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Wen-Zhi He
- Key Laboratory for Major Obstetric Diseases of Guangdong Province, Key Laboratory of Reproduction and Genetics of Guangdong Higher Education Institutes, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Jia-Jia Xian
- Key Laboratory for Major Obstetric Diseases of Guangdong Province, Key Laboratory of Reproduction and Genetics of Guangdong Higher Education Institutes, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Xiao-Yan Ma
- Key Laboratory for Major Obstetric Diseases of Guangdong Province, Key Laboratory of Reproduction and Genetics of Guangdong Higher Education Institutes, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Yan-Chao Wang
- Key Laboratory for Major Obstetric Diseases of Guangdong Province, Key Laboratory of Reproduction and Genetics of Guangdong Higher Education Institutes, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Min-Cong Zhang
- Key Laboratory for Major Obstetric Diseases of Guangdong Province, Key Laboratory of Reproduction and Genetics of Guangdong Higher Education Institutes, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Guo-Xin Ye
- Key Laboratory for Major Obstetric Diseases of Guangdong Province, Key Laboratory of Reproduction and Genetics of Guangdong Higher Education Institutes, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Bo Liang
- Basecare Medical Device Co., Ltd, Suzhou, China
| | - Qin Xia
- Basecare Medical Device Co., Ltd, Suzhou, China,
| | - Qing Li
- Key Laboratory for Major Obstetric Diseases of Guangdong Province, Key Laboratory of Reproduction and Genetics of Guangdong Higher Education Institutes, The Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| |
Collapse
|
25
|
Breton G, Johansson ACV, Sjödin P, Schlebusch CM, Jakobsson M. Comparison of sequencing data processing pipelines and application to underrepresented African human populations. BMC Bioinformatics 2021; 22:488. [PMID: 34627144 PMCID: PMC8502359 DOI: 10.1186/s12859-021-04407-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2020] [Accepted: 09/27/2021] [Indexed: 11/10/2022] Open
Abstract
Background Population genetic studies of humans make increasing use of high-throughput sequencing in order to capture diversity in an unbiased way. There is an abundance of sequencing technologies, bioinformatic tools and the available genomes are increasing in number. Studies have evaluated and compared some of these technologies and tools, such as the Genome Analysis Toolkit (GATK) and its “Best Practices” bioinformatic pipelines. However, studies often focus on a few genomes of Eurasian origin in order to detect technical issues. We instead surveyed the use of the GATK tools and established a pipeline for processing high coverage full genomes from a diverse set of populations, including Sub-Saharan African groups, in order to reveal challenges from human diversity and stratification. Results We surveyed 29 studies using high-throughput sequencing data, and compared their strategies for data pre-processing and variant calling. We found that processing of data is very variable across studies and that the GATK “Best Practices” are seldom followed strictly. We then compared three versions of a GATK pipeline, differing in the inclusion of an indel realignment step and with a modification of the base quality score recalibration step. We applied the pipelines on a diverse set of 28 individuals. We compared the pipelines in terms of count of called variants and overlap of the callsets. We found that the pipelines resulted in similar callsets, in particular after callset filtering. We also ran one of the pipelines on a larger dataset of 179 individuals. We noted that including more individuals at the joint genotyping step resulted in different counts of variants. At the individual level, we observed that the average genome coverage was correlated to the number of variants called. Conclusions We conclude that applying the GATK “Best Practices” pipeline, including their recommended reference datasets, to underrepresented populations does not lead to a decrease in the number of called variants compared to alternative pipelines. We recommend to aim for coverage of > 30X if identifying most variants is important, and to work with large sample sizes at the variant calling stage, also for underrepresented individuals and populations. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-021-04407-x.
Collapse
Affiliation(s)
- Gwenna Breton
- Human Evolution, Department of Organismal Biology, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18C, 752 36, Uppsala, Sweden.
| | - Anna C V Johansson
- Department of Cell and Molecular Biology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Uppsala University, Husargatan 3, 752 37, Uppsala, Sweden
| | - Per Sjödin
- Human Evolution, Department of Organismal Biology, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18C, 752 36, Uppsala, Sweden
| | - Carina M Schlebusch
- Human Evolution, Department of Organismal Biology, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18C, 752 36, Uppsala, Sweden.,Palaeo-Research Institute, University of Johannesburg, P.O. Box 524, Auckland Park, 2006, South Africa.,Science for Life Laboratory, Uppsala, Sweden
| | - Mattias Jakobsson
- Human Evolution, Department of Organismal Biology, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18C, 752 36, Uppsala, Sweden. .,Palaeo-Research Institute, University of Johannesburg, P.O. Box 524, Auckland Park, 2006, South Africa. .,Science for Life Laboratory, Uppsala, Sweden.
| |
Collapse
|
26
|
Karimi MR, Karimi AH, Abolmaali S, Sadeghi M, Schmitz U. Prospects and challenges of cancer systems medicine: from genes to disease networks. Brief Bioinform 2021; 23:6361045. [PMID: 34471925 PMCID: PMC8769701 DOI: 10.1093/bib/bbab343] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2021] [Revised: 08/02/2021] [Accepted: 08/03/2021] [Indexed: 12/20/2022] Open
Abstract
It is becoming evident that holistic perspectives toward cancer are crucial in deciphering the overwhelming complexity of tumors. Single-layer analysis of genome-wide data has greatly contributed to our understanding of cellular systems and their perturbations. However, fundamental gaps in our knowledge persist and hamper the design of effective interventions. It is becoming more apparent than ever, that cancer should not only be viewed as a disease of the genome but as a disease of the cellular system. Integrative multilayer approaches are emerging as vigorous assets in our endeavors to achieve systemic views on cancer biology. Herein, we provide a comprehensive review of the approaches, methods and technologies that can serve to achieve systemic perspectives of cancer. We start with genome-wide single-layer approaches of omics analyses of cellular systems and move on to multilayer integrative approaches in which in-depth descriptions of proteogenomics and network-based data analysis are provided. Proteogenomics is a remarkable example of how the integration of multiple levels of information can reduce our blind spots and increase the accuracy and reliability of our interpretations and network-based data analysis is a major approach for data interpretation and a robust scaffold for data integration and modeling. Overall, this review aims to increase cross-field awareness of the approaches and challenges regarding the omics-based study of cancer and to facilitate the necessary shift toward holistic approaches.
Collapse
Affiliation(s)
| | | | | | - Mehdi Sadeghi
- Department of Cell & Molecular Biology, Semnan University, Semnan, Iran
| | - Ulf Schmitz
- Department of Molecular & Cell Biology, James Cook University, Townsville, QLD 4811, Australia
| |
Collapse
|
27
|
Ahmed Z, Renart EG, Mishra D, Zeeshan S. JWES: a new pipeline for whole genome/exome sequence data processing, management, and gene-variant discovery, annotation, prediction, and genotyping. FEBS Open Bio 2021; 11:2441-2452. [PMID: 34370400 PMCID: PMC8409305 DOI: 10.1002/2211-5463.13261] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Revised: 07/18/2021] [Accepted: 08/02/2021] [Indexed: 01/07/2023] Open
Abstract
Whole genome and exome sequencing (WGS/WES) are the most popular next‐generation sequencing (NGS) methodologies and are at present often used to detect rare and common genetic variants of clinical significance. We emphasize that automated sequence data processing, management, and visualization should be an indispensable component of modern WGS and WES data analysis for sequence assembly, variant detection (SNPs, SVs), imputation, and resolution of haplotypes. In this manuscript, we present a newly developed findable, accessible, interoperable, and reusable (FAIR) bioinformatics‐genomics pipeline Java based Whole Genome/Exome Sequence Data Processing Pipeline (JWES) for efficient variant discovery and interpretation, and big data modeling and visualization. JWES is a cross‐platform, user‐friendly, product line application, that entails three modules: (a) data processing, (b) storage, and (c) visualization. The data processing module performs a series of different tasks for variant calling, the data storage module efficiently manages high‐volume gene‐variant data, and the data visualization module supports variant data interpretation with Circos graphs. The performance of JWES was tested and validated in‐house with different experiments, using Microsoft Windows, macOS Big Sur, and UNIX operating systems. JWES is an open‐source and freely available pipeline, allowing scientists to take full advantage of all the computing resources available, without requiring much computer science knowledge. We have successfully applied JWES for processing, management, and gene‐variant discovery, annotation, prediction, and genotyping of WGS and WES data to analyze variable complex disorders. In summary, we report the performance of JWES with some reproducible case studies, using open access and in‐house generated, high‐quality datasets.
Collapse
Affiliation(s)
- Zeeshan Ahmed
- Institute for Health, Health Care Policy and Aging Research, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA.,Department of Medicine, Rutgers Robert Wood Johnson Medical School, Rutgers Biomedical and Health Sciences, New Brunswick, NJ, USA
| | - Eduard Gibert Renart
- Institute for Health, Health Care Policy and Aging Research, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
| | - Deepshikha Mishra
- Institute for Health, Health Care Policy and Aging Research, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
| | - Saman Zeeshan
- Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
| |
Collapse
|
28
|
Genomic landscape and tumor mutation burden analysis of Chinese patients with sarcomatoid carcinoma of the head and neck. Oral Oncol 2021; 121:105436. [PMID: 34371452 DOI: 10.1016/j.oraloncology.2021.105436] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2021] [Accepted: 06/23/2021] [Indexed: 01/23/2023]
Abstract
BACKGROUND Sarcomatoid carcinoma (SC) of the head and neck (HN) is a rare disease that has both sarcomatoid and cancerous components. The genetic background and mechanisms of tumorigenesis remain largely unrevealed, and the progress of precision therapy has been limited. METHODS Targeted DNA-based next-generation sequencing (NGS) was performed by a 539 genes panel of pan-cancer in 12 patients with SC of the HN to identify their genetic alterations and investigate clinically actionable mutations for use in precision treatment. RESULTS TP53 was identified as the most frequently mutated gene. Genes related to the cell cycling, chromatin remodeling and histone modification were found to be frequently mutated in patients with SC of the HN. Alterations in receptor tyrosine kinases (RTKs) were also found in six patients. In addition, four patients had mutations in members of the downstream RAS and PI3-kinase pathways, PIK3CA was identified as the most frequently mutated gene in this pathway. The tumor mutation burden (TMB) value ranged from 0.71 to 14.71 per megabase, with a median of 4.34. The TMB value of PIK3CA mutation patients was significantly higher than that of PIK3CA wild-type patients. CONCLUSIONS This was the first study to investigate genomic alterations specifically in Chinese patients with SC of the HN. Our research results showed that 10 out of 12 patients can match the targeted therapies or immunotherapy currently available in clinical practice or active clinical trials, suggesting precision therapy has the potential utility to improve the long-term prognosis for patients with the rare disease. Due to the small number of patients in this study, the findings need to be validated in a larger cohort.
Collapse
|
29
|
Ahmed Z, Renart EG, Zeeshan S. Genomics pipelines to investigate susceptibility in whole genome and exome sequenced data for variant discovery, annotation, prediction and genotyping. PeerJ 2021; 9:e11724. [PMID: 34395068 PMCID: PMC8320519 DOI: 10.7717/peerj.11724] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2021] [Accepted: 06/14/2021] [Indexed: 12/12/2022] Open
Abstract
Over the last few decades, genomics is leading toward audacious future, and has been changing our views about conducting biomedical research, studying diseases, and understanding diversity in our society across the human species. The whole genome and exome sequencing (WGS/WES) are two of the most popular next-generation sequencing (NGS) methodologies that are currently being used to detect genetic variations of clinical significance. Investigating WGS/WES data for the variant discovery and genotyping is based on the nexus of different data analytic applications. Although several bioinformatics applications have been developed, and many of those are freely available and published. Timely finding and interpreting genetic variants are still challenging tasks among diagnostic laboratories and clinicians. In this study, we are interested in understanding, evaluating, and reporting the current state of solutions available to process the NGS data of variable lengths and types for the identification of variants, alleles, and haplotypes. Residing within the scope, we consulted high quality peer reviewed literature published in last 10 years. We were focused on the standalone and networked bioinformatics applications proposed to efficiently process WGS and WES data, and support downstream analysis for gene-variant discovery, annotation, prediction, and interpretation. We have discussed our findings in this manuscript, which include but not are limited to the set of operations, workflow, data handling, involved tools, technologies and algorithms and limitations of the assessed applications.
Collapse
Affiliation(s)
- Zeeshan Ahmed
- Institute for Health, Health Care Policy and Aging Research, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA.,Department of Medicine, Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
| | - Eduard Gibert Renart
- Institute for Health, Health Care Policy and Aging Research, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
| | - Saman Zeeshan
- Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
| |
Collapse
|
30
|
Behl T, Kaur I, Sehgal A, Singh S, Bhatia S, Al-Harrasi A, Zengin G, Babes EE, Brisc C, Stoicescu M, Toma MM, Sava C, Bungau SG. Bioinformatics Accelerates the Major Tetrad: A Real Boost for the Pharmaceutical Industry. Int J Mol Sci 2021; 22:6184. [PMID: 34201152 PMCID: PMC8227524 DOI: 10.3390/ijms22126184] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2021] [Revised: 06/03/2021] [Accepted: 06/05/2021] [Indexed: 02/01/2023] Open
Abstract
With advanced technology and its development, bioinformatics is one of the avant-garde fields that has managed to make amazing progress in the pharmaceutical-medical field by modeling the infrastructural dimensions of healthcare and integrating computing tools in drug innovation, facilitating prevention, detection/more accurate diagnosis, and treatment of disorders, while saving time and money. By association, bioinformatics and pharmacovigilance promoted both sample analyzes and interpretation of drug side effects, also focusing on drug discovery and development (DDD), in which systems biology, a personalized approach, and drug repositioning were considered together with translational medicine. The role of bioinformatics has been highlighted in DDD, proteomics, genetics, modeling, miRNA discovery and assessment, and clinical genome sequencing. The authors have collated significant data from the most known online databases and publishers, also narrowing the diversified applications, in order to target four major areas (tetrad): DDD, anti-microbial research, genomic sequencing, and miRNA research and its significance in the management of current pandemic context. Our analysis aims to provide optimal data in the field by stratification of the information related to the published data in key sectors and to capture the attention of researchers interested in bioinformatics, a field that has succeeded in advancing the healthcare paradigm by introducing developing techniques and multiple database platforms, addressed in the manuscript.
Collapse
Affiliation(s)
- Tapan Behl
- Department of Pharmacology, Chitkara College of Pharmacy, Chitkara University, Punjab 140401, India; (I.K.); (A.S.); (S.S.)
| | - Ishnoor Kaur
- Department of Pharmacology, Chitkara College of Pharmacy, Chitkara University, Punjab 140401, India; (I.K.); (A.S.); (S.S.)
| | - Aayush Sehgal
- Department of Pharmacology, Chitkara College of Pharmacy, Chitkara University, Punjab 140401, India; (I.K.); (A.S.); (S.S.)
| | - Sukhbir Singh
- Department of Pharmacology, Chitkara College of Pharmacy, Chitkara University, Punjab 140401, India; (I.K.); (A.S.); (S.S.)
| | - Saurabh Bhatia
- Amity Institute of Pharmacy, Amity University, Gurugram 122413, India;
- Natural & Medical Sciences Research Centre, University of Nizwa, Birkat Al Mauz, Nizwa 616, Oman;
| | - Ahmed Al-Harrasi
- Natural & Medical Sciences Research Centre, University of Nizwa, Birkat Al Mauz, Nizwa 616, Oman;
| | - Gokhan Zengin
- Department of Biology, Faculty of Science, Selcuk University Campus, 42130 Konya, Turkey;
| | - Elena Emilia Babes
- Department of Medical Disciplines, Faculty of Medicine and Pharmacy, University of Oradea, 410073 Oradea, Romania; (E.E.B.); (C.B.); (M.S.); (C.S.)
| | - Ciprian Brisc
- Department of Medical Disciplines, Faculty of Medicine and Pharmacy, University of Oradea, 410073 Oradea, Romania; (E.E.B.); (C.B.); (M.S.); (C.S.)
| | - Manuela Stoicescu
- Department of Medical Disciplines, Faculty of Medicine and Pharmacy, University of Oradea, 410073 Oradea, Romania; (E.E.B.); (C.B.); (M.S.); (C.S.)
| | - Mirela Marioara Toma
- Department of Pharmacy, Faculty of Medicine and Pharmacy, University of Oradea, 410028 Oradea, Romania;
- Doctoral School of Biomedical Sciences, University of Oradea, 410087 Oradea, Romania
| | - Cristian Sava
- Department of Medical Disciplines, Faculty of Medicine and Pharmacy, University of Oradea, 410073 Oradea, Romania; (E.E.B.); (C.B.); (M.S.); (C.S.)
| | - Simona Gabriela Bungau
- Department of Pharmacy, Faculty of Medicine and Pharmacy, University of Oradea, 410028 Oradea, Romania;
- Doctoral School of Biomedical Sciences, University of Oradea, 410087 Oradea, Romania
| |
Collapse
|
31
|
Bogaerts B, Delcourt T, Soetaert K, Boarbi S, Ceyssens PJ, Winand R, Van Braekel J, De Keersmaecker SCJ, Roosens NHC, Marchal K, Mathys V, Vanneste K. A Bioinformatics Whole-Genome Sequencing Workflow for Clinical Mycobacterium tuberculosis Complex Isolate Analysis, Validated Using a Reference Collection Extensively Characterized with Conventional Methods and In Silico Approaches. J Clin Microbiol 2021; 59:e00202-21. [PMID: 33789960 PMCID: PMC8316078 DOI: 10.1128/jcm.00202-21] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Accepted: 03/27/2021] [Indexed: 01/18/2023] Open
Abstract
The use of whole-genome sequencing (WGS) for routine typing of bacterial isolates has increased substantially in recent years. For Mycobacterium tuberculosis (MTB), in particular, WGS has the benefit of drastically reducing the time required to generate results compared to most conventional phenotypic methods. Consequently, a multitude of solutions for analyzing WGS MTB data have been developed, but their successful integration in clinical and national reference laboratories is hindered by the requirement for their validation, for which a consensus framework is still largely absent. We developed a bioinformatics workflow for (Illumina) WGS-based routine typing of MTB complex (MTBC) member isolates allowing complete characterization, including (sub)species confirmation and identification (16S, csb/RD, hsp65), single nucleotide polymorphism (SNP)-based antimicrobial resistance (AMR) prediction, and pathogen typing (spoligotyping, SNP barcoding, and core genome multilocus sequence typing). Workflow performance was validated on a per-assay basis using a collection of 238 in-house-sequenced MTBC isolates, extensively characterized with conventional molecular biology-based approaches supplemented with public data. For SNP-based AMR prediction, results from molecular genotyping methods were supplemented with in silico modified data sets, allowing us to greatly increase the set of evaluated mutations. The workflow demonstrated very high performance with performance metrics of >99% for all assays, except for spoligotyping, where sensitivity dropped to ∼90%. The validation framework for our WGS-based bioinformatics workflow can aid in the standardization of bioinformatics tools by the MTB community and other SNP-based applications regardless of the targeted pathogen(s). The bioinformatics workflow is available for academic and nonprofit use through the Galaxy instance of our institute at https://galaxy.sciensano.be.
Collapse
Affiliation(s)
- Bert Bogaerts
- Transversal Activities in Applied Genomics, Sciensano, Brussels, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
| | - Thomas Delcourt
- Transversal Activities in Applied Genomics, Sciensano, Brussels, Belgium
| | | | | | | | - Raf Winand
- Transversal Activities in Applied Genomics, Sciensano, Brussels, Belgium
| | - Julien Van Braekel
- Transversal Activities in Applied Genomics, Sciensano, Brussels, Belgium
| | | | - Nancy H C Roosens
- Transversal Activities in Applied Genomics, Sciensano, Brussels, Belgium
| | - Kathleen Marchal
- Department of Information Technology, Internet Technology and Data Science Lab (IDLab), Interuniversity Microelectronics Centre (IMEC), Ghent University, Ghent, Belgium
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium
- Department of Genetics, University of Pretoria, Pretoria, South Africa
| | | | - Kevin Vanneste
- Transversal Activities in Applied Genomics, Sciensano, Brussels, Belgium
| |
Collapse
|
32
|
Sundercombe SL, Berbic M, Evans CA, Cliffe C, Elakis G, Temple SEL, Selvanathan A, Ewans L, Quayum N, Nixon CY, Dias KR, Lang S, Richards A, Goh S, Wilson M, Mowat D, Sachdev R, Sandaradura S, Walsh M, Farrar MA, Walsh R, Fletcher J, Kirk EP, Teunisse GM, Schofield D, Buckley MF, Zhu Y, Roscioli T. Clinically Responsive Genomic Analysis Pipelines: Elements to Improve Detection Rate and Efficiency. J Mol Diagn 2021; 23:894-905. [PMID: 33962052 DOI: 10.1016/j.jmoldx.2021.04.007] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Revised: 03/27/2021] [Accepted: 04/21/2021] [Indexed: 11/25/2022] Open
Abstract
Massively parallel sequencing has markedly improved mendelian diagnostic rates. This study assessed the effects of custom alterations to a diagnostic genomic bioinformatic pipeline in response to clinical need and derived practice recommendations relative to diagnostic rates and efficiency. The Genomic Annotation and Interpretation Application (GAIA) bioinformatics pipeline was designed to detect panel, exome, and genome sample integrity and prioritize gene variants in mendelian disorders. Reanalysis of selected negative cases was performed after improvements to the pipeline. GAIA improvements and their effect on sensitivity are described, including addition of a PubMed search for gene-disease associations not in the Online Mendelian Inheritance of Man database, inclusion of a process for calling low-quality variants (known as QPatch), and gene symbol nomenclature consistency checking. The new pipeline increased the diagnostic rate and reduced staff costs, resulting in a saving of US$844.34 per additional diagnosis. Recommendations for genomic analysis pipeline requirements are summarized. Clinically responsive bioinformatics pipeline improvements increase diagnostic sensitivity and increase cost-effectiveness.
Collapse
Affiliation(s)
| | - Marina Berbic
- NSW Health Pathology Randwick Genomics, Prince of Wales Hospital, Randwick, New South Wales, Australia; School of Women's and Children's Health, University of New South Wales Sydney, Kensington, New South Wales, Australia
| | - Carey-Anne Evans
- Neuroscience Research Australia (NeuRA), Randwick, New South Wales, Australia
| | - Corrina Cliffe
- NSW Health Pathology Randwick Genomics, Prince of Wales Hospital, Randwick, New South Wales, Australia
| | - George Elakis
- NSW Health Pathology Randwick Genomics, Prince of Wales Hospital, Randwick, New South Wales, Australia
| | - Suzanna E L Temple
- NSW Health Pathology Randwick Genomics, Prince of Wales Hospital, Randwick, New South Wales, Australia; Neuroscience Research Australia (NeuRA), Randwick, New South Wales, Australia; Centre for Clinical Genetics, Sydney Children's Hospital, Sydney, Randwick, New South Wales, Australia
| | - Arthavan Selvanathan
- NSW Health Pathology Randwick Genomics, Prince of Wales Hospital, Randwick, New South Wales, Australia; Neuroscience Research Australia (NeuRA), Randwick, New South Wales, Australia; Discipline of Child and Adolescent Health, The University of Sydney, New South Wales, Australia
| | - Lisa Ewans
- Department of Medical Genomics, Royal Prince Alfred Hospital, Sydney, New South Wales, Australia; Central Clinical School, Sydney Medical School, The University of Sydney, New South Wales, Australia
| | - Nila Quayum
- NSW Health Pathology Randwick Genomics, Prince of Wales Hospital, Randwick, New South Wales, Australia
| | - Cheng-Yee Nixon
- NSW Health Pathology Randwick Genomics, Prince of Wales Hospital, Randwick, New South Wales, Australia; Neuroscience Research Australia (NeuRA), Randwick, New South Wales, Australia
| | - Kerith-Rae Dias
- Neuroscience Research Australia (NeuRA), Randwick, New South Wales, Australia; Prince of Wales Clinical School, Faculty of Medicine, University of New South Wales Sydney, Kensington, New South Wales, Australia
| | - Sarah Lang
- NSW Health Pathology Randwick Genomics, Prince of Wales Hospital, Randwick, New South Wales, Australia
| | - Anna Richards
- NSW Health Pathology Randwick Genomics, Prince of Wales Hospital, Randwick, New South Wales, Australia
| | - Shuxiang Goh
- NSW Health Pathology Randwick Genomics, Prince of Wales Hospital, Randwick, New South Wales, Australia; Neuroscience Research Australia (NeuRA), Randwick, New South Wales, Australia
| | - Meredith Wilson
- Department of Clinical Genetics, Children's Hospital at Westmead, Sydney, Westmead, New South Wales, Australia
| | - David Mowat
- Centre for Clinical Genetics, Sydney Children's Hospital, Sydney, Randwick, New South Wales, Australia
| | - Rani Sachdev
- Centre for Clinical Genetics, Sydney Children's Hospital, Sydney, Randwick, New South Wales, Australia
| | - Sarah Sandaradura
- Department of Clinical Genetics, Children's Hospital at Westmead, Sydney, Westmead, New South Wales, Australia
| | - Maie Walsh
- Genetic Medicine Department, Royal Melbourne Hospital, Parkville, Victoria, Australia
| | - Michelle A Farrar
- School of Women's and Children's Health, University of New South Wales Sydney, Kensington, New South Wales, Australia; Neurology Department, Sydney Children's Hospital, Sydney, Randwick, New South Wales, Australia
| | - Rebecca Walsh
- NSW Health Pathology Randwick Genomics, Prince of Wales Hospital, Randwick, New South Wales, Australia
| | - Janice Fletcher
- NSW Health Pathology Randwick Genomics, Prince of Wales Hospital, Randwick, New South Wales, Australia
| | - Edwin P Kirk
- NSW Health Pathology Randwick Genomics, Prince of Wales Hospital, Randwick, New South Wales, Australia; School of Women's and Children's Health, University of New South Wales Sydney, Kensington, New South Wales, Australia; Centre for Clinical Genetics, Sydney Children's Hospital, Sydney, Randwick, New South Wales, Australia
| | - Guus M Teunisse
- Neuroscience Research Australia (NeuRA), Randwick, New South Wales, Australia
| | - Deborah Schofield
- Centre for Economic Impacts of Genomic Medicine, Macquarie Business School, Macquarie University, Macquarie Park, New South Wales, Australia
| | - Michael Francis Buckley
- NSW Health Pathology Randwick Genomics, Prince of Wales Hospital, Randwick, New South Wales, Australia
| | - Ying Zhu
- NSW Health Pathology Randwick Genomics, Prince of Wales Hospital, Randwick, New South Wales, Australia; Neuroscience Research Australia (NeuRA), Randwick, New South Wales, Australia; Genetics of Learning Disability Service, Hunter Genetics, Waratah Newcastle, New South Wales, Australia
| | - Tony Roscioli
- NSW Health Pathology Randwick Genomics, Prince of Wales Hospital, Randwick, New South Wales, Australia; Neuroscience Research Australia (NeuRA), Randwick, New South Wales, Australia; Centre for Clinical Genetics, Sydney Children's Hospital, Sydney, Randwick, New South Wales, Australia.
| |
Collapse
|
33
|
Kısakol B, Sarıhan Ş, Ergün MA, Baysan M. Detailed evaluation of cancer sequencing pipelines in different microenvironments and heterogeneity levels. ACTA ACUST UNITED AC 2021; 45:114-126. [PMID: 33907494 PMCID: PMC8068765 DOI: 10.3906/biy-2008-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2020] [Accepted: 02/03/2021] [Indexed: 11/25/2022]
Abstract
The importance of next generation sequencing (NGS) rises in cancer research as accessing this key technology becomes easier for researchers. The sequence data created by NGS technologies must be processed by various bioinformatics algorithms within a pipeline in order to convert raw data to meaningful information. Mapping and variant calling are the two main steps of these analysis pipelines, and many algorithms are available for these steps. Therefore, detailed benchmarking of these algorithms in different scenarios is crucial for the efficient utilization of sequencing technologies. In this study, we compared the performance of twelve pipelines (three mapping and four variant discovery algorithms) with recommended settings to capture single nucleotide variants. We observed significant discrepancy in variant calls among tested pipelines for different heterogeneity levels in real and simulated samples with overall high specificity and low sensitivity. Additional to the individual evaluation of pipelines, we also constructed and tested the performance of pipeline combinations. In these analyses, we observed that certain pipelines complement each other much better than others and display superior performance than individual pipelines. This suggests that adhering to a single pipeline is not optimal for cancer sequencing analysis and sample heterogeneity should be considered in algorithm optimization.
Collapse
Affiliation(s)
- Batuhan Kısakol
- Department of Physiology and Medical Physics, Centre for Systems Medicine, Royal College of Surgeons in Ireland, Dublin Ireland
| | - Şahin Sarıhan
- Computer Engineering Department, Faculty of Engineering, Marmara University, İstanbul, Turkey Turkey
| | - Mehmet Arif Ergün
- Computer Engineering Department, Faculty of Computer and Informatics Engineering, İstanbul Technical University,İstanbul Turkey
| | - Mehmet Baysan
- Computer Engineering Department, Faculty of Computer and Informatics Engineering, İstanbul Technical University,İstanbul Turkey
| |
Collapse
|
34
|
Next Generation Sequencing Technology in the Clinic and Its Challenges. Cancers (Basel) 2021; 13:cancers13081751. [PMID: 33916923 PMCID: PMC8067551 DOI: 10.3390/cancers13081751] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Revised: 03/30/2021] [Accepted: 04/05/2021] [Indexed: 12/12/2022] Open
Abstract
Simple Summary Precise identification and annotation of mutations are of utmost importance in clinical oncology. Insights of the DNA sequence can provide meaningful knowledge to unravel the underlying genetics of disease. Hence, tailoring of personalized medicine often relies on specific genomic alteration for treatment efficacy. The aim of this review is to highlight that sequencing harbors much more than just four nucleotides. Moreover, the gradual transition from first to second generation sequencing technologies has led to awareness for choosing the most appropriate bioinformatic analytic tools based on the aim, quality and demand for a specific purpose. Thus, the same raw data can lead to various results reflecting the intrinsic features of different datamining pipelines. Abstract Data analysis has become a crucial aspect in clinical oncology to interpret output from next-generation sequencing-based testing. NGS being able to resolve billions of sequencing reactions in a few days has consequently increased the demand for tools to handle and analyze such large data sets. Many tools have been developed since the advent of NGS, featuring their own peculiarities. Increased awareness when interpreting alterations in the genome is therefore of utmost importance, as the same data using different tools can provide diverse outcomes. Hence, it is crucial to evaluate and validate bioinformatic pipelines in clinical settings. Moreover, personalized medicine implies treatment targeting efficacy of biological drugs for specific genomic alterations. Here, we focused on different sequencing technologies, features underlying the genome complexity, and bioinformatic tools that can impact the final annotation. Additionally, we discuss the clinical demand and design for implementing NGS.
Collapse
|
35
|
Weißbach S, Sys S, Hewel C, Todorov H, Schweiger S, Winter J, Pfenninger M, Torkamani A, Evans D, Burger J, Everschor-Sitte K, May-Simera HL, Gerber S. Reliability of genomic variants across different next-generation sequencing platforms and bioinformatic processing pipelines. BMC Genomics 2021; 22:62. [PMID: 33468057 PMCID: PMC7814447 DOI: 10.1186/s12864-020-07362-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2020] [Accepted: 12/30/2020] [Indexed: 12/14/2022] Open
Abstract
Background Next Generation Sequencing (NGS) is the fundament of various studies, providing insights into questions from biology and medicine. Nevertheless, integrating data from different experimental backgrounds can introduce strong biases. In order to methodically investigate the magnitude of systematic errors in single nucleotide variant calls, we performed a cross-sectional observational study on a genomic cohort of 99 subjects each sequenced via (i) Illumina HiSeq X, (ii) Illumina HiSeq, and (iii) Complete Genomics and processed with the respective bioinformatic pipeline. We also repeated variant calling for the Illumina cohorts with GATK, which allowed us to investigate the effect of the bioinformatics analysis strategy separately from the sequencing platform’s impact. Results The number of detected variants/variant classes per individual was highly dependent on the experimental setup. We observed a statistically significant overrepresentation of variants uniquely called by a single setup, indicating potential systematic biases. Insertion/deletion polymorphisms (indels) were associated with decreased concordance compared to single nucleotide polymorphisms (SNPs). The discrepancies in indel absolute numbers were particularly prominent in introns, Alu elements, simple repeats, and regions with medium GC content. Notably, reprocessing sequencing data following the best practice recommendations of GATK considerably improved concordance between the respective setups. Conclusion We provide empirical evidence of systematic heterogeneity in variant calls between alternative experimental and data analysis setups. Furthermore, our results demonstrate the benefit of reprocessing genomic data with harmonized pipelines when integrating data from different studies. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-020-07362-8.
Collapse
Affiliation(s)
- Stephan Weißbach
- Institute of Human Genetics, University Medical Center of the Johannes Gutenberg-University Mainz, Mainz, Germany.,Institute of Developmental Biology and Neurobiology, Johannes Gutenberg-University Mainz, Mainz, Germany
| | - Stanislav Sys
- Institute of Human Genetics, University Medical Center of the Johannes Gutenberg-University Mainz, Mainz, Germany
| | - Charlotte Hewel
- Institute of Human Genetics, University Medical Center of the Johannes Gutenberg-University Mainz, Mainz, Germany
| | - Hristo Todorov
- Institute of Human Genetics, University Medical Center of the Johannes Gutenberg-University Mainz, Mainz, Germany
| | - Susann Schweiger
- Institute of Human Genetics, University Medical Center of the Johannes Gutenberg-University Mainz, Mainz, Germany.,Leibniz Institute for Resilience Research, Mainz, Germany
| | - Jennifer Winter
- Institute of Human Genetics, University Medical Center of the Johannes Gutenberg-University Mainz, Mainz, Germany.,Leibniz Institute for Resilience Research, Mainz, Germany
| | - Markus Pfenninger
- Department of Molecular Ecology, Senckenberg Biodiversity and Climate Research Centre, Senckenberganlage 25, 60325, Frankfurt am Main, Germany.,Institute for Molecular and Organismic Evolution, Johannes Gutenberg-University Mainz, Johann-Joachim-Becher-Weg 7, 55128, Mainz, Germany.,LOEWE Centre for Translational Biodiversity Genomics, Senckenberg Biodiversity, and Climate Research Centre, Senckenberganlage 25, 60325, Frankfurt am Main, Germany
| | - Ali Torkamani
- Department of Integrative Structural and Computational Biology, Scripps Research Translational Institute, California Campus, San Diego, USA
| | - Doug Evans
- Department of Integrative Structural and Computational Biology, Scripps Research Translational Institute, California Campus, San Diego, USA
| | - Joachim Burger
- Institute of Anthropology, Johannes Gutenberg-University Mainz, Mainz, Germany
| | | | | | - Susanne Gerber
- Institute of Human Genetics, University Medical Center of the Johannes Gutenberg-University Mainz, Mainz, Germany.
| |
Collapse
|
36
|
Zhang R, Gao P, Han Y, Zhang R, Tan P, Zhou L, Zhang J, Xie J, Li J. Reliable assessment of BRCA1 and BRCA2 germline variants by next-generation sequencing: a multicenter study. Breast Cancer 2021; 28:672-683. [PMID: 33400207 DOI: 10.1007/s12282-020-01204-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Accepted: 12/09/2020] [Indexed: 11/24/2022]
Abstract
BACKGROUND BRCA1/2 gene mutation testing, based on next-generation sequencing (NGS), has been gradually applied in the clinic to serve as preventive early screening for predisposed individuals or to provide treatment options for patients with hereditary breast or ovarian cancers. Here, we evaluated the accuracy of NGS-based mutation detection in BRCA1/2 and the consistency in variant interpretation among clinical laboratories to find the possible reasons underlying inaccurate results and discrepant variant interpretation. METHODS Laboratories were asked to use their routine procedures to detect six mimetic DNA samples with different BRCA1/2 germline variants. The results of variant detection were required to be submitted via a web-based evaluation system and were automatically scored, according to predefined criteria. The variant interpretation report, including the detailed clinical evidence, was summarized and analyzed for reasons underlying inconsistent results. RESULTS Overall, only 55.2% (16/29) of laboratories, whose detection score was higher than 90 points, was found to be an acceptable detection capability level. 82.9% (29/35) of the errors were genotype errors. The variant classification results were generally consistent, and 77.8% (7/9) of the variants were given the consistent classification answer. Only two single nucleotide variants (SNVs) had a discrepant classification opinion across laboratories. CONCLUSIONS The BRCA1/2 variant detection performance should be further improved, especially in reporting the correct genome coordinates. Inconsistent variant classification may be a result of the different clinical pieces of evidence collected by the laboratories. However, discordant clinical evidence also appeared within the same classification results. Therefore, our study provided clear clinical evidence assessment strategies for BRCA1/2 variants, which was aimed at obtaining a consistent variant classification strategy for providing accurate clinical reports to the clinicians.
Collapse
Affiliation(s)
- Rui Zhang
- National Center for Clinical Laboratories, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, People's Republic of China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, People's Republic of China
| | - Peng Gao
- National Center for Clinical Laboratories, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, People's Republic of China
- Graduate School, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, People's Republic of China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, People's Republic of China
| | - Yanxi Han
- National Center for Clinical Laboratories, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, People's Republic of China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, People's Republic of China
| | - Runling Zhang
- National Center for Clinical Laboratories, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, People's Republic of China
- Graduate School, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, People's Republic of China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, People's Republic of China
| | - Ping Tan
- National Center for Clinical Laboratories, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, People's Republic of China
- Graduate School, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, People's Republic of China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, People's Republic of China
| | - Li Zhou
- National Center for Clinical Laboratories, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, People's Republic of China
- Graduate School, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, People's Republic of China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, People's Republic of China
| | - Jiawei Zhang
- National Center for Clinical Laboratories, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, People's Republic of China
- Graduate School, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, People's Republic of China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, People's Republic of China
| | - Jiehong Xie
- National Center for Clinical Laboratories, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, People's Republic of China
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, People's Republic of China
| | - Jinming Li
- National Center for Clinical Laboratories, Beijing Hospital, National Center of Gerontology, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, People's Republic of China.
- Graduate School, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, People's Republic of China.
- Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, People's Republic of China.
| |
Collapse
|
37
|
Ray M, Sable MN, Sarkar S, Hallur V. Essential interpretations of bioinformatics in COVID-19 pandemic. Meta Gene 2020; 27:100844. [PMID: 33349792 PMCID: PMC7744275 DOI: 10.1016/j.mgene.2020.100844] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2020] [Revised: 12/02/2020] [Accepted: 12/14/2020] [Indexed: 02/06/2023] Open
Abstract
The currently emerging pathogen SARS-CoV-2 has produced the global pandemic crisis by causing COVID-19. The unique and novel genetic makeup of SARS-CoV-2 has created hurdles in biological research, due to which the potential drug/vaccine candidates have not yet been discovered by the scientific community. Meanwhile, the advantages of bioinformatics in viral research had created a milestone since last few decades. The exploitation of bioinformatics tools and techniques has successfully interpreted this viral genomics architecture. Some major in silico studies involving next-generation sequencing, genome-wide association studies, computer-aided drug design etc. have been effectively applied in COVID-19 research methodologies and discovered novel information on SARS-CoV-2 in several ways. Nowadays the implementation of in silico studies in COVID-19 research has not only sequenced the SARS-CoV-2 genome but also properly analyzed the sequencing errors, evolutionary relationship, genetic variations, putative drug candidates against SARS-CoV-2 viral genes etc. within a very short time period. These would be very needful towards further research on COVID-19 pandemic and essential for vaccine development against SARS-CoV-2 which will save public health.
Collapse
Affiliation(s)
- Manisha Ray
- Department of Pathology & Lab Medicine, All India Institute of Medical Sciences, Bhubaneswar, Odisha 751019, India
| | - Mukund Namdev Sable
- Department of ENT, All India Institute of Medical Sciences, Bhubaneswar, Odisha 751019, India
| | - Saurav Sarkar
- Department of Microbiology, All India Institute of Medical Sciences, Bhubaneswar, Odisha 751019, India
| | - Vinaykumar Hallur
- Department of Microbiology, All India Institute of Medical Sciences, Bhubaneswar, Odisha 751019, India
| |
Collapse
|
38
|
Hart RK, Prlić A. SeqRepo: A system for managing local collections of biological sequences. PLoS One 2020; 15:e0239883. [PMID: 33270643 PMCID: PMC7714221 DOI: 10.1371/journal.pone.0239883] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2020] [Accepted: 11/13/2020] [Indexed: 11/19/2022] Open
Abstract
MOTIVATION Access to biological sequence data, such as genome, transcript, or protein sequence, is at the core of many bioinformatics analysis workflows. The National Center for Biotechnology Information (NCBI), Ensembl, and other sequence database maintainers provide methods to access sequences through network connections. For many users, the convenience and currency of remotely managed data are compelling, and the network latency is non-consequential. However, for high-throughput and clinical applications, local sequence collections are essential for performance, stability, privacy, and reproducibility. RESULTS Here we describe SeqRepo, a novel system for building a local, high-performance, non-redundant collection of biological sequences. SeqRepo enables clients to use primary database identifiers and several digests to identify sequences and sequence alises. SeqRepo provides a native Python interface and a REST interface, which can run locally and enables access from other programming languages. SeqRepo also provides an alternative REST interface based on the GA4GH refget protocol. SeqRepo provides fast random access to sequence slices. We provide results that demonstrate that a local SeqRepo sequence collection yields significant performance benefits of up to 1300-fold over remote sequence collections. In our use case for a variant validation and normalization pipeline, SeqRepo improved throughput 50-fold relative to use with remote sequences. SeqRepo may be used with any species or sequence type. Regular snapshots of Human sequence collections are available. It is often convenient or necessary to use a computed digest as a sequence identifier. For example, a digest-based identifier may be used to refer to proprietary reference genomes or segments of a graph genome, for which conventional identifiers will not be available. Here we also introduce a convention for the application of the SHA-512 hashing algorithm with Base64 encoding to generate URL-safe identifiers. This convention, sha512t24u, combines a fast digest mechanism with a space-efficient representation that can be used for any object. Our report includes an analysis of timing and collision probabilities for sha512t24u. SeqRepo enables clients to use sha512t24u as identifiers, thereby seamlessly integrating public and private sequence sets. AVAILABILITY SeqRepo is released under the Apache License 2.0 and is available on github and PyPi. Docker images and database snapshots are also available. See https://github.com/biocommons/biocommons.seqrepo.
Collapse
Affiliation(s)
- Reece K. Hart
- Biocommons, San Francisco, CA, United States of America
- * E-mail:
| | - Andreas Prlić
- Invitae, Inc., San Francisco, CA, United States of America
| |
Collapse
|
39
|
Abstract
Advances in next-generation sequencing technology have enabled whole genome sequencing (WGS) to be widely used for identification of causal variants in a spectrum of genetic-related disorders, and provided new insight into how genetic polymorphisms affect disease phenotypes. The development of different bioinformatics pipelines has continuously improved the variant analysis of WGS data. However, there is a necessity for a systematic performance comparison of these pipelines to provide guidance on the application of WGS-based scientific and clinical genomics. In this study, we evaluated the performance of three variant calling pipelines (GATK, DRAGEN and DeepVariant) using the Genome in a Bottle Consortium, "synthetic-diploid" and simulated WGS datasets. DRAGEN and DeepVariant show better accuracy in SNP and indel calling, with no significant differences in their F1-score. DRAGEN platform offers accuracy, flexibility and a highly-efficient execution speed, and therefore superior performance in the analysis of WGS data on a large scale. The combination of DRAGEN and DeepVariant also suggests a good balance of accuracy and efficiency as an alternative solution for germline variant detection in further applications. Our results facilitate the standardization of benchmarking analysis of bioinformatics pipelines for reliable variant detection, which is critical in genetics-based medical research and clinical applications.
Collapse
|
40
|
Zhao S, Agafonov O, Azab A, Stokowy T, Hovig E. Accuracy and efficiency of germline variant calling pipelines for human genome data. Sci Rep 2020; 10:20222. [PMID: 33214604 PMCID: PMC7678823 DOI: 10.1038/s41598-020-77218-4] [Citation(s) in RCA: 42] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2020] [Accepted: 11/02/2020] [Indexed: 12/30/2022] Open
Abstract
Advances in next-generation sequencing technology have enabled whole genome sequencing (WGS) to be widely used for identification of causal variants in a spectrum of genetic-related disorders, and provided new insight into how genetic polymorphisms affect disease phenotypes. The development of different bioinformatics pipelines has continuously improved the variant analysis of WGS data. However, there is a necessity for a systematic performance comparison of these pipelines to provide guidance on the application of WGS-based scientific and clinical genomics. In this study, we evaluated the performance of three variant calling pipelines (GATK, DRAGEN and DeepVariant) using the Genome in a Bottle Consortium, "synthetic-diploid" and simulated WGS datasets. DRAGEN and DeepVariant show better accuracy in SNP and indel calling, with no significant differences in their F1-score. DRAGEN platform offers accuracy, flexibility and a highly-efficient execution speed, and therefore superior performance in the analysis of WGS data on a large scale. The combination of DRAGEN and DeepVariant also suggests a good balance of accuracy and efficiency as an alternative solution for germline variant detection in further applications. Our results facilitate the standardization of benchmarking analysis of bioinformatics pipelines for reliable variant detection, which is critical in genetics-based medical research and clinical applications.
Collapse
Affiliation(s)
- Sen Zhao
- Department of Tumor Biology, Institute of Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, 0310, Oslo, Norway
| | | | - Abdulrahman Azab
- Center for Bioinformatics, Department of Informatics, University of Oslo, 0316, Oslo, Norway
- Division of Research Computing, University Center for Information Technology (USIT), University of Oslo, 0316, Oslo, Norway
| | - Tomasz Stokowy
- Computational Biology Unit, Institute of Informatics, University of Bergen, 5008, Bergen, Norway
- Department of Clinical Science, University of Bergen, 5021, Bergen, Norway
| | - Eivind Hovig
- Department of Tumor Biology, Institute of Cancer Research, The Norwegian Radium Hospital, Oslo University Hospital, 0310, Oslo, Norway.
- Center for Bioinformatics, Department of Informatics, University of Oslo, 0316, Oslo, Norway.
| |
Collapse
|
41
|
Eghbalnia HR, Wilfinger WW, Mackey K, Chomczynski P. Coordinated analysis of exon and intron data reveals novel differential gene expression changes. Sci Rep 2020; 10:15669. [PMID: 32973253 PMCID: PMC7515875 DOI: 10.1038/s41598-020-72482-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Accepted: 08/24/2020] [Indexed: 12/14/2022] Open
Abstract
RNA-Seq expression analysis currently relies primarily upon exon expression data. The recognized role of introns during translation, and the presence of substantial RNA-Seq counts attributable to introns, provide the rationale for the simultaneous consideration of both exon and intron data. We describe here a method for the coordinated analysis of exon and intron data by investigating their relationship within individual genes and across samples, while taking into account changes in both variability and expression level. This coordinated analysis of exon and intron data offers strong evidence for significant differences that distinguish the profiles of the exon-only expression data from the combined exon and intron data. One advantage of our proposed method, called matched change characterization for exons and introns (MEI), is its straightforward applicability to existing archived data using small modifications to standard RNA-Seq pipelines. Using MEI, we demonstrate that when data are examined for changes in variability across control and case conditions, novel differential changes can be detected. Notably, when MEI criteria were employed in the analysis of an archived data set involving polyarthritic subjects, the number of differentially expressed genes was expanded by sevenfold. More importantly, the observed changes in exon and intron variability with statistically significant false discovery rates could be traced to specific immune pathway gene networks. The application of MEI analysis provides a strategy for incorporating the significance of exon and intron variability and further developing the role of using both exons and intron sequencing counts in studies of gene regulatory processes.
Collapse
Affiliation(s)
- Hamid R Eghbalnia
- University of Wisconsin-Madison, Madison, USA. .,University of Cincinnati, Cincinnati, USA.
| | | | - Karol Mackey
- Molecular Research Center, Inc., Cincinnati, USA
| | | |
Collapse
|
42
|
An integrated Asian human SNV and indel benchmark established using multiple sequencing methods. Sci Rep 2020; 10:9821. [PMID: 32555294 PMCID: PMC7300012 DOI: 10.1038/s41598-020-66605-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Accepted: 05/05/2020] [Indexed: 11/08/2022] Open
Abstract
Sequencing technologies have been rapidly developed recently, leading to the breakthrough of sequencing-based clinical diagnosis, but accurate and complete genome variation benchmark would be required for further assessment of precision medicine applications. Despite the human cell line of NA12878 has been successfully developed to be a variation benchmark, population-specific variation benchmark is still lacking. Here, we established an Asian human variation benchmark by constructing and sequencing a stabilized cell line of a Chinese Han volunteer. By using seven different sequencing strategies, we obtained ~3.88 Tb clean data from different laboratories, hoping to reach the point of high sequencing depth and accurate variation detection. Through the combination of variations identified from different sequencing strategies and different analysis pipelines, we identified 3.35 million SNVs and 348.65 thousand indels, which were well supported by our sequencing data and passed our strict quality control, thus should be high confidence variation benchmark. Besides, we also detected 5,913 high-quality SNVs which had 969 sites were novel and located in the high homologous regions supported by long-range information in both the co-barcoding single tube Long Fragment Read (stLFR) data and PacBio HiFi CCS data. Furthermore, by using the long reads data (stLFR and HiFi CCS), we were able to phase more than 99% heterozygous SNVs, which helps to improve the benchmark to be haplotype level. Our study provided comprehensive sequencing data as well as the integrated variation benchmark of an Asian derived cell line, which would be valuable for future sequencing-based clinical development.
Collapse
|
43
|
Zhang Q, Luo M, Liu CJ, Guo AY. CCLA: an accurate method and web server for cancer cell line authentication using gene expression profiles. Brief Bioinform 2020; 22:5854406. [PMID: 32510568 DOI: 10.1093/bib/bbaa093] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Revised: 04/26/2020] [Accepted: 04/28/2020] [Indexed: 01/28/2023] Open
Abstract
Cancer cell lines (CCLs) as important model systems play critical roles in cancer research. The misidentification and contamination of CCLs are serious problems, leading to unreliable results and waste of resources. Current methods for CCL authentication are mainly based on the CCL-specific genetic polymorphism, whereas no method is available for CCL authentication using gene expression profiles. Here, we developed a novel method and homonymic web server (CCLA, Cancer Cell Line Authentication, http://bioinfo.life.hust.edu.cn/web/CCLA/) to authenticate 1291 human CCLs of 28 tissues using gene expression profiles. CCLA showed an excellent speed advantage and high accuracy for CCL authentication, a top 1 accuracy of 96.58 or 92.15% (top 3 accuracy of 100 or 95.11%) for microarray or RNA-Seq validation data (719 samples, 461 CCLs), respectively. To the best of our knowledge, CCLA is the first approach to authenticate CCLs using gene expression data. Users can freely and conveniently authenticate CCLs using gene expression profiles or NCBI GEO accession on CCLA website.
Collapse
|
44
|
Schilbert HM, Rempel A, Pucker B. Comparison of Read Mapping and Variant Calling Tools for the Analysis of Plant NGS Data. PLANTS (BASEL, SWITZERLAND) 2020; 9:E439. [PMID: 32252268 PMCID: PMC7238416 DOI: 10.3390/plants9040439] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2020] [Revised: 03/28/2020] [Accepted: 03/30/2020] [Indexed: 12/30/2022]
Abstract
High-throughput sequencing technologies have rapidly developed during the past years and have become an essential tool in plant sciences. However, the analysis of genomic data remains challenging and relies mostly on the performance of automatic pipelines. Frequently applied pipelines involve the alignment of sequence reads against a reference sequence and the identification of sequence variants. Since most benchmarking studies of bioinformatics tools for this purpose have been conducted on human datasets, there is a lack of benchmarking studies in plant sciences. In this study, we evaluated the performance of 50 different variant calling pipelines, including five read mappers and ten variant callers, on six real plant datasets of the model organism Arabidopsis thaliana. Sets of variants were evaluated based on various parameters including sensitivity and specificity. We found that all investigated tools are suitable for analysis of NGS data in plant research. When looking at different performance metrics, BWA-MEM and Novoalign were the best mappers and GATK returned the best results in the variant calling step.
Collapse
Affiliation(s)
- Hanna Marie Schilbert
- Genetics and Genomics of Plants, CeBiTec and Faculty of Biology, Bielefeld University, 33615 Bielefeld, Germany
| | - Andreas Rempel
- Genetics and Genomics of Plants, CeBiTec and Faculty of Biology, Bielefeld University, 33615 Bielefeld, Germany
- Graduate School DILS, Bielefeld Institute for Bioinformatics Infrastructure (BIBI), Faculty of Technology, Bielefeld University, 33615 Bielefeld, Germany
| | - Boas Pucker
- Genetics and Genomics of Plants, CeBiTec and Faculty of Biology, Bielefeld University, 33615 Bielefeld, Germany
- Molecular Genetics and Physiology of Plants, Faculty of Biology and Biotechnology, Ruhr-University Bochum, 44801 Bochum, Germany
| |
Collapse
|
45
|
Zelli V, Compagnoni C, Cannita K, Capelli R, Capalbo C, Di Vito Nolfi M, Alesse E, Zazzeroni F, Tessitore A. Applications of Next Generation Sequencing to the Analysis of Familial Breast/Ovarian Cancer. High Throughput 2020; 9:ht9010001. [PMID: 31936873 PMCID: PMC7151204 DOI: 10.3390/ht9010001] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2019] [Revised: 01/02/2020] [Accepted: 01/07/2020] [Indexed: 12/24/2022] Open
Abstract
Next generation sequencing (NGS) provides a powerful tool in the field of medical genetics, allowing one to perform multi-gene analysis and to sequence entire exomes (WES), transcriptomes or genomes (WGS). The generated high-throughput data are particularly suitable for enhancing the understanding of the genetic bases of complex, multi-gene diseases, such as cancer. Among the various types of tumors, those with a familial predisposition are of great interest for the isolation of novel genes or gene variants, detectable at the germline level and involved in cancer pathogenesis. The identification of novel genetic factors would have great translational value, helping clinicians in defining risk and prevention strategies. In this regard, it is known that the majority of breast/ovarian cases with familial predisposition, lacking variants in the highly penetrant BRCA1 and BRCA2 genes (non-BRCA), remains unexplained, although several less penetrant genes (e.g., ATM, PALB2) have been identified. In this scenario, NGS technologies offer a powerful tool for the discovery of novel factors involved in familial breast/ovarian cancer. In this review, we summarize and discuss the state of the art applications of NGS gene panels, WES and WGS in the context of familial breast/ovarian cancer.
Collapse
Affiliation(s)
- Veronica Zelli
- Department of Biotechnological and Applied Clinical Sciences, University of L’Aquila, Via Vetoio, Coppito 2, 67100 L’Aquila, Italy; (V.Z.); (C.C.); (R.C.); (M.D.V.N.); (E.A.); (F.Z.)
- Center for Molecular Diagnostics and Advanced Therapies, University of L’Aquila, Via Petrini, 67100 L’Aquila, Italy
| | - Chiara Compagnoni
- Department of Biotechnological and Applied Clinical Sciences, University of L’Aquila, Via Vetoio, Coppito 2, 67100 L’Aquila, Italy; (V.Z.); (C.C.); (R.C.); (M.D.V.N.); (E.A.); (F.Z.)
| | - Katia Cannita
- Medical Oncology Unit, St Salvatore Hospital, Via L. Natali 1, 67100 L’Aquila, Italy;
| | - Roberta Capelli
- Department of Biotechnological and Applied Clinical Sciences, University of L’Aquila, Via Vetoio, Coppito 2, 67100 L’Aquila, Italy; (V.Z.); (C.C.); (R.C.); (M.D.V.N.); (E.A.); (F.Z.)
| | - Carlo Capalbo
- Department of Molecular Medicine, University of Rome “La Sapienza”, Viale Regina Elena 324, 00161 Rome, Italy;
| | - Mauro Di Vito Nolfi
- Department of Biotechnological and Applied Clinical Sciences, University of L’Aquila, Via Vetoio, Coppito 2, 67100 L’Aquila, Italy; (V.Z.); (C.C.); (R.C.); (M.D.V.N.); (E.A.); (F.Z.)
| | - Edoardo Alesse
- Department of Biotechnological and Applied Clinical Sciences, University of L’Aquila, Via Vetoio, Coppito 2, 67100 L’Aquila, Italy; (V.Z.); (C.C.); (R.C.); (M.D.V.N.); (E.A.); (F.Z.)
| | - Francesca Zazzeroni
- Department of Biotechnological and Applied Clinical Sciences, University of L’Aquila, Via Vetoio, Coppito 2, 67100 L’Aquila, Italy; (V.Z.); (C.C.); (R.C.); (M.D.V.N.); (E.A.); (F.Z.)
| | - Alessandra Tessitore
- Department of Biotechnological and Applied Clinical Sciences, University of L’Aquila, Via Vetoio, Coppito 2, 67100 L’Aquila, Italy; (V.Z.); (C.C.); (R.C.); (M.D.V.N.); (E.A.); (F.Z.)
- Center for Molecular Diagnostics and Advanced Therapies, University of L’Aquila, Via Petrini, 67100 L’Aquila, Italy
- Correspondence:
| |
Collapse
|
46
|
De Vitis C, Corleone G, Salvati V, Ascenzi F, Pallocca M, De Nicola F, Fanciulli M, di Martino S, Bruschini S, Napoli C, Ricci A, Bassi M, Venuta F, Rendina EA, Ciliberto G, Mancini R. B4GALT1 Is a New Candidate to Maintain the Stemness of Lung Cancer Stem Cells. J Clin Med 2019; 8:E1928. [PMID: 31717588 PMCID: PMC6912435 DOI: 10.3390/jcm8111928] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Revised: 10/30/2019] [Accepted: 11/05/2019] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND According to the cancer stem cells (CSCs) hypothesis, a population of cancer cells with stem cell properties is responsible for tumor propagation, drug resistance, and disease recurrence. Study of the mechanisms responsible for lung CSCs propagation is expected to provide better understanding of cancer biology and new opportunities for therapy. METHODS The Lung Adenocarcinoma (LUAD) NCI-H460 cell line was grown either as 2D or as 3D cultures. Transcriptomic and genome-wide chromatin accessibility studies of 2D vs. 3D cultures were carried out using RNA-sequencing and Assay for Transposase Accessible Chromatin with high-throughput sequencing (ATAC-seq), respectively. Reverse transcription polymerase chain reaction (RT-PCR) was also carried out on RNA extracted from primary cultures derived from malignant pleural effusions to validate RNA-seq results. RESULTS RNA-seq and ATAC-seq data disentangled transcriptional and genome accessibility variability of 3D vs. 2D cultures in NCI-H460 cells. The examination of genomic landscape of genes upregulated in 3D vs. 2D cultures led to the identification of 2D cultures led to the identification of Beta-1,4-galactosyltranferase 1 (B4GALT1) as the top candidate. B4GALT1 as the top candidate. B4GALT1 was validated as a stemness factor, since its silencing caused strong inhibition of 3D spheroid formation. CONCLUSION Combined transcriptomic and chromatin accessibility study of 3D vs. 2D LUAD cultures led to the identification of B4GALT1 as a new factor involved in the propagation and maintenance of LUAD CSCs.
Collapse
Affiliation(s)
- Claudia De Vitis
- Department of Clinical and Molecular Medicine, Sant’Andrea Hospital, “Sapienza” University of Rome, 00161 Rome, Italy; (C.D.V.); (R.M.)
| | - Giacomo Corleone
- SAFU Laboratory, Department of Research, Advanced Diagnostic, and Technological Innovation, IRCCS “Regina Elena” National Cancer Institute, 00144 Rome, Italy; (G.C.); (M.P.); (F.D.N.); (M.F.)
| | - Valentina Salvati
- Preclinical Models and New Therapeutic Agents Unit, IRCCS-Regina Elena National Cancer Institute, 00144 Rome, Italy;
| | - Francesca Ascenzi
- Tumor Immunology and Immunotherapy Unit, Department of Research, Advanced Diagnostic and Technological Innovation, IRCCS Regina Elena National Cancer Institute, 00144 Rome, Italy;
| | - Matteo Pallocca
- SAFU Laboratory, Department of Research, Advanced Diagnostic, and Technological Innovation, IRCCS “Regina Elena” National Cancer Institute, 00144 Rome, Italy; (G.C.); (M.P.); (F.D.N.); (M.F.)
| | - Francesca De Nicola
- SAFU Laboratory, Department of Research, Advanced Diagnostic, and Technological Innovation, IRCCS “Regina Elena” National Cancer Institute, 00144 Rome, Italy; (G.C.); (M.P.); (F.D.N.); (M.F.)
| | - Maurizio Fanciulli
- SAFU Laboratory, Department of Research, Advanced Diagnostic, and Technological Innovation, IRCCS “Regina Elena” National Cancer Institute, 00144 Rome, Italy; (G.C.); (M.P.); (F.D.N.); (M.F.)
| | - Simona di Martino
- Pathology Unit, IRCSS “Regina Elena” National Cancer Institute, 00144 Rome, Italy;
| | - Sara Bruschini
- Department of Experimental and Clinical Medicine, Magna Graecia University of Catanzaro, 88100 Catanzaro, Italy;
| | - Christian Napoli
- Department of Medical Surgical Sciences and Translational Medicine, Sant’Andrea Hospital, “Sapienza” University of Rome, 00189 Rome, Italy;
| | - Alberto Ricci
- Department of Clinical and Molecular Medicine, Division of Pneumology, Sapienza University of Rome, Sant’Andrea Hospital, 00189 Rome, Italy;
| | - Massimiliano Bassi
- Department of Thoracic Surgery, University of Rome Sapienza, 00161 Rome, Italy; (M.B.); (F.V.)
| | - Federico Venuta
- Department of Thoracic Surgery, University of Rome Sapienza, 00161 Rome, Italy; (M.B.); (F.V.)
| | - Erino Angelo Rendina
- Department of Thoracic Surgery, Sant’Andrea Hospital, “Sapienza” University of Rome, 00189 Rome, Italy
| | - Gennaro Ciliberto
- Scientific Direction, IRCCS “Regina Elena” National Cancer Institute, 00144 Rome, Italy
| | - Rita Mancini
- Department of Clinical and Molecular Medicine, Sant’Andrea Hospital, “Sapienza” University of Rome, 00161 Rome, Italy; (C.D.V.); (R.M.)
| |
Collapse
|
47
|
Bertolini F, Chinchilla-Vargas J, Khadse JR, Juneja A, Deshpande PD, Bhave K, Potdar V, Kakramkar PM, Karlekar AR, Pande AB, Fernando RL, Rothschild MF. Marker discovery and associations with β-carotene content in Indian dairy cattle and buffalo breeds. J Dairy Sci 2019; 102:10039-10055. [PMID: 31477308 PMCID: PMC7753891 DOI: 10.3168/jds.2019-16361] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Accepted: 06/30/2019] [Indexed: 01/17/2023]
Abstract
Vitamin A is essential for human health, but current intake levels in many developing countries such as India are too low due to malnutrition. According to the World Health Organization, an estimated 250 million preschool children are vitamin A deficient globally. This number excludes pregnant women and nursing mothers, who are particularly vulnerable. Efforts to improve access to vitamin A are key because supplementation can reduce mortality rates in young children in developing countries by around 23%. Three key genes, BCMO1, BCO2, and SCARB1, have been shown to be associated with the amount of β-carotene (BC) in milk. Whole-genome sequencing reads from the coordinates of these 3 genes in 202 non-Indian cattle (141 Bos taurus, 61 Bos indicus) and 35 non-Indian buffalo (Bubalus bubalis) animals from several breeds were collected from data repositories. The number of SNP detected in the coding regions of these 3 genes ranged from 16 to 26 in the 3 species, with 5 overlapping SNP between B. taurus and B. indicus. All these SNP together with 2 SNP in the upstream part of the gene but already present in dbSNP (https://www.ncbi.nlm.nih.gov/projects/SNP/) were used to build a custom Sequenom array. Blood for DNA and milk samples for BC were obtained from 2,291 Indian cows of 5 different breeds (Gir, Holstein cross, Jersey Cross, Tharparkar, and Sahiwal) and 2,242 Indian buffaloes (Jafarabadi, Murrah, Pandharpuri, and Surti breeds). The DNA was extracted and genotyped with the Sequenom array. For each individual breed and the combined breeds, SNP with an association that had a P-value <0.3 in the first round of linear analysis were included in a second step of regression analyses to determine allele substitution effects to increase the content of BC in milk. Additionally, an F-test for all SNP within gene was performed with the objective of determining if overall the gene had a significant effect on the content of BC in milk. The analyses were repeated using a Bayesian approach to compare and validate the previous frequentist results. Multiple significant SNP were found using both methodologies with allele substitution effects ranging from 6.21 (3.13) to 9.10 (5.43) μg of BC per 100 mL of milk. Total gene effects exceeded the mean BC value for all breeds with both analysis approaches. The custom panel designed for genes related to BC production demonstrated applicability in genotyping of cattle and buffalo in India and may be used for cattle or buffalo from other developing countries. Moreover, the recommendation of selection for significant specific alleles of some gene markers provides a route to effectively increase the BC content in milk in the Indian cattle and buffalo populations.
Collapse
Affiliation(s)
- F Bertolini
- Department of Animal Science, Iowa State University, 2255 Kildee Hall, 806 Stange Road, Ames 50011; National Institute of Aquatic Resources, Technical University of Denmark, Kemitoryet 2800, KGs. Lyngby, Denmark
| | - J Chinchilla-Vargas
- Department of Animal Science, Iowa State University, 2255 Kildee Hall, 806 Stange Road, Ames 50011
| | - J R Khadse
- Bharatiya Agro Industries Foundation, Development Research Foundation, Bhavan, Dr. Manibhai Desai Nagar Warje, Pune 411058, India
| | - A Juneja
- Bharatiya Agro Industries Foundation, Development Research Foundation, Bhavan, Dr. Manibhai Desai Nagar Warje, Pune 411058, India
| | - P D Deshpande
- Bharatiya Agro Industries Foundation, Development Research Foundation, Bhavan, Dr. Manibhai Desai Nagar Warje, Pune 411058, India
| | - K Bhave
- Bharatiya Agro Industries Foundation, Development Research Foundation, Bhavan, Dr. Manibhai Desai Nagar Warje, Pune 411058, India
| | - V Potdar
- Bharatiya Agro Industries Foundation, Development Research Foundation, Bhavan, Dr. Manibhai Desai Nagar Warje, Pune 411058, India
| | - P M Kakramkar
- Bharatiya Agro Industries Foundation, Development Research Foundation, Bhavan, Dr. Manibhai Desai Nagar Warje, Pune 411058, India
| | - A R Karlekar
- Bharatiya Agro Industries Foundation, Development Research Foundation, Bhavan, Dr. Manibhai Desai Nagar Warje, Pune 411058, India
| | - A B Pande
- Bharatiya Agro Industries Foundation, Development Research Foundation, Bhavan, Dr. Manibhai Desai Nagar Warje, Pune 411058, India
| | - Rohan L Fernando
- Department of Animal Science, Iowa State University, 2255 Kildee Hall, 806 Stange Road, Ames 50011
| | - M F Rothschild
- Department of Animal Science, Iowa State University, 2255 Kildee Hall, 806 Stange Road, Ames 50011.
| |
Collapse
|
48
|
Gao P, Zhang R, Li J. Comprehensive elaboration of database resources utilized in next-generation sequencing-based tumor somatic mutation detection. Biochim Biophys Acta Rev Cancer 2019; 1872:122-137. [PMID: 31265877 DOI: 10.1016/j.bbcan.2019.06.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 06/16/2019] [Accepted: 06/26/2019] [Indexed: 12/20/2022]
Abstract
The rapid evolution of next-generation sequencing (NGS)-based tumor genomic profile detection and the emergence of molecularly targeted therapies have enabled precision oncology. In NGS-based analysis, various types of databases have been developed to perform different functions. However, many problems still exist when using these public databases. Therefore, it is important to better understand the characteristics and limitations of each database and have them complement each other to provide useful clinical evidence for NGS testing. In this review, we elaborate on the important role of databases and their concrete applications in NGS-based somatic mutation detection. We introduce the typically used databases for sequence alignment, variant filtration, and variant interpretation, and compare the differences between the databases with similar functions. Subsequently, we determine the limitations of each database and provide the corresponding solutions. Furthermore, we present an overview diagram to clearly illustrate the database used in the entire NGS-based somatic mutation detection pipeline.
Collapse
Affiliation(s)
- Peng Gao
- National Center for Clinical Laboratories, Beijing Hospital, National Center of Gerontology, Beijing, People's Republic of China; Graduate School, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, People's Republic of China; Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, People's Republic of China
| | - Rui Zhang
- National Center for Clinical Laboratories, Beijing Hospital, National Center of Gerontology, Beijing, People's Republic of China; Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, People's Republic of China.
| | - Jinming Li
- National Center for Clinical Laboratories, Beijing Hospital, National Center of Gerontology, Beijing, People's Republic of China; Graduate School, Peking Union Medical College, Chinese Academy of Medical Sciences, Beijing, People's Republic of China; Beijing Engineering Research Center of Laboratory Medicine, Beijing Hospital, Beijing, People's Republic of China.
| |
Collapse
|