1
|
González-Carracedo MA, Herrera-Luis E, Marco-Simancas M, Escuela-Escobar A, Martín-González E, Sardón-Prado O, Corcuera P, Hernández-Pérez JM, Lorenzo-Díaz F, Pérez-Pérez JA. Haplotype-Aware Detection of SERPINA1 Variants by Nanopore Sequencing. J Mol Diagn 2024:S1525-1578(24)00195-8. [PMID: 39276924 DOI: 10.1016/j.jmoldx.2024.08.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2024] [Revised: 08/01/2024] [Accepted: 08/14/2024] [Indexed: 09/17/2024] Open
Abstract
α-1 Antitrypsin (AAT) is an acute-phase reactant with immunomodulatory properties that mainly inhibits neutrophil elastase. Low serum levels cause AAT deficiency (AATD), an underdiagnosed condition that predisposes to pulmonary and hepatic diseases. The SERPINA1 gene, which encodes AAT, contains >500 variants. PI∗Z and PI∗S alleles are the most diagnosed causes of AATD, but the role of the SERPINA1 haplotypes in AAT function remains unknown. SERPINA1 gene was PCR amplified from 94 patients with asthma, using primers with tails for indexing. Sequencing libraries were loaded into a MinION-Mk1C, and MinKNOW was used for basecalling and demultiplexing. Nanofilt and Minimap2 were used for filtering and mapping/alignment. Variant calling/phasing were performed with PEPPER-Margin-DeepVariant. SERPINA1 gene was 100% covered for all samples, with a minimum sequencing depth of 500×. A total of 75 single-nucleotide variants (SNVs) and 4 insertions/deletions were detected, with 45 and 2 of them highly polymorphic (minor allele frequency >0.1), respectively. Nine of the SNVs showed differences in allele frequencies when compared with the overall Spanish population. More than 90% of heterozygous SNVs were phased, yielding 91 and 58 different haplotypes for each SERPINA1 amplified region. Haplotype-based linkage disequilibrium analysis suggests that a recombination hotspot could generate variation in the SERPINA1 gene. The proposed workflow enables haplotype-aware genotyping of the SERPINA1 gene by nanopore sequencing, which will allow the development of novel AATD diagnostic strategies.
Collapse
Affiliation(s)
- Mario A González-Carracedo
- Genetics Laboratory, Institute of Tropical Diseases and Public Health of the Canary Islands, Universidad de La Laguna, Tenerife, Spain; Genomics and Health Group, Department of Biochemistry, Microbiology, Cell Biology and Genetics, Universidad de La Laguna, Tenerife, Spain
| | - Esther Herrera-Luis
- Department of Epidemiology, Bloomberg School of Public Health, Johns Hopkins University, Baltimore, Maryland
| | - María Marco-Simancas
- Genomics and Health Group, Department of Biochemistry, Microbiology, Cell Biology and Genetics, Universidad de La Laguna, Tenerife, Spain
| | - Ainhoa Escuela-Escobar
- Genetics Laboratory, Institute of Tropical Diseases and Public Health of the Canary Islands, Universidad de La Laguna, Tenerife, Spain
| | - Elena Martín-González
- Genomics and Health Group, Department of Biochemistry, Microbiology, Cell Biology and Genetics, Universidad de La Laguna, Tenerife, Spain
| | - Olaia Sardón-Prado
- Division of Pediatric Respiratory Medicine, Hospital Universitario Donostia, San Sebastián, Spain; Department of Pediatrics, University of the Basque Country, San Sebastián, Spain
| | - Paula Corcuera
- Division of Pediatric Respiratory Medicine, Hospital Universitario Donostia, San Sebastián, Spain
| | - Jose M Hernández-Pérez
- Department of Respiratory Medicine, Hospital Universitario de N.S. de Candelaria, Tenerife, Spain
| | - Fabián Lorenzo-Díaz
- Genetics Laboratory, Institute of Tropical Diseases and Public Health of the Canary Islands, Universidad de La Laguna, Tenerife, Spain; Genomics and Health Group, Department of Biochemistry, Microbiology, Cell Biology and Genetics, Universidad de La Laguna, Tenerife, Spain
| | - José A Pérez-Pérez
- Genetics Laboratory, Institute of Tropical Diseases and Public Health of the Canary Islands, Universidad de La Laguna, Tenerife, Spain; Genomics and Health Group, Department of Biochemistry, Microbiology, Cell Biology and Genetics, Universidad de La Laguna, Tenerife, Spain.
| |
Collapse
|
2
|
Modesto AAC, de Moraes MR, Valente CMD, Costa MSCR, Leal DFDVB, Pereira EEB, Fernandes MR, Pinheiro JADS, Pantoja KBCC, Moreira FC, Burbano RMR, de Assumpção PP, dos Santos NPC, dos Santos SEB. Association between INDELs in MicroRNAs and Susceptibility to Gastric Cancer in Amazonian Population. Genes (Basel) 2022; 14:genes14010060. [PMID: 36672804 PMCID: PMC9858651 DOI: 10.3390/genes14010060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 11/23/2022] [Accepted: 11/24/2022] [Indexed: 12/28/2022] Open
Abstract
Gastric cancer (GC) is a multifactorial, complex, and aggressive disease with a prevalence of one million new cases and high global mortality. Factors such as genetic, epigenetic, and environmental changes contribute to the onset and progression of the disease. Identification of INDELs in miRNA and its target sites in current studies showed an important role in the development of cancer. In GC, miRNAs act as oncogenes or tumor suppressors, favoring important cancer pathways, such as cell proliferation and migration. This work aims to investigate INDELs in the coding region of miRNAs (hsa-miR-302c, hsa-miR-548AJ-2, hsa-miR-4274, hsa-miR-630, hsa-miR-516B-2, hsa-miR-4463, hsa-miR-3945, hsa-miR-548H_4, hsa-miR-920, has-mir-3171, and hsa-miR-3652) that may be associated with susceptibility and clinical variants of gastric cancer. For this study, 301 patients with GC and 145 individuals from the control group were selected from an admixed population in the Brazilian Amazon. The results showed the hsa-miR-4463, hsa-miR-3945, hsa-miR-548H_4, hsa-miR-920 and hsa-miR-3652 variants were associated with gastric cancer susceptibility. The hsa-miR-4463 was significantly associated with clinical features of GC such as diffuse gastric tumor histological type, "non-cardia" localization region, and early onset. Our findings indicated that INDELs could be potentially functional genetic variants for gastric cancer risk.
Collapse
Affiliation(s)
- Antonio A. C. Modesto
- Núcleo de Pesquisas em Oncologia, Universidade Federal do Pará, R. dos Mundurucus 4487, Guamá, Belém 66073-000, Brazil
| | - Milene R. de Moraes
- Laboratório de Genética Humana e Médica, Instituto de Ciências Biológicas, Universidade Federal do Pará, Belém 66073-000, Brazil
| | - Cristina M. D. Valente
- Laboratório de Genética Humana e Médica, Instituto de Ciências Biológicas, Universidade Federal do Pará, Belém 66073-000, Brazil
| | - Marta S. C. R. Costa
- Núcleo de Pesquisas em Oncologia, Universidade Federal do Pará, R. dos Mundurucus 4487, Guamá, Belém 66073-000, Brazil
| | - Diana F. da V. B. Leal
- Núcleo de Pesquisas em Oncologia, Universidade Federal do Pará, R. dos Mundurucus 4487, Guamá, Belém 66073-000, Brazil
| | - Esdras E. B. Pereira
- Núcleo de Pesquisas em Oncologia, Universidade Federal do Pará, R. dos Mundurucus 4487, Guamá, Belém 66073-000, Brazil
| | - Marianne R. Fernandes
- Núcleo de Pesquisas em Oncologia, Universidade Federal do Pará, R. dos Mundurucus 4487, Guamá, Belém 66073-000, Brazil
- Correspondence: ; Tel.: +91-99123-4727
| | - Jhully A. dos S. Pinheiro
- Laboratório de Genética Humana e Médica, Instituto de Ciências Biológicas, Universidade Federal do Pará, Belém 66073-000, Brazil
| | - Karla B. C. C. Pantoja
- Núcleo de Pesquisas em Oncologia, Universidade Federal do Pará, R. dos Mundurucus 4487, Guamá, Belém 66073-000, Brazil
| | - Fabiano C. Moreira
- Núcleo de Pesquisas em Oncologia, Universidade Federal do Pará, R. dos Mundurucus 4487, Guamá, Belém 66073-000, Brazil
| | | | - Paulo P. de Assumpção
- Núcleo de Pesquisas em Oncologia, Universidade Federal do Pará, R. dos Mundurucus 4487, Guamá, Belém 66073-000, Brazil
| | - Ney P. C. dos Santos
- Núcleo de Pesquisas em Oncologia, Universidade Federal do Pará, R. dos Mundurucus 4487, Guamá, Belém 66073-000, Brazil
| | - Sidney E. B. dos Santos
- Núcleo de Pesquisas em Oncologia, Universidade Federal do Pará, R. dos Mundurucus 4487, Guamá, Belém 66073-000, Brazil
- Laboratório de Genética Humana e Médica, Instituto de Ciências Biológicas, Universidade Federal do Pará, Belém 66073-000, Brazil
| |
Collapse
|
3
|
Craven KE, Fischer CG, Jiang L, Pallavajjala A, Lin MT, Eshleman JR. Optimizing Insertion and Deletion Detection Using Next-Generation Sequencing in the Clinical Laboratory. J Mol Diagn 2022; 24:1217-1231. [PMID: 36162758 PMCID: PMC9808503 DOI: 10.1016/j.jmoldx.2022.08.006] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Revised: 07/18/2022] [Accepted: 08/31/2022] [Indexed: 01/13/2023] Open
Abstract
Detection of insertions and deletions (InDels) by short-read next-generation sequencing (NGS) technology can be challenging because of frequent misaligned reads. A systematic analysis of short InDels (1 to 30 bases) and fms-related receptor tyrosine kinase 3 (FLT3) internal tandem duplications (ITDs; 6 to 183 bases) from 46 clinical cases of solid or hematologic malignancy processed with a clinical NGS assay identified misaligned reads in every case, ranging from 3% to 100% of reads with the InDel showing mismapped bases. Mismaps also increased with InDel size. As a consequence, the clinical NGS bioinformatics pipeline undercalled the variant allele frequency by 1% to 84%, incorrectly called simultaneous single-base substitutions along with InDels, or did not report an FLT3 ITD that had been detected by capillary electrophoresis. To improve the ability of the pipeline to better detect and quantify InDels, we utilized a software program called Assembly-Based ReAligner (ABRA2) to more accurately remap reads. ABRA2 was able to correct 41% to 100% of the reads with mismapped bases and led to absolute increases in the variant allele frequency from 1% to 61% along with correction of all of the single-base substitutions except for two cases. ABRA2 could also detect multiple FLT3 ITD clones except for one 183-base ITD. Our analysis has found that ABRA2 performs well on short InDels as well as FLT3 ITDs that are <100 bases.
Collapse
Affiliation(s)
- Kelly E Craven
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Catherine G Fischer
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland; Division of Cancer Prevention, National Cancer Institute, Rockville, Maryland
| | - LiQun Jiang
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Aparna Pallavajjala
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - Ming-Tseh Lin
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland
| | - James R Eshleman
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, Maryland; Department of Oncology, Johns Hopkins University School of Medicine, Baltimore, Maryland; The Sol Goldman Pancreatic Cancer Research Center, Johns Hopkins University School of Medicine, Baltimore, Maryland.
| |
Collapse
|
4
|
Quinones-Valdez G, Fu T, Chan TW, Xiao X. scAllele: A versatile tool for the detection and analysis of variants in scRNA-seq. SCIENCE ADVANCES 2022; 8:eabn6398. [PMID: 36054357 DOI: 10.1126/sciadv.abn6398] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Single-cell RNA sequencing (scRNA-seq) data contain rich information at the gene, transcript, and nucleotide levels. Most analyses of scRNA-seq have focused on gene expression profiles, and it remains challenging to extract nucleotide variants and isoform-specific information. Here, we present scAllele, an integrative approach that detects single-nucleotide variants, insertions, deletions, and their allelic linkage with splicing patterns in scRNA-seq. We demonstrate that scAllele achieves better performance in identifying nucleotide variants than other commonly used tools. In addition, the read-specific variant calls by scAllele enables allele-specific splicing analysis, a unique feature not afforded by other methods. Applied to a lung cancer scRNA-seq dataset, scAllele identified variants with strong allelic linkage to alternative splicing, some of which are cancer specific and enriched in cancer-relevant pathways. scAllele represents a versatile tool to uncover multilayer information and previously unidentified biological insights from scRNA-seq data.
Collapse
Affiliation(s)
| | - Ting Fu
- Molecular, Cellular, and Integrative Physiology Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Tracey W Chan
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Xinshu Xiao
- Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Molecular, Cellular, and Integrative Physiology Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Department of Integrative Biology and Physiology, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Molecular Biology Institute, University of California, Los Angeles, Los Angeles, CA 90095, USA
- Institute for Quantitative and Computational Biosciences, University of California, Los Angeles, Los Angeles, CA 90095, USA
| |
Collapse
|
5
|
Chen J, Guo JT. Structural and functional analysis of somatic coding and UTR indels in breast and lung cancer genomes. Sci Rep 2021; 11:21178. [PMID: 34707120 PMCID: PMC8551294 DOI: 10.1038/s41598-021-00583-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 10/14/2021] [Indexed: 11/24/2022] Open
Abstract
Insertions and deletions (Indels) represent one of the major variation types in the human genome and have been implicated in diseases including cancer. To study the features of somatic indels in different cancer genomes, we investigated the indels from two large samples of cancer types: invasive breast carcinoma (BRCA) and lung adenocarcinoma (LUAD). Besides mapping somatic indels in both coding and untranslated regions (UTRs) from the cancer whole exome sequences, we investigated the overlap between these indels and transcription factor binding sites (TFBSs), the key elements for regulation of gene expression that have been found in both coding and non-coding sequences. Compared to the germline indels in healthy genomes, somatic indels contain more coding indels with higher than expected frame-shift (FS) indels in cancer genomes. LUAD has a higher ratio of deletions and higher coding and FS indel rates than BRCA. More importantly, these somatic indels in cancer genomes tend to locate in sequences with important functions, which can affect the core secondary structures of proteins and have a bigger overlap with predicted TFBSs in coding regions than the germline indels. The somatic CDS indels are also enriched in highly conserved nucleotides when compared with germline CDS indels.
Collapse
Affiliation(s)
- Jing Chen
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, 28223, USA
| | - Jun-Tao Guo
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, 28223, USA.
| |
Collapse
|