76
|
Sonon P, Tokplonou L, Sadissou I, M'po KKG, Glitho SSC, Agniwo P, Ibikounlé M, Souza AS, Massaro JD, Gonzalez D, Tchégninougbo T, Ayitchédji A, Massougbodji A, Moreau P, Garcia A, Milet J, Sabbagh A, Mendes-Junior CT, Moutairou KA, Castelli EC, Courtin D, Donadi EA. Human leukocyte antigen (HLA)-F and -G gene polymorphisms and haplotypes are associated with malaria susceptibility in the Beninese Toffin children. INFECTION GENETICS AND EVOLUTION 2021; 92:104828. [PMID: 33781967 DOI: 10.1016/j.meegid.2021.104828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/28/2020] [Revised: 02/05/2021] [Accepted: 03/24/2021] [Indexed: 11/19/2022]
Abstract
BACKGROUND Little attention has been devoted to the role of the immunoregulatory HLA-E/-F/-G genes in malaria. We evaluated the entire HLA-E/-F/-G variability in Beninese children highly exposed to Plasmodium falciparum (P.f.) malaria. METHODS 154 unrelated children were followed-up for six months and evaluated for the presence and number of malaria episodes. HLA-E/-F/-G genes were genotyped using massively parallel sequencing. Anti P.f. antibodies were evaluated using ELISA. RESULTS Children carrying the G allele at HLA-F (-1499,rs183540921) showed increased P.f. asymptomatic/symptomatic ratio, suggesting that these children experienced more asymptomatic P.f. episodes than symptomatic one. Children carrying HLA-G-UTR-03 haplotype exhibited increased risk for symptomatic P.f. episodes and showed lower IgG2 response against P.f. GLURP-R2 when compared to the non-carriers. No associations were observed for the HLA-E gene. CONCLUSION HLA-F associations may be related to the differential expression profiles of the encoded immunomodulatory molecules, and the regulatory sites at the HLA-G 3'UTR may be associated to posttranscriptional regulation of HLA-G and to host humoral response against P.f.
Collapse
|
77
|
Golubickaite I, Ugenskiene R, Korobeinikova E, Gudaitiene J, Vaitiekus D, Poskiene L, Juozaityte E. The impact of mitochondria-related POLG and TFAM variants on breast cancer pathomorphological characteristics and patient outcomes. Biomarkers 2021; 26:343-353. [PMID: 33715547 DOI: 10.1080/1354750x.2021.1900397] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
PURPOSE Breast cancer is the most frequent female cancer, leading to relapse with distant metastasis of approximately one-third of patients. Cancer is usually considered a genetic disease involving mutations in nuclear DNA. However, genes, coding for mitochondrial proteins or regulatory molecules, are rarely under consideration. This study aimed to analyse 10 single nucleotide variants in POLG and TFAM genes and assess their association with tumour phenotype and disease outcome. MATERIALS AND METHODS A total of 234 breast cancer patients were included in this study. Variations were determined with Real-Time PCR using TaqMan® probes. RESULTS We found that patients with POLG rs2307441 TT and CT genotypes had a lower probability for vascular invasion than those with CC genotype (p = 0.001). Patients with POLG rs2072267 AG genotype were predisposed for progression compared with GG genotype (p = 0.015). TFAM rs3900887 TT genotype was associated with a higher probability for positive oestrogen receptors (p = 0.003) and lymphatic invasion (p = 0.001) in comparison to AA genotype, patients with TT (p = 0.000) were more likely to have positive lymph nodes. CONCLUSIONS Our data suggest that variations in POLG and TFAM genes are important determinacies of tumour phenotype and disease outcome in breast cancer patients.
Collapse
|
78
|
Hawari MA, Hong CS, Biesecker LG. SomatoSim: precision simulation of somatic single nucleotide variants. BMC Bioinformatics 2021; 22:109. [PMID: 33676403 PMCID: PMC7936459 DOI: 10.1186/s12859-021-04024-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2020] [Accepted: 02/14/2021] [Indexed: 12/26/2022] Open
Abstract
BACKGROUND Somatic single nucleotide variants have gained increased attention because of their role in cancer development and the widespread use of high-throughput sequencing techniques. The necessity to accurately identify these variants in sequencing data has led to a proliferation of somatic variant calling tools. Additionally, the use of simulated data to assess the performance of these tools has become common practice, as there is no gold standard dataset for benchmarking performance. However, many existing somatic variant simulation tools are limited because they rely on generating entirely synthetic reads derived from a reference genome or because they do not allow for the precise customizability that would enable a more focused understanding of single nucleotide variant calling performance. RESULTS SomatoSim is a tool that lets users simulate somatic single nucleotide variants in sequence alignment map (SAM/BAM) files with full control of the specific variant positions, number of variants, variant allele fractions, depth of coverage, read quality, and base quality, among other parameters. SomatoSim accomplishes this through a three-stage process: variant selection, where candidate positions are selected for simulation, variant simulation, where reads are selected and mutated, and variant evaluation, where SomatoSim summarizes the simulation results. CONCLUSIONS SomatoSim is a user-friendly tool that offers a high level of customizability for simulating somatic single nucleotide variants. SomatoSim is available at https://github.com/BieseckerLab/SomatoSim .
Collapse
|
79
|
Zengin T, Önal-Süzek T. Comprehensive Profiling of Genomic and Transcriptomic Differences between Risk Groups of Lung Adenocarcinoma and Lung Squamous Cell Carcinoma. J Pers Med 2021; 11:154. [PMID: 33672117 PMCID: PMC7926392 DOI: 10.3390/jpm11020154] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Revised: 02/11/2021] [Accepted: 02/19/2021] [Indexed: 12/17/2022] Open
Abstract
Lung cancer is the second most frequently diagnosed cancer type and responsible for the highest number of cancer deaths worldwide. Lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) are subtypes of non-small-cell lung cancer which has the highest frequency of lung cancer cases. We aimed to analyze genomic and transcriptomic variations including simple nucleotide variations (SNVs), copy number variations (CNVs) and differential expressed genes (DEGs) in order to find key genes and pathways for diagnostic and prognostic prediction for lung adenocarcinoma and lung squamous cell carcinoma. We performed a univariate Cox model and then lasso-regularized Cox model with leave-one-out cross-validation using The Cancer Genome Atlas (TCGA) gene expression data in tumor samples. We generated 35- and 33-gene signatures for prognostic risk prediction based on the overall survival time of the patients with LUAD and LUSC, respectively. When we clustered patients into high- and low-risk groups, the survival analysis showed highly significant results with high prediction power for both training and test datasets. Then, we characterized the differences including significant SNVs, CNVs, DEGs, active subnetworks, and the pathways. We described the results for the risk groups and cancer subtypes separately to identify specific genomic alterations between both high-risk groups and cancer subtypes. Both LUAD and LUSC high-risk groups have more downregulated immune pathways and upregulated metabolic pathways. On the other hand, low-risk groups have both up- and downregulated genes on cancer-related pathways. Both LUAD and LUSC have important gene alterations such as CDKN2A and CDKN2B deletions with different frequencies. SOX2 amplification occurs in LUSC and PSMD4 amplification in LUAD. EGFR and KRAS mutations are mutually exclusive in LUAD samples. EGFR, MGA, SMARCA4, ATM, RBM10, and KDM5C genes are mutated only in LUAD but not in LUSC. CDKN2A, PTEN, and HRAS genes are mutated only in LUSC samples. The low-risk groups of both LUAD and LUSC tend to have a higher number of SNVs, CNVs, and DEGs. The signature genes and altered genes have the potential to be used as diagnostic and prognostic biomarkers for personalized oncology.
Collapse
|
80
|
Detecting Causal Variants in Mendelian Disorders Using Whole-Genome Sequencing. Methods Mol Biol 2021; 2243:1-25. [PMID: 33606250 DOI: 10.1007/978-1-0716-1103-6_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023]
Abstract
Increasingly affordable sequencing technologies are revolutionizing the field of genomic medicine. It is now feasible to interrogate all major classes of variation in an individual across the entire genome for less than $1000 USD. While the generation of patient sequence information using these technologies has become routine, the analysis and interpretation of this data remains the greatest obstacle to widespread clinical implementation. This chapter summarizes the steps to identify, annotate, and prioritize variant information required for clinical report generation. We discuss methods to detect each variant class and describe strategies to increase the likelihood of detecting causal variant(s) in Mendelian disease. Lastly, we describe a sample workflow for synthesizing large amount of genetic information into concise clinical reports.
Collapse
|
81
|
Gao Y, Qiao H, Pan V, Wang Z, Li J, Wei Y, Ke Y, Qi H. Accurate genotyping of fragmented DNA using a toehold assisted padlock probe. Biosens Bioelectron 2021; 179:113079. [PMID: 33636500 DOI: 10.1016/j.bios.2021.113079] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2020] [Revised: 01/25/2021] [Accepted: 02/03/2021] [Indexed: 11/15/2022]
Abstract
Fragmented DNA from blood plasma, i.e., cell-free DNA, has received great interest as a noninvasive diagnostic biomarker for "point-of-care" testing or liquid biopsy. Here, we present a new approach for accurate genotyping of highly fragmented DNA. Based on toehold-mediated strand displacement, a toehold-assisted padlock probe and toehold blocker were designed and demonstrated with new controllability in significantly suppressing undesired cross-reaction, promoting target recycling and point mutation detection by tuning the thermodynamic properties. Furthermore, toehold-assisted padlock probe systems were elaborately designed for 14 different single-nucleotide variants (SNVs) and were demonstrated to be able to detect low concentration of variant alleles (0.1%). In addition, a target, spanning a narrow sequence window of 29 nucleotides on average is sufficient for the toehold-assisted padlock probe system, which is valuable for the analysis of highly fragmented DNA molecules from clinical samples. We further demonstrated that the toehold-assisted padlock probe, in combination with a unique asymmetric PCR technique, could detect more target SNVs at low allele fractions (1%) in highly fragmented cfDNA. This allows accurate genotyping and provides a new commercial approach for high-resolution analysis of genetic variation.
Collapse
|
82
|
Fortini BK, Tring S, Devall MA, Ali MW, Plummer SJ, Casey G. SNPs associated with colorectal cancer at 15q13.3 affect risk enhancers that modulate GREM1 gene expression. Hum Mutat 2021; 42:237-245. [PMID: 33476087 PMCID: PMC7898835 DOI: 10.1002/humu.24166] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2020] [Revised: 08/12/2020] [Accepted: 11/03/2020] [Indexed: 12/21/2022]
Abstract
Several genome wide association studies of colorectal cancer (CRC) have identified single nucleotide polymorphisms (SNPs) on chromosome 15q13.3 associated with CRC risk. To identify functional variant(s) underlying this association, we investigated SNPs in linkage disequilibrium with the risk‐associated SNP rs4779584 that overlapped regulatory regions/enhancer elements characterized in colon‐related tissues and cells. We identified several SNP‐containing regulatory regions that exhibited enhancer activity in vitro, including one SNP (rs1406389) that correlated with allele‐specific effects on enhancer activity. Deletion of either this enhancer or another enhancer that had previously been reported in this region correlated with decreased expression of GREM1 following CRISPR/Cas9 genome editing. That GREM1 is one target of these enhancers was further supported by an expression quantitative trait loci correlation between rs1406389 and GREM1 expression in the transverse but not sigmoid colon in the Genotype‐Tissue Expression dataset. Taken together, we conclude that the 15q13.3 region contains at least two functional variants that map to distinct enhancers and impact CRC risk through modulation of GREM1 expression.
Collapse
|
83
|
Liu H, Prashant NM, Spurr LF, Bousounis P, Alomran N, Ibeawuchi H, Sein J, Słowiński P, Tsaneva-Atanasova K, Horvath A. scReQTL: an approach to correlate SNVs to gene expression from individual scRNA-seq datasets. BMC Genomics 2021; 22:40. [PMID: 33419390 PMCID: PMC7791999 DOI: 10.1186/s12864-020-07334-y] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Accepted: 12/16/2020] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Recently, pioneering expression quantitative trait loci (eQTL) studies on single cell RNA sequencing (scRNA-seq) data have revealed new and cell-specific regulatory single nucleotide variants (SNVs). Here, we present an alternative QTL-related approach applicable to transcribed SNV loci from scRNA-seq data: scReQTL. ScReQTL uses Variant Allele Fraction (VAFRNA) at expressed biallelic loci, and corelates it to gene expression from the corresponding cell. RESULTS Our approach employs the advantage that, when estimated from multiple cells, VAFRNA can be used to assess effects of SNVs in a single sample or individual. In this setting scReQTL operates in the context of identical genotypes, where it is likely to capture RNA-mediated genetic interactions with cell-specific and transient effects. Applying scReQTL on scRNA-seq data generated on the 10 × Genomics Chromium platform using 26,640 mesenchymal cells derived from adipose tissue obtained from three healthy female donors, we identified 1272 unique scReQTLs. ScReQTLs common between individuals or cell types were consistent in terms of the directionality of the relationship and the effect size. Comparative assessment with eQTLs from bulk sequencing data showed that scReQTL analysis identifies a distinct set of SNV-gene correlations, that are substantially enriched in known gene-gene interactions and significant genome-wide association studies (GWAS) loci. CONCLUSION ScReQTL is relevant to the rapidly growing source of scRNA-seq data and can be applied to outline SNVs potentially contributing to cell type-specific and/or dynamic genetic interactions from an individual scRNA-seq dataset. AVAILABILITY https://github.com/HorvathLab/NGS/tree/master/scReQTL.
Collapse
|
84
|
Gomes da Silva IIF, Lima CAD, Silva JEA, Rushansky E, Mariano MHQA, Rolim P, Oliveira RDR, Louzada-Júnior P, Souto FO, Crovella S, de Azevêdo Silva J, Sandrin-Garcia P. Is there an Inflammation Role for MYD88 in Rheumatoid Arthritis? Inflammation 2021; 44:1014-1022. [PMID: 33405020 DOI: 10.1007/s10753-020-01397-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2020] [Revised: 10/27/2020] [Accepted: 12/07/2020] [Indexed: 10/22/2022]
Abstract
Rheumatoid arthritis (RA) is an autoimmune and inflammatory disease with strong genetic influence, especially upon immune response components. Several cytokines from the toll-like receptors activation pathway display recognized role for RA establishment. However, few studies have verified the role of key mediators such as MYD88 gene and its genetic variants. In the present study, we aim to evaluate the rs6853 functional single-nucleotide variation (SNV) role in RA etiopathogenesis, clinical severity status, and its impact in MYD88 mRNA levels and IL-lβ protein levels. For the association study, a total of 423 RA patients and 346 health individuals, enrolled as control, from Northeast and Southeast Brazil were genotyped using specific Taqman probe. For the gene expression assays, we performed a MYD88 rs6853 genotype-guided monocyte cell culture divided into non-stimulated and lypopolysaccharides (LPS)-stimulated cells from healthy individuals. MYD88 gene expression was measured using primer specifics while IL-1β levels were evaluated by ELISA. We observed that A allele and AA genotype were associated to an increased risk to RA development (OR = 1.60; 95% CI 1.24-2.08; p = 0.0004/OR = 2.83; 95% CI 1.25-6.41; p = 0.0152). The AA genotype exhibited lower MYD88 mRNA levels than GG genotype in non-stimulated monocyte cell culture (FC - 3.83; p = 0.003). Additionally, we verified an increase of IL-1β levels when AA genotype non-stimulated monocytes were compared to AA genotype LPS-stimulates (p = 0.021). In summary, MYD88 rs6853 polymorphism associated to RA development in our Brazilian cohort and showed influence upon MYD88 mRNA levels' expression and IL-lβ production.
Collapse
|
85
|
Abstract
Application of next generation sequencing techniques in the field of liquid biopsy, in particular urine, requires specific bioinformatics methods in order to deal with its peculiarity. Many aspects of cancer can be explored starting from nucleic acids, especially from cell-free DNA and circulating tumor DNA in order to characterize cancer. It is possible to detect small mutations, as single nucleotide variants, small insertions and deletions, copy-number alterations, and epigenetic profiles. Due to the low fraction of circulating tumor DNA over the whole cell-free DNA, some methods have been exploited. One of them is the application of unique barcodes to each DNA fragment in order to lower the limit of detection of cancer-related variants. Some bioinformatics workflows and tools are the same of a classic analysis of tumor tissue, but there are some steps in which specific algorithms have to be introduced.
Collapse
|
86
|
Zarubin A, Stepanov V, Markov A, Kolesnikov N, Marusin A, Khitrinskaya I, Swarovskaya M, Litvinov S, Ekomasova N, Dzhaubermezov M, Maksimova N, Sukhomyasova A, Shtygasheva O, Khusnutdinova E, Radzhabov M, Kharkov V. Structural Variability, Expression Profile, and Pharmacogenetic Properties of TMPRSS2 Gene as a Potential Target for COVID-19 Therapy. Genes (Basel) 2020; 12:genes12010019. [PMID: 33375616 PMCID: PMC7823984 DOI: 10.3390/genes12010019] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Revised: 12/18/2020] [Accepted: 12/21/2020] [Indexed: 02/07/2023] Open
Abstract
The human serine protease serine 2 TMPRSS2 is involved in the priming of proteins of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) and represents a possible target for COVID-19 therapy. The TMPRSS2 gene may be co-expressed with SARS-CoV-2 cell receptor genes angiotensin-converting enzyme 2 (ACE2) and Basigin (BSG), but only TMPRSS2 demonstrates tissue-specific expression in alveolar cells according to single-cell RNA sequencing data. Our analysis of the structural variability of the TMPRSS2 gene based on genome-wide data from 76 human populations demonstrates that a functionally significant missense mutation in exon 6/7 in the TMPRSS2 gene is found in many human populations at relatively high frequencies, with region-specific distribution patterns. The frequency of the missense mutation encoded by rs12329760, which has previously been found to be associated with prostate cancer, ranged between 10% and 63% and was significantly higher in populations of Asian origin compared with European populations. In addition to single-nucleotide polymorphisms, two copy number variants were detected in the TMPRSS2 gene. A number of microRNAs have been predicted to regulate TMPRSS2 and BSG expression levels, but none of them is enriched in lung or respiratory tract cells. Several well-studied drugs can downregulate the expression of TMPRSS2 in human cells, including acetaminophen (paracetamol) and curcumin. Thus, the interactions of TMPRSS2 with SARS-CoV-2, together with its structural variability, gene–gene interactions, expression regulation profiles, and pharmacogenomic properties, characterize this gene as a potential target for COVID-19 therapy.
Collapse
|
87
|
O'Sullivan B, Seoighe C. vcfView: An Extensible Data Visualization and Quality Assurance Platform for Integrated Somatic Variant Analysis. Cancer Inform 2020; 19:1176935120972377. [PMID: 33239857 PMCID: PMC7672756 DOI: 10.1177/1176935120972377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 10/19/2020] [Indexed: 11/21/2022] Open
Abstract
Motivation: Somatic mutations can have critical prognostic and therapeutic implications for cancer patients. Although targeted methods are often used to assay specific cancer driver mutations, high throughput sequencing is frequently applied to discover novel driver mutations and to determine the status of less-frequent driver mutations. The task of recovering somatic mutations from these data is nontrivial as somatic mutations must be distinguished from germline variants, sequencing errors, and other artefacts. Consequently, bioinformatics pipelines for recovery of somatic mutations from high throughput sequencing typically involve a large number of analytical choices in the form of quality filters. Results: We present vcfView, an interactive tool designed to support the evaluation of somatic mutation calls from cancer sequencing data. The tool takes as input a single variant call format (VCF) file and enables researchers to explore the impacts of analytical choices on the mutant allele frequency spectrum, on mutational signatures and on annotated somatic variants in genes of interest. It allows variants that have failed variant caller filters to be re-examined to improve sensitivity or guide the design of future experiments. It is extensible, allowing other algorithms to be incorporated easily. Availability: The shiny application can be downloaded from GitHub (https://github.com/BrianOSullivanGit/vcfView). All data processing is performed within R to ensure platform independence. The app has been tested on RStudio, version 1.1.456, with base R 3.6.2 and Shiny 1.4.0. A vignette based on a publicly available data set is also available on GitHub.
Collapse
|
88
|
Almazni I, Stapley RJ, Khan AO, Morgan NV. A comprehensive bioinformatic analysis of 126 patients with an inherited platelet disorder to identify both sequence and copy number genetic variants. Hum Mutat 2020; 41:1848-1865. [PMID: 32935436 DOI: 10.1002/humu.24114] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Revised: 07/28/2020] [Accepted: 09/04/2020] [Indexed: 12/25/2022]
Abstract
Inherited bleeding disorders (IBDs) comprise an extremely heterogeneous group of diseases that reflect abnormalities of blood vessels, coagulation proteins, and platelets. Previously the UK-GAPP study has used whole-exome sequencing in combination with deep platelet phenotyping to identify pathogenic genetic variants in both known and novel genes in approximately 40% of the patients. To interrogate the remaining "unknown" cohort and improve this detection rate, we employed an IBD-specific gene panel of 119 genes using the Congenica Clinical Interpretation Platform to detect both single-nucleotide variants and copy number variants in 126 patients. In total, 135 different heterozygous variants in genes implicated in bleeding disorders were identified. Of which, 22 were classified pathogenic, 26 likely pathogenic, and the remaining were of uncertain significance. There were marked differences in the number of reported variants in individuals between the four patient groups: platelet count (35), platelet function (43), combined platelet count and function (59), and normal count (17). Additionally, we report three novel copy number variations (CNVs) not previously detected. We show that a combined single-nucleotide variation (SNV)/CNV analysis using the Congenica platform not only improves detection rates for IBDs, suggesting that such an approach can be applied to other genetic disorders where there is a high degree of heterogeneity.
Collapse
|
89
|
Moradifard S, Hoseinbeyki M, Emam MM, Parchiniparchin F, Ebrahimi-Rad M. Association of the Sp1 binding site and -1997 promoter variations in COL1A1 with osteoporosis risk: The application of meta-analysis and bioinformatics approaches offers a new perspective for future research. MUTATION RESEARCH. REVIEWS IN MUTATION RESEARCH 2020; 786:108339. [PMID: 33339581 DOI: 10.1016/j.mrrev.2020.108339] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Revised: 08/11/2020] [Accepted: 10/06/2020] [Indexed: 12/21/2022]
Abstract
As a complex disease, osteoporosis is influenced by several genetic markers. Many studies have examined the link between the Sp1 binding site +1245 G > T (rs1800012) and -1997 G > T (rs1107946) variations in the COL1A1 gene with osteoporosis risk. However, the findings of these studies have been contradictory; therefore, we performed a meta-analysis to aggregate additional information and obtain increased statistical power to more efficiently estimate this correlation. A meta-analysis was conducted with studies published between 1991-2020 that were identified by a systematic electronic search of the Scopus and Clarivate Analytics databases. Studies with bone mineral density (BMD) data and complete genotypes of the single-nucleotide variations (SNVs) for the overall and postmenopausal female population were included in this meta-analysis and analyzed using the R metaphor package. A relationship between rs1800012 and significantly decreased BMD values at the lumbar spine and femoral neck was found in individuals carrying the "ss" versus the "SS" genotype in the overall population according to a random effects model (p < 0.0001). Similar results were also found in the postmenopausal female population (p = 0.003 and 0.0002, respectively). Such findings might be an indication of increased osteoporosis risk in both studied groups in individuals with the "ss" genotype. Although no association was identified between the -1997 G > T and low BMD in the overall population, those individuals with the "GT" genotype showed a higher level of BMD than those with "GG" in the subgroup analysis (p = 0.007). To determine which transcription factor (TF) might bind to the -1997 G > T in COL1A1, 45 TFs were identified based on bioinformatics predictions. According to the GSE35958 microarray dataset, 16 of 45 TFs showed differential expression profiles in osteoporotic human mesenchymal stem cells relative to normal samples from elderly donors. By identifying candidate TFs for the -1997 G > T site, our study offers a new perspective for future research.
Collapse
|
90
|
Zengin T, Önal-Süzek T. Analysis of genomic and transcriptomic variations as prognostic signature for lung adenocarcinoma. BMC Bioinformatics 2020; 21:368. [PMID: 32998690 PMCID: PMC7526001 DOI: 10.1186/s12859-020-03691-3] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
Background Lung cancer is the leading cause of the largest number of deaths worldwide and lung adenocarcinoma is the most common form of lung cancer. In order to understand the molecular basis of lung adenocarcinoma, integrative analysis have been performed by using genomics, transcriptomics, epigenomics, proteomics and clinical data. Besides, molecular prognostic signatures have been generated for lung adenocarcinoma by using gene expression levels in tumor samples. However, we need signatures including different types of molecular data, even cohort or patient-based biomarkers which are the candidates of molecular targeting. Results We built an R pipeline to carry out an integrated meta-analysis of the genomic alterations including single-nucleotide variations and the copy number variations, transcriptomics variations through RNA-seq and clinical data of patients with lung adenocarcinoma in The Cancer Genome Atlas project. We integrated significant genes including single-nucleotide variations or the copy number variations, differentially expressed genes and those in active subnetworks to construct a prognosis signature. Cox proportional hazards model with Lasso penalty and LOOCV was used to identify best gene signature among different gene categories. We determined a 12-gene signature (BCHE, CCNA1, CYP24A1, DEPTOR, MASP2, MGLL, MYO1A, PODXL2, RAPGEF3, SGK2, TNNI2, ZBTB16) for prognostic risk prediction based on overall survival time of the patients with lung adenocarcinoma. The patients in both training and test data were clustered into high-risk and low-risk groups by using risk scores of the patients calculated based on selected gene signature. The overall survival probability of these risk groups was highly significantly different for both training and test datasets. Conclusions This 12-gene signature could predict the prognostic risk of the patients with lung adenocarcinoma in TCGA and they are potential predictors for the survival-based risk clustering of the patients with lung adenocarcinoma. These genes can be used to cluster patients based on molecular nature and the best candidates of drugs for the patient clusters can be proposed. These genes also have a high potential for targeted cancer therapy of patients with lung adenocarcinoma.
Collapse
|
91
|
Cheung KM, Abendroth JM, Nakatsuka N, Zhu B, Yang Y, Andrews AM, Weiss PS. Detecting DNA and RNA and Differentiating Single-Nucleotide Variations via Field-Effect Transistors. NANO LETTERS 2020; 20:5982-5990. [PMID: 32706969 PMCID: PMC7439785 DOI: 10.1021/acs.nanolett.0c01971] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
We detect short oligonucleotides and distinguish between sequences that differ by a single base, using label-free, electronic field-effect transistors (FETs). Our sensing platform utilizes ultrathin-film indium oxide FETs chemically functionalized with single-stranded DNA (ssDNA). The ssDNA-functionalized semiconducting channels in FETs detect fully complementary DNA sequences and differentiate these sequences from those having different types and locations of single base-pair mismatches. Changes in charge associated with surface-bound ssDNA vs double-stranded DNA (dsDNA) alter FET channel conductance to enable detection due to differences in DNA duplex stability. We illustrate the capability of ssDNA-FETs to detect complementary RNA sequences and to distinguish from RNA sequences with single nucleotide variations. The development and implementation of electronic biosensors that rapidly and sensitively detect and differentiate oligonucleotides present new opportunities in the fields of disease diagnostics and precision medicine.
Collapse
|
92
|
Field MA. Detecting pathogenic variants in autoimmune diseases using high-throughput sequencing. Immunol Cell Biol 2020; 99:146-156. [PMID: 32623783 PMCID: PMC7891608 DOI: 10.1111/imcb.12372] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Revised: 06/22/2020] [Accepted: 07/02/2020] [Indexed: 12/12/2022]
Abstract
Sequencing the first human genome in 2003 took 15 years and cost $2.7 billion. Advances in sequencing technologies have since decreased costs to the point where it is now feasible to resequence a whole human genome for $1000 in a single day. These advances have allowed the generation of huge volumes of high‐quality human sequence data used to construct increasingly large catalogs of both population‐level and disease‐causing variation. The existence of such databases, coupled with a high‐quality human reference genome, means we are able to interrogate and annotate all types of genetic variation and identify pathogenic variants for many diseases. Increasingly, sequencing‐based approaches are being used to elucidate the underlying genetic cause of autoimmune diseases, a group of roughly 80 polygenic diseases characterized by abnormal immune responses where healthy tissue is attacked. Although sequence data generation has become routine and affordable, significant challenges remain with no gold‐standard methodology to identify pathogenic variants currently available. This review examines the latest methodologies used to identify pathogenic variants in autoimmune diseases and considers available sequencing options and subsequent bioinformatic methodologies and strategies. The development of reliable and robust sequencing and analytic workflows to detect pathogenic variants is critical to realize the potential of precision medicine programs where patient variant information is used to inform clinical practice.
Collapse
|
93
|
Bellanné-Chantelot C, Rabadan Moraes G, Schmaltz-Panneau B, Marty C, Vainchenker W, Plo I. Germline genetic factors in the pathogenesis of myeloproliferative neoplasms. Blood Rev 2020; 42:100710. [PMID: 32532454 DOI: 10.1016/j.blre.2020.100710] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Revised: 04/08/2020] [Accepted: 05/05/2020] [Indexed: 02/06/2023]
Abstract
Myeloproliferative neoplasms (MPN) are clonal hematological malignancies that lead to overproduction of mature myeloid cells. They are due to acquired mutations in genes encoding for AK2, MPL and CALR that result in the activation of the cytokine receptor/JAK2 signaling pathway. In addition, it exists germline variants that can favor the initiation of the disease or may affect its phenotype. First, they can be common risk alleles, which correspond to frequent single nucleotide variants present in control population and that contribute to the development of either sporadic or familial MPN. Second, some variants predispose to the onset of MPN with a higher penetrance and lead to familial clustering of MPN. Finally, some extremely rare genetic variants can induce MPN-like hereditary disease. We will review these different subtypes of germline genetic variants and discuss how they impact the initiation and/or development of the MPN disease.
Collapse
|
94
|
Batista FM, Stapleton T, Lowther JA, Fonseca VG, Shaw R, Pond C, Walker DI, van Aerle R, Martinez-Urtaza J. Whole Genome Sequencing of Hepatitis A Virus Using a PCR-Free Single-Molecule Nanopore Sequencing Approach. Front Microbiol 2020; 11:874. [PMID: 32523561 PMCID: PMC7261825 DOI: 10.3389/fmicb.2020.00874] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2019] [Accepted: 04/14/2020] [Indexed: 12/18/2022] Open
Abstract
Hepatitis A virus (HAV) is one of the most common causes of acute viral hepatitis in humans. Although HAV has a relatively small genome, there are several factors limiting whole genome sequencing such as PCR amplification artefacts and ambiguities in de novo assembly. The recently developed Oxford Nanopore technologies (ONT) allows single-molecule sequencing of long-size fragments of DNA or RNA using PCR-free strategies. We have sequenced the whole genome of HAV using a PCR-free approach by direct reverse-transcribed sequencing. We were able to sequence HAV cDNA and obtain reads over 7 kilobases in length containing almost the whole genome of the virus. The comparison of these raw long nanopore reads with the HAV reference wild type revealed a nucleotide sequence identity between 81.1 and 96.6%. By de novo assembly of all HAV reads we obtained a consensus sequence of 7362 bases, with a nucleotide sequence identity of 99.0% with the genome of the HAV strain pHM175/18f. When the assembly was performed using as reference the HAV strain pHM175/18f a consensus with a sequence similarity of 99.8 % was obtained. We have also used an ONT amplicon-based assay to sequence two fragments of the VP3 and VP1 regions which showed a sequence similarity of 100% with matching regions of the consensus sequence obtained using the direct cDNA sequencing approach. This study showed the applicability of ONT sequencing technologies to obtain the whole genome of HAV by direct cDNA nanopore sequencing, highlighting the utility of this PCR-free approach for HAV characterization and potentially other viruses of the Picornaviridae family.
Collapse
|
95
|
Fleming JR, Rigden DJ, Mayans O. The importance of chain context in assessing small nucleotide variants in titin: in silico case study of the I10-I11 tandem and its arrhythmic right ventricular cardiomyopathy linked position T2580. J Biomol Struct Dyn 2020; 39:3480-3490. [PMID: 32396765 DOI: 10.1080/07391102.2020.1768148] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Non-synonymous small nucleotide variations (nsSNVs) in the giant muscle protein, titin, have key roles in the development of several myopathologies. Although there is considerable motive to screen at-risk individuals for nsSNVs, to identify patients in early disease stages while therapeutic intervention is still possible, the clinical significance of most titin variations remains unclear. Therefore, there is a growing need to establish methods to classify nsSNVs in a simple, economic and rapid manner. Due to its strong correlation to arrhythmogenic right ventricular cardiomyopathy (ARVC), one particular mutation in titin-T2580I, located in the I10 immunoglobulin domain-has received considerable attention. Here, we use the I10-I11 tandem as a case study to explore the possible benefits of considering the titin chain context-i.e. domain interfaces-in the assessment of titin nsSNVs. Specifically, we investigate which exchanges mimic the conformational molecular phenotype of the T2580I mutation at the I10-I11 domain interface. Then, we computed a residue stability landscape for domains alone and in tandem to define a Domain Interface Score (DIS) which identifies several hotspot residues. Our findings suggest that the T2580 position is highly sensitive to exchange and that any variant found in this position should be considered with care. Furthermore, we conclude that the consideration of the higher order structure of the titin chain is important to gain accurate insights into the vulnerability of positions in linker regions and that titin nsSNV prediction benefits from a contextual analysis. Communicated by Ramaswamy H. Sarma.
Collapse
|
96
|
Guengerich FP. Cytochrome P450 2E1 and its roles in disease. Chem Biol Interact 2020; 322:109056. [PMID: 32198084 PMCID: PMC7217708 DOI: 10.1016/j.cbi.2020.109056] [Citation(s) in RCA: 53] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Revised: 12/12/2019] [Accepted: 03/10/2020] [Indexed: 12/27/2022]
Abstract
Cytochrome P450 (P450) 2E1 is the major P450 enzyme involved in ethanol metabolism. That role is shared with two other enzymes that oxidize ethanol, alcohol dehydrogenase and catalase. P450 2E1 is also involved in the bioactivation of a number of low molecular weight cancer suspects, as validated in vivo in mouse models where cancers could be attenuated by deletion of Cyp2e1. P450 2E1 does not have a role in global production of reactive oxygen species but localized roles are possible, e.g. in mitochondria. The structures, conformations, and catalytic mechanisms of P450 2E1 have some unusual features among P450s. The concentration of hepatic P450 varies ≥10-fold among humans, possibly in part due to single nucleotide variants. The level of P450 2E1 may have relevance in the rates of oxidation of drugs, particularly acetaminophen and anesthetics.
Collapse
|
97
|
Jasinska AJ. Resources for functional genomic studies of health and development in nonhuman primates. AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY 2020; 171 Suppl 70:174-194. [PMID: 32221967 DOI: 10.1002/ajpa.24051] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/01/2019] [Revised: 01/22/2020] [Accepted: 02/26/2020] [Indexed: 01/01/2023]
Abstract
Primates display a wide range of phenotypic variation underlaid by complex genetically regulated mechanisms. The links among DNA sequence, gene function, and phenotype have been of interest from an evolutionary perspective, to understand functional genome evolution and its phenotypic consequences, and from a biomedical perspective to understand the shared and human-specific roots of health and disease. Progress in methods for characterizing genetic, transcriptomic, and DNA methylation (DNAm) variation is driving the rapid development of extensive omics resources, which are now increasingly available from humans as well as a growing number of nonhuman primates (NHPs). The fast growth of large-scale genomic data is driving the emergence of integrated tools and databases, thus facilitating studies of gene functionality across primates. This review describes NHP genomic resources that can aid in exploration of how genes shape primate phenotypes. It focuses on the gene expression trajectories across development in different tissues, the identification of functional genetic variation (including variants deleterious for protein function and regulatory variants modulating gene expression), and DNAm profiles as an emerging tool to understand the process of aging. These resources enable comparative functional genomics approaches to identify species-specific and primate-shared gene functionalities associated with health and development.
Collapse
|
98
|
Murillo J, Spetale F, Guillaume S, Bulacio P, Garcia Labari I, Cailloux O, Destercke S, Tapia E. Consistency of the Tools That Predict the Impact of Single Nucleotide Variants ( SNVs) on Gene Functionality: The BRCA1 Gene. Biomolecules 2020; 10:biom10030475. [PMID: 32244891 PMCID: PMC7175253 DOI: 10.3390/biom10030475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2019] [Revised: 01/15/2020] [Accepted: 01/29/2020] [Indexed: 11/16/2022] Open
Abstract
Single nucleotide variants (SNVs) occurring in a protein coding gene may disrupt its function in multiple ways. Predicting this disruption has been recognized as an important problem in bioinformatics research. Many tools, hereafter p-tools, have been designed to perform these predictions and many of them are now of common use in scientific research, even in clinical applications. This highlights the importance of understanding the semantics of their outputs. To shed light on this issue, two questions are formulated, (i) do p-tools provide similar predictions? (inner consistency), and (ii) are these predictions consistent with the literature? (outer consistency). To answer these, six p-tools are evaluated with exhaustive SNV datasets from the BRCA1 gene. Two indices, called Kall and Kstrong, are proposed to quantify the inner consistency of pairs of p-tools while the outer consistency is quantified by standard information retrieval metrics. While the inner consistency analysis reveals that most of the p-tools are not consistent with each other, the outer consistency analysis reveals they are characterized by a low prediction performance. Although this result highlights the need of improving the prediction performance of individual p-tools, the inner consistency results pave the way to the systematic design of truly diverse ensembles of p-tools that can overcome the limitations of individual members.
Collapse
|
99
|
Estimating the Allele-Specific Expression of SNVs From 10× Genomics Single-Cell RNA-Sequencing Data. Genes (Basel) 2020; 11:genes11030240. [PMID: 32106453 PMCID: PMC7140866 DOI: 10.3390/genes11030240] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2019] [Revised: 02/10/2020] [Accepted: 02/19/2020] [Indexed: 12/15/2022] Open
Abstract
With the recent advances in single-cell RNA-sequencing (scRNA-seq) technologies, the estimation of allele expression from single cells is becoming increasingly reliable. Allele expression is both quantitative and dynamic and is an essential component of the genomic interactome. Here, we systematically estimate the allele expression from heterozygous single nucleotide variant (SNV) loci using scRNA-seq data generated on the 10×Genomics Chromium platform. We analyzed 26,640 human adipose-derived mesenchymal stem cells (from three healthy donors), sequenced to an average of 150K sequencing reads per cell (more than 4 billion scRNA-seq reads in total). High-quality SNV calls assessed in our study contained approximately 15% exonic and >50% intronic loci. To analyze the allele expression, we estimated the expressed variant allele fraction (VAFRNA) from SNV-aware alignments and analyzed its variance and distribution (mono- and bi-allelic) at different minimum sequencing read thresholds. Our analysis shows that when assessing positions covered by a minimum of three unique sequencing reads, over 50% of the heterozygous SNVs show bi-allelic expression, while at a threshold of 10 reads, nearly 90% of the SNVs are bi-allelic. In addition, our analysis demonstrates the feasibility of scVAFRNA estimation from current scRNA-seq datasets and shows that the 3′-based library generation protocol of 10×Genomics scRNA-seq data can be informative in SNV-based studies, including analyses of transcriptional kinetics.
Collapse
|
100
|
Coleman I, Corleone G, Arram J, Ng HC, Magnani L, Luk W. GeDi: applying suffix arrays to increase the repertoire of detectable SNVs in tumour genomes. BMC Bioinformatics 2020; 21:45. [PMID: 32024475 PMCID: PMC7003401 DOI: 10.1186/s12859-020-3367-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2019] [Accepted: 01/14/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Current popular variant calling pipelines rely on the mapping coordinates of each input read to a reference genome in order to detect variants. Since reads deriving from variant loci that diverge in sequence substantially from the reference are often assigned incorrect mapping coordinates, variant calling pipelines that rely on mapping coordinates can exhibit reduced sensitivity. RESULTS In this work we present GeDi, a suffix array-based somatic single nucleotide variant (SNV) calling algorithm that does not rely on read mapping coordinates to detect SNVs and is therefore capable of reference-free and mapping-free SNV detection. GeDi executes with practical runtime and memory resource requirements, is capable of SNV detection at very low allele frequency (<1%), and detects SNVs with high sensitivity at complex variant loci, dramatically outperforming MuTect, a well-established pipeline. CONCLUSION By designing novel suffix-array based SNV calling methods, we have developed a practical SNV calling software, GeDi, that can characterise SNVs at complex variant loci and at low allele frequency thus increasing the repertoire of detectable SNVs in tumour genomes. We expect GeDi to find use cases in targeted-deep sequencing analysis, and to serve as a replacement and improvement over previous suffix-array based SNV calling methods.
Collapse
|