1
|
Uppili B, Faruq M. STRIDE-DB: a comprehensive database for exploration of instability and phenotypic relevance of short tandem repeats in the human genome. Database (Oxford) 2024; 2024:baae020. [PMID: 38602506 PMCID: PMC11008502 DOI: 10.1093/database/baae020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 11/10/2023] [Accepted: 03/07/2024] [Indexed: 04/12/2024]
Abstract
Short Tandem Repeats (STRs) are genetic markers made up of repeating DNA sequences. The variations of the STRs are widely studied in forensic analysis, population studies and genetic testing for a variety of neuromuscular disorders. Understanding polymorphic STR variation and its cause is crucial for deciphering genetic information and finding links to various disorders. In this paper, we present STRIDE-DB, a novel and unique platform to explore STR Instability and its Phenotypic Relevance, and a comprehensive database of STRs in the human genome. We utilized RepeatMasker to identify all the STRs in the human genome (hg19) and combined it with frequency data from the 1000 Genomes Project. STRIDE-DB, a user-friendly resource, plays a pivotal role in investigating the relationship between STR variation, instability and phenotype. By harnessing data from genome-wide association studies (GWAS), ClinVar database, Alu loci, Haploblocks in genome and Conservation of the STRs, it serves as an important tool for researchers exploring the variability of STRs in the human genome and its direct impact on phenotypes. STRIDE-DB has its broad applicability and significance in various research domains like forensic sciences and other repeat expansion disorders. Database URL: https://stridedb.igib.res.in.
Collapse
Affiliation(s)
- Bharathram Uppili
- Genomics and Molecular Medicine, CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi 110007, India
- CSIR-HRDC Campus, Academy for Scientific and Innovative Research, Ghaziabad 201002, India
| | - Mohammed Faruq
- Genomics and Molecular Medicine, CSIR-Institute of Genomics and Integrative Biology (CSIR-IGIB), Delhi 110007, India
| |
Collapse
|
2
|
Han L, Cui DJ, Huang B, Yang Q, Huang T, Lin GY, Chen SJ. CLDN5 identified as a biomarker for metastasis and immune infiltration in gastric cancer via pan-cancer analysis. Aging (Albany NY) 2023; 15:204776. [PMID: 37286335 PMCID: PMC10292893 DOI: 10.18632/aging.204776] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 05/23/2023] [Indexed: 06/09/2023]
Abstract
BACKGROUND CLDN5 protein is essential for the formation of tight junctions in epithelial cells, and has been associated with epithelial-mesenchymal transition. Research has indicated that CLDN5 is associated with tumor metastasis, the tumor microenvironment, and immunotherapy in multiple types of cancer. Also, no comprehensive evaluation of the expression of CLDN5 and immunotherapy signatures through a pan-cancer analysis or immunoassay has been performed. METHODS We explored CLDN5's differential expression, survival analysis and clinicopathological staging through the TCGA database, and then corroborated the expression of CLDN5 by utilizing the GEO (Gene expression omnibus) database. To analyze CLDN5 KEGG, GO, and Hallmark mutations, as well as TIMER for immune infiltration, GSEA was utilized with ROC curve, mutation, and other factors such as survival, pathological stage, TME, MSI, TMB, immune cell infiltration, and DNA methylation. Immunohistochemistry was used to assess CLDN5 staining in gastric cancer tissues and paracancerous tissues. Visualization was done with R version 4.2.0 (http://www.rproject.org/). RESULTS According to TCGA database, CLDN5 expression levels differed significantly between cancer and normal tissues, and the GEO database (GSE49051 and GSE 64951) and tissue microarrays confirmed this result. Infiltrating cluster of differentiation 8+ (CD8+) T cells, CD4+ cells, neutrophils, dendritic cells, and macrophages revealed a correlation with CLDN5 expression. DNA methylation, TMB, and MSI are related to CLDN5 expression. Based on the ROC curve analysis, CLDN5 demonstrates outstanding diagnostic effectiveness for gastric cancer and is comparable to CA-199. CONCLUSIONS The findings suggest that CLDN5 is implicated in the oncogenesis of diverse cancer types, underscoring its potential significance in cancer biology. Notably, CLDN5 could have implications in immune filtration and immune checkpoint inhibitor therapies, however, further research is needed to confirm this.
Collapse
Affiliation(s)
- Lu Han
- Department of Gastroenterology, Guizhou Provincial People’s Hospital, Guiyang, Guizhou Province, China
- Department of Infectious Diseases, Affiliated Hospital of Guizhou Medical University, Guiyang, Guizhou Province, China
| | - De-Jun Cui
- Department of Gastroenterology, Guizhou Provincial People’s Hospital, Guiyang, Guizhou Province, China
| | - Bo Huang
- Department of Gastroenterology, Guizhou Provincial People’s Hospital, Guiyang, Guizhou Province, China
| | - Qian Yang
- Department of Gastroenterology, Guizhou Provincial People’s Hospital, Guiyang, Guizhou Province, China
| | - Tao Huang
- Department of Infectious Diseases, Affiliated Hospital of Guizhou Medical University, Guiyang, Guizhou Province, China
| | - Guo-Yuan Lin
- Department of Infectious Diseases, Affiliated Hospital of Guizhou Medical University, Guiyang, Guizhou Province, China
| | - Shao-Jie Chen
- Department of Hepatobiliary Surgery, The Affiliated Hospital of Guizhou Medical University, Guiyang, Guizhou, China
| |
Collapse
|
3
|
Shi Y, Niu Y, Zhang P, Luo H, Liu S, Zhang S, Wang J, Li Y, Liu X, Song T, Xu T, He S. Characterization of genome-wide STR variation in 6487 human genomes. Nat Commun 2023; 14:2092. [PMID: 37045857 PMCID: PMC10097659 DOI: 10.1038/s41467-023-37690-8] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Accepted: 03/27/2023] [Indexed: 04/14/2023] Open
Abstract
Short tandem repeats (STRs) are abundant and highly mutagenic in the human genome. Many STR loci have been associated with a range of human genetic disorders. However, most population-scale studies on STR variation in humans have focused on European ancestry cohorts or are limited by sequencing depth. Here, we depicted a comprehensive map of 366,013 polymorphic STRs (pSTRs) constructed from 6487 deeply sequenced genomes, comprising 3983 Chinese samples (~31.5x, NyuWa) and 2504 samples from the 1000 Genomes Project (~33.3x, 1KGP). We found that STR mutations were affected by motif length, chromosome context and epigenetic features. We identified 3273 and 1117 pSTRs whose repeat numbers were associated with gene expression and 3'UTR alternative polyadenylation, respectively. We also implemented population analysis, investigated population differentiated signatures, and genotyped 60 known disease-causing STRs. Overall, this study further extends the scale of STR variation in humans and propels our understanding of the semantics of STRs.
Collapse
Affiliation(s)
- Yirong Shi
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yiwei Niu
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Peng Zhang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Huaxia Luo
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Shuai Liu
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Sijia Zhang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Jiajia Wang
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Yanyan Li
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Xinyue Liu
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Tingrui Song
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | - Tao Xu
- National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
- Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan, 250117, Shandong, China.
| | - Shunmin He
- Key Laboratory of RNA Biology, Center for Big Data Research in Health, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China.
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|
4
|
Assessment of Microsatellite Instability from Next-Generation Sequencing Data. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2022; 1361:75-100. [DOI: 10.1007/978-3-030-91836-1_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
5
|
Aprajita, Sharma R. Comprehending fibroblast growth factor receptor like 1: Oncogene or tumor suppressor? Cancer Treat Res Commun 2021; 29:100472. [PMID: 34689016 DOI: 10.1016/j.ctarc.2021.100472] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2021] [Revised: 09/27/2021] [Accepted: 09/29/2021] [Indexed: 12/16/2022]
Abstract
Fibroblast Growth Factor Receptor Like 1 (FGFRL1) signaling has crucial role in a multitude of processes during genetic diseases, embryonic development and various types of cancer. Due to its partial structural similarity with its classical Fibroblast Growth Factor Receptor [FGFR] counterparts and lack of tyrosine kinase domain, FGFRL1 was thought to work as a decoy receptor in FGF/FGFR signaling. Later on, growing number evidences showed that expression of FGFRL1 affects major pathways like ERK1/2, Akt and others, which are dysfunctional in a wide range of human cancers. In this review, we provide an overview of the current understanding of FGFRL1 and its roles in cell differentiation, adhesion and proliferation pathways . Overexpression of FGFRL1 might lead to tumor progression and invasion. In this context, inhibitors for FGFRL1 might have therapeutic benefits in human cancer prognosis.
Collapse
Affiliation(s)
- Aprajita
- University School of Biotechnology, Guru Gobind Singh Indraprastha University, Dwarka, New Delhi, India
| | - Rinu Sharma
- University School of Biotechnology, Guru Gobind Singh Indraprastha University, Dwarka, New Delhi, India.
| |
Collapse
|
6
|
Kinney N, Kang L, Bains H, Lawson E, Husain M, Husain K, Sandhu I, Shin Y, Carter JK, Anandakrishnan R, Michalak P, Garner H. Ethnically biased microsatellites contribute to differential gene expression and glutathione metabolism in Africans and Europeans. PLoS One 2021; 16:e0249148. [PMID: 33765058 PMCID: PMC7993785 DOI: 10.1371/journal.pone.0249148] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Accepted: 03/11/2021] [Indexed: 12/28/2022] Open
Abstract
Approximately three percent of the human genome is occupied by microsatellites: a type of short tandem repeat (STR). Microsatellites have well established effects on (a) the genetic structure of diverse human populations and (b) expression of nearby genes. These lines of inquiry have uncovered 3,984 ethnically biased microsatellite loci (EBML) and 28,375 expression STRs (eSTRs), respectively. We hypothesize that a combination of EBML, eSTRs, and gene expression data (RNA-seq) can be used to show that microsatellites contribute to differential gene expression and phenotype in human populations. In fact, our previous study demonstrated a degree of mutual overlap between EBML and eSTRs but fell short of quantifying effects on gene expression. The present work aims to narrow the gap. First, we identify 313 overlapping EBML/eSTRs and recapitulate their mutual overlap. The 313 EBML/eSTRs are then characterized across ethnicity and tissue type. We use RNA-seq data to pursue validation of 49 regions that affect whole blood gene expression; 32 out of 54 affected genes are differentially expressed in Africans and Europeans. We quantify the relative contribution of these 32 genes to differential expression; fold change tends to be less than other differentially expressed genes. Repeat length correlates with expression for 15 of the 32 genes; two are conspicuously involved in glutathione metabolism. Finally, we repurpose a mathematical model of glutathione metabolism to investigate how a single polymorphic microsatellite affects phenotype. We conclude with a testable prediction that microsatellite polymorphisms affect GPX7 expression and oxidative stress in Africans and Europeans.
Collapse
Affiliation(s)
- Nick Kinney
- Edward Via College of Osteopathic Medicine, Blacksburg, Virginia, United States of America
- Gibbs Cancer Center & Research Institute, Spartanburg, South Carolina, United States of America
- * E-mail:
| | - Lin Kang
- Edward Via College of Osteopathic Medicine, Blacksburg, Virginia, United States of America
- Gibbs Cancer Center & Research Institute, Spartanburg, South Carolina, United States of America
| | - Harpal Bains
- Edward Via College of Osteopathic Medicine, Blacksburg, Virginia, United States of America
| | - Elizabeth Lawson
- Edward Via College of Osteopathic Medicine, Blacksburg, Virginia, United States of America
| | - Mesam Husain
- Edward Via College of Osteopathic Medicine, Blacksburg, Virginia, United States of America
| | - Kumayl Husain
- Edward Via College of Osteopathic Medicine, Blacksburg, Virginia, United States of America
| | - Inderjit Sandhu
- Edward Via College of Osteopathic Medicine, Blacksburg, Virginia, United States of America
| | - Yongdeok Shin
- Edward Via College of Osteopathic Medicine, Blacksburg, Virginia, United States of America
| | - Javan K. Carter
- University of Colorado Boulder, Boulder, Colorado, United States of America
| | - Ramu Anandakrishnan
- Edward Via College of Osteopathic Medicine, Blacksburg, Virginia, United States of America
- Gibbs Cancer Center & Research Institute, Spartanburg, South Carolina, United States of America
| | - Pawel Michalak
- Edward Via College of Osteopathic Medicine, Blacksburg, Virginia, United States of America
- Gibbs Cancer Center & Research Institute, Spartanburg, South Carolina, United States of America
- Institute of Evolution, University of Haifa, Haifa, Israel
| | - Harold Garner
- Edward Via College of Osteopathic Medicine, Blacksburg, Virginia, United States of America
- Gibbs Cancer Center & Research Institute, Spartanburg, South Carolina, United States of America
| |
Collapse
|
7
|
Hodgman MW, Miller JB, Meurs TE, Kauwe JSK. CUBAP: an interactive web portal for analyzing codon usage biases across populations. Nucleic Acids Res 2020; 48:11030-11039. [PMID: 33045750 PMCID: PMC7641757 DOI: 10.1093/nar/gkaa863] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2020] [Revised: 08/18/2020] [Accepted: 09/22/2020] [Indexed: 12/19/2022] Open
Abstract
Synonymous codon usage significantly impacts translational and transcriptional efficiency, gene expression, the secondary structure of both mRNA and proteins, and has been implicated in various diseases. However, population-specific differences in codon usage biases remain largely unexplored. Here, we present a web server, https://cubap.byu.edu, to facilitate analyses of codon usage biases across populations (CUBAP). Using the 1000 Genomes Project, we calculated and visually depict population-specific differences in codon frequencies, codon aversion, identical codon pairing, co-tRNA codon pairing, ramp sequences, and nucleotide composition in 17,634 genes. We found that codon pairing significantly differs between populations in 35.8% of genes, allowing us to successfully predict the place of origin for African and East Asian individuals with 98.8% and 100% accuracy, respectively. We also used CUBAP to identify a significant bias toward decreased CTG pairing in the immunity related GTPase M (IRGM) gene in East Asian and African populations, which may contribute to the decreased association of rs10065172 with Crohn's disease in those populations. CUBAP facilitates in-depth gene-specific and codon-specific visualization that will aid in analyzing candidate genes identified in genome-wide association studies, identifying functional implications of synonymous variants, predicting population-specific impacts of synonymous variants and categorizing genetic biases unique to certain populations.
Collapse
Affiliation(s)
- Matthew W Hodgman
- Department of Biology, Brigham Young University, Provo, UT 84602, USA
| | - Justin B Miller
- Department of Biology, Brigham Young University, Provo, UT 84602, USA
| | - Taylor E Meurs
- Department of Biology, Brigham Young University, Provo, UT 84602, USA
| | - John S K Kauwe
- Department of Biology, Brigham Young University, Provo, UT 84602, USA
| |
Collapse
|
8
|
Balzano E, Pelliccia F, Giunta S. Genome (in)stability at tandem repeats. Semin Cell Dev Biol 2020; 113:97-112. [PMID: 33109442 DOI: 10.1016/j.semcdb.2020.10.003] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2020] [Revised: 09/26/2020] [Accepted: 10/10/2020] [Indexed: 12/12/2022]
Abstract
Repeat sequences account for over half of the human genome and represent a significant source of variation that underlies physiological and pathological states. Yet, their study has been hindered due to limitations in short-reads sequencing technology and difficulties in assembly. A important category of repetitive DNA in the human genome is comprised of tandem repeats (TRs), where repetitive units are arranged in a head-to-tail pattern. Compared to other regions of the genome, TRs carry between 10 and 10,000 fold higher mutation rate. There are several mutagenic mechanisms that can give rise to this propensity toward instability, but their precise contribution remains speculative. Given the high degree of homology between these sequences and their arrangement in tandem, once damaged, TRs have an intrinsic propensity to undergo aberrant recombination with non-allelic exchange and generate harmful rearrangements that may undermine the stability of the entire genome. The dynamic mutagenesis at TRs has been found to underlie individual polymorphism associated with neurodegenerative and neuromuscular disorders, as well as complex genetic diseases like cancer and diabetes. Here, we review our current understanding of the surveillance and repair mechanisms operating within these regions, and we describe how alterations in these protective processes can readily trigger mutational signatures found at TRs, ultimately resulting in the pathological correlation between TRs instability and human diseases. Finally, we provide a viewpoint to counter the detrimental effects that TRs pose in light of their selection and conservation, as important drivers of human evolution.
Collapse
Affiliation(s)
- Elisa Balzano
- Dipartimento di Biologia e Biotecnologie "Charles Darwin", Sapienza Università di Roma, 00185 Roma, Italy
| | - Franca Pelliccia
- Dipartimento di Biologia e Biotecnologie "Charles Darwin", Sapienza Università di Roma, 00185 Roma, Italy
| | - Simona Giunta
- The Rockefeller University, 1230 York Avenue, New York, NY 10065, USA; Dipartimento di Biologia e Biotecnologie "Charles Darwin", Sapienza Università di Roma, 00185 Roma, Italy.
| |
Collapse
|
9
|
Chiara M, Zambelli F, Picardi E, Horner DS, Pesole G. Critical assessment of bioinformatics methods for the characterization of pathological repeat expansions with single-molecule sequencing data. Brief Bioinform 2019; 21:1971-1986. [DOI: 10.1093/bib/bbz099] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2019] [Revised: 06/22/2019] [Accepted: 07/09/2019] [Indexed: 01/19/2023] Open
Abstract
Abstract
A number of studies have reported the successful application of single-molecule sequencing technologies to the determination of the size and sequence of pathological expanded microsatellite repeats over the last 5 years. However, different custom bioinformatics pipelines were employed in each study, preventing meaningful comparisons and somewhat limiting the reproducibility of the results. In this review, we provide a brief summary of state-of-the-art methods for the characterization of expanded repeats alleles, along with a detailed comparison of bioinformatics tools for the determination of repeat length and sequence, using both real and simulated data. Our reanalysis of publicly available human genome sequencing data suggests a modest, but statistically significant, increase of the error rate of single-molecule sequencing technologies at genomic regions containing short tandem repeats. However, we observe that all the methods herein tested, irrespective of the strategy used for the analysis of the data (either based on the alignment or assembly of the reads), show high levels of sensitivity in both the detection of expanded tandem repeats and the estimation of the expansion size, suggesting that approaches based on single-molecule sequencing technologies are highly effective for the detection and quantification of tandem repeat expansions and contractions.
Collapse
Affiliation(s)
- Matteo Chiara
- Department of Biosciences, University of Milan, via Celoria 26, 20133 Milan, Italy
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council, Via Amendola e, 70126 Bari, Italy
| | - Federico Zambelli
- Department of Biosciences, University of Milan, via Celoria 26, 20133 Milan, Italy
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council, Via Amendola e, 70126 Bari, Italy
| | - Ernesto Picardi
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council, Via Amendola e, 70126 Bari, Italy
- Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari “A. Moro”, Via Orabona 4, 70126 Bari, Italy
| | - David S Horner
- Department of Biosciences, University of Milan, via Celoria 26, 20133 Milan, Italy
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council, Via Amendola e, 70126 Bari, Italy
| | - Graziano Pesole
- Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, National Research Council, Via Amendola e, 70126 Bari, Italy
- Department of Biosciences, Biotechnology and Biopharmaceutics, University of Bari “A. Moro”, Via Orabona 4, 70126 Bari, Italy
| |
Collapse
|