1
|
Chen R, Wu J, Che Y, Jiao Y, Sun H, Zhao Y, Chen P, Meng L, Zhao T. Machine learning-driven prognostic analysis of cuproptosis and disulfidptosis-related lncRNAs in clear cell renal cell carcinoma: a step towards precision oncology. Eur J Med Res 2024; 29:176. [PMID: 38491523 PMCID: PMC10943875 DOI: 10.1186/s40001-024-01763-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Accepted: 03/01/2024] [Indexed: 03/18/2024] Open
Abstract
Cuproptosis and disulfidptosis, recently discovered mechanisms of cell death, have demonstrated that differential expression of key genes and long non-coding RNAs (lncRNAs) profoundly influences tumor development and affects their drug sensitivity. Clear cell renal cell carcinoma (ccRCC), the most common subtype of kidney cancer, presently lacks research utilizing cuproptosis and disulfidptosis-related lncRNAs (CDRLRs) as prognostic markers. In this study, we analyzed RNA-seq data, clinical information, and mutation data from The Cancer Genome Atlas (TCGA) on ccRCC and cross-referenced it with known cuproptosis and disulfidptosis-related genes (CDRGs). Using the LASSO machine learning algorithm, we identified four CDRLRs-ACVR2B-AS1, AC095055.1, AL161782.1, and MANEA-DT-that are strongly associated with prognosis and used them to construct a prognostic risk model. To verify the model's reliability and validate these four CDRLRs as significant prognostic factors, we performed dataset grouping validation, followed by RT-qPCR and external database validation for differential expression and prognosis of CDRLRs in ccRCC. Gene function and pathway analysis were conducted using Gene Ontology (GO) and Gene Set Enrichment Analysis (GSEA) for high- and low-risk groups. Additionally, we have analyzed the tumor mutation burden (TMB) and the immune microenvironment (TME), employing the oncoPredict and Immunophenoscore (IPS) algorithms to assess the sensitivity of diverse risk categories to targeted therapeutics and immunosuppressants. Our predominant objective is to refine prognostic predictions for patients with ccRCC and inform treatment decisions by conducting an exhaustive study on cuproptosis and disulfidptosis.
Collapse
Affiliation(s)
- Ronghui Chen
- School of Clinical Medicine, Shandong Second Medical University, Weifang, 261053, China
- Department of Oncology, People's Hospital of Rizhao, Rizhao, 276826, China
| | - Jun Wu
- Department of Oncology, People's Hospital of Rizhao, Rizhao, 276826, China
| | - Yinwei Che
- Department of Central Laboratory, Shandong Provincial Key Medical and Health Laboratory, Rizhao Key Laboratory of Basic Research on Anesthesia and Respiratory Intensive Care, The People's Hospital of Rizhao, Rizhao, 276826, Shandong, China
| | - Yuzhuo Jiao
- Department of Central Laboratory, Shandong Provincial Key Medical and Health Laboratory, Rizhao Key Laboratory of Basic Research on Anesthesia and Respiratory Intensive Care, The People's Hospital of Rizhao, Rizhao, 276826, Shandong, China
| | - Huashan Sun
- Department of Central Laboratory, Shandong Provincial Key Medical and Health Laboratory, Rizhao Key Laboratory of Basic Research on Anesthesia and Respiratory Intensive Care, The People's Hospital of Rizhao, Rizhao, 276826, Shandong, China
| | - Yinuo Zhao
- Department of Pathology, People's Hospital of Rizhao, Rizhao, 276826, China
| | - Pingping Chen
- Department of Pathology, People's Hospital of Rizhao, Rizhao, 276826, China
| | - Lingxin Meng
- Department of Oncology, People's Hospital of Rizhao, Rizhao, 276826, China.
| | - Tao Zhao
- Department of Central Laboratory, Shandong Provincial Key Medical and Health Laboratory, Rizhao Key Laboratory of Basic Research on Anesthesia and Respiratory Intensive Care, The People's Hospital of Rizhao, Rizhao, 276826, Shandong, China
| |
Collapse
|
2
|
Chen R, Wu J, Liu S, Sun Y, Liu G, Zhang L, Yu Q, Xu J, Meng L. Immune-related risk prognostic model for clear cell renal cell carcinoma: Implications for immunotherapy. Medicine (Baltimore) 2023; 102:e34786. [PMID: 37653791 PMCID: PMC10470711 DOI: 10.1097/md.0000000000034786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Revised: 07/25/2023] [Accepted: 07/26/2023] [Indexed: 09/02/2023] Open
Abstract
Clear cell renal cell carcinoma (ccRCC) is associated with complex immune interactions. We conducted a comprehensive analysis of immune-related differentially expressed genes in patients with ccRCC using data from The Cancer Genome Atlas and ImmPort databases. The immune-related differentially expressed genes underwent functional and pathway enrichment analysis, followed by COX regression combined with LASSO regression to construct an immune-related risk prognostic model. The model comprised 4 IRGs: CLDN4, SEMA3G, CAT, and UCN. Patients were stratified into high-risk and low-risk groups based on the median risk score, and the overall survival rate of the high-risk group was significantly lower than that of the low-risk group, confirming the reliability of the model from various perspectives. Further comparison of immune infiltration, tumor mutation load, and immunophenoscore (IPS) comparison between the 2 groups indicates that the high-risk group could potentially demonstrate a heightened sensitivity towards immunotherapy checkpoints PD-1, CTLA-4, IL-6, and LAG3 in ccRCC patients. The proposed model not only applies to ccRCC but also shows potential in developing into a prognostic model for renal cancer, thus introducing a novel approach for personalized immunotherapy in ccRCC.
Collapse
Affiliation(s)
- Ronghui Chen
- Clinical Medical College of Weifang Medical University, Weifang, China
| | - Jun Wu
- Department of Oncology, People’s Hospital of Rizhao, Rizhao, China
| | - Shan Liu
- Department of Oncology, People’s Hospital of Rizhao, Rizhao, China
| | - Yefeng Sun
- Department of Emergency, People’s Hospital of Rizhao, Rizhao, China
| | - Guozhi Liu
- Jining Medical University, Jining, China
| | - Lin Zhang
- Jining Medical University, Jining, China
| | - Qing Yu
- Clinical Medical College of Weifang Medical University, Weifang, China
| | - Juan Xu
- Clinical Medical College of Weifang Medical University, Weifang, China
| | - Lingxin Meng
- Department of Oncology, People’s Hospital of Rizhao, Rizhao, China
| |
Collapse
|
3
|
Wang H, Liu J, Yang J, Wang Z, Zhang Z, Peng J, Wang Y, Hong L. A novel tumor mutational burden-based risk model predicts prognosis and correlates with immune infiltration in ovarian cancer. Front Immunol 2022; 13:943389. [PMID: 36003381 PMCID: PMC9393426 DOI: 10.3389/fimmu.2022.943389] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Accepted: 07/18/2022] [Indexed: 11/29/2022] Open
Abstract
Tumor mutational burden (TMB) has been reported to determine the response to immunotherapy, thus affecting the patient’s prognosis in many cancers. However, it is unclear whether TMB or TMB-related signature could be used as prognostic indicators for ovarian cancer (OC), as its potential association with immune infiltration remains poorly understood. Therefore, this study aimed to develop a novel TMB-related risk model (TMBrisk) to predict the prognosis of OC patients on the basis of exploring TMB-related genes, and to explore the potential association between TMB/TMBrisk and immune infiltration. The mutational landscape, TMB scores, and correlations between TMB and clinical characteristics and immune infiltration were investigated in The Cancer Genome Atlas (TCGA)-OV cohort. Differentially expressed gene (DEG) analyses and weighted gene co-expression network analysis (WGCNA) were performed to derive TMB-related genes. TMBrisk was constructed by Cox regression and further validated in Gene Expression Omnibus (GEO) datasets. The mRNA and protein expression levels and biological functions of TMBrisk hub genes were verified through Gene Expression Profiling Interactive Analysis (GEPIA), GSCA Lite, the Human Protein Atlas (HPA) database, and RT-qPCR. TMBrisk-related biological phenotypes were analyzed in function enrichment and tumor immune infiltration signature. Potential therapeutic regimens were inferred utilizing the Genomics of Drug Sensitivity in Cancer (GDSC) database and connectivity map (CMap). According to our results, higher TMB was associated with better survival and higher CD8+ T cell, regulatory T cell, and NK cell infiltration. TMBrisk was developed based on CBWD1, ST7L, RFX5-AS1, C3orf38, LRFN1, LEMD1, and HMGB1. High TMBrisk was identified as a poor factor for prognosis in TCGA and GEO datasets; the high-TMBrisk group comprised more higher-grade (G2 and G3) and advanced clinical stage (stage III/IV) tumors. Meanwhile, higher TMBrisk was associated with an immunosuppressive phenotype, with less infiltration of a majority of immunocytes and less expression of several genes of the human leukocyte antigen (HLA) family. Moreover, a nomogram containing TMBrisk showed a strong predictive ability demonstrated by time-dependent ROC analysis. Overall, this novel TMB-related risk model (TMBrisk) could predict prognosis, evaluate immune infiltration, and discover new therapeutic regimens in OC, which is very promising in clinical promotion.
Collapse
|
4
|
Liu H, Xiong F, Zhai D, Duan X, Chen D, Chen Y, Wang Y, Xia M. Genetic Diversity and Population Differentiation of Chinese Lizard Gudgeon (Saurogobio dabryi) in the Upper Yangtze River. Front Ecol Evol 2022. [DOI: 10.3389/fevo.2022.890475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Dam construction on the upper Yangtze River has dramatically altered riverine ecosystems and caused habitat fragmentation of fishes, which might influence the genetic structure of fish populations. In this study, we examined the possible genetic effects of dam construction on Chinese lizard gudgeon (Saurogobio dabryi) populations in the upper Yangtze River, China. Seven populations were sampled, and genetic structure was analyzed using single nucleotide polymorphism (SNP) markers through the specific locus amplified fragment sequencing (SLAF-seq) method. The numbers of SNPs were lower in the upstream populations than in the downstream populations. Genetic similarity was increased from downstream to upstream. The upstream populations of S. dabryi might be more vulnerable to genetic drift than those downstream. Structure analysis indicated three distinct genetic groups of S. dabryi in the upper Yangtze River, among which the genetic differentiation values (Fst) were at a high level. The genetic differentiation of S. dabryi exhibited a close correlation with spatial distance. We did not detect a significant correlation between isolation time and genetic differentiation, suggesting that impacts of dams on the genetic structure of S. dabryi can be relatively minimal on a short time scale. The results quantify the genetic diversity and population structure patterns of S. dabryi after habitat fragmentation caused by dams, which will provide a reference for resource protection and management of this species in the upper Yangtze River.
Collapse
|
5
|
KIF2C Is a Novel Prognostic Biomarker and Correlated with Immune Infiltration in Endometrial Cancer. Stem Cells Int 2021; 2021:1434856. [PMID: 34650608 PMCID: PMC8510809 DOI: 10.1155/2021/1434856] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2021] [Revised: 08/24/2021] [Accepted: 09/02/2021] [Indexed: 11/30/2022] Open
Abstract
Endometrial cancer (EC) is commonly diagnosed cancer in women, and the prognosis of advanced types of EC is extremely poor. Kinesin family member 2C (KIF2C) has been reported as an oncogene in cancers. However, its pathophysiological roles and the correlation with tumor-infiltrating lymphocytes in EC remain unclear. The mRNA and protein levels of KIF2C in EC tissues were detected by qRT-PCR, Western blot (WB), and IHC. CCK8, Transwell, and colony formation assay were applied to assess the effects of KIF2C on cell proliferation, migration, and invasion. Cell apoptosis and cell cycle were analyzed by flow cytometry. The antitumor effect was further validated in the nude mouse xenograft cancer model and humanized mouse model. KIF2C expression was higher in EC. Knockdown of KIF2C prolonged the G1 phases and inhibited EC cell proliferation, migration, and invasion in vitro. Bioinformatics analysis indicated that KIF2C is negatively correlated with the infiltration level of CD8+ T cells but positively with the poor prognosis of EC patients. The apoptosis of CD8+ T cell was inhibited after the knockdown of KIF2C and was further inhibited when it is combined with anti-PD1. Conversely, compared to the knockdown of KIF2C expression alone, the combination of anti-PD1 further promoted the apoptosis of Ishikawa and RL95-2 cells. Moreover, the knockdown of KIF2C inhibited the expression of Ki-67 and the growth of tumors in the nude mouse xenograft cancer model. Our study found that the antitumor efficacy was further evaluated by the combination of anti-PD1 and KIF2C knockdown in a humanized mouse model. This study indicated that KIF2C is a novel prognostic biomarker that determines cancer progression and also a target for the therapy of EC and correlated with tumor immune cells infiltration in EC.
Collapse
|
6
|
Klees S, Heinrich F, Schmitt AO, Gültas M. agReg-SNPdb: A Database of Regulatory SNPs for Agricultural Animal Species. BIOLOGY 2021; 10:790. [PMID: 34440019 PMCID: PMC8389679 DOI: 10.3390/biology10080790] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/12/2021] [Revised: 08/12/2021] [Accepted: 08/12/2021] [Indexed: 12/13/2022]
Abstract
Transcription factors (TFs) govern transcriptional gene regulation by specifically binding to short DNA motifs, known as transcription factor binding sites (TFBSs), in regulatory regions, such as promoters. Today, it is well known that single nucleotide polymorphisms (SNPs) in TFBSs can dramatically affect the level of gene expression, since they can cause a change in the binding affinity of TFs. Such SNPs, referred to as regulatory SNPs (rSNPs), have gained attention in the life sciences due to their causality for specific traits or diseases. In this study, we present agReg-SNPdb, a database comprising rSNP data of seven agricultural and domestic animal species: cattle, pig, chicken, sheep, horse, goat, and dog. To identify the rSNPs, we constructed a bioinformatics pipeline and identified a total of 10,623,512 rSNPs, which are located within TFBSs and affect the binding affinity of putative TFs. Altogether, we implemented the first systematic analysis of SNPs in promoter regions and their impact on the binding affinity of TFs for livestock and made it usable via a web interface.
Collapse
Affiliation(s)
- Selina Klees
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (F.H.); (A.O.S.)
- Center for Integrated Breeding Research (CiBreed), Georg-August University, Albrecht-Thaer-Weg 3, 37075 Göttingen, Germany
| | - Felix Heinrich
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (F.H.); (A.O.S.)
| | - Armin Otto Schmitt
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (F.H.); (A.O.S.)
- Center for Integrated Breeding Research (CiBreed), Georg-August University, Albrecht-Thaer-Weg 3, 37075 Göttingen, Germany
| | - Mehmet Gültas
- Center for Integrated Breeding Research (CiBreed), Georg-August University, Albrecht-Thaer-Weg 3, 37075 Göttingen, Germany
- Faculty of Agriculture, South Westphalia University of Applied Sciences, Lübecker Ring 2, 59494 Soest, Germany
| |
Collapse
|
7
|
Zhang J, An L, Zhou X, Shi R, Wang H. Analysis of tumor mutation burden combined with immune infiltrates in endometrial cancer. ANNALS OF TRANSLATIONAL MEDICINE 2021; 9:551. [PMID: 33987249 PMCID: PMC8105813 DOI: 10.21037/atm-20-6049] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Accepted: 01/03/2021] [Indexed: 01/10/2023]
Abstract
BACKGROUND Tumor mutational burden (TMB) is widely regarded as a predictor of response to immunotherapy. Few researchers have focused on the activity and prognosis of TMB in endometrial cancer (EC) and immune cells. Our study aimed to identify the prognostic role of TMB in EC. METHODS We downloaded transcriptome data from The Cancer Genome Atlas (TCGA) database. Kaplan-Meier analysis with log-rank test was conducted to assess the difference in overall survival (OS) between the high and low TMB groups. The "CIBERSORT" scripts were performed to evaluate the immune compositions of EC patients. Cox regression analysis and survival analysis were used to verify the prognostic value prognosis of TMB. RESULTS We obtained the single nucleotide mutation data for 529 EC patients. A missense mutation was the most common mutation type. TMB was associated with survival outcome, tumor grades, and pathological types. We identified 10 hub TMB-related signature and found that elevated T-cell subsets infiltrating density in the high TMB group revealed improved survival outcomes. According to Kaplan-Meier analysis, T cells gamma delta and T cells regulatory were prognostic immune cells in EC samples. Moreover, many top gene set enrichment analysis (GSEA) results, including amino sugar and nucleotide sugar metabolism, nucleotide excision repair, or p53 signaling pathway, were enriched significantly with TMB level as phenotype. CONCLUSIONS TMB is an important prognostic factor for EC, and TMB-related genes may be potential therapeutic targets for EC.
Collapse
Affiliation(s)
- Jun Zhang
- Department of Obstetrics and Gynecology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Lanfen An
- Department of Obstetrics and Gynecology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Xing Zhou
- Department of Obstetrics and Gynecology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Rui Shi
- Department of Obstetrics and Gynecology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| | - Hongbo Wang
- Department of Obstetrics and Gynecology, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China
| |
Collapse
|
8
|
Hess JE, Zendt JS, Matala AR, Narum SR. Genetic basis of adult migration timing in anadromous steelhead discovered through multivariate association testing. Proc Biol Sci 2017; 283:rspb.2015.3064. [PMID: 27170720 DOI: 10.1098/rspb.2015.3064] [Citation(s) in RCA: 69] [Impact Index Per Article: 9.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2015] [Accepted: 04/14/2016] [Indexed: 01/21/2023] Open
Abstract
Migration traits are presumed to be complex and to involve interaction among multiple genes. We used both univariate analyses and a multivariate random forest (RF) machine learning algorithm to conduct association mapping of 15 239 single nucleotide polymorphisms (SNPs) for adult migration-timing phenotype in steelhead (Oncorhynchus mykiss). Our study focused on a model natural population of steelhead that exhibits two distinct migration-timing life histories with high levels of admixture in nature. Neutral divergence was limited between fish exhibiting summer- and winter-run migration owing to high levels of interbreeding, but a univariate mixed linear model found three SNPs from a major effect gene to be significantly associated with migration timing (p < 0.000005) that explained 46% of trait variation. Alignment to the annotated Salmo salar genome provided evidence that all three SNPs localize within a 46 kb region overlapping GREB1-like (an oestrogen target gene) on chromosome Ssa03. Additionally, multivariate analyses with RF identified that these three SNPs plus 15 additional SNPs explained up to 60% of trait variation. These candidate SNPs may provide the ability to predict adult migration timing of steelhead to facilitate conservation management of this species, and this study demonstrates the benefit of multivariate analyses for association studies.
Collapse
Affiliation(s)
- Jon E Hess
- Columbia River Inter-Tribal Fish Commission, 3059-F National Fish Hatchery Road, Hagerman, ID 83332, USA
| | - Joseph S Zendt
- Yakama Nation Fisheries Program, Yakima/Klickitat Fisheries Project, PO Box 151, Toppenish, WA 98948, USA
| | - Amanda R Matala
- Columbia River Inter-Tribal Fish Commission, 3059-F National Fish Hatchery Road, Hagerman, ID 83332, USA
| | - Shawn R Narum
- Columbia River Inter-Tribal Fish Commission, 3059-F National Fish Hatchery Road, Hagerman, ID 83332, USA
| |
Collapse
|
9
|
Wang Y, Wang S, Zhou D, Yang S, Xu Y, Yang C, Yang L. CsSNP: A Web-Based Tool for the Detecting of Comparative Segments SNPs. J Comput Biol 2016; 23:597-602. [PMID: 27347883 DOI: 10.1089/cmb.2015.0215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
SNP (single nucleotide polymorphism) is a popular tool for the study of genetic diversity, evolution, and other areas. Therefore, it is necessary to develop a convenient, utility, robust, rapid, and open source detecting-SNP tool for all researchers. Since the detection of SNPs needs special software and series steps including alignment, detection, analysis and present, the study of SNPs is limited for nonprofessional users. CsSNP (Comparative segments SNP, http://biodb.sdau.edu.cn/cssnp/ ) is a freely available web tool based on the Blat, Blast, and Perl programs to detect comparative segments SNPs and to show the detail information of SNPs. The results are filtered and presented in the statistics figure and a Gbrowse map. This platform contains the reference genomic sequences and coding sequences of 60 plant species, and also provides new opportunities for the users to detect SNPs easily. CsSNP is provided a convenient tool for nonprofessional users to find comparative segments SNPs in their own sequences, and give the users the information and the analysis of SNPs, and display these data in a dynamic map. It provides a new method to detect SNPs and may accelerate related studies.
Collapse
Affiliation(s)
- Yi Wang
- 1 Key Laboratory of Crop Biology of China, Shandong Agricultural University , Taian, China
| | - Shuangshuang Wang
- 2 College of Plant Protection, Shandong Agricultural University , Taian, China
| | - Dongjie Zhou
- 2 College of Plant Protection, Shandong Agricultural University , Taian, China
| | - Shuai Yang
- 2 College of Plant Protection, Shandong Agricultural University , Taian, China
| | - Yongchao Xu
- 2 College of Plant Protection, Shandong Agricultural University , Taian, China
| | - Chao Yang
- 1 Key Laboratory of Crop Biology of China, Shandong Agricultural University , Taian, China
| | - Long Yang
- 2 College of Plant Protection, Shandong Agricultural University , Taian, China .,3 Agricultural Big-Data Research Center, Shandong Agricultural University , Taian, China
| |
Collapse
|
10
|
Do DN, Janss LLG, Jensen J, Kadarmideen HN. SNP annotation-based whole genomic prediction and selection: an application to feed efficiency and its component traits in pigs. J Anim Sci 2016; 93:2056-63. [PMID: 26020301 DOI: 10.2527/jas.2014-8640] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
The study investigated genetic architecture and predictive ability using genomic annotation of residual feed intake (RFI) and its component traits (daily feed intake [DFI], ADG, and back fat [BF]). A total of 1,272 Duroc pigs had both genotypic and phenotypic records, and the records were split into a training (968 pigs) and a validation dataset (304 pigs) by assigning records as before and after January 1, 2012, respectively. SNP were annotated by 14 different classes using Ensembl variant effect prediction. Predictive accuracy and prediction bias were calculated using Bayesian Power LASSO, Bayesian A, B, and Cπ, and genomic BLUP (GBLUP) methods. Predictive accuracy ranged from 0.508 to 0.531, 0.506 to 0.532, 0.276 to 0.357, and 0.308 to 0.362 for DFI, RFI, ADG, and BF, respectively. BayesCπ100.1 increased accuracy slightly compared to the GBLUP model and other methods. The contribution per SNP to total genomic variance was similar among annotated classes across different traits. Predictive performance of SNP classes did not significantly differ from randomized SNP groups. Genomic prediction has accuracy comparable to observed phenotype, and use of genomic prediction can be cost effective by replacing feed intake measurement. Genomic annotation had less impact on predictive accuracy traits considered here but may be different for other traits. It is the first study to provide useful insights into biological classes of SNP driving the whole genomic prediction for complex traits in pigs.
Collapse
|
11
|
Lyon KF, Strong CL, Schooler SG, Young RJ, Roy N, Ozar B, Bachmeier M, Rajasekaran S, Schiller MR. Natural variability of minimotifs in 1092 people indicates that minimotifs are targets of evolution. Nucleic Acids Res 2015; 43:6399-412. [PMID: 26068475 PMCID: PMC4513861 DOI: 10.1093/nar/gkv580] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2014] [Revised: 04/17/2015] [Accepted: 05/21/2015] [Indexed: 01/05/2023] Open
Abstract
Since the function of a short contiguous peptide minimotif can be introduced or eliminated by a single point mutation, these functional elements may be a source of human variation and a target of selection. We analyzed the variability of ∼300 000 minimotifs in 1092 human genomes from the 1000 Genomes Project. Most minimotifs have been purified by selection, with a 94% invariance, which supports important functional roles for minimotifs. Minimotifs are generally under negative selection, possessing high genomic evolutionary rate profiling (GERP) and sitewise likelihood-ratio (SLR) scores. Some are subject to neutral drift or positive selection, similar to coding regions. Most SNPs in minimotif were common variants, but with minor allele frequencies generally <10%. This was supported by low substation rates and few newly derived minimotifs. Several minimotif alleles showed different intercontinental and regional geographic distributions, strongly suggesting a role for minimotifs in adaptive evolution. We also note that 4% of PTM minimotif sites in histone tails were common variants, which has the potential to differentially affect DNA packaging among individuals. In conclusion, minimotifs are a source of functional genetic variation in the human population; thus, they are likely to be an important target of selection and evolution.
Collapse
Affiliation(s)
- Kenneth F Lyon
- Nevada Institute of Personalized Medicine and School of Life Sciences, University of Nevada Las Vegas, 4505 Maryland Parkway, Las Vegas, NV 89154-4004, USA
| | - Christy L Strong
- Nevada Institute of Personalized Medicine and School of Life Sciences, University of Nevada Las Vegas, 4505 Maryland Parkway, Las Vegas, NV 89154-4004, USA
| | - Steve G Schooler
- Nevada Institute of Personalized Medicine and School of Life Sciences, University of Nevada Las Vegas, 4505 Maryland Parkway, Las Vegas, NV 89154-4004, USA
| | - Richard J Young
- Nevada Institute of Personalized Medicine and School of Life Sciences, University of Nevada Las Vegas, 4505 Maryland Parkway, Las Vegas, NV 89154-4004, USA Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269-2155, USA
| | - Nervik Roy
- Nevada Institute of Personalized Medicine and School of Life Sciences, University of Nevada Las Vegas, 4505 Maryland Parkway, Las Vegas, NV 89154-4004, USA
| | - Brittany Ozar
- Nevada Institute of Personalized Medicine and School of Life Sciences, University of Nevada Las Vegas, 4505 Maryland Parkway, Las Vegas, NV 89154-4004, USA
| | - Mark Bachmeier
- Nevada Institute of Personalized Medicine and School of Life Sciences, University of Nevada Las Vegas, 4505 Maryland Parkway, Las Vegas, NV 89154-4004, USA
| | - Sanguthevar Rajasekaran
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269-2155, USA
| | - Martin R Schiller
- Nevada Institute of Personalized Medicine and School of Life Sciences, University of Nevada Las Vegas, 4505 Maryland Parkway, Las Vegas, NV 89154-4004, USA
| |
Collapse
|
12
|
Kadarmideen HN. Genomics to systems biology in animal and veterinary sciences: Progress, lessons and opportunities. Livest Sci 2014. [DOI: 10.1016/j.livsci.2014.04.028] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
13
|
Koufariotis L, Chen YPP, Bolormaa S, Hayes BJ. Regulatory and coding genome regions are enriched for trait associated variants in dairy and beef cattle. BMC Genomics 2014; 15:436. [PMID: 24903263 PMCID: PMC4070550 DOI: 10.1186/1471-2164-15-436] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2013] [Accepted: 05/22/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In livestock, as in humans, the number of genetic variants that can be tested for association with complex quantitative traits, or used in genomic predictions, is increasing exponentially as whole genome sequencing becomes more common. The power to identify variants associated with traits, particularly those of small effects, could be increased if certain regions of the genome were known a priori to be enriched for associations. Here, we investigate whether twelve genomic annotation classes were enriched or depleted for significant associations in genome wide association studies for complex traits in beef and dairy cattle. We also describe a variance component approach to determine the proportion of genetic variance captured by each annotation class. RESULTS P-values from large GWAS using 700K SNP in both dairy and beef cattle were available for 11 and 10 traits respectively. We found significant enrichment for trait associated variants (SNP significant in the GWAS) in the missense class along with regions 5 kilobases upstream and downstream of coding genes. We found that the non-coding conserved regions (across mammals) were not enriched for trait associated variants. The results from the enrichment or depletion analysis were not in complete agreement with the results from variance component analysis, where the missense and synonymous classes gave the greatest increase in variance explained, while the upstream and downstream classes showed a more modest increase in the variance explained. CONCLUSION Our results indicate that functional annotations could assist in prioritization of variants to a subset more likely to be associated with complex traits; including missense variants, and upstream and downstream regions. The differences in two sets of results (GWAS enrichment depletion versus variance component approaches) might be explained by the fact that the variance component approach has greater power to capture the cumulative effect of mutations of small effect, while the enrichment or depletion approach only captures the variants that are significant in GWAS, which is restricted to a limited number of common variants of moderate effects.
Collapse
Affiliation(s)
- Lambros Koufariotis
- Faculty of Science, Technology and Engineering, La Trobe University, Melbourne, Victoria 3086, Australia.
| | | | | | | |
Collapse
|
14
|
Snpdat: easy and rapid annotation of results from de novo snp discovery projects for model and non-model organisms. BMC Bioinformatics 2013; 14:45. [PMID: 23390980 PMCID: PMC3574845 DOI: 10.1186/1471-2105-14-45] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2012] [Accepted: 02/05/2013] [Indexed: 11/14/2022] Open
Abstract
Background Single nucleotide polymorphisms (SNPs) are the most abundant genetic variant found in vertebrates and invertebrates. SNP discovery has become a highly automated, robust and relatively inexpensive process allowing the identification of many thousands of mutations for model and non-model organisms. Annotating large numbers of SNPs can be a difficult and complex process. Many tools available are optimised for use with organisms densely sampled for SNPs, such as humans. There are currently few tools available that are species non-specific or support non-model organism data. Results Here we present SNPdat, a high throughput analysis tool that can provide a comprehensive annotation of both novel and known SNPs for any organism with a draft sequence and annotation. Using a dataset of 4,566 SNPs identified in cattle using high-throughput DNA sequencing we demonstrate the annotations performed and the statistics that can be generated by SNPdat. Conclusions SNPdat provides users with a simple tool for annotation of genomes that are either not supported by other tools or have a small number of annotated SNPs available. SNPdat can also be used to analyse datasets from organisms which are densely sampled for SNPs. As a command line tool it can easily be incorporated into existing SNP discovery pipelines and fills a niche for analyses involving non-model organisms that are not supported by many available SNP annotation tools. SNPdat will be of great interest to scientists involved in SNP discovery and analysis projects, particularly those with limited bioinformatics experience.
Collapse
|
15
|
Gray KA, Maltecca C, Bagnato A, Dolezal M, Rossoni A, Samore AB, Cassady JP. Estimates of marker effects for measures of milk flow in the Italian brown Swiss dairy cattle population. BMC Vet Res 2012; 8:199. [PMID: 23092401 PMCID: PMC3534398 DOI: 10.1186/1746-6148-8-199] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2012] [Accepted: 10/05/2012] [Indexed: 01/25/2023] Open
Abstract
BACKGROUND Milkability is a complex trait that is characterized by milk flow traits including average milk flow rate, maximum milk flow rate and total milking time. Milkability has long been recognized as an economically important trait that can be improved through selection. By improving milkability, management costs of milking decrease through reduced labor and improved efficiency of the automatic milking system, which has been identified as an important factor affecting net profit. The objective of this study was to identify markers associated with electronically measured milk flow traits, in the Italian Brown Swiss population that could potentially improve selection based on genomic predictions. RESULTS Sires (n = 1351) of cows with milk flow information were genotyped for 33,074 single nucleotide polymorphism (SNP) markers distributed across 29 Bos taurus autosomes (BTA). Among the six milk flow traits collected, ascending time, time of plateau, descending time, total milking time, maximum milk flow and average milk flow, there were 6,929 (time of plateau) to 14,585 (maximum milk flow) significant SNP markers identified for each trait across all BTA. Unique regions were found for each of the 6 traits providing evidence that each individual milk flow trait offers distinct genetic information about milk flow. This study was also successful in identifying functional processes and genes associated with SNPs that influences milk flow. CONCLUSIONS In addition to verifying the presence of previously identified milking speed quantitative trait loci (QTL) within the Italian Brown Swiss population, this study revealed a number of genomic regions associated with milk flow traits that have never been reported as milking speed QTL. While several of these regions were not associated with a known gene or QTL, a number of regions were associated with QTL that have been formerly reported as regions associated with somatic cell count, somatic cell score and udder morphometrics. This provides further evidence of the complexity of milk flow traits and the underlying relationship it has with other economically important traits for dairy cattle. Improved understanding of the overall milking pattern will aid in identification of cows with lower management costs and improved udder health.
Collapse
Affiliation(s)
- Kent A Gray
- Animal Breeding and Genetics, Department of Animal Science, North Carolina State University, Raleigh, NC, USA
| | | | | | | | | | | | | |
Collapse
|
16
|
NovelSNPer: A Fast Tool for the Identification and Characterization of Novel SNPs and InDels. Adv Bioinformatics 2011; 2011:657341. [PMID: 22110502 PMCID: PMC3206323 DOI: 10.1155/2011/657341] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2011] [Accepted: 08/11/2011] [Indexed: 02/06/2023] Open
Abstract
Typically, next-generation resequencing projects produce large lists of variants. NovelSNPer is a software
tool that permits fast and efficient processing of such output lists. In a first step, NovelSNPer determines if a variant represents a known variant or a previously unknown variant. In a second step, each variant is classified into one of 15 SNP classes or 19 InDel classes. Beside the classes used by Ensembl, we introduce POTENTIAL_START_GAINED and START_LOST as new functional classes and present a classification scheme for InDels. NovelSNPer is based upon the gene structure information stored in Ensembl. It processes two million SNPs in six hours. The tool can be used online or downloaded.
Collapse
|
17
|
Jiang J, Jiang L, Zhou B, Fu W, Liu JF, Zhang Q. Snat: a SNP annotation tool for bovine by integrating various sources of genomic information. BMC Genet 2011; 12:85. [PMID: 21982513 PMCID: PMC3224132 DOI: 10.1186/1471-2156-12-85] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2011] [Accepted: 10/07/2011] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Most recently, with maturing of bovine genome sequencing and high throughput SNP genotyping technologies, a large number of significant SNPs associated with economic important traits can be identified by genome-wide association studies (GWAS). To further determine true association findings in GWAS, the common strategy is to sift out most promising SNPs for follow-up replication studies. Hence it is crucial to explore the functional significance of the candidate SNPs in order to screen and select the potential functional ones. To systematically prioritize these statistically significant SNPs and facilitate follow-up replication studies, we developed a bovine SNP annotation tool (Snat) based on a web interface. RESULTS With Snat, various sources of genomic information are integrated and retrieved from several leading online databases, including SNP information from dbSNP, gene information from Entrez Gene, protein features from UniProt, linkage information from AnimalQTLdb, conserved elements from UCSC Genome Browser Database and gene functions from Gene Ontology (GO), KEGG PATHWAY and Online Mendelian Inheritance in Animals (OMIA). Snat provides two different applications, including a CGI-based web utility and a command-line version, to access the integrated database, target any single nucleotide loci of interest and perform multi-level functional annotations. For further validation of the practical significance of our study, SNPs involved in two commercial bovine SNP chips, i.e., the Affymetrix Bovine 10K chip array and the Illumina 50K chip array, have been annotated by Snat, and the corresponding outputs can be directly downloaded from Snat website. Furthermore, a real dataset involving 20 identified SNPs associated with milk yield in our recent GWAS was employed to demonstrate the practical significance of Snat. CONCLUSIONS To our best knowledge, Snat is one of first tools focusing on SNP annotation for livestock. Snat confers researchers with a convenient and powerful platform to aid functional analyses and accurate evaluation on genes/variants related to SNPs, and facilitates follow-up replication studies in the post-GWAS era.
Collapse
Affiliation(s)
- Jicai Jiang
- Key Laboratory of Animal Genetics and Breeding of the Ministry of Agriculture, College of Animal Science and Technology, China Agricultural University, Beijing, P.R. China
| | | | | | | | | | | |
Collapse
|
18
|
Yang C, Zhou X, Wan X, Yang Q, Xue H, Yu W. Identifying disease-associated SNP clusters via contiguous outlier detection. ACTA ACUST UNITED AC 2011; 27:2578-85. [PMID: 21784794 DOI: 10.1093/bioinformatics/btr424] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
MOTIVATION Although genome-wide association studies (GWAS) have identified many disease-susceptibility single-nucleotide polymorphisms (SNPs), these findings can only explain a small portion of genetic contributions to complex diseases, which is known as the missing heritability. A possible explanation is that genetic variants with small effects have not been detected. The chance is < 8 that a causal SNP will be directly genotyped. The effects of its neighboring SNPs may be too weak to be detected due to the effect decay caused by imperfect linkage disequilibrium. Moreover, it is still challenging to detect a causal SNP with a small effect even if it has been directly genotyped. RESULTS In order to increase the statistical power when detecting disease-associated SNPs with relatively small effects, we propose a method using neighborhood information. Since the disease-associated SNPs account for only a small fraction of the entire SNP set, we formulate this problem as Contiguous Outlier DEtection (CODE), which is a discrete optimization problem. In our formulation, we cast the disease-associated SNPs as outliers and further impose a spatial continuity constraint for outlier detection. We show that this optimization can be solved exactly using graph cuts. We also employ the stability selection strategy to control the false positive results caused by imperfect parameter tuning. We demonstrate its advantage in simulations and real experiments. In particular, the newly identified SNP clusters are replicable in two independent datasets. AVAILABILITY The software is available at: http://bioinformatics.ust.hk/CODE.zip. CONTACT eeyu@ust.hk SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Can Yang
- Laboratory for Bioinformatics and Computational Biology, Department of Electronic and Computer Engineering, The Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong, China
| | | | | | | | | | | |
Collapse
|
19
|
Lecerf F, Bretaudeau A, Sallou O, Desert C, Blum Y, Lagarrigue S, Demeure O. AnnotQTL: a new tool to gather functional and comparative information on a genomic region. Nucleic Acids Res 2011; 39:W328-33. [PMID: 21596783 PMCID: PMC3125768 DOI: 10.1093/nar/gkr361] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
AnnotQTL is a web tool designed to aggregate functional annotations from different prominent web sites by minimizing the redundancy of information. Although thousands of QTL regions have been identified in livestock species, most of them are large and contain many genes. This tool was therefore designed to assist the characterization of genes in a QTL interval region as a step towards selecting the best candidate genes. It localizes the gene to a specific region (using NCBI and Ensembl data) and adds the functional annotations available from other databases (Gene Ontology, Mammalian Phenotype, HGNC and Pubmed). Both human genome and mouse genome can be aligned with the studied region to detect synteny and segment conservation, which is useful for running inter-species comparisons of QTL locations. Finally, custom marker lists can be included in the results display to select the genes that are closest to your most significant markers. We use examples to demonstrate that in just a couple of hours, AnnotQTL is able to identify all the genes located in regions identified by a full genome scan, with some highlighted based on both location and function, thus considerably increasing the chances of finding good candidate genes. AnnotQTL is available at http://annotqtl.genouest.org.
Collapse
Affiliation(s)
- F Lecerf
- INRA, UMR598 Génétique Animale, F-35000 Rennes, France.
| | | | | | | | | | | | | |
Collapse
|
20
|
Palejev D, Hwang W, Landi N, Eastman M, Frost SJ, Fulbright RK, Kidd JR, Kidd KK, Mason GF, Mencl WE, Yrigollen C, Pugh KR, Grigorenko EL. An application of the elastic net for an endophenotype analysis. Behav Genet 2011; 41:120-4. [PMID: 21229297 DOI: 10.1007/s10519-011-9443-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2010] [Accepted: 12/23/2010] [Indexed: 11/27/2022]
Abstract
We provide an illustration of an application of the elastic net to a large number of common genetic variants in the context of the search for the genetic bases of an endophenotype conceivably related to individual differences in learning. GABA concentration in the occipital cortex, a critical area for reading, was obtained in a group (n = 76) of children aged 6-10 years. Two extreme groups, high and low, were selected for genotyping with the 650Y Illumina array chip (Ilmn650Y). An elastic net approach was applied to the resulting SNP dataset; 100 SNPs were identified for each chromosome as "interesting" based on having the highest absolute value coefficients. The analyses highlighted chromosomes 15 and 20, which contained 55 candidate genes. The STRING partner analyses of the associated proteins pointed to a number of related genes, most notably, GABA and NTRK receptors.
Collapse
|