1
|
Taş G, Westerdijk T, Postma E, Veldink JH, Schönhuth A, Balvert M. Computing linkage disequilibrium aware genome embeddings using autoencoders. Bioinformatics 2024; 40:btae326. [PMID: 38775680 PMCID: PMC11208726 DOI: 10.1093/bioinformatics/btae326] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2023] [Revised: 04/23/2024] [Accepted: 05/17/2024] [Indexed: 06/28/2024] Open
Abstract
MOTIVATION The completion of the genome has paved the way for genome-wide association studies (GWAS), which explained certain proportions of heritability. GWAS are not optimally suited to detect non-linear effects in disease risk, possibly hidden in non-additive interactions (epistasis). Alternative methods for epistasis detection using, e.g. deep neural networks (DNNs) are currently under active development. However, DNNs are constrained by finite computational resources, which can be rapidly depleted due to increasing complexity with the sheer size of the genome. Besides, the curse of dimensionality complicates the task of capturing meaningful genetic patterns for DNNs; therefore necessitates dimensionality reduction. RESULTS We propose a method to compress single nucleotide polymorphism (SNP) data, while leveraging the linkage disequilibrium (LD) structure and preserving potential epistasis. This method involves clustering correlated SNPs into haplotype blocks and training per-block autoencoders to learn a compressed representation of the block's genetic content. We provide an adjustable autoencoder design to accommodate diverse blocks and bypass extensive hyperparameter tuning. We applied this method to genotyping data from Project MinE, and achieved 99% average test reconstruction accuracy-i.e. minimal information loss-while compressing the input to nearly 10% of the original size. We demonstrate that haplotype-block based autoencoders outperform linear Principal Component Analysis (PCA) by approximately 3% chromosome-wide accuracy of reconstructed variants. To the extent of our knowledge, our approach is the first to simultaneously leverage haplotype structure and DNNs for dimensionality reduction of genetic data. AVAILABILITY AND IMPLEMENTATION Data are available for academic use through Project MinE at https://www.projectmine.com/research/data-sharing/, contingent upon terms and requirements specified by the source studies. Code is available at https://github.com/gizem-tas/haploblock-autoencoders.
Collapse
Affiliation(s)
- Gizem Taş
- Department of Econometrics and Operations Research, Tilburg University, Tilburg 5037AB, The Netherlands
| | - Timo Westerdijk
- Department of Neurology, University Medical Center Utrecht, Utrecht 3584CX, The Netherlands
| | - Eric Postma
- Department of Cognitive Science and Artificial Intelligence, Tilburg University, Tilburg 5037AB, The Netherlands
| | - Jan H Veldink
- Department of Neurology, University Medical Center Utrecht, Utrecht 3584CX, The Netherlands
| | | | - Marleen Balvert
- Department of Econometrics and Operations Research, Tilburg University, Tilburg 5037AB, The Netherlands
| |
Collapse
|
2
|
Goraya SA, Ding S, Miller RC, Arif MK, Kong H, Masud A. Modeling of spatiotemporal dynamics of ligand-coated particle flow in targeted drug delivery processes. Proc Natl Acad Sci U S A 2024; 121:e2314533121. [PMID: 38776373 PMCID: PMC11145262 DOI: 10.1073/pnas.2314533121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Accepted: 04/18/2024] [Indexed: 05/25/2024] Open
Abstract
Nanoparticles tethered with vasculature-binding epitopes have been used to deliver the drug into injured or diseased tissues via the bloodstream. However, the extent that blood flow dynamics affects nanoparticle retention at the target site after adhesion needs to be better understood. This knowledge gap potentially underlies significantly different therapeutic efficacies between animal models and humans. An experimentally validated mathematical model that accurately simulates the effects of blood flow on nanoparticle adhesion and retention, thus circumventing the limitations of conventional trial-and-error-based drug design in animal models, is lacking. This paper addresses this technical bottleneck and presents an integrated mathematical method that derives heavily from a unique combination of a mechanics-based dispersion model for nanoparticle transport and diffusion in the boundary layers, an asperity model to account for surface roughness of endothelium, and an experimentally calibrated stochastic nanoparticle-cell adhesion model to describe nanoparticle adhesion and subsequent retention at the target site under external flow. PLGA-b-HA nanoparticles tethered with VHSPNKK peptides that specifically bind to vascular cell adhesion molecules on the inflamed vascular wall were investigated. The computational model revealed that larger particles perform better in adhesion and retention at the endothelium for the particle sizes suitable for drug delivery applications and within physiologically relevant shear rates. The computational model corresponded closely to the in vitro experiments which demonstrates the impact that model-based simulations can have on optimizing nanocarriers in vascular microenvironments, thereby substantially reducing in vivo experimentation as well as the development costs.
Collapse
Affiliation(s)
- Shoaib A. Goraya
- Department of Civil and Environmental Engineering, University of Illinois Urbana-Champaign, Urbana, IL61801
| | - Shengzhe Ding
- Department of Chemical and Biomolecular Engineering, University of Illinois Urbana-Champaign, Urbana, IL61801
| | - Ryan C. Miller
- Department of Chemical and Biomolecular Engineering, University of Illinois Urbana-Champaign, Urbana, IL61801
| | - Mariam K. Arif
- Feinberg School of Medicine, Northwestern University, Chicago, IL60611
| | - Hyunjoon Kong
- Department of Chemical and Biomolecular Engineering, University of Illinois Urbana-Champaign, Urbana, IL61801
- Department of Biomedical and Translational Sciences, Carle Illinois College of Medicine, University of Illinois Urbana-Champaign, Urbana, IL61801
| | - Arif Masud
- Department of Civil and Environmental Engineering, University of Illinois Urbana-Champaign, Urbana, IL61801
- Department of Biomedical and Translational Sciences, Carle Illinois College of Medicine, University of Illinois Urbana-Champaign, Urbana, IL61801
| |
Collapse
|
3
|
Shipilina D, Pal A, Stankowski S, Chan YF, Barton NH. On the origin and structure of haplotype blocks. Mol Ecol 2023; 32:1441-1457. [PMID: 36433653 PMCID: PMC10946714 DOI: 10.1111/mec.16793] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 11/16/2022] [Accepted: 11/18/2022] [Indexed: 11/27/2022]
Abstract
The term "haplotype block" is commonly used in the developing field of haplotype-based inference methods. We argue that the term should be defined based on the structure of the Ancestral Recombination Graph (ARG), which contains complete information on the ancestry of a sample. We use simulated examples to demonstrate key features of the relationship between haplotype blocks and ancestral structure, emphasizing the stochasticity of the processes that generate them. Even the simplest cases of neutrality or of a "hard" selective sweep produce a rich structure, often missed by commonly used statistics. We highlight a number of novel methods for inferring haplotype structure, based on the full ARG, or on a sequence of trees, and illustrate how they can be used to define haplotype blocks using an empirical data set. While the advent of new, computationally efficient methods makes it possible to apply these concepts broadly, they (and additional new methods) could benefit from adding features to explore haplotype blocks, as we define them. Understanding and applying the concept of the haplotype block will be essential to fully exploit long and linked-read sequencing technologies.
Collapse
Affiliation(s)
- Daria Shipilina
- Evolutionary Biology Program, Department of Ecology and Genetics (IEG), Uppsala University, Uppsala, Sweden
- Institute of Science and Technology Austria, Klosterneuburg, Austria
- Swedish Collegium for Advanced Study, Uppsala, Sweden
| | - Arka Pal
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| | - Sean Stankowski
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| | | | - Nicholas H Barton
- Institute of Science and Technology Austria, Klosterneuburg, Austria
| |
Collapse
|
4
|
Li X, Shi Z, Gao J, Wang X, Guo K. CandiHap: a haplotype analysis toolkit for natural variation study. MOLECULAR BREEDING : NEW STRATEGIES IN PLANT IMPROVEMENT 2023; 43:21. [PMID: 37313297 PMCID: PMC10248607 DOI: 10.1007/s11032-023-01366-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 02/22/2023] [Indexed: 06/15/2023]
Abstract
Haplotype blocks greatly assist association-based mapping of casual candidate genes by significantly reducing genotyping effort. The gene haplotype could be used to evaluate variants of affected traits captured from the gene region. While there is a rising interest in gene haplotypes, much of the corresponding analysis was carried out manually. CandiHap allows rapid and robust haplotype analysis and candidate identification preselection of candidate causal single-nucleotide polymorphisms and InDels from Sanger or next-generation sequencing data. Investigators can use CandiHap to specify a gene or linkage sites based on genome-wide association studies and explore favorable haplotypes of candidate genes for target traits. CandiHap can be run on computers with Windows, Mac, or UNIX platforms in a graphical user interface or command line, and applied to any species, such as plant, animal, and microbial. The CandiHap software, user manual, and example datasets are freely available at BioCode (https://ngdc.cncb.ac.cn/biocode/tools/BT007080) or GitHub (https://github.com/xukaili/CandiHap). Supplementary information The online version contains supplementary material available at 10.1007/s11032-023-01366-4.
Collapse
Affiliation(s)
- Xukai Li
- Hou Ji Laboratory in Shanxi Province, Shanxi Agricultural University, Taigu, 030031 China
- College of Life Sciences, Shanxi Agricultural University, Taigu, 030801 China
| | - Zhiyong Shi
- College of Life Sciences, Shanxi Agricultural University, Taigu, 030801 China
| | - Jianhua Gao
- Hou Ji Laboratory in Shanxi Province, Shanxi Agricultural University, Taigu, 030031 China
- College of Life Sciences, Shanxi Agricultural University, Taigu, 030801 China
| | - Xingchun Wang
- Hou Ji Laboratory in Shanxi Province, Shanxi Agricultural University, Taigu, 030031 China
- College of Life Sciences, Shanxi Agricultural University, Taigu, 030801 China
| | - Kai Guo
- Department of Neurology, University of Michigan, Ann Arbor, MI 48109 USA
| |
Collapse
|
5
|
Pevzner P, Vingron M, Reidys C, Sun F, Istrail S. Michael Waterman's Contributions to Computational Biology and Bioinformatics. J Comput Biol 2022; 29:601-615. [PMID: 35727100 DOI: 10.1089/cmb.2022.29066.pp] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
On the occasion of Dr. Michael Waterman's 80th birthday, we review his major contributions to the field of computational biology and bioinformatics including the famous Smith-Waterman algorithm for sequence alignment, the probability and statistics theory related to sequence alignment, algorithms for sequence assembly, the Lander-Waterman model for genome physical mapping, combinatorics and predictions of ribonucleic acid structures, word counting statistics in molecular sequences, alignment-free sequence comparison, and algorithms for haplotype block partition and tagSNP selection related to the International HapMap Project. His books Introduction to Computational Biology: Maps, Sequences and Genomes for graduate students and Computational Genome Analysis: An Introduction geared toward undergraduate students played key roles in computational biology and bioinformatics education. We also highlight his efforts of building the computational biology and bioinformatics community as the founding editor of the Journal of Computational Biology and a founding member of the International Conference on Research in Computational Molecular Biology (RECOMB).
Collapse
Affiliation(s)
- Pavel Pevzner
- Department of Computer Science and Engineering, University of California San Diego, San Diego, California, USA
| | - Martin Vingron
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Christian Reidys
- Department of Mathematics, Biocomplexity Institute & Initiative, University of Virginia, Charlottesville, Virginia, USA
| | - Fengzhu Sun
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, California, USA
| | - Sorin Istrail
- Department of Computer Science, Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, USA
| |
Collapse
|
6
|
Toda-Oti KS, Stefano JT, Cavaleiro AM, Carrilho FJ, Correa-Gianella ML, Oliveira CPMDSD. Association of UCP3 Polymorphisms with Nonalcoholic Steatohepatitis and Metabolic Syndrome in Nonalcoholic Fatty Liver Disease Brazilian Patients. Metab Syndr Relat Disord 2022; 20:114-123. [PMID: 35020496 DOI: 10.1089/met.2020.0104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Background: We investigated the possible association of uncoupling protein 3 gene (UCP3) single nucleotide polymorphisms (SNPs) with nonalcoholic steatohepatitis (NASH) and metabolic syndrome (MetS) in nonalcoholic fatty liver disease (NAFLD) Brazilian patients. Methods: UCP3 SNPs rs1726745, rs3781907, and rs11235972 were genotyped in 158 biopsy-proven NAFLD Brazilian patients. Statistics was performed with JMP, R, and SHEsis softwares. Results: The TT genotype of rs1726745 was associated with less occurrence of MetS (P = 0.006) and with lower body mass index (BMI) in the entire NAFLD sample (P = 0.01) and in the NASH group (P = 0.02). The rs1726745-T was associated with lower values of AST (P = 0.001), ALT (P = 0.0002), triglycerides (P = 0.01), and total cholesterol (P = 0.02) in the entire NAFLD sample. Between groups, there were lower values of aminotransferases strictly in individuals with NASH (AST, P = 0.002; ALT, P = 0.0007) and with MetS (AST, P = 0.002; ALT, P = 0.001). The rs3781907-G was associated with lower GGT elevation values in the entire NAFLD sample (P = 0.002), in the NASH group (P = 0.004), and with MetS group (P = 0.003) and with protection for advanced fibrosis (P = 0.01). The rs11235972-A was associated with lower GGT values in the entire NAFLD sample (P = 0.006) and in the NASH group (P = 0.01) and with MetS group (P = 0.005), with fibrosis absence (P = 0.01) and protection for advanced fibrosis (P = 0.01). The TAA haplotype was protective for NASH (P = 0.002), and TGG haplotype was protective for MetS (P = 0.01). Conclusion: UCP3 gene variants were associated with protection against NASH and MetS, in addition to lower values of liver enzymes, lipid profile, BMI and, lesser fibrosis severity in the studied population.
Collapse
Affiliation(s)
- Karla Sawada Toda-Oti
- Departamento de Gastroenterologia, Faculdade de Medicina da, Universidade de São Paulo, São Paulo, Brazil
| | - José Tadeu Stefano
- Laboratório de Gastroenterologia Clínica e Experimental (LIM-07), Departamento de Gastroenterologia e Hepatologia, Faculdade de Medicina, Hospital das Clínicas HC-FMUSP, Universidade de São Paulo, São Paulo, Brazil
| | - Ana Mercedes Cavaleiro
- Laboratório de Carboidratos e Radioimunensaio (LIM-18), Hospital das Clínicas HC-FMUSP, Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil
| | - Flair José Carrilho
- Departamento de Gastroenterologia, Faculdade de Medicina da, Universidade de São Paulo, São Paulo, Brazil.,Laboratório de Gastroenterologia Clínica e Experimental (LIM-07), Departamento de Gastroenterologia e Hepatologia, Faculdade de Medicina, Hospital das Clínicas HC-FMUSP, Universidade de São Paulo, São Paulo, Brazil
| | - Maria Lúcia Correa-Gianella
- Laboratório de Carboidratos e Radioimunensaio (LIM-18), Hospital das Clínicas HC-FMUSP, Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil.,Programa de Pós-graduação em Medicina, Universidade Nove de Julho (UNINOVE), São Paulo, Brazil
| | - Cláudia Pinto Marques de Souza de Oliveira
- Departamento de Gastroenterologia, Faculdade de Medicina da, Universidade de São Paulo, São Paulo, Brazil.,Laboratório de Gastroenterologia Clínica e Experimental (LIM-07), Departamento de Gastroenterologia e Hepatologia, Faculdade de Medicina, Hospital das Clínicas HC-FMUSP, Universidade de São Paulo, São Paulo, Brazil
| |
Collapse
|
7
|
Nyine M, Adhikari E, Clinesmith M, Aiken R, Betzen B, Wang W, Davidson D, Yu Z, Guo Y, He F, Akhunova A, Jordan KW, Fritz AK, Akhunov E. The Haplotype-Based Analysis of Aegilops tauschii Introgression Into Hard Red Winter Wheat and Its Impact on Productivity Traits. FRONTIERS IN PLANT SCIENCE 2021; 12:716955. [PMID: 34484280 PMCID: PMC8416154 DOI: 10.3389/fpls.2021.716955] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Accepted: 07/20/2021] [Indexed: 05/13/2023]
Abstract
The introgression from wild relatives have a great potential to broaden the availability of beneficial allelic diversity for crop improvement in breeding programs. Here, we assessed the impact of the introgression from 21 diverse accessions of Aegilops tauschii, the diploid ancestor of the wheat D genome, into 6 hard red winter wheat cultivars on yield and yield component traits. We used 5.2 million imputed D genome SNPs identified by the whole-genome sequencing of parental lines and the sequence-based genotyping of introgression population, including 351 BC1F3:5 lines. Phenotyping data collected from the irrigated and non-irrigated field trials revealed that up to 23% of the introgression lines (ILs) produce more grain than the parents and check cultivars. Based on 16 yield stability statistics, the yield of 12 ILs (3.4%) was stable across treatments, years, and locations; 5 of these lines were also high yielding lines, producing 9.8% more grain than the average yield of check cultivars. The most significant SNP- and haplotype-trait associations were identified on chromosome arms 2DS and 6DL for the spikelet number per spike (SNS), on chromosome arms 2DS, 3DS, 5DS, and 7DS for grain length (GL) and on chromosome arms 1DL, 2DS, 6DL, and 7DS for grain width (GW). The introgression of haplotypes from A. tauschii parents was associated with an increase in SNS, which was positively correlated with a heading date (HD), whereas the haplotypes from hexaploid wheat parents were associated with an increase in GW. We show that the haplotypes on 2DS associated with an increase in the spikelet number and HD are linked with multiple introgressed alleles of Ppd-D1 identified by the whole-genome sequencing of A. tauschii parents. Meanwhile, some introgressed haplotypes exhibited significant pleiotropic effects with the direction of effects on the yield component traits being largely consistent with the previously reported trade-offs, there were haplotype combinations associated with the positive trends in yield. The characterized repertoire of the introgressed haplotypes derived from A. tauschii accessions with the combined positive effects on yield and yield component traits in elite germplasm provides a valuable source of alleles for improving the productivity of winter wheat by optimizing the contribution of component traits to yield.
Collapse
Affiliation(s)
- Moses Nyine
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
| | - Elina Adhikari
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
| | - Marshall Clinesmith
- Department of Agronomy, Kansas State University, Manhattan, KS, United States
| | - Robert Aiken
- Department of Agronomy, Kansas State University, Manhattan, KS, United States
| | - Bliss Betzen
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
| | - Wei Wang
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
| | - Dwight Davidson
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
| | - Zitong Yu
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
| | - Yuanwen Guo
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
| | - Fei He
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
| | - Alina Akhunova
- Integrated Genomics Facility, Kansas State University, Manhattan, KS, United States
| | - Katherine W. Jordan
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
- United States Department of Agriculture, Agricultural Research Service Hard Winter Wheat Genetics Research Unit, Manhattan, KS, United States
| | - Allan K. Fritz
- Department of Agronomy, Kansas State University, Manhattan, KS, United States
| | - Eduard Akhunov
- Department of Plant Pathology, Kansas State University, Manhattan, KS, United States
- *Correspondence: Eduard Akhunov
| |
Collapse
|
8
|
Wu Y, Pan X, Jin X. Haplotype-based association study between PRCP gene polymorphisms and essential hypertension in Hani minority group from a remote region of China. J Renin Angiotensin Aldosterone Syst 2020; 21:1470320320981316. [PMID: 33319614 PMCID: PMC7745576 DOI: 10.1177/1470320320981316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Objective: Prolylcarboxypeptidase (PRCP) is both involved in the Kallikrein-Kinin system (KKS) and renin-angiotensin-aldosterone system (RAAS). This study aimed to determine the genetic impact of PRCP gene polymorphisms on essential hypertension (EH) in an isolated population from a remote region of China. Methods: A haplotype-based study was investigated in 346 EH patients and 346 normal subjects and all samples were Hani minority residents in Southwest China. A total of 11 tag single nucleotide polymorphisms (SNPs) in PRCP gene were tested by polymerase chain reaction-restriction fragment length polymorphism method. Results: Single site analysis found that PRCP gene 3′UTR SNP rs3750931 was associated with EH. The minor allele G of rs3750931 was more prevalent in the EH patients compared to control subjects after Bonferroni correction (p < 0.05). Moreover, the rs3750931 G allele carriers showed higher average blood pressure (BP) level among the subjects. The H2 (GAGCACTAACA) haplotype without rs3750931 G allele showed the protective effect for EH (OR = 0.68, 95 CI 0.54–0.85, p = 0.001). Conclusion: The present study indicated PRCP gene rs3750931 was associated with the risk of EH. This SNP G allele could be considered as one of risk markers for EH in Hani population.
Collapse
Affiliation(s)
- Yanrui Wu
- School of Basic Medical Sciences, Kunming Medical University, Kunming, Yunnan Province, P. R. China
| | - Xingming Pan
- Human Resources Department of Kunming Medical University, Kunming, Yunnan Province, P. R. China
| | - Xiaoxiao Jin
- School of Basic Medical Sciences, Kunming Medical University, Kunming, Yunnan Province, P. R. China
| |
Collapse
|
9
|
Xiong Q, Jiao Y, Yang P, Liao Y, Gu X, Hu F, Chen B. The association study between CYP24A1 gene polymorphisms and risk of liver, lung and gastric cancer in a Chinese population. Pathol Res Pract 2020; 216:153237. [PMID: 33065483 DOI: 10.1016/j.prp.2020.153237] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/26/2020] [Revised: 09/24/2020] [Accepted: 09/25/2020] [Indexed: 12/29/2022]
Abstract
Recently, four single nucleotide polymorphisms (rs2585428, rs4809960, rs6022999 and rs6068816) in CYP24A1 gene were extensively studied for their associations with cancer risk. However, these studies included only a few types of cancer, which calls for further investigations. In view of this, we here conducted a case-control study to explore the associations between these four CYP24A1 gene polymorphisms and risk of liver, lung and gastric cancer in a Chinese population. A total of 480 liver cancer patients, 550 lung cancer patients, 460 gastric cancer patients and 800 normal controls were recruited in this study. The genotyping of CYP24A1 gene polymorphisms was applied with Sanger sequencing assay. Single-locus analysis demonstrated that rs6022999 was significantly associated with risk of liver and lung cancer, while rs6068816 was significantly associated with the risk of gastric cancer. Haplotype analysis revealed that haplotype GTAT was associated with an increased risk of liver cancer and a decreased risk of lung cancer, and haplotype ATGC was associated with a decreased risk of lung cancer. The further meta-analysis of rs6068816 and lung cancer risk showed that rs6068816 was not associated with lung cancer risk in Chinese population, which confirmed our present finding. Conclusively, rs6022999 may be a genetic biomarker for liver and lung cancer susceptibility in Chinese population, and rs6068816 may be used to predict gastric cancer risk in Chinese population.
Collapse
Affiliation(s)
- Qiantao Xiong
- Department of Laboratory, Maternal and Child Health Hospital of Hubei Province, Wuhan, China
| | - Yuwei Jiao
- Department of Biological Science and Technology, School of Chemistry, Chemical Engineering and Life Sciences, Wuhan University of Technology, Wuhan, China
| | - Puyu Yang
- Department of Biological Science and Technology, School of Chemistry, Chemical Engineering and Life Sciences, Wuhan University of Technology, Wuhan, China
| | - Yuxiao Liao
- Department of Biological Science and Technology, School of Chemistry, Chemical Engineering and Life Sciences, Wuhan University of Technology, Wuhan, China
| | - Xiuli Gu
- Center of Reproductive Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, China; Department of Reproductive Genetics, Wuhan Tongji Reproductive Medicine Hospital, Wuhan, China
| | - Fuyan Hu
- Department of Statistics, Faculty of Science, Wuhan University of Technology, Wuhan, Hubei, China.
| | - Bifeng Chen
- Department of Biological Science and Technology, School of Chemistry, Chemical Engineering and Life Sciences, Wuhan University of Technology, Wuhan, China.
| |
Collapse
|
10
|
Ramzan F, Gültas M, Bertram H, Cavero D, Schmitt AO. Combining Random Forests and a Signal Detection Method Leads to the Robust Detection of Genotype-Phenotype Associations. Genes (Basel) 2020; 11:E892. [PMID: 32764260 PMCID: PMC7465705 DOI: 10.3390/genes11080892] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2020] [Revised: 07/28/2020] [Accepted: 08/03/2020] [Indexed: 12/21/2022] Open
Abstract
Genome wide association studies (GWAS) are a well established methodology to identify genomic variants and genes that are responsible for traits of interest in all branches of the life sciences. Despite the long time this methodology has had to mature the reliable detection of genotype-phenotype associations is still a challenge for many quantitative traits mainly because of the large number of genomic loci with weak individual effects on the trait under investigation. Thus, it can be hypothesized that many genomic variants that have a small, however real, effect remain unnoticed in many GWAS approaches. Here, we propose a two-step procedure to address this problem. In a first step, cubic splines are fitted to the test statistic values and genomic regions with spline-peaks that are higher than expected by chance are considered as quantitative trait loci (QTL). Then the SNPs in these QTLs are prioritized with respect to the strength of their association with the phenotype using a Random Forests approach. As a case study, we apply our procedure to real data sets and find trustworthy numbers of, partially novel, genomic variants and genes involved in various egg quality traits.
Collapse
Affiliation(s)
- Faisal Ramzan
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (F.R.); (M.G.); (H.B.)
- Department of Animal Breeding and Genetics, University of Agriculture Faisalabad, 38000 Faisalabad, Pakistan
| | - Mehmet Gültas
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (F.R.); (M.G.); (H.B.)
- Center for Integrated Breeding Research (CiBreed), Albrecht-Thaer-Weg 3, Georg-August University, 37075 Göttingen, Germany
| | - Hendrik Bertram
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (F.R.); (M.G.); (H.B.)
| | | | - Armin Otto Schmitt
- Breeding Informatics Group, Department of Animal Sciences, Georg-August University, Margarethe von Wrangell-Weg 7, 37075 Göttingen, Germany; (F.R.); (M.G.); (H.B.)
- Center for Integrated Breeding Research (CiBreed), Albrecht-Thaer-Weg 3, Georg-August University, 37075 Göttingen, Germany
| |
Collapse
|
11
|
Saravanan KA, Panigrahi M, Kumar H, Parida S, Bhushan B, Gaur GK, Kumar P, Dutt T, Mishra BP, Singh RK. Genome-wide assessment of genetic diversity, linkage disequilibrium and haplotype block structure in Tharparkar cattle breed of India. Anim Biotechnol 2020; 33:297-311. [PMID: 32730141 DOI: 10.1080/10495398.2020.1796696] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Knowledge about genetic diversity is very essential for the management and sustainable utilization of livestock genetic resources. In this study, we presented a comprehensive genome-wide analysis of genetic diversity, ROH, inbreeding, linkage disequilibrium, effective population size and haplotype block structure in Tharparkar cattle of India. A total of 24 Tharparkar animals used in this study were genotyped with Illumina BovineSNP50 array. After quality control, 22,825 biallelic SNPs were retained, which were in HWE, MAF > 0.05 and genotyping rate >90%. The overall mean observed (HO) and expected heterozygosity (HE) were 0.339 ± 0.156 and 0.325 ± 0.129, respectively. The average minor allele frequency was 0.234 with a standard deviation of ± 0.131. We identified a total of 1832 ROH segments and the highest autosomal coverage of 13.87% was observed on chromosome 23. The genomic inbreeding coefficients estimates by FROH, FHOM, FGRM and FUNI were 0.0589, 0.0215, 0.0532 and 0.0160 respectively. The overall mean linkage disequilibrium (LD) for a total of 133,532 pairwise SNPs measured by D' and r2 was 0.6452 and 0.1339, respectively. In addition, we observed a gradual decline in effective population size over the past generations.
Collapse
Affiliation(s)
- K A Saravanan
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, India
| | - Manjit Panigrahi
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, India
| | - Harshit Kumar
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, India
| | - Subhashree Parida
- Division of Pharmacology and Toxicology, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, India
| | - Bharat Bhushan
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, India
| | - G K Gaur
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, India
| | - Pushpendra Kumar
- Division of Animal Genetics, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, India
| | - Triveni Dutt
- Livestock Production and Management Section, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, India
| | - B P Mishra
- Division of Animal Biotechnology, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, India
| | - R K Singh
- Division of Animal Biotechnology, ICAR-Indian Veterinary Research Institute, Izatnagar, Bareilly, India
| |
Collapse
|
12
|
From molecules to populations: appreciating and estimating recombination rate variation. Nat Rev Genet 2020; 21:476-492. [DOI: 10.1038/s41576-020-0240-1] [Citation(s) in RCA: 45] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/15/2020] [Indexed: 02/07/2023]
|
13
|
Wu Y, Yang H, Xiao C. Genetic association study of prolylcarboxypeptidase polymorphisms with susceptibility to essential hypertension in the Yi minority of China: A case-control study based on an isolated population. J Renin Angiotensin Aldosterone Syst 2020; 21:1470320320919586. [PMID: 32448049 PMCID: PMC7249571 DOI: 10.1177/1470320320919586] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Objective: Prolylcarboxypeptidase (PRCP) is a negative regulator of the pressor actions of the renin–angiotensin–aldosterone system. It is also involved in the kallikrein–kinin system. This gene has an important role in blood pressure (BP) regulation. Methods: A case–control study was performed for 615 Yi participants (303 cases and 312 controls) from a remote mountainous area in Yunnan Province of China. For the PRCP gene, 11 tag single-nucleotide polymorphisms were genotyped using the polymerase chain reaction-restriction fragment length polymorphism method. Results: The PRCP gene rs12290550 was associated with the occurrence of essential hypertension (EH) and BP traits. Logistic regression analysis indicated that the rs12290550 T allele was significantly linked to the risk of EH (odds ratio (OR) = 1.85, 95% confidence interval (CI) 1.44–2.39, p = 0.2 × 10−5). Under Bonferroni correction, the H7 TAGCACTAACA haplotype containing the risk allele rs12290550 T increased the risk of EH (OR = 4.53, 95% CI 2.29–8.93, p = 0.2×10−5). Conclusions: The findings of this study demonstrate the strong association of the PRCP gene with EH. rs12290550 may be a useful genetic predictor of EH in the Yi minority.
Collapse
Affiliation(s)
- Yanrui Wu
- Cell Biology and Genetics Department, Kunming Medical University, China.,School of Medicine, Yunnan University, China
| | - Hongju Yang
- The First Affiliated Hospital of Kunming Medical University, Kunming Medical University, China
| | | |
Collapse
|
14
|
Lan S, Zheng C, Hauck K, McCausland M, Duguid SD, Booker HM, Cloutier S, You FM. Genomic Prediction Accuracy of Seven Breeding Selection Traits Improved by QTL Identification in Flax. Int J Mol Sci 2020; 21:ijms21051577. [PMID: 32106624 PMCID: PMC7084455 DOI: 10.3390/ijms21051577] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Revised: 02/23/2020] [Accepted: 02/23/2020] [Indexed: 01/21/2023] Open
Abstract
Molecular markers are one of the major factors affecting genomic prediction accuracy and the cost of genomic selection (GS). Previous studies have indicated that the use of quantitative trait loci (QTL) as markers in GS significantly increases prediction accuracy compared with genome-wide random single nucleotide polymorphism (SNP) markers. To optimize the selection of QTL markers in GS, a set of 260 lines from bi-parental populations with 17,277 genome-wide SNPs were used to evaluate the prediction accuracy for seed yield (YLD), days to maturity (DTM), iodine value (IOD), protein (PRO), oil (OIL), linoleic acid (LIO), and linolenic acid (LIN) contents. These seven traits were phenotyped over four years at two locations. Identification of quantitative trait nucleotides (QTNs) for the seven traits was performed using three types of statistical models for genome-wide association study: two SNP-based single-locus (SS), seven SNP-based multi-locus (SM), and one haplotype-block-based multi-locus (BM) models. The identified QTNs were then grouped into QTL based on haplotype blocks. For all seven traits, 133, 355, and 1208 unique QTL were identified by SS, SM, and BM, respectively. A total of 1420 unique QTL were obtained by SS+SM+BM, ranging from 254 (OIL, LIO) to 361 (YLD) for individual traits, whereas a total of 427 unique QTL were achieved by SS+SM, ranging from 56 (YLD) to 128 (LIO). SS models alone did not identify sufficient QTL for GS. The highest prediction accuracies were obtained using single-trait QTL identified by SS+SM+BM for OIL (0.929 ± 0.016), PRO (0.893 ± 0.023), YLD (0.892 ± 0.030), and DTM (0.730 ± 0.062), and by SS+SM for LIN (0.837 ± 0.053), LIO (0.835 ± 0.049), and IOD (0.835 ± 0.041). In terms of the number of QTL markers and prediction accuracy, SS+SM outperformed other models or combinations thereof. The use of all SNPs or QTL of all seven traits significantly reduced the prediction accuracy of traits. The results further validated that QTL outperformed high-density genome-wide random markers, and demonstrated that the combined use of single and multi-locus models can effectively identify a comprehensive set of QTL that improve prediction accuracy, but further studies on detection and removal of redundant or false-positive QTL to maximize prediction accuracy and minimize the number of QTL markers in GS are warranted.
Collapse
Affiliation(s)
- Samuel Lan
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON K1A 0C6, Canada; (S.L.); (C.Z.); (K.H.); (M.M.)
- Department of Mathematics and Statistics, University of Waterloo, Waterloo, ON N2L 3G1, Canada
| | - Chunfang Zheng
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON K1A 0C6, Canada; (S.L.); (C.Z.); (K.H.); (M.M.)
| | - Kyle Hauck
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON K1A 0C6, Canada; (S.L.); (C.Z.); (K.H.); (M.M.)
- Department of Mathematics and Statistics, University of Waterloo, Waterloo, ON N2L 3G1, Canada
| | - Madison McCausland
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON K1A 0C6, Canada; (S.L.); (C.Z.); (K.H.); (M.M.)
- Department of Plant Sciences, University of Manitoba, Winnipeg, MB R3T 2N2, Canada
| | - Scott D. Duguid
- Morden Research and Development Centre, Agriculture and Agri-Food Canada, Morden, MB R6M 1Y5, Canada;
| | - Helen M. Booker
- Crop Development Centre, University of Saskatchewan, Saskatoon, SK S7N 5A8, Canada;
| | - Sylvie Cloutier
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON K1A 0C6, Canada; (S.L.); (C.Z.); (K.H.); (M.M.)
- Correspondence: (F.M.Y.); (S.C); Tel.: +1-613-759-1539 (F.M.Y.); +1-613-759-1744 (S.C.)
| | - Frank M. You
- Ottawa Research and Development Centre, Agriculture and Agri-Food Canada, Ottawa, ON K1A 0C6, Canada; (S.L.); (C.Z.); (K.H.); (M.M.)
- Correspondence: (F.M.Y.); (S.C); Tel.: +1-613-759-1539 (F.M.Y.); +1-613-759-1744 (S.C.)
| |
Collapse
|
15
|
Genetic Screening of Plasticity Regulating Nogo-Type Signaling Genes in Migraine. Brain Sci 2019; 10:brainsci10010005. [PMID: 31861860 PMCID: PMC7016645 DOI: 10.3390/brainsci10010005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2019] [Revised: 12/12/2019] [Accepted: 12/17/2019] [Indexed: 11/26/2022] Open
Abstract
Migraine is the sixth most prevalent disease in the world and a substantial number of experiments have been conducted to analyze potential differences between the migraine brain and the healthy brain. Results from these investigations point to the possibility that development and aggravation of migraine may include grey matter plasticity. Nogo-type signaling is a potent plasticity regulating system in the CNS and consists of ligands, receptors, co-receptors and modulators with a dynamic age- and activity-related expression in cortical and subcortical regions. Here we investigated a potential link between migraine and five key Nogo-type signaling genes: RTN4, OMGP, MAG, RTN4R and LINGO1, by screening 15 single nucleotide polymorphisms (SNPs) within these genes. In a large Swedish migraine cohort (749 migraine patients and 4032 controls), using a logistic regression with sex as covariate, we found that there was no such association. In addition, a haplotype analysis was performed which revealed three haplotype blocks. These blocks had no significant association with migraine. However, to robustly conclude that Nogo-type genotypes signaling do not influence the prevalence of migraine, further studies are encouraged.
Collapse
|
16
|
Variants in the 3' End of SLC6A3 in Northwest Han Population with Parkinson's. PARKINSONS DISEASE 2019; 2019:6452471. [PMID: 31565212 PMCID: PMC6745156 DOI: 10.1155/2019/6452471] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/29/2019] [Revised: 06/06/2019] [Accepted: 06/29/2019] [Indexed: 11/29/2022]
Abstract
Parkinson's disease (PD) is one of the most common neurodegenerative disorders in neurology. It is possible that multifactorial and genetic factors are related to its pathogenesis. Recently, there have been reports of SLC6A3 genetic variants leading to PD. However, the role of 3′ end of SLC6A3 in PD is less studied in different ethnic groups. To explore the roles of 3′ end of SLC6A3 in PD development, 17 SNP sites in 3′ end of SLC6A3 were analyzed in 360 PD patients and 392 normal controls of Han population residing in northwest of China. The significant difference of gene type and allele frequencies between the PD and control groups was detected only in rs40184 (P = 0.013 and 0.004, respectively; odds ratio 2.529, 95% confidence interval 1.325–4.827). The genotype and allele frequencies of the other 16 SNP sites were not found to be different between the PD group and the control group. rs2550936, rs3776510, and rs429699 were selected to construct the haplotypes; no significant difference was found in a frequency of 5 haplotypes between the PD group and the control group. These results suggest that the SLC6A3 variant in rs40184 A allele may increase the risk of PD in northwest Han population and may be a biomarker of PD.
Collapse
|
17
|
Knisely MR, Conley YP, Smoot B, Paul SM, Levine JD, Miaskowski C. Associations Between Catecholaminergic and Serotonergic Genes and Persistent Arm Pain Severity Following Breast Cancer Surgery. THE JOURNAL OF PAIN 2019; 20:1100-1111. [PMID: 30904518 PMCID: PMC6736756 DOI: 10.1016/j.jpain.2019.03.008] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/26/2018] [Revised: 02/17/2019] [Accepted: 03/19/2019] [Indexed: 01/09/2023]
Abstract
Persistent arm pain is a common problem after breast cancer surgery. Little is known about genetic factors that contribute to this type of postsurgical pain. Study purpose was to explore associations between persistent arm pain phenotypes and genetic polymorphisms among 15 genes involved in catecholaminergic and serotonergic neurotransmission. Women (n = 398) rated the presence and intensity of arm pain monthly for 6 months after breast cancer surgery. Three distinct latent classes of patients were identified (ie, no arm pain [41.6%], mild arm pain (23.6%), and moderate arm pain (34.8%). Logistic regression analyses were used to evaluate for differences between genotype or haplotype frequencies and the persistent arm pain classes. Compared with the no arm pain class, 3 single nucleotide polymorphisms and 1 haplotype, in 4 genes, were associated with membership in the mild arm pain class: COMT rs4633, HTR2A haplotype B02 (composed of rs1923886 and rs7330636), HTR3A rs1985242, and TH rs2070762. Compared with the no arm pain class, 4 single nucleotide polymorphisms in 3 genes were associated with membership in the moderate arm pain class: COMT rs165656, HTR2A rs2770298 and rs9534511, and HTR3A rs1985242. Findings suggest that variations in catecholaminergic and serotonergic genes play a role in the development of persistent arm pain. PERSPECTIVE: Limited information is available on genetic factors that contribute to persistent arm pain after breast cancer surgery. Genetic polymorphisms in genes involved in catecholaminergic and serotonergic neurotransmission were associated with 2 persistent arm pain phenotypes. Findings may be used to identify patients are higher risk for this common pain condition.
Collapse
Affiliation(s)
| | - Yvette P Conley
- School of Nursing, University of Pittsburgh, Pittsburgh, Pennsylvania
| | - Betty Smoot
- Schools of Medicine, University of California, San Francisco, California
| | - Steven M Paul
- Schools of Nursing, University of California, San Francisco, California
| | - Jon D Levine
- Schools of Medicine, University of California, San Francisco, California
| | | |
Collapse
|
18
|
Ahsan T, Sajib AA. Drug-response related genetic architecture of Bangladeshi population. Meta Gene 2019. [DOI: 10.1016/j.mgene.2019.100585] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022] Open
|
19
|
Pook T, Schlather M, de Los Campos G, Mayer M, Schoen CC, Simianer H. HaploBlocker: Creation of Subgroup-Specific Haplotype Blocks and Libraries. Genetics 2019; 212:1045-1061. [PMID: 31152070 PMCID: PMC6707469 DOI: 10.1534/genetics.119.302283] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Accepted: 05/30/2019] [Indexed: 11/18/2022] Open
Abstract
The concept of haplotype blocks has been shown to be useful in genetics. Fields of application range from the detection of regions under positive selection to statistical methods that make use of dimension reduction. We propose a novel approach ("HaploBlocker") for defining and inferring haplotype blocks that focuses on linkage instead of the commonly used population-wide measures of linkage disequilibrium. We define a haplotype block as a sequence of genetic markers that has a predefined minimum frequency in the population, and only haplotypes with a similar sequence of markers are considered to carry that block, effectively screening a dataset for group-wise identity-by-descent. From these haplotype blocks, we construct a haplotype library that represents a large proportion of genetic variability with a limited number of blocks. Our method is implemented in the associated R-package HaploBlocker, and provides flexibility not only to optimize the structure of the obtained haplotype library for subsequent analyses, but also to handle datasets of different marker density and genetic diversity. By using haplotype blocks instead of single nucleotide polymorphisms (SNPs), local epistatic interactions can be naturally modeled, and the reduced number of parameters enables a wide variety of new methods for further genomic analyses such as genomic prediction and the detection of selection signatures. We illustrate our methodology with a dataset comprising 501 doubled haploid lines in a European maize landrace genotyped at 501,124 SNPs. With the suggested approach, we identified 2991 haplotype blocks with an average length of 2685 SNPs that together represent 94% of the dataset.
Collapse
Affiliation(s)
- Torsten Pook
- Department of Animal Sciences, Animal Breeding and Genetics Group, University of Goettingen, 37075, Germany
- Center for Integrated Breeding Research, University of Goettingen, 37075, Germany
| | - Martin Schlather
- Center for Integrated Breeding Research, University of Goettingen, 37075, Germany
- Stochastics and Its Applications Group, University of Mannheim, 68159, Germany
| | - Gustavo de Los Campos
- Departments of Epidemiology and Biostatistics and Statistics and Probability, Institute for Quantitative Health Science and Engineering, Michigan State University, Michigan 48824
| | - Manfred Mayer
- Plant Breeding, Technical University of Munich School of Life Sciences Weihenstephan, 85354 Freising, Germany
| | - Chris Carolin Schoen
- Plant Breeding, Technical University of Munich School of Life Sciences Weihenstephan, 85354 Freising, Germany
| | - Henner Simianer
- Department of Animal Sciences, Animal Breeding and Genetics Group, University of Goettingen, 37075, Germany
- Center for Integrated Breeding Research, University of Goettingen, 37075, Germany
| |
Collapse
|
20
|
Lin M, Griessenauer CJ, Starke RM, Tubbs RS, Shoja MM, Foreman PM, Vyas NA, Walters BC, Harrigan MR, Hendrix P, Fisher WS, Pittet JF, Mathru M, Lipsky RH. Haplotype analysis of SERPINE1 gene: Risk for aneurysmal subarachnoid hemorrhage and clinical outcomes. Mol Genet Genomic Med 2019; 7:e737. [PMID: 31268630 PMCID: PMC6687628 DOI: 10.1002/mgg3.737] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2019] [Accepted: 04/24/2019] [Indexed: 12/12/2022] Open
Abstract
Background Aneurysmal subarachnoid hemorrhage (aSAH) has high fatality and permanent disability rates due to the severe damage to brain cells and inflammation. The SERPINE1 gene that encodes PAI‐1 for the regulation of tissue plasminogen activator is considered an important therapeutic target for aSAH. Methods Six SNPs in the SERPINE1 gene (in order of rs2227631, rs1799889, rs6092, rs6090, rs2227684, rs7242) were investigated. Blood samples were genotyped with Taqman genotyping assays and pyrosequencing. The experiment‐wide statistically significant threshold for single marker analysis was set at p < 0.01 after evaluation of independent markers. Haplotype analysis was performed in Haplo.stats package with permutation tests. Bonferroni correction for multiple comparison in dominant, additive, and recessive model was applied. Results A total of 146 aSAH patients and 49 control subjects were involved in this study. The rs2227631 G allele is significant (p = 0.01) for aSAH compared to control. In aSAH group, haplotype analysis showed that G5GGGT homozygotes in recessive model were associated with delayed cerebral ischemia (p < 0.01, Odds Ratio = 5.14, 95% CI = 1.45–18.18), clinical vasospasm (p = 0.01, Odds Ratio = 4.58, 95% CI = 1.30–16.13), and longer intensive care unit stay (p = 0.01). By contrast, the G5GGAG carriers were associated with less incidence of cerebral edema (p < 0.01) and higher Glasgow Coma Scale (p < 0.01). The A4GGGT carriers were associated with less incidence of severe hypertension (>140/90) (p < 0.01). Conclusion The results suggested an important regulatory role of the SERPINE1 gene polymorphism in clinical outcomes of aSAH.
Collapse
Affiliation(s)
- Mingkuan Lin
- Department of Systems Biology, George Mason University, Fairfax, Virginia.,Department of Neuroscience, INOVA Health System, Fairfax, Virginia
| | - Christoph J Griessenauer
- Department of Neurosurgery, Geisinger, Danville, Pennsylvania.,Research Institute of Neurointervention, Paracelsus Medical University, Salzurg, Austria
| | - Robert M Starke
- Department of Neurosurgery and Radiology, University of Miami, Miami, Florida
| | | | | | - Paul M Foreman
- Department of Neurosurgery, University of Alabama at Birmingham, Alabama, Alabama
| | - Nilesh A Vyas
- Department of Neuroscience, INOVA Health System, Fairfax, Virginia
| | | | - Mark R Harrigan
- Department of Neurosurgery, University of Alabama at Birmingham, Alabama, Alabama
| | - Philipp Hendrix
- Department of Neurosurgery, Saarland University Medical Center, Saarland University, Homburg, Germany
| | - Winfield S Fisher
- Department of Neurosurgery, University of Alabama at Birmingham, Alabama, Alabama
| | - Jean-Francois Pittet
- Department of Neurosurgery, University of Alabama at Birmingham, Alabama, Alabama
| | - Mali Mathru
- Department of Neurosurgery, University of Alabama at Birmingham, Alabama, Alabama
| | - Robert H Lipsky
- Department of Systems Biology, George Mason University, Fairfax, Virginia.,Department of Neuroscience, INOVA Health System, Fairfax, Virginia
| |
Collapse
|
21
|
Tangherloni A, Spolaor S, Rundo L, Nobile MS, Cazzaniga P, Mauri G, Liò P, Merelli I, Besozzi D. GenHap: a novel computational method based on genetic algorithms for haplotype assembly. BMC Bioinformatics 2019; 20:172. [PMID: 30999845 PMCID: PMC6471693 DOI: 10.1186/s12859-019-2691-y] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Background In order to fully characterize the genome of an individual, the reconstruction of the two distinct copies of each chromosome, called haplotypes, is essential. The computational problem of inferring the full haplotype of a cell starting from read sequencing data is known as haplotype assembly, and consists in assigning all heterozygous Single Nucleotide Polymorphisms (SNPs) to exactly one of the two chromosomes. Indeed, the knowledge of complete haplotypes is generally more informative than analyzing single SNPs and plays a fundamental role in many medical applications. Results To reconstruct the two haplotypes, we addressed the weighted Minimum Error Correction (wMEC) problem, which is a successful approach for haplotype assembly. This NP-hard problem consists in computing the two haplotypes that partition the sequencing reads into two disjoint sub-sets, with the least number of corrections to the SNP values. To this aim, we propose here GenHap, a novel computational method for haplotype assembly based on Genetic Algorithms, yielding optimal solutions by means of a global search process. In order to evaluate the effectiveness of our approach, we run GenHap on two synthetic (yet realistic) datasets, based on the Roche/454 and PacBio RS II sequencing technologies. We compared the performance of GenHap against HapCol, an efficient state-of-the-art algorithm for haplotype phasing. Our results show that GenHap always obtains high accuracy solutions (in terms of haplotype error rate), and is up to 4× faster than HapCol in the case of Roche/454 instances and up to 20× faster when compared on the PacBio RS II dataset. Finally, we assessed the performance of GenHap on two different real datasets. Conclusions Future-generation sequencing technologies, producing longer reads with higher coverage, can highly benefit from GenHap, thanks to its capability of efficiently solving large instances of the haplotype assembly problem. Moreover, the optimization approach proposed in GenHap can be extended to the study of allele-specific genomic features, such as expression, methylation and chromatin conformation, by exploiting multi-objective optimization techniques. The source code and the full documentation are available at the following GitHub repository: https://github.com/andrea-tango/GenHap.
Collapse
Affiliation(s)
- Andrea Tangherloni
- Department of Informatics, Systems and Communication (DISCo), University of Milano-Bicocca, Viale Sarca 336, U14 Building, Milan, 20126, Italy.
| | - Simone Spolaor
- Department of Informatics, Systems and Communication (DISCo), University of Milano-Bicocca, Viale Sarca 336, U14 Building, Milan, 20126, Italy
| | - Leonardo Rundo
- Department of Informatics, Systems and Communication (DISCo), University of Milano-Bicocca, Viale Sarca 336, U14 Building, Milan, 20126, Italy.,Institute of Molecular Bioimaging and Physiology, Italian National Research Council, Contrada Pietrapollastra-Pisciotto, Cefalù (PA), 90015, Italy
| | - Marco S Nobile
- Department of Informatics, Systems and Communication (DISCo), University of Milano-Bicocca, Viale Sarca 336, U14 Building, Milan, 20126, Italy.,SYSBIO.IT Centre of Systems Biology, Piazza della Scienza 2, Milan, 20126, Italy
| | - Paolo Cazzaniga
- Department of Human and Social Sciences, University of Bergamo, Piazzale Sant'Agostino 2, Bergamo, 24129, Italy.,SYSBIO.IT Centre of Systems Biology, Piazza della Scienza 2, Milan, 20126, Italy
| | - Giancarlo Mauri
- Department of Informatics, Systems and Communication (DISCo), University of Milano-Bicocca, Viale Sarca 336, U14 Building, Milan, 20126, Italy.,SYSBIO.IT Centre of Systems Biology, Piazza della Scienza 2, Milan, 20126, Italy
| | - Pietro Liò
- Computer Laboratory, University of Cambridge, 15 JJ Thomson Avenue, Cambridge, CB3 0FD, UK
| | - Ivan Merelli
- Institute of Biomedical Technologies, Italian National Research Council, Via Fratelli Cervi 93, Segrate (MI), 20090, Italy
| | - Daniela Besozzi
- Department of Informatics, Systems and Communication (DISCo), University of Milano-Bicocca, Viale Sarca 336, U14 Building, Milan, 20126, Italy
| |
Collapse
|
22
|
Motazedi E, Maliepaard C, Finkers R, Visser R, de Ridder D. Family-Based Haplotype Estimation and Allele Dosage Correction for Polyploids Using Short Sequence Reads. Front Genet 2019; 10:335. [PMID: 31040862 PMCID: PMC6477055 DOI: 10.3389/fgene.2019.00335] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Accepted: 03/28/2019] [Indexed: 12/27/2022] Open
Abstract
DNA sequence reads contain information about the genomic variants located on a single chromosome. By extracting and extending this information using the overlaps between the reads, the haplotypes of an individual can be obtained. Using parent-offspring relationships in a population can considerably improve the quality of the haplotypes obtained from short reads, as pedigree information can be used to correct for spurious overlaps (due to sequencing errors) and insufficient overlaps (due to short read lengths, low genomic variation and shallow coverage). We developed a novel method, PopPoly, to estimate polyploid haplotypes in an F1-population from short sequence data by taking into consideration the transmission of the haplotypes from the parents to the offspring. In addition, this information is employed to improve genotype dosage estimation and to call missing genotypes in the population. Through simulations, we compare PopPoly to other haplotyping methods and show its better performance. We evaluate PopPoly by applying it to a tetraploid potato cross at nine genomic regions involved in tuber formation.
Collapse
Affiliation(s)
- Ehsan Motazedi
- Bioinformatics Group, Wageningen University & Research, Wageningen, Netherlands.,Plant Breeding, Wageningen University & Research, Wageningen, Netherlands
| | - Chris Maliepaard
- Plant Breeding, Wageningen University & Research, Wageningen, Netherlands
| | - Richard Finkers
- Plant Breeding, Wageningen University & Research, Wageningen, Netherlands
| | - Richard Visser
- Plant Breeding, Wageningen University & Research, Wageningen, Netherlands
| | - Dick de Ridder
- Bioinformatics Group, Wageningen University & Research, Wageningen, Netherlands
| |
Collapse
|
23
|
Li Z, Kemppainen P, Rastas P, Merilä J. Linkage disequilibrium clustering‐based approach for association mapping with tightly linked genomewide data. Mol Ecol Resour 2018; 18:809-824. [DOI: 10.1111/1755-0998.12893] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2017] [Revised: 04/05/2018] [Accepted: 04/06/2018] [Indexed: 02/05/2023]
Affiliation(s)
- Zitong Li
- Ecological Genetics Research Unit Research Programme in Organismal and Evolutionary Biology Faculty of Biological and Environmental Sciences Department of Biosciences University of Helsinki Helsinki Finland
| | - Petri Kemppainen
- Ecological Genetics Research Unit Research Programme in Organismal and Evolutionary Biology Faculty of Biological and Environmental Sciences Department of Biosciences University of Helsinki Helsinki Finland
| | - Pasi Rastas
- Ecological Genetics Research Unit Research Programme in Organismal and Evolutionary Biology Faculty of Biological and Environmental Sciences Department of Biosciences University of Helsinki Helsinki Finland
| | - Juha Merilä
- Ecological Genetics Research Unit Research Programme in Organismal and Evolutionary Biology Faculty of Biological and Environmental Sciences Department of Biosciences University of Helsinki Helsinki Finland
| |
Collapse
|
24
|
Bourke PM, Voorrips RE, Visser RGF, Maliepaard C. Tools for Genetic Studies in Experimental Populations of Polyploids. FRONTIERS IN PLANT SCIENCE 2018; 9:513. [PMID: 29720992 PMCID: PMC5915555 DOI: 10.3389/fpls.2018.00513] [Citation(s) in RCA: 55] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2018] [Accepted: 04/04/2018] [Indexed: 05/19/2023]
Abstract
Polyploid organisms carry more than two copies of each chromosome, a condition rarely tolerated in animals but which occurs relatively frequently in the plant kingdom. One of the principal challenges faced by polyploid organisms is to evolve stable meiotic mechanisms to faithfully transmit genetic information to the next generation upon which the study of inheritance is based. In this review we look at the tools available to the research community to better understand polyploid inheritance, many of which have only recently been developed. Most of these tools are intended for experimental populations (rather than natural populations), facilitating genomics-assisted crop improvement and plant breeding. This is hardly surprising given that a large proportion of domesticated plant species are polyploid. We focus on three main areas: (1) polyploid genotyping; (2) genetic and physical mapping; and (3) quantitative trait analysis and genomic selection. We also briefly review some miscellaneous topics such as the mode of inheritance and the availability of polyploid simulation software. The current polyploid analytic toolbox includes software for assigning marker genotypes (and in particular, estimating the dosage of marker alleles in the heterozygous condition), establishing chromosome-scale linkage phase among marker alleles, constructing (short-range) haplotypes, generating linkage maps, performing genome-wide association studies (GWAS) and quantitative trait locus (QTL) analyses, and simulating polyploid populations. These tools can also help elucidate the mode of inheritance (disomic, polysomic or a mixture of both as in segmental allopolyploids) or reveal whether double reduction and multivalent chromosomal pairing occur. An increasing number of polyploids (or associated diploids) are being sequenced, leading to publicly available reference genome assemblies. Much work remains in order to keep pace with developments in genomic technologies. However, such technologies also offer the promise of understanding polyploid genomes at a level which hitherto has remained elusive.
Collapse
Affiliation(s)
| | | | | | - Chris Maliepaard
- Plant Breeding, Wageningen University & Research, Wageningen, Netherlands
| |
Collapse
|
25
|
Sayad A, Noroozi R, Khodamoradi Z, Omrani MD, Taheri M, Ghafouri-Fard S. Association Study of VMAT1 Polymorphisms and Suicide Behavior. J Mol Neurosci 2018. [PMID: 29536333 DOI: 10.1007/s12031-018-1047-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
Genetic association studies have linked suicide behavior with genes encoding transporters of monoamine. Variants in the vesicular monoamine transporter 1 (VMAT1) have been previously shown to be associated with several psychiatric disorders including schizophrenia and bipolar disorder. However, their association with suicide behavior has not been explored. In the present study, we genotyped three single-nucleotide polymorphisms (rs2270637, rs1390938, and rs2279709) within this gene in 100 individuals who attempted suicide, 236 suicide victims, and 300 control subjects without any history of psychiatric disorders or suicide ideation. We demonstrated no difference in genotype, allele, or haplotype frequencies of theses single-nucleotide polymorphisms between the study groups. Consequently, contribution of VMAT1 in risk of psychiatric disorders might be independent of suicide behavior. Future studies with larger sample sizes are needed to confirm our results.
Collapse
Affiliation(s)
- Arezou Sayad
- Department of Medical Genetics, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Rezvan Noroozi
- Phytochemistry Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Zahra Khodamoradi
- School of Medicine, Hamadan University of Medical Sciences, Hamadan, Iran
| | - Mir Davood Omrani
- Department of Medical Genetics, Shahid Beheshti University of Medical Sciences, Tehran, Iran.,Urogenital Stem Cell Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mohammad Taheri
- Department of Medical Genetics, Shahid Beheshti University of Medical Sciences, Tehran, Iran. .,Urogenital Stem Cell Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| | - Soudeh Ghafouri-Fard
- Department of Medical Genetics, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
26
|
ITPR3 gene haplotype is associated with cervical squamous cell carcinoma risk in Taiwanese women. Oncotarget 2018; 8:10085-10090. [PMID: 28036301 PMCID: PMC5354643 DOI: 10.18632/oncotarget.14341] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2016] [Accepted: 12/15/2016] [Indexed: 11/25/2022] Open
Abstract
Host immunogenetic background plays an important role in human papillomavirus (HPV) infection and cervical cancer development. Inositol 1,4,5-triphosphate receptor type 3 (ITPR3) is essential for both immune activation and cancer pathogenesis. We aim to investigate if ITPR3 genetic polymorphisms are associated with the risk of cervical cancer in Taiwanese women. ITPR3 rs3748079 A/G and rs2229634 C/T polymorphisms were genotyped in a hospital-based study of 462 women with cervical squamous cell carcinoma (CSCC) and 921 age-matched healthy control women. The presence and genotypes of HPV in CSCC was determined. No significant association of individual ITPR3 variants were found among controls, CSCC, and HPV-16 positive CSCC. However, we found a significant association of haplotype AT between CSCC and controls (OR = 2.28, 95% CI 1.31-3.97, P = 2.83 × 10-3) and the OR increased further in CSCC patients infected with HPV-16 (OR = 2.89, 95% CI 1.55-5.37, P = 4.54 × 10-4). The linkage disequilibrium analysis demonstrated that ITPR3 association with CSCC was independent of HLA-DRB1 alleles. In conclusion, these findings suggest that AT haplotype in the ITPR3 gene may serve as a potential marker for genetic susceptibility to CSCC.
Collapse
|
27
|
Chen B, Wang J, Gu X, Zhang J, Zhang J, Feng X. The DNMT3B -579G>T Polymorphism Is Significantly Associated With the Risk of Gastric Cancer but not Lung Cancer in Chinese Population. Technol Cancer Res Treat 2017; 16:1259-1265. [PMID: 29332452 PMCID: PMC5762089 DOI: 10.1177/1533034617740475] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2017] [Revised: 09/12/2017] [Accepted: 10/09/2017] [Indexed: 01/09/2023] Open
Abstract
The -149C>T and -579G>T, 2 single nucleotide polymorphisms in de novo methyltransferase 3B gene promoter, have been previously reported to potentially alter the promoter activity and to influence cancer risk. However, the results from previous studies remain conflicting rather than conclusive. In view of this, we conducted a case-control study and then a meta-analysis to examine the association between these 2 single-nucleotide polymorphisms with risk of lung and gastric cancer in Chinese population. The genotyping was performed by polymerase chain reaction-restriction fragment length polymorphism and confirmed by sequencing. In this case-control study, no significant association with lung or gastric cancer risk was observed for -149C>T, while -579G>T was significantly correlated with the risk of gastric cancer but not lung cancer. Moreover, haplotype analysis showed that haplotype -149T/-579 T, which carried the risk -579 T allele, significantly increased the susceptibility to gastric cancer. However, none of the haplotypes was associated with the risk of lung cancer. The following meta-analysis involved only Chinese population and further confirmed the significant association of -579G>T with gastric cancer but not lung cancer and suggested no significant association between -149C>T and risk of lung or gastric cancer. Collectively, DNMT3B -579G>T polymorphism is associated with gastric cancer risk in Chinese population, and the -579G>T may be used as a genetic biomarker to predict the risk of gastric cancer in Chinese population.
Collapse
Affiliation(s)
- Bifeng Chen
- Department of Biological Science and Technology, School of Chemistry, Chemical Engineering and Life Sciences, Wuhan University of Technology, Wuhan, Hubei, China
| | - Jingdong Wang
- Department of Biological Science and Technology, School of Chemistry, Chemical Engineering and Life Sciences, Wuhan University of Technology, Wuhan, Hubei, China
| | - Xiuli Gu
- Center of Reproductive Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China
- Wuhan Tongji Reproductive Medicine Hospital, Wuhan, Hubei, China
| | - Jingli Zhang
- Department of Biological Science and Technology, School of Chemistry, Chemical Engineering and Life Sciences, Wuhan University of Technology, Wuhan, Hubei, China
| | - Jiankun Zhang
- Department of Biological Science and Technology, School of Chemistry, Chemical Engineering and Life Sciences, Wuhan University of Technology, Wuhan, Hubei, China
| | - Xianhong Feng
- Clinical Laboratory, Wuhan Xinzhou District People’s Hospital, Wuhan, China
| |
Collapse
|
28
|
Gonzalez S, Gupta J, Villa E, Mallawaarachchi I, Rodriguez M, Ramirez M, Zavala J, Armas R, Dassori A, Contreras J, Flores D, Jerez A, Ontiveros A, Nicolini H, Escamilla M. Replication of genome-wide association study (GWAS) susceptibility loci in a Latino bipolar disorder cohort. Bipolar Disord 2016; 18:520-527. [PMID: 27759212 PMCID: PMC5095871 DOI: 10.1111/bdi.12438] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/09/2015] [Accepted: 09/02/2016] [Indexed: 12/23/2022]
Abstract
OBJECTIVES Recent genome-wide association studies (GWASs) have identified numerous putative genetic polymorphisms associated with bipolar disorder (BD) and/or schizophrenia (SC). We hypothesized that a portion of these polymorphisms would also be associated with BD in the Latino American population. To identify such regions, we tested previously identified genetic variants associated with BD and/or SC and ancestral haploblocks containing these single nucleotide polymorphisms (SNPs) in a sample of Latino subjects with BD. METHODS A total of 2254 Latino individuals were genotyped for 91 SNPs identified in previous BD and/or SC GWASs, along with selected SNPs in strong linkage disequilibrium with these markers. Family-based single marker and haplotype association testing was performed using the PBAT software package. Empirical P-values were derived from 10 000 permutations. RESULTS Associations of eight a priori GWAS SNPs with BD were replicated with nominal (P≤.05) levels of significance. These included SNPs within nuclear factor I A (NFIA), serologically defined colon cancer antigen 8 (SDCCAG8), lysosomal associated membrane protein 3 (LAMP3), nuclear factor kappa B subunit 1 (NFKB1), major histocompatibility complex, class I, B (HLA-B) and 5'-nucleotidase, cytosolic II (NT5C2) and SNPs within intragenic regions microRNA 6828 (MIR6828)-solute carrier family 7 member 14 (SLC7A14) and sonic hedgehog (SHH)-long intergenic non-protein coding RNA 1006 (LINC01006). Of the 76 ancestral haploblocks that were tested for associations with BD, our top associated haploblock was located in LAMP3; however, the association did not meet statistical thresholds of significance following Bonferroni correction. CONCLUSIONS These results indicate that some of the gene variants found to be associated with BD or SC in other populations are also associated with BD risk in Latinos. Variants in six genes and two intragenic regions were associated with BD in our Latino sample and provide additional evidence for overlap in genetic risk between SC and BD.
Collapse
Affiliation(s)
- Suzanne Gonzalez
- Center of Excellence in Neurosciences, Department of Biomedical Sciences, Paul L. Foster School of Medicine, Texas Tech University Health Sciences Center, El Paso, TX, USA.
| | - Jayanta Gupta
- Department of Health Sciences, College of Health Professions & Social Work, Florida Gulf Coast University, Fort Myers, FL, USA
| | - Erika Villa
- Center of Excellence in Neurosciences, Department of Biomedical Sciences, Paul L. Foster School of Medicine, Texas Tech University Health Sciences Center, El Paso, TX, USA
| | - Indika Mallawaarachchi
- Biostatistics and Epidemiology Consulting Lab, Paul L. Foster School of Medicine, Texas Tech University Health Sciences Center, El Paso, TX, USA
| | - Marco Rodriguez
- Center of Excellence in Neurosciences, Department of Biomedical Sciences, Paul L. Foster School of Medicine, Texas Tech University Health Sciences Center, El Paso, TX, USA
| | - Mercedes Ramirez
- Center of Excellence in Neurosciences, Department of Biomedical Sciences, Paul L. Foster School of Medicine, Texas Tech University Health Sciences Center, El Paso, TX, USA
- Department of Psychiatry, Paul L. Foster School of Medicine, Texas Tech University Health Sciences Center, El Paso, TX, USA
| | - Juan Zavala
- Center of Excellence in Neurosciences, Department of Biomedical Sciences, Paul L. Foster School of Medicine, Texas Tech University Health Sciences Center, El Paso, TX, USA
- Department of Psychiatry, Paul L. Foster School of Medicine, Texas Tech University Health Sciences Center, El Paso, TX, USA
| | - Regina Armas
- Langley Porter Psychiatric Institute, University of California at San Francisco, San Francisco, CA, USA
| | - Albana Dassori
- Department of Psychiatry, University of Texas Health Science Center at San Antonio, San Antonio, TX, USA
- South Texas Veterans Health Care System, San Antonio, TX, USA
| | - Javier Contreras
- Centro de Investigación en Biología Celular y Molecular y Escuela de Biologia, Universidad de Costa Rica, San Jose, Costa Rica
| | - Deborah Flores
- Los Angeles Biomedical Research Center at Harbor, University of California Los Angeles Medical Center, Torrance, CA, USA
| | - Alvaro Jerez
- Centro Internacional de Trastornos Afectivos y de la Conducta Adictiva, Guatemala City, Guatemala
| | - Alfonso Ontiveros
- Instituto de Información e Investigación en Salud Mental AC, Monterrey, Nuevo Leon, México
| | - Humberto Nicolini
- Grupo de Estudios Médicos y Familiares Carracci S.C., México D.F, México
| | - Michael Escamilla
- Center of Excellence in Neurosciences, Department of Biomedical Sciences, Paul L. Foster School of Medicine, Texas Tech University Health Sciences Center, El Paso, TX, USA
- Department of Psychiatry, Paul L. Foster School of Medicine, Texas Tech University Health Sciences Center, El Paso, TX, USA
| |
Collapse
|
29
|
Affiliation(s)
- Marco Dauriz
- Division of Endocrinology, Diabetes and Metabolism, Department of Medicine, University of Verona School of Medicine and Hospital Trust of Verona, Ospedale Civile Maggiore, P.le Stefani, 1 - Pad. 22, 37126, Verona, Italy.
| | - James B Meigs
- General Medicine Division, Massachusetts General Hospital, Boston, MA, USA
- Department of Medicine, Harvard Medical School, Boston, MA, USA
| |
Collapse
|
30
|
Cao XK, Zhan ZY, Huang YZ, Lan XY, Lei CZ, Qi XL, Chen H. Variants and haplotypes within MEF2C gene influence stature of chinese native cattle including body dimensions and weight. Livest Sci 2016. [DOI: 10.1016/j.livsci.2016.01.008] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
31
|
Dias H, Muc M, Padez C, Manco L. Association of polymorphisms in 5-HTT (SLC6A4) and MAOA genes with measures of obesity in young adults of Portuguese origin. Arch Physiol Biochem 2016; 122:8-13. [PMID: 26698543 DOI: 10.3109/13813455.2015.1111390] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
OBJECTIVES To investigate the association of polymorphisms in SLC6A4 and MAOA genes with overweight (including obesity). MATERIAL AND METHODS Young adults (n = 535) of Portuguese origin were genotyped for the SLC6A4 polymorphisms 5-HTTLPR and STin2 and a MAOA VNTR. BMI and body fat percentage were measured and a questionnaire was used to assess individual's sport practicing habits. RESULTS In whole study sample, haplotype-based analysis revealed significant association with overweight/obesity for the individual 5-HTTLPR/Stin2 haplotype L10 (p = 0.04). In men, the MAOA 3R genotype was nominally associated with body fat (p = 0.04). In inactive individuals, overweight/obesity was found significantly associated with 5-HTTLPR L-allele (p = 0.01) and nominally associated with STin2 10-allele (p = 0.03). A significant association was also found testing for all haplotype effects (χ(2 )= 8.7; p = 0.03). CONCLUSIONS We found some evidences for the association of SLC6A4 and MAOA genes with measures of obesity. Our results suggest physical inactivity accentuates the influence of SLC6A4 polymorphisms on obesity risk.
Collapse
Affiliation(s)
- Helena Dias
- a Research Centre for Anthropology and Health (CIAS), Department of Life Sciences , University of Coimbra , Portugal
| | - Magdalena Muc
- a Research Centre for Anthropology and Health (CIAS), Department of Life Sciences , University of Coimbra , Portugal
| | - Cristina Padez
- a Research Centre for Anthropology and Health (CIAS), Department of Life Sciences , University of Coimbra , Portugal
| | - Licínio Manco
- a Research Centre for Anthropology and Health (CIAS), Department of Life Sciences , University of Coimbra , Portugal
| |
Collapse
|
32
|
Association of BDNF Polymorphisms with the Risk of Epilepsy: a Multicenter Study. Mol Neurobiol 2015; 53:2869-2877. [PMID: 25876511 DOI: 10.1007/s12035-015-9150-1] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2015] [Accepted: 03/19/2015] [Indexed: 12/23/2022]
Abstract
Epilepsy is a common neurological disease characterized by recurrent unprovoked seizures. Evidence suggested that abnormal activity of brain-derived neurotrophic factor (BDNF) contributes to the pathogenesis of epilepsy. Some previous studies identified association between genetic variants of BDNF and risk of epilepsy. In this study, this association has been examined in the Hong Kong and Malaysian epilepsy cohorts. Genomic DNA of 6047 subjects (1640 patients with epilepsy and 4407 healthy individuals) was genotyped for rs6265, rs11030104, rs7103411, and rs7127507 polymorphisms by using Sequenom MassArray and Illumina HumanHap 610-Quad or 550-Duo BeadChip arrays techniques. Results showed significant association between rs6265 T, rs7103411 C, and rs7127507 T and cryptgenic epilepsy risk (p = 0.00003, p = 0.0002, and p = 0.002, respectively) or between rs6265 and rs7103411 and symptomatic epilepsy risk in Malaysian Indians (TT vs. CC, p = 0.004 and T vs. C, p = 0.0002, respectively) as well as between rs6265 T and risk of cryptogenic epilepsy in Malaysian Chinese (p = 0.005). The Trs6265-Crs7103411-Trs7127507 was significantly associated with cryptogenic epilepsy in Malaysian Indians (p = 0.00005). In conclusion, our results suggest that BDNF polymorphisms might contribute to the risk of epilepsy in Malaysian Indians and Chinese.
Collapse
|
33
|
Hung CL, Chen WP, Hua GJ, Zheng H, Tsai SJJ, Lin YL. Cloud computing-based TagSNP selection algorithm for human genome data. Int J Mol Sci 2015; 16:1096-110. [PMID: 25569088 PMCID: PMC4307292 DOI: 10.3390/ijms16011096] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2014] [Accepted: 12/04/2014] [Indexed: 12/31/2022] Open
Abstract
Single nucleotide polymorphisms (SNPs) play a fundamental role in human genetic variation and are used in medical diagnostics, phylogeny construction, and drug design. They provide the highest-resolution genetic fingerprint for identifying disease associations and human features. Haplotypes are regions of linked genetic variants that are closely spaced on the genome and tend to be inherited together. Genetics research has revealed SNPs within certain haplotype blocks that introduce few distinct common haplotypes into most of the population. Haplotype block structures are used in association-based methods to map disease genes. In this paper, we propose an efficient algorithm for identifying haplotype blocks in the genome. In chromosomal haplotype data retrieved from the HapMap project website, the proposed algorithm identified longer haplotype blocks than an existing algorithm. To enhance its performance, we extended the proposed algorithm into a parallel algorithm that copies data in parallel via the Hadoop MapReduce framework. The proposed MapReduce-paralleled combinatorial algorithm performed well on real-world data obtained from the HapMap dataset; the improvement in computational efficiency was proportional to the number of processors used.
Collapse
Affiliation(s)
- Che-Lun Hung
- Department of Computer Science and Communication Engineering, Providence University, Taichung 43301, Taiwan.
| | - Wen-Pei Chen
- Department of Applied Chemistry, Providence University, Taiwan 43301, Taiwan.
| | - Guan-Jie Hua
- Department of Computer Science, National Tsing Hua University, Hsinchu 30013, Taiwan.
| | - Huiru Zheng
- School of Computing and Mathematics, University of Ulster, Newtownabbey BT37 0QB, UK.
| | - Suh-Jen Jane Tsai
- Department of Applied Chemistry, Providence University, Taiwan 43301, Taiwan.
| | - Yaw-Ling Lin
- Department of Applied Chemistry, Providence University, Taiwan 43301, Taiwan.
| |
Collapse
|
34
|
Qin L, Zhao P, Liu Z, Chang P. Associations SELE gene haplotype variant and hypertension in Mongolian and Han populations. Intern Med 2015; 54:287-93. [PMID: 25748737 DOI: 10.2169/internalmedicine.54.2797] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
UNLABELLED Genetic variation is thought to contribute to the etiology of hypertension, and E-selectin is a candidate essential hypertension-associated gene. OBJECTIVE In this study, we attempted to test the hypothesis that subtle haplotype variants of SELE genes may be sources of essential hypertension in Mongolian and Han populations. MATERIALS A total of 429 unrelated Mongolian herdsmen and 416 Han farmers were enrolled, including 212 Mongolian essential hypertension (EH) patients, 217 Mongolian normotensives (controls), 200 Han EH patients and 216 Han normotensives (controls). METHODS All nine tag single-nucleotide polymorphisms (SNPs) within the SELE gene were retrieved from HapMap and the genotyping was performed using a polymerase chain reaction (PCR)/ligase detection reaction assay. Results The distributions of the A-allele frequency of rs3917458 and the C-allele frequency of rs2179172 differed significantly between the hypertensive subjects and controls in the Han population. The frequency of haplotype GGC was significantly higher in the EH group than in the controls in the Mongolian population. In the Han population, a significant difference was observed in the haplotype frequency of TCC between the patients and controls, whereas haplotype ACA was detected significantly less often in the EH subjects than in the controls. CONCLUSION Meanwhile, the haplotype TCC in the Han hypertensive patients and the haplotype GGC in the Mongolian patients had independent effects in increasing the risk for EH and maybe used as risk factors for predicting high blood pressure. However, the haplotype ACA had an independent effect in decreasing the risk of hypertension and may be protective in normotensive subjects in the Han population. Therefore, multiple SNPs in combination in SELE may confer a risk of hypertension.
Collapse
Affiliation(s)
- Li Qin
- Department of Pathophysiology, Inner Mongolia Medical University, China
| | | | | | | |
Collapse
|
35
|
Abstract
A genome-wide association study involves examining a large number of single-nucleotide polymorphisms (SNPs) to identify SNPs that are significantly associated with the given phenotype, while trying to reduce the false positive rate. Although haplotype-based association methods have been proposed to accommodate correlation information across nearby SNPs that are in linkage disequilibrium, none of these methods directly incorporated the structural information such as recombination events along chromosome. In this paper, we propose a new approach called stochastic block lasso for association mapping that exploits prior knowledge on linkage disequilibrium structure in the genome such as recombination rates and distances between adjacent SNPs in order to increase the power of detecting true associations while reducing false positives. Following a typical linear regression framework with the genotypes as inputs and the phenotype as output, our proposed method employs a sparsity-enforcing Laplacian prior for the regression coefficients, augmented by a first-order Markov process along the sequence of SNPs that incorporates the prior information on the linkage disequilibrium structure. The Markov-chain prior models the structural dependencies between a pair of adjacent SNPs, and allows us to look for association SNPs in a coupled manner, combining strength from multiple nearby SNPs. Our results on HapMap-simulated datasets and mouse datasets show that there is a significant advantage in incorporating the prior knowledge on linkage disequilibrium structure for marker identification under whole-genome association.
Collapse
Affiliation(s)
- Seyoung Kim
- School of Computer Science, Carnegie Mellon University , Pittsburgh, Pennsylvania
| | | |
Collapse
|
36
|
The CRHR1 Gene Contributes to Genetic Susceptibility of Aggressive Behavior Towards Others in Chinese Southwest Han Population. J Mol Neurosci 2013; 52:481-6. [DOI: 10.1007/s12031-013-0160-z] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2013] [Accepted: 10/23/2013] [Indexed: 12/28/2022]
|
37
|
Chen WP, Hung CL, Lin YL. Efficient haplotype block partitioning and tag SNP selection algorithms under various constraints. BIOMED RESEARCH INTERNATIONAL 2013; 2013:984014. [PMID: 24319694 PMCID: PMC3844216 DOI: 10.1155/2013/984014] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/29/2013] [Accepted: 09/05/2013] [Indexed: 11/18/2022]
Abstract
Patterns of linkage disequilibrium plays a central role in genome-wide association studies aimed at identifying genetic variation responsible for common human diseases. These patterns in human chromosomes show a block-like structure, and regions of high linkage disequilibrium are called haplotype blocks. A small subset of SNPs, called tag SNPs, is sufficient to capture the haplotype patterns in each haplotype block. Previously developed algorithms completely partition a haplotype sample into blocks while attempting to minimize the number of tag SNPs. However, when resource limitations prevent genotyping all the tag SNPs, it is desirable to restrict their number. We propose two dynamic programming algorithms, incorporating many diversity evaluation functions, for haplotype block partitioning using a limited number of tag SNPs. We use the proposed algorithms to partition the chromosome 21 haplotype data. When the sample is fully partitioned into blocks by our algorithms, the 2,266 blocks and 3,260 tag SNPs are fewer than those identified by previous studies. We also demonstrate that our algorithms find the optimal solution by exploiting the nonmonotonic property of a common haplotype-evaluation function.
Collapse
Affiliation(s)
- Wen-Pei Chen
- Department of Applied Chemistry, Providence University, Taichung 433, Taiwan
| | - Che-Lun Hung
- Department of Computer Science and Communication Engineering, Providence University, Taichung 433, Taiwan
| | - Yaw-Ling Lin
- Department of Computer Science and Information Engineering, Providence University, Taichung 433, Taiwan
| |
Collapse
|
38
|
Lin WY, Yi N, Lou XY, Zhi D, Zhang K, Gao G, Tiwari HK, Liu N. Haplotype kernel association test as a powerful method to identify chromosomal regions harboring uncommon causal variants. Genet Epidemiol 2013; 37:560-70. [PMID: 23740760 DOI: 10.1002/gepi.21740] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2012] [Revised: 05/01/2013] [Accepted: 05/06/2013] [Indexed: 01/09/2023]
Abstract
For most complex diseases, the fraction of heritability that can be explained by the variants discovered from genome-wide association studies is minor. Although the so-called "rare variants" (minor allele frequency [MAF] < 1%) have attracted increasing attention, they are unlikely to account for much of the "missing heritability" because very few people may carry these rare variants. The genetic variants that are likely to fill in the "missing heritability" include uncommon causal variants (MAF < 5%), which are generally untyped in association studies using tagging single-nucleotide polymorphisms (SNPs) or commercial SNP arrays. Developing powerful statistical methods can help to identify chromosomal regions harboring uncommon causal variants, while bypassing the genome-wide or exome-wide next-generation sequencing. In this work, we propose a haplotype kernel association test (HKAT) that is equivalent to testing the variance component of random effects for distinct haplotypes. With an appropriate weighting scheme given to haplotypes, we can further enhance the ability of HKAT to detect uncommon causal variants. With scenarios simulated according to the population genetics theory, HKAT is shown to be a powerful method for detecting chromosomal regions harboring uncommon causal variants.
Collapse
Affiliation(s)
- Wan-Yu Lin
- Institute of Epidemiology and Preventive Medicine, College of Public Health, National Taiwan University, Taipei, Taiwan
| | | | | | | | | | | | | | | |
Collapse
|
39
|
Tang LL, Chen FY, Wang H, Hu XL, Dai X, Mao J, Shen ZT, Wu YH, Wang SM, Hai J, Yan GJ, Li H, Huang J. Haplotype analysis of eight genes of the monoubiquitinated FANCD2–DNA damage–repair pathway in breast cancer patients. Cancer Epidemiol 2013; 37:311-7. [DOI: 10.1016/j.canep.2012.12.010] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2012] [Revised: 11/20/2012] [Accepted: 12/30/2012] [Indexed: 11/28/2022]
|
40
|
Baxter AG, Jordan MA. From markers to molecular mechanisms: type 1 diabetes in the post-GWAS era. Rev Diabet Stud 2012; 9:201-23. [PMID: 23804261 DOI: 10.1900/rds.2012.9.201] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
By the year 2000, a draft of the human genome sequence was completed. Millions of single-nucleotide polymorphisms (SNPs) had been deposited into public databases, and high throughput technologies were under development for SNP genotyping. At that time, it was predicted that large case control association studies would provide far better resolution and power than genome-wide linkage studies. Type 1 diabetes was one of the first phenotypes to be examined by genome-wide association studies (GWAS), and to date over 50 genomic regions have been associated with the disease. In general, the great majority of these loci individually contribute a relatively small degree of risk, and most loci lie outside of coding sequences. The identification of molecular mechanisms from these genomic data therefore remains a significant challenge. Here, we summarize genetic candidate, linkage, and association studies of type 1 diabetes and discuss a potential strategy to identify mechanisms of disease from genomic data.
Collapse
Affiliation(s)
- Alan G Baxter
- Comparative Genomics Centre, Molecular Sciences Building 21, James Cook University, Townsville QLD 4811, Australia.
| | | |
Collapse
|
41
|
Jiang J, Nakayama T, Shimodaira M, Sato N, Aoi N, Sato M, Izumi Y, Kasamaki Y, Ohta M, Soma M, Matsumoto K, Kawamura H, Ozawa Y, Ma Y. Haplotype of smoothelin gene associated with essential hypertension. Hereditas 2012; 149:178-85. [PMID: 23121329 DOI: 10.1111/j.1601-5223.2012.02242.x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Smoothelin is a specific cytoskeletal protein that is associated with smooth muscle cells. The human SMTN gene encodes smoothelin-A and smoothelin-B, and studies using SMTN gene knockout mice have demonstrated that these animals develop hypertension. The aim of the present study was to investigate the association between the human SMTN gene and essential hypertension (EH) using a haplotype-based case-control study. This is the first study to assess the association between essential hypertension and this gene. A total of 255 EH patients and 225 controls were genotyped for the five single-nucleotide polymorphisms (rs2074738, rs5997872, rs56095120, rs9621187 and rs10304) used as genetic markers for the human SMTN gene. Data were analyzed for three separate groups: total subjects, men and women. Although there were no differences for genotype distributions, or the dominant and recessive model distributions noted for total subjects, men and women for all of the SNPs selected for the present study, for the total subjects group, the frequency of the G-C-A-C haplotype constructed with rs2074738-rs5997872-rs56095120-rs9621187 was significantly lower in the essential hypertension patients than in the controls (P = 0.002). The G-C-A-C haplotype appears to be a useful protective marker of essential hypertension in Japanese, and the SMTN gene might also be a genetic marker for essential hypertension.
Collapse
Affiliation(s)
- Jie Jiang
- Division of Laboratory Medicine, Department of Pathology and Microbiology, Nihon University School of Medicine, JP-173-8610 Tokyo, Japan
| | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
42
|
Liao B, Li X, Zhu W, Cao Z. A novel method to select informative SNPs and their application in genetic association studies. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012; 9:1529-1534. [PMID: 22585142 DOI: 10.1109/tcbb.2012.70] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
The association studies between complex diseases and single nucleotide polymorphisms (SNPs) or haplotypes have recently received great attention. However, these studies are limited by the cost of genotyping all SNPs. Therefore, it is essential to find a small subset of tag SNPs representing the rest of the SNPs. The presence of linkage disequilibrium between tag SNPs and the disease variant (genotyped or not), may allow fine mapping study. In this paper, we combine a nearest-means classifier (NMC) and ant colony algorithm to select tags. Results show that our method (ACO/NMC) can get a similar prediction accuracy with method BPSO/SVM and is better than BPSO/STAMPA for small data sets. For large data sets, although the prediction accuracy of our method is lower than BPSO/SVM, ACO/NMC can reach a high accuracy (>99 percent) in a relatively short time. when the number of tags increases, the time complexity of NMC is nearly linear growth. To find out that the ability of tags to locate disease locus, we simulate a case-control study and use two-locus haplotype analysis to quantitatively assess the power. The result showed that 20 percent of all SNPs selected by NMC have about 10 percent higher power than random tags, on average.
Collapse
Affiliation(s)
- Bo Liao
- College of Information Science and Engineering, Hunan University, Changsha, Hunan 410082, China.
| | | | | | | |
Collapse
|
43
|
Lin WY, Tiwari HK, Gao G, Zhang K, Arcaroli JJ, Abraham E, Liu N. Similarity-based multimarker association tests for continuous traits. Ann Hum Genet 2012; 76:246-60. [PMID: 22497480 DOI: 10.1111/j.1469-1809.2012.00706.x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Testing multiple markers simultaneously not only can capture the linkage disequilibrium patterns but also can decrease the number of tests and thus alleviate the multiple-testing penalty. If a gene is associated with a phenotype, subjects with similar genotypes in this gene should also have similar phenotypes. Based on this concept, we have developed a general framework that is applicable to continuous traits. Two similarity-based tests (namely, SIMc and SIMp tests) were derived as special cases of the general framework. In our simulation study, we compared the power of the two tests with that of the single-marker analysis, a standard haplotype regression, and a popular and powerful kernel machine regression. Our SIMc test outperforms other tests when the average R(2) (a measure of linkage disequilibrium) between the causal variant and the surrounding markers is larger than 0.3 or when the causal allele is common (say, frequency = 0.3). Our SIMp test outperforms other tests when the causal variant was introduced at common haplotypes (the maximum frequency of risk haplotypes >0.4). We also applied our two tests to an adiposity data set to show their utility.
Collapse
Affiliation(s)
- Wan-Yu Lin
- Department of Biostatistics, University of Alabama at Birmingham, USA
| | | | | | | | | | | | | |
Collapse
|
44
|
DNA microarray as a tool in establishing genetic relatedness—Current status and future prospects. Forensic Sci Int Genet 2012; 6:322-9. [PMID: 21813350 DOI: 10.1016/j.fsigen.2011.07.007] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2011] [Revised: 06/12/2011] [Accepted: 07/05/2011] [Indexed: 11/21/2022]
|
45
|
A two-stage association study identifies methyl-CpG-binding domain protein 2 gene polymorphisms as candidates for breast cancer susceptibility. Eur J Hum Genet 2012; 20:682-9. [PMID: 22258532 PMCID: PMC3355265 DOI: 10.1038/ejhg.2011.273] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
Abstract
Genome-wide association studies for breast cancer have identified over 40 single-nucleotide polymorphisms (SNPs), a subset of which remains statistically significant after genome-wide correction. Improved strategies for mining of genome-wide association data have been suggested to address heritable component of genetic risk in breast cancer. In this study, we attempted a two-stage association design using markers from a genome-wide study (stage 1, Affymetrix Human SNP 6.0 array, cases=302, controls=321). We restricted our analysis to DNA repair/modifications/metabolism pathway related gene polymorphisms for their obvious role in carcinogenesis in general and for their known protein–protein interactions vis-à-vis, potential epistatic effects. We selected 22 SNPs based on linkage disequilibrium patterns and high statistical significance. Genotyping assays in an independent replication study of 1178 cases and 1314 controls were attempted using Sequenom iPLEX Gold platform (stage 2). Six SNPs (rs8094493, rs4041245, rs7614, rs13250873, rs1556459 and rs2297381) showed consistent and statistically significant associations with breast cancer risk in both stages, with allelic odds ratios (and P-values) of 0.85 (0.0021), 0.86 (0.0026), 0.86 (0.0041), 1.17 (0.0043), 1.20 (0.0103) and 1.13 (0.0154), respectively, in combined analysis (N=3115). Of these, three polymorphisms were located in methyl-CpG-binding domain protein 2 gene regions and were in strong linkage disequilibrium. The remaining three SNPs were in proximity to RAD21 homolog (S. pombe), O-6-methylguanine-DNA methyltransferase and RNA polymerase II-associated protein 1. The identified markers may be relevant to breast cancer susceptibility in populations if these findings are confirmed in independent cohorts.
Collapse
|
46
|
Zhang Y. A novel bayesian graphical model for genome-wide multi-SNP association mapping. Genet Epidemiol 2011; 36:36-47. [PMID: 22127647 DOI: 10.1002/gepi.20661] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2011] [Revised: 09/20/2011] [Accepted: 10/05/2011] [Indexed: 11/10/2022]
Abstract
Most disease association mapping algorithms are based on hypothesis testing procedures that test one variant at a time. Those methods lose power when the disease mutations are jointly tagged by multiple variants, or when gene-gene interaction exist. Nearby variants are also correlated, for which procedures ignoring the dependence between variants will inevitably produce redundant results. With a large number of variants genotyped in current genome-wide disease association studies, simultaneous multivariant association mapping algorithms are strongly desired. We present a novel Bayesian method for automatic detection of multivariant joint association in genome-wide case-control studies. Our method has improved power and specificity over existing tools. We fit a joint probabilistic model to the entire data and identify disease variants simultaneously. The method dynamically accounts for the strong linkage disequilibrium (LD) between variants. As a result, only the primary disease variants will be identified, with all secondary associations due to LD effects filtered out. Our method better pinpoints the disease variants with improved resolution. The method is also computationally efficient for genome-wide studies. When applied to a real data set of inflammatory bowel disease (IBD) containing 401,473 variants in 4,720 individuals, our method detected all previously reported IBD loci in the same data, and recovered two missed loci. We further detected two novel interchromosome interactions. The first is between STAT3 and PARD6G, and the second is between DLG5 and an intergenic region at 5p14. We further validated the two interactions in an independent study.
Collapse
Affiliation(s)
- Yu Zhang
- Department of Statistics, The Pennsylvania State University, University Park, Pennsylvania 16802, USA.
| |
Collapse
|
47
|
Liu TF, Sung WK, Li Y, Liu JJ, Mittal A, Mao PL. EFFECTIVE ALGORITHMS FOR TAG SNP SELECTION. J Bioinform Comput Biol 2011; 3:1089-106. [PMID: 16278949 DOI: 10.1142/s0219720005001521] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2004] [Revised: 04/25/2005] [Accepted: 05/16/2005] [Indexed: 11/18/2022]
Abstract
Single nucleotide polymorphisms (SNPs), due to their abundance and low mutation rate, are very useful genetic markers for genetic association studies. However, the current genotyping technology cannot afford to genotype all common SNPs in all the genes. By making use of linkage disequilibrium, we can reduce the experiment cost by genotyping a subset of SNPs, called Tag SNPs, which have a strong association with the ungenotyped SNPs, while are as independent from each other as possible. The problem of selecting Tag SNPs is NP-complete; when there are large number of SNPs, in order to avoid extremely long computational time, most of the existing Tag SNP selection methods first partition the SNPs into blocks based on certain block definitions, then Tag SNPs are selected in each block by brute-force search. The size of the Tag SNP set obtained in this way may usually be reduced further due to the inter-dependency among blocks. This paper proposes two algorithms, TSSA and TSSD, to tackle the block-independent Tag SNP selection problem. TSSA is based on A* search algorithm, and TSSD is a heuristic algorithm. Experiments show that TSSA can find the optimal solutions for medium-sized problems in reasonable time, while TSSD can handle very large problems and report approximate solutions very close to the optimal ones.
Collapse
Affiliation(s)
- Tie-Fei Liu
- School of Computing, National University of Singapore & Institute of Bioengineering and Nanotechnology, Singapore.
| | | | | | | | | | | |
Collapse
|
48
|
Xu HP, Zeng H, Zhang DX, Jia XL, Luo CL, Fang MX, Nie QH, Zhang XQ. Polymorphisms associated with egg number at 300 days of age in chickens. GENETICS AND MOLECULAR RESEARCH 2011; 10:2279-89. [PMID: 22002122 DOI: 10.4238/2011.october.3.5] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
We looked for variations that could be associated with chicken egg number at 300 days of age (EN300) in seven genes of the hypothalamic-pituitary-gonadal axis, including gonadotrophin-releasing hormone-I (GnRH-I), GnRH receptor (GnRHR), neuropeptide Y (NPY), dopamine D2 receptor (DRD2), vasoactive intestinal polypeptide (VIP), VIP receptor-1 (VIPR-1), prolactin (PRL), and the QTL region between 87 and 105 cM of the Z chromosome. Ten mutations in the seven genes were chosen to do marker-trait association analyses in a population comprising 1310 chickens, which were obtained from a company located in Guangdong Province of China. The C1704887T of VIPR-1 was found to have a highly significant association with EN300. The T5841629C of DRD2 and the C1715301T of VIPR-1 were significantly associated with EN300. A highly significant association was also found between the C1704887T-C1715301T haplotypes of VIPR-1 and EN300. H1H3 had the highest EN300. Four PCR-RFLP variations in the candidate QTL region were selected to investigate their genetic effects on EN300. The haplotypes of T32742468C-G32742603A in this region showed a highly significant association with EN300. Bioinformatics analyses showed that both T32742468C and G32742603A were located in intron 1 of the SH3-domain GRB2-like 2 (SH3GL2) gene. We conclude that five SNPs, including C1704887T and C1715301T of VIPR-1, T5841629C of DRD2, and T32742468C and G32742603A of SH3GL2, would be useful as markers for breeding to increase chicken EN300.
Collapse
Affiliation(s)
- H P Xu
- Department of Animal Genetics, Breeding and Reproduction, College of Animal Science, South China Agricultural University, Guangzhou, Guangdong, China
| | | | | | | | | | | | | | | |
Collapse
|
49
|
Zhang BY, Zhang J, Liu JS. BLOCK-BASED BAYESIAN EPISTASIS ASSOCIATION MAPPING WITH APPLICATION TO WTCCC TYPE 1 DIABETES DATA. Ann Appl Stat 2011; 5:2052-2077. [PMID: 22140419 PMCID: PMC3226821 DOI: 10.1214/11-aoas469] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Interactions among multiple genes across the genome may contribute to the risks of many complex human diseases. Whole-genome single nucleotide polymorphisms (SNPs) data collected for many thousands of SNP markers from thousands of individuals under the case-control design promise to shed light on our understanding of such interactions. However, nearby SNPs are highly correlated due to linkage disequilibrium (LD) and the number of possible interactions is too large for exhaustive evaluation. We propose a novel Bayesian method for simultaneously partitioning SNPs into LD-blocks and selecting SNPs within blocks that are associated with the disease, either individually or interactively with other SNPs. When applied to homogeneous population data, the method gives posterior probabilities for LD-block boundaries, which not only result in accurate block partitions of SNPs, but also provide measures of partition uncertainty. When applied to case-control data for association mapping, the method implicitly filters out SNP associations created merely by LD with disease loci within the same blocks. Simulation study showed that this approach is more powerful in detecting multi-locus associations than other methods we tested, including one of ours. When applied to the WTCCC type 1 diabetes data, the method identified many previously known T1D associated genes, including PTPN22, CTLA4, MHC, and IL2RA. The method also revealed some interesting two-way associations that are undetected by single SNP methods. Most of the significant associations are located within the MHC region. Our analysis showed that the MHC SNPs form long-distance joint associations over several known recombination hotspots. By controlling the haplotypes of the MHC class II region, we identified additional associations in both MHC class I (HLA-A, HLA-B) and class III regions (BAT1). We also observed significant interactions between genes PRSS16, ZNF184 in the extended MHC region and the MHC class II genes. The proposed method can be broadly applied to the classification problem with correlated discrete covariates.
Collapse
Affiliation(s)
- By Yu Zhang
- Department of Statistics, Pennsylvania State University, 422A Thomas, University Park, Pennsylvania 16802, USA
| | - Jing Zhang
- Department of Statistics, Yale University, 24 Hillhouse Ave., New Haven, Connecticut 06511, USA
| | - Jun S. Liu
- Department of Statistics, Harvard University, Science Center, 1 Oxford St., Cambridge, Massachusetts 02138, USA
| |
Collapse
|
50
|
Cui MF, Gao YF, Lv F, Li N, Zhang ZH, Li X, Su F. Association between polymorphisms of the T cell immunoglobulin musin-3 gene and outcome of hepatitis B virus infection. Shijie Huaren Xiaohua Zazhi 2011; 19:1506-1510. [DOI: 10.11569/wcjd.v19.i14.1506] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
AIM: To investigate the relationship between single nucleotide polymorphisms (SNPs) of the T cell immunoglobulin musin-3 (Tim-3) gene and outcome of hepatitis B virus (HBV) infection in a Chinese Han population.
METHODS: Two tagSNPs of the Tim-3 gene (rs11741184 and rs13170556) were genotyped using the SNaPshot method in 996 patients with chronic HBV infection group and 301 patients with acute self-limiting HBV infection. The genotypes, allele frequencies and haplotypes of the two Tim-3 tagSNPs were compared between the two groups of patients.
RESULTS: The frequencies of CC, CG and GG genotypes at the rs11741184 locus were 84.39% (254/301), 15.28% (46/301) and 0.3% (1/301) in patients with acute self-limiting HBV infection, and 86.04% (857/996), 13.65% (136/996) and 0.3% (3/996) in patients with chronic HBV infection, respectively. There were no statistical differences in the genotype frequencies at the rs11741184 locus between the two groups of patients (all P > 0.05). The frequencies of AA, GA and GG genotypes at the rs13170556 locus were 68.77% (207/301), 28.57% (6/301) and 2.66% (8/301) in patients with acute self-limiting HBV infection, and 68.07% (678/996), 28.41% (283/996) and 3.51% (35/996) in patients with chronic HBV infection, respectively. There were also no statistical differences in the genotype frequencies at the rs13170556 locus between the two groups of patients (all P > 0.05). Three haplotypes for Tim-3 tagSNPs (C-A, C-G, G-A) were found in the Chinese Han population, and their haplotype frequencies were similar between patients with acute self-limiting HBV infection (75.08%, 16.94%, 7.97%) and those with chronic HBV infection (75.08% vs 75.15%, 16.94% vs 17.72%, 7.97% vs 7.13%, all P > 0.05).
CONCLUSION: The two Tim-3 tagSNPs may not be associated with outcome of HBV infection in Chinese Han population.
Collapse
|