1
|
Sivabharathi RC, Rajagopalan VR, Suresh R, Sudha M, Karthikeyan G, Jayakanthan M, Raveendran M. Haplotype-based breeding: A new insight in crop improvement. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2024; 346:112129. [PMID: 38763472 DOI: 10.1016/j.plantsci.2024.112129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 05/09/2024] [Accepted: 05/15/2024] [Indexed: 05/21/2024]
Abstract
Haplotype-based breeding (HBB) is one of the cutting-edge technologies in the realm of crop improvement due to the increasing availability of Single Nucleotide Polymorphisms identified by Next Generation Sequencing technologies. The complexity of the data can be decreased with fewer statistical tests and a lower probability of spurious associations by combining thousands of SNPs into a few hundred haplotype blocks. The presence of strong genomic regions in breeding lines of most crop species facilitates the use of haplotypes to improve the efficiency of genomic and marker-assisted selection. Haplotype-based breeding as a Genomic Assisted Breeding (GAB) approach harnesses the genome sequence data to pinpoint the allelic variation used to hasten the breeding cycle and circumvent the challenges associated with linkage drag. This review article demonstrates ways to identify candidate genes, superior haplotype identification, haplo-pheno analysis, and haplotype-based marker-assisted selection. The crop improvement strategies that utilize superior haplotypes will hasten the breeding progress to safeguard global food security.
Collapse
Affiliation(s)
- R C Sivabharathi
- Department of Genetics and Plant breeding, CPBG, Tamil Nadu Agricultural University, Coimbatore 641003, India
| | - Veera Ranjani Rajagopalan
- Department of Plant Biotechnology, Centre for Plant Molecular Biology and Biotechnology, Tamil Nadu Agricultural University, Coimbatore, 641003, India
| | - R Suresh
- Department of Rice, CPBG, Tamil Nadu Agricultural University, Coimbatore 641003, India
| | - M Sudha
- Department of Plant Biotechnology, Centre for Plant Molecular Biology and Biotechnology, Tamil Nadu Agricultural University, Coimbatore, 641003, India.
| | - G Karthikeyan
- Department of Plant Pathology, CPPS, Tamil Nadu Agricultural University, Coimbatore 641003, India
| | - M Jayakanthan
- Department of Plant Molecular Biology and Bioinformatics, Centre for Plant Molecular Biology and Biotechnology, Tamil Nadu Agricultural University, Coimbatore 641003, India
| | - M Raveendran
- Directorate of research, Tamil Nadu Agricultural University, Coimbatore 641003, India.
| |
Collapse
|
2
|
Pevzner P, Vingron M, Reidys C, Sun F, Istrail S. Michael Waterman's Contributions to Computational Biology and Bioinformatics. J Comput Biol 2022; 29:601-615. [PMID: 35727100 DOI: 10.1089/cmb.2022.29066.pp] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
On the occasion of Dr. Michael Waterman's 80th birthday, we review his major contributions to the field of computational biology and bioinformatics including the famous Smith-Waterman algorithm for sequence alignment, the probability and statistics theory related to sequence alignment, algorithms for sequence assembly, the Lander-Waterman model for genome physical mapping, combinatorics and predictions of ribonucleic acid structures, word counting statistics in molecular sequences, alignment-free sequence comparison, and algorithms for haplotype block partition and tagSNP selection related to the International HapMap Project. His books Introduction to Computational Biology: Maps, Sequences and Genomes for graduate students and Computational Genome Analysis: An Introduction geared toward undergraduate students played key roles in computational biology and bioinformatics education. We also highlight his efforts of building the computational biology and bioinformatics community as the founding editor of the Journal of Computational Biology and a founding member of the International Conference on Research in Computational Molecular Biology (RECOMB).
Collapse
Affiliation(s)
- Pavel Pevzner
- Department of Computer Science and Engineering, University of California San Diego, San Diego, California, USA
| | - Martin Vingron
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany
| | - Christian Reidys
- Department of Mathematics, Biocomplexity Institute & Initiative, University of Virginia, Charlottesville, Virginia, USA
| | - Fengzhu Sun
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, California, USA
| | - Sorin Istrail
- Department of Computer Science, Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, USA
| |
Collapse
|
3
|
Wu Y, Yang H, Xiao C. Genetic association study of prolylcarboxypeptidase polymorphisms with susceptibility to essential hypertension in the Yi minority of China: A case-control study based on an isolated population. J Renin Angiotensin Aldosterone Syst 2020; 21:1470320320919586. [PMID: 32448049 PMCID: PMC7249571 DOI: 10.1177/1470320320919586] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Objective: Prolylcarboxypeptidase (PRCP) is a negative regulator of the pressor actions of the renin–angiotensin–aldosterone system. It is also involved in the kallikrein–kinin system. This gene has an important role in blood pressure (BP) regulation. Methods: A case–control study was performed for 615 Yi participants (303 cases and 312 controls) from a remote mountainous area in Yunnan Province of China. For the PRCP gene, 11 tag single-nucleotide polymorphisms were genotyped using the polymerase chain reaction-restriction fragment length polymorphism method. Results: The PRCP gene rs12290550 was associated with the occurrence of essential hypertension (EH) and BP traits. Logistic regression analysis indicated that the rs12290550 T allele was significantly linked to the risk of EH (odds ratio (OR) = 1.85, 95% confidence interval (CI) 1.44–2.39, p = 0.2 × 10−5). Under Bonferroni correction, the H7 TAGCACTAACA haplotype containing the risk allele rs12290550 T increased the risk of EH (OR = 4.53, 95% CI 2.29–8.93, p = 0.2×10−5). Conclusions: The findings of this study demonstrate the strong association of the PRCP gene with EH. rs12290550 may be a useful genetic predictor of EH in the Yi minority.
Collapse
Affiliation(s)
- Yanrui Wu
- Cell Biology and Genetics Department, Kunming Medical University, China.,School of Medicine, Yunnan University, China
| | - Hongju Yang
- The First Affiliated Hospital of Kunming Medical University, Kunming Medical University, China
| | | |
Collapse
|
4
|
Wang X, Wang S, Meng X. A novel SNP-set analytical method without distinguishing common variants or rare variants in genome-wide association study. INT J BIOMATH 2018. [DOI: 10.1142/s1793524518500948] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Single nucleotide polymorphism (SNP)-set analysis in genome-wide association studies (GWASs) has become a hot topic. Most existing SNP-set analystic methods are designed and work well according to the different natures of common or rare variants and associated diseases. But the information that the disease associated variants are common or rare cannot be gained in advance. Therefore, in this research, we proposed a new and powerful weighted function method without distinguishing common or rare variants to select tagging SNP-set. We applied our selection method to sequence kernel association test (SKAT) and compared the power with some existing methods. The simulation results showed that our method has higher power not only than SKAT in un-weighted case, but also than SKAT in other weighted functions. Moreover, the power is improved significantly when the minor allele frequency (MAF) of causal SNP is relatively small.
Collapse
Affiliation(s)
- Xinzeng Wang
- State Key Laboratory of Mining Disaster Prevention and Control Co-founded by Shandong Province and the Ministry of Science and Technology, Shandong University of Science and Technology, Qingdao 266590, P. R. China
- College of Mathematics and Systems Science, Shandong University of Science and Technology, Qingdao 266510, P. R. China
| | - Shudong Wang
- College of Computer and Communication Engineering, China University of Petroleum (East China), Qingdao, Shandong 266580, P. R. China
| | - Xinzhu Meng
- State Key Laboratory of Mining Disaster Prevention and Control Co-founded by Shandong Province and the Ministry of Science and Technology, Shandong University of Science and Technology, Qingdao 266590, P. R. China
- College of Mathematics and Systems Science, Shandong University of Science and Technology, Qingdao 266510, P. R. China
| |
Collapse
|
5
|
Wang MD, Dzama K, Hefer CA, Muchadeyi FC. Genomic population structure and prevalence of copy number variations in South African Nguni cattle. BMC Genomics 2015; 16:894. [PMID: 26531252 PMCID: PMC4632335 DOI: 10.1186/s12864-015-2122-z] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2015] [Accepted: 10/22/2015] [Indexed: 12/21/2022] Open
Abstract
Background Copy number variations (CNVs) are modifications in DNA structure comprising of deletions, duplications, insertions and complex multi-site variants. Although CNVs are proven to be involved in a variety of phenotypic discrepancies, the full extent and consequence of CNVs is yet to be understood. To date, no such genomic characterization has been performed in indigenous South African Nguni cattle. Nguni cattle are recognized for their ability to sustain harsh environmental conditions while exhibiting enhanced resistance to disease and parasites and are thought to comprise of up to nine different ecotypes. Methods Illumina BovineSNP50 Beadchip data was utilized to investigate genomic population structure and the prevalence of CNVs in 492 South African Nguni cattle. PLINK, ADMIXTURE, R, gPLINK and Haploview software was utilized for quality control, population structure and haplotype block determination. PennCNV hidden Markov model identified CNVs and genes contained within and 10 Mb downstream from reported CNVs. PANTHER and Ensembl databases were subsequently utilized for gene annotation analyses. Results Population structure analyses on Nguni cattle revealed 5 sub-populations with a possible sub-structure evident at K equal to 8. Four hundred and thirty three CNVs that formed 334 CNVRs ranging from 30 kb to 1 Mb in size are reported. Only 231 of the 492 animals demonstrated CNVRs. Two hundred and eighty nine genes were observed within CNVRs identified. Of these 149, 28, 44, 2 and 14 genes were unique to sub-populations A, B, C, D and E respectively. Gene ontology analyses demonstrated a number of pathways to be represented by respective genes, including immune response, response to abiotic stress and biological regulation processess. Conclusions CNVs may explain part of the phenotypic diversity and the enhanced adaptation evident in Nguni cattle. Genes involved in a number of cellular components, biological processes and molecular functions are reported within CNVRs identified. The significance of such CNVRs and the possible effect thereof needs to be ascertained and may hold interesting insight into the functional and adaptive consequence of CNVs in cattle. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-2122-z) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Magretha Diane Wang
- Department of Animal Sciences, University of Stellenbosch, Private Bag X1, Matieland, Stellenbosch, 7602, South Africa. .,Biotechnology Platform, Agricultural Research Council, Private Bag X5, Onderstepoort, 0110, South Africa.
| | - Kennedy Dzama
- Department of Animal Sciences, University of Stellenbosch, Private Bag X1, Matieland, Stellenbosch, 7602, South Africa.
| | - Charles A Hefer
- Biotechnology Platform, Agricultural Research Council, Private Bag X5, Onderstepoort, 0110, South Africa.
| | - Farai C Muchadeyi
- Biotechnology Platform, Agricultural Research Council, Private Bag X5, Onderstepoort, 0110, South Africa.
| |
Collapse
|
6
|
Evaluating information content of SNPs for sample-tagging in re-sequencing projects. Sci Rep 2015; 5:10247. [PMID: 25975447 PMCID: PMC4432563 DOI: 10.1038/srep10247] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2014] [Accepted: 04/07/2015] [Indexed: 12/31/2022] Open
Abstract
Sample-tagging is designed for identification of accidental sample mix-up, which is a major issue in re-sequencing studies. In this work, we develop a model to measure the information content of SNPs, so that we can optimize a panel of SNPs that approach the maximal information for discrimination. The analysis shows that as low as 60 optimized SNPs can differentiate the individuals in a population as large as the present world, and only 30 optimized SNPs are in practice sufficient in labeling up to 100 thousand individuals. In the simulated populations of 100 thousand individuals, the average Hamming distances, generated by the optimized set of 30 SNPs are larger than 18, and the duality frequency, is lower than 1 in 10 thousand. This strategy of sample discrimination is proved robust in large sample size and different datasets. The optimized sets of SNPs are designed for Whole Exome Sequencing, and a program is provided for SNP selection, allowing for customized SNP numbers and interested genes. The sample-tagging plan based on this framework will improve re-sequencing projects in terms of reliability and cost-effectiveness.
Collapse
|
7
|
Wu Z, Huang C, Zhou T, Lin J, Zhang K, Li W, Zheng J, Chen B, Wang B, Zhang X, Xing J. Association of polymorphisms in AGTR1 and AGTR2 genes with primary aldosteronism in the Chinese Han population. J Renin Angiotensin Aldosterone Syst 2014; 16:880-7. [PMID: 25172908 DOI: 10.1177/1470320314534511] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open
Abstract
HYPOTHESIS Polymorphisms in angiotensin II type-1/2 receptor genes (AGTR1/AGTR2) may be involved in the pathogenesis of primary aldosteronism. The present study aims to reveal some loci susceptible to the disease on the genes in a group of Chinese Han nationality. MATERIALS AND METHODS A case-control study was conducted in 202 patients and 188 controls. Ten tagging SNPs on AGTR1/AGTR2 were genotyped for all subjects via the method of multiplex PCR-ligase detection reaction. Statistical analysis was performed with chi-square test and logistic regression analysis. RESULTS rs3772616 on the AGTR1 gene was a factor for susceptibility to primary aldosteronism (p<0.001), and the TT genotype significantly decreased the risk of primary aldosteronism compared with the CC homozygote (p=0.008, adjusted OR=0.13; 95%CI: 0.03-0.59). The rs3772616 polymorphism was associated with primary aldosteronism under the additive and dominant models. The female carriers of the G allele in rs5193 showed a significant difference compared with the T allele. CONCLUSIONS The AGTR1 rs3772616 polymorphism can be considered as a hereditary marker for primary aldosteronism, and in the Chinese Han population the rs5193 G allele seems to predispose to it only in women.
Collapse
Affiliation(s)
- Zhun Wu
- Department of Urology, The First Affiliated Hospital of Xiamen University, Xiamen, Fujian, China
| | - Chao Huang
- Department of Urology, The First Affiliated Hospital of Xiamen University, Xiamen, Fujian, China
| | - Tingting Zhou
- Department of Urology, Chengdu Military General Hospital, Chengdu, Sichuan, China
| | - Jinglai Lin
- Department of Urology, The First Affiliated Hospital of Xiamen University, Xiamen, Fujian, China
| | - Kaiyan Zhang
- Department of Urology, The First Affiliated Hospital of Xiamen University, Xiamen, Fujian, China
| | - Wei Li
- Department of Urology, The First Affiliated Hospital of Xiamen University, Xiamen, Fujian, China
| | - Jiaxin Zheng
- Department of Urology, The First Affiliated Hospital of Xiamen University, Xiamen, Fujian, China
| | - Bin Chen
- Department of Urology, The First Affiliated Hospital of Xiamen University, Xiamen, Fujian, China
| | - Baojun Wang
- Department of Urology, Chinese PLA General Hospital, Beijing, China
| | - Xu Zhang
- Department of Urology, Chinese PLA General Hospital, Beijing, China
| | - Jinchun Xing
- Department of Urology, The First Affiliated Hospital of Xiamen University, Xiamen, Fujian, China
| |
Collapse
|
8
|
Corbin LJ, Kranis A, Blott SC, Swinburne JE, Vaudin M, Bishop SC, Woolliams JA. The utility of low-density genotyping for imputation in the Thoroughbred horse. Genet Sel Evol 2014; 46:9. [PMID: 24495673 PMCID: PMC3930001 DOI: 10.1186/1297-9686-46-9] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2013] [Accepted: 12/20/2013] [Indexed: 12/21/2022] Open
Abstract
Background Despite the dramatic reduction in the cost of high-density genotyping that has occurred over the last decade, it remains one of the limiting factors for obtaining the large datasets required for genomic studies of disease in the horse. In this study, we investigated the potential for low-density genotyping and subsequent imputation to address this problem. Results Using the haplotype phasing and imputation program, BEAGLE, it is possible to impute genotypes from low- to high-density (50K) in the Thoroughbred horse with reasonable to high accuracy. Analysis of the sources of variation in imputation accuracy revealed dependence both on the minor allele frequency of the single nucleotide polymorphisms (SNPs) being imputed and on the underlying linkage disequilibrium structure. Whereas equidistant spacing of the SNPs on the low-density panel worked well, optimising SNP selection to increase their minor allele frequency was advantageous, even when the panel was subsequently used in a population of different geographical origin. Replacing base pair position with linkage disequilibrium map distance reduced the variation in imputation accuracy across SNPs. Whereas a 1K SNP panel was generally sufficient to ensure that more than 80% of genotypes were correctly imputed, other studies suggest that a 2K to 3K panel is more efficient to minimize the subsequent loss of accuracy in genomic prediction analyses. The relationship between accuracy and genotyping costs for the different low-density panels, suggests that a 2K SNP panel would represent good value for money. Conclusions Low-density genotyping with a 2K SNP panel followed by imputation provides a compromise between cost and accuracy that could promote more widespread genotyping, and hence the use of genomic information in horses. In addition to offering a low cost alternative to high-density genotyping, imputation provides a means to combine datasets from different genotyping platforms, which is becoming necessary since researchers are starting to use the recently developed equine 70K SNP chip. However, more work is needed to evaluate the impact of between-breed differences on imputation accuracy.
Collapse
Affiliation(s)
| | | | | | | | | | | | - John A Woolliams
- Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush, Midlothian EH25 9RG, UK.
| |
Collapse
|
9
|
Taliun D, Gamper J, Pattaro C. Efficient haplotype block recognition of very long and dense genetic sequences. BMC Bioinformatics 2014; 15:10. [PMID: 24423111 PMCID: PMC3898000 DOI: 10.1186/1471-2105-15-10] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2013] [Accepted: 12/18/2013] [Indexed: 11/10/2022] Open
Abstract
Background The new sequencing technologies enable to scan very long and dense genetic sequences, obtaining datasets of genetic markers that are an order of magnitude larger than previously available. Such genetic sequences are characterized by common alleles interspersed with multiple rarer alleles. This situation has renewed the interest for the identification of haplotypes carrying the rare risk alleles. However, large scale explorations of the linkage-disequilibrium (LD) pattern to identify haplotype blocks are not easy to perform, because traditional algorithms have at least Θ(n2) time and memory complexity. Results We derived three incremental optimizations of the widely used haplotype block recognition algorithm proposed by Gabriel et al. in 2002. Our most efficient solution, called MIG ++, has only Θ(n) memory complexity and, on a genome-wide scale, it omits >80% of the calculations, which makes it an order of magnitude faster than the original algorithm. Differently from the existing software, the MIG ++ analyzes the LD between SNPs at any distance, avoiding restrictions on the maximal block length. The haplotype block partition of the entire HapMap II CEPH dataset was obtained in 457 hours. By replacing the standard likelihood-based D′ variance estimator with an approximated estimator, the runtime was further improved. While producing a coarser partition, the approximate method allowed to obtain the full-genome haplotype block partition of the entire 1000 Genomes Project CEPH dataset in 44 hours, with no restrictions on allele frequency or long-range correlations. These experiments showed that LD-based haplotype blocks can span more than one million base-pairs in both HapMap II and 1000 Genomes datasets. An application to the North American Rheumatoid Arthritis Consortium (NARAC) dataset shows how the MIG ++ can support genome-wide haplotype association studies. Conclusions The MIG ++ enables to perform LD-based haplotype block recognition on genetic sequences of any length and density. In the new generation sequencing era, this can help identify haplotypes that carry rare variants of interest. The low computational requirements open the possibility to include the haplotype block structure into genome-wide association scans, downstream analyses, and visual interfaces for online genome browsers.
Collapse
Affiliation(s)
- Daniel Taliun
- Center for Biomedicine, European Academy of Bolzano/Bozen (EURAC), Bozen-Bolzano, Italy.
| | | | | |
Collapse
|
10
|
Chen WP, Hung CL, Lin YL. Efficient haplotype block partitioning and tag SNP selection algorithms under various constraints. BIOMED RESEARCH INTERNATIONAL 2013; 2013:984014. [PMID: 24319694 PMCID: PMC3844216 DOI: 10.1155/2013/984014] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/29/2013] [Accepted: 09/05/2013] [Indexed: 11/18/2022]
Abstract
Patterns of linkage disequilibrium plays a central role in genome-wide association studies aimed at identifying genetic variation responsible for common human diseases. These patterns in human chromosomes show a block-like structure, and regions of high linkage disequilibrium are called haplotype blocks. A small subset of SNPs, called tag SNPs, is sufficient to capture the haplotype patterns in each haplotype block. Previously developed algorithms completely partition a haplotype sample into blocks while attempting to minimize the number of tag SNPs. However, when resource limitations prevent genotyping all the tag SNPs, it is desirable to restrict their number. We propose two dynamic programming algorithms, incorporating many diversity evaluation functions, for haplotype block partitioning using a limited number of tag SNPs. We use the proposed algorithms to partition the chromosome 21 haplotype data. When the sample is fully partitioned into blocks by our algorithms, the 2,266 blocks and 3,260 tag SNPs are fewer than those identified by previous studies. We also demonstrate that our algorithms find the optimal solution by exploiting the nonmonotonic property of a common haplotype-evaluation function.
Collapse
Affiliation(s)
- Wen-Pei Chen
- Department of Applied Chemistry, Providence University, Taichung 433, Taiwan
| | - Che-Lun Hung
- Department of Computer Science and Communication Engineering, Providence University, Taichung 433, Taiwan
| | - Yaw-Ling Lin
- Department of Computer Science and Information Engineering, Providence University, Taichung 433, Taiwan
| |
Collapse
|
11
|
Xiong Q, Chai J, Deng C, Jiang S, Liu Y, Huang T, Suo X, Zhang N, Li X, Yang Q, Chen M, Zheng R. Characterization of porcine SKIP gene in skeletal muscle development: Polymorphisms, association analysis, expression and regulation of cell growth in C2C12 cells. Meat Sci 2012; 92:490-7. [DOI: 10.1016/j.meatsci.2012.05.016] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2011] [Revised: 04/02/2012] [Accepted: 05/18/2012] [Indexed: 10/28/2022]
|
12
|
Trifonova EA, Spiridonova MG, Gabidulina TV, Urnov FD, Puzyrev VP, Stepanov VA. Analysis of the MTHFR gene linkage disequilibrium structure and association of polymorphic gene variants with coronary atherosclerosis. RUSS J GENET+ 2012. [DOI: 10.1134/s1022795412100122] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
13
|
Kharrat N, Abdelmouleh W, Abdelhedi R, AlFadhli S, Rebai A. The linkage disequilibrium pattern of the Angiotensin Converting Enzyme gene in Arabic and Asian population groups. Ann Hum Biol 2012; 39:538-40. [DOI: 10.3109/03014460.2012.713509] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
14
|
Wu Y, Yang H, Yang B, Yang K, Xiao C. Association of polymorphisms in prolylcarboxypeptidase and chymase genes with essential hypertension in the Chinese Han population. J Renin Angiotensin Aldosterone Syst 2012; 14:263-70. [PMID: 22679278 DOI: 10.1177/1470320312448949] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
INTRODUCTION The prolylcarboxypeptidase (PRCP) gene encodes a membrane protein that acts on angiotensin II (Ang II) and kallikrein to release vasoactive peptides. The chymase (CMA1) gene is important for Ang II generation. Therefore, the two genes might be involved in the pathogenesis of essential hypertension (EH). MATERIALS AND METHODS Eleven tag single nucleotide polymorphisms (SNPs) in the PRCP gene and four tag SNPs and G-1903A (rs1800875) polymorphism in the CMA1 gene were genotyped in the Chinese Han population (n=1020) using a polymerase chain reaction-restriction fragment length polymorphism method. RESULTS In the PRCP gene, single site analyses indicated that the rs7104980 G allele was a susceptible factor for EH (adjusted odds ratio (OR)=1.98, 95% confidence interval (CI) 1.62-2.43, p=0.3×10(-10)). The protective effect of Hap3 GAGCACTAACA was observed without carrying the susceptible rs7104908 G allele (OR=0.67, 95% CI 0.56-0.81, p=0.3×10(-4)) by haplotype analyses. In the case of the CMA1 gene, no associations with EH were found through single site analyses. However, haplotype analyses showed that Hap16 TTTA significantly increased the risk of EH with OR=3.15 (p=0.0002) which may be driven by interaction with a nearby SNP combination. CONCLUSIONS The present results indicated PRCP rs7104980 can be considered as a marker for EH and Hap3 GAGCACTAACA (PRCP) and Hap16 TTTA (CMA1) might be associated with EH in Chinese Han population.
Collapse
Affiliation(s)
- Yanrui Wu
- Cell Biology and Genetics Department, Kunming Medical University, China
| | | | | | | | | |
Collapse
|
15
|
Zhou Y, Cao D, Lei Q, Han H, Li F, Li G, Huang B. Associations of Melanocortin-4 Receptor (MC4R) Gene Single Nucleotide Polymorphisms with Carcass Traits in a Synthetic Broiler Line. ACTA ACUST UNITED AC 2012. [DOI: 10.3923/javaa.2012.13.19] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
|
16
|
Cohen-Zinder M, Donthu R, Larkin DM, Kumar CG, Rodriguez-Zas SL, Andropolis KE, Oliveira R, Lewin HA. Multisite haplotype on cattle chromosome 3 is associated with quantitative trait locus effects on lactation traits. Physiol Genomics 2011; 43:1185-97. [DOI: 10.1152/physiolgenomics.00253.2010] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open
Abstract
The goal of this study was to identify candidate genes and DNA polymorphisms for quantitative trait loci (QTL) affecting milk yield (MY), fat yield (FY), and protein yield (PY) previously mapped to bovine chromosome 3 (BTA3). To accomplish this, 373 half-siblings sired by three bulls previously shown to be segregating for lactation trait QTL, and 263 additional sires in the U.S. Dairy Bull DNA Repository (DBDR) were genotyped for 2,500 SNPs within a 16.3 Mbp QTL critical region on BTA3. Targeted resequencing of ∼1.8 Mbp within the QTL critical region of one of the QTL heterozygous sires identified additional polymorphisms useful for association studies. Twenty-three single nucleotide polymorphisms (SNPs) within a fine-mapped region were associated with effects on breeding values for MY, FY, or PY in DBDR sires, of which five SNPs were in strong linkage disequilibrium in the population. This multisite haplotype included SNPs located within exons or promoters of four tightly linked genes: RAP1A, ADORA3, OVGP1, and C3H1orf88. An SNP within RAP1A showed strong evidence of a recent selective sweep based on integrated haplotype score and was also associated with breeding value for PY. Because of its known function in alveolar lumen formation in the mammary gland, RAP1A is thus a strong candidate gene for QTL effects on lactation traits. Our results provide a detailed assessment of a QTL region that will be a useful guide for complex traits analysis in humans and other noninbred species.
Collapse
Affiliation(s)
| | - Ravikiran Donthu
- Department of Animal Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Denis M. Larkin
- Department of Animal Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Charu Gupta Kumar
- Department of Animal Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Sandra L. Rodriguez-Zas
- Institute for Genomic Biology, and
- Department of Animal Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Kalista E. Andropolis
- Department of Animal Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Rosane Oliveira
- Department of Animal Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois
| | - Harris A. Lewin
- Institute for Genomic Biology, and
- Department of Animal Sciences, University of Illinois at Urbana-Champaign, Urbana, Illinois
| |
Collapse
|
17
|
Javed A, Drineas P, Mahoney MW, Paschou P. Efficient genomewide selection of PCA-correlated tSNPs for genotype imputation. Ann Hum Genet 2011; 75:707-22. [PMID: 21902678 DOI: 10.1111/j.1469-1809.2011.00673.x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The linkage disequilibrium structure of the human genome allows identification of small sets of single nucleotide polymorphisms (SNPs) (tSNPs) that efficiently represent dense sets of markers. This structure can be translated into linear algebraic terms as evidenced by the well documented principal components analysis (PCA)-based methods. Here we apply, for the first time, PCA-based methodology for efficient genomewide tSNP selection; and explore the linear algebraic structure of the human genome. Our algorithm divides the genome into contiguous nonoverlapping windows of high linear structure. Coupling this novel window definition with a PCA-based tSNP selection method, we analyze 2.5 million SNPs from the HapMap phase 2 dataset. We show that 10-25% of these SNPs suffice to predict the remaining genotypes with over 95% accuracy. A comparison with other popular methods in the ENCODE regions indicates significant genotyping savings. We evaluate the portability of genome-wide tSNPs across a diverse set of populations (HapMap phase 3 dataset). Interestingly, African populations are good reference populations for the rest of the world. Finally, we demonstrate the applicability of our approach in a real genome-wide disease association study. The chosen tSNP panels can be used toward genotype imputation using either a simple regression-based algorithm or more sophisticated genotype imputation methods.
Collapse
Affiliation(s)
- Asif Javed
- Computational Biology Center, IBM TJ Watson Research, Yorktown Heights, NY 10598, USA
| | | | | | | |
Collapse
|
18
|
Yuan X, Zhang J, Wang Y. Mutual information and linkage disequilibrium based SNP association study by grouping case-control. Genes Genomics 2011. [DOI: 10.1007/s13258-010-0094-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
19
|
Zhang Y, Jiang B, Zhu J, Liu JS. Bayesian models for detecting epistatic interactions from genetic data. Ann Hum Genet 2010; 75:183-93. [PMID: 21091453 DOI: 10.1111/j.1469-1809.2010.00621.x] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
Current disease association studies are routinely conducted on a genome-wide scale, testing hundreds of thousands or millions of genetic markers. Besides detecting marginal associations of individual markers with the disease, it is also of interest to identify gene-gene and gene-environment interactions, which confer susceptibility to the disease risk. The astronomical number of possible combinations of markers and environmental factors, however, makes interaction mapping a daunting task both computationally and statistically. In this paper, we review and discuss a set of Bayesian partition methods developed recently for mapping single-nucleotide polymorphisms in case-control studies, their extension to quantitative traits, and further generalization to multiple traits. We use simulation and real data sets to demonstrate the performance of these methods, and we compare them with some existing interaction mapping algorithms. With the recent advance in high-throughput sequencing technologies, genome-wide measurements of epigenetic factor enrichment, structural variations, and transcription activities become available at the individual level. The tsunami of data creates more challenges for gene-gene interaction mapping, but at the same time provides new opportunities that, if utilized properly through sophisticated statistical means, can improve the power of mapping interactions at the genome scale.
Collapse
Affiliation(s)
- Yu Zhang
- Department of Statistics, Penn State University, University Park, PA, USA
| | | | | | | |
Collapse
|
20
|
A novel efficient dynamic programming algorithm for haplotype block partitioning. J Theor Biol 2010; 267:164-70. [PMID: 20728452 DOI: 10.1016/j.jtbi.2010.08.019] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2009] [Revised: 08/10/2010] [Accepted: 08/16/2010] [Indexed: 11/24/2022]
Abstract
In this paper, a new efficient algorithm is presented for haplotype block partitioning based on haplotype diversity. In this algorithm, finding the largest meaningful block that satisfies the diversity condition is the main goal as an optimization problem. The algorithm can be performed in polynomial time complexity with regard to the number of haplotypes and SNPs. We apply our algorithm on three biological data sets from chromosome 21 in three different population data sets from HapMap data bulk; the obtained results show the efficiency and better performance of our algorithm in comparison with three other well known methods.
Collapse
|
21
|
Extreme evolutionary disparities seen in positive selection across seven complex diseases. PLoS One 2010; 5:e12236. [PMID: 20808933 PMCID: PMC2923198 DOI: 10.1371/journal.pone.0012236] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2010] [Accepted: 07/12/2010] [Indexed: 12/22/2022] Open
Abstract
Positive selection is known to occur when the environment that an organism inhabits is suddenly altered, as is the case across recent human history. Genome-wide association studies (GWASs) have successfully illuminated disease-associated variation. However, whether human evolution is heading towards or away from disease susceptibility in general remains an open question. The genetic-basis of common complex disease may partially be caused by positive selection events, which simultaneously increased fitness and susceptibility to disease. We analyze seven diseases studied by the Wellcome Trust Case Control Consortium to compare evidence for selection at every locus associated with disease. We take a large set of the most strongly associated SNPs in each GWA study in order to capture more hidden associations at the cost of introducing false positives into our analysis. We then search for signs of positive selection in this inclusive set of SNPs. There are striking differences between the seven studied diseases. We find alleles increasing susceptibility to Type 1 Diabetes (T1D), Rheumatoid Arthritis (RA), and Crohn's Disease (CD) underwent recent positive selection. There is more selection in alleles increasing, rather than decreasing, susceptibility to T1D. In the 80 SNPs most associated with T1D (p-value <7.01 x 10(-5)) showing strong signs of positive selection, 58 alleles associated with disease susceptibility show signs of positive selection, while only 22 associated with disease protection show signs of positive selection. Alleles increasing susceptibility to RA are under selection as well. In contrast, selection in SNPs associated with CD favors protective alleles. These results inform the current understanding of disease etiology, shed light on potential benefits associated with the genetic-basis of disease, and aid in the efforts to identify causal genetic factors underlying complex disease.
Collapse
|
22
|
Zangerl B, Lindauer SJ, Acland GM, Aguirre GD. Identification of genetic variation and haplotype structure of the canine ABCA4 gene for retinal disease association studies. Mol Genet Genomics 2010; 284:243-50. [PMID: 20661590 DOI: 10.1007/s00438-010-0560-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2010] [Accepted: 07/07/2010] [Indexed: 11/28/2022]
Abstract
Over 200 mutations in the retina specific member of the ATP-binding cassette transporter superfamily (ABCA4) have been associated with a diverse group of human retinal diseases. The disease mechanisms, and genotype-phenotype associations, nonetheless, remain elusive in many cases. As orthologous genes are commonly mutated in canine models of human blinding disorders, canine ABCA4 appears to be an ideal candidate gene to identify and study sequence changes in dogs affected by various forms of inherited retinal degeneration. However, the size of the gene and lack of haplotype assignment significantly limit targeted association and/or linkage approaches. This study assessed the naturally observed sequence diversity of ABCA4 in the dog, identifying 80% of novel variations. While none of the observed polymorphisms have been associated with blinding disorders to date, breed and potentially disease specific haplotypes have been identified. Moreover, a tag SNP map of 17 (15) markers has been established that accurately predicts common ABCA4 haplotypes (frequency > 5%) explaining >85% (>80%) of the observed genetic diversity and will considerably advance future studies. Our sequence analysis of the complete canine ABCA4 coding region will clearly provide a baseline and tools for future association studies and comparative genomics to further delineate the role of ABCA4 in canine blinding disorders.
Collapse
Affiliation(s)
- B Zangerl
- Section of Ophthalmology, School of Veterinary Medicine, University of Pennsylvania, Philadelphia, PA 19104, USA.
| | | | | | | |
Collapse
|
23
|
Chuang LY, Yang CS, Ho CH, Yang CH. Tag SNP selection using particle swarm optimization. Biotechnol Prog 2010; 26:580-8. [PMID: 20039435 DOI: 10.1002/btpr.350] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Single nucleotide polymorphisms (SNPs) are the most abundant form of genetic variations amongst species. With the genome-wide SNP discovery, many genome-wide association studies are likely to identify multiple genetic variants that are associated with complex diseases. However, genotyping all existing SNPs for a large number of samples is still challenging even though SNP arrays have been developed to facilitate the task. Therefore, it is essential to select only informative SNPs representing the original SNP distributions in the genome (tag SNP selection) for genome-wide association studies. These SNPs are usually chosen from haplotypes and called haplotype tag SNPs (htSNPs). Accordingly, the scale and cost of genotyping are expected to be largely reduced. We introduce binary particle swarm optimization (BPSO) with local search capability to improve the prediction accuracy of STAMPA. The proposed method does not rely on block partitioning of the genomic region, and consistently identified tag SNPs with higher prediction accuracy than either STAMPA or SVM/STSA. We compared the prediction accuracy and time complexity of BPSO to STAMPA and an SVM-based (SVM/STSA) method using publicly available data sets. For STAMPA and SVM/STSA, BPSO effective improved prediction accuracy for smaller and larger scale data sets. These results demonstrate that the BPSO method selects tag SNP with higher accuracy no matter the scale of data sets is used.
Collapse
Affiliation(s)
- Li-Yeh Chuang
- Dept. of Chemical Engineering, I-Shou University, Kaohsiung, Taiwan
| | | | | | | |
Collapse
|
24
|
Liu L, Wu Y, Lonardi S, Jiang T. Efficient genome-wide TagSNP selection across populations via the linkage disequilibrium criterion. J Comput Biol 2010; 17:21-37. [PMID: 20078395 DOI: 10.1089/cmb.2007.0228] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
In this article, we studied the tag single-nucleotide polymorphism (tagSNP) selection problem on multiple populations using the pairwise r(2) linkage disequilibrium criterion. We proposed a novel combinatorial optimization model for the tagSNP selection problem, called the minimum common tagSNP selection (MCTS) problem, and presented efficient solutions for MCTS. Our approach consists of the following three main steps: (i) partitioning the SNP markers into small disjoint components, (ii) applying some data reduction rules to simplify the problem, and (iii) applying either a fast greedy algorithm or a Lagrangian relaxation algorithm to solve the remaining (general) MCTS. These algorithms also provide lower bounds on tagging (i.e., the minimum number of tagSNPs needed). The lower bounds allow us to evaluate how far our solution is from the optimum. To the best of our knowledge, it is the first time the tagging lower bounds are discussed in the literature. We assessed the performance of our algorithms on real HapMap data for genome-wide tagging. The experiments demonstrated that our algorithms run 3-4 orders of magnitude faster than the existing single-population tagging programs such as FESTA, LD-Select, and the multiple-population tagging method MultiPop-TagSelect. Our method also greatly reduced the required tagSNPs compared with LD-Select on a single population and MultiPop-TagSelect on multiple populations. Moreover, the numbers of tagSNPs selected by our algorithms are almost optimal because they are very close to the corresponding lower bounds obtained by our method.
Collapse
Affiliation(s)
- Lan Liu
- Department of Computer Science and Engineering, University of California, Riverside, California, USA.
| | | | | | | |
Collapse
|
25
|
Zhou Y, Liu Y, Jiang X, Du H, Li X, Zhu Q. Polymorphism of chicken myocyte-specific enhancer-binding factor 2A gene and its association with chicken carcass traits. Mol Biol Rep 2010; 37:587-94. [PMID: 19774488 DOI: 10.1007/s11033-009-9838-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2009] [Accepted: 09/15/2009] [Indexed: 01/04/2023]
Abstract
Myocyte-specific enhancer-binding factor 2A (MEF2A) gene is a member of the myocyte-specific enhancer-binding factor 2 (MEF2) protein family which involved in vertebrate skeletal muscle development and differentiation. The aim of the current study is to investigate the potential associations between MEF2A gene SNPs (single nucleotide polymorphisms) and the carcass traits in 471 chicken samples from four populations. Three new SNPs (T46023C, A72626G, and T89232G) were detected in the chicken MEF2A gene. The T46023C genotypes were associated with live body weight (BW), carcass weight (CW), eviscerated weight, semi-eviscerated weight (SEW), and leg muscle weight (LMW) (P < 0.05); the A72626G genotypes were associated with BW, CW, LMW (P < 0.01) and breast muscle weight (BMW), leg muscle percentage (LMP) (P < 0.05); whereas the T89232G genotypes were associated with carcass percentage (CP) and semi-eviscerated percentage (SEP) (P < 0.05). The haplotypes constructed on the three SNPs were associated with BW, CW, LMW (P < 0.01), SEW, BMW, CP (P < 0.05). Significantly and suggestive dominant effects of diplotype H1H2 were observed for BW, CW, SEW, BMW and CP, whereas diplotype H5H5 had a negative effect on BW, CW, SEW, BMW and LMW. Our results suggest that the MEF2A gene may be a potential marker affecting the muscle trait of chickens.
Collapse
Affiliation(s)
- Yan Zhou
- College of Animal Science and Technology, Sichuan Agriculture University, Ya'an, Sichuan, China
| | | | | | | | | | | |
Collapse
|
26
|
Xu H, Shen X, Zhou M, Luo C, Kang L, Liang Y, Zeng H, Nie Q, Zhang D, Zhang X. The dopamine D2 receptor gene polymorphisms associated with chicken broodiness. Poult Sci 2010; 89:428-38. [DOI: 10.3382/ps.2009-00428] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
27
|
Hsiao CL, Lian IB, Hsieh AR, Fann CS. Modeling expression quantitative trait loci in data combining ethnic populations. BMC Bioinformatics 2010; 11:111. [PMID: 20187971 PMCID: PMC2844390 DOI: 10.1186/1471-2105-11-111] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2009] [Accepted: 02/27/2010] [Indexed: 12/18/2022] Open
Abstract
Background Combining data from different ethnic populations in a study can increase efficacy of methods designed to identify expression quantitative trait loci (eQTL) compared to analyzing each population independently. In such studies, however, the genetic diversity of minor allele frequencies among populations has rarely been taken into account. Due to the fact that allele frequency diversity and population-level expression differences are present in populations, a consensus regarding the optimal statistical approach for analysis of eQTL in data combining different populations remains inconclusive. Results In this report, we explored the applicability of a constrained two-way model to identify eQTL for combined ethnic data that might contain genetic diversity among ethnic populations. In addition, gene expression differences resulted from ethnic allele frequency diversity between populations were directly estimated and analyzed by the constrained two-way model. Through simulation, we investigated effects of genetic diversity on eQTL identification by examining gene expression data pooled from normal quantile transformation of each population. Using the constrained two-way model to reanalyze data from Caucasians and Asian individuals available from HapMap, a large number of eQTL were identified with similar genetic effects on the gene expression levels in these two populations. Furthermore, 19 single nucleotide polymorphisms with inter-population differences with respect to both genotype frequency and gene expression levels directed by genotypes were identified and reflected a clear distinction between Caucasians and Asian individuals. Conclusions This study illustrates the influence of minor allele frequencies on common eQTL identification using either separate or combined population data. Our findings are important for future eQTL studies in which different datasets are combined to increase the power of eQTL identification.
Collapse
Affiliation(s)
- Ching-Lin Hsiao
- Division of Biostatistics, Institute & Department of Public Health, National Yang-Ming University, Taipei 112, Taiwan
| | | | | | | |
Collapse
|
28
|
Yang CH, Chuang LY, Cheng YH, Wen CH, Chang HW. Dynamic programming for single nucleotide polymorphism ID identification in systematic association studies. Kaohsiung J Med Sci 2010; 25:165-76. [PMID: 19502133 DOI: 10.1016/s1607-551x(09)70057-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Single nucleotide polymorphisms (SNPs) play an important role in personalized medicine. However, the SNP data reported in many association studies provide only the SNP nucleotide/amino acid position, without providing the SNP ID recorded in National Center for Biotechnology Information databases. A tool with the ability to provide SNP ID identification, with a user-friendly interface, is needed. In this paper, a dynamic programming algorithm was used to compare homologs when the processed input sequence is aligned with the SNP FASTA database. Our novel system provides a web-based tool that uses the National Center for Biotechnology Information dbSNP database, which provides SNP sequence identification and SNP FASTA formats. Freely selectable sequence formats for alignment can be used, including general sequence formats (ACGT, [dNTP1/dNTP2] or IUPAC formats) and orientation with bidirectional sequence matching. In contrast to the National Center for Biotechnology Information SNP-BLAST, the proposed system always provides the correct targeted SNP ID (SNP hit), as well as nearby SNPs (flanking hits), arranged in their chromosomal order and contig positions. The system also solves problems inherent in SNP-BLAST, which cannot always provide the correct SNP ID for a given input sequence. Therefore, this system constitutes a novel application which uses dynamic programming to identify SNP IDs from the literature and keyed-in sequences for systematic association studies. It is freely available at http://bio.kuas.edu.tw/SNPosition/.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan
| | | | | | | | | |
Collapse
|
29
|
Nishizawa D, Hayashida M, Nagashima M, Koga H, Ikeda K. Genetic polymorphisms and human sensitivity to opioid analgesics. Methods Mol Biol 2010; 617:395-420. [PMID: 20336437 DOI: 10.1007/978-1-60327-323-7_29] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Opioid analgesics are commonly used for the treatment of acute as well as chronic, moderate to severe pain. Well-known, however, is the wide interindividual variability in sensitivity to opioids that exists, which has often been a critical problem in pain treatment. To date, only a limited number of studies have addressed the relationship between human genetic variations and sensitivity to opioids, and such studies are still in their early stages. Therefore, revealing the relationship between genetic variations in many candidate genes and individual differences in sensitivity to opioids will provide valuable information for appropriate individualization of opioid doses required for adequate pain control. Although the methodologies for such association studies can be diverse, here we summarize protocols for investigating the association between genetic polymorphisms and sensitivity to opioids in human volunteers and patients undergoing painful surgery.
Collapse
Affiliation(s)
- Daisuke Nishizawa
- Division of Psychobiology, Tokyo Institute of Psychiatry, Tokyo, Japan
| | | | | | | | | |
Collapse
|
30
|
Katanforoush A, Sadeghi M, Pezeshk H, Elahi E. Global haplotype partitioning for maximal associated SNP pairs. BMC Bioinformatics 2009; 10:269. [PMID: 19712447 PMCID: PMC2749056 DOI: 10.1186/1471-2105-10-269] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2009] [Accepted: 08/27/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Global partitioning based on pairwise associations of SNPs has not previously been used to define haplotype blocks within genomes. Here, we define an association index based on LD between SNP pairs. We use the Fisher's exact test to assess the statistical significance of the LD estimator. By this test, each SNP pair is characterized as associated, independent, or not-statistically-significant. We set limits on the maximum acceptable proportion of independent pairs within all blocks and search for the partitioning with maximal proportion of associated SNP pairs. Essentially, this model is reduced to a constrained optimization problem, the solution of which is obtained by iterating a dynamic programming algorithm. RESULTS In comparison with other methods, our algorithm reports blocks of larger average size. Nevertheless, the haplotype diversity within the blocks is captured by a small number of tagSNPs. Resampling HapMap haplotypes under a block-based model of recombination showed that our algorithm is robust in reproducing the same partitioning for recombinant samples. Our algorithm performed better than previously reported models in a case-control association study aimed at mapping a single locus trait, based on simulation results that were evaluated by a block-based statistical test. Compared to methods of haplotype block partitioning, we performed best on detection of recombination hotspots. CONCLUSION Our proposed method divides chromosomes into the regions within which allelic associations of SNP pairs are maximized. This approach presents a native design for dimension reduction in genome-wide association studies. Our results show that the pairwise allelic association of SNPs can describe various features of genomic variation, in particular recombination hotspots.
Collapse
Affiliation(s)
- Ali Katanforoush
- Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Mehdi Sadeghi
- National Institute of Genetics Engineering and Biotechnology, Tehran, Iran
- School of Computer Science, Institute for Studies in Theoretical Physics and Mathematics, Tehran, Iran
| | - Hamid Pezeshk
- School of Mathematics, Statistics and Computer Science, and Center of Excellence in Biomathematics, College of Science, University of Tehran, Tehran, Iran
| | - Elahe Elahi
- Department of Biology, College of Science, University of Tehran, Tehran, Iran
| |
Collapse
|
31
|
Bink MCAM, van Eeuwijk FA. A Bayesian QTL linkage analysis of the common dataset from the 12th QTLMAS workshop. BMC Proc 2009; 3 Suppl 1:S4. [PMID: 19278543 PMCID: PMC2654498 DOI: 10.1186/1753-6561-3-s1-s4] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/25/2023] Open
Abstract
Background To compare the power of various QTL mapping methodologies, a dataset was simulated within the framework of 12th QTLMAS workshop. A total of 5865 diploid individuals was simulated, spanning seven generations, with known pedigree. Individuals were genotyped for 6000 SNPs across six chromosomes. We present an illustration of a Bayesian QTL linkage analysis, as implemented in the special purpose software FlexQTL. Most importantly, we treated the number of bi-allelic QTL as a random variable and used Bayes Factors to infer plausible QTL models. We investigated the power of our analysis in relation to the number of phenotyped individuals and SNPs. Results We report clear posterior evidence for 12 QTL that jointly explained 30% of the phenotypic variance, which was very close to the total of included simulation effects, when using all phenotypes and a set of 600 SNPs. Decreasing the number of phenotyped individuals from 4665 to 1665 and/or the number of SNPs in the analysis from 600 to 120 dramatically reduced the power to identify and locate QTL. Posterior estimates of genome-wide breeding values for a small set of individuals were given. Conclusion We presented a successful Bayesian linkage analysis of a simulated dataset with a pedigree spanning several generations. Our analysis identified all regions that contained QTL with effects explaining more than one percent of the phenotypic variance. We showed how the results of a Bayesian QTL mapping can be used in genomic prediction.
Collapse
Affiliation(s)
- Marco C A M Bink
- Biometris, Wageningen University & Research centre, Bornsesteeg 47, 6708 PD, Wageningen, Netherlands.
| | | |
Collapse
|
32
|
Rakovic A, Stiller B, Djarmati A, Flaquer A, Freudenberg J, Toliat MR, Linnebank M, Kostic V, Lohmann K, Paus S, Nürnberg P, Kubisch C, Klein C, Wüllner U, Ramirez A. Genetic association study of the P-type ATPase ATP13A2
in late-onset Parkinson's disease. Mov Disord 2008; 24:429-33. [DOI: 10.1002/mds.22399] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
|
33
|
Trifonova EA, Spiridonova MG, Stepanov VA. Genetic diversity and the structure of linkage disequilibrium in the methylenetetrahydrofolate reductase locus. RUSS J GENET+ 2008. [DOI: 10.1134/s102279540810013x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
34
|
Zhao Y, Xu Y, Wang Z, Zhang H, Chen G. A better block partition and ligation strategy for individual haplotyping. Bioinformatics 2008; 24:2720-5. [DOI: 10.1093/bioinformatics/btn519] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
35
|
Pattaro C, Ruczinski I, Fallin DM, Parmigiani G. Haplotype block partitioning as a tool for dimensionality reduction in SNP association studies. BMC Genomics 2008; 9:405. [PMID: 18759977 PMCID: PMC2547855 DOI: 10.1186/1471-2164-9-405] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2007] [Accepted: 08/29/2008] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Identification of disease-related genes in association studies is challenged by the large number of SNPs typed. To address the dilution of power caused by high dimensionality, and to generate results that are biologically interpretable, it is critical to take into consideration spatial correlation of SNPs along the genome. With the goal of identifying true genetic associations, partitioning the genome according to spatial correlation can be a powerful and meaningful way to address this dimensionality problem. RESULTS We developed and validated an MCMC Algorithm To Identify blocks of Linkage DisEquilibrium (MATILDE) for clustering contiguous SNPs, and a statistical testing framework to detect association using partitions as units of analysis. We compared its ability to detect true SNP associations to that of the most commonly used algorithm for block partitioning, as implemented in the Haploview and HapBlock software. Simulations were based on artificially assigning phenotypes to individuals with SNPs corresponding to region 14q11 of the HapMap database. When block partitioning is performed using MATILDE, the ability to correctly identify a disease SNP is higher, especially for small effects, than it is with the alternatives considered. Advantages can be both in terms of true positive findings and limiting the number of false discoveries. Finer partitions provided by LD-based methods or by marker-by-marker analysis are efficient only for detecting big effects, or in presence of large sample sizes. The probabilistic approach we propose offers several additional advantages, including: a) adapting the estimation of blocks to the population, technology, and sample size of the study; b) probabilistic assessment of uncertainty about block boundaries and about whether any two SNPs are in the same block; c) user selection of the probability threshold for assigning SNPs to the same block. CONCLUSION We demonstrate that, in realistic scenarios, our adaptive, study-specific block partitioning approach is as or more efficient than currently available LD-based approaches in guiding the search for disease loci.
Collapse
Affiliation(s)
- Cristian Pattaro
- Unit of Genetic Epidemiology and Biostatistics, Institute of Genetic Medicine, European Academy, Viale Druso 1, I-39100, Bolzano, Italy.
| | | | | | | |
Collapse
|
36
|
Canine polydactyl mutations with heterogeneous origin in the conserved intronic sequence of LMBR1. Genetics 2008; 179:2163-72. [PMID: 18689889 DOI: 10.1534/genetics.108.087114] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Canine preaxial polydactyly (PPD) in the hind limb is a developmental trait that restores the first digit lost during canine evolution. Using a linkage analysis, we previously demonstrated that the affected gene in a Korean breed is located on canine chromosome 16. The candidate locus was further limited to a linkage disequilibrium (LD) block of <213 kb composing the single gene, LMBR1, by LD mapping with single nucleotide polymorphisms (SNPs) for affected individuals from both Korean and Western breeds. The ZPA regulatory sequence (ZRS) in intron 5 of LMBR1 was implicated in mammalian polydactyly. An analysis of the LD haplotypes around the ZRS for various dog breeds revealed that only a subset is assigned to Western breeds. Furthermore, two distinct affected haplotypes for Asian and Western breeds were found, each containing different single-base changes in the upstream sequence (pZRS) of the ZRS. Unlike the previously characterized cases of PPD identified in the mouse and human ZRS regions, the canine mutations in pZRS lacked the ectopic expression of sonic hedgehog in the anterior limb bud, distinguishing its role in limb development from that of the ZRS.
Collapse
|
37
|
Marques E, Schnabel RD, Stothard P, Kolbehdari D, Wang Z, Taylor JF, Moore SS. High density linkage disequilibrium maps of chromosome 14 in Holstein and Angus cattle. BMC Genet 2008; 9:45. [PMID: 18611270 PMCID: PMC2478670 DOI: 10.1186/1471-2156-9-45] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2007] [Accepted: 07/08/2008] [Indexed: 01/16/2023] Open
Abstract
BACKGROUND Linkage disequilibrium (LD) maps can provide a wealth of information on specific marker-phenotype relationships, especially in areas of the genome where positional candidate genes with similar functions are located. A recently published high resolution radiation hybrid map of bovine chromosome 14 (BTA14) together with the bovine physical map have enabled the creation of more accurate LD maps for BTA14 in both dairy and beef cattle. RESULTS Over 500 Single Nucleotide Polymorphism (SNP) markers from both Angus and Holstein animals had their phased haplotypes estimated using GENOPROB and their pairwise r2 values compared. For both breeds, results showed that average LD extends at moderate levels up to 100 kilo base pairs (kbp) and falls to background levels after 500 kbp. Haplotype block structure analysis using HAPLOVIEW under the four gamete rule identified 122 haplotype blocks for both Angus and Holstein. In addition, SNP tagging analysis identified 410 SNPs and 420 SNPs in Holstein and Angus, respectively, for future whole genome association studies on BTA14. Correlation analysis for marker pairs common to these two breeds confirmed that there are no substantial correlations between r-values at distances over 10 kbp. Comparison of extended haplotype homozygosity (EHH), which calculates the LD decay away from a core haplotype, shows that in Holstein there is long range LD decay away from the DGAT1 region consistent with the selection for milk fat % in this population. Comparison of EHH values for Angus in the same region shows very little long range LD. CONCLUSION Overall, the results presented here can be applied in future single or haplotype association analysis for both populations, aiding in confirming or excluding potential polymorphisms as causative mutations, especially around Quantitative Trait Loci regions. In addition, knowledge of specific LD information among markers will aid the research community in selecting appropriate markers for whole genome association studies.
Collapse
Affiliation(s)
- Elisa Marques
- Department of Agricultural, Food and Nutritional Science, University of Alberta, Edmonton, AB, T6G 2P5, Canada.
| | | | | | | | | | | | | |
Collapse
|
38
|
SNPAnalyzer 2.0: a web-based integrated workbench for linkage disequilibrium analysis and association analysis. BMC Bioinformatics 2008; 9:290. [PMID: 18570686 PMCID: PMC2453143 DOI: 10.1186/1471-2105-9-290] [Citation(s) in RCA: 80] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2008] [Accepted: 06/23/2008] [Indexed: 11/24/2022] Open
Abstract
Background Since the completion of the HapMap project, huge numbers of individual genotypes have been generated from many kinds of laboratories. The efforts of finding or interpreting genetic association between disease and SNPs/haplotypes have been on-going widely. So, the necessity of the capability to analyze huge data and diverse interpretation of the results are growing rapidly. Results We have developed an advanced tool to perform linkage disequilibrium analysis, and genetic association analysis between disease and SNPs/haplotypes in an integrated web interface. It comprises of four main analysis modules: (i) data import and preprocessing, (ii) haplotype estimation, (iii) LD blocking and (iv) association analysis. Hardy-Weinberg Equilibrium test is implemented for each SNPs in the data preprocessing. Haplotypes are reconstructed from unphased diploid genotype data, and linkage disequilibrium between pairwise SNPs is computed and represented by D', r2 and LOD score. Tagging SNPs are determined by using the square of Pearson's correlation coefficient (r2). If genotypes from two different sample groups are available, diverse genetic association analyses are implemented using additive, codominant, dominant and recessive models. Multiple verified algorithms and statistics are implemented in parallel for the reliability of the analysis. Conclusion SNPAnalyzer 2.0 performs linkage disequilibrium analysis and genetic association analysis in an integrated web interface using multiple verified algorithms and statistics. Diverse analysis methods, capability of handling huge data and visual comparison of analysis results are very comprehensive and easy-to-use.
Collapse
|
39
|
Snagger: a user-friendly program for incorporating additional information for tagSNP selection. BMC Bioinformatics 2008; 9:174. [PMID: 18371222 PMCID: PMC2375134 DOI: 10.1186/1471-2105-9-174] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2007] [Accepted: 03/27/2008] [Indexed: 11/10/2022] Open
Abstract
Background There has been considerable effort focused on developing efficient programs for tagging single-nucleotide polymorphisms (SNPs). Many of these programs do not account for potential reduced genomic coverage resulting from genotyping failures nor do they preferentially select SNPs based on functionality, which may be more likely to be biologically important. Results We have developed a user-friendly and efficient software program, Snagger, as an extension to the existing open-source software, Haploview, which uses pairwise r2 linkage disequilibrium between single nucleotide polymorphisms (SNPs) to select tagSNPs. Snagger distinguishes itself from existing SNP selection algorithms, including Tagger, by providing user options that allow for: (1) prioritization of tagSNPs based on certain characteristics, including platform-specific design scores, functionality (i.e., coding status), and chromosomal position, (2) efficient selection of SNPs across multiple populations, (3) selection of tagSNPs outside defined genomic regions to improve coverage and genotyping success, and (4) picking of surrogate tagSNPs that serve as backups for tagSNPs whose failure would result in a significant loss of data. Using HapMap genotype data from ten ENCODE regions and design scores for the Illumina platform, we show similar coverage and design score distribution and fewer total tagSNPs selected by Snagger compared to the web server Tagger. Conclusion Snagger improves upon current available tagSNP software packages by providing a means for researchers to select tagSNPs that reliably capture genetic variation across multiple populations while accounting for significant genotyping failure risk and prioritizing on SNP-specific characteristics.
Collapse
|
40
|
Gu CC, Yu K, Rao DC. Characterization of LD structures and the utility of HapMap in genetic association studies. ADVANCES IN GENETICS 2008; 60:407-35. [PMID: 18358328 DOI: 10.1016/s0065-2660(07)00415-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Observed distribution of and variation in linkage disequilibrium (LD) with respect to the evolution history and disease transmission in a population is the driving force behind the current wave of genome-wide association (GWA) studies of complex human diseases. An extensive literature covers topics from haplotype analysis that utilizes local LD structures in candidate genes and regions to genome-wide organization of LD blocks (neighborhood) that led to the development of International HapMap Project and panels of "tagSNPs" used by current GWA studies. In this chapter, we examine the scenarios where each of the major types of analysis methods may be applicable and where the current popular genotyping platforms for GWA might come short. We discuss current association analysis methods by emphasizing their reliance on the local LD structures or the global organization of the LD structures, and highlight the need to consider individual marker information content in large-scale association mapping.
Collapse
Affiliation(s)
- C Charles Gu
- Division of Biostatistics and Department of Genetics, Washington University School of Medicine, St. Louis, MO 63110, USA
| | | | | |
Collapse
|
41
|
Li Z, Zheng T, Califano A, Floratos A. Pattern-based mining strategy to detect multi-locus association and gene x environment interaction. BMC Proc 2007; 1 Suppl 1:S16. [PMID: 18466505 PMCID: PMC2367515 DOI: 10.1186/1753-6561-1-s1-s16] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
As genome-wide association studies grow in popularity for the identification of genetic factors for common and rare diseases, analytical methods to comb through large numbers of genetic variants efficiently to identify disease association are increasingly in demand. We have developed a pattern-based data-mining approach to discover unlinked multilocus genetic effects for complex disease and to detect genotype x phenotype/genotype x environment interactions. On a densely mapped chromosome 18 data set for rheumatoid arthritis that was made available by Genetic Analysis Workshop 15, this method detected two potential two-locus associations as well as a putative two-locus gene x gender interaction.
Collapse
Affiliation(s)
- Zhong Li
- Department of Computational Genetics, High Throughput Biology Inc, 513 West Mount Pleasant Avenue, Livingston, New Jersey 07039, USA.
| | | | | | | |
Collapse
|
42
|
Mei H, Cuccaro ML, Martin ER. Multifactor dimensionality reduction-phenomics: a novel method to capture genetic heterogeneity with use of phenotypic variables. Am J Hum Genet 2007; 81:1251-61. [PMID: 17999363 DOI: 10.1086/522307] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2007] [Accepted: 08/09/2007] [Indexed: 11/03/2022] Open
Abstract
Complex human diseases do not have a clear inheritance pattern, and it is expected that risk involves multiple genes with modest effects acting independently or interacting. Major challenges for the identification of genetic effects are genetic heterogeneity and difficulty in analyzing high-order interactions. To address these challenges, we present MDR-Phenomics, a novel approach based on the multifactor dimensionality reduction (MDR) method, to detect genetic effects in pedigree data by integration of phenotypic covariates (PCs) that may reflect genetic heterogeneity. The P value of the test is calculated using a permutation test adjusted for multiple tests. To validate MDR-Phenomics, we compared it with two MDR-based methods: (1) traditional MDR pedigree disequilibrium test (PDT) without consideration of PCs (MDR-PDT) and (2) stratified phenotype (SP) analysis based on PCs, with use of MDR-PDT with a Bonferroni adjustment (SP-MDR). Using computer simulations, we examined the statistical power and type I error of the different approaches under several genetic models and sampling scenarios. We conclude that MDR-Phenomics is more powerful than MDR-PDT and SP-MDR when there is genetic heterogeneity, and the statistical power is affected by sample size and the number of PC levels. We further compared MDR-Phenomics with conditional logistic regression (CLR) for testing interactions across single or multiple loci with consideration of PC. The results show that CLR with PC has only slightly smaller power than does MDR-Phenomics for single-locus analysis but has considerably smaller power for multiple loci. Finally, by applying MDR-Phenomics to autism, a complex disease in which multiple genes are believed to confer risk, we attempted to identify multiple gene effects in two candidate genes of interest--the serotonin transporter gene (SLC6A4) and the integrin beta 3 gene (ITGB3) on chromosome 17. Analyzing four markers in SLC6A4 and four markers in ITGB3 in 117 white family triads with autism and using sex of the proband as a PC, we found significant interaction between two markers--rs1042173 in SLC6A4 and rs3809865 in ITGB3.
Collapse
Affiliation(s)
- H Mei
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA
| | | | | |
Collapse
|
43
|
Zhao W, Wang L, Lu X, Yang W, Huang J, Chen S, Gu D. A coding polymorphism of the kallikrein 1 gene is associated with essential hypertension: a tagging SNP-based association study in a Chinese Han population. J Hypertens 2007; 25:1821-7. [PMID: 17762646 DOI: 10.1097/hjh.0b013e328244e119] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
OBJECTIVE The aim of this study was to investigate the association between common variants in the human tissue kallikrein 1 (KLK1) gene and susceptibility to essential hypertension in Chinese Han. METHODS A tagging single nucleotide polymorphism (tSNP) approach was used for a case-control study in 2411 patients with essential hypertension and 2348 controls. All DNA samples and clinical data were collected from the International Collaborative Study of Cardiovascular Disease in Asia (InterASIA). RESULTS Based on the HapMap data of Han Chinese in Beijing (CHB) population, two non-synonymous polymorphisms, namely rs5517 (Glu162Lys) and rs5516 (Gln121Glu), were selected as tSNPs which could efficiently tag eight SNPs of the KLK1 gene with R larger than 90% for both haplotypes and single locus. Significant differences were found between groups for frequencies of rs5517 A allele (42.48% in cases versus 39.32% in controls, P=0.0019) and AA genotype [adjusted odds ratio (OR)=1.25 for AA versus AG/GG, P=0.0067]. The haplotype composed of the rs5517 A and rs5516 G allele significantly increased the risk of hypertension, with adjusted OR of 1.12 [95% confidence interval (CI), 1.04-1.28, P=0.0377] when compared with the common haplotype G-C. Diplotype analysis also showed a significant association between the diplotype of AG-AC and essential hypertension (OR=1.34, 95% CI, 1.07-1.68, P=0.0096). CONCLUSIONS The present study suggested that rs5517 in the KLK1 gene was significantly associated with essential hypertension in a Chinese Han population.
Collapse
Affiliation(s)
- Weiyan Zhao
- Department of Evidence Based Medicine and Division of Population Genetics, Cardiovascular Institute and Fu Wai Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | | | | | | | | | | | | |
Collapse
|
44
|
Levin AM, Zuhlke KA, Ray AM, Cooney KA, Douglas JA. Sequence variation in alpha-methylacyl-CoA racemase and risk of early-onset and familial prostate cancer. Prostate 2007; 67:1507-13. [PMID: 17683075 DOI: 10.1002/pros.20642] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
BACKGROUND Expression of the alpha-methylacyl-CoA racemase (AMACR) gene has been established as a sensitive and specific biomarker for the diagnosis of prostate cancer. An initial study has also suggested that the risk of familial (but not sporadic) prostate cancer may be associated with germline variation in the AMACR gene. METHODS In a study of brothers discordant for the diagnosis of prostate cancer (including 449 affected and 394 unaffected men) from 332 familial and early-onset prostate cancer families, we used conditional logistic regression and family-based association tests to investigate the association between prostate cancer and five single nucleotide polymorphisms (SNPs) tagging common haplotype variation within the coding and regulatory regions of AMACR. RESULTS The strongest evidence for prostate cancer association was for SNP rs3195676, with an estimated odds ratio of 0.58 (95% confidence interval = 0.38-0.90; P = 0.01 for a recessive model). This non-synonymous SNP (nsSNP) results in a methionine-to-valine substitution at codon 9 (M9V) in exon 2 of the AMACR gene. Three additional nsSNPs showed suggestive evidence for prostate cancer association (P < or = 0.10). CONCLUSIONS Our results confirm an initial report of association between the AMACR gene and the risk of familial prostate cancer. These findings emphasize the value of studying early-onset and familial prostate cancer when attempting to identify genetic variation associated with prostate cancer.
Collapse
Affiliation(s)
- Albert M Levin
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan 48109-0618, USA
| | | | | | | | | |
Collapse
|
45
|
Maekawa K, Saeki M, Saito Y, Ozawa S, Kurose K, Kaniwa N, Kawamoto M, Kamatani N, Kato K, Hamaguchi T, Yamada Y, Shirao K, Shimada Y, Muto M, Doi T, Ohtsu A, Yoshida T, Matsumura Y, Saijo N, Sawada JI. Genetic variations and haplotype structures of the DPYD gene encoding dihydropyrimidine dehydrogenase in Japanese and their ethnic differences. J Hum Genet 2007; 52:804-819. [PMID: 17828463 DOI: 10.1007/s10038-007-0186-6] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2007] [Accepted: 07/26/2007] [Indexed: 01/10/2023]
Abstract
Dihydropyrimidine dehydrogenase (DPD) is an inactivating and rate-limiting enzyme for 5-fluorouracil (5-FU), and its deficiency is associated with a risk for developing a severe or fatal toxicity to 5-FU. In this study, to search for genetic variations of DPYD encoding DPD in Japanese, the putative promoter region, all exons, and flanking introns of DPYD were sequenced from 341 subjects including cancer patients treated with 5-FU. Fifty-five genetic variations, including 38 novel ones, were found and consisted of 4 in the 5'-flanking region, 21 (5 synonymous and 16 nonsynonymous) in the coding exons, and 30 in the introns. Nine novel nonsynonymous SNPs, 29C>A (Ala10Glu), 325T>A (Tyr109Asn), 451A>G (Asn151Asp), 733A>T (Ile245Phe), 793G>A (Glu265Lys), 1543G>A (Val515Ile), 1572T>G (Phe524Leu), 1666A>C (Ser556Arg), and 2678A>G (Asn893Ser), were found at allele frequencies between 0.15 and 0.88%. Two known nonsynonymous variations reported only in Japanese, 1003G>T (*11, Val335Leu) and 2303C>A (Thr768Lys), were found at allele frequencies of 0.15 and 2.8%, respectively. SNP and haplotype distributions in Japanese were quite different from those reported previously in Caucasians. This study provides fundamental information for pharmacogenetic studies for evaluating the efficacy and toxicity of 5-FU in Japanese and probably East Asians.
Collapse
Affiliation(s)
- Keiko Maekawa
- Division of Biochemistry and Immunochemistry, National Institute of Health Sciences, 1-18-1 Kamiyoga, Setagaya-ku, Tokyo, 158-8501, Japan.
- Project Team for Pharmacogenetics, National Institute of Health Sciences, Tokyo, Japan.
| | - Mayumi Saeki
- Project Team for Pharmacogenetics, National Institute of Health Sciences, Tokyo, Japan
| | - Yoshiro Saito
- Division of Biochemistry and Immunochemistry, National Institute of Health Sciences, 1-18-1 Kamiyoga, Setagaya-ku, Tokyo, 158-8501, Japan
- Project Team for Pharmacogenetics, National Institute of Health Sciences, Tokyo, Japan
| | - Shogo Ozawa
- Project Team for Pharmacogenetics, National Institute of Health Sciences, Tokyo, Japan
- Division of Pharmacology, National Institute of Health Sciences, Tokyo, Japan
| | - Kouichi Kurose
- Project Team for Pharmacogenetics, National Institute of Health Sciences, Tokyo, Japan
- Division of Medicinal Safety Science, National Institute of Health Sciences, Tokyo, Japan
| | - Nahoko Kaniwa
- Project Team for Pharmacogenetics, National Institute of Health Sciences, Tokyo, Japan
- Division of Medicinal Safety Science, National Institute of Health Sciences, Tokyo, Japan
| | - Manabu Kawamoto
- Division of Genomic Medicine, Department of Advanced Biomedical Engineering and Science, Tokyo Women's Medical University, Tokyo, Japan
| | - Naoyuki Kamatani
- Division of Genomic Medicine, Department of Advanced Biomedical Engineering and Science, Tokyo Women's Medical University, Tokyo, Japan
| | - Ken Kato
- Gastrointestinal Oncology Division, National Cancer Center Hospital, National Cancer Center, Tokyo, Japan
| | - Tetsuya Hamaguchi
- Gastrointestinal Oncology Division, National Cancer Center Hospital, National Cancer Center, Tokyo, Japan
| | - Yasuhide Yamada
- Gastrointestinal Oncology Division, National Cancer Center Hospital, National Cancer Center, Tokyo, Japan
| | - Kuniaki Shirao
- Gastrointestinal Oncology Division, National Cancer Center Hospital, National Cancer Center, Tokyo, Japan
| | - Yasuhiro Shimada
- Gastrointestinal Oncology Division, National Cancer Center Hospital, National Cancer Center, Tokyo, Japan
| | - Manabu Muto
- Gastrointestinal Oncology Division, National Cancer Center Hospital East, Kashiwa, Japan
| | - Toshihiko Doi
- Division of GI Oncology/Digestive Endoscopy, National Cancer Center Hospital East, Kashiwa, Japan
| | - Atsushi Ohtsu
- Division of GI Oncology/Digestive Endoscopy, National Cancer Center Hospital East, Kashiwa, Japan
| | - Teruhiko Yoshida
- Genetics Division, National Cancer Center Research Institute, National Cancer Center, Tokyo, Japan
| | - Yasuhiro Matsumura
- Research Center of Innovative Oncology, National Cancer Center Hospital East, Kashiwa, Japan
| | - Nagahiro Saijo
- Deputy Director, National Cancer Center Hospital East, Kashiwa, Japan
| | - Jun-Ichi Sawada
- Division of Biochemistry and Immunochemistry, National Institute of Health Sciences, 1-18-1 Kamiyoga, Setagaya-ku, Tokyo, 158-8501, Japan
- Project Team for Pharmacogenetics, National Institute of Health Sciences, Tokyo, Japan
| |
Collapse
|
46
|
Nannya Y, Taura K, Kurokawa M, Chiba S, Ogawa S. Evaluation of genome-wide power of genetic association studies based on empirical data from the HapMap project. Hum Mol Genet 2007; 16:2494-505. [PMID: 17666406 DOI: 10.1093/hmg/ddm205] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
With recent advances in high-throughput single nucleotide polymorphism (SNP) typing technologies, genome-wide association studies have become a realistic approach to identify the causative genes that are responsible for common diseases of complex genetic traits. In this strategy, a trade-off between the increased genome coverage and a chance of finding SNPs incidentally showing a large statistics becomes serious due to extreme multiple-hypothesis testing. We investigated the extent to which this trade-off limits the genome-wide power with this approach by simulating a large number of case-control panels based on the empirical data from the HapMap Project. In our simulations, statistical costs of multiple hypothesis testing were evaluated by empirically calculating distributions of the maximum value of the chi(2) statistics for a series of marker sets having increasing numbers of SNPs, which were used to determine a genome-wide threshold in the following power simulations. With a practical study size, the cost of multiple testing largely offsets the potential benefits from increased genome coverage given modest genetic effects and/or low frequencies of causal alleles. In most realistic scenarios, increasing genome coverage becomes less influential on the power, while sample size is the predominant determinant of the feasibility of genome-wide association tests. Increasing genome coverage without corresponding increase in sample size will only consume resources without little gain in power. For common causal alleles with relatively large effect sizes [genotype relative risk > or =1.7], we can expect satisfactory power with currently available large-scale genotyping platforms using realistic sample size ( approximately 1000 per arm).
Collapse
Affiliation(s)
- Yasuhito Nannya
- Department of Hematology/Oncology, Graduate School of Medicine, University of Tokyo, Tokyo 113-8655, Japan
| | | | | | | | | |
Collapse
|
47
|
Sarkar Roy N, Farheen S, Roy N, Sengupta S, Majumder PP. Portability of tag SNPs across isolated population groups: an example from India. Ann Hum Genet 2007; 72:82-9. [PMID: 17627800 DOI: 10.1111/j.1469-1809.2006.00383.x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Isolated population groups are useful in conducting association studies of complex diseases to avoid various pitfalls, including those arising from population stratification. Since DNA resequencing is expensive, it is recommended that genotyping be carried out at tagSNP (tSNP) loci. For this, tSNPs identified in one isolated population need to be used in another. Unless tSNPs are highly portable across populations this strategy may result in loss of information in association studies. We examined the issue of tSNP portability by sampling individuals from 10 isolated ethnic groups from India. We generated DNA resequencing data pertaining to 3 genomic regions and identified tSNPs in each population. We defined an index of tSNP portability and showed that portability is low across isolated Indian ethnic groups. The extent of portability did not significantly correlate with genetic similarity among the populations studied here. We also analyzed our data with sequence data from individuals of African and European descent. Our results indicated that it may be necessary to carry out resequencing in a small number of individuals to discover SNPs and identify tSNPs in the specific isolated population in which a disease association study is to be conducted.
Collapse
Affiliation(s)
- N Sarkar Roy
- TCG-ISI Centre for Population Genomics, Bengal Intelligent Park Ltd., Salt Lake Electronics Complex, Kolkata 700091, India
| | | | | | | | | |
Collapse
|
48
|
|
49
|
Douglas JA, Levin AM, Zuhlke KA, Ray AM, Johnson GR, Lange EM, Wood DP, Cooney KA. Common variation in the BRCA1 gene and prostate cancer risk. Cancer Epidemiol Biomarkers Prev 2007; 16:1510-6. [PMID: 17585057 PMCID: PMC3082399 DOI: 10.1158/1055-9965.epi-07-0137] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Rare inactivating mutations in the BRCA1 gene seem to play a limited role in prostate cancer. To our knowledge, however, no study has comprehensively assessed the role of other BRCA1 sequence variations (e.g., missense mutations) in prostate cancer. In a study of 817 men with and without prostate cancer from 323 familial and early-onset prostate cancer families, we used family-based association tests and conditional logistic regression to investigate the association between prostate cancer and single nucleotide polymorphisms (SNPs) tagging common haplotype variation in a 200-kb region surrounding (and including) the BRCA1 gene. We also used the Genotype-Identity-by-Descent Sharing Test to determine whether our most strongly associated SNP could account for prostate cancer linkage to chromosome 17q21 in a sample of 154 families from our previous genome-wide linkage study. The strongest evidence for prostate cancer association was for a glutamine-to-arginine substitution at codon 356 (Gln(356)Arg) in exon 11 of the BRCA1 gene. The minor (Arg) allele was preferentially transmitted to affected men (P = 0.005 for a dominant model), with an estimated odds ratio of 2.25 (95% confidence interval, 1.21-4.20). Notably, BRCA1 Gln(356)Arg is not in strong linkage disequilibrium with other BRCA1 coding SNPs or any known HapMap SNP on chromosome 17. In addition, Genotype-Identity-by-Descent Sharing Test results suggest that Gln(356)Arg accounts (in part) for our prior evidence of prostate cancer linkage to chromosome 17q21 (P = 0.022). Thus, we have identified a common, nonsynonymous substitution in the BRCA1 gene that is associated with and linked to prostate cancer.
Collapse
Affiliation(s)
- Julie A Douglas
- Department of Human Genetics, University of Michigan, Room 5912, Buhl Building, Ann Arbor, MI 48109-0618, USA.
| | | | | | | | | | | | | | | |
Collapse
|
50
|
Khatkar MS, Zenger KR, Hobbs M, Hawken RJ, Cavanagh JAL, Barris W, McClintock AE, McClintock S, Thomson PC, Tier B, Nicholas FW, Raadsma HW. A primary assembly of a bovine haplotype block map based on a 15,036-single-nucleotide polymorphism panel genotyped in holstein-friesian cattle. Genetics 2007; 176:763-72. [PMID: 17435229 PMCID: PMC1894606 DOI: 10.1534/genetics.106.069369] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Analysis of data on 1000 Holstein-Friesian bulls genotyped for 15,036 single-nucleotide polymorphisms (SNPs) has enabled genomewide identification of haplotype blocks and tag SNPs. A final subset of 9195 SNPs in Hardy-Weinberg equilibrium and mapped on autosomes on the bovine sequence assembly (release Btau 3.1) was used in this study. The average intermarker spacing was 251.8 kb. The average minor allele frequency (MAF) was 0.29 (0.05-0.5). Following recent precedents in human HapMap studies, a haplotype block was defined where 95% of combinations of SNPs within a region are in very high linkage disequilibrium. A total of 727 haplotype blocks consisting of > or =3 SNPs were identified. The average block length was 69.7 +/- 7.7 kb, which is approximately 5-10 times larger than in humans. These blocks comprised a total of 2964 SNPs and covered 50,638 kb of the sequence map, which constitutes 2.18% of the length of all autosomes. A set of tag SNPs, which will be useful for further fine-mapping studies, has been identified. Overall, the results suggest that as many as 75,000-100,000 tag SNPs would be needed to track all important haplotype blocks in the bovine genome. This would require approximately 250,000 SNPs in the discovery phase.
Collapse
Affiliation(s)
- Mehar S Khatkar
- Centre for Advanced Technologies in Animal Genetics and Reproduction (ReproGen), University of Sydney, Camden NSW 2570, Australia.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|