1
|
Abstract
We consider the emerging problem of comparing the similarity between (unlabeled) pedigrees. More specifically, we focus on the simplest pedigrees, namely, the 2-generation pedigrees. We show that the isomorphism testing for two 2-generation pedigrees is GI-hard. If the 2-generation pedigrees are monogamous (i.e., each individual at level-1 can mate with exactly one partner) then the isomorphism testing problem can be solved in polynomial time. We then consider the problem by relaxing it into an NP-complete decomposition problem which can be formulated as the Minimum Common Integer Pair Partition (MCIPP) problem, which we show to be FPT by exploiting a property of the optimal solution. While there is still some difficulty to overcome, this lays down a solid foundation for this research.
Collapse
Affiliation(s)
- Haitao Jiang
- School of Computer Science and Technology, Shandong University, 1500 Shunhua Road, Jinan, Shandong, 250101, China
| | - Guohui Lin
- Department of Computing Science, University of Alberta, Edmonton, Alberta, T2G 2E6, Germany
| | - Weitian Tong
- Department of Computing Science, University of Alberta, Edmonton, Alberta, T2G 2E6, Germany
| | - Daming Zhu
- School of Computer Science and Technology, Shandong University, 1500 Shunhua Road, Jinan, Shandong, 250101, China
| | - Binhai Zhu
- Department of Computer Science, Montana State University, Bozeman, MT, 59717, USA
| |
Collapse
|
2
|
Improved maximum likelihood reconstruction of complex multi-generational pedigrees. Theor Popul Biol 2014; 97:11-9. [DOI: 10.1016/j.tpb.2014.07.002] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2014] [Revised: 07/11/2014] [Accepted: 07/16/2014] [Indexed: 11/17/2022]
|
3
|
He D, Furlotte NA, Hormozdiari F, Joo JWJ, Wadia A, Ostrovsky R, Sahai A, Eskin E. Identifying genetic relatives without compromising privacy. Genome Res 2014; 24:664-72. [PMID: 24614977 PMCID: PMC3975065 DOI: 10.1101/gr.153346.112] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
The development of high-throughput genomic technologies has impacted many areas of genetic research. While many applications of these technologies focus on the discovery of genes involved in disease from population samples, applications of genomic technologies to an individual's genome or personal genomics have recently gained much interest. One such application is the identification of relatives from genetic data. In this application, genetic information from a set of individuals is collected in a database, and each pair of individuals is compared in order to identify genetic relatives. An inherent issue that arises in the identification of relatives is privacy. In this article, we propose a method for identifying genetic relatives without compromising privacy by taking advantage of novel cryptographic techniques customized for secure and private comparison of genetic information. We demonstrate the utility of these techniques by allowing a pair of individuals to discover whether or not they are related without compromising their genetic information or revealing it to a third party. The idea is that individuals only share enough special-purpose cryptographically protected information with each other to identify whether or not they are relatives, but not enough to expose any information about their genomes. We show in HapMap and 1000 Genomes data that our method can recover first- and second-order genetic relationships and, through simulations, show that our method can identify relationships as distant as third cousins while preserving privacy.
Collapse
Affiliation(s)
- Dan He
- Department of Computer Science, University of California, Los Angeles, Los Angeles, California 90095,USA
| | | | | | | | | | | | | | | |
Collapse
|
4
|
Cussens J, Bartlett M, Jones EM, Sheehan NA. Maximum Likelihood Pedigree Reconstruction Using Integer Linear Programming. Genet Epidemiol 2012; 37:69-83. [DOI: 10.1002/gepi.21686] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2012] [Revised: 08/30/2012] [Accepted: 09/07/2012] [Indexed: 11/10/2022]
Affiliation(s)
- James Cussens
- Department of Computer Science; University of York; York; North Yorkshire; United Kingdom
| | - Mark Bartlett
- Department of Computer Science; University of York; York; North Yorkshire; United Kingdom
| | - Elinor M. Jones
- Department of Health Sciences; University of Leicester; Leicester; Leicestershire; United Kingdom
| | - Nuala A. Sheehan
- Department of Health Sciences; University of Leicester; Leicester; Leicestershire; United Kingdom
| |
Collapse
|
5
|
Kirkpatrick B, Reshef Y, Finucane H, Jiang H, Zhu B, Karp RM. Comparing pedigree graphs. J Comput Biol 2012; 19:998-1014. [PMID: 22897201 DOI: 10.1089/cmb.2011.0254] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Pedigree graphs, or family trees, are typically constructed by an expensive process of examining genealogical records to determine which pairs of individuals are parent and child. New methods to automate this process take as input genetic data from a set of extant individuals and reconstruct ancestral individuals. There is a great need to evaluate the quality of these methods by comparing the estimated pedigree to the true pedigree. In this article, we consider two main pedigree comparison problems. The first is the pedigree isomorphism problem, for which we present a linear-time algorithm for leaf-labeled pedigrees. The second is the pedigree edit distance problem, for which we present (1) several algorithms that are fast and exact in various special cases, and (2) a general, randomized heuristic algorithm. In the negative direction, we first prove that the pedigree isomorphism problem is as hard as the general graph isomorphism problem, and that the sub-pedigree isomorphism problem is NP-hard. We then show that the pedigree edit distance problem is APX-hard in general and NP-hard on leaf-labeled pedigrees. We use simulated pedigrees to compare our edit-distance algorithms to each other as well as to a branch-and-bound algorithm that always finds an optimal solution.
Collapse
Affiliation(s)
- Bonnie Kirkpatrick
- Electrical Engineering and Computer Sciences, University of California, Berkeley, California, USA.
| | | | | | | | | | | |
Collapse
|
6
|
Kirkpatrick B, Li SC, Karp RM, Halperin E. Pedigree reconstruction using identity by descent. J Comput Biol 2011; 18:1481-93. [PMID: 22035331 DOI: 10.1089/cmb.2011.0156] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
Can we find the family trees, or pedigrees, that relate the haplotypes of a group of individuals? Collecting the genealogical information for how individuals are related is a very time-consuming and expensive process. Methods for automating the construction of pedigrees could stream-line this process. While constructing single-generation families is relatively easy given whole genome data, reconstructing multi-generational, possibly inbred, pedigrees is much more challenging. This article addresses the important question of reconstructing monogamous, regular pedigrees, where pedigrees are regular when individuals mate only with other individuals at the same generation. This article introduces two multi-generational pedigree reconstruction methods: one for inbreeding relationships and one for outbreeding relationships. In contrast to previous methods that focused on the independent estimation of relationship distances between every pair of typed individuals, here we present methods that aim at the reconstruction of the entire pedigree. We show that both our methods out-perform the state-of-the-art and that the outbreeding method is capable of reconstructing pedigrees at least six generations back in time with high accuracy. The two programs are available at http://cop.icsi.berkeley.edu/cop/.
Collapse
Affiliation(s)
- Bonnie Kirkpatrick
- Electrical Engineering and Computer Sciences, University of California, Berkeley, California 94720, USA.
| | | | | | | |
Collapse
|
7
|
Kyriazopoulou-Panagiotopoulou S, Kashef Haghighi D, Aerni SJ, Sundquist A, Bercovici S, Batzoglou S. Reconstruction of genealogical relationships with applications to Phase III of HapMap. Bioinformatics 2011; 27:i333-41. [PMID: 21685089 PMCID: PMC3117348 DOI: 10.1093/bioinformatics/btr243] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Accurate inference of genealogical relationships between pairs of individuals is paramount in association studies, forensics and evolutionary analyses of wildlife populations. Current methods for relationship inference consider only a small set of close relationships and have limited to no power to distinguish between relationships with the same number of meioses separating the individuals under consideration (e.g. aunt-niece versus niece-aunt or first cousins versus great aunt-niece). RESULTS We present CARROT (ClAssification of Relationships with ROTations), a novel framework for relationship inference that leverages linkage information to differentiate between rotated relationships, that is, between relationships with the same number of common ancestors and the same number of meioses separating the individuals under consideration. We demonstrate that CARROT clearly outperforms existing methods on simulated data. We also applied CARROT on four populations from Phase III of the HapMap Project and detected previously unreported pairs of third- and fourth-degree relatives. AVAILABILITY Source code for CARROT is freely available at http://carrot.stanford.edu. CONTACT sofiakp@stanford.edu.
Collapse
|
8
|
Huff CD, Witherspoon DJ, Simonson TS, Xing J, Watkins WS, Zhang Y, Tuohy TM, Neklason DW, Burt RW, Guthery SL, Woodward SR, Jorde LB. Maximum-likelihood estimation of recent shared ancestry (ERSA). Genome Res 2011; 21:768-74. [PMID: 21324875 PMCID: PMC3083094 DOI: 10.1101/gr.115972.110] [Citation(s) in RCA: 107] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2010] [Accepted: 02/02/2011] [Indexed: 01/20/2023]
Abstract
Accurate estimation of recent shared ancestry is important for genetics, evolution, medicine, conservation biology, and forensics. Established methods estimate kinship accurately for first-degree through third-degree relatives. We demonstrate that chromosomal segments shared by two individuals due to identity by descent (IBD) provide much additional information about shared ancestry. We developed a maximum-likelihood method for the estimation of recent shared ancestry (ERSA) from the number and lengths of IBD segments derived from high-density SNP or whole-genome sequence data. We used ERSA to estimate relationships from SNP genotypes in 169 individuals from three large, well-defined human pedigrees. ERSA is accurate to within one degree of relationship for 97% of first-degree through fifth-degree relatives and 80% of sixth-degree and seventh-degree relatives. We demonstrate that ERSA's statistical power approaches the maximum theoretical limit imposed by the fact that distant relatives frequently share no DNA through a common ancestor. ERSA greatly expands the range of relationships that can be estimated from genetic data and is implemented in a freely available software package.
Collapse
Affiliation(s)
- Chad D. Huff
- Department of Human Genetics, University of Utah Health Sciences Center, Salt Lake City, Utah 84112, USA
| | - David J. Witherspoon
- Department of Human Genetics, University of Utah Health Sciences Center, Salt Lake City, Utah 84112, USA
| | - Tatum S. Simonson
- Department of Human Genetics, University of Utah Health Sciences Center, Salt Lake City, Utah 84112, USA
| | - Jinchuan Xing
- Department of Human Genetics, University of Utah Health Sciences Center, Salt Lake City, Utah 84112, USA
| | - W. Scott Watkins
- Department of Human Genetics, University of Utah Health Sciences Center, Salt Lake City, Utah 84112, USA
| | - Yuhua Zhang
- Department of Human Genetics, University of Utah Health Sciences Center, Salt Lake City, Utah 84112, USA
| | - Therese M. Tuohy
- Huntsman Cancer Institute, University of Utah, Salt Lake City, Utah 84112, USA
| | - Deborah W. Neklason
- Huntsman Cancer Institute, University of Utah, Salt Lake City, Utah 84112, USA
| | - Randall W. Burt
- Huntsman Cancer Institute, University of Utah, Salt Lake City, Utah 84112, USA
| | - Stephen L. Guthery
- Department of Pediatrics, University of Utah School of Medicine, Salt Lake City, Utah 84108, USA
| | - Scott R. Woodward
- Sorenson Molecular Genealogy Foundation, Salt Lake City, Utah 84115, USA
| | - Lynn B. Jorde
- Department of Human Genetics, University of Utah Health Sciences Center, Salt Lake City, Utah 84112, USA
| |
Collapse
|
9
|
Kirkpatrick B, Li SC, Karp RM, Halperin E. Pedigree Reconstruction Using Identity by Descent. LECTURE NOTES IN COMPUTER SCIENCE 2011. [DOI: 10.1007/978-3-642-20036-6_15] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
|
10
|
Berkovic SF, Dibbens LM, Oshlack A, Silver JD, Katerelos M, Vears DF, Lüllmann-Rauch R, Blanz J, Zhang KW, Stankovich J, Kalnins RM, Dowling JP, Andermann E, Andermann F, Faldini E, D'Hooge R, Vadlamudi L, Macdonell RA, Hodgson BL, Bayly MA, Savige J, Mulley JC, Smyth GK, Power DA, Saftig P, Bahlo M. Array-based gene discovery with three unrelated subjects shows SCARB2/LIMP-2 deficiency causes myoclonus epilepsy and glomerulosclerosis. Am J Hum Genet 2008; 82:673-84. [PMID: 18308289 PMCID: PMC2427287 DOI: 10.1016/j.ajhg.2007.12.019] [Citation(s) in RCA: 175] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2007] [Revised: 12/10/2007] [Accepted: 12/28/2007] [Indexed: 01/09/2023] Open
Abstract
Action myoclonus-renal failure syndrome (AMRF) is an autosomal-recessive disorder with the remarkable combination of focal glomerulosclerosis, frequently with glomerular collapse, and progressive myoclonus epilepsy associated with storage material in the brain. Here, we employed a novel combination of molecular strategies to find the responsible gene and show its effects in an animal model. Utilizing only three unrelated affected individuals and their relatives, we used homozygosity mapping with single-nucleotide polymorphism chips to localize AMRF. We then used microarray-expression analysis to prioritize candidates prior to sequencing. The disorder was mapped to 4q13-21, and microarray-expression analysis identified SCARB2/Limp2, which encodes a lysosomal-membrane protein, as the likely candidate. Mutations in SCARB2/Limp2 were found in all three families used for mapping and subsequently confirmed in two other unrelated AMRF families. The mutations were associated with lack of SCARB2 protein. Reanalysis of an existing Limp2 knockout mouse showed intracellular inclusions in cerebral and cerebellar cortex, and the kidneys showed subtle glomerular changes. This study highlights that recessive genes can be identified with a very small number of subjects. The ancestral lysosomal-membrane protein SCARB2/LIMP-2 is responsible for AMRF. The heterogeneous pathology in the kidney and brain suggests that SCARB2/Limp2 has pleiotropic effects that may be relevant to understanding the pathogenesis of other forms of glomerulosclerosis or collapse and myoclonic epilepsies.
Collapse
Affiliation(s)
- Samuel F Berkovic
- Department of Medicine, Austin Health and Northern Health, Heidelberg, Victoria 3081, Australia.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Jessup B, Ward E, Cahill L, Keating D. Prevalence of speech and/or language impairment in preparatory students in northern Tasmania. INTERNATIONAL JOURNAL OF SPEECH-LANGUAGE PATHOLOGY 2008; 10:364-377. [PMID: 20840035 DOI: 10.1080/17549500701871171] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
The purpose of this paper is to report the prevalence of speech and/or language impairment in a sample of preparatory students in northern Tasmania, Australia. A total of 308 preparatory students attending 30 public schools in northern Tasmania were administered assessments by a speech-language pathologist, and subsequently diagnosed with either typical or impaired speech and/or language skills. Overall, 41.2% of assessed preparatory students were identified as having either speech and/or language impairment. Specifically, 8.7% of students were found to have isolated speech impairment, 18.2% were diagnosed with isolated language impairment, and 14.3% were identified as having comorbid speech and language impairment. Compared to prior Australian and international research, the present data reflect one of the highest prevalence estimates for speech and/or language impairment reported to date. Given the relative paucity of Australian prevalence data, further epidemiological research specifically of Australian children is needed to validate the current findings.
Collapse
|
12
|
Sheehan NA, Egeland T. Adjusting for founder relatedness in a linkage analysis using prior information. Hum Hered 2007; 65:221-31. [PMID: 18073492 DOI: 10.1159/000112369] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2007] [Accepted: 07/31/2007] [Indexed: 11/19/2022] Open
Abstract
In genetic linkage studies, while the pedigrees are generally known, background relatedness between the founding individuals, assumed by definition to be unrelated, can seriously affect the results of the analysis. Likelihood approaches to relationship estimation from genetic marker data can all be expressed in terms of finding the most likely pedigree connecting the individuals of interest. When the true relationship is the main focus, the set of all possible alternative pedigrees can be too large to consider. However, prior information is often available which, when incorporated in a formal and structured way, can restrict this set to a manageable size thus enabling the calculation of a posterior distribution from which inferences can be drawn. Here, the unknown relationships are more of a nuisance factor than of interest in their own right, so the focus is on adjusting the results of the analysis rather than on direct estimation. In this paper, we show how prior information on founder relationships can be exploited in some applications to generate a set of candidate extended pedigrees. We then weight the relevant pedigree-specific likelihoods by their posterior probabilities to adjust the lod score statistics.
Collapse
Affiliation(s)
- N A Sheehan
- Department of Health Sciences and Department of Genetics, University of Leicester, Leicester, UK.
| | | |
Collapse
|
13
|
Thornton T, McPeek MS. Case-control association testing with related individuals: a more powerful quasi-likelihood score test. Am J Hum Genet 2007; 81:321-37. [PMID: 17668381 PMCID: PMC1950805 DOI: 10.1086/519497] [Citation(s) in RCA: 148] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2007] [Accepted: 05/07/2007] [Indexed: 01/23/2023] Open
Abstract
We consider the problem of genomewide association testing of a binary trait when some sampled individuals are related, with known relationships. This commonly arises when families sampled for a linkage study are included in an association study. Furthermore, power to detect association with complex traits can be increased when affected individuals with affected relatives are sampled, because they are more likely to carry disease alleles than are randomly sampled affected individuals. With related individuals, correlations among relatives must be taken into account, to ensure validity of the test, and consideration of these correlations can also improve power. We provide new insight into the use of pedigree-based weights to improve power, and we propose a novel test, the MQLS test, which, as we demonstrate, represents an overall, and in many cases, substantial, improvement in power over previous tests, while retaining a computational simplicity that makes it useful in genomewide association studies in arbitrary pedigrees. Other features of the MQLS are as follows: (1) it is applicable to completely general combinations of family and case-control designs, (2) it can incorporate both unaffected controls and controls of unknown phenotype into the same analysis, and (3) it can incorporate phenotype data about relatives with missing genotype data. The methods are applied to data from the Genetic Analysis Workshop 14 Collaborative Study of the Genetics of Alcoholism, where the MQLS detects genomewide significant association (after Bonferroni correction) with an alcoholism-related phenotype for four different single-nucleotide polymorphisms: tsc1177811 (P=5.9x10(-7)), tsc1750530 (P=4.0x10(-7)), tsc0046696 (P=4.7x10(-7)), and tsc0057290 (P=5.2x10(-7)) on chromosomes 1, 16, 18, and 18, respectively. Three of these four significant associations were not detected in previous studies analyzing these data.
Collapse
Affiliation(s)
- Timothy Thornton
- Department of Statistics, University of Chicago, Chicago, IL 60637, USA
| | | |
Collapse
|
14
|
Rubio JP, Bahlo M, Stankovich J, Burfoot RK, Johnson LJ, Huxtable S, Butzkueven H, Lin L, Taylor BV, Speed TP, Kilpatrick TJ, Mignot E, Foote SJ. Analysis of extended HLA haplotypes in multiple sclerosis and narcolepsy families confirms a predisposing effect for the class I region in Tasmanian MS patients. Immunogenetics 2007; 59:177-86. [PMID: 17256150 DOI: 10.1007/s00251-006-0183-5] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2006] [Accepted: 11/14/2006] [Indexed: 10/23/2022]
Abstract
Human leucocyte antigen (HLA)-DRB1*15 is associated with predisposition to multiple sclerosis (MS), although conjecture surrounds the possible involvement of an alternate risk locus in the class I region of the HLA complex. We have shown previously that an alternate MS risk allele(s) may be encompassed by the telomerically extended DRB1*15 haplotype, and here, we have attempted to map the putative variant. Thirteen microsatellite markers encompassing a 6.79-megabase (D6S2236-G51152) region, and the DRB1 and DQB1 genes, were genotyped in 166 MS simplex families and 104 control families from the Australian State of Tasmania and 153 narcolepsy simplex families (trios) from the USA. Complementary approaches were used to investigate residual predisposing effects of microsatellite alleles comprising the extended DRB1*15 haplotype taking into account the strong predisposing effect of DRB1*15: (1) Disease association of the extended DRB1*15 haplotype was compared for MS and narcolepsy families--predisposing effects were observed for extended class I microsatellite marker alleles in MS families, but not narcolepsy families; (2) disease association of the extended DRB1*15 haplotype was investigated after conditioning MS and control haplotypes on the absence of DRB1*15--a significant predisposing effect was observed for a 627-kb haplotype (D6S258 allele 8-MOGCA allele 4; MOG, myelin oligodendrocyte glycoprotein) spanning the extended class I region. MOGCA allele 4 displayed the strongest predisposing effect in DRB1*15-conditioned haplotypes (p = 0.0006; OR 2.83 [1.54-5.19]). Together, these data confirm that an alternate MS risk locus exists in the extended class I region in Tasmanian MS patients independent of DRB1*15.
Collapse
Affiliation(s)
- Justin P Rubio
- The Neurogenetics Laboratory, The Howard Florey Institute, University of Melbourne, Melbourne, Victoria, Australia.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Sheehan NA, Egeland T. Structured Incorporation of Prior Information in Relationship Identification Problems. Ann Hum Genet 2007; 71:501-18. [PMID: 17233753 DOI: 10.1111/j.1469-1809.2006.00345.x] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The objective of this paper is to show how various sources of information can be modelled and integrated to address relationship identification problems. Applications come from areas as diverse as evolution and conservation research, genealogical research in human, plant and animal populations, and forensic problems including paternity cases, identification following disasters, family reunions and immigration issues. We propose assigning a prior probability distribution to the sample space of pedigrees, calculating the likelihood based on DNA data using available software and posterior probabilities using Bayes' Theorem. Our emphasis here is on the modelling of this prior information in a formal and consistent manner. We introduce the distinction between local and global prior information, whereby local information usually applies to particular components of the pedigree and global prior information refers to more general features. When it is difficult to decide on a prior distribution, robustness to various choices should be studied. When suitable prior information is not available, a flat prior can be used which will then correspond to a strict likelihood approach. In practice, prior information is often considered for these problems, but in a generally ad hoc manner. This paper offers a consistent alternative. We emphasise that many practical problems can be addressed using freely available software.
Collapse
Affiliation(s)
- N A Sheehan
- Department of Health Sciences, University of Leicester, University Road, Leicester LE1 7RH, UK.
| | | |
Collapse
|
16
|
Bahlo M, Stankovich J, Speed TP, Rubio JP, Burfoot RK, Foote SJ. Detecting genome wide haplotype sharing using SNP or microsatellite haplotype data. Hum Genet 2005; 119:38-50. [PMID: 16362347 DOI: 10.1007/s00439-005-0114-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2005] [Accepted: 11/23/2005] [Indexed: 11/26/2022]
Abstract
Genome wide association studies using high throughput technology are already being conducted despite the significant hurdles that need to be overcome (Nat Rev Genet 6:95-108, 2005; Nat Rev Genet 6:109-118, 2005). Methods for detecting haplotype association signals in genome wide haplotype datasets are as yet very limited. Much methodological research has already been devoted to linkage disequilibrium (LD) fine mapping where the focus is the identification of the disease locus rather than the detection of a disease signal. Applications of these approaches to genome wide scanning are limited by the strong model assumptions of the sharing process, which lead to computational complexity. We describe a new algorithm for the initial identification of disease susceptibility loci in genome wide haplotype association studies. Excess sharing of ancestral haplotypes, which indicates the presence of a disease locus, is detected with a simple, easy to interpret, chi2 based statistic. The method allows genome wide scanning for qualitative traits within reasonable computational timeframes and can serve as a first pass analysis prior to the usage of likelihood based methods, providing candidate regions and inferred susceptibility haplotypes. Our method makes no assumptions regarding the population history or the pattern of background LD. Statistical significance is evaluated with permutation tests. The method is illustrated on simulated and real data where it is applied to simple (cystic fibrosis) and complex disease (multiple sclerosis) examples. The statistic has low type I error and greater power to map disease loci over conventional single marker tests for low to moderate levels of LD.
Collapse
Affiliation(s)
- Melanie Bahlo
- The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, 3050 Parkville, VIC, Australia.
| | | | | | | | | | | |
Collapse
|