1
|
Vsevolozhskaya OA, Zaykin DV. Quantifying posterior effect size distribution of susceptibility loci by common summary statistics. Genet Epidemiol 2020; 44:339-351. [PMID: 32100375 DOI: 10.1002/gepi.22286] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2019] [Revised: 12/25/2019] [Accepted: 01/27/2020] [Indexed: 11/06/2022]
Abstract
Testing millions of single nucleotide polymorphisms (SNPs) in genetic association studies has become a standard routine for disease gene discovery. In light of recent re-evaluation of statistical practice, it has been suggested that p-values are unfit as summaries of statistical evidence. Despite this criticism, p-values contain information that can be utilized to address the concerns about their flaws. We present a new method for utilizing evidence summarized by p-values for estimating odds ratio (OR) based on its approximate posterior distribution. In our method, only p-values, sample size, and standard deviation for ln(OR) are needed as summaries of data, accompanied by a suitable prior distribution for ln(OR) that can assume any shape. The parameter of interest, ln(OR), is the only parameter with a specified prior distribution, hence our model is a mix of classical and Bayesian approaches. We show that our method retains the main advantages of the Bayesian approach: it yields direct probability statements about hypotheses for OR and is resistant to biases caused by selection of top-scoring SNPs. Our method enjoys greater flexibility than similarly inspired methods in the assumed distribution for the summary statistic and in the form of the prior for the parameter of interest. We illustrate our method by presenting interval estimates of effect size for reported genetic associations with lung cancer. Although we focus on OR, the method is not limited to this particular measure of effect size and can be used broadly for assessing reliability of findings in studies testing multiple predictors.
Collapse
Affiliation(s)
| | - Dmitri V Zaykin
- Biostatistics and Computational Biology, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, North Carolina
| |
Collapse
|
2
|
Vilor-Tejedor N, Alemany S, Cáceres A, Bustamante M, Pujol J, Sunyer J, González JR. Strategies for integrated analysis in imaging genetics studies. Neurosci Biobehav Rev 2018; 93:57-70. [PMID: 29944960 DOI: 10.1016/j.neubiorev.2018.06.013] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2017] [Revised: 04/30/2018] [Accepted: 06/15/2018] [Indexed: 02/06/2023]
Abstract
Imaging Genetics (IG) integrates neuroimaging and genomic data from the same individual, deepening our knowledge of the biological mechanisms behind neurodevelopmental domains and neurological disorders. Although the literature on IG has exponentially grown over the past years, the majority of studies have mainly analyzed associations between candidate brain regions and individual genetic variants. However, this strategy is not designed to deal with the complexity of neurobiological mechanisms underlying behavioral and neurodevelopmental domains. Moreover, larger sample sizes and increased multidimensionality of this type of data represents a challenge for standardizing modeling procedures in IG research. This review provides a systematic update of the methods and strategies currently used in IG studies, and serves as an analytical framework for researchers working in this field. To complement the functionalities of the Neuroconductor framework, we also describe existing R packages that implement these methodologies. In addition, we present an overview of how these methodological approaches are applied in integrating neuroimaging and genetic data.
Collapse
Affiliation(s)
- Natàlia Vilor-Tejedor
- Barcelona Research Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain; Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain; Barcelona Beta Brain Research Center (BBRC) - Pasqual Maragall Foundation, Barcelona, Spain.
| | - Silvia Alemany
- Barcelona Research Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain
| | - Alejandro Cáceres
- Barcelona Research Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain
| | - Mariona Bustamante
- Barcelona Research Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain; Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Jesús Pujol
- MRI Research Unit, Hospital del Mar, Centro de Investigación Biomédica en Red de Salud Mental, CIBERSAM G21, Barcelona, Spain
| | - Jordi Sunyer
- Barcelona Research Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain; IMIM (Hospital del Mar Medical Research Institute), Barcelona, Spain
| | - Juan R González
- Barcelona Research Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain.
| |
Collapse
|
3
|
Otani T, Noma H, Nishino J, Matsui S. Re-assessment of multiple testing strategies for more efficient genome-wide association studies. Eur J Hum Genet 2018. [PMID: 29523830 DOI: 10.1038/s41431-018-0125-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Although enormous costs have been dedicated to discovering relevant disease-related genetic variants, especially in genome-wide association studies (GWASs), only a small fraction of estimated heritability can be explained by these results. This is the so-called missing heritability problem. The conventional use of overly conservative multiple testing strategies based on controlling the familywise error rate (FWER), in particular with a genome-wide significance threshold of P <5 × 10-8, is one of the most important issues from a statistical perspective. To help resolve this problem, we performed comprehensive re-assessments of currently available strategies using recently published, extremely large-scale GWAS data sets of rheumatoid arthritis and schizophrenia (>50,000 subjects). The estimates of statistical power averaged for all disease-related genetic variants of the standard FWER-based strategy were only 0.09% for the rheumatoid arthritis data and 0.04% for the schizophrenia data. To design more efficient strategies, we also conducted an extensive comparison of multiple testing strategies by applying false discovery rate (FDR)-controlling procedures to these data sets and simulations, and found that the FDR-based procedures achieved higher power than the FWER-based strategy, even at a strict FDR level (e.g., FDR = 1%). We also discuss a useful alternative measure, namely "partial power," which is an averaged power for detecting the clinically and biologically meaningful genetic factors with the largest effects. Simulation results suggest that the FDR-based procedures can achieve sufficient partial power (>80%) for detecting these factors (odds ratios of >1.05) with 80,000 subjects, and thus this may be a useful measure for defining realistic objectives of future GWASs.
Collapse
Affiliation(s)
- Takahiro Otani
- Risk Analysis Research Center, The Institute of Statistical Mathematics, Tachikawa, Tokyo, 190-8562, Japan
| | - Hisashi Noma
- Department of Data Science, The Institute of Statistical Mathematics, Tachikawa, Tokyo, 190-8562, Japan.
| | - Jo Nishino
- Department of Biostatistics, Nagoya University Graduate School of Medicine, Nagoya, Aichi, 466-8550, Japan
| | - Shigeyuki Matsui
- Department of Biostatistics, Nagoya University Graduate School of Medicine, Nagoya, Aichi, 466-8550, Japan
| |
Collapse
|
4
|
Kuo KH. Multiple Testing in the Context of Gene Discovery in Sickle Cell Disease Using Genome-Wide Association Studies. GENOMICS INSIGHTS 2017; 10:1178631017721178. [PMID: 28811740 PMCID: PMC5542087 DOI: 10.1177/1178631017721178] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2016] [Accepted: 06/26/2017] [Indexed: 12/25/2022]
Abstract
The issue of multiple testing, also termed multiplicity, is ubiquitous in studies where multiple hypotheses are tested simultaneously. Genome-wide association study (GWAS), a type of genetic association study that has gained popularity in the past decade, is most susceptible to the issue of multiple testing. Different methodologies have been employed to address the issue of multiple testing in GWAS. The purpose of the review is to examine the methodologies employed in dealing with multiple testing in the context of gene discovery using GWAS in sickle cell disease complications.
Collapse
Affiliation(s)
- Kevin H.M. Kuo
- Departments of Medical Oncology and Hematology and Medicine, University Health Network, Toronto, ON, Canada
- Division of Hematology, Department of Medicine, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
5
|
Fu MR, Conley YP, Axelrod D, Guth AA, Yu G, Fletcher J, Zagzag D. Precision assessment of heterogeneity of lymphedema phenotype, genotypes and risk prediction. Breast 2016; 29:231-40. [PMID: 27460425 DOI: 10.1016/j.breast.2016.06.023] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2016] [Revised: 06/17/2016] [Accepted: 06/23/2016] [Indexed: 12/11/2022] Open
Abstract
Lymphedema following breast cancer surgery is considered to be mainly due to the mechanical injury from surgery. Recent research identified that inflammation-infection and obesity may be the important predictors for lymphedema. The purpose of this exploratory research was to prospectively examine phenotype of arm lymphedema defined by limb volume and lymphedema symptoms in relation to inflammatory genes in women treated for breast cancer. A prospective, descriptive and repeated-measure design using candidate gene association method was used to enroll 140 women at pre-surgery and followed at 4-8 weeks and 12 months post-surgery. Arm lymphedema was determined by a perometer measurement of ≥5% limb volume increase from baseline of pre-surgery. Lymphedema symptom phenotype was evaluated using a reliable and valid instrument. Saliva samples were collected for DNA extraction. Genes known for inflammation were evaluated, including lymphatic specific growth factors (VEGF-C & VEGF-D), cytokines (IL1-a, IL-4, IL6, IL8, IL10, & IL13), and tumor necrosis factor-a (TNF-a). No significant associations were found between arm lymphedema phenotype and any inflammatory genetic variations. IL1-a rs17561 was marginally associated with symptom count phenotype of ≥8 symptoms. IL-4 rs2070874 was significantly associated with phenotype of impaired limb mobility and fluid accumulation. Phenotype of fluid accumulation was significantly associated with IL6 rs1800795, IL4 rs2243250 and IL4 rs2070874. Phenotype of discomfort was significantly associated with VEGF-C rs3775203 and IL13 rs1800925. Precision assessment of heterogeneity of lymphedema phenotype and understanding the biological mechanism of each phenotype through the exploration of inherited genetic susceptibility is essential for finding a cure. Further exploration of investigative intervention in the context of genotype and gene expressions would advance our understanding of heterogeneity of lymphedema phenotype.
Collapse
Affiliation(s)
- Mei R Fu
- NYU Rory Meyers College of Nursing, New York University, New York, NY, USA; NYU Laura and Isaac Perlmutter Cancer Center, New York, NY, USA.
| | - Yvette P Conley
- School of Nursing, University of Pittsburgh, Pittsburgh, PA, USA
| | - Deborah Axelrod
- Department of Surgery, New York University School of Medicine, New York, NY, USA; NYU Laura and Isaac Perlmutter Cancer Center, New York, NY, USA
| | - Amber A Guth
- Department of Surgery, New York University School of Medicine, New York, NY, USA; NYU Laura and Isaac Perlmutter Cancer Center, New York, NY, USA
| | - Gary Yu
- NYU Rory Meyers College of Nursing, New York University, New York, NY, USA
| | - Jason Fletcher
- NYU Rory Meyers College of Nursing, New York University, New York, NY, USA
| | - David Zagzag
- Pathology and Neurosurgery, Division of Neuropathology, Microvascular and Molecular Neuro-Oncology Laboratory, NYU Langone Medical Center, New York, NY, USA
| |
Collapse
|
6
|
Jeng XJ, Daye ZJ, Lu W, Tzeng JY. Rare Variants Association Analysis in Large-Scale Sequencing Studies at the Single Locus Level. PLoS Comput Biol 2016; 12:e1004993. [PMID: 27355347 PMCID: PMC4927097 DOI: 10.1371/journal.pcbi.1004993] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2015] [Accepted: 05/21/2016] [Indexed: 11/24/2022] Open
Abstract
Genetic association analyses of rare variants in next-generation sequencing (NGS) studies are fundamentally challenging due to the presence of a very large number of candidate variants at extremely low minor allele frequencies. Recent developments often focus on pooling multiple variants to provide association analysis at the gene instead of the locus level. Nonetheless, pinpointing individual variants is a critical goal for genomic researches as such information can facilitate the precise delineation of molecular mechanisms and functions of genetic factors on diseases. Due to the extreme rarity of mutations and high-dimensionality, significances of causal variants cannot easily stand out from those of noncausal ones. Consequently, standard false-positive control procedures, such as the Bonferroni and false discovery rate (FDR), are often impractical to apply, as a majority of the causal variants can only be identified along with a few but unknown number of noncausal variants. To provide informative analysis of individual variants in large-scale sequencing studies, we propose the Adaptive False-Negative Control (AFNC) procedure that can include a large proportion of causal variants with high confidence by introducing a novel statistical inquiry to determine those variants that can be confidently dispatched as noncausal. The AFNC provides a general framework that can accommodate for a variety of models and significance tests. The procedure is computationally efficient and can adapt to the underlying proportion of causal variants and quality of significance rankings. Extensive simulation studies across a plethora of scenarios demonstrate that the AFNC is advantageous for identifying individual rare variants, whereas the Bonferroni and FDR are exceedingly over-conservative for rare variants association studies. In the analyses of the CoLaus dataset, AFNC has identified individual variants most responsible for gene-level significances. Moreover, single-variant results using the AFNC have been successfully applied to infer related genes with annotation information. Next-generation sequencing technologies have allowed genetic association studies of complex traits at the single base-pair resolution, where most genetic variants have extremely low mutation frequencies. These rare variants have been the focus of modern statistical-computational genomics due to their potential to explain missing disease heritability. The identification of individual rare variants associated with diseases can provide new biological insights and enable the precise delineation of disease mechanisms. However, due to the extreme rarity of mutations and large numbers of variants, significances of causative variants tend to be mixed inseparably with a few noncausative ones, and standard multiple testing procedures controlling for false positives fail to provide a meaningful way to include a large proportion of the causative variants. To address the challenge of detecting weak biological signals, we propose a novel statistical procedure, based on false-negative control, to provide a practical approach for variant inclusion in large-scale sequencing studies. By determining those variants that can be confidently dispatched as noncausative, the proposed procedure offers an objective selection of a modest number of potentially causative variants at the single-locus level. Results can be further prioritized or used to infer disease-associated genes with annotation information.
Collapse
Affiliation(s)
- Xinge Jessie Jeng
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Zhongyin John Daye
- Epidemiology and Biostatistics, University of Arizona, Tucson, Arizona, United States of America
| | - Wenbin Lu
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States of America
| | - Jung-Ying Tzeng
- Department of Statistics, North Carolina State University, Raleigh, North Carolina, United States of America
- Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, United States of America
- Department of Statistics, National Cheng-Kung University, Tainan, Taiwan
- * E-mail:
| |
Collapse
|
7
|
Carey CE, Agrawal A, Zhang B, Conley ED, Degenhardt L, Heath AC, Li D, Lynskey MT, Martin NG, Montgomery GW, Wang T, Bierut LJ, Hariri AR, Nelson EC, Bogdan R. Monoacylglycerol lipase (MGLL) polymorphism rs604300 interacts with childhood adversity to predict cannabis dependence symptoms and amygdala habituation: Evidence from an endocannabinoid system-level analysis. JOURNAL OF ABNORMAL PSYCHOLOGY 2015; 124:860-77. [PMID: 26595473 PMCID: PMC4700831 DOI: 10.1037/abn0000079] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Despite evidence for heritable variation in cannabis involvement and the discovery of cannabinoid receptors and their endogenous ligands, no consistent patterns have emerged from candidate endocannabinoid (eCB) genetic association studies of cannabis involvement. Given interactions between eCB and stress systems and associations between childhood stress and cannabis involvement, it may be important to consider childhood adversity in the context of eCB-related genetic variation. We employed a system-level gene-based analysis of data from the Comorbidity and Trauma Study (N = 1,558) to examine whether genetic variation in six eCB genes (anabolism: DAGLA, DAGLB, NAPEPLD; catabolism: MGLL, FAAH; binding: CNR1; SNPs N = 65) and childhood sexual abuse (CSA) predict cannabis dependence symptoms. Significant interactions with CSA emerged for MGLL at the gene level (p = .009), and for rs604300 within MGLL (ΔR2 = .007, p < .001), the latter of which survived SNP-level Bonferroni correction and was significant in an additional sample with similar directional effects (N = 859; ΔR2 = .005, p = .026). Furthermore, in a third sample (N = 312), there was evidence that rs604300 genotype interacts with early life adversity to predict threat-related basolateral amygdala habituation, a neural phenotype linked to the eCB system and addiction (ΔR2 = .013, p = .047). Rs604300 may be related to epigenetic modulation of MGLL expression. These results are consistent with rodent models implicating 2-arachidonoylglycerol (2-AG), an endogenous cannabinoid metabolized by the enzyme encoded by MGLL, in the etiology of stress adaptation related to cannabis dependence, but require further replication.
Collapse
Affiliation(s)
- Caitlin E Carey
- Department of Psychology, Washington University in St. Louis
| | - Arpana Agrawal
- Department of Psychiatry, Washington University in St. Louis
| | - Bo Zhang
- Department of Genetics, Washington University in St. Louis
| | | | - Louisa Degenhardt
- National Drug and Alcohol Research Centre, University of New South Wales
| | - Andrew C Heath
- Department of Psychiatry, Washington University in St. Louis
| | - Daofeng Li
- Department of Genetics, Washington University in St. Louis
| | | | | | | | - Ting Wang
- Department of Genetics, Washington University in St. Louis
| | - Laura J Bierut
- Department of Psychiatry, Washington University in St. Louis
| | - Ahmad R Hariri
- Department of Psychology and Neuroscience, Duke University
| | - Elliot C Nelson
- Department of Psychiatry, Washington University in St. Louis
| | - Ryan Bogdan
- Department of Psychology, Washington University in St. Louis
| |
Collapse
|
8
|
Developing Peripheral Blood Gene Expression-Based Diagnostic Tests for Coronary Artery Disease: a Review. J Cardiovasc Transl Res 2015; 8:372-80. [DOI: 10.1007/s12265-015-9641-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/30/2015] [Accepted: 06/10/2015] [Indexed: 12/16/2022]
|
9
|
Kuo CL, Vsevolozhskaya OA, Zaykin DV. Assessing the Probability that a Finding Is Genuine for Large-Scale Genetic Association Studies. PLoS One 2015; 10:e0124107. [PMID: 25955023 PMCID: PMC4425705 DOI: 10.1371/journal.pone.0124107] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2014] [Accepted: 02/26/2015] [Indexed: 11/25/2022] Open
Abstract
Genetic association studies routinely involve massive numbers of statistical tests accompanied by P-values. Whole genome sequencing technologies increased the potential number of tested variants to tens of millions. The more tests are performed, the smaller P-value is required to be deemed significant. However, a small P-value is not equivalent to small chances of a spurious finding and significance thresholds may fail to serve as efficient filters against false results. While the Bayesian approach can provide a direct assessment of the probability that a finding is spurious, its adoption in association studies has been slow, due in part to the ubiquity of P-values and the automated way they are, as a rule, produced by software packages. Attempts to design simple ways to convert an association P-value into the probability that a finding is spurious have been met with difficulties. The False Positive Report Probability (FPRP) method has gained increasing popularity. However, FPRP is not designed to estimate the probability for a particular finding, because it is defined for an entire region of hypothetical findings with P-values at least as small as the one observed for that finding. Here we propose a method that lets researchers extract probability that a finding is spurious directly from a P-value. Considering the counterpart of that probability, we term this method POFIG: the Probability that a Finding is Genuine. Our approach shares FPRP's simplicity, but gives a valid probability that a finding is spurious given a P-value. In addition to straightforward interpretation, POFIG has desirable statistical properties. The POFIG average across a set of tentative associations provides an estimated proportion of false discoveries in that set. POFIGs are easily combined across studies and are immune to multiple testing and selection bias. We illustrate an application of POFIG method via analysis of GWAS associations with Crohn's disease.
Collapse
Affiliation(s)
- Chia-Ling Kuo
- Department of Community Medicine and Health Care, University of Connecticut, Farmington, Connecticut, United Sates of America
| | - Olga A. Vsevolozhskaya
- Department of Epidemiology and Biostatisitcs, Michigan State University, East Lansing, Michigan, United States of America
| | - Dmitri V. Zaykin
- National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, North Carolina, United States of America
- * E-mail:
| |
Collapse
|
10
|
Genetic variations in the VEGF pathway as prognostic factors in metastatic colorectal cancer patients treated with oxaliplatin-based chemotherapy. THE PHARMACOGENOMICS JOURNAL 2015; 15:397-404. [PMID: 25707392 DOI: 10.1038/tpj.2015.1] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/17/2014] [Revised: 11/12/2014] [Accepted: 12/02/2014] [Indexed: 12/27/2022]
Abstract
Angiogenesis is a significant biological mechanism in the progression and metastasis of solid tumors. Vascular endothelial growth factor (VEGF), its receptors and signaling effectors have a central role in tumor-induced angiogenesis. Genetic variation in the VEGF pathway may impact on tumor angiogenesis and, hence, on clinical cancer outcomes. This study evaluates the influence of common genetic variations within the VEGF pathway in the clinical outcomes of 172 metastatic colorectal cancer (mCRC) patients treated with first-line oxaliplatin/5-fluorouracil chemotherapy. A total of 27 single-nucleotide polymorphisms (SNPs) in 16 genes in the VEGF-dependent angionenesis process were genotyped using a dynamic array on the BioMark™ system. After assessing the KRAS mutational status, we found that four SNPs located in three genes (KISS1, KRAS and VEGFR2) were associated with progression-free survival. Five SNPs in three genes (ITGAV, KRAS and VEGFR2) correlated with overall survival. The gene-gene interactions identified in the survival tree analysis support the importance of VEGFR2 rs2071559 and KISS1 rs71745629 in modulating these outcomes. This study provides evidence that functional germline polymorphisms in the VEGF pathway may help to predict outcome in mCRC patients who undergo oxaliplatin/5-fluorouracil chemotherapy.
Collapse
|
11
|
Combined analysis with copy number variation identifies risk loci in lung cancer. BIOMED RESEARCH INTERNATIONAL 2014; 2014:469103. [PMID: 25093167 PMCID: PMC4100386 DOI: 10.1155/2014/469103] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/21/2014] [Revised: 06/11/2014] [Accepted: 06/11/2014] [Indexed: 12/26/2022]
Abstract
Background. Lung cancer is the most important cause of cancer mortality worldwide, but the underlying mechanisms of this disease are not fully understood. Copy number variations (CNVs) are promising genetic variations to study because of their potential effects on cancer.
Methodology/Principal Findings. Here we conducted a pilot study in which we systematically analyzed the association of CNVs in two lung cancer datasets: the Environment And Genetics in Lung cancer Etiology (EAGLE) and the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial datasets. We used a preestablished association method to test the datasets separately and conducted a combined analysis to test the association accordance between the two datasets. Finally, we identified 167 risk SNP loci and 22 CNVs associated with lung cancer and linked them with recombination hotspots. Functional annotation and biological relevance analyses implied that some of our predicted risk loci were supported by other studies and might be potential candidate loci for lung cancer studies. Conclusions/Significance. Our results further emphasized the importance of copy number variations in cancer and might be a valuable complement to current genome-wide association studies on cancer.
Collapse
|
12
|
Zhang Z, Wang J, He J, Zheng Z, Zeng X, Zhang C, Ye J, Zhang Y, Zhong N, Lu W. Genetic variants in MUC4 gene are associated with lung cancer risk in a Chinese population. PLoS One 2013; 8:e77723. [PMID: 24204934 PMCID: PMC3804582 DOI: 10.1371/journal.pone.0077723] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2013] [Accepted: 09/03/2013] [Indexed: 12/22/2022] Open
Abstract
Mucin MUC4, which is encoded by the MUC4 gene, plays an important role in epithelial cell proliferation and differentiation. Aberrant MUC4 overexpression is associated with invasive tumor proliferation and poor outcome in epithelial cancers. Collectively, the existing evidence suggests that MUC4 has tumor-promoter functions. In this study, we performed a case-control study of 1,048 incident lung cancer cases and 1,048 age- and sex frequency-matched cancer-free controls in a Chinese population to investigate the role of MUC4 gene polymorphism in lung cancer etiology. We identified nine SNPs that were significantly associated with increased lung cancer risk (P = 0.0425 for rs863582, 0.0333 for rs842226, 0.0294 for rs842225, 0.0010 for rs2550236, 0.0149 for rs2688515, 0.0191 for rs 2641773, 0.0058 for rs3096337, 0.0077 for rs859769, and 0.0059 for rs842461 in an additive model). Consistent with these single-locus analysis results, the haplotype analyses revealed an adverse effect of the haplotype “GGC” of rs3096337, rs859769, and rs842461 on lung cancer. Both the haplotype and diplotype “CTGAGC” of rs863582, rs842226, rs2550236, rs842225, and rs2688515 had an adverse effect on lung cancer, which is also consistent with the single-locus analysis. Moreover, we observed statistically significant interactions for rs863582 and rs842461 in heavy smokers. Our results suggest that MUC4 gene polymorphisms and their interaction with smoking may contribute to lung cancer etiology.
Collapse
Affiliation(s)
- Zili Zhang
- State Key Laboratory of Respiratory Diseases, Guangzhou Institute of Respiratory Disease, The First Affiliated Hospital, Guangzhou Medical University, Guangzhou, Guangdong, China
| | - Jian Wang
- State Key Laboratory of Respiratory Diseases, Guangzhou Institute of Respiratory Disease, The First Affiliated Hospital, Guangzhou Medical University, Guangzhou, Guangdong, China
| | - Jianxing He
- State Key Laboratory of Respiratory Diseases, Guangzhou Institute of Respiratory Disease, The First Affiliated Hospital, Guangzhou Medical University, Guangzhou, Guangdong, China
| | - Zeguang Zheng
- State Key Laboratory of Respiratory Diseases, Guangzhou Institute of Respiratory Disease, The First Affiliated Hospital, Guangzhou Medical University, Guangzhou, Guangdong, China
| | - Xiansheng Zeng
- Department of Respiratory Medicine, Xiangyang Central Hospital, Xiangyang, Hubei, China
| | - Chenting Zhang
- State Key Laboratory of Respiratory Diseases, Guangzhou Institute of Respiratory Disease, The First Affiliated Hospital, Guangzhou Medical University, Guangzhou, Guangdong, China
| | - Jinmei Ye
- State Key Laboratory of Respiratory Diseases, Guangzhou Institute of Respiratory Disease, The First Affiliated Hospital, Guangzhou Medical University, Guangzhou, Guangdong, China
| | - Yajie Zhang
- Department of Pathology, Guangzhou Medical University, Guangzhou, Guangdong, China
| | - Nanshan Zhong
- State Key Laboratory of Respiratory Diseases, Guangzhou Institute of Respiratory Disease, The First Affiliated Hospital, Guangzhou Medical University, Guangzhou, Guangdong, China
| | - Wenju Lu
- State Key Laboratory of Respiratory Diseases, Guangzhou Institute of Respiratory Disease, The First Affiliated Hospital, Guangzhou Medical University, Guangzhou, Guangdong, China
- Department of Laboratory Medicine, The First Affiliated Hospital, Guangzhou Medical University, Guangzhou, Guangdong, China
- * E-mail:
| |
Collapse
|
13
|
Sinsheimer J. "Statistics 101"--a primer for the genetics of complex human disease. Cold Spring Harb Protoc 2011; 2011:1190-1199. [PMID: 21969626 DOI: 10.1101/pdb.top065870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
This article reviews the basis of probability and statistics used in the genetic analysis of complex human diseases and illustrates their use in several simple examples. Much of the material presented here is so fundamental to statistics that it has become common knowledge in the field and the originators are no longer cited (e.g., Gauss).
Collapse
|
14
|
Abstract
Over the last few years, main effect genetic association analysis has proven to be a successful tool to unravel genetic risk components to a variety of complex diseases. In the quest for disease susceptibility factors and the search for the 'missing heritability', supplementary and complementary efforts have been undertaken. These include the inclusion of several genetic inheritance assumptions in model development, the consideration of different sources of information, and the acknowledgement of disease underlying pathways of networks. The search for epistasis or gene-gene interaction effects on traits of interest is marked by an exponential growth, not only in terms of methodological development, but also in terms of practical applications, translation of statistical epistasis to biological epistasis and integration of omics information sources. The current popularity of the field, as well as its attraction to interdisciplinary teams, each making valuable contributions with sometimes rather unique viewpoints, renders it impossible to give an exhaustive review of to-date available approaches for epistasis screening. The purpose of this work is to give a perspective view on a selection of currently active analysis strategies and concerns in the context of epistasis detection, and to provide an eye to the future of gene-gene interaction analysis.
Collapse
Affiliation(s)
- Kristel Van Steen
- Department of Electrical Engineering and Computer Science (Montefiore Institute), Grande Traverse, Bioinformatique 4000 Liège 1, Belgium.
| |
Collapse
|
15
|
Association of a mineralocorticoid receptor gene polymorphism with hypertension in a Spanish population. Am J Hypertens 2009; 22:649-55. [PMID: 19325532 DOI: 10.1038/ajh.2009.39] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022] Open
Abstract
BACKGROUND To assess the association of polymorphisms and haplotypes of the mineralocorticoid receptor (MR) (NR3C2) gene to the risk of essential hypertension (HTN) in a Spanish population. METHODS This is a population-based study which included 1,502 subjects (748 women) >18 years old. Twenty-four polymorphisms of NR3C2 gene were analyzed by using SNPlex (Genotyping System based on OLA/PCR technology). RESULTS Alleles of the single-nucleotide polymorphism (SNP) rs5522 were significantly associated with the risk of HTN, both in the recessive and codominant models adjusted by age, gender, and body mass index (BMI). Genotype GG of the rs5522 showed to be protective against HTN odds ratio (OR) 0.10 (0.02-0.56), P < 0.01. One haplotype, which included the G allele of the rs5522, was also associated with reduced risk of HTN and four haplotypes which included the A allele were associated with increased risk of HTN. When the 24-h urinary sodium excretion and the estimated glomerular filtration rate (eGFR) were added, they did not reduce the significance level. Interaction between genotypes of the rs5522 and quartiles of 24-h sodium excretion has been observed. In subjects with the AA genotype, those with higher urinary sodium excretion had the lowest risk to be hypertensive. CONCLUSION A functional polymorphism of the NR3C2 gene was associated with risk of HTN. The data provided in this study seems to give credit to the hypothesis of the participation of MR gene in the development of HTN, although further studies are necessary to better assess its real impact.
Collapse
|
16
|
Gene discovery through imaging genetics: identification of two novel genes associated with schizophrenia. Mol Psychiatry 2009; 14:416-28. [PMID: 19065146 PMCID: PMC3254586 DOI: 10.1038/mp.2008.127] [Citation(s) in RCA: 83] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
We have discovered two genes, RSRC1 and ARHGAP18, associated with schizophrenia and in an independent study provided additional support for this association. We have both discovered and verified the association of two genes, RSRC1 and ARHGAP18, with schizophrenia. We combined a genome-wide screening strategy with neuroimaging measures as the quantitative phenotype and identified the single nucleotide polymorphisms (SNPs) related to these genes as consistently associated with the phenotypic variation. To control for the risk of false positives, the empirical P-value for association significance was calculated using permutation testing. The quantitative phenotype was Blood-Oxygen-Level Dependent (BOLD) Contrast activation in the left dorsal lateral prefrontal cortex measured during a working memory task. The differential distribution of SNPs associated with these two genes in cases and controls was then corroborated in a larger, independent sample of patients with schizophrenia (n=82) and healthy controls (n=91), thus suggesting a putative etiological function for both genes in schizophrenia. Up until now these genes have not been linked to any neuropsychiatric illness, although both genes have a function in prenatal brain development. We introduce the use of functional magnetic resonance imaging activation as a quantitative phenotype in conjunction with genome-wide association as a gene discovery tool.
Collapse
|
17
|
Aiyar RS, Gagneur J, Steinmetz LM. Identification of mitochondrial disease genes through integrative analysis of multiple datasets. Methods 2008; 46:248-55. [PMID: 18930150 PMCID: PMC2774125 DOI: 10.1016/j.ymeth.2008.10.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2008] [Revised: 10/03/2008] [Accepted: 10/08/2008] [Indexed: 11/24/2022] Open
Abstract
Determining the genetic factors in a disease is crucial to elucidating its molecular basis. This task is challenging due to a lack of information on gene function. The integration of large-scale functional genomics data has proven to be an effective strategy to prioritize candidate disease genes. Mitochondrial disorders are a prevalent and heterogeneous class of diseases that are particularly amenable to this approach. Here we explain the application of integrative approaches to the identification of mitochondrial disease genes. We first examine various datasets that can be used to evaluate the involvement of each gene in mitochondrial function. The data integration methodology is then described, accompanied by examples of common implementations. Finally, we discuss how gene networks are constructed using integrative techniques and applied to candidate gene prioritization. Relevant public data resources are indicated. This report highlights the success and potential of data integration as well as its applicability to the search for mitochondrial disease genes.
Collapse
Affiliation(s)
- Raeka S. Aiyar
- European Molecular Biology Laboratory, Meyerhofstraβe 1, 69117 Heidelberg, Germany
| | - Julien Gagneur
- European Molecular Biology Laboratory, Meyerhofstraβe 1, 69117 Heidelberg, Germany
| | - Lars M. Steinmetz
- European Molecular Biology Laboratory, Meyerhofstraβe 1, 69117 Heidelberg, Germany
| |
Collapse
|
18
|
Kimman TG, Banus S, Reijmerink N, Reimerink J, Stelma FF, Koppelman GH, Thijs C, Postma DS, Kerkhof M. Association of interacting genes in the toll-like receptor signaling pathway and the antibody response to pertussis vaccination. PLoS One 2008; 3:e3665. [PMID: 18987746 PMCID: PMC2573957 DOI: 10.1371/journal.pone.0003665] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2008] [Accepted: 10/21/2008] [Indexed: 01/21/2023] Open
Abstract
BACKGROUND Activation of the Toll-like receptor (TLR) signaling pathway through TLR4 may be important in the induction of protective immunity against Bordetella pertussis with TLR4-mediated activation of dendritic and B cells, induction of cytokine expression, and reversal of tolerance as crucial steps. We examined whether single nucleotide polymorphisms (SNPs) in genes of the TLR4 pathway and their interaction are associated with the response to whole-cell vaccine (WCV) pertussis vaccination in 490 one-year-old children. METHODOLOGY/PRINCIPAL FINDINGS We analyzed associations of 75 haplotype-tagging SNPs in genes in the TLR4 signaling pathway with pertussis toxin (PT)-IgG titers. We found significant associations between the PT-IgG titer and SNPs in CD14, TLR4, TOLLIP, TIRAP, IRAK3, IRAK4, TICAM1, and TNFRSF4 in one or more of the analyses. The strongest evidence for association was found for two SNPs (rs5744034 and rs5743894) in TOLLIP that were almost completely in linkage disequilibrium, provided statistically significant associations in all tests with the lowest p-values, and displayed a dominant mode of inheritance. However, none of these single gene associations would withstand correction for multiple testing. In addition, Multifactor Dimensionality Reduction Analysis, an approach that does not need correction for multiple testing, showed significant and strong two and three locus interactions between SNPs in TOLLIP (rs4963060), TLR4 (rs6478317) and IRAK1 (rs1059703). CONCLUSIONS/SIGNIFICANCE We have identified significant interactions between genes in the TLR pathway in the induction of vaccine-induced immunity. These interactions underline that these genes are functionally related and together form a true biological relationship in a protein-protein interaction network. Practically all our findings may be explained by genetic variation in directly or indirectly interacting proteins at the extra- and intracytoplasmic sites of the cell membrane of antigen-presenting cells, B cells, or both. Fine tuning of interacting proteins in the TLR pathway appears important for the induction of an optimal vaccine response.
Collapse
Affiliation(s)
- Tjeerd G Kimman
- Center for Infectious Disease Control, National Institute of Public Health and Environment, Bilthoven, The Netherlands.
| | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Uhl GR, Drgon T, Johnson C, Li CY, Contoreggi C, Hess J, Naiman D, Liu QR. Molecular genetics of addiction and related heritable phenotypes: genome-wide association approaches identify "connectivity constellation" and drug target genes with pleiotropic effects. Ann N Y Acad Sci 2008; 1141:318-81. [PMID: 18991966 PMCID: PMC3922196 DOI: 10.1196/annals.1441.018] [Citation(s) in RCA: 131] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Genome-wide association (GWA) can elucidate molecular genetic bases for human individual differences in complex phenotypes that include vulnerability to addiction. Here, we review (a) evidence that supports polygenic models with (at least) modest heterogeneity for the genetic architectures of addiction and several related phenotypes; (b) technical and ethical aspects of importance for understanding GWA data, including genotyping in individual samples versus DNA pools, analytic approaches, power estimation, and ethical issues in genotyping individuals with illegal behaviors; (c) the samples and the data that shape our current understanding of the molecular genetics of individual differences in vulnerability to substance dependence and related phenotypes; (d) overlaps between GWA data sets for dependence on different substances; and (e) overlaps between GWA data for addictions versus other heritable, brain-based phenotypes that include bipolar disorder, cognitive ability, frontal lobe brain volume, the ability to successfully quit smoking, neuroticism, and Alzheimer's disease. These convergent results identify potential targets for drugs that might modify addictions and play roles in these other phenotypes. They add to evidence that individual differences in the quality and quantity of brain connections make pleiotropic contributions to individual differences in vulnerability to addictions and to related brain disorders and phenotypes. A "connectivity constellation" of brain phenotypes and disorders appears to receive substantial pathogenic contributions from individual differences in a constellation of genes whose variants provide individual differences in the specification of brain connectivities during development and in adulthood. Heritable brain differences that underlie addiction vulnerability thus lie squarely in the midst of the repertoire of heritable brain differences that underlie vulnerability to other common brain disorders and phenotypes.
Collapse
Affiliation(s)
- George R Uhl
- Molecular Neurobiology Branch, National Institutes of Health (NIH), Intramural Research Program (IRP), National Institute on Drug Abuse (NIDA), Baltimore, MD 21224, USA.
| | | | | | | | | | | | | | | |
Collapse
|
20
|
Zhang Z, Zhang S, Wong MY, Wareham NJ, Sha Q. An ensemble learning approach jointly modeling main and interaction effects in genetic association studies. Genet Epidemiol 2008; 32:285-300. [PMID: 18205210 DOI: 10.1002/gepi.20304] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Complex diseases are presumed to be the results of interactions of several genes and environmental factors, with each gene only having a small effect on the disease. Thus, the methods that can account for gene-gene interactions to search for a set of marker loci in different genes or across genome and to analyze these loci jointly are critical. In this article, we propose an ensemble learning approach (ELA) to detect a set of loci whose main and interaction effects jointly have a significant association with the trait. In the ELA, we first search for "base learners" and then combine the effects of the base learners by a linear model. Each base learner represents a main effect or an interaction effect. The result of the ELA is easy to interpret. When the ELA is applied to analyze a data set, we can get a final model, an overall P-value of the association test between the set of loci involved in the final model and the trait, and an importance measure for each base learner and each marker involved in the final model. The final model is a linear combination of some base learners. We know which base learner represents a main effect and which one represents an interaction effect. The importance measure of each base learner or marker can tell us the relative importance of the base learner or marker in the final model. We used intensive simulation studies as well as a real data set to evaluate the performance of the ELA. Our simulation studies demonstrated that the ELA is more powerful than the single-marker test in all the simulation scenarios. The ELA also outperformed the other three existing multi-locus methods in almost all cases. In an application to a large-scale case-control study for Type 2 diabetes, the ELA identified 11 single nucleotide polymorphisms that have a significant multi-locus effect (P-value=0.01), while none of the single nucleotide polymorphisms showed significant marginal effects and none of the two-locus combinations showed significant two-locus interaction effects.
Collapse
Affiliation(s)
- Zhaogong Zhang
- Department of Mathematical Sciences, Michigan Technological University, Houghton, Michigan 49931, USA
| | | | | | | | | |
Collapse
|
21
|
Uhl GR, Liu QR, Drgon T, Johnson C, Walther D, Rose JE, David SP, Niaura R, Lerman C. Molecular genetics of successful smoking cessation: convergent genome-wide association study results. ACTA ACUST UNITED AC 2008; 65:683-93. [PMID: 18519826 DOI: 10.1001/archpsyc.65.6.683] [Citation(s) in RCA: 192] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
CONTEXT Smoking remains a major public health problem. Twin studies indicate that the ability to quit smoking is substantially heritable, with genetics that overlap modestly with the genetics of vulnerability to dependence on addictive substances. OBJECTIVES To identify replicated genes that facilitate smokers' abilities to achieve and sustain abstinence from smoking (herein after referred to as quit-success genes) found in more than 2 genome-wide association (GWA) studies of successful vs unsuccessful abstainers, and, secondarily, to nominate genes for selective involvement in smoking cessation success with bupropion hydrochloride vs nicotine replacement therapy (NRT). DESIGN The GWA results in subjects from 3 centers, with secondary analyses of NRT vs bupropion responders. SETTING Outpatient smoking cessation trial participants from 3 centers. PARTICIPANTS European American smokers who successfully vs unsuccessfully abstain from smoking with biochemical confirmation in a smoking cessation trial using NRT, bupropion, or placebo (N = 550). MAIN OUTCOME MEASURES Quit-success genes, reproducibly identified by clustered nominally positive single-nucleotide polymorphisms (SNPs) in more than 2 independent samples with significant P values based on Monte Carlo simulation trials. The NRT-selective genes were nominated by clustered SNPs that display much larger t values for NRT vs placebo comparisons. The bupropion-selective genes were nominated by bupropion-selective results. RESULTS Variants in quit-success genes are likely to alter cell adhesion, enzymatic, transcriptional, structural, and DNA, RNA, and/or protein-handling functions. Quit-success genes are identified by clustered nominally positive SNPs from more than 2 samples and are unlikely to represent chance observations (Monte Carlo P< .0003). These genes display modest overlap with genes identified in GWA studies of dependence on addictive substances and memory. CONCLUSIONS These results support polygenic genetics for success in abstaining from smoking, overlap with genetics of substance dependence and memory, and nominate gene variants for selective influences on therapeutic responses to bupropion vs NRT. Molecular genetics should help match the types and/or intensity of antismoking treatments with the smokers most likely to benefit from them.
Collapse
Affiliation(s)
- George R Uhl
- Molecular Neurobiology Research Branch, National Institutes of Health-Intramural Research Program, National Institute on Drug Abuse, 333 Cassell Dr, Ste 3510, Baltimore, MD 21224, USA.
| | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Hoggart CJ, Clark TG, De Iorio M, Whittaker JC, Balding DJ. Genome-wide significance for dense SNP and resequencing data. Genet Epidemiol 2008; 32:179-85. [PMID: 18200594 DOI: 10.1002/gepi.20292] [Citation(s) in RCA: 161] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The problem of multiple testing is an important aspect of genome-wide association studies, and will become more important as marker densities increase. The problem has been tackled with permutation and false discovery rate procedures and with Bayes factors, but each approach faces difficulties that we briefly review. In the current context of multiple studies on different genotyping platforms, we argue for the use of truly genome-wide significance thresholds, based on all polymorphisms whether or not typed in the study. We approximate genome-wide significance thresholds in contemporary West African, East Asian and European populations by simulating sequence data, based on all polymorphisms as well as for a range of single nucleotide polymorphism (SNP) selection criteria. Overall we find that significance thresholds vary by a factor of >20 over the SNP selection criteria and statistical tests that we consider and can be highly dependent on sample size. We compare our results for sequence data to those derived by the HapMap Consortium and find notable differences which may be due to the small sample sizes used in the HapMap estimate.
Collapse
Affiliation(s)
- Clive J Hoggart
- Department of Epidemiology and Public Health, Imperial College London, Norfolk Place, London, UK.
| | | | | | | | | |
Collapse
|
23
|
Ziegler A, König IR, Thompson JR. Biostatistical Aspects of Genome-Wide Association Studies. Biom J 2008; 50:8-28. [DOI: 10.1002/bimj.200710398] [Citation(s) in RCA: 113] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
|
24
|
Chen D, Jin G, Wang Y, Wang H, Liu H, Liu Y, Fan W, Ma H, Miao R, Hu Z, Sun W, Qian J, Jin L, Wei Q, Shen H, Huang W, Lu D. Genetic variants in peroxisome proliferator-activated receptor-γ gene are associated with risk of lung cancer in a Chinese population. Carcinogenesis 2008; 29:342-50. [DOI: 10.1093/carcin/bgm285] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
|
25
|
Bantscheff M, Schirle M, Sweetman G, Rick J, Kuster B. Quantitative mass spectrometry in proteomics: a critical review. Anal Bioanal Chem 2007; 389:1017-31. [PMID: 17668192 DOI: 10.1007/s00216-007-1486-6] [Citation(s) in RCA: 1128] [Impact Index Per Article: 66.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2007] [Revised: 06/25/2007] [Accepted: 06/29/2007] [Indexed: 01/28/2023]
Abstract
The quantification of differences between two or more physiological states of a biological system is among the most important but also most challenging technical tasks in proteomics. In addition to the classical methods of differential protein gel or blot staining by dyes and fluorophores, mass-spectrometry-based quantification methods have gained increasing popularity over the past five years. Most of these methods employ differential stable isotope labeling to create a specific mass tag that can be recognized by a mass spectrometer and at the same time provide the basis for quantification. These mass tags can be introduced into proteins or peptides (i) metabolically, (ii) by chemical means, (iii) enzymatically, or (iv) provided by spiked synthetic peptide standards. In contrast, label-free quantification approaches aim to correlate the mass spectrometric signal of intact proteolytic peptides or the number of peptide sequencing events with the relative or absolute protein quantity directly. In this review, we critically examine the more commonly used quantitative mass spectrometry methods for their individual merits and discuss challenges in arriving at meaningful interpretations of quantitative proteomic data.
Collapse
|
26
|
Abstract
High throughput DNA microarray technology has been broadly applied to the study of breast cancer to classify molecular subtypes, to predict outcome, survival, response to treatment, and for the identification of novel therapeutic targets. Although results are promising, this technology will not have a full impact on routine clinical practice until there is further standardization of techniques and optimal clinical trial design. Due to substantial disease heterogeneity and the number of genes being analyzed, collaborative, multi-institutional studies are required to accrue enough patients for sufficient statistical power. Newer bioinformatic approaches are being developed to assist with the analysis of this important data.
Collapse
Affiliation(s)
- Jianjiang Fu
- Stanford University Medical Center, Stanford, CA 94305-5494, USA
| | | |
Collapse
|
27
|
Abstract
Although genetic association studies have been with us for many years, even for the simplest analyses there is little consensus on the most appropriate statistical procedures. Here I give an overview of statistical approaches to population association studies, including preliminary analyses (Hardy-Weinberg equilibrium testing, inference of phase and missing data, and SNP tagging), and single-SNP and multipoint tests for association. My goal is to outline the key methods with a brief discussion of problems (population structure and multiple testing), avenues for solutions and some ongoing developments.
Collapse
Affiliation(s)
- David J Balding
- Department of Epidemiology and Public Health, Imperial College, St Marys Campus, Norfolk Place, London W2 1PG, UK.
| |
Collapse
|
28
|
Leschziner GD, Andrew T, Pirmohamed M, Johnson MR. ABCB1 genotype and PGP expression, function and therapeutic drug response: a critical review and recommendations for future research. THE PHARMACOGENOMICS JOURNAL 2006; 7:154-79. [PMID: 16969364 DOI: 10.1038/sj.tpj.6500413] [Citation(s) in RCA: 208] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
The product of the ABCB1 gene, P-glycoprotein (PGP), is a transmembrane active efflux pump for a variety of drugs. It is a putative mechanism of multidrug resistance in a range of diseases. It is postulated that ABCB1 polymorphisms contribute to variability in PGP function, and that therefore multidrug resistance is, at least in part, genetically determined. However, studies of ABCB1 genotype or haplotype and PGP expression, activity or drug response have produced inconsistent results. This critical review of ABCB1 genotype and PGP function, including mRNA expression, PGP-substrate drug pharmacokinetics and drug response, highlights methodological limitations of existing studies, including inadequate power, potential confounding by co-morbidity and co-medication, multiple testing, poor definition of disease phenotype and outcomes, and analysis of multiple drugs that might not be PGP substrates. We have produced recommendations for future research that will aid clarification of the association between ABCB1 genotypes and factors related to PGP activity.
Collapse
Affiliation(s)
- G D Leschziner
- Division of Neurosciences, Imperial College, London, UK.
| | | | | | | |
Collapse
|