1
|
Petrykey K, Lippé S, Sultan S, Robaey P, Drouin S, Affret-Bertout L, Beaulieu P, St-Onge P, Baedke JL, Yasui Y, Hudson MM, Laverdière C, Sinnett D, Krajinovic M. Genetic Factors and Long-term Treatment-Related Neurocognitive Deficits, Anxiety, and Depression in Childhood Leukemia Survivors: An Exome-Wide Association Study. Cancer Epidemiol Biomarkers Prev 2024; 33:234-243. [PMID: 38051303 PMCID: PMC10903523 DOI: 10.1158/1055-9965.epi-23-0634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Revised: 09/23/2023] [Accepted: 11/30/2023] [Indexed: 12/07/2023] Open
Abstract
BACKGROUND An increased risk of neurocognitive deficits, anxiety, and depression has been reported in childhood cancer survivors. METHODS We analyzed associations of neurocognitive deficits, as well as anxiety and depression, with common and rare genetic variants derived from whole-exome sequencing data of acute lymphoblastic leukemia (ALL) survivors from the PETALE cohort. In addition, significant associations were assessed using stratified and multivariable analyses. Next, top-ranking common associations were analyzed in an independent SJLIFE replication cohort of ALL survivors. RESULTS Significant associations were identified in the entire discovery cohort (N = 229) between the AK8 gene and changes in neurocognitive function, whereas PTPRZ1, MUC16, TNRC6C-AS1 were associated with anxiety. Following stratification according to sex, the ZNF382 gene was linked to a neurocognitive deficit in males, whereas APOL2 and C6orf165 were associated with anxiety and EXO5 with depression. Following stratification according to prognostic risk groups, the modulatory effect of rare variants on depression was additionally found in the CYP2W1 and PCMTD1 genes. In the replication SJLIFE cohort (N = 688), the male-specific association in the ZNF382 gene was not significant; however, a P value<0.05 was observed when the entire SJLIFE cohort was analyzed. ZNF382 was significant in males in the combined cohorts as shown by meta-analyses as well as the depression-associated gene EXO5. CONCLUSIONS Further research is needed to confirm whether the current findings, along with other known risk factors, may be valuable in identifying patients at increased risk of these long-term complications. IMPACT Our results suggest that specific genes may be related to increased neuropsychological consequences.
Collapse
Affiliation(s)
- Kateryna Petrykey
- Sainte-Justine University Health Center (SJUHC), Montreal (Quebec), Canada
- Department of Pharmacology and Physiology, Université de Montréal (Quebec), Canada
| | - Sarah Lippé
- Sainte-Justine University Health Center (SJUHC), Montreal (Quebec), Canada
- Department of Psychology, Université de Montréal (Quebec), Canada
| | - Serge Sultan
- Sainte-Justine University Health Center (SJUHC), Montreal (Quebec), Canada
- Department of Psychology, Université de Montréal (Quebec), Canada
| | - Philippe Robaey
- Sainte-Justine University Health Center (SJUHC), Montreal (Quebec), Canada
- Children’s Hospital of Eastern Ontario, Ottawa (Ontario), Canada
- Department of Psychiatry, Université de Montréal (Quebec), Canada
- Department of Psychiatry, University of Ottawa (Ontario), Canada
| | - Simon Drouin
- Sainte-Justine University Health Center (SJUHC), Montreal (Quebec), Canada
| | | | - Patrick Beaulieu
- Sainte-Justine University Health Center (SJUHC), Montreal (Quebec), Canada
| | - Pascal St-Onge
- Sainte-Justine University Health Center (SJUHC), Montreal (Quebec), Canada
| | - Jessica L. Baedke
- Department of Epidemiology and Cancer Control, St. Jude Children’s Research Hospital, Memphis (TN), USA
| | - Yutaka Yasui
- Department of Epidemiology and Cancer Control, St. Jude Children’s Research Hospital, Memphis (TN), USA
| | - Melissa M. Hudson
- Department of Epidemiology and Cancer Control, St. Jude Children’s Research Hospital, Memphis (TN), USA
- Department of Oncology, St. Jude Children’s Research Hospital, Memphis (TN), USA
| | - Caroline Laverdière
- Sainte-Justine University Health Center (SJUHC), Montreal (Quebec), Canada
- Department of Pediatrics, Université de Montréal (Quebec), Canada
| | - Daniel Sinnett
- Sainte-Justine University Health Center (SJUHC), Montreal (Quebec), Canada
- Department of Pediatrics, Université de Montréal (Quebec), Canada
| | - Maja Krajinovic
- Sainte-Justine University Health Center (SJUHC), Montreal (Quebec), Canada
- Department of Pharmacology and Physiology, Université de Montréal (Quebec), Canada
- Department of Pediatrics, Université de Montréal (Quebec), Canada
| |
Collapse
|
2
|
Miller A, Panneerselvam J, Liu L. A review of regression and classification techniques for analysis of common and rare variants and gene-environmental factors. Neurocomputing 2021. [DOI: 10.1016/j.neucom.2021.08.150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
3
|
Petrykey K, Rezgui AM, Guern ML, Beaulieu P, St-Onge P, Drouin S, Bertout L, Wang F, Baedke JL, Yasui Y, Hudson MM, Raboisson MJ, Laverdière C, Sinnett D, Andelfinger GU, Krajinovic M. Genetic factors in treatment-related cardiovascular complications in survivors of childhood acute lymphoblastic leukemia. Pharmacogenomics 2021; 22:885-901. [PMID: 34505544 PMCID: PMC9043873 DOI: 10.2217/pgs-2021-0067] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 08/12/2021] [Indexed: 11/21/2022] Open
Abstract
Aim: Cardiovascular disease represents one of the main causes of secondary morbidity and mortality in patients with childhood cancer. Patients & methods: To further address this issue, we analyzed cardiovascular complications in relation to common and rare genetic variants derived through whole-exome sequencing from childhood acute lymphoblastic leukemia survivors (PETALE cohort). Results: Significant associations were detected among common variants in the TTN gene, left ventricular ejection fraction (p ≤ 0.0005), and fractional shortening (p ≤ 0.001). Rare variants enrichment in the NOS1, ABCG2 and NOD2 was observed in relation to left ventricular ejection fraction, and in NOD2 and ZNF267 genes in relation to fractional shortening. Following stratification according to risk groups, the modulatory effect of rare variants was additionally found in the CBR1, ABCC5 and AKR1C3 genes. None of the associations was replicated in St-Jude Lifetime Cohort Study. Conclusion: Further studies are needed to confirm whether the described genetic markers may be useful in identifying patients at increased risk of these complications.
Collapse
Affiliation(s)
- Kateryna Petrykey
- Immune Diseases and Cancer Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC H3T 1C5, Canada
- Department of Pharmacology & Physiology, Université de Montréal, QC, H3T 1J4, Canada
| | - Aziz M Rezgui
- Immune Diseases and Cancer Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC H3T 1C5, Canada
| | - Mathilde Le Guern
- Immune Diseases and Cancer Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC H3T 1C5, Canada
| | - Patrick Beaulieu
- Immune Diseases and Cancer Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC H3T 1C5, Canada
| | - Pascal St-Onge
- Immune Diseases and Cancer Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC H3T 1C5, Canada
| | - Simon Drouin
- Immune Diseases and Cancer Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC H3T 1C5, Canada
| | - Laurence Bertout
- Immune Diseases and Cancer Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC H3T 1C5, Canada
| | - Fan Wang
- Department of Epidemiology & Cancer Control, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Jessica L Baedke
- Department of Epidemiology & Cancer Control, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Yutaka Yasui
- Department of Epidemiology & Cancer Control, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Melissa M Hudson
- Department of Epidemiology & Cancer Control, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
- Department of Oncology, St. Jude Children’s Research Hospital, Memphis, TN 38105, USA
| | - Marie-Josée Raboisson
- Department of Pediatrics, Université de Montréal, QC, H3T 1C5, Canada
- Cardiology Unit, Sainte-Justine University Health Center (SJUHC), Montreal, QC, H3T 1C5, Canada
| | - Caroline Laverdière
- Immune Diseases and Cancer Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC H3T 1C5, Canada
- Department of Pediatrics, Université de Montréal, QC, H3T 1C5, Canada
| | - Daniel Sinnett
- Immune Diseases and Cancer Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC H3T 1C5, Canada
- Department of Pediatrics, Université de Montréal, QC, H3T 1C5, Canada
| | - Gregor U Andelfinger
- Department of Pediatrics, Université de Montréal, QC, H3T 1C5, Canada
- Fetomaternal and Neonatal Pathologies Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC, H3T 1C5, Canada
| | - Maja Krajinovic
- Immune Diseases and Cancer Research Axis, Sainte-Justine University Health Center (SJUHC), Montreal, QC H3T 1C5, Canada
- Department of Pharmacology & Physiology, Université de Montréal, QC, H3T 1J4, Canada
- Department of Pediatrics, Université de Montréal, QC, H3T 1C5, Canada
| |
Collapse
|
4
|
Novel directions in data pre-processing and genome-wide association study (GWAS) methodologies to overcome ongoing challenges. INFORMATICS IN MEDICINE UNLOCKED 2021. [DOI: 10.1016/j.imu.2021.100586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
5
|
Bocher O, Génin E. Rare variant association testing in the non-coding genome. Hum Genet 2020; 139:1345-1362. [PMID: 32500240 DOI: 10.1007/s00439-020-02190-y] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Accepted: 05/29/2020] [Indexed: 12/25/2022]
Abstract
The development of next-generation sequencing technologies has opened-up some new possibilities to explore the contribution of genetic variants to human diseases and in particular that of rare variants. Statistical methods have been developed to test for association with rare variants that require the definition of testing units and, in these testing units, the selection of qualifying variants to include in the test. In the coding regions of the genome, testing units are usually the different genes and qualifying variants are selected based on their functional effects on the encoded proteins. Extending these tests to the non-coding regions of the genome is challenging. Testing units are difficult to define as the non-coding genome organisation is still rather unknown. Qualifying variants are difficult to select as the functional impact of non-coding variants on gene expression is hard to predict. These difficulties could explain why very few investigators so far have analysed the non-coding parts of their whole genome sequencing data. These non-coding parts yet represent the vast majority of the genome and some studies suggest that they could play a major role in disease susceptibility. In this review, we discuss recent experimental and statistical developments to gain knowledge on the non-coding genome and how this knowledge could be used to include rare non-coding variants in association tests. We describe the few studies that have considered variants from the non-coding genome in association tests and how they managed to define testing units and select qualifying variants.
Collapse
Affiliation(s)
- Ozvan Bocher
- Génétique, Génomique Fonctionnelle Et Biotechnologies, Faculté de Médecine, Univ Brest, Inserm, Inserm UMR1078, Bâtiment E-IBRBS 2ieme étage, 22 avenue Camille Desmoulins, 29238, Brest Cedex 3, France.
| | - Emmanuelle Génin
- Génétique, Génomique Fonctionnelle Et Biotechnologies, Faculté de Médecine, Univ Brest, Inserm, Inserm UMR1078, Bâtiment E-IBRBS 2ieme étage, 22 avenue Camille Desmoulins, 29238, Brest Cedex 3, France.
- CHU Brest, Brest, France.
| |
Collapse
|
6
|
Xu J, Xu W, Briollais L. A Bayes factor approach with informative prior for rare genetic variant analysis from next generation sequencing data. Biometrics 2020; 77:316-328. [PMID: 32277476 DOI: 10.1111/biom.13278] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2018] [Revised: 02/15/2020] [Accepted: 04/01/2020] [Indexed: 11/28/2022]
Abstract
The discovery of rare genetic variants through next generation sequencing is a very challenging issue in the field of human genetics. We propose a novel region-based statistical approach based on a Bayes Factor (BF) to assess evidence of association between a set of rare variants (RVs) located on the same genomic region and a disease outcome in the context of case-control design. Marginal likelihoods are computed under the null and alternative hypotheses assuming a binomial distribution for the RV count in the region and a beta or mixture of Dirac and beta prior distribution for the probability of RV. We derive the theoretical null distribution of the BF under our prior setting and show that a Bayesian control of the false Discovery Rate can be obtained for genome-wide inference. Informative priors are introduced using prior evidence of association from a Kolmogorov-Smirnov test statistic. We use our simulation program, sim1000G, to generate RV data similar to the 1000 genomes sequencing project. Our simulation studies showed that the new BF statistic outperforms standard methods (SKAT, SKAT-O, Burden test) in case-control studies with moderate sample sizes and is equivalent to them under large sample size scenarios. Our real data application to a lung cancer case-control study found enrichment for RVs in known and novel cancer genes. It also suggests that using the BF with informative prior improves the overall gene discovery compared to the BF with noninformative prior.
Collapse
Affiliation(s)
- Jingxiong Xu
- Dalla Lana School of Public Health, University of Toronto, Toronto, Canada.,Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, Canada
| | - Wei Xu
- Dalla Lana School of Public Health, University of Toronto, Toronto, Canada.,Princess Margaret Cancer Center, Toronto, Canada
| | - Laurent Briollais
- Dalla Lana School of Public Health, University of Toronto, Toronto, Canada.,Lunenfeld-Tanenbaum Research Institute, Sinai Health System, Toronto, Canada
| |
Collapse
|
7
|
Petrykey K, Lippé S, Robaey P, Sultan S, Laniel J, Drouin S, Bertout L, Beaulieu P, St-Onge P, Boulet-Craig A, Rezgui A, Yasui Y, Sapkota Y, Krull KR, Hudson MM, Laverdière C, Sinnett D, Krajinovic M. Influence of genetic factors on long-term treatment related neurocognitive complications, and on anxiety and depression in survivors of childhood acute lymphoblastic leukemia: The Petale study. PLoS One 2019; 14:e0217314. [PMID: 31181069 PMCID: PMC6557490 DOI: 10.1371/journal.pone.0217314] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2019] [Accepted: 05/08/2019] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND A substantial number of survivors of childhood acute lymphoblastic leukemia suffer from treatment-related late adverse effects including neurocognitive impairment. While multiple studies have described neurocognitive outcomes in childhood acute lymphoblastic leukemia (ALL) survivors, relatively few have investigated their association with individual genetic constitution. METHODS To further address this issue, genetic variants located in 99 genes relevant to the effects of anticancer drugs and in 360 genes implicated in nervous system function and predicted to affect protein function, were pooled from whole exome sequencing data of childhood ALL survivors (PETALE cohort) and analyzed for an association with neurocognitive complications, as well as with anxiety and depression. Variants that sustained correction for multiple testing were genotyped in entire cohort (n = 236) and analyzed with same outcomes. RESULTS Common variants in MTR, PPARA, ABCC3, CALML5, CACNB2 and PCDHB10 genes were associated with deficits in neurocognitive tests performance, whereas a variant in SLCO1B1 and EPHA5 genes was associated with anxiety and depression. Majority of associations were modulated by intensity of treatment. Associated variants were further analyzed in an independent SJLIFE cohort of 545 ALL survivors. Two variants, rs1805087 in methionine synthase, MTR and rs58225473 in voltage-dependent calcium channel protein encoding gene, CACNB2 are of particular interest, since associations of borderline significance were found in replication cohort and remain significant in combined discovery and replication groups (OR = 1.5, 95% CI, 1-2.3; p = 0.04 and; OR = 3.7, 95% CI, 1.25-11; p = 0.01, respectively). Variant rs4149056 in SLCO1B1 gene also deserves further attention since previously shown to affect methotrexate clearance and short-term toxicity in ALL patients. CONCLUSIONS Current findings can help understanding of the influence of genetic component on long-term neurocognitive impairment. Further studies are needed to confirm whether identified variants may be useful in identifying survivors at increased risk of these complications.
Collapse
Affiliation(s)
- Kateryna Petrykey
- Sainte-Justine University Health Center (SJUHC), Montreal, Quebec, Canada
- Department of Pharmacology and Physiology, Université de Montréal, Montreal, Quebec, Canada
| | - Sarah Lippé
- Sainte-Justine University Health Center (SJUHC), Montreal, Quebec, Canada
- Department of Psychology, Université de Montréal, Montreal, Quebec, Canada
| | - Philippe Robaey
- Sainte-Justine University Health Center (SJUHC), Montreal, Quebec, Canada
- Children’s Hospital of Eastern Ontario, Ottawa, Ontario, Canada
- Department of Psychiatry, Université de Montréal, Montreal, Quebec, Canada
- Department of Psychiatry, University of Ottawa, Ottawa, Ontario, Canada
| | - Serge Sultan
- Sainte-Justine University Health Center (SJUHC), Montreal, Quebec, Canada
- Department of Psychology, Université de Montréal, Montreal, Quebec, Canada
| | - Julie Laniel
- Sainte-Justine University Health Center (SJUHC), Montreal, Quebec, Canada
- Department of Psychology, Université de Montréal, Montreal, Quebec, Canada
| | - Simon Drouin
- Sainte-Justine University Health Center (SJUHC), Montreal, Quebec, Canada
| | - Laurence Bertout
- Sainte-Justine University Health Center (SJUHC), Montreal, Quebec, Canada
| | - Patrick Beaulieu
- Sainte-Justine University Health Center (SJUHC), Montreal, Quebec, Canada
| | - Pascal St-Onge
- Sainte-Justine University Health Center (SJUHC), Montreal, Quebec, Canada
| | - Aubrée Boulet-Craig
- Sainte-Justine University Health Center (SJUHC), Montreal, Quebec, Canada
- Department of Psychology, Université de Montréal, Montreal, Quebec, Canada
| | - Aziz Rezgui
- Sainte-Justine University Health Center (SJUHC), Montreal, Quebec, Canada
| | - Yutaka Yasui
- Epidemiology and Cancer Control Department, St. Jude Children’s Research Hospital, Memphis, TN, United States of America
| | - Yadav Sapkota
- Epidemiology and Cancer Control Department, St. Jude Children’s Research Hospital, Memphis, TN, United States of America
| | - Kevin R. Krull
- Epidemiology and Cancer Control Department, St. Jude Children’s Research Hospital, Memphis, TN, United States of America
| | - Melissa M. Hudson
- Epidemiology and Cancer Control Department, St. Jude Children’s Research Hospital, Memphis, TN, United States of America
- Oncology Department, St. Jude Children’s Research Hospital, Memphis, TN, United States of America
| | - Caroline Laverdière
- Sainte-Justine University Health Center (SJUHC), Montreal, Quebec, Canada
- Department of Pediatrics, Université de Montréal, Montreal, Quebec, Canada
| | - Daniel Sinnett
- Sainte-Justine University Health Center (SJUHC), Montreal, Quebec, Canada
- Department of Pediatrics, Université de Montréal, Montreal, Quebec, Canada
| | - Maja Krajinovic
- Sainte-Justine University Health Center (SJUHC), Montreal, Quebec, Canada
- Department of Pharmacology and Physiology, Université de Montréal, Montreal, Quebec, Canada
- Department of Pediatrics, Université de Montréal, Montreal, Quebec, Canada
| |
Collapse
|
8
|
Investigation of novel variations of ORAI1 gene and their association with Kawasaki disease. J Hum Genet 2019; 64:511-519. [PMID: 30853710 DOI: 10.1038/s10038-019-0588-2] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2018] [Revised: 02/07/2019] [Accepted: 02/21/2019] [Indexed: 11/09/2022]
Abstract
ORAI1 encodes a calcium channel essential in the store-operated calcium entry mechanism. A previous genetic association study identified a rare in-frame insertion variant of ORAI1 conferring Kawasaki disease (KD). To deepen our understanding of the involvement of rare variants of ORAI1 in KD pathogenesis, we investigated 3812 patients with KD and 2644 healthy individuals for variations in the protein-coding region of ORAI1. By re-sequencing the study participants' DNA, 27 variants with minor allele frequencies (MAFs) < 0.01 that had not been examined in the previous study were identified. Although no significant association with KD was observed either in single-variant analyses or in a collapsing method analysis of the 27 variants, stratification by MAFs, variant types, and predicted deleteriousness revealed that six rare, deleterious, missense variants (MAF < 0.001, CADD C-score ≥ 20) were exclusively present in KD patients, including three refractory cases (OR = ∞, P = 0.046). The six missense variants include p.Gly98Asp, which has been demonstrated to result in gain of function leading to constitutive Ca2+ entry. Conversely, five types of frameshift variants, all identified near the N terminus and assumed to disrupt ORAI1 function, showed an opposite trend of association (OR = 0.35, P = 0.24). These findings support our hypothesis that genetic variations causing the upregulation of the Ca2+/NFAT pathway confer susceptibility to KD. Our findings also provide insights into the usefulness of stratifying the variants based on their MAFs and on the direction of the effects on protein function when conducting association studies using the gene-based collapsing method.
Collapse
|
9
|
Lumley T, Brody J, Peloso G, Morrison A, Rice K. FastSKAT: Sequence kernel association tests for very large sets of markers. Genet Epidemiol 2018; 42:516-527. [PMID: 29932245 PMCID: PMC6129408 DOI: 10.1002/gepi.22136] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2018] [Revised: 04/30/2018] [Accepted: 05/10/2018] [Indexed: 11/06/2022]
Abstract
The sequence kernel association test (SKAT) is widely used to test for associations between a phenotype and a set of genetic variants that are usually rare. Evaluating tail probabilities or quantiles of the null distribution for SKAT requires computing the eigenvalues of a matrix related to the genotype covariance between markers. Extracting the full set of eigenvalues of this matrix (an n × n matrix, for n subjects) has computational complexity proportional to n3 . As SKAT is often used when n > 10 4 , this step becomes a major bottleneck in its use in practice. We therefore propose fastSKAT, a new computationally inexpensive but accurate approximations to the tail probabilities, in which the k largest eigenvalues of a weighted genotype covariance matrix or the largest singular values of a weighted genotype matrix are extracted, and a single term based on the Satterthwaite approximation is used for the remaining eigenvalues. While the method is not particularly sensitive to the choice of k, we also describe how to choose its value, and show how fastSKAT can automatically alert users to the rare cases where the choice may affect results. As well as providing faster implementation of SKAT, the new method also enables entirely new applications of SKAT that were not possible before; we give examples grouping variants by topologically associating domains, and comparing chromosome-wide association by class of histone marker.
Collapse
Affiliation(s)
| | - Jennifer Brody
- Cardiovascular Health Research Unit, University of Washington
| | - Gina Peloso
- Department of Biostatistics, Boston University
| | | | - Kenneth Rice
- Department of Biostatistics, University of Washington
| |
Collapse
|
10
|
Hecker J, Xu X, Townes FW, Loehlein Fier H, Corcoran C, Laird N, Lange C. Family-based tests for associating haplotypes with general phenotype data: Improving the FBAT-haplotype algorithm. Genet Epidemiol 2017; 42:123-126. [PMID: 29159827 DOI: 10.1002/gepi.22094] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2017] [Revised: 09/29/2017] [Accepted: 10/10/2017] [Indexed: 11/08/2022]
Abstract
For family-based association studies, Horvath et al. proposed an algorithm for the association analysis between haplotypes and arbitrary phenotypes when the phase of the haplotypes is unknown, that is, genotype data is given. Their approach to haplotype analysis maintains the original features of the TDT/FBAT-approach, that is, complete robustness against genetic confounding and misspecification of the phenotype. The algorithm has been implemented in the FBAT and PBAT software package and has been used in numerous substantive manuscripts. Here, we propose a simplification of the original algorithm that maintains the original approach but reduces the computational burden of the approach substantially and gives valuable insights regarding the conditional distribution. With the modified algorithm, the application to whole-genome sequencing (WGS) studies becomes feasible; for example, in sliding window approaches or spatial-clustering approaches. The reduction of the computational burden that our modification provides is especially dramatic when both parental genotypes are missing. For example, for eight variants and 441 nuclear families with mostly offspring-only families, in a WGS study at the APOE locus, the running time decreased from approximately 21 hr for the original algorithm to 0.11 sec after our modification.
Collapse
Affiliation(s)
- Julian Hecker
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America.,Department of Genomic Mathematics, University of Bonn, Bonn, Germany
| | - Xin Xu
- Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - F William Townes
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
| | - Heide Loehlein Fier
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America.,Department of Genomic Mathematics, University of Bonn, Bonn, Germany
| | - Chris Corcoran
- Department of Mathematics and Statistics, Utah State University, Logan, UT, USA
| | - Nan Laird
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America
| | - Christoph Lange
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, United States of America.,Channing Division of Network Medicine, Brigham and Women's Hospital, Boston, MA, USA
| |
Collapse
|
11
|
Longitudinal data analysis for rare variants detection with penalized quadratic inference function. Sci Rep 2017; 7:650. [PMID: 28381821 PMCID: PMC5429681 DOI: 10.1038/s41598-017-00712-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2016] [Accepted: 03/08/2017] [Indexed: 11/08/2022] Open
Abstract
Longitudinal genetic data provide more information regarding genetic effects over time compared with cross-sectional data. Coupled with next-generation sequencing technologies, it becomes reality to identify important genes containing both rare and common variants in a longitudinal design. In this work, we adopted a weighted sum statistic (WSS) to collapse multiple variants in a gene region to form a gene score. When multiple genes in a pathway were considered together, a penalized longitudinal model under the quadratic inference function (QIF) framework was applied for efficient gene selection. We evaluated the estimation accuracy and model selection performance under different model settings, then applied the method to a real dataset from the Genetic Analysis Workshop 18 (GAW18). Compared with the unpenalized QIF method, the penalized QIF (pQIF) method achieved better estimation accuracy and higher selection efficiency. The pQIF remained optimal even when the working correlation structure was mis-specified. The real data analysis identified one important gene, angiotensin II receptor type 1 (AGTR1), in the Ca2+/AT-IIR/α-AR signaling pathway. The estimated effect implied that AGTR1 may have a protective effect for hypertension. Our pQIF method provides a general tool for longitudinal sequencing studies involving large numbers of genetic variants.
Collapse
|
12
|
Sun R, Weng H, Hu I, Guo J, Wu WKK, Zee BCY, Wang MH. A W-test collapsing method for rare-variant association testing in exome sequencing data. Genet Epidemiol 2016; 40:591-596. [PMID: 27531462 DOI: 10.1002/gepi.22000] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2015] [Revised: 06/06/2016] [Accepted: 07/17/2016] [Indexed: 12/20/2022]
Abstract
Advancement in sequencing technology enables the study of association between complex disorder phenotypes and single-nucleotide polymorphisms with rare mutations. However, the rare genetic variant has extremely small variance and impairs testing power of traditional statistical methods. We introduce a W-test collapsing method to evaluate rare-variant association by measuring the distributional differences between cases and controls through combined log of odds ratio within a genomic region. The method is model-free and inherits chi-squared distribution with degrees of freedom estimated from bootstrapped samples of the data, and allows for fast and accurate P-value calculation without the need of permutations. The proposed method is compared with the Weighted-Sum Statistic and Sequence Kernel Association Test on simulation datasets, and showed good performances and significantly faster computing speed. In the application of real next-generation sequencing dataset of hypertensive disorder, it identified genes of interesting biological functions associated to metabolism disorder and inflammation, including the MACROD1, NLRP7, AGK, PAK6, and APBB1. The proposed method offers an efficient and effective way for testing rare genetic variants in whole exome sequencing datasets.
Collapse
Affiliation(s)
- Rui Sun
- Division of Biostatistics, Centre for Clinical Research and Biostatistics, JC School of Public Health and Primary Care, Chinese University of Hong Kong, Shatin, NT, Hong Kong SAR.,Centre for Clinical Trials and Biostatistics, CUHK Shenzhen Research Institute, Shenzhen, China
| | - Haoyi Weng
- Division of Biostatistics, Centre for Clinical Research and Biostatistics, JC School of Public Health and Primary Care, Chinese University of Hong Kong, Shatin, NT, Hong Kong SAR.,Centre for Clinical Trials and Biostatistics, CUHK Shenzhen Research Institute, Shenzhen, China
| | - Inchi Hu
- ISOM Department, Biomedical Engineering Division, Hong Kong University of Science and Technology, Kowloon, Hong Kong SAR
| | - Junfeng Guo
- Division of Biostatistics, Centre for Clinical Research and Biostatistics, JC School of Public Health and Primary Care, Chinese University of Hong Kong, Shatin, NT, Hong Kong SAR.,Centre for Clinical Trials and Biostatistics, CUHK Shenzhen Research Institute, Shenzhen, China.,Australian National University, Canberra, Australia
| | - William K K Wu
- Department of Anesthesia and Intensive Care, Chinese University of Hong Kong, Hong Kong, Hong Kong SAR
| | - Benny Chung-Ying Zee
- Division of Biostatistics, Centre for Clinical Research and Biostatistics, JC School of Public Health and Primary Care, Chinese University of Hong Kong, Shatin, NT, Hong Kong SAR.,Centre for Clinical Trials and Biostatistics, CUHK Shenzhen Research Institute, Shenzhen, China
| | - Maggie Haitian Wang
- Division of Biostatistics, Centre for Clinical Research and Biostatistics, JC School of Public Health and Primary Care, Chinese University of Hong Kong, Shatin, NT, Hong Kong SAR. .,Centre for Clinical Trials and Biostatistics, CUHK Shenzhen Research Institute, Shenzhen, China.
| |
Collapse
|
13
|
Abstract
In recent years, genome and exome sequencing studies have implicated a plethora of new disease genes with rare causal variants. Here, I review 150 exome sequencing studies that claim to have discovered that a disease can be caused by different rare variants in the same gene, and I determine whether their methods followed the current best-practice guidelines in the interpretation of their data. Specifically, I assess whether studies appropriately assess controls for rare variants throughout the entire gene or implicated region as opposed to only investigating the specific rare variants identified in the cases, and I assess whether studies present sufficient co-segregation data for statistically significant linkage. I find that the proportion of studies performing gene-based analyses has increased with time, but that even in 2015 fewer than 40% of the reviewed studies used this method, and only 10% presented statistically significant co-segregation data. Furthermore, I find that the genes reported in these papers are explaining a decreasing proportion of cases as the field moves past most of the low-hanging fruit, with 50% of the genes from studies in 2014 and 2015 having variants in fewer than 5% of cases. As more studies focus on genes explaining relatively few cases, the importance of performing appropriate gene-based analyses is increasing. It is becoming increasingly important for journal editors and reviewers to require stringent gene-based evidence to avoid an avalanche of misleading disease gene discovery papers.
Collapse
Affiliation(s)
- Elizabeth T Cirulli
- Center for Applied Genomics and Precision Medicine, Duke University School of Medicine, Durham, North Carolina, United States of America
| |
Collapse
|
14
|
Cordell HJ. Summary of results and discussions from the gene-based tests group at Genetic Analysis Workshop 18. Genet Epidemiol 2014; 38 Suppl 1:S44-8. [PMID: 25112187 PMCID: PMC4305206 DOI: 10.1002/gepi.21824] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
I present a summary of the results and discussions held within the working group on gene-based tests at Genetic Analysis Workshop 18 (GAW18). The main focus of interest in our working group was modeling the action of combinations or "groups" of genetic variants, with a group of variants most often defined as a set of single-nucleotide polymorphisms lying within a known gene. Some contributions investigated the performance of previously proposed methods (particularly rare variant collapsing or burden-type methods) for addressing this question, applied to the GAW18 data, and other contributions developed novel approaches and addressed novel questions. Most approaches were successful in detecting significant effects at MAP4 in the simulated data. No other genetic effects were consistently detected across different analyses. Low power was noted, particularly for those methods that restricted analysis to purely the subset of unrelated individuals.
Collapse
Affiliation(s)
- Heather J Cordell
- Institute of Genetic Medicine, Newcastle University, Newcastle upon Tyne, United Kingdom
| |
Collapse
|
15
|
Paterson AD. Drinking from the Holy Grail: analysis of whole-genome sequencing from the Genetic Analysis Workshop 18. Genet Epidemiol 2014; 38 Suppl 1:S1-4. [PMID: 25112182 DOI: 10.1002/gepi.21818] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
The Genetic Analysis Workshops distribute real and simulated human genetic data to allow the development and comparison of methods to detect genetic variants and genes related to biological traits; the results are then presented and discussed at a biennial meeting. The data made available for Genetic Analysis Workshop 18 (GAW18) included whole-genome sequence data for odd-numbered autosomes from 20 large Mexican American pedigrees selected through probands with type 2 diabetes. Real and simulated blood pressure phenotype data were provided to allow the comparison of methods to detect variants and genes associated with blood pressure. Some of the complexity present in the data includes related individuals, repeated quantitative trait outcomes, covariates, medication effects, pharmacokinetic effects, missing data, admixed population, and imputed genotypes. A wide range of analytic approaches were applied to the data. Contributions that focused only on a subset of up to 155 unrelated subjects from the pedigrees were faced with low power. One recommendation for future analysis is the use of the provided null phenotype to allow comparison of type I error across methods. Collaboration between statistical geneticists and molecular biologists or bioinformaticians would provide helpful input to place variants in genes for gene-based association tests.
Collapse
Affiliation(s)
- Andrew D Paterson
- Genetics and Genome Biology Program, The Hospital for Sick Children Research Institute, Toronto, Ontario, Canada; Divisions of Epidemiology and Biostatistics, Dalla Lana School of Public Health, Department of Psychiatry, Institute of Medical Sciences, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|