1
|
Curtis D. Investigation of Recessive Effects of Coding Variants on Common Clinical Phenotypes in Exome-Sequenced UK Biobank Participants. Hum Hered 2024; 89:1-7. [PMID: 38342085 DOI: 10.1159/000537771] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 02/07/2024] [Indexed: 02/13/2024] Open
Abstract
INTRODUCTION Previous studies have demonstrated effects of rare coding variants on common, clinically relevant phenotypes although the additive burden of these variants makes only a small contribution to overall trait variance. Although recessive effects of individual homozygous variants have been studied, little work has been done to elucidate the impact of rare coding variants occurring together as compound heterozygotes. METHODS In this study, attempts were made to identify pairs of variants likely to be occurring as compound heterozygotes using 200,000 exome-sequenced subjects from the UK Biobank. Pairs of variants, which were seen together in the same subject more often than would be expected by chance, were excluded as it was assumed that these might be present in the same haplotype. Attention was restricted to variants with minor allele frequency ≤0.05 and to those predicted to alter amino acid sequence or prevent normal gene expression. For each gene, compound heterozygotes were assigned scores based on the rarity and predicted functional consequences of the constituent variants and the scores were used in a logistic regression analysis to test for association with hypertension, hyperlipidaemia, and type 2 diabetes. RESULTS No statistically significant associations were observed and the results conformed to the distribution, which would be expected under the null hypothesis. The average number of apparently compound heterozygous subjects for each gene was only 282.2. CONCLUSION It seems difficult to detect an effect of compound heterozygotes on the risk of these phenotypes. Even if recessive effects from compound heterozygotes do occur, they would only affect a small number of people and overall would not make a substantial contribution to phenotypic variance. This research has been conducted using the UK Biobank Resource.
Collapse
Affiliation(s)
- David Curtis
- UCL Genetics Institute, University College London, London, UK
| |
Collapse
|
2
|
Hudson A, Fournier M, Coulombe J, Daee D. Using existing pediatric cancer data from the Gabriella Miller Kids First Data Resource Program. JNCI Cancer Spectr 2023; 7:pkad079. [PMID: 37788089 PMCID: PMC10635640 DOI: 10.1093/jncics/pkad079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Revised: 09/07/2023] [Accepted: 09/25/2023] [Indexed: 10/05/2023] Open
Abstract
Childhood cancer and birth defects are leading causes of childhood mortality, and studies suggest that birth defects increase pediatric cancer risk. The Gabriella Miller Kids First Pediatric Research Program (Kids First) seeks to alleviate these conditions by building an expansive resource of genetic and clinical data from patients with pediatric cancer and birth defects and their families. This article describes the data and support provided by the Kids First Data Resource Center and the Kids First Data Resource Center Data Resource Portal, which enables the public to review Kids First studies and request access to individual data. The Kids First Portal contains data from more than 34 000 participants and connects with CAVATICA (Seven Bridges Genomics, Inc, now part of Velsera), a cloud-based analysis and sharing platform. Researchers have used Kids First data to investigate a variety of cancers and further funding opportunities are available. The Kids First Portal is a unique resource that unites pediatric cancer and birth defects to uncover their genetic etiology and improve patients' lives.
Collapse
Affiliation(s)
- Alexandra Hudson
- Center for Research Strategy, National Cancer Institute, Bethesda, MD, USA
| | - Marcia Fournier
- Developmental Biology and Congenital Anomalies Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD, USA
| | - James Coulombe
- Developmental Biology and Congenital Anomalies Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, Bethesda, MD, USA
| | - Danielle Daee
- Genomic Epidemiology Branch, Division of Cancer Control and Population Sciences, National Cancer Institute, Bethesda, MD, USA
| |
Collapse
|
3
|
van Dijk EL, Naquin D, Gorrichon K, Jaszczyszyn Y, Ouazahrou R, Thermes C, Hernandez C. Genomics in the long-read sequencing era. Trends Genet 2023; 39:649-671. [PMID: 37230864 DOI: 10.1016/j.tig.2023.04.006] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2023] [Revised: 04/21/2023] [Accepted: 04/25/2023] [Indexed: 05/27/2023]
Abstract
Long-read sequencing (LRS) technologies have provided extremely powerful tools to explore genomes. While in the early years these methods suffered technical limitations, they have recently made significant progress in terms of read length, throughput, and accuracy and bioinformatics tools have strongly improved. Here, we aim to review the current status of LRS technologies, the development of novel methods, and the impact on genomics research. We will explore the most impactful recent findings made possible by these technologies focusing on high-resolution sequencing of genomes and transcriptomes and the direct detection of DNA and RNA modifications. We will also discuss how LRS methods promise a more comprehensive understanding of human genetic variation, transcriptomics, and epigenetics for the coming years.
Collapse
Affiliation(s)
- Erwin L van Dijk
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France.
| | - Delphine Naquin
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Kévin Gorrichon
- National Center of Human Genomics Research (CNRGH), 91000 Évry-Courcouronnes, France
| | - Yan Jaszczyszyn
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Rania Ouazahrou
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Claude Thermes
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Céline Hernandez
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| |
Collapse
|
4
|
Hofmeister RJ, Ribeiro DM, Rubinacci S, Delaneau O. Accurate rare variant phasing of whole-genome and whole-exome sequencing data in the UK Biobank. Nat Genet 2023:10.1038/s41588-023-01415-w. [PMID: 37386248 DOI: 10.1038/s41588-023-01415-w] [Citation(s) in RCA: 22] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 05/04/2023] [Indexed: 07/01/2023]
Abstract
Phasing involves distinguishing the two parentally inherited copies of each chromosome into haplotypes. Here, we introduce SHAPEIT5, a new phasing method that quickly and accurately processes large sequencing datasets and applied it to UK Biobank (UKB) whole-genome and whole-exome sequencing data. We demonstrate that SHAPEIT5 phases rare variants with low switch error rates of below 5% for variants present in just 1 sample out of 100,000. Furthermore, we outline a method for phasing singletons, which, although less precise, constitutes an important step towards future developments. We then demonstrate that the use of UKB as a reference panel improves the accuracy of genotype imputation, which is even more pronounced when phased with SHAPEIT5 compared with other methods. Finally, we screen the UKB data for loss-of-function compound heterozygous events and identify 549 genes where both gene copies are knocked out. These genes complement current knowledge of gene essentiality in the human genome.
Collapse
Affiliation(s)
- Robin J Hofmeister
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Diogo M Ribeiro
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Simone Rubinacci
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland
| | - Olivier Delaneau
- Department of Computational Biology, University of Lausanne, Lausanne, Switzerland.
| |
Collapse
|
5
|
Spendlove SJ, Bondhus L, Lluri G, Sul JH, Arboleda VA. Polygenic risk scores of endo-phenotypes identify the effect of genetic background in congenital heart disease. HGG ADVANCES 2022; 3:100112. [PMID: 35599848 PMCID: PMC9118152 DOI: 10.1016/j.xhgg.2022.100112] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Accepted: 04/19/2022] [Indexed: 01/28/2023] Open
Abstract
Congenital heart disease (CHD) is a rare structural defect that occurs in ∼1% of live births. Studies on CHD genetic architecture have identified pathogenic single-gene mutations in less than 30% of cases. Single-gene mutations often show incomplete penetrance and variable expressivity. Therefore, we hypothesize that genetic background may play a role in modulating disease expression. Polygenic risk scores (PRSs) aggregate effects of common genetic variants to investigate whether, cumulatively, these variants are associated with disease penetrance or severity. However, the major limitations in this field have been in generating sufficient sample sizes for these studies. Here we used CHD-phenotype matched genome-wide association study (GWAS) summary statistics from the UK Biobank (UKBB) as our base study and whole-genome sequencing data from the CHD cohort (n1 = 711 trios, n2 = 362 European trios) of the Gabriella Miller Kids First dataset as our target study to develop PRSs for CHD. PRSs estimated using a GWAS for heart valve problems and heart murmur explain 2.5% of the variance in case-control status of CHD (all SNVs, p = 7.90 × 10-3; fetal cardiac SNVs, p = 8.00 × 10-3) and 1.8% of the variance in severity of CHD (fetal cardiac SNVs, p = 6.20 × 10-3; all SNVs, p = 0.015). These results show that common variants captured in CHD phenotype-matched GWASs have a modest but significant contribution to phenotypic expression of CHD. Further exploration of the cumulative effect of common variants is necessary for understanding the complex genetic etiology of CHD and other rare diseases.
Collapse
Affiliation(s)
- Sarah J Spendlove
- Interdepartmental Bioinformatics Program, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA.,Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Leroy Bondhus
- Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA.,Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Gentian Lluri
- Ahmanson/UCLA Adult Congenital Heart Disease Center, Division of Cardiology, Department of Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Jae Hoon Sul
- Interdepartmental Bioinformatics Program, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA.,Department of Psychiatry, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Valerie A Arboleda
- Interdepartmental Bioinformatics Program, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA.,Department of Pathology and Laboratory Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA.,Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA.,Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| |
Collapse
|
6
|
Scafuri B, Verdino A, D'Arminio N, Marabotti A. Computational methods to assist in the discovery of pharmacological chaperones for rare diseases. Brief Bioinform 2022; 23:6590149. [PMID: 35595532 DOI: 10.1093/bib/bbac198] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Revised: 04/13/2022] [Accepted: 04/28/2022] [Indexed: 12/21/2022] Open
Abstract
Pharmacological chaperones are chemical compounds able to bind proteins and stabilize them against denaturation and following degradation. Some pharmacological chaperones have been approved, or are under investigation, for the treatment of rare inborn errors of metabolism, caused by genetic mutations that often can destabilize the structure of the wild-type proteins expressed by that gene. Given that, for rare diseases, there is a general lack of pharmacological treatments, many expectations are poured out on this type of compounds. However, their discovery is not straightforward. In this review, we would like to focus on the computational methods that can assist and accelerate the search for these compounds, showing also examples in which these methods were successfully applied for the discovery of promising molecules belonging to this new category of pharmacologically active compounds.
Collapse
Affiliation(s)
- Bernardina Scafuri
- Department of Chemistry and Biology "A. Zambelli", University of Salerno, via Giovanni Paolo II, 132, 84084 Fisciano (SA), Italy
| | - Anna Verdino
- Department of Chemistry and Biology "A. Zambelli", University of Salerno, via Giovanni Paolo II, 132, 84084 Fisciano (SA), Italy
| | - Nancy D'Arminio
- Department of Chemistry and Biology "A. Zambelli", University of Salerno, via Giovanni Paolo II, 132, 84084 Fisciano (SA), Italy
| | - Anna Marabotti
- Department of Chemistry and Biology "A. Zambelli", University of Salerno, via Giovanni Paolo II, 132, 84084 Fisciano (SA), Italy
| |
Collapse
|
7
|
Collen LV, Kim DY, Field M, Okoroafor I, Saccocia G, Whitcomb SD, Green J, Dong MD, Barends J, Carey B, Weatherly ME, Rockowitz S, Sliz P, Liu E, Eran A, Grushkin-Lerner L, Bousvaros A, Muise AM, Klein C, Mitsialis V, Ouahed J, Snapper SB. Clinical Phenotypes and Outcomes in Monogenic Versus Non-monogenic Very Early Onset Inflammatory Bowel Disease. J Crohns Colitis 2022; 16:1380-1396. [PMID: 35366317 PMCID: PMC9455789 DOI: 10.1093/ecco-jcc/jjac045] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/24/2021] [Revised: 01/31/2022] [Accepted: 03/31/2022] [Indexed: 12/24/2022]
Abstract
BACKGROUND AND AIMS Over 80 monogenic causes of very early onset inflammatory bowel disease [VEOIBD] have been identified. Prior reports of the natural history of VEOIBD have not considered monogenic disease status. The objective of this study is to describe clinical phenotypes and outcomes in a large single-centre cohort of patients with VEOIBD and universal access to whole exome sequencing [WES]. METHODS Patients receiving IBD care at a single centre were prospectively enrolled in a longitudinal data repository starting in 2012. WES was offered with enrollment. Enrolled patients were filtered by age of diagnosis <6 years to comprise a VEOIBD cohort. Monogenic disease was identified by filtering proband variants for rare, loss-of-function, or missense variants in known VEOIBD genes inherited according to standard Mendelian inheritance patterns. RESULTS This analysis included 216 VEOIBD patients, followed for a median of 5.8 years. Seventeen patients [7.9%] had monogenic disease. Patients with monogenic IBD were younger at diagnosis and were more likely to have Crohn's disease phenotype with higher rates of stricturing and penetrating disease and extraintestinal manifestations. Patients with monogenic disease were also more likely to experience outcomes of intensive care unit [ICU] hospitalisation, gastrostomy tube, total parenteral nutrition use, stunting at 3-year follow-up, haematopoietic stem cell transplant, and death. A total of 41 patients [19.0%] had infantile-onset disease. After controlling for monogenic disease, patients with infantile-onset IBD did not have increased risk for most severity outcomes. CONCLUSIONS Monogenic disease is an important driver of disease severity in VEOIBD. WES is a valuable tool in prognostication and management of VEOIBD.
Collapse
Affiliation(s)
- Lauren V Collen
- Corresponding authors: Lauren V. Collen, 300 Longwood Avenue, Enders 670, Boston, MA 02115, USA. Tel.: 617-919-4973; fax: 617-730-0498;
| | - David Y Kim
- Division of Gastroenterology, Hepatology, and Nutrition, Department of Pediatrics, Boston Children’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Michael Field
- Division of Gastroenterology, Hepatology, and Nutrition, Department of Pediatrics, Boston Children’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Ibeawuchi Okoroafor
- Division of Gastroenterology, Hepatology, and Nutrition, Department of Pediatrics, Boston Children’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Gwen Saccocia
- Division of Gastroenterology, Hepatology, and Nutrition, Department of Pediatrics, Boston Children’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Sydney Driscoll Whitcomb
- Division of Gastroenterology, Hepatology, and Nutrition, Department of Pediatrics, Boston Children’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Julia Green
- Division of Gastroenterology, Hepatology, and Nutrition, Department of Pediatrics, Boston Children’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Michelle Dao Dong
- Division of Gastroenterology, Hepatology, and Nutrition, Department of Pediatrics, Boston Children’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Jared Barends
- Division of Gastroenterology, Hepatology, and Nutrition, Department of Pediatrics, Boston Children’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Bridget Carey
- Division of Gastroenterology, Hepatology, and Nutrition, Department of Pediatrics, Boston Children’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Madison E Weatherly
- Division of Gastroenterology, Hepatology, and Nutrition, Department of Pediatrics, Boston Children’s Hospital and Harvard Medical School, Boston, MA, USA
| | | | - Shira Rockowitz
- Manton centre for Orphan Disease Research, Boston Children’s Hospital, Boston, MA, USA
| | - Piotr Sliz
- Manton centre for Orphan Disease Research, Boston Children’s Hospital, Boston, MA, USA,Division of Molecular Medicine, Boston Children’s Hospital, Boston, MA, USA
| | - Enju Liu
- Division of Gastroenterology, Hepatology, and Nutrition, Department of Pediatrics, Boston Children’s Hospital and Harvard Medical School, Boston, MA, USA,Institutional centres for Clinical and Translational Research, Boston Children’s Hospital, Boston, MA, USA
| | - Alal Eran
- Computational Health Informatics Program, Boston Children’s Hospital, Boston, MA, USA,Harvard Medical School, Department of Biomedical Informatics, Boston, MA, USA,Department of Life Sciences and Zlotowski centre for Neuroscience, Ben Gurion University of the Negev, Beer-Sheva, Israel
| | - Leslie Grushkin-Lerner
- Division of Gastroenterology, Hepatology, and Nutrition, Department of Pediatrics, Boston Children’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Athos Bousvaros
- Division of Gastroenterology, Hepatology, and Nutrition, Department of Pediatrics, Boston Children’s Hospital and Harvard Medical School, Boston, MA, USA
| | - Aleixo M Muise
- SickKids Inflammatory Bowel Disease centre, Research Institute, Hospital for Sick Children, Toronto, ON, Canada,Division of Gastroenterology, Hepatology and Nutrition, Department of Pediatrics, University of Toronto, Toronto, ON, Canada,Institute of Medical Science, University of Toronto, Toronto, ON, Canada
| | - Christoph Klein
- Department of Pediatrics, Dr. von Hauner Children’s Hospital, LMU Klinikum, and Gene centre, Ludwig Maximilians Universität München, München,Germany
| | - Vanessa Mitsialis
- Division of Gastroenterology, Hepatology, and Nutrition, Department of Pediatrics, Boston Children’s Hospital and Harvard Medical School, Boston, MA, USA,Division of Gastroenterology, Department of Medicine, Brigham & Women’s Hospital and Harvard Medical School, Boston, MA, USA
| | | | - Scott B Snapper
- Scott B. Snapper, 300 Longwood Avenue, Enders 670, Boston, MA 02115, USA. Tel: 617-919-4973; fax: 617-730-0498;
| |
Collapse
|
8
|
Miller DB, Piccolo SR. trioPhaser: using Mendelian inheritance logic to improve genomic phasing of trios. BMC Bioinformatics 2021; 22:559. [PMID: 34809557 PMCID: PMC8607709 DOI: 10.1186/s12859-021-04470-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Accepted: 11/08/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND When analyzing DNA sequence data of an individual, knowing which nucleotide was inherited from each parent can be beneficial when trying to identify certain types of DNA variants. Mendelian inheritance logic can be used to accurately phase (haplotype) the majority (67-83%) of an individual's heterozygous nucleotide positions when genotypes are available for both parents (trio). However, when all members of a trio are heterozygous at a position, Mendelian inheritance logic cannot be used to phase. For such positions, a computational phasing algorithm can be used. Existing phasing algorithms use a haplotype reference panel, sequencing reads, and/or parental genotypes to phase an individual; however, they are limited in that they can only phase certain types of variants, require a specific genotype build, require large amounts of storage capacity, and/or require long run times. We created trioPhaser to address these challenges. RESULTS trioPhaser uses gVCF files from an individual and their parents as initial input, and then outputs a phased VCF file. Input trio data are first phased using Mendelian inheritance logic. Then, the positions that cannot be phased using inheritance information alone are phased by the SHAPEIT4 phasing algorithm. Using whole-genome sequencing data of 52 trios, we show that trioPhaser, on average, increases the total number of phased positions by 21.0% and 10.5%, respectively, when compared to the number of positions that SHAPEIT4 or Mendelian inheritance logic can phase when either is used alone. In addition, we show that the accuracy of the phased calls output by trioPhaser are similar to linked-read and read-backed phasing. CONCLUSION trioPhaser is a containerized software tool that uses both Mendelian inheritance logic and SHAPEIT4 to phase trios when gVCF files are available. By implementing both phasing methods, more variant positions are phased compared to what either method is able to phase alone.
Collapse
Affiliation(s)
- Dustin B Miller
- Department of Biology, Brigham Young University, Provo, UT, 84602, USA
| | - Stephen R Piccolo
- Department of Biology, Brigham Young University, Provo, UT, 84602, USA.
| |
Collapse
|