1
|
REEV: review, evaluate and explain variants. Nucleic Acids Res 2024:gkae366. [PMID: 38769069 DOI: 10.1093/nar/gkae366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Revised: 04/07/2024] [Accepted: 05/03/2024] [Indexed: 05/22/2024] Open
Abstract
In the era of high throughput sequencing, special software is required for the clinical evaluation of genetic variants. We developed REEV (Review, Evaluate and Explain Variants), a user-friendly platform for clinicians and researchers in the field of rare disease genetics. Supporting data was aggregated from public data sources. We compared REEV with seven other tools for clinical variant evaluation. REEV (semi-)automatically fills individual ACMG criteria facilitating variant interpretation. REEV can store disease and phenotype data related to a case to use these for phenotype similarity measures. Users can create public permanent links for individual variants that can be saved as browser bookmarks and shared. REEV may help in the fast diagnostic assessment of genetic variants in a clinical as well as in a research context. REEV (https://reev.bihealth.org/) is free and open to all users and there is no login requirement.
Collapse
|
2
|
Loss-of-function variants affecting the STAGA complex component SUPT7L cause a developmental disorder with generalized lipodystrophy. Hum Genet 2024; 143:683-694. [PMID: 38592547 PMCID: PMC11098864 DOI: 10.1007/s00439-024-02669-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Accepted: 03/11/2024] [Indexed: 04/10/2024]
Abstract
Generalized lipodystrophy is a feature of various hereditary disorders, often leading to a progeroid appearance. In the present study we identified a missense and a frameshift variant in a compound heterozygous state in SUPT7L in a boy with intrauterine growth retardation, generalized lipodystrophy, and additional progeroid features. SUPT7L encodes a component of the transcriptional coactivator complex STAGA. By transcriptome sequencing, we showed the predicted missense variant to cause aberrant splicing, leading to exon truncation and thereby to a complete absence of SUPT7L in dermal fibroblasts. In addition, we found altered expression of genes encoding DNA repair pathway components. This pathway was further investigated and an increased rate of DNA damage was detected in proband-derived fibroblasts and genome-edited HeLa cells. Finally, we performed transient overexpression of wildtype SUPT7L in both cellular systems, which normalizes the number of DNA damage events. Our findings suggest SUPT7L as a novel disease gene and underline the link between genome instability and progeroid phenotypes.
Collapse
|
3
|
TREX1 p.A129fs and p.Y305C variants in a large multi-ethnic cohort of CADASIL-like unrelated patients. Neurobiol Aging 2023; 123:208-215. [PMID: 36586737 DOI: 10.1016/j.neurobiolaging.2022.11.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2022] [Revised: 11/16/2022] [Accepted: 11/22/2022] [Indexed: 11/27/2022]
Abstract
Cerebral autosomal dominant arteriopathy with subcortical infarcts and leukoencephalopathy (CADASIL) and retinal vasculopathy with cerebral leukodystrophy and systemic manifestations (RVCL-S) are the most common forms of rare monogenic early-onset cerebral small vessel disease and share clinical, and, to different extents, neuroradiological and neuropathological features. However, whether CADASIL and RVCL-S overlapping phenotype may be explained by shared genetic risk or causative factors such as TREX1 coding variants remains poorly understood. To investigate this intriguing hypothesis, we used exome sequencing to screen TREX1 protein-coding variability in a large multi-ethnic cohort of 180 early-onset independent familial and apparently sporadic CADASIL-like Caucasian patients from the USA, Portugal, Finland, Serbia and Turkey. We report 2 very rare and likely pathogenic TREX1 mutations: a loss of function mutation (p.Ala129fs) clustering in the catalytic domain, in an apparently sporadic 46-year-old patient from the USA and a missense mutation (p.Tyr305Cys) in the well conserved C-terminal region, in a 57-year-old patient with positive family history from Serbia. In concert with recent findings, our study expands the clinical spectrum of diseases associated with TREX1 mutations.
Collapse
|
4
|
SODAR: managing multiomics study data and metadata. Gigascience 2022; 12:giad052. [PMID: 37498129 PMCID: PMC10373112 DOI: 10.1093/gigascience/giad052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2022] [Revised: 03/30/2023] [Accepted: 06/27/2023] [Indexed: 07/28/2023] Open
Abstract
Scientists employing omics in life science studies face challenges such as the modeling of multiassay studies, recording of all relevant parameters, and managing many samples with their metadata. They must manage many large files that are the results of the assays or subsequent computation. Users with diverse backgrounds, ranging from computational scientists to wet-lab scientists, have dissimilar needs when it comes to data access, with programmatic interfaces being favored by the former and graphical ones by the latter. We introduce SODAR, the system for omics data access and retrieval. SODAR is a software package that addresses these challenges by providing a web-based graphical user interface for managing multiassay studies and describing them using the ISA (Investigation, Study, Assay) data model and the ISA-Tab file format. Data storage is handled using the iRODS data management system, which handles large quantities of files and substantial amounts of data. SODAR also offers programmable APIs and command-line access for metadata and file storage. SODAR supports complex omics integration studies and can be easily installed. The software is written in Python 3 and freely available at https://github.com/bihealth/sodar-server under the MIT license.
Collapse
|
5
|
Prioritization of non-coding elements involved in non-syndromic cleft lip with/without cleft palate through genome-wide analysis of de novo mutations. HGG ADVANCES 2022; 4:100166. [PMID: 36589413 PMCID: PMC9795529 DOI: 10.1016/j.xhgg.2022.100166] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 12/01/2022] [Indexed: 12/12/2022] Open
Abstract
Non-syndromic cleft lip with/without cleft palate (nsCL/P) is a highly heritable facial disorder. To date, systematic investigations of the contribution of rare variants in non-coding regions to nsCL/P etiology are sparse. Here, we re-analyzed available whole-genome sequence (WGS) data from 211 European case-parent trios with nsCL/P and identified 13,522 de novo mutations (DNMs) in nsCL/P cases, 13,055 of which mapped to non-coding regions. We integrated these data with DNMs from a reference cohort, with results of previous genome-wide association studies (GWASs), and functional and epigenetic datasets of relevance to embryonic facial development. A significant enrichment of nsCL/P DNMs was observed at two GWAS risk loci (4q28.1 (p = 8 × 10-4) and 2p21 (p = 0.02)), suggesting a convergence of both common and rare variants at these loci. We also mapped the DNMs to 810 position weight matrices indicative of transcription factor (TF) binding, and quantified the effect of the allelic changes in silico. This revealed a nominally significant overrepresentation of DNMs (p = 0.037), and a stronger effect on binding strength, for DNMs located in the sequence of the core binding region of the TF Musculin (MSC). Notably, MSC is involved in facial muscle development, together with a set of nsCL/P genes located at GWAS loci. Supported by additional results from single-cell transcriptomic data and molecular binding assays, this suggests that variation in MSC binding sites contributes to nsCL/P etiology. Our study describes a set of approaches that can be applied to increase the added value of WGS data.
Collapse
|
6
|
ClearCNV: CNV calling from NGS panel data in the presence of ambiguity and noise. Bioinformatics 2022; 38:3871-3876. [PMID: 35751599 DOI: 10.1093/bioinformatics/btac418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2021] [Revised: 04/27/2022] [Accepted: 06/23/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION While the identification of small variants in panel sequencing data can be considered a solved problem, the identification of larger, multi-exon copy number variants (CNVs) still poses a considerable challenge. Thus, CNV calling has not been established in all laboratories performing panel sequencing. At the same time, such laboratories have accumulated large datasets and thus have the need to identify CNVs on their data to close the diagnostic gap. RESULTS In this article, we present our method clearCNV that addresses this need in two ways. First, it helps laboratories to properly assign datasets to enrichment kits. Based on homogeneous subsets of data, clearCNV identifies CNVs affecting the targeted regions. Using real-world datasets and validation, we show that our method is highly competitive with previous methods and preferable in terms of specificity. AVAILABILITY AND IMPLEMENTATION The software is available for free under a permissible license at https://github.com/bihealth/clear-cnv. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
7
|
Pathogenic Variants in Cardiomyopathy Disorder Genes Underlie Pediatric Myocarditis—Further Impact of Heterozygous Immune Disorder Gene Variants? J Cardiovasc Dev Dis 2022; 9:jcdd9070216. [PMID: 35877578 PMCID: PMC9321514 DOI: 10.3390/jcdd9070216] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 06/23/2022] [Accepted: 07/01/2022] [Indexed: 12/04/2022] Open
Abstract
Myocarditis is an inflammatory disease of the heart. Pediatric myocarditis with the dilated cardiomyopathy (DCM) phenotype may be caused by likely pathogenic or pathogenic genetic variants [(L)P] in cardiomyopathy (CMP) genes. Systematic analysis of immune disorder gene defects has not been performed so far. We analyzed 12 patients with biopsy-proven myocarditis and the DCM phenotype together with their parents using whole-exome sequencing (WES). The WES data were filtered for rare pathogenic variants in CMP (n = 89) and immune disorder genes (n = 631). Twelve children with a median age of 2.9 (1.0–6.8) years had a mean left ventricular ejection fraction of 28% (22–32%) and myocarditis was confirmed by endomyocardial biopsy. Patients with primary immunodeficiency were excluded from the study. Four patients underwent implantation of a ventricular assist device and subsequent heart transplantation. Genetic analysis of the 12 families revealed an (L)P variant in the CMP gene in 8/12 index patients explaining DCM. Screening of recessive immune disorder genes identified a heterozygous (L)P variant in 3/12 index patients. This study supports the genetic impact of CMP genes for pediatric myocarditis with the DCM phenotype. Piloting the idea that additional immune-related genetic defects promote myocarditis suggests that the presence of heterozygous variants in these genes needs further investigation. Altered cilium function might play an additional role in inducing inflammation in the context of CMP.
Collapse
|
8
|
Highly multiplexed immune repertoire sequencing links multiple lymphocyte classes with severity of response to COVID-19. EClinicalMedicine 2022; 48:101438. [PMID: 35600330 PMCID: PMC9106482 DOI: 10.1016/j.eclinm.2022.101438] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/02/2021] [Revised: 04/15/2022] [Accepted: 04/19/2022] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Disease progression of subjects with coronavirus disease 2019 (COVID-19) varies dramatically. Understanding the various types of immune response to SARS-CoV-2 is critical for better clinical management of coronavirus outbreaks and to potentially improve future therapies. Disease dynamics can be characterized by deciphering the adaptive immune response. METHODS In this cross-sectional study we analyzed 117 peripheral blood immune repertoires from healthy controls and subjects with mild to severe COVID-19 disease to elucidate the interplay between B and T cells. We used an immune repertoire Primer Extension Target Enrichment method (immunoPETE) to sequence simultaneously human leukocyte antigen (HLA) restricted T cell receptor beta chain (TRB) and unrestricted T cell receptor delta chain (TRD) and immunoglobulin heavy chain (IgH) immune receptor repertoires. The distribution was analyzed of TRB, TRD and IgH clones between healthy and COVID-19 infected subjects. Using McFadden's Adjusted R2 variables were examined for a predictive model. The aim of this study is to analyze the influence of the adaptive immune repertoire on the severity of the disease (value on the World Health Organization Clinical Progression Scale) in COVID-19. FINDINGS Combining clinical metadata with clonotypes of three immune receptor heavy chains (TRB, TRD, and IgH), we found significant associations between COVID-19 disease severity groups and immune receptor sequences of B and T cell compartments. Logistic regression showed an increase in shared IgH clonal types and decrease of TRD in subjects with severe COVID-19. The probability of finding shared clones of TRD clonal types was highest in healthy subjects (controls). Some specific TRB clones seems to be present in severe COVID-19 (Figure S7b). The most informative models (McFadden´s Adjusted R2=0.141) linked disease severity with immune repertoire measures across all three cell types, as well as receptor-specific cell counts, highlighting the importance of multiple lymphocyte classes in disease progression. INTERPRETATION Adaptive immune receptor peripheral blood repertoire measures are associated with COVID-19 disease severity. FUNDING The study was funded with grants from the Berlin Institute of Health (BIH).
Collapse
|
9
|
Xq27.1 palindrome mediated interchromosomal insertion likely causes familial congenital bilateral laryngeal abductor paralysis (Plott syndrome). J Hum Genet 2022; 67:405-410. [PMID: 35095096 PMCID: PMC9233990 DOI: 10.1038/s10038-022-01018-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Revised: 01/12/2022] [Accepted: 01/13/2022] [Indexed: 01/27/2023]
Abstract
Bilateral laryngeal abductor paralysis is a rare entity and the second most common cause of stridor in newborns. So far, no conclusive genetic or chromosomal aberration has been reported for X-linked isolated bilateral vocal cord paralysis, also referred to as Plott syndrome. Via whole genome sequencing (WGS), we identified a complex interchromosomal insertion in a large family with seven affected males. The 404 kb inserted fragment originates from chromosome 10q21.3, contains no genes and is inserted inversionally into the intergenic chromosomal region Xq27.1, 82 kb centromeric to the nearest gene SOX3. The patterns found at the breakpoint junctions resemble typical characteristics that arise in replication-based mechanisms with long-distance template switching. Non protein-coding insertions into the same genomic region have been described to result in different phenotypes, indicating that the phenotypic outcome likely depends on the introduction of regulatory elements. In conclusion, our data adds Plott syndrome as another entity, likely caused by the insertion of non-coding DNA into the intergenic chromosomal region Xq27.1. In this regard, we demonstrate the importance of WGS as a powerful diagnostic test in unsolved genetic diseases, as this genomic rearrangement has not been detected by current first-line diagnostic tests, i.e., exome sequencing and chromosomal microarray analysis.
Collapse
|
10
|
Increased risk of severe clinical course of COVID-19 in carriers of HLA-C*04:01. EClinicalMedicine 2021; 40:101099. [PMID: 34490415 PMCID: PMC8410317 DOI: 10.1016/j.eclinm.2021.101099] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Revised: 08/01/2021] [Accepted: 08/04/2021] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND Since the beginning of the coronavirus disease 2019 (COVID-19) pandemic, there has been increasing urgency to identify pathophysiological characteristics leading to severe clinical course in patients infected with severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Human leukocyte antigen alleles (HLA) have been suggested as potential genetic host factors that affect individual immune response to SARS-CoV-2. We sought to evaluate this hypothesis by conducting a multicenter study using HLA sequencing. METHODS We analyzed the association between COVID-19 severity and HLAs in 435 individuals from Germany (n = 135), Spain (n = 133), Switzerland (n = 20) and the United States (n = 147), who had been enrolled from March 2020 to August 2020. This study included patients older than 18 years, diagnosed with COVID-19 and representing the full spectrum of the disease. Finally, we tested our results by meta-analysing data from prior genome-wide association studies (GWAS). FINDINGS We describe a potential association of HLA-C*04:01 with severe clinical course of COVID-19. Carriers of HLA-C*04:01 had twice the risk of intubation when infected with SARS-CoV-2 (risk ratio 1.5 [95% CI 1.1-2.1], odds ratio 3.5 [95% CI 1.9-6.6], adjusted p-value = 0.0074). These findings are based on data from four countries and corroborated by independent results from GWAS. Our findings are biologically plausible, as HLA-C*04:01 has fewer predicted bindings sites for relevant SARS-CoV-2 peptides compared to other HLA alleles. INTERPRETATION HLA-C*04:01 carrier state is associated with severe clinical course in SARS-CoV-2. Our findings suggest that HLA class I alleles have a relevant role in immune defense against SARS-CoV-2. FUNDING Funded by Roche Sequencing Solutions, Inc.
Collapse
|
11
|
Abstract
Periodontitis is characterized by alveolar bone loss leading to tooth loss. A small proportion of patients develop severe periodontitis at the juvenile or adolescent age without exposure to the main risk factors of the disease. It is considered that these cases carry rare variants with large causal effects, but the specific variants are largely unknown. In this study, we performed exome sequencing of 5 families with children who developed stage IV, grade C, periodontitis between 3 and 18 y of age. In 1 family, we found compound heterozygous variants in the gene CTSC (p.R272H, p.G139R), 1 of which was previously identified in a family with prepubertal periodontitis. Subsequent targeted resequencing of the CTSC gene in 24 patients <25 y of age (stage IV, grade C) identified the known mutation p.I453V (odds ratio = 4.06, 95% CI = 1.6 to 10.3, P = 0.001), which was previously reported to increase the risk for adolescent periodontitis. An affected sibling of another family carried a homozygous deleterious mutation in the gene TUT7 (p.R560Q, CADD score >30 [Combined Annotation Dependent Depletion]), which is implicated in regulation of interleukin 6 expression. Two other affected siblings shared heterozygous deleterious mutations in the interacting genes PADI1 and FLG (both CADD = 36), which contribute to the integrity of the environment-tissue barrier interface. Additionally, we found predicted deleterious mutations in the periodontitis risk genes ABCA1, GLT6D1, and SIGLEC5. We conclude that the CTSC variants p.R272H and p.I453V have different expressivity and diagnostic relevance for prepubertal and adolescent periodontitis, respectively. We propose additional causal variants for early-onset periodontitis, which also locate within genes that carry known susceptibility variants for common forms. However, the genetic architecture of juvenile periodontitis is complex and differs among the affected siblings of the sequenced families.
Collapse
|
12
|
GLI3 variants causing isolated polysyndactyly are not restricted to the protein's C-terminal third. Clin Genet 2021; 100:758-765. [PMID: 34482537 DOI: 10.1111/cge.14059] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Revised: 08/30/2021] [Accepted: 09/01/2021] [Indexed: 02/03/2023]
Abstract
Loss of function variants of GLI3 are associated with a variety of forms of polysyndactyly: Pallister-Hall syndrome (PHS), Greig-Cephalopolysyndactyly syndrome (GCPS), and isolated polysyndactyly (IPD). Variants affecting the N-terminal and C-terminal thirds of the GLI3 protein have been associated with GCPS, those within the central third with PHS. Cases of IPD have been attributed to variants affecting the C-terminal third of the GLI3 protein. In this study, we further investigate these genotype-phenotype correlations. Sequencing of GLI3 was performed in patients with clinical findings suggestive of a GLI3-associated syndrome. Additionally, we searched the literature for reported cases of either manifestation with mutations in the GLI3 gene. Here, we report 48 novel cases from 16 families with polysyndactyly in whom we found causative variants in GLI3 and a review on 314 previously reported GLI3 variants. No differences in location of variants causing either GCPS or IPD were found. Review of published data confirmed the association of PHS and variants affecting the GLI3 protein's central third. We conclude that the observed manifestations of GLI3 variants as GCPS or IPD display different phenotypic severities of the same disorder and propose a binary division of GLI3-associated disorders in either PHS or GCPS/polysyndactyly.
Collapse
|
13
|
Shared and oppositely regulated transcriptomic signatures in Huntington's disease and brain ischemia confirm known and unveil novel potential neuroprotective genes. Neurobiol Aging 2021; 104:122.e1-122.e17. [PMID: 33875290 DOI: 10.1016/j.neurobiolaging.2021.03.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2020] [Revised: 02/13/2021] [Accepted: 03/02/2021] [Indexed: 11/20/2022]
Abstract
Huntington's disease and subcortical vascular dementia display similar dementing features, shaped by different degrees of striatal atrophy, deep white matter degeneration and tau pathology. To investigate the hypothesis that Huntington's disease transcriptomic hallmarks may provide a window into potential protective genes upregulated during brain acute and subacute ischemia, we compared RNA sequencing signatures in the most affected brain areas of 2 widely used experimental mouse models: Huntington's disease, (R6/2, striatum and cortex and Q175, hippocampus) and brain ischemia-subcortical vascular dementia (BCCAS, striatum, cortex and hippocampus). We identified a cluster of 55 shared genes significantly differentially regulated in both models and we screened these in 2 different mouse models of Alzheimer's disease, and 96 early-onset familial and apparently sporadic small vessel ischemic disease patients. Our data support the prevalent role of transcriptional regulation upon genetic coding variability of known neuroprotective genes (Egr2, Fos, Ptgs2, Itga5, Cdkn1a, Gsn, Npas4, Btg2, Cebpb) and provide a list of potential additional ones likely implicated in different dementing disorders and worth further investigation.
Collapse
|
14
|
Pathogenic Variants Associated With Dilated Cardiomyopathy Predict Outcome in Pediatric Myocarditis. CIRCULATION-GENOMIC AND PRECISION MEDICINE 2021; 14:e003250. [PMID: 34213952 PMCID: PMC8373449 DOI: 10.1161/circgen.120.003250] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
BACKGROUND Myocarditis is one of the most common causes leading to heart failure in children and a possible genetic background has been postulated. We sought to characterize the clinical and genetic characteristics in patients with myocarditis ≤18 years of age to predict outcome. METHODS A cohort of 42 patients (Genetics in Pediatric Myocarditis) with biopsy-proven myocarditis underwent genetic testing with targeted panel sequencing of cardiomyopathy-associated genes. Genetics in Pediatric Myocarditis patients were divided into subgroups according to the phenotype of dilated cardiomyopathy (DCM) at presentation, resulting in 22 patients without DCM (myocarditis without phenotype of DCM) and 20 patients with DCM (myocarditis with phenotype of DCM). RESULTS Myocarditis with phenotype of DCM patients (median age 1.4 years) were younger than myocarditis without phenotype of DCM patients (median age 16.1 years; P<0.001) and were corresponding to heart failure-like and coronary syndrome-like phenotypes, respectively. At least one likely pathogenic/pathogenic variant was identified in 9 out of 42 patients (22%), 8 of them were heterozygous, and 7 out of 9 were in myocarditis with phenotype of DCM. Likely pathogenic/pathogenic variants were found in genes validated for primary DCM (BAG3, DSP, LMNA, MYH7, TNNI3, TNNT2, and TTN). Rare variant enrichment analysis revealed significant accumulation of high-impact disease variants in myocarditis with phenotype of DCM versus healthy individuals (P=0.0003). Event-free survival was lower (P=0.008) in myocarditis with phenotype of DCM patients compared with myocarditis without phenotype of DCM and primary DCM. CONCLUSIONS We report heterozygous likely pathogenic/pathogenic variants in biopsy-proven pediatric myocarditis. Myocarditis patients with DCM phenotype were characterized by early-onset heart failure, significant enrichment of likely pathogenic/pathogenic variants, and poor outcome. These phenotype-specific and age group-specific findings will be useful for personalized management of these patients. Genetic evaluation in children newly diagnosed with myocarditis and DCM phenotype is warranted.
Collapse
|
15
|
Expanding the clinical and molecular spectrum of ATP6V1A related metabolic cutis laxa. J Inherit Metab Dis 2021; 44:972-986. [PMID: 33320377 PMCID: PMC8638669 DOI: 10.1002/jimd.12341] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/18/2020] [Revised: 12/08/2020] [Accepted: 12/14/2020] [Indexed: 12/14/2022]
Abstract
Several inborn errors of metabolism show cutis laxa as a highly recognizable feature. One group of these metabolic cutis laxa conditions is autosomal recessive cutis laxa type 2 caused by defects in v-ATPase components or the mitochondrial proline cycle. Besides cutis laxa, muscular hypotonia and cardiac abnormalities are hallmarks of autosomal recessive cutis laxa type 2D (ARCL2D) due to pathogenic variants in ATP6V1A encoding subunit A of the v-ATPase. Here, we report on three affected individuals from two families with ARCL2D in whom we performed whole exome and Sanger sequencing. We performed functional studies in fibroblasts from one individual, summarized all known probands' clinical, molecular, and biochemical features and compared them, also to other metabolic forms of cutis laxa. We identified novel missense and the first nonsense variant strongly affecting ATP6V1A expression. All six ARCL2D affected individuals show equally severe cutis laxa and dysmorphism at birth. While for one no information was available, two died in infancy and three are now adolescents with mild or absent intellectual disability. Muscular weakness, ptosis, contractures, and elevated muscle enzymes indicated a persistent myopathy. In cellular studies, a fragmented Golgi compartment, a delayed Brefeldin A-induced retrograde transport and glycosylation abnormalities were present in fibroblasts from two individuals. This is the second and confirmatory report on pathogenic variants in ATP6V1A as the cause of this extremely rare condition and the first to describe a nonsense allele. Our data highlight the tremendous clinical variability of ATP6V1A related phenotypes even within the same family.
Collapse
|
16
|
Biallelic truncating variants in ATP9A cause a novel neurodevelopmental disorder involving postnatal microcephaly and failure to thrive. J Med Genet 2021; 59:662-668. [PMID: 34379057 PMCID: PMC9252857 DOI: 10.1136/jmedgenet-2021-107843] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Accepted: 05/20/2021] [Indexed: 12/04/2022]
Abstract
Background Genes implicated in the Golgi and endosomal trafficking machinery are crucial for brain development, and mutations in them are particularly associated with postnatal microcephaly (POM). Methods Exome sequencing was performed in three affected individuals from two unrelated consanguineous families presenting with delayed neurodevelopment, intellectual disability of variable degree, POM and failure to thrive. Patient-derived fibroblasts were tested for functional effects of the variants. Results We detected homozygous truncating variants in ATP9A. While the variant in family A is predicted to result in an early premature termination codon, the variant in family B affects a canonical splice site. Both variants lead to a substantial reduction of ATP9A mRNA expression. It has been shown previously that ATP9A localises to early and recycling endosomes, whereas its depletion leads to altered gene expression of components from this compartment. Consistent with previous findings, we also observed overexpression of ARPC3 and SNX3, genes strongly interacting with ATP9A. Conclusion In aggregate, our findings show that pathogenic variants in ATP9A cause a novel autosomal recessive neurodevelopmental disorder with POM. While the physiological function of endogenous ATP9A is still largely elusive, our results underline a crucial role of this gene in endosomal transport in brain tissue.
Collapse
|
17
|
CDK19-related disorder results from both loss-of-function and gain-of-function de novo missense variants. Genet Med 2021; 23:1050-1057. [PMID: 33495529 DOI: 10.1038/s41436-020-01091-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2020] [Revised: 12/18/2020] [Accepted: 12/22/2020] [Indexed: 11/08/2022] Open
Abstract
PURPOSE To expand the recent description of a new neurodevelopmental syndrome related to alterations in CDK19. METHODS Individuals were identified through international collaboration. Functional studies included autophosphorylation assays for CDK19 Gly28Arg and Tyr32His variants and in vivo zebrafish assays of the CDK19G28R and CDK19Y32H. RESULTS We describe 11 unrelated individuals (age range: 9 months to 14 years) with de novo missense variants mapped to the kinase domain of CDK19, including two recurrent changes at residues Tyr32 and Gly28. In vitro autophosphorylation and substrate phosphorylation assays revealed that kinase activity of protein was lower for p.Gly28Arg and higher for p.Tyr32His substitutions compared with that of the wild-type protein. Injection of CDK19 messenger RNA (mRNA) with either the Tyr32His or the Gly28Arg variants using in vivo zebrafish model significantly increased fraction of embryos with morphological abnormalities. Overall, the phenotype of the now 14 individuals with CDK19-related disorder includes universal developmental delay and facial dysmorphism, hypotonia (79%), seizures (64%), ophthalmologic anomalies (64%), and autism/autistic traits (56%). CONCLUSION CDK19 de novo missense variants are responsible for a novel neurodevelopmental disorder. Both kinase assay and zebrafish experiments showed that the pathogenetic mechanism may be more diverse than previously thought.
Collapse
|
18
|
Abstract
Data analysis often entails a multitude of heterogeneous steps, from the application of various command line tools to the usage of scripting languages like R or Python for the generation of plots and tables. It is widely recognized that data analyses should ideally be conducted in a reproducible way. Reproducibility enables technical validation and regeneration of results on the original or even new data. However, reproducibility alone is by no means sufficient to deliver an analysis that is of lasting impact (i.e., sustainable) for the field, or even just one research group. We postulate that it is equally important to ensure adaptability and transparency. The former describes the ability to modify the analysis to answer extended or slightly different research questions. The latter describes the ability to understand the analysis in order to judge whether it is not only technically, but methodologically valid. Here, we analyze the properties needed for a data analysis to become reproducible, adaptable, and transparent. We show how the popular workflow management system Snakemake can be used to guarantee this, and how it enables an ergonomic, combined, unified representation of all steps involved in data analysis, ranging from raw data processing, to quality control and fine-grained, interactive exploration and plotting of final results.
Collapse
|
19
|
Abstract
Data analysis often entails a multitude of heterogeneous steps, from the application of various command line tools to the usage of scripting languages like R or Python for the generation of plots and tables. It is widely recognized that data analyses should ideally be conducted in a reproducible way. Reproducibility enables technical validation and regeneration of results on the original or even new data. However, reproducibility alone is by no means sufficient to deliver an analysis that is of lasting impact (i.e., sustainable) for the field, or even just one research group. We postulate that it is equally important to ensure adaptability and transparency. The former describes the ability to modify the analysis to answer extended or slightly different research questions. The latter describes the ability to understand the analysis in order to judge whether it is not only technically, but methodologically valid. Here, we analyze the properties needed for a data analysis to become reproducible, adaptable, and transparent. We show how the popular workflow management system Snakemake can be used to guarantee this, and how it enables an ergonomic, combined, unified representation of all steps involved in data analysis, ranging from raw data processing, to quality control and fine-grained, interactive exploration and plotting of final results.
Collapse
|
20
|
Variants in the SK2 channel gene (KCNN2) lead to dominant neurodevelopmental movement disorders. Brain 2020; 143:3564-3573. [PMID: 33242881 DOI: 10.1093/brain/awaa346] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Revised: 07/17/2020] [Accepted: 09/08/2020] [Indexed: 11/14/2022] Open
Abstract
KCNN2 encodes the small conductance calcium-activated potassium channel 2 (SK2). Rodent models with spontaneous Kcnn2 mutations show abnormal gait and locomotor activity, tremor and memory deficits, but human disorders related to KCNN2 variants are largely unknown. Using exome sequencing, we identified a de novo KCNN2 frameshift deletion in a patient with learning disabilities, cerebellar ataxia and white matter abnormalities on brain MRI. This discovery prompted us to collect data from nine additional patients with de novo KCNN2 variants (one nonsense, one splice site, six missense variants and one in-frame deletion) and one family with a missense variant inherited from the affected mother. We investigated the functional impact of six selected variants on SK2 channel function using the patch-clamp technique. All variants tested but one, which was reclassified to uncertain significance, led to a loss-of-function of SK2 channels. Patients with KCNN2 variants had motor and language developmental delay, intellectual disability often associated with early-onset movement disorders comprising cerebellar ataxia and/or extrapyramidal symptoms. Altogether, our findings provide evidence that heterozygous variants, likely causing a haploinsufficiency of the KCNN2 gene, lead to novel autosomal dominant neurodevelopmental movement disorders mirroring phenotypes previously described in rodents.
Collapse
|
21
|
VarFish: comprehensive DNA variant analysis for diagnostics and research. Nucleic Acids Res 2020; 48:W162-W169. [PMID: 32338743 PMCID: PMC7319464 DOI: 10.1093/nar/gkaa241] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2020] [Revised: 03/27/2020] [Accepted: 04/16/2020] [Indexed: 12/18/2022] Open
Abstract
VarFish is a user-friendly web application for the quality control, filtering, prioritization, analysis, and user-based annotation of DNA variant data with a focus on rare disease genetics. It is capable of processing variant call files with single or multiple samples. The variants are automatically annotated with population frequencies, molecular impact, and presence in databases such as ClinVar. Further, it provides support for pathogenicity scores including CADD, MutationTaster, and phenotypic similarity scores. Users can filter variants based on these annotations and presumed inheritance pattern and sort the results by these scores. Variants passing the filter are listed with their annotations and many useful link-outs to genome browsers, other gene/variant data portals, and external tools for variant assessment. VarFish allows users to create their own annotations including support for variant assessment following ACMG-AMP guidelines. In close collaboration with medical practitioners, VarFish was designed for variant analysis and prioritization in diagnostic and research settings as described in the software's extensive manual. The user interface has been optimized for supporting these protocols. Users can install VarFish on their own in-house servers where it provides additional lab notebook features for collaborative analysis and allows re-analysis of cases, e.g. after update of genotype or phenotype databases.
Collapse
|
22
|
Interpretable Clinical Genomics with a Likelihood Ratio Paradigm. Am J Hum Genet 2020; 107:403-417. [PMID: 32755546 PMCID: PMC7477017 DOI: 10.1016/j.ajhg.2020.06.021] [Citation(s) in RCA: 43] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2020] [Accepted: 06/26/2020] [Indexed: 10/23/2022] Open
Abstract
Human Phenotype Ontology (HPO)-based analysis has become standard for genomic diagnostics of rare diseases. Current algorithms use a variety of semantic and statistical approaches to prioritize the typically long lists of genes with candidate pathogenic variants. These algorithms do not provide robust estimates of the strength of the predictions beyond the placement in a ranked list, nor do they provide measures of how much any individual phenotypic observation has contributed to the prioritization result. However, given that the overall success rate of genomic diagnostics is only around 25%-50% or less in many cohorts, a good ranking cannot be taken to imply that the gene or disease at rank one is necessarily a good candidate. Here, we present an approach to genomic diagnostics that exploits the likelihood ratio (LR) framework to provide an estimate of (1) the posttest probability of candidate diagnoses, (2) the LR for each observed HPO phenotype, and (3) the predicted pathogenicity of observed genotypes. LIkelihood Ratio Interpretation of Clinical AbnormaLities (LIRICAL) placed the correct diagnosis within the first three ranks in 92.9% of 384 case reports comprising 262 Mendelian diseases, and the correct diagnosis had a mean posttest probability of 67.3%. Simulations show that LIRICAL is robust to many typically encountered forms of genomic and phenomic noise. In summary, LIRICAL provides accurate, clinically interpretable results for phenotype-driven genomic diagnostics.
Collapse
|
23
|
Variable pulmonary manifestations in Chitayat syndrome: Six additional affected individuals. Am J Med Genet A 2020; 182:2068-2076. [PMID: 32592542 DOI: 10.1002/ajmg.a.61735] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 05/07/2020] [Accepted: 05/24/2020] [Indexed: 12/11/2022]
Abstract
Hand hyperphalangism leading to shortened index fingers with ulnar deviation, hallux valgus, mild facial dysmorphism and respiratory compromise requiring assisted ventilation are the key features of Chitayat syndrome. This condition results from the recurrent heterozygous missense variant NM_006494.2:c.266A>G; p.(Tyr89Cys) in ERF on chromosome 19q13.2, encoding the ETS2 repressor factor (ERF) protein. The pathomechanism of Chitayat syndrome is unknown. To date, seven individuals with Chitayat syndrome and the recurrent pathogenic ERF variant have been reported in the literature. Here, we describe six additional individuals, among them only one presenting with a history of assisted ventilation, and the remaining presenting with variable pulmonary phenotypes, including one individual without any obvious pulmonary manifestations. Our findings widen the phenotype spectrum caused by the recurrent pathogenic variant in ERF, underline Chitayat syndrome as a cause of isolated skeletal malformations and therefore contribute to the improvement of diagnostic strategies in individuals with hand hyperphalangism.
Collapse
|
24
|
An intronic splice site alteration in combination with a large deletion affecting VPS13B (COH1) causes Cohen syndrome. Eur J Med Genet 2020; 63:103973. [PMID: 32505691 DOI: 10.1016/j.ejmg.2020.103973] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2019] [Revised: 04/06/2020] [Accepted: 06/01/2020] [Indexed: 01/15/2023]
Abstract
Cohen syndrome (CS) is a rare, autosomal recessive disorder characterized by intellectual disability, postnatal microcephaly, facial abnormalities, abnormal truncal fat distribution, myopia, and pigmentary retinopathy. It is often considered an underdiagnosed condition, especially in children with developmental delay and intellectual disability. Here we report on four individuals from a large Jordanian family clinically diagnosed with CS. Using Trio Exome Sequencing (Trio-WES) and MLPA analyses we identified a maternally inherited novel intronic nucleotide substitution c.3446-23T>G leading to the activation of a cryptic splice site and a paternally inherited multi-exon deletion in VPS13B (previously termed COH1) in the index patient. Expression analysis showed a strong decrease of VPS13B mRNA levels and direct sequencing of cDNA confirmed splicing at a cryptic upstream splice acceptor site, resulting in the inclusion of 22 intronic bases. This extension results in a frameshift and a premature stop of translation (p.Gly1149Valfs*9). Segregation analysis revealed that three affected maternal cousins were homozygous for the intronic splice site variant. Our data show causality of both alterations and strongly suggest the expansion of the diagnostic strategy to search for intronic splice variants in molecularly unconfirmed patients affected by CS.
Collapse
|
25
|
Hi-C Identifies Complex Genomic Rearrangements and TAD-Shuffling in Developmental Diseases. Am J Hum Genet 2020; 106:872-884. [PMID: 32470376 PMCID: PMC7273525 DOI: 10.1016/j.ajhg.2020.04.016] [Citation(s) in RCA: 68] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2019] [Accepted: 04/29/2020] [Indexed: 12/15/2022] Open
Abstract
Genome-wide analysis methods, such as array comparative genomic hybridization (CGH) and whole-genome sequencing (WGS), have greatly advanced the identification of structural variants (SVs) in the human genome. However, even with standard high-throughput sequencing techniques, complex rearrangements with multiple breakpoints are often difficult to resolve, and predicting their effects on gene expression and phenotype remains a challenge. Here, we address these problems by using high-throughput chromosome conformation capture (Hi-C) generated from cultured cells of nine individuals with developmental disorders (DDs). Three individuals had previously been identified as harboring duplications at the SOX9 locus and six had been identified with translocations. Hi-C resolved the positions of the duplications and was instructive in interpreting their distinct pathogenic effects, including the formation of new topologically associating domains (neo-TADs). Hi-C was very sensitive in detecting translocations, and it revealed previously unrecognized complex rearrangements at the breakpoints. In several cases, we observed the formation of fused-TADs promoting ectopic enhancer-promoter interactions that were likely to be involved in the disease pathology. In summary, we show that Hi-C is a sensible method for the detection of complex SVs in a clinical setting. The results help interpret the possible pathogenic effects of the SVs in individuals with DDs.
Collapse
|
26
|
Biallelic variants in KYNU cause a multisystemic syndrome with hand hyperphalangism. Bone 2020; 133:115219. [PMID: 31923704 PMCID: PMC10521254 DOI: 10.1016/j.bone.2019.115219] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Revised: 12/25/2019] [Accepted: 12/29/2019] [Indexed: 01/17/2023]
Abstract
Catel-Manzke syndrome is characterized by the combination of Pierre Robin sequence and radial deviation, shortening as well as clinodactyly of the index fingers, due to an accessory ossification center. Mutations in TGDS have been identified as one cause of Catel-Manzke syndrome, but cannot be found as causative in every patient with the clinical diagnosis. We performed a chromosome microarray and/or exome sequencing in three patients with hand hyperphalangism, heart defect, short stature, and mild to severe developmental delay, all of whom were initially given a clinical diagnosis of Catel-Manzke syndrome. In one patient, we detected a large deletion of exons 1-8 and the missense variant c.1282C > T (p.Arg428Trp) in KYNU (NM_003937.2), whereas homozygous missense variants in KYNU were found in the other two patients (c.989G > A (p.Arg330Gln) and c.326G > C (p.Trp109Ser)). Plasma and urine metabolomic analysis of two patients indicated a block along the tryptophan catabolic pathway and urine organic acid analysis showed excretion of xanthurenic acid. Biallelic loss-of-function mutations in KYNU were recently described as a cause of NAD deficiency with vertebral, cardiac, renal and limb defects; however, no hand hyperphalangism was described in those patients, and Catel-Manzke syndrome was not discussed as a differential diagnosis. In conclusion, we present unrelated patients identified with biallelic variants in KYNU leading to kynureninase deficiency and xanthurenic aciduria as a very likely cause of their hyperphalangism, heart defect, short stature, and developmental delay. We suggest performance of urine organic acid analysis in patients with suspected Catel-Manzke syndrome, particularly in those with cardiac or vertebral defects or without mutations in TGDS.
Collapse
|
27
|
SCelVis: exploratory single cell data analysis on the desktop and in the cloud. PeerJ 2020; 8:e8607. [PMID: 32117635 PMCID: PMC7035868 DOI: 10.7717/peerj.8607] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2019] [Accepted: 01/20/2020] [Indexed: 11/20/2022] Open
Abstract
Background Single cell omics technologies present unique opportunities for biomedical and life sciences from lab to clinic, but the high dimensional nature of such data poses challenges for computational analysis and interpretation. Furthermore, FAIR data management as well as data privacy and security become crucial when working with clinical data, especially in cross-institutional and translational settings. Existing solutions are either bound to the desktop of one researcher or come with dependencies on vendor-specific technology for cloud storage or user authentication. Results To facilitate analysis and interpretation of single-cell data by users without bioinformatics expertise, we present SCelVis, a flexible, interactive and user-friendly app for web-based visualization of pre-processed single-cell data. Users can survey multiple interactive visualizations of their single cell expression data and cell annotation, define cell groups by filtering or manual selection and perform differential gene expression, and download raw or processed data for further offline analysis. SCelVis can be run both on the desktop and cloud systems, accepts input from local and various remote sources using standard and open protocols, and allows for hosting data in the cloud and locally. We test and validate our visualization using publicly available scRNA-seq data. Methods SCelVis is implemented in Python using Dash by Plotly. It is available as a standalone application as a Python package, via Conda/Bioconda and as a Docker image. All components are available as open source under the permissive MIT license and are based on open standards and interfaces, enabling further development and integration with third party pipelines and analysis components. The GitHub repository is https://github.com/bihealth/scelvis.
Collapse
|
28
|
Identification and ranking of recurrent neo-epitopes in cancer. BMC Med Genomics 2019; 12:171. [PMID: 31775766 PMCID: PMC6882202 DOI: 10.1186/s12920-019-0611-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2019] [Accepted: 10/25/2019] [Indexed: 12/25/2022] Open
Abstract
Background Immune escape is one of the hallmarks of cancer and several new treatment approaches attempt to modulate and restore the immune system’s capability to target cancer cells. At the heart of the immune recognition process lies antigen presentation from somatic mutations. These neo-epitopes are emerging as attractive targets for cancer immunotherapy and new strategies for rapid identification of relevant candidates have become a priority. Methods We carefully screen TCGA data sets for recurrent somatic amino acid exchanges and apply MHC class I binding predictions. Results We propose a method for in silico selection and prioritization of candidates which have a high potential for neo-antigen generation and are likely to appear in multiple patients. While the percentage of patients carrying a specific neo-epitope and HLA-type combination is relatively small, the sheer number of new patients leads to surprisingly high reoccurence numbers. We identify 769 epitopes which are expected to occur in 77629 patients per year. Conclusion While our candidate list will definitely contain false positives, the results provide an objective order for wet-lab testing of reusable neo-epitopes. Thus recurrent neo-epitopes may be suitable to supplement existing personalized T cell treatment approaches with precision treatment options.
Collapse
|
29
|
Abstract
Summary Management of raw-sequencing data and its pre-processing (conversion into sequences and demultiplexing) remains a challenging topic for groups running sequencing devices. They face many challenges in such efforts and solutions ranging from manual management of spreadsheets to very complex and customized laboratory information management systems handling much more than just sequencing raw data. In this article, we describe the software package DigestiFlow that focuses on the management of Illumina flow cell sample sheets and raw data. It allows for automated extraction of information from flow cell data and management of sample sheets. Furthermore, it allows for the automated and reproducible conversion of Illumina base calls to sequences and the demultiplexing thereof using bcl2fastq and Picard Tools, followed by quality control report generation. Availability and implementation The software is available under the MIT license at https://github.com/bihealth/digestiflow-server. The client software components are available via Bioconda. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
|
30
|
Targeted panel sequencing in pediatric primary cardiomyopathy supports a critical role of TNNI3. Clin Genet 2019; 96:549-559. [PMID: 31568572 DOI: 10.1111/cge.13645] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2019] [Revised: 09/05/2019] [Accepted: 09/15/2019] [Indexed: 12/30/2022]
Abstract
The underlying genetic mechanisms and early pathological events of children with primary cardiomyopathy (CMP) are insufficiently characterized. In this study, we aimed to characterize the mutational spectrum of primary CMP in a large cohort of patients ≤18 years referred to a tertiary center. Eighty unrelated index patients with pediatric primary CMP underwent genetic testing with a panel-based next-generation sequencing approach of 89 genes. At least one pathogenic or probably pathogenic variant was identified in 30/80 (38%) index patients. In all CMP subgroups, patients carried most frequently variants of interest in sarcomere genes suggesting them as a major contributor in pediatric primary CMP. In MYH7, MYBPC3, and TNNI3, we identified 18 pathogenic/probably pathogenic variants (MYH7 n = 7, MYBPC3 n = 6, TNNI3 n = 5, including one homozygous (TNNI3 c.24+2T>A) truncating variant. Protein and transcript level analysis on heart biopsies from individuals with homozygous mutation of TNNI3 revealed that the TNNI3 protein is absent and associated with upregulation of the fetal isoform TNNI1. The present study further supports the clinical importance of sarcomeric mutation-not only in adult-but also in pediatric primary CMP. TNNI3 is the third most important disease gene in this cohort and complete loss of TNNI3 leads to severe pediatric CMP.
Collapse
|
31
|
Pathogenic variants in USP7 cause a neurodevelopmental disorder with speech delays, altered behavior, and neurologic anomalies. Genet Med 2019; 21:1797-1807. [PMID: 30679821 PMCID: PMC6752677 DOI: 10.1038/s41436-019-0433-1] [Citation(s) in RCA: 35] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2018] [Accepted: 01/02/2019] [Indexed: 11/09/2022] Open
Abstract
PURPOSE Haploinsufficiency of USP7, located at chromosome 16p13.2, has recently been reported in seven individuals with neurodevelopmental phenotypes, including developmental delay/intellectual disability (DD/ID), autism spectrum disorder (ASD), seizures, and hypogonadism. Further, USP7 was identified to critically incorporate into the MAGEL2-USP7-TRIM27 (MUST), such that pathogenic variants in USP7 lead to altered endosomal F-actin polymerization and dysregulated protein recycling. METHODS We report 16 newly identified individuals with heterozygous USP7 variants, identified by genome or exome sequencing or by chromosome microarray analysis. Clinical features were evaluated by review of medical records. Additional clinical information was obtained on the seven previously reported individuals to fully elucidate the phenotypic expression associated with USP7 haploinsufficiency. RESULTS The clinical manifestations of these 23 individuals suggest a syndrome characterized by DD/ID, hypotonia, eye anomalies,feeding difficulties, GERD, behavioral anomalies, and ASD, and more specific phenotypes of speech delays including a nonverbal phenotype and abnormal brain magnetic resonance image findings including white matter changes based on neuroradiologic examination. CONCLUSION The consistency of clinical features among all individuals presented regardless of de novo USP7 variant type supports haploinsufficiency as a mechanism for pathogenesis and refines the clinical impact faced by affected individuals and caregivers.
Collapse
|
32
|
Multisite de novo mutations in human offspring after paternal exposure to ionizing radiation. Sci Rep 2018; 8:14611. [PMID: 30279461 PMCID: PMC6168503 DOI: 10.1038/s41598-018-33066-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2018] [Accepted: 09/12/2018] [Indexed: 12/30/2022] Open
Abstract
A genome-wide evaluation of the effects of ionizing radiation on mutation induction in the mouse germline has identified multisite de novo mutations (MSDNs) as marker for previous exposure. Here we present the results of a small pilot study of whole genome sequencing in offspring of soldiers who served in radar units on weapon systems that were emitting high-frequency radiation. We found cases of exceptionally high MSDN rates as well as an increased mean in our cohort: While a MSDN mutation is detected in average in 1 out of 5 offspring of unexposed controls, we observed 12 MSDNs in altogether 18 offspring, including a family with 6 MSDNs in 3 offspring. Moreover, we found two translocations, also resulting from neighboring mutations. Our findings indicate that MSDNs might be suited in principle for the assessment of DNA damage from ionizing radiation also in humans. However, as exact person-related dose values in risk groups are usually not available, the interpretation of MSDNs in single families would benefit from larger molecular epidemiologic studies on this new biomarker.
Collapse
|
33
|
HLA-MA: simple yet powerful matching of samples using HLA typing results. Bioinformatics 2017; 33:2241-2242. [DOI: 10.1093/bioinformatics/btx132] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2016] [Accepted: 03/09/2017] [Indexed: 11/14/2022] Open
|
34
|
Abstract
Exomiser is an application that prioritizes genes and variants in next-generation sequencing (NGS) projects for novel disease-gene discovery or differential diagnostics of Mendelian disease. Exomiser comprises a suite of algorithms for prioritizing exome sequences using random-walk analysis of protein interaction networks, clinical relevance and cross-species phenotype comparisons, as well as a wide range of other computational filters for variant frequency, predicted pathogenicity and pedigree analysis. In this protocol, we provide a detailed explanation of how to install Exomiser and use it to prioritize exome sequences in a number of scenarios. Exomiser requires ∼3 GB of RAM and roughly 15-90 s of computing time on a standard desktop computer to analyze a variant call format (VCF) file. Exomiser is freely available for academic use from http://www.sanger.ac.uk/science/tools/exomiser.
Collapse
|
35
|
Methods for the detection and assembly of novel sequence in high-throughput sequencing data. Bioinformatics 2015; 31:1904-12. [PMID: 25649620 DOI: 10.1093/bioinformatics/btv051] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2014] [Accepted: 01/26/2015] [Indexed: 01/04/2023] Open
Abstract
MOTIVATION Large insertions of novel sequence are an important type of structural variants. Previous studies used traditional de novo assemblers for assembling non-mapping high-throughput sequencing (HTS) or capillary reads and then tried to anchor them in the reference using paired read information. RESULTS We present approaches for detecting insertion breakpoints and targeted assembly of large insertions from HTS paired data: BASIL and ANISE. On near identity repeats that are hard for assemblers, ANISE employs a repeat resolution step. This results in far better reconstructions than obtained by the compared methods. On simulated data, we found our insert assembler to be competitive with the de novo assemblers ABYSS and SGA while yielding already anchored inserted sequence as opposed to unanchored contigs as from ABYSS/SGA. On real-world data, we detected novel sequence in a human individual and thoroughly validated the assembled sequence. ANISE was found to be superior to the competing tool MindTheGap on both simulated and real-world data. AVAILABILITY AND IMPLEMENTATION ANISE and BASIL are available for download at http://www.seqan.de/projects/herbarium under a permissive open source license.
Collapse
|
36
|
Abstract
Motivation: Automatic error correction of high-throughput sequencing data can have a dramatic impact on the amount of usable base pairs and their quality. It has been shown that the performance of tasks such as de novo genome assembly and SNP calling can be dramatically improved after read error correction. While a large number of methods specialized for correcting substitution errors as found in Illumina data exist, few methods for the correction of indel errors, common to technologies like 454 or Ion Torrent, have been proposed. Results: We present Fiona, a new stand-alone read error–correction method. Fiona provides a new statistical approach for sequencing error detection and optimal error correction and estimates its parameters automatically. Fiona is able to correct substitution, insertion and deletion errors and can be applied to any sequencing technology. It uses an efficient implementation of the partial suffix array to detect read overlaps with different seed lengths in parallel. We tested Fiona on several real datasets from a variety of organisms with different read lengths and compared its performance with state-of-the-art methods. Fiona shows a constantly higher correction accuracy over a broad range of datasets from 454 and Ion Torrent sequencers, without compromise in speed. Conclusion: Fiona is an accurate parameter-free read error–correction method that can be run on inexpensive hardware and can make use of multicore parallelization whenever available. Fiona was implemented using the SeqAn library for sequence analysis and is publicly available for download at http://www.seqan.de/projects/fiona. Contact: mschulz@mmci.uni-saarland.de or hugues.richard@upmc.fr Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
|
37
|
Genome alignment with graph data structures: a comparison. BMC Bioinformatics 2014; 15:99. [PMID: 24712884 PMCID: PMC4020321 DOI: 10.1186/1471-2105-15-99] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2013] [Accepted: 03/28/2014] [Indexed: 12/21/2022] Open
Abstract
Background Recent advances in rapid, low-cost sequencing have opened up the opportunity to study complete genome sequences. The computational approach of multiple genome alignment allows investigation of evolutionarily related genomes in an integrated fashion, providing a basis for downstream analyses such as rearrangement studies and phylogenetic inference. Graphs have proven to be a powerful tool for coping with the complexity of genome-scale sequence alignments. The potential of graphs to intuitively represent all aspects of genome alignments led to the development of graph-based approaches for genome alignment. These approaches construct a graph from a set of local alignments, and derive a genome alignment through identification and removal of graph substructures that indicate errors in the alignment. Results We compare the structures of commonly used graphs in terms of their abilities to represent alignment information. We describe how the graphs can be transformed into each other, and identify and classify graph substructures common to one or more graphs. Based on previous approaches, we compile a list of modifications that remove these substructures. Conclusion We show that crucial pieces of alignment information, associated with inversions and duplications, are not visible in the structure of all graphs. If we neglect vertex or edge labels, the graphs differ in their information content. Still, many ideas are shared among all graph-based approaches. Based on these findings, we outline a conceptual framework for graph-based genome alignment that can assist in the development of future genome alignment tools.
Collapse
|
38
|
Abstract
MOTIVATION During the past years, next-generation sequencing has become a key technology for many applications in the biomedical sciences. Throughput continues to increase and new protocols provide longer reads than currently available. In almost all applications, read mapping is a first step. Hence, it is crucial to have algorithms and implementations that perform fast, with high sensitivity, and are able to deal with long reads and a large absolute number of insertions and deletions. RESULTS RazerS is a read mapping program with adjustable sensitivity based on counting q-grams. In this work, we propose the successor RazerS 3, which now supports shared-memory parallelism, an additional seed-based filter with adjustable sensitivity, a much faster, banded version of the Myers' bit-vector algorithm for verification, memory-saving measures and support for the SAM output format. This leads to a much improved performance for mapping reads, in particular, long reads with many errors. We extensively compare RazerS 3 with other popular read mappers and show that its results are often superior to them in terms of sensitivity while exhibiting practical and often competitive run times. In addition, RazerS 3 works without a pre-computed index. AVAILABILITY AND IMPLEMENTATION Source code and binaries are freely available for download at http://www.seqan.de/projects/razers. RazerS 3 is implemented in C++ and OpenMP under a GPL license using the SeqAn library and supports Linux, Mac OS X and Windows.
Collapse
|