1
|
Karim S, Hussein IR, Schulten HJ, Alsaedi S, Mirza Z, Al-Qahtani M, Chaudhary A. Identification of Extremely Rare Pathogenic CNVs by Array CGH in Saudi Children with Developmental Delay, Congenital Malformations, and Intellectual Disability. CHILDREN 2023; 10:children10040662. [PMID: 37189911 DOI: 10.3390/children10040662] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 03/15/2023] [Accepted: 03/28/2023] [Indexed: 04/03/2023]
Abstract
Chromosomal imbalance is implicated in developmental delay (DD), congenital malformations (CM), and intellectual disability (ID), and, thus, precise identification of copy number variations (CNVs) is essential. We therefore aimed to investigate the genetic heterogeneity in Saudi children with DD/CM/ID. High-resolution array comparative genomic hybridization (array CGH) was used to detect disease-associated CNVs in 63 patients. Quantitative PCR was done to confirm the detected CNVs. Giemsa banding-based karyotyping was also performed. Array CGH identified chromosomal abnormalities in 24 patients; distinct pathogenic and/or variants of uncertain significance CNVs were found in 19 patients, and aneuploidy was found in 5 patients including 47,XXY (n = 2), 45,X (n = 2) and a patient with trisomy 18 who carried a balanced Robertsonian translocation. CNVs including 9p24p13, 16p13p11, 18p11 had gains/duplications and CNVs, including 3p23p14, 10q26, 11p15, 11q24q25, 13q21.1q32.1, 16p13.3p11.2, and 20q11.1q13.2, had losses/deletions only, while CNVs including 8q24, 11q12, 15q25q26, 16q21q23, and 22q11q13 were found with both gains or losses in different individuals. In contrast, standard karyotyping detected chromosomal abnormalities in ten patients. The diagnosis rate of array CGH (28%, 18/63 patients) was around two-fold higher than that of conventional karyotyping (15.87%, 10/63 patients). We herein report, for the first time, the extremely rare pathogenic CNVs in Saudi children with DD/CM/ID. The reported prevalence of CNVs in Saudi Arabia adds value to clinical cytogenetics.
Collapse
|
2
|
Adaptive Savitzky–Golay Filters for Analysis of Copy Number Variation Peaks from Whole-Exome Sequencing Data. INFORMATION 2023. [DOI: 10.3390/info14020128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/18/2023] Open
Abstract
Copy number variation (CNV) is a form of structural variation in the human genome that provides medical insight into complex human diseases; while whole-genome sequencing is becoming more affordable, whole-exome sequencing (WES) remains an important tool in clinical diagnostics. Because of its discontinuous nature and unique characteristics of sparse target-enrichment-based WES data, the analysis and detection of CNV peaks remain difficult tasks. The Savitzky–Golay (SG) smoothing is well known as a fast and efficient smoothing method. However, no study has documented the use of this technique for CNV peak detection. It is well known that the effectiveness of the classical SG filter depends on the proper selection of the window length and polynomial degree, which should correspond with the scale of the peak because, in the case of peaks with a high rate of change, the effectiveness of the filter could be restricted. Based on the Savitzky–Golay algorithm, this paper introduces a novel adaptive method to smooth irregular peak distributions. The proposed method ensures high-precision noise reduction by dynamically modifying the results of the prior smoothing to automatically adjust parameters. Our method offers an additional feature extraction technique based on density and Euclidean distance. In comparison to classical Savitzky–Golay filtering and other peer filtering methods, the performance evaluation demonstrates that adaptive Savitzky–Golay filtering performs better. According to experimental results, our method effectively detects CNV peaks across all genomic segments for both short and long tags, with minimal peak height fidelity values (i.e., low estimation bias). As a result, we clearly demonstrate how well the adaptive Savitzky–Golay filtering method works and how its use in the detection of CNV peaks can complement the existing techniques used in CNV peak analysis.
Collapse
|
3
|
Xiao M, Ma F, Yu J, Xie J, Zhang Q, Liu P, Yu F, Jiang Y, Zhang L. A Computer Simulation of SARS-CoV-2 Mutation Spectra for Empirical Data Characterization and Analysis. Biomolecules 2022; 13:63. [PMID: 36671448 PMCID: PMC9855923 DOI: 10.3390/biom13010063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 12/21/2022] [Accepted: 12/23/2022] [Indexed: 12/31/2022] Open
Abstract
It is very important to compute the mutation spectra, and simulate the intra-host mutation processes by sequencing data, which is not only for the understanding of SARS-CoV-2 genetic mechanism, but also for epidemic prediction, vaccine, and drug design. However, the current intra-host mutation analysis algorithms are not only inaccurate, but also the simulation methods are unable to quickly and precisely predict new SARS-CoV-2 variants generated from the accumulation of mutations. Therefore, this study proposes a novel accurate strand-specific SARS-CoV-2 intra-host mutation spectra computation method, develops an efficient and fast SARS-CoV-2 intra-host mutation simulation method based on mutation spectra, and establishes an online analysis and visualization platform. Our main results include: (1) There is a significant variability in the SARS-CoV-2 intra-host mutation spectra across different lineages, with the major mutations from G- > A, G- > C, G- > U on the positive-sense strand and C- > U, C- > G, C- > A on the negative-sense strand; (2) our mutation simulation reveals the simulation sequence starts to deviate from the base content percentage of Alpha-CoV/Delta-CoV after approximately 620 mutation steps; (3) 2019-NCSS provides an easy-to-use and visualized online platform for SARS-Cov-2 online analysis and mutation simulation.
Collapse
Affiliation(s)
- Ming Xiao
- College of Computer Science, Sichuan University, Chengdu 610065, China
- Med-X Center for Informatics, Sichuan University, Chengdu 610041, China
| | - Fubo Ma
- West China Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu 610041, China
| | - Jun Yu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100049, China
- College of Life Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jianghang Xie
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Qiaozhen Zhang
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Peng Liu
- National Wildlife Health Center, Hebei Agricultural University, Baoding 071001, China
- Hebei Key Laboratory of Analysis and Control of Zoonotic Pathogenic Microorganism, Hebei Agricultural University, Baoding 071001, China
| | - Fei Yu
- Hebei Key Laboratory of Analysis and Control of Zoonotic Pathogenic Microorganism, Hebei Agricultural University, Baoding 071001, China
- College of Life Sciences, Hebei Agricultural University, Baoding 071001, China
| | - Yuming Jiang
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Le Zhang
- College of Computer Science, Sichuan University, Chengdu 610065, China
- Med-X Center for Informatics, Sichuan University, Chengdu 610041, China
| |
Collapse
|
4
|
You Y, Zhang L, Tao P, Liu S, Chen L. Spatiotemporal Transformer Neural Network for Time-Series Forecasting. ENTROPY (BASEL, SWITZERLAND) 2022; 24:1651. [PMID: 36421506 PMCID: PMC9689721 DOI: 10.3390/e24111651] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 11/05/2022] [Accepted: 11/08/2022] [Indexed: 06/16/2023]
Abstract
Predicting high-dimensional short-term time-series is a difficult task due to the lack of sufficient information and the curse of dimensionality. To overcome these problems, this study proposes a novel spatiotemporal transformer neural network (STNN) for efficient prediction of short-term time-series with three major features. Firstly, the STNN can accurately and robustly predict a high-dimensional short-term time-series in a multi-step-ahead manner by exploiting high-dimensional/spatial information based on the spatiotemporal information (STI) transformation equation. Secondly, the continuous attention mechanism makes the prediction results more accurate than those of previous studies. Thirdly, we developed continuous spatial self-attention, temporal self-attention, and transformation attention mechanisms to create a bridge between effective spatial information and future temporal evolution information. Fourthly, we show that the STNN model can reconstruct the phase space of the dynamical system, which is explored in the time-series prediction. The experimental results demonstrate that the STNN significantly outperforms the existing methods on various benchmarks and real-world systems in the multi-step-ahead prediction of a short-term time-series.
Collapse
Affiliation(s)
- Yujie You
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Le Zhang
- College of Computer Science, Sichuan University, Chengdu 610065, China
- Key Laboratory of Systems Biology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China
- Key Laboratory of Systems Health Science of Zhejiang Province, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China
| | - Peng Tao
- Key Laboratory of Systems Health Science of Zhejiang Province, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China
| | - Suran Liu
- College of Computer Science, Sichuan University, Chengdu 610065, China
| | - Luonan Chen
- Key Laboratory of Systems Health Science of Zhejiang Province, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310024, China
- State Key Laboratory of Cell Biology, Institute of Biochemistry and Cell Biology, Center for Excellence in Molecular Cell Science, Chinese Academy of Sciences, Shanghai 200031, China
- Guangdong Institute of Intelligence Science and Technology, Hengqin, Zhuhai 519031, China
- West China Biomedical Big Data Center, Med-X Center for Informatics, West China Hospital, Sichuan University, Chengdu 610041, China
| |
Collapse
|
5
|
Pillay NS, Ross OA, Christoffels A, Bardien S. Current Status of Next-Generation Sequencing Approaches for Candidate Gene Discovery in Familial Parkinson´s Disease. Front Genet 2022; 13:781816. [PMID: 35299952 PMCID: PMC8921601 DOI: 10.3389/fgene.2022.781816] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Accepted: 01/12/2022] [Indexed: 11/13/2022] Open
Abstract
Parkinson's disease is a neurodegenerative disorder with a heterogeneous genetic etiology. The advent of next-generation sequencing (NGS) technologies has aided novel gene discovery in several complex diseases, including PD. This Perspective article aimed to explore the use of NGS approaches to identify novel loci in familial PD, and to consider their current relevance. A total of 17 studies, spanning various populations (including Asian, Middle Eastern and European ancestry), were identified. All the studies used whole-exome sequencing (WES), with only one study incorporating both WES and whole-genome sequencing. It is worth noting how additional genetic analyses (including linkage analysis, haplotyping and homozygosity mapping) were incorporated to enhance the efficacy of some studies. Also, the use of consanguineous families and the specific search for de novo mutations appeared to facilitate the finding of causal mutations. Across the studies, similarities and differences in downstream analysis methods and the types of bioinformatic tools used, were observed. Although these studies serve as a practical guide for novel gene discovery in familial PD, these approaches have not significantly resolved the "missing heritability" of PD. We speculate that what is needed is the use of third-generation sequencing technologies to identify complex genomic rearrangements and new sequence variation, missed with existing methods. Additionally, the study of ancestrally diverse populations (in particular those of Black African ancestry), with the concomitant optimization and tailoring of sequencing and analytic workflows to these populations, are critical. Only then, will this pave the way for exciting new discoveries in the field.
Collapse
Affiliation(s)
- Nikita Simone Pillay
- South African National Bioinformatics Institute (SANBI), South African Medical Research Council Bioinformatics Unit, University of the Western Cape, Bellville, South Africa
| | - Owen A. Ross
- Department of Neuroscience, Mayo Clinic, Jacksonville, FL, United States
- Department of Clinical Genomics, Mayo Clinic, Jacksonville, FL, United States
| | - Alan Christoffels
- South African National Bioinformatics Institute (SANBI), South African Medical Research Council Bioinformatics Unit, University of the Western Cape, Bellville, South Africa
- Africa Centres for Disease Control and Prevention, African Union Headquarters, Addis Ababa, Ethiopia
| | - Soraya Bardien
- Division of Molecular Biology and Human Genetics, Department of Biomedical Sciences, Faculty of Medicine and Health Sciences, Stellenbosch University, Cape Town, South Africa
- South African Medical Research Council/Stellenbosch University Genomics of Brain Disorders Research Unit, Cape Town, South Africa
| |
Collapse
|
6
|
Gao B, Baudis M. Signatures of Discriminative Copy Number Aberrations in 31 Cancer Subtypes. Front Genet 2021; 12:654887. [PMID: 34054918 PMCID: PMC8155688 DOI: 10.3389/fgene.2021.654887] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2021] [Accepted: 04/15/2021] [Indexed: 12/13/2022] Open
Abstract
Copy number aberrations (CNA) are one of the most important classes of genomic mutations related to oncogenetic effects. In the past three decades, a vast amount of CNA data has been generated by molecular-cytogenetic and genome sequencing based methods. While this data has been instrumental in the identification of cancer-related genes and promoted research into the relation between CNA and histo-pathologically defined cancer types, the heterogeneity of source data and derived CNV profiles pose great challenges for data integration and comparative analysis. Furthermore, a majority of existing studies have been focused on the association of CNA to pre-selected "driver" genes with limited application to rare drivers and other genomic elements. In this study, we developed a bioinformatics pipeline to integrate a collection of 44,988 high-quality CNA profiles of high diversity. Using a hybrid model of neural networks and attention algorithm, we generated the CNA signatures of 31 cancer subtypes, depicting the uniqueness of their respective CNA landscapes. Finally, we constructed a multi-label classifier to identify the cancer type and the organ of origin from copy number profiling data. The investigation of the signatures suggested common patterns, not only of physiologically related cancer types but also of clinico-pathologically distant cancer types such as different cancers originating from the neural crest. Further experiments of classification models confirmed the effectiveness of the signatures in distinguishing different cancer types and demonstrated their potential in tumor classification.
Collapse
Affiliation(s)
- Bo Gao
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Zurich, Switzerland
| | - Michael Baudis
- Department of Molecular Life Sciences, University of Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Zurich, Switzerland
| |
Collapse
|