Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	[Subscribe] [Scholar Register]

Number

Cited by Other Article(s)

Shirguppe S, Gapinske M, Swami D, Gosstola N, Acharya P, Miskalis A, Joulani D, Szkwarek MG, Bhattacharjee A, Elias G, Stilger M, Winter J, Woods WS, Anand D, Lim CKW, Gaj T, Perez-Pinera P. In vivo CRISPR base editing for treatment of Huntington's disease. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.05.602282. [PMID: 39005280 PMCID: PMC11245100 DOI: 10.1101/2024.07.05.602282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/16/2024]

Babu HWS, Elangovan A, Iyer M, Kirola L, Muthusamy S, Jeeth P, Muthukumar S, Vanlalpeka H, Gopalakrishnan AV, Kadhirvel S, Kumar NS, Vellingiri B. Association Study Between Kynurenine 3-Monooxygenase (KMO) Gene and Parkinson's Disease Patients. Mol Neurobiol 2024;61:3867-3881. [PMID: 38040995 DOI: 10.1007/s12035-023-03815-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 11/18/2023] [Indexed: 12/03/2023]

Abstract

The influence of various risk factors such as aging, intricate cellular molecular processes, and lifestyle factors like smoking, alcohol consumption, caffeine intake, and occupational factors has received increased focus in relation to the risk and development of Parkinson's disease (PD). Limited research has been conducted on the assessment of lifestyle impact on kynurenine 3-monooxygenase (KMO) gene in PD. A total of 164 subjects, including 82 PD cases and 82 healthy individuals, were recruited based on specific inclusion and exclusion criteria. The severity of PD and clinical assessment were evaluated using the Unified Parkinson's Disease Rating Scale (UPDRS) and Hoehn and Yahr (HY) scaling. Sanger sequencing was performed to analyse the KMO gene in the recruited subjects, and case-control studies were conducted. The UPDRS assessment revealed significant impairments in smell, tremors, walking, and posture instability in the late-onset PD cohorts. The HY scaling indicated a higher proportion of late-onset cohorts in stage 2. Moreover, both alcoholic and non-alcoholic groups showed significantly increased levels of 3-HK in late-onset PD. Gene analysis identified missense variants at position g.241593373 T > A (rs752312199) and intronic variants at positions g.241592623A > G (rs640718), g.241592800C > A (rs990388262), g.241592802A > C (rs1350160268), g.241592808 T > C (rs1478255936), and g.241592812G > T (rs948928931). The alterations in the KMO gene were found to influence the levels of kynurenic acid (KYNA) and 3-hydroxykynurenine (3-HK). Genomic analysis revealed a high prevalence of missense mutations in the late-onset PD groups, leading to a decline in 3-HK levels in patients. This leads to the reduction of the progression of disease in late-onset groups which shows that this mutation may lead to the protective effect on the PD subjects. This study suggests the use of KYNA and 3-HK as potential biomarkers in analysing the progression of disease. This study is limited by its small sample size. To overcome this limitation, a larger study involving in greater number of participants is needed to thoroughly investigate the KMO gene and KP metabolites, to enhance our understanding of Parkinson's disease progression, and to enhance diagnostic capabilities.

Collapse

Affiliation(s)

Harysh Winster Suresh Babu Human Molecular Cytogenetics and Stem Cell Laboratory, Department of Human Genetics and Molecular Biology, Bharathiar University, Coimbatore, 641 046, Tamil Nadu, India Stem Cell and Regenerative Medicine, Translational Research, Department of Zoology, School of Basic Sciences, Central University of Punjab, Bathinda, 151401, Punjab, India
Ajay Elangovan Stem Cell and Regenerative Medicine, Translational Research, Department of Zoology, School of Basic Sciences, Central University of Punjab, Bathinda, 151401, Punjab, India
Mahalaxmi Iyer Department of Microbiology, School of Basic Sciences, Central University of Punjab, Bathinda, 151401, Punjab, India Centre for Neuroscience, Department of Biotechnology, Karpagam Academy of Higher Education (Deemed to be University), Coimbatore, India
Laxmi Kirola Amity Institute of Biotechnology, Amity University, Noida, 201301, India Department of Biotechnology, School of Health Sciences and Technology (SoHST), UPES University, Dehradun, 248007, Uttarakhand, India
Sureshan Muthusamy School of Chemical & Biotechnology, SASTRA Deemed University, Thanjavur, 613401, India
Priyanka Jeeth Structural and Computational Biology Laboratory, Department of Computational Sciences, Central University of Punjab, 151401, Bathinda, Punjab, India
Sindduja Muthukumar Stem Cell and Regenerative Medicine, Translational Research, Department of Zoology, School of Basic Sciences, Central University of Punjab, Bathinda, 151401, Punjab, India
Harvey Vanlalpeka Department of Obstetrics and Gynaecology, Zoram Medical College, Falkawn, 796005, India
Abilash Valsala Gopalakrishnan Department of Biomedical Sciences, School of Biosciences and Technology, Vellore Institute of Technology, Tamil Nadu, Vellore, 632 014, India
Saraboji Kadhirvel Structural and Computational Biology Laboratory, Department of Computational Sciences, Central University of Punjab, 151401, Bathinda, Punjab, India
Nachimuthu Senthil Kumar Department of Biotechnology, Mizoram University, Aizawl, 796004, India
Balachandar Vellingiri Stem Cell and Regenerative Medicine, Translational Research, Department of Zoology, School of Basic Sciences, Central University of Punjab, Bathinda, 151401, Punjab, India.

Collapse

Hwang H, Jeon H, Yeo N, Baek D. Big data and deep learning for RNA biology. Exp Mol Med 2024;56:1293-1321. [PMID: 38871816 PMCID: PMC11263376 DOI: 10.1038/s12276-024-01243-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 02/27/2024] [Accepted: 03/05/2024] [Indexed: 06/15/2024] Open

Zhao J, Li J, Yao J, Lin G, Chen C, Ye H, He X, Qu S, Chen Y, Wang D, Liang Y, Gao Z, Wu F. Enhanced PSO feature selection with Runge-Kutta and Gaussian sampling for precise gastric cancer recurrence prediction. Comput Biol Med 2024;175:108437. [PMID: 38669732 DOI: 10.1016/j.compbiomed.2024.108437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 03/14/2024] [Accepted: 04/07/2024] [Indexed: 04/28/2024]

Bhattacharyya N, Chai N, Hafford-Tear NJ, Sadan AN, Szabo A, Zarouchlioti C, Jedlickova J, Leung SK, Liao T, Dudakova L, Skalicka P, Parekh M, Moghul I, Jeffries AR, Cheetham ME, Muthusamy K, Hardcastle AJ, Pontikos N, Liskova P, Tuft SJ, Davidson AE. Deciphering novel TCF4-driven mechanisms underlying a common triplet repeat expansion-mediated disease. PLoS Genet 2024;20:e1011230. [PMID: 38713708 PMCID: PMC11101122 DOI: 10.1371/journal.pgen.1011230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 05/17/2024] [Accepted: 03/19/2024] [Indexed: 05/09/2024] Open

Abstract

Fuchs endothelial corneal dystrophy (FECD) is an age-related cause of vision loss, and the most common repeat expansion-mediated disease in humans characterised to date. Up to 80% of European FECD cases have been attributed to expansion of a non-coding CTG repeat element (termed CTG18.1) located within the ubiquitously expressed transcription factor encoding gene, TCF4. The non-coding nature of the repeat and the transcriptomic complexity of TCF4 have made it extremely challenging to experimentally decipher the molecular mechanisms underlying this disease. Here we comprehensively describe CTG18.1 expansion-driven molecular components of disease within primary patient-derived corneal endothelial cells (CECs), generated from a large cohort of individuals with CTG18.1-expanded (Exp+) and CTG 18.1-independent (Exp-) FECD. We employ long-read, short-read, and spatial transcriptomic techniques to interrogate expansion-specific transcriptomic biomarkers. Interrogation of long-read sequencing and alternative splicing analysis of short-read transcriptomic data together reveals the global extent of altered splicing occurring within Exp+ FECD, and unique transcripts associated with CTG18.1-expansions. Similarly, differential gene expression analysis highlights the total transcriptomic consequences of Exp+ FECD within CECs. Furthermore, differential exon usage, pathway enrichment and spatial transcriptomics reveal TCF4 isoform ratio skewing solely in Exp+ FECD with potential downstream functional consequences. Lastly, exome data from 134 Exp- FECD cases identified rare (minor allele frequency <0.005) and potentially deleterious (CADD>15) TCF4 variants in 7/134 FECD Exp- cases, suggesting that TCF4 variants independent of CTG18.1 may increase FECD risk. In summary, our study supports the hypothesis that at least two distinct pathogenic mechanisms, RNA toxicity and TCF4 isoform-specific dysregulation, both underpin the pathophysiology of FECD. We anticipate these data will inform and guide the development of translational interventions for this common triplet-repeat mediated disease.

Collapse

Affiliation(s)

Nihar Bhattacharyya University College London Institute of Ophthalmology, London, United Kingdom
Niuzheng Chai University College London Institute of Ophthalmology, London, United Kingdom
Nathaniel J. Hafford-Tear University College London Institute of Ophthalmology, London, United Kingdom
Amanda N. Sadan University College London Institute of Ophthalmology, London, United Kingdom
Anita Szabo University College London Institute of Ophthalmology, London, United Kingdom
Christina Zarouchlioti University College London Institute of Ophthalmology, London, United Kingdom
Jana Jedlickova Department of Paediatrics and Inherited Metabolic Disorders, First Faculty of Medicine, Charles University and General University Hospital in Prague, Prague, Czech Republic
Szi Kay Leung Faculty of Health and Life Sciences, University of Exeter, Exeter, United Kingdom
Tianyi Liao University College London Institute of Ophthalmology, London, United Kingdom
Lubica Dudakova Department of Paediatrics and Inherited Metabolic Disorders, First Faculty of Medicine, Charles University and General University Hospital in Prague, Prague, Czech Republic
Pavlina Skalicka Department of Paediatrics and Inherited Metabolic Disorders, First Faculty of Medicine, Charles University and General University Hospital in Prague, Prague, Czech Republic Department of Ophthalmology, First Faculty of Medicine, Charles University and General University Hospital in Prague, Prague, Czech Republic
Mohit Parekh University College London Institute of Ophthalmology, London, United Kingdom
Ismail Moghul University College London Institute of Ophthalmology, London, United Kingdom Moorfields Eye Hospital, London, United Kingdom
Aaron R. Jeffries Faculty of Health and Life Sciences, University of Exeter, Exeter, United Kingdom
Michael E. Cheetham University College London Institute of Ophthalmology, London, United Kingdom
Kirithika Muthusamy Moorfields Eye Hospital, London, United Kingdom
Alison J. Hardcastle University College London Institute of Ophthalmology, London, United Kingdom Moorfields Eye Hospital, London, United Kingdom
Nikolas Pontikos University College London Institute of Ophthalmology, London, United Kingdom Moorfields Eye Hospital, London, United Kingdom
Petra Liskova Department of Paediatrics and Inherited Metabolic Disorders, First Faculty of Medicine, Charles University and General University Hospital in Prague, Prague, Czech Republic Department of Ophthalmology, First Faculty of Medicine, Charles University and General University Hospital in Prague, Prague, Czech Republic
Stephen J. Tuft University College London Institute of Ophthalmology, London, United Kingdom Moorfields Eye Hospital, London, United Kingdom
Alice E. Davidson University College London Institute of Ophthalmology, London, United Kingdom Moorfields Eye Hospital, London, United Kingdom

Collapse

Lee H, Ozbulak U, Park H, Depuydt S, De Neve W, Vankerschaver J. Assessing the reliability of point mutation as data augmentation for deep learning with genomic data. BMC Bioinformatics 2024;25:170. [PMID: 38689247 PMCID: PMC11059627 DOI: 10.1186/s12859-024-05787-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2024] [Accepted: 04/15/2024] [Indexed: 05/02/2024] Open

Xu C, Bao S, Chen H, Jiang T, Zhang C. Reference-informed prediction of alternative splicing and splicing-altering mutations from sequences. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.22.586363. [PMID: 38586002 PMCID: PMC10996483 DOI: 10.1101/2024.03.22.586363] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]

Abstract

Alternative splicing plays a crucial role in protein diversity and gene expression regulation in higher eukaryotes and mutations causing dysregulated splicing underlie a range of genetic diseases. Computational prediction of alternative splicing from genomic sequences not only provides insight into gene-regulatory mechanisms but also helps identify disease-causing mutations and drug targets. However, the current methods for the quantitative prediction of splice site usage still have limited accuracy. Here, we present DeltaSplice, a deep neural network model optimized to learn the impact of mutations on quantitative changes in alternative splicing from the comparative analysis of homologous genes. The model architecture enables DeltaSplice to perform "reference-informed prediction" by incorporating the known splice site usage of a reference gene sequence to improve its prediction on splicing-altering mutations. We benchmarked DeltaSplice and several other state-of-the-art methods on various prediction tasks, including evolutionary sequence divergence on lineage-specific splicing and splicing-altering mutations in human populations and neurodevelopmental disorders, and demonstrated that DeltaSplice outperformed consistently. DeltaSplice predicted ~15% of splicing quantitative trait loci (sQTLs) in the human brain as causal splicing-altering variants. It also predicted splicing-altering de novo mutations outside the splice sites in a subset of patients affected by autism and other neurodevelopmental disorders, including 19 genes with recurrent splicing-altering mutations. Among the new candidate disease risk genes, MFN1 is involved in mitochondria fusion, which is frequently disrupted in autism patients. Our work expanded the capacity of in silico splicing models with potential applications in genetic diagnosis and the development of splicing-based precision medicine.

Collapse

Speakman E, Gunaratne GH. On a kneading theory for gene-splicing. CHAOS (WOODBURY, N.Y.) 2024;34:043125. [PMID: 38579148 DOI: 10.1063/5.0199364] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2024] [Accepted: 03/05/2024] [Indexed: 04/07/2024]

Liu X, Zhang H, Zeng Y, Zhu X, Zhu L, Fu J. DRANetSplicer: A Splice Site Prediction Model Based on Deep Residual Attention Networks. Genes (Basel) 2024;15:404. [PMID: 38674339 PMCID: PMC11048956 DOI: 10.3390/genes15040404] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Revised: 03/20/2024] [Accepted: 03/23/2024] [Indexed: 04/28/2024] Open

Abstract

The precise identification of splice sites is essential for unraveling the structure and function of genes, constituting a pivotal step in the gene annotation process. In this study, we developed a novel deep learning model, DRANetSplicer, that integrates residual learning and attention mechanisms for enhanced accuracy in capturing the intricate features of splice sites. We constructed multiple datasets using the most recent versions of genomic data from three different organisms, Oryza sativa japonica, Arabidopsis thaliana and Homo sapiens. This approach allows us to train models with a richer set of high-quality data. DRANetSplicer outperformed benchmark methods on donor and acceptor splice site datasets, achieving an average accuracy of (96.57%, 95.82%) across the three organisms. Comparative analyses with benchmark methods, including SpliceFinder, Splice2Deep, Deep Splicer, EnsembleSplice, and DNABERT, revealed DRANetSplicer's superior predictive performance, resulting in at least a (4.2%, 11.6%) relative reduction in average error rate. We utilized the DRANetSplicer model trained on O. sativa japonica data to predict splice sites in A. thaliana, achieving accuracies for donor and acceptor sites of (94.89%, 94.25%). These results indicate that DRANetSplicer possesses excellent cross-organism predictive capabilities, with its performance in cross-organism predictions even surpassing that of benchmark methods in non-cross-organism predictions. Cross-organism validation showcased DRANetSplicer's excellence in predicting splice sites across similar organisms, supporting its applicability in gene annotation for understudied organisms. We employed multiple methods to visualize the decision-making process of the model. The visualization results indicate that DRANetSplicer can learn and interpret well-known biological features, further validating its overall performance. Our study systematically examined and confirmed the predictive ability of DRANetSplicer from various levels and perspectives, indicating that its practical application in gene annotation is justified.

Collapse

Ferese R, Scala S, Suppa A, Campopiano R, Asci F, Zampogna A, Chiaravalloti MA, Griguoli A, Storto M, Pardo AD, Giardina E, Zampatti S, Fornai F, Novelli G, Fanelli M, Zecca C, Logroscino G, Centonze D, Gambardella S. Cohort analysis of novel SPAST variants in SPG4 patients and implementation of in vitro and in vivo studies to identify the pathogenic mechanism caused by splicing mutations. Front Neurol 2023;14:1296924. [PMID: 38145127 PMCID: PMC10748595 DOI: 10.3389/fneur.2023.1296924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 11/14/2023] [Indexed: 12/26/2023] Open

Affiliation(s)

Rosangela Ferese IRCCS Neuromed, Pozzilli, Italy
Simona Scala IRCCS Neuromed, Pozzilli, Italy
Antonio Suppa IRCCS Neuromed, Pozzilli, Italy Department of Human Neurosciences, Sapienza University of Rome, Rome, Italy
Rosa Campopiano IRCCS Neuromed, Pozzilli, Italy
Francesco Asci IRCCS Neuromed, Pozzilli, Italy
Alessandro Zampogna Department of Human Neurosciences, Sapienza University of Rome, Rome, Italy
Maria Antonietta Chiaravalloti IRCCS Neuromed, Pozzilli, Italy
Annamaria Griguoli IRCCS Neuromed, Pozzilli, Italy
Marianna Storto IRCCS Neuromed, Pozzilli, Italy
Alba Di Pardo IRCCS Neuromed, Pozzilli, Italy
Emiliano Giardina Genomic Medicine Laboratory, IRCCS Fondazione Santa Lucia, Rome, Italy
Stefania Zampatti Genomic Medicine Laboratory, IRCCS Fondazione Santa Lucia, Rome, Italy
Francesco Fornai IRCCS Neuromed, Pozzilli, Italy Department of Translational Research and New Technologies in Medicine and Surgery, University of Pisa, Pisa, Italy
Giuseppe Novelli IRCCS Neuromed, Pozzilli, Italy Department of Biomedicine and Prevention, University of Rome “Tor Vergata”, Rome, Italy
Mirco Fanelli Department of Biomolecular Sciences, University of Urbino “Carlo Bo”, Urbino, Italy
Chiara Zecca Center for Neurodegenerative Diseases and the Aging Brain, Department of Clinical Research in Neurology of the University of Bari “Aldo Moro” at “Pia Fondazione Card G. Panico” Hospital Tricase, Lecce, Italy
Giancarlo Logroscino Center for Neurodegenerative Diseases and the Aging Brain, Department of Clinical Research in Neurology of the University of Bari “Aldo Moro” at “Pia Fondazione Card G. Panico” Hospital Tricase, Lecce, Italy
Diego Centonze IRCCS Neuromed, Pozzilli, Italy Department of Systems Medicine, Tor Vergata University, Rome, Italy
Stefano Gambardella IRCCS Neuromed, Pozzilli, Italy Department of Biomolecular Sciences, University of Urbino “Carlo Bo”, Urbino, Italy

Collapse

Toussaint PA, Leiser F, Thiebes S, Schlesner M, Brors B, Sunyaev A. Explainable artificial intelligence for omics data: a systematic mapping study. Brief Bioinform 2023;25:bbad453. [PMID: 38113073 PMCID: PMC10729786 DOI: 10.1093/bib/bbad453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 07/28/2023] [Accepted: 11/08/2023] [Indexed: 12/21/2023] Open

Weklak D, Tisborn J, Mangold MH, Scheu R, Wodrich H, Hagedorn C, Jönsson F, Kreppel F. Insights from the Construction of Adenovirus-Based Vaccine Candidates against SARS-CoV-2: Expecting the Unexpected. Viruses 2023;15:2155. [PMID: 38005833 PMCID: PMC10675337 DOI: 10.3390/v15112155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2023] [Revised: 10/20/2023] [Accepted: 10/23/2023] [Indexed: 11/26/2023] Open

Ditz JC, Reuter B, Pfeifer N. Inherently interpretable position-aware convolutional motif kernel networks for biological sequencing data. Sci Rep 2023;13:17216. [PMID: 37821530 PMCID: PMC10567796 DOI: 10.1038/s41598-023-44175-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 10/04/2023] [Indexed: 10/13/2023] Open

Shen F, Hu C, Huang X, He H, Yang D, Zhao J, Yang X. Advances in alternative splicing identification: deep learning and pantranscriptome. FRONTIERS IN PLANT SCIENCE 2023;14:1232466. [PMID: 37790793 PMCID: PMC10544900 DOI: 10.3389/fpls.2023.1232466] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 08/28/2023] [Indexed: 10/05/2023]

Chao KH, Mao A, Salzberg SL, Pertea M. Splam: a deep-learning-based splice site predictor that improves spliced alignments. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.27.550754. [PMID: 37546880 PMCID: PMC10402160 DOI: 10.1101/2023.07.27.550754] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]

Zabardast A, Tamer EG, Son YA, Yılmaz A. An automated framework for evaluation of deep learning models for splice site predictions. Sci Rep 2023;13:10221. [PMID: 37353532 PMCID: PMC10290104 DOI: 10.1038/s41598-023-34795-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2022] [Accepted: 05/08/2023] [Indexed: 06/25/2023] Open

Abstract

A novel framework for the automated evaluation of various deep learning-based splice site detectors is presented. The framework eliminates time-consuming development and experimenting activities for different codebases, architectures, and configurations to obtain the best models for a given RNA splice site dataset. RNA splicing is a cellular process in which pre-mRNAs are processed into mature mRNAs and used to produce multiple mRNA transcripts from a single gene sequence. Since the advancement of sequencing technologies, many splice site variants have been identified and associated with the diseases. So, RNA splice site prediction is essential for gene finding, genome annotation, disease-causing variants, and identification of potential biomarkers. Recently, deep learning models performed highly accurately for classifying genomic signals. Convolutional Neural Network (CNN), Long Short-Term Memory (LSTM) and its bidirectional version (BLSTM), Gated Recurrent Unit (GRU), and its bidirectional version (BGRU) are promising models. During genomic data analysis, CNN's locality feature helps where each nucleotide correlates with other bases in its vicinity. In contrast, BLSTM can be trained bidirectionally, allowing sequential data to be processed from forward and reverse directions. Therefore, it can process 1-D encoded genomic data effectively. Even though both methods have been used in the literature, a performance comparison was missing. To compare selected models under similar conditions, we have created a blueprint for a series of networks with five different levels. As a case study, we compared CNN and BLSTM models' learning capabilities as building blocks for RNA splice site prediction in two different datasets. Overall, CNN performed better with [Formula: see text] accuracy ([Formula: see text] improvement), [Formula: see text] F1 score ([Formula: see text] improvement), and [Formula: see text] AUC-PR ([Formula: see text] improvement) in human splice site prediction. Likewise, an outperforming performance with [Formula: see text] accuracy ([Formula: see text] improvement), [Formula: see text] F1 score ([Formula: see text] improvement), and [Formula: see text] AUC-PR ([Formula: see text] improvement) is achieved in C. elegans splice site prediction. Overall, our results showed that CNN learns faster than BLSTM and BGRU. Moreover, CNN performs better at extracting sequence patterns than BLSTM and BGRU. To our knowledge, no other framework is developed explicitly for evaluating splice detection models to decide the best possible model in an automated manner. So, the proposed framework and the blueprint would help selecting different deep learning models, such as CNN vs. BLSTM and BGRU, for splice site analysis or similar classification tasks and in different problems.

Collapse

McBeath E, Fujiwara K, Hofmann MC. Evidence-Based Guide to Using Artificial Introns for Tissue-Specific Knockout in Mice. Int J Mol Sci 2023;24:10258. [PMID: 37373404 PMCID: PMC10299402 DOI: 10.3390/ijms241210258] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Revised: 06/09/2023] [Accepted: 06/10/2023] [Indexed: 06/29/2023] Open

Akpokiro V, Chowdhury HMAM, Olowofila S, Nusrat R, Oluwadare O. CNNSplice: Robust models for splice site prediction using convolutional neural networks. Comput Struct Biotechnol J 2023;21:3210-3223. [PMID: 37304005 PMCID: PMC10250157 DOI: 10.1016/j.csbj.2023.05.031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Revised: 05/25/2023] [Accepted: 05/28/2023] [Indexed: 06/13/2023] Open

Lin BC, Katneni U, Jankowska KI, Meyer D, Kimchi-Sarfaty C. In silico methods for predicting functional synonymous variants. Genome Biol 2023;24:126. [PMID: 37217943 PMCID: PMC10204308 DOI: 10.1186/s13059-023-02966-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 05/10/2023] [Indexed: 05/24/2023] Open

Rogalska ME, Vivori C, Valcárcel J. Regulation of pre-mRNA splicing: roles in physiology and disease, and therapeutic prospects. Nat Rev Genet 2023;24:251-269. [PMID: 36526860 DOI: 10.1038/s41576-022-00556-8] [Citation(s) in RCA: 50] [Impact Index Per Article: 50.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/10/2022] [Indexed: 12/23/2022]

Patterson A, Elbasir A, Tian B, Auslander N. Computational Methods Summarizing Mutational Patterns in Cancer: Promise and Limitations for Clinical Applications. Cancers (Basel) 2023;15:cancers15071958. [PMID: 37046619 PMCID: PMC10093138 DOI: 10.3390/cancers15071958] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 02/24/2023] [Accepted: 03/09/2023] [Indexed: 03/29/2023] Open

A deep intronic TCTN2 variant activating a cryptic exon predicted by SpliceRover in a patient with Joubert syndrome. J Hum Genet 2023:10.1038/s10038-023-01143-3. [PMID: 36894704 DOI: 10.1038/s10038-023-01143-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Revised: 01/26/2023] [Accepted: 02/27/2023] [Indexed: 03/11/2023]

Barbosa P, Savisaar R, Carmo-Fonseca M, Fonseca A. Computational prediction of human deep intronic variation. Gigascience 2022;12:giad085. [PMID: 37878682 PMCID: PMC10599398 DOI: 10.1093/gigascience/giad085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2023] [Revised: 06/07/2023] [Accepted: 09/20/2023] [Indexed: 10/27/2023] Open

Ding W, Abdel-Basset M, Hawash H, Ali AM. Explainability of artificial intelligence methods, applications and challenges: A comprehensive survey. Inf Sci (N Y) 2022. [DOI: 10.1016/j.ins.2022.10.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Comparison of In Silico Tools for Splice-Altering Variant Prediction Using Established Spliceogenic Variants: An End-User’s Point of View. Int J Genomics 2022;2022:5265686. [PMID: 36275637 PMCID: PMC9584665 DOI: 10.1155/2022/5265686] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 07/18/2022] [Accepted: 08/10/2022] [Indexed: 11/18/2022] Open

Park HM, Park Y, Berani U, Bang E, Vankerschaver J, Van Messem A, De Neve W, Shim H. In silico optimization of RNA-protein interactions for CRISPR-Cas13-based antimicrobials. Biol Direct 2022;17:27. [PMID: 36207756 PMCID: PMC9547417 DOI: 10.1186/s13062-022-00339-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2022] [Accepted: 09/19/2022] [Indexed: 12/04/2022] Open

Abstract

RNA–protein interactions are crucial for diverse biological processes. In prokaryotes, RNA–protein interactions enable adaptive immunity through CRISPR-Cas systems. These defence systems utilize CRISPR RNA (crRNA) templates acquired from past infections to destroy foreign genetic elements through crRNA-mediated nuclease activities of Cas proteins. Thanks to the programmability and specificity of CRISPR-Cas systems, CRISPR-based antimicrobials have the potential to be repurposed as new types of antibiotics. Unlike traditional antibiotics, these CRISPR-based antimicrobials can be designed to target specific bacteria and minimize detrimental effects on the human microbiome during antibacterial therapy. In this study, we explore the potential of CRISPR-based antimicrobials by optimizing the RNA–protein interactions of crRNAs and Cas13 proteins. CRISPR-Cas13 systems are unique as they degrade specific foreign RNAs using the crRNA template, which leads to non-specific RNase activities and cell cycle arrest. We show that a high proportion of the Cas13 systems have no colocalized CRISPR arrays, and the lack of direct association between crRNAs and Cas proteins may result in suboptimal RNA–protein interactions in the current tools. Here, we investigate the RNA–protein interactions of the Cas13-based systems by curating the validation dataset of Cas13 protein and CRISPR repeat pairs that are experimentally validated to interact, and the candidate dataset of CRISPR repeats that reside on the same genome as the currently known Cas13 proteins. To find optimal CRISPR-Cas13 interactions, we first validate the 3-D structure prediction of crRNAs based on their experimental structures. Next, we test a number of RNA–protein interaction programs to optimize the in silico docking of crRNAs with the Cas13 proteins. From this optimized pipeline, we find a number of candidate crRNAs that have comparable or better in silico docking with the Cas13 proteins of the current tools. This study fully automatizes the in silico optimization of RNA–protein interactions as an efficient preliminary step for designing effective CRISPR-Cas13-based antimicrobials.

Collapse

Akpokiro V, Martin T, Oluwadare O. EnsembleSplice: ensemble deep learning model for splice site prediction. BMC Bioinformatics 2022;23:413. [PMID: 36203144 PMCID: PMC9535948 DOI: 10.1186/s12859-022-04971-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Accepted: 09/29/2022] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Identifying splice site regions is an important step in the genomic DNA sequencing pipelines of biomedical and pharmaceutical research. Within this research purview, efficient and accurate splice site detection is highly desirable, and a variety of computational models have been developed toward this end. Neural network architectures have recently been shown to outperform classical machine learning approaches for the task of splice site prediction. Despite these advances, there is still considerable potential for improvement, especially regarding model prediction accuracy, and error rate.

RESULTS

Given these deficits, we propose EnsembleSplice, an ensemble learning architecture made up of four (4) distinct convolutional neural networks (CNN) model architecture combination that outperform existing splice site detection methods in the experimental evaluation metrics considered including the accuracies and error rates. We trained and tested a variety of ensembles made up of CNNs and DNNs using the five-fold cross-validation method to identify the model that performed the best across the evaluation and diversity metrics. As a result, we developed our diverse and highly effective splice site (SS) detection model, which we evaluated using two (2) genomic Homo sapiens datasets and the Arabidopsis thaliana dataset. The results showed that for of the Homo sapiens EnsembleSplice achieved accuracies of 94.16% for one of the acceptor splice sites and 95.97% for donor splice sites, with an error rate for the same Homo sapiens dataset, 4.03% for the donor splice sites and 5.84% for the acceptor splice sites datasets.

CONCLUSIONS

Our five-fold cross validation ensured the prediction accuracy of our models are consistent. For reproducibility, all the datasets used, models generated, and results in our work are publicly available in our GitHub repository here: https://github.com/OluwadareLab/EnsembleSplice.

Collapse

Liu Q, Fang H, Wang X, Wang M, Li S, Coin LJM, Li F, Song J. DeepGenGrep: a general deep learning-based predictor for multiple genomic signals and regions. Bioinformatics 2022;38:4053-4061. [PMID: 35799358 DOI: 10.1093/bioinformatics/btac454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2022] [Revised: 04/11/2022] [Accepted: 07/06/2022] [Indexed: 12/24/2022] Open

Alharbi WS, Rashid M. A review of deep learning applications in human genomics using next-generation sequencing data. Hum Genomics 2022;16:26. [PMID: 35879805 PMCID: PMC9317091 DOI: 10.1186/s40246-022-00396-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 07/12/2022] [Indexed: 12/02/2022] Open

Lee J, Jeong H, Won D, Shin S, Lee ST, Choi JR, Byeon SH, Kuht HJ, Thomas MG, Han J. Noncanonical Splice Site and Deep Intronic FRMD7 Variants Activate Cryptic Exons in X-linked Infantile Nystagmus. Transl Vis Sci Technol 2022;11:25. [PMID: 35762937 PMCID: PMC9251792 DOI: 10.1167/tvst.11.6.25] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open

Abstract

Purpose

We aim to report noncoding pathogenic variants in patients with FRMD7-related infantile nystagmus (FIN).

Methods

Genome sequencing (n = 2 families) and reanalysis of targeted panel next generation sequencing (n = 2 families) was performed in genetically unsolved cases of suspected FIN. Previous sequence analysis showed no pathogenic coding variants in genes associated with infantile nystagmus. SpliceAI, SpliceRover, and Alamut consensus programs were used to annotate noncoding variants. Minigene splicing assay was performed to confirm aberrant splicing. In silico analysis of exonic splicing enhancer and silencer was also performed.

Results

FRMD7 intronic variants were identified based on genome sequencing and targeted next-generation sequencing analysis. These included c.285-12A>G (pedigree 1), c.284+63T>A (pedigrees 2 and 3), and c. 383-1368A>G (pedigree 4). All variants were absent in gnomAD, and the both c.285-12A>G and c.284+63T>A variants were predicted to enhance new splicing acceptor gains with SpliceAI, SpliceRover, and Alamut consensus approaches. However, the c.383-1368 A>G variant only had a significant impact score on the SpliceRover program. The c.383-1368A>G variant was predicted to promote pseudoexon inclusion by binding of exonic splicing enhancer. Aberrant exonizations were validated through minigene constructs, and all variants were segregated in the families.

Conclusions

Deep learning–based annotation of noncoding variants facilitates the discovery of hidden genetic variations in patients with FIN. This study provides evidence of effectiveness of combined deep learning–based splicing tools to identify hidden pathogenic variants in previously unsolved patients with infantile nystagmus.

Translational Relevance

These results demonstrate robust analysis using two deep learning splicing predictions and in vitro functional study can lead to finding hidden genetic variations in unsolved patients.

Collapse

Fernandez-Castillo E, Barbosa-Santillán LI, Falcon-Morales L, Sánchez-Escobar JJ. Deep Splicer: A CNN Model for Splice Site Prediction in Genetic Sequences. Genes (Basel) 2022;13:907. [PMID: 35627292 PMCID: PMC9141016 DOI: 10.3390/genes13050907] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Revised: 05/12/2022] [Accepted: 05/13/2022] [Indexed: 02/05/2023] Open

Abstract

Many living organisms have DNA in their cells that is responsible for their biological features. DNA is an organic molecule of two complementary strands of four different nucleotides wound up in a double helix. These nucleotides are adenine (A), thymine (T), guanine (G), and cytosine (C). Genes are DNA sequences containing the information to synthesize proteins. The genes of higher eukaryotic organisms contain coding sequences, known as exons and non-coding sequences, known as introns, which are removed on splice sites after the DNA is transcribed into RNA. Genome annotation is the process of identifying the location of coding regions and determining their function. This process is fundamental for understanding gene structure; however, it is time-consuming and expensive when done by biochemical methods. With technological advances, splice site detection can be done computationally. Although various software tools have been developed to predict splice sites, they need to improve accuracy and reduce false-positive rates. The main goal of this research was to generate Deep Splicer, a deep learning model to identify splice sites in the genomes of humans and other species. This model has good performance metrics and a lower false-positive rate than the currently existing tools. Deep Splicer achieved an accuracy between 93.55% and 99.66% on the genetic sequences of different organisms, while Splice2Deep, another splice site detection tool, had an accuracy between 90.52% and 98.08%. Splice2Deep surpassed Deep Splicer on the accuracy obtained after evaluating C. elegans genomic sequences (97.88% vs. 93.62%) and A. thaliana (95.40% vs. 94.93%); however, Deep Splicer's accuracy was better for H. sapiens (98.94% vs. 97.15%) and D. melanogaster (97.14% vs. 92.30%). The rate of false positives was 0.11% for human genetic sequences and 0.25% for other species' genetic sequences. Another splice prediction tool, Splice Finder, had between 1% and 3% of false positives for human sequences, while other species' sequences had around 4% and 10%.

Collapse

A systems genomics approach to uncover patient-specific pathogenic pathways and proteins in ulcerative colitis. Nat Commun 2022;13:2299. [PMID: 35484353 PMCID: PMC9051123 DOI: 10.1038/s41467-022-29998-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Accepted: 04/06/2022] [Indexed: 12/11/2022] Open

Jankovic B, Gojobori T. From shallow to deep: some lessons learned from application of machine learning for recognition of functional genomic elements in human genome. Hum Genomics 2022;16:7. [PMID: 35180894 PMCID: PMC8855580 DOI: 10.1186/s40246-022-00376-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2021] [Accepted: 01/02/2022] [Indexed: 11/25/2022] Open

Abstract

Identification of genomic signals as indicators for functional genomic elements is one of the areas that received early and widespread application of machine learning methods. With time, the methods applied grew in variety and generally exhibited a tendency to improve their ability to identify some major genomic and transcriptomics signals. The evolution of machine learning in genomics followed a similar path to applications of machine learning in other fields. These were impacted in a major way by three dominant developments, namely an enormous increase in availability and quality of data, a significant increase in computational power available to machine learning applications, and finally, new machine learning paradigms, of which deep learning is the most well-known example. It is not easy in general to distinguish factors leading to improvements in results of applications of machine learning. This is even more so in the field of genomics, where the advent of next-generation sequencing and the increased ability to perform functional analysis of raw data have had a major effect on the applicability of machine learning in OMICS fields. In this paper, we survey the results from a subset of published work in application of machine learning in the recognition of genomic signals and regions in human genome and summarize some lessons learnt from this endeavor. There is no doubt that a significant progress has been made both in terms of accuracy and reliability of models. Questions remain however whether the progress has been sufficient and what these developments bring to the field of genomics in general and human genomics in particular. Improving usability, interpretability and accuracy of models remains an important open challenge for current and future research in application of machine learning and more generally of artificial intelligence methods in genomics.

Collapse

Oligonucleotide correction of an intronic TIMMDC1 variant in cells of patients with severe neurodegenerative disorder. NPJ Genom Med 2022;7:9. [PMID: 35091571 PMCID: PMC8799713 DOI: 10.1038/s41525-021-00277-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2021] [Accepted: 12/09/2021] [Indexed: 11/08/2022] Open

Kowarz E, Krutzke L, Külp M, Streb P, Larghero P, Reis J, Bracharz S, Engler T, Kochanek S, Marschalek R. Vaccine-induced COVID-19 mimicry syndrome. eLife 2022;11:74974. [PMID: 35084333 PMCID: PMC8846585 DOI: 10.7554/elife.74974] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2021] [Accepted: 01/21/2022] [Indexed: 12/02/2022] Open

Malard F, Mackereth CD, Campagne S. Principles and correction of 5'-splice site selection. RNA Biol 2022;19:943-960. [PMID: 35866748 PMCID: PMC9311317 DOI: 10.1080/15476286.2022.2100971] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open

Yang G, Ye Q, Xia J. Unbox the black-box for the medical explainable AI via multi-modal and multi-centre data fusion: A mini-review, two showcases and beyond. AN INTERNATIONAL JOURNAL ON INFORMATION FUSION 2022;77:29-52. [PMID: 34980946 PMCID: PMC8459787 DOI: 10.1016/j.inffus.2021.07.016] [Citation(s) in RCA: 140] [Impact Index Per Article: 70.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2021] [Revised: 05/25/2021] [Accepted: 07/25/2021] [Indexed: 05/04/2023]

Scalzitti N, Kress A, Orhand R, Weber T, Moulinier L, Jeannin-Girardon A, Collet P, Poch O, Thompson JD. Spliceator: multi-species splice site prediction using convolutional neural networks. BMC Bioinformatics 2021;22:561. [PMID: 34814826 PMCID: PMC8609763 DOI: 10.1186/s12859-021-04471-3] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2021] [Accepted: 11/09/2021] [Indexed: 12/14/2022] Open

Decoding disease: from genomes to networks to phenotypes. Nat Rev Genet 2021;22:774-790. [PMID: 34341555 DOI: 10.1038/s41576-021-00389-x] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/18/2021] [Indexed: 02/06/2023]

Riepe TV, Khan M, Roosing S, Cremers FPM, 't Hoen PAC. Benchmarking deep learning splice prediction tools using functional splice assays. Hum Mutat 2021;42:799-810. [PMID: 33942434 PMCID: PMC8360004 DOI: 10.1002/humu.24212] [Citation(s) in RCA: 50] [Impact Index Per Article: 16.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Revised: 03/16/2021] [Accepted: 04/17/2021] [Indexed: 12/21/2022]

Dasari CM, Bhukya R. Explainable deep neural networks for novel viral genome prediction. APPL INTELL 2021;52:3002-3017. [PMID: 34764607 PMCID: PMC8232563 DOI: 10.1007/s10489-021-02572-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/26/2021] [Indexed: 11/27/2022]

Zrimec J, Buric F, Kokina M, Garcia V, Zelezniak A. Learning the Regulatory Code of Gene Expression. Front Mol Biosci 2021;8:673363. [PMID: 34179082 PMCID: PMC8223075 DOI: 10.3389/fmolb.2021.673363] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2021] [Accepted: 05/24/2021] [Indexed: 11/13/2022] Open

Dutta A, Singh KK, Anand A. SpliceViNCI: Visualizing the splicing of non-canonical introns through recurrent neural networks. J Bioinform Comput Biol 2021;19:2150014. [PMID: 34088258 DOI: 10.1142/s0219720021500141] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

MET Exon 14 Skipping: A Case Study for the Detection of Genetic Variants in Cancer Driver Genes by Deep Learning. Int J Mol Sci 2021;22:ijms22084217. [PMID: 33921709 PMCID: PMC8072630 DOI: 10.3390/ijms22084217] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2021] [Revised: 04/13/2021] [Accepted: 04/17/2021] [Indexed: 11/17/2022] Open

Clauwaert J, Menschaert G, Waegeman W. Explainability in transformer models for functional genomics. Brief Bioinform 2021;22:6214646. [PMID: 33834200 PMCID: PMC8425421 DOI: 10.1093/bib/bbab060] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2020] [Revised: 01/28/2021] [Accepted: 02/05/2021] [Indexed: 11/16/2022] Open

Kong L, Chen Y, Xu F, Xu M, Li Z, Fang J, Zhang L, Pian C. Mining influential genes based on deep learning. BMC Bioinformatics 2021;22:27. [PMID: 33482718 PMCID: PMC7821411 DOI: 10.1186/s12859-021-03972-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Accepted: 01/15/2021] [Indexed: 11/17/2022] Open

Wei C, Zhang J, Yuan X, He Z, Liu G, Wu J. NeuroTIS: Enhancing the prediction of translation initiation sites in mRNA sequences via a hybrid dependency network and deep learning framework. Knowl Based Syst 2021. [DOI: 10.1016/j.knosys.2020.106459] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Nonsense-associated altered splicing of MAP3K1 in two siblings with 46,XY disorders of sex development. Sci Rep 2020;10:17375. [PMID: 33060765 PMCID: PMC7567082 DOI: 10.1038/s41598-020-74405-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Accepted: 09/29/2020] [Indexed: 01/31/2023] Open

Thanapattheerakul T, Engchuan W, Chan JH. Predicting the effect of variants on splicing using Convolutional Neural Networks. PeerJ 2020;8:e9470. [PMID: 32704450 PMCID: PMC7346860 DOI: 10.7717/peerj.9470] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2019] [Accepted: 06/11/2020] [Indexed: 11/23/2022] Open

Abstract

Mutations that cause an error in the splicing of a messenger RNA (mRNA) can lead to diseases in humans. Various computational models have been developed to recognize the sequence pattern of the splice sites. In recent studies, Convolutional Neural Network (CNN) architectures were shown to outperform other existing models in predicting the splice sites. However, an insufficient effort has been put into extending the CNN model to predict the effect of the genomic variants on the splicing of mRNAs. This study proposes a framework to elaborate on the utility of CNNs to assess the effect of splice variants on the identification of potential disease-causing variants that disrupt the RNA splicing process. Five models, including three CNN-based and two non-CNN machine learning based, were trained and compared using two existing splice site datasets, Genome Wide Human splice sites (GWH) and a dataset provided at the Deep Learning and Artificial Intelligence winter school 2018 (DLAI). The donor sites were also used to test on the HSplice tool to evaluate the predictive models. To improve the effectiveness of predictive models, two datasets were combined. The CNN model with four convolutional layers showed the best splice site prediction performance with an AUPRC of 93.4% and 88.8% for donor and acceptor sites, respectively. The effects of variants on splicing were estimated by applying the best model on variant data from the ClinVar database. Based on the estimation, the framework could effectively differentiate pathogenic variants from the benign variants (p = 5.9 × 10⁻⁷). These promising results support that the proposed framework could be applied in future genetic studies to identify disease causing loci involving the splicing mechanism. The datasets and Python scripts used in this study are available on the GitHub repository at https://github.com/smiile8888/rna-splice-sites-recognition.

Collapse

Payrovnaziri SN, Chen Z, Rengifo-Moreno P, Miller T, Bian J, Chen JH, Liu X, He Z. Explainable artificial intelligence models using real-world electronic health record data: a systematic scoping review. J Am Med Inform Assoc 2020;27:1173-1185. [PMID: 32417928 PMCID: PMC7647281 DOI: 10.1093/jamia/ocaa053] [Citation(s) in RCA: 87] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2020] [Revised: 04/01/2020] [Accepted: 04/07/2020] [Indexed: 01/08/2023] Open