1
|
Alanazi AE, Alhumaidy AA, Almutairi H, Awadalla ME, Alkathiri A, Alarjani M, Aldawsari MA, Maniah K, Alahmadi RM, Alanazi BS, Eifan S, Alosaimi B. Evolutionary analysis of LMP-1 genetic diversity in EBV-associated nasopharyngeal carcinoma: Bioinformatic insights into oncogenic potential. Infect Genet Evol 2024; 120:105586. [PMID: 38508363 DOI: 10.1016/j.meegid.2024.105586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/01/2024] [Revised: 03/07/2024] [Accepted: 03/16/2024] [Indexed: 03/22/2024]
Abstract
EBV latent membrane protein 1 (LMP-1) is an important oncogene involved in the induction and maintenance of EBV infection and the activation of several cell survival and proliferative pathways. The genetic diversity of LMP-1 has an important role in immunogenicity and tumorigenicity allowing escape from host cell immunity and more metastatic potential of LMP-1 variants. This study explored the evolutionary of LMP-1 in EBV-infected patients at an advanced stage of nasopharyngeal carcinoma (NPC). Detection of genetic variability in LMP-1 genes was carried out using Sanger sequencing. Bioinformatic analysis was conducted for translation and nucleotide alignment. Phylogenetic analysis was used to construct a Bayesian tree for a deeper understanding of the genetic relationships, evolutionary connections, and variations between sequences. Genetic characterization of LMP-1 in NPC patients revealed the detection of polymorphism in LMP-1 Sequences. Motifs were identified within three critical LMP-1 domains, such as PQQAT within CTAR1 and YYD within CTAR2. The presence of the JACK3 region at specific sites within CTAR3, as well as repeat regions at positions (122-132) and (133-143) within CTAR3, was also annotated. Additionally, several mutations were detected including 30 and 69 bp deletions, 33 bp repeats, and 15 bp insertion. Although LMP-1 strains appear to be genetically diverse, they are closely related to 3 reference strains: prototype B95.8, Med- 30 bp deletion, and Med + 30 bp deletion. In our study, one of the strains harboring the 30 bp deletion had both bone and bone marrow metastasis which could be attributed to the fact that LMP-1 is involved in tumor metastasis, evasion and migration of NPC cells. This study provided valuable insights into genetic variability in LMP-1 sequences of EBV in NPC patients. Further functional studies would provide a more comprehensive understanding of the molecular characteristics, epidemiology, and clinical implications of LMP-1 polymorphisms in EBV-related malignancies.
Collapse
Affiliation(s)
- Abdullah E Alanazi
- Comprehensive Cancer Center, King Fahad Medical City, Riyadh Second Health Cluster, Riyadh 11525, Saudi Arabia
| | | | - Hatim Almutairi
- Bioinformatics Laboratory, Public Health Authority, Riyadh 11451, Saudi Arabia
| | - Maaweya E Awadalla
- Research Center, King Fahad Medical City, Riyadh Second Heath Cluster, Riyadh 11525, Saudi Arabia
| | - Abdulrahman Alkathiri
- Botany and Microbiology Department, College of Science, King Saud University, Riyadh 11451, Saudi Arabia
| | - Modhi Alarjani
- Research Center, King Fahad Medical City, Riyadh Second Heath Cluster, Riyadh 11525, Saudi Arabia
| | - Mesfer Abdullah Aldawsari
- Department of Health Education, Alyamamah Hospital, Riyadh Second Heath Cluster, Riyadh 11525, Saudi Arabia
| | - Khalid Maniah
- Department of Biology, King Khalid Military Academy, Riyadh 22140, Saudi Arabia
| | - Reham M Alahmadi
- Botany and Microbiology Department, College of Science, King Saud University, Riyadh 11451, Saudi Arabia
| | - Bader S Alanazi
- Research Center, King Fahad Medical City, Riyadh Second Heath Cluster, Riyadh 11525, Saudi Arabia
| | - Saleh Eifan
- Botany and Microbiology Department, College of Science, King Saud University, Riyadh 11451, Saudi Arabia
| | - Bandar Alosaimi
- Research Center, King Fahad Medical City, Riyadh Second Heath Cluster, Riyadh 11525, Saudi Arabia.
| |
Collapse
|
2
|
Hoda A, Bixheku X, Lika Çekani M. Computational analysis of non-synonymous single nucleotide polymorphism in the bovine PKLR geneComputational analysis of bovine PKLR gene. J Biomol Struct Dyn 2024; 42:4155-4168. [PMID: 37278385 DOI: 10.1080/07391102.2023.2219315] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 05/23/2023] [Indexed: 06/07/2023]
Abstract
Pyruvate kinase (PKLR) is a potential candidate gene for milk production traits in cows. The main aim of this work is to investigate the potentially deleterious non-synonymous single nucleotide polymorphisms (nsSNPs) in the PKLR gene by using several computational tools. In silico tools including SIFT, Polyphen-2, SNAP2 and Panther indicated only 18 nsSNPs out of 170 were considered deleterious. The analysis of proteins' stability change due to amino acid substitution performed by the use of the I-mutant, MUpro, CUPSTAT, SDM and Dynamut confirmed that 9 nsSNPs decreased protein stability. ConSurf analysis predicted that all 18 nsSNPs were evolutionary moderately or highly conserved. Two different domains of PKLR protein were revealed by the InterPro tool with 12 nsSNPs positioned in the Pyruvate Kinase barrel domain and 6 nsSNP present in the Pyruvate Kinase C Terminal. The PKLR 3D model was predicted by MODELLER software and validated via Ramachandran plot and Prosa which indicated a good quality model. The analysis of energy minimizations for the native and mutated structures was performed by SWISS PDB viewer with GROMOS 96 program and showed that 3 structural and 4 functional residues had total energy higher than the native model. These findings indicate that these mutant structures (rs441424814, rs449326723, rs476805413, rs472263384, rs474320860, rs475521477, rs441633284) were less stable than the native model. Molecular Dynamics simulations were performed to confirm the impact of nsSNPs on the protein structure and function. The present study provides useful information about functional SNPs that have an impact on PKLR protein in cattle.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Anila Hoda
- Agricultural University of Tirana, Tirana, Albania
| | | | | |
Collapse
|
3
|
Lee Y, Xu Y, Gao P, Chen J. TENET: Triple-enhancement based graph neural network for cell-cell interaction network reconstruction from spatial transcriptomics. J Mol Biol 2024; 436:168543. [PMID: 38508302 DOI: 10.1016/j.jmb.2024.168543] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Revised: 03/03/2024] [Accepted: 03/13/2024] [Indexed: 03/22/2024]
Abstract
Cellular communication relies on the intricate interplay of signaling molecules, forming the Cell-cell Interaction network (CCI) that coordinates tissue behavior. Researchers have shown the capability of shallow neural networks in reconstructing CCI, given molecules' abundance in the Spatial Transcriptomics (ST) data. When encountering situations such as sparse connections in CCI and excessive noise, the susceptibility of shallow networks to these factors significantly impacts the accuracy of CCI reconstruction, resulting in subpar results. To reconstruct a more comprehensive and accurate CCI, we propose a novel method named Triple-Enhancement based Graph Neural Network (TENET). In TENET, three progressive enhancement mechanisms build upon each other, creating a cumulative effect. This approach can ensure the ability to capture valuable features in limited data and amplify the noise signal to facilitate the denoising effect. Additionally, the whole architecture guides the decoding reconstruction phase with integrated knowledge, which leverages the accumulated insights from each stage of enhancement to ensure a refined and comprehensive CCI reconstruction. The presented TENET has been implemented and tested on both real and synthetic ST datasets. Averagely, the CCI reconstruction using TENET achieves a 9.61% improvement in Average Precision (AP) and a 7.32% improvement in Area Under the Receiver Operating Characteristic (AUROC) compared to the existing state-of-the-art (SOTA) method. The source code and data are available at https://github.com/Yujian-Lee/TENET.
Collapse
Affiliation(s)
- Yujian Lee
- Guangdong Provincial Key Laboratory IRADS, Beijing Normal University-Hong Kong Baptist University United International College, Zhuhai, China; Department of Computer Science, Hong Kong Baptist University, Hong Kong Special Administrative Region; Beijing Normal University-Hong Kong Baptist University United International College, Zhuhai, China
| | - Yongqi Xu
- Department of Computer Science and Technology, Guangdong University of Technology, Guangzhou, China
| | - Peng Gao
- Department of Computer Science, Hong Kong Baptist University, Hong Kong Special Administrative Region; Beijing Normal University-Hong Kong Baptist University United International College, Zhuhai, China
| | - Jiaxing Chen
- Guangdong Provincial Key Laboratory IRADS, Beijing Normal University-Hong Kong Baptist University United International College, Zhuhai, China; Beijing Normal University-Hong Kong Baptist University United International College, Zhuhai, China.
| |
Collapse
|
4
|
Hills S, Li Q, Madden JA, Genetti CA, Brownstein CA, Schmitz-Abe K, Beggs AH, Agrawal PB. High number of candidate gene variants are identified as disease-causing in a period of 4 years. Am J Med Genet A 2024; 194:e63509. [PMID: 38158391 DOI: 10.1002/ajmg.a.63509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 09/15/2023] [Accepted: 12/09/2023] [Indexed: 01/03/2024]
Abstract
Advances in bioinformatic tools paired with the ongoing accumulation of genetic knowledge and periodic reanalysis of genomic sequencing data have led to an improvement in genetic diagnostic rates. Candidate gene variants (CGVs) identified during sequencing or on reanalysis but not yet implicated in human disease or associated with a phenotypically distinct condition are often not revisited, leading to missed diagnostic opportunities. Here, we revisited 33 such CGVs from our previously published study and determined that 16 of them are indeed disease-causing (novel or phenotype expansion) since their identification. These results emphasize the need to focus on previously identified CGVs during sequencing or reanalysis and the importance of sharing that information with researchers around the world, including relevant functional analysis to establish disease causality.
Collapse
Affiliation(s)
- Sonia Hills
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, Massachusetts, USA
- The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, Massachusetts, USA
| | - Qifei Li
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, Massachusetts, USA
- The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, Massachusetts, USA
- Division of Neonatology, Department of Pediatrics, University of Miami Miller School of Medicine and Jackson Health System, Miami, Florida, USA
| | - Jill A Madden
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, Massachusetts, USA
- The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, Massachusetts, USA
| | - Casie A Genetti
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, Massachusetts, USA
- The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, Massachusetts, USA
| | - Catherine A Brownstein
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, Massachusetts, USA
- The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, Massachusetts, USA
- Department of Pediatrics, Harvard Medical School, Boston, Massachusetts, USA
| | - Klaus Schmitz-Abe
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, Massachusetts, USA
- The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, Massachusetts, USA
- Division of Neonatology, Department of Pediatrics, University of Miami Miller School of Medicine and Jackson Health System, Miami, Florida, USA
- Department of Pediatrics, Harvard Medical School, Boston, Massachusetts, USA
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Alan H Beggs
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, Massachusetts, USA
- The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, Massachusetts, USA
- Department of Pediatrics, Harvard Medical School, Boston, Massachusetts, USA
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Pankaj B Agrawal
- Division of Genetics and Genomics, Boston Children's Hospital, Boston, Massachusetts, USA
- The Manton Center for Orphan Disease Research, Boston Children's Hospital, Boston, Massachusetts, USA
- Division of Neonatology, Department of Pediatrics, University of Miami Miller School of Medicine and Jackson Health System, Miami, Florida, USA
| |
Collapse
|
5
|
Hong H, Yu L, Cong W, Kang K, Gao Y, Guan Q, Meng X, Zhang H, Zhou Z. Cross-Talking Pathways of Rapidly Accelerated Fibrosarcoma-1 (RAF-1) in Alzheimer's Disease. Mol Neurobiol 2024; 61:2798-2807. [PMID: 37940778 DOI: 10.1007/s12035-023-03765-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2022] [Accepted: 11/01/2023] [Indexed: 11/10/2023]
Abstract
Alzheimer's disease (AD) becomes one of the main global burden diseases with the aging population. This study was to investigate the potential molecular mechanisms of rapidly accelerated fibrosarcoma-1 (RAF-1) in AD through bioinformatics analysis. Differential gene expression analysis was performed in GSE132903 dataset. We used weight gene correlation network analysis (WGCNA) to evaluate the relations among co-expression modules and construct global regulatory network. Cross-talking pathways of RAF-1 in AD were identified by functional enrichment analysis. Totally, 2700 differentially expressed genes (DEGs) were selected between AD versus non-dementia control and RAF-1-high versus low group. Among them, DEGs in turquoise module strongly associated with AD and high expression of RAF-1 were enriched in vascular endothelial growth factor (VEGF), neurotrophin, mitogen-activated protein kinase (MAPK) signaling pathway, oxidative phosphorylation, GABAergic synapse, and axon guidance. Moreover, cross-talking pathways of RAF-1, including MAPK, VEGF, neurotrophin signaling pathways, and axon guidance, were identified by global regulatory network. The performance evaluation of AUC was 84.2%. The gene set enrichment analysis (GSEA) indicated that oxidative phosphorylation and synapse-related biological processes were enriched in RAF-1-high and AD group. Our findings strengthened the potential roles of high RAF-1 level in AD pathogenesis, which were mediated by MAPK, VEGF, neurotrophin signaling pathways, and axon guidance.
Collapse
Affiliation(s)
- Hong Hong
- Department of Geriatrics, The First Hospital of China Medical University, Shenyang, 110001, Liaoning, China
| | - Lujiao Yu
- Department of Geriatrics, The First Hospital of China Medical University, Shenyang, 110001, Liaoning, China
| | - Wenqiang Cong
- Department of Geriatrics, The First Hospital of China Medical University, Shenyang, 110001, Liaoning, China
| | - Kexin Kang
- Department of Geriatrics, The First Hospital of China Medical University, Shenyang, 110001, Liaoning, China
| | - Yazhu Gao
- Department of Geriatrics, The First Hospital of China Medical University, Shenyang, 110001, Liaoning, China
| | - Qing Guan
- Department of Geriatrics, The First Hospital of China Medical University, Shenyang, 110001, Liaoning, China
| | - Xin Meng
- Department of Biochemistry and Molecular Biology, College of Life Science, China Medical University, Shenyang, 110001, Liaoning, China
| | - Haiyan Zhang
- Department of Geriatrics, The First Hospital of China Medical University, Shenyang, 110001, Liaoning, China
| | - Zhike Zhou
- Department of Geriatrics, The First Hospital of China Medical University, Shenyang, 110001, Liaoning, China.
| |
Collapse
|
6
|
Kao HJ, Weng TH, Chen CH, Chen YC, Huang KY, Weng SL. iDVEIP: A computer-aided approach for the prediction of viral entry inhibitory peptides. Proteomics 2024; 24:e2300257. [PMID: 38263811 DOI: 10.1002/pmic.202300257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 01/03/2024] [Accepted: 01/05/2024] [Indexed: 01/25/2024]
Abstract
With the notable surge in therapeutic peptide development, various peptides have emerged as potential agents against virus-induced diseases. Viral entry inhibitory peptides (VEIPs), a subset of antiviral peptides (AVPs), offer a promising avenue as entry inhibitors (EIs) with distinct advantages over chemical counterparts. Despite this, a comprehensive analytical platform for characterizing these peptides and their effectiveness in blocking viral entry remains lacking. In this study, we introduce a groundbreaking in silico approach that leverages bioinformatics analysis and machine learning to characterize and identify novel VEIPs. Cross-validation results demonstrate the efficacy of a model combining sequence-based features in predicting VEIPs with high accuracy, validated through independent testing. Additionally, an EI type model has been developed to distinguish peptides specifically acting as Eis from AVPs with alternative activities. Notably, we present iDVEIP, a web-based tool accessible at http://mer.hc.mmh.org.tw/iDVEIP/, designed for automatic analysis and prediction of VEIPs. Emphasizing its capabilities, the tool facilitates comprehensive analyses of peptide characteristics, providing detailed amino acid composition data for each prediction. Furthermore, we showcase the tool's utility in identifying EIs against severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2).
Collapse
Affiliation(s)
- Hui-Ju Kao
- Department of Medical Research, Hsinchu MacKay Memorial Hospital, Hsinchu City, Taiwan
| | - Tzu-Hsiang Weng
- Department of Obstetrics and Gynecology, MacKay Memorial Hospital, Taipei City, Taiwan
| | - Chia-Hung Chen
- Department of Medical Research, Hsinchu MacKay Memorial Hospital, Hsinchu City, Taiwan
| | - Yu-Chi Chen
- Department of Medical Research, Hsinchu MacKay Memorial Hospital, Hsinchu City, Taiwan
| | - Kai-Yao Huang
- Department of Medical Research, Hsinchu MacKay Memorial Hospital, Hsinchu City, Taiwan
- Department of Medicine, MacKay Medical College, New Taipei City, Taiwan
- Institute of Biomedical Sciences, MacKay Medical College, New Taipei City, Taiwan
| | - Shun-Long Weng
- Department of Medicine, MacKay Medical College, New Taipei City, Taiwan
- Department of Obstetrics and Gynecology, Hsinchu MacKay Memorial Hospital, Hsinchu City, Taiwan
- MacKay Junior College of Medicine, Nursing and Management, Taipei City, Taiwan
| |
Collapse
|
7
|
Abdulabbas HT, Mohammad Ali AN, Farjadfar A, Arabfard M, Najafipour S, Kouhpayeh A, Ghasemian A, Behmard E. Design of a novel multi-epitope vaccine candidate against Chlamydia trachomatis using structural and nonstructural proteins: an immunoinformatics study. J Biomol Struct Dyn 2024; 42:4356-4369. [PMID: 37288800 DOI: 10.1080/07391102.2023.2220812] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2023] [Accepted: 05/28/2023] [Indexed: 06/09/2023]
Abstract
Chlamydia trachomatis (C. trachomatis) is an obligate intracellular bacterium which causes eye and sexually transmitted infections. During pregnancy, the bacterium is associated with preterm complications, low weight of neonates, fetal demise and endometritis leading to infertility. The aim of our study was design of a multi-epitope vaccine (MEV) candidate against C. trachomatis. After protein sequence adoption from the NCBI, potential epitopes toxicity, antigenicity, allergenicity, MHC-I and MHC-II binding, cytotoxic T lymphocytes (CTLs), Helper T lymphocytes (HTLs) and interferon-γ (IFN-γ)- induction were predicted. The adopted epitopes were fused together using appropriate linkers. In the next step, the MEV structural mapping and characterization, three-dimensional (3D) structure homology modeling and refinement were also performed. The MEV candidate interaction with the toll-like receptor 4 (TLR4) was also docked. The immune responses simulation was assessed using the C-IMMSIM server. Molecular dynamic (MD) simulation verified the structural stability of the TLR4-MEV complex. The Molecular Mechanics Poisson-Boltzmann Surface Area (MMPBSA) approach demonstrated the MEV high affinity of binding to the TLR4, MHC-I and MHC-II. The MEV construct was also stable and water soluble and had enough antigenicity and lacked allergenicity with stimulation of T cells and B cells and INF-γ release. The immune simulation confirmed acceptable responses of both the humoral and cellular arms. It is proposed that in vitro and in vivo studies are needed to evaluate the findings of this study.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Hussein T Abdulabbas
- Department of Medical Microbiology, Medical College, Al Muthanna University, Al Muthanna, Iraq
| | | | - Akbar Farjadfar
- Department of medical Biotechnology, Fasa University of Medical Sciences, Fasa, Iran
| | - Masoud Arabfard
- Chemical Injuries Research Center, Systems Biology and Poisonings Institute, Baqiyatallah University of Medical Sciences, Tehran, Iran
| | - Sohrab Najafipour
- School of Advanced Technologies in Medicine, Fasa University of Medical Sciences, Fasa, Iran
| | - Amin Kouhpayeh
- Department of Pharmacology, Faculty of Medicine, Fasa University of Medical Sciences, Fasa, Iran
| | - Abdolmajid Ghasemian
- Noncommunicable Diseases Research Center, Fasa University of Medical Sciences, Fasa, Iran
| | - Esmaeil Behmard
- School of Advanced Technologies in Medicine, Fasa University of Medical Sciences, Fasa, Iran
| |
Collapse
|
8
|
Santos JD, Sobral D, Pinheiro M, Isidro J, Bogaardt C, Pinto M, Eusébio R, Santos A, Mamede R, Horton DL, Gomes JP, Borges V. INSaFLU-TELEVIR: an open web-based bioinformatics suite for viral metagenomic detection and routine genomic surveillance. Genome Med 2024; 16:61. [PMID: 38659008 DOI: 10.1186/s13073-024-01334-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Accepted: 04/15/2024] [Indexed: 04/26/2024] Open
Abstract
BACKGROUND Implementation of clinical metagenomics and pathogen genomic surveillance can be particularly challenging due to the lack of bioinformatics tools and/or expertise. In order to face this challenge, we have previously developed INSaFLU, a free web-based bioinformatics platform for virus next-generation sequencing data analysis. Here, we considerably expanded its genomic surveillance component and developed a new module (TELEVIR) for metagenomic virus identification. RESULTS The routine genomic surveillance component was strengthened with new workflows and functionalities, including (i) a reference-based genome assembly pipeline for Oxford Nanopore technologies (ONT) data; (ii) automated SARS-CoV-2 lineage classification; (iii) Nextclade analysis; (iv) Nextstrain phylogeographic and temporal analysis (SARS-CoV-2, human and avian influenza, monkeypox, respiratory syncytial virus (RSV A/B), as well as a "generic" build for other viruses); and (v) algn2pheno for screening mutations of interest. Both INSaFLU pipelines for reference-based consensus generation (Illumina and ONT) were benchmarked against commonly used command line bioinformatics workflows for SARS-CoV-2, and an INSaFLU snakemake version was released. In parallel, a new module (TELEVIR) for virus detection was developed, after extensive benchmarking of state-of-the-art metagenomics software and following up-to-date recommendations and practices in the field. TELEVIR allows running complex workflows, covering several combinations of steps (e.g., with/without viral enrichment or host depletion), classification software (e.g., Kaiju, Kraken2, Centrifuge, FastViromeExplorer), and databases (RefSeq viral genome, Virosaurus, etc.), while culminating in user- and diagnosis-oriented reports. Finally, to potentiate real-time virus detection during ONT runs, we developed findONTime, a tool aimed at reducing costs and the time between sample reception and diagnosis. CONCLUSIONS The accessibility, versatility, and functionality of INSaFLU-TELEVIR are expected to supply public and animal health laboratories and researchers with a user-oriented and pan-viral bioinformatics framework that promotes a strengthened and timely viral metagenomic detection and routine genomics surveillance. INSaFLU-TELEVIR is compatible with Illumina, Ion Torrent, and ONT data and is freely available at https://insaflu.insa.pt/ (online tool) and https://github.com/INSaFLU (code).
Collapse
Affiliation(s)
- João Dourado Santos
- Genomics and Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge (INSA), Lisbon, Portugal
| | - Daniel Sobral
- Genomics and Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge (INSA), Lisbon, Portugal
| | - Miguel Pinheiro
- Institute of Biomedicine-iBiMED, Department of Medical Sciences, University of Aveiro, Aveiro, Portugal
| | - Joana Isidro
- Genomics and Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge (INSA), Lisbon, Portugal
| | - Carlijn Bogaardt
- Department of Comparative Biomedical Sciences, School of Veterinary Medicine, University of Surrey, Surrey, UK
| | - Miguel Pinto
- Genomics and Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge (INSA), Lisbon, Portugal
| | - Rodrigo Eusébio
- Genomics and Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge (INSA), Lisbon, Portugal
| | - André Santos
- Genomics and Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge (INSA), Lisbon, Portugal
| | - Rafael Mamede
- Faculdade de Medicina, Instituto de Microbiologia, Instituto de Medicina Molecular, Universidade de Lisboa, Lisbon, Portugal
| | - Daniel L Horton
- Department of Comparative Biomedical Sciences, School of Veterinary Medicine, University of Surrey, Surrey, UK
| | - João Paulo Gomes
- Genomics and Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge (INSA), Lisbon, Portugal
- Veterinary and Animal Research Centre (CECAV), Faculty of Veterinary Medicine, Lusófona University, Lisbon, Portugal
| | - Vítor Borges
- Genomics and Bioinformatics Unit, Department of Infectious Diseases, National Institute of Health Doutor Ricardo Jorge (INSA), Lisbon, Portugal.
| |
Collapse
|
9
|
Wang C, Li Y, Huang J, Yan H, Zhao B. Mutation of neurotrophic tyrosine receptor kinase can promote pan-cancer immunity and the efficacy of immunotherapy. Mol Cancer 2024; 23:81. [PMID: 38658978 DOI: 10.1186/s12943-024-01986-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2024] [Accepted: 03/21/2024] [Indexed: 04/26/2024] Open
Abstract
The Neurotrophic tyrosine receptor kinase (NTRK) family plays important roles in tumor progression and is involved in tumor immunogenicity. Here, we conducted a comprehensive bioinformatic and clinical analysis to investigate the characteristics of NTRK mutations and their association with the outcomes in pan-cancer immunotherapy. In 3888 patients across 12 cancer types, patients with NTRK-mutant tumors showed more benefit from immunotherapy in terms of objective response rate (ORR; 41.7% vs. 27.5%; P < 0.001), progress-free survival (PFS; HR = 0.80; 95% CI, 0.68-0.96; P = 0.01), and overall survival (OS; HR = 0.71; 95% CI, 0.61-0.82; P < 0.001). We further constructed and validated a nomogram to estimate survival probabilities after the initiation of immunotherapy. Multi-omics analysis on intrinsic and extrinsic immune landscapes indicated that NTRK mutation was associated with enhanced tumor immunogenicity, enriched infiltration of immune cells, and improved immune responses. In summary, NTRK mutation may promote cancer immunity and indicate favorable outcomes in immunotherapy. Our results have implications for treatment decision-making and developing immunotherapy for personalized care.
Collapse
Affiliation(s)
- Congren Wang
- Quanzhou First Hospital Affiliated to Fujian Medical University, Quanzhou, 362000, China
| | - Yingying Li
- Quanzhou First Hospital Affiliated to Fujian Medical University, Quanzhou, 362000, China
- Second Affiliated Hospital, Yuying Children's Hospital, Wenzhou Medical University, Wenzhou, 325035, China
| | - Jinyuan Huang
- Quanzhou First Hospital Affiliated to Fujian Medical University, Quanzhou, 362000, China
- Second Affiliated Hospital, Yuying Children's Hospital, Wenzhou Medical University, Wenzhou, 325035, China
| | - Huimeng Yan
- Quanzhou First Hospital Affiliated to Fujian Medical University, Quanzhou, 362000, China
- Second Affiliated Hospital, Yuying Children's Hospital, Wenzhou Medical University, Wenzhou, 325035, China
| | - Bin Zhao
- Quanzhou First Hospital Affiliated to Fujian Medical University, Quanzhou, 362000, China.
- Second Affiliated Hospital, Yuying Children's Hospital, Wenzhou Medical University, Wenzhou, 325035, China.
| |
Collapse
|
10
|
Chen Y, Qiu M, Hu R, Cao J, Liang W, Yan S. [Uncovering the molecular mechanisms behind steroidal saponin accumulation in Liriope muscari (Decne.) Baily through transcriptome sequencing and bioinformatics analysis]. Sheng Wu Gong Cheng Xue Bao 2024; 40:1120-1137. [PMID: 38658153 DOI: 10.13345/j.cjb.230492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
The leaves and roots of Liriope muscari (Decne.) Baily were subjected to high-throughput Illumina transcriptome sequencing. Bioinformatics analysis was used to investigate the enzyme genes and key transcription factors involved in regulating the accumulation of steroidal saponins, which are the main active ingredient in L. muscari. These analyses aimed to reveal the molecular mechanism behind steroidal saponin accumulation. The sequencing results of L. muscari revealed 31 enzymes, including AACT, CAS, DXS and DXR, that are involved in the synthesis of steroidal saponins. Among these enzymes, 16 were in the synthesis of terpenoid skeleton, 3 were involved in the synthesis of sesquiterpene and triterpene, and 12 were involved in the synthesis of steroidal compound. Differential gene expression identified 15 metabolic enzymes coded by 34 differentially expressed genes (DEGs) in the leaves and roots, which were associated with steroidal saponin synthesis. Further analysis using gene co-expression patterns showed that 14 metabolic enzymes coded by 31 DEGs were co-expressed. In addition, analysis using gene co-expression analysis and PlantTFDB's transcription factor analysis tool predicted the involvement of 8 transcription factors, including GAI, PIF4, PIL6, ERF8, SVP, LHCA4, NF-YB3 and DOF2.4, in regulating 6 metabolic enzymes such as DXS, DXR, HMGR, DHCR7, DHCR24, and CAS. These eight transcription factors were predicted to play important roles in regulating steroidal saponin accumulation in L. muscari. Promoter analysis of these transcription factors indicated that their main regulatory mechanisms involve processes such as abscisic acid response, drought-induction stress response and light response, especially abscisic acid responsive elements (ABRE) response and MYB binding site involved in drought-inducibility (MBS) response pathway. Furthermore, qRT-PCR analysis of these eight key transcription factors demonstrated their specific differences in the leaves and roots.
Collapse
Affiliation(s)
- Ying Chen
- College of Landscape Architecture and Art, Fujian Agriculture and Forestry University, Fuzhou 350100, Fujian, China
| | - Mingyue Qiu
- College of Landscape Architecture and Art, Fujian Agriculture and Forestry University, Fuzhou 350100, Fujian, China
| | - Ruoqun Hu
- College of Landscape Architecture and Art, Fujian Agriculture and Forestry University, Fuzhou 350100, Fujian, China
| | - Jiayu Cao
- College of Landscape Architecture and Art, Fujian Agriculture and Forestry University, Fuzhou 350100, Fujian, China
| | - Wanfeng Liang
- College of Landscape Architecture and Art, Fujian Agriculture and Forestry University, Fuzhou 350100, Fujian, China
| | - Shujun Yan
- College of Landscape Architecture and Art, Fujian Agriculture and Forestry University, Fuzhou 350100, Fujian, China
| |
Collapse
|
11
|
Lu C, Jiang J, Chen Q, Liu H, Ju X, Wang H. Analysis and prediction of interactions between transmembrane and non-transmembrane proteins. BMC Genomics 2024; 25:401. [PMID: 38658824 DOI: 10.1186/s12864-024-10251-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2022] [Accepted: 03/25/2024] [Indexed: 04/26/2024] Open
Abstract
BACKGROUND Most of the important biological mechanisms and functions of transmembrane proteins (TMPs) are realized through their interactions with non-transmembrane proteins(nonTMPs). The interactions between TMPs and nonTMPs in cells play vital roles in intracellular signaling, energy metabolism, investigating membrane-crossing mechanisms, correlations between disease and drugs. RESULTS Despite the importance of TMP-nonTMP interactions, the study of them remains in the wet experimental stage, lacking specific and comprehensive studies in the field of bioinformatics. To fill this gap, we performed a comprehensive statistical analysis of known TMP-nonTMP interactions and constructed a deep learning-based predictor to identify potential interactions. The statistical analysis describes known TMP-nonTMP interactions from various perspectives, such as distributions of species and protein families, enrichment of GO and KEGG pathways, as well as hub proteins and subnetwork modules in the PPI network. The predictor implemented by an end-to-end deep learning model can identify potential interactions from protein primary sequence information. The experimental results over the independent validation demonstrated considerable prediction performance with an MCC of 0.541. CONCLUSIONS To our knowledge, we were the first to focus on TMP-nonTMP interactions. We comprehensively analyzed them using bioinformatics methods and predicted them via deep learning-based solely on their sequence. This research completes a key link in the protein network, benefits the understanding of protein functions, and helps in pathogenesis studies of diseases and associated drug development.
Collapse
Affiliation(s)
- Chang Lu
- School of Psychology, School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun, China
| | - Jiuhong Jiang
- School of Psychology, School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun, China
| | - Qiufen Chen
- School of Psychology, School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun, China
| | - Huanhuan Liu
- School of Psychology, School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun, China
| | - Xingda Ju
- School of Psychology, School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun, China.
| | - Han Wang
- School of Psychology, School of Information Science and Technology, Institute of Computational Biology, Northeast Normal University, Changchun, China.
| |
Collapse
|
12
|
Wei W, Xia X, Li T, Chen Q, Feng X. Shaoxia: a web-based interactive analysis platform for single cell RNA sequencing data. BMC Genomics 2024; 25:402. [PMID: 38658838 DOI: 10.1186/s12864-024-10322-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 04/18/2024] [Indexed: 04/26/2024] Open
Abstract
BACKGROUND In recent years, Single-cell RNA sequencing (scRNA-seq) is increasingly accessible to researchers of many fields. However, interpreting its data demands proficiency in multiple programming languages and bioinformatic skills, which limited researchers, without such expertise, exploring information from scRNA-seq data. Therefore, there is a tremendous need to develop easy-to-use software, covering all the aspects of scRNA-seq data analysis. RESULTS We proposed a clear analysis framework for scRNA-seq data, which emphasized the fundamental and crucial roles of cell identity annotation, abstracting the analysis process into three stages: upstream analysis, cell annotation and downstream analysis. The framework can equip researchers with a comprehensive understanding of the analysis procedure and facilitate effective data interpretation. Leveraging the developed framework, we engineered Shaoxia, an analysis platform designed to democratize scRNA-seq analysis by accelerating processing through high-performance computing capabilities and offering a user-friendly interface accessible even to wet-lab researchers without programming expertise. CONCLUSION Shaoxia stands as a powerful and user-friendly open-source software for automated scRNA-seq analysis, offering comprehensive functionality for streamlined functional genomics studies. Shaoxia is freely accessible at http://www.shaoxia.cloud , and its source code is publicly available at https://github.com/WiedenWei/shaoxia .
Collapse
Affiliation(s)
- Weideng Wei
- State Key Laboratory of Oral Diseases & National Center for Stomatology & National Clinical Research Center for Oral Diseases & Research Unit of Oral Carcinogenesis and Management & Chinese Academy of Medical Sciences, West China Hospital of Stomatology, Sichuan University, No. 14, 3rd Section of Ren Min Nan Rd., Chengdu, Sichuan, 610041, China
| | - Xiaoqiang Xia
- State Key Laboratory of Oral Diseases & National Center for Stomatology & National Clinical Research Center for Oral Diseases & Research Unit of Oral Carcinogenesis and Management & Chinese Academy of Medical Sciences, West China Hospital of Stomatology, Sichuan University, No. 14, 3rd Section of Ren Min Nan Rd., Chengdu, Sichuan, 610041, China
| | - Taiwen Li
- State Key Laboratory of Oral Diseases & National Center for Stomatology & National Clinical Research Center for Oral Diseases & Research Unit of Oral Carcinogenesis and Management & Chinese Academy of Medical Sciences, West China Hospital of Stomatology, Sichuan University, No. 14, 3rd Section of Ren Min Nan Rd., Chengdu, Sichuan, 610041, China
| | - Qianming Chen
- Key Laboratory of Oral Biomedical Research of Zhejiang Province, Affiliated Stomatology Hospital, Zhejiang University School of Stomatology, Hangzhou, Zhejiang, 310006, China
| | - Xiaodong Feng
- State Key Laboratory of Oral Diseases & National Center for Stomatology & National Clinical Research Center for Oral Diseases & Research Unit of Oral Carcinogenesis and Management & Chinese Academy of Medical Sciences, West China Hospital of Stomatology, Sichuan University, No. 14, 3rd Section of Ren Min Nan Rd., Chengdu, Sichuan, 610041, China.
| |
Collapse
|
13
|
Ernst TR, Blischak JD, Nordlund P, Dalen J, Moore J, Bhamidipati A, Dwivedi P, LoGrasso J, Curado MR, Engelmann BW. OmicNavigator: open-source software for the exploration, visualization, and archival of omic studies. BMC Bioinformatics 2024; 25:162. [PMID: 38658834 DOI: 10.1186/s12859-024-05743-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 03/13/2024] [Indexed: 04/26/2024] Open
Abstract
BACKGROUND The results of high-throughput biology ('omic') experiments provide insight into biological mechanisms but can be challenging to explore, archive and share. The scale of these challenges continues to grow as omic research volume expands and multiple analytical technologies, bioinformatic pipelines, and visualization preferences have emerged. Multiple software applications exist that support omic study exploration and/or archival. However, an opportunity remains for open-source software that can archive and present the results of omic analyses with broad accommodation of study-specific analytical approaches and visualizations with useful exploration features. RESULTS We present OmicNavigator, an R package for the archival, visualization and interactive exploration of omic studies. OmicNavigator enables bioinformaticians to create web applications that interactively display their custom visualizations and analysis results linked with app-derived analytical tools, graphics, and tables. Studies created with OmicNavigator can be viewed within an interactive R session or hosted on a server for shared access. CONCLUSIONS OmicNavigator can be found at https://github.com/abbvie-external/OmicNavigator.
Collapse
Affiliation(s)
| | - John D Blischak
- AbbVie Inc., 1 North Waukegan Rd, North Chicago, IL, 60064, USA
| | - Paul Nordlund
- AbbVie Inc., 1 North Waukegan Rd, North Chicago, IL, 60064, USA
| | - Joe Dalen
- AbbVie Inc., 1 North Waukegan Rd, North Chicago, IL, 60064, USA
| | - Justin Moore
- AbbVie Inc., 1 North Waukegan Rd, North Chicago, IL, 60064, USA
- Current Address: Program in Quantitative and Computational Biosciences, Baylor College of Medicine, Houston, TX, USA
| | | | - Pankaj Dwivedi
- AbbVie Inc., 1 North Waukegan Rd, North Chicago, IL, 60064, USA
- Proteovant Therapeutics, 2500 Renaissance Blvd, King of Prussia, PA, USA
| | - Joe LoGrasso
- AbbVie Inc., 1 North Waukegan Rd, North Chicago, IL, 60064, USA
| | | | | |
Collapse
|
14
|
Abbasi AF, Asim MN, Ahmed S, Dengel A. Long extrachromosomal circular DNA identification by fusing sequence-derived features of physicochemical properties and nucleotide distribution patterns. Sci Rep 2024; 14:9466. [PMID: 38658614 DOI: 10.1038/s41598-024-57457-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Accepted: 03/18/2024] [Indexed: 04/26/2024] Open
Abstract
Long extrachromosomal circular DNA (leccDNA) regulates several biological processes such as genomic instability, gene amplification, and oncogenesis. The identification of leccDNA holds significant importance to investigate its potential associations with cancer, autoimmune, cardiovascular, and neurological diseases. In addition, understanding these associations can provide valuable insights about disease mechanisms and potential therapeutic approaches. Conventionally, wet lab-based methods are utilized to identify leccDNA, which are hindered by the need for prior knowledge, and resource-intensive processes, potentially limiting their broader applicability. To empower the process of leccDNA identification across multiple species, the paper in hand presents the very first computational predictor. The proposed iLEC-DNA predictor makes use of SVM classifier along with sequence-derived nucleotide distribution patterns and physicochemical properties-based features. In addition, the study introduces a set of 12 benchmark leccDNA datasets related to three species, namely Homo sapiens (HM), Arabidopsis Thaliana (AT), and Saccharomyces cerevisiae (SC/YS). It performs large-scale experimentation across 12 benchmark datasets under different experimental settings using the proposed predictor, more than 140 baseline predictors, and 858 encoder ensembles. The proposed predictor outperforms baseline predictors and encoder ensembles across diverse leccDNA datasets by producing average performance values of 81.09%, 62.2% and 81.08% in terms of ACC, MCC and AUC-ROC across all the datasets. The source code of the proposed and baseline predictors is available at https://github.com/FAhtisham/Extrachrosmosomal-DNA-Prediction . To facilitate the scientific community, a web application for leccDNA identification is available at https://sds_genetic_analysis.opendfki.de/iLEC_DNA/.
Collapse
Affiliation(s)
- Ahtisham Fazeel Abbasi
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, 67663, Kaiserslautern, Germany.
- German Research Center for Artificial Intelligence GmbH, 67663, Kaiserslautern, Germany.
| | - Muhammad Nabeel Asim
- German Research Center for Artificial Intelligence GmbH, 67663, Kaiserslautern, Germany.
| | - Sheraz Ahmed
- German Research Center for Artificial Intelligence GmbH, 67663, Kaiserslautern, Germany
| | - Andreas Dengel
- Department of Computer Science, Rhineland-Palatinate Technical University of Kaiserslautern-Landau, 67663, Kaiserslautern, Germany
- German Research Center for Artificial Intelligence GmbH, 67663, Kaiserslautern, Germany
| |
Collapse
|
15
|
Schultheis H, Bentsen M, Heger V, Looso M. Uncovering uncharacterized binding of transcription factors from ATAC-seq footprinting data. Sci Rep 2024; 14:9275. [PMID: 38654130 DOI: 10.1038/s41598-024-59989-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 04/17/2024] [Indexed: 04/25/2024] Open
Abstract
Transcription factors (TFs) are crucial epigenetic regulators, which enable cells to dynamically adjust gene expression in response to environmental signals. Computational procedures like digital genomic footprinting on chromatin accessibility assays such as ATACseq can be used to identify bound TFs in a genome-wide scale. This method utilizes short regions of low accessibility signals due to steric hindrance of DNA bound proteins, called footprints (FPs), which are combined with motif databases for TF identification. However, while over 1600 TFs have been described in the human genome, only ~ 700 of these have a known binding motif. Thus, a substantial number of FPs without overlap to a known DNA motif are normally discarded from FP analysis. In addition, the FP method is restricted to organisms with a substantial number of known TF motifs. Here we present DENIS (DE Novo motIf diScovery), a framework to generate and systematically investigate the potential of de novo TF motif discovery from FPs. DENIS includes functionality (1) to isolate FPs without binding motifs, (2) to perform de novo motif generation and (3) to characterize novel motifs. Here, we show that the framework rediscovers artificially removed TF motifs, quantifies de novo motif usage during an early embryonic development example dataset, and is able to analyze and uncover TF activity in organisms lacking canonical motifs. The latter task is exemplified by an investigation of a scATAC-seq dataset in zebrafish which covers different cell types during hematopoiesis.
Collapse
Affiliation(s)
- Hendrik Schultheis
- Bioinformatics Core Unit (BCU), Max Planck Institute for Heart and Lung Research, Bad Nauheim, Germany
| | - Mette Bentsen
- Bioinformatics Core Unit (BCU), Max Planck Institute for Heart and Lung Research, Bad Nauheim, Germany
| | - Vanessa Heger
- Bioinformatics Core Unit (BCU), Max Planck Institute for Heart and Lung Research, Bad Nauheim, Germany
| | - Mario Looso
- Bioinformatics Core Unit (BCU), Max Planck Institute for Heart and Lung Research, Bad Nauheim, Germany.
- Cardio-Pulmonary Institute (CPI), Bad Nauheim, Germany.
| |
Collapse
|
16
|
Hou Q, Jiang J, Na K, Zhang X, Liu D, Jing Q, Yan C, Han Y. Potential therapeutic targets for COVID-19 complicated with pulmonary hypertension: a bioinformatics and early validation study. Sci Rep 2024; 14:9294. [PMID: 38653779 DOI: 10.1038/s41598-024-60113-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Accepted: 04/18/2024] [Indexed: 04/25/2024] Open
Abstract
Coronavirus disease (COVID-19) and pulmonary hypertension (PH) are closely correlated. However, the mechanism is still poorly understood. In this article, we analyzed the molecular action network driving the emergence of this event. Two datasets (GSE113439 and GSE147507) from the GEO database were used for the identification of differentially expressed genes (DEGs).Common DEGs were selected by VennDiagram and their enrichment in biological pathways was analyzed. Candidate gene biomarkers were selected using three different machine-learning algorithms (SVM-RFE, LASSO, RF).The diagnostic efficacy of these foundational genes was validated using independent datasets. Eventually, we validated molecular docking and medication prediction. We found 62 common DEGs, including several ones that could be enriched for Immune Response and Inflammation. Two DEGs (SELE and CCL20) could be identified by machine-learning algorithms. They performed well in diagnostic tests on independent datasets. In particular, we observed an upregulation of functions associated with the adaptive immune response, the leukocyte-lymphocyte-driven immunological response, and the proinflammatory response. Moreover, by ssGSEA, natural killer T cells, activated dendritic cells, activated CD4 T cells, neutrophils, and plasmacytoid dendritic cells were correlated with COVID-19 and PH, with SELE and CCL20 showing the strongest correlation with dendritic cells. Potential therapeutic compounds like FENRETI-NIDE, AFLATOXIN B1 and 1-nitropyrene were predicted. Further molecular docking and molecular dynamics simulations showed that 1-nitropyrene had the most stable binding with SELE and CCL20.The findings indicated that SELE and CCL20 were identified as novel diagnostic biomarkers for COVID-19 complicated with PH, and the target of these two key genes, FENRETI-NIDE and 1-nitropyrene, was predicted to be a potential therapeutic target, thus providing new insights into the prediction and treatment of COVID-19 complicated with PH in clinical practice.
Collapse
Affiliation(s)
- Qingbin Hou
- State Key Laboratory of Frigid Zone Cardiovascular Disease, Cardiovascular Research Institute and Department of Cardiology, General Hospital of Northern Theater Command, Shenyang, China
| | - Jinping Jiang
- Department of Cardiology, Shengjing Hospital Affiliated to China Medical University, Shenyang, China
| | - Kun Na
- State Key Laboratory of Frigid Zone Cardiovascular Disease, Cardiovascular Research Institute and Department of Cardiology, General Hospital of Northern Theater Command, Shenyang, China
| | - Xiaolin Zhang
- State Key Laboratory of Frigid Zone Cardiovascular Disease, Cardiovascular Research Institute and Department of Cardiology, General Hospital of Northern Theater Command, Shenyang, China
| | - Dan Liu
- State Key Laboratory of Frigid Zone Cardiovascular Disease, Cardiovascular Research Institute and Department of Cardiology, General Hospital of Northern Theater Command, Shenyang, China
| | - Quanmin Jing
- State Key Laboratory of Frigid Zone Cardiovascular Disease, Cardiovascular Research Institute and Department of Cardiology, General Hospital of Northern Theater Command, Shenyang, China
| | - Chenghui Yan
- State Key Laboratory of Frigid Zone Cardiovascular Disease, Cardiovascular Research Institute and Department of Cardiology, General Hospital of Northern Theater Command, Shenyang, China.
| | - Yaling Han
- State Key Laboratory of Frigid Zone Cardiovascular Disease, Cardiovascular Research Institute and Department of Cardiology, General Hospital of Northern Theater Command, Shenyang, China.
| |
Collapse
|
17
|
Beg A, Parveen R, Fouad H, Yahia ME, Hassanein AS. Unravelling driver genes as potential therapeutic targets in ovarian cancer via integrated bioinformatics approach. J Ovarian Res 2024; 17:86. [PMID: 38654363 DOI: 10.1186/s13048-024-01402-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2023] [Accepted: 03/29/2024] [Indexed: 04/25/2024] Open
Abstract
Target-driven cancer therapy is a notable advancement in precision oncology that has been accompanied by substantial medical accomplishments. Ovarian cancer is a highly frequent neoplasm in women and exhibits significant genomic and clinical heterogeneity. In a previous publication, we presented an extensive bioinformatics study aimed at identifying specific biomarkers associated with ovarian cancer. The findings of the network analysis indicate the presence of a cluster of nine dysregulated hub genes that exhibited significance in the underlying biological processes and contributed to the initiation of ovarian cancer. Here in this research article, we are proceeding our previous research by taking all hub genes into consideration for further analysis. GEPIA2 was used to identify patterns in the expression of critical genes. The KM plotter analysis indicated that the out of all genes 5 genes are statistically significant. The cBioPortal platform was further used to investigate the frequency of genetic mutations across the board and how they affected the survival of the patients. Maximum mutation was reported by ELAVL2. In order to discover viable therapeutic candidates after competitive inhibition of ELAVL2 with small molecular drug complex, high throughput screening and docking studies were used. Five compounds were identified. Overall, our results suggest that the ELAV-like protein 2-ZINC03830554 complex was relatively stable during the molecular dynamic simulation. The five compounds that have been found can also be further examined as potential therapeutic possibilities. The combined findings suggest that ELAVL2, together with their genetic changes, can be investigated in therapeutic interventions for precision oncology, leveraging early diagnostics and target-driven therapy.
Collapse
Affiliation(s)
- Anam Beg
- Department of Computer Science, Jamia Millia Islamia, New Delhi, 110025, India
| | - Rafat Parveen
- Department of Computer Science, Jamia Millia Islamia, New Delhi, 110025, India.
| | - Hassan Fouad
- Applied Medical Science Department, CC, King Saud University, Riyadh, 11433, Saudi Arabia
| | - M E Yahia
- Abu Dhabi Polytechnic, Institute of Applied Technology, Abu Dhabi, 111499, United Arab Emirates
| | - Azza S Hassanein
- Biomedical Engineering Department, Faculty of Engineering, Helwan University, Cairo, Egypt
| |
Collapse
|
18
|
Baghdassarian HM, Dimitrov D, Armingol E, Saez-Rodriguez J, Lewis NE. Combining LIANA and Tensor-cell2cell to decipher cell-cell communication across multiple samples. Cell Rep Methods 2024; 4:100758. [PMID: 38631346 DOI: 10.1016/j.crmeth.2024.100758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/10/2023] [Revised: 12/22/2023] [Accepted: 03/22/2024] [Indexed: 04/19/2024]
Abstract
In recent years, data-driven inference of cell-cell communication has helped reveal coordinated biological processes across cell types. Here, we integrate two tools, LIANA and Tensor-cell2cell, which, when combined, can deploy multiple existing methods and resources to enable the robust and flexible identification of cell-cell communication programs across multiple samples. In this work, we show how the integration of our tools facilitates the choice of method to infer cell-cell communication and subsequently perform an unsupervised deconvolution to obtain and summarize biological insights. We explain how to perform the analysis step by step in both Python and R and provide online tutorials with detailed instructions available at https://ccc-protocols.readthedocs.io/. This workflow typically takes ∼1.5 h to complete from installation to downstream visualizations on a graphics processing unit-enabled computer for a dataset of ∼63,000 cells, 10 cell types, and 12 samples.
Collapse
Affiliation(s)
- Hratch M Baghdassarian
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA; Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
| | - Daniel Dimitrov
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, 69120 Heidelberg, Germany
| | - Erick Armingol
- Bioinformatics and Systems Biology Graduate Program, University of California, San Diego, La Jolla, CA 92093, USA; Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA
| | - Julio Saez-Rodriguez
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, 69120 Heidelberg, Germany.
| | - Nathan E Lewis
- Department of Pediatrics, University of California, San Diego, La Jolla, CA 92093, USA; Department of Bioengineering, University of California, San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
19
|
Safadi A, Lovell SC, Doig AJ. Essentiality, protein-protein interactions and evolutionary properties are key predictors for identifying cancer-associated genes using machine learning. Sci Rep 2024; 14:9199. [PMID: 38649399 PMCID: PMC11035574 DOI: 10.1038/s41598-023-44118-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 10/04/2023] [Indexed: 04/25/2024] Open
Abstract
The distinctive nature of cancer as a disease prompts an exploration of the special characteristics the genes implicated in cancer exhibit. The identification of cancer-associated genes and their characteristics is crucial to further our understanding of this disease and enhanced likelihood of therapeutic drug targets success. However, the rate at which cancer genes are being identified experimentally is slow. Applying predictive analysis techniques, through the building of accurate machine learning models, is potentially a useful approach in enhancing the identification rate of these genes and their characteristics. Here, we investigated gene essentiality scores and found that they tend to be higher for cancer-associated genes compared to other protein-coding human genes. We built a dataset of extended gene properties linked to essentiality and used it to train a machine-learning model; this model reached 89% accuracy and > 0.85 for the Area Under Curve (AUC). The model showed that essentiality, evolutionary-related properties, and properties arising from protein-protein interaction networks are particularly effective in predicting cancer-associated genes. We were able to use the model to identify potential candidate genes that have not been previously linked to cancer. Prioritising genes that score highly by our methods could aid scientists in their cancer genes research.
Collapse
Affiliation(s)
- Amro Safadi
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, M13 9PT, UK
| | - Simon C Lovell
- Division of Evolution and Genomic Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, M13 9PT, UK
| | - Andrew J Doig
- Division of Neuroscience, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester, M13 9BL, UK.
| |
Collapse
|
20
|
Yang F, Shen J, Zhao Z, Shang W, Cai H. Unveiling the link between lactate metabolism and rheumatoid arthritis through integration of bioinformatics and machine learning. Sci Rep 2024; 14:9166. [PMID: 38644410 PMCID: PMC11033278 DOI: 10.1038/s41598-024-59907-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 04/16/2024] [Indexed: 04/23/2024] Open
Abstract
Rheumatoid arthritis (RA) is a persistent autoimmune condition characterized by synovitis and joint damage. Recent findings suggest a potential link to abnormal lactate metabolism. This study aims to identify lactate metabolism-related genes (LMRGs) in RA and investigate their correlation with the molecular mechanisms of RA immunity. Data on the gene expression profiles of RA synovial tissue samples were acquired from the gene expression omnibus (GEO) database. The RA database was acquired by obtaining the common LMRDEGs, and selecting the gene collection through an SVM model. Conducting the functional enrichment analysis, followed by immuno-infiltration analysis and protein-protein interaction networks. The results revealed that as possible markers associated with lactate metabolism in RA, KCNN4 and SLC25A4 may be involved in regulating macrophage function in the immune response to RA, whereas GATA2 is involved in the immune mechanism of DC cells. In conclusion, this study utilized bioinformatics analysis and machine learning to identify biomarkers associated with lactate metabolism in RA and examined their relationship with immune cell infiltration. These findings offer novel perspectives on potential diagnostic and therapeutic targets for RA.
Collapse
Affiliation(s)
- Fan Yang
- Department of Chinese Medicine, Jinling Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, 210002, China
| | - Junyi Shen
- Department of Chinese Medicine, Jinling Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, 210002, China
| | - Zhiming Zhao
- Department of Chinese Medicine, Jinling Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, 210002, China
| | - Wei Shang
- Department of Chinese Medicine, Jinling Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, 210002, China.
| | - Hui Cai
- Department of Chinese Medicine, Jinling Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, 210002, China
| |
Collapse
|
21
|
Gaston JM, Alm EJ, Zhang AN. Fast and accurate variant identification tool for sequencing-based studies. BMC Biol 2024; 22:90. [PMID: 38644496 PMCID: PMC11034086 DOI: 10.1186/s12915-024-01891-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Accepted: 04/17/2024] [Indexed: 04/23/2024] Open
Abstract
BACKGROUND Accurate identification of genetic variants, such as point mutations and insertions/deletions (indels), is crucial for various genetic studies into epidemic tracking, population genetics, and disease diagnosis. Genetic studies into microbiomes often require processing numerous sequencing datasets, necessitating variant identifiers with high speed, accuracy, and robustness. RESULTS We present QuickVariants, a bioinformatics tool that effectively summarizes variant information from read alignments and identifies variants. When tested on diverse bacterial sequencing data, QuickVariants demonstrates a ninefold higher median speed than bcftools, a widely used variant identifier, with higher accuracy in identifying both point mutations and indels. This accuracy extends to variant identification in virus samples, including SARS-CoV-2, particularly with significantly fewer false negative indels than bcftools. The high accuracy of QuickVariants is further demonstrated by its detection of a greater number of Omicron-specific indels (5 versus 0) and point mutations (61 versus 48-54) than bcftools in sewage metagenomes predominated by Omicron variants. Much of the reduced accuracy of bcftools was attributable to its misinterpretation of indels, often producing false negative indels and false positive point mutations at the same locations. CONCLUSIONS We introduce QuickVariants, a fast, accurate, and robust bioinformatics tool designed for identifying genetic variants for microbial studies. QuickVariants is available at https://github.com/caozhichongchong/QuickVariants .
Collapse
Affiliation(s)
| | - Eric J Alm
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, USA
- Department of Biological Engineering, Center for Microbiome Informatics and Therapeutics, Massachusetts Institute of Technology, Cambridge, USA
| | - An-Ni Zhang
- Department of Biological Engineering, Center for Microbiome Informatics and Therapeutics, Massachusetts Institute of Technology, Cambridge, USA.
| |
Collapse
|
22
|
Zhang Y, Li X. Empowering Graph Neural Networks with Block-Based Dual Adaptive Deep Adjustment for Drug Resistance-Related NcRNA Discovery. J Chem Inf Model 2024; 64:3537-3547. [PMID: 38523272 DOI: 10.1021/acs.jcim.3c01973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2024]
Abstract
Drug resistance to chemotherapeutic agents remains a formidable challenge in cancer treatment, significantly impacting treatment efficacy. Extensive research has exposed the intimate involvement of noncoding RNAs (ncRNAs) in conferring resistance to cancer drugs. Understanding the intricate associations between ncRNAs and drug resistance is of pivotal importance in advancing clinical interventions and expediting drug development. However, traditional biological experimental methods are hampered by limitations, such as labor intensiveness, time consumption, and constraints in scalability. Addressing these challenges necessitates the development of efficient computational methods for the accurate prediction of potential ncRNA-drug resistance associations (NDRA). However, most existing predictive models primarily focus on known ncRNA-drug resistance associations, often neglecting the critical aspect of similarity information between ncRNAs and drug resistance. This oversight may hinder the accuracy of characterizing these associations. To overcome the limitations of existing computational models, we proposed B-NDRA, a computational framework designed for the discovery of drug resistance-related ncRNA. Initially, we constructed a heterogeneous graph that integrates ncRNA-drug resistance pairs, leveraging both known associations and similarity fusion information between ncRNAs and drug resistance. Subsequently, we employed an attention mechanism to aggregate local features of graph nodes following a dimensionality reduction of node features. Further, a graph neural network (GNN) facilitated the learning of global node embeddings. Notably, the integration of dual adaptive deep adjustment architectures, encompassing intrablock and interblock methodologies, enabled efficient extraction of global features while balancing local and global features. Finally, B-NDRA employed a multilayer perceptron to predict associations between ncRNAs and drug resistance. Through rigorous 5-fold cross-validation, B-NDRA achieved average AUC, AUPR, Accuracy, Precision, Recall, and F1-score values of 92.2%, 91.9%, 84.88%, 86.9%, 82.37%, and 84.44%, respectively. Furthermore, comparative evaluations were conducted on established models, namely, GAEMDA, GRPAMDA, and LRGCPND. The results, obtained through three distinct 5-fold cross-validation strategies, demonstrated a notable performance improvement across almost all metrics for our B-NDRA. Specific case studies targeting Doxorubicin and Imatinib further validated the practicality of our B-NDRA in discovering potential NDRA. These results confirm the potential of our B-NDRA as a valuable tool in advancing cancer research and therapeutic development. The source code and data set of B-NDRA can be found at https://github.com/XuanLi1145/B-NDRA.
Collapse
Affiliation(s)
- Yi Zhang
- Guilin University of Technology, Guilin 541004, China
- Guangxi Key Laboratory of Embedded Technology and Intelligent System, Guilin University of Technology, Guilin 541004, China
| | - Xuanzhao Li
- Guilin University of Technology, Guilin 541004, China
- Guangxi Key Laboratory of Embedded Technology and Intelligent System, Guilin University of Technology, Guilin 541004, China
| |
Collapse
|
23
|
Xuan P, Lu S, Cui H, Wang S, Nakaguchi T, Zhang T. Learning Association Characteristics by Dynamic Hypergraph and Gated Convolution Enhanced Pairwise Attributes for Prediction of Disease-Related lncRNAs. J Chem Inf Model 2024; 64:3569-3578. [PMID: 38523267 DOI: 10.1021/acs.jcim.4c00245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/26/2024]
Abstract
As the long non-coding RNAs (lncRNAs) play important roles during the incurrence and development of various human diseases, identifying disease-related lncRNAs can contribute to clarifying the pathogenesis of diseases. Most of the recent lncRNA-disease association prediction methods utilized the multi-source data about the lncRNAs and diseases. A single lncRNA may participate in multiple disease processes, and multiple lncRNAs usually are involved in the same disease process synergistically. However, the previous methods did not completely exploit the biological characteristics to construct the informative prediction models. We construct a prediction model based on adaptive hypergraph and gated convolution for lncRNA-disease association prediction (AGLDA), to embed and encode the biological characteristics about lncRNA-disease associations, the topological features from the entire heterogeneous graph perspective, and the gated enhanced pairwise features. First, the strategy for constructing hyperedges is designed to reflect the biological characteristic that multiple lncRNAs are involved in multiple disease processes. Furthermore, each hyperedge has its own biological perspective, and multiple hyperedges are beneficial for revealing the diverse relationships among multiple lncRNAs and diseases. Second, we encode the biological features of each lncRNA (disease) node using a strategy based on dynamic hypergraph convolutional networks. The strategy may adaptively learn the features of the hyperedges and formulate the dynamically evolved hypergraph topological structure. Third, a group convolutional network is established to integrate the entire heterogeneous topological structure and multiple types of node attributes within an lncRNA-disease-miRNA graph. Finally, a gated convolutional strategy is proposed to enhance the informative features of the lncRNA-disease node pairs. The comparison experiments indicate that AGLDA outperforms seven advanced prediction methods. The ablation studies confirm the effectiveness of major innovations, and the case studies validate AGLDA's ability in application for discovering potential disease-related lncRNA candidates.
Collapse
Affiliation(s)
- Ping Xuan
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
- Department of Computer Science, Shantou University, Shantou 515063, China
| | - Siyuan Lu
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
| | - Hui Cui
- Department of Computer Science and Information Technology, La Trobe University, Melbourne 3083, Australia
| | - Shuai Wang
- School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China
| | - Toshiya Nakaguchi
- Center for Frontier Medical Engineering, Chiba University, Chiba 2638522, Japan
| | - Tiangang Zhang
- School of Computer Science and Technology, Heilongjiang University, Harbin 150080, China
- School of Mathematical Science, Heilongjiang University, Harbin 150080, China
| |
Collapse
|
24
|
Ledesma-Dominguez L, Carbajal-Degante E, Moreno-Hagelsieb G, Perez-Rueda E. DeepReg: a deep learning hybrid model for predicting transcription factors in eukaryotic and prokaryotic genomes. Sci Rep 2024; 14:9155. [PMID: 38644393 DOI: 10.1038/s41598-024-59487-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 04/11/2024] [Indexed: 04/23/2024] Open
Abstract
Deep learning models (DLMs) have gained importance in predicting, detecting, translating, and classifying a diversity of inputs. In bioinformatics, DLMs have been used to predict protein structures, transcription factor-binding sites, and promoters. In this work, we propose a hybrid model to identify transcription factors (TFs) among prokaryotic and eukaryotic protein sequences, named Deep Regulation (DeepReg) model. Two architectures were used in the DL model: a convolutional neural network (CNN), and a bidirectional long-short-term memory (BiLSTM). DeepReg reached a precision of 0.99, a recall of 0.97, and an F1-score of 0.98. The quality of our predictions, the bias-variance trade-off approach, and the characterization of new TF predictions were evaluated and compared against those produced by DeepTFactor, as well as against experimental data from three model organisms. Predictions based on our DLM tended to exhibit less variance and bias than those from DeepTFactor, thus increasing reliability and decreasing overfitting.
Collapse
Affiliation(s)
- Leonardo Ledesma-Dominguez
- Posgrado en Ciencia en Ingeniería de la Computación, Universidad Nacional Autónoma de México, 04510, Mexico City, Mexico.
- Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, UNAM, 04510, Mexico City, México.
| | - Erik Carbajal-Degante
- Coordinación de Universidad Abierta y Educación Digital (CUAED), Universidad Nacional Autónoma de México, 04510, Mexico City, México
| | | | - Ernesto Perez-Rueda
- Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Unidad Académica del Estado de Yucatán, Universidad Nacional Autónoma de México, Mérida, Yucatán, México.
| |
Collapse
|
25
|
Pan L, Wang H, Yang B, Li W. A protein network refinement method based on module discovery and biological information. BMC Bioinformatics 2024; 25:157. [PMID: 38643108 PMCID: PMC11031909 DOI: 10.1186/s12859-024-05772-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Accepted: 04/10/2024] [Indexed: 04/22/2024] Open
Abstract
BACKGROUND The identification of essential proteins can help in understanding the minimum requirements for cell survival and development to discover drug targets and prevent disease. Nowadays, node ranking methods are a common way to identify essential proteins, but the poor data quality of the underlying PIN has somewhat hindered the identification accuracy of essential proteins for these methods in the PIN. Therefore, researchers constructed refinement networks by considering certain biological properties of interacting protein pairs to improve the performance of node ranking methods in the PIN. Studies show that proteins in a complex are more likely to be essential than proteins not present in the complex. However, the modularity is usually ignored for the refinement methods of the PINs. METHODS Based on this, we proposed a network refinement method based on module discovery and biological information. The idea is, first, to extract the maximal connected subgraph in the PIN, and to divide it into different modules by using Fast-unfolding algorithm; then, to detect critical modules according to the orthologous information, subcellular localization information and topology information within each module; finally, to construct a more refined network (CM-PIN) by using the identified critical modules. RESULTS To evaluate the effectiveness of the proposed method, we used 12 typical node ranking methods (LAC, DC, DMNC, NC, TP, LID, CC, BC, PR, LR, PeC, WDC) to compare the overall performance of the CM-PIN with those on the S-PIN, D-PIN and RD-PIN. The experimental results showed that the CM-PIN was optimal in terms of the identification number of essential proteins, precision-recall curve, Jackknifing method and other criteria, and can help to identify essential proteins more accurately.
Collapse
Affiliation(s)
- Li Pan
- Hunan Institute of Science and Technology, Yueyang, 414006, China
- Hunan Engineering Research Center of Multimodal Health Sensing and Intelligent Analysis, Yueyang, 414006, China
| | - Haoyue Wang
- Hunan Institute of Science and Technology, Yueyang, 414006, China.
| | - Bo Yang
- Hunan Institute of Science and Technology, Yueyang, 414006, China
- Hunan Engineering Research Center of Multimodal Health Sensing and Intelligent Analysis, Yueyang, 414006, China
| | - Wenbin Li
- Hunan Institute of Science and Technology, Yueyang, 414006, China.
| |
Collapse
|
26
|
Li Y, Du L, Meng L, Lv C, Tian X. High expression of CASP1 induces atherosclerosis. Medicine (Baltimore) 2024; 103:e37616. [PMID: 38640260 PMCID: PMC11030018 DOI: 10.1097/md.0000000000037616] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Revised: 02/17/2024] [Accepted: 02/23/2024] [Indexed: 04/21/2024] Open
Abstract
Atherosclerosis is a chronic, progressive vascular disease. The relationship between CASP1 gene expression and atherosclerosis remains unclear. The atherosclerosis dataset GSE132651 and GSE202625 profiles were downloaded from gene expression omnibus. Differentially expressed genes (DEGs) were screened. The construction and analysis of protein-protein interaction network, functional enrichment analysis, gene set enrichment analysis, and Comparative Toxicogenomics Database analysis were performed. Gene expression heatmap was drawn. TargetScan was used to screen miRNAs that regulate central DEG. 47 DEGs were identified. According to gene ontology analysis, they were mainly enriched in the regulation of stimulus response, response to organic matter, extracellular region, extracellular region, and the same protein binding. Kyoto Encyclopedia of Gene and Genome analysis results showed that the target cells were mainly enriched in the PI3K-Akt signaling pathway, Ras signaling pathway, and PPAR signaling pathway. In the enrichment project of Metascape, vascular development, regulation of body fluid levels, and positive regulation of cell motility can be seen in the gene ontology enrichment project. Eleven core genes (CASP1, NLRP3, MRC1, IRS1, PPARG, APOE, IL13, FGF2, CCR2, ICAM1, HIF1A) were obtained. IRS1, PPARG, APOE, FGF2, CCR2, and HIF1A genes are identified as core genes. Gene expression heatmap showed that CASP1 was highly expressed in atherosclerosis samples and low expressed in normal samples. NLRP3, MRC1, IRS1, PPARG, APOE, IL13, FGF2, CCR2, ICAM1, HIF1A were low expressed in atherosclerosis samples. CTD analysis showed that 5 genes (CASP1, NLRP3, CCR2, ICAM1, HIF1A) were found to be associated with pneumonia, inflammation, cardiac enlargement, and tumor invasiveness. CASP1 gene is highly expressed in atherosclerosis. The higher the CASP1 gene, the worse the prognosis.
Collapse
Affiliation(s)
- Yongchao Li
- Department of Cardiac Surgery, Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
| | - Lihong Du
- Department of Rheumatology and Clinical Immunology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Peking Union Medical College, National Clinical Research Center for Dermatologic and Immunologic Diseases (NCRC-DID), Ministry of Science & Technology, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Key Laboratory of Rheumatology and Clinical Immunology, Ministry of Education, Beijing, China
| | - Lingbing Meng
- Department of Cardiology, Beijing Tsinghua Changgung Hospital, School of Clinical Medicine, Tsinghua University, Beijing, China
| | - Chao Lv
- Department of neurology, Pizhou Hospital Affiliated to Xuzhou Medical University, Pizhou People's Hospital, Pizhou, Jiangsu Province, China
| | - Xinping Tian
- Department of Rheumatology and Clinical Immunology, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences, Peking Union Medical College, National Clinical Research Center for Dermatologic and Immunologic Diseases (NCRC-DID), Ministry of Science & Technology, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Key Laboratory of Rheumatology and Clinical Immunology, Ministry of Education, Beijing, China
| |
Collapse
|
27
|
Dua R, Bhardwaj T, Ahmad I, Somvanshi P. Investigating the potential of Juglans regia phytoconstituents for the treatment of cervical cancer utilizing network biology and molecular docking approach. PLoS One 2024; 19:e0287864. [PMID: 38626166 PMCID: PMC11020953 DOI: 10.1371/journal.pone.0287864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2023] [Accepted: 01/22/2024] [Indexed: 04/18/2024] Open
Abstract
The fourth most frequent type of cancer in women and the leading cause of mortality for females worldwide is cervical cancer. Traditionally, medicinal plants have been utilized to treat various illnesses and ailments. The molecular docking method is used in the current study to look into the phytoconstituents of Juglans regia's possible anticancer effects on cervical cancer target proteins. This work uses the microarray dataset analysis of GSE63678 from the NCBI Gene Expression Omnibus database to find differentially expressed genes. Furthermore, protein-protein interactions of differentially expressed genes were constructed using network biology techniques. The top five hub genes (IGF1, FGF2, ESR1, MYL9, and MYH11) are then determined by computing topological parameters with Cytohubba. In addition, molecular docking research was performed on Juglans regia phytocompounds that were extracted from the IMPPAT database versus hub genes that had been identified. Utilizing molecular dynamics, simulation confirmed that prioritized docked complexes with low binding energies were stable.
Collapse
Affiliation(s)
- Riya Dua
- School of Computational & Integrative Sciences (SCIS), Jawaharlal Nehru University, JNU Campus, New Delhi, India
| | - Tulika Bhardwaj
- Department of Agricultural, Food and Nutritional Sciences, University of Alberta, Edmonton, Alberta, Canada
| | - Irshad Ahmad
- College of Applied Medical Sciences, Department of Medical Rehabilitation Sciences, King Khalid University, Abha, Saudi Arabia
| | - Pallavi Somvanshi
- School of Computational & Integrative Sciences (SCIS), Jawaharlal Nehru University, JNU Campus, New Delhi, India
| |
Collapse
|
28
|
Kou J, Bie Y, Liu M, Wang L, Liu X, Sun Y, Zheng X. Identification and bioinformatics analysis of lncRNAs in serum of patients with ankylosing spondylitis. BMC Musculoskelet Disord 2024; 25:291. [PMID: 38622662 PMCID: PMC11017588 DOI: 10.1186/s12891-024-07396-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Accepted: 03/29/2024] [Indexed: 04/17/2024] Open
Abstract
OBJECTIVES The aim of this study was to explore the long non-coding RNA (lncRNA) expression profiles in serum of patients with ankylosing spondylitis (AS). The role of these lncRNAs in this complex autoimmune situation needs to be evaluated. METHODS We used high-throughput whole-transcriptome sequencing to generate sequencing data from three patients with AS and three normal controls (NC). Then, we performed bioinformatics analyses to identify the functional and biological processes associated with differentially expressed lncRNAs (DElncRNAs). We confirmed the validity of our RNA-seq data by assessing the expression of eight lncRNAs via quantitative reverse transcription polymerase chain reaction (qRT-PCR) in 20 AS and 20 NC samples. We measured the correlation between the expression levels of lncRNAs and patient clinical index values using the Spearman correlation test. RESULTS We identified 72 significantly upregulated and 73 significantly downregulated lncRNAs in AS patients compared to NC. qRT-PCR was performed to validate the expression of selected DElncRNAs; the results demonstrated that the expression levels of MALAT1:24, NBR2:9, lnc-DLK1-35:13, lnc-LARP1-1:1, lnc-AIPL1-1:7, and lnc-SLC12A7-1:16 were consistent with the sequencing analysis results. Enrichment analysis showed that DElncRNAs mainly participated in the immune and inflammatory responses pathways, such as regulation of protein ubiquitination, major histocompatibility complex class I-mediated antigen processing and presentation, MAPkinase activation, and interleukin-17 signaling pathways. In addition, a competing endogenous RNA network was constructed to determine the interaction among the lncRNAs, microRNAs, and mRNAs based on the confirmed lncRNAs (MALAT1:24 and NBR2:9). We further found the expression of MALAT1:24 and NBR2:9 to be positively correlated with disease severity. CONCLUSION Taken together, our study presents a comprehensive overview of lncRNAs in the serum of AS patients, thereby contributing novel perspectives on the underlying pathogenic mechanisms of this condition. In addition, our study predicted MALAT1 has the potential to be deeply involved in the pathogenesis of AS.
Collapse
Affiliation(s)
- Jianqiang Kou
- Department of Spinal Surgery, The Affiliated Hospital of Qingdao University, Qingdao, 266000, Shandong, China
| | - Yongchen Bie
- Department of Spinal Surgery, The Affiliated Hospital of Qingdao University, Qingdao, 266000, Shandong, China
| | - Mingquan Liu
- Department of Operating Room, The Affiliated Hospital of Qingdao University, Qingdao, 266000, Shandong, China
| | - Liqin Wang
- Department of Rheumatology, The Affiliated Hospital of Qingdao University, Qingdao, 266000, Shandong, China
| | - Xiangyun Liu
- Department of Spinal Surgery, The Affiliated Hospital of Qingdao University, Qingdao, 266000, Shandong, China
| | - Yuanliang Sun
- Department of Spinal Surgery, The Affiliated Hospital of Qingdao University, Qingdao, 266000, Shandong, China
| | - Xiujun Zheng
- Department of Spinal Surgery, The Affiliated Hospital of Qingdao University, Qingdao, 266000, Shandong, China.
| |
Collapse
|
29
|
Qing KX, Lo ACY, Lu S, Zhou Y, Yang D, Yang D. Integrated bioinformatics analysis of retinal ischemia/reperfusion injury in rats with potential key genes. BMC Genomics 2024; 25:367. [PMID: 38622534 PMCID: PMC11017533 DOI: 10.1186/s12864-024-10288-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Accepted: 04/07/2024] [Indexed: 04/17/2024] Open
Abstract
The tissue damage caused by transient ischemic injury is an essential component of the pathogenesis of retinal ischemia, which mainly hinges on the degree and duration of interruption of the blood supply and the subsequent damage caused by tissue reperfusion. Some research indicated that the retinal injury induced by ischemia-reperfusion (I/R) was related to reperfusion time.In this study, we screened the differentially expressed circRNAs, lncRNAs, and mRNAs between the control and model group and at different reperfusion time (24h, 72h, and 7d) with the aid of whole transcriptome sequencing technology, and the trend changes in time-varying mRNA, lncRNA, circRNA were obtained by chronological analysis. Then, candidate circRNAs, lncRNAs, and mRNAs were obtained as the intersection of differentially expression genes and trend change genes. Importance scores of the genes selected the key genes whose expression changed with the increase of reperfusion time. Also, the characteristic differentially expressed genes specific to the reperfusion time were analyzed, key genes specific to reperfusion time were selected to show the change in biological process with the increase of reperfusion time.As a result, 316 candidate mRNAs, 137 candidate lncRNAs, and 31 candidate circRNAs were obtained by the intersection of differentially expressed mRNAs, lncRNAs, and circRNAs with trend mRNAs, trend lncRNAs and trend circRNAs, 5 key genes (Cd74, RT1-Da, RT1-CE5, RT1-Bb, RT1-DOa) were selected by importance scores of the genes. The result of GSEA showed that key genes were found to play vital roles in antigen processing and presentation, regulation of the actin cytoskeleton, and the ribosome. A network included 4 key genes (Cd74, RT1-Da, RT1-Bb, RT1-DOa), 34 miRNAs and 48 lncRNAs, and 81 regulatory relationship axes, and a network included 4 key genes (Cd74, RT1-Da, RT1-Bb, RT1-DOa), 9 miRNAs and 3 circRNAs (circRNA_10572, circRNA_03219, circRNA_11359) and 12 regulatory relationship axes were constructed, the subcellular location, transcription factors, signaling network, targeted drugs and relationship to eye diseases of key genes were predicted. 1370 characteristic differentially expressed mRNAs (spec_24h mRNA), 558 characteristic differentially expressed mRNAs (spec_72h mRNA), and 92 characteristic differentially expressed mRNAs (spec_7d mRNA) were found, and their key genes and regulation networks were analyzed.In summary, we screened the differentially expressed circRNAs, lncRNAs, and mRNAs between the control and model groups and at different reperfusion time (24h, 72h, and 7d). 5 key genes, Cd74, RT1-Da, RT1-CE5, RT1-Bb, RT1-DOa, were selected. Key genes specific to reperfusion time were selected to show the change in biological process with the increased reperfusion time. These results provided theoretical support and a reference basis for the clinical treatment.
Collapse
Affiliation(s)
- Kai-Xiong Qing
- Department of Cardiac & Vascular Surgery, First Affiliated Hospital of Kunming Medical University, Kunming Medical University, Kunming, Yunnan Province, China
| | - Amy C Y Lo
- Department of Ophthalmology, Li Ka Shing Faculty of Medicine, The University of Hong Kong, Hong Kong, China
| | - Siduo Lu
- Department of Ophthalmology, First Affiliated Hospital of Kunming Medical University, Kunming Medical University, Kunming, Yunnan Province, China
| | - You Zhou
- Department of Ophthalmology, First Affiliated Hospital of Kunming Medical University, Kunming Medical University, Kunming, Yunnan Province, China
| | - Dan Yang
- Department of Ophthalmology, First Affiliated Hospital of Kunming Medical University, Kunming Medical University, Kunming, Yunnan Province, China
| | - Di Yang
- Department of Ophthalmology, First Affiliated Hospital of Kunming Medical University, Kunming Medical University, Kunming, Yunnan Province, China.
| |
Collapse
|
30
|
Wu LY, Wijesekara Y, Piedade GJ, Pappas N, Brussaard CPD, Dutilh BE. Benchmarking bioinformatic virus identification tools using real-world metagenomic data across biomes. Genome Biol 2024; 25:97. [PMID: 38622738 PMCID: PMC11020464 DOI: 10.1186/s13059-024-03236-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Accepted: 04/01/2024] [Indexed: 04/17/2024] Open
Abstract
BACKGROUND As most viruses remain uncultivated, metagenomics is currently the main method for virus discovery. Detecting viruses in metagenomic data is not trivial. In the past few years, many bioinformatic virus identification tools have been developed for this task, making it challenging to choose the right tools, parameters, and cutoffs. As all these tools measure different biological signals, and use different algorithms and training and reference databases, it is imperative to conduct an independent benchmarking to give users objective guidance. RESULTS We compare the performance of nine state-of-the-art virus identification tools in thirteen modes on eight paired viral and microbial datasets from three distinct biomes, including a new complex dataset from Antarctic coastal waters. The tools have highly variable true positive rates (0-97%) and false positive rates (0-30%). PPR-Meta best distinguishes viral from microbial contigs, followed by DeepVirFinder, VirSorter2, and VIBRANT. Different tools identify different subsets of the benchmarking data and all tools, except for Sourmash, find unique viral contigs. Performance of tools improved with adjusted parameter cutoffs, indicating that adjustment of parameter cutoffs before usage should be considered. CONCLUSIONS Together, our independent benchmarking facilitates selecting choices of bioinformatic virus identification tools and gives suggestions for parameter adjustments to viromics researchers.
Collapse
Affiliation(s)
- Ling-Yi Wu
- Theoretical Biology and Bioinformatics, Science4Life, Utrecht University, Padualaan 8, Utrecht, 3584 CH, The Netherlands
| | - Yasas Wijesekara
- Institute of Bioinformatics, University Medicine Greifswald, Felix Hausdorff Str. 8, 17475, Greifswald, Germany
| | - Gonçalo J Piedade
- Department Marine Microbiology and Biogeochemistry, NIOZ Royal Netherlands Institute for Sea Research, Den Burg, PO Box 59, Texel, 1790 AB, The Netherlands
- Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Amsterdam, The Netherlands
| | - Nikolaos Pappas
- Theoretical Biology and Bioinformatics, Science4Life, Utrecht University, Padualaan 8, Utrecht, 3584 CH, The Netherlands
| | - Corina P D Brussaard
- Department Marine Microbiology and Biogeochemistry, NIOZ Royal Netherlands Institute for Sea Research, Den Burg, PO Box 59, Texel, 1790 AB, The Netherlands
- Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Amsterdam, The Netherlands
| | - Bas E Dutilh
- Theoretical Biology and Bioinformatics, Science4Life, Utrecht University, Padualaan 8, Utrecht, 3584 CH, The Netherlands.
- Institute of Biodiversity, Faculty of Biological Sciences, Cluster of Excellence Balance of the Microverse, Friedrich Schiller University Jena, 07743, Jena, Germany.
| |
Collapse
|
31
|
Boob AG, Zhu Z, Intasian P, Jain M, Petrov V, Lane ST, Tan SI, Xun G, Zhao H. CRISPR-COPIES: an in silico platform for discovery of neutral integration sites for CRISPR/Cas-facilitated gene integration. Nucleic Acids Res 2024; 52:e30. [PMID: 38346683 PMCID: PMC11014336 DOI: 10.1093/nar/gkae062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2023] [Revised: 01/09/2024] [Accepted: 01/19/2024] [Indexed: 04/14/2024] Open
Abstract
The CRISPR/Cas system has emerged as a powerful tool for genome editing in metabolic engineering and human gene therapy. However, locating the optimal site on the chromosome to integrate heterologous genes using the CRISPR/Cas system remains an open question. Selecting a suitable site for gene integration involves considering multiple complex criteria, including factors related to CRISPR/Cas-mediated integration, genetic stability, and gene expression. Consequently, identifying such sites on specific or different chromosomal locations typically requires extensive characterization efforts. To address these challenges, we have developed CRISPR-COPIES, a COmputational Pipeline for the Identification of CRISPR/Cas-facilitated intEgration Sites. This tool leverages ScaNN, a state-of-the-art model on the embedding-based nearest neighbor search for fast and accurate off-target search, and can identify genome-wide intergenic sites for most bacterial and fungal genomes within minutes. As a proof of concept, we utilized CRISPR-COPIES to characterize neutral integration sites in three diverse species: Saccharomyces cerevisiae, Cupriavidus necator, and HEK293T cells. In addition, we developed a user-friendly web interface for CRISPR-COPIES (https://biofoundry.web.illinois.edu/copies/). We anticipate that CRISPR-COPIES will serve as a valuable tool for targeted DNA integration and aid in the characterization of synthetic biology toolkits, enable rapid strain construction to produce valuable biochemicals, and support human gene and cell therapy applications.
Collapse
Affiliation(s)
- Aashutosh Girish Boob
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Zhixin Zhu
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Pattarawan Intasian
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- School of Biomolecular Science and Engineering, Vidyasirimedhi Institute of Science and Technology, Wangchan Valley, Rayong 21210, Thailand
| | - Manan Jain
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Vassily Andrew Petrov
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Stephan Thomas Lane
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Shih-I Tan
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Guanhua Xun
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| | - Huimin Zhao
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
- DOE Center for Advanced Bioenergy and Bioproducts Innovation, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA
| |
Collapse
|
32
|
Dhaka B, Zimmerli M, Hanhart D, Moser M, Guillen-Ramirez H, Mishra S, Esposito R, Polidori T, Widmer M, García-Pérez R, Julio MKD, Pervouchine D, Melé M, Chouvardas P, Johnson R. Functional identification of cis-regulatory long noncoding RNAs at controlled false discovery rates. Nucleic Acids Res 2024; 52:2821-2835. [PMID: 38348970 PMCID: PMC11014264 DOI: 10.1093/nar/gkae075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 01/15/2024] [Accepted: 01/26/2024] [Indexed: 03/09/2024] Open
Abstract
A key attribute of some long noncoding RNAs (lncRNAs) is their ability to regulate expression of neighbouring genes in cis. However, such 'cis-lncRNAs' are presently defined using ad hoc criteria that, we show, are prone to false-positive predictions. The resulting lack of cis-lncRNA catalogues hinders our understanding of their extent, characteristics and mechanisms. Here, we introduce TransCistor, a framework for defining and identifying cis-lncRNAs based on enrichment of targets amongst proximal genes. TransCistor's simple and conservative statistical models are compatible with functionally defined target gene maps generated by existing and future technologies. Using transcriptome-wide perturbation experiments for 268 human and 134 mouse lncRNAs, we provide the first large-scale survey of cis-lncRNAs. Known cis-lncRNAs are correctly identified, including XIST, LINC00240 and UMLILO, and predictions are consistent across analysis methods, perturbation types and independent experiments. We detect cis-activity in a minority of lncRNAs, primarily involving activators over repressors. Cis-lncRNAs are detected by both RNA interference and antisense oligonucleotide perturbations. Mechanistically, cis-lncRNA transcripts are observed to physically associate with their target genes and are weakly enriched with enhancer elements. In summary, TransCistor establishes a quantitative foundation for cis-lncRNAs, opening a path to elucidating their molecular mechanisms and biological significance.
Collapse
Affiliation(s)
- Bhavya Dhaka
- School of Biology and Environmental Science, University College Dublin, Dublin D04 V1W8, Ireland
- Conway Institute for Biomolecular and Biomedical Research, University College Dublin, Dublin D04 V1W8, Ireland
| | - Marc Zimmerli
- Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
- Department for BioMedical Research, University of Bern, Bern 3008, Switzerland
| | - Daniel Hanhart
- Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
- Department for BioMedical Research, University of Bern, Bern 3008, Switzerland
| | - Mario B Moser
- Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
- Department for BioMedical Research, University of Bern, Bern 3008, Switzerland
| | - Hugo Guillen-Ramirez
- School of Biology and Environmental Science, University College Dublin, Dublin D04 V1W8, Ireland
- Conway Institute for Biomolecular and Biomedical Research, University College Dublin, Dublin D04 V1W8, Ireland
- Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
- Department for BioMedical Research, University of Bern, Bern 3008, Switzerland
- Department of Visceral Surgery and Medicine, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
| | - Sanat Mishra
- Indian Institute of Science Education and Research, Mohali, India
| | - Roberta Esposito
- Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
- Department for BioMedical Research, University of Bern, Bern 3008, Switzerland
| | - Taisia Polidori
- Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
- Department for BioMedical Research, University of Bern, Bern 3008, Switzerland
| | - Maro Widmer
- Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
- Department for BioMedical Research, University of Bern, Bern 3008, Switzerland
| | - Raquel García-Pérez
- Department of Life Sciences, Barcelona Supercomputing Centre, Barcelona 08034, Spain
| | - Marianna Kruithof-de Julio
- Department of Urology, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
- Urology Research Laboratory, Department for BioMedical Research, University of Bern, 3008, Bern, Switzerland
| | - Dmitri Pervouchine
- Center for Cellular and Molecular Biology, Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Marta Melé
- Department of Life Sciences, Barcelona Supercomputing Centre, Barcelona 08034, Spain
| | - Panagiotis Chouvardas
- Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
- Department for BioMedical Research, University of Bern, Bern 3008, Switzerland
- Department of Urology, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
- Urology Research Laboratory, Department for BioMedical Research, University of Bern, 3008, Bern, Switzerland
| | - Rory Johnson
- School of Biology and Environmental Science, University College Dublin, Dublin D04 V1W8, Ireland
- Conway Institute for Biomolecular and Biomedical Research, University College Dublin, Dublin D04 V1W8, Ireland
- Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, Bern 3010, Switzerland
- Department for BioMedical Research, University of Bern, Bern 3008, Switzerland
- FutureNeuro SFI Research Centre, University College Dublin, Dublin D04 V1W8, Ireland
| |
Collapse
|
33
|
Yang M, Zhang S, Zheng Z, Zhang P, Liang Y, Tang S. Employing bimodal representations to predict DNA bendability within a self-supervised pre-trained framework. Nucleic Acids Res 2024; 52:e33. [PMID: 38375921 PMCID: PMC11014357 DOI: 10.1093/nar/gkae099] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2023] [Revised: 01/10/2024] [Accepted: 02/01/2024] [Indexed: 02/21/2024] Open
Abstract
The bendability of genomic DNA, which measures the DNA looping rate, is crucial for numerous biological processes of DNA. Recently, an advanced high-throughput technique known as 'loop-seq' has made it possible to measure the inherent cyclizability of DNA fragments. However, quantifying the bendability of large-scale DNA is costly, laborious, and time-consuming. To close the gap between rapidly evolving large language models and expanding genomic sequence information, and to elucidate the DNA bendability's impact on critical regulatory sequence motifs such as super-enhancers in the human genome, we introduce an innovative computational model, named MIXBend, to forecast the DNA bendability utilizing both nucleotide sequences and physicochemical properties. In MIXBend, a pre-trained language model DNABERT and convolutional neural network with attention mechanism are utilized to construct both sequence- and physicochemical-based extractors for the sophisticated refinement of DNA sequence representations. These bimodal DNA representations are then fed to a k-mer sequence-physicochemistry matching module to minimize the semantic gap between each modality. Lastly, a self-attention fusion layer is employed for the prediction of DNA bendability. In conclusion, the experimental results validate MIXBend's superior performance relative to other state-of-the-art methods. Additionally, MIXBend reveals both novel and known motifs from the yeast. Moreover, MIXBend discovers significant bendability fluctuations within super-enhancer regions and transcription factors binding sites in the human genome.
Collapse
Affiliation(s)
- Minghao Yang
- Bioscience and Biomedical Engineering Thrust, System Hub, Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511466, China
| | - Shichen Zhang
- Bioscience and Biomedical Engineering Thrust, System Hub, Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511466, China
| | - Zhihang Zheng
- Bioscience and Biomedical Engineering Thrust, System Hub, Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511466, China
| | - Pengfei Zhang
- Bioscience and Biomedical Engineering Thrust, System Hub, Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511466, China
| | - Yan Liang
- School of Artificial Intelligence, South China Normal University, Foshan 528225, China
| | - Shaojun Tang
- Bioscience and Biomedical Engineering Thrust, System Hub, Hong Kong University of Science and Technology (Guangzhou), Guangzhou 511466, China
- Division of Life Science, Hong Kong University of Science and Technology, Hong Kong SAR 999077, China
| |
Collapse
|
34
|
Ponne S, Kumar R, Vanmathi SM, Brilhante RSN, Kumar CR. Reverse engineering protection: A comprehensive survey of reverse vaccinology-based vaccines targeting viral pathogens. Vaccine 2024; 42:2503-2518. [PMID: 38523003 DOI: 10.1016/j.vaccine.2024.02.087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 01/30/2024] [Accepted: 02/27/2024] [Indexed: 03/26/2024]
Abstract
Vaccines have significantly reduced the impact of numerous deadly viral infections. However, there is an increasing need to expedite vaccine development in light of the recurrent pandemics and epidemics. Also, identifying vaccines against certain viruses is challenging due to various factors, notably the inability to culture certain viruses in cell cultures and the wide-ranging diversity of MHC profiles in humans. Fortunately, reverse vaccinology (RV) efficiently overcomes these limitations and has simplified the identification of epitopes from antigenic proteins across the entire proteome, streamlining the vaccine development process. Furthermore, it enables the creation of multiepitope vaccines that can effectively account for the variations in MHC profiles within the human population. The RV approach offers numerous advantages in developing precise and effective vaccines against viral pathogens, including extensive proteome coverage, accurate epitope identification, cross-protection capabilities, and MHC compatibility. With the introduction of RV, there is a growing emphasis among researchers on creating multiepitope-based vaccines aiming to stimulate the host's immune responses against multiple serotypes, as opposed to single-component monovalent alternatives. Regardless of how promising the RV-based vaccine candidates may appear, they must undergo experimental validation to probe their protection efficacy for real-world applications. The time, effort, and resources allocated to the laborious epitope identification process can now be redirected toward validating vaccine candidates identified through the RV approach. However, to overcome failures in the RV-based approach, efforts must be made to incorporate immunological principles and consider targeting the epitope regions involved in disease pathogenesis, immune responses, and neutralizing antibody maturation. Integrating multi-omics and incorporating artificial intelligence and machine learning-based tools and techniques in RV would increase the chances of developing an effective vaccine. This review thoroughly explains the RV approach, ideal RV-based vaccine construct components, RV-based vaccines designed to combat viral pathogens, its challenges, and future perspectives.
Collapse
Affiliation(s)
- Saravanaraman Ponne
- Department of Medical Biotechnology, Aarupadai Veedu Medical College and Hospital, Vinayaka Mission's Research Foundation (Deemed to be University), Kirumampakkam, Puducherry 607402, India
| | - Rajender Kumar
- Division of Glycoscience, Department of Chemistry, School of Engineering Sciences in Chemistry, Biotechnology and Health, KTH Royal Institute of Technology, AlbaNova University Center, Stockholm 106 91, Sweden
| | - S M Vanmathi
- Mahatma Gandhi Medical Advanced Research Institute, Sri Balaji Vidyapeeth (Deemed to be University), Pondicherry 607402, India
| | - Raimunda Sâmia Nogueira Brilhante
- Medical Mycology Specialized Center, Department of Pathology and Legal Medicine, Federal University of Ceará, Fortaleza, Ceará, Brazil
| | - Chinnadurai Raj Kumar
- Mahatma Gandhi Medical Advanced Research Institute, Sri Balaji Vidyapeeth (Deemed to be University), Pondicherry 607402, India.
| |
Collapse
|
35
|
Li H, Meng J, Wang Z, Tang Y, Xia S, Wang Y, Qin Z, Luan Y. miPEPPred-FRL: A Novel Method for Predicting Plant MiRNA-Encoded Peptides Using Adaptive Feature Representation Learning. J Chem Inf Model 2024; 64:2889-2900. [PMID: 37733290 DOI: 10.1021/acs.jcim.3c01020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/22/2023]
Abstract
MicroRNAs (miRNAs) are an essential type of small molecule RNAs that play significant regulatory roles in organisms. Recent studies have demonstrated that small open reading frames (sORFs) harbored in primary miRNAs (pri-miRNAs) can encode small peptides, known as miPEPs. Plant miPEPs can increase the abundance and activity of cognate miRNAs by promoting the transcription of their corresponding pri-miRNAs, thereby modulating plant traits. Biological experiments are the most effective way to accurately identify miPEPs; however, they are time-consuming and expensive. Hence, an efficient computational method for the identification of miPEPs on a large scale is highly desirable. Up to now, there have been no specialized computational tools for identifying miPEPs. In this work, a novel predictor named miPEPPred-FRL based on an adaptive feature representation learning framework that consists of the feature transformation module and the cascade architecture has been proposed. The feature transformation module integrating a newly designed feature selection method and classifier selection rule is developed to convert sequence-based features into primary class and probabilistic features, which are then fed into the improved cascade architecture to obtain more stable and discriminative augmented features. Finally, the augmented features are utilized to construct the final predictor. Cross-validation experiments illustrate that the novel feature selection method and classifier selection rule contribute to boosting the feature representation ability of the framework. Furthermore, the high accuracy of miPEPPred-FRL on independent testing data suggests that it is a trustworthy and valuable tool for the identification of miPEPs.
Collapse
Affiliation(s)
- Haibin Li
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Jun Meng
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Zhaowei Wang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Youwei Tang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Shihao Xia
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Yu Wang
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Zhaojing Qin
- School of Computer Science and Technology, Dalian University of Technology, Dalian, Liaoning 116024, China
| | - Yushi Luan
- School of Bioengineering, Dalian University of Technology, Dalian, Liaoning 116024, China
| |
Collapse
|
36
|
Rafiei F, Zeraati H, Abbasi K, Razzaghi P, Ghasemi JB, Parsaeian M, Masoudi-Nejad A. CFSSynergy: Combining Feature-Based and Similarity-Based Methods for Drug Synergy Prediction. J Chem Inf Model 2024; 64:2577-2585. [PMID: 38514966 DOI: 10.1021/acs.jcim.3c01486] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/23/2024]
Abstract
Drug synergy prediction plays a vital role in cancer treatment. Because experimental approaches are labor-intensive and expensive, computational-based approaches get more attention. There are two types of computational methods for drug synergy prediction: feature-based and similarity-based. In feature-based methods, the main focus is to extract more discriminative features from drug pairs and cell lines to pass to the task predictor. In similarity-based methods, the similarities among all drugs and cell lines are utilized as features and fed into the task predictor. In this work, a novel approach, called CFSSynergy, that combines these two viewpoints is proposed. First, a discriminative representation is extracted for paired drugs and cell lines as input. We have utilized transformer-based architecture for drugs. For cell lines, we have created a similarity matrix between proteins using the Node2Vec algorithm. Then, the new cell line representation is computed by multiplying the protein-protein similarity matrix and the initial cell line representation. Next, we compute the similarity between unique drugs and unique cells using the learned representation for paired drugs and cell lines. Then, we compute a new representation for paired drugs and cell lines based on the similarity-based features and the learned features. Finally, these features are fed to XGBoost as a task predictor. Two well-known data sets were used to evaluate the performance of our proposed method: DrugCombDB and OncologyScreen. The CFSSynergy approach consistently outperformed existing methods in comparative evaluations. This substantiates the efficacy of our approach in capturing complex synergistic interactions between drugs and cell lines, setting it apart from conventional similarity-based or feature-based methods.
Collapse
Affiliation(s)
- Fatemeh Rafiei
- Department of Epidemiology and Biostatistics, School of Health, Tehran University of Medical Sciences, Tehran 14167-53955, Iran
| | - Hojjat Zeraati
- Department of Epidemiology and Biostatistics, School of Health, Tehran University of Medical Sciences, Tehran 14167-53955, Iran
| | - Karim Abbasi
- Laboratory of System Biology, Bioinformatics & Artificial Intelligence in Medicine (LBB&AI), Faculty of Mathematics and Computer Science, Kharazmi University, Tehran 14588-89694, Iran
| | - Parvin Razzaghi
- Department of Computer Science and Information Technology, Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan 45137-66731, Iran
| | - Jahan B Ghasemi
- Chemistry Department, Faculty of Chemistry, School of Sciences, University of Tehran, Tehran 14174-66191, Iran
| | - Mahboubeh Parsaeian
- Department of Epidemiology and Biostatistics, School of Health, Tehran University of Medical Sciences, Tehran 14167-53955, Iran
- Cancer Epidemiology Unit, Nuffield Department of Population Health, University of Oxford, Oxford OX3 7LF, U.K
| | - Ali Masoudi-Nejad
- Laboratory of Systems Biology and Bioinformatics (LBB), Institute of Biochemistry and Biophysics, University of Tehran, Tehran 13145-1365, Iran
| |
Collapse
|
37
|
He J, Li M, Qiu J, Pu X, Guo Y. HOPEXGB: A Consensual Model for Predicting miRNA/lncRNA-Disease Associations Using a Heterogeneous Disease-miRNA-lncRNA Information Network. J Chem Inf Model 2024; 64:2863-2877. [PMID: 37604142 DOI: 10.1021/acs.jcim.3c00856] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2023]
Abstract
Predicting disease-related microRNAs (miRNAs) and long noncoding RNAs (lncRNAs) is crucial to find new biomarkers for the prevention, diagnosis, and treatment of complex human diseases. Computational predictions for miRNA/lncRNA-disease associations are of great practical significance, since traditional experimental detection is expensive and time-consuming. In this paper, we proposed a consensual machine-learning technique-based prediction approach to identify disease-related miRNAs and lncRNAs by high-order proximity preserved embedding (HOPE) and eXtreme Gradient Boosting (XGB), named HOPEXGB. By connecting lncRNA, miRNA, and disease nodes based on their correlations and relationships, we first created a heterogeneous disease-miRNA-lncRNA (DML) information network to achieve an effective fusion of information on similarities, correlations, and interactions among miRNAs, lncRNAs, and diseases. In addition, a more rational negative data set was generated based on the similarities of unknown associations with the known ones, so as to effectively reduce the false negative rate in the data set for model construction. By 10-fold cross-validation, HOPE shows better performance than other graph embedding methods. The final consensual HOPEXGB model yields robust performance with a mean prediction accuracy of 0.9569 and also demonstrates high sensitivity and specificity advantages compared to lncRNA/miRNA-specific predictions. Moreover, it is superior to other existing methods and gives promising performance on the external testing data, indicating that integrating the information on lncRNA-miRNA interactions and the similarities of lncRNAs/miRNAs is beneficial for improving the prediction performance of the model. Finally, case studies on lung, stomach, and breast cancers indicate that HOPEXGB could be a powerful tool for preclinical biomarker detection and bioexperiment preliminary screening for the diagnosis and prognosis of cancers. HOPEXGB is publicly available at https://github.com/airpamper/HOPEXGB.
Collapse
Affiliation(s)
- Jian He
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Menglong Li
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Jiangguo Qiu
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Xuemei Pu
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Yanzhi Guo
- College of Chemistry, Sichuan University, Chengdu 610064, China
| |
Collapse
|
38
|
Yang TH, Chen JC, Lee YH, Lu SY, Wu SH, Chang FY, Huang YC, Lee MH, Tseng YY, Wu WS. Identifying Human miRNA Target Sites via Learning the Interaction Patterns between miRNA and mRNA Segments. J Chem Inf Model 2024; 64:2445-2453. [PMID: 37903033 DOI: 10.1021/acs.jcim.3c01150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2023]
Abstract
miRNAs (microRNAs) target specific mRNA (messenger RNA) sites to regulate their translation expression. Although miRNA targeting can rely on seed region base pairing, animal miRNAs, including human miRNAs, typically cooperate with several cofactors, leading to various noncanonical pairing rules. Therefore, identifying the binding sites of animal miRNAs remains challenging. Because experiments for mapping miRNA targets are costly, computational methods are preferred for extracting potential miRNA-mRNA fragment binding pairs first. However, existing prediction tools can have significant false positives due to the prevalent noncanonical miRNA binding behaviors and the information-biased training negative sets that were used while constructing these tools. To overcome these obstacles, we first prepared an information-balanced miRNA binding pair ground-truth data set. A miRNA-mRNA interaction-aware model was then designed to help identify miRNA binding events. On the test set, our model (auROC = 94.4%) outperformed existing models by at least 2.8% in auROC. Furthermore, we showed that this model can suggest potential binding patterns for miRNA-mRNA sequence interacting pairs. Finally, we made the prepared data sets and the designed model available at http://cosbi2.ee.ncku.edu.tw/mirna_binding/download.
Collapse
Affiliation(s)
- Tzu-Hsien Yang
- Department of Biomedical Engineering, National Cheng Kung University, No.1, University Road, Tainan 701, Taiwan
- Medical Device Innovation Center, National Cheng Kung University, No.1 University Road, Tainan 701, Taiwan
| | - Jhih-Cheng Chen
- Department of Electrical Engineering, National Cheng Kung University, No.1, University Road, Tainan 701, Taiwan
| | - Yuan-Han Lee
- Department of Electrical Engineering, National Cheng Kung University, No.1, University Road, Tainan 701, Taiwan
| | - Shang-Yi Lu
- Department of Electrical Engineering, National Cheng Kung University, No.1, University Road, Tainan 701, Taiwan
| | - Sheng-Hang Wu
- Department of Information Management, National University of Kaohsiung, Kaohsiung University Rd, Kaohsiung 811, Taiwan
| | - Fang-Yuan Chang
- Department of Information Management, National University of Kaohsiung, Kaohsiung University Rd, Kaohsiung 811, Taiwan
| | - Yan-Cheng Huang
- Department of Electrical Engineering, National Cheng Kung University, No.1, University Road, Tainan 701, Taiwan
| | - Mei-Hsien Lee
- Department of Mathematics, University of Taipei, No.1, Ai-Guo West Road, Taipei 100234, Taiwan
| | - Yan-Yuan Tseng
- Center for Molecular Medicine and Genetics, Wayne State University, School of Medicine, Detroit, Michigan 48201, United States
| | - Wei-Sheng Wu
- Department of Electrical Engineering, National Cheng Kung University, No.1, University Road, Tainan 701, Taiwan
| |
Collapse
|
39
|
Lin W, Wells J, Wang Z, Orengo C, Martin ACR. Enhancing missense variant pathogenicity prediction with protein language models using VariPred. Sci Rep 2024; 14:8136. [PMID: 38584172 PMCID: PMC10999449 DOI: 10.1038/s41598-024-51489-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 01/05/2024] [Indexed: 04/09/2024] Open
Abstract
Computational approaches for predicting the pathogenicity of genetic variants have advanced in recent years. These methods enable researchers to determine the possible clinical impact of rare and novel variants. Historically these prediction methods used hand-crafted features based on structural, evolutionary, or physiochemical properties of the variant. In this study we propose a novel framework that leverages the power of pre-trained protein language models to predict variant pathogenicity. We show that our approach VariPred (Variant impact Predictor) outperforms current state-of-the-art methods by using an end-to-end model that only requires the protein sequence as input. Using one of the best-performing protein language models (ESM-1b), we establish a robust classifier that requires no calculation of structural features or multiple sequence alignments. We compare the performance of VariPred with other representative models including 3Cnet, Polyphen-2, REVEL, MetaLR, FATHMM and ESM variant. VariPred performs as well as, or in most cases better than these other predictors using six variant impact prediction benchmarks despite requiring only sequence data and no pre-processing of the data.
Collapse
Affiliation(s)
- Weining Lin
- Division of Biosciences, Institute of Structural and Molecular Biology, University College London, London, UK
| | - Jude Wells
- Department of Computer Science, University College London, London, UK
| | - Zeyuan Wang
- College of Computer Science and Technology, Zhejiang University, Zhejiang, China
| | - Christine Orengo
- Division of Biosciences, Institute of Structural and Molecular Biology, University College London, London, UK.
| | - Andrew C R Martin
- Division of Biosciences, Institute of Structural and Molecular Biology, University College London, London, UK.
| |
Collapse
|
40
|
Zhao LZ, Liang Y, Yin T, Liao HL, Liang B. Identification of Potential Crucial Biomarkers in STEMI Through Integrated Bioinformatic Analysis. Arq Bras Cardiol 2024; 121:e20230462. [PMID: 38597542 DOI: 10.36660/abc.20230462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 11/14/2023] [Indexed: 04/11/2024] Open
Abstract
BACKGROUND ST-segment elevation myocardial infarction (STEMI) is one of the leading causes of fatal cardiovascular diseases, which have been the prime cause of mortality worldwide. Diagnosis in the early phase would benefit clinical intervention and prognosis, but the exploration of the biomarkers of STEMI is still lacking. OBJECTIVES In this study, we conducted a bioinformatics analysis to identify potential crucial biomarkers in the progress of STEMI. METHODS We obtained GSE59867 for STEMI and stable coronary artery disease (SCAD) patients. Differentially expressed genes (DEGs) were screened with the threshold of |log2fold change| > 0.5 and p <0.05. Based on these genes, we conducted enrichment analysis to explore the potential relevance between genes and to screen hub genes. Subsequently, hub genes were analyzed to detect related miRNAs and DAVID to detect transcription factors for further analysis. Finally, GSE62646 was utilized to assess DEGs specificity, with genes demonstrating AUC results exceeding 75%, indicating their potential as candidate biomarkers. RESULTS 133 DEGs between SCAD and STEMI were obtained. Then, the PPI network of DEGs was constructed using String and Cytoscape, and further analysis determined hub genes and 6 molecular complexes. Functional enrichment analysis of the DEGs suggests that pathways related to inflammation, metabolism, and immunity play a pivotal role in the progression from SCAD to STEMI. Besides, related-miRNAs were predicted, has-miR-124, has-miR-130a/b, and has-miR-301a/b regulated the expression of the largest number of genes. Meanwhile, Transcription factors analysis indicate that EVI1, AML1, GATA1, and PPARG are the most enriched gene. Finally, ROC curves demonstrate that MS4A3, KLRC4, KLRD1, AQP9, and CD14 exhibit both high sensitivity and specificity in predicting STEMI. CONCLUSIONS This study revealed that immunity, metabolism, and inflammation are involved in the development of STEMI derived from SCAD, and 6 genes, including MS4A3, KLRC4, KLRD1, AQP9, CD14, and CCR1, could be employed as candidate biomarkers to STEMI.
Collapse
Affiliation(s)
- Li-Zhi Zhao
- The Affiliated Traditional Chinese Medicine Hospital, Southwest Medical University, Luzhou - China
- College of Integration of Traditional Chinese and Western Medicine, Southwest Medical University, Luzhou - China
| | - Yi Liang
- Department of Geriatrics, Sichuan Second Hospital of T.C.M., Chengdu - China
| | - Ting Yin
- Department of Cardiology, The Second Affiliated Hospital, School of Medicine, Zhejiang University, Hangzhou - China
| | - Hui-Ling Liao
- The Affiliated Traditional Chinese Medicine Hospital, Southwest Medical University, Luzhou - China
- College of Integration of Traditional Chinese and Western Medicine, Southwest Medical University, Luzhou - China
| | - Bo Liang
- Department of Nephrology, The Key Laboratory for the Prevention and Treatment of Chronic Kidney Disease of Chongqing, Chongqing Clinical Research Center of Kidney and Urology Diseases, Xinqiao Hospital, Army Medical University (Third Military Medical University), Chongqing - China
| |
Collapse
|
41
|
Alam MJ, Rahman MH, Hossain MA, Hoque MR, Aktaruzzaman M. Bioinformatics and Systems Biology Approaches to Identify the Synergistic Effects of Alcohol Use Disorder on the Progression of Neurological Diseases. Neuroscience 2024; 543:65-82. [PMID: 38401711 DOI: 10.1016/j.neuroscience.2024.02.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Revised: 02/14/2024] [Accepted: 02/16/2024] [Indexed: 02/26/2024]
Abstract
Clinical investigations showed that individuals with Alcohol Use Disorder (AUD) have worse Neurological Disease (ND) development, pointing to possible pathogenic relationships between AUD and NDs. It remains difficult to identify risk factors that are predisposing between AUD and NDs. In order to fix these issues, we created the bioinformatics pipeline and network-based approaches for employing unbiased methods to discover genes abnormally stated in both AUD and NDs and to pinpoint some of the common molecular pathways that might underlie AUD and ND interaction. We found 100 differentially expressed genes (DEGs) in both the AUD and ND patient's tissue samples. The most important Gene Ontology (GO) terms and metabolic pathways, including positive control of cytotoxicity caused by T cells, proinflammatory responses, antigen processing and presentation, and platelet-triggered interactions with vascular and circulating cell pathways were then extracted using the overlapped DEGs. Protein-protein interaction analysis was used to identify hub proteins, including CCL2, IL1B, TH, MYCN, HLA-DRB1, SLC17A7, and HNF4A, in the pathways that have been reported as playing a function in these disorders. We determined several TFs (HNF4A, C4A, HLA-B, SNCA, HLA-DMB, SLC17A7, HLA-DRB1, HLA-C, HLA-A, and HLA-DPB1) and potential miRNAs (hsa-mir-34a-5p, hsa-mir-34c-5p, hsa-mir-449a, hsa-mir-155-5p, and hsa-mir-1-3p) were crucial for regulating the expression of AUD and ND which could serve as prospective targets for treatment. Our methodologies discovered unique putative biomarkers that point to the interaction between AUD and various neurological disorders, as well as pathways that could one day be the focus of therapeutic intervention.
Collapse
Affiliation(s)
- Md Jahangir Alam
- Department of Computer Science and Engineering, Islamic University, Kushtia 7003, Bangladesh; Center for Advanced Bioinformatics and Artificial Intelligence Research, Islamic University, Kushtia 7003, Bangladesh
| | - Md Habibur Rahman
- Department of Computer Science and Engineering, Islamic University, Kushtia 7003, Bangladesh; Center for Advanced Bioinformatics and Artificial Intelligence Research, Islamic University, Kushtia 7003, Bangladesh.
| | - Md Arju Hossain
- Department of Biotechnology and Genetic Engineering, Mawlana Bhashani Science and Technology University, Santosh, Tangail 1902, Bangladesh; Department of Microbiology, Primeasia University, Banani, Dhaka 1213, Bangladesh
| | - Md Robiul Hoque
- Department of Computer Science and Engineering, Islamic University, Kushtia 7003, Bangladesh
| | - Md Aktaruzzaman
- Department of Computer Science and Engineering, Islamic University, Kushtia 7003, Bangladesh
| |
Collapse
|
42
|
Arif M, Fang G, Ghulam A, Musleh S, Alam T. DPI_CDF: druggable protein identifier using cascade deep forest. BMC Bioinformatics 2024; 25:145. [PMID: 38580921 DOI: 10.1186/s12859-024-05744-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 03/13/2024] [Indexed: 04/07/2024] Open
Abstract
BACKGROUND Drug targets in living beings perform pivotal roles in the discovery of potential drugs. Conventional wet-lab characterization of drug targets is although accurate but generally expensive, slow, and resource intensive. Therefore, computational methods are highly desirable as an alternative to expedite the large-scale identification of druggable proteins (DPs); however, the existing in silico predictor's performance is still not satisfactory. METHODS In this study, we developed a novel deep learning-based model DPI_CDF for predicting DPs based on protein sequence only. DPI_CDF utilizes evolutionary-based (i.e., histograms of oriented gradients for position-specific scoring matrix), physiochemical-based (i.e., component protein sequence representation), and compositional-based (i.e., normalized qualitative characteristic) properties of protein sequence to generate features. Then a hierarchical deep forest model fuses these three encoding schemes to build the proposed model DPI_CDF. RESULTS The empirical outcomes on 10-fold cross-validation demonstrate that the proposed model achieved 99.13 % accuracy and 0.982 of Matthew's-correlation-coefficient (MCC) on the training dataset. The generalization power of the trained model is further examined on an independent dataset and achieved 95.01% of maximum accuracy and 0.900 MCC. When compared to current state-of-the-art methods, DPI_CDF improves in terms of accuracy by 4.27% and 4.31% on training and testing datasets, respectively. We believe, DPI_CDF will support the research community to identify druggable proteins and escalate the drug discovery process. AVAILABILITY The benchmark datasets and source codes are available in GitHub: http://github.com/Muhammad-Arif-NUST/DPI_CDF .
Collapse
Affiliation(s)
- Muhammad Arif
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Ge Fang
- State Key Laboratory for Organic Electronics and Information Displays, Institute of Advanced Materials (IAM), Nanjing 210023, P. R. China, Nanjing 210023, China
- Center for Research Innovation and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bankok, 10700, Thailand
| | - Ali Ghulam
- Information Technology Centre, Sindh Agriculture University, Sindh, Pakistan
| | - Saleh Musleh
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar
| | - Tanvir Alam
- College of Science and Engineering, Hamad Bin Khalifa University, Doha, Qatar.
| |
Collapse
|
43
|
Zhang H, Mo Y, Wang L, Zhang H, Wu S, Sandai D, Shuid AN, Chen X. Potential shared pathogenic mechanisms between endometriosis and inflammatory bowel disease indicate a strong initial effect of immune factors. Front Immunol 2024; 15:1339647. [PMID: 38660311 PMCID: PMC11041628 DOI: 10.3389/fimmu.2024.1339647] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2023] [Accepted: 02/12/2024] [Indexed: 04/26/2024] Open
Abstract
Introduction Over the past decades, immune dysregulation has been consistently demonstrated being common charactoristics of endometriosis (EM) and Inflammatory Bowel Disease (IBD) in numerous studies. However, the underlying pathological mechanisms remain unknown. In this study, bioinformatics techniques were used to screen large-scale gene expression data for plausible correlations at the molecular level in order to identify common pathogenic pathways between EM and IBD. Methods Based on the EM transcriptomic datasets GSE7305 and GSE23339, as well as the IBD transcriptomic datasets GSE87466 and GSE126124, differential gene analysis was performed using the limma package in the R environment. Co-expressed differentially expressed genes were identified, and a protein-protein interaction (PPI) network for the differentially expressed genes was constructed using the 11.5 version of the STRING database. The MCODE tool in Cytoscape facilitated filtering out protein interaction subnetworks. Key genes in the PPI network were identified through two topological analysis algorithms (MCC and Degree) from the CytoHubba plugin. Upset was used for visualization of these key genes. The diagnostic value of gene expression levels for these key genes was assessed using the Receiver Operating Characteristic (ROC) curve and Area Under the Curve (AUC) The CIBERSORT algorithm determined the infiltration status of 22 immune cell subtypes, exploring differences between EM and IBD patients in both control and disease groups. Finally, different gene expression trends shared by EM and IBD were input into CMap to identify small molecule compounds with potential therapeutic effects. Results 113 differentially expressed genes (DEGs) that were co-expressed in EM and IBD have been identified, comprising 28 down-regulated genes and 86 up-regulated genes. The co-expression differential gene of EM and IBD in the functional enrichment analyses focused on immune response activation, circulating immunoglobulin-mediated humoral immune response and humoral immune response. Five hub genes (SERPING1、VCAM1、CLU、C3、CD55) were identified through the Protein-protein Interaction network and MCODE.High Area Under the Curve (AUC) values of Receiver Operating Characteristic (ROC) curves for 5hub genes indicate the predictive ability for disease occurrence.These hub genes could be used as potential biomarkers for the development of EM and IBD. Furthermore, the CMap database identified a total of 9 small molecule compounds (TTNPB、CAY-10577、PD-0325901 etc.) targeting therapeutic genes for EM and IBD. Discussion Our research revealed common pathogenic mechanisms between EM and IBD, particularly emphasizing immune regulation and cell signalling, indicating the significance of immune factors in the occurence and progression of both diseases. By elucidating shared mechanisms, our study provides novel avenues for the prevention and treatment of EM and IBD.
Collapse
Affiliation(s)
- Haolong Zhang
- Department of Biomedical Sciences, Advanced Medical & Dental Institute, Universiti Sains Malaysia, Penang, Malaysia
| | - Yaxin Mo
- Department of Biomedical Sciences, Advanced Medical & Dental Institute, Universiti Sains Malaysia, Penang, Malaysia
| | - Ling Wang
- Department of TCM Gynecology, Hangzhou TCM Hospital Affiliated to Zhejiang Chinese Medical University, Hangzhou, China
| | - Haoling Zhang
- Department of Biomedical Sciences, Advanced Medical & Dental Institute, Universiti Sains Malaysia, Penang, Malaysia
| | - Sen Wu
- Department of Biomedical Sciences, Advanced Medical & Dental Institute, Universiti Sains Malaysia, Penang, Malaysia
| | - Doblin Sandai
- Department of Biomedical Sciences, Advanced Medical & Dental Institute, Universiti Sains Malaysia, Penang, Malaysia
| | - Ahmad Naqib Shuid
- Department of Biomedical Sciences, Advanced Medical & Dental Institute, Universiti Sains Malaysia, Penang, Malaysia
| | - Xingbei Chen
- Department of Gynecology and Obstetrics, The First Affiliated Hospital of Zhejiang Chinese Medical University (Zhejiang Provincial Hospital of Chinese Medicine), Hangzhou, China
| |
Collapse
|
44
|
Upadhyay M, Pogorevc N, Medugorac I. scalepopgen: Bioinformatic Workflow Resources Implemented in Nextflow for Comprehensive Population Genomic Analyses. Mol Biol Evol 2024; 41:msae057. [PMID: 38507648 PMCID: PMC10994858 DOI: 10.1093/molbev/msae057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Revised: 02/07/2024] [Accepted: 03/04/2024] [Indexed: 03/22/2024] Open
Abstract
Population genomic analyses such as inference of population structure and identifying signatures of selection usually involve the application of a plethora of tools. The installation of tools and their dependencies, data transformation, or series of data preprocessing in a particular order sometimes makes the analyses challenging. While the usage of container-based technologies has significantly resolved the problems associated with the installation of tools and their dependencies, population genomic analyses requiring multistep pipelines or complex data transformation can greatly be facilitated by the application of workflow management systems such as Nextflow and Snakemake. Here, we present scalepopgen, a collection of fully automated workflows that can carry out widely used population genomic analyses on the biallelic single nucleotide polymorphism data stored in either variant calling format files or the plink-generated binary files. scalepopgen is developed in Nextflow and can be run locally or on high-performance computing systems using either Conda, Singularity, or Docker. The automated workflow includes procedures such as (i) filtering of individuals and genotypes; (ii) principal component analysis, admixture with identifying optimal K-values; (iii) running TreeMix analysis with or without bootstrapping and migration edges, followed by identification of an optimal number of migration edges; (iv) implementing single-population and pair-wise population comparison-based procedures to identify genomic signatures of selection. The pipeline uses various open-source tools; additionally, several Python and R scripts are also provided to collect and visualize the results. The tool is freely available at https://github.com/Popgen48/scalepopgen.
Collapse
Affiliation(s)
- Maulik Upadhyay
- Population Genomics Group, Department of Veterinary Sciences, LMU Munich, Martinsried 82152, Germany
| | - Neža Pogorevc
- Population Genomics Group, Department of Veterinary Sciences, LMU Munich, Martinsried 82152, Germany
| | - Ivica Medugorac
- Population Genomics Group, Department of Veterinary Sciences, LMU Munich, Martinsried 82152, Germany
| |
Collapse
|
45
|
Zhao Y, Yang Z, Wang L, Zhang Y, Lin H, Wang J. Predicting Protein Functions Based on Heterogeneous Graph Attention Technique. IEEE J Biomed Health Inform 2024; 28:2408-2415. [PMID: 38319781 DOI: 10.1109/jbhi.2024.3357834] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2024]
Abstract
In bioinformatics, protein function prediction stands as a fundamental area of research and plays a crucial role in addressing various biological challenges, such as the identification of potential targets for drug discovery and the elucidation of disease mechanisms. However, known functional annotation databases usually provide positive experimental annotations that proteins carry out a given function, and rarely record negative experimental annotations that proteins do not carry out a given function. Therefore, existing computational methods based on deep learning models focus on these positive annotations for prediction and ignore these scarce but informative negative annotations, leading to an underestimation of precision. To address this issue, we introduce a deep learning method that utilizes a heterogeneous graph attention technique. The method first constructs a heterogeneous graph that covers the protein-protein interaction network, ontology structure, and positive and negative annotation information. Then, it learns embedding representations of proteins and ontology terms by using the heterogeneous graph attention technique. Finally, it leverages these learned representations to reconstruct the positive protein-term associations and score unobserved functional annotations. It can enhance the predictive performance by incorporating these known limited negative annotations into the constructed heterogeneous graph. Experimental results on three species (i.e., Human, Mouse, and Arabidopsis) demonstrate that our method can achieve better performance in predicting new protein annotations than state-of-the-art methods.
Collapse
|
46
|
Sunila BG, Dhanushkumar T, Dasegowda KR, Vasudevan K, Rambabu M. Unraveling the molecular landscape of Ataxia Telangiectasia: Insights into Neuroinflammation, immune dysfunction, and potential therapeutic target. Neurosci Lett 2024; 828:137764. [PMID: 38582325 DOI: 10.1016/j.neulet.2024.137764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2024] [Revised: 03/23/2024] [Accepted: 04/03/2024] [Indexed: 04/08/2024]
Abstract
BACKGROUND Ataxia Telangiectasia (AT) is a genetic disorder characterized by compromised DNA repair, cerebellar degeneration, and immune dysfunction. Understanding the molecular mechanisms driving AT pathology is crucial for developing targeted therapies. METHODS In this study, we conducted a comprehensive analysis to elucidate the molecular mechanisms underlying AT pathology. Using publicly available RNA-seq datasets comparing control and AT samples, we employed in silico transcriptomics to identify potential genes and pathways. We performed differential gene expression analysis with DESeq2 to reveal dysregulated genes associated with AT. Additionally, we constructed a Protein-Protein Interaction (PPI) network to explore the interactions between proteins implicated in AT. RESULTS The network analysis identified hub genes, including TYROBP and PCP2, crucial in immune regulation and cerebellar function, respectively. Furthermore, pathway enrichment analysis unveiled dysregulated pathways linked to AT pathology, providing insights into disease progression. CONCLUSION Our integrated approach offers a holistic understanding of the complex molecular landscape of AT and identifies potential targets for therapeutic intervention. By combining transcriptomic analysis with network-based methods, we provide valuable insights into the underlying mechanisms of AT pathogenesis.
Collapse
Affiliation(s)
- B G Sunila
- Department of Biotechnology, School of Applied Sciences, REVA University, Bengaluru 560064, India
| | - T Dhanushkumar
- Department of Biotechnology, School of Applied Sciences, REVA University, Bengaluru 560064, India
| | - K R Dasegowda
- Department of Biotechnology, School of Applied Sciences, REVA University, Bengaluru 560064, India
| | - Karthick Vasudevan
- Department of Biotechnology, School of Applied Sciences, REVA University, Bengaluru 560064, India
| | - Majji Rambabu
- Department of Biotechnology, School of Applied Sciences, REVA University, Bengaluru 560064, India.
| |
Collapse
|
47
|
Hashemi Gheinani A, Kim J, You S, Adam RM. Bioinformatics in urology - molecular characterization of pathophysiology and response to treatment. Nat Rev Urol 2024; 21:214-242. [PMID: 37604982 DOI: 10.1038/s41585-023-00805-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/13/2023] [Indexed: 08/23/2023]
Abstract
The application of bioinformatics has revolutionized the practice of medicine in the past 20 years. From early studies that uncovered subtypes of cancer to broad efforts spearheaded by the Cancer Genome Atlas initiative, the use of bioinformatics strategies to analyse high-dimensional data has provided unprecedented insights into the molecular basis of disease. In addition to the identification of disease subtypes - which enables risk stratification - informatics analysis has facilitated the identification of novel risk factors and drivers of disease, biomarkers of progression and treatment response, as well as possibilities for drug repurposing or repositioning; moreover, bioinformatics has guided research towards precision and personalized medicine. Implementation of specific computational approaches such as artificial intelligence, machine learning and molecular subtyping has yet to become widespread in urology clinical practice for reasons of cost, disruption of clinical workflow and need for prospective validation of informatics approaches in independent patient cohorts. Solving these challenges might accelerate routine integration of bioinformatics into clinical settings.
Collapse
Affiliation(s)
- Ali Hashemi Gheinani
- Department of Urology, Boston Children's Hospital, Boston, MA, USA
- Department of Surgery, Harvard Medical School, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Department of Urology, Inselspital, Bern, Switzerland
- Department for BioMedical Research, University of Bern, Bern, Switzerland
| | - Jina Kim
- Department of Urology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Sungyong You
- Department of Urology, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
- Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| | - Rosalyn M Adam
- Department of Urology, Boston Children's Hospital, Boston, MA, USA.
- Department of Surgery, Harvard Medical School, Boston, MA, USA.
- Broad Institute of MIT and Harvard, Cambridge, MA, USA.
| |
Collapse
|
48
|
Choong YS, Mancera R, Lee VS. Special issue: Biomolecular Modelling and Simulation. Mol Biotechnol 2024; 66:567. [PMID: 38337130 DOI: 10.1007/s12033-024-01073-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/10/2024] [Indexed: 02/12/2024]
Affiliation(s)
- Yee Siew Choong
- Institute for Research in Molecular Medicine (INFORMM), Universiti Sains Malaysia, 11800, Minden, Penang, Malaysia.
| | - Ricardo Mancera
- Curtin Medical School, Curtin Health Innovation Research Institute, Curtin University, GPO Box U1987, Perth, WA, 6845, Australia
| | - Vannajan Sanghiran Lee
- Department of Chemistry, Centre for Quantum Information Science and Technology (QIST), Faculty of Science, University of Malaya, 50603, Kuala Lumpur, Malaysia
| |
Collapse
|
49
|
Ng CL, Lim TS, Choong YS. Application of Computational Techniques in Antibody Fc-Fused Molecule Design for Therapeutics. Mol Biotechnol 2024; 66:568-581. [PMID: 37742298 DOI: 10.1007/s12033-023-00885-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 08/23/2023] [Indexed: 09/26/2023]
Abstract
Since the advent of hybridoma technology in the year 1975, it took a decade to witness the first approved monoclonal antibody Orthoclone OKT39 (muromonab-CD3) in the year 1986. Since then, continuous strides have been made to engineer antibodies for specific desired effects. The engineering efforts were not confined to only the variable domains of the antibody but also included the fragment crystallizable (Fc) region that influences the immune response and serum half-life. Engineering of the Fc fragment would have a profound effect on the therapeutic dose, antibody-dependent cell-mediated cytotoxicity as well as antibody-dependent cellular phagocytosis. The integration of computational techniques into antibody engineering designs has allowed for the generation of testable hypotheses and guided the rational antibody design framework prior to further experimental evaluations. In this article, we discuss the recent works in the Fc-fused molecule design that involves computational techniques. We also summarize the usefulness of in silico techniques to aid Fc-fused molecule design and analysis for the therapeutics application.
Collapse
Affiliation(s)
- Chong Lee Ng
- Institute for Research in Molecular Medicine (INFORMM), Universiti Sains Malaysia, Minden, Penang, Malaysia
| | - Theam Soon Lim
- Institute for Research in Molecular Medicine (INFORMM), Universiti Sains Malaysia, Minden, Penang, Malaysia
| | - Yee Siew Choong
- Institute for Research in Molecular Medicine (INFORMM), Universiti Sains Malaysia, Minden, Penang, Malaysia.
| |
Collapse
|
50
|
Guo H, Guo L, Li L, Li N, Lin X, Wang Y. Identification of key genes and molecular mechanisms of chronic urticaria based on bioinformatics. Skin Res Technol 2024; 30:e13624. [PMID: 38558219 PMCID: PMC10982677 DOI: 10.1111/srt.13624] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Accepted: 02/05/2024] [Indexed: 04/04/2024]
Abstract
Chronic urticaria (CU) is characterized by persistent skin hives, redness, and itching, enhanced by immune dysregulation and inflammation. Our main objective is identifying key genes and molecular mechanisms of chronic urticaria based on bioinformatics. We used the Gene Expression Omnibus (GEO) database and retrieved two GEO datasets, GSE57178 and GSE72540. The raw data were extracted, pre-processed, and analyzed using the GEO2R tool to identify the differentially expressed genes (DEGs). The samples were divided into two groups: healthy samples and CU samples. We defined cut-off values of log2 fold change ≥1 and p < .05. Analyses were performed in the Kyoto Encyclopaedia of Genes and Genomes (KEGG), the Database for Annotation, Visualization and Integrated Discovery (DAVID), Metascape, Search Tool for the Retrieval of Interacting Genes/Proteins (STRING) and CIBERSOFT databases. We obtained 1613 differentially expressed genes. There were 114 overlapping genes in both datasets, out of which 102 genes were up-regulated while 12 were down-regulated. The biological processes included activation of myeloid leukocytes, response to inflammations, and response to organic substances. Moreover, the KEGG pathways of CU were enriched in the Nuclear Factor-Kappa B (NF-kB) signaling pathway, Tumor Necrosis Factor (TNF) signaling pathway, and Janus kinase/signal transducers and activators of transcription (JAK-STAT) signaling pathway. We identified 27 hub genes that were implicated in the pathogenesis of CU, such as interleukin-6 (IL-6), Prostaglandin-endoperoxide synthase 2 (PTGS2), and intercellular adhesion molecule-1 (ICAM1). The complex interplay between immune responses, inflammatory pathways, cytokine networks, and specific genes enhances CU. Understanding these mechanisms paves the way for potential interventions to mitigate symptoms and improve the quality of life of CU patients.
Collapse
Affiliation(s)
- Haichao Guo
- Department of Acupuncture and MoxibustionThe First Affiliated Hospital of Hebei University of Chinese MedicineShijiazhuangHebeiChina
- Department of DermatologyXingtai Hospital of Traditional Chinese MedicineXingtaiHebeiChina
| | - Lifang Guo
- Department of DermatologyXingtai Hospital of Traditional Chinese MedicineXingtaiHebeiChina
| | - Li Li
- Department of DermatologyXingtai Hospital of Traditional Chinese MedicineXingtaiHebeiChina
| | - Na Li
- Department of PsychiatryThe First Affiliated Hospital of Hebei University of Chinese MedicineShijiazhuangHebeiChina
| | - Xiaoyun Lin
- Department of Acupuncture and MoxibustionThe First Affiliated Hospital of Hebei University of Chinese MedicineShijiazhuangHebeiChina
| | - Yanjun Wang
- Department of Acupuncture and MoxibustionThe First Affiliated Hospital of Hebei University of Chinese MedicineShijiazhuangHebeiChina
| |
Collapse
|