1
|
Patiyal S, Tiwari P, Ghai M, Dhapola A, Dhall A, Raghava GPS. A hybrid approach for predicting transcription factors. FRONTIERS IN BIOINFORMATICS 2024; 4:1425419. [PMID: 39119181 PMCID: PMC11306938 DOI: 10.3389/fbinf.2024.1425419] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Accepted: 07/03/2024] [Indexed: 08/10/2024] Open
Abstract
Transcription factors are essential DNA-binding proteins that regulate the transcription rate of several genes and control the expression of genes inside a cell. The prediction of transcription factors with high precision is important for understanding biological processes such as cell differentiation, intracellular signaling, and cell-cycle control. In this study, we developed a hybrid method that combines alignment-based and alignment-free methods for predicting transcription factors with higher accuracy. All models have been trained, tested, and evaluated on a large dataset that contains 19,406 transcription factors and 523,560 non-transcription factor protein sequences. To avoid biases in evaluation, the datasets were divided into training and validation/independent datasets, where 80% of the data was used for training, and the remaining 20% was used for external validation. In the case of alignment-free methods, models were developed using machine learning techniques and the composition-based features of a protein. Our best alignment-free model obtained an AUC of 0.97 on an independent dataset. In the case of the alignment-based method, we used BLAST at different cut-offs to predict the transcription factors. Although the alignment-based method demonstrated excellent performance, it was unable to cover all transcription factors due to instances of no hits. To combine the strengths of both methods, we developed a hybrid method that combines alignment-free and alignment-based methods. In the hybrid method, we added the scores of the alignment-free and alignment-based methods and achieved a maximum AUC of 0.99 on the independent dataset. The method proposed in this study performs better than existing methods. We incorporated the best models in the webserver/Python Package Index/standalone package of "TransFacPred" (https://webs.iiitd.edu.in/raghava/transfacpred).
Collapse
Affiliation(s)
| | | | | | | | | | - Gajendra P. S. Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, India
| |
Collapse
|
2
|
Moreno E, Ciordia S, Fátima SM, Jiménez D, Martínez-Sanz J, Vizcarra P, Ron R, Sánchez-Conde M, Bargiela R, Sanchez-Carrillo S, Moreno S, Corrales F, Ferrer M, Serrano-Villar S. Proteomic snapshot of saliva samples predicts new pathways implicated in SARS-CoV-2 pathogenesis. Clin Proteomics 2024; 21:37. [PMID: 38778280 PMCID: PMC11112864 DOI: 10.1186/s12014-024-09482-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Accepted: 04/15/2024] [Indexed: 05/25/2024] Open
Abstract
BACKGROUND Information on the microbiome's human pathways and active members that can affect SARS-CoV-2 susceptibility and pathogenesis in the salivary proteome is very scarce. Here, we studied a unique collection of samples harvested from April to June 2020 from unvaccinated patients. METHODS We compared 10 infected and hospitalized patients with severe (n = 5) and moderate (n = 5) coronavirus disease (COVID-19) with 10 uninfected individuals, including non-COVID-19 but susceptible individuals (n = 5) and non-COVID-19 and nonsusceptible healthcare workers with repeated high-risk exposures (n = 5). RESULTS By performing high-throughput proteomic profiling in saliva samples, we detected 226 unique differentially expressed (DE) human proteins between groups (q-value ≤ 0.05) out of 3376 unambiguously identified proteins (false discovery rate ≤ 1%). Major differences were observed between the non-COVID-19 and nonsusceptible groups. Bioinformatics analysis of DE proteins revealed human proteomic signatures related to inflammatory responses, central cellular processes, and antiviral activity associated with the saliva of SARS-CoV-2-infected patients (p-value ≤ 0.0004). Discriminatory biomarker signatures from human saliva include cystatins, protective molecules present in the oral cavity, calprotectins, involved in cell cycle progression, and histones, related to nucleosome functions. The expression levels of two human proteins related to protein transport in the cytoplasm, DYNC1 (p-value, 0.0021) and MAPRE1 (p-value, 0.047), correlated with angiotensin-converting enzyme 2 (ACE2) plasma activity. Finally, the proteomes of microorganisms present in the saliva samples showed 4 main microbial functional features related to ribosome functioning that were overrepresented in the infected group. CONCLUSION Our study explores potential candidates involved in pathways implicated in SARS-CoV-2 susceptibility, although further studies in larger cohorts will be necessary.
Collapse
Affiliation(s)
- Elena Moreno
- Department of Infectious Diseases, Facultad de Medicina, Hospital Universitario Ramón y Cajal, IRYCIS, Carretera de Colmenar Viejo, Km 9.100, 28034, Madrid, Spain.
- CIBERINFEC, Instituto de Salud Carlos III, 28029, Madrid, Spain.
| | - Sergio Ciordia
- Functional Proteomics Laboratory, Centro Nacional de Biotecnología (CNB), CSIC, 28049, Madrid, Spain
| | - Santos Milhano Fátima
- Functional Proteomics Laboratory, Centro Nacional de Biotecnología (CNB), CSIC, 28049, Madrid, Spain
| | - Daniel Jiménez
- Department of Infectious Diseases, Facultad de Medicina, Hospital Universitario Ramón y Cajal, IRYCIS, Carretera de Colmenar Viejo, Km 9.100, 28034, Madrid, Spain
| | - Javier Martínez-Sanz
- Department of Infectious Diseases, Facultad de Medicina, Hospital Universitario Ramón y Cajal, IRYCIS, Carretera de Colmenar Viejo, Km 9.100, 28034, Madrid, Spain
- CIBERINFEC, Instituto de Salud Carlos III, 28029, Madrid, Spain
| | - Pilar Vizcarra
- Department of Infectious Diseases, Facultad de Medicina, Hospital Universitario Ramón y Cajal, IRYCIS, Carretera de Colmenar Viejo, Km 9.100, 28034, Madrid, Spain
- CIBERINFEC, Instituto de Salud Carlos III, 28029, Madrid, Spain
| | - Raquel Ron
- Department of Infectious Diseases, Facultad de Medicina, Hospital Universitario Ramón y Cajal, IRYCIS, Carretera de Colmenar Viejo, Km 9.100, 28034, Madrid, Spain
- CIBERINFEC, Instituto de Salud Carlos III, 28029, Madrid, Spain
| | - Matilde Sánchez-Conde
- Department of Infectious Diseases, Facultad de Medicina, Hospital Universitario Ramón y Cajal, IRYCIS, Carretera de Colmenar Viejo, Km 9.100, 28034, Madrid, Spain
- CIBERINFEC, Instituto de Salud Carlos III, 28029, Madrid, Spain
| | - Rafael Bargiela
- Centre for Environmental Biotechnology, School of Natural Sciences, Bangor University, Bangor, LL57 2UW, UK
| | - Sergio Sanchez-Carrillo
- Instituto de Catalisis y Petroleoquimica (ICP), CSIC, 28049, Madrid, Spain
- Centro de Biologia Molecular Severo Ochoa (CBM), CSIC-UAM, 28049, Madrid, Spain
| | - Santiago Moreno
- Department of Infectious Diseases, Facultad de Medicina, Hospital Universitario Ramón y Cajal, IRYCIS, Carretera de Colmenar Viejo, Km 9.100, 28034, Madrid, Spain
- CIBERINFEC, Instituto de Salud Carlos III, 28029, Madrid, Spain
- Facultad de Medicina, Universidad de Alcalá de Henares, 28801, Alcalá de Henares, Madrid, Spain
| | - Fernando Corrales
- Functional Proteomics Laboratory, Centro Nacional de Biotecnología (CNB), CSIC, 28049, Madrid, Spain
| | - Manuel Ferrer
- Instituto de Catalisis y Petroleoquimica (ICP), CSIC, 28049, Madrid, Spain
| | - Sergio Serrano-Villar
- Department of Infectious Diseases, Facultad de Medicina, Hospital Universitario Ramón y Cajal, IRYCIS, Carretera de Colmenar Viejo, Km 9.100, 28034, Madrid, Spain
- CIBERINFEC, Instituto de Salud Carlos III, 28029, Madrid, Spain
| |
Collapse
|
3
|
Tomer R, Patiyal S, Dhall A, Raghava GPS. Prediction of celiac disease associated epitopes and motifs in a protein. Front Immunol 2023; 14:1056101. [PMID: 36742312 PMCID: PMC9893285 DOI: 10.3389/fimmu.2023.1056101] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2022] [Accepted: 01/02/2023] [Indexed: 01/20/2023] Open
Abstract
Introduction Celiac disease (CD) is an autoimmune gastrointestinal disorder causes immune-mediated enteropathy against gluten. Gluten immunogenic peptides have the potential to trigger immune responses which leads to damage the small intestine. HLA-DQ2/DQ8 are major alleles that bind to epitope/antigenic region of gluten and induce celiac disease. There is a need to identify CD associated epitopes in protein-based foods and therapeutics. Methods In this study, computational tools have been developed to predict CD associated epitopes and motifs. Dataset used for training, testing and evaluation contain experimentally validated CD associated and non-CD associate peptides. We perform positional analysis to identify the most significant position of an amino acid residue in the peptide and checked the frequency of HLA alleles. We also compute amino acid composition to develop machine learning based models. We also developed ensemble method that combines motif-based approach and machine learning based models. Results and Discussion Our analysis support existing hypothesis that proline (P) and glutamine (Q) are highly abundant in CD associated peptides. A model based on density of P&Q in peptides has been developed for predicting CD associated peptides which achieve maximum AUROC 0.98 on independent data. We discovered motifs (e.g., QPF, QPQ, PYP) which occurs specifically in CD associated peptides. We also developed machine learning based models using peptide composition and achieved maximum AUROC 0.99. Finally, we developed ensemble method that combines motif-based approach and machine learning based models. The ensemble model-predict CD associated motifs with 100% accuracy on an independent dataset, not used for training. Finally, the best models and motifs has been integrated in a web server and standalone software package "CDpred". We hope this server anticipate the scientific community for the prediction, designing and scanning of CD associated peptides as well as CD associated motifs in a protein/peptide sequence (https://webs.iiitd.edu.in/raghava/cdpred/).
Collapse
|
4
|
Al‐Kuraishy HM, Al‐Gareeb AI, Mohammed AA, Alexiou A, Papadakis M, Batiha GE. The potential link between Covid-19 and multiple myeloma: A new saga. Immun Inflamm Dis 2022; 10:e701. [PMID: 36444620 PMCID: PMC9673426 DOI: 10.1002/iid3.701] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Revised: 08/26/2022] [Accepted: 08/29/2022] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Covid-19 is considered a primary respiratory disease-causing viral pneumonia and, in severe cases, leads to acute lung injury and acute respiratory distress syndrome (ARDS). In addition, though, extra-pulmonary manifestations of Covid-19 have been shown. Furthermore, severe acute respiratory distress syndrome coronavirus type 2 (SARS-CoV-2) infection may coexist with several malignancies, including multiple myeloma (MM). METHODS This critical literature review aimed to find the potential association between SARS-CoV-2 infection and MM in Covid-19 patients with underlying MM. Narrative literature and databases search revealed that ARDS is developed in both MM and Covid-19 due to hypercalcemia and proteasome dysfunction. RESULTS Notably, the expression of angiogenic factors and glutamine deficiency could link Covid-19 severity and MM in the pathogenesis of cardiovascular complications. MM and Covid-19 share thrombosis as a typical complication; unlike thrombosis in Covid-19, which reflects disease severity, thrombosis does not reflect disease severity in MM. In both conditions, thromboprophylaxis is essential to prevent pulmonary thrombosis and other thromboembolic disorders. Moreover, Covid-19 may exacerbate the development of acute kidney injury and neurological complications in MM patients. CONCLUSION These findings highlighted that MM patients might be a risk group for Covid-19 severity due to underlying immunosuppression and most of those patients need specific management in the Covid-19 era.
Collapse
Affiliation(s)
- Hayder M. Al‐Kuraishy
- Department of Clinical Pharmacology and Medicine, College of MedicineALmustansiriyia UniversityBaghdadIraq
| | - Ali I. Al‐Gareeb
- Department of Clinical Pharmacology and Medicine, College of MedicineALmustansiriyia UniversityBaghdadIraq
| | - Ali A Mohammed
- The Chest Clinic, Barts Health NHS TrustWhipps Cross University HospitalLondonUK
| | - Athanasios Alexiou
- Department of Science and EngineeringNovel Global Community Educational FoundationHebershamAustralia
- AFNP MedWienAustria
| | - Marios Papadakis
- Department of Surgery II, University Hospital Witten‐HerdeckeUniversity of Witten‐HerdeckeWuppertalGermany
| | - Gaber El‐Saber Batiha
- Department of Pharmacology and Therapeutics, Faculty of Veterinary MedicineDamanhour UniversityDamanhourEgypt
| |
Collapse
|
5
|
Roy T, Sharma K, Dhall A, Patiyal S, Raghava GPS. In silico method for predicting infectious strains of influenza A virus from its genome and protein sequences. J Gen Virol 2022; 103. [PMID: 36318663 DOI: 10.1099/jgv.0.001802] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023] Open
Abstract
Influenza A is a contagious viral disease responsible for four pandemics in the past and a major public health concern. Being zoonotic in nature, the virus can cross the species barrier and transmit from wild aquatic bird reservoirs to humans via intermediate hosts. In this study, we have developed a computational method for the prediction of human-associated and non-human-associated influenza A virus sequences. The models were trained and validated on proteins and genome sequences of influenza A virus. Firstly, we have developed prediction models for 15 types of influenza A proteins using composition-based and one-hot-encoding features. We have achieved a highest AUC of 0.98 for HA protein on a validation dataset using dipeptide composition-based features. Of note, we obtained a maximum AUC of 0.99 using one-hot-encoding features for protein-based models on a validation dataset. Secondly, we built models using whole genome sequences which achieved an AUC of 0.98 on a validation dataset. In addition, we showed that our method outperforms a similarity-based approach (i.e., blast) on the same validation dataset. Finally, we integrated our best models into a user-friendly web server 'FluSPred' (https://webs.iiitd.edu.in/raghava/fluspred/index.html) and a standalone version (https://github.com/raghavagps/FluSPred) for the prediction of human-associated/non-human-associated influenza A virus strains.
Collapse
Affiliation(s)
- Trinita Roy
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi-110020, India
| | - Khushal Sharma
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi-110020, India
| | - Anjali Dhall
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi-110020, India
| | - Sumeet Patiyal
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi-110020, India
| | - Gajendra Pal Singh Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi-110020, India
| |
Collapse
|
6
|
Ursolic acid and SARS-CoV-2 infection: a new horizon and perspective. Inflammopharmacology 2022; 30:1493-1501. [PMID: 35922738 PMCID: PMC9362167 DOI: 10.1007/s10787-022-01038-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Accepted: 07/14/2022] [Indexed: 12/11/2022]
Abstract
SARS-CoV-2 (severe acute respiratory syndrome coronavirus type 2) has been identified as the source of a world coronavirus pandemic in 2019. Covid-19 is considered a main respiratory disease-causing viral pneumonia and, in severe cases, leads to acute lung injury (ALI) and acute respiratory distress syndrome (ARDS). Although, extrapulmonary manifestations of Covid-19 like neurological, cardiovascular, and gastrointestinal have been confirmed. Exaggerated immune response and release of a high amount of pro-inflammatory cytokines may progress, causing a cytokine storm. Consequently, direct and indirect effects of SARS-CoV-2 infection can evolve into systemic complications due to the progression of hyper inflammation, oxidative stress and dysregulation of the renin-angiotensin system (RAS). Therefore, anti-inflammatory and antioxidant agents could be efficient in alleviating these disorders. Ursolic acid has anti-inflammatory, antioxidant, and antiviral effects; it reduces the release of pro-inflammatory cytokines, improves anti-inflammatory cytokines, and inhibits the production of reactive oxygen species (ROS). In virtue of its anti-inflammatory and antioxidant effects, ursolic acid may minimize SARS-CoV-2 infection-induced complications. Also, by regulating RAS and inflammatory signaling pathways, ursolic acid might effectively reduce the development of ALI in ARDS in Covid-19. In this state, this perspective discusses how ursolic acid can mitigate hyper inflammation and oxidative stress in Covid-19.
Collapse
|
7
|
Dhall A, Jain S, Sharma N, Naorem LD, Kaur D, Patiyal S, Raghava GPS. In silico tools and databases for designing cancer immunotherapy. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2021; 129:1-50. [PMID: 35305716 DOI: 10.1016/bs.apcsb.2021.11.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Immunotherapy is a rapidly growing therapy for cancer which have numerous benefits over conventional treatments like surgery, chemotherapy, and radiation. Overall survival of cancer patients has improved significantly due to the use of immunotherapy. It acts as a novel pillar for treating different malignancies from their primary to the metastatic stage. Recent preferments in high-throughput sequencing and computational immunology leads to the development of targeted immunotherapy for precision oncology. In the last few decades, several computational methods and resources have been developed for designing immunotherapy against cancer. In this review, we have summarized cancer-associated genomic, transcriptomic, and mutation profile repositories. We have also enlisted in silico methods for the prediction of vaccine candidates, HLA binders, cytokines inducing peptides, and potential neoepitopes. Of note, we have incorporated the most important bioinformatics pipelines and resources for the designing of cancer immunotherapy. Moreover, to facilitate the scientific community, we have developed a web portal entitled ImmCancer (https://webs.iiitd.edu.in/raghava/immcancer/), comprises cancer immunotherapy tools and repositories.
Collapse
Affiliation(s)
- Anjali Dhall
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, India
| | - Shipra Jain
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, India
| | - Neelam Sharma
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, India
| | - Leimarembi Devi Naorem
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, India
| | - Dilraj Kaur
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, India
| | - Sumeet Patiyal
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, India
| | - Gajendra P S Raghava
- Department of Computational Biology, Indraprastha Institute of Information Technology, Okhla Phase 3, New Delhi, India.
| |
Collapse
|