1
|
Kumari M, Chauhan R, Garg P. MedKG: enabling drug discovery through a unified biomedical knowledge graph. Mol Divers 2025:10.1007/s11030-025-11164-z. [PMID: 40085402 DOI: 10.1007/s11030-025-11164-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2024] [Accepted: 03/07/2025] [Indexed: 03/16/2025]
Abstract
Biomedical knowledge graphs have emerged as powerful tools for drug discovery, but existing platforms often suffer from outdated information, limited accessibility, and insufficient integration of complex data. This study presents MedKG, a comprehensive and continuously updated knowledge graph designed to address these challenges in precision medicine and drug discovery. MedKG integrates data from 35 authoritative sources, encompassing 34 node types and 79 relationships. A Continuous Integration/Continuous Update pipeline ensures MedKG remains current, addressing a critical limitation of static knowledge bases. The integration of molecular embeddings enhances semantic analysis capabilities, bridging the gap between chemical structures and biological entities. To demonstrate MedKG's utility, a novel hybrid Relational Graph Convolutional Network for disease-drug link prediction, MedLINK was developed and used in case studies on clinical trial data for disease drug link prediction. Furthermore, a web-based application with user-friendly APIs and visualization tools was built, making MedKG accessible to both technical and non-technical users, which is freely available at http://pitools.niper.ac.in/medkg/.
Collapse
Affiliation(s)
- Madhavi Kumari
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research (NIPER), S.A.S. Nagar, Sector 67, S.A.S. Nagar, Mohali, Punjab, 160062, India
| | - Rohit Chauhan
- Department of Computer Science, National Institute of Technology (NIT), Durgapur, MG Road, Durgapur, West Bengal, 713209, India
| | - Prabha Garg
- Department of Pharmacoinformatics, National Institute of Pharmaceutical Education and Research (NIPER), S.A.S. Nagar, Sector 67, S.A.S. Nagar, Mohali, Punjab, 160062, India.
| |
Collapse
|
2
|
Suvarna E, Setlur AS, K C, M S, Niranjan V. Computational molecular perspectives on novel carbazole derivative as an anti-cancer molecule against CDK1 of breast and colorectal cancers via gene expression studies, novel two-way docking strategies, molecular mechanics and dynamics. Comput Biol Chem 2024; 108:107979. [PMID: 37989072 DOI: 10.1016/j.compbiolchem.2023.107979] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Revised: 10/19/2023] [Accepted: 10/30/2023] [Indexed: 11/23/2023]
Abstract
With increase in cancer incidences, alternative strategies for disease management are of utmost importance. Carbazole, is a compound that is being studied extensively as an anti-cancer compound. In this work, we aimed to investigate a carbazole derivative against specific cancer types such as breast and colorectal, based on the off-target analyses of carbazole derivative. The present work shortlisted 6 proteins that have an association in both cancer types, and then employed two different molecular docking strategies to examine the binding stability of carbazole derivative: a blind-docking state, where the pockets were undefined and mutation-docking state, where possible mutations were induced within the proteins. The results showed that CDK1 bound best in both states to carbazole derivative, and performed better than an array of positive controls. Molecular dynamic simulations at 100 ns further proved its stability, with carbazole derivative-CDK1-blind and mutated complex having RMSD values between 3.2 and 3.6 Å, and 2.8-3.2 Å respectively. Molecular-mechanics generalized born and surface area solvation disclosed free energy of binding for the complexes as -28.79 ± 3.97 kcal/mol and -31.86 ± 5.09 kcal/mol respectively, with carbazole derivative bound stably within the binding pocket at every 10 ns of the 100 ns trajectory. Radial distribution functions showed that the bell curve was well within 6 Å, thus showing that carbazole derivative and its atoms do not deviate away from the pocket, suggesting its ability to be used as a good anti-cancer compound against breast and colorectal.
Collapse
Affiliation(s)
- Eashita Suvarna
- Amity Institute of Biotechnology, Amity University, Mumbai, Maharashtra 410206, India
| | - Anagha S Setlur
- Department of Biotechnology, RV College of Engineering, Bangalore 560059, India
| | - Chandrashekar K
- Department of Biotechnology, RV College of Engineering, Bangalore 560059, India
| | - Sridharan M
- Department of Chemistry, RV College of Engineering, Bangalore 560059, India.
| | - Vidya Niranjan
- Department of Biotechnology, RV College of Engineering, Bangalore 560059, India.
| |
Collapse
|
3
|
Guttapadu R, Katte T, Sayeeram D, Bhatia S, Abraham AR, Rajeev K, Amara ARR, Siri S, Bommana K, Rasalkar AA, Malempati R, Mustak MS, Narayanan P, Reddy SDN. Identification of novel biomarkers for lung squamous cell carcinoma. 3 Biotech 2023; 13:72. [PMID: 36742449 PMCID: PMC9895444 DOI: 10.1007/s13205-023-03489-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2022] [Accepted: 01/19/2023] [Indexed: 02/05/2023] Open
Abstract
Lung squamous cell carcinoma (LUSC) is the second most common subtype of lung cancer, accounting for a majority of lung cancer-related deaths. Detection or diagnosis of cancer at an early stage is an unmet clinical need that is being actively explored. In this study, we aimed to identify potential biomarkers for LUSC, by screening expression status of all human genes against LUSC patient samples available with The Cancer Genome Atlas (TCGA). This led to the identification of several genes that are upregulated in LUSC. Further analysis revealed that many of these genes also show higher expression at the protein level not only in lung cancer but also in other cancers. Additionally, some of these genes show stage-dependent higher expression and are associated with statistically significant poor survival of LUSC patients. As per our results, more than 60 genes are overexpressed in LUSC at the level of mRNA and some at the protein level. Thus, we identified genes such as MCC1, MRPL47, CRYGS, HSP40, DNAJC19, GMPS and PARL as novel potential biomarkers for LUSC in this study. We believe that these genes hold great potential as LUSC biomarkers for early detection as the data are derived from patient samples. Supplementary Information The online version contains supplementary material available at 10.1007/s13205-023-03489-z.
Collapse
Affiliation(s)
- Ranjitha Guttapadu
- Department of Biotechnology, BMS College of Engineering, Bull Temple Road, Basavanagudi, Bengaluru, 560019 India
| | - Teesta Katte
- Department of Biotechnology, BMS College of Engineering, Bull Temple Road, Basavanagudi, Bengaluru, 560019 India
| | - Deepak Sayeeram
- Department of Biotechnology, BMS College of Engineering, Bull Temple Road, Basavanagudi, Bengaluru, 560019 India
| | - Saloni Bhatia
- Department of Biotechnology, BMS College of Engineering, Bull Temple Road, Basavanagudi, Bengaluru, 560019 India
| | - Anika Rachel Abraham
- Department of Biotechnology, BMS College of Engineering, Bull Temple Road, Basavanagudi, Bengaluru, 560019 India
| | - Kiran Rajeev
- Department of Biotechnology, BMS College of Engineering, Bull Temple Road, Basavanagudi, Bengaluru, 560019 India
| | - Anish Raju R. Amara
- Department of Biotechnology, BMS College of Engineering, Bull Temple Road, Basavanagudi, Bengaluru, 560019 India
| | - Sharadhi Siri
- Department of Biotechnology, BMS College of Engineering, Bull Temple Road, Basavanagudi, Bengaluru, 560019 India
| | - Kavitha Bommana
- Department of Botany, Rayalaseema University, Kurnool, India
| | - Avinash Arvind Rasalkar
- in-DNA Life Science Pvt LtD, Plot, No. 368, Infocity Ave, Infocity, Sishu Vihar, Patia, Bhubaneswar, Odisha 751024 India
| | - Rajyalakshmi Malempati
- Department of Biotechnology, BMS College of Engineering, Bull Temple Road, Basavanagudi, Bengaluru, 560019 India
| | - Mohammed S. Mustak
- Department of Applied Zoology, Mangalore University, Mangalagangothri, Mangalore, 574199 India
| | - Prathibha Narayanan
- Department of Biotechnology, BMS College of Engineering, Bull Temple Road, Basavanagudi, Bengaluru, 560019 India
| | - S. Divijendra Natha Reddy
- Department of Biotechnology, BMS College of Engineering, Bull Temple Road, Basavanagudi, Bengaluru, 560019 India
| |
Collapse
|
4
|
Lefranc MP, Lefranc G. Antibody Sequence and Structure Analyses Using IMGT ®: 30 Years of Immunoinformatics. Methods Mol Biol 2023; 2552:3-59. [PMID: 36346584 DOI: 10.1007/978-1-0716-2609-2_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
IMGT®, the international ImMunoGeneTics information system®, http://www.imgt.org , the global reference in immunogenetics and immunoinformatics, was created in 1989 by Marie-Paule Lefranc (Université de Montpellier and CNRS) to manage the huge diversity of the antigen receptors, immunoglobulins (IG) or antibodies, and T cell receptors (TR) of the adaptive immune responses. The founding of IMGT® marked the advent of immunoinformatics, which emerged at the interface between immunogenetics and bioinformatics. IMGT® standardized analysis of the IG, TR, and major histocompatibility (MH) genes and proteins bridges the gap between sequences and three-dimensional (3D) structures, for all jawed vertebrates from fish to humans. This is achieved through the IMGT Scientific chart rules, based on the IMGT-ONTOLOGY axioms, and primarily CLASSIFICATION (IMGT gene and allele nomenclature) and NUMEROTATION (IMGT unique numbering and IMGT Colliers de Perles). IMGT® comprises seven databases (IMGT/LIGM-DB for nucleotide sequences, IMGT/GENE-DB for genes and alleles, etc.), 17 tools (IMGT/V-QUEST, IMGT/JunctionAnalysis, IMGT/HighV-QUEST for NGS, etc.), and more than 20,000 Web resources. In this chapter, the focus is on the tools for amino acid sequences per domain (IMGT/DomainGapAlign and IMGT/Collier-de-Perles), and on the databases for receptors (IMGT/2Dstructure-DB and IMGT/3D-structure-DB) described per receptor, chain, and domain and, for 3D, with contact analysis, paratope, and epitope. The IMGT/mAb-DB is the query interface for monoclonal antibodies (mAb), fusion proteins for immune applications (FPIA), composite proteins for clinical applications (CPCA), and related proteins of interest (RPI) with links to IMGT® 2D and 3D databases and to the World Health Organization (WHO) International Nonproprietary Names (INN) program lists. The chapter includes the human IG allotypes and antibody engineered variants for effector properties used in the description of therapeutical mAb.
Collapse
Affiliation(s)
- Marie-Paule Lefranc
- IMGT®, the international ImMunoGeneTics information system®, Laboratoire d'ImmunoGénétique Moléculaire LIGM, Institut de Génétique Humaine IGH, UMR 9002 CNRS, Université de Montpellier, Montpellier cedex 5, France.
| | - Gérard Lefranc
- IMGT®, the international ImMunoGeneTics information system®, Laboratoire d'ImmunoGénétique Moléculaire LIGM, Institut de Génétique Humaine IGH, UMR 9002 CNRS, Université de Montpellier, Montpellier cedex 5, France.
| |
Collapse
|
5
|
Integrative network analysis interweaves the missing links in cardiomyopathy diseasome. Sci Rep 2022; 12:19670. [PMID: 36385157 PMCID: PMC9668833 DOI: 10.1038/s41598-022-24246-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2022] [Accepted: 11/11/2022] [Indexed: 11/17/2022] Open
Abstract
Cardiomyopathies are progressive disease conditions that give rise to an abnormal heart phenotype and are a leading cause of heart failures in the general population. These are complex diseases that show co-morbidity with other diseases. The molecular interaction network in the localised disease neighbourhood is an important step toward deciphering molecular mechanisms underlying these complex conditions. In this pursuit, we employed network medicine techniques to systematically investigate cardiomyopathy's genetic interplay with other diseases and uncover the molecular players underlying these associations. We predicted a set of candidate genes in cardiomyopathy by exploring the DIAMOnD algorithm on the human interactome. We next revealed how these candidate genes form association across different diseases and highlighted the predominant association with brain, cancer and metabolic diseases. Through integrative systems analysis of molecular pathways, heart-specific mouse knockout data and disease tissue-specific transcriptomic data, we screened and ascertained prominent candidates that show abnormal heart phenotype, including NOS3, MMP2 and SIRT1. Our computational analysis broadens the understanding of the genetic associations of cardiomyopathies with other diseases and holds great potential in cardiomyopathy research.
Collapse
|
6
|
Liu S, Chen L, Zhang Y, Zhou Y, He Y, Chen Z, Qi S, Zhu J, Chen X, Zhang H, Luo Y, Qiu Y, Tao L, Zhu F. M6AREG: m6A-centered regulation of disease development and drug response. Nucleic Acids Res 2022; 51:D1333-D1344. [PMID: 36134713 PMCID: PMC9825441 DOI: 10.1093/nar/gkac801] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2022] [Revised: 08/27/2022] [Accepted: 09/06/2022] [Indexed: 01/30/2023] Open
Abstract
As the most prevalent internal modification in eukaryotic RNAs, N6-methyladenosine (m6A) has been discovered to play an essential role in cellular proliferation, metabolic homeostasis, embryonic development, etc. With the rapid accumulation of research interest in m6A, its crucial roles in the regulations of disease development and drug response are gaining more and more attention. Thus, a database offering such valuable data on m6A-centered regulation is greatly needed; however, no such database is as yet available. Herein, a new database named 'M6AREG' is developed to (i) systematically cover, for the first time, data on the effects of m6A-centered regulation on both disease development and drug response, (ii) explicitly describe the molecular mechanism underlying each type of regulation and (iii) fully reference the collected data by cross-linking to existing databases. Since the accumulated data are valuable for researchers in diverse disciplines (such as pathology and pathophysiology, clinical laboratory diagnostics, medicinal biochemistry and drug design), M6AREG is expected to have many implications for the future conduct of m6A-based regulation studies. It is currently accessible by all users at: https://idrblab.org/m6areg/.
Collapse
Affiliation(s)
- Shuiping Liu
- Correspondence may also be addressed to Shuiping Liu.
| | | | | | | | - Ying He
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
| | - Zhen Chen
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China,Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Shasha Qi
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
| | - Jinyu Zhu
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
| | - Xudong Chen
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
| | - Hao Zhang
- Key Laboratory of Elemene Class Anti-Cancer Chinese Medicines, Engineering Laboratory of Development and Application of Traditional Chinese Medicines, Collaborative Innovation Center of Traditional Chinese Medicines of Zhejiang Province, School of Pharmacy, Hangzhou Normal University, Hangzhou 311121, China
| | - Yongchao Luo
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, Zhejiang University, Hangzhou 310058, China,Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Yunqing Qiu
- State Key Laboratory for Diagnosis and Treatment of Infectious Disease, Collaborative Innovation Center for Diagnosis and Treatment of Infectious Diseases, Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, Zhejiang University, Hangzhou, 310000, China
| | - Lin Tao
- Correspondence may also be addressed to Lin Tao.
| | - Feng Zhu
- To whom correspondence should be addressed. Tel: +86 189 8946 6518; Fax: +86 571 8820 8444;
| |
Collapse
|
7
|
Prokaryotic Na+/H+ Exchangers—Transport Mechanism and Essential Residues. Int J Mol Sci 2022; 23:ijms23169156. [PMID: 36012428 PMCID: PMC9408914 DOI: 10.3390/ijms23169156] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 08/09/2022] [Accepted: 08/13/2022] [Indexed: 11/16/2022] Open
Abstract
Na+/H+ exchangers are essential for Na+ and pH homeostasis in all organisms. Human Na+/H+ exchangers are of high medical interest, and insights into their structure and function are aided by the investigation of prokaryotic homologues. Most prokaryotic Na+/H+ exchangers belong to either the Cation/Proton Antiporter (CPA) superfamily, the Ion Transport (IT) superfamily, or the Na+-translocating Mrp transporter superfamily. Several structures have been solved so far for CPA and Mrp members, but none for the IT members. NhaA from E. coli has served as the prototype of Na+/H+ exchangers due to the high amount of structural and functional data available. Recent structures from other CPA exchangers, together with diverse functional information, have allowed elucidation of some common working principles shared by Na+/H+ exchangers from different families, such as the type of residues involved in the substrate binding and even a simple mechanism sufficient to explain the pH regulation in the CPA and IT superfamilies. Here, we review several aspects of prokaryotic Na+/H+ exchanger structure and function, discussing the similarities and differences between different transporters, with a focus on the CPA and IT exchangers. We also discuss the proposed transport mechanisms for Na+/H+ exchangers that explain their highly pH-regulated activity profile.
Collapse
|
8
|
Ran Z, Yang J, Liu Y, Chen X, Ma Z, Wu S, Huang Y, Song Y, Gu Y, Zhao S, Fa M, Lu J, Chen Q, Cao Z, Li X, Sun S, Yang T. GlioMarker: An integrated database for knowledge exploration of diagnostic biomarkers in gliomas. Front Oncol 2022; 12:792055. [PMID: 36081550 PMCID: PMC9446481 DOI: 10.3389/fonc.2022.792055] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2021] [Accepted: 07/15/2022] [Indexed: 11/23/2022] Open
Abstract
Gliomas are the most frequent malignant and aggressive tumors in the central nervous system. Early and effective diagnosis of glioma using diagnostic biomarkers can prolong patients' lives and aid in the development of new personalized treatments. Therefore, a thorough and comprehensive understanding of the diagnostic biomarkers in gliomas is of great significance. To this end, we developed the integrated and web-based database GlioMarker (http://gliomarker.prophetdb.org/), the first comprehensive database for knowledge exploration of glioma diagnostic biomarkers. In GlioMarker, accurate information on 406 glioma diagnostic biomarkers from 1559 publications was manually extracted, including biomarker descriptions, clinical information, associated literature, experimental records, associated diseases, statistical indicators, etc. Importantly, we integrated many external resources to provide clinicians and researchers with the capability to further explore knowledge on these diagnostic biomarkers based on three aspects. (1) Obtain more ontology annotations of the biomarker. (2) Identify the relationship between any two or more components of diseases, drugs, genes, and variants to explore the knowledge related to precision medicine. (3) Explore the clinical application value of a specific diagnostic biomarker through online analysis of genomic and expression data from glioma cohort studies. GlioMarker provides a powerful, practical, and user-friendly web-based tool that may serve as a specialized platform for clinicians and researchers by providing rapid and comprehensive knowledge of glioma diagnostic biomarkers to subsequently facilitates high-quality research and applications.
Collapse
Affiliation(s)
- Zihan Ran
- Department of Research, Shanghai University of Medicine & Health Sciences Affiliated Zhoupu Hospital, Shanghai, China
- Inspection and Quarantine Department, The College of Medical Technology, Shanghai University of Medicine & Health Sciences, Shanghai, China
- The Genius Medicine Consortium (TGMC), Shanghai, China
| | - Jingcheng Yang
- The Genius Medicine Consortium (TGMC), Shanghai, China
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, China
- Center for Intelligent Medicine Research, Greater Bay Area Institute of Precision Medicine, Guangzhou, China
| | - Yaqing Liu
- The Genius Medicine Consortium (TGMC), Shanghai, China
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - XiuWen Chen
- Inspection and Quarantine Department, The College of Medical Technology, Shanghai University of Medicine & Health Sciences, Shanghai, China
| | - Zijing Ma
- Inspection and Quarantine Department, The College of Medical Technology, Shanghai University of Medicine & Health Sciences, Shanghai, China
| | - Shaobo Wu
- Department of Laboratory Medicine, Tinglin Hospital of Jinshan District, Shanghai, China
| | - Yechao Huang
- The Genius Medicine Consortium (TGMC), Shanghai, China
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yueqiang Song
- The Genius Medicine Consortium (TGMC), Shanghai, China
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Yu Gu
- Inspection and Quarantine Department, The College of Medical Technology, Shanghai University of Medicine & Health Sciences, Shanghai, China
| | - Shuo Zhao
- Inspection and Quarantine Department, The College of Medical Technology, Shanghai University of Medicine & Health Sciences, Shanghai, China
| | - Mengqi Fa
- Inspection and Quarantine Department, The College of Medical Technology, Shanghai University of Medicine & Health Sciences, Shanghai, China
| | - Jiangjie Lu
- Inspection and Quarantine Department, The College of Medical Technology, Shanghai University of Medicine & Health Sciences, Shanghai, China
| | - Qingwang Chen
- The Genius Medicine Consortium (TGMC), Shanghai, China
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Zehui Cao
- The Genius Medicine Consortium (TGMC), Shanghai, China
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Xiaofei Li
- The Genius Medicine Consortium (TGMC), Shanghai, China
- Department of Toxicology, School of Public Health, Guangxi Medical University, Nanning, China
| | - Shanyue Sun
- The Genius Medicine Consortium (TGMC), Shanghai, China
- State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences and Shanghai Cancer Center, Fudan University, Shanghai, China
| | - Tao Yang
- Department of Radiology, Shanghai University of Medicine & Health Sciences Affiliated Zhoupu Hospital, Shanghai, China
| |
Collapse
|
9
|
Crow M, Suresh H, Lee J, Gillis J. Coexpression reveals conserved gene programs that co-vary with cell type across kingdoms. Nucleic Acids Res 2022; 50:4302-4314. [PMID: 35451481 PMCID: PMC9071420 DOI: 10.1093/nar/gkac276] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Revised: 03/30/2022] [Accepted: 04/08/2022] [Indexed: 12/24/2022] Open
Abstract
What makes a mouse a mouse, and not a hamster? Differences in gene regulation between the two organisms play a critical role. Comparative analysis of gene coexpression networks provides a general framework for investigating the evolution of gene regulation across species. Here, we compare coexpression networks from 37 species and quantify the conservation of gene activity 1) as a function of evolutionary time, 2) across orthology prediction algorithms, and 3) with reference to cell- and tissue-specificity. We find that ancient genes are expressed in multiple cell types and have well conserved coexpression patterns, however they are expressed at different levels across cell types. Thus, differential regulation of ancient gene programs contributes to transcriptional cell identity. We propose that this differential regulation may play a role in cell diversification in both the animal and plant kingdoms.
Collapse
Affiliation(s)
- Megan Crow
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor NY, USA
| | - Hamsini Suresh
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor NY, USA
| | - John Lee
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor NY, USA
| | - Jesse Gillis
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor NY, USA
| |
Collapse
|
10
|
Munjal NS, Sapra D, Parthasarathi KTS, Goyal A, Pandey A, Banerjee M, Sharma J. Deciphering the Interactions of SARS-CoV-2 Proteins with Human Ion Channels Using Machine-Learning-Based Methods. Pathogens 2022; 11:pathogens11020259. [PMID: 35215201 PMCID: PMC8874499 DOI: 10.3390/pathogens11020259] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2021] [Revised: 01/31/2022] [Accepted: 02/08/2022] [Indexed: 01/04/2023] Open
Abstract
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is accountable for the protracted COVID-19 pandemic. Its high transmission rate and pathogenicity led to health emergencies and economic crisis. Recent studies pertaining to the understanding of the molecular pathogenesis of SARS-CoV-2 infection exhibited the indispensable role of ion channels in viral infection inside the host. Moreover, machine learning (ML)-based algorithms are providing a higher accuracy for host-SARS-CoV-2 protein–protein interactions (PPIs). In this study, PPIs of SARS-CoV-2 proteins with human ion channels (HICs) were trained on the PPI-MetaGO algorithm. PPI networks (PPINs) and a signaling pathway map of HICs with SARS-CoV-2 proteins were generated. Additionally, various U.S. food and drug administration (FDA)-approved drugs interacting with the potential HICs were identified. The PPIs were predicted with 82.71% accuracy, 84.09% precision, 84.09% sensitivity, 0.89 AUC-ROC, 65.17% Matthews correlation coefficient score (MCC) and 84.09% F1 score. Several host pathways were found to be altered, including calcium signaling and taste transduction pathway. Potential HICs could serve as an initial set to the experimentalists for further validation. The study also reinforces the drug repurposing approach for the development of host directed antiviral drugs that may provide a better therapeutic management strategy for infection caused by SARS-CoV-2.
Collapse
Affiliation(s)
- Nupur S. Munjal
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; (N.S.M.); (D.S.); (K.T.S.P.); (A.G.)
| | - Dikscha Sapra
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; (N.S.M.); (D.S.); (K.T.S.P.); (A.G.)
| | - K. T. Shreya Parthasarathi
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; (N.S.M.); (D.S.); (K.T.S.P.); (A.G.)
| | - Abhishek Goyal
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; (N.S.M.); (D.S.); (K.T.S.P.); (A.G.)
| | - Akhilesh Pandey
- Center for Molecular Medicine, National Institute of Mental Health and Neurosciences (NIMHANS), Hosur Road, Bangalore 560029, India;
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN 55905, USA
- Center for Individualized Medicine, Mayo Clinic, Rochester, MN 55905, USA
| | - Manidipa Banerjee
- Kusuma School of Biological Sciences, Indian Institute of Technology Delhi, Hauz Khas, New Delhi 110016, India;
| | - Jyoti Sharma
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India; (N.S.M.); (D.S.); (K.T.S.P.); (A.G.)
- Manipal Academy of Higher Education (MAHE), Udupi 576104, India
- Correspondence:
| |
Collapse
|
11
|
Wang Y, Tong Y, Zhang Z, Zheng R, Huang D, Yang J, Zong H, Tan F, Xie Y, Huang H, Zhang X. ViMIC: a database of human disease-related virus mutations, integration sites and cis-effects. Nucleic Acids Res 2022; 50:D918-D927. [PMID: 34500462 PMCID: PMC8728280 DOI: 10.1093/nar/gkab779] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2021] [Revised: 08/10/2021] [Accepted: 08/26/2021] [Indexed: 02/06/2023] Open
Abstract
Molecular mechanisms of virus-related diseases involve multiple factors, including viral mutation accumulation and integration of a viral genome into the host DNA. With increasing attention being paid to virus-mediated pathogenesis and the development of many useful technologies to identify virus mutations (VMs) and viral integration sites (VISs), much research on these topics is available in PubMed. However, knowledge of VMs and VISs is widely scattered in numerous published papers which lack standardization, integration and curation. To address these challenges, we built a pilot database of human disease-related Virus Mutations, Integration sites and Cis-effects (ViMIC), which specializes in three features: virus mutation sites, viral integration sites and target genes. In total, the ViMIC provides information on 31 712 VMs entries, 105 624 VISs, 16 310 viral target genes and 1 110 015 virus sequences of eight viruses in 77 human diseases obtained from the public domain. Furthermore, in ViMIC users are allowed to explore the cis-effects of virus-host interactions by surveying 78 histone modifications, binding of 1358 transcription regulators and chromatin accessibility on these VISs. We believe ViMIC will become a valuable resource for the virus research community. The database is available at http://bmtongji.cn/ViMIC/index.php.
Collapse
Affiliation(s)
- Ying Wang
- Research Center for Translational Medicine, Shanghai East Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
- Department of Laboratory Medicine, Shanghai Eastern Hepatobiliary Surgery Hospital, Shanghai 200438, China
| | - Yuantao Tong
- Research Center for Translational Medicine, Shanghai East Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Zeyu Zhang
- Research Center for Translational Medicine, Shanghai East Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Rongbin Zheng
- Research Center for Translational Medicine, Shanghai East Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Danqi Huang
- Research Center for Translational Medicine, Shanghai East Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Jinxuan Yang
- Research Center for Translational Medicine, Shanghai East Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Hui Zong
- Research Center for Translational Medicine, Shanghai East Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Fanglin Tan
- Research Center for Translational Medicine, Shanghai East Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Yujia Xie
- Research Center for Translational Medicine, Shanghai East Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Honglian Huang
- Research Center for Translational Medicine, Shanghai East Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Xiaoyan Zhang
- Research Center for Translational Medicine, Shanghai East Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| |
Collapse
|
12
|
Parthasarathi KTS, Munjal NS, Dey G, Kumar A, Pandey A, Balakrishnan L, Sharma J. A pathway map of signaling events triggered upon SARS-CoV infection. J Cell Commun Signal 2021; 15:595-600. [PMID: 34487344 PMCID: PMC8419830 DOI: 10.1007/s12079-021-00642-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Accepted: 08/15/2021] [Indexed: 12/11/2022] Open
Abstract
Severe acute respiratory syndrome coronaviruses (SARS-CoVs) caused worldwide epidemics over the past few decades. Extensive studies on various strains of coronaviruses provided a basic understanding of the pathogenesis of the disease. Presently, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is leading a global pandemic with unprecedented challenges. This is the third coronavirus outbreak of this century. A signaling pathway map of signaling events induced by SARS-CoV infection is not yet available. In this study, we present a literature-annotated signaling pathway map of reactions induced by SARS-CoV infected cells. Multiple signaling modules were found to be orchestrated including PI3K-AKT, Ras-MAPK, JAK-STAT, Type 1 IFN and NFκB. The signaling pathway map of SARS-CoV consists of 110 molecules and 101 reactions mediated by SARS-CoV proteins. The pathway reaction data are available in various community standard data exchange formats including Systems Biology Graphical Notation (SBGN). The pathway map is publicly available through the GitHub repository and data in various formats can be freely downloadable.
Collapse
Affiliation(s)
| | - Nupur S Munjal
- Institute of Bioinformatics, International Technology Park, Bangalore, 560066, India
| | - Gourav Dey
- Institute of Bioinformatics, International Technology Park, Bangalore, 560066, India
| | - Abhishek Kumar
- Institute of Bioinformatics, International Technology Park, Bangalore, 560066, India
- Manipal Academy of Higher Education (MAHE), Manipal, Karnataka, 576104, India
| | - Akhilesh Pandey
- Center for Molecular Medicine, National Institute of Mental Health and Neurosciences (NIMHANS), Hosur Road, Bangalore, 560029, India
- Department of Laboratory Medicine and Pathology, Mayo Clinic, Rochester, MN, 55905, USA
- Center for Individualized Medicine, Mayo Clinic, Rochester, MN, 55905, USA
| | - Lavanya Balakrishnan
- Mazumdar Shaw Center for Translational Research, Narayana Hrudayalaya Health City, Bangalore, India.
| | - Jyoti Sharma
- Institute of Bioinformatics, International Technology Park, Bangalore, 560066, India.
- Manipal Academy of Higher Education (MAHE), Manipal, Karnataka, 576104, India.
| |
Collapse
|
13
|
Venkatraman DL, Pulimamidi D, Shukla HG, Hegde SR. Tumor relevant protein functional interactions identified using bipartite graph analyses. Sci Rep 2021; 11:21530. [PMID: 34728699 PMCID: PMC8563864 DOI: 10.1038/s41598-021-00879-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Accepted: 09/30/2021] [Indexed: 12/02/2022] Open
Abstract
An increased surge of -omics data for the diseases such as cancer allows for deriving insights into the affiliated protein interactions. We used bipartite network principles to build protein functional associations of the differentially regulated genes in 18 cancer types. This approach allowed us to combine expression data to functional associations in many cancers simultaneously. Further, graph centrality measures suggested the importance of upregulated genes such as BIRC5, UBE2C, BUB1B, KIF20A and PTH1R in cancer. Pathway analysis of the high centrality network nodes suggested the importance of the upregulation of cell cycle and replication associated proteins in cancer. Some of the downregulated high centrality proteins include actins, myosins and ATPase subunits. Among the transcription factors, mini-chromosome maintenance proteins (MCMs) and E2F family proteins appeared prominently in regulating many differentially regulated genes. The projected unipartite networks of the up and downregulated genes were comprised of 37,411 and 41,756 interactions, respectively. The conclusions obtained by collating these interactions revealed pan-cancer as well as subtype specific protein complexes and clusters. Therefore, we demonstrate that incorporating expression data from multiple cancers into bipartite graphs validates existing cancer associated mechanisms as well as directs to novel interactions and pathways.
Collapse
Affiliation(s)
| | - Deepshika Pulimamidi
- Institute of Bioinformatics and Applied Biotechnology (IBAB), Bengaluru, 560 100, India
| | - Harsh G Shukla
- Institute of Bioinformatics and Applied Biotechnology (IBAB), Bengaluru, 560 100, India
| | - Shubhada R Hegde
- Institute of Bioinformatics and Applied Biotechnology (IBAB), Bengaluru, 560 100, India.
| |
Collapse
|
14
|
Yazdani B, Jazini M, Jabbari N, Karami M, Rahimirad S, Azadeh M, Mahdevar M, Ghaedi K. Altered expression level of ACSM5 in breast cancer: An integrative analysis of tissue biomarkers with diagnostic potential. GENE REPORTS 2021. [DOI: 10.1016/j.genrep.2020.100992] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|
15
|
Genetic variants in levodopa-induced dyskinesia (LID): A systematic review and meta-analysis. Parkinsonism Relat Disord 2021; 84:52-60. [DOI: 10.1016/j.parkreldis.2021.01.020] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/28/2020] [Revised: 01/25/2021] [Accepted: 01/25/2021] [Indexed: 12/17/2022]
|
16
|
Wu J, Li D, Liu X, Li Q, He X, Wei J, Li X, Li M, Rehman AU, Xia Y, Wu C, Zhang J, Lu X. IDDB: a comprehensive resource featuring genes, variants and characteristics associated with infertility. Nucleic Acids Res 2021; 49:D1218-D1224. [PMID: 32941628 PMCID: PMC7779019 DOI: 10.1093/nar/gkaa753] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2020] [Revised: 08/24/2020] [Accepted: 08/29/2020] [Indexed: 12/26/2022] Open
Abstract
Infertility is a complex multifactorial disease that affects up to 10% of couples across the world. However, many mechanisms of infertility remain unclear due to the lack of studies based on systematic knowledge, leading to ineffective treatment and/or transmission of genetic defects to offspring. Here, we developed an infertility disease database to provide a comprehensive resource featuring various factors involved in infertility. Features in the current IDDB version were manually curated as follows: (i) a total of 307 infertility-associated genes in human and 1348 genes associated with reproductive disorder in 9 model organisms; (ii) a total of 202 chromosomal abnormalities leading to human infertility, including aneuploidies and structural variants; and (iii) a total of 2078 pathogenic variants from infertility patients’ samples across 60 different diseases causing infertility. Additionally, the characteristics of clinically diagnosed infertility patients (i.e. causative variants, laboratory indexes and clinical manifestations) were collected. To the best of our knowledge, the IDDB is the first infertility database serving as a systematic resource for biologists to decipher infertility mechanisms and for clinicians to achieve better diagnosis/treatment of patients from disease phenotype to genetic factors. The IDDB is freely available at http://mdl.shsmu.edu.cn/IDDB/.
Collapse
Affiliation(s)
- Jing Wu
- Department of Assisted Reproduction, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine (SJTU-SM), Shanghai 200011, China.,Medicinal Bioinformatics Center, Shanghai Jiao Tong University School of Medicine (SJTU-SM), Shanghai 200025, China
| | - Danjun Li
- Department of Assisted Reproduction, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine (SJTU-SM), Shanghai 200011, China.,Medicinal Bioinformatics Center, Shanghai Jiao Tong University School of Medicine (SJTU-SM), Shanghai 200025, China
| | - Xinyi Liu
- Medicinal Bioinformatics Center, Shanghai Jiao Tong University School of Medicine (SJTU-SM), Shanghai 200025, China
| | - Qian Li
- Medicinal Bioinformatics Center, Shanghai Jiao Tong University School of Medicine (SJTU-SM), Shanghai 200025, China
| | - Xinheng He
- Medicinal Bioinformatics Center, Shanghai Jiao Tong University School of Medicine (SJTU-SM), Shanghai 200025, China
| | - Jiale Wei
- Medicinal Bioinformatics Center, Shanghai Jiao Tong University School of Medicine (SJTU-SM), Shanghai 200025, China
| | - Xinyi Li
- Medicinal Bioinformatics Center, Shanghai Jiao Tong University School of Medicine (SJTU-SM), Shanghai 200025, China
| | - Mingyu Li
- Medicinal Bioinformatics Center, Shanghai Jiao Tong University School of Medicine (SJTU-SM), Shanghai 200025, China
| | - Ashfaq Ur Rehman
- Medicinal Bioinformatics Center, Shanghai Jiao Tong University School of Medicine (SJTU-SM), Shanghai 200025, China
| | - Yujia Xia
- Medicinal Bioinformatics Center, Shanghai Jiao Tong University School of Medicine (SJTU-SM), Shanghai 200025, China
| | - Chengwei Wu
- Medicinal Bioinformatics Center, Shanghai Jiao Tong University School of Medicine (SJTU-SM), Shanghai 200025, China
| | - Jian Zhang
- Medicinal Bioinformatics Center, Shanghai Jiao Tong University School of Medicine (SJTU-SM), Shanghai 200025, China.,School of Pharmaceutical Sciences, Zhengzhou University, Zhengzhou 450001, China
| | - Xuefeng Lu
- Department of Assisted Reproduction, Shanghai Ninth People's Hospital, Shanghai Jiao Tong University School of Medicine (SJTU-SM), Shanghai 200011, China
| |
Collapse
|
17
|
Vivek-Ananth RP, Sahoo AK, Kumaravel K, Mohanraj K, Samal A. MeFSAT: a curated natural product database specific to secondary metabolites of medicinal fungi. RSC Adv 2021; 11:2596-2607. [PMID: 35424258 PMCID: PMC8693784 DOI: 10.1039/d0ra10322e] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Accepted: 01/04/2021] [Indexed: 01/25/2023] Open
Abstract
Fungi are a rich source of secondary metabolites which constitutes a valuable and diverse chemical space of natural products. Medicinal fungi have been used in traditional medicine to treat human ailments for centuries. To date, there is no devoted resource on secondary metabolites and therapeutic uses of medicinal fungi. Such a dedicated resource compiling dispersed information on medicinal fungi across published literature will facilitate ongoing efforts towards natural product based drug discovery. Here, we present the first comprehensive manually curated database on Medicinal Fungi Secondary metabolites And Therapeutics (MeFSAT) that compiles information on 184 medicinal fungi, 1830 secondary metabolites and 149 therapeutics uses. Importantly, MeFSAT contains a non-redundant in silico natural product library of 1830 secondary metabolites along with information on their chemical structures, computed physicochemical properties, drug-likeness properties, predicted ADMET properties, molecular descriptors and predicted human target proteins. By comparing the physicochemical properties of secondary metabolites in MeFSAT with other small molecules collections, we find that fungal secondary metabolites have high stereochemical complexity and shape complexity similar to other natural product libraries. Based on multiple scoring schemes, we have filtered a subset of 228 drug-like secondary metabolites in MeFSAT database. By constructing and analyzing chemical similarity networks, we show that the chemical space of secondary metabolites in MeFSAT is highly diverse. The compiled information in MeFSAT database is openly accessible at: https://cb.imsc.res.in/mefsat/.
Collapse
Affiliation(s)
- R P Vivek-Ananth
- The Institute of Mathematical Sciences (IMSc) Chennai 600113 India
- Homi Bhabha National Institute (HBNI) Mumbai 400094 India
| | - Ajaya Kumar Sahoo
- The Institute of Mathematical Sciences (IMSc) Chennai 600113 India
- Homi Bhabha National Institute (HBNI) Mumbai 400094 India
| | - Kavyaa Kumaravel
- The Institute of Mathematical Sciences (IMSc) Chennai 600113 India
| | - Karthikeyan Mohanraj
- The Institute of Mathematical Sciences (IMSc) Chennai 600113 India
- Institute for Clinical Chemistry and Laboratory Medicine, Technische Universität Dresden Dresden 01307 Germany
| | - Areejit Samal
- The Institute of Mathematical Sciences (IMSc) Chennai 600113 India
- Homi Bhabha National Institute (HBNI) Mumbai 400094 India
| |
Collapse
|
18
|
Wang C, Chen L, Zhang M, Yang Y, Wong G. PDmethDB: A curated Parkinson's disease associated methylation information database. Comput Struct Biotechnol J 2020; 18:3745-3749. [PMID: 33304468 PMCID: PMC7714663 DOI: 10.1016/j.csbj.2020.11.015] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Revised: 11/10/2020] [Accepted: 11/10/2020] [Indexed: 01/12/2023] Open
Abstract
Parkinson's disease (PD) is the second most common neurodegenerative disease, of which the histopathological hallmark is the formation of Lewy bodies consisting of α-synuclein as the major component. α-Synuclein can sequester DNA Methyltransferase 1 (DNMT1), the maintenance DNA methylation enzyme, from the nucleus and into the cytoplasm, leading to global DNA hypomethylation in human brain. As DNA methylation is a major epigenetic modification that regulates gene expression and there is no specific database storing PD associated methylation information, PDmethDB (Parkinson's Disease Methylation Database) aims to curate PD associated methylation information from literature to facilitate the study of the relationship between PD and methylation. Currently, PDmethDB contains 97,077 PD methylation associated entries among 12,308 molecules, 37,944 CpG sites, 31 tissues and 3 species through a review of about 1600 published papers. This includes information concerning the gene/molecule name, CpG site, methylation alteration, expression alteration, tissue, PMID, experimental method, and a brief description about the entry. PDmethDB provides a user-friendly interface to search, browse, download and submit data. PDmethDB supports browsing by molecule, species, tissue, gene region, methylation alteration and experimental methods. PDmethDB also shows the entry gene interaction network including protein-protein interactions and miRNA-targets interactions with a highlight of PD associated genes from DisGeNET database. PDmethDB aims to facilitate the understanding of the relationship between PD and methylation. Database URL: https://ageing.shinyapps.io/pdmethdb/.
Collapse
Affiliation(s)
- Changliang Wang
- Cancer Centre, Centre of Reproduction, Development and Aging, Faculty of Health Sciences, University of Macau, Macau S.A.R., China
- Guangzhou Regenerative Medicine and Health Guangdong Laboratory, Guangzhou, China
| | - Liang Chen
- Department of Computer Science, College of Engineering, Shantou University, Shantou, China
- Key Laboratory of Intelligent Manufacturing Technology of Ministry of Education, Shantou University, Shantou, China
| | - Menglei Zhang
- Cancer Centre, Centre of Reproduction, Development and Aging, Faculty of Health Sciences, University of Macau, Macau S.A.R., China
| | - Yang Yang
- Cancer Centre, Centre of Reproduction, Development and Aging, Faculty of Health Sciences, University of Macau, Macau S.A.R., China
| | - Garry Wong
- Cancer Centre, Centre of Reproduction, Development and Aging, Faculty of Health Sciences, University of Macau, Macau S.A.R., China
| |
Collapse
|
19
|
Sayeeram D, Katte TV, Bhatia S, Jai Kumar A, Kumar A, Jayashree G, Rachana D, Nalla Reddy HV, Arvind Rasalkar A, Malempati RL, Reddy S DN. Identification of potential biomarkers for lung adenocarcinoma. Heliyon 2020; 6:e05452. [PMID: 33251353 PMCID: PMC7677689 DOI: 10.1016/j.heliyon.2020.e05452] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2020] [Revised: 04/21/2020] [Accepted: 09/21/2020] [Indexed: 12/14/2022] Open
Abstract
Lung adenocarcinoma (LUAD) is the most predominant subtype of lung cancers and is one of the leading causes of cancer related mortality worldwide. Despite the advancements in the field of cancer diagnostics and therapeutics, detection at an early stage using reliable biomarkers is an unmet clinical need for a plethora of cancers, including LUAD, thus attributing to poor prognosis. In view of this, to identify potential biomarkers and therapeutic candidate genes, the expression of all known human genes was screened in the publicly available 'The Cancer Genome Atlas' (TCGA) samples of LUAD patients which resulted in the identification of overexpressed genes. Further analysis of these genes across various patient sample datasets revealed that ZNF687, ODR4, PBXIP1, PYGO2, METTL3, PIGM and RAD1 are consistently more highly expressed in LUAD. Higher expression of these genes either alone or in combination is correlated with poor survival of LUAD patients. Hence, in this study we propose that these identified genes could serve as potential candidates as gene signatures or biomarkers for LUAD that require further investigation in large cohorts of LUAD samples.
Collapse
Affiliation(s)
- Deepak Sayeeram
- Department of Biotechnology, BMS College of Engineering, Bengaluru, India
| | - Teesta V. Katte
- Department of Biotechnology, BMS College of Engineering, Bengaluru, India
| | - Saloni Bhatia
- Department of Biotechnology, BMS College of Engineering, Bengaluru, India
| | - Anushree Jai Kumar
- Department of Biotechnology, BMS College of Engineering, Bengaluru, India
| | - Avinesh Kumar
- Department of Biotechnology, BMS College of Engineering, Bengaluru, India
| | - G. Jayashree
- Department of Biotechnology, BMS College of Engineering, Bengaluru, India
| | - D.S. Rachana
- Department of Biotechnology, BMS College of Engineering, Bengaluru, India
| | | | - Avinash Arvind Rasalkar
- inDNA Life Sciences Private Limited, Plot 368, 3 Floor, North View, Infocity Avenue, Patia, Bhubaneswar, Odisha 751024, India
| | | | | |
Collapse
|
20
|
Liany H, Jeyasekharan A, Rajan V. Predicting synthetic lethal interactions using heterogeneous data sources. Bioinformatics 2020; 36:2209-2216. [PMID: 31782759 DOI: 10.1093/bioinformatics/btz893] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2019] [Revised: 10/31/2019] [Accepted: 11/27/2019] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION A synthetic lethal (SL) interaction is a relationship between two functional entities where the loss of either one of the entities is viable but the loss of both entities is lethal to the cell. Such pairs can be used as drug targets in targeted anticancer therapies, and so, many methods have been developed to identify potential candidate SL pairs. However, these methods use only a subset of available data from multiple platforms, at genomic, epigenomic and transcriptomic levels; and hence are limited in their ability to learn from complex associations in heterogeneous data sources. RESULTS In this article, we develop techniques that can seamlessly integrate multiple heterogeneous data sources to predict SL interactions. Our approach obtains latent representations by collective matrix factorization-based techniques, which in turn are used for prediction through matrix completion. Our experiments, on a variety of biological datasets, illustrate the efficacy and versatility of our approach, that outperforms state-of-the-art methods for predicting SL interactions and can be used with heterogeneous data sources with minimal feature engineering. AVAILABILITY AND IMPLEMENTATION Software available at https://github.com/lianyh. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Herty Liany
- Department of Computer Science, School of Computing, National University of Singapore, Singapore, Singapore
| | - Anand Jeyasekharan
- Department of Medicine, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
| | - Vaibhav Rajan
- Department of Information Systems and Analytics, School of Computing, National University of Singapore, Singapore, Singapore
| |
Collapse
|
21
|
Systems Biology Approach Identifies Prognostic Signatures of Poor Overall Survival and Guides the Prioritization of Novel BET-CHK1 Combination Therapy for Osteosarcoma. Cancers (Basel) 2020; 12:cancers12092426. [PMID: 32859084 PMCID: PMC7564419 DOI: 10.3390/cancers12092426] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2020] [Revised: 08/01/2020] [Accepted: 08/14/2020] [Indexed: 12/12/2022] Open
Abstract
Osteosarcoma (OS) patients exhibit poor overall survival, partly due to copy number variations (CNVs) resulting in dysregulated gene expression and therapeutic resistance. To identify actionable prognostic signatures of poor overall survival, we employed a systems biology approach using public databases to integrate CNVs, gene expression, and survival outcomes in pediatric, adolescent, and young adult OS patients. Chromosome 8 was a hotspot for poor prognostic signatures. The MYC-RAD21 copy number gain (8q24) correlated with increased gene expression and poor overall survival in 90% of the patients (n = 85). MYC and RAD21 play a role in replication-stress, which is a therapeutically actionable network. We prioritized replication-stress regulators, bromodomain and extra-terminal proteins (BETs), and CHK1, in order to test the hypothesis that the inhibition of BET + CHK1 in MYC-RAD21+ pediatric OS models would be efficacious and safe. We demonstrate that MYC-RAD21+ pediatric OS cell lines were sensitive to the inhibition of BET (BETi) and CHK1 (CHK1i) at clinically achievable concentrations. While the potentiation of CHK1i-mediated effects by BETi was BET-BRD4-dependent, MYC expression was BET-BRD4-independent. In MYC-RAD21+ pediatric OS xenografts, BETi + CHK1i significantly decreased tumor growth, increased survival, and was well tolerated. Therefore, targeting replication stress is a promising strategy to pursue as a therapeutic option for this devastating disease.
Collapse
|
22
|
Shah SG, Mandloi T, Kunte P, Natu A, Rashid M, Reddy D, Gadewal N, Gupta S. HISTome2: a database of histone proteins, modifiers for multiple organisms and epidrugs. Epigenetics Chromatin 2020; 13:31. [PMID: 32746900 PMCID: PMC7398201 DOI: 10.1186/s13072-020-00354-8] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Accepted: 07/28/2020] [Indexed: 01/06/2023] Open
Abstract
BACKGROUND Epigenetics research is progressing in basic, pre-clinical and clinical studies using various model systems. Hence, updating the knowledge and integration of biological data emerging from in silico, in vitro and in vivo studies for different epigenetic factors is essential. Moreover, new drugs are being discovered which target various epigenetic proteins, tested in pre-clinical studies, clinical trials and approved by the FDA. It brings distinct challenges as well as opportunities to update the existing HIstome database for implementing and applying enormous data for biomedical research. RESULTS HISTome2 focuses on the sub-classification of histone proteins as variants and isoforms, post-translational modifications (PTMs) and modifying enzymes for humans (Homo sapiens), rat (Rattus norvegicus) and mouse (Mus musculus) on one interface for integrative analysis. It contains 232, 267 and 350 entries for histone proteins (non-canonical/variants and canonical/isoforms), PTMs and modifying enzymes respectively for human, rat, and mouse. Around 200 EpiDrugs for various classes of epigenetic modifiers, their clinical trial status, and pharmacological relevance have been provided in HISTome2. The additional features like 'Clustal omega' for multiple sequence alignment, link to 'FireBrowse' to visualize TCGA expression data and 'TargetScanHuman' for miRNA targets have been included in the database. CONCLUSION The information for multiple organisms and EpiDrugs on a common platform will accelerate the understanding and future development of drugs. Overall, HISTome2 has significantly increased the extent and diversity of its content which will serve as a 'knowledge Infobase' for biologists, pharmacologists, and clinicians. HISTome2: The HISTone Infobase is freely available on http://www.actrec.gov.in/histome2/ .
Collapse
Affiliation(s)
- Sanket G. Shah
- Epigenetics and Chromatin Biology Group, Gupta Laboratory, Cancer Research Institute, Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Kharghar, Navi Mumbai, MH 410210 India
- Homi Bhabha National Institute, Training School Complex, Anushakti Nagar, Mumbai, MH 400085 India
| | - Tushar Mandloi
- Bioinformatics Centre, Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Kharghar, Navi Mumbai, MH 410210 India
| | - Pooja Kunte
- Epigenetics and Chromatin Biology Group, Gupta Laboratory, Cancer Research Institute, Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Kharghar, Navi Mumbai, MH 410210 India
- Present Address: Diabetes Unit, King Edward Memorial Hospital Research Centre, Rasta Peth, Pune, Maharashtra 411 011 India
| | - Abhiram Natu
- Epigenetics and Chromatin Biology Group, Gupta Laboratory, Cancer Research Institute, Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Kharghar, Navi Mumbai, MH 410210 India
- Homi Bhabha National Institute, Training School Complex, Anushakti Nagar, Mumbai, MH 400085 India
| | - Mudasir Rashid
- Epigenetics and Chromatin Biology Group, Gupta Laboratory, Cancer Research Institute, Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Kharghar, Navi Mumbai, MH 410210 India
- Homi Bhabha National Institute, Training School Complex, Anushakti Nagar, Mumbai, MH 400085 India
| | - Divya Reddy
- Epigenetics and Chromatin Biology Group, Gupta Laboratory, Cancer Research Institute, Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Kharghar, Navi Mumbai, MH 410210 India
- Homi Bhabha National Institute, Training School Complex, Anushakti Nagar, Mumbai, MH 400085 India
- Present Address: Stowers Institute for Medical Research, Kansas City, MO 64110 USA
| | - Nikhil Gadewal
- Bioinformatics Centre, Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Kharghar, Navi Mumbai, MH 410210 India
| | - Sanjay Gupta
- Epigenetics and Chromatin Biology Group, Gupta Laboratory, Cancer Research Institute, Advanced Centre for Treatment, Research and Education in Cancer, Tata Memorial Centre, Kharghar, Navi Mumbai, MH 410210 India
- Homi Bhabha National Institute, Training School Complex, Anushakti Nagar, Mumbai, MH 400085 India
| |
Collapse
|
23
|
Turek C, Wróbel S, Piwowar M. OmicsON - Integration of omics data with molecular networks and statistical procedures. PLoS One 2020; 15:e0235398. [PMID: 32726348 PMCID: PMC7390260 DOI: 10.1371/journal.pone.0235398] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2019] [Accepted: 06/15/2020] [Indexed: 12/05/2022] Open
Abstract
A huge amount of atomized biological data collected in various databases and the need for a description of their relation by theoretical methods causes the development of data integration methods. The omics data analysis by integration of biological knowledge with mathematical procedures implemented in the OmicsON R library is presented in the paper. OmicsON is a tool for the integration of two sets of data: transcriptomics and metabolomics. In the workflow of the library, the functional grouping and statistical analysis are applied. Subgroups among the transcriptomic and metabolomics sets are created based on the biological knowledge stored in Reactome and String databases. It gives the possibility to analyze such sets of data by multivariate statistical procedures like Canonical Correlation Analysis (CCA) or Partial Least Squares (PLS). The integration of metabolomic and transcriptomic data based on the methodology contained in OmicsON helps to easily obtain information on the connection of data from two different sets. This information can significantly help in assessing the relationship between gene expression and metabolite concentrations, which in turn facilitates the biological interpretation of the analyzed process.
Collapse
Affiliation(s)
- Cezary Turek
- Department of Bioinformatics and Telemedicine, Jagiellonian University–Medical College, Krakow, Poland
| | - Sonia Wróbel
- Department of Medical Physics, Jagiellonian University, Marian Smoluchowski Institute of Physics, Krakow, Poland
| | - Monika Piwowar
- Department of Bioinformatics and Telemedicine, Jagiellonian University–Medical College, Krakow, Poland
- * E-mail:
| |
Collapse
|
24
|
Cervellati C, Trentini A, Rosta V, Passaro A, Bosi C, Sanz JM, Bonazzi S, Pacifico S, Seripa D, Valacchi G, Guerini R, Zuliani G. Serum beta-secretase 1 (BACE1) activity as candidate biomarker for late-onset Alzheimer's disease. GeroScience 2019; 42:159-167. [PMID: 31745860 DOI: 10.1007/s11357-019-00127-6] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2019] [Accepted: 10/17/2019] [Indexed: 01/02/2023] Open
Abstract
Beta-secretase (BACE1) is a key enzyme in the formation of amyloid-β; its activity/concentration is increased in brain and cerebrospinal fluid of patients with late-onset Alzheimer's disease (LOAD). Since BACE1 was found also in blood, we evaluated its potential as peripheral biomarker. To this aim, serum BACE1 activity was assessed in 115 subjects with LOAD and 151 controls. We found that BACE1 changed across groups (p < 0.001) with a 25% increase in LOAD versus controls. High levels of BACE1 (IV quartile) were independently associated with the diagnosis of LOAD (OR 2.8; 1.4-5.7). Diagnostic accuracy was 76% for LOAD. Our data suggest that increased BACE1 activity in serum may represent a potential biomarker for LOAD. Additional studies are needed to confirm the usefulness of BACE1, alone or in combination with other markers, in discriminating patients and predicting LOAD onset and progression.
Collapse
Affiliation(s)
- Carlo Cervellati
- Department of Biomedical and Specialist Surgical Sciences, University of Ferrara, Via Luigi Borsari 46, 44121, Ferrara, Italy.
| | - Alessandro Trentini
- Department of Biomedical and Specialist Surgical Sciences, University of Ferrara, Via Luigi Borsari 46, 44121, Ferrara, Italy
| | - Valentina Rosta
- Department of Biomedical and Specialist Surgical Sciences, University of Ferrara, Via Luigi Borsari 46, 44121, Ferrara, Italy
| | - Angelina Passaro
- Department of Morphology, Surgery, and Medical Sciences, University of Ferrara, Via Luigi Borsari 46, 44121, Ferrara, Italy
| | - Cristina Bosi
- Department of Morphology, Surgery, and Medical Sciences, University of Ferrara, Via Luigi Borsari 46, 44121, Ferrara, Italy
| | - Juana Maria Sanz
- Department of Morphology, Surgery, and Medical Sciences, University of Ferrara, Via Luigi Borsari 46, 44121, Ferrara, Italy
| | - Stefania Bonazzi
- Department of Morphology, Surgery, and Medical Sciences, University of Ferrara, Via Luigi Borsari 46, 44121, Ferrara, Italy
| | - Salvatore Pacifico
- Department of Chemical and Pharmaceutical Sciences, University of Ferrara, Via Luigi Borsari 46, 44121, Ferrara, Italy
| | - Davide Seripa
- Research Laboratory, Complex Structure of Geriatrics, Department of Medical Sciences, Fondazione IRCCS Casa Sollievo della Sofferenza, Viale Cappuccini, 1, 71013, San Giovanni Rotondo, Italy
| | - Giuseppe Valacchi
- Department of Biomedical and Specialist Surgical Sciences, University of Ferrara, Via Luigi Borsari 46, 44121, Ferrara, Italy.,Plants for Human Health Institute, Animal Science Department, NC State University, 600 Laureate Way, Kannapolis, NC, 28081, USA.,Department of Food and Nutrition, Kyung Hee University, 26, Kyungheedae-ro, Dongdaemun-gu, Seoul, 02447, Republic of Korea
| | - Remo Guerini
- Department of Chemical and Pharmaceutical Sciences, University of Ferrara, Via Luigi Borsari 46, 44121, Ferrara, Italy
| | - Giovanni Zuliani
- Department of Morphology, Surgery, and Medical Sciences, University of Ferrara, Via Luigi Borsari 46, 44121, Ferrara, Italy
| |
Collapse
|
25
|
Mobasheri L, Moossavi SZ, Esmaeili A, Mohammadoo-Khorasani M, Sarab GA. Association between vitamin D receptor gene FokI and TaqI variants with autism spectrum disorder predisposition in Iranian population. Gene 2019; 723:144133. [PMID: 31589956 DOI: 10.1016/j.gene.2019.144133] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2019] [Revised: 09/12/2019] [Accepted: 09/16/2019] [Indexed: 01/18/2023]
Abstract
BACKGROUND AND AIM Autism spectrum disorder (ASD) is one of the neurodevelopmental and cognitive conditions that involves 1 in 160 children around the world. Several studies showed that there is a relationship between vitamin D receptor (VDR) gene polymorphisms with the neurodevelopmental behavioral disorders. In the current study, we aimed to highlight the association of VDR gene polymorphisms (FokI and TaqI) with the risk of autism in Birjand population. MATERIAL AND METHODS In this case-control study eighty-one patients recognized with ASD and one hundred-eight healthy controls were recruited to the study from 2017 to 2018. Genotyping was carried out by polymerase chain reaction followed by restriction fragment length polymorphism (PCR-RFLP) technique for all subjects. RESULTS Calculated odds ratio and P-value for the alleles of VDR gene FokI and TaqI variants between autistic patients and controls did not show a significant difference (P > 0.05). However, calculated homozygous recessive (tt) for TaqI polymorphism was statistically significant (P = 0.015) in control group and there was also statistically meaningful difference in both case and control groups in ft haplotype (P = 0.04). CONCLUSION These results provide preliminary evidence that genetic variants of the VDR gene (FokI and TaqI) might have a possible reduced risk of ASD occurrence in children. The additional examination is needed to acquire more decisive and precise results in this area.
Collapse
Affiliation(s)
- Leila Mobasheri
- Student Research Committee, Birjand University of Medical Sciences, Birjand, Iran; Department of Medical Immunology, Faculty of Medicine, Birjand University of Medical Sciences, Birjand, Iran
| | | | - Aliakbar Esmaeili
- Psychiatry and Behavioral Science Research Center, Birjand University of Medical Sciences, Birjand, Iran
| | - Milad Mohammadoo-Khorasani
- Department of Clinical Biochemistry, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran
| | - Gholamreza Anani Sarab
- Cellular and Molecular Research Center, Birjand University of Medical Sciences, Birjand, Iran; Department of Medical Immunology, Faculty of Medicine, Birjand University of Medical Sciences, Birjand, Iran.
| |
Collapse
|
26
|
Kumar A, Bansal A, Singh TR. ABCD: Alzheimer's disease Biomarkers Comprehensive Database. 3 Biotech 2019; 9:351. [PMID: 31501752 DOI: 10.1007/s13205-019-1888-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Accepted: 08/27/2019] [Indexed: 11/24/2022] Open
Abstract
Alzheimer's disease (AD) is an age-related, non-reversible, and progressive brain disorder. Memory loss, confusion, and personality changes are major symptoms noticed. AD ultimately leads to a severe loss of mental function. Due to lack of effective biomarkers, no effective medication was available for the complete treatment of AD. There is a need to provide all AD-related essential information to the scientific community. Our resource Alzheimer's disease Biomarkers Comprehensive Database (ABCD) is being planned to accomplish this objective. ABCD is a huge collection of AD-related data of molecular markers. The web interface contains information concerning the proteins, genes, transcription factors, SNPs, miRNAs, mitochondrial genes, and expressed genes implicated in AD pathogenesis. In addition to the molecular-level data, the database has information for animal models, medicinal candidates and pathways involved in the AD and some image data for AD patients. ABCD is coupled with some major external resources where the user can retrieve additional general information about the disease. The database was designed in such a manner that user can extract meaningful information about gene, protein, pathway, and regulatory elements based search options. This database is unique in the sense that it is completely dedicated to specific neurological disorder i.e. AD. Further advance options like AD-affected brain image data of patients and structural compound level information add values to our database. Features of this database enable users to extract, analyze and display information related to a disease in many different ways. The database is available for academic purpose and accessible at http://www.bioinfoindia.org/abcd.
Collapse
Affiliation(s)
- Ashwani Kumar
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Waknaghat, Solan, Himachal Pradesh 173234 India
| | - Ankush Bansal
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Waknaghat, Solan, Himachal Pradesh 173234 India
| | - Tiratha Raj Singh
- Department of Biotechnology and Bioinformatics, Jaypee University of Information Technology, Waknaghat, Solan, Himachal Pradesh 173234 India
| |
Collapse
|
27
|
Qian Z, Zhang Z, Wang Y. T cell receptor signaling pathway and cytokine-cytokine receptor interaction affect the rehabilitation process after respiratory syncytial virus infection. PeerJ 2019; 7:e7089. [PMID: 31223533 PMCID: PMC6571000 DOI: 10.7717/peerj.7089] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Accepted: 05/06/2019] [Indexed: 11/20/2022] Open
Abstract
Background Respiratory syncytial virus (RSV) is the main cause of respiratory tract infection, which seriously threatens the health and life of children. This study is conducted to reveal the rehabilitation mechanisms of RSV infection. Methods E-MTAB-5195 dataset was downloaded from EBI ArrayExpress database, including 39 acute phase samples in the acute phase of infection and 21 samples in the recovery period. Using the limma package, differentially expressed RNAs (DE-RNAs) were analyzed. The significant modules were identified using WGCNA package, and the mRNAs in them were conducted with enrichment analysis using DAVID tool. Afterwards, co-expression network for the RNAs involved in the significant modules was built by Cytoscape software. Additionally, RSV-correlated pathways were searched from Comparative Toxicogenomics Database, and then the pathway network was constructed. Results There were 2,489 DE-RNAs between the two groups, including 2,386 DE-mRNAs and 103 DE-lncRNAs. The RNAs in the black, salmon, blue, tan and turquoise modules correlated with stage were taken as RNA set1. Meanwhile, the RNAs in brown, blue, magenta and pink modules related to disease severity were defined as RNA set2. In the pathway networks, CD40LG and RASGRP1 co-expressed with LINC00891/LINC00526/LINC01215 were involved in the T cell receptor signaling pathway, and IL1B, IL1R2, IL18, and IL18R1 co-expressed with BAIAP2-AS1/CRNDE/LINC01503/SMIM25 were implicated in cytokine-cytokine receptor interaction. Conclusion LINC00891/LINC00526/LINC01215 co-expressed with CD40LG and RASGRP1 might affect the rehabilitation process of RSV infection through the T cell receptor signaling pathway. Besides, BAIAP2-AS1/CRNDE/LINC01503/SMIM25 co-expressed with IL1 and IL18 families might function in the clearance process after RSV infection via cytokine-cytokine receptor interaction.
Collapse
Affiliation(s)
- Zuanhao Qian
- Department of Pediatrics, Taikang Xianlin Drum Tower Hospital, Nanjing, China
| | - Zhenglei Zhang
- Department of Pediatrics, Taikang Xianlin Drum Tower Hospital, Nanjing, China
| | - Yingying Wang
- Department of Pediatrics, Taikang Xianlin Drum Tower Hospital, Nanjing, China
| |
Collapse
|
28
|
Cheng L, Pandya PH, Liu E, Chandra P, Wang L, Murray ME, Carter J, Ferguson M, Saadatzadeh MR, Bijangi-Visheshsaraei K, Marshall M, Li L, Pollok KE, Renbarger JL. Integration of genomic copy number variations and chemotherapy-response biomarkers in pediatric sarcoma. BMC Med Genomics 2019; 12:23. [PMID: 30704460 PMCID: PMC6357363 DOI: 10.1186/s12920-018-0456-5] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Background While most pediatric sarcomas respond to front-line therapy, some bone sarcomas do not show radiographic response like soft-tissue sarcomas (rhabdomyosarccomas) but do show 90% necrosis. Though, new therapies are urgently needed to improve survival and quality of life in pediatric patients with sarcomas. Complex chromosomal aberrations such as amplifications and deletions of DNA sequences are frequently observed in pediatric sarcomas. Evaluation of copy number variations (CNVs) associated with pediatric sarcoma patients at the time of diagnosis or following therapy offers an opportunity to assess dysregulated molecular targets and signaling pathways that may drive sarcoma development, progression, or relapse. The objective of this study was to utilize publicly available data sets to identify potential predictive biomarkers of chemotherapeutic response in pediatric Osteosarcoma (OS), Rhabdomyosarcoma (RMS) and Ewing’s Sarcoma Family of Tumors (ESFTs) based on CNVs following chemotherapy (OS n = 117, RMS n = 64, ESFTs n = 25 tumor biopsies). Methods There were 206 CNV profiles derived from pediatric sarcoma biopsies collected from the public databases TARGET and NCBI-Gene Expression Omnibus (GEO). Through our comparative genomic analyses of OS, RMS, and ESFTs and 22,255 healthy individuals called from the Database of Genomic Variants (DGV), we identified CNVs (amplifications and deletions) pattern of genomic instability in these pediatric sarcomas. By integrating CNVs of Cancer Cell Line Encyclopedia (CCLE) identified in the pool of genes with drug-response data from sarcoma cell lines (n = 27) from Cancer Therapeutics Response Portal (CTRP) Version 2, potential predictive biomarkers of therapeutic response were identified. Results Genes associated with survival and/recurrence of these sarcomas with statistical significance were found on long arm of chromosome 8 and smaller aberrations were also identified at chromosomes 1q, 12q and x in OS, RMS, and ESFTs. A pool of 63 genes that harbored amplifications and/or deletions were frequently associated with recurrence across OS, RMS, and ESFTs. Correlation analysis of CNVs from CCLE with drug-response data of CTRP in 27 sarcoma cell lines, 33 CNVs out of 63 genes correlated with either sensitivity or resistance to 17 chemotherapies from which actionable CNV signatures such as IGF1R, MYC, MAPK1, ATF1, and MDM2 were identified. These CNV signatures could potentially be used to delineate patient populations that will respond versus those that will not respond to a particular chemotherapy. Conclusions The large-scale analyses of CNV-drug screening provides a platform to evaluate genetic alterations across aggressive pediatric sarcomas. Additionally, this study provides novel insights into the potential utilization of CNVs as not only prognostic but also as predictive biomarkers of therapeutic response. Information obtained in this study may help guide and prioritize patient-specific therapeutic options in pediatric bone and soft-tissue sarcomas. Electronic supplementary material The online version of this article (10.1186/s12920-018-0456-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Lijun Cheng
- Department of Biomedical Informatics, College of Medicine, Ohio State University, Columbus, OH, 43210, USA
| | - Pankita H Pandya
- Herman B Wells Center for Pediatric Research, Department of Pediatrics, School of Medicine, Indiana University, Indianapolis, IN, 46202, USA.,Division of Hematology/Oncology, Department of Pediatrics, School of Medicine, Indiana University, Indianapolis, IN, 46202, USA
| | - Enze Liu
- Department of Biomedical Informatics, College of Medicine, Ohio State University, Columbus, OH, 43210, USA.,Center for Computational Biology and Bioinformatics, School of Medicine, Indiana University, Indianapolis, IN, 46202, USA
| | - Pooja Chandra
- Center for Computational Biology and Bioinformatics, School of Medicine, Indiana University, Indianapolis, IN, 46202, USA
| | - Limei Wang
- Department of Biomedical Informatics, College of Medicine, Ohio State University, Columbus, OH, 43210, USA
| | - Mary E Murray
- Division of Hematology/Oncology, Department of Pediatrics, School of Medicine, Indiana University, Indianapolis, IN, 46202, USA
| | - Jacquelyn Carter
- Division of Hematology/Oncology, Department of Pediatrics, School of Medicine, Indiana University, Indianapolis, IN, 46202, USA
| | - Michael Ferguson
- Division of Hematology/Oncology, Department of Pediatrics, School of Medicine, Indiana University, Indianapolis, IN, 46202, USA
| | - Mohammad Reza Saadatzadeh
- Herman B Wells Center for Pediatric Research, Department of Pediatrics, School of Medicine, Indiana University, Indianapolis, IN, 46202, USA.,Division of Hematology/Oncology, Department of Pediatrics, School of Medicine, Indiana University, Indianapolis, IN, 46202, USA
| | - Khadijeh Bijangi-Visheshsaraei
- Herman B Wells Center for Pediatric Research, Department of Pediatrics, School of Medicine, Indiana University, Indianapolis, IN, 46202, USA.,Division of Hematology/Oncology, Department of Pediatrics, School of Medicine, Indiana University, Indianapolis, IN, 46202, USA
| | - Mark Marshall
- Division of Hematology/Oncology, Department of Pediatrics, School of Medicine, Indiana University, Indianapolis, IN, 46202, USA
| | - Lang Li
- Department of Biomedical Informatics, College of Medicine, Ohio State University, Columbus, OH, 43210, USA. .,Herman B Wells Center for Pediatric Research, Department of Pediatrics, School of Medicine, Indiana University, Indianapolis, IN, 46202, USA.
| | - Karen E Pollok
- Herman B Wells Center for Pediatric Research, Department of Pediatrics, School of Medicine, Indiana University, Indianapolis, IN, 46202, USA. .,Division of Hematology/Oncology, Department of Pediatrics, School of Medicine, Indiana University, Indianapolis, IN, 46202, USA. .,Indiana University Melvin and Bren Simon Cancer Center, Indianapolis, IN, 46202, USA.
| | - Jamie L Renbarger
- Herman B Wells Center for Pediatric Research, Department of Pediatrics, School of Medicine, Indiana University, Indianapolis, IN, 46202, USA. .,Division of Hematology/Oncology, Department of Pediatrics, School of Medicine, Indiana University, Indianapolis, IN, 46202, USA. .,Center for Computational Biology and Bioinformatics, School of Medicine, Indiana University, Indianapolis, IN, 46202, USA. .,Indiana University Melvin and Bren Simon Cancer Center, Indianapolis, IN, 46202, USA. .,Indiana Institute of Personalized Medicine, Indiana University, Indianapolis, IN, 46202, USA.
| |
Collapse
|
29
|
Wang Y, Wang Y, Liu F. A 44-gene set constructed for predicting the prognosis of clear cell renal cell carcinoma. Int J Mol Med 2018; 42:3105-3114. [PMID: 30272265 PMCID: PMC6202093 DOI: 10.3892/ijmm.2018.3899] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2018] [Accepted: 09/07/2018] [Indexed: 12/14/2022] Open
Abstract
Clear cell renal cell carcinoma (ccRCC) is the most frequent type of renal cell carcinoma (RCC). The present study aimed to examine prognostic markers and construct a prognostic prediction system for ccRCC. The mRNA sequencing data of ccRCC was downloaded from The Cancer Genome Atlas (TCGA) database, and the GSE40435 dataset was obtained from the Gene Expression Omnibus database. Using the Limma package, the differentially expressed genes (DEGs) in the TCGA dataset and GSE40435 dataset were obtained, respectively, and the overlapped DEGs were selected. Subsequently, Cox regression analysis was applied for screening prognosis-associated genes. Following visualization of the co-expression network using Cytoscape software, the network modules were examined using the GraphWeb tool. Functional annotation for genes in the network was performed using the clusterProfiler package. Finally, a prognostic prediction system was constructed through Bayes discriminant analysis and confirmed with the GSE29609 validation dataset. The results revealed a total of 263 overlapped DEGs and 161 prognosis-associated genes. Following construction of the co-expression network, 16 functional terms and three pathways were obtained for genes in the network. In addition, red, yellow (Involving chemokine ligand 10 (CXCL10), CD27 molecule (CD27) and runt-related transcription factor 3 (RUNX3)], green (Involving angiopoietin-like 4 (ANGPTL4), stannio-calcin 2 (STC2), and sperm associated antigen 4 (SPAG4)], and cyan modules were extracted from the co-expression network. Additionally, the prognostic prediction system involving 44 signature genes, including ANGPTL4, STC2, CXCL10, SPAG4, CD27, matrix metalloproteinase (MMP9) and RUNX3, was identified and confirmed. In conclusion, the 44-gene prognostic prediction system involving ANGPTL4, STC2, CXCL10, SPAG4, CD27, MMP9 and RUNX3 may be utilized for predicting the prognosis of patients with ccRCC.
Collapse
Affiliation(s)
- Yonggang Wang
- Department of Urology, China‑Japan Union Hospital of Jilin University, Changchun, Jilin 130033, P.R. China
| | - Yao Wang
- Department of Urology, China‑Japan Union Hospital of Jilin University, Changchun, Jilin 130033, P.R. China
| | - Feng Liu
- Department of Urology, China‑Japan Union Hospital of Jilin University, Changchun, Jilin 130033, P.R. China
| |
Collapse
|
30
|
Bhasuran B, Natarajan J. Automatic extraction of gene-disease associations from literature using joint ensemble learning. PLoS One 2018; 13:e0200699. [PMID: 30048465 PMCID: PMC6061985 DOI: 10.1371/journal.pone.0200699] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2018] [Accepted: 07/02/2018] [Indexed: 12/26/2022] Open
Abstract
A wealth of knowledge concerning relations between genes and its associated diseases is present in biomedical literature. Mining these biological associations from literature can provide immense support to research ranging from drug-targetable pathways to biomarker discovery. However, time and cost of manual curation heavily slows it down. In this current scenario one of the crucial technologies is biomedical text mining, and relation extraction shows the promising result to explore the research of genes associated with diseases. By developing automatic extraction of gene-disease associations from the literature using joint ensemble learning we addressed this problem from a text mining perspective. In the proposed work, we employ a supervised machine learning approach in which a rich feature set covering conceptual, syntax and semantic properties jointly learned with word embedding are trained using ensemble support vector machine for extracting gene-disease relations from four gold standard corpora. Upon evaluating the machine learning approach shows promised results of 85.34%, 83.93%,87.39% and 85.57% of F-measure on EUADR, GAD, CoMAGC and PolySearch corpora respectively. We strongly believe that the presented novel approach combining rich syntax and semantic feature set with domain-specific word embedding through ensemble support vector machines evaluated on four gold standard corpora can act as a new baseline for future works in gene-disease relation extraction from literature.
Collapse
Affiliation(s)
- Balu Bhasuran
- DRDO-BU Center for Life Sciences, Bharathiar University Campus, Coimbatore, Tamilnadu, India
| | - Jeyakumar Natarajan
- DRDO-BU Center for Life Sciences, Bharathiar University Campus, Coimbatore, Tamilnadu, India
- Data mining and Text mining Laboratory, Department of Bioinformatics, Bharathiar University, Coimbatore, Tamilnadu, India
- * E-mail:
| |
Collapse
|
31
|
Lacroix M. Poor Usage of HUGO Standard Gene Nomenclature in Cancer Marker Studies. Int J Biol Markers 2018; 23:123-6. [DOI: 10.1177/172460080802300210] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- M. Lacroix
- InTextoResearch, Baelen, Wallonia - Belgium
| |
Collapse
|
32
|
Abstract
IMGT®, the international ImMunoGeneTics information system® ( http://www.imgt.org ), was created in 1989 by Marie-Paule Lefranc (Université de Montpellier and CNRS) to manage the huge diversity of the antigen receptors, immunoglobulins (IG) or antibodies, and T cell receptors (TR). The founding of IMGT® marked the advent of immunoinformatics, which emerged at the interface between immunogenetics and bioinformatics. Standardized sequence and structure analysis of antibody using IMGT® databases and tools allow one to bridge, for the first time, the gap between antibody sequences and three-dimensional (3D) structures. This is achieved through the IMGT Scientific chart rules, based on the IMGT-ONTOLOGY concepts of classification (IMGT gene and allele nomenclature), description (IMGT standardized labels), and numerotation (IMGT unique numbering and IMGT Collier de Perles). IMGT® is acknowledged as the global reference for immunogenetics and immunoinformatics, and its standards are particularly useful for antibody engineering and humanization. IMGT® databases for antibody nucleotide sequences and genes include IMGT/LIGM-DB and IMGT/GENE-DB, respectively, and nucleotide sequence analysis is performed by the IMGT/V-QUEST and IMGT/JunctionAnalysis tools and for NGS by IMGT/HighV-QUEST. In this chapter, we focus on IMGT® databases and tools for amino acid sequences, two-dimensional (2D) and three-dimensional (3D) structures: the IMGT/DomainGapAlign and IMGT Collier de Perles tools and the IMGT/2Dstructure-DB and IMGT/3Dstructure-DB database. IMGT/mAb-DB provides the query interface for monoclonal antibodies (mAb), fusion proteins for immune applications (FPIA), and composite proteins for clinical applications (CPCA) and related proteins of interest (RPI) and links to the proposed and recommended lists of the World Health Organization International Nonproprietary Name (WHO INN) programme, to IMGT/2Dstructure-DB for amino acid sequences, and to IMGT/3Dstructure-DB and its associated tools (IMGT/StructuralQuery, IMGT/DomainSuperimpose) for crystallized antibodies.
Collapse
|
33
|
PMTDS: a computational method based on genetic interaction networks for Precision Medicine Target-Drug Selection in cancer. QUANTITATIVE BIOLOGY 2017. [DOI: 10.1007/s40484-017-0126-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
34
|
A Novel Statistical Method to Diagnose, Quantify and Correct Batch Effects in Genomic Studies. Sci Rep 2017; 7:10849. [PMID: 28883548 PMCID: PMC5589920 DOI: 10.1038/s41598-017-11110-6] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2017] [Accepted: 08/18/2017] [Indexed: 01/01/2023] Open
Abstract
Genome projects now generate large-scale data often produced at various time points by different laboratories using multiple platforms. This increases the potential for batch effects. Currently there are several batch evaluation methods like principal component analysis (PCA; mostly based on visual inspection), and sometimes they fail to reveal all of the underlying batch effects. These methods can also lead to the risk of unintentionally correcting biologically interesting factors attributed to batch effects. Here we propose a novel statistical method, finding batch effect (findBATCH), to evaluate batch effect based on probabilistic principal component and covariates analysis (PPCCA). The same framework also provides a new approach to batch correction, correcting batch effect (correctBATCH), which we have shown to be a better approach to traditional PCA-based correction. We demonstrate the utility of these methods using two different examples (breast and colorectal cancers) by merging gene expression data from different studies after diagnosing and correcting for batch effects and retaining the biological effects. These methods, along with conventional visual inspection-based PCA, are available as a part of an R package exploring batch effect (exploBATCH; https://github.com/syspremed/exploBATCH).
Collapse
|
35
|
Tobón-Arroyave SI, Isaza-Guzmán DM, Pineda-Trujillo N. Association Study of Vitamin D Receptor (VDR) - Related Genetic Polymorphisms and their Haplotypes with Chronic Periodontitis in Colombian Population. J Clin Diagn Res 2017; 11:ZC60-ZC66. [PMID: 28384983 DOI: 10.7860/jcdr/2017/23967.9451] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2016] [Accepted: 11/19/2016] [Indexed: 11/24/2022]
Abstract
INTRODUCTION There is strong evidence that both genetic and environmental factors may affect the periodontal clinical status. However, epidemiological evidence on the association between Vitamin D Receptor (VDR) polymorphisms and Chronic Periodontitis (CP) has been inconsistent. AIM The focus of this study was to identify if a possible association between VDR Single-Nucleotide Polymorphisms (SNPs) may be implicated in the aetiopathogenesis of CP in Colombian population. MATERIALS AND METHODS One hundred and ten CP patients and 50 Healthy Controls (HC) were recruited. Periodontal status was assessed based on probing depth, clinical attachment level, extent, and severity of periodontal breakdown. The polymerase chain reaction-restriction fragment length polymorphism method was used to identify the VDR rs7975232, rs1544410, rs2228570, and rs731236 SNPs from saliva samples. Odds Ratios (ORs) along with their 95% Confidence Intervals (CIs) were computed to compare the distribution of genotypes/alleles between HC and CP patients, alongside with analysis of Linkage Disequilibrium (LD) and haplotype associations between SNPs. Also, an analysis of the interaction between genetic findings and those significant demographic factors was performed for all SNPs. RESULTS There was no association neither between the different genotypes/allele frequencies nor haplotypes and CP. Similarly, no significant differences in extent or severity amongst genotype/allele groups were observed. Even so, interaction analysis revealed significant synergistic interactions between each SNP and age associated with the disease status. CONCLUSION Although these results do not support that VDR SNPs could be identified as independent risk predictor variables for CP in the Colombian population, synergistic biological interactive effects of all these SNPs related to age might play a significant role in the pathogenic pathways of CP.
Collapse
Affiliation(s)
- Sergio Iván Tobón-Arroyave
- Professor, Laboratory of Immunodetection and Bioanalysis, Faculty of Dentistry, University of Antioquia , Medellín, Antioquia, Colombia
| | - Diana María Isaza-Guzmán
- Professor, Laboratory of Immunodetection and Bioanalysis, Faculty of Dentistry, University of Antioquia , Medellín, Antioquia, Colombia
| | - Nicolás Pineda-Trujillo
- Professor, Gene Mapping Research Group, Faculty of Medicine, University of Antioquia , Medellín, Antioquia, Colombia
| |
Collapse
|
36
|
Annotating the Function of the Human Genome with Gene Ontology and Disease Ontology. BIOMED RESEARCH INTERNATIONAL 2016; 2016:4130861. [PMID: 27635398 PMCID: PMC5011202 DOI: 10.1155/2016/4130861] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/02/2016] [Revised: 07/24/2016] [Accepted: 07/27/2016] [Indexed: 01/08/2023]
Abstract
Increasing evidences indicated that function annotation of human genome in molecular level and phenotype level is very important for systematic analysis of genes. In this study, we presented a framework named Gene2Function to annotate Gene Reference into Functions (GeneRIFs), in which each functional description of GeneRIFs could be annotated by a text mining tool Open Biomedical Annotator (OBA), and each Entrez gene could be mapped to Human Genome Organisation Gene Nomenclature Committee (HGNC) gene symbol. After annotating all the records about human genes of GeneRIFs, 288,869 associations between 13,148 mRNAs and 7,182 terms, 9,496 associations between 948 microRNAs and 533 terms, and 901 associations between 139 long noncoding RNAs (lncRNAs) and 297 terms were obtained as a comprehensive annotation resource of human genome. High consistency of term frequency of individual gene (Pearson correlation = 0.6401, p = 2.2e - 16) and gene frequency of individual term (Pearson correlation = 0.1298, p = 3.686e - 14) in GeneRIFs and GOA shows our annotation resource is very reliable.
Collapse
|
37
|
Uzun A, Triche EW, Schuster J, Dewan AT, Padbury JF. dbPEC: a comprehensive literature-based database for preeclampsia related genes and phenotypes. Database (Oxford) 2016; 2016:baw006. [PMID: 26946289 PMCID: PMC4779341 DOI: 10.1093/database/baw006] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Revised: 12/28/2015] [Accepted: 01/12/2016] [Indexed: 01/08/2023]
Abstract
Preeclampsia is one of the most common causes of fetal and maternal morbidity and mortality in the world. We built a Database for Preeclampsia (dbPEC) consisting of the clinical features, concurrent conditions, published literature and genes associated with Preeclampsia. We included gene sets associated with severity, concurrent conditions, tissue sources and networks. The published scientific literature is the primary repository for all information documenting human disease. We used semantic data mining to retrieve and extract the articles pertaining to preeclampsia-associated genes and performed manual curation. We deposited the articles, genes, preeclampsia phenotypes and other supporting information into the dbPEC. It is publicly available and freely accessible. Previously, we developed a database for preterm birth (dbPTB) using a similar approach. Using the gene sets in dbPTB, we were able to successfully analyze a genome-wide study of preterm birth including 4000 women and children. We identified important genes and pathways associated with preterm birth that were not otherwise demonstrable using genome-wide approaches. dbPEC serves not only as a resources for genes and articles associated with preeclampsia, it is a robust source of gene sets to analyze a wide range of high-throughput data for gene set enrichment analysis. Database URL: http://ptbdb.cs.brown.edu/dbpec/.
Collapse
Affiliation(s)
- Alper Uzun
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, RI, USA Department of Pediatrics, Brown Alpert Medical School, Providence, RI, USA
| | - Elizabeth W Triche
- The Mandell Center for Multiple Sclerosis, Mount Sinai Rehabilitation Hospital, Hartford, CT, USA
| | - Jessica Schuster
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, RI, USA Department of Pediatrics, Brown Alpert Medical School, Providence, RI, USA
| | - Andrew T Dewan
- Department of Epidemiology and Public Health, Yale University, New Haven, CT, USA
| | - James F Padbury
- Department of Pediatrics, Women & Infants Hospital of Rhode Island, Providence, RI, USA Department of Pediatrics, Brown Alpert Medical School, Providence, RI, USA Center for Computational Molecular Biology, Brown University, Providence, RI, USA
| |
Collapse
|
38
|
Hettne KM, Thompson M, van Haagen HHHBM, van der Horst E, Kaliyaperumal R, Mina E, Tatum Z, Laros JFJ, van Mulligen EM, Schuemie M, Aten E, Li TS, Bruskiewich R, Good BM, Su AI, Kors JA, den Dunnen J, van Ommen GJB, Roos M, ‘t Hoen PA, Mons B, Schultes EA. The Implicitome: A Resource for Rationalizing Gene-Disease Associations. PLoS One 2016; 11:e0149621. [PMID: 26919047 PMCID: PMC4769089 DOI: 10.1371/journal.pone.0149621] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2015] [Accepted: 02/03/2016] [Indexed: 11/19/2022] Open
Abstract
High-throughput experimental methods such as medical sequencing and genome-wide association studies (GWAS) identify increasingly large numbers of potential relations between genetic variants and diseases. Both biological complexity (millions of potential gene-disease associations) and the accelerating rate of data production necessitate computational approaches to prioritize and rationalize potential gene-disease relations. Here, we use concept profile technology to expose from the biomedical literature both explicitly stated gene-disease relations (the explicitome) and a much larger set of implied gene-disease associations (the implicitome). Implicit relations are largely unknown to, or are even unintended by the original authors, but they vastly extend the reach of existing biomedical knowledge for identification and interpretation of gene-disease associations. The implicitome can be used in conjunction with experimental data resources to rationalize both known and novel associations. We demonstrate the usefulness of the implicitome by rationalizing known and novel gene-disease associations, including those from GWAS. To facilitate the re-use of implicit gene-disease associations, we publish our data in compliance with FAIR Data Publishing recommendations [https://www.force11.org/group/fairgroup] using nanopublications. An online tool (http://knowledge.bio) is available to explore established and potential gene-disease associations in the context of other biomedical relations.
Collapse
Affiliation(s)
- Kristina M. Hettne
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
- * E-mail:
| | - Mark Thompson
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | | | - Eelke van der Horst
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Rajaram Kaliyaperumal
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Eleni Mina
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Zuotian Tatum
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Jeroen F. J. Laros
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Erik M. van Mulligen
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
- Department of Medical Informatics, Erasmus University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - Martijn Schuemie
- Department of Medical Informatics, Erasmus University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - Emmelien Aten
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Tong Shu Li
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA, United States of America
| | | | - Benjamin M. Good
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA, United States of America
| | - Andrew I. Su
- Department of Molecular and Experimental Medicine, The Scripps Research Institute, La Jolla, CA, United States of America
| | - Jan A. Kors
- Department of Medical Informatics, Erasmus University Medical Center Rotterdam, Rotterdam, The Netherlands
| | - Johan den Dunnen
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Gert-Jan B. van Ommen
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Marco Roos
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Peter A.C. ‘t Hoen
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Barend Mons
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
- Dutch Techcentre for Life Sciences, Utrecht, The Netherlands
| | - Erik A. Schultes
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
- Leiden Institute for Advanced Computer Science, Leiden, The Netherlands
| |
Collapse
|
39
|
César-Razquin A, Snijder B, Frappier-Brinton T, Isserlin R, Gyimesi G, Bai X, Reithmeier RA, Hepworth D, Hediger MA, Edwards AM, Superti-Furga G. A Call for Systematic Research on Solute Carriers. Cell 2015; 162:478-87. [PMID: 26232220 DOI: 10.1016/j.cell.2015.07.022] [Citation(s) in RCA: 405] [Impact Index Per Article: 40.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2015] [Indexed: 01/10/2023]
Abstract
Solute carrier (SLC) membrane transport proteins control essential physiological functions, including nutrient uptake, ion transport, and waste removal. SLCs interact with several important drugs, and a quarter of the more than 400 SLC genes are associated with human diseases. Yet, compared to other gene families of similar stature, SLCs are relatively understudied. The time is right for a systematic attack on SLC structure, specificity, and function, taking into account kinship and expression, as well as the dependencies that arise from the common metabolic space.
Collapse
Affiliation(s)
- Adrián César-Razquin
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, 1090 Vienna, Austria
| | - Berend Snijder
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, 1090 Vienna, Austria
| | | | - Ruth Isserlin
- The Donnelly Centre, University of Toronto, Toronto, Ontario, M5S 3E1, Canada
| | - Gergely Gyimesi
- Institute of Biochemistry and Molecular Medicine and Swiss National Center of Competence in Research, NCCR TransCure, University of Bern, 3012 Bern, Switzerland
| | - Xiaoyun Bai
- Department of Biochemistry, University of Toronto, Toronto, Ontario, M5S 1A8 Canada
| | | | - David Hepworth
- Worldwide Medicinal Chemistry, Pfizer Worldwide Research and Development, Cambridge, MA 02139, USA
| | - Matthias A Hediger
- Institute of Biochemistry and Molecular Medicine and Swiss National Center of Competence in Research, NCCR TransCure, University of Bern, 3012 Bern, Switzerland.
| | - Aled M Edwards
- Structural Genomics Consortium, University of Toronto, Toronto, Ontario M5G 1L7, Canada.
| | - Giulio Superti-Furga
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, 1090 Vienna, Austria; Center for Physiology and Pharmacology, Medical University of Vienna, 1090 Vienna, Austria.
| |
Collapse
|
40
|
Lefranc MP. Immunoglobulins: 25 years of immunoinformatics and IMGT-ONTOLOGY. Biomolecules 2014; 4:1102-39. [PMID: 25521638 PMCID: PMC4279172 DOI: 10.3390/biom4041102] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2014] [Revised: 12/02/2014] [Accepted: 12/03/2014] [Indexed: 11/17/2022] Open
Abstract
IMGT®, the international ImMunoGeneTics information system® (CNRS and Montpellier University) is the global reference in immunogenetics and immunoinformatics. By its creation in 1989, IMGT® marked the advent of immunoinformatics, which emerged at the interface between immunogenetics and bioinformatics. IMGT® is specialized in the immunoglobulins (IG) or antibodies, T cell receptors (TR), major histocompatibility (MH), and IgSF and MhSF superfamilies. IMGT® has been built on the IMGT-ONTOLOGY axioms and concepts, which bridged the gap between genes, sequences and three-dimensional (3D) structures. The concepts include the IMGT® standardized keywords (identification), IMGT® standardized labels (description), IMGT® standardized nomenclature (classification), IMGT unique numbering and IMGT Colliers de Perles (numerotation). IMGT® comprises seven databases, 15,000 pages of web resources and 17 tools. IMGT® tools and databases provide a high-quality analysis of the IG from fish to humans, for basic, veterinary and medical research, and for antibody engineering and humanization. They include, as examples: IMGT/V-QUEST and IMGT/JunctionAnalysis for nucleotide sequence analysis and their high-throughput version IMGT/HighV-QUEST for next generation sequencing, IMGT/DomainGapAlign for amino acid sequence analysis of IG domains, IMGT/3Dstructure-DB for 3D structures, contact analysis and paratope/epitope interactions of IG/antigen complexes, and the IMGT/mAb-DB interface for therapeutic antibodies and fusion proteins for immunological applications (FPIA).
Collapse
Affiliation(s)
- Marie-Paule Lefranc
- IMGT®, the international ImMunoGenetics information system®, Laboratoire d'ImmunoGénétique Moléculaire LIGM, Institut de Génétique Humaine IGH, UPR CNRS 1142, Montpellier University, 141 rue de la Cardonille, 34396 Montpellier cedex 5, France.
| |
Collapse
|
41
|
Alamyar E, Giudicelli V, Duroux P, Lefranc MP. Antibody V and C domain sequence, structure, and interaction analysis with special reference to IMGT®. Methods Mol Biol 2014; 1131:337-81. [PMID: 24515476 DOI: 10.1007/978-1-62703-992-5_21] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
IMGT(®), the international ImMunoGeneTics information system(®) (http://www.imgt.org), created in 1989 (Centre National de la Recherche Scientifique, Montpellier University), is acknowledged as the global reference in immunogenetics and immunoinformatics. The accuracy and the consistency of the IMGT(®) data are based on IMGT-ONTOLOGY which bridges the gap between genes, sequences, and three-dimensional (3D) structures. Thus, receptors, chains, and domains are characterized with the same IMGT(®) rules and standards (IMGT standardized labels, IMGT gene and allele nomenclature, IMGT unique numbering, IMGT Collier de Perles), independently from the molecule type (genomic DNA, complementary DNA, transcript, or protein) or from the species. More particularly, IMGT(®) tools and databases provide a highly standardized analysis of the immunoglobulin (IG) or antibody and T cell receptor (TR) V and C domains. IMGT/V-QUEST analyzes the V domains of IG or TR rearranged nucleotide sequences, integrates the IMGT/JunctionAnalysis and IMGT/Automat tools, and provides IMGT Collier de Perles. IMGT/HighV-QUEST analyzes sequences from high-throughput sequencing (HTS) (up to 150,000 sequences per batch) and performs statistical analysis on up to 450,000 results, with the same resolution and high quality as IMGT/V-QUEST online. IMGT/DomainGapAlign analyzes amino acid sequences of V and C domains and IMGT/3Dstructure-DB and associated tools provide information on 3D structures, contact analysis, and paratope/epitope interactions. These IMGT(®) tools and databases, and the IMGT/mAb-DB interface with access to therapeutical antibody data, provide an invaluable help for antibody engineering and antibody humanization.
Collapse
Affiliation(s)
- Eltaf Alamyar
- The International ImMunoGenetics information system, Laboratoire d'ImmunoGénétique Moléculaire, Institut de Génétique Humaine IGH, Université Montpellier 2, Montpellier, France
| | | | | | | |
Collapse
|
42
|
Ulveling D, Dinger ME, Francastel C, Hubé F. Identification of a dinucleotide signature that discriminates coding from non-coding long RNAs. Front Genet 2014; 5:316. [PMID: 25250049 PMCID: PMC4158813 DOI: 10.3389/fgene.2014.00316] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2014] [Accepted: 08/22/2014] [Indexed: 11/13/2022] Open
Abstract
To date, the main criterion by which long ncRNAs (lncRNAs) are discriminated from mRNAs is based on the capacity of the transcripts to encode a protein. However, it becomes important to identify non-ORF-based sequence characteristics that can be used to parse between ncRNAs and mRNAs. In this study, we first established an extremely selective workflow to define a highly refined database of lncRNAs which was used for comparison with mRNAs. Then using this highly selective collection of lncRNAs, we found the CG dinucleotide frequencies were clearly distinct. In addition, we showed that the bias in CG dinucleotide frequency was conserved in human and mouse genomes. We propose that this sequence feature will serve as a useful classifier in transcript classification pipelines. We also suggest that our refined database of "bona fide" lncRNAs will be valuable for the discovery of other sequence characteristics distinct to lncRNAs.
Collapse
Affiliation(s)
- Damien Ulveling
- CNRS UMR7216, Epigenetics and Cell Fate, Université Paris Diderot, Sorbonne Paris Cité Paris, France
| | - Marcel E Dinger
- The University of Queensland Diamantina Institute, The University of Queensland Brisbane, QLD, Australia
| | - Claire Francastel
- CNRS UMR7216, Epigenetics and Cell Fate, Université Paris Diderot, Sorbonne Paris Cité Paris, France
| | - Florent Hubé
- CNRS UMR7216, Epigenetics and Cell Fate, Université Paris Diderot, Sorbonne Paris Cité Paris, France
| |
Collapse
|
43
|
Thomas JK, Kim MS, Balakrishnan L, Nanjappa V, Raju R, Marimuthu A, Radhakrishnan A, Muthusamy B, Khan AA, Sakamuri S, Tankala SG, Singal M, Nair B, Sirdeshmukh R, Chatterjee A, Prasad TSK, Maitra A, Gowda H, Hruban RH, Pandey A. Pancreatic Cancer Database: an integrative resource for pancreatic cancer. Cancer Biol Ther 2014; 15:963-7. [PMID: 24839966 DOI: 10.4161/cbt.29188] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Pancreatic cancer is the fourth leading cause of cancer-related death in the world. The etiology of pancreatic cancer is heterogeneous with a wide range of alterations that have already been reported at the level of the genome, transcriptome, and proteome. The past decade has witnessed a large number of experimental studies using high-throughput technology platforms to identify genes whose expression at the transcript or protein levels is altered in pancreatic cancer. Based on expression studies, a number of molecules have also been proposed as potential biomarkers for diagnosis and prognosis of this deadly cancer. Currently, there are no repositories which provide an integrative view of multiple Omics data sets from published research on pancreatic cancer. Here, we describe the development of a web-based resource, Pancreatic Cancer Database (http://www.pancreaticcancerdatabase.org), as a unified platform for pancreatic cancer research. PCD contains manually curated information pertaining to quantitative alterations in miRNA, mRNA, and proteins obtained from small-scale as well as high-throughput studies of pancreatic cancer tissues and cell lines. We believe that PCD will serve as an integrative platform for scientific community involved in pancreatic cancer research.
Collapse
Affiliation(s)
- Joji Kurian Thomas
- Institute of Bioinformatics; International Technology Park; Bangalore, India; Amrita School of Biotechnology; Amrita Vishwa Vidyapeetham; Kollam, Kerala India
| | - Min-Sik Kim
- McKusick-Nathans Institute of Genetic Medicine; Johns Hopkins University School of Medicine; Baltimore, MD USA; Department of Biological Chemistry; Johns Hopkins University School of Medicine; Baltimore, MD USA
| | | | - Vishalakshi Nanjappa
- Institute of Bioinformatics; International Technology Park; Bangalore, India; Amrita School of Biotechnology; Amrita Vishwa Vidyapeetham; Kollam, Kerala India
| | - Rajesh Raju
- Institute of Bioinformatics; International Technology Park; Bangalore, India
| | | | - Aneesha Radhakrishnan
- Institute of Bioinformatics; International Technology Park; Bangalore, India; Department of Biochemistry and Molecular Biology; School of Life Sciences; Pondicherry University; Puducherry, India
| | - Babylakshmi Muthusamy
- Institute of Bioinformatics; International Technology Park; Bangalore, India; Bioinformatics Centre; School of Life Sciences; Pondicherry University; Puducherry, India
| | - Aafaque Ahmad Khan
- Institute of Bioinformatics; International Technology Park; Bangalore, India
| | - Sruthi Sakamuri
- McKusick-Nathans Institute of Genetic Medicine; Johns Hopkins University School of Medicine; Baltimore, MD USA
| | | | - Mukul Singal
- Government Medical College and Hospital; Chandigarh, India
| | - Bipin Nair
- Amrita School of Biotechnology; Amrita Vishwa Vidyapeetham; Kollam, Kerala India
| | - Ravi Sirdeshmukh
- Institute of Bioinformatics; International Technology Park; Bangalore, India
| | - Aditi Chatterjee
- Institute of Bioinformatics; International Technology Park; Bangalore, India
| | - T S Keshava Prasad
- Institute of Bioinformatics; International Technology Park; Bangalore, India; Amrita School of Biotechnology; Amrita Vishwa Vidyapeetham; Kollam, Kerala India
| | - Anirban Maitra
- Departments of Pathology and Translational Molecular Pathology; Sheikh Ahmed Bin Zayed Al Nahyan Center for Pancreatic Cancer Research; UT MD Anderson Cancer Center; Houston, TX USA
| | - Harsha Gowda
- Institute of Bioinformatics; International Technology Park; Bangalore, India
| | - Ralph H Hruban
- Department of Pathology; Sol Goldman Pancreatic Cancer Research Center; Johns Hopkins University School of Medicine; Baltimore, MD USA; Department of Oncology; Johns Hopkins University School of Medicine; Baltimore, MD USA
| | - Akhilesh Pandey
- McKusick-Nathans Institute of Genetic Medicine; Johns Hopkins University School of Medicine; Baltimore, MD USA; Department of Biological Chemistry; Johns Hopkins University School of Medicine; Baltimore, MD USA; Department of Pathology; Sol Goldman Pancreatic Cancer Research Center; Johns Hopkins University School of Medicine; Baltimore, MD USA; Department of Oncology; Johns Hopkins University School of Medicine; Baltimore, MD USA
| |
Collapse
|
44
|
Abstract
snoRNAs (small nucleolar RNAs) constitute one of the largest and best-studied classes of non-coding RNAs that confer enzymatic specificity. With associated proteins, these snoRNAs form ribonucleoprotein complexes that can direct 2'-O-methylation or pseudouridylation of target non-coding RNAs. Aided by computational methods and high-throughput sequencing, new studies have expanded the diversity of known snoRNA functions. Complexes incorporating snoRNAs have dynamic specificity, and include diverse roles in RNA silencing, telomerase maintenance and regulation of alternative splicing. Evidence that dysregulation of snoRNAs can cause human disease, including cancer, indicates that the full scope of snoRNA roles remains an unfinished story. The diversity in structure, genomic origin and function between snoRNAs found in different complexes and among different phyla illustrates the surprising plasticity of snoRNAs in evolution. The ability of snoRNAs to direct highly specific interactions with other RNAs is a consistent thread in their newly discovered functions. Because they are ubiquitous throughout Eukarya and Archaea, it is likely they were a feature of the last common ancestor of these two domains, placing their origin over two billion years ago. In the present chapter, we focus on recent advances in our understanding of these ancient, but functionally dynamic RNA-processing machines.
Collapse
|
45
|
Abstract
ABSTRACT
Antibody informatics, a part of immunoinformatics, refers to the concepts, databases, and tools developed and used to explore and to analyze the particular properties of the immunoglobulins (IG) or antibodies, compared with conventional genes and proteins. Antibody informatics is based on a unique ontology, IMGT-ONTOLOGY, created in 1989 by IMGT, the international ImMunoGeneTics information system (
http://www.imgt.org
). IMGT-ONTOLOGY defined, for the first time, the concept of ‘genes’ for the IG and the T cell receptors (TR), which led to their gene and allele nomenclature and allowed their entry in databases and tools. A second IMGT-ONTOLOGY revolutionizing and definitive concept was the IMGT unique numbering that bridged the gap between sequences and structures for the variable (V) and constant (C) domains of the IG and TR, and for the groove (G) domains of the major histocompatibility (MH). These breakthroughs contributed to the development of IMGT databases and tools for antibody informatics and its diverse applications, such as repertoire analysis in infectious diseases, antibody engineering and humanization, and study of antibody/antigen interactions. Nucleotide sequences of antibody V domains from deep sequencing (Next Generation Sequencing or High Throughput Sequencing) are analyzed with IMGT/HighV-QUEST, the high-throughput version of IMGT/V-QUEST and IMGT/JunctionAnalysis. Amino acid sequences of V and C domains are represented with the IMGT/Collier-de-Perles tool and analyzed with IMGT/DomainGapAlign. Three-dimensional (3D) structures (including contact analysis and paratope/epitope) are described in IMGT/3Dstructure-DB. Based on a friendly interface, IMGT/mAb-DB contains therapeutic monoclonal antibodies (INN suffix–mab) that can be queried on their specificity, for example, in infectious diseases, on bacterial or viral targets.
Collapse
|
46
|
Lefranc MP. Immunoglobulin and T Cell Receptor Genes: IMGT(®) and the Birth and Rise of Immunoinformatics. Front Immunol 2014; 5:22. [PMID: 24600447 PMCID: PMC3913909 DOI: 10.3389/fimmu.2014.00022] [Citation(s) in RCA: 176] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2013] [Accepted: 01/15/2014] [Indexed: 11/13/2022] Open
Abstract
IMGT(®), the international ImMunoGeneTics information system(®) (1), (CNRS and Université Montpellier 2) is the global reference in immunogenetics and immunoinformatics. By its creation in 1989, IMGT(®) marked the advent of immunoinformatics, which emerged at the interface between immunogenetics and bioinformatics. IMGT(®) is specialized in the immunoglobulins (IG) or antibodies, T cell receptors (TR), major histocompatibility (MH), and proteins of the IgSF and MhSF superfamilies. IMGT(®) has been built on the IMGT-ONTOLOGY axioms and concepts, which bridged the gap between genes, sequences, and three-dimensional (3D) structures. The concepts include the IMGT(®) standardized keywords (concepts of identification), IMGT(®) standardized labels (concepts of description), IMGT(®) standardized nomenclature (concepts of classification), IMGT unique numbering, and IMGT Colliers de Perles (concepts of numerotation). IMGT(®) comprises seven databases, 15,000 pages of web resources, and 17 tools, and provides a high-quality and integrated system for the analysis of the genomic and expressed IG and TR repertoire of the adaptive immune responses. Tools and databases are used in basic, veterinary, and medical research, in clinical applications (mutation analysis in leukemia and lymphoma) and in antibody engineering and humanization. They include, for example IMGT/V-QUEST and IMGT/JunctionAnalysis for nucleotide sequence analysis and their high-throughput version IMGT/HighV-QUEST for next-generation sequencing (500,000 sequences per batch), IMGT/DomainGapAlign for amino acid sequence analysis of IG and TR variable and constant domains and of MH groove domains, IMGT/3Dstructure-DB for 3D structures, contact analysis and paratope/epitope interactions of IG/antigen and TR/peptide-MH complexes and IMGT/mAb-DB interface for therapeutic antibodies and fusion proteins for immune applications (FPIA).
Collapse
Affiliation(s)
- Marie-Paule Lefranc
- The International ImMunoGenetics Information System (IMGT), Laboratoire d’ImmunoGénétique Moléculaire (LIGM), Institut de Génétique Humaine, UPR CNRS, Université Montpellier 2, Montpellier, France
| |
Collapse
|
47
|
Lovell PV, Wirthlin M, Wilhelm L, Minx P, Lazar NH, Carbone L, Warren WC, Mello CV. Conserved syntenic clusters of protein coding genes are missing in birds. Genome Biol 2014; 15:565. [PMID: 25518852 PMCID: PMC4290089 DOI: 10.1186/s13059-014-0565-1] [Citation(s) in RCA: 87] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2014] [Accepted: 12/08/2014] [Indexed: 12/12/2022] Open
Abstract
BACKGROUND Birds are one of the most highly successful and diverse groups of vertebrates, having evolved a number of distinct characteristics, including feathers and wings, a sturdy lightweight skeleton and unique respiratory and urinary/excretion systems. However, the genetic basis of these traits is poorly understood. RESULTS Using comparative genomics based on extensive searches of 60 avian genomes, we have found that birds lack approximately 274 protein coding genes that are present in the genomes of most vertebrate lineages and are for the most part organized in conserved syntenic clusters in non-avian sauropsids and in humans. These genes are located in regions associated with chromosomal rearrangements, and are largely present in crocodiles, suggesting that their loss occurred subsequent to the split of dinosaurs/birds from crocodilians. Many of these genes are associated with lethality in rodents, human genetic disorders, or biological functions targeting various tissues. Functional enrichment analysis combined with orthogroup analysis and paralog searches revealed enrichments that were shared by non-avian species, present only in birds, or shared between all species. CONCLUSIONS Together these results provide a clearer definition of the genetic background of extant birds, extend the findings of previous studies on missing avian genes, and provide clues about molecular events that shaped avian evolution. They also have implications for fields that largely benefit from avian studies, including development, immune system, oncogenesis, and brain function and cognition. With regards to the missing genes, birds can be considered ‘natural knockouts’ that may become invaluable model organisms for several human diseases.
Collapse
Affiliation(s)
- Peter V Lovell
- />Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR USA
| | - Morgan Wirthlin
- />Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR USA
| | - Larry Wilhelm
- />Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR USA
- />Oregon National Primate Research Center, West Campus, Oregon Health and Science University, Portland, OR USA
| | - Patrick Minx
- />The Genome Institute, Washington University School of Medicine, St. Louis, MO USA
| | - Nathan H Lazar
- />Oregon National Primate Research Center, West Campus, Oregon Health and Science University, Portland, OR USA
- />Bioinformatics and Computational Biology Division, Oregon Health & Science University, Portland, OR USA
| | - Lucia Carbone
- />Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR USA
- />Oregon National Primate Research Center, West Campus, Oregon Health and Science University, Portland, OR USA
| | - Wesley C Warren
- />The Genome Institute, Washington University School of Medicine, St. Louis, MO USA
| | - Claudio V Mello
- />Department of Behavioral Neuroscience, Oregon Health and Science University, Portland, OR USA
| |
Collapse
|
48
|
van Haagen HHHBM, 't Hoen PAC, Mons B, Schultes EA. Generic information can retrieve known biological associations: implications for biomedical knowledge discovery. PLoS One 2013; 8:e78665. [PMID: 24260124 PMCID: PMC3834066 DOI: 10.1371/journal.pone.0078665] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2013] [Accepted: 09/13/2013] [Indexed: 02/01/2023] Open
Abstract
Motivation Weighted semantic networks built from text-mined literature can be used to retrieve known protein-protein or gene-disease associations, and have been shown to anticipate associations years before they are explicitly stated in the literature. Our text-mining system recognizes over 640,000 biomedical concepts: some are specific (i.e., names of genes or proteins) others generic (e.g., ‘Homo sapiens’). Generic concepts may play important roles in automated information retrieval, extraction, and inference but may also result in concept overload and confound retrieval and reasoning with low-relevance or even spurious links. Here, we attempted to optimize the retrieval performance for protein-protein interactions (PPI) by filtering generic concepts (node filtering) or links to generic concepts (edge filtering) from a weighted semantic network. First, we defined metrics based on network properties that quantify the specificity of concepts. Then using these metrics, we systematically filtered generic information from the network while monitoring retrieval performance of known protein-protein interactions. We also systematically filtered specific information from the network (inverse filtering), and assessed the retrieval performance of networks composed of generic information alone. Results Filtering generic or specific information induced a two-phase response in retrieval performance: initially the effects of filtering were minimal but beyond a critical threshold network performance suddenly drops. Contrary to expectations, networks composed exclusively of generic information demonstrated retrieval performance comparable to unfiltered networks that also contain specific concepts. Furthermore, an analysis using individual generic concepts demonstrated that they can effectively support the retrieval of known protein-protein interactions. For instance the concept “binding” is indicative for PPI retrieval and the concept “mutation abnormality” is indicative for gene-disease associations. Conclusion Generic concepts are important for information retrieval and cannot be removed from semantic networks without negative impact on retrieval performance.
Collapse
Affiliation(s)
| | - Peter A. C. 't Hoen
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Barend Mons
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| | - Erik A. Schultes
- Department of Human Genetics, Leiden University Medical Center, Leiden, The Netherlands
| |
Collapse
|
49
|
Bouatra S, Aziat F, Mandal R, Guo AC, Wilson MR, Knox C, Bjorndahl TC, Krishnamurthy R, Saleem F, Liu P, Dame ZT, Poelzer J, Huynh J, Yallou FS, Psychogios N, Dong E, Bogumil R, Roehring C, Wishart DS. The human urine metabolome. PLoS One 2013; 8:e73076. [PMID: 24023812 PMCID: PMC3762851 DOI: 10.1371/journal.pone.0073076] [Citation(s) in RCA: 996] [Impact Index Per Article: 83.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2013] [Accepted: 07/09/2013] [Indexed: 02/07/2023] Open
Abstract
Urine has long been a "favored" biofluid among metabolomics researchers. It is sterile, easy-to-obtain in large volumes, largely free from interfering proteins or lipids and chemically complex. However, this chemical complexity has also made urine a particularly difficult substrate to fully understand. As a biological waste material, urine typically contains metabolic breakdown products from a wide range of foods, drinks, drugs, environmental contaminants, endogenous waste metabolites and bacterial by-products. Many of these compounds are poorly characterized and poorly understood. In an effort to improve our understanding of this biofluid we have undertaken a comprehensive, quantitative, metabolome-wide characterization of human urine. This involved both computer-aided literature mining and comprehensive, quantitative experimental assessment/validation. The experimental portion employed NMR spectroscopy, gas chromatography mass spectrometry (GC-MS), direct flow injection mass spectrometry (DFI/LC-MS/MS), inductively coupled plasma mass spectrometry (ICP-MS) and high performance liquid chromatography (HPLC) experiments performed on multiple human urine samples. This multi-platform metabolomic analysis allowed us to identify 445 and quantify 378 unique urine metabolites or metabolite species. The different analytical platforms were able to identify (quantify) a total of: 209 (209) by NMR, 179 (85) by GC-MS, 127 (127) by DFI/LC-MS/MS, 40 (40) by ICP-MS and 10 (10) by HPLC. Our use of multiple metabolomics platforms and technologies allowed us to identify several previously unknown urine metabolites and to substantially enhance the level of metabolome coverage. It also allowed us to critically assess the relative strengths and weaknesses of different platforms or technologies. The literature review led to the identification and annotation of another 2206 urinary compounds and was used to help guide the subsequent experimental studies. An online database containing the complete set of 2651 confirmed human urine metabolite species, their structures (3079 in total), concentrations, related literature references and links to their known disease associations are freely available at http://www.urinemetabolome.ca.
Collapse
Affiliation(s)
- Souhaila Bouatra
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| | - Farid Aziat
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| | - Rupasri Mandal
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| | - An Chi Guo
- Department of Computing Sciences, University of Alberta, Edmonton, Alberta, Canada
| | - Michael R. Wilson
- Department of Computing Sciences, University of Alberta, Edmonton, Alberta, Canada
| | - Craig Knox
- Department of Computing Sciences, University of Alberta, Edmonton, Alberta, Canada
| | - Trent C. Bjorndahl
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| | | | - Fozia Saleem
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| | - Philip Liu
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| | - Zerihun T. Dame
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| | - Jenna Poelzer
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| | - Jessica Huynh
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| | - Faizath S. Yallou
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| | - Nick Psychogios
- Cardiovascular Research Center, Massachusetts General Hospital, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Edison Dong
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
| | | | | | - David S. Wishart
- Department of Biological Sciences, University of Alberta, Edmonton, Alberta, Canada
- Department of Computing Sciences, University of Alberta, Edmonton, Alberta, Canada
- National Institute for Nanotechnology, Edmonton, Alberta, Canada
| |
Collapse
|
50
|
Lovell PV, Carleton JB, Mello CV. Genomics analysis of potassium channel genes in songbirds reveals molecular specializations of brain circuits for the maintenance and production of learned vocalizations. BMC Genomics 2013; 14:470. [PMID: 23845108 PMCID: PMC3711925 DOI: 10.1186/1471-2164-14-470] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2012] [Accepted: 06/19/2013] [Indexed: 02/08/2023] Open
Abstract
Background A fundamental question in molecular neurobiology is how genes that determine basic neuronal properties shape the functional organization of brain circuits underlying complex learned behaviors. Given the growing availability of complete vertebrate genomes, comparative genomics represents a promising approach to address this question. Here we used genomics and molecular approaches to study how ion channel genes influence the properties of the brain circuitry that regulates birdsong, a learned vocal behavior with important similarities to human speech acquisition. We focused on potassium (K-)Channels, which are major determinants of neuronal cell excitability. Starting with the human gene set of K-Channels, we used cross-species mRNA/protein alignments, and syntenic analysis to define the full complement of orthologs, paralogs, allelic variants, as well as novel loci not previously predicted in the genome of zebra finch (Taeniopygia guttata). We also compared protein coding domains in chicken and zebra finch orthologs to identify genes under positive selective pressure, and those that contained lineage-specific insertions/deletions in functional domains. Finally, we conducted comprehensive in situ hybridizations to determine the extent of brain expression, and identify K-Channel gene enrichments in nuclei of the avian song system. Results We identified 107 K-Channel finch genes, including 6 novel genes common to non-mammalian vertebrate lineages. Twenty human genes are absent in songbirds, birds, or sauropsids, or unique to mammals, suggesting K-Channel properties may be lineage-specific. We also identified specific family members with insertions/deletions and/or high dN/dS ratios compared to chicken, a non-vocal learner. In situ hybridization revealed that while most K-Channel genes are broadly expressed in the brain, a subset is selectively expressed in song nuclei, representing molecular specializations of the vocal circuitry. Conclusions Together, these findings shed new light on genes that may regulate biophysical and excitable properties of the song circuitry, identify potential targets for the manipulation of the song system, and reveal genomic specializations that may relate to the emergence of vocal learning and associated brain areas in birds.
Collapse
|