1
|
Callahan TJ, Tripodi IJ, Stefanski AL, Cappelletti L, Taneja SB, Wyrwa JM, Casiraghi E, Matentzoglu NA, Reese J, Silverstein JC, Hoyt CT, Boyce RD, Malec SA, Unni DR, Joachimiak MP, Robinson PN, Mungall CJ, Cavalleri E, Fontana T, Valentini G, Mesiti M, Gillenwater LA, Santangelo B, Vasilevsky NA, Hoehndorf R, Bennett TD, Ryan PB, Hripcsak G, Kahn MG, Bada M, Baumgartner WA, Hunter LE. An open source knowledge graph ecosystem for the life sciences. Sci Data 2024; 11:363. [PMID: 38605048 PMCID: PMC11009265 DOI: 10.1038/s41597-024-03171-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 03/21/2024] [Indexed: 04/13/2024] Open
Abstract
Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoint resources and abstraction algorithms), and benchmarks (e.g., prebuilt KGs). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 different large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability.
Collapse
Affiliation(s)
- Tiffany J Callahan
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA.
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA.
| | - Ignacio J Tripodi
- Computer Science Department, Interdisciplinary Quantitative Biology, University of Colorado Boulder, Boulder, CO, 80301, USA
| | - Adrianne L Stefanski
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Luca Cappelletti
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
| | - Sanya B Taneja
- Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Jordan M Wyrwa
- Department of Physical Medicine and Rehabilitation, School of Medicine, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
| | - Elena Casiraghi
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | | | - Justin Reese
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Jonathan C Silverstein
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15206, USA
| | - Charles Tapley Hoyt
- Laboratory of Systems Pharmacology, Harvard Medical School, Boston, MA, 02115, USA
| | - Richard D Boyce
- Department of Biomedical Informatics, University of Pittsburgh School of Medicine, Pittsburgh, PA, 15206, USA
| | - Scott A Malec
- Division of Translational Informatics, University of New Mexico School of Medicine, Albuquerque, NM, 87131, USA
| | - Deepak R Unni
- SIB Swiss Institute of Bioinformatics, Basel, Switzerland
| | - Marcin P Joachimiak
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Peter N Robinson
- Berlin Institute of Health at Charité-Universitatsmedizin, 10117, Berlin, Germany
| | - Christopher J Mungall
- Division of Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Berkeley, CA, 94720, USA
| | - Emanuele Cavalleri
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
| | - Tommaso Fontana
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
| | - Giorgio Valentini
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
- ELLIS, European Laboratory for Learning and Intelligent Systems, Milan Unit, Italy
| | - Marco Mesiti
- AnacletoLab, Dipartimento di Informatica, Universit`a degli Studi di Milano, Via Celoria 18, 20133, Milan, Italy
| | - Lucas A Gillenwater
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Brook Santangelo
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Nicole A Vasilevsky
- Data Collaboration Center, Critical Path Institute, 1840 E River Rd. Suite 100, Tucson, AZ, 85718, USA
| | - Robert Hoehndorf
- Computer, Electrical and Mathematical Sciences & Engineering Division, Computational Bioscience Research Center, King Abdullah University of Science and Technology, Thuwal, 23955-6900, Kingdom of Saudi Arabia
| | - Tellen D Bennett
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
- Department of Pediatrics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Patrick B Ryan
- Janssen Research and Development, Raritan, NJ, 08869, USA
| | - George Hripcsak
- Department of Biomedical Informatics, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Michael G Kahn
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - Michael Bada
- Division of General Internal Medicine, University of Colorado School of Medicine, Aurora, CO, 80045, USA
| | - William A Baumgartner
- Division of General Internal Medicine, University of Colorado School of Medicine, Aurora, CO, 80045, USA.
| | - Lawrence E Hunter
- Computational Bioscience Program, University of Colorado Anschutz Medical Campus, Aurora, CO, 80045, USA.
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, 80045, USA.
| |
Collapse
|
2
|
Santangelo BE, Apgar M, Colorado ASB, Martin CG, Sterrett J, Wall E, Joachimiak MP, Hunter LE, Lozupone CA. Integrating biological knowledge for mechanistic inference in the host-associated microbiome. Front Microbiol 2024; 15:1351678. [PMID: 38638909 PMCID: PMC11024261 DOI: 10.3389/fmicb.2024.1351678] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 02/26/2024] [Indexed: 04/20/2024] Open
Abstract
Advances in high-throughput technologies have enhanced our ability to describe microbial communities as they relate to human health and disease. Alongside the growth in sequencing data has come an influx of resources that synthesize knowledge surrounding microbial traits, functions, and metabolic potential with knowledge of how they may impact host pathways to influence disease phenotypes. These knowledge bases can enable the development of mechanistic explanations that may underlie correlations detected between microbial communities and disease. In this review, we survey existing resources and methodologies for the computational integration of broad classes of microbial and host knowledge. We evaluate these knowledge bases in their access methods, content, and source characteristics. We discuss challenges of the creation and utilization of knowledge bases including inconsistency of nomenclature assignment of taxa and metabolites across sources, whether the biological entities represented are rooted in ontologies or taxonomies, and how the structure and accessibility limit the diversity of applications and user types. We make this information available in a code and data repository at: https://github.com/lozuponelab/knowledge-source-mappings. Addressing these challenges will allow for the development of more effective tools for drawing from abundant knowledge to find new insights into microbial mechanisms in disease by fostering a systematic and unbiased exploration of existing information.
Collapse
Affiliation(s)
- Brook E. Santangelo
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, United States
| | - Madison Apgar
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, United States
| | | | - Casey G. Martin
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, United States
| | - John Sterrett
- Department of Integrative Physiology, University of Colorado, Boulder, CO, United States
| | - Elena Wall
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, United States
| | - Marcin P. Joachimiak
- Lawrence Berkeley National Laboratory, Environmental Genomics and Systems Biology Division, Biosystems Data Science Department, Berkeley, CA, United States
| | - Lawrence E. Hunter
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, United States
| | - Catherine A. Lozupone
- Department of Biomedical Informatics, University of Colorado School of Medicine, Aurora, CO, United States
| |
Collapse
|
3
|
Kilicoglu H, Ensan F, McInnes B, Wang LL. Semantics-enabled biomedical literature analytics. J Biomed Inform 2024; 150:104588. [PMID: 38244957 DOI: 10.1016/j.jbi.2024.104588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 01/10/2024] [Indexed: 01/22/2024]
Affiliation(s)
- Halil Kilicoglu
- School of Information Sciences, University of Illinois Urbana Champaign, Champaign, IL, USA.
| | - Faezeh Ensan
- Department of Electrical, Computer, and Biomedical Engineering, Toronto Metropolitan University, Toronto, ON, Canada.
| | - Bridget McInnes
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA.
| | - Lucy Lu Wang
- Information School, University of Washington, Seattle, WA, USA.
| |
Collapse
|
4
|
Otero-Carrasco B, Ugarte Carro E, Prieto-Santamaría L, Diaz Uzquiano M, Caraça-Valente Hernández JP, Rodríguez-González A. Identifying patterns to uncover the importance of biological pathways on known drug repurposing scenarios. BMC Genomics 2024; 25:43. [PMID: 38191292 PMCID: PMC10775474 DOI: 10.1186/s12864-023-09913-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Accepted: 12/15/2023] [Indexed: 01/10/2024] Open
Abstract
BACKGROUND Drug repurposing plays a significant role in providing effective treatments for certain diseases faster and more cost-effectively. Successful repurposing cases are mostly supported by a classical paradigm that stems from de novo drug development. This paradigm is based on the "one-drug-one-target-one-disease" idea. It consists of designing drugs specifically for a single disease and its drug's gene target. In this article, we investigated the use of biological pathways as potential elements to achieve effective drug repurposing. METHODS Considering a total of 4214 successful cases of drug repurposing, we identified cases in which biological pathways serve as the underlying basis for successful repurposing, referred to as DREBIOP. Once the repurposing cases based on pathways were identified, we studied their inherent patterns by considering the different biological elements associated with this dataset, as well as the pathways involved in these cases. Furthermore, we obtained gene-disease association values to demonstrate the diminished significance of the drug's gene target in these repurposing cases. To achieve this, we compared the values obtained for the DREBIOP set with the overall association values found in DISNET, as well as with the drug's target gene (DREGE) based repurposing cases using the Mann-Whitney U Test. RESULTS A collection of drug repurposing cases, known as DREBIOP, was identified as a result. DREBIOP cases exhibit distinct characteristics compared with DREGE cases. Notably, DREBIOP cases are associated with a higher number of biological pathways, with Vitamin D Metabolism and ACE inhibitors being the most prominent pathways. Additionally, it was observed that the association values of GDAs in DREBIOP cases were significantly lower than those in DREGE cases (p-value < 0.05). CONCLUSIONS Biological pathways assume a pivotal role in drug repurposing cases. This investigation successfully revealed patterns that distinguish drug repurposing instances associated with biological pathways. These identified patterns can be applied to any known repurposing case, enabling the detection of pathway-based repurposing scenarios or the classical paradigm.
Collapse
Affiliation(s)
- Belén Otero-Carrasco
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Pozuelo de Alarcón, 28223, Spain
- ETS Ingenieros Informáticos, Universidad Politécnica de Madrid, Boadilla del Monte, 28660, Spain
| | - Esther Ugarte Carro
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Pozuelo de Alarcón, 28223, Spain
| | - Lucía Prieto-Santamaría
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Pozuelo de Alarcón, 28223, Spain
- ETS Ingenieros Informáticos, Universidad Politécnica de Madrid, Boadilla del Monte, 28660, Spain
| | - Marina Diaz Uzquiano
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Pozuelo de Alarcón, 28223, Spain
| | | | - Alejandro Rodríguez-González
- Centro de Tecnología Biomédica, Universidad Politécnica de Madrid, Pozuelo de Alarcón, 28223, Spain.
- ETS Ingenieros Informáticos, Universidad Politécnica de Madrid, Boadilla del Monte, 28660, Spain.
| |
Collapse
|
5
|
Wishart DS, Kruger R, Sivakumaran A, Harford K, Sanford S, Doshi R, Khetarpal N, Fatokun O, Doucet D, Zubkowski A, Jackson H, Sykes G, Ramirez-Gaona M, Marcu A, Li C, Yee K, Garros C, Rayat D, Coleongco J, Nandyala T, Gautam V, Oler E. PathBank 2.0-the pathway database for model organism metabolomics. Nucleic Acids Res 2024; 52:D654-D662. [PMID: 37962386 PMCID: PMC10767802 DOI: 10.1093/nar/gkad1041] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 10/19/2023] [Accepted: 10/31/2023] [Indexed: 11/15/2023] Open
Abstract
PathBank (https://pathbank.org) and its predecessor database, the Small Molecule Pathway Database (SMPDB), have been providing comprehensive metabolite pathway information for the metabolomics community since 2010. Over the past 14 years, these pathway databases have grown and evolved significantly to meet the needs of the metabolomics community and respond to continuing changes in computing technology. This year's update, PathBank 2.0, brings a number of important improvements and upgrades that should make the database more useful and more appealing to a larger cross-section of users. In particular, these improvements include: (i) a significant increase in the number of primary or canonical pathways (from 1720 to 6951); (ii) a massive increase in the total number of pathways (from 110 234 to 605 359); (iii) significant improvements to the quality of pathway diagrams and pathway descriptions; (iv) a strong emphasis on drug metabolism and drug mechanism pathways; (v) making most pathway images more slide-compatible and manuscript-compatible; (vi) adding tools to support better pathway filtering and selecting through a more complete pathway taxonomy; (vii) adding pathway analysis tools for visualizing and calculating pathway enrichment. Many other minor improvements and updates to the content, the interface and general performance of the PathBank website have also been made. Overall, we believe these upgrades and updates should greatly enhance PathBank's ease of use and its potential applications for interpreting metabolomics data.
Collapse
Affiliation(s)
- David S Wishart
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
- Department of Computing Science, University of Alberta, Edmonton, AB T6G 2E8, Canada
- Department of Laboratory Medicine and Pathology, University of Alberta, Edmonton, AB T6G 2B7, Canada
- Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Edmonton, AB T6G 2H7, Canada
| | - Ray Kruger
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Aadhavya Sivakumaran
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Karxena Harford
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Selena Sanford
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Rahil Doshi
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Nitya Khetarpal
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Omolola Fatokun
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Daphnee Doucet
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Ashley Zubkowski
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Hayley Jackson
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Gina Sykes
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Miguel Ramirez-Gaona
- Department of Plant Breeding, Wageningen University and Research, 6708 PBWageningen, Gelderland, Netherlands
| | - Ana Marcu
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Carin Li
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Kristen Yee
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Christiana Garros
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Dorsa Yahya Rayat
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Jeanne Coleongco
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Tharuni Nandyala
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Vasuk Gautam
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Eponine Oler
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| |
Collapse
|
6
|
Pastrello C, Kotlyar M, Abovsky M, Lu R, Jurisica I. PathDIP 5: improving coverage and making enrichment analysis more biologically meaningful. Nucleic Acids Res 2024; 52:D663-D671. [PMID: 37994706 PMCID: PMC10767947 DOI: 10.1093/nar/gkad1027] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/16/2023] [Accepted: 10/20/2023] [Indexed: 11/24/2023] Open
Abstract
Pathway Data Integration Portal (PathDIP) is an integrated pathway database that was developed to increase functional gene annotation coverage and reduce bias in pathway enrichment analysis. PathDIP 5 provides multiple improvements to enable more interpretable analysis: users can perform enrichment analysis using all sources, separate sources or by combining specific pathway subsets; they can select the types of sources to use or the types of pathways for the analysis, reducing the number of resulting generic pathways or pathways not related to users' research question; users can use API. All pathways have been mapped to seven representative types. The results of pathway enrichment can be summarized through knowledge-based pathway consolidation. All curated pathways were mapped to 53 pathway ontology-based categories. In addition to genes, pathDIP 5 now includes metabolites. We updated existing databases, included two new sources, PathBank and MetabolicAtlas, and removed outdated databases. We enable users to analyse their results using Drugst.One, where a drug-gene network is created using only the user's genes in a specific pathway. Interpreting the results of any analysis is now improved by multiple charts on all the results pages. PathDIP 5 is freely available at https://ophid.utoronto.ca/pathDIP.
Collapse
Affiliation(s)
- Chiara Pastrello
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute, Toronto, Ontario M5T 0S8, Canada
- Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, Krembil Discovery Tower, Toronto, ON M5T 0S8, Canada
| | - Max Kotlyar
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute, Toronto, Ontario M5T 0S8, Canada
- Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, Krembil Discovery Tower, Toronto, ON M5T 0S8, Canada
| | - Mark Abovsky
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute, Toronto, Ontario M5T 0S8, Canada
- Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, Krembil Discovery Tower, Toronto, ON M5T 0S8, Canada
| | - Richard Lu
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute, Toronto, Ontario M5T 0S8, Canada
- Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, Krembil Discovery Tower, Toronto, ON M5T 0S8, Canada
| | - Igor Jurisica
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute, Toronto, Ontario M5T 0S8, Canada
- Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, Krembil Discovery Tower, Toronto, ON M5T 0S8, Canada
- Departments of Medical Biophysics and Computer Science, and Faculty of Dentistry, University of Toronto, Toronto, ON M5G 1L7, Canada
- Institute of Neuroimmunology, Slovak Academy of Sciences, Bratislava, Slovakia
| |
Collapse
|
7
|
Kaldunski ML, Smith JR, Brodie KC, De Pons JL, Demos WM, Gibson AC, Hayman GT, Lamers L, Laulederkind SJF, Thorat K, Thota J, Tutaj MA, Tutaj M, Vedi M, Wang SJ, Zacher S, Dwinell MR, Kwitek AE. Rare disease research resources at the Rat Genome Database. Genetics 2023; 224:iyad078. [PMID: 37119810 PMCID: PMC10411567 DOI: 10.1093/genetics/iyad078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2023] [Revised: 04/05/2023] [Accepted: 04/19/2023] [Indexed: 05/01/2023] Open
Abstract
Rare diseases individually affect relatively few people, but as a group they impact considerable numbers of people. The Rat Genome Database (https://rgd.mcw.edu) is a knowledgebase that offers resources for rare disease research. This includes disease definitions, genes, quantitative trail loci (QTLs), genetic variants, annotations to published literature, links to external resources, and more. One important resource is identifying relevant cell lines and rat strains that serve as models for disease research. Diseases, genes, and strains have report pages with consolidated data, and links to analysis tools. Utilizing these globally accessible resources for rare disease research, potentiating discovery of mechanisms and new treatments, can point researchers toward solutions to alleviate the suffering of those afflicted with these diseases.
Collapse
Affiliation(s)
- Mary L Kaldunski
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Jennifer R Smith
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Kent C Brodie
- Clinical and Translational Science Institute, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Jeffrey L De Pons
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Wendy M Demos
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Adam C Gibson
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - G Thomas Hayman
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Logan Lamers
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Stanley J F Laulederkind
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Ketaki Thorat
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Jyothi Thota
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Marek A Tutaj
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Monika Tutaj
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Mahima Vedi
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Shur-Jen Wang
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Stacy Zacher
- Finance and Administration, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Melinda R Dwinell
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Anne E Kwitek
- The Rat Genome Database, Department of Physiology, Medical College of Wisconsin, Milwaukee, WI 53226, USA
- Joint Department of Biomedical Engineering, Marquette University & Medical College of Wisconsin, Milwaukee, WI 53226, USA
| |
Collapse
|
8
|
Laulederkind SJF, Hayman GT, Wang SJ, Kaldunski ML, Vedi M, Demos WM, Tutaj M, Smith JR, Lamers L, Gibson AC, Thorat K, Thota J, Tutaj MA, De Pons JL, Dwinell MR, Kwitek AE. The Rat Genome Database: Genetic, Genomic, and Phenotypic Data Across Multiple Species. Curr Protoc 2023; 3:e804. [PMID: 37347557 PMCID: PMC10335880 DOI: 10.1002/cpz1.804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/24/2023]
Abstract
The laboratory rat, Rattus norvegicus, is an important model of human health and disease, and experimental findings in the rat have relevance to human physiology and disease. The Rat Genome Database (RGD, https://rgd.mcw.edu) is a model organism database that provides access to a wide variety of curated rat data including disease associations, phenotypes, pathways, molecular functions, biological processes, cellular components, and chemical interactions for genes, quantitative trait loci, and strains. We present an overview of the database followed by specific examples that can be used to gain experience in employing RGD to explore the wealth of functional data available for the rat and other species. © 2023 Wiley Periodicals LLC. Basic Protocol 1: Navigating the Rat Genome Database (RGD) home page Basic Protocol 2: Using the RGD search functions Basic Protocol 3: Searching for quantitative trait loci Basic Protocol 4: Using the RGD genome browser (JBrowse) to find phenotypic annotations Basic Protocol 5: Using OntoMate to find gene-disease data Basic Protocol 6: Using MOET to find gene-ontology enrichment Basic Protocol 7: Using OLGA to generate gene lists for analysis Basic Protocol 8: Using the GA tool to analyze ontology annotations for genes Basic Protocol 9: Using the RGD InterViewer tool to find protein interaction data Basic Protocol 10: Using the RGD Variant Visualizer tool to find genetic variant data Basic Protocol 11: Using the RGD Disease Portals to find disease, phenotype, and other information Basic Protocol 12: Using the RGD Phenotypes & Models Portal to find qualitative and quantitative phenotype data and other rat strain-related information Basic Protocol 13: Using the RGD Pathway Portal to find disease and phenotype data via molecular pathways.
Collapse
Affiliation(s)
| | - G. Thomas Hayman
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Shur-Jen Wang
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Mary L. Kaldunski
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Mahima Vedi
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Wendy M. Demos
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Monika Tutaj
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Jennifer R. Smith
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Logan Lamers
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Adam C. Gibson
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Ketaki Thorat
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Jyothi Thota
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Marek A. Tutaj
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Jeffrey L. De Pons
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Melinda R. Dwinell
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Anne E. Kwitek
- Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| |
Collapse
|
9
|
Manicka S, Johnson K, Levin M, Murrugarra D. The nonlinearity of regulation in biological networks. NPJ Syst Biol Appl 2023; 9:10. [PMID: 37015937 PMCID: PMC10073134 DOI: 10.1038/s41540-023-00273-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Accepted: 03/23/2023] [Indexed: 04/06/2023] Open
Abstract
The extent to which the components of a biological system are (non)linearly regulated determines how amenable they are to therapy and control. To better understand this property termed "regulatory nonlinearity", we analyzed a suite of 137 published Boolean network models, containing a variety of complex nonlinear regulatory interactions, using a probabilistic generalization of Boolean logic that George Boole himself had proposed. Leveraging the continuous-nature of this formulation, we used Taylor decomposition to approximate the models with various levels of regulatory nonlinearity. A comparison of the resulting series of approximations of the biological models with appropriate random ensembles revealed that biological regulation tends to be less nonlinear than expected, meaning that higher-order interactions among the regulatory inputs tend to be less pronounced. A further categorical analysis of the biological models revealed that the regulatory nonlinearity of cancer and disease networks could not only be sometimes higher than expected but also be relatively more variable. We show that this variation is caused by differences in the apportioning of information among the various orders of regulatory nonlinearity. Our results suggest that there may have been a weak but discernible selection pressure for biological systems to evolve linear regulation on average, but for certain systems such as cancer, on the other hand, to simultaneously evolve more nonlinear rules.
Collapse
Affiliation(s)
- Santosh Manicka
- Department of Biology, Tufts University, Medford, MA, 02155, USA
| | - Kathleen Johnson
- Department of Mathematics, University of Kentucky, Lexington, KY, 40506, USA
| | - Michael Levin
- Department of Biology, Tufts University, Medford, MA, 02155, USA
| | - David Murrugarra
- Department of Mathematics, University of Kentucky, Lexington, KY, 40506, USA.
| |
Collapse
|
10
|
Taneja SB, Callahan TJ, Paine MF, Kane-Gill SL, Kilicoglu H, Joachimiak MP, Boyce RD. Developing a Knowledge Graph for Pharmacokinetic Natural Product-Drug Interactions. J Biomed Inform 2023; 140:104341. [PMID: 36933632 PMCID: PMC10150409 DOI: 10.1016/j.jbi.2023.104341] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2022] [Revised: 01/09/2023] [Accepted: 03/13/2023] [Indexed: 03/17/2023]
Abstract
BACKGROUND Pharmacokinetic natural product-drug interactions (NPDIs) occur when botanical or other natural products are co-consumed with pharmaceutical drugs. With the growing use of natural products, the risk for potential NPDIs and consequent adverse events has increased. Understanding mechanisms of NPDIs is key to preventing or minimizing adverse events. Although biomedical knowledge graphs (KGs) have been widely used for drug-drug interaction applications, computational investigation of NPDIs is novel. We constructed NP-KG as a first step toward computational discovery of plausible mechanistic explanations for pharmacokinetic NPDIs that can be used to guide scientific research. METHODS We developed a large-scale, heterogeneous KG with biomedical ontologies, linked data, and full texts of the scientific literature. To construct the KG, biomedical ontologies and drug databases were integrated with the Phenotype Knowledge Translator framework. The semantic relation extraction systems, SemRep and Integrated Network and Dynamic Reasoning Assembler, were used to extract semantic predications (subject-relation-object triples) from full texts of the scientific literature related to the exemplar natural products green tea and kratom. A literature-based graph constructed from the predications was integrated into the ontology-grounded KG to create NP-KG. NP-KG was evaluated with case studies of pharmacokinetic green tea- and kratom-drug interactions through KG path searches and meta-path discovery to determine congruent and contradictory information in NP-KG compared to ground truth data. We also conducted an error analysis to identify knowledge gaps and incorrect predications in the KG. RESULTS The fully integrated NP-KG consisted of 745,512 nodes and 7,249,576 edges. Evaluation of NP-KG resulted in congruent (38.98% for green tea, 50% for kratom), contradictory (15.25% for green tea, 21.43% for kratom), and both congruent and contradictory (15.25% for green tea, 21.43% for kratom) information compared to ground truth data. Potential pharmacokinetic mechanisms for several purported NPDIs, including the green tea-raloxifene, green tea-nadolol, kratom-midazolam, kratom-quetiapine, and kratom-venlafaxine interactions were congruent with the published literature. CONCLUSION NP-KG is the first KG to integrate biomedical ontologies with full texts of the scientific literature focused on natural products. We demonstrate the application of NP-KG to identify known pharmacokinetic interactions between natural products and pharmaceutical drugs mediated by drug metabolizing enzymes and transporters. Future work will incorporate context, contradiction analysis, and embedding-based methods to enrich NP-KG. NP-KG is publicly available at https://doi.org/10.5281/zenodo.6814507. The code for relation extraction, KG construction, and hypothesis generation is available at https://github.com/sanyabt/np-kg.
Collapse
Affiliation(s)
- Sanya B Taneja
- Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA 15206, USA.
| | - Tiffany J Callahan
- Department of Biomedical Informatics, Columbia University, New York, NY 10032, USA
| | - Mary F Paine
- Department of Pharmaceutical Sciences, College of Pharmacy and Pharmaceutical Sciences, Washington State University, Spokane, WA 99202, USA
| | | | - Halil Kilicoglu
- School of Information Sciences, University of Illinois at Urbana-Champaign, Champaign, IL 61820, USA
| | - Marcin P Joachimiak
- Environmental Genomics and Systems Biology Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Richard D Boyce
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA 15206, USA
| |
Collapse
|
11
|
DHULI KRISTJANA, BONETTI GABRIELE, ANPILOGOV KYRYLO, HERBST KARENL, CONNELLY STEPHENTHADDEUS, BELLINATO FRANCESCO, GISONDI PAOLO, BERTELLI MATTEO. Validating methods for testing natural molecules on molecular pathways of interest in silico and in vitro. JOURNAL OF PREVENTIVE MEDICINE AND HYGIENE 2022; 63:E279-E288. [PMID: 36479497 PMCID: PMC9710400 DOI: 10.15167/2421-4248/jpmh2022.63.2s3.2770] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Abstract
Differentially expressed genes can serve as drug targets and are used to predict drug response and disease progression. In silico drug analysis based on the expression of these genetic biomarkers allows the detection of putative therapeutic agents, which could be used to reverse a pathological gene expression signature. Indeed, a set of bioinformatics tools can increase the accuracy of drug discovery, helping in biomarker identification. Once a drug target is identified, in vitro cell line models of disease are used to evaluate and validate the therapeutic potential of putative drugs and novel natural molecules. This study describes the development of efficacious PCR primers that can be used to identify gene expression of specific genetic pathways, which can lead to the identification of natural molecules as therapeutic agents in specific molecular pathways. For this study, genes involved in health conditions and processes were considered. In particular, the expression of genes involved in obesity, xenobiotics metabolism, endocannabinoid pathway, leukotriene B4 metabolism and signaling, inflammation, endocytosis, hypoxia, lifespan, and neurotrophins were evaluated. Exploiting the expression of specific genes in different cell lines can be useful in in vitro to evaluate the therapeutic effects of small natural molecules.
Collapse
Affiliation(s)
- KRISTJANA DHULI
- MAGI’S LAB, Rovereto (TN), Italy
- Correspondence: Kristjana Dhuli, MAGI’S LAB, Rovereto (TN), 38068, Italy. E-mail:
| | | | | | - KAREN L. HERBST
- Total Lipedema Care, Beverly Hills California and Tucson Arizona, USA
| | - STEPHEN THADDEUS CONNELLY
- San Francisco Veterans Affairs Health Care System, Department of Oral & Maxillofacial Surgery, University of California, San Francisco, CA, USA7
| | - FRANCESCO BELLINATO
- Section of Dermatology and Venereology, Department of Medicine, University of Verona, Verona, Italy
| | - PAOLO GISONDI
- Section of Dermatology and Venereology, Department of Medicine, University of Verona, Verona, Italy
| | - MATTEO BERTELLI
- MAGI’S LAB, Rovereto (TN), Italy
- MAGI EUREGIO, Bolzano, BZ, Italy
- MAGISNAT, Peachtree Corners (GA), USA
| |
Collapse
|
12
|
Giuseppe A, Chiara P, Yun N, Igor J. Pathway integration and annotation: building a puzzle with non-matching pieces and no reference picture. Brief Bioinform 2022; 23:6691914. [PMID: 36063560 DOI: 10.1093/bib/bbac368] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Revised: 07/25/2022] [Accepted: 08/05/2022] [Indexed: 11/13/2022] Open
Abstract
Biological pathways are a broadly used formalism for representing and interpreting the cascade of biochemical reactions underlying cellular and biological mechanisms. Pathway representation provides an ontological link among biomolecules such as RNA, DNA, small molecules, proteins, protein complexes, hormones and genes. Frequently, pathway annotations are used to identify mechanisms linked to genes within affected biological contexts. This important role and the simplicity and elegance in representing complex interactions led to an explosion of pathway representations and databases. Unfortunately, the lack of overlap across databases results in inconsistent enrichment analysis results, unless databases are integrated. However, due to absence of consensus, guidelines or gold standards in pathway definition and representation, integration of data across pathway databases is not straightforward. Despite multiple attempts to provide consolidated pathways, highly related, redundant, poorly overlapping or ambiguous pathways continue to render pathways analysis inconsistent and hard to interpret. Ontology-based integration will promote unbiased, comprehensive yet streamlined analysis of experiments, and will reduce the number of enriched pathways when performing pathway enrichment analysis. Moreover, appropriate and consolidated pathways provide better training data for pathway prediction algorithms. In this manuscript, we describe the current methods for pathway consolidation, their strengths and pitfalls, and highlight directions for future improvements to this research area.
Collapse
Affiliation(s)
- Agapito Giuseppe
- Department of Law, Economics and Social Sciences, University Magna Græcia of Catanzaro, Italy.,Data Analytic Research Center, University Magna Græcia of Catanzaro, Italy
| | - Pastrello Chiara
- Osteoarthritis Research Program, Division of Orthopaedics, Schroeder Arthritis Institute, University Health Network, Toronto, Canada
| | - Niu Yun
- Osteoarthritis Research Program, Division of Orthopaedics, Schroeder Arthritis Institute, University Health Network, Toronto, Canada
| | - Jurisica Igor
- Osteoarthritis Research Program, Division of Orthopaedics, Schroeder Arthritis Institute, University Health Network, Toronto, Canada.,Departments of Medical Biophysics and Computer Science Canada, University of Toronto, Toronto, Canada.,Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, Toronto, Canada.,Institute of Neuroimmunology, Slovak Academy of Sciences, Bratislava, Slovakia
| |
Collapse
|
13
|
Martens M, Kreidl F, Ehrhart F, Jean D, Mei M, Mortensen HM, Nash A, Nymark P, Evelo CT, Cerciello F. A Community-Driven, Openly Accessible Molecular Pathway Integrating Knowledge on Malignant Pleural Mesothelioma. Front Oncol 2022; 12:849640. [PMID: 35558518 PMCID: PMC9088009 DOI: 10.3389/fonc.2022.849640] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2022] [Accepted: 03/29/2022] [Indexed: 12/28/2022] Open
Abstract
Malignant pleural mesothelioma (MPM) is a highly aggressive malignancy mainly triggered by exposure to asbestos and characterized by complex biology. A significant body of knowledge has been generated over the decades by the research community which has improved our understanding of the disease toward prevention, diagnostic opportunities and new treatments. Omics technologies are opening for additional levels of information and hypotheses. Given the growing complexity and technological spread of biological knowledge in MPM, there is an increasing need for an integrating tool that may allow scientists to access the information and analyze data in a simple and interactive way. We envisioned that a platform to capture this widespread and fast-growing body of knowledge in a machine-readable and simple visual format together with tools for automated large-scale data analysis could be an important support for the work of the general scientist in MPM and for the community to share, critically discuss, distribute and eventually advance scientific results. Toward this goal, with the support of experts in the field and informed by existing literature, we have developed the first version of a molecular pathway model of MPM in the biological pathway database WikiPathways. This provides a visual and interactive overview of interactions and connections between the most central genes, proteins and molecular pathways known to be involved or altered in MPM. Currently, 455 unique genes and 247 interactions are included, derived after stringent manual curation of an initial 39 literature references. The pathway model provides a directly employable research tool with links to common databases and repositories for the exploration and the analysis of omics data. The resource is publicly available in the WikiPathways database (Wikipathways : WP5087) and continues to be under development and curation by the community, enabling the scientists in MPM to actively participate in the prioritization of shared biological knowledge.
Collapse
Affiliation(s)
- Marvin Martens
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, Netherlands
| | - Franziska Kreidl
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, Netherlands
| | - Friederike Ehrhart
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, Netherlands.,Department of Bioinformatics - BiGCaT, MHeNs, Maastricht University, Maastricht, Netherlands
| | - Didier Jean
- Centre de Recherche des Cordeliers, Inserm, Sorbonne Université, Université de Paris, Functional Genomics of Solid Tumors, Paris, France
| | - Merlin Mei
- Oak Ridge Associated Universities, Research Triangle Park, Durham, NC, United States.,Center for Public Health and Environmental Assessment, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, NC, United States
| | - Holly M Mortensen
- Center for Public Health and Environmental Assessment, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, Durham, NC, United States
| | - Alistair Nash
- National Centre for Asbestos Related Diseases, University of Western Australia, Perth, WA, Australia
| | - Penny Nymark
- Institute of Environmental Medicine, Karolinska Institute, Stockholm, Sweden
| | - Chris T Evelo
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, Netherlands.,Maastricht Centre for Systems Biology (MaCSBio), Maastricht University, Maastricht, Netherlands
| | - Ferdinando Cerciello
- Department of Medical Oncology, Inselspital, Bern University Hospital, University of Bern, Bern, Switzerland
| |
Collapse
|
14
|
Mubeen S, Tom Kodamullil A, Hofmann-Apitius M, Domingo-Fernández D. On the influence of several factors on pathway enrichment analysis. Brief Bioinform 2022; 23:bbac143. [PMID: 35453140 PMCID: PMC9116215 DOI: 10.1093/bib/bbac143] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2022] [Revised: 03/21/2022] [Accepted: 03/30/2022] [Indexed: 02/01/2023] Open
Abstract
Pathway enrichment analysis has become a widely used knowledge-based approach for the interpretation of biomedical data. Its popularity has led to an explosion of both enrichment methods and pathway databases. While the elegance of pathway enrichment lies in its simplicity, multiple factors can impact the results of such an analysis, which may not be accounted for. Researchers may fail to give influential aspects their due, resorting instead to popular methods and gene set collections, or default settings. Despite ongoing efforts to establish set guidelines, meaningful results are still hampered by a lack of consensus or gold standards around how enrichment analysis should be conducted. Nonetheless, such concerns have prompted a series of benchmark studies specifically focused on evaluating the influence of various factors on pathway enrichment results. In this review, we organize and summarize the findings of these benchmarks to provide a comprehensive overview on the influence of these factors. Our work covers a broad spectrum of factors, spanning from methodological assumptions to those related to prior biological knowledge, such as pathway definitions and database choice. In doing so, we aim to shed light on how these aspects can lead to insignificant, uninteresting or even contradictory results. Finally, we conclude the review by proposing future benchmarks as well as solutions to overcome some of the challenges, which originate from the outlined factors.
Collapse
Affiliation(s)
- Sarah Mubeen
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115 Bonn, Germany
- Fraunhofer Center for Machine Learning, Germany
| | - Alpha Tom Kodamullil
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
| | - Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Bonn-Aachen International Center for Information Technology (B-IT), University of Bonn, 53115 Bonn, Germany
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Fraunhofer Center for Machine Learning, Germany
- Enveda Biosciences, Boulder, CO, 80301, USA
| |
Collapse
|
15
|
Iatsiuk V, Malinka F, Pickova M, Tureckova J, Klema J, Spoutil F, Novosadova V, Prochazka J, Sedlacek R. Semantic clustering analysis of E3-ubiquitin ligases in gastrointestinal tract defines genes ontology clusters with tissue expression patterns. BMC Gastroenterol 2022; 22:186. [PMID: 35413796 PMCID: PMC9006408 DOI: 10.1186/s12876-022-02265-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Accepted: 04/01/2022] [Indexed: 11/20/2022] Open
Abstract
Background Ubiquitin ligases (Ub-ligases) are essential intracellular enzymes responsible for the regulation of proteome homeostasis, signaling pathway crosstalk, cell differentiation and stress responses. Individual Ub-ligases exhibit their unique functions based on the nature of their substrates. They create a complex regulatory network with alternative and feedback pathways to maintain cell homeostasis, being thus important players in many physiological and pathological conditions. However, the functional classification of Ub-ligases needs to be revised and extended. Methods In the current study, we used a novel semantic biclustering technique for expression profiling of Ub-ligases and ubiquitination-related genes in the murine gastrointestinal tract (GIT). We accommodated a general framework of the algorithm for finding tissue-specific gene expression clusters in GIT. In order to test identified clusters in a biological system, we used a model of epithelial regeneration. For this purpose, a dextran sulfate sodium (DSS) mouse model, following with in situ hybridization, was used to expose genes with possible compensatory features. To determine cell-type specific distribution of Ub-ligases and ubiquitination-related genes, principal component analysis (PCA) and Uniform Manifold Approximation and Projection technique (UMAP) were used to analyze the Tabula Muris scRNA-seq data of murine colon followed by comparison with our clustering results. Results Our established clustering protocol, that incorporates the semantic biclustering algorithm, demonstrated the potential to reveal interesting expression patterns. In this manner, we statistically defined gene clusters consisting of the same genes involved in distinct regulatory pathways vs distinct genes playing roles in functionally similar signaling pathways. This allowed us to uncover the potentially redundant features of GIT-specific Ub-ligases and ubiquitination-related genes. Testing the statistically obtained results on the mouse model showed that genes clustered to the same ontology group simultaneously alter their expression pattern after induced epithelial damage, illustrating their complementary role during tissue regeneration. Conclusions An optimized semantic clustering protocol demonstrates the potential to reveal a readable and unique pattern in the expression profiling of GIT-specific Ub-ligases, exposing ontologically relevant gene clusters with potentially redundant features. This extends our knowledge of ontological relationships among Ub-ligases and ubiquitination-related genes, providing an alternative and more functional gene classification. In a similar way, semantic cluster analysis could be used for studding of other enzyme families, tissues and systems. Supplementary Information The online version contains supplementary material available at 10.1186/s12876-022-02265-2.
Collapse
Affiliation(s)
- Veronika Iatsiuk
- Laboratory of Transgenic Models of Diseases and Czech Centre for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czech Republic
| | - Frantisek Malinka
- Laboratory of Transgenic Models of Diseases and Czech Centre for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czech Republic.,Department of Computer Science, Czech Technical University in Prague, Prague, Czech Republic
| | - Marketa Pickova
- Laboratory of Transgenic Models of Diseases and Czech Centre for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czech Republic
| | - Jolana Tureckova
- Laboratory of Transgenic Models of Diseases and Czech Centre for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czech Republic
| | - Jiri Klema
- Department of Computer Science, Czech Technical University in Prague, Prague, Czech Republic
| | - Frantisek Spoutil
- Laboratory of Transgenic Models of Diseases and Czech Centre for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czech Republic
| | - Vendula Novosadova
- Laboratory of Transgenic Models of Diseases and Czech Centre for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czech Republic
| | - Jan Prochazka
- Laboratory of Transgenic Models of Diseases and Czech Centre for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czech Republic
| | - Radislav Sedlacek
- Laboratory of Transgenic Models of Diseases and Czech Centre for Phenogenomics, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, Czech Republic.
| |
Collapse
|
16
|
Mortensen HM, Martens M, Senn J, Levey T, Evelo CT, Willighagen EL, Exner T. The AOP-DB RDF: Applying FAIR Principles to the Semantic Integration of AOP Data Using the Research Description Framework. FRONTIERS IN TOXICOLOGY 2022; 4:803983. [PMID: 35295213 PMCID: PMC8915825 DOI: 10.3389/ftox.2022.803983] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Accepted: 01/13/2022] [Indexed: 01/12/2023] Open
Abstract
Computational toxicology is central to the current transformation occurring in toxicology and chemical risk assessment. There is a need for more efficient use of existing data to characterize human toxicological response data for environmental chemicals in the US and Europe. The Adverse Outcome Pathway (AOP) framework helps to organize existing mechanistic information and contributes to what is currently being described as New Approach Methodologies (NAMs). AOP knowledge and data are currently submitted directly by users and stored in the AOP-Wiki (https://aopwiki.org/). Automatic and systematic parsing of AOP-Wiki data is challenging, so we have created the EPA Adverse Outcome Pathway Database. The AOP-DB, developed by the US EPA to assist in the biological and mechanistic characterization of AOP data, provides a broad, systems-level overview of the biological context of AOPs. Here we describe the recent semantic mapping efforts for the AOP-DB, and how this process facilitates the integration of AOP-DB data with other toxicologically relevant datasets through a use case example.
Collapse
Affiliation(s)
- Holly M. Mortensen
- United States Environmental Protection Agency, Office of Research and Development, Center for Public Health and Environmental Assessment, Research Triangle Park, Durham, NC, United States
- *Correspondence: Holly M. Mortensen,
| | - Marvin Martens
- Department of Bioinformatics (BiGCaT), Maastricht University, Maastricht, Netherlands
| | - Jonathan Senn
- Oak Ridge Associated Universities, Oak Ridge, TN, United States
| | - Trevor Levey
- Oak Ridge Associated Universities, Oak Ridge, TN, United States
- SAS Institute, Cary, NC, United States
| | - Chris T. Evelo
- Department of Bioinformatics (BiGCaT), Maastricht University, Maastricht, Netherlands
- Maastricht Centre for Systems Biology, Maastricht University, Maastricht, Netherlands
| | - Egon L. Willighagen
- Department of Bioinformatics (BiGCaT), Maastricht University, Maastricht, Netherlands
| | | |
Collapse
|
17
|
Prashanth G, Vastrad B, Vastrad C, Kotrashetti S. Potential Molecular Mechanisms and Remdesivir Treatment for Acute Respiratory Syndrome Corona Virus 2 Infection/COVID 19 Through RNA Sequencing and Bioinformatics Analysis. Bioinform Biol Insights 2022; 15:11779322211067365. [PMID: 34992355 PMCID: PMC8725226 DOI: 10.1177/11779322211067365] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Accepted: 11/29/2021] [Indexed: 11/27/2022] Open
Abstract
Introduction: Severe acute respiratory syndrome corona virus 2 (SARS-CoV-2) infections
(COVID 19) is a progressive viral infection that has been investigated
extensively. However, genetic features and molecular pathogenesis underlying
remdesivir treatment for SARS-CoV-2 infection remain unclear. Here, we used
bioinformatics to investigate the candidate genes associated in the
molecular pathogenesis of remdesivir-treated SARS-CoV-2-infected
patients. Methods: Expression profiling by high-throughput sequencing dataset (GSE149273) was
downloaded from the Gene Expression Omnibus, and the differentially
expressed genes (DEGs) in remdesivir-treated SARS-CoV-2 infection samples
and nontreated SARS-CoV-2 infection samples with an adjusted
P value of <.05 and a |log fold change| > 1.3
were first identified by limma in R software package. Next, pathway and gene
ontology (GO) enrichment analysis of these DEGs was performed. Then, the hub
genes were identified by the NetworkAnalyzer plugin and the other
bioinformatics approaches including protein-protein interaction network
analysis, module analysis, target gene—miRNA regulatory network, and target
gene—TF regulatory network. Finally, a receiver-operating characteristic
analysis was performed for diagnostic values associated with hub genes. Results: A total of 909 DEGs were identified, including 453 upregulated genes and 457
downregulated genes. As for the pathway and GO enrichment analysis, the
upregulated genes were mainly linked with influenza A and defense response,
whereas downregulated genes were mainly linked with drug
metabolism—cytochrome P450 and reproductive process. In addition, 10 hub
genes (VCAM1, IKBKE, STAT1, IL7R, ISG15, E2F1, ZBTB16, TFAP4, ATP6V1B1, and
APBB1) were identified. Receiver-operating characteristic analysis showed
that hub genes (CIITA, HSPA6, MYD88, SOCS3, TNFRSF10A, ADH1A, CACNA2D2,
DUSP9, FMO5, and PDE1A) had good diagnostic values. Conclusion: This study provided insights into the molecular mechanism of
remdesivir-treated SARS-CoV-2 infection that might be useful in further
investigations.
Collapse
Affiliation(s)
- G Prashanth
- Department of General Medicine, Basaveshwara Medical College, Chitradurga, India
| | - Basavaraj Vastrad
- Department of Biochemistry, Basaveshwar College of Pharmacy, Gadag, India
| | | | | |
Collapse
|
18
|
Muscolino A, Di Maria A, Rapicavoli RV, Alaimo S, Bellomo L, Billeci F, Borzì S, Ferragina P, Ferro A, Pulvirenti A. NETME: on-the-fly knowledge network construction from biomedical literature. APPLIED NETWORK SCIENCE 2022; 7:1. [PMID: 35013714 PMCID: PMC8733431 DOI: 10.1007/s41109-021-00435-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 09/21/2021] [Indexed: 06/14/2023]
Abstract
BACKGROUND The rapidly increasing biological literature is a key resource to automatically extract and gain knowledge concerning biological elements and their relations. Knowledge Networks are helpful tools in the context of biological knowledge discovery and modeling. RESULTS We introduce a novel system called NETME, which, starting from a set of full-texts obtained from PubMed, through an easy-to-use web interface, interactively extracts biological elements from ontological databases and then synthesizes a network inferring relations among such elements. The results clearly show that our tool is capable of inferring comprehensive and reliable biological networks. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s41109-021-00435-x.
Collapse
Affiliation(s)
| | - Antonio Di Maria
- Department of Clinical and Experimental Medicine, University of Catania, Catania, Italy
| | | | - Salvatore Alaimo
- Department of Clinical and Experimental Medicine, University of Catania, Catania, Italy
| | - Lorenzo Bellomo
- Department of Computer Science, University of Pisa, Pisa, Italy
| | - Fabrizio Billeci
- Department of Maths and Computer Science, University of Catania, Catania, Italy
| | - Stefano Borzì
- Department of Maths and Computer Science, University of Catania, Catania, Italy
| | - Paolo Ferragina
- Department of Computer Science, University of Pisa, Pisa, Italy
| | - Alfredo Ferro
- Department of Clinical and Experimental Medicine, University of Catania, Catania, Italy
| | - Alfredo Pulvirenti
- Department of Clinical and Experimental Medicine, University of Catania, Catania, Italy
| |
Collapse
|
19
|
Pastrello C, Niu Y, Jurisica I. Pathway Enrichment Analysis of Microarray Data. Methods Mol Biol 2022; 2401:147-159. [PMID: 34902127 DOI: 10.1007/978-1-0716-1839-4_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Microarray analyses usually result in a list of differential genes that need to be annotated to link them the phenotype being studied, help planning validation experiments and interpretation of the results. Pathway enrichment analyses are frequently used for such purpose, where pathways are human created models of molecular activities and processes. While different types of pathway enrichment are available, we focus this protocol on the most frequent type-overrepresentation analysis. Many databases collect different sets of pathways and curate different sets of genes for the same pathways, so it is important to carefully choose the most suitable pathway source to perform enrichment analysis. To provide a comprehensive pathway analysis, in this protocol we will use pathDIP, which supports comprehensive enrichment analysis by integrating 22 main pathway databases. We will also describe the steps needed to visualize the enriched pathways using GSOAP.
Collapse
Affiliation(s)
- Chiara Pastrello
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute, University Health Network, Toronto, ON, Canada
- Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, Toronto Western Hospital, University Health Network, Toronto, ON, Canada
| | - Yun Niu
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute, University Health Network, Toronto, ON, Canada
- Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, Toronto Western Hospital, University Health Network, Toronto, ON, Canada
| | - Igor Jurisica
- Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute, University Health Network, Toronto, ON, Canada.
- Data Science Discovery Centre for Chronic Diseases, Krembil Research Institute, Toronto Western Hospital, University Health Network, Toronto, ON, Canada.
- Departments of Medical Biophysics and Computer Science, University of Toronto, Toronto, ON, Canada.
- Institute of Neuroimmunology, Slovak Academy of Sciences, Bratislava, Slovakia.
| |
Collapse
|
20
|
Xiao Y, Zheng X, Song W, Tong F, Mao Y, Liu S, Zhao D. CIDO-COVID-19: An Ontology for COVID-19 Based on CIDO. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2021; 2021:2119-2122. [PMID: 34891707 DOI: 10.1109/embc46164.2021.9629555] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
To realize integration, organization and reusability of knowledge related to COVID-19, an ontology for COVID-19 (CIDO-COVID-19) was constructed which extended the Coronavirus Infectious Disease Ontology (CIDO) by adding terms of COVID-19 related to symptoms, prevention, drugs and clinical domains. First, terms from the existing ontologies, literature, clinical guidelines and other resources about COVID-19 were merged. Then, the Stanford seven-step approach was used to define and organize the acquired terms. Finally, the CIDO-COVID-19 was built on basis of the terms mentioned above using Protégé. The CIDO-COVID-19 is a more comprehensive ontology for COVID-19, covering multiple areas in the domain of COVID-19, including disease, diagnosis, etiology, virus, transmission, symptom, treatment, drug and prevention.Clinical Relevance- The CIDO-COVID-19 covers multiple areas related to COVID-19, including diseases, diagnosis, etiology, virus, transmission, symptoms, treatment, drugs, prevention. Compared with the CIDO, it is expanded to cover drugs, prevention, and clinical domain. The definition of terms in CIDO-COVID-19 refers to biomedical ontologies, Clinical glossaries and clinical guidelines for COVID-19, which can provide clinicians with standard terminology in the clinical domain.
Collapse
|
21
|
Mubeen S, Bharadhwaj VS, Gadiya Y, Hofmann-Apitius M, Kodamullil AT, Domingo-Fernández D. DecoPath: a web application for decoding pathway enrichment analysis. NAR Genom Bioinform 2021; 3:lqab087. [PMID: 34568823 PMCID: PMC8459727 DOI: 10.1093/nargab/lqab087] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 08/31/2021] [Accepted: 09/14/2021] [Indexed: 12/16/2022] Open
Abstract
The past decades have brought a steady growth of pathway databases and enrichment methods. However, the advent of pathway data has not been accompanied by an improvement in interoperability across databases, hampering the use of pathway knowledge from multiple databases for enrichment analysis. While integrative databases have attempted to address this issue, they often do not account for redundant information across resources. Furthermore, the majority of studies that employ pathway enrichment analysis still rely upon a single database or enrichment method, though the use of another could yield differing results. These shortcomings call for approaches that investigate the differences and agreements across databases and methods as their selection in the design of a pathway analysis can be a crucial step in ensuring the results of such an analysis are meaningful. Here we present DecoPath, a web application to assist in the interpretation of the results of pathway enrichment analysis. DecoPath provides an ecosystem to run enrichment analysis or directly upload results and facilitate the interpretation of results with custom visualizations that highlight the consensus and/or discrepancies at the pathway- and gene-levels. DecoPath is available at https://decopath.scai.fraunhofer.de, and its source code and documentation can be found on GitHub at https://github.com/DecoPath/DecoPath.
Collapse
Affiliation(s)
- Sarah Mubeen
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn 53115, Germany
- Fraunhofer Center for Machine Learning, Germany
| | - Vinay S Bharadhwaj
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn 53115, Germany
| | - Yojana Gadiya
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn 53115, Germany
| | - Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn 53115, Germany
| | - Alpha T Kodamullil
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
| | - Daniel Domingo-Fernández
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, Sankt Augustin 53757, Germany
- Fraunhofer Center for Machine Learning, Germany
- Enveda Biosciences, Boulder, CO 80301, USA
| |
Collapse
|
22
|
Hanspers K, Kutmon M, Coort SL, Digles D, Dupuis LJ, Ehrhart F, Hu F, Lopes EN, Martens M, Pham N, Shin W, Slenter DN, Waagmeester A, Willighagen EL, Winckers LA, Evelo CT, Pico AR. Ten simple rules for creating reusable pathway models for computational analysis and visualization. PLoS Comput Biol 2021; 17:e1009226. [PMID: 34411100 PMCID: PMC8375987 DOI: 10.1371/journal.pcbi.1009226] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Affiliation(s)
- Kristina Hanspers
- Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, California, United States of America
| | - Martina Kutmon
- Department of Bioinformatics—BiGCaT, NUTRIM, Maastricht University, Maastricht, the Netherlands
- Maastricht Centre for Systems Biology (MaCSBio), Maastricht University, Maastricht, the Netherlands
| | - Susan L. Coort
- Department of Bioinformatics—BiGCaT, NUTRIM, Maastricht University, Maastricht, the Netherlands
| | - Daniela Digles
- Department of Pharmaceutical Sciences, Division of Pharmaceutical Sciences, University of Vienna, Vienna, Austria
| | - Lauren J. Dupuis
- Department of Bioinformatics—BiGCaT, NUTRIM, Maastricht University, Maastricht, the Netherlands
| | - Friederike Ehrhart
- Department of Bioinformatics—BiGCaT, NUTRIM, Maastricht University, Maastricht, the Netherlands
| | - Finterly Hu
- Department of Bioinformatics—BiGCaT, NUTRIM, Maastricht University, Maastricht, the Netherlands
- Maastricht Centre for Systems Biology (MaCSBio), Maastricht University, Maastricht, the Netherlands
| | - Elisson N. Lopes
- Instituto de Ciencias Biologicas, Departamento de Bioquimica e Imunologia, Universidade Federal de Minas Gerais, Belo Horizonte, Brazil
| | - Marvin Martens
- Department of Bioinformatics—BiGCaT, NUTRIM, Maastricht University, Maastricht, the Netherlands
| | - Nhung Pham
- Department of Bioinformatics—BiGCaT, NUTRIM, Maastricht University, Maastricht, the Netherlands
- Maastricht Centre for Systems Biology (MaCSBio), Maastricht University, Maastricht, the Netherlands
| | - Woosub Shin
- Department of Bioinformatics—BiGCaT, NUTRIM, Maastricht University, Maastricht, the Netherlands
| | - Denise N. Slenter
- Department of Bioinformatics—BiGCaT, NUTRIM, Maastricht University, Maastricht, the Netherlands
| | | | - Egon L. Willighagen
- Department of Bioinformatics—BiGCaT, NUTRIM, Maastricht University, Maastricht, the Netherlands
| | - Laurent A. Winckers
- Department of Bioinformatics—BiGCaT, NUTRIM, Maastricht University, Maastricht, the Netherlands
| | - Chris T. Evelo
- Department of Bioinformatics—BiGCaT, NUTRIM, Maastricht University, Maastricht, the Netherlands
- Maastricht Centre for Systems Biology (MaCSBio), Maastricht University, Maastricht, the Netherlands
| | - Alexander R. Pico
- Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, California, United States of America
- * E-mail:
| |
Collapse
|
23
|
Mapping gene and gene pathways associated with coronary artery disease: a CARDIoGRAM exome and multi-ancestry UK biobank analysis. Sci Rep 2021; 11:16461. [PMID: 34385509 PMCID: PMC8361107 DOI: 10.1038/s41598-021-95637-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2020] [Accepted: 07/28/2021] [Indexed: 02/07/2023] Open
Abstract
Coronary artery disease (CAD) genome-wide association studies typically focus on single nucleotide variants (SNVs), and many potentially associated SNVs fail to reach the GWAS significance threshold. We performed gene and pathway-based association (GBA) tests on publicly available Coronary ARtery DIsease Genome wide Replication and Meta-analysis consortium Exome (n = 120,575) and multi ancestry pan UK Biobank study (n = 442,574) summary data using versatile gene-based association study (VEGAS2) and Multi-marker analysis of genomic annotation (MAGMA) to identify novel genes and pathways associated with CAD. We included only exonic SNVs and excluded regulatory regions. VEGAS2 and MAGMA ranked genes and pathways based on aggregated SNV test statistics. We used Bonferroni corrected gene and pathway significance threshold at 3.0 × 10-6 and 1.0 × 10-5, respectively. We also report the top one percent of ranked genes and pathways. We identified 17 top enriched genes with four genes (PCSK9, FAM177, LPL, ARGEF26), reaching statistical significance (p ≤ 3.0 × 10-6) using both GBA tests in two GWAS studies. In addition, our analyses identified ten genes (DUSP13, KCNJ11, CD300LF/RAB37, SLCO1B1, LRRFIP1, QSER1, UBR2, MOB3C, MST1R, and ABCC8) with previously unreported associations with CAD, although none of the single SNV associations within the genes were genome-wide significant. Among the top 1% non-lipid pathways, we detected pathways regulating coagulation, inflammation, neuronal aging, and wound healing.
Collapse
|
24
|
Roychowdhury D, Gupta S, Qin X, Arighi CN, Vijay-Shanker K. emiRIT: a text-mining-based resource for microRNA information. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2021; 2021:6287648. [PMID: 34048547 PMCID: PMC8163238 DOI: 10.1093/database/baab031] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 03/15/2021] [Accepted: 05/04/2021] [Indexed: 01/18/2023]
Abstract
microRNAs (miRNAs) are essential gene regulators, and their dysregulation often leads to diseases. Easy access to miRNA information is crucial for interpreting generated experimental data, connecting facts across publications and developing new hypotheses built on previous knowledge. Here, we present extracting miRNA Information from Text (emiRIT), a text-miningbased resource, which presents miRNA information mined from the literature through a user-friendly interface. We collected 149 ,233 miRNA –PubMed ID pairs from Medline between January 1997 and May 2020. emiRIT currently contains ‘miRNA –gene regulation’ (69 ,152 relations), ‘miRNA disease (cancer)’ (12 ,300 relations), ‘miRNA –biological process and pathways’ (23, 390 relations) and circulatory ‘miRNAs in extracellular locations’ (3782 relations). Biological entities and their relation to miRNAs were extracted from Medline abstracts using publicly available and in-house developed text-mining tools, and the entities were normalized to facilitate querying and integration. We built a database and an interface to store and access the integrated data, respectively. We provide an up-to-date and user-friendly resource to facilitate access to comprehensive miRNA information from the literature on a large scale, enabling users to navigate through different roles of miRNA and examine them in a context specific to their information needs. To assess our resource’s information coverage, we have conducted two case studies focusing on the target and differential expression information of miRNAs in the context of cancer and a third case study to assess the usage of emiRIT in the curation of miRNA information. Database URL: https://research.bioinformatics.udel.edu/emirit/
Collapse
Affiliation(s)
- Debarati Roychowdhury
- Department of Computer and Information Sciences, University of Delaware, 101 Smith Hall, 18 Amstel Ave, Newark, DE 19716, USA
| | - Samir Gupta
- Department of Computer and Information Sciences, University of Delaware, 101 Smith Hall, 18 Amstel Ave, Newark, DE 19716, USA
| | - Xihan Qin
- Department of Computer and Information Sciences, Center of Bioinformatics and Computational Biology, University of Delaware, 15 Innovation Way, Room 205, Newark, DE 19711, USA
| | - Cecilia N Arighi
- Department of Computer and Information Sciences, Center of Bioinformatics and Computational Biology, University of Delaware, 15 Innovation Way, Room 205, Newark, DE 19711, USA
| | - K Vijay-Shanker
- Department of Computer and Information Sciences, University of Delaware, 101 Smith Hall, 18 Amstel Ave, Newark, DE 19716, USA
| |
Collapse
|
25
|
Application of multiple omics and network projection analyses to drug repositioning for pathogenic mosquito-borne viruses. Sci Rep 2021; 11:10136. [PMID: 33980888 PMCID: PMC8115341 DOI: 10.1038/s41598-021-89171-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Accepted: 04/19/2021] [Indexed: 12/22/2022] Open
Abstract
Pathogenic mosquito-borne viruses are a serious public health issue in tropical and subtropical regions and are increasingly becoming a problem in other climate zones. Drug repositioning is a rapid, pharmaco-economic approach that can be used to identify compounds that target these neglected tropical diseases. We have applied a computational drug repositioning method to five mosquito-borne viral infections: dengue virus (DENV), zika virus (ZIKV), West Nile virus (WNV), Japanese encephalitis virus (JEV) and Chikungunya virus (CHIV). We identified signature molecules and pathways for each virus infection based on omics analyses, and determined 77 drug candidates and 146 proteins for those diseases by using a filtering method. Based on the omics analyses, we analyzed the relationship among drugs, target proteins and the five viruses by projecting the signature molecules onto a human protein-protein interaction network. We have classified the drug candidates according to the degree of target proteins in the protein-protein interaction network for the five infectious diseases.
Collapse
|
26
|
Chen Y, Verbeek FJ, Wolstencroft K. Establishing a consensus for the hallmarks of cancer based on gene ontology and pathway annotations. BMC Bioinformatics 2021; 22:178. [PMID: 33823788 PMCID: PMC8025515 DOI: 10.1186/s12859-021-04105-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Accepted: 03/22/2021] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND The hallmarks of cancer provide a highly cited and well-used conceptual framework for describing the processes involved in cancer cell development and tumourigenesis. However, methods for translating these high-level concepts into data-level associations between hallmarks and genes (for high throughput analysis), vary widely between studies. The examination of different strategies to associate and map cancer hallmarks reveals significant differences, but also consensus. RESULTS Here we present the results of a comparative analysis of cancer hallmark mapping strategies, based on Gene Ontology and biological pathway annotation, from different studies. By analysing the semantic similarity between annotations, and the resulting gene set overlap, we identify emerging consensus knowledge. In addition, we analyse the differences between hallmark and gene set associations using Weighted Gene Co-expression Network Analysis and enrichment analysis. CONCLUSIONS Reaching a community-wide consensus on how to identify cancer hallmark activity from research data would enable more systematic data integration and comparison between studies. These results highlight the current state of the consensus and offer a starting point for further convergence. In addition, we show how a lack of consensus can lead to large differences in the biological interpretation of downstream analyses and discuss the challenges of annotating changing and accumulating biological data, using intermediate knowledge resources that are also changing over time.
Collapse
Affiliation(s)
- Yi Chen
- The Leiden Institute of Advanced Computer Science (LIACS), Snellius Gebouw, Niels Bohrweg 1, Leiden, The Netherlands
| | - Fons. J. Verbeek
- The Leiden Institute of Advanced Computer Science (LIACS), Snellius Gebouw, Niels Bohrweg 1, Leiden, The Netherlands
| | - Katherine Wolstencroft
- The Leiden Institute of Advanced Computer Science (LIACS), Snellius Gebouw, Niels Bohrweg 1, Leiden, The Netherlands
| |
Collapse
|
27
|
Ghatnatti V, Vastrad B, Patil S, Vastrad C, Kotturshetti I. Identification of potential and novel target genes in pituitary prolactinoma by bioinformatics analysis. AIMS Neurosci 2021; 8:254-283. [PMID: 33709028 PMCID: PMC7940115 DOI: 10.3934/neuroscience.2021014] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Accepted: 01/29/2021] [Indexed: 02/05/2023] Open
Abstract
Pituitary prolactinoma is one of the most complicated and fatally pathogenic pituitary adenomas. Therefore, there is an urgent need to improve our understanding of the underlying molecular mechanism that drives the initiation, progression, and metastasis of pituitary prolactinoma. The aim of the present study was to identify the key genes and signaling pathways associated with pituitary prolactinoma using bioinformatics analysis. Transcriptome microarray dataset GSE119063 was downloaded from Gene Expression Omnibus (GEO) database. Limma package in R software was used to screen DEGs. Pathway and Gene ontology (GO) enrichment analysis were conducted to identify the biological role of DEGs. A protein-protein interaction (PPI) network was constructed and analyzed by using HIPPIE database and Cytoscape software. Module analyses was performed. In addition, a target gene-miRNA regulatory network and target gene-TF regulatory network were constructed by using NetworkAnalyst and Cytoscape software. Finally, validation of hub genes by receiver operating characteristic (ROC) curve analysis. A total of 989 DEGs were identified, including 461 up regulated genes and 528 down regulated genes. Pathway enrichment analysis showed that the DEGs were significantly enriched in the retinoate biosynthesis II, signaling pathways regulating pluripotency of stem cells, ALK2 signaling events, vitamin D3 biosynthesis, cell cycle and aurora B signaling. Gene Ontology (GO) enrichment analysis showed that the DEGs were significantly enriched in the sensory organ morphogenesis, extracellular matrix, hormone activity, nuclear division, condensed chromosome and microtubule binding. In the PPI network and modules, SOX2, PRSS45, CLTC, PLK1, B4GALT6, RUNX1 and GTSE1 were considered as hub genes. In the target gene-miRNA regulatory network and target gene-TF regulatory network, LINC00598, SOX4, IRX1 and UNC13A were considered as hub genes. Using integrated bioinformatics analysis, we identified candidate genes in pituitary prolactinoma, which might improve our understanding of the molecular mechanisms of pituitary prolactinoma.
Collapse
Affiliation(s)
- Vikrant Ghatnatti
- Department of Endocrinology, J N Medical College, Belagavi and KLE Academy of Higher Education & Research 590010, Karnataka, India
| | - Basavaraj Vastrad
- Department of Biochemistry, Basaveshwar College of Pharmacy, Gadag, Karnataka 582103, India
| | - Swetha Patil
- Department of Obstetrics and Gynaecology, J N Medical College, Belagavi and KLE Academy of Higher Education & Research 590010, Karnataka, India
| | - Chanabasayya Vastrad
- Biostatistics and Bioinformatics, Chanabasava Nilaya, Bharthinagar, Dharwad 580001, Karanataka, India
| | - Iranna Kotturshetti
- Department of Ayurveda, Rajiv Gandhi Education Society's Ayurvedic Medical College, Ron 562209, Karanataka, India
| |
Collapse
|
28
|
Kanza S, Graham Frey J. Semantic Technologies in Drug Discovery. SYSTEMS MEDICINE 2021. [DOI: 10.1016/b978-0-12-801238-3.11520-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022] Open
|
29
|
Hanspers K, Riutta A, Summer-Kutmon M, Pico AR. Pathway information extracted from 25 years of pathway figures. Genome Biol 2020; 21:273. [PMID: 33168034 PMCID: PMC7649569 DOI: 10.1186/s13059-020-02181-2] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2020] [Accepted: 10/16/2020] [Indexed: 12/16/2022] Open
Abstract
Thousands of pathway diagrams are published each year as static figures inaccessible to computational queries and analyses. Using a combination of machine learning, optical character recognition, and manual curation, we identified 64,643 pathway figures published between 1995 and 2019 and extracted 1,112,551 instances of human genes, comprising 13,464 unique NCBI genes, participating in a wide variety of biological processes. This collection represents an order of magnitude more genes than found in the text of the same papers, and thousands of genes missing from other pathway databases, thus presenting new opportunities for discovery and research.
Collapse
Affiliation(s)
- Kristina Hanspers
- Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA
| | - Anders Riutta
- Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA
| | - Martina Summer-Kutmon
- Maastricht Centre for Systems Biology (MaCSBio), Maastricht University, Maastricht, The Netherlands.,Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, The Netherlands
| | - Alexander R Pico
- Institute of Data Science and Biotechnology, Gladstone Institutes, San Francisco, CA, USA.
| |
Collapse
|
30
|
Bioinformatics analyses of significant genes, related pathways, and candidate diagnostic biomarkers and molecular targets in SARS-CoV-2/COVID-19. GENE REPORTS 2020; 21:100956. [PMID: 33553808 PMCID: PMC7854084 DOI: 10.1016/j.genrep.2020.100956] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Accepted: 10/31/2020] [Indexed: 12/12/2022]
Abstract
Severe acute respiratory syndrome corona virus 2 (SARS-CoV-2) infection is a leading cause of pneumonia and death. The aim of this investigation is to identify the key genes in SARS-CoV-2 infection and uncover their potential functions. We downloaded the expression profiling by high throughput sequencing of GSE152075 from the Gene Expression Omnibus database. Normalization of the data from primary SARS-CoV-2 infected samples and negative control samples in the database was conducted using R software. Then, joint analysis of the data was performed. Pathway and Gene ontology (GO) enrichment analyses were performed, and the protein-protein interaction (PPI) network, target gene - miRNA regulatory network, target gene - TF regulatory network of the differentially expressed genes (DEGs) were constructed using Cytoscape software. Identification of diagnostic biomarkers was conducted using receiver operating characteristic (ROC) curve analysis. 994 DEGs (496 up regulated and 498 down regulated genes) were identified. Pathway and GO enrichment analysis showed up and down regulated genes mainly enriched in the NOD-like receptor signaling pathway, Ribosome, response to external biotic stimulus and viral transcription in SARS-CoV-2 infection. Down and up regulated genes were selected to establish the PPI network, modules, target gene - miRNA regulatory network, target gene - TF regulatory network revealed that these genes were involved in adaptive immune system, fluid shear stress and atherosclerosis, influenza A and protein processing in endoplasmic reticulum. In total, ten genes (CBL, ISG15, NEDD4, PML, REL, CTNNB1, ERBB2, JUN, RPS8 and STUB1) were identified as good diagnostic biomarkers. In conclusion, the identified DEGs, hub genes and target genes contribute to the understanding of the molecular mechanisms underlying the advancement of SARS-CoV-2 infection and they may be used as diagnostic and molecular targets for the treatment of patients with SARS-CoV-2 infection in the future.
Collapse
Key Words
- Bioinformatics
- CBL, Cbl proto-oncogene
- DEGs, differentially expressed genes
- Diagnosis
- GO, Gene ontology
- ISG15, ISG15 ubiquitin like modifier
- Key genes
- NEDD4, NEDD4 E3 ubiquitin protein ligase
- PML, promyelocyticleukemia
- PPI, protein-protein interaction
- Pathways
- REL, REL proto-oncogene, NF-kB subunit
- ROC, receiver operating characteristic
- SARS-CoV-2 infection
- SARS-CoV-2, Severe acute respiratory syndrome corona virus 2
Collapse
|
31
|
Identification of potential mRNA panels for severe acute respiratory syndrome coronavirus 2 (COVID-19) diagnosis and treatment using microarray dataset and bioinformatics methods. 3 Biotech 2020; 10:422. [PMID: 33251083 PMCID: PMC7679428 DOI: 10.1007/s13205-020-02406-y] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Accepted: 08/20/2020] [Indexed: 12/15/2022] Open
Abstract
The goal of the present investigation is to identify the differentially expressed genes (DEGs) between SARS-CoV-2 infected and normal control samples to investigate the molecular mechanisms of infection with SARS-CoV-2. The microarray data of the dataset E-MTAB-8871 were retrieved from the ArrayExpress database. Pathway and Gene Ontology (GO) enrichment study, protein–protein interaction (PPI) network, modules, target gene–miRNA regulatory network, and target gene–TF regulatory network have been performed. Subsequently, the key genes were validated using an analysis of the receiver operating characteristic (ROC) curve. In SARS-CoV-2 infection, a total of 324 DEGs (76 up- and 248 down-regulated genes) were identified and enriched in a number of associated SARS-CoV-2 infection pathways and GO terms. Hub and target genes such as TP53, HRAS, MAPK11, RELA, IKZF3, IFNAR2, SKI, TNFRSF13C, JAK1, TRAF6, KLRF2, CD1A were identified from PPI network, target gene–miRNA regulatory network, and target gene–TF regulatory network. Study of the ROC showed that ten genes (CCL5, IFNAR2, JAK2, MX1, STAT1, BID, CD55, CD80, HAL-B, and HLA-DMA) were substantially involved in SARS-CoV-2 patients. The present investigation identified key genes and pathways that deepen our understanding of the molecular mechanisms of SARS-CoV-2 infection, and could be used for SARS-CoV-2 infection as diagnostic and therapeutic biomarkers.
Collapse
|
32
|
Screening and identification of potential prognostic biomarkers in bladder urothelial carcinoma: Evidence from bioinformatics analysis. GENE REPORTS 2020. [DOI: 10.1016/j.genrep.2020.100658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
33
|
Balashanmugam MV, Shivanandappa TB, Nagarethinam S, Vastrad B, Vastrad C. Analysis of Differentially Expressed Genes in Coronary Artery Disease by Integrated Microarray Analysis. Biomolecules 2019; 10:biom10010035. [PMID: 31881747 PMCID: PMC7022900 DOI: 10.3390/biom10010035] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2019] [Revised: 12/13/2019] [Accepted: 12/20/2019] [Indexed: 12/31/2022] Open
Abstract
Coronary artery disease (CAD) is a major cause of end-stage cardiac disease. Although profound efforts have been made to illuminate the pathogenesis, the molecular mechanisms of CAD remain to be analyzed. To identify the candidate genes in the advancement of CAD, microarray dataset GSE23766 was downloaded from the Gene Expression Omnibus database. The differentially expressed genes (DEGs) were identified, and pathway and gene ontology (GO) enrichment analyses were performed. The protein-protein interaction network was constructed and the module analysis was performed using the Biological General Repository for Interaction Datasets (BioGRID) and Cytoscape. Additionally, target genes-miRNA regulatory network and target genes-TF regulatory network were constructed and analyzed. There were 894 DEGs between male human CAD samples and female human CAD samples, including 456 up regulated genes and 438 down regulated genes. Pathway enrichment analyses revealed that DEGs (up and down regulated) were mostly enriched in the superpathway of steroid hormone biosynthesis, ABC transporters, oxidative ethanol degradation III and Complement and coagulation cascades. Similarly, geneontology enrichment analyses revealed that DEGs (up and down regulated) were mostly enriched in the forebrain neuron differentiation, filopodium membrane, platelet degranulation and blood microparticle. In the PPI network and modules (up and down regulated), MYC, NPM1, TRPC7, UBC, FN1, HEMK1, IFT74 and VHL were hub genes. In the target genes-miRNA regulatory network and target genes—TF regulatory network (up and down regulated), TAOK1, KHSRP, HSD17B11 and PAH were target genes. In conclusion, the pathway and GO ontology enriched by DEGs may reveal the molecular mechanism of CAD. Its hub and target genes, MYC, NPM1, TRPC7, UBC, FN1, HEMK1, IFT74, VHL, TAOK1, KHSRP, HSD17B11 and PAH were expected to be new targets for CAD. Our finding provided clues for exploring molecular mechanism and developing new prognostics, diagnostic and therapeutic strategies for CAD.
Collapse
Affiliation(s)
- Meenashi Vanathi Balashanmugam
- Department of Biomedical Sciences, College of Pharmacy, Shaqra University, Al Dawadmi 11911, Saudi Arabia; (M.V.B.); (T.B.S.); (S.N.)
| | - Thippeswamy Boreddy Shivanandappa
- Department of Biomedical Sciences, College of Pharmacy, Shaqra University, Al Dawadmi 11911, Saudi Arabia; (M.V.B.); (T.B.S.); (S.N.)
| | - Sivagurunathan Nagarethinam
- Department of Biomedical Sciences, College of Pharmacy, Shaqra University, Al Dawadmi 11911, Saudi Arabia; (M.V.B.); (T.B.S.); (S.N.)
| | - Basavaraj Vastrad
- Department of Pharmaceutics, SET’S College of Pharmacy, Dharwad, Karnataka 580002, India;
| | - Chanabasayya Vastrad
- Biostatistics and Bioinformatics, Chanabasava Nilaya, Bharthinagar, Dharwad 580001, Karanataka
- Correspondence: ; Tel.: +91-9480-073398
| |
Collapse
|
34
|
Identification of important invasion and proliferation related genes in adrenocortical carcinoma. Med Oncol 2019; 36:73. [DOI: 10.1007/s12032-019-1296-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2019] [Accepted: 07/01/2019] [Indexed: 12/17/2022]
|
35
|
Alshabi AM, Shaikh IA, Vastrad C. Exploring the Molecular Mechanism of the Drug-Treated Breast Cancer Based on Gene Expression Microarray. Biomolecules 2019; 9:biom9070282. [PMID: 31311202 PMCID: PMC6681318 DOI: 10.3390/biom9070282] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2019] [Revised: 06/24/2019] [Accepted: 07/09/2019] [Indexed: 02/07/2023] Open
Abstract
: Breast cancer (BRCA) remains the leading cause of cancer morbidity and mortality worldwide. In the present study, we identified novel biomarkers expressed during estradiol and tamoxifen treatment of BRCA. The microarray dataset of E-MTAB-4975 from Array Express database was downloaded, and the differential expressed genes (DEGs) between estradiol-treated BRCA sample and tamoxifen-treated BRCA sample were identified by limma package. The pathway and gene ontology (GO) enrichment analysis, construction of protein-protein interaction (PPI) network, module analysis, construction of target genes-miRNA interaction network and target genes-transcription factor (TF) interaction network were performed using bioinformatics tools. The expression, prognostic values, and mutation of hub genes were validated by SurvExpress database, cBioPortal, and human protein atlas (HPA) database. A total of 856 genes (421 up-regulated genes and 435 down-regulated genes) were identified in T47D (overexpressing Split Ends (SPEN) + estradiol) samples compared to T47D (overexpressing Split Ends (SPEN) + tamoxifen) samples. Pathway and GO enrichment analysis revealed that the DEGs were mainly enriched in response to lysine degradation II (pipecolate pathway), cholesterol biosynthesis pathway, cell cycle pathway, and response to cytokine pathway. DEGs (MCM2, TCF4, OLR1, HSPA5, MAP1LC3B, SQSTM1, NEU1, HIST1H1B, RAD51, RFC3, MCM10, ISG15, TNFRSF10B, GBP2, IGFBP5, SOD2, DHF and MT1H) , which were significantly up- and down-regulated in estradiol and tamoxifen-treated BRCA samples, were selected as hub genes according to the results of protein-protein interaction (PPI) network, module analysis, target genes-miRNA interaction network and target genes-TF interaction network analysis. The SurvExpress database, cBioPortal, and Human Protein Atlas (HPA) database further confirmed that patients with higher expression levels of these hub genes experienced a shorter overall survival. A comprehensive bioinformatics analysis was performed, and potential therapeutic applications of estradiol and tamoxifen were predicted in BRCA samples. The data may unravel the future molecular mechanisms of BRCA.
Collapse
Affiliation(s)
- Ali Mohamed Alshabi
- Department of Clinical Pharmacy, College of Pharmacy, Najran University, Najran, 66237, Saudi Arabia
| | - Ibrahim Ahmed Shaikh
- Department of Pharmacology, College of Pharmacy, Najran University, Najran, 66237, Saudi Arabia
| | - Chanabasayya Vastrad
- Biostatistics and Bioinformatics, ChanabasavaNilaya, Bharthinagar, Dharwad 580001, Karnataka, India.
| |
Collapse
|
36
|
Kim S. Pathway Interactions Based on Drug-Induced Datasets. Cancer Inform 2019; 18:1176935119851518. [PMID: 31205412 PMCID: PMC6535899 DOI: 10.1177/1176935119851518] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2019] [Accepted: 04/23/2019] [Indexed: 11/16/2022] Open
Abstract
In this study, we identified enrichment pathway connections from MCF7 breast cancer epithelial cells that were treated with 87 drugs. We extracted drug-treated samples, where the sample size was greater than or equal to 5. The drugs included 17-allylamino-geldanamycin, LY294002, trichostatin A, valproic acid, sirolimus, and wortmannin, which had sample sizes of 11, 8, 7, 7, 7, and 5, respectively. We found meaningful pathways using gene set enrichment analysis and identified intradrug and interdrug pathway interactions, which implied the influence of drug combination. Among the top 20 enrichment pathways that were wortmannin induced, there were a total of 37 intradrug pathway interactions via common genes. Thirty-seven pathway interactions were induced by valproic acid, 11 induced by trichostatin A, 20 induced by LY294002, and 59 induced by sirolimus, all via common genes. The number of interdrug-induced pathway interactions ranged from one pair of pathways to 23. The pair of ERBB_SIGNALING and INSULIN_SIGNALING pathways showed the highest score from a pair of 2 individual drugs. The highest number of pathway interactions was observed between the drugs 17-allylamino-geldanamycin and LY294002.
Collapse
Affiliation(s)
- Shinuk Kim
- Department of Civil Engineering, Sangmyung University, Cheonan, Republic of Korea
| |
Collapse
|
37
|
Wang LL, Thomas Hayman G, Smith JR, Tutaj M, Shimoyama ME, Gennari JH. Predicting instances of pathway ontology classes for pathway integration. J Biomed Semantics 2019; 10:11. [PMID: 31196182 PMCID: PMC6567466 DOI: 10.1186/s13326-019-0202-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2018] [Accepted: 05/22/2019] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND To improve the outcomes of biological pathway analysis, a better way of integrating pathway data is needed. Ontologies can be used to organize data from disparate sources, and we leverage the Pathway Ontology as a unifying ontology for organizing pathway data. We aim to associate pathway instances from different databases to the appropriate class in the Pathway Ontology. RESULTS Using a supervised machine learning approach, we trained neural networks to predict mappings between Reactome pathways and Pathway Ontology (PW) classes. For 2222 Reactome classes, the neural network (NN) model generated 10,952 class recommendations. We compared against a baseline bag-of-words (BOW) model for predicting correct PW classes. A 5% subset of Reactome pathways (111 pathways) was randomly selected, and the corresponding class recommendations from both models were evaluated by two curators. The precision of the BOW model was higher (0.49 for BOW and 0.39 for NN), but the recall was lower (0.42 for BOW and 0.78 for NN). Around 78% of Reactome pathways received pertinent recommendations from the NN model. CONCLUSIONS The neural predictive model produced meaningful class recommendations that assisted PW curators in selecting appropriate class mappings for Reactome pathways. Our methods can be used to reduce the manual effort associated with ontology curation, and more broadly, for augmenting the curators' ability to organize and integrate data from pathway databases using the Pathway Ontology.
Collapse
Affiliation(s)
- Lucy Lu Wang
- Department of Biomedical Informatics and Medical Education, University of Washington, 850 Republican St, Seattle, 98109, WA, USA.
| | - G Thomas Hayman
- Department of Biomedical Engineering, Medical College of Wisconsin, 8701 W Watertown Plank Rd, Milwaukee, 53226, WI, USA
| | - Jennifer R Smith
- Department of Biomedical Engineering, Medical College of Wisconsin, 8701 W Watertown Plank Rd, Milwaukee, 53226, WI, USA
| | - Monika Tutaj
- Department of Biomedical Engineering, Medical College of Wisconsin, 8701 W Watertown Plank Rd, Milwaukee, 53226, WI, USA
| | - Mary E Shimoyama
- Department of Biomedical Engineering, Medical College of Wisconsin, 8701 W Watertown Plank Rd, Milwaukee, 53226, WI, USA
| | - John H Gennari
- Department of Biomedical Informatics and Medical Education, University of Washington, 850 Republican St, Seattle, 98109, WA, USA
| |
Collapse
|
38
|
Alshabi AM, Vastrad B, Shaikh IA, Vastrad C. Identification of Crucial Candidate Genes and Pathways in Glioblastoma Multiform by Bioinformatics Analysis. Biomolecules 2019; 9:biom9050201. [PMID: 31137733 PMCID: PMC6571969 DOI: 10.3390/biom9050201] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2019] [Revised: 05/17/2019] [Accepted: 05/23/2019] [Indexed: 02/07/2023] Open
Abstract
The present study aimed to investigate the molecular mechanisms underlying glioblastoma multiform (GBM) and its biomarkers. The differentially expressed genes (DEGs) were diagnosed using the limma software package. The ToppGene (ToppFun) was used to perform pathway and Gene Ontology (GO) enrichment analysis of the DEGs. Protein-protein interaction (PPI) networks, extracted modules, miRNA-target genes regulatory network and TF-target genes regulatory network were used to obtain insight into the actions of DEGs. Survival analysis for DEGs was carried out. A total of 590 DEGs, including 243 up regulated and 347 down regulated genes, were diagnosed between scrambled shRNA expression and Lin7A knock down. The up-regulated genes were enriched in ribosome, mitochondrial translation termination, translation, and peptide biosynthetic process. The down-regulated genes were enriched in focal adhesion, VEGFR3 signaling in lymphatic endothelium, extracellular matrix organization, and extracellular matrix. The current study screened the genes in the PPI network, extracted modules, miRNA-target genes regulatory network, and TF-target genes regulatory network with higher degrees as hub genes, which included NPM1, CUL4A, YIPF1, SHC1, AKT1, VLDLR, RPL14, P3H2, DTNA, FAM126B, RPL34, and MYL5. Survival analysis indicated that the high expression of RPL36A and MRPL35 were predicting longer survival of GBM, while high expression of AP1S1 and AKAP12 were predicting shorter survival of GBM. High expression of RPL36A and AP1S1 were associated with pathogenesis of GBM, while low expression of ALPL was associated with pathogenesis of GBM. In conclusion, the current study diagnosed DEGs between scrambled shRNA expression and Lin7A knock down samples, which could improve our understanding of the molecular mechanisms in the progression of GBM, and these crucial as well as new diagnostic markers might be used as therapeutic targets for GBM.
Collapse
Affiliation(s)
- Ali Mohamed Alshabi
- Department of Clinical Pharmacy, College of Pharmacy, Najran University, Najran 61441, Saudi Arabia.
| | - Basavaraj Vastrad
- Department of Pharmaceutics, SET`S College of Pharmacy, Dharwad, Karnataka 580002, India.
| | - Ibrahim Ahmed Shaikh
- Department of Pharmacology, College of Pharmacy, Najran University, Najran 61441, Saudi Arabia.
| | - Chanabasayya Vastrad
- Biostatistics and Bioinformatics, Chanabasava Nilaya, Bharthinagar, Dharwad 580001, Karnataka, India.
| |
Collapse
|
39
|
Joshi H, Vastrad B, Vastrad C. Identification of Important Invasion-Related Genes in Non-functional Pituitary Adenomas. J Mol Neurosci 2019; 68:565-589. [PMID: 30982163 DOI: 10.1007/s12031-019-01318-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2018] [Accepted: 03/29/2019] [Indexed: 12/18/2022]
Abstract
Non-functioning pituitary adenomas (NFPAs) are locally invasive with high morbidity. The objective of this study was to diagnose important genes and pathways related to the invasiveness of NFPAs and gain more insights into the underlying molecular mechanisms of NFPAs. The gene expression profiles of GSE51618 were downloaded from the Gene Expression Omnibus database with 4 non-invasive NFPA samples, 3 invasive NFPA samples, and 3 normal pituitary gland samples. Differentially expressed genes (DEGs) are screened between invasive NFPA samples and normal pituitary gland samples, followed by pathway and ontology (GO) enrichment analyses. Subsequently, a protein-protein interaction (PPI) network was constructed and analyzed for these DEGs, and module analysis was performed. In addition, a target gene-miRNA network and target gene-TF (transcription factor) network were analyzed for these DEGs. A total of 879 DEGs were obtained. Among them, 439 genes were upregulated and 440 genes were downregulated. Pathway enrichment analysis indicated that the upregulated genes were significantly enriched in cysteine biosynthesis/homocysteine degradation (trans-sulfuration) and PI3K-Akt signaling pathway, while the downregulated genes were mainly associated with docosahexaenoate biosynthesis III (mammals) and chemokine signaling pathway. GO enrichment analysis indicated that the upregulated genes were significantly enriched in animal organ morphogenesis, extracellular matrix, and hormone activity, while the downregulated genes were mainly associated with leukocyte chemotaxis, dendrites, and RAGE receptor binding. Subsequently, ESR1, SOX2, TTN, GFAP, WIF1, TTR, XIST, SPAG5, PPBP, AR, IL1R2, and HIST1H1C were diagnosed as the top hub genes in the upregulated and downregulated PPI networks and modules. In addition, HS3ST1, GPC4, CCND2, and SCD were diagnosed as the top hub genes in the upregulated and downregulated target gene-miRNA networks, while CISH, ISLR, UBE2E3, and CCNG2 were diagnosed as the top hub genes in the upregulated and downregulated target gene-TF networks. The new important DEGs and pathways diagnosed in this study may serve key roles in the invasiveness of NFPAs and indicate more molecular targets for the treatment of NFPAs.
Collapse
Affiliation(s)
- Harish Joshi
- Endocrine and Diabetes Care Center, Hubli, Karnataka, 5800029, India
| | - Basavaraj Vastrad
- Department of Pharmaceutics, SET'S College of Pharmacy, Dharwad, Karnataka, 580002, India
| | - Chanabasayya Vastrad
- Biostatistics and Bioinformatics, Chanabasava Nilaya, Bharthinagar, Dharwad, Karnataka, 580001, India.
| |
Collapse
|
40
|
Alur VC, Raju V, Vastrad B, Vastrad C. Mining Featured Biomarkers Linked with Epithelial Ovarian CancerBased on Bioinformatics. Diagnostics (Basel) 2019; 9:diagnostics9020039. [PMID: 30970615 PMCID: PMC6628368 DOI: 10.3390/diagnostics9020039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2019] [Revised: 03/31/2019] [Accepted: 04/05/2019] [Indexed: 11/16/2022] Open
Abstract
Epithelial ovarian cancer (EOC) is the18th most common cancer worldwide and the 8th most common in women. The aim of this study was to diagnose the potential importance of, as well as novel genes linked with, EOC and to provide valid biological information for further research. The gene expression profiles of E-MTAB-3706 which contained four high-grade ovarian epithelial cancer samples, four normal fallopian tube samples and four normal ovarian epithelium samples were downloaded from the ArrayExpress database. Pathway enrichment and Gene Ontology (GO) enrichment analysis of differentially expressed genes (DEGs) were performed, and protein-protein interaction (PPI) network, microRNA-target gene regulatory network and TFs (transcription factors) -target gene regulatory network for up- and down-regulated were analyzed using Cytoscape. In total, 552 DEGs were found, including 276 up-regulated and 276 down-regulated DEGs. Pathway enrichment analysis demonstrated that most DEGs were significantly enriched in chemical carcinogenesis, urea cycle, cell adhesion molecules and creatine biosynthesis. GO enrichment analysis showed that most DEGs were significantly enriched in translation, nucleosome, extracellular matrix organization and extracellular matrix. From protein-protein interaction network (PPI) analysis, modules, microRNA-target gene regulatory network and TFs-target gene regulatory network for up- and down-regulated, and the top hub genes such as E2F4, SRPK2, A2M, CDH1, MAP1LC3A, UCHL1, HLA-C (major histocompatibility complex, class I, C), VAT1, ECM1 and SNRPN (small nuclear ribonucleoprotein polypeptide N) were associated in pathogenesis of EOC. The high expression levels of the hub genes such as CEBPD (CCAAT enhancer binding protein delta) and MID2 in stages 3 and 4 were validated in the TCGA (The Cancer Genome Atlas) database. CEBPD andMID2 were associated with the worst overall survival rates in EOC. In conclusion, the current study diagnosed DEGs between normal and EOC samples, which could improve our understanding of the molecular mechanisms in the progression of EOC. These new key biomarkers might be used as therapeutic targets for EOC.
Collapse
Affiliation(s)
- Varun Chandra Alur
- Department of Endocrinology, J.J. M Medical College, Davanagere, Karnataka 577004, India.
| | - Varshita Raju
- Department of Obstetrics and Gynecology, J.J. M Medical College, Davanagere, Karnataka 577004, India.
| | - Basavaraj Vastrad
- Department of Pharmaceutics, SET`S College of Pharmacy, Dharwad, Karnataka 580002, India.
| | - Chanabasayya Vastrad
- Biostatistics and Bioinformatics,Chanabasava Nilaya, Bharthinagar,Dharwad, Karanataka 580001, India.
| |
Collapse
|
41
|
Drug repositioning for dengue haemorrhagic fever by integrating multiple omics analyses. Sci Rep 2019; 9:523. [PMID: 30679503 PMCID: PMC6346040 DOI: 10.1038/s41598-018-36636-1] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2018] [Accepted: 11/22/2018] [Indexed: 12/16/2022] Open
Abstract
To detect drug candidates for dengue haemorrhagic fever (DHF), we employed a computational drug repositioning method to perform an integrated multiple omics analysis based on transcriptomic, proteomic, and interactomic data. We identified 3,892 significant genes, 389 proteins, and 221 human proteins by transcriptomic analysis, proteomic analysis, and human–dengue virus protein–protein interactions, respectively. The drug candidates were selected using gene expression profiles for inverse drug–disease relationships compared with DHF patients and healthy controls as well as interactomic relationships between the signature proteins and chemical compounds. Integrating the results of the multiple omics analysis, we identified eight candidates for drug repositioning to treat DHF that targeted five proteins (ACTG1, CALR, ERC1, HSPA5, SYNE2) involved in human–dengue virus protein–protein interactions, and the signature proteins in the proteomic analysis mapped to significant pathways. Interestingly, five of these drug candidates, valparoic acid, sirolimus, resveratrol, vorinostat, and Y-27632, have been reported previously as effective treatments for flavivirus-induced diseases. The computational approach using multiple omics data for drug repositioning described in this study can be used effectively to identify novel drug candidates.
Collapse
|
42
|
Abstract
Resources for rat researchers are extensive, including strain repositories and databases all around the world. The Rat Genome Database (RGD) serves as the primary rat data repository, providing both manual and computationally collected data from other databases.
Collapse
|
43
|
Domingo-Fernández D, Hoyt CT, Bobis-Álvarez C, Marín-Llaó J, Hofmann-Apitius M. ComPath: an ecosystem for exploring, analyzing, and curating mappings across pathway databases. NPJ Syst Biol Appl 2018; 5:3. [PMID: 30564458 PMCID: PMC6292919 DOI: 10.1038/s41540-018-0078-8] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2018] [Revised: 10/31/2018] [Accepted: 11/02/2018] [Indexed: 11/09/2022] Open
Abstract
Although pathways are widely used for the analysis and representation of biological systems, their lack of clear boundaries, their dispersion across numerous databases, and the lack of interoperability impedes the evaluation of the coverage, agreements, and discrepancies between them. Here, we present ComPath, an ecosystem that supports curation of pathway mappings between databases and fosters the exploration of pathway knowledge through several novel visualizations. We have curated mappings between three of the major pathway databases and present a case study focusing on Parkinson’s disease that illustrates how ComPath can generate new biological insights by identifying pathway modules, clusters, and cross-talks with these mappings. The ComPath source code and resources are available at https://github.com/ComPath and the web application can be accessed at https://compath.scai.fraunhofer.de/.
Collapse
Affiliation(s)
- Daniel Domingo-Fernández
- 1Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53754 Sankt Augustin, Germany.,2Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115 Bonn, Germany
| | - Charles Tapley Hoyt
- 1Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53754 Sankt Augustin, Germany.,2Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115 Bonn, Germany
| | - Carlos Bobis-Álvarez
- 3Faculty of Medicine and Health Sciences, University of Oviedo, 33006 Oviedo, Spain
| | - Josep Marín-Llaó
- 1Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53754 Sankt Augustin, Germany.,4Rovira i Virgili University, 43003 Tarragona, Spain
| | - Martin Hofmann-Apitius
- 1Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing, 53754 Sankt Augustin, Germany.,2Bonn-Aachen International Center for IT, Rheinische Friedrich-Wilhelms-Universität Bonn, 53115 Bonn, Germany
| |
Collapse
|
44
|
Lau E, Venkatraman V, Thomas CT, Wu JC, Van Eyk JE, Lam MPY. Identifying High-Priority Proteins Across the Human Diseasome Using Semantic Similarity. J Proteome Res 2018; 17:4267-4278. [PMID: 30256117 DOI: 10.1021/acs.jproteome.8b00393] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Identifying the genes and proteins associated with a biological process or disease is a central goal of the biomedical research enterprise. However, relatively few systematic approaches are available that provide objective evaluation of the genes or proteins known to be important to a research topic, and hence researchers often rely on subjective evaluation of domain experts and laborious manual literature review. Computational bibliometric analysis, in conjunction with text mining and data curation, attempts to automate this process and return prioritized proteins in any given research topic. We describe here a method to identify and rank protein-topic relationships by calculating the semantic similarity between a protein and a query term in the biomerical literature while adjusting for the impact and immediacy of associated research articles. We term the calculated metric the weighted copublication distance (WCD) and show that it compares well to related approaches in predicting benchmark protein lists in multiple biological processes. We used WCD to extract prioritized "popular proteins" across multiple cell types, subanatomical regions, and standardized vocabularies containing over 20 000 human disease terms. The collection of protein-disease associations across the resulting human "diseasome" supports data analytical workflows to perform reverse protein-to-disease queries and functional annotation of experimental protein lists. We envision that the described improvement to the popular proteins strategy will be useful for annotating protein lists and guiding method development efforts as well as generating new hypotheses on understudied disease proteins using bibliometric information.
Collapse
Affiliation(s)
- Edward Lau
- Stanford Cardiovascular Institute , Stanford University , Stanford , California 94305 , United States
| | - Vidya Venkatraman
- Advanced Clinical Biosystems Research Institute, Department of Medicine and The Heart Institute , Cedars-Sinai Medical Center , Los Angeles , California 90048 , United States
| | - Cody T Thomas
- Department of Medicine, Division of Cardiology, Consortium for Fibrosis Research and Translation, Anschutz Medical Campus , University of Colorado Denver , Aurora , Colorado 80045 , United States
| | - Joseph C Wu
- Stanford Cardiovascular Institute , Stanford University , Stanford , California 94305 , United States
| | - Jennifer E Van Eyk
- Advanced Clinical Biosystems Research Institute, Department of Medicine and The Heart Institute , Cedars-Sinai Medical Center , Los Angeles , California 90048 , United States
| | - Maggie P Y Lam
- Department of Medicine, Division of Cardiology, Consortium for Fibrosis Research and Translation, Anschutz Medical Campus , University of Colorado Denver , Aurora , Colorado 80045 , United States
| |
Collapse
|
45
|
Vastrad C, Vastrad B. Bioinformatics analysis of gene expression profiles to diagnose crucial and novel genes in glioblastoma multiform. Pathol Res Pract 2018; 214:1395-1461. [PMID: 30097214 DOI: 10.1016/j.prp.2018.07.015] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/25/2018] [Revised: 06/27/2018] [Accepted: 07/22/2018] [Indexed: 02/07/2023]
Abstract
Therefore, the current study aimed to diagnose the genes associated in the pathogenesis of GBM. The differentially expressed genes (DEGs) were diagnosed using the limma software package. The ToppFun was used to perform pathway and Gene Ontology (GO) enrichment analysis of the DEGs. Protein-protein interaction (PPI) networks, extracted modules, miRNA-target genes regulatory network and miRNA-target genes regulatory network were used to obtain insight into the actions of DEGs. Survival analysis for DEGs carried out. A total of 701 DEGs, including 413 upregulated and 288 downregulated genes, were diagnosed between U1118MG cell line (PK 11195 treated with 1 h exposure) and U1118MG cell line (PK 11195 treated with 24 h exposure). The up-regulated genes were enriched in superpathway of pyrimidine deoxyribonucleotides de novo biosynthesis, cell cycle, cell cycle process and chromosome. The down-regulated genes were enriched in folate transformations I, biosynthesis of amino acids, cellular amino acid metabolic process and vacuolar membrane. The current study screened the genes in PPI network, extracted modules, miRNA-target genes regulatory network and miRNA-target genes regulatory network with higher degrees as hub genes, which included MYC, TERF2IP, CDK1, EEF1G, TXNIP, SLC1A5, RGS4 and IER5L Survival suggested that low expressed NR4A2, SLC7 A5, CYR61 and ID1 in patients with GBM was linked with a positive prognosis for overall survival. In conclusion, the current study could improve our understanding of the molecular mechanisms in the progression of GBM, and these crucial as well as new molecular markers might be used as therapeutic targets for GBM.
Collapse
Affiliation(s)
- Chanabasayya Vastrad
- Biostatistics and Bioinformatics, Chanabasava Nilaya, Bharthinagar, Dharwad, 580001, Karanataka, India.
| | - Basavaraj Vastrad
- Department of Pharmaceutics, SET`S College of Pharmacy, Dharwad, Karnataka, 580002, India
| |
Collapse
|
46
|
Chowdhury S, Sinha N, Ganguli P, Bhowmick R, Singh V, Nandi S, Sarkar RR. BIOPYDB: A Dynamic Human Cell Specific Biochemical Pathway Database with Advanced Computational Analyses Platform. J Integr Bioinform 2018; 15:/j/jib.ahead-of-print/jib-2017-0072/jib-2017-0072.xml. [PMID: 29547394 PMCID: PMC6340122 DOI: 10.1515/jib-2017-0072] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2017] [Accepted: 01/29/2018] [Indexed: 12/03/2022] Open
Abstract
BIOPYDB: BIOchemical PathwaY DataBase is developed as a manually curated, readily updatable, dynamic resource of human cell specific pathway information along with integrated computational platform to perform various pathway analyses. Presently, it comprises of 46 pathways, 3189 molecules, 5742 reactions and 6897 different types of diseases linked with pathway proteins, which are referred by 520 literatures and 17 other pathway databases. With its repertoire of biochemical pathway data, and computational tools for performing Topological, Logical and Dynamic analyses, BIOPYDB offers both the experimental and computational biologists to acquire a comprehensive understanding of signaling cascades in the cells. Automated pathway image reconstruction, cross referencing of pathway molecules and interactions with other databases and literature sources, complex search operations to extract information from other similar resources, integrated platform for pathway data sharing and computation, etc. are the novel and useful features included in this database to make it more acceptable and attractive to the users of pathway research communities. The RESTful API service is also made available to the advanced users and developers for accessing this database more conveniently through their own computer programmes.
Collapse
Affiliation(s)
- Saikat Chowdhury
- CSIR- National Chemical Laboratory, Chemical Engineering and Process Development Division, Pune, Maharashtra 411008, India.,Academy of Scientific and Innovative Research (AcSIR), CSIR-NCL Campus, Pune, Maharashtra 411008, India
| | - Noopur Sinha
- CSIR- National Chemical Laboratory, Chemical Engineering and Process Development Division, Pune, Maharashtra 411008, India.,Academy of Scientific and Innovative Research (AcSIR), CSIR-NCL Campus, Pune, Maharashtra 411008, India
| | - Piyali Ganguli
- CSIR- National Chemical Laboratory, Chemical Engineering and Process Development Division, Pune, Maharashtra 411008, India.,Academy of Scientific and Innovative Research (AcSIR), CSIR-NCL Campus, Pune, Maharashtra 411008, India
| | - Rupa Bhowmick
- CSIR- National Chemical Laboratory, Chemical Engineering and Process Development Division, Pune, Maharashtra 411008, India.,Academy of Scientific and Innovative Research (AcSIR), CSIR-NCL Campus, Pune, Maharashtra 411008, India
| | - Vidhi Singh
- CSIR- National Chemical Laboratory, Chemical Engineering and Process Development Division, Pune, Maharashtra 411008, India
| | - Sutanu Nandi
- CSIR- National Chemical Laboratory, Chemical Engineering and Process Development Division, Pune, Maharashtra 411008, India.,Academy of Scientific and Innovative Research (AcSIR), CSIR-NCL Campus, Pune, Maharashtra 411008, India
| | - Ram Rup Sarkar
- CSIR- National Chemical Laboratory, Chemical Engineering and Process Development Division, Pune, Maharashtra 411008, India.,Academy of Scientific and Innovative Research (AcSIR), CSIR-NCL Campus, Pune, Maharashtra 411008, India
| |
Collapse
|
47
|
Shimoyama M, Smith JR, Bryda E, Kuramoto T, Saba L, Dwinell M. Rat Genome and Model Resources. ILAR J 2017; 58:42-58. [PMID: 28838068 PMCID: PMC6057551 DOI: 10.1093/ilar/ilw041] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2016] [Indexed: 11/25/2022] Open
Abstract
Rats remain a major model for studying disease mechanisms and discovery, validation, and testing of new compounds to improve human health. The rat’s value continues to grow as indicated by the more than 1.4 million publications (second to human) at PubMed documenting important discoveries using this model. Advanced sequencing technologies, genome modification techniques, and the development of embryonic stem cell protocols ensure the rat remains an important mammalian model for disease studies. The 2004 release of the reference genome has been followed by the production of complete genomes for more than two dozen individual strains utilizing NextGen sequencing technologies; their analyses have identified over 80 million variants. This explosion in genomic data has been accompanied by the ability to selectively edit the rat genome, leading to hundreds of new strains through multiple technologies. A number of resources have been developed to provide investigators with access to precision rat models, comprehensive datasets, and sophisticated software tools necessary for their research. Those profiled here include the Rat Genome Database, PhenoGen, Gene Editing Rat Resource Center, Rat Resource and Research Center, and the National BioResource Project for the Rat in Japan.
Collapse
Affiliation(s)
- Mary Shimoyama
- Department of Biomedical Engineering, Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Rat Genome Database, Department of Biomedical Engineering at Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Department of Veterinary Pathobiology, College of Veterinary Medicine, University of Missouri, Columbia, Missouri. Institute of Laboratory Animals, Graduate School of Medicine, Kyoto University, Kyoto, Japan. Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, Colorado. Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Jennifer R Smith
- Department of Biomedical Engineering, Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Rat Genome Database, Department of Biomedical Engineering at Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Department of Veterinary Pathobiology, College of Veterinary Medicine, University of Missouri, Columbia, Missouri. Institute of Laboratory Animals, Graduate School of Medicine, Kyoto University, Kyoto, Japan. Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, Colorado. Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Elizabeth Bryda
- Department of Biomedical Engineering, Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Rat Genome Database, Department of Biomedical Engineering at Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Department of Veterinary Pathobiology, College of Veterinary Medicine, University of Missouri, Columbia, Missouri. Institute of Laboratory Animals, Graduate School of Medicine, Kyoto University, Kyoto, Japan. Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, Colorado. Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Takashi Kuramoto
- Department of Biomedical Engineering, Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Rat Genome Database, Department of Biomedical Engineering at Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Department of Veterinary Pathobiology, College of Veterinary Medicine, University of Missouri, Columbia, Missouri. Institute of Laboratory Animals, Graduate School of Medicine, Kyoto University, Kyoto, Japan. Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, Colorado. Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Laura Saba
- Department of Biomedical Engineering, Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Rat Genome Database, Department of Biomedical Engineering at Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Department of Veterinary Pathobiology, College of Veterinary Medicine, University of Missouri, Columbia, Missouri. Institute of Laboratory Animals, Graduate School of Medicine, Kyoto University, Kyoto, Japan. Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, Colorado. Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| | - Melinda Dwinell
- Department of Biomedical Engineering, Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Rat Genome Database, Department of Biomedical Engineering at Marquette University and the Medical College of Wisconsin, Milwaukee, Wisconsin. Department of Veterinary Pathobiology, College of Veterinary Medicine, University of Missouri, Columbia, Missouri. Institute of Laboratory Animals, Graduate School of Medicine, Kyoto University, Kyoto, Japan. Department of Pharmaceutical Sciences, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of Colorado Anschutz Medical Campus, Aurora, Colorado. Department of Physiology, Medical College of Wisconsin, Milwaukee, Wisconsin
| |
Collapse
|
48
|
Shimoyama M, Laulederkind SJF, De Pons J, Nigam R, Smith JR, Tutaj M, Petri V, Hayman GT, Wang SJ, Ghiasvand O, Thota J, Dwinell MR. Exploring human disease using the Rat Genome Database. Dis Model Mech 2017; 9:1089-1095. [PMID: 27736745 PMCID: PMC5087824 DOI: 10.1242/dmm.026021] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Rattus norvegicus, the laboratory rat, has been a crucial model for studies of the environmental and genetic factors associated with human diseases for over 150 years. It is the primary model organism for toxicology and pharmacology studies, and has features that make it the model of choice in many complex-disease studies. Since 1999, the Rat Genome Database (RGD; http://rgd.mcw.edu) has been the premier resource for genomic, genetic, phenotype and strain data for the laboratory rat. The primary role of RGD is to curate rat data and validate orthologous relationships with human and mouse genes, and make these data available for incorporation into other major databases such as NCBI, Ensembl and UniProt. RGD also provides official nomenclature for rat genes, quantitative trait loci, strains and genetic markers, as well as unique identifiers. The RGD team adds enormous value to these basic data elements through functional and disease annotations, the analysis and visual presentation of pathways, and the integration of phenotype measurement data for strains used as disease models. Because much of the rat research community focuses on understanding human diseases, RGD provides a number of datasets and software tools that allow users to easily explore and make disease-related connections among these datasets. RGD also provides comprehensive human and mouse data for comparative purposes, illustrating the value of the rat in translational research. This article introduces RGD and its suite of tools and datasets to researchers - within and beyond the rat community - who are particularly interested in leveraging rat-based insights to understand human diseases.
Collapse
Affiliation(s)
- Mary Shimoyama
- Medical College of Wisconsin, Department of Surgery, Milwaukee, WI 53226, USA
| | | | - Jeff De Pons
- Medical College of Wisconsin, Department of Surgery, Milwaukee, WI 53226, USA
| | - Rajni Nigam
- Medical College of Wisconsin, Department of Surgery, Milwaukee, WI 53226, USA
| | - Jennifer R Smith
- Medical College of Wisconsin, Department of Surgery, Milwaukee, WI 53226, USA
| | - Marek Tutaj
- Medical College of Wisconsin, Department of Surgery, Milwaukee, WI 53226, USA
| | - Victoria Petri
- Medical College of Wisconsin, Department of Surgery, Milwaukee, WI 53226, USA
| | - G Thomas Hayman
- Medical College of Wisconsin, Department of Surgery, Milwaukee, WI 53226, USA
| | - Shur-Jen Wang
- Medical College of Wisconsin, Department of Surgery, Milwaukee, WI 53226, USA
| | - Omid Ghiasvand
- Medical College of Wisconsin, Department of Surgery, Milwaukee, WI 53226, USA
| | - Jyothi Thota
- Medical College of Wisconsin, Department of Surgery, Milwaukee, WI 53226, USA
| | - Melinda R Dwinell
- Medical College of Wisconsin, Department of Physiology, Milwaukee, WI 53226, USA
| |
Collapse
|
49
|
Henry VJ, Goelzer A, Ferré A, Fischer S, Dinh M, Loux V, Froidevaux C, Fromion V. The bacterial interlocked process ONtology (BiPON): a systemic multi-scale unified representation of biological processes in prokaryotes. J Biomed Semantics 2017; 8:53. [PMID: 29169408 PMCID: PMC5701433 DOI: 10.1186/s13326-017-0165-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2016] [Accepted: 11/10/2017] [Indexed: 01/10/2023] Open
Abstract
BACKGROUND High-throughput technologies produce huge amounts of heterogeneous biological data at all cellular levels. Structuring these data together with biological knowledge is a critical issue in biology and requires integrative tools and methods such as bio-ontologies to extract and share valuable information. In parallel, the development of recent whole-cell models using a systemic cell description opened alternatives for data integration. Integrating a systemic cell description within a bio-ontology would help to progress in whole-cell data integration and modeling synergistically. RESULTS We present BiPON, an ontology integrating a multi-scale systemic representation of bacterial cellular processes. BiPON consists in of two sub-ontologies, bioBiPON and modelBiPON. bioBiPON organizes the systemic description of biological information while modelBiPON describes the mathematical models (including parameters) associated with biological processes. bioBiPON and modelBiPON are related using bridge rules on classes during automatic reasoning. Biological processes are thus automatically related to mathematical models. 37% of BiPON classes stem from different well-established bio-ontologies, while the others have been manually defined and curated. Currently, BiPON integrates the main processes involved in bacterial gene expression processes. CONCLUSIONS BiPON is a proof of concept of the way to combine formally systems biology and bio-ontology. The knowledge formalization is highly flexible and generic. Most of the known cellular processes, new participants or new mathematical models could be inserted in BiPON. Altogether, BiPON opens up promising perspectives for knowledge integration and sharing and can be used by biologists, systems and computational biologists, and the emerging community of whole-cell modeling.
Collapse
Affiliation(s)
- Vincent J. Henry
- Laboratoire de Recherche en Informatique (LRI), UMR 8623, CNRS, Université Paris-Sud/Université Paris-Saclay, Orsay, France
- INRA, UR1404, MaIAGE, Université Paris-Saclay, Jouy-en-Josas, France
| | - Anne Goelzer
- INRA, UR1404, MaIAGE, Université Paris-Saclay, Jouy-en-Josas, France
| | - Arnaud Ferré
- Laboratoire de Recherche en Informatique (LRI), UMR 8623, CNRS, Université Paris-Sud/Université Paris-Saclay, Orsay, France
| | - Stephan Fischer
- INRA, UR1404, MaIAGE, Université Paris-Saclay, Jouy-en-Josas, France
| | - Marc Dinh
- INRA, UR1404, MaIAGE, Université Paris-Saclay, Jouy-en-Josas, France
| | - Valentin Loux
- INRA, UR1404, MaIAGE, Université Paris-Saclay, Jouy-en-Josas, France
| | - Christine Froidevaux
- Laboratoire de Recherche en Informatique (LRI), UMR 8623, CNRS, Université Paris-Sud/Université Paris-Saclay, Orsay, France
| | - Vincent Fromion
- INRA, UR1404, MaIAGE, Université Paris-Saclay, Jouy-en-Josas, France
| |
Collapse
|
50
|
Pengelly RJ, Alom T, Zhang Z, Hunt D, Ennis S, Collins A. Evaluating phenotype-driven approaches for genetic diagnoses from exomes in a clinical setting. Sci Rep 2017; 7:13509. [PMID: 29044180 PMCID: PMC5647373 DOI: 10.1038/s41598-017-13841-y] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2017] [Accepted: 10/02/2017] [Indexed: 12/27/2022] Open
Abstract
Next generation sequencing is transforming clinical medicine and genome research, providing a powerful route to establishing molecular diagnoses for genetic conditions; however, challenges remain given the volume and complexity of genetic variation. A number of methods integrate patient phenotype and genotypic data to prioritise variants as potentially causal. Some methods have a clinical focus while others are more research-oriented. With clinical applications in mind we compare results from alternative methods using 21 exomes for which the disease causal variant has been previously established through traditional clinical evaluation. In this case series we find that the PhenIX program is the most effective, ranking the true causal variant at between 1 and 10 in 85% of these cases. This is a significantly higher proportion than the combined results from five alternative methods tested (p = 0.003). The next best method is Exomiser (hiPHIVE), in which the causal variant is ranked 1–10 in 25% of cases. The widely different targets of these methods (more clinical focus, considering known Mendelian genes, in PhenIX, versus gene discovery in Exomiser) is perhaps not fully appreciated but may impact strongly on their utility for molecular diagnosis using clinical exome data.
Collapse
Affiliation(s)
- Reuben J Pengelly
- Genetic Epidemiology and Genomic Informatics, Faculty of Medicine, University of Southampton, Duthie Building, Mailpoint 808, Tremona Road, Southampton, SO16 6YD, UK.
| | - Thahmina Alom
- Genetic Epidemiology and Genomic Informatics, Faculty of Medicine, University of Southampton, Duthie Building, Mailpoint 808, Tremona Road, Southampton, SO16 6YD, UK
| | - Zijian Zhang
- Genetic Epidemiology and Genomic Informatics, Faculty of Medicine, University of Southampton, Duthie Building, Mailpoint 808, Tremona Road, Southampton, SO16 6YD, UK
| | - David Hunt
- Wessex Clinical Genetics Service, Level G, Mailpoint 105, Princess Anne Hospital, Coxford Road, Southampton, SO16 5YA, UK
| | - Sarah Ennis
- Genetic Epidemiology and Genomic Informatics, Faculty of Medicine, University of Southampton, Duthie Building, Mailpoint 808, Tremona Road, Southampton, SO16 6YD, UK
| | - Andrew Collins
- Genetic Epidemiology and Genomic Informatics, Faculty of Medicine, University of Southampton, Duthie Building, Mailpoint 808, Tremona Road, Southampton, SO16 6YD, UK
| |
Collapse
|