1
|
Oishi M, Sayama H, Toshimoto K, Nakayama T, Nagasaka Y. Practical QSP application from the preclinical phase to enhance the probability of clinical success: Insights from case studies in oncology. Drug Metab Pharmacokinet 2024; 56:101020. [PMID: 38797089 DOI: 10.1016/j.dmpk.2024.101020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 02/02/2024] [Accepted: 05/06/2024] [Indexed: 05/29/2024]
Abstract
Quantitative Systems Pharmacology (QSP) has emerged as a promising modeling and simulation (M&S) approach in drug development, with potential to improve clinical success rates. While conventional M&S has significantly contributed to quantitative understanding in late preclinical and clinical phases, it falls short in explaining unexpected phenomena and testing hypotheses in the early research phase. QSP presents a solution to these limitations. To harness the full potential of QSP in early preclinical stages, preclinical modelers who are familiar with conventional M&S need to update their understanding of the differences between conventional M&S and QSP. This review focuses on QSP applications during the preclinical stage, citing case examples and sharing our experiences in oncology. We emphasize the critical role of QSP in increasing the probability of success for clinical proof of concept (PoC) when applied from the early preclinical stage. Enhancing the quality of both hypotheses and QSP models from early preclinical stage is of critical importance. Once a QSP model achieves credibility, it facilitates predictions of clinical responses and potential biomarkers. We propose that sequential QSP applications from preclinical stages can improve success rates of clinical PoC, and emphasize the importance of refining both hypotheses and QSP models throughout the process.
Collapse
Affiliation(s)
- Masayo Oishi
- Systems Pharmacology, Non-Clinical Biomedical Science, Applied Research & Operations, Astellas Pharma Inc., Tsukuba, Ibaraki, 305-8585, Japan.
| | - Hiroyuki Sayama
- Systems Pharmacology, Non-Clinical Biomedical Science, Applied Research & Operations, Astellas Pharma Inc., Tsukuba, Ibaraki, 305-8585, Japan
| | - Kota Toshimoto
- Systems Pharmacology, Non-Clinical Biomedical Science, Applied Research & Operations, Astellas Pharma Inc., Tsukuba, Ibaraki, 305-8585, Japan
| | - Takeshi Nakayama
- Systems Pharmacology, Non-Clinical Biomedical Science, Applied Research & Operations, Astellas Pharma Inc., Tsukuba, Ibaraki, 305-8585, Japan
| | - Yasuhisa Nagasaka
- Non-Clinical Biomedical Science, Applied Research & Operations, Astellas Pharma Inc., Tsukuba, Ibaraki, 305-8585, Japan
| |
Collapse
|
2
|
Huang HH, Li J, Cho WC. Editorial: Integrative analysis for complex disease biomarker discovery. Front Bioeng Biotechnol 2023; 11:1273084. [PMID: 37671188 PMCID: PMC10476627 DOI: 10.3389/fbioe.2023.1273084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Accepted: 08/14/2023] [Indexed: 09/07/2023] Open
Affiliation(s)
- Hai-Hui Huang
- Provincial Demonstration Software Institute, Shaoguan University, Shaoguan, China
- Faculty of Information Technology, Macau University of Science and Technology, Macau, China
| | - Jie Li
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China
| | - William C. Cho
- Department of Clinical Oncology, Queen Elizabeth Hospital, Kowloon, Hong Kong SAR, China
| |
Collapse
|
3
|
Charoenkwan P, Schaduangrat N, Lio’ P, Moni MA, Shoombuatong W, Manavalan B. Computational prediction and interpretation of druggable proteins using a stacked ensemble-learning framework. iScience 2022; 25:104883. [PMID: 36046193 PMCID: PMC9421381 DOI: 10.1016/j.isci.2022.104883] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2022] [Revised: 07/08/2022] [Accepted: 08/02/2022] [Indexed: 11/22/2022] Open
Abstract
Discovery of potential drugs requires rapid and precise identification of drug targets. Although traditional experimental methodologies can accurately identify drug targets, they are time-consuming and inappropriate for high-throughput screening. Computational approaches based on machine learning (ML) algorithms can expedite the prediction of druggable proteins; however, the performance of the existing computational methods remains unsatisfactory. This study proposes a computational tool, SPIDER, to enhance the accurate prediction of druggable proteins. SPIDER employs various feature descriptors pertaining to several aspects, including physicochemical properties, compositional information, and composition-transition-distribution information, coupled with well-known ML algorithms to facilitate the construction of the final meta-predictor. The experimental results showed that SPIDER enabled more precise and robust prediction of druggable proteins than the baseline models and current existing methods in terms of the independent test dataset. An online web server was established and made freely available online.
Collapse
Affiliation(s)
- Phasit Charoenkwan
- Modern Management and Information Technology, College of Arts, Media and Technology, Chiang Mai University, Chiang Mai 50200, Thailand
| | - Nalini Schaduangrat
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand
| | - Pietro Lio’
- Department of Computer Science and Technology, University of Cambridge, Cambridge CB3 0FD, UK
| | - Mohammad Ali Moni
- Artificial Intelligence & Digital Health, School of Health and Rehabilitation Sciences, Faculty of Health and Behavioural Sciences, The University of Queensland, St Lucia, QLD 4072, Australia
| | - Watshara Shoombuatong
- Center of Data Mining and Biomedical Informatics, Faculty of Medical Technology, Mahidol University, Bangkok 10700, Thailand
| | - Balachandran Manavalan
- Computational Biology and Bioinformatics Laboratory, Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon 16419, Gyeonggi-do, Republic of Korea
| |
Collapse
|
4
|
Fernández-Torras A, Duran-Frigola M, Bertoni M, Locatelli M, Aloy P. Integrating and formatting biomedical data as pre-calculated knowledge graph embeddings in the Bioteque. Nat Commun 2022; 13:5304. [PMID: 36085310 PMCID: PMC9463154 DOI: 10.1038/s41467-022-33026-0] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2022] [Accepted: 08/30/2022] [Indexed: 12/25/2022] Open
Abstract
Biomedical data is accumulating at a fast pace and integrating it into a unified framework is a major challenge, so that multiple views of a given biological event can be considered simultaneously. Here we present the Bioteque, a resource of unprecedented size and scope that contains pre-calculated biomedical descriptors derived from a gigantic knowledge graph, displaying more than 450 thousand biological entities and 30 million relationships between them. The Bioteque integrates, harmonizes, and formats data collected from over 150 data sources, including 12 biological entities (e.g., genes, diseases, drugs) linked by 67 types of associations (e.g., 'drug treats disease', 'gene interacts with gene'). We show how Bioteque descriptors facilitate the assessment of high-throughput protein-protein interactome data, the prediction of drug response and new repurposing opportunities, and demonstrate that they can be used off-the-shelf in downstream machine learning tasks without loss of performance with respect to using original data. The Bioteque thus offers a thoroughly processed, tractable, and highly optimized assembly of the biomedical knowledge available in the public domain.
Collapse
Affiliation(s)
- Adrià Fernández-Torras
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Miquel Duran-Frigola
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
- Ersilia Open Source Initiative, Cambridge, UK
| | - Martino Bertoni
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Martina Locatelli
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
| | - Patrick Aloy
- Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain.
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Barcelona, Catalonia, Spain.
| |
Collapse
|
5
|
Erdem C, Mutsuddy A, Bensman EM, Dodd WB, Saint-Antoine MM, Bouhaddou M, Blake RC, Gross SM, Heiser LM, Feltus FA, Birtwistle MR. A scalable, open-source implementation of a large-scale mechanistic model for single cell proliferation and death signaling. Nat Commun 2022; 13:3555. [PMID: 35729113 PMCID: PMC9213456 DOI: 10.1038/s41467-022-31138-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 06/07/2022] [Indexed: 02/01/2023] Open
Abstract
Mechanistic models of how single cells respond to different perturbations can help integrate disparate big data sets or predict response to varied drug combinations. However, the construction and simulation of such models have proved challenging. Here, we developed a python-based model creation and simulation pipeline that converts a few structured text files into an SBML standard and is high-performance- and cloud-computing ready. We applied this pipeline to our large-scale, mechanistic pan-cancer signaling model (named SPARCED) and demonstrate it by adding an IFNγ pathway submodel. We then investigated whether a putative crosstalk mechanism could be consistent with experimental observations from the LINCS MCF10A Data Cube that IFNγ acts as an anti-proliferative factor. The analyses suggested this observation can be explained by IFNγ-induced SOCS1 sequestering activated EGF receptors. This work forms a foundational recipe for increased mechanistic model-based data integration on a single-cell level, an important building block for clinically-predictive mechanistic models.
Collapse
Affiliation(s)
- Cemal Erdem
- Department of Chemical & Biomolecular Engineering, Clemson University, Clemson, SC, USA.
| | - Arnab Mutsuddy
- Department of Chemical & Biomolecular Engineering, Clemson University, Clemson, SC, USA
| | - Ethan M Bensman
- Computer Science, School of Computing, Clemson University, Clemson, SC, USA
| | - William B Dodd
- Department of Chemical & Biomolecular Engineering, Clemson University, Clemson, SC, USA
| | - Michael M Saint-Antoine
- Center for Bioinformatics and Computational Biology, University of Delaware, Newark, DE, USA
| | - Mehdi Bouhaddou
- Department of Cellular and Molecular Pharmacology, University of California San Francisco, San Francisco, CA, USA
| | - Robert C Blake
- Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, Livermore, CA, USA
| | - Sean M Gross
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA
| | - Laura M Heiser
- Department of Biomedical Engineering, Oregon Health & Science University, Portland, OR, USA
| | - F Alex Feltus
- Department of Genetics and Biochemistry, Clemson University, Clemson, SC, USA
- Biomedical Data Science and Informatics Program, Clemson University, Clemson, SC, USA
- Center for Human Genetics, Clemson University, Clemson, SC, USA
| | - Marc R Birtwistle
- Department of Chemical & Biomolecular Engineering, Clemson University, Clemson, SC, USA.
- Department of Bioengineering, Clemson University, Clemson, SC, USA.
| |
Collapse
|
6
|
Huang H, Wu N, Liang Y, Peng X, Jun S. SLNL: A novel method for gene selection and phenotype classification. INT J INTELL SYST 2022. [DOI: 10.1002/int.22844] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Affiliation(s)
- HaiHui Huang
- School of Information Engineering Shaoguan University Shaoguan China
| | - NaiQi Wu
- Macau Institute of Systems Engineering and Collaborative Laboratory of Intelligent Science and Systems Macau University of Science and Technology Macau China
| | - Yong Liang
- The Peng Cheng Laboratory Shenzhen China
| | - XinDong Peng
- School of Information Engineering Shaoguan University Shaoguan China
| | - Shu Jun
- School of Mathematics and Statistics Xi'an Jiaotong University Xi'an China
| |
Collapse
|
7
|
Zhu C, Wang Z, Cai J, Pan C, Lin S, Zhang Y, Chen Y, Leng M, He C, Zhou P, Wu C, Fang Y, Li Q, Li A, Liu S, Lai Q. VDR Signaling via the Enzyme NAT2 Inhibits Colorectal Cancer Progression. Front Pharmacol 2021; 12:727704. [PMID: 34867333 PMCID: PMC8635240 DOI: 10.3389/fphar.2021.727704] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2021] [Accepted: 10/04/2021] [Indexed: 12/31/2022] Open
Abstract
Recent epidemiological and preclinical evidence indicates that vitamin D3 inhibits colorectal cancer (CRC) progression, but the mechanism has not been completely elucidated. This study was designed to determine the protective effects of vitamin D3 and identify crucial targets and regulatory mechanisms in CRC. First, we confirmed that 1,25(OH)2D3, the active form of vitamin D3, suppressed the aggressive phenotype of CRC in vitro and in vivo. Based on a network pharmacological analysis, N-acetyltransferase 2 (NAT2) was identified as a potential target of vitamin D3 against CRC. Clinical data of CRC patients from our hospital and bioinformatics analysis by online databases indicated that NAT2 was downregulated in CRC specimens and that the lower expression of NAT2 was correlated with a higher metastasis risk and lower survival rate of CRC patients. Furthermore, we found that NAT2 suppressed the proliferation and migration capacity of CRC cells, and the JAK1/STAT3 signaling pathway might be the underlying mechanism. Moreover, Western blot and immunofluorescence staining assays demonstrated that 1,25(OH)2D3 promoted NAT2 expression, and the chromatin immunoprecipitation assay indicated that the vitamin D receptor (VDR) transcriptionally regulated NAT2. These findings expand the potential uses of vitamin D3 against CRC and introduce VDR signaling via the enzyme NAT2 as a potential diagnostic and therapeutic target for CRC.
Collapse
Affiliation(s)
- Chaojun Zhu
- Guangdong Provincial Key Laboratory of Gastroenterology, Department of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Zihuan Wang
- Guangdong Provincial Key Laboratory of Gastroenterology, Department of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Jianqun Cai
- Guangdong Provincial Key Laboratory of Gastroenterology, Department of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Chunqiu Pan
- Department of Emergency Medicine, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Simin Lin
- Guangdong Provincial Key Laboratory of Gastroenterology, Department of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Yue Zhang
- Guangdong Provincial Key Laboratory of Gastroenterology, Department of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Yuting Chen
- Guangdong Provincial Key Laboratory of Gastroenterology, Department of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Mengxin Leng
- Guangdong Provincial Key Laboratory of Gastroenterology, Department of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Chengcheng He
- Guangdong Provincial Key Laboratory of Gastroenterology, Department of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, China.,Department of Gastroenterology, Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Peirong Zhou
- Guangdong Provincial Key Laboratory of Gastroenterology, Department of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, China.,Department of Gastroenterology, Third Affiliated Hospital of Guangzhou Medical University, Guangzhou, China
| | - Changjie Wu
- Guangdong Provincial Key Laboratory of Gastroenterology, Department of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Yuxin Fang
- Guangdong Provincial Key Laboratory of Gastroenterology, Department of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Qingyuan Li
- Guangdong Provincial Key Laboratory of Gastroenterology, Department of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Aimin Li
- Guangdong Provincial Key Laboratory of Gastroenterology, Department of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Side Liu
- Guangdong Provincial Key Laboratory of Gastroenterology, Department of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, China
| | - Qiuhua Lai
- Guangdong Provincial Key Laboratory of Gastroenterology, Department of Gastroenterology, Nanfang Hospital, Southern Medical University, Guangzhou, China
| |
Collapse
|
8
|
Akhter R, Sofi SA. Precision agriculture using IoT data analytics and machine learning. JOURNAL OF KING SAUD UNIVERSITY - COMPUTER AND INFORMATION SCIENCES 2021. [DOI: 10.1016/j.jksuci.2021.05.013] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
9
|
Spjuth O, Frid J, Hellander A. The machine learning life cycle and the cloud: implications for drug discovery. Expert Opin Drug Discov 2021; 16:1071-1079. [PMID: 34057379 DOI: 10.1080/17460441.2021.1932812] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Introduction: Artificial intelligence (AI) and machine learning (ML) are increasingly used in many aspects of drug discovery. Larger data sizes and methods such as Deep Neural Networks contribute to challenges in data management, the required software stack, and computational infrastructure. There is an increasing need in drug discovery to continuously re-train models and make them available in production environments.Areas covered: This article describes how cloud computing can aid the ML life cycle in drug discovery. The authors discuss opportunities with containerization and scientific workflows and introduce the concept of MLOps and describe how it can facilitate reproducible and robust ML modeling in drug discovery organizations. They also discuss ML on private, sensitive and regulated data.Expert opinion: Cloud computing offers a compelling suite of building blocks to sustain the ML life cycle integrated in iterative drug discovery. Containerization and platforms such as Kubernetes together with scientific workflows can enable reproducible and resilient analysis pipelines, and the elasticity and flexibility of cloud infrastructures enables scalable and efficient access to compute resources. Drug discovery commonly involves working with sensitive or private data, and cloud computing and federated learning can contribute toward enabling collaborative drug discovery within and between organizations.Abbreviations: AI = Artificial Intelligence; DL = Deep Learning; GPU = Graphics Processing Unit; IaaS = Infrastructure as a Service; K8S = Kubernetes; ML = Machine Learning; MLOps = Machine Learning and Operations; PaaS = Platform as a Service; QC = Quality Control; SaaS = Software as a Service.
Collapse
Affiliation(s)
- Ola Spjuth
- Department of Pharmaceutical Biosciences and Science for Life Laboratory, Uppsala University, Uppsala Sweden.,Scaleout Systems AB, Sweden
| | | | - Andreas Hellander
- Scaleout Systems AB, Sweden.,Department of Information Technology, Uppsala University, Sweden
| |
Collapse
|
10
|
Pin C, Collins T, Gibbs M, Kimko H. Systems Modeling to Quantify Safety Risks in Early Drug Development: Using Bifurcation Analysis and Agent-Based Modeling as Examples. AAPS JOURNAL 2021; 23:77. [PMID: 34018069 PMCID: PMC8137611 DOI: 10.1208/s12248-021-00580-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/24/2020] [Accepted: 03/09/2021] [Indexed: 11/30/2022]
Abstract
Quantitative Systems Toxicology (QST) models, recapitulating pharmacokinetics and mechanism of action together with the organic response at multiple levels of biological organization, can provide predictions on the magnitude of injury and recovery dynamics to support study design and decision-making during drug development. Here, we highlight the application of QST models to predict toxicities of cancer treatments, such as cytopenia(s) and gastrointestinal adverse effects, where narrow therapeutic indexes need to be actively managed. The importance of bifurcation analysis is demonstrated in QST models of hematologic toxicity to understand how different regions of the parameter space generate different behaviors following cancer treatment, which results in asymptotically stable predictions, yet highly irregular for specific schedules, or oscillating predictions of blood cell levels. In addition, an agent-based model of the intestinal crypt was used to simulate how the spatial location of the injury within the crypt affects the villus disruption severity. We discuss the value of QST modeling approaches to support drug development and how they align with technological advances impacting trial design including patient selection, dose/regimen selection, and ultimately patient safety.
Collapse
Affiliation(s)
- Carmen Pin
- Clinical Pharmacology and Quantitative Pharmacology, Clinical Pharmacology and Safety Sciences, R&D, AstraZeneca, Cambridge Science Park, Milton Road, Cambridge, UK
| | - Teresa Collins
- Clinical Pharmacology and Quantitative Pharmacology, Clinical Pharmacology and Safety Sciences, R&D, AstraZeneca, Cambridge Science Park, Milton Road, Cambridge, UK
| | - Megan Gibbs
- Clinical Pharmacology and Quantitative Pharmacology, Clinical Pharmacology and Safety Sciences, R&D, AstraZeneca, Gaithersburg, Maryland, USA
| | - Holly Kimko
- Clinical Pharmacology and Quantitative Pharmacology, Clinical Pharmacology and Safety Sciences, R&D, AstraZeneca, Gaithersburg, Maryland, USA.
| |
Collapse
|
11
|
Kafita D, Daka V, Nkhoma P, Zulu M, Zulu E, Tembo R, Ngwira Z, Mwaba F, Sinkala M, Munsaka S. High ELF4 expression in human cancers is associated with worse disease outcomes and increased resistance to anticancer drugs. PLoS One 2021; 16:e0248984. [PMID: 33836003 PMCID: PMC8034723 DOI: 10.1371/journal.pone.0248984] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2020] [Accepted: 03/09/2021] [Indexed: 12/12/2022] Open
Abstract
The malignant phenotype of tumour cells is fuelled by changes in the expression of various transcription factors, including some of the well-studied proteins such as p53 and Myc. Despite significant progress made, little is known about several other transcription factors, including ELF4, and how they help shape the oncogenic processes in cancer cells. To this end, we performed a bioinformatics analysis to facilitate a detailed understanding of how the expression variations of ELF4 in human cancers are related to disease outcomes and the cancer cell drug responses. Here, using ELF4 mRNA expression data of 9,350 samples from the Cancer Genome Atlas pan-cancer project, we identify two groups of patient's tumours: those that expressed high ELF4 transcripts and those that expressed low ELF4 transcripts across 32 different human cancers. We uncover that patients segregated into these two groups are associated with different clinical outcomes. Further, we find that tumours that express high ELF4 mRNA levels tend to be of a higher-grade, afflict a significantly older patient population and have a significantly higher mutation burden. By analysing dose-response profiles to 397 anti-cancer drugs of 612 well-characterised human cancer cell lines, we discover that cell lines that expressed high ELF4 mRNA transcript are significantly less responsive to 129 anti-cancer drugs, and only significantly more response to three drugs: dasatinib, WH-4-023, and Ponatinib, all of which remarkably target the proto-oncogene tyrosine-protein kinase SRC and tyrosine-protein kinase ABL1. Collectively our analyses have shown that, across the 32 different human cancers, the patients afflicted with tumours that overexpress ELF4 tended to have a more aggressive disease that is also is more likely more refractory to most anti-cancer drugs, a finding upon which we could devise novel categorisation of patient tumours, treatment, and prognostic strategies.
Collapse
Affiliation(s)
- Doris Kafita
- Department of Biomedical Sciences, School of Health Sciences, University of Zambia, Lusaka, Zambia
| | - Victor Daka
- Department of Pathology and Microbiology, School of Medicine, University of Zambia, Lusaka, Zambia
| | - Panji Nkhoma
- Department of Biomedical Sciences, School of Health Sciences, University of Zambia, Lusaka, Zambia
| | - Mildred Zulu
- Department of Clinical Sciences, School of Medicine, Copperbelt University, Ndola, Zambia
| | - Ephraim Zulu
- Department of Biomedical Sciences, School of Health Sciences, University of Zambia, Lusaka, Zambia
| | - Rabecca Tembo
- Department of Clinical Sciences, School of Medicine, Copperbelt University, Ndola, Zambia
| | - Zifa Ngwira
- Department of Clinical Sciences, School of Medicine, Copperbelt University, Ndola, Zambia
| | - Florence Mwaba
- Department of Clinical Sciences, School of Medicine, Copperbelt University, Ndola, Zambia
| | - Musalula Sinkala
- Department of Biomedical Sciences, School of Health Sciences, University of Zambia, Lusaka, Zambia
| | - Sody Munsaka
- Department of Biomedical Sciences, School of Health Sciences, University of Zambia, Lusaka, Zambia
| |
Collapse
|
12
|
Lima DB, Zhu Y, Liu F. XlinkCyNET: A Cytoscape Application for Visualization of Protein Interaction Networks Based on Cross-Linking Mass Spectrometry Identifications. J Proteome Res 2021; 20:1943-1950. [PMID: 33689356 DOI: 10.1021/acs.jproteome.0c00957] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Software tools that allow the visualization and analysis of protein interaction networks are essential for studies in systems biology. One of the most popular network visualization tools in biology is Cytoscape, which offers a great selection of plug-ins for the interpretation of network data. Chemical cross-linking coupled to mass spectrometry (XL-MS) is an increasingly important source for protein interaction data; however, to date, no Cytoscape tools are available to analyze XL-MS results. In light of the suitability of the Cytoscape platform and to expand its toolbox, here we introduce XlinkCyNET, an open-source Cytoscape Java plug-in for exploring large-scale XL-MS-based protein interaction networks. XlinkCyNET offers the rapid and easy visualization of intra- and interprotein cross-links in a rectangular-bar style as well as on the 3D structure, allowing the interrogation of protein interaction networks at the residue level. XlinkCyNET is freely available from the Cytoscape App Store (http://apps.cytoscape.org/apps/xlinkcynet) and at the Liu lab webpage (https://www.theliulab.com/software/xlinkcynet).
Collapse
Affiliation(s)
- Diogo Borges Lima
- Department of Chemical Biology, Leibniz - Forschungsinstitut für Molekulare Pharmakologie (FMP), Robert-Rössle-Str. 10, Berlin 13125, Germany
| | - Ying Zhu
- Department of Chemical Biology, Leibniz - Forschungsinstitut für Molekulare Pharmakologie (FMP), Robert-Rössle-Str. 10, Berlin 13125, Germany
| | - Fan Liu
- Department of Chemical Biology, Leibniz - Forschungsinstitut für Molekulare Pharmakologie (FMP), Robert-Rössle-Str. 10, Berlin 13125, Germany
| |
Collapse
|
13
|
From the Digital Data Revolution toward a Digital Society: Pervasiveness of Artificial Intelligence. MACHINE LEARNING AND KNOWLEDGE EXTRACTION 2021. [DOI: 10.3390/make3010014] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Technological progress has led to powerful computers and communication technologies that penetrate nowadays all areas of science, industry and our private lives. As a consequence, all these areas are generating digital traces of data amounting to big data resources. This opens unprecedented opportunities but also challenges toward the analysis, management, interpretation and responsible usage of such data. In this paper, we discuss these developments and the fields that have been particularly effected by the digital revolution. Our discussion is AI-centered showing domain-specific prospects but also intricacies for the method development in artificial intelligence. For instance, we discuss recent breakthroughs in deep learning algorithms and artificial intelligence as well as advances in text mining and natural language processing, e.g., word-embedding methods that enable the processing of large amounts of text data from diverse sources such as governmental reports, blog entries in social media or clinical health records of patients. Furthermore, we discuss the necessity of further improving general artificial intelligence approaches and for utilizing advanced learning paradigms. This leads to arguments for the establishment of statistical artificial intelligence. Finally, we provide an outlook on important aspects of future challenges that are of crucial importance for the development of all fields, including ethical AI and the influence of bias on AI systems. As potential end-point of this development, we define digital society as the asymptotic limiting state of digital economy that emerges from fully connected information and communication technologies enabling the pervasiveness of AI. Overall, our discussion provides a perspective on the elaborate relatedness of digital data and AI systems.
Collapse
|
14
|
Karapiperis C, Chasapi A, Angelis L, Scouras ZG, Mastroberardino PG, Tapio S, Atkinson MJ, Ouzounis CA. The Coming of Age for Big Data in Systems Radiobiology, an Engineering Perspective. BIG DATA 2021; 9:63-71. [PMID: 32991205 DOI: 10.1089/big.2019.0144] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
As high-throughput approaches in biological and biomedical research are transforming the life sciences into information-driven disciplines, modern analytics platforms for big data have started to address the needs for efficient and systematic data analysis and interpretation. We observe that radiobiology is following this general trend, with -omics information providing unparalleled depth into the biomolecular mechanisms of radiation response-defined as systems radiobiology. We outline the design of computational frameworks and discuss the analysis of big data in low-dose ionizing radiation (LDIR) responses of the mammalian brain. Following successful examples and best practices of approaches for the analysis of big data in life sciences and health care, we present the needs and requirements for radiation research. Our goal is to raise awareness for the radiobiology community about the new technological possibilities that can capture complex information and execute data analytics on a large scale. The production of large data sets from genome-wide experiments (quantity) and the complexity of radiation research with multidimensional experimental designs (quality) will necessitate the adoption of latest information technologies. The main objective was to translate research results into applied clinical and epidemiological practice and understand the responses of biological tissues to LDIR to define new radiation protection policies. We envisage a future where multidisciplinary teams include data scientists, artificial intelligence experts, DevOps engineers, and of course radiation experts to fulfill the augmented needs of the radiobiology community, accelerate research, and devise new strategies.
Collapse
Affiliation(s)
- Christos Karapiperis
- School of Informatics, Aristotle University of Thessalonica (AUTH), Thessalonica, Greece
| | - Anastasia Chasapi
- Biological Computation & Process Laboratory (BCPL), Chemical Process & Energy Resources Institute (CPERI), Centre for Research & Technology Hellas (CERTH), Thessalonica, Greece
| | - Lefteris Angelis
- School of Informatics, Aristotle University of Thessalonica (AUTH), Thessalonica, Greece
| | - Zacharias G Scouras
- School of Biology, Aristotle University of Thessalonica (AUTH), Thessalonica, Greece
| | | | - Soile Tapio
- Institute of Radiation Biology, Helmholtz Zentrum Muenchen, German Research Center for Environmental Health (HMGU), Neuherberg, Germany
| | - Michael J Atkinson
- Institute of Radiation Biology, Helmholtz Zentrum Muenchen, German Research Center for Environmental Health (HMGU), Neuherberg, Germany
| | - Christos A Ouzounis
- School of Informatics, Aristotle University of Thessalonica (AUTH), Thessalonica, Greece
- Biological Computation & Process Laboratory (BCPL), Chemical Process & Energy Resources Institute (CPERI), Centre for Research & Technology Hellas (CERTH), Thessalonica, Greece
| |
Collapse
|
15
|
Dezső Z, Ceccarelli M. Machine learning prediction of oncology drug targets based on protein and network properties. BMC Bioinformatics 2020; 21:104. [PMID: 32171238 PMCID: PMC7071582 DOI: 10.1186/s12859-020-3442-9] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2019] [Accepted: 03/04/2020] [Indexed: 01/12/2023] Open
Abstract
Background The selection and prioritization of drug targets is a central problem in drug discovery. Computational approaches can leverage the growing number of large-scale human genomics and proteomics data to make in-silico target identification, reducing the cost and the time needed. Results We developed a machine learning approach to score proteins to generate a druggability score of novel targets. In our model we incorporated 70 protein features which included properties derived from the sequence, features characterizing protein functions as well as network properties derived from the protein-protein interaction network. The advantage of this approach is that it is unbiased and even less studied proteins with limited information about their function can score well as most of the features are independent of the accumulated literature. We build models on a training set which consist of targets with approved drugs and a negative set of non-drug targets. The machine learning techniques help to identify the most important combination of features differentiating validated targets from non-targets. We validated our predictions on an independent set of clinical trial drug targets, achieving a high accuracy characterized by an Area Under the Curve (AUC) of 0.89. Our most predictive features included biological function of proteins, network centrality measures, protein essentiality, tissue specificity, localization and solvent accessibility. Our predictions, based on a small set of 102 validated oncology targets, recovered the majority of known drug targets and identifies a novel set of proteins as drug target candidates. Conclusions We developed a machine learning approach to prioritize proteins according to their similarity to approved drug targets. We have shown that the method proposed is highly predictive on a validation dataset consisting of 277 targets of clinical trial drug confirming that our computational approach is an efficient and cost-effective tool for drug target discovery and prioritization. Our predictions were based on oncology targets and cancer relevant biological functions, resulting in significantly higher scores for targets of oncology clinical trial drugs compared to the scores of targets of trial drugs for other indications. Our approach can be used to make indication specific drug-target prediction by combining generic druggability features with indication specific biological functions.
Collapse
Affiliation(s)
- Zoltán Dezső
- Computational Biology-Genomic Research Center, ABBVIE, Redwood City, CA, USA.
| | - Michele Ceccarelli
- Computational Biology-Genomic Research Center, ABBVIE, Redwood City, CA, USA. .,Department of Electrical Engineering and Information Technology (DIETI), University of Naples "Federico II", 80128, Naples, Italy. .,Istituto di Ricerche Genetiche "G. Salvatore", Biogem s.c.ar.l, 83031, Ariano Irpino, Italy.
| |
Collapse
|
16
|
Barneh F, Mirzaie M, Nickchi P, Tan TZ, Thiery JP, Piran M, Salimi M, Goshadrou F, Aref AR, Jafari M. Integrated use of bioinformatic resources reveals that co-targeting of histone deacetylases, IKBK and SRC inhibits epithelial-mesenchymal transition in cancer. Brief Bioinform 2020; 20:717-731. [PMID: 29726962 DOI: 10.1093/bib/bby030] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2018] [Revised: 03/04/2018] [Indexed: 02/07/2023] Open
Abstract
With the advent of high-throughput technologies leading to big data generation, increasing number of gene signatures are being published to predict various features of diseases such as prognosis and patient survival. However, to use these signatures for identifying therapeutic targets, use of additional bioinformatic tools is indispensible part of research. Here, we have generated a pipeline comprised of nearly 15 bioinformatic tools and enrichment statistical methods to propose and validate a drug combination strategy from already approved drugs and present our approach using published pan-cancer epithelial-mesenchymal transition (EMT) signatures as a case study. We observed that histone deacetylases were critical targets to tune expression of multiple epithelial versus mesenchymal genes. Moreover, SRC and IKBK were the principal intracellular kinases regulating multiple signaling pathways. To confirm the anti-EMT efficacy of the proposed target combination in silico, we validated expression of targets in mesenchymal versus epithelial subtypes of ovarian cancer. Additionally, we inhibited the pinpointed proteins in vitro using an invasive lung cancer cell line. We found that whereas low-dose mono-therapy failed to limit cell dispersion from collagen spheroids in a microfluidic device as a metric of EMT, the combination fully inhibited dissociation and invasion of cancer cells toward cocultured endothelial cells. Given the approval status and safety profiles of the suggested drugs, the proposed combination set can be considered in clinical trials.
Collapse
Affiliation(s)
- Farnaz Barneh
- Department of Basic Sciences, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran.,Drug Design and Bioinformatics Unit, Medical Biotechnology Department, Biotechnology Research Center, Pasteur Institute of Iran, Tehran, Iran
| | - Mehdi Mirzaie
- Department of Applied Mathematics, Faculty of Mathematical Sciences, Tarbiat Modares University, Tehran, Iran
| | - Payman Nickchi
- Drug Design and Bioinformatics Unit, Medical Biotechnology Department, Biotechnology Research Center, Pasteur Institute of Iran, Tehran, Iran.,Department of Statistics and Actuarial Science, Simon Fraser University, Burnaby, BC, Canada
| | - Tuan Zea Tan
- Cancer Science Institute of Singapore, National University of Singapore, 14 Medical Drive, Singapore 117599, Singapore, Translational Centre for Development and Research, National University Health System, MD11, #03-10, 10 Medical Drive, Singapore 117597, Singapore
| | - Jean Paul Thiery
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore, Singapore 117596, Singapore.,Institut Gustave Roussy, Inserm Unit 1186 Comprehensive Cancer Center, Villejuif, France.,CNRS UMR 7057 Matter and Complex Systems, University Paris Denis Diderot, Paris, France
| | - Mehran Piran
- Drug Design and Bioinformatics Unit, Medical Biotechnology Department, Biotechnology Research Center, Pasteur Institute of Iran, Tehran, Iran
| | - Mona Salimi
- Department of Physiology and Pharmacology, Pasteur Institute of Iran, Tehran, Iran
| | - Fatemeh Goshadrou
- Department of Basic Sciences, Faculty of Paramedical Sciences, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Amir R Aref
- Department of Medical Oncology, Belfer Center for Applied Cancer Science, Dana-Farber Cancer Institute, Harvard Medical School, Boston 02215, USA
| | - Mohieddin Jafari
- Drug Design and Bioinformatics Unit, Medical Biotechnology Department, Biotechnology Research Center, Pasteur Institute of Iran, Tehran, Iran
| |
Collapse
|
17
|
Bradshaw EL, Spilker ME, Zang R, Bansal L, He H, Jones RD, Le K, Penney M, Schuck E, Topp B, Tsai A, Xu C, Nijsen MJ, Chan JR. Applications of Quantitative Systems Pharmacology in Model-Informed Drug Discovery: Perspective on Impact and Opportunities. CPT Pharmacometrics Syst Pharmacol 2019; 8:777-791. [PMID: 31535440 PMCID: PMC6875708 DOI: 10.1002/psp4.12463] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Accepted: 07/19/2019] [Indexed: 12/15/2022] Open
Abstract
Quantitative systems pharmacology (QSP) approaches have been increasingly applied in the pharmaceutical since the landmark white paper published in 2011 by a National Institutes of Health working group brought attention to the discipline. In this perspective, we discuss QSP in the context of other modeling approaches and highlight the impact of QSP across various stages of drug development and therapeutic areas. We discuss challenges to the field as well as future opportunities.
Collapse
Affiliation(s)
| | - Mary E. Spilker
- Pfizer Worldwide Research and DevelopmentSan DiegoCaliforniaUSA
| | | | | | - Handan He
- Novartis Institutes for Biomedical ResearchEast HanoverNew JerseyUSA
| | | | - Kha Le
- AgiosCambridgeMassachusettsUSA
| | | | | | | | - Alice Tsai
- Vertex Pharmaceuticals IncorporatedBostonMassachusettsUSA
| | | | | | | |
Collapse
|
18
|
Koleti A, Terryn R, Stathias V, Chung C, Cooper DJ, Turner JP, Vidovic D, Forlin M, Kelley TT, D'Urso A, Allen BK, Torre D, Jagodnik KM, Wang L, Jenkins SL, Mader C, Niu W, Fazel M, Mahi N, Pilarczyk M, Clark N, Shamsaei B, Meller J, Vasiliauskas J, Reichard J, Medvedovic M, Ma'ayan A, Pillai A, Schürer SC. Data Portal for the Library of Integrated Network-based Cellular Signatures (LINCS) program: integrated access to diverse large-scale cellular perturbation response data. Nucleic Acids Res 2019; 46:D558-D566. [PMID: 29140462 PMCID: PMC5753343 DOI: 10.1093/nar/gkx1063] [Citation(s) in RCA: 107] [Impact Index Per Article: 21.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2017] [Accepted: 10/19/2017] [Indexed: 11/21/2022] Open
Abstract
The Library of Integrated Network-based Cellular Signatures (LINCS) program is a national consortium funded by the NIH to generate a diverse and extensive reference library of cell-based perturbation-response signatures, along with novel data analytics tools to improve our understanding of human diseases at the systems level. In contrast to other large-scale data generation efforts, LINCS Data and Signature Generation Centers (DSGCs) employ a wide range of assay technologies cataloging diverse cellular responses. Integration of, and unified access to LINCS data has therefore been particularly challenging. The Big Data to Knowledge (BD2K) LINCS Data Coordination and Integration Center (DCIC) has developed data standards specifications, data processing pipelines, and a suite of end-user software tools to integrate and annotate LINCS-generated data, to make LINCS signatures searchable and usable for different types of users. Here, we describe the LINCS Data Portal (LDP) (http://lincsportal.ccs.miami.edu/), a unified web interface to access datasets generated by the LINCS DSGCs, and its underlying database, LINCS Data Registry (LDR). LINCS data served on the LDP contains extensive metadata and curated annotations. We highlight the features of the LDP user interface that is designed to enable search, browsing, exploration, download and analysis of LINCS data and related curated content.
Collapse
Affiliation(s)
- Amar Koleti
- Center for Computational Science, University of Miami, FL, USA.,BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA
| | - Raymond Terryn
- Center for Computational Science, University of Miami, FL, USA.,BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Department of Molecular and Cellular Pharmacology, Miller School of Medicine, University of Miami, FL, USA
| | - Vasileios Stathias
- BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Department of Molecular and Cellular Pharmacology, Miller School of Medicine, University of Miami, FL, USA.,Department of Human Genetics and Genomics, Miller School of Medicine, University of Miami, FL, USA
| | - Caty Chung
- Center for Computational Science, University of Miami, FL, USA.,BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA
| | - Daniel J Cooper
- BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Department of Molecular and Cellular Pharmacology, Miller School of Medicine, University of Miami, FL, USA
| | - John P Turner
- Center for Computational Science, University of Miami, FL, USA.,BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Department of Molecular and Cellular Pharmacology, Miller School of Medicine, University of Miami, FL, USA
| | - Dušica Vidovic
- Center for Computational Science, University of Miami, FL, USA.,BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Department of Molecular and Cellular Pharmacology, Miller School of Medicine, University of Miami, FL, USA
| | - Michele Forlin
- Center for Computational Science, University of Miami, FL, USA.,BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Department of Molecular and Cellular Pharmacology, Miller School of Medicine, University of Miami, FL, USA
| | - Tanya T Kelley
- BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Department of Molecular and Cellular Pharmacology, Miller School of Medicine, University of Miami, FL, USA
| | - Alessandro D'Urso
- Center for Computational Science, University of Miami, FL, USA.,BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA
| | - Bryce K Allen
- Center for Computational Science, University of Miami, FL, USA.,BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Department of Molecular and Cellular Pharmacology, Miller School of Medicine, University of Miami, FL, USA
| | - Denis Torre
- BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Kathleen M Jagodnik
- BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Lily Wang
- BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Sherry L Jenkins
- BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Christopher Mader
- Center for Computational Science, University of Miami, FL, USA.,BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA
| | - Wen Niu
- BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Division of Biostatistics and Bioinformatics, Department of Environmental Health, University of Cincinnati, Cincinnati, OH, USA
| | - Mehdi Fazel
- BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Division of Biostatistics and Bioinformatics, Department of Environmental Health, University of Cincinnati, Cincinnati, OH, USA
| | - Naim Mahi
- BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Division of Biostatistics and Bioinformatics, Department of Environmental Health, University of Cincinnati, Cincinnati, OH, USA
| | - Marcin Pilarczyk
- BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Division of Biostatistics and Bioinformatics, Department of Environmental Health, University of Cincinnati, Cincinnati, OH, USA
| | - Nicholas Clark
- BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Division of Biostatistics and Bioinformatics, Department of Environmental Health, University of Cincinnati, Cincinnati, OH, USA
| | - Behrouz Shamsaei
- BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Division of Biostatistics and Bioinformatics, Department of Environmental Health, University of Cincinnati, Cincinnati, OH, USA
| | - Jarek Meller
- BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Division of Biostatistics and Bioinformatics, Department of Environmental Health, University of Cincinnati, Cincinnati, OH, USA
| | - Juozas Vasiliauskas
- BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Division of Biostatistics and Bioinformatics, Department of Environmental Health, University of Cincinnati, Cincinnati, OH, USA
| | - John Reichard
- BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Division of Biostatistics and Bioinformatics, Department of Environmental Health, University of Cincinnati, Cincinnati, OH, USA
| | - Mario Medvedovic
- BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Division of Biostatistics and Bioinformatics, Department of Environmental Health, University of Cincinnati, Cincinnati, OH, USA
| | - Avi Ma'ayan
- BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Ajay Pillai
- Division of Genome Sciences, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD, USA
| | - Stephan C Schürer
- Center for Computational Science, University of Miami, FL, USA.,BD2K LINCS Data Coordination and Integration Center, Icahn School of Medicine at Mount Sinai, University of Miami, University of Cincinnati, New York NY, Miami FL, Cincinnati OH, USA.,Department of Molecular and Cellular Pharmacology, Miller School of Medicine, University of Miami, FL, USA
| |
Collapse
|
19
|
Musa A, Tripathi S, Dehmer M, Emmert-Streib F. L1000 Viewer: A Search Engine and Web Interface for the LINCS Data Repository. Front Genet 2019; 10:557. [PMID: 31258549 PMCID: PMC6588157 DOI: 10.3389/fgene.2019.00557] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2019] [Accepted: 05/28/2019] [Indexed: 12/12/2022] Open
Abstract
The LINCS L1000 data repository contains almost two million gene expression profiles for thousands of small molecules and drugs. However, due to the complexity and the size of the data repository and a lack of an interoperable interface, the creation of pharmacologically meaningful workflows utilizing these data is severely hampered. In order to overcome this limitation, we developed the L1000 Viewer, a search engine and graphical web interface for the LINCS data repository. The web interface serves as an interactive platform allowing the user to select different forms of perturbation profiles, e.g., for specific cell lines, drugs, dosages, time points and combinations thereof. At its core, our method has a database we created from inferring and utilizing the intricate dependency graph structure among the data files. The L1000 Viewer is accessible via http://L1000viewer.bio-complexity.com/.
Collapse
Affiliation(s)
- Aliyu Musa
- Predictive Society and Data Analytics Lab, Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland.,Institute of Biosciences and Medical Technology, Tampere, Finland
| | - Shailesh Tripathi
- Predictive Society and Data Analytics Lab, Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland.,Institute for Intelligent Production, Faculty for Management, University of Applied Sciences Upper Austria, Linz, Austria
| | - Matthias Dehmer
- Institute for Intelligent Production, Faculty for Management, University of Applied Sciences Upper Austria, Linz, Austria.,Department of Mechatronics and Biomedical Computer Science, UMIT, Hall in Tyrol, Austria.,College of Computer and Control Engineering, Nankai University, Tianjin, China
| | - Frank Emmert-Streib
- Predictive Society and Data Analytics Lab, Faculty of Information Technology and Communication Sciences, Tampere University, Tampere, Finland.,Institute of Biosciences and Medical Technology, Tampere, Finland
| |
Collapse
|
20
|
Barrot CC, Woillard JB, Picard N. Big data in pharmacogenomics: current applications, perspectives and pitfalls. Pharmacogenomics 2019; 20:609-620. [PMID: 31190620 DOI: 10.2217/pgs-2018-0184] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
The efficiency of new generation sequencing methods and the reduction of their cost has led pharmacogenomics to gradually supplant pharmacogenetics, leading to new applications in personalized medicine along with new perspectives in drug design or identification of drug response factors. The amount of data generated in genomics fits the definition of big data, and need a specific bioinformatics processing following standard steps: data collection, processing, analysis and interpretation. Pitfalls of pharmacogenomics studies are directly related to these steps. This review aims to describe these steps from a pharmacogenomic point of view, focusing on bioinformatics aspects.
Collapse
Affiliation(s)
- Claire-Cécile Barrot
- INSERM, IPPRITT, U1248, F-87000, Limoges, France; Univ. Limoges, IPPRITT, F-87000 Limoges, France
| | - Jean-Baptiste Woillard
- INSERM, IPPRITT, U1248, F-87000, Limoges, France; Univ. Limoges, IPPRITT, F-87000 Limoges, France
| | - Nicolas Picard
- INSERM, IPPRITT, U1248, F-87000, Limoges, France; Univ. Limoges, IPPRITT, F-87000 Limoges, France
| |
Collapse
|
21
|
Schneider MV, Griffin PC, Tyagi S, Flannery M, Dayalan S, Gladman S, Watson-Haigh N, Bayer PE, Charleston M, Cooke I, Cook R, Edwards RJ, Edwards D, Gorse D, McConville M, Powell D, Wilkins MR, Lonie A. Establishing a distributed national research infrastructure providing bioinformatics support to life science researchers in Australia. Brief Bioinform 2019; 20:384-389. [PMID: 29106479 PMCID: PMC6433737 DOI: 10.1093/bib/bbx071] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
EMBL Australia Bioinformatics Resource (EMBL-ABR) is a developing national research infrastructure, providing bioinformatics resources and support to life science and biomedical researchers in Australia. EMBL-ABR comprises 10 geographically distributed national nodes with one coordinating hub, with current funding provided through Bioplatforms Australia and the University of Melbourne for its initial 2-year development phase. The EMBL-ABR mission is to: (1) increase Australia's capacity in bioinformatics and data sciences; (2) contribute to the development of training in bioinformatics skills; (3) showcase Australian data sets at an international level and (4) enable engagement in international programs. The activities of EMBL-ABR are focussed in six key areas, aligning with comparable international initiatives such as ELIXIR, CyVerse and NIH Commons. These key areas-Tools, Data, Standards, Platforms, Compute and Training-are described in this article.
Collapse
Affiliation(s)
| | - Philippa C Griffin
- EMBL Australia Bioinformatics Resource, EMBL-ABR Hub, Melbourne, Victoria, Australia
| | - Sonika Tyagi
- Australian Genome Research Facility, Bioinformatics, 1G royal Pde Parkville, Victoria, Australia
| | - Madison Flannery
- EMBL Australia Bioinformatics Resource, EMBL-ABR Hub, Melbourne, Victoria, Australia
| | - Saravanan Dayalan
- University of Melbourne Bio21 Molecular Science and Biotechnology Institute, Metabolomics Platform, Parkville Victoria, Australia
| | - Simon Gladman
- EMBL Australia Bioinformatics Resource, EMBL-ABR Hub, Melbourne, Victoria, Australia
| | | | - Philipp E Bayer
- University of Western Australia, School of Plant Biology, Crawley, Western Australia, Australia
| | - Michael Charleston
- University of Tasmania Menzies Institute for Medical Research, Hobart Tasmania, Australia
| | - Ira Cooke
- James Cook University, College of Public Health, Medical & Vet Sciences, Townsville, Queensland, Australia
| | - Rob Cook
- University of New South Wales, Sydney, Australia
| | | | - David Edwards
- University of Western Australia, School of Plant Biology, Crawley, Western Australia, Western Australia
| | - Dominique Gorse
- Queensland Facility for Advanced Bioinformatics, Brisbane, Queensland, Australia
| | - Malcolm McConville
- University of Melbourne Bio21 Molecular Science and Biotechnology Institute, Parkville Victoria, Australia
| | | | - Marc R Wilkins
- University of New South Wales, School of Biotechnology and Biomolecular Sciences, Sydney, Australia
| | - Andrew Lonie
- University of Melbourne Department of General Practice and Primary Health Care, Melbourne Bioinformatics, Carlton Victoria, Australia
| |
Collapse
|
22
|
Bouzaglo D, Chasida I, Ezra Tsur E. Distributed retrieval engine for the development of cloud-deployed biological databases. BioData Min 2018; 11:26. [PMID: 30459848 PMCID: PMC6233384 DOI: 10.1186/s13040-018-0185-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2018] [Accepted: 10/12/2018] [Indexed: 11/10/2022] Open
Abstract
The integration of cloud resources with federated data retrieval has the potential of improving the maintenance, accessibility and performance of specialized databases in the biomedical field. However, such an integrative approach requires technical expertise in cloud computing, usage of a data retrieval engine and development of a unified data-model, which can encapsulate the heterogeneity of biological data. Here, a framework for the development of cloud-based biological specialized databases is proposed. It is powered by a distributed biodata retrieval system, able to interface with different data formats, as well as provides an integrated way for data exploration. The proposed framework was implemented using Java as the development environment, and MongoDB as the database manager. Syntactic analysis was based on BSON, jsoup, Apache Commons and w3c.dom open libraries. Framework is available in: http://nbel-lab.com and is distributed under the creative common agreement.
Collapse
Affiliation(s)
- David Bouzaglo
- Neuro-biomorphic Engineering Lab, Faculty of Engineering, Jerusalem College of Technology, Jerusalem, Israel
| | - Israel Chasida
- Neuro-biomorphic Engineering Lab, Faculty of Engineering, Jerusalem College of Technology, Jerusalem, Israel
| | - Elishai Ezra Tsur
- Neuro-biomorphic Engineering Lab, Faculty of Engineering, Jerusalem College of Technology, Jerusalem, Israel
| |
Collapse
|
23
|
Wang A, Lim H, Cheng SY, Xie L. ANTENNA, a Multi-Rank, Multi-Layered Recommender System for Inferring Reliable Drug-Gene-Disease Associations: Repurposing Diazoxide as a Targeted Anti-Cancer Therapy. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:1960-1967. [PMID: 29993812 PMCID: PMC6139288 DOI: 10.1109/tcbb.2018.2812189] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Existing drug discovery processes follow a reductionist model of "one-drug-one-gene-one-disease," which is inadequate to tackle complex diseases involving multiple malfunctioned genes. The availability of big omics data offers opportunities to transform drug discovery process into a new paradigm of systems pharmacology that focuses on designing drugs to target molecular interaction networks instead of a single gene. Here, we develop a reliable multi-rank, multi-layered recommender system, ANTENNA, to mine large-scale chemical genomics and disease association data for prediction of novel drug-gene-disease associations. ANTENNA integrates a novel tri-factorization based dual-regularized weighted and imputed One Class Collaborative Filtering (OCCF) algorithm, tREMAP, with a statistical framework based on Random Walk with Restart and assess the reliability of specific predictions. In the benchmark, tREMAP clearly outperforms the single-rank OCCF. We apply ANTENNA to a real-world problem: repurposing old drugs for new clinical indications without effective treatments. We discover that FDA-approved drug diazoxide can inhibit multiple kinase genes responsible for many diseases including cancer and kill triple negative breast cancer (TNBC) cells efficiently [Formula: see text]. TNBC is a deadly disease without effective targeted therapies. Our finding demonstrates the power of big data analytics in drug discovery and developing a targeted therapy for TNBC.
Collapse
|
24
|
Wu Z, Li W, Liu G, Tang Y. Network-Based Methods for Prediction of Drug-Target Interactions. Front Pharmacol 2018; 9:1134. [PMID: 30356768 PMCID: PMC6189482 DOI: 10.3389/fphar.2018.01134] [Citation(s) in RCA: 116] [Impact Index Per Article: 19.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2018] [Accepted: 09/18/2018] [Indexed: 01/10/2023] Open
Abstract
Drug-target interaction (DTI) is the basis of drug discovery. However, it is time-consuming and costly to determine DTIs experimentally. Over the past decade, various computational methods were proposed to predict potential DTIs with high efficiency and low costs. These methods can be roughly divided into several categories, such as molecular docking-based, pharmacophore-based, similarity-based, machine learning-based, and network-based methods. Among them, network-based methods, which do not rely on three-dimensional structures of targets and negative samples, have shown great advantages over the others. In this article, we focused on network-based methods for DTI prediction, in particular our network-based inference (NBI) methods that were derived from recommendation algorithms. We first introduced the methodologies and evaluation of network-based methods, and then the emphasis was put on their applications in a wide range of fields, including target prediction and elucidation of molecular mechanisms of therapeutic effects or safety problems. Finally, limitations and perspectives of network-based methods were discussed. In a word, network-based methods provide alternative tools for studies in drug repurposing, new drug discovery, systems pharmacology and systems toxicology.
Collapse
Affiliation(s)
| | | | | | - Yun Tang
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| |
Collapse
|
25
|
De Bastiani MA, Pfaffenseller B, Klamt F. Master Regulators Connectivity Map: A Transcription Factors-Centered Approach to Drug Repositioning. Front Pharmacol 2018; 9:697. [PMID: 30034338 PMCID: PMC6043797 DOI: 10.3389/fphar.2018.00697] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Accepted: 06/08/2018] [Indexed: 01/09/2023] Open
Abstract
Drug discovery is a very expensive and time-consuming endeavor. Fortunately, recent omics technologies and Systems Biology approaches introduced interesting new tools to achieve this task, facilitating the repurposing of already known drugs to new therapeutic assignments using gene expression data and bioinformatics. The inherent role of transcription factors in gene expression modulation makes them strong candidates for master regulators of phenotypic transitions. However, transcription factors expression itself usually does not reflect its activity changes due to post-transcriptional modifications and other complications. In this aspect, the use of high-throughput transcriptomic data may be employed to infer transcription factors-targets interactions and assess their activity through co-expression networks, which can be further used to search for drugs capable of reverting the gene expression profile of pathological phenotypes employing the connectivity maps paradigm. Following this idea, we argue that a module-oriented connectivity map approach using transcription factors-centered networks would aid the query for new repositioning candidates. Through a brief case study, we explored this idea in bipolar disorder, retrieving known drugs used in the usual clinical scenario as well as new candidates with potential therapeutic application in this disease. Indeed, the results of the case study indicate just how promising our approach may be to drug repositioning.
Collapse
Affiliation(s)
- Marco A De Bastiani
- Laboratory of Cellular Biochemistry, Department of Biochemistry, Federal University of Rio Grande do Sul, Porto Alegre, Brazil.,National Institute of Science and Technology for Translational Medicine, Porto Alegre, Brazil
| | - Bianca Pfaffenseller
- Laboratory of Cellular Biochemistry, Department of Biochemistry, Federal University of Rio Grande do Sul, Porto Alegre, Brazil.,Laboratory of Molecular Psychiatry, Clinicas Hospital of Porto Alegre, Federal University of Rio Grande do Sul, Porto Alegre, Brazil
| | - Fabio Klamt
- Laboratory of Cellular Biochemistry, Department of Biochemistry, Federal University of Rio Grande do Sul, Porto Alegre, Brazil.,National Institute of Science and Technology for Translational Medicine, Porto Alegre, Brazil
| |
Collapse
|
26
|
Abstract
Complex systems theory is concerned with identifying and characterizing common design elements that are observed across diverse natural, technological and social complex systems. Systems biology, a more holistic approach to study molecules and cells in biology, has advanced rapidly in the past two decades. However, not much appreciation has been granted to the realization that the human cell is an exemplary complex system. Here, I outline general design principles identified in many complex systems, and then describe the human cell as a prototypical complex system. Considering concepts of complex systems theory in systems biology can illuminate our overall understanding of normal cell physiology and the alterations that lead to human disease.
Collapse
Affiliation(s)
- Avi Ma'ayan
- BD2K-LINCS Data Coordination and Integration Center; Mount Sinai Center for Bioinformatics; Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY 10029, USA
| |
Collapse
|
27
|
Lachmann A, Torre D, Keenan AB, Jagodnik KM, Lee HJ, Wang L, Silverstein MC, Ma'ayan A. Massive mining of publicly available RNA-seq data from human and mouse. Nat Commun 2018; 9:1366. [PMID: 29636450 PMCID: PMC5893633 DOI: 10.1038/s41467-018-03751-6] [Citation(s) in RCA: 404] [Impact Index Per Article: 67.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2018] [Accepted: 03/08/2018] [Indexed: 02/06/2023] Open
Abstract
RNA sequencing (RNA-seq) is the leading technology for genome-wide transcript quantification. However, publicly available RNA-seq data is currently provided mostly in raw form, a significant barrier for global and integrative retrospective analyses. ARCHS4 is a web resource that makes the majority of published RNA-seq data from human and mouse available at the gene and transcript levels. For developing ARCHS4, available FASTQ files from RNA-seq experiments from the Gene Expression Omnibus (GEO) were aligned using a cloud-based infrastructure. In total 187,946 samples are accessible through ARCHS4 with 103,083 mouse and 84,863 human. Additionally, the ARCHS4 web interface provides intuitive exploration of the processed data through querying tools, interactive visualization, and gene pages that provide average expression across cell lines and tissues, top co-expressed genes for each gene, and predicted biological functions and protein-protein interactions for each gene based on prior knowledge combined with co-expression.
Collapse
Affiliation(s)
- Alexander Lachmann
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA
| | - Denis Torre
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA
| | - Alexandra B Keenan
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA
| | - Kathleen M Jagodnik
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA
| | - Hoyjin J Lee
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA
| | - Lily Wang
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA
| | - Moshe C Silverstein
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA
| | - Avi Ma'ayan
- Department of Pharmacological Sciences; Mount Sinai Center for Bioinformatics; Big Data to Knowledge, Library of Integrated Network-based Cellular Signatures, Data Coordination and Integration Center (BD2K-LINCS DCIC); Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, One Gustave L. Levy Place, Box 1603, New York, NY, 10029, USA.
| |
Collapse
|
28
|
The Big Picture: Systems Biology Approach to Antiepileptic Drug Discovery. Epilepsy Curr 2017; 17:232-234. [PMID: 29225529 DOI: 10.5698/1535-7597.17.4.232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
29
|
Systematic discovery of drug action mechanisms by an integrated chemical genomics approach: identification of functional disparities between azacytidine and decitabine. Oncotarget 2017; 7:27363-78. [PMID: 27036028 PMCID: PMC5053656 DOI: 10.18632/oncotarget.8455] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2015] [Accepted: 03/16/2016] [Indexed: 01/22/2023] Open
Abstract
Polypharmacology (the ability of a drug to affect more than one molecular target) is considered a basic property of many therapeutic small molecules. Herein, we used a chemical genomics approach to systematically analyze polypharmacology by integrating several analytical tools, including the LINCS (Library of Integrated Cellular Signatures), STITCH (Search Tool for Interactions of Chemicals), and WebGestalt (WEB-based GEne SeT AnaLysis Toolkit). We applied this approach to identify functional disparities between two cytidine nucleoside analogs: azacytidine (AZA) and decitabine (DAC). AZA and DAC are structurally and mechanistically similar DNA-hypomethylating agents. However, their metabolism and destinations in cells are distinct. Due to their differential incorporation into RNA or DNA, functional disparities between AZA and DAC are expected. Indeed, different cytotoxicities of AZA and DAC toward human colorectal cancer cell lines were observed, in which cells were more sensitive to AZA. Based on a polypharmacological analysis, we found that AZA transiently blocked protein synthesis and induced an acute apoptotic response that was antagonized by concurrently induced cytoprotective autophagy. In contrast, DAC caused cell cycle arrest at the G2/M phase associated with p53 induction. Therefore, our study discriminated functional disparities between AZA and DAC, and also demonstrated the value of this chemical genomics approach that can be applied to discover novel drug action mechanisms.
Collapse
|
30
|
Kang M, Park J, Kim DC, Biswas AK, Liu C, Gao J. Multi-Block Bipartite Graph for Integrative Genomic Analysis. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:1350-1358. [PMID: 27429442 DOI: 10.1109/tcbb.2016.2591521] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Human diseases involve a sequence of complex interactions between multiple biological processes. In particular, multiple genomic data such as Single Nucleotide Polymorphism (SNP), Copy Number Variation (CNV), DNA Methylation (DM), and their interactions simultaneously play an important role in human diseases. However, despite the widely known complex multi-layer biological processes and increased availability of the heterogeneous genomic data, most research has considered only a single type of genomic data. Furthermore, recent integrative genomic studies for the multiple genomic data have also been facing difficulties due to the high-dimensionality and complexity, especially when considering their intra- and inter-block interactions. In this paper, we introduce a novel multi-block bipartite graph and its inference methods, MB2I and sMB2I, for the integrative genomic study. The proposed methods not only integrate multiple genomic data but also incorporate intra/inter-block interactions by using a multi-block bipartite graph. In addition, the methods can be used to predict quantitative traits (e.g., gene expression, survival time) from the multi-block genomic data. The performance was assessed by simulation experiments that implement practical situations. We also applied the method to the human brain data of psychiatric disorders. The experimental results were analyzed by maximum edge biclique and biclustering, and biological findings were discussed.
Collapse
|
31
|
El-Hachem N, Ba-Alawi W, Smith I, Mer AS, Haibe-Kains B. Integrative cancer pharmacogenomics to establish drug mechanism of action: drug repurposing. Pharmacogenomics 2017; 18:1469-1472. [PMID: 29057710 DOI: 10.2217/pgs-2017-0132] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Affiliation(s)
- Nehme El-Hachem
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Wail Ba-Alawi
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Ian Smith
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Arvind Singh Mer
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Benjamin Haibe-Kains
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada.,Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.,Ontario Institute of Cancer Research, Toronto, Ontario, Canada
| |
Collapse
|
32
|
Deng Z, Tu W, Deng Z, Hu QN. PhID: An Open-Access Integrated Pharmacology Interactions Database for Drugs, Targets, Diseases, Genes, Side-Effects, and Pathways. J Chem Inf Model 2017; 57:2395-2400. [PMID: 28906116 DOI: 10.1021/acs.jcim.7b00175] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The current network pharmacology study encountered a bottleneck with a lot of public data scattered in different databases. There is a lack of an open-access and consolidated platform that integrates this information for systemic research. To address this issue, we have developed PhID, an integrated pharmacology database which integrates >400 000 pharmacology elements (drug, target, disease, gene, side-effect, and pathway) and >200 000 element interactions in branches of public databases. PhID has three major applications: (1) assisting scientists searching through the overwhelming amount of pharmacology element interaction data by names, public IDs, molecule structures, or molecular substructures; (2) helping visualizing pharmacology elements and their interactions with a web-based network graph; and (3) providing prediction of drug-target interactions through two modules: PreDPI-ki and FIM, by which users can predict drug-target interactions of PhID entities or some drug-target pairs of their own interest. To get a systems-level understanding of drug action and disease complexity, PhID as a network pharmacology tool was established from the perspective of data layer, visualization layer, and prediction model layer to present information untapped by current databases.
Collapse
Affiliation(s)
- Zhe Deng
- Key Laboratory of Combinatorial Biosynthesis and Drug Discovery (Wuhan University), Ministry of Education, and Wuhan University School of Pharmaceutical Sciences , Wuhan, 430071, China
| | - Weizhong Tu
- Key Laboratory of Combinatorial Biosynthesis and Drug Discovery (Wuhan University), Ministry of Education, and Wuhan University School of Pharmaceutical Sciences , Wuhan, 430071, China
| | - Zixin Deng
- Key Laboratory of Combinatorial Biosynthesis and Drug Discovery (Wuhan University), Ministry of Education, and Wuhan University School of Pharmaceutical Sciences , Wuhan, 430071, China
| | - Qian-Nan Hu
- Key Laboratory of Combinatorial Biosynthesis and Drug Discovery (Wuhan University), Ministry of Education, and Wuhan University School of Pharmaceutical Sciences , Wuhan, 430071, China.,Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences , 300308, Tianjin, China
| |
Collapse
|
33
|
Gomez-Cabrero D, Marabita F, Tarazona S, Cano I, Roca J, Conesa A, Sabatier P, Tegnér J. Guidelines for Developing Successful Short Advanced Courses in Systems Medicine and Systems Biology. Cell Syst 2017; 5:168-175. [PMID: 28843483 DOI: 10.1016/j.cels.2017.05.013] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2016] [Revised: 02/21/2017] [Accepted: 05/31/2017] [Indexed: 12/15/2022]
Abstract
Systems medicine and systems biology have inherent educational challenges. These have largely been addressed either by providing new masters programs or by redesigning undergraduate programs. In contrast, short courses can respond to a different need: they can provide condensed updates for professionals across academia, the clinic, and industry. These courses have received less attention. Here, we share our experiences in developing and providing such courses to current and future leaders in systems biology and systems medicine. We present guidelines for how to reproduce our courses, and we offer suggestions for how to select students who will nurture an interdisciplinary learning environment and thrive there.
Collapse
Affiliation(s)
- David Gomez-Cabrero
- Unit of Computational Medicine, Department of Medicine, Karolinska Institutet, 171 77 Stockholm, Sweden; Center for Molecular Medicine, Karolinska Institutet, 171 77 Stockholm, Sweden; Unit of Clinical Epidemiology, Department of Medicine, Karolinska University Hospital, L8, 17176 Stockholm, Sweden; Science for Life Laboratory, 17121 Solna, Sweden; Mucosal and Salivary Biology Division, King's College London Dental Institute, London SE1 9RT, UK.
| | - Francesco Marabita
- Unit of Computational Medicine, Department of Medicine, Karolinska Institutet, 171 77 Stockholm, Sweden; Center for Molecular Medicine, Karolinska Institutet, 171 77 Stockholm, Sweden; Unit of Clinical Epidemiology, Department of Medicine, Karolinska University Hospital, L8, 17176 Stockholm, Sweden; Science for Life Laboratory, 17121 Solna, Sweden
| | - Sonia Tarazona
- Centro de Investigacion Principe Felipe, 46012 Valencia, Spain; Department of Applied Statistics, Operations Research and Quality, Universitat Politècnica de València, Camí de Vera, 46022 Valencia, Spain
| | - Isaac Cano
- Hospital Clinic de Barcelona, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Universitat de Barcelona, 08007 Barcelona, Spain; Center for Biomedical Network Research in Respiratory Diseases (CIBERES), 28029 Madrid, Spain
| | - Josep Roca
- Hospital Clinic de Barcelona, Institut d'Investigacions Biomèdiques August Pi i Sunyer (IDIBAPS), Universitat de Barcelona, 08007 Barcelona, Spain; Center for Biomedical Network Research in Respiratory Diseases (CIBERES), 28029 Madrid, Spain
| | - Ana Conesa
- Centro de Investigacion Principe Felipe, 46012 Valencia, Spain; Microbiology and Cell Science Department, Institute for Food and Agricultural Sciences, University of Florida, Gainesville, FL 32603, USA
| | - Philippe Sabatier
- TIMC-IMAG Laboratory, UMR 5525, Centre National de la Recherche Scientifique, Vetagro Sup, Université Grenoble-Alpes, 38400 Saint-Martin-d'Hères, France
| | - Jesper Tegnér
- Unit of Computational Medicine, Department of Medicine, Karolinska Institutet, 171 77 Stockholm, Sweden; Center for Molecular Medicine, Karolinska Institutet, 171 77 Stockholm, Sweden; Unit of Clinical Epidemiology, Department of Medicine, Karolinska University Hospital, L8, 17176 Stockholm, Sweden; Science for Life Laboratory, 17121 Solna, Sweden; Biological and Environmental Sciences and Engineering Division (BESE), Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Kingdom of Saudi Arabia.
| |
Collapse
|
34
|
Liu TP, Hong YH, Yang PM. In silico and in vitro identification of inhibitory activities of sorafenib on histone deacetylases in hepatocellular carcinoma cells. Oncotarget 2017; 8:86168-86180. [PMID: 29156785 PMCID: PMC5689675 DOI: 10.18632/oncotarget.21030] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2017] [Accepted: 08/02/2017] [Indexed: 12/18/2022] Open
Abstract
Although sorafenib has been approved for treating hepatocellular carcinoma (HCC), clinical results are not satisfactory. Polypharmacology (one drug with multiple molecular targets) is viewed as an attractive strategy for identifying novel mechanisms of a drug and then rationally designing more-effective next-generation therapeutic agents. In this study, a polypharmacological study of sorafenib was performed by mining the next-generation Connectivity Map (CMap) database, CLUE (https://clue.io/). We found that sorafenib may act as a histone deacetylase (HDAC) inhibitor based on similar gene expression profiles. In vitro experimental analyses demonstrated that sorafenib indirectly inhibited HDAC activity in both sorafenib-sensitive and -resistant HCC cells. A cancer genomics analysis using the cBioPortal online tool showed the frequent upregulation of HDAC mRNAs. Furthermore, HCC patients with higher expressions of HDAC1 and HDAC2 had worse overall survival. Taken together, our study suggests that inhibition of HDAC by sorafenib may provide clinical benefits against HCC, and enhancement of HDAC-inhibitory activity of sorafenib may improve its therapeutic efficacy. In addition, our study also provides a novel strategy to study polypharmacology.
Collapse
Affiliation(s)
- Tsang-Pai Liu
- PhD Program for Cancer Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University and Academia Sinica, Taipei, Taiwan.,Department of Surgery, Mackay Memorial Hospital, Taipei, Taiwan.,Mackay Junior College of Medicine, Nursing and Management, New Taipei City, Taiwan.,Department of Medicine, Mackay Medical College, New Taipei City, Taiwan.,Liver Medical Center, Mackay Memorial Hospital, Taipei, Taiwan
| | - Yi-Han Hong
- Department of Surgery, Mackay Memorial Hospital, Taipei, Taiwan
| | - Pei-Ming Yang
- PhD Program for Cancer Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University and Academia Sinica, Taipei, Taiwan.,Graduate Institute of Cancer Biology and Drug Discovery, College of Medical Science and Technology, Taipei Medical University, Taipei, Taiwan
| |
Collapse
|
35
|
Kiyosawa N, Manabe S. Data-intensive drug development in the information age: applications of Systems Biology/Pharmacology/Toxicology. J Toxicol Sci 2017; 41:SP15-SP25. [PMID: 28003636 DOI: 10.2131/jts.41.sp15] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
Pharmaceutical companies continuously face challenges to deliver new drugs with true medical value. R&D productivity of drug development projects depends on 1) the value of the drug concept and 2) data and in-depth knowledge that are used rationally to evaluate the drug concept's validity. A model-based data-intensive drug development approach is a key competitive factor used by innovative pharmaceutical companies to reduce information bias and rationally demonstrate the value of drug concepts. Owing to the accumulation of publicly available biomedical information, our understanding of the pathophysiological mechanisms of diseases has developed considerably; it is the basis for identifying the right drug target and creating a drug concept with true medical value. Our understanding of the pathophysiological mechanisms of disease animal models can also be improved; it can thus support rational extrapolation of animal experiment results to clinical settings. The Systems Biology approach, which leverages publicly available transcriptome data, is useful for these purposes. Furthermore, applying Systems Pharmacology enables dynamic simulation of drug responses, from which key research questions to be addressed in the subsequent studies can be adequately informed. Application of Systems Biology/Pharmacology to toxicology research, namely Systems Toxicology, should considerably improve the predictability of drug-induced toxicities in clinical situations that are difficult to predict from conventional preclinical toxicology studies. Systems Biology/Pharmacology/Toxicology models can be continuously improved using iterative learn-confirm processes throughout preclinical and clinical drug discovery and development processes. Successful implementation of data-intensive drug development approaches requires cultivation of an adequate R&D culture to appreciate this approach.
Collapse
Affiliation(s)
- Naoki Kiyosawa
- Translational Medicine & Clinical Pharmacology Department, Daiichi Sankyo Co. Ltd
| | | |
Collapse
|
36
|
Mechanism-based biomarker discovery. Drug Discov Today 2017; 22:1209-1215. [DOI: 10.1016/j.drudis.2017.04.013] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2016] [Revised: 04/12/2017] [Accepted: 04/20/2017] [Indexed: 11/22/2022]
|
37
|
Ruggles KV, Krug K, Wang X, Clauser KR, Wang J, Payne SH, Fenyö D, Zhang B, Mani DR. Methods, Tools and Current Perspectives in Proteogenomics. Mol Cell Proteomics 2017; 16:959-981. [PMID: 28456751 DOI: 10.1074/mcp.mr117.000024] [Citation(s) in RCA: 95] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2017] [Indexed: 12/20/2022] Open
Abstract
With combined technological advancements in high-throughput next-generation sequencing and deep mass spectrometry-based proteomics, proteogenomics, i.e. the integrative analysis of proteomic and genomic data, has emerged as a new research field. Early efforts in the field were focused on improving protein identification using sample-specific genomic and transcriptomic sequencing data. More recently, integrative analysis of quantitative measurements from genomic and proteomic studies have identified novel insights into gene expression regulation, cell signaling, and disease. Many methods and tools have been developed or adapted to enable an array of integrative proteogenomic approaches and in this article, we systematically classify published methods and tools into four major categories, (1) Sequence-centric proteogenomics; (2) Analysis of proteogenomic relationships; (3) Integrative modeling of proteogenomic data; and (4) Data sharing and visualization. We provide a comprehensive review of methods and available tools in each category and highlight their typical applications.
Collapse
Affiliation(s)
- Kelly V Ruggles
- From the ‡Department of Medicine, New York University School of Medicine, New York, New York 10016
| | - Karsten Krug
- §The Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142
| | - Xiaojing Wang
- ¶Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas 77030.,‖Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030
| | - Karl R Clauser
- §The Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142
| | - Jing Wang
- ¶Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas 77030.,‖Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030
| | - Samuel H Payne
- **Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99354
| | - David Fenyö
- ‡‡Department of Biochemistry and Molecular Pharmacology, New York University School of Medicine, New York, New York 10016; .,§§Institute for Systems Genetics, New York University School of Medicine, New York, New York 10016
| | - Bing Zhang
- ¶Lester and Sue Smith Breast Center, Baylor College of Medicine, Houston, Texas 77030; .,‖Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas 77030
| | - D R Mani
- §The Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142;
| |
Collapse
|
38
|
El-Hachem N, Gendoo DMA, Ghoraie LS, Safikhani Z, Smirnov P, Chung C, Deng K, Fang A, Birkwood E, Ho C, Isserlin R, Bader GD, Goldenberg A, Haibe-Kains B. Integrative Cancer Pharmacogenomics to Infer Large-Scale Drug Taxonomy. Cancer Res 2017; 77:3057-3069. [PMID: 28314784 DOI: 10.1158/0008-5472.can-17-0096] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2017] [Revised: 02/27/2017] [Accepted: 03/13/2017] [Indexed: 11/16/2022]
Abstract
Identification of drug targets and mechanism of action (MoA) for new and uncharacterized anticancer drugs is important for optimization of treatment efficacy. Current MoA prediction largely relies on prior information including side effects, therapeutic indication, and chemoinformatics. Such information is not transferable or applicable for newly identified, previously uncharacterized small molecules. Therefore, a shift in the paradigm of MoA predictions is necessary toward development of unbiased approaches that can elucidate drug relationships and efficiently classify new compounds with basic input data. We propose here a new integrative computational pharmacogenomic approach, referred to as Drug Network Fusion (DNF), to infer scalable drug taxonomies that rely only on basic drug characteristics toward elucidating drug-drug relationships. DNF is the first framework to integrate drug structural information, high-throughput drug perturbation, and drug sensitivity profiles, enabling drug classification of new experimental compounds with minimal prior information. DNF taxonomy succeeded in identifying pertinent and novel drug-drug relationships, making it suitable for investigating experimental drugs with potential new targets or MoA. The scalability of DNF facilitated identification of key drug relationships across different drug categories, providing a flexible tool for potential clinical applications in precision medicine. Our results support DNF as a valuable resource to the cancer research community by providing new hypotheses on compound MoA and potential insights for drug repurposing. Cancer Res; 77(11); 3057-69. ©2017 AACR.
Collapse
Affiliation(s)
- Nehme El-Hachem
- Integrative Computational Systems Biology, Institut de Recherches Cliniques de Montréal, Montreal, Quebec, Canada.,Department of Biomedical Sciences. Université de Montréal, Montreal, Quebec, Canada
| | - Deena M A Gendoo
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Laleh Soltan Ghoraie
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Zhaleh Safikhani
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada
| | - Petr Smirnov
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada
| | - Christina Chung
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| | - Kenan Deng
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| | - Ailsa Fang
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| | - Erin Birkwood
- School of Computer Science, McGill University, Montreal, Quebec, Canada
| | - Chantal Ho
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| | - Ruth Isserlin
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada
| | - Gary D Bader
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.,The Donnelly Centre, Toronto, Ontario, Canada.,The Lunenfeld-Tanenbaum Research Institute, Mount Sinai Hospital, Toronto, Ontario, Canada
| | - Anna Goldenberg
- Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.,Hospital for Sick Children, Toronto, Ontario, Canada
| | - Benjamin Haibe-Kains
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario, Canada. .,Department of Medical Biophysics, University of Toronto, Toronto, Ontario, Canada.,Department of Computer Science, University of Toronto, Toronto, Ontario, Canada.,Ontario Institute of Cancer Research, Toronto, Ontario, Canada
| |
Collapse
|
39
|
Nim HT, Furtado MB, Ramialison M, Boyd SE. Combinatorial Ranking of Gene Sets to Predict Disease Relapse: The Retinoic Acid Pathway in Early Prostate Cancer. Front Oncol 2017; 7:30. [PMID: 28361034 PMCID: PMC5350134 DOI: 10.3389/fonc.2017.00030] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2016] [Accepted: 02/20/2017] [Indexed: 11/24/2022] Open
Abstract
Background Quantitative high-throughput data deposited in consortia such as International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA) present opportunities and challenges for computational analyses. Methods We present a computational strategy to systematically rank and investigate a large number (210–220) of clinically testable gene sets, using combinatorial gene subset generation and disease-free survival (DFS) analyses. This approach integrates protein–protein interaction networks, gene expression, DNA methylation, and copy number data, in association with DFS profiles from patient clinical records. Results As a case study, we applied this pipeline to systematically analyze the role of ALDH1A2 in prostate cancer (PCa). We have previously found this gene to have multiple roles in disease and homeostasis, and here we investigate the role of the associated ALDH1A2 gene/protein networks in PCa, using our methodology in combination with PCa patient clinical profiles from ICGC and TCGA databases. Relationships between gene signatures and relapse were analyzed using Kaplan–Meier (KM) log-rank analysis and multivariable Cox regression. Relative expression versus pooled mean from diploid population was used for z-statistics calculation. Gene/protein interaction network analyses generated 11 core genes associated with ALDH1A2; combinatorial ranking of the power set of these core genes identified two gene sets (out of 211 − 1 = 2,047 combinations) with significant correlation with disease relapse (KM log rank p < 0.05). For the more significant of these two sets, referred to as the optimal gene set (OGS), patients have median survival 62.7 months with OGS alterations compared to >150 months without OGS alterations (p = 0.0248, hazard ratio = 2.213, 95% confidence interval = 1.1–4.098). Two genes comprising OGS (CYP26A1 and RDH10) are strongly associated with ALDH1A2 in the retinoic acid (RA) pathways, suggesting a major role of RA signaling in early PCa progression. Our pipeline complements human expertise in the search for prognostic biomarkers in large-scale datasets.
Collapse
Affiliation(s)
- Hieu T Nim
- Faculty of Information Technology, Monash University, Melbourne, VIC, Australia; Australian Regenerative Medicine Institute, Monash University, Melbourne, VIC, Australia
| | | | - Mirana Ramialison
- Australian Regenerative Medicine Institute, Monash University, Melbourne, VIC, Australia; EMBL - Australia Collaborating Group, Systems Biology Institute Australia, Monash University, Melbourne, VIC, Australia
| | - Sarah E Boyd
- Faculty of Information Technology, Monash University , Melbourne, VIC , Australia
| |
Collapse
|
40
|
Edwards LM. Metabolic systems biology: a brief primer. J Physiol 2017; 595:2849-2855. [PMID: 28028815 DOI: 10.1113/jp272275] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2016] [Accepted: 11/23/2016] [Indexed: 12/25/2022] Open
Abstract
In the early to mid-20th century, reductionism as a concept in biology was challenged by key thinkers, including Ludwig von Bertalanffy. He proposed that living organisms were specific examples of complex systems and, as such, they should display characteristics including hierarchical organisation and emergent behaviour. Yet the true study of complete biological systems (for example, metabolism) was not possible until technological advances that occurred 60 years later. Technology now exists that permits the measurement of complete levels of the biological hierarchy, for example the genome and transcriptome. The complexity and scale of these data require computational models for their interpretation. The combination of these - systems thinking, high-dimensional data and computation - defines systems biology, typically accompanied by some notion of iterative model refinement. Only sequencing-based technologies, however, offer full coverage. Other 'omics' platforms trade coverage for sensitivity, although the densely connected nature of biological networks suggests that full coverage may not be necessary. Systems biology models are often characterised as either 'bottom-up' (mechanistic) or 'top-down' (statistical). This distinction can mislead, as all models rely on data and all are, to some degree, 'middle-out'. Systems biology has matured as a discipline, and its methods are commonplace in many laboratories. However, many challenges remain, especially those related to large-scale data integration.
Collapse
Affiliation(s)
- Lindsay M Edwards
- Respiratory Data Sciences Group, GlaxoSmithKline Medicines Research, Stevenage, Hertfordshire, UK
| |
Collapse
|
41
|
Kim SW, Md Hasanuzzaman, Cho M, Kim NH, Choi HY, Han JW, Park HJ, Oh JW, Shin JG. Role of 14-3-3 sigma in over-expression of P-gp by rifampin and paclitaxel stimulation through interaction with PXR. Cell Signal 2017; 31:124-134. [PMID: 28077325 DOI: 10.1016/j.cellsig.2017.01.001] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2016] [Revised: 01/01/2017] [Accepted: 01/03/2017] [Indexed: 01/04/2023]
Abstract
In this study, we presented the role of 14-3-3σ to activate CK2-Hsp90β-PXR-MDR1 pathway on rifampin and paclitaxel treated LS174T cells and in vivo LS174T cell-xenografted nude mouse model. Following several in vitro and in vivo experiments, rifampin and paclitaxel were found to be stimulated the CK2-Hsp90β-PXR-MDR1 pathway. Of the proteins in this pathway, Pregnane X receptor (PXR) is a representative transcription factor of multidrug resistance protein 1 (MDR1). We constructed FLAG-PXR-LS174T stable cell lines and discovered 22 proteins that interacted with PXR on rifampin treatment. Among them, Hsp90β and 14-3-3σ were isolated for further study. Both the proteins were found to be localized in cytoplasm on rifampin treatment by using confocal microscopy. On the other hand, PXR was found to be localized in nucleus after rifampin and paclitaxel treatment by using cell fractionation assay. In Western blot analysis, rifampin did not influence the expression of 14-3-3σ protein. Transient transfection of 14-3-3σ into LS174T cells induced overexpression of PXR; however, P-glycoprotein (P-gp) was not changed significantly. P-gp overexpression was induced only when 14-3-3σ transfected LS174T cells were treated with rifampin and paclitaxel, whereas 14-3-3σ inhibition by nonpeptidic inhibitor, BV02 and 14-3-3σ siRNA reduced rifampin induced PXR and P-gp expression. Cell survival rates were much higher at 14-3-3σ-LS174T stable cell lines than LS174T cells following paclitaxel and vincristine treatment. This data indicates that 14-3-3σ contributes to P-gp overexpression through interaction with PXR with rifampin and paclitaxel treatment.
Collapse
Affiliation(s)
- So Won Kim
- Department of Pharmacology, Catholic Kwandong University College of Medicine, Gangneung 25601, Republic of Korea; The Institute for Clinical and Translational Research, Catholic Kwandong University College of Medicine, Gangneung 25601, Republic of Korea.
| | - Md Hasanuzzaman
- Department of Pharmacology and PharmacoGenomics Research Center, Inje University College of Medicine, Busan 47392, Republic of Korea; Department of Pharmacy, Noakhali Science and Technology University, Sonapur, Noakhali 3814, Bangladesh
| | - Munju Cho
- Department of Pharmacology and PharmacoGenomics Research Center, Inje University College of Medicine, Busan 47392, Republic of Korea
| | - Nam Hyun Kim
- Department of Pharmacology, Catholic Kwandong University College of Medicine, Gangneung 25601, Republic of Korea
| | - Hye-Young Choi
- Department of Pharmacology, Catholic Kwandong University College of Medicine, Gangneung 25601, Republic of Korea
| | - Jung Woo Han
- Department of Pharmacology, Yonsei University College of Medicine, Seoul 03722, Republic of Korea
| | - Hyun June Park
- Department of Chemistry, University of Chicago, Chicago, IL 60637, USA
| | - Ji Won Oh
- Department of Anatomy, School of Medicine, Kyungpook National University, Daegu 41944, Republic of Korea; Bio-Medical Research Institute, Kyungpook National University Hospital, Daegu 41944, Republic of Korea
| | - Jae-Gook Shin
- Department of Pharmacology and PharmacoGenomics Research Center, Inje University College of Medicine, Busan 47392, Republic of Korea; Department of Clinical Pharmacology, Inje University Busan Paik Hospital, Busan 47392, Republic of Korea.
| |
Collapse
|
42
|
Abstract
Systems pharmacology aims to holistically understand mechanisms of drug actions to support drug discovery and clinical practice. Systems pharmacology modeling (SPM) is data driven. It integrates an exponentially growing amount of data at multiple scales (genetic, molecular, cellular, organismal, and environmental). The goal of SPM is to develop mechanistic or predictive multiscale models that are interpretable and actionable. The current explosions in genomics and other omics data, as well as the tremendous advances in big data technologies, have already enabled biologists to generate novel hypotheses and gain new knowledge through computational models of genome-wide, heterogeneous, and dynamic data sets. More work is needed to interpret and predict a drug response phenotype, which is dependent on many known and unknown factors. To gain a comprehensive understanding of drug actions, SPM requires close collaborations between domain experts from diverse fields and integration of heterogeneous models from biophysics, mathematics, statistics, machine learning, and semantic webs. This creates challenges in model management, model integration, model translation, and knowledge integration. In this review, we discuss several emergent issues in SPM and potential solutions using big data technology and analytics. The concurrent development of high-throughput techniques, cloud computing, data science, and the semantic web will likely allow SPM to be findable, accessible, interoperable, reusable, reliable, interpretable, and actionable.
Collapse
Affiliation(s)
- Lei Xie
- Department of Computer Science, Hunter College, The City University of New York, New York, NY 10065; .,The Graduate Center, The City University of New York, New York, NY 10016
| | - Eli J Draizen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894; .,Program in Bioinformatics, Boston University, Boston, Massachusetts 02215
| | - Philip E Bourne
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894; .,Office of the Director, National Institutes of Health, Bethesda, Maryland 20894
| |
Collapse
|
43
|
Taglang G, Jackson DB. Use of "big data" in drug discovery and clinical trials. Gynecol Oncol 2016; 141:17-23. [PMID: 27016224 DOI: 10.1016/j.ygyno.2016.02.022] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2015] [Revised: 02/08/2016] [Accepted: 02/21/2016] [Indexed: 12/31/2022]
Abstract
Oncology is undergoing a data-driven metamorphosis. Armed with new and ever more efficient molecular and information technologies, we have entered an era where data is helping us spearhead the fight against cancer. This technology driven data explosion, often referred to as "big data", is not only expediting biomedical discovery, but it is also rapidly transforming the practice of oncology into an information science. This evolution is critical, as results to-date have revealed the immense complexity and genetic heterogeneity of patients and their tumors, a sobering reminder of the challenge facing every patient and their oncologist. This can only be addressed through development of clinico-molecular data analytics that provide a deeper understanding of the mechanisms controlling the biological and clinical response to available therapeutic options. Beyond the exciting implications for improved patient care, such advancements in predictive and evidence-based analytics stand to profoundly affect the processes of cancer drug discovery and associated clinical trials.
Collapse
|
44
|
Cai X, Chen Y, Gao Z, Xu R. Explore Small Molecule-induced Genome-wide Transcriptional Profiles for Novel Inflammatory Bowel Disease Drug. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2016; 2016:22-31. [PMID: 27570643 PMCID: PMC5001780] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Inflammatory Bowel Disease (IBD) is a chronic and relapsing disorder, which affects millions people worldwide. Current drug options cannot cure the disease and may cause severe side effects. We developed a systematic framework to identify novel IBD drugs exploiting millions of genomic signatures for chemical compounds. Specifically, we searched all FDA-approved drugs for candidates that share similar genomic profiles with IBD. In the evaluation experiments, our approach ranked approved IBD drugs averagely within top 26% among 858 candidates, significantly outperforming a state-of-art genomics-based drug repositioning method (p-value < e-8). Our approach also achieved significantly higher average precision than the state-of-art approach in predicting potential IBD drugs from clinical trials (0.072 vs. 0.043, p<0.1) and off-label IBD drugs (0.198 vs. 0.138, p<0.1). Furthermore, we found evidences supporting the therapeutic potential of the top-ranked drugs, such as Naloxone, in literature and through analyzing target genes and pathways.
Collapse
Affiliation(s)
- Xiaoshu Cai
- Department of Electrical Engineering and Computer Science, School of Engineering, Case Western Reserve University, Cleveland, Ohio, USA
| | - Yang Chen
- Department of Epidemiology & Biostatistics, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| | - Zhen Gao
- Department of Epidemiology & Biostatistics, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| | - Rong Xu
- Department of Epidemiology & Biostatistics, School of Medicine, Case Western Reserve University, Cleveland, Ohio, USA
| |
Collapse
|
45
|
Rouillard AD, Gundersen GW, Fernandez NF, Wang Z, Monteiro CD, McDermott MG, Ma'ayan A. The harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins. Database (Oxford) 2016; 2016:baw100. [PMID: 27374120 PMCID: PMC4930834 DOI: 10.1093/database/baw100] [Citation(s) in RCA: 921] [Impact Index Per Article: 115.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2016] [Revised: 05/15/2016] [Accepted: 05/31/2016] [Indexed: 12/18/2022]
Abstract
Genomics, epigenomics, transcriptomics, proteomics and metabolomics efforts rapidly generate a plethora of data on the activity and levels of biomolecules within mammalian cells. At the same time, curation projects that organize knowledge from the biomedical literature into online databases are expanding. Hence, there is a wealth of information about genes, proteins and their associations, with an urgent need for data integration to achieve better knowledge extraction and data reuse. For this purpose, we developed the Harmonizome: a collection of processed datasets gathered to serve and mine knowledge about genes and proteins from over 70 major online resources. We extracted, abstracted and organized data into ∼72 million functional associations between genes/proteins and their attributes. Such attributes could be physical relationships with other biomolecules, expression in cell lines and tissues, genetic associations with knockout mouse or human phenotypes, or changes in expression after drug treatment. We stored these associations in a relational database along with rich metadata for the genes/proteins, their attributes and the original resources. The freely available Harmonizome web portal provides a graphical user interface, a web service and a mobile app for querying, browsing and downloading all of the collected data. To demonstrate the utility of the Harmonizome, we computed and visualized gene-gene and attribute-attribute similarity networks, and through unsupervised clustering, identified many unexpected relationships by combining pairs of datasets such as the association between kinase perturbations and disease signatures. We also applied supervised machine learning methods to predict novel substrates for kinases, endogenous ligands for G-protein coupled receptors, mouse phenotypes for knockout genes, and classified unannotated transmembrane proteins for likelihood of being ion channels. The Harmonizome is a comprehensive resource of knowledge about genes and proteins, and as such, it enables researchers to discover novel relationships between biological entities, as well as form novel data-driven hypotheses for experimental validation.Database URL: http://amp.pharm.mssm.edu/Harmonizome.
Collapse
Affiliation(s)
- Andrew D Rouillard
- Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Gregory W Gundersen
- Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Nicolas F Fernandez
- Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Zichen Wang
- Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Caroline D Monteiro
- Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Michael G McDermott
- Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Avi Ma'ayan
- Department of Pharmacology and Systems Therapeutics, Department of Genetics and Genomic Sciences, BD2K-LINCS Data Coordination and Integration Center (DCIC), Mount Sinai's Knowledge Management Center for Illuminating the Druggable Genome (KMC-IDG), Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
46
|
Yin JY, Li X, Li XP, Xiao L, Zheng W, Chen J, Mao CX, Fang C, Cui JJ, Guo CX, Zhang W, Gao Y, Zhang CF, Chen ZH, Zhou H, Zhou HH, Liu ZQ. Prediction models for platinum-based chemotherapy response and toxicity in advanced NSCLC patients. Cancer Lett 2016; 377:65-73. [DOI: 10.1016/j.canlet.2016.04.029] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2016] [Revised: 04/14/2016] [Accepted: 04/20/2016] [Indexed: 12/23/2022]
|
47
|
Repositioning of a cyclin-dependent kinase inhibitor GW8510 as a ribonucleotide reductase M2 inhibitor to treat human colorectal cancer. Cell Death Discov 2016; 2:16027. [PMID: 27551518 PMCID: PMC4979501 DOI: 10.1038/cddiscovery.2016.27] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2016] [Accepted: 04/03/2016] [Indexed: 01/25/2023] Open
Abstract
Colorectal cancer (CRC) is the second leading cause of cancer-related death in males and females in the world. It is of immediate importance to develop novel therapeutics. Human ribonucleotide reductase (RRM1/RRM2) has an essential role in converting ribonucleoside diphosphate to 2'-deoxyribonucleoside diphosphate to maintain the homeostasis of nucleotide pools. RRM2 is a prognostic biomarker and predicts poor survival of CRC. In addition, increased RRM2 activity is associated with malignant transformation and tumor cell growth. Bioinformatics analyses show that RRM2 was overexpressed in CRC and might be an attractive target for treating CRC. Therefore, we attempted to search novel RRM2 inhibitors by using a gene expression signature-based approach, connectivity MAP (CMAP). The result predicted GW8510, a cyclin-dependent kinase inhibitor, as a potential RRM2 inhibitor. Western blot analysis indicated that GW8510 inhibited RRM2 expression through promoting its proteasomal degradation. In addition, GW8510 induced autophagic cell death. In addition, the sensitivities of CRC cells to GW8510 were associated with the levels of RRM2 and endogenous autophagic flux. Taken together, our study indicates that GW8510 could be a potential anti-CRC agent through targeting RRM2.
Collapse
|
48
|
Ballouz S, Gillis J. AuPairWise: A Method to Estimate RNA-Seq Replicability through Co-expression. PLoS Comput Biol 2016; 12:e1004868. [PMID: 27082953 PMCID: PMC4833304 DOI: 10.1371/journal.pcbi.1004868] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2015] [Accepted: 03/14/2016] [Indexed: 11/23/2022] Open
Abstract
In addition to detecting novel transcripts and higher dynamic range, a principal claim for RNA-sequencing has been greater replicability, typically measured in sample-sample correlations of gene expression levels. Through a re-analysis of ENCODE data, we show that replicability of transcript abundances will provide misleading estimates of the replicability of conditional variation in transcript abundances (i.e., most expression experiments). Heuristics which implicitly address this problem have emerged in quality control measures to obtain ‘good’ differential expression results. However, these methods involve strict filters such as discarding low expressing genes or using technical replicates to remove discordant transcripts, and are costly or simply ad hoc. As an alternative, we model gene-level replicability of differential activity using co-expressing genes. We find that sets of housekeeping interactions provide a sensitive means of estimating the replicability of expression changes, where the co-expressing pair can be regarded as pseudo-replicates of one another. We model the effects of noise that perturbs a gene’s expression within its usual distribution of values and show that perturbing expression by only 5% within that range is readily detectable (AUROC~0.73). We have made our method available as a set of easily implemented R scripts. RNA-sequencing has become a popular means to detect the expression levels of genes. However, quality control is still challenging, requiring both extreme measures and rules which are set in stone from extensive previous analysis. Instead of relying on these rules, we show that co-expression can be used to measure biological replicability with extremely high precision. Co-expression is a well-studied phenomenon in which two genes that are known to form a functional unit are also expressed at similar levels, and change in similar ways across conditions. Using this concept, we can detect how well an experiment replicates by measuring how well it has retained the co-expression pattern across defined gene-pairs. We do this by measuring how easy it is to detect a sample to which some noise has been added. We show this method is a useful tool for quality control.
Collapse
Affiliation(s)
- Sara Ballouz
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Jesse Gillis
- Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| |
Collapse
|
49
|
Sarntivijai S, Vasant D, Jupp S, Saunders G, Bento AP, Gonzalez D, Betts J, Hasan S, Koscielny G, Dunham I, Parkinson H, Malone J. Linking rare and common disease: mapping clinical disease-phenotypes to ontologies in therapeutic target validation. J Biomed Semantics 2016; 7:8. [PMID: 27011785 PMCID: PMC4804633 DOI: 10.1186/s13326-016-0051-7] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2015] [Accepted: 02/02/2016] [Indexed: 12/29/2022] Open
Abstract
BACKGROUND The Centre for Therapeutic Target Validation (CTTV - https://www.targetvalidation.org/) was established to generate therapeutic target evidence from genome-scale experiments and analyses. CTTV aims to support the validity of therapeutic targets by integrating existing and newly-generated data. Data integration has been achieved in some resources by mapping metadata such as disease and phenotypes to the Experimental Factor Ontology (EFO). Additionally, the relationship between ontology descriptions of rare and common diseases and their phenotypes can offer insights into shared biological mechanisms and potential drug targets. Ontologies are not ideal for representing the sometimes associated type relationship required. This work addresses two challenges; annotation of diverse big data, and representation of complex, sometimes associated relationships between concepts. METHODS Semantic mapping uses a combination of custom scripting, our annotation tool 'Zooma', and expert curation. Disease-phenotype associations were generated using literature mining on Europe PubMed Central abstracts, which were manually verified by experts for validity. Representation of the disease-phenotype association was achieved by the Ontology of Biomedical AssociatioN (OBAN), a generic association representation model. OBAN represents associations between a subject and object i.e., disease and its associated phenotypes and the source of evidence for that association. The indirect disease-to-disease associations are exposed through shared phenotypes. This was applied to the use case of linking rare to common diseases at the CTTV. RESULTS EFO yields an average of over 80% of mapping coverage in all data sources. A 42% precision is obtained from the manual verification of the text-mined disease-phenotype associations. This results in 1452 and 2810 disease-phenotype pairs for IBD and autoimmune disease and contributes towards 11,338 rare diseases associations (merged with existing published work [Am J Hum Genet 97:111-24, 2015]). An OBAN result file is downloadable at http://sourceforge.net/p/efo/code/HEAD/tree/trunk/src/efoassociations/. Twenty common diseases are linked to 85 rare diseases by shared phenotypes. A generalizable OBAN model for association representation is presented in this study. CONCLUSIONS Here we present solutions to large-scale annotation-ontology mapping in the CTTV knowledge base, a process for disease-phenotype mining, and propose a generic association model, 'OBAN', as a means to integrate disease using shared phenotypes. AVAILABILITY EFO is released monthly and available for download at http://www.ebi.ac.uk/efo/.
Collapse
Affiliation(s)
- Sirarat Sarntivijai
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK ; Centre for Therapeutic Target Validation, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| | - Drashtti Vasant
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK ; Centre for Therapeutic Target Validation, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| | - Simon Jupp
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| | - Gary Saunders
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK ; Centre for Therapeutic Target Validation, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| | - A Patrícia Bento
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK ; Centre for Therapeutic Target Validation, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| | - Daniel Gonzalez
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK ; Centre for Therapeutic Target Validation, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| | - Joanna Betts
- Centre for Therapeutic Target Validation, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK ; GSK, Medicine Research Centre, Stevenage, SG1 2NY UK
| | - Samiul Hasan
- Centre for Therapeutic Target Validation, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK ; GSK, Medicine Research Centre, Stevenage, SG1 2NY UK
| | - Gautier Koscielny
- Centre for Therapeutic Target Validation, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK ; GSK, Medicine Research Centre, Stevenage, SG1 2NY UK
| | - Ian Dunham
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK ; Centre for Therapeutic Target Validation, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| | - Helen Parkinson
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| | - James Malone
- European Bioinformatics Institute (EMBL-EBI), European Molecular Biology Laboratory, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK ; Centre for Therapeutic Target Validation, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD UK
| |
Collapse
|
50
|
Lachmann A, Giorgi FM, Alvarez MJ, Califano A. Detection and removal of spatial bias in multiwell assays. Bioinformatics 2016; 32:1959-65. [PMID: 27153732 DOI: 10.1093/bioinformatics/btw092] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2015] [Accepted: 02/14/2016] [Indexed: 01/05/2023] Open
Abstract
MOTIVATION Multiplex readout assays are now increasingly being performed using microfluidic automation in multiwell format. For instance, the Library of Integrated Network-based Cellular Signatures (LINCS) has produced gene expression measurements for tens of thousands of distinct cell perturbations using a 384-well plate format. This dataset is by far the largest 384-well gene expression measurement assay ever performed. We investigated the gene expression profiles of a million samples from the LINCS dataset and found that the vast majority (96%) of the tested plates were affected by a significant 2D spatial bias. RESULTS Using a novel algorithm combining spatial autocorrelation detection and principal component analysis, we could remove most of the spatial bias from the LINCS dataset and show in parallel a dramatic improvement of similarity between biological replicates assayed in different plates. The proposed methodology is fully general and can be applied to any highly multiplexed assay performed in multiwell format. CONTACT ac2248@columbia.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Alexander Lachmann
- Department of Biomedical Informatics (DBMI) Department of Systems Biology Center for Computational Biology and Bioinformatics (C2B2), Columbia University, New York, NY, USA
| | - Federico M Giorgi
- Department of Systems Biology Center for Computational Biology and Bioinformatics (C2B2), Columbia University, New York, NY, USA Scuola Superiore Sant'Anna, Pisa, Italy
| | | | - Andrea Califano
- Department of Biomedical Informatics (DBMI) Department of Systems Biology Center for Computational Biology and Bioinformatics (C2B2), Columbia University, New York, NY, USA Department of Biochemistry and Molecular Biophysics Institute for Cancer Genetics Herbert Irving Comprehensive Cancer Center, Columbia University, New York, NY, USA
| |
Collapse
|