1
|
Mou X, Jamil HM. Visual Life Sciences Workflow Design Using Distributed and Heterogeneous Resources. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:1459-1473. [PMID: 30561349 DOI: 10.1109/tcbb.2018.2886185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Programming or querying usually presupposes some degree of technical familiarity with the syntax of a language and the peculiarity of the objects it manipulates to produce useful information. The degree of abstractions supported in a language helps lessen the depth of such familiarity needed, and aids in improving access to and usability of these resources. To help biologists concentrate more on their science questions and not on how to compute it, several successful workflow orchestration languages and systems have been proposed. Despite their popularity, significant limitations reduce their usability and limit applicability in novel applications. In this paper, we present a visual language, called VisFlow, for workflow orchestration using heterogeneous and distributed resources. We advance the idea that once resources are minimally described and abstracted, arbitrary workflows can be designed solely using query primitives supported in VisFlow. Its capabilities can be augmented by including computational artifacts in the form of library functions written in R, Python, and Java, or even in SQL and XQuery, making it a truly extensible system. We discuss its salient features and illustrate its capabilities using a substantial set of examples.
Collapse
|
2
|
Eicher T, Kinnebrew G, Patt A, Spencer K, Ying K, Ma Q, Machiraju R, Mathé EA. Metabolomics and Multi-Omics Integration: A Survey of Computational Methods and Resources. Metabolites 2020; 10:E202. [PMID: 32429287 PMCID: PMC7281435 DOI: 10.3390/metabo10050202] [Citation(s) in RCA: 60] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 05/07/2020] [Accepted: 05/13/2020] [Indexed: 02/06/2023] Open
Abstract
As researchers are increasingly able to collect data on a large scale from multiple clinical and omics modalities, multi-omics integration is becoming a critical component of metabolomics research. This introduces a need for increased understanding by the metabolomics researcher of computational and statistical analysis methods relevant to multi-omics studies. In this review, we discuss common types of analyses performed in multi-omics studies and the computational and statistical methods that can be used for each type of analysis. We pinpoint the caveats and considerations for analysis methods, including required parameters, sample size and data distribution requirements, sources of a priori knowledge, and techniques for the evaluation of model accuracy. Finally, for the types of analyses discussed, we provide examples of the applications of corresponding methods to clinical and basic research. We intend that our review may be used as a guide for metabolomics researchers to choose effective techniques for multi-omics analyses relevant to their field of study.
Collapse
Affiliation(s)
- Tara Eicher
- Biomedical Informatics Department, The Ohio State University College of Medicine, Columbus, OH 43210, USA; (T.E.); (G.K.); (K.S.); (Q.M.); (R.M.)
- Computer Science and Engineering Department, The Ohio State University College of Engineering, Columbus, OH 43210, USA
| | - Garrett Kinnebrew
- Biomedical Informatics Department, The Ohio State University College of Medicine, Columbus, OH 43210, USA; (T.E.); (G.K.); (K.S.); (Q.M.); (R.M.)
- Comprehensive Cancer Center, The Ohio State University and James Cancer Hospital, Columbus, OH 43210, USA;
- Bioinformatics Shared Resource Group, The Ohio State University, Columbus, OH 43210, USA
| | - Andrew Patt
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences, NIH, 9800 Medical Center Dr., Rockville, MD, 20892, USA;
- Biomedical Sciences Graduate Program, The Ohio State University, Columbus, OH 43210, USA
| | - Kyle Spencer
- Biomedical Informatics Department, The Ohio State University College of Medicine, Columbus, OH 43210, USA; (T.E.); (G.K.); (K.S.); (Q.M.); (R.M.)
- Biomedical Sciences Graduate Program, The Ohio State University, Columbus, OH 43210, USA
- Nationwide Children’s Research Hospital, Columbus, OH 43210, USA
| | - Kevin Ying
- Comprehensive Cancer Center, The Ohio State University and James Cancer Hospital, Columbus, OH 43210, USA;
- Molecular, Cellular and Developmental Biology Program, The Ohio State University, Columbus, OH 43210, USA
| | - Qin Ma
- Biomedical Informatics Department, The Ohio State University College of Medicine, Columbus, OH 43210, USA; (T.E.); (G.K.); (K.S.); (Q.M.); (R.M.)
| | - Raghu Machiraju
- Biomedical Informatics Department, The Ohio State University College of Medicine, Columbus, OH 43210, USA; (T.E.); (G.K.); (K.S.); (Q.M.); (R.M.)
- Computer Science and Engineering Department, The Ohio State University College of Engineering, Columbus, OH 43210, USA
- Department of Pathology, Wexner Medical Center, The Ohio State University, Columbus, OH 43210, USA
- Translational Data Analytics Institute, The Ohio State University, Columbus, OH 43210, USA
| | - Ewy A. Mathé
- Biomedical Informatics Department, The Ohio State University College of Medicine, Columbus, OH 43210, USA; (T.E.); (G.K.); (K.S.); (Q.M.); (R.M.)
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences, NIH, 9800 Medical Center Dr., Rockville, MD, 20892, USA;
| |
Collapse
|
3
|
Csabai L, Ölbei M, Budd A, Korcsmáros T, Fazekas D. SignaLink: Multilayered Regulatory Networks. Methods Mol Biol 2018; 1819:53-73. [PMID: 30421399 DOI: 10.1007/978-1-4939-8618-7_3] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Biological networks are graphs used to represent the inner workings of a biological system. Networks describe the relationships of the elements of biological systems using edges and nodes. However, the resulting representation of the system can sometimes be too simplistic to usefully model reality. By combining several different interaction types within one larger multilayered biological network, tools such as SignaLink provide a more nuanced view than those relying on single-layer networks (where edges only describe one kind of interaction). Multilayered networks display connections between multiple networks (i.e., protein-protein interactions and their transcriptional and posttranscriptional regulators), each one of them describing a specific set of connections. Multilayered networks also allow us to depict cross talk between cellular systems, which is a more realistic way of describing molecular interactions. They can be used to collate networks from different sources into one multilayered structure, which makes them useful as an analytic tool as well.
Collapse
Affiliation(s)
| | - Márton Ölbei
- Earlham Institute, Norwich Research Park, Norwich, UK.,Quadram Institute, Norwich Research Park, Norwich, UK
| | - Aidan Budd
- Earlham Institute, Norwich Research Park, Norwich, UK
| | - Tamás Korcsmáros
- Eötvös Loránd University, Budapest, Hungary. .,Earlham Institute, Norwich Research Park, Norwich, UK. .,Quadram Institute, Norwich Research Park, Norwich, UK.
| | - Dávid Fazekas
- Eötvös Loránd University, Budapest, Hungary.,Earlham Institute, Norwich Research Park, Norwich, UK
| |
Collapse
|