Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Wang S, Pandis I, Wu C, He S, Johnson D, Emam I, Guitton F, Guo Y. High dimensional biological data retrieval optimization with NoSQL technology. BMC Genomics 2014;15 Suppl 8:S3. [PMID: 25435347 PMCID: PMC4248814 DOI: 10.1186/1471-2164-15-s8-s3] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open

For:	Wang S, Pandis I, Wu C, He S, Johnson D, Emam I, Guitton F, Guo Y. High dimensional biological data retrieval optimization with NoSQL technology. BMC Genomics 2014;15 Suppl 8:S3. [PMID: 25435347 PMCID: PMC4248814 DOI: 10.1186/1471-2164-15-s8-s3] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open

Number

Cited by Other Article(s)

Sen S, Woodhouse MR, Portwood JL, Andorf CM. Maize Feature Store: A centralized resource to manage and analyze curated maize multi-omics features for machine learning applications. Database (Oxford) 2023;2023:baad078. [PMID: 37935586 PMCID: PMC10634621 DOI: 10.1093/database/baad078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 09/16/2023] [Accepted: 10/19/2023] [Indexed: 11/09/2023]

Big Data in Laboratory Medicine—FAIR Quality for AI? Diagnostics (Basel) 2022;12:diagnostics12081923. [PMID: 36010273 PMCID: PMC9406962 DOI: 10.3390/diagnostics12081923] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2022] [Revised: 08/05/2022] [Accepted: 08/06/2022] [Indexed: 12/22/2022] Open

Pal S, Mondal S, Das G, Khatua S, Ghosh Z. Big data in biology: The hope and present-day challenges in it. GENE REPORTS 2020. [DOI: 10.1016/j.genrep.2020.100869] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Liu J, Liu Q, Zhang L, Su S, Liu Y. Enabling Massive XML-Based Biological Data Management in HBase. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020;17:1994-2004. [PMID: 31094692 DOI: 10.1109/tcbb.2019.2915811] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Nti-Addae Y, Matthews D, Ulat VJ, Syed R, Sempéré G, Pétel A, Renner J, Larmande P, Guignon V, Jones E, Robbins K. Benchmarking database systems for Genomic Selection implementation. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2020;2019:5566651. [PMID: 31508797 PMCID: PMC6737464 DOI: 10.1093/database/baz096] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Revised: 05/29/2019] [Accepted: 07/01/2019] [Indexed: 01/07/2023]

Wang X, Williams C, Liu ZH, Croghan J. Big data management challenges in health research-a literature review. Brief Bioinform 2019;20:156-167. [PMID: 28968677 DOI: 10.1093/bib/bbx086] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Indexed: 12/12/2022] Open

Paris N, Mendis M, Daniel C, Murphy S, Tannier X, Zweigenbaum P. i2b2 implemented over SMART-on-FHIR. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2018;2017:369-378. [PMID: 29888095 PMCID: PMC5961782] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Christoph J, Knell C, Bosserhoff A, Naschberger E, Stürzl M, Rübner M, Seuss H, Ruh M, Prokosch HU, Sedlmayr B. Usability and Suitability of the Omics-Integrating Analysis Platform tranSMART for Translational Research and Education. Appl Clin Inform 2017;8:1173-1183. [PMID: 29270954 DOI: 10.4338/aci-2017-05-ra-0085] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open

Abstract

BACKGROUND

Platforms like tranSMART assist researchers in analyzing clinical and corresponding omics data. Usability is an important, yet often overlooked, factor affecting the adoption and meaningful use. Analyses on the specific needs of translational researchers and considerations about the application of such platforms for education are rare.

OBJECTIVES

The aim of this study was to test whether tranSMART can be used in education and how well medical students and professional researchers can handle it; to identify which kind of translational researchers-in terms of skills, experienced limitations, and available data-can take advantage of tranSMART; and to evaluate the usability and to generate recommendations for improvements.

METHODS

An online-based test has been done by medical students (N = 109) and researchers (N = 26). The test comprised 13 tasks in the context of four typical research scenarios based on experimental and clinical data. A web questionnaire was provided to identify both the needs and the conditions of research as well as to evaluate the system's usability based on the "System Usability Scale" (SUS).

RESULTS

Students and researchers were able to handle tranSMART well and coped with most scenarios: cohort identification, data exploration, hypothesis generation, and hypothesis validation were answered with a rate of correctness between 82 and 100%. Of the total, 72.2% of the teaching researchers considered tranSMART suitable for their lessons and 84.6% of the researchers considered the platform useful for their daily work; 65.4% of the researchers named the nonavailability of a platform like tranSMART as a restriction on their research. The usability was rated "acceptable" with a SUS of 70.8.

CONCLUSION

tranSMART is potentially suitable for education purposes and fits most of the needs of translational researchers. Improvements are needed on the presentation of analysis results and on the guidance of users through the analysis, especially to ensure the compliance of the analysis with the requirements of statistical testing.

Collapse

Using Distributed Data over HBase in Big Data Analytics Platform for Clinical Services. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE 2017;2017:6120820. [PMID: 29375652 PMCID: PMC5742497 DOI: 10.1155/2017/6120820] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/01/2017] [Accepted: 11/01/2017] [Indexed: 02/01/2023]

Kulkarni P, Frommolt P. Challenges in the Setup of Large-scale Next-Generation Sequencing Analysis Workflows. Comput Struct Biotechnol J 2017;15:471-477. [PMID: 29158876 PMCID: PMC5683667 DOI: 10.1016/j.csbj.2017.10.001] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2017] [Revised: 09/29/2017] [Accepted: 10/06/2017] [Indexed: 11/18/2022] Open

Bao S, Plassard AJ, Landman BA, Gokhale A. Cloud Engineering Principles and Technology Enablers for Medical Image Processing-as-a-Service. PROCEEDINGS OF THE IEEE INTERNATIONAL CONFERENCE ON CLOUD ENGINEERING. IEEE INTERNATIONAL CONFERENCE ON CLOUD ENGINEERING 2017;2017:127-137. [PMID: 28884169 PMCID: PMC5584067 DOI: 10.1109/ic2e.2017.23] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]

Abstract

Traditional in-house, laboratory-based medical imaging studies use hierarchical data structures (e.g., NFS file stores) or databases (e.g., COINS, XNAT) for storage and retrieval. The resulting performance from these approaches is, however, impeded by standard network switches since they can saturate network bandwidth during transfer from storage to processing nodes for even moderate-sized studies. To that end, a cloud-based "medical image processing-as-a-service" offers promise in utilizing the ecosystem of Apache Hadoop, which is a flexible framework providing distributed, scalable, fault tolerant storage and parallel computational modules, and HBase, which is a NoSQL database built atop Hadoop's distributed file system. Despite this promise, HBase's load distribution strategy of region split and merge is detrimental to the hierarchical organization of imaging data (e.g., project, subject, session, scan, slice). This paper makes two contributions to address these concerns by describing key cloud engineering principles and technology enhancements we made to the Apache Hadoop ecosystem for medical imaging applications. First, we propose a row-key design for HBase, which is a necessary step that is driven by the hierarchical organization of imaging data. Second, we propose a novel data allocation policy within HBase to strongly enforce collocation of hierarchically related imaging data. The proposed enhancements accelerate data processing by minimizing network usage and localizing processing to machines where the data already exist. Moreover, our approach is amenable to the traditional scan, subject, and project-level analysis procedures, and is compatible with standard command line/scriptable image processing software. Experimental results for an illustrative sample of imaging data reveals that our new HBase policy results in a three-fold time improvement in conversion of classic DICOM to NiFTI file formats when compared with the default HBase region split policy, and nearly a six-fold improvement over a commonly available network file system (NFS) approach even for relatively small file sets. Moreover, file access latency is lower than network attached storage.

Collapse

de Silva NHND. Relational Databases and Biomedical Big Data. Methods Mol Biol 2017;1617:69-81. [PMID: 28540677 DOI: 10.1007/978-1-4939-7046-9_5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]

Schulz WL, Nelson BG, Felker DK, Durant TJS, Torres R. Evaluation of relational and NoSQL database architectures to manage genomic annotations. J Biomed Inform 2016;64:288-295. [PMID: 27810480 DOI: 10.1016/j.jbi.2016.10.015] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2016] [Revised: 10/03/2016] [Accepted: 10/26/2016] [Indexed: 10/20/2022]

Wang S, Mares MA, Guo YK. CGDM: collaborative genomic data model for molecular profiling data using NoSQL. Bioinformatics 2016;32:3654-3660. [PMID: 27522085 DOI: 10.1093/bioinformatics/btw531] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2016] [Revised: 07/21/2016] [Accepted: 08/09/2016] [Indexed: 11/13/2022] Open

Sempéré G, Philippe F, Dereeper A, Ruiz M, Sarah G, Larmande P. Gigwa-Genotype investigator for genome-wide analyses. Gigascience 2016;5:25. [PMID: 27267926 PMCID: PMC4897896 DOI: 10.1186/s13742-016-0131-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2016] [Accepted: 05/16/2016] [Indexed: 01/16/2023] Open

Satagopam V, Gu W, Eifes S, Gawron P, Ostaszewski M, Gebel S, Barbosa-Silva A, Balling R, Schneider R. Integration and Visualization of Translational Medicine Data for Better Understanding of Human Diseases. BIG DATA 2016;4:97-108. [PMID: 27441714 PMCID: PMC4932659 DOI: 10.1089/big.2015.0057] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]

Abstract

Translational medicine is a domain turning results of basic life science research into new tools and methods in a clinical environment, for example, as new diagnostics or therapies. Nowadays, the process of translation is supported by large amounts of heterogeneous data ranging from medical data to a whole range of -omics data. It is not only a great opportunity but also a great challenge, as translational medicine big data is difficult to integrate and analyze, and requires the involvement of biomedical experts for the data processing. We show here that visualization and interoperable workflows, combining multiple complex steps, can address at least parts of the challenge. In this article, we present an integrated workflow for exploring, analysis, and interpretation of translational medicine data in the context of human health. Three Web services-tranSMART, a Galaxy Server, and a MINERVA platform-are combined into one big data pipeline. Native visualization capabilities enable the biomedical experts to get a comprehensive overview and control over separate steps of the workflow. The capabilities of tranSMART enable a flexible filtering of multidimensional integrated data sets to create subsets suitable for downstream processing. A Galaxy Server offers visually aided construction of analytical pipelines, with the use of existing or custom components. A MINERVA platform supports the exploration of health and disease-related mechanisms in a contextualized analytical visualization system. We demonstrate the utility of our workflow by illustrating its subsequent steps using an existing data set, for which we propose a filtering scheme, an analytical pipeline, and a corresponding visualization of analytical results. The workflow is available as a sandbox environment, where readers can work with the described setup themselves. Overall, our work shows how visualization and interfacing of big data processing services facilitate exploration, analysis, and interpretation of translational medicine data.

Collapse

Gabetta M, Limongelli I, Rizzo E, Riva A, Segagni D, Bellazzi R. BigQ: a NoSQL based framework to handle genomic variants in i2b2. BMC Bioinformatics 2015;16:415. [PMID: 26714792 PMCID: PMC4696314 DOI: 10.1186/s12859-015-0861-0] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2015] [Accepted: 12/15/2015] [Indexed: 12/25/2022] Open

Abstract

Background

Precision medicine requires the tight integration of clinical and molecular data. To this end, it is mandatory to define proper technological solutions able to manage the overwhelming amount of high throughput genomic data needed to test associations between genomic signatures and human phenotypes. The i2b2 Center (Informatics for Integrating Biology and the Bedside) has developed a widely internationally adopted framework to use existing clinical data for discovery research that can help the definition of precision medicine interventions when coupled with genetic data. i2b2 can be significantly advanced by designing efficient management solutions of Next Generation Sequencing data.

Results

We developed BigQ, an extension of the i2b2 framework, which integrates patient clinical phenotypes with genomic variant profiles generated by Next Generation Sequencing. A visual programming i2b2 plugin allows retrieving variants belonging to the patients in a cohort by applying filters on genomic variant annotations. We report an evaluation of the query performance of our system on more than 11 million variants, showing that the implemented solution scales linearly in terms of query time and disk space with the number of variants.

Conclusions

In this paper we describe a new i2b2 web service composed of an efficient and scalable document-based database that manages annotations of genomic variants and of a visual programming plug-in designed to dynamically perform queries on clinical and genetic data. The system therefore allows managing the fast growing volume of genomic variants and can be used to integrate heterogeneous genomic annotations.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0861-0) contains supplementary material, which is available to authorized users.

Collapse

Noor AM, Holmberg L, Gillett C, Grigoriadis A. Big Data: the challenge for small research groups in the era of cancer genomics. Br J Cancer 2015;113:1405-12. [PMID: 26492224 PMCID: PMC4815885 DOI: 10.1038/bjc.2015.341] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2015] [Revised: 08/04/2015] [Accepted: 08/09/2015] [Indexed: 01/06/2023] Open

Trends in IT Innovation to Build a Next Generation Bioinformatics Solution to Manage and Analyse Biological Big Data Produced by NGS Technologies. BIOMED RESEARCH INTERNATIONAL 2015;2015:904541. [PMID: 26125026 PMCID: PMC4466500 DOI: 10.1155/2015/904541] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/31/2014] [Revised: 04/01/2015] [Accepted: 04/01/2015] [Indexed: 02/07/2023]