Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Almugbel R, Hung LH, Hu J, Almutairy A, Ortogero N, Tamta Y, Yeung KY. Reproducible Bioconductor workflows using browser-based interactive notebooks and containers. J Am Med Inform Assoc 2018;25:4-12. [PMID: 29092073 PMCID: PMC6381817 DOI: 10.1093/jamia/ocx120] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2017] [Revised: 08/31/2017] [Accepted: 09/28/2017] [Indexed: 11/14/2022] Open

For:	Almugbel R, Hung LH, Hu J, Almutairy A, Ortogero N, Tamta Y, Yeung KY. Reproducible Bioconductor workflows using browser-based interactive notebooks and containers. J Am Med Inform Assoc 2018;25:4-12. [PMID: 29092073 PMCID: PMC6381817 DOI: 10.1093/jamia/ocx120] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2017] [Revised: 08/31/2017] [Accepted: 09/28/2017] [Indexed: 11/14/2022] Open

Number

Cited by Other Article(s)

Weberpals J, Wang SV. The FAIRification of research in real-world evidence: A practical introduction to reproducible analytic workflows using Git and R. Pharmacoepidemiol Drug Saf 2024;33:e5740. [PMID: 38173166 DOI: 10.1002/pds.5740] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 11/29/2023] [Accepted: 11/30/2023] [Indexed: 01/05/2024]

Mallya P, Stevens LM, Zhao J, Hong C, Henao R, Economou-Zavlanos N, Wojdyla DM, Schibler T, Manchanda V, Pencina MJ, Hall JL. Facilitating Harmonization of Variables in Framingham, MESA, ARIC, and REGARDS Studies Through a Metadata Repository. Circ Cardiovasc Qual Outcomes 2023;16:e009938. [PMID: 37850400 PMCID: PMC10841164 DOI: 10.1161/circoutcomes.123.009938] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 10/19/2023]

Abstract

BACKGROUND

High-quality research in cardiovascular prevention, as in other fields, requires inclusion of a broad range of data sets from different sources. Integrating and harmonizing different data sources are essential to increase generalizability, sample size, and representation of understudied populations-strengthening the evidence for the scientific questions being addressed.

METHODS

Here, we describe an effort to build an open-access repository and interactive online portal for researchers to access the metadata and code harmonizing data from 4 well-known cohort studies-the REGARDS (Reasons for Geographic and Racial Differences in Stroke) study, FHS (Framingham Heart Study), MESA (Multi-Ethnic Study of Atherosclerosis), and ARIC (Atherosclerosis Risk in Communities) study. We introduce a methodology and a framework used for preprocessing and harmonizing variables from multiple studies.

RESULTS

We provide a real-case study and step-by-step guidance to demonstrate the practical utility of our repository and interactive web page. In addition to our successful development of such an open-access repository and interactive web page, this exercise in harmonizing data from multiple cohort studies has revealed several key themes. These themes include the importance of careful preprocessing and harmonization of variables, the value of creating an open-access repository to facilitate collaboration and reproducibility, and the potential for using harmonized data to address important scientific questions and disparities in cardiovascular disease research.

CONCLUSIONS

By integrating and harmonizing these large-scale cohort studies, such a repository may improve the statistical power and representation of understudied cohorts, enabling development and validation of risk prediction models, identification and investigation of risk factors, and creating a platform for racial disparities research.

REGISTRATION

URL: https://precision.heart.org/duke-ninds.

Collapse

Serret-Larmande A, Kaltman JR, Avillach P. Streamlining statistical reproducibility: NHLBI ORCHID clinical trial results reproduction. JAMIA Open 2022;5:ooac001. [PMID: 35156003 PMCID: PMC8826998 DOI: 10.1093/jamiaopen/ooac001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Revised: 12/23/2021] [Accepted: 01/07/2022] [Indexed: 11/24/2022] Open

Abstract

Reproducibility in medical research has been a long-standing issue. More recently, the COVID-19 pandemic has publicly underlined this fact as the retraction of several studies reached out to general media audiences. A significant number of these retractions occurred after in-depth scrutiny of the methodology and results by the scientific community. Consequently, these retractions have undermined confidence in the peer-review process, which is not considered sufficiently reliable to generate trust in the published results. This partly stems from opacity in published results, the practical implementation of the statistical analysis often remaining undisclosed. We present a workflow that uses a combination of informatics tools to foster statistical reproducibility: an open-source programming language, Jupyter Notebook, cloud-based data repository, and an application programming interface can streamline an analysis and help to kick-start new analyses. We illustrate this principle by (1) reproducing the results of the ORCHID clinical trial, which evaluated the efficacy of hydroxychloroquine in COVID-19 patients, and (2) expanding on the analyses conducted in the original trial by investigating the association of premedication with biological laboratory results. Such workflows will be encouraged for future publications from National Heart, Lung, and Blood Institute-funded studies.

The COVID-19 pandemic has seen several articles published in high-profile journals being retracted. These retractions undermined even more confidence in the peer-review process, which is not considered sufficiently reliable to generate trust in the published results. A significant number of these retractions occurred after in-depth scrutiny of the methodology and results by the scientific community. This partly stems from opacity in published results, the practical implementation of the statistical analysis often remaining undisclosed. This article presents a simple workflow that leverages a combination of preexisting and newly developed biomedical informatics tools to promote transparent statistical analysis in biomedical research, which relies on the National Heart, Lung, and Blood Institute (NHLBI) BioData Catalyst platform. By streamlining access to data and analysis source code, it eases results reproduction and accelerates supplemental analyses. Such workflows will be encouraged for future publications from NHLBI-funded studies. We illustrate it by reproducing the results of the ORCHID clinical trial, which evaluated the efficacy of hydroxychloroquine in COVID-19 patients.

Collapse

Vesteghem C, Brøndum RF, Sønderkær M, Sommer M, Schmitz A, Bødker JS, Dybkær K, El-Galaly TC, Bøgsted M. Implementing the FAIR Data Principles in precision oncology: review of supporting initiatives. Brief Bioinform 2021;21:936-945. [PMID: 31263868 PMCID: PMC7299292 DOI: 10.1093/bib/bbz044] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2019] [Revised: 03/13/2019] [Accepted: 03/21/2019] [Indexed: 12/26/2022] Open

Nishiwaki H, Hamaguchi T, Ito M, Ishida T, Maeda T, Kashihara K, Tsuboi Y, Ueyama J, Shimamura T, Mori H, Kurokawa K, Katsuno M, Hirayama M, Ohno K. Short-Chain Fatty Acid-Producing Gut Microbiota Is Decreased in Parkinson's Disease but Not in Rapid-Eye-Movement Sleep Behavior Disorder. mSystems 2020;5:e00797-20. [PMID: 33293403 PMCID: PMC7771407 DOI: 10.1128/msystems.00797-20] [Citation(s) in RCA: 61] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Accepted: 11/16/2020] [Indexed: 02/07/2023] Open

Abstract

Gut dysbiosis has been repeatedly reported in Parkinson's disease (PD) but only once in idiopathic rapid-eye-movement sleep behavior disorder (iRBD) from Germany. Abnormal aggregation of α-synuclein fibrils causing PD possibly starts from the intestine, although this is still currently under debate. iRBD patients frequently develop PD. Early-stage gut dysbiosis that is causally associated with PD is thus expected to be observed in iRBD. We analyzed gut microbiota in 26 iRBD patients and 137 controls by 16S rRNA sequencing (16S rRNA-seq). Our iRBD data set was meta-analyzed with the German iRBD data set and was compared with gut microbiota in 223 PD patients. Unsupervised clustering of gut microbiota by LIGER, a topic model-based tool for single-cell RNA sequencing (RNA-seq) analysis, revealed four enterotypes in controls, iRBD, and PD. Short-chain fatty acid (SCFA)-producing bacteria were conserved in an enterotype observed in controls and iRBD, whereas they were less conserved in enterotypes observed in PD. Genus Akkermansia and family Akkermansiaceae were consistently increased in both iRBD in two countries and PD in five countries. Short-chain fatty acid (SCFA)-producing bacteria were not significantly decreased in iRBD in two countries. In contrast, we previously reported that recognized or putative SCFA-producing genera Faecalibacterium, Roseburia, and Lachnospiraceae ND3007 group were consistently decreased in PD in five countries. In α-synucleinopathy, increase of mucin-layer-degrading genus Akkermansia is observed at the stage of iRBD, whereas decrease of SCFA-producing genera becomes obvious with development of PD.IMPORTANCE Twenty studies on gut microbiota in PD have been reported, whereas only one study has been reported on iRBD from Germany. iRBD has the highest likelihood ratio to develop PD. Our meta-analysis of iRBD in Japan and Germany revealed increased mucin-layer-degrading genus Akkermansia in iRBD. Genus Akkermansia may increase the intestinal permeability, as we previously observed in PD patients, and may make the intestinal neural plexus exposed to oxidative stress, which can lead to abnormal aggregation of prion-like α-synuclein fibrils in the intestine. In contrast to PD, SCFA-producing bacteria were not decreased in iRBD. As SCFA induces regulatory T (Treg) cells, a decrease of SCFA-producing bacteria may be a prerequisite for the development of PD. We propose that prebiotic and/or probiotic therapeutic strategies to increase the intestinal mucin layer and to increase intestinal SCFA potentially retard the development of iRBD and PD.

Collapse

Virkus S, Garoufallou E. Data science and its relationship to library and information science: a content analysis. DATA TECHNOLOGIES AND APPLICATIONS 2020. [DOI: 10.1108/dta-07-2020-0167] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Stevens L, Kao D, Hall J, Görg C, Abdo K, Linstead E. ML-MEDIC: A Preliminary Study of an Interactive Visual Analysis Tool Facilitating Clinical Applications of Machine Learning for Precision Medicine. APPLIED SCIENCES (BASEL, SWITZERLAND) 2020;10:3309. [PMID: 33664984 PMCID: PMC7928533 DOI: 10.3390/app10093309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Ulfenborg B. Vertical and horizontal integration of multi-omics data with miodin. BMC Bioinformatics 2019;20:649. [PMID: 31823712 PMCID: PMC6902525 DOI: 10.1186/s12859-019-3224-4] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2019] [Accepted: 11/14/2019] [Indexed: 11/10/2022] Open

Coiera E, Ammenwerth E, Georgiou A, Magrabi F. Does health informatics have a replication crisis? J Am Med Inform Assoc 2019;25:963-968. [PMID: 29669066 PMCID: PMC6077781 DOI: 10.1093/jamia/ocy028] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2017] [Accepted: 03/13/2018] [Indexed: 01/27/2023] Open

Building Containerized Workflows Using the BioDepot-Workflow-Builder. Cell Syst 2019;9:508-514.e3. [PMID: 31521606 DOI: 10.1016/j.cels.2019.08.007] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2018] [Revised: 05/21/2019] [Accepted: 08/16/2019] [Indexed: 11/22/2022]

Rodríguez-Pérez H, Hernández-Beeftink T, Lorenzo-Salazar JM, Roda-García JL, Pérez-González CJ, Colebrook M, Flores C. NanoDJ: a Dockerized Jupyter notebook for interactive Oxford Nanopore MinION sequence manipulation and genome assembly. BMC Bioinformatics 2019;20:234. [PMID: 31072312 PMCID: PMC6509807 DOI: 10.1186/s12859-019-2860-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2018] [Accepted: 04/29/2019] [Indexed: 12/23/2022] Open

Brennan PF, Chiang MF, Ohno-Machado L. Biomedical informatics and data science: evolving fields with significant overlap. J Am Med Inform Assoc 2019;25:2-3. [PMID: 29267964 DOI: 10.1093/jamia/ocx146] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Kulkarni N, Alessandrì L, Panero R, Arigoni M, Olivero M, Ferrero G, Cordero F, Beccuti M, Calogero RA. Reproducible bioinformatics project: a community for reproducible bioinformatics analysis pipelines. BMC Bioinformatics 2018;19:349. [PMID: 30367595 PMCID: PMC6191970 DOI: 10.1186/s12859-018-2296-x] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open

Abstract

Background

Reproducibility of a research is a key element in the modern science and it is mandatory for any industrial application. It represents the ability of replicating an experiment independently by the location and the operator. Therefore, a study can be considered reproducible only if all used data are available and the exploited computational analysis workflow is clearly described. However, today for reproducing a complex bioinformatics analysis, the raw data and the list of tools used in the workflow could be not enough to guarantee the reproducibility of the results obtained. Indeed, different releases of the same tools and/or of the system libraries (exploited by such tools) might lead to sneaky reproducibility issues.

Results

To address this challenge, we established the Reproducible Bioinformatics Project (RBP), which is a non-profit and open-source project, whose aim is to provide a schema and an infrastructure, based on docker images and R package, to provide reproducible results in Bioinformatics. One or more Docker images are then defined for a workflow (typically one for each task), while the workflow implementation is handled via R-functions embedded in a package available at github repository. Thus, a bioinformatician participating to the project has firstly to integrate her/his workflow modules into Docker image(s) exploiting an Ubuntu docker image developed ad hoc by RPB to make easier this task. Secondly, the workflow implementation must be realized in R according to an R-skeleton function made available by RPB to guarantee homogeneity and reusability among different RPB functions. Moreover she/he has to provide the R vignette explaining the package functionality together with an example dataset which can be used to improve the user confidence in the workflow utilization.

Conclusions

Reproducible Bioinformatics Project provides a general schema and an infrastructure to distribute robust and reproducible workflows. Thus, it guarantees to final users the ability to repeat consistently any analysis independently by the used UNIX-like architecture.

Collapse