1
|
Chao-Ecija A, Dawid-Milner MS. BaroWavelet: An R-based tool for dynamic baroreflex evaluation through wavelet analysis techniques. Comput Methods Programs Biomed 2023; 242:107758. [PMID: 37688995 DOI: 10.1016/j.cmpb.2023.107758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 07/28/2023] [Accepted: 08/07/2023] [Indexed: 09/11/2023]
Abstract
BACKGROUND AND OBJECTIVE Baroreflex sensitivity constitutes an indicator of the function of the baroreceptor control mechanism of blood pressure levels. It can be computed after estimating heart rate and blood pressure variability. We propose a novel tool for the evaluation of baroreflex sensitivity using wavelet analysis methods. This tool, known as BaroWavelet, incorporates an algorithm proposal based on the analysis methodology of the RHRV software package, as well as other conventional techniques. Our objectives are to develop and evaluate the tool, by testing its ability to detect changes in baroreflex sensitivity in humans. METHODS The code for this tool was designed in the R programming environment and was organized into two analysis routines and a graphical interface. Simulated recordings of blood pressure and inter-beat intervals were employed for an initial evaluation of the tool in a controlled environment. Finally, similar recordings obtained during supine and orthostatic postural evaluations, from patients that belonged to the open-access EUROBAVAR data set, were analyzed. RESULTS BaroWavelet identified the scripted changes of the baroreflex sensitivity in the simulated data. The algorithm proposal was also able to better retain additional information regarding the dynamics of the baroreflex. In the EUROBAVAR subjects, baroreflex sensitivity components were significantly smaller during orthostatism when compared with the supine position. CONCLUSIONS BaroWavelet managed to characterize baroreflex dynamics from the recordings, which were consistent with the findings reported in the literature. This demonstrates its effectiveness to perform these analyses. We suggest that this tool may be of use in research and for the evaluation of baroreflex sensitivity with clinical and therapeutic purposes. The new tool is available at the official GitHub repository of the Autonomic Nervous System Unit of the University of Málaga (https://github.com/CIMES-USNA-UMA/BaroWavelet).
Collapse
Affiliation(s)
- A Chao-Ecija
- Autonomic Nervous System Unit, CIMES, School of Medicine, University of Málaga, Spain
| | - M S Dawid-Milner
- Autonomic Nervous System Unit, CIMES, School of Medicine, University of Málaga, Spain; Instituto de Investigación Biomédica de Málaga (IBIMA), Málaga, Spain.
| |
Collapse
|
2
|
Aboul-Ata MA, Qonsua FT, Saadi IAA. Personality Pathology and Suicide Risk: Examining the Relationship Between DSM-5 Alternative Model Traits and Suicidal Ideation and Behavior in College-Aged Individuals. Psychol Rep 2023:332941231218940. [PMID: 38029776 DOI: 10.1177/00332941231218940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2023]
Abstract
BACKGROUND This study examines the link between personality pathology and suicide risk regarding the DSM-5 alternative model of personality disorders. METHOD The study investigates the facets, domains, internalizing, and externalizing of personality pathology and their correlation and predictive significance for suicidal ideation and behavior. This study examined a diverse and balanced sample of 1,398 college students aged between 18- and 29-year-olds from nine colleges in Kafrelshiekh University, with nearly equal representation of both genders (687 males, 711 females), a mix of rural and urban residents (807 rural, 591 urban), and a wide range of socioeconomic backgrounds (15 very low SES, 84 low SES, 878 moderate SES, 364 high SES, and 57 very high SES). The Personality Inventory for the DSM-5 (PID-5) was utilized to assess personality pathology. Columbia-Suicide Severity Rating Scale (C-SSRS) was used to evaluate suicidal ideation and behavior. RESULTS AND DISCUSSION Logistic regression reveals significant associations between personality traits and suicidal ideation (e.g., Anhedonia, Suspiciousness) and behavior (e.g., Risk Taking, Depressivity). Negative Affect and Detachment are significantly linked to suicidal ideation, while Detachment, Disinhibition, and Psychoticism are linked to suicidal behavior. Internalizing personality pathology predicts both ideation and behavior, indicating a contribution to suicidal thoughts and self-destructive acts. Externalizing is a significant predictor of suicidal behavior.
Collapse
Affiliation(s)
| | - Faten T Qonsua
- Department of Psychology, Kafrelsheikh University, Kafr el-Sheikh, Egypt
| | - Ibrahim A A Saadi
- Department of Psychology, University of Jeddah, Jeddah, Saudi Arabia
| |
Collapse
|
3
|
Riza LS, Zain MI, Izzuddin A, Prasetyo Y, Hidayat T, Abu Samah KAF. Implementation of machine learning in DNA barcoding for determining the plant family taxonomy. Heliyon 2023; 9:e20161. [PMID: 37767518 PMCID: PMC10520734 DOI: 10.1016/j.heliyon.2023.e20161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2022] [Revised: 09/05/2023] [Accepted: 09/13/2023] [Indexed: 09/29/2023] Open
Abstract
The DNA barcoding approach has been used extensively in taxonomy and phylogenetics. The differences in certain DNA sequences are able to differentiate and help classify organisms into taxa. It has been used in cases of taxonomic disputes where morphology by itself is insufficient. This research aimed to utilize hierarchical clustering, an unsupervised machine learning method, to determine and resolve disputes in plant family taxonomy. We take a case study of Leguminosae that historically some classify into three families (Fabaceae, Caesalpiniaceae, and Mimosaceae) but others classify into one family (Leguminosae). This study is divided into several phases, which are: (i) data collection, (ii) data preprocessing, (iii) finding the best distance method, and (iv) determining disputed family. The data used are collected from several sources, including National Center for Biotechnology Information (NCBI), journals, and websites. The data for validation of the methods were collected from NCBI. This was used to determine the best distance method for differentiating families or genera. The data for the case study in the Leguminosae group was collected from journals and a website. From the experiment that we have conducted, we found that the Pearson method is the best distance method to do clustering ITS sequence of plants, both in accuracy and computational cost. We use the Pearson method to determine the disputed family between Leguminosae. We found that the case study of Leguminosae should be grouped into one family based on our research.
Collapse
Affiliation(s)
- Lala Septem Riza
- Department of Computer Science Education, Universitas Pendidikan Indonesia, Bandung, Indonesia
| | - Muhammad Iqbal Zain
- Department of Computer Science Education, Universitas Pendidikan Indonesia, Bandung, Indonesia
| | - Ahmad Izzuddin
- Department of Computer Science Education, Universitas Pendidikan Indonesia, Bandung, Indonesia
| | - Yudi Prasetyo
- Department of Computer Science Education, Universitas Pendidikan Indonesia, Bandung, Indonesia
| | - Topik Hidayat
- Department of Biology Education, Universitas Pendidikan Indonesia, Bandung, Indonesia
| | | |
Collapse
|
4
|
Martinez-Blasco M, Serrano V, Prior F, Cuadros J. Analysis of an event study using the Fama-French five-factor model: teaching approaches including spreadsheets and the R programming language. Financ Innov 2023; 9:76. [PMID: 37063168 PMCID: PMC10088769 DOI: 10.1186/s40854-023-00477-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Accepted: 03/09/2023] [Indexed: 06/19/2023]
Abstract
The current financial education framework has an increasing need to introduce tools that facilitate the application of theoretical models to real-world data and contexts. However, only a limited number of free tools are available for this purpose. Given this lack of tools, the present study provides two approaches to facilitate the implementation of an event study. The first approach consists of a set of MS Excel files based on the Fama-French five-factor model, which allows the application of the event study methodology in a semi-automatic manner. The second approach is an open-source R-programmed tool through which results can be obtained in the context of an event study without the need for programming knowledge. This tool widens the calculus possibilities provided by the first approach and offers the option to apply not only the Fama-French five-factor model but also other models that are common in the financial literature. It is a user-friendly tool that enables reproducibility of the analysis and ensures that the calculations are free of manipulation errors. Both approaches are freely available and ready-to-use.
Collapse
Affiliation(s)
- Monica Martinez-Blasco
- IQS School of Management-Universitat Ramon Llull, Via Augusta 390, 08017 Barcelona, Spain
| | - Vanessa Serrano
- IQS School of Engineering-Universitat Ramon Llull, Via Augusta 390, 08017 Barcelona, Spain
| | - Francesc Prior
- IQS School of Management-Universitat Ramon Llull, Via Augusta 390, 08017 Barcelona, Spain
| | - Jordi Cuadros
- IQS School of Engineering-Universitat Ramon Llull, Via Augusta 390, 08017 Barcelona, Spain
| |
Collapse
|
5
|
Mobini M, Matic N, Gugten JGVD, Ritchie G, Lowe CF, Holmes DT. End-to-End Data Automation for Pooled Sample SARS-CoV-2 Using R and Other Open-Source Tools. J Appl Lab Med 2023; 8:41-52. [PMID: 36610407 DOI: 10.1093/jalm/jfac109] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Accepted: 10/17/2022] [Indexed: 01/09/2023]
Abstract
BACKGROUND Due to supply chain shortages of reagents for real-time (RT)-PCR for SARS-CoV-2 and increasing demand on technical staff, an end-to-end data automation strategy for SARS-CoV-2 sample pooling and singleton analysis became necessary in the summer of 2020. METHODS Using entirely open source software tools-Linux, bash, R, RShiny, ShinyProxy, and Docker-we developed a modular software application stack to manage the preanalytical, analytical, and postanalytical processes for singleton and pooled testing in a 5-week time frame. RESULTS Pooling was operationalized for 81 days, during which time 64 pooled runs were performed for a total of 5320 sample pools and approximately 21 280 patient samples in 4:1 format. A total of 17 580 negative pooled results were released in bulk. After pooling was discontinued, the application stack was used for singleton analysis and modified to release all viral RT-PCR results from our laboratory. To date, 236 109 samples have been processed avoiding over 610 000 transcriptions. CONCLUSIONS We present an end-to-end data automation strategy connecting 11 devices, one network attached storage, 2 Linux servers, and the laboratory information system.
Collapse
Affiliation(s)
- Mahdi Mobini
- St. Paul's Hospital Department of Pathology and Laboratory Medicine, Vancouver, BC, Canada.,Providence Health Emerging Technologies, Vancouver, BC, Canada
| | - Nancy Matic
- St. Paul's Hospital Department of Pathology and Laboratory Medicine, Vancouver, BC, Canada.,University of British Columbia Department of Pathology and Laboratory Medicine, Vancouver, BC, Canada
| | - J Grace Van Der Gugten
- St. Paul's Hospital Department of Pathology and Laboratory Medicine, Vancouver, BC, Canada
| | - Gordon Ritchie
- St. Paul's Hospital Department of Pathology and Laboratory Medicine, Vancouver, BC, Canada.,University of British Columbia Department of Pathology and Laboratory Medicine, Vancouver, BC, Canada
| | - Christopher F Lowe
- St. Paul's Hospital Department of Pathology and Laboratory Medicine, Vancouver, BC, Canada.,University of British Columbia Department of Pathology and Laboratory Medicine, Vancouver, BC, Canada
| | - Daniel T Holmes
- St. Paul's Hospital Department of Pathology and Laboratory Medicine, Vancouver, BC, Canada.,University of British Columbia Department of Pathology and Laboratory Medicine, Vancouver, BC, Canada
| |
Collapse
|
6
|
Coleman A, Bose A, Mitra S. Metagenomics Data Visualization Using R. Methods Mol Biol 2023; 2649:359-392. [PMID: 37258873 DOI: 10.1007/978-1-0716-3072-3_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Communicating key finds is a crucial part of the research process. Data visualization is the field of graphically representing data to help communicate key findings. Building on previous chapters around data manipulating using the R programming language this, chapter will explore how to use R to plot data and generate high-quality graphics. It will cover plotting using the base R plotting functionality and introduce the famous ggplot2 package [2] that is widely used for data visualization in R. After this general introduction to data visualization tools, the chapter will explore more specific data visualization techniques for metagenomics data and their use cases using these basic packages.
Collapse
Affiliation(s)
- Alex Coleman
- Research Computing, IT Services, University of Leeds, Leeds, UK
| | - Anupam Bose
- Department of Mathematics, University of Leeds, Leeds, UK
| | - Suparna Mitra
- Leeds Institute of Medical Research, University of Leeds, Leeds General Infirmary, Leeds, UK.
| |
Collapse
|
7
|
Matthiesen R, Rodriguez MS, Carvalho AS. A Computational Tool for Analysis of Mass Spectrometry Data of Ubiquitin-Enriched Samples. Methods Mol Biol 2023; 2602:205-214. [PMID: 36446977 DOI: 10.1007/978-1-0716-2859-1_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Mass spectrometry data on ubiquitin and ubiquitin-like modifiers are becoming increasingly more accessible, and the coverage progressively deepen as methodologies mature. This type of mass spectrometry data is linked to specific data analysis pipelines for ubiquitin. This chapter describes a computational tool to facilitate analysis of mass spectrometry data obtained on ubiquitin-enriched samples. For example, the analysis of ubiquitin branch site statistics and functional enrichment analysis against ubiquitin proteasome system protein sets are completed with a few functional calls. We foresee that the proposed computational methodology can aid in proximity drug design by, for example, elucidating the expression of E3 ligases and other factors related to the ubiquitin proteasome system.
Collapse
Affiliation(s)
- Rune Matthiesen
- iNOVA4Health, NOVA Medical School, Faculdade de Ciências Médicas, NMS, FCM, Universidade Nova de Lisboa, Lisbon, Portugal.
| | | | - Ana Sofia Carvalho
- iNOVA4Health, NOVA Medical School, Faculdade de Ciências Médicas, NMS, FCM, Universidade Nova de Lisboa, Lisbon, Portugal
| |
Collapse
|
8
|
Coleman A, Callaghan M. Manipulating and Basic Analysis of Tabular Metagenomics Datasets Using R. Methods Mol Biol 2023; 2649:339-357. [PMID: 37258872 DOI: 10.1007/978-1-0716-3072-3_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Handling and manipulating tabular datasets is a critical step in every metagenomics analysis pipeline. The R statistical programming language offers a variety of versatile tools for working with tabular data that allow for the development of computationally efficient and reproducible workflows. Here we outline the basics of the R programming language and showcase a number of tools for data manipulation and basic analysis of metagenomics datasets.
Collapse
Affiliation(s)
- Alex Coleman
- Research Computing, IT Services, University of Leeds, Leeds, UK.
| | | |
Collapse
|
9
|
Lortie CJ, Vargas Poulsen C, Brun J, Kui L. Tabular strategies for metadata in ecology, evolution, and the environmental sciences. Ecol Evol 2022; 12:e9245. [PMID: 36035265 PMCID: PMC9405493 DOI: 10.1002/ece3.9245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Accepted: 08/04/2022] [Indexed: 11/07/2022] Open
Abstract
Data support knowledge development and theory advances in ecology and evolution. We are increasingly reusing data within our teams and projects and through the global, openly archived datasets of others. Metadata can be challenging to write and interpret, but it is always crucial for reuse. The value metadata cannot be overstated-even as a relatively independent research object because it describes the work that has been done in a structured format. We advance a new perspective and classify methods for metadata curation and development with tables. Tables with templates can be effectively used to capture all components of an experiment or project in a single, easy-to-read file familiar to most scientists. If coupled with the R programming language, metadata from tables can then be rapidly and reproducibly converted to publication formats including extensible markup language files suitable for data repositories. Tables can also be used to summarize existing metadata and store metadata across many datasets. A case study is provided and the added benefits of tables for metadata, a priori, are developed to ensure a more streamlined publishing process for many data repositories used in ecology, evolution, and the environmental sciences. In ecology and evolution, researchers are often highly tabular thinkers from experimental data collection in the lab and/or field, and representations of metadata as a table will provide novel research and reuse insights.
Collapse
Affiliation(s)
- C. J. Lortie
- National Center for Ecological Analysis and Synthesis, UCSBSanta BarbaraCaliforniaUSA
- Department of BiologyYork UniversityTorontoOntarioCanada
| | | | - Julien Brun
- National Center for Ecological Analysis and Synthesis, UCSBSanta BarbaraCaliforniaUSA
| | - Li Kui
- Marine Science Institute, UCSBSanta BarbaraCaliforniaUSA
| |
Collapse
|
10
|
Feretzakis G, Sakagianni A, Kalles D, Loupelis E, Panteris V, Tzelves L, Chatzikyriakou R, Trakas N, Kolokytha S, Batiani P, Rakopoulou Z, Tika A, Petropoulou S, Dalainas I, Kaldis V. Using Machine Learning for Predicting the Hospitalization of Emergency Department Patients. Stud Health Technol Inform 2022; 295:405-408. [PMID: 35773897 DOI: 10.3233/shti220751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Artificial intelligence processes are increasingly being used in emergency medicine, notably for supporting clinical decisions and potentially improving healthcare services. This study investigated demographics, coagulation tests, and biochemical markers routinely used for patients seen in the Emergency Department (ED) concerning hospitalization. This retrospective observational study included 13,991 emergency department visits of patients who had undergone biomarker testing to a tertiary public hospital in Greece during 2020. After applying five well-known classifiers of the caret package for machine learning of the R programming language in the whole data set and to each ED unit separately, the best performance regarding AUC ROC was observed in the Pulmonology ED unit. Furthermore, among the five classification techniques evaluated, a random forest classifier outperformed other models.
Collapse
Affiliation(s)
- Georgios Feretzakis
- Hellenic Open University, Patra, Greece
- Sismanogleio General Hospital of Attica, Marousi, Greece
| | | | | | | | | | - Lazaros Tzelves
- Sismanogleio General Hospital of Attica, Marousi, Greece
- National and Kapodistrian University of Athens Athens, Greece
| | | | | | | | | | - Zoi Rakopoulou
- Sismanogleio General Hospital of Attica, Marousi, Greece
| | | | | | - Ilias Dalainas
- Sismanogleio General Hospital of Attica, Marousi, Greece
| | | |
Collapse
|
11
|
Abstract
Gene expression profiling is a useful way to measure the activity of genes in molecular biology and, because of its effectiveness, researchers have released thousands of gene expression datasets publicly in online databases and repositories, such as Gene Expression Omnibus (GEO). To read and analyze gene expression data, the computational biology community has developed several tools and platforms, including Bioconductor, an R open-source platform of software packages that can be used to analyze these data. Despite the usefulness of Bioconductor and of its packages, it is still difficult to read gene expression data from GEO, and to assign gene symbols to the probesets of datasets. To alleviate this problem, we introduce here a new R software package, geneExpressionFromGEO, which provides to the users the possibility to easily download gene expression data from GEO and to easily associate gene symbols to probesets. In this short chapter, we describe the assets of our software package, and we report an example of its usage. We believe that geneExpressionFromGEO can be very useful for the R community of bioinformaticians working on gene expression data.
Collapse
|
12
|
Kaplan BA, Franck CT, McKee K, Gilroy SP, Koffarnus MN. Applying Mixed-Effects Modeling to Behavioral Economic Demand: An Introduction. Perspect Behav Sci 2021; 44:333-58. [PMID: 34632281 DOI: 10.1007/s40614-021-00299-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/19/2021] [Indexed: 01/04/2023] Open
Abstract
Behavioral economic demand methodology is increasingly being used in various fields such as substance use and consumer behavior analysis. Traditional analytical techniques to fitting demand data have proven useful yet some of these approaches require preprocessing of data, ignore dependence in the data, and present statistical limitations. We term these approaches "fit to group" and "two stage" with the former interested in group or population level estimates and the latter interested in individual subject estimates. As an extension to these regression techniques, mixed-effect (or multilevel) modeling can serve as an improvement over these traditional methods. Notable benefits include providing simultaneous group (i.e., population) level estimates (with more accurate standard errors) and individual level predictions while accommodating the inclusion of "nonsystematic" response sets and covariates. These models can also accommodate complex experimental designs including repeated measures. The goal of this article is to introduce and provide a high-level overview of mixed-effects modeling techniques applied to behavioral economic demand data. We compare and contrast results from traditional techniques to that of the mixed-effects models across two datasets differing in species and experimental design. We discuss the relative benefits and drawbacks of these approaches and provide access to statistical code and data to support the analytical replicability of the comparisons. Supplementary Information The online version contains supplementary material available at 10.1007/s40614-021-00299-7.
Collapse
|
13
|
Nicolotti L, Hack J, Herderich M, Lloyd N. MStractor: R Workflow Package for Enhancing Metabolomics Data Pre-Processing and Visualization. Metabolites 2021; 11:metabo11080492. [PMID: 34436433 PMCID: PMC8398219 DOI: 10.3390/metabo11080492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2021] [Revised: 07/23/2021] [Accepted: 07/23/2021] [Indexed: 11/16/2022] Open
Abstract
Untargeted metabolomics experiments for characterizing complex biological samples, conducted with chromatography/mass spectrometry technology, generate large datasets containing very complex and highly variable information. Many data-processing options are available, however, both commercial and open-source solutions for data processing have limitations, such as vendor platform exclusivity and/or requiring familiarity with diverse programming languages. Data processing of untargeted metabolite data is a particular problem for laboratories that specialize in non-routine mass spectrometry analysis of diverse sample types across humans, animals, plants, fungi, and microorganisms. Here, we present MStractor, an R workflow package developed to streamline and enhance pre-processing of metabolomics mass spectrometry data and visualization. MStractor combines functions for molecular feature extraction with user-friendly dedicated GUIs for chromatographic and mass spectromerty (MS) parameter input, graphical quality-control outputs, and descriptive statistics. MStractor performance was evaluated through a detailed comparison with XCMS Online. The MStractor package is freely available on GitHub at the MetabolomicsSA repository.
Collapse
Affiliation(s)
- Luca Nicolotti
- The Australian Wine Research Institute, Adelaide, SA 5064, Australia; (J.H.); (M.H.); (N.L.)
- Metabolomics Australia, The Australian Wine Research Institute, Adelaide, SA 5064, Australia
- Correspondence:
| | - Jeremy Hack
- The Australian Wine Research Institute, Adelaide, SA 5064, Australia; (J.H.); (M.H.); (N.L.)
- Metabolomics Australia, The Australian Wine Research Institute, Adelaide, SA 5064, Australia
| | - Markus Herderich
- The Australian Wine Research Institute, Adelaide, SA 5064, Australia; (J.H.); (M.H.); (N.L.)
- Metabolomics Australia, The Australian Wine Research Institute, Adelaide, SA 5064, Australia
| | - Natoiya Lloyd
- The Australian Wine Research Institute, Adelaide, SA 5064, Australia; (J.H.); (M.H.); (N.L.)
- Metabolomics Australia, The Australian Wine Research Institute, Adelaide, SA 5064, Australia
| |
Collapse
|
14
|
Zhao QY, Luo JC, Su Y, Zhang YJ, Tu GW, Luo Z. Propensity score matching with R: conventional methods and new features. Ann Transl Med 2021; 9:812. [PMID: 34268425 PMCID: PMC8246231 DOI: 10.21037/atm-20-3998] [Citation(s) in RCA: 75] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/16/2020] [Accepted: 10/29/2020] [Indexed: 02/05/2023]
Abstract
It is increasingly important to accurately and comprehensively estimate the effects of particular clinical treatments. Although randomization is the current gold standard, randomized controlled trials (RCTs) are often limited in practice due to ethical and cost issues. Observational studies have also attracted a great deal of attention as, quite often, large historical datasets are available for these kinds of studies. However, observational studies also have their drawbacks, mainly including the systematic differences in baseline covariates, which relate to outcomes between treatment and control groups that can potentially bias results. Propensity score methods, which are a series of balancing methods in these studies, have become increasingly popular by virtue of the two major advantages of dimension reduction and design separation. Within this approach, propensity score matching (PSM) has been empirically proven, with outstanding performances across observational datasets. While PSM tutorials are available in the literature, there is still room for improvement. Some PSM tutorials provide step-by-step guidance, but only one or two packages have been covered, thereby limiting their scope and practicality. Several articles and books have expounded upon propensity scores in detail, exploring statistical principles and theories; however, the lack of explanations on function usage in programming language has made it difficult for researchers to understand and follow these materials. To this end, this tutorial was developed with a six-step PSM framework, in which we summarize the recent updates and provide step-by-step guidance to the R programming language. This tutorial offers researchers with a broad survey of PSM, ranging from data preprocessing to estimations of propensity scores, and from matching to analyses. We also explain generalized propensity scoring for multiple or continuous treatments, as well as time-dependent PSM. Lastly, we discuss the advantages and disadvantages of propensity score methods.
Collapse
Affiliation(s)
- Qin-Yu Zhao
- College of Engineering and Computer Science, Australian National University, Canberra, ACT, Australia
| | - Jing-Chao Luo
- Department of Critical Care Medicine, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Ying Su
- Department of Critical Care Medicine, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Yi-Jie Zhang
- Department of Critical Care Medicine, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Guo-Wei Tu
- Department of Critical Care Medicine, Zhongshan Hospital, Fudan University, Shanghai, China
| | - Zhe Luo
- Department of Critical Care Medicine, Xiamen Branch, Zhongshan Hospital, Fudan University, Xiamen, China
| |
Collapse
|
15
|
Wu C, Cai X, Yan J, Deng A, Cao Y, Zhu X. Identification of Novel Glycolysis-Related Gene Signatures Associated With Prognosis of Patients With Clear Cell Renal Cell Carcinoma Based on TCGA. Front Genet 2020; 11:589663. [PMID: 33391344 PMCID: PMC7775602 DOI: 10.3389/fgene.2020.589663] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Accepted: 11/16/2020] [Indexed: 12/14/2022] Open
Abstract
Objective The purpose of the present study was to detect novel glycolysis-related gene signatures of prognostic values for patients with clear cell renal cell carcinoma (ccRCC). Methods Glycolysis-related gene sets were acquired from the Molecular Signatures Database (V7.0). Gene Set Enrichment Analysis (GSEA) software (4.0.3) was applied to analyze glycolysis-related gene sets. The Perl programming language (5.32.0) was used to extract glycolysis-related genes and clinical information of patients with ccRCC. The receiver operating characteristic curve (ROC) and Kaplan-Meier curve were drawn by the R programming language (3.6.3). Results The four glycolysis-related genes (B3GAT3, CENPA, AGL, and ALDH3A2) associated with prognosis were identified using Cox proportional regression analysis. A risk score staging system was established to predict the outcomes of patients with ccRCC. The patients with ccRCC were classified into the low-risk group and high-risk group. Conclusions We have successfully constructed a risk staging model for ccRCC. The model has a better performance in predicting the prognosis of patients, which may have positive reference value for the treatment and curative effect evaluation of ccRCC.
Collapse
Affiliation(s)
- Chengjiang Wu
- Department of Clinical Laboratory, The Second Affiliated Hospital of Soochow University, Suzhou, China
| | - Xiaojie Cai
- Department of Radiology, Affiliated Changshu Hospital of Soochow University, First People's Hospital of Changshu City, Suzhou, China
| | - Jie Yan
- Department of Clinical Laboratory, The Second Affiliated Hospital of Soochow University, Suzhou, China
| | - Anyu Deng
- Department of Clinical Laboratory, The Second Affiliated Hospital of Soochow University, Suzhou, China
| | - Yun Cao
- Department of Clinical Laboratory, The Second Affiliated Hospital of Soochow University, Suzhou, China
| | - Xueming Zhu
- Department of Clinical Laboratory, The Second Affiliated Hospital of Soochow University, Suzhou, China
| |
Collapse
|
16
|
Lortie CJ, Braun J, Filazzola A, Miguel F. A checklist for choosing between R packages in ecology and evolution. Ecol Evol 2020; 10:1098-1105. [PMID: 32076500 PMCID: PMC7029065 DOI: 10.1002/ece3.5970] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2019] [Revised: 11/18/2019] [Accepted: 12/05/2019] [Indexed: 11/12/2022] Open
Abstract
The open source and free programming language R is a phenomenal mechanism to address a multiplicity of challenges in ecology and evolution. It is also a complex ecosystem because of the diversity of solutions available to the analyst.Packages for R enhance and specialize the capacity to explore both niche data/experiments and more common needs. However, the paradox of choice or how we select between many seemingly similar options can be overwhelming and lead to different potential outcomes.There is extensive choice in ecology and evolution between packages for both fundamental statistics and for more specialized domain-level analyses.Here, we provide a checklist to inform these decisions based on the principles of resilience, need, and integration with scientific workflows for evidence.It is important to explore choices in any analytical coding environment-not just R-for solutions to challenges in ecology and evolution, and document this process because it advances reproducible science, promotes a deeper understand of the scientific evidence, and ensures that the outcomes are correct, representative, and robust.
Collapse
Affiliation(s)
- Christopher J. Lortie
- Department of BiologyYork UniversityTorontoONCanada
- The National Center for Ecological Analysis and SynthesisUCSBSanta BarbaraCAUSA
| | - Jenna Braun
- Department of BiologyYork UniversityTorontoONCanada
| | | | - Florencia Miguel
- National Scientific and Technical Research CouncilCONICETBuenos AiresArgentina
| |
Collapse
|
17
|
Zisi C, Pappa-Louisi A, Nikitas P. Separation optimization in HPLC analysis implemented in R programming language. J Chromatogr A 2019; 1617:460823. [PMID: 31932085 DOI: 10.1016/j.chroma.2019.460823] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2019] [Revised: 12/18/2019] [Accepted: 12/20/2019] [Indexed: 11/30/2022]
Abstract
A complete package of functions in the R-language has been written for professional separation optimization of complex mixtures of ionized and/or non-ionized solutes. The package includes functions for (a) base-line correction of experimentally recorded chromatograms, (b) modeling of chromatographic peak shapes and retention data, (c) prediction of the retention time of the test analytes and/or their chromatograms, and (d) separation optimization under either isocratic or single and/or double gradient elution conditions by changing the organic modifier(s) content and/or eluent pH. The optimization functions presented in this study offer two different modes for selection of optimal separation conditions: automatic and manual mode. In the automatic mode, the optimal separation conditions are determined by maximizing the resolution within separation time preset by the analyst. In the manual mode, the optimal separation conditions are selected via scatter or contour plots. The foreknowledge of the precise dependence of resolution and separation time upon one or two retention parameters of interest, provided by the proposed computer-assisted separation optimization method, gives chromatographers a feel of confidence for the selection of the optimal conditions for a desired separation. An illustrative video given in the Supplementary material may encourage a novice practitioner in R (software) programming language to follow the proposed separation optimization procedure in a real HPLC analysis.
Collapse
Affiliation(s)
- Ch Zisi
- Department of Chemistry, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece.
| | - A Pappa-Louisi
- Department of Chemistry, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
| | - P Nikitas
- Department of Chemistry, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
| |
Collapse
|
18
|
Abstract
beezdemand: Behavioral Economic Easy Demand, a novel package for performing behavioral economic analyses, is introduced and evaluated. beezdemand extends the statistical program to facilitate many of the analyses performed in studies of behavioral economic demand. The package supports commonly used options for modeling operant demand and performs data screening, fits models of demand, and calculates numerous measures relevant to applied behavioral economists. The free and open source beezdemand package is compared to commercially available software (i.e., GraphPad Prism™) using peer-reviewed and simulated data. The results of this study indicated that beezdemand provides results consistent with commonly used commercial software but provides a wider range of methods and functionality desirable to behavioral economic researchers. A brief overview of the package is presented, its functionality is demonstrated, and considerations for its use are discussed.
Collapse
Affiliation(s)
- Brent A. Kaplan
- Virginia Tech Carilion Research Institute, Virginia Polytechnic Institute and State University, 1 Riverside Circle, Roanoke, VA 24016 USA
| | - Shawn P. Gilroy
- Department of Psychology, Louisiana State University, Baton Rouge, LA USA
| | - Derek D. Reed
- Department of Applied Behavioral Science, University of Kansas, Lawrence, KS USA
| | - Mikhail N. Koffarnus
- Virginia Tech Carilion Research Institute, Virginia Polytechnic Institute and State University, 1 Riverside Circle, Roanoke, VA 24016 USA
| | - Steven R. Hursh
- Institutes for Behavior Resources, Inc., Baltimore, MD USA
- Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, MD USA
| |
Collapse
|
19
|
Shams S, Amlani S, Scicluna M, Gerlai R. Argus: An open-source and flexible software application for automated quantification of behavior during social interaction in adult zebrafish. Behav Res Methods 2019; 51:727-46. [PMID: 30105442 DOI: 10.3758/s13428-018-1083-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Zebrafish show great potential for behavioral neuroscience. Promising lines of research, however, require the development and validation of software tools that will allow automated and cost-effective behavioral analysis. Building on our previous work with the RealFishTracker (in-house-developed tracking system), we present Argus, a data extraction and analysis tool built in the open-source R language for behavioral researchers without any expertise in R. Argus includes a new, user-friendly, and efficient graphical user interface, instead of a command-line interface, and offers simplicity and flexibility in measuring complex zebrafish behavior through customizable parameters. In this article, we compare Argus with Noldus EthoVision and Noldus The Observer, to validate this new system. All three software applications were originally designed to quantify the behavior of a single subject. We first also performed an analysis of the movement of individual fish and compared the performance of the three software applications. Next we computed and quantified the behavioral variables that characterize dyadic interactions between zebrafish. We found that Argus and EthoVision extract similar absolute values and patterns of changes in these values for several behavioral measures, including speed, freezing, erratic movement, and interindividual distance. In contrast, the manual coding of behavior in The Observer showed weaker correlations with the two tracking methods (EthoVision and Argus). Thus, Argus is a novel, cost-effective, and customizable method for the analysis of adult zebrafish behavior that may be utilized for the behavioral quantification of both single and dyadic interacting subjects, but further sophistication will be needed for the proper identification of complex motor patterns, measures that a human observers can easily detect.
Collapse
|
20
|
Abstract
The measurement of concentrations of drugs and endogenous substances is widely used in basic and clinical pharmacology research and service tasks. Using data science-derived visualizations of laboratory data, it is demonstrated on a real-life example that basic statistical exploration of laboratory assay results or advised standard visual methods of data inspection may fall short in detecting systematic laboratory errors. For example, data pathologies such as generating always the same value in all probes of a particular assay run may pass undetected when using standard methods of data quality check. It is shown that the use of different data visualizations that emphasize different views of the data may enhance the detection of systematic laboratory errors. A dotplot of single data in the order of assay is proposed that provides an overview on the data range, outliers and a particular type of systematic errors where similar values are wrongly measured in all probes.
Collapse
Affiliation(s)
- Jörn Lötsch
- Institute of Clinical PharmacologyGoethe ‐ UniversityFrankfurt am MainGermany
- Fraunhofer Institute of Molecular Biology and Applied Ecology ‐ Project Group Translational Medicine and PharmacologyIME‐TMPFrankfurt am MainGermany
| |
Collapse
|
21
|
Abstract
Plasma samples from 177 control and type 2 diabetes patients collected at three Australian hospitals are screened for 14 analytes using six custom-made multiplex kits across 60 96-well plates. In total 354 samples were collected from the patients, representing one baseline and one end point sample from each patient. R methods and source code for analyzing the analyte fluorescence response obtained from these samples by Luminex Bio-Plex® xMap multiplexed immunoassay technology are disclosed. Techniques and R procedures for reading Bio-Plex® result files for statistical analysis and data visualization are also presented. The need for technical replicates and the number of technical replicates are addressed as well as plate layout design strategies. Multinomial regression is used to determine plate to sample covariate balance. Methods for matching clinical covariate information to Bio-Plex® results and vice versa are given. As well as methods for measuring and inspecting the quality of the fluorescence responses are presented. Both fixed and mixed-effect approaches for immunoassay statistical differential analysis are presented and discussed. A random effect approach to outlier analysis and detection is also shown. The bioinformatics R methodology present here provides a foundation for rigorous and reproducible analysis of the fluorescence response obtained from multiplexed immunoassays.
Collapse
Affiliation(s)
- Edmond J Breen
- Australian Proteome Analysis Facility (APAF), Macquarie University, Level 4, Building F7B, Research Park Drive, Sydney, NSW, 2109, Australia.
| |
Collapse
|
22
|
Komenda M, Ščavnický J, Růžičková P, Karolyi M, Štourač P, Schwarz D. Similarity Detection Between Virtual Patients and Medical Curriculum Using R. Stud Health Technol Inform 2018; 255:222-226. [PMID: 30306941] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This paper presents the domain of information sciences, applied informatics and biomedical engineering, proposing to develop methods for an automated detection of similarities between two particular virtual learning environments - virtual patients at Akutne.cz and the OPTIMED curriculum management system - in order to provide support to clinically oriented stages of medical and healthcare studies. For this purpose, the authors used large amounts of text-based data collected by the system for mapping medical curricula and through the system for virtual patient authoring and delivery. The proposed text-mining algorithm for an automated detection of links between content entities of these systems has been successfully implemented by the means of a web-based toolbox.
Collapse
Affiliation(s)
- Martin Komenda
- Institute of Biostatistics and Analyses, Faculty of Medicine, Masaryk University, Czech Republic
| | - Jakub Ščavnický
- Institute of Biostatistics and Analyses, Faculty of Medicine, Masaryk University, Czech Republic
| | - Petra Růžičková
- Institute of Biostatistics and Analyses, Faculty of Medicine, Masaryk University, Czech Republic
| | - Matěj Karolyi
- Institute of Biostatistics and Analyses, Faculty of Medicine, Masaryk University, Czech Republic
| | - Petr Štourač
- Department of Paediatric Anaesthesiology and Intensive Care Medicine, Faculty of Medicine, Masaryk University, Czech Republic
| | - Daniel Schwarz
- Institute of Biostatistics and Analyses, Faculty of Medicine, Masaryk University, Czech Republic
| |
Collapse
|
23
|
Shamsara J. Ezqsar: An R Package for Developing QSAR Models Directly From Structures. Open Med Chem J 2017; 11:212-221. [PMID: 29387275 PMCID: PMC5748834 DOI: 10.2174/1874104501711010212] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/18/2017] [Revised: 11/01/2017] [Accepted: 11/12/2017] [Indexed: 02/01/2023]
Abstract
Background: Quantitative Structure Activity Relationship (QSAR) is a difficult computational chemistry approach for beginner scientists and a time consuming one for even more experienced researchers. Method and Materials: Ezqsar which is introduced here addresses both the issues. It considers important steps to have a reliable QSAR model. Besides calculation of descriptors using CDK library, highly correlated descriptors are removed, a provided data set is divided to train and test sets, descriptors are selected by a statistical method, statistical parameter for the model are presented and applicability domain is investigated. Results: Finally, the model can be applied to predict the activities for an extra set of molecules for a purpose of either lead optimization or virtual screening. The performance is demonstrated by an example. Conclusion: The R package, ezqsar, is freely available viahttps://github.com/shamsaraj/ezqsar, and it runs on Linux and MS-Windows.
Collapse
Affiliation(s)
- Jamal Shamsara
- Pharmaceutical Research Center, Pharmaceutical Technology Institute, Mashhad University of Medical Sciences, Mashhad, Iran
| |
Collapse
|
24
|
Abstract
Metabolomics allows for the investigation of the small molecules found within living systems. Based on the design of the experiments, it is not uncommon for these analyses to include matrices of thousands of variables. In order to handle such large datasets, many have turned to multivariate statistical analyses to analyze and understand their data. Herein, we present protocols for using R to analyze metabolomic data using some of the more common multivariate statistical techniques.
Collapse
|
25
|
Lakshmanan K, Peter AP, Mohandass S, Varadharaj S, Lakshmanan U, Dharmar P. SynRio: R and Shiny based application platform for cyanobacterial genome analysis. Bioinformation 2015; 11:422-5. [PMID: 26527850 PMCID: PMC4620618 DOI: 10.6026/97320630011422] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2015] [Accepted: 08/28/2015] [Indexed: 11/23/2022] Open
Abstract
SynRio is a Shiny and R based web analysis portal for viewing Synechocystis PCC 6803 genome, a cyanobacterial genome with data
analysis capabilities. The web based user interface is created using R programming language powered by Shiny package. This web
interface helps in creating interactive genome visualization based on user provided data selection along with selective data
download options.
Collapse
Affiliation(s)
- Karthick Lakshmanan
- National Facility for Marine Cyanobacteria ,Sub-Distributed Bioinformatics Center, Department of Marine Biotechnology, School of Marine Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, India
| | - Arul Prakasham Peter
- National Facility for Marine Cyanobacteria ,Sub-Distributed Bioinformatics Center, Department of Marine Biotechnology, School of Marine Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, India
| | - Shylajanaciyar Mohandass
- National Facility for Marine Cyanobacteria ,Sub-Distributed Bioinformatics Center, Department of Marine Biotechnology, School of Marine Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, India
| | - Sangeetha Varadharaj
- National Facility for Marine Cyanobacteria ,Sub-Distributed Bioinformatics Center, Department of Marine Biotechnology, School of Marine Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, India
| | - Uma Lakshmanan
- National Facility for Marine Cyanobacteria ,Sub-Distributed Bioinformatics Center, Department of Marine Biotechnology, School of Marine Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, India
| | - Prabaharan Dharmar
- National Facility for Marine Cyanobacteria ,Sub-Distributed Bioinformatics Center, Department of Marine Biotechnology, School of Marine Sciences, Bharathidasan University, Tiruchirappalli, Tamil Nadu, India
| |
Collapse
|
26
|
Feng D, Baumgartner R, Svetnik V. A Bayesian estimate of the concordance correlation coefficient with skewed data. Pharm Stat 2015; 14:350-8. [PMID: 26033433 DOI: 10.1002/pst.1692] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2014] [Revised: 03/16/2015] [Accepted: 05/04/2015] [Indexed: 11/11/2022]
Abstract
Concordance correlation coefficient (CCC) is one of the most popular scaled indices used to evaluate agreement. Most commonly, it is used under the assumption that data is normally distributed. This assumption, however, does not apply to skewed data sets. While methods for the estimation of the CCC of skewed data sets have been introduced and studied, the Bayesian approach and its comparison with the previous methods has been lacking. In this study, we propose a Bayesian method for the estimation of the CCC of skewed data sets and compare it with the best method previously investigated. The proposed method has certain advantages. It tends to outperform the best method studied before when the variation of the data is mainly from the random subject effect instead of error. Furthermore, it allows for greater flexibility in application by enabling incorporation of missing data, confounding covariates, and replications, which was not considered previously. The superiority of this new approach is demonstrated using simulation as well as real-life biomarker data sets used in an electroencephalography clinical study. The implementation of the Bayesian method is accessible through the Comprehensive R Archive Network.
Collapse
Affiliation(s)
- Dai Feng
- Merck & Co., Inc., Rahway, NJ, USA
| | | | | |
Collapse
|
27
|
Khomtchouk BB, Van Booven DJ, Wahlestedt C. HeatmapGenerator: high performance RNAseq and microarray visualization software suite to examine differential gene expression levels using an R and C++ hybrid computational pipeline. Source Code Biol Med 2014; 9:30. [PMID: 25550709 PMCID: PMC4279803 DOI: 10.1186/s13029-014-0030-2] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/19/2014] [Accepted: 12/04/2014] [Indexed: 11/25/2022]
Abstract
Background The graphical visualization of gene expression data using heatmaps has become an integral component of modern-day medical research. Heatmaps are used extensively to plot quantitative differences in gene expression levels, such as those measured with RNAseq and microarray experiments, to provide qualitative large-scale views of the transcriptonomic landscape. Creating high-quality heatmaps is a computationally intensive task, often requiring considerable programming experience, particularly for customizing features to a specific dataset at hand. Methods Software to create publication-quality heatmaps is developed with the R programming language, C++ programming language, and OpenGL application programming interface (API) to create industry-grade high performance graphics. Results We create a graphical user interface (GUI) software package called HeatmapGenerator for Windows OS and Mac OS X as an intuitive, user-friendly alternative to researchers with minimal prior coding experience to allow them to create publication-quality heatmaps using R graphics without sacrificing their desired level of customization. The simplicity of HeatmapGenerator is that it only requires the user to upload a preformatted input file and download the publicly available R software language, among a few other operating system-specific requirements. Advanced features such as color, text labels, scaling, legend construction, and even database storage can be easily customized with no prior programming knowledge. Conclusion We provide an intuitive and user-friendly software package, HeatmapGenerator, to create high-quality, customizable heatmaps generated using the high-resolution color graphics capabilities of R. The software is available for Microsoft Windows and Apple Mac OS X. HeatmapGenerator is released under the GNU General Public License and publicly available at: http://sourceforge.net/projects/heatmapgenerator/. The Mac OS X direct download is available at: http://sourceforge.net/projects/heatmapgenerator/files/HeatmapGenerator_MAC_OSX.tar.gz/download. The Windows OS direct download is available at: http://sourceforge.net/projects/heatmapgenerator/files/HeatmapGenerator_WINDOWS.zip/download.
Collapse
Affiliation(s)
- Bohdan B Khomtchouk
- Center for Therapeutic Innovation and Department of Psychiatry and Behavioral Sciences, University of Miami Miller School of Medicine, 1120 NW 14th ST, Miami, 33136 FL USA
| | - Derek J Van Booven
- John P. Hussman Institute for Human Genomics, University of Miami Miller School of Medicine, 1501 NW 10th Avenue, Miami, 33136 FL USA
| | - Claes Wahlestedt
- Center for Therapeutic Innovation and Department of Psychiatry and Behavioral Sciences, University of Miami Miller School of Medicine, 1120 NW 14th ST, Miami, 33136 FL USA
| |
Collapse
|