1
|
Žlahtič B, Kokol P, Blažun Vošner H, Završnik J. The role of correspondence analysis in medical research. Front Public Health 2024; 12:1362699. [PMID: 38584915 PMCID: PMC10995278 DOI: 10.3389/fpubh.2024.1362699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2023] [Accepted: 03/07/2024] [Indexed: 04/09/2024] Open
Abstract
Correspondence analysis (CA) is a multivariate statistical and visualization technique. CA is extremely useful in analyzing either two- or multi-way contingency tables, representing some degree of correspondence between columns and rows. The CA results are visualized in easy-to-interpret "bi-plots," where the proximity of items (values of categorical variables) represents the degree of association between presented items. In other words, items positioned near each other are more associated than those located farther away. Each bi-plot has two dimensions, named during the analysis. The naming of dimensions adds a qualitative aspect to the analysis. Correspondence analysis may support medical professionals in finding answers to many important questions related to health, wellbeing, quality of life, and similar topics in a simpler but more informal way than by using more complex statistical or machine learning approaches. In that way, it can be used for dimension reduction and data simplification, clustering, classification, feature selection, knowledge extraction, visualization of adverse effects, or pattern detection.
Collapse
Affiliation(s)
- Bojan Žlahtič
- Faculty of Electrical Engineering and Computer Science, University of Maribor, Maribor, Slovenia
| | - Peter Kokol
- Faculty of Electrical Engineering and Computer Science, University of Maribor, Maribor, Slovenia
- Community Healthcare Center dr. Adolf Drolc, Maribor, Slovenia
| | - Helena Blažun Vošner
- Community Healthcare Center dr. Adolf Drolc, Maribor, Slovenia
- Faculty of Health and Social Sciences Slovenj Gradec, Slovenj Gradec, Slovenia
| | - Jernej Završnik
- Community Healthcare Center dr. Adolf Drolc, Maribor, Slovenia
- Alma Mater Europaea, Maribor, Slovenia
| |
Collapse
|
2
|
Song CJ, Park JY. Design of Fire Risk Estimation Method Based on Facility Data for Thermal Power Plants. Sensors (Basel) 2023; 23:8967. [PMID: 37960666 PMCID: PMC10650879 DOI: 10.3390/s23218967] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Revised: 10/30/2023] [Accepted: 11/01/2023] [Indexed: 11/15/2023]
Abstract
In this paper, we propose a data classification and analysis method to estimate fire risk using facility data of thermal power plants. To estimate fire risk based on facility data, we divided facilities into three states-Steady, Transient, and Anomaly-categorized by their purposes and operational conditions. This method is designed to satisfy three requirements of fire protection systems for thermal power plants. For example, areas with fire risk must be identified, and fire risks should be classified and integrated into existing systems. We classified thermal power plants into turbine, boiler, and indoor coal shed zones. Each zone was subdivided into small pieces of equipment. The turbine, generator, oil-related equipment, hydrogen (H2), and boiler feed pump (BFP) were selected for the turbine zone, while the pulverizer and ignition oil were chosen for the boiler zone. We selected fire-related tags from Supervisory Control and Data Acquisition (SCADA) data and acquired sample data during a specific period for two thermal power plants based on inspection of fire and explosion scenarios in thermal power plants over many years. We focused on crucial fire cases such as pool fires, 3D fires, and jet fires and organized three fire hazard levels for each zone. Experimental analysis was conducted with these data set by the proposed method for 500 MW and 100 MW thermal power plants. The data classification and analysis methods presented in this paper can provide indirect experience for data analysts who do not have domain knowledge about power plant fires and can also offer good inspiration for data analysts who need to understand power plant facilities.
Collapse
Affiliation(s)
- Chai-Jong Song
- Information Media Research Center, Korea Electronics Technology Institute, Seoul 03924, Republic of Korea;
| | | |
Collapse
|
3
|
Elvas LB, Nunes M, Ferreira JC, Dias MS, Rosário LB. AI-Driven Decision Support for Early Detection of Cardiac Events: Unveiling Patterns and Predicting Myocardial Ischemia. J Pers Med 2023; 13:1421. [PMID: 37763188 PMCID: PMC10533089 DOI: 10.3390/jpm13091421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2023] [Revised: 09/18/2023] [Accepted: 09/19/2023] [Indexed: 09/29/2023] Open
Abstract
Cardiovascular diseases (CVDs) account for a significant portion of global mortality, emphasizing the need for effective strategies. This study focuses on myocardial infarction, pulmonary thromboembolism, and aortic stenosis, aiming to empower medical practitioners with tools for informed decision making and timely interventions. Drawing from data at Hospital Santa Maria, our approach combines exploratory data analysis (EDA) and predictive machine learning (ML) models, guided by the Cross-Industry Standard Process for Data Mining (CRISP-DM) methodology. EDA reveals intricate patterns and relationships specific to cardiovascular diseases. ML models achieve accuracies above 80%, providing a 13 min window to predict myocardial ischemia incidents and intervene proactively. This paper presents a Proof of Concept for real-time data and predictive capabilities in enhancing medical strategies.
Collapse
Affiliation(s)
- Luís B. Elvas
- ISTAR, Instituto Universitário de Lisboa (ISCTE-IUL), 1649-026 Lisbon, Portugal; (M.N.); (J.C.F.); (M.S.D.)
- Inov Inesc Inovação—Instituto de Novas Tecnologias, 1000-029 Lisbon, Portugal
| | - Miguel Nunes
- ISTAR, Instituto Universitário de Lisboa (ISCTE-IUL), 1649-026 Lisbon, Portugal; (M.N.); (J.C.F.); (M.S.D.)
| | - Joao C. Ferreira
- ISTAR, Instituto Universitário de Lisboa (ISCTE-IUL), 1649-026 Lisbon, Portugal; (M.N.); (J.C.F.); (M.S.D.)
- Inov Inesc Inovação—Instituto de Novas Tecnologias, 1000-029 Lisbon, Portugal
| | - Miguel Sales Dias
- ISTAR, Instituto Universitário de Lisboa (ISCTE-IUL), 1649-026 Lisbon, Portugal; (M.N.); (J.C.F.); (M.S.D.)
| | - Luís Brás Rosário
- Faculty of Medicine, Lisbon University, Hospital Santa Maria/CHULN, CCUL, 1649-028 Lisbon, Portugal;
| |
Collapse
|
4
|
Tanyel T, Nadarajan C, Duc NM, Keserci B. Deciphering Machine Learning Decisions to Distinguish between Posterior Fossa Tumor Types Using MRI Features: What Do the Data Tell Us? Cancers (Basel) 2023; 15:4015. [PMID: 37627043 PMCID: PMC10452543 DOI: 10.3390/cancers15164015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2023] [Revised: 07/22/2023] [Accepted: 08/02/2023] [Indexed: 08/27/2023] Open
Abstract
Machine learning (ML) models have become capable of making critical decisions on our behalf. Nevertheless, due to complexity of these models, interpreting their decisions can be challenging, and humans cannot always control them. This paper provides explanations of decisions made by ML models in diagnosing four types of posterior fossa tumors: medulloblastoma, ependymoma, pilocytic astrocytoma, and brainstem glioma. The proposed methodology involves data analysis using kernel density estimations with Gaussian distributions to examine individual MRI features, conducting an analysis on the relationships between these features, and performing a comprehensive analysis of ML model behavior. This approach offers a simple yet informative and reliable means of identifying and validating distinguishable MRI features for the diagnosis of pediatric brain tumors. By presenting a comprehensive analysis of the responses of the four pediatric tumor types to each other and to ML models in a single source, this study aims to bridge the knowledge gap in the existing literature concerning the relationship between ML and medical outcomes. The results highlight that employing a simplistic approach in the absence of very large datasets leads to significantly more pronounced and explainable outcomes, as expected. Additionally, the study also demonstrates that the pre-analysis results consistently align with the outputs of the ML models and the clinical findings reported in the existing literature.
Collapse
Affiliation(s)
- Toygar Tanyel
- Department of Computer Engineering, Yildiz Technical University, Istanbul 34349, Türkiye;
| | - Chandran Nadarajan
- Department of Radiology, Gleneagles Hospital Kota Kinabalu, Kota Kinabalu 88100, Sabah, Malaysia;
| | - Nguyen Minh Duc
- Department of Radiology, Pham Ngoc Thach University of Medicine, Ho Chi Minh City 700000, Vietnam;
| | - Bilgin Keserci
- Department of Biomedical Engineering, Yildiz Technical University, Istanbul 34349, Türkiye
| |
Collapse
|
5
|
Jakab-Nácsa A, Garami A, Fiser B, Farkas L, Viskolcz B. Towards Machine Learning in Heterogeneous Catalysis-A Case Study of 2,4-Dinitrotoluene Hydrogenation. Int J Mol Sci 2023; 24:11461. [PMID: 37511224 PMCID: PMC10380742 DOI: 10.3390/ijms241411461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 06/22/2023] [Accepted: 07/11/2023] [Indexed: 07/30/2023] Open
Abstract
Utilization of multivariate data analysis in catalysis research has extraordinary importance. The aim of the MIRA21 (MIskolc RAnking 21) model is to characterize heterogeneous catalysts with bias-free quantifiable data from 15 different variables to standardize catalyst characterization and provide an easy tool to compare, rank, and classify catalysts. The present work introduces and mathematically validates the MIRA21 model by identifying fundamentals affecting catalyst comparison and provides support for catalyst design. Literature data of 2,4-dinitrotoluene hydrogenation catalysts for toluene diamine synthesis were analyzed by using the descriptor system of MIRA21. In this study, exploratory data analysis (EDA) has been used to understand the relationships between individual variables such as catalyst performance, reaction conditions, catalyst compositions, and sustainable parameters. The results will be applicable in catalyst design, and using machine learning tools will also be possible.
Collapse
Affiliation(s)
- Alexandra Jakab-Nácsa
- BorsodChem Ltd., Bolyai tér 1, H-3700 Kazincbarcika, Hungary
- Institute of Chemistry, Faculty of Materials Science and Engineering, University of Miskolc, H-3515 Miskolc-Egyetemváros, Hungary
| | - Attila Garami
- Institute of Energy, Ceramics and Polymer Technology, University of Miskolc, H-3515 Miskolc, Hungary
| | - Béla Fiser
- Higher Education and Industrial Cooperation Centre, University of Miskolc, H-3515 Miskolc, Hungary
- Ferenc Rakoczi II Transcarpathian Hungarian College of Higher Education, 90200 Beregszász, Transcarpathia, Ukraine
- Department of Physical Chemistry, Faculty of Chemistry, University of Lodz, 90-236 Lodz, Poland
| | - László Farkas
- BorsodChem Ltd., Bolyai tér 1, H-3700 Kazincbarcika, Hungary
- Institute of Chemistry, Faculty of Materials Science and Engineering, University of Miskolc, H-3515 Miskolc-Egyetemváros, Hungary
| | - Béla Viskolcz
- Institute of Chemistry, Faculty of Materials Science and Engineering, University of Miskolc, H-3515 Miskolc-Egyetemváros, Hungary
- Higher Education and Industrial Cooperation Centre, University of Miskolc, H-3515 Miskolc, Hungary
| |
Collapse
|
6
|
La Corte JC. "Classifying D/s Profiles Without Prior Assumptions: An Application of Cluster Analysis to Social Data". J Homosex 2023; 70:1549-1584. [PMID: 35166194 DOI: 10.1080/00918369.2022.2036534] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
Dominant/submissive role-play (D/s) is associated with specialized roles including Mistress, Master, Slave, Switch, Sadist, and Masochist. The current study uses cluster analysis to provide empirical evidence that no binary opposition or single spectrum constitutes a workable typology of individuals based on their affinities for these roles. The optimality of a particular choice of clustering scheme, including the number of clusters, is established using a replication technique which is presented in detail. A large number (n = 236,353) of individualized results (profiles) generated by the BDSM Test, a popular anonymous web survey, were analyzed. We hypothesize a two-dimensional typology of D/s profiles as the inferential result of our cluster analyses.
Collapse
Affiliation(s)
- Julie C La Corte
- Department of Mathematics, Computer Science & Engineering, Georgia State University, Atlanta, Georgia, USA
| |
Collapse
|
7
|
Alalayah KM, Senan EM, Atlam HF, Ahmed IA, Shatnawi HSA. Automatic and Early Detection of Parkinson's Disease by Analyzing Acoustic Signals Using Classification Algorithms Based on Recursive Feature Elimination Method. Diagnostics (Basel) 2023; 13:diagnostics13111924. [PMID: 37296776 DOI: 10.3390/diagnostics13111924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 05/23/2023] [Accepted: 05/27/2023] [Indexed: 06/12/2023] Open
Abstract
Parkinson's disease (PD) is a neurodegenerative condition generated by the dysfunction of brain cells and their 60-80% inability to produce dopamine, an organic chemical responsible for controlling a person's movement. This condition causes PD symptoms to appear. Diagnosis involves many physical and psychological tests and specialist examinations of the patient's nervous system, which causes several issues. The methodology method of early diagnosis of PD is based on analysing voice disorders. This method extracts a set of features from a recording of the person's voice. Then machine-learning (ML) methods are used to analyse and diagnose the recorded voice to distinguish Parkinson's cases from healthy ones. This paper proposes novel techniques to optimize the techniques for early diagnosis of PD by evaluating selected features and hyperparameter tuning of ML algorithms for diagnosing PD based on voice disorders. The dataset was balanced by the synthetic minority oversampling technique (SMOTE) and features were arranged according to their contribution to the target characteristic by the recursive feature elimination (RFE) algorithm. We applied two algorithms, t-distributed stochastic neighbour embedding (t-SNE) and principal component analysis (PCA), to reduce the dimensions of the dataset. Both t-SNE and PCA finally fed the resulting features into the classifiers support-vector machine (SVM), K-nearest neighbours (KNN), decision tree (DT), random forest (RF), and multilayer perception (MLP). Experimental results proved that the proposed techniques were superior to existing studies in which RF with the t-SNE algorithm yielded an accuracy of 97%, precision of 96.50%, recall of 94%, and F1-score of 95%. In addition, MLP with the PCA algorithm yielded an accuracy of 98%, precision of 97.66%, recall of 96%, and F1-score of 96.66%.
Collapse
Affiliation(s)
- Khaled M Alalayah
- Department of Computer Science, Faculty of Science and Arts, Najran University, Sharurah 68341, Saudi Arabia
| | - Ebrahim Mohammed Senan
- Department of Artificial Intelligence, Faculty of Computer Science and Information Technology, Alrazi University, Sana'a, Yemen
| | - Hany F Atlam
- Cyber Security Centre, WMG, University of Warwick, Coventry CV4 7AL, UK
| | | | | |
Collapse
|
8
|
Ramirez-Camba CD, Levesque CL. The Linear-Logistic Model: A Novel Paradigm for Estimating Dietary Amino Acid Requirements. Animals (Basel) 2023; 13:ani13101708. [PMID: 37238138 DOI: 10.3390/ani13101708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2023] [Revised: 05/16/2023] [Accepted: 05/20/2023] [Indexed: 05/28/2023] Open
Abstract
This study aimed to determine whether current methods for estimating AA requirements for animal health and welfare are sufficient. An exploratory data analysis (EDA) was conducted, which involved a review of assumptions underlying AA requirements research, a data mining approach to identify animal responses to dietary AA levels exceeding those for maximum protein retention, and a literature review to assess the physiological relevance of the linear-logistic model developed through the data mining approach. The results showed that AA dietary levels above those for maximum growth resulted in improvements in key physiological responses, and the linear-logistic model depicted the AA level at which growth and protein retention rates were maximized, along with key metabolic functions related to milk yield, litter size, immune response, intestinal permeability, and plasma AA concentrations. The results suggest that current methods based solely on growth and protein retention measurements are insufficient for optimizing key physiological responses associated with health, survival, and reproduction. The linear-logistic model could be used to estimate AA doses that optimize these responses and, potentially, survival rates.
Collapse
Affiliation(s)
- Christian D Ramirez-Camba
- Department of Animal Science, University of Minnesota, St. Paul, MN 55108, USA
- Department of Animal Science, South Dakota State University, Brookings, SD 57007, USA
| | - Crystal L Levesque
- Department of Animal Science, South Dakota State University, Brookings, SD 57007, USA
| |
Collapse
|
9
|
Alsabhan W. Student Cheating Detection in Higher Education by Implementing Machine Learning and LSTM Techniques. Sensors (Basel) 2023; 23:4149. [PMID: 37112489 PMCID: PMC10142698 DOI: 10.3390/s23084149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/12/2023] [Revised: 04/07/2023] [Accepted: 04/17/2023] [Indexed: 06/19/2023]
Abstract
Both paper-based and computerized exams have a high level of cheating. It is, therefore, desirable to be able to detect cheating accurately. Keeping the academic integrity of student evaluations intact is one of the biggest issues in online education. There is a substantial possibility of academic dishonesty during final exams since teachers are not directly monitoring students. We suggest a novel method in this study for identifying possible exam-cheating incidents using Machine Learning (ML) approaches. The 7WiseUp behavior dataset compiles data from surveys, sensor data, and institutional records to improve student well-being and academic performance. It offers information on academic achievement, student attendance, and behavior in general. In order to build models for predicting academic accomplishment, identifying at-risk students, and detecting problematic behavior, the dataset is designed for use in research on student behavior and performance. Our model approach surpassed all prior three-reference efforts with an accuracy of 90% and used a long short-term memory (LSTM) technique with a dropout layer, dense layers, and an optimizer called Adam. Implementing a more intricate and optimized architecture and hyperparameters is credited with increased accuracy. In addition, the increased accuracy could have been caused by how we cleaned and prepared our data. More investigation and analysis are required to determine the precise elements that led to our model's superior performance.
Collapse
Affiliation(s)
- Waleed Alsabhan
- College of Engineering, Al Faisal University, P.O. Box 50927, Riyadh 11533, Saudi Arabia
| |
Collapse
|
10
|
Marshall ADA, Hasdianda MA, Miyawaki S, Jambaulikar GD, Cao C, Chen P, Baugh CW, Zhang H, McCabe J, Steinbach L, King S, Friedman J, Su J, Landman AB, Chai PR. A Pilot of Digital Whiteboards for Improving Patient Satisfaction in the Emergency Department: Nonrandomized Controlled Trial. JMIR Form Res 2023; 7:e44725. [PMID: 36943360 PMCID: PMC10131606 DOI: 10.2196/44725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 01/27/2023] [Accepted: 02/10/2023] [Indexed: 02/12/2023] Open
Abstract
BACKGROUND Electronic paper (E-paper) screens use electrophoretic ink to provide paper-like low-power displays with advanced networking capabilities that may potentially serve as an alternative to traditional whiteboards and television display screens in hospital settings. E-paper may be leveraged in the emergency department (ED) to facilitate communication. Providing ED patient status updates on E-paper screens could improve patient satisfaction and overall experience and provide more equitable access to their health information. OBJECTIVE We aimed to pilot a patient-facing digital whiteboard using E-paper to display relevant orienting and clinical information in real time to ED patients. We also sought to assess patients' satisfaction after our intervention and understand our patients' overall perception of the impact of the digital whiteboards on their stay. METHODS We deployed a 41-inch E-paper digital whiteboard in 4 rooms in an urban, tertiary care, and academic ED and enrolled 110 patients to understand and evaluate their experience. Participants completed a modified Hospital Consumer Assessment of Health Care Provider and Systems satisfaction questionnaire about their ED stay. We compared responses to a matched control group of patients triaged to ED rooms without digital whiteboards. We designed the digital whiteboard based on iterative feedback from various departmental stakeholders. After establishing IT infrastructure to support the project, we enrolled patients on a convenience basis into a control and an intervention (digital whiteboard) group. Enrollees were given a baseline survey to evaluate their comfort with technology and an exit survey to evaluate their opinions of the digital whiteboard and overall ED satisfaction. Statistical analysis was performed to compare baseline characteristics as well as satisfaction. RESULTS After the successful prototyping and implementation of 4 digital whiteboards, we screened 471 patients for inclusion. We enrolled 110 patients, and 50 patients in each group (control and intervention) completed the study protocol. Age, gender, and racial and ethnic composition were similar between groups. We saw significant increases in satisfaction on postvisit surveys when patients were asked about communication regarding delays (P=.03) and what to do after discharge (P=.02). We found that patients in the intervention group were more likely to recommend the facility to family and friends (P=.04). Additionally, 96% (48/50) stated that they preferred a room with a digital whiteboard, and 70% (35/50) found the intervention "quite a bit" or "extremely" helpful in understanding their ED stay. CONCLUSIONS Digital whiteboards are a feasible and acceptable method of displaying patient-facing data in the ED. Our pilot suggested that E-paper screens coupled with relevant, real-time clinical data and packaged together as a digital whiteboard may positively impact patient satisfaction and the perception of the facility during ED visits. Further study is needed to fully understand the impact on patient satisfaction and experience. TRIAL REGISTRATION ClinicalTrials.gov NCT04497922; https://clinicaltrials.gov/ct2/show/NCT04497922.
Collapse
Affiliation(s)
- Andrew D A Marshall
- Department of Emergency Medicine, Harvard Medical School, Boston, MA, United States
- Department of Emergency Medicine, Brigham and Women's Hospital, Boston, MA, United States
| | - Mohammad Adrian Hasdianda
- Department of Emergency Medicine, Harvard Medical School, Boston, MA, United States
- Department of Emergency Medicine, Brigham and Women's Hospital, Boston, MA, United States
| | - Steven Miyawaki
- Department of Emergency Medicine, Brigham and Women's Hospital, Boston, MA, United States
| | | | - Chenze Cao
- Brigham Digital Innovation Hub, Brigham and Women's Hospital, Boston, MA, United States
| | - Paul Chen
- Department of Emergency Medicine, Harvard Medical School, Boston, MA, United States
- Department of Emergency Medicine, Brigham and Women's Hospital, Boston, MA, United States
| | - Christopher W Baugh
- Department of Emergency Medicine, Harvard Medical School, Boston, MA, United States
- Department of Emergency Medicine, Brigham and Women's Hospital, Boston, MA, United States
| | - Haipeng Zhang
- Brigham Digital Innovation Hub, Brigham and Women's Hospital, Boston, MA, United States
- Department of Psychosocial Oncology and Palliative Care, Dana Farber Cancer Institute, Boston, MA, United States
| | - Jonathan McCabe
- Department of Emergency Medicine, Brigham and Women's Hospital, Boston, MA, United States
| | - Lee Steinbach
- eVideon Coropration, Grand Rapids, MI, United States
| | - Scott King
- eVideon Coropration, Grand Rapids, MI, United States
| | | | - Jennifer Su
- E Ink Corporation, Billerica, MA, United States
| | - Adam B Landman
- Department of Emergency Medicine, Harvard Medical School, Boston, MA, United States
- Department of Emergency Medicine, Brigham and Women's Hospital, Boston, MA, United States
- Brigham Digital Innovation Hub, Brigham and Women's Hospital, Boston, MA, United States
- Mass General Brigham Digital, Somerville, MA, United States
| | - Peter Ray Chai
- Department of Emergency Medicine, Harvard Medical School, Boston, MA, United States
- Department of Emergency Medicine, Brigham and Women's Hospital, Boston, MA, United States
- Department of Psychosocial Oncology and Palliative Care, Dana Farber Cancer Institute, Boston, MA, United States
- The Koch Institute for Integrated Cancer Research, Massachusetts Institute of Technology, Cambridge, MA, United States
- The Fenway Institute, Boston, MA, United States
| |
Collapse
|
11
|
Abstract
Post hoc power estimates are often requested by reviewers and/or performed by researchers after a study has been conducted. The purpose of this commentary is to provide a heuristic explanation of why post hoc power should not be used. To illustrate our point, we provide a detailed simulation study of two essentially identical research experiments hypothetically conducted in parallel at two separate universities. The simulation demonstrates that post hoc power calculations are misleading and simply not informative for data interpretation. As such, we encourage authors and peer-reviewers to avoid using or requesting post hoc power calculations.
Collapse
Affiliation(s)
- Lacey W. Heinsberg
- Department of Human Genetics, School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| | - Daniel E. Weeks
- Department of Human Genetics, School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
- Department of Biostatistics, School of Public Health, University of Pittsburgh, Pittsburgh, Pennsylvania, USA
| |
Collapse
|
12
|
Simões RF, Oliveira PJ, Cunha-Oliveira T, Pereira FB. Evaluation of 6-Hydroxydopamine and Rotenone In Vitro Neurotoxicity on Differentiated SH-SY5Y Cells Using Applied Computational Statistics. Int J Mol Sci 2022; 23:3009. [PMID: 35328430 DOI: 10.3390/ijms23063009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Revised: 03/04/2022] [Accepted: 03/07/2022] [Indexed: 11/22/2022] Open
Abstract
With the increase in life expectancy and consequent aging of the world’s population, the prevalence of many neurodegenerative diseases is increasing, without concomitant improvement in diagnostics and therapeutics. These diseases share neuropathological hallmarks, including mitochondrial dysfunction. In fact, as mitochondrial alterations appear prior to neuronal cell death at an early phase of a disease’s onset, the study and modulation of mitochondrial alterations have emerged as promising strategies to predict and prevent neurotoxicity and neuronal cell death before the onset of cell viability alterations. In this work, differentiated SH-SY5Y cells were treated with the mitochondrial-targeted neurotoxicants 6-hydroxydopamine and rotenone. These compounds were used at different concentrations and for different time points to understand the similarities and differences in their mechanisms of action. To accomplish this, data on mitochondrial parameters were acquired and analyzed using unsupervised (hierarchical clustering) and supervised (decision tree) machine learning methods. Both biochemical and computational analyses resulted in an evident distinction between the neurotoxic effects of 6-hydroxydopamine and rotenone, specifically for the highest concentrations of both compounds.
Collapse
|
13
|
Behera J, Pasayat AK, Behera H. COVID-19 Vaccination Effect on Stock Market and Death Rate in India. Asia-Pac Financ Markets 2022; 29:651-673. [PMCID: PMC8913195 DOI: 10.1007/s10690-022-09364-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 02/18/2022] [Indexed: 06/16/2023]
Abstract
The COVID-19 epidemic has brought attention to the vulnerability of new illnesses, and immunization remains a viable option for resuming normal life. This paper examines the influence of COVID-19 vaccination on the death rate and the performance of stock market in India. For this study, COVID-19 vaccination and death rate data is gathered from the Ministry of Health and Family Welfare (MoHFW) portal, and the data for the stock index is taken from the Bombay Stock Exchange (BSE), India. In order to achieve a precise representation of feature significance and distribution, EDA (Exploratory Data Analysis) is utilized in this study. The impact of COVID-19 immunization on the mortality rate and stock market index is investigated using both statistical analysis and Machine Learning Regression-based models. The models are remarkably accurate in reproducing actual result. The empirical study suggests that vaccination has a strong positive impact on the stock market and reducing the death rate. Furthermore, the policies recommended by government and monetary authorities coupled with COVID-19 vaccine supported the stock market recovery in pandemic.
Collapse
Affiliation(s)
- Jyotirmayee Behera
- Department of Mathematics, SRM Institute of Science and Technology, Kattankulathur, Chengalpattu, Tamil Nadu 603203 India
| | - Ajit Kumar Pasayat
- Indian Institute of Technology, Kharagpur, Kharagpur, West Bengal 721302 India
| | - Harekrushna Behera
- Department of Mathematics, SRM Institute of Science and Technology, Kattankulathur, Chengalpattu, Tamil Nadu 603203 India
| |
Collapse
|
14
|
Brown BC, Knowles DA. Welch-weighted Egger regression reduces false positives due to correlated pleiotropy in Mendelian randomization. Am J Hum Genet 2021; 108:2319-2335. [PMID: 34861175 DOI: 10.1016/j.ajhg.2021.10.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Accepted: 10/19/2021] [Indexed: 02/01/2023] Open
Abstract
Modern population-scale biobanks contain simultaneous measurements of many phenotypes, providing unprecedented opportunity to study the relationship between biomarkers and disease. However, inferring causal effects from observational data is notoriously challenging. Mendelian randomization (MR) has recently received increased attention as a class of methods for estimating causal effects using genetic associations. However, standard methods result in pervasive false positives when two traits share a heritable, unobserved common cause. This is the problem of correlated pleiotropy. Here, we introduce a flexible framework for simulating traits with a common genetic confounder that generalizes recently proposed models, as well as a simple approach we call Welch-weighted Egger regression (WWER) for estimating causal effects. We show in comprehensive simulations that our method substantially reduces false positives due to correlated pleiotropy while being fast enough to apply to hundreds of phenotypes. We apply our method first to a subset of the UK Biobank consisting of blood traits and inflammatory disease, and then to a broader set of 411 heritable phenotypes. We detect many effects with strong literature support, as well as numerous behavioral effects that appear to stem from physician advice given to people at high risk for disease. We conclude that WWER is a powerful tool for exploratory data analysis in ever-growing databases of genotypes and phenotypes.
Collapse
Affiliation(s)
- Brielin C Brown
- Data Science Institute, Columbia University, New York, NY 10027, USA; New York Genome Center, New York, NY 10013, USA.
| | - David A Knowles
- Data Science Institute, Columbia University, New York, NY 10027, USA; New York Genome Center, New York, NY 10013, USA; Department of Computer Science, Columbia University, New York, NY 10027, USA; Department of Systems Biology, Columbia University, New York, NY 10027, USA.
| |
Collapse
|
15
|
Guerra-Urzola R, Van Deun K, Vera JC, Sijtsma K. A Guide for Sparse PCA: Model Comparison and Applications. Psychometrika 2021; 86:893-919. [PMID: 34185214 PMCID: PMC8636462 DOI: 10.1007/s11336-021-09773-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Revised: 05/17/2021] [Indexed: 05/14/2023]
Abstract
PCA is a popular tool for exploring and summarizing multivariate data, especially those consisting of many variables. PCA, however, is often not simple to interpret, as the components are a linear combination of the variables. To address this issue, numerous methods have been proposed to sparsify the nonzero coefficients in the components, including rotation-thresholding methods and, more recently, PCA methods subject to sparsity inducing penalties or constraints. Here, we offer guidelines on how to choose among the different sparse PCA methods. Current literature misses clear guidance on the properties and performance of the different sparse PCA methods, often relying on the misconception that the equivalence of the formulations for ordinary PCA also holds for sparse PCA. To guide potential users of sparse PCA methods, we first discuss several popular sparse PCA methods in terms of where the sparseness is imposed on the loadings or on the weights, assumed model, and optimization criterion used to impose sparseness. Second, using an extensive simulation study, we assess each of these methods by means of performance measures such as squared relative error, misidentification rate, and percentage of explained variance for several data generating models and conditions for the population model. Finally, two examples using empirical data are considered.
Collapse
Affiliation(s)
- Rosember Guerra-Urzola
- Department of Methodology and Statistics, Tilburg University, Prof. Cobbenhagenlaan 225, Simon Building, Room S 820, 5037 DB Tilburg, The Netherlands
| | - Katrijn Van Deun
- Department of Methodology and Statistics, Tilburg University, Tilburg, The Netherlands
| | - Juan C. Vera
- Department of Econometrics and OR, Tilburg University, Tilburg, Netherlands
| | | |
Collapse
|
16
|
Simões RF, Pino R, Moreira-Soares M, Kovarova J, Neuzil J, Travasso R, Oliveira PJ, Cunha-Oliveira T, Pereira FB. Quantitative analysis of neuronal mitochondrial movement reveals patterns resulting from neurotoxicity of rotenone and 6-hydroxydopamine. FASEB J 2021; 35:e22024. [PMID: 34751984 DOI: 10.1096/fj.202100899r] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2021] [Revised: 09/24/2021] [Accepted: 10/19/2021] [Indexed: 01/31/2023]
Abstract
Alterations in mitochondrial dynamics, including their intracellular trafficking, are common early manifestations of neuronal degeneration. However, current methodologies used to study mitochondrial trafficking events rely on parameters that are primarily altered in later stages of neurodegeneration. Our objective was to establish a reliable applied statistical analysis to detect early alterations in neuronal mitochondrial trafficking. We propose a novel quantitative analysis of mitochondria trajectories based on innovative movement descriptors, including straightness, efficiency, anisotropy, and kurtosis. We evaluated time- and dose-dependent alterations in trajectory descriptors using biological data from differentiated SH-SY5Y cells treated with the mitochondrial toxicants 6-hydroxydopamine and rotenone. MitoTracker Red CMXRos-labelled mitochondria movement was analyzed by total internal reflection fluorescence microscopy followed by computational modelling to describe the process. Based on the aforementioned trajectory descriptors, this innovative analysis of mitochondria trajectories provides insights into mitochondrial movement characteristics and can be a consistent and sensitive method to detect alterations in mitochondrial trafficking occurring in the earliest time points of neurodegeneration.
Collapse
Affiliation(s)
- Rui F Simões
- CNC, Center for Neuroscience and Cell Biology, UC Biotech, Cantanhede, Portugal
| | - Rute Pino
- CISUC, Department of Informatics Engineering, University of Coimbra, Coimbra, Portugal
| | - Maurício Moreira-Soares
- OCBE, Faculty of Medicine, University of Oslo, Oslo, Norway.,Centre for Bioinformatics, Faculty of Mathematics and Natural Sciences, University of Oslo, Oslo, Norway
| | - Jaromira Kovarova
- Institute of Biotechnology, Czech Academy of Sciences, Prague-West, Czech Republic
| | - Jiri Neuzil
- Institute of Biotechnology, Czech Academy of Sciences, Prague-West, Czech Republic.,School of Medical Science, Griffith University, Southport, Queensland, Australia
| | - Rui Travasso
- CFisUC, Department of Physics, University of Coimbra, Coimbra, Portugal
| | - Paulo J Oliveira
- CNC, Center for Neuroscience and Cell Biology, UC Biotech, Cantanhede, Portugal
| | | | - Francisco B Pereira
- CISUC, Department of Informatics Engineering, University of Coimbra, Coimbra, Portugal.,Coimbra Polytechnic - ISEC, Coimbra, Portugal
| |
Collapse
|
17
|
Yue A, Chauve C, Libbrecht MW, Brinkman RR. Automated identification of maximal differential cell populations in flow cytometry data. Cytometry A 2021; 101:177-184. [PMID: 34559446 PMCID: PMC8810629 DOI: 10.1002/cyto.a.24503] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2020] [Revised: 08/14/2021] [Accepted: 09/14/2021] [Indexed: 11/21/2022]
Abstract
We introduce a new cell population score called SpecEnr (specific enrichment) and describe a method that discovers robust and accurate candidate biomarkers from flow cytometry data. Our approach identifies a new class of candidate biomarkers we define as driver cell populations, whose abundance is associated with a sample class (e.g., disease), but not as a result of a change in a related population. We show that the driver cell populations we find are also easily interpretable using a lattice‐based visualization tool. Our method is implemented in the R package flowGraph, freely available on GitHub (github.com/aya49/flowGraph) and on BioConductor.
Collapse
Affiliation(s)
- Alice Yue
- Department of Computing Science, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Cedric Chauve
- Department of Mathematics, Simon Fraser University, Burnaby, British Columbia, Canada.,LaBRI, University of Bordeaux, Bordeaux, France
| | - Maxwell W Libbrecht
- Department of Computing Science, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Ryan R Brinkman
- Terry Fox Laboratory, BC Cancer Research Centre, BC Cancer Agency, Vancouver, British Columbia, Canada.,Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| |
Collapse
|
18
|
Sushkova OS, Morozov AA, Gabova AV, Karabanov AV, Illarioshkin SN. A Statistical Method for Exploratory Data Analysis Based on 2D and 3D Area under Curve Diagrams: Parkinson's Disease Investigation. Sensors (Basel) 2021; 21:s21144700. [PMID: 34300440 PMCID: PMC8309570 DOI: 10.3390/s21144700] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 07/01/2021] [Accepted: 07/06/2021] [Indexed: 12/31/2022]
Abstract
A statistical method for exploratory data analysis based on 2D and 3D area under curve (AUC) diagrams was developed. The method was designed to analyze electroencephalogram (EEG), electromyogram (EMG), and tremorogram data collected from patients with Parkinson's disease. The idea of the method of wave train electrical activity analysis is that we consider the biomedical signal as a combination of the wave trains. The wave train is the increase in the power spectral density of the signal localized in time, frequency, and space. We detect the wave trains as the local maxima in the wavelet spectrograms. We do not consider wave trains as a special kind of signal. The wave train analysis method is different from standard signal analysis methods such as Fourier analysis and wavelet analysis in the following way. Existing methods for analyzing EEG, EMG, and tremor signals, such as wavelet analysis, focus on local time-frequency changes in the signal and therefore do not reveal the generalized properties of the signal. Other methods such as standard Fourier analysis ignore the local time-frequency changes in the characteristics of the signal and, consequently, lose a large amount of information that existed in the signal. The method of wave train electrical activity analysis resolves the contradiction between these two approaches because it addresses the generalized characteristics of the biomedical signal based on local time-frequency changes in the signal. We investigate the following wave train parameters: wave train central frequency, wave train maximal power spectral density, wave train duration in periods, and wave train bandwidth. We have developed special graphical diagrams, named AUC diagrams, to determine what wave trains are characteristic of neurodegenerative diseases. In this paper, we consider the following types of AUC diagrams: 2D and 3D diagrams. The technique of working with AUC diagrams is illustrated by examples of analysis of EMG in patients with Parkinson's disease and healthy volunteers. It is demonstrated that new regularities useful for the high-accuracy diagnosis of Parkinson's disease can be revealed using the method of analyzing the wave train electrical activity and AUC diagrams.
Collapse
Affiliation(s)
- Olga Sergeevna Sushkova
- Kotel’nikov Institute of Radio Engineering and Electronics of RAS, Mokhovaya 11-7, 125009 Moscow, Russia;
- Correspondence:
| | | | | | | | | |
Collapse
|
19
|
Amorim R, Simões ICM, Veloso C, Carvalho A, Simões RF, Pereira FB, Thiel T, Normann A, Morais C, Jurado AS, Wieckowski MR, Teixeira J, Oliveira PJ. Exploratory Data Analysis of Cell and Mitochondrial High-Fat, High-Sugar Toxicity on Human HepG2 Cells. Nutrients 2021; 13:nu13051723. [PMID: 34069635 PMCID: PMC8161147 DOI: 10.3390/nu13051723] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Revised: 05/06/2021] [Accepted: 05/17/2021] [Indexed: 12/13/2022] Open
Abstract
Non-alcoholic steatohepatitis (NASH), one of the deleterious stages of non-alcoholic fatty liver disease, remains a significant cause of liver-related morbidity and mortality worldwide. In the current work, we used an exploratory data analysis to investigate time-dependent cellular and mitochondrial effects of different supra-physiological fatty acids (FA) overload strategies, in the presence or absence of fructose (F), on human hepatoma-derived HepG2 cells. We measured intracellular neutral lipid content and reactive oxygen species (ROS) levels, mitochondrial respiration and morphology, and caspases activity and cell death. FA-treatments induced a time-dependent increase in neutral lipid content, which was paralleled by an increase in ROS. Fructose, by itself, did not increase intracellular lipid content nor aggravated the effects of palmitic acid (PA) or free fatty acids mixture (FFA), although it led to an up-expression of hepatic fructokinase. Instead, F decreased mitochondrial phospholipid content, as well as OXPHOS subunits levels. Increased lipid accumulation and ROS in FA-treatments preceded mitochondrial dysfunction, comprising altered mitochondrial membrane potential (ΔΨm) and morphology, and decreased oxygen consumption rates, especially with PA. Consequently, supra-physiological PA alone or combined with F prompted the activation of caspase pathways leading to a time-dependent decrease in cell viability. Exploratory data analysis methods support this conclusion by clearly identifying the effects of FA treatments. In fact, unsupervised learning algorithms created homogeneous and cohesive clusters, with a clear separation between PA and FFA treated samples to identify a minimal subset of critical mitochondrial markers in order to attain a feasible model to predict cell death in NAFLD or for high throughput screening of possible therapeutic agents, with particular focus in measuring mitochondrial function.
Collapse
Affiliation(s)
- Ricardo Amorim
- CNC-Center for Neuroscience and Cell Biology, CIBB-Centre for Innovative Biomedicine and Biotechnology, University of Coimbra, UC-Biotech, Biocant Park, 3060-197 Cantanhede, Portugal; (R.A.); (C.V.); (A.C.); (R.F.S.); (J.T.)
- CIQUP/Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal
- PhD Programme in Experimental Biology and Biomedicine (PDBEB), Institute for Interdisciplinary Research (IIIUC), University of Coimbra, 3004-531 Coimbra, Portugal
| | - Inês C. M. Simões
- Laboratory of Mitochondrial Biology and Metabolism, Nencki Institute of Experimental Biology of Polish Academy of Sciences, 02-093 Warsaw, Poland; (I.C.M.S.); (M.R.W.)
| | - Caroline Veloso
- CNC-Center for Neuroscience and Cell Biology, CIBB-Centre for Innovative Biomedicine and Biotechnology, University of Coimbra, UC-Biotech, Biocant Park, 3060-197 Cantanhede, Portugal; (R.A.); (C.V.); (A.C.); (R.F.S.); (J.T.)
| | - Adriana Carvalho
- CNC-Center for Neuroscience and Cell Biology, CIBB-Centre for Innovative Biomedicine and Biotechnology, University of Coimbra, UC-Biotech, Biocant Park, 3060-197 Cantanhede, Portugal; (R.A.); (C.V.); (A.C.); (R.F.S.); (J.T.)
- PhD Programme in Experimental Biology and Biomedicine (PDBEB), Institute for Interdisciplinary Research (IIIUC), University of Coimbra, 3004-531 Coimbra, Portugal
| | - Rui F. Simões
- CNC-Center for Neuroscience and Cell Biology, CIBB-Centre for Innovative Biomedicine and Biotechnology, University of Coimbra, UC-Biotech, Biocant Park, 3060-197 Cantanhede, Portugal; (R.A.); (C.V.); (A.C.); (R.F.S.); (J.T.)
- PhD Programme in Experimental Biology and Biomedicine (PDBEB), Institute for Interdisciplinary Research (IIIUC), University of Coimbra, 3004-531 Coimbra, Portugal
| | - Francisco B. Pereira
- Center for Informatics and Systems, University of Coimbra, Polo II, Pinhal de Marrocos, 3030-290 Coimbra, Portugal;
- Coimbra Polytechnic-ISEC, 3030-190 Coimbra, Portugal
| | - Theresa Thiel
- Mediagnostic, D-72770 Reutlingen, Germany; (T.T.); (A.N.)
| | - Andrea Normann
- Mediagnostic, D-72770 Reutlingen, Germany; (T.T.); (A.N.)
| | - Catarina Morais
- Center for Neuroscience and Cell Biology, Department of Life Sciences, University of Coimbra, Calçada Martim de Freitas, 3000-456 Coimbra, Portugal; (C.M.); (A.S.J.)
| | - Amália S. Jurado
- Center for Neuroscience and Cell Biology, Department of Life Sciences, University of Coimbra, Calçada Martim de Freitas, 3000-456 Coimbra, Portugal; (C.M.); (A.S.J.)
| | - Mariusz R. Wieckowski
- Laboratory of Mitochondrial Biology and Metabolism, Nencki Institute of Experimental Biology of Polish Academy of Sciences, 02-093 Warsaw, Poland; (I.C.M.S.); (M.R.W.)
| | - José Teixeira
- CNC-Center for Neuroscience and Cell Biology, CIBB-Centre for Innovative Biomedicine and Biotechnology, University of Coimbra, UC-Biotech, Biocant Park, 3060-197 Cantanhede, Portugal; (R.A.); (C.V.); (A.C.); (R.F.S.); (J.T.)
- CIQUP/Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal
| | - Paulo J. Oliveira
- CNC-Center for Neuroscience and Cell Biology, CIBB-Centre for Innovative Biomedicine and Biotechnology, University of Coimbra, UC-Biotech, Biocant Park, 3060-197 Cantanhede, Portugal; (R.A.); (C.V.); (A.C.); (R.F.S.); (J.T.)
- Correspondence:
| |
Collapse
|
20
|
Robeva R, Nedyalkova M, Kirilov G, Elenkova A, Zacharieva S, Kudłak B, Jatkowska N, Simeonov V. Multivariate Statistical Approach for Nephrines in Women with Obesity. Molecules 2021; 26:molecules26051393. [PMID: 33807567 PMCID: PMC7961883 DOI: 10.3390/molecules26051393] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 02/19/2021] [Accepted: 03/02/2021] [Indexed: 12/22/2022] Open
Abstract
Catecholamines are physiological regulators of carbohydrate and lipid metabolism during stress, but their chronic influence on metabolic changes in obese patients is still not clarified. The present study aimed to establish the associations between the catecholamine metabolites and metabolic syndrome (MS) components in obese women as well as to reveal the possible hidden subgroups of patients through hierarchical cluster analysis and principal component analysis. The 24-h urine excretion of metanephrine and normetanephrine was investigated in 150 obese women (54 non diabetic without MS, 70 non-diabetic with MS and 26 with type 2 diabetes). The interrelations between carbohydrate disturbances, metabolic syndrome components and stress response hormones were studied. Exploratory data analysis was used to determine different patterns of similarities among the patients. Normetanephrine concentrations were significantly increased in postmenopausal patients and in women with morbid obesity, type 2 diabetes, and hypertension but not with prediabetes. Both metanephrine and normetanephrine levels were positively associated with glucose concentrations one hour after glucose load irrespectively of the insulin levels. The exploratory data analysis showed different risk subgroups among the investigated obese women. The development of predictive tools that include not only traditional metabolic risk factors, but also markers of stress response systems might help for specific risk estimation in obesity patients.
Collapse
Affiliation(s)
- Ralitsa Robeva
- Department of Endocrinology, Faculty of Medicine, Medical University—Sofia, USHATE “Acad. Iv. Penchev”, 2, Zdrave Str., 1431 Sofia, Bulgaria; (R.R.); (G.K.); (A.E.); (S.Z.)
| | - Miroslava Nedyalkova
- Department of Inorganic Chemistry, Faculty of Chemistry and Pharmacy, University of Sofia “St. Kl. Ohridski”, 1164 Sofia, Bulgaria
- Correspondence:
| | - Georgi Kirilov
- Department of Endocrinology, Faculty of Medicine, Medical University—Sofia, USHATE “Acad. Iv. Penchev”, 2, Zdrave Str., 1431 Sofia, Bulgaria; (R.R.); (G.K.); (A.E.); (S.Z.)
| | - Atanaska Elenkova
- Department of Endocrinology, Faculty of Medicine, Medical University—Sofia, USHATE “Acad. Iv. Penchev”, 2, Zdrave Str., 1431 Sofia, Bulgaria; (R.R.); (G.K.); (A.E.); (S.Z.)
| | - Sabina Zacharieva
- Department of Endocrinology, Faculty of Medicine, Medical University—Sofia, USHATE “Acad. Iv. Penchev”, 2, Zdrave Str., 1431 Sofia, Bulgaria; (R.R.); (G.K.); (A.E.); (S.Z.)
| | - Błażej Kudłak
- Department of Analytical Chemistry, Faculty of Chemistry, Gdańsk University of Technology, 11/12 Narutowicza, 80-233 Gdańsk, Poland; (B.K.); (N.J.)
| | - Natalia Jatkowska
- Department of Analytical Chemistry, Faculty of Chemistry, Gdańsk University of Technology, 11/12 Narutowicza, 80-233 Gdańsk, Poland; (B.K.); (N.J.)
| | - Vasil Simeonov
- Department of Analytical Chemistry, Faculty of Chemistry and Pharmacy, University of Sofia “St. Kl. Ohridski”, 1164 Sofia, Bulgaria;
| |
Collapse
|
21
|
Epskamp S, Fried EI, van Borkulo CD, Robinaugh DJ, Marsman M, Dalege J, Rhemtulla M, Cramer AOJ. Investigating the Utility of Fixed-margin Sampling in Network Psychometrics. Multivariate Behav Res 2021; 56:314-328. [PMID: 30463456 DOI: 10.1080/00273171.2018.1489771] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2017] [Revised: 06/04/2018] [Accepted: 06/05/2018] [Indexed: 06/09/2023]
Abstract
Steinley, Hoffman, Brusco, and Sher (2017) proposed a new method for evaluating the performance of psychological network models: fixed-margin sampling. The authors investigated LASSO regularized Ising models (eLasso) by generating random datasets with the same margins as the original binary dataset, and concluded that many estimated eLasso parameters are not distinguishable from those that would be expected if the data were generated by chance. We argue that fixed-margin sampling cannot be used for this purpose, as it generates data under a particular null-hypothesis: a unidimensional factor model with interchangeable indicators (i.e., the Rasch model). We show this by discussing relevant psychometric literature and by performing simulation studies. Results indicate that while eLasso correctly estimated network models and estimated almost no edges due to chance, fixed-margin sampling performed poorly in classifying true effects as "interesting" (Steinley et al. 2017, p. 1004). Further simulation studies indicate that fixed-margin sampling offers a powerful method for highlighting local misfit from the Rasch model, but performs only moderately in identifying global departures from the Rasch model. We conclude that fixed-margin sampling is not up to the task of assessing if results from estimated Ising models or other multivariate psychometric models are due to chance.
Collapse
Affiliation(s)
- Sacha Epskamp
- Department of Psychological Methods, University of Amsterdam, Amsterdam, The Netherlands
| | - Eiko I Fried
- Department of Psychological Methods, University of Amsterdam, Amsterdam, The Netherlands
| | - Claudia D van Borkulo
- Department of Psychological Methods, University of Amsterdam, Amsterdam, The Netherlands
| | - Donald J Robinaugh
- Department of Psychological Methods, University of Amsterdam, Amsterdam, The Netherlands
- Department of Psychiatry, Massachusetts General Hospital, Cambridge, MA, USA
| | - Maarten Marsman
- Department of Psychological Methods, University of Amsterdam, Amsterdam, The Netherlands
| | - Jonas Dalege
- Department of Social Psychology, University of Amsterdam, Amsterdam, The Netherlands
| | - Mijke Rhemtulla
- Department of Psychology, University of California, Davis, CA, USA
| | - Angélique O J Cramer
- Social and Behavioral Sciences, Department of Methodology and Statistics, Tilburg University, Tilburg, The Netherlands
| |
Collapse
|
22
|
Abstract
Structural equation model (SEM) trees are data-driven tools for finding variables that predict group differences in SEM parameters. SEM trees build upon the decision tree paradigm by growing tree structures that divide a data set recursively into homogeneous subsets. In past research, SEM trees have been estimated predominantly with the R package semtree. The original algorithm in the semtree package selects split variables among covariates by calculating a likelihood ratio for each possible split of each covariate. Obtaining these likelihood ratios is computationally demanding. As a remedy, we propose to guide the construction of SEM trees by a family of score-based tests that have recently been popularized in psychometrics (Merkle and Zeileis, 2013; Merkle et al., 2014). These score-based tests monitor fluctuations in case-wise derivatives of the likelihood function to detect parameter differences between groups. Compared to the likelihood-ratio approach, score-based tests are computationally efficient because they do not require refitting the model for every possible split. In this paper, we introduce score-guided SEM trees, implement them in semtree, and evaluate their performance by means of a Monte Carlo simulation.
Collapse
Affiliation(s)
- Manuel Arnold
- Psychological Research Methods, Department of Psychology, Humboldt-Universität zu Berlin, Berlin, Germany.,Max Planck UCL Centre for Computational Psychiatry and Ageing Research, Berlin, Germany
| | - Manuel C Voelkle
- Psychological Research Methods, Department of Psychology, Humboldt-Universität zu Berlin, Berlin, Germany
| | - Andreas M Brandmaier
- Max Planck UCL Centre for Computational Psychiatry and Ageing Research, Berlin, Germany.,Center for Lifespan Psychology, Max Planck Institute for Human Development, Berlin, Germany
| |
Collapse
|
23
|
Meier R, Pahud de Mortanges A, Wiest R, Knecht U. Exploratory Analysis of Qualitative MR Imaging Features for the Differentiation of Glioblastoma and Brain Metastases. Front Oncol 2020; 10:581037. [PMID: 33425734 PMCID: PMC7793795 DOI: 10.3389/fonc.2020.581037] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2020] [Accepted: 11/09/2020] [Indexed: 12/12/2022] Open
Abstract
OBJECTIVES To identify qualitative VASARI (Visually AcceSIble Rembrandt Images) Magnetic Resonance (MR) Imaging features for differentiation of glioblastoma (GBM) and brain metastasis (BM) of different primary tumors. MATERIALS AND METHODS T1-weighted pre- and post-contrast, T2-weighted, and T2-weighted, fluid attenuated inversion recovery (FLAIR) MR images of a total of 239 lesions from 109 patients with either GBM or BM (breast cancer, non-small cell (NSCLC) adenocarcinoma, NSCLC squamous cell carcinoma, small-cell lung cancer (SCLC)) were included. A set of adapted, qualitative VASARI MR features describing tumor appearance and location was scored (binary; 1 = presence of feature, 0 = absence of feature). Exploratory data analysis was performed on binary scores using a combination of descriptive statistics (proportions with 95% binomial confidence intervals), unsupervised methods and supervised methods including multivariate feature ranking using either repeated fitting or recursive feature elimination with Support Vector Machines (SVMs). RESULTS GBMs were found to involve all lobes of the cerebrum with a fronto-occipital gradient, often affected the corpus callosum (32.4%, 95% CI 19.1-49.2), and showed a strong preference for the right hemisphere (79.4%, 95% CI 63.2-89.7). BMs occurred most frequently in the frontal lobe (35.1%, 95% CI 28.9-41.9) and cerebellum (28.3%, 95% CI 22.6-34.8). The appearance of GBMs was characterized by preference for well-defined non-enhancing tumor margin (100%, 89.8-100), ependymal extension (52.9%, 36.7-68.5) and substantially less enhancing foci than BMs (44.1%, 28.9-60.6 vs. 75.1%, 68.8-80.5). Unsupervised and supervised analyses showed that GBMs are distinctively different from BMs and that this difference is driven by definition of non-enhancing tumor margin, ependymal extension and features describing laterality. Differentiation of histological subtypes of BMs was driven by the presence of well-defined enhancing and non-enhancing tumor margins and localization in the vision center. SVM models with optimal hyperparameters led to weighted F1-score of 0.865 for differentiation of GBMs from BMs and weighted F1-score of 0.326 for differentiation of BM subtypes. CONCLUSION VASARI MR imaging features related to definition of non-enhancing margin, ependymal extension, and tumor localization may serve as potential imaging biomarkers to differentiate GBMs from BMs.
Collapse
Affiliation(s)
- Raphael Meier
- University Institute of Diagnostic and Interventional Neuroradiology, University Hospital Bern, Inselspital, University of Bern, Bern, Switzerland
- Support Center for Advanced Neuroimaging, University Hospital Bern, Inselspital, University of Bern, Bern, Switzerland
| | - Aurélie Pahud de Mortanges
- University Institute of Diagnostic and Interventional Neuroradiology, University Hospital Bern, Inselspital, University of Bern, Bern, Switzerland
| | - Roland Wiest
- University Institute of Diagnostic and Interventional Neuroradiology, University Hospital Bern, Inselspital, University of Bern, Bern, Switzerland
- Support Center for Advanced Neuroimaging, University Hospital Bern, Inselspital, University of Bern, Bern, Switzerland
| | - Urspeter Knecht
- ARTORG Center for Biomedical Research, University of Bern, Bern, Switzerland
- Department of Diagnostic Radiology and Neuroradiology, Regional Hospital Emmental, Burgdorf, Switzerland
| |
Collapse
|
24
|
Abstract
We discuss the validation of machine learning models, which is standard practice in determining model efficacy and generalizability. We argue that internal validation approaches, such as cross-validation and bootstrap, cannot guarantee the quality of a machine learning model due to potentially biased training data and the complexity of the validation procedure itself. For better evaluating the generalization ability of a learned model, we suggest leveraging on external data sources from elsewhere as validation datasets, namely external validation. Due to the lack of research attractions on external validation, especially a well-structured and comprehensive study, we discuss the necessity for external validation and propose two extensions of the external validation approach that may help reveal the true domain-relevant model from a candidate set. Moreover, we also suggest a procedure to check whether a set of validation datasets is valid and introduce statistical reference points for detecting external data problems.
Collapse
Affiliation(s)
- Sung Yang Ho
- School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore
| | - Kimberly Phua
- School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore
| | - Limsoon Wong
- Department of Computer Science, National University of Singapore, Singapore 117417, Singapore
| | - Wilson Wen Bin Goh
- School of Biological Sciences, Nanyang Technological University, Singapore 637551, Singapore
| |
Collapse
|
25
|
Bjorgan A, Pukstad BS, Randeberg LL. Hyperspectral characterization of re-epithelialization in an in vitro wound model. J Biophotonics 2020; 13:e202000108. [PMID: 32558341 DOI: 10.1002/jbio.202000108] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Revised: 05/27/2020] [Accepted: 06/11/2020] [Indexed: 06/11/2023]
Abstract
In vitro wound models are useful for research on wound re-epithelialization. Hyperspectral imaging represents a non-destructive alternative to histology analysis for detection of re-epithelialization. This study aims to characterize the main optical behavior of a wound model in order to enable development of detection algorithms. K-Means clustering and agglomerative analysis were used to group spatial regions based on the spectral behavior, and an inverse photon transport model was used to explain differences in optical properties. Six samples of the wound model were prepared from human tissue and followed over 22 days. Re-epithelialization occurred at a mean rate of 0.24 mm2 /day after day 8 to 10. Suppression of wound spectral features was the main feature characterizing re-epithelialized and intact tissue. Modeling the photon transport through a diffuse layer placed on top of wound tissue properties reproduced the spectral behavior. The missing top layer represented by wounds is thus optically detectable using hyperspectral imaging.
Collapse
Affiliation(s)
- Asgeir Bjorgan
- Department of Electronic Systems, NTNU Norwegian University of Science and Technology, Trondheim, Norway
| | - Brita S Pukstad
- Department of Clinical and Molecular Medicine, NTNU Norwegian University of Science and Technology, Trondheim, Norway
- Department of Dermatology, St. Olavs Hospital, Trondheim University Hospital, Trondheim, Norway
| | - Lise L Randeberg
- Department of Electronic Systems, NTNU Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|
26
|
McVey C, Hsieh F, Manriquez D, Pinedo P, Horback K. Mind the Queue: A Case Study in Visualizing Heterogeneous Behavioral Patterns in Livestock Sensor Data Using Unsupervised Machine Learning Techniques. Front Vet Sci 2020; 7:523. [PMID: 33134329 PMCID: PMC7518149 DOI: 10.3389/fvets.2020.00523] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2020] [Accepted: 07/07/2020] [Indexed: 12/28/2022] Open
Abstract
Sensor technologies allow ethologists to continuously monitor the behaviors of large numbers of animals over extended periods of time. This creates new opportunities to study livestock behavior in commercial settings, but also new methodological challenges. Densely sampled behavioral data from large heterogeneous groups can contain a range of complex patterns and stochastic structures that may be difficult to visualize using conventional exploratory data analysis techniques. The goal of this research was to assess the efficacy of unsupervised machine learning tools in recovering complex behavioral patterns from such datasets to better inform subsequent statistical modeling. This methodological case study was carried out using records on milking order, or the sequence in which cows arrange themselves as they enter the milking parlor. Data was collected over a 6-month period from a closed group of 200 mixed-parity Holstein cattle on an organic dairy. Cows at the front and rear of the queue proved more consistent in their entry position than animals at the center of the queue, a systematic pattern of heterogeneity more clearly visualized using entropy estimates, a scale and distribution-free alternative to variance robust to outliers. Dimension reduction techniques were then used to visualize relationships between cows. No evidence of social cohesion was recovered, but Diffusion Map embeddings proved more adept than PCA at revealing the underlying linear geometry of this data. Median parlor entry positions from the pre- and post-pasture subperiods were highly correlated (R = 0.91), suggesting a surprising degree of temporal stationarity. Data Mechanics visualizations, however, revealed heterogeneous non-stationary among subgroups of animals in the center of the group and herd-level temporal outliers. A repeated measures model recovered inconsistent evidence of a relationships between entry position and cow attributes. Mutual conditional entropy tests, a permutation-based approach to assessing bivariate correlations robust to non-independence, confirmed a significant but non-linear association with peak milk yield, but revealed the age effect to be potentially confounded by health status. Finally, queueing records were related back to behaviors recorded via ear tag accelerometers using linear models and mutual conditional entropy tests. Both approaches recovered consistent evidence of differences in home pen behaviors across subsections of the queue.
Collapse
Affiliation(s)
- Catherine McVey
- Department of Animal Science, University of California, Davis, Davis, CA, United States
| | - Fushing Hsieh
- Department of Statistics, University of California, Davis, Davis, CA, United States
| | - Diego Manriquez
- Department of Animal Science, Colorado State University, Fort Collins, CO, United States
| | - Pablo Pinedo
- Department of Animal Science, Colorado State University, Fort Collins, CO, United States
| | - Kristina Horback
- Department of Animal Science, University of California, Davis, Davis, CA, United States
| |
Collapse
|
27
|
Shimose S, Kawaguchi T, Iwamoto H, Niizeki T, Shirono T, Tanaka M, Koga H, Torimura T. Indication of suitable transarterial chemoembolization and multikinase inhibitors for intermediate stage hepatocellular carcinoma. Oncol Lett 2020; 19:2667-2676. [PMID: 32218817 PMCID: PMC7068224 DOI: 10.3892/ol.2020.11399] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Accepted: 11/06/2019] [Indexed: 02/06/2023] Open
Abstract
Prognosis of patients with intermediate stage hepatocellular carcinoma (HCC) treated with transcatheter arterial chemoembolization (TACE) is unsatisfactory. The present study analyzed the indications for suitable TACE in patients with intermediate stage HCC. Additionally, it was investigated whether further TACE or switching to multi-kinase inhibitors (MKIs) was more beneficial for patients with HCC recurrence following initial TACE. The present retrospective study included 238 patients with intermediate stage HCC who were initially treated with TACE (median age, 74 years). A decision-tree analysis was employed to investigate the therapeutic effect profiles and overall survival (OS) rates. In the decision-tree analysis for OS, complete response (CR) by initial TACE was selected as the most important variable. In the decision-tree analysis for CR, <3 liver segments with nodule, simple nodular type and within the up-to-seven criteria were selected as the first, second and third variables associated with a high CR rate (35–64%), respectively. In patients with HCC recurrence having ≥3 liver segments with nodule, out of the up-to-seven criteria, and Child-Pugh class A, the median survival time was significantly longer in those who were treated by switching to MKIs compared with further TACE (44.9 vs. 21.9 months; P=0.003). In intermediate stage HCC, the indications for suitable TACE criteria may be ‘<3 liver segments with nodule’, ‘simple nodular type’, and ‘within the up-to-seven criteria’. Additionally, in patients who were ineligible for TACE criteria, the switch to MKIs may improve the prognosis compared with further TACE in cases of HCC recurrence following first TACE.
Collapse
Affiliation(s)
- Shigeo Shimose
- Division of Gastroenterology, Department of Medicine, Kurume University School of Medicine, Kurume 830-0011, Japan
| | - Takumi Kawaguchi
- Division of Gastroenterology, Department of Medicine, Kurume University School of Medicine, Kurume 830-0011, Japan
| | - Hideki Iwamoto
- Division of Gastroenterology, Department of Medicine, Kurume University School of Medicine, Kurume 830-0011, Japan.,Division of Liver Cancer Research, Research Center for Innovative Cancer Therapy, Kurume University School of Medicine, Kurume 830-0011, Japan
| | - Takashi Niizeki
- Division of Gastroenterology, Department of Medicine, Kurume University School of Medicine, Kurume 830-0011, Japan
| | - Tomotake Shirono
- Division of Gastroenterology, Department of Medicine, Kurume University School of Medicine, Kurume 830-0011, Japan
| | - Masatoshi Tanaka
- Department of Gastroenterology and Hepatology, Yokokura Hospital, Miyama 839-0295, Japan
| | - Hironori Koga
- Division of Gastroenterology, Department of Medicine, Kurume University School of Medicine, Kurume 830-0011, Japan.,Division of Liver Cancer Research, Research Center for Innovative Cancer Therapy, Kurume University School of Medicine, Kurume 830-0011, Japan
| | - Takuji Torimura
- Division of Gastroenterology, Department of Medicine, Kurume University School of Medicine, Kurume 830-0011, Japan.,Division of Liver Cancer Research, Research Center for Innovative Cancer Therapy, Kurume University School of Medicine, Kurume 830-0011, Japan
| |
Collapse
|
28
|
Noda Y, Kawaguchi T, Korenaga M, Yoshio S, Komukai S, Nakano M, Niizeki T, Koga H, Kawaguchi A, Kanto T, Torimura T. High serum interleukin-34 level is a predictor of poor prognosis in patients with non-viral hepatocellular carcinoma. Hepatol Res 2019; 49:1046-1053. [PMID: 30993774 DOI: 10.1111/hepr.13350] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/20/2019] [Revised: 04/10/2019] [Accepted: 04/11/2019] [Indexed: 12/30/2022]
Abstract
AIMS We aimed to investigate the impact of interleukin (IL)-34 and YKL-40, regulators of hepatic fibrosis and tumor growth, on the prognosis of patients with non-viral hepatocellular carcinoma (HCC). METHODS We enrolled 159 non-viral HCC patients (age, 70.8 ± 8.5 years; female/male, 43/116). Of these, 86 patients were alive and 73 patients had died at the censor time point. Serum IL-34 and YKL-40 levels were quantified by enzyme-linked immunosorbent assay. Patients were stratified by the median level of serum IL-34 to examine its effect on survival. Multivariate analysis and random forest analysis were used to evaluate the impact of IL-34 and YKL-40 on the prognosis of non-viral HCC patients. RESULTS Interleukin-34 (hazard ratio [HR] 1.30; 95% confidence interval [CI], 1.13-1.49; P ≤ 0.01), tumor size (HR 1.63; 95% CI, 1.37-1.94; P ≤ 0.01), and tumor number (HR 1.53; 95% CI, 1.25-1.87; P ≤ 0.01) were independent predictive factors for survival. Furthermore, the survival rates were significantly lower in the high IL-34 group than in the low IL-34 group (5-year survival rates, 34.7% vs. 59.8%, respectively; P < 0.05). In the random forest analysis for survival, IL-34 was the third-highest ranking factor, following tumor size and number. In a stratification analysis, serum α-fetoprotein level and Fibrosis-4 index were independent positive risk factors for high serum IL-34 level. YKL-40 was not associated with prognosis in either the multivariate or random forest analysis. CONCLUSION Interleukin-34 was an independent factor for survival of non-viral HCC patients. Interleukin-34 might be associated with prognosis through tumor and hepatic fibrosis factors.
Collapse
Affiliation(s)
- Yu Noda
- Division of Gastroenterology, Kurume University School of Medicine, Kurume, Japan
| | - Takumi Kawaguchi
- Division of Gastroenterology, Kurume University School of Medicine, Kurume, Japan
| | - Masaaki Korenaga
- The Research Center for Hepatitis and Immunology, National Center for Global Health and Medicine, Ichikawa, Japan
| | - Sachiyo Yoshio
- The Research Center for Hepatitis and Immunology, National Center for Global Health and Medicine, Ichikawa, Japan
| | - Sho Komukai
- Division of Biomedical Statistics, Department of Integrated Medicine, Graduate School of Medicine, Osaka University, Osaka, Japan
| | - Masahito Nakano
- Division of Gastroenterology, Kurume University School of Medicine, Kurume, Japan
| | - Takashi Niizeki
- Division of Gastroenterology, Kurume University School of Medicine, Kurume, Japan
| | - Hironori Koga
- Division of Gastroenterology, Kurume University School of Medicine, Kurume, Japan.,Liver Cancer Research Division, Research Center for Innovative Cancer Therapy, Kurume University, Kurume, Japan
| | - Atsushi Kawaguchi
- Center for Comprehensive Community Medicine, Faculty of Medicine, Saga University, Saga, Japan
| | - Tatsuya Kanto
- The Research Center for Hepatitis and Immunology, National Center for Global Health and Medicine, Ichikawa, Japan
| | - Takuji Torimura
- Division of Gastroenterology, Kurume University School of Medicine, Kurume, Japan.,Liver Cancer Research Division, Research Center for Innovative Cancer Therapy, Kurume University, Kurume, Japan
| |
Collapse
|
29
|
Shimose S, Tanaka M, Iwamoto H, Niizeki T, Shirono T, Aino H, Noda Y, Kamachi N, Okamura S, Nakano M, Kuromatsu R, Kawaguchi T, Kawaguchi A, Koga H, Yokokura Y, Torimura T. Prognostic impact of transcatheter arterial chemoembolization (TACE) combined with radiofrequency ablation in patients with unresectable hepatocellular carcinoma: Comparison with TACE alone using decision-tree analysis after propensity score matching. Hepatol Res 2019; 49:919-928. [PMID: 30969006 DOI: 10.1111/hepr.13348] [Citation(s) in RCA: 39] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/28/2018] [Revised: 03/24/2019] [Accepted: 04/07/2019] [Indexed: 02/06/2023]
Abstract
AIMS The prognosis of hepatocellular carcinoma (HCC) patients treated with transcatheter arterial chemoembolization (TACE) is still poor. We aimed to evaluate the impact of TACE combined with radiofrequency ablation (TACE+RFA) on the prognosis of HCC patients using decision-tree analysis after propensity score matching. METHODS This was a retrospective study. We enrolled 420 patients with HCC treated with TACE alone (n = 311) or TACE+RFA (n = 109) between 1998 and 2016 (median age, 72 years; male / female, 272/148; Barcelona Clinic Liver Cancer (BCLC) stage A / B, 215/205). The prognosis of patients who underwent TACE+RFA was compared to patients who underwent TACE alone after propensity score matching. Decision-tree analysis was used to investigate the profile for prognosis of the patients. RESULTS After propensity score matching, there was no significant difference in age, sex, BCLC stage, or albumin-bilirubin (ALBI) score between both groups. The survival rate of the TACE+RFA group was significantly higher than the TACE alone group (median survival time [MST] 57.9 months vs. 33.1 months, P < 0.001). In a stratification analysis according to BCLC stage, the overall survival rate of the TACE+RFA group was significantly higher than the TACE alone group in BCLC stage A and B (MST 57.9 and 50.7 months vs. 39.8 and 24.5 months [P = 0.007 and 0.001], respectively). Decision-tree analysis showed that TACE+RFA was the third distinguishable factor for survival in patients with α-fetoprotein level >7 ng/mL and ALBI <-2.08. CONCLUSION Decision-tree analysis after propensity score matching showed that TACE+RFA could prolong the survival of HCC patients compared to TACE alone.
Collapse
Affiliation(s)
- Shigeo Shimose
- Division of Gastroenterology, Department of Medicine, Kurume University School of Medicine, Kurume, Japan
| | | | - Hideki Iwamoto
- Division of Gastroenterology, Department of Medicine, Kurume University School of Medicine, Kurume, Japan
| | - Takashi Niizeki
- Division of Gastroenterology, Department of Medicine, Kurume University School of Medicine, Kurume, Japan
| | - Tomotake Shirono
- Division of Gastroenterology, Department of Medicine, Kurume University School of Medicine, Kurume, Japan
| | - Hajime Aino
- Division of Gastroenterology, Department of Medicine, Kurume University School of Medicine, Kurume, Japan
| | - Yu Noda
- Division of Gastroenterology, Department of Medicine, Kurume University School of Medicine, Kurume, Japan
| | - Naoki Kamachi
- Division of Gastroenterology, Department of Medicine, Kurume University School of Medicine, Kurume, Japan
| | - Shusuke Okamura
- Division of Gastroenterology, Department of Medicine, Kurume University School of Medicine, Kurume, Japan
| | - Masahito Nakano
- Division of Gastroenterology, Department of Medicine, Kurume University School of Medicine, Kurume, Japan
| | - Ryoko Kuromatsu
- Division of Gastroenterology, Department of Medicine, Kurume University School of Medicine, Kurume, Japan
| | - Takumi Kawaguchi
- Division of Gastroenterology, Department of Medicine, Kurume University School of Medicine, Kurume, Japan
| | - Atsushi Kawaguchi
- Center for Comprehensive Community Medicine, Faculty of Medicine, Saga University, Saga, Japan
| | - Hironori Koga
- Division of Gastroenterology, Department of Medicine, Kurume University School of Medicine, Kurume, Japan
| | | | - Takuji Torimura
- Division of Gastroenterology, Department of Medicine, Kurume University School of Medicine, Kurume, Japan
| |
Collapse
|
30
|
Bezerra A, Silva I, Guedes LA, Silva D, Leitão G, Saito K. Extracting Value from Industrial Alarms and Events: A Data-Driven Approach Based on Exploratory Data Analysis. Sensors (Basel) 2019; 19:s19122772. [PMID: 31226811 PMCID: PMC6631682 DOI: 10.3390/s19122772] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/15/2019] [Revised: 04/25/2019] [Accepted: 05/01/2019] [Indexed: 11/16/2022]
Abstract
Alarm and event logs are an immense but latent source of knowledge commonly undervalued in industry. Though, the current massive data-exchange, high efficiency and strong competitiveness landscape, boosted by Industry 4.0 and IIoT (Industrial Internet of Things) paradigms, does not accommodate such a data misuse and demands more incisive approaches when analyzing industrial data. Advances in Data Science and Big Data (or more precisely, Industrial Big Data) have been enabling novel approaches in data analysis which can be great allies in extracting hitherto hidden information from plant operation data. Coping with that, this work proposes the use of Exploratory Data Analysis (EDA) as a promising data-driven approach to pave industrial alarm and event analysis. This approach proved to be fully able to increase industrial perception by extracting insights and valuable information from real-world industrial data without making prior assumptions.
Collapse
Affiliation(s)
- Aguinaldo Bezerra
- Postgraduate Program in Electrical and Computer Engineering, Federal University of Rio Grande do Norte, Natal 59078-970, Rio Grande do Norte, Brazil.
| | - Ivanovitch Silva
- Digital Metropolis Institute, Federal University of Rio Grande do Norte, Natal 59078-970, Rio Grande do Norte, Brazil.
| | - Luiz Affonso Guedes
- Postgraduate Program in Electrical and Computer Engineering, Federal University of Rio Grande do Norte, Natal 59078-970, Rio Grande do Norte, Brazil.
| | - Diego Silva
- School of Sciences and Technology, Federal University of Rio Grande do Norte, Natal 59078-970, Rio Grande do Norte, Brazil.
| | - Gustavo Leitão
- Digital Metropolis Institute, Federal University of Rio Grande do Norte, Natal 59078-970, Rio Grande do Norte, Brazil.
| | - Kaku Saito
- Petróleo Brasileiro S.A., Rio de Janeiro 21941-915, Brazil.
| |
Collapse
|
31
|
Bramer LM, Stratton KG, White AM, Bleeker AH, Kobold MA, Waters KM, Metz TO, Rodland KD, Webb-Robertson BJM. P-Mart: Interactive Analysis of Ion Abundance Global Proteomics Data. J Proteome Res 2019; 18:1426-1432. [PMID: 30667224 DOI: 10.1021/acs.jproteome.8b00840] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
The use of mass-spectrometry-based techniques for global protein profiling of biomedical or environmental experiments has become a major focus in research centered on biomarker discovery; however, one of the most important issues recently highlighted in the new era of omics data generation is the ability to perform analyses in a robust and reproducible manner. This has been hypothesized to be one of the issues hindering the ability of clinical proteomics to successfully identify clinical diagnostic and prognostic biomarkers of disease. P-Mart ( https://pmart.labworks.org ) is a new interactive web-based software environment that enables domain scientists to perform quality-control processing, statistics, and exploration of large-complex proteomics data sets without requiring statistical programming. P-Mart is developed in a manner that allows researchers to perform analyses via a series of modules, explore the results using interactive visualization, and finalize the analyses with a collection of output files documenting all stages of the analysis and a report to allow reproduction of the analysis.
Collapse
Affiliation(s)
- Lisa M Bramer
- Computing & Analytics Division , Pacific Northwest National Laboratory , 902 Battelle Boulevard , Richland , Washington 99352 , United States
| | - Kelly G Stratton
- Computing & Analytics Division , Pacific Northwest National Laboratory , 902 Battelle Boulevard , Richland , Washington 99352 , United States
| | - Amanda M White
- Computing & Analytics Division , Pacific Northwest National Laboratory , 902 Battelle Boulevard , Richland , Washington 99352 , United States
| | - Ameila H Bleeker
- Computing & Analytics Division , Pacific Northwest National Laboratory , 902 Battelle Boulevard , Richland , Washington 99352 , United States
| | - Markus A Kobold
- Computing & Analytics Division , Pacific Northwest National Laboratory , 902 Battelle Boulevard , Richland , Washington 99352 , United States
| | - Katrina M Waters
- Biological Sciences Division , Pacific Northwest National Laboratory , 902 Battelle Boulevard , Richland , Washington 99352 , United States
| | - Thomas O Metz
- Biological Sciences Division , Pacific Northwest National Laboratory , 902 Battelle Boulevard , Richland , Washington 99352 , United States
| | - Karin D Rodland
- Biological Sciences Division , Pacific Northwest National Laboratory , 902 Battelle Boulevard , Richland , Washington 99352 , United States.,Department of Cell, Developmental, and Cancer Biology , Oregon Health & Science University , Portland , Oregon 97221 , United States
| | - Bobbie-Jo M Webb-Robertson
- Computing & Analytics Division , Pacific Northwest National Laboratory , 902 Battelle Boulevard , Richland , Washington 99352 , United States
| |
Collapse
|
32
|
Abstract
Post-hoc power estimates (power calculated for hypothesis tests after performing them) are sometimes requested by reviewers in an attempt to promote more rigorous designs. However, they should never be requested or reported because they have been shown to be logically invalid and practically misleading. We review the problems associated with post-hoc power, particularly the fact that the resulting calculated power is a monotone function of the p-value and therefore contains no additional helpful information. We then discuss some situations that seem at first to call for post-hoc power analysis, such as attempts to decide on the practical implications of a null finding, or attempts to determine whether the sample size of a secondary data analysis is adequate for a proposed analysis, and consider possible approaches to achieving these goals. We make recommendations for practice in situations in which clear recommendations can be made, and point out other situations where further methodological research and discussion are required.
Collapse
|
33
|
Silva EJ, Bezerra-Souza A, Passero LF, Laurenti MD, Ferreira GM, Fujii DG, Trossini GH, Raminelli C. Synthesis, leishmanicidal activity, structural descriptors and structure-activity relationship of quinoline derivatives. Future Med Chem 2018; 10:2069-85. [PMID: 30066582 DOI: 10.4155/fmc-2018-0124] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
AIM Considering the epidemiology of leishmaniasis, the emergence of resistant parasites to the approved drugs, and severe clinical manifestations, the development of novel leishmanicidal molecules has become of considerable importance. RESULTS In this work, three commercially available and 19 synthesized quinoline derivatives were evaluated against promastigote and amastigote forms of Leishmania (Leishmania) amazonensis. In addition, structural parameters and molecular electrostatic potentials were obtained by theoretical calculations, allowing statistical (principal component analyses and hierarchical cluster analyses) and comparative (molecular electrostatic potentials vs leishmanicidal activities) studies, respectively. CONCLUSION Principal component analyses and hierarchical cluster analyses suggested volume and polar surface area as possible structural descriptors for the leishmanicidal activity. Furthermore, a comparison between molecular electrostatic potentials and leishmanicidal activities afforded a reasonable structure-activity relationship.
Collapse
|
34
|
Abstract
The difference between the pth quantiles of 2 survival functions can be used to compare patients' survival between 2 therapies. Setting p = 0.5 yields the median survival time difference. Varying p between 0 and 1 defines the quantile survival time difference curve which can be straightforwardly estimated by the horizontal differences between 2 Kaplan-Meier curves. The estimate's variability can be visualized by adding either a bundle of resampled bootstrap step functions or, alternatively, approximate bootstrap confidence bands. The user-friendly SAS software macro %kmdiff enables the straightforward application of this exploratory graphical approach. The macro is described, and its application is exemplified with breast cancer data. The advantages and limitations of the approach are discussed.
Collapse
Affiliation(s)
- Harald Heinzl
- Section for Clinical Biometrics, Center for Medical Statistics, Informatics, and Intelligent Systems, Medical University of Vienna, Vienna, Austria
| | | |
Collapse
|
35
|
Pastore M, Lionetti F, Altoè G. When One Shape Does Not Fit All: A Commentary Essay on the Use of Graphs in Psychological Research. Front Psychol 2017; 8:1666. [PMID: 28993749 PMCID: PMC5622191 DOI: 10.3389/fpsyg.2017.01666] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2017] [Accepted: 09/11/2017] [Indexed: 11/27/2022] Open
Affiliation(s)
- Massimiliano Pastore
- Department of Developmental and Social Psychology, University of PadovaPadova, Italy
| | - Francesca Lionetti
- Department of Biological and Experimental Psychology, Queen Mary University of LondonLondon, United Kingdom
| | - Gianmarco Altoè
- Department of Developmental and Social Psychology, University of PadovaPadova, Italy
| |
Collapse
|
36
|
Taylor M, Bickel A, Mannion R, Bell E, Harrigan GG. Dicamba-Tolerant Soybeans (Glycine max L.) MON 87708 and MON 87708 × MON 89788 Are Compositionally Equivalent to Conventional Soybean. J Agric Food Chem 2017; 65:8037-8045. [PMID: 28825823 DOI: 10.1021/acs.jafc.7b03844] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Herbicide-tolerant crops can expand both tools for and timing of weed control strategies. MON 87708 soybean has been developed through genetic modification and confers tolerance to the dicamba herbicide. As part of the safety assessment conducted for new genetically modified (GM) crop varieties, a compositional assessment of MON 87708 was performed. Levels of key soybean nutrients and anti-nutrients in harvested MON 87708 were compared to levels of those components in a closely related non-GM variety as well as to levels measured in other conventional soybean varieties. From this analysis, MON 87708 was shown to be compositionally equivalent to its comparator. A similar analysis conducted for a stacked trait product produced by conventional breeding, MON 87708 × MON 89788, which confers tolerance to both dicamba and glyphosate herbicides, reached the same conclusion. These results are consistent with other results that demonstrate no compositional impact of genetic modification, except in those cases where an impact was an intended outcome.
Collapse
Affiliation(s)
- Mary Taylor
- Monsanto Company , 800 North Lindbergh Boulevard, St. Louis, Missouri 63167, United States
| | - Anna Bickel
- Monsanto Company , 800 North Lindbergh Boulevard, St. Louis, Missouri 63167, United States
| | - Rhonda Mannion
- Monsanto Company , 800 North Lindbergh Boulevard, St. Louis, Missouri 63167, United States
| | - Erin Bell
- Monsanto Company , 800 North Lindbergh Boulevard, St. Louis, Missouri 63167, United States
| | - George G Harrigan
- Monsanto Company , 800 North Lindbergh Boulevard, St. Louis, Missouri 63167, United States
| |
Collapse
|
37
|
Rakotonirina JC, Csősz S, Fisher BL. Revision of the Malagasy Camponotus edmondi species group (Hymenoptera, Formicidae, Formicinae): integrating qualitative morphology and multivariate morphometric analysis. Zookeys 2017:81-154. [PMID: 28050160 PMCID: PMC4843987 DOI: 10.3897/zookeys.572.7177] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2015] [Accepted: 02/20/2016] [Indexed: 11/16/2022] Open
Abstract
The Malagasy Camponotusedmondi species group is revised based on both qualitative morphological traits and multivariate analysis of continuous morphometric data. To minimize the effect of the scaling properties of diverse traits due to worker caste polymorphism, and to achieve the desired near-linearity of data, morphometric analyses were done only on minor workers. The majority of traits exhibit broken scaling on head size, dividing Camponotus workers into two discrete subcastes, minors and majors. This broken scaling prevents the application of algorithms that uses linear combination of data to the entire dataset, hence only minor workers were analyzed statistically. The elimination of major workers resulted in linearity and the data meet required assumptions. However, morphometric ratios for the subsets of minor and major workers were used in species descriptions and redefinitions. Prior species hypotheses and the goodness of clusters were tested on raw data by confirmatory linear discriminant analysis. Due to the small sample size available for some species, a factor known to reduce statistical reliability, hypotheses generated by exploratory analyses were tested with extreme care and species delimitations were inferred via the combined evidence of both qualitative (morphology and biology) and quantitative data. Altogether, fifteen species are recognized, of which 11 are new to science: Camponotusalamainasp. n., Camponotusandroysp. n., Camponotusbevohitrasp. n., Camponotusgalokosp. n., Camponotusmatsilosp. n., Camponotusmifakasp. n., Camponotusorombesp. n., Camponotustafosp. n., Camponotustratrasp. n., Camponotusvaratrasp. n., and Camponotuszavosp. n. Four species are redescribed: Camponotusechinoploides Forel, Camponotusedmondi André, Camponotusethicus Forel, and Camponotusrobustus Roger. Camponotusedmondiernesti Forel, syn. n. is synonymized under Camponotusedmondi. This revision also includes an identification key to species for both minor and major castes, information on geographic distribution and biology, taxonomic discussions, and descriptions of intraspecific variation. Traditional taxonomy and multivariate morphometric analysis are independent sources of information which, in combination, allow more precise species delimitation. Moreover, quantitative characters included in identification keys improve accuracy of determination in difficult cases.
Collapse
Affiliation(s)
- Jean Claude Rakotonirina
- Madagascar Biodiversity Center, BP 6257, Parc Botanique et Zoologique de Tsimbazaza, Antananarivo, Madagascar
| | - Sándor Csősz
- Entomology, California Academy of Sciences, 55 Music Concourse Drive, San Francisco, CA 94118, U.S.A
| | - Brian L Fisher
- Entomology, California Academy of Sciences, 55 Music Concourse Drive, San Francisco, CA 94118, U.S.A
| |
Collapse
|
38
|
Tomitaka S, Kawasaki Y, Ide K, Akutagawa M, Yamada H, Yutaka O, Furukawa TA. Item Response Patterns on the Patient Health Questionnaire-8 in a Nationally Representative Sample of US Adults. Front Psychiatry 2017; 8:251. [PMID: 29225583 PMCID: PMC5705613 DOI: 10.3389/fpsyt.2017.00251] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/13/2017] [Accepted: 11/09/2017] [Indexed: 11/13/2022] Open
Abstract
Recent studies have shown that item responses on the Center for Epidemiologic Studies Depression Scale (CES-D) and Kessler Screening Scale for Psychological Distress (K6) exhibit the same characteristic item response patterns among the general population. However, the distributional patterns of responses on the Patient Health Questionnaire-8 (PHQ-8) among the general population have not been adequately studied. Thus, we conducted a pattern analysis of PHQ-8 item responses among US adults. Data (18,446 individuals) were obtained from the 2015 Behavioral Risk Factor Surveillance Survey (BRFSS). Item responses on the BRFSS version of the PHQ-8 were scored using the number of days response set and then converted to the original 4-point scale. The patterns of item responses were analyzed through graphical analysis. Lines of item responses scored using the number of days response set showed the same pattern among the eight items, characterized by crossing at a single point between "0 days" and "1 day," and parallel fluctuation from "1 day" to "14 days" on a semi-logarithmic scale. Lines of item responses converted to the 4-point scale also showed the same characteristic pattern among the eight items. The present results demonstrate that the item responses on the PHQ-8 show the same characteristic patterns among items, consistent with the CES-D and the K6.
Collapse
Affiliation(s)
- Shinichiro Tomitaka
- Department of Mental Health, Panasonic Health Center, Tokyo, Japan.,Department of Health Promotion and Human Behavior, Kyoto University Graduate School of Medicine/School of Public Health, Kyoto, Japan.,Department of Drug Evaluation and Informatics, School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan
| | - Yohei Kawasaki
- Clinical Research Center, Chiba University Hospital, Chiba, Japan
| | - Kazuki Ide
- Department of Drug Evaluation and Informatics, School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan.,Department of Pharmacoepidemiology, Graduate School of Medicine and Public Health, Kyoto University, Kyoto, Japan.,Center for the Promotion of Interdisciplinary Education and Research, Kyoto University, Kyoto, Japan
| | - Maiko Akutagawa
- Department of Drug Evaluation and Informatics, School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan
| | - Hiroshi Yamada
- Department of Drug Evaluation and Informatics, School of Pharmaceutical Sciences, University of Shizuoka, Shizuoka, Japan
| | - Ono Yutaka
- Center for the Development of Cognitive Behavior Therapy Training, Tokyo, Japan
| | - Toshiaki A Furukawa
- Department of Health Promotion and Human Behavior, Kyoto University Graduate School of Medicine/School of Public Health, Kyoto, Japan
| |
Collapse
|
39
|
Meng C, Zeleznik OA, Thallinger GG, Kuster B, Gholami AM, Culhane AC. Dimension reduction techniques for the integrative analysis of multi-omics data. Brief Bioinform 2016; 17:628-41. [PMID: 26969681 PMCID: PMC4945831 DOI: 10.1093/bib/bbv108] [Citation(s) in RCA: 190] [Impact Index Per Article: 23.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2015] [Revised: 10/26/2015] [Indexed: 01/16/2023] Open
Abstract
State-of-the-art next-generation sequencing, transcriptomics, proteomics and other high-throughput 'omics' technologies enable the efficient generation of large experimental data sets. These data may yield unprecedented knowledge about molecular pathways in cells and their role in disease. Dimension reduction approaches have been widely used in exploratory analysis of single omics data sets. This review will focus on dimension reduction approaches for simultaneous exploratory analyses of multiple data sets. These methods extract the linear relationships that best explain the correlated structure across data sets, the variability both within and between variables (or observations) and may highlight data issues such as batch effects or outliers. We explore dimension reduction techniques as one of the emerging approaches for data integration, and how these can be applied to increase our understanding of biological systems in normal physiological function and disease.
Collapse
|
40
|
He Y, Shimizu I, Schappert S, Xu J, Beresovsky V, Khan D, Valverde R, Schenker N. A Note on the Effect of Data Clustering on the Multiple-Imputation Variance Estimator: A Theoretical Addendum to , JOS. J Off Stat 2016; 32:147-164. [PMID: 30948863 PMCID: PMC6444354 DOI: 10.1515/jos-2016-0007] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Multiple imputation is a popular approach to handling missing data. Although it was originally motivated by survey nonresponse problems, it has been readily applied to other data settings. However, its general behavior still remains unclear when applied to survey data with complex sample designs, including clustering. Recently, Lewis et al. (2014) compared single- and multiple-imputation analyses for certain incomplete variables in the 2008 National Ambulatory Medicare Care Survey, which has a nationally representative, multistage, and clustered sampling design. Their study results suggested that the increase of the variance estimate due to multiple imputation compared with single imputation largely disappears for estimates with large design effects. We complement their empirical research by providing some theoretical reasoning. We consider data sampled from an equally weighted, single-stage cluster design and characterize the process using a balanced, one-way normal random-effects model. Assuming that the missingness is completely at random, we derive analytic expressions for the within- and between-multiple-imputation variance estimators for the mean estimator, and thus conveniently reveal the impact of design effects on these variance estimators. We propose approximations for the fraction of missing information in clustered samples, extending previous results for simple random samples. We discuss some generalizations of this research and its practical implications for data release by statistical agencies.
Collapse
Affiliation(s)
- Yulei He
- National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, MD, 20782, U.S.A
| | - Iris Shimizu
- National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, MD, 20782, U.S.A
| | - Susan Schappert
- National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, MD, 20782, U.S.A
| | - Jianmin Xu
- National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, MD, 20782, U.S.A
| | - Vladislav Beresovsky
- National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, MD, 20782, U.S.A
| | - Diba Khan
- National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, MD, 20782, U.S.A
| | - Roberto Valverde
- National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, MD, 20782, U.S.A
| | - Nathaniel Schenker
- National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, MD, 20782, U.S.A
| |
Collapse
|
41
|
Abstract
In this article, I argue for the need of more use of exploratory techniques to identify dynamics in social interactions. I describe several approaches as they are applied to multivariate time series data. The first approach is an algorithm that searches for periods of variability and stability at the individual level as well as for patterns of overlap in such periods between the two individuals in a couple. These patterns describe the daily ups and downs in the couples' affect and are predictive of the state of the couples 1 to 2 years later. The second approach, hierarchical segmentation, is based on the idea of partitioning the time series in segments with distinct data patterns. In the case of data from dyads, as in the illustration, the patterns can be compared in terms of coherence between the 2 individuals in the dyad. The third approach is based on network analysis, and its use is shown as a method to examine data transitions at the individual and dyadic level as well as system-wide coherence in multivariate systems. For each approach, I provide examples of its use with empirical data. The article ends with general guidelines and recommendations for researchers interested in using exploratory methods as a way to examine psychological processes.
Collapse
Affiliation(s)
- Emilio Ferrer
- a Department of Psychology , University of California , Davis
| |
Collapse
|
42
|
Abstract
Pattern recognition is a key element in pharmacodynamic analyses as a first step to identify drug action and selection of a pharmacodynamic model. The essence of this process is going from data to insight through exploratory data analysis. There are few formal strategies that scientists typically use when the experiment has been done and data collected. This report attempts to ameliorate this deficit by identifying the properties of a pharmacodynamic model via dissection of the pattern revealed in response-time data. Pattern recognition in pharmacodynamic analyses contrasts with pharmacokinetic analyses with respect to time course. Thus, the time course of drug in plasma usually differs markedly from the time course of the biomarker response, as a consequence of a myriad of interactions (transport to biophase, binding to target, activation of target and downstream mediators, physiological response, cascade and amplification of biosignals, homeostatic feedback) between the events of exposure to test compound and the occurrence of the biomarker response. Homing in on this important-but less often addressed-element, 20 datasets of varying complexity were analyzed, and from this, we summarize a set of points to consider, specifically addressing baseline behavior, number of phases in the response-time course, time delays between concentration- and response-time courses, peak shifts in response with increasing doses, saturation, and other potential nonlinearities. These strategies will hopefully give a better understanding of the complete pharmacodynamic response-time profile.
Collapse
Affiliation(s)
- Johan Gabrielsson
- Division of Pharmacology and Toxicology, Department of Biomedical Sciences and Veterinary Public Health, SLU, Box 7028, SE-750 07, Uppsala, Sweden.
| | - Stephan Hjorth
- Department of Molecular and Clinical Medicine, Institute of Medicine, The Sahlgrenska Academy at Gothenburg University, SE-413 45, Gothenburg, Sweden
- PharmaLot Consulting AB, V. Bäckvägen 21B, SE-434 92, Vallda, Sweden
| |
Collapse
|
43
|
Abstract
Improvements in DNA sequencing technology have increased the amount and quality of sequences that can be obtained from metagenomic samples, making it practical to extract individual microbial genomes from metagenomic assemblies (“binning”). However, while many tools and methods exist for unsupervised binning with various statistical algorithms, there are few options for visualizing the results, even though visualization is vital to exploratory data analysis. We have developed gbtools, a software package that allows users to visualize metagenomic assemblies by plotting coverage (sequencing depth) and GC values of contigs, and also to annotate the plots with taxonomic information. Different sets of annotations, including taxonomic assignments from conserved marker genes or SSU rRNA genes, can be imported simultaneously; users can choose which annotations to plot. Bins can be manually defined from plots, or be imported from third-party binning tools and overlaid onto plots, such that results from different methods can be compared side-by-side. gbtools reports summary statistics of bins including marker gene completeness, and allows the user to add or subtract bins with each other. We illustrate some of the functions available in gbtools with two examples: the metagenome of Olavius algarvensis, a marine oligochaete worm that has up to five bacterial symbionts, and the metagenome of a synthetic mock community comprising 64 bacterial and archaeal strains. We show how instances of poor automated binning, sequencer GC% bias, and variation between samples can be quickly diagnosed by visualization, and demonstrate how the results from different binning tools can be combined and refined to yield manually curated bins with higher completeness. gbtools is open-source and written in R. The software package, documentation, and example data are available freely online at https://github.com/kbseah/genome-bin-tools.
Collapse
Affiliation(s)
- Brandon K B Seah
- Department of Symbiosis, Max Planck Institute for Marine Microbiology Bremen, Germany
| | | |
Collapse
|
44
|
Ho AD, Yu CC. Descriptive Statistics for Modern Test Score Distributions: Skewness, Kurtosis, Discreteness, and Ceiling Effects. Educ Psychol Meas 2015; 75:365-388. [PMID: 29795825 PMCID: PMC5965643 DOI: 10.1177/0013164414548576] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Many statistical analyses benefit from the assumption that unconditional or conditional distributions are continuous and normal. More than 50 years ago in this journal, Lord and Cook chronicled departures from normality in educational tests, and Micerri similarly showed that the normality assumption is met rarely in educational and psychological practice. In this article, the authors extend these previous analyses to state-level educational test score distributions that are an increasingly common target of high-stakes analysis and interpretation. Among 504 scale-score and raw-score distributions from state testing programs from recent years, nonnormal distributions are common and are often associated with particular state programs. The authors explain how scaling procedures from item response theory lead to nonnormal distributions as well as unusual patterns of discreteness. The authors recommend that distributional descriptive statistics be calculated routinely to inform model selection for large-scale test score data, and they illustrate consequences of nonnormality using sensitivity studies that compare baseline results to those from normalized score scales.
Collapse
Affiliation(s)
- Andrew D. Ho
- Harvard Graduate School of Education, Cambridge, MA, USA
| | - Carol C. Yu
- Harvard Graduate School of Education, Cambridge, MA, USA
| |
Collapse
|
45
|
Pamplona GSP, Santos Neto GS, Rosset SRE, Rogers BP, Salmon CEG. Analyzing the association between functional connectivity of the brain and intellectual performance. Front Hum Neurosci 2015; 9:61. [PMID: 25713528 PMCID: PMC4322636 DOI: 10.3389/fnhum.2015.00061] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2014] [Accepted: 01/23/2015] [Indexed: 11/13/2022] Open
Abstract
Measurements of functional connectivity support the hypothesis that the brain is composed of distinct networks with anatomically separated nodes but common functionality. A few studies have suggested that intellectual performance may be associated with greater functional connectivity in the fronto-parietal network and enhanced global efficiency. In this fMRI study, we performed an exploratory analysis of the relationship between the brain's functional connectivity and intelligence scores derived from the Portuguese language version of the Wechsler Adult Intelligence Scale (WAIS-III) in a sample of 29 people, born and raised in Brazil. We examined functional connectivity between 82 regions, including graph theoretic properties of the overall network. Some previous findings were extended to the Portuguese-speaking population, specifically the presence of small-world organization of the brain and relationships of intelligence with connectivity of frontal, pre-central, parietal, occipital, fusiform and supramarginal gyrus, and caudate nucleus. Verbal comprehension was associated with global network efficiency, a new finding.
Collapse
Affiliation(s)
- Gustavo S P Pamplona
- InBrain Lab, Department of Physics, Faculty of Philosophy, Sciences and Letters of Ribeirão Preto, University of São Paulo São Paulo, Brazil
| | - Gérson S Santos Neto
- Faculty of Medicine of Ribeirão Preto, University of São Paulo São Paulo, Brazil
| | - Sara R E Rosset
- Faculty of Medicine of Ribeirão Preto, University of São Paulo São Paulo, Brazil
| | - Baxter P Rogers
- Department of Radiology and Radiological Sciences, Department of Biomedical Engineering, Institute of Imaging Science, Vanderbilt University Nashville, TN, USA
| | - Carlos E G Salmon
- InBrain Lab, Department of Physics, Faculty of Philosophy, Sciences and Letters of Ribeirão Preto, University of São Paulo São Paulo, Brazil
| |
Collapse
|
46
|
Van Gassen S, Callebaut B, Van Helden MJ, Lambrecht BN, Demeester P, Dhaene T, Saeys Y. FlowSOM: Using self-organizing maps for visualization and interpretation of cytometry data. Cytometry A 2015; 87:636-45. [PMID: 25573116 DOI: 10.1002/cyto.a.22625] [Citation(s) in RCA: 1007] [Impact Index Per Article: 111.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The number of markers measured in both flow and mass cytometry keeps increasing steadily. Although this provides a wealth of information, it becomes infeasible to analyze these datasets manually. When using 2D scatter plots, the number of possible plots increases exponentially with the number of markers and therefore, relevant information that is present in the data might be missed. In this article, we introduce a new visualization technique, called FlowSOM, which analyzes Flow or mass cytometry data using a Self-Organizing Map. Using a two-level clustering and star charts, our algorithm helps to obtain a clear overview of how all markers are behaving on all cells, and to detect subsets that might be missed otherwise. R code is available at https://github.com/SofieVG/FlowSOM and will be made available at Bioconductor.
Collapse
Affiliation(s)
- Sofie Van Gassen
- Department of Information Technology, Ghent University, iMinds, Ghent, Belgium.,Inflammation Research Center, VIB, Ghent, Belgium.,Department of Respiratory Medicine, Ghent University Hospital, Ghent, Belgium
| | - Britt Callebaut
- Department of Information Technology, Ghent University, iMinds, Ghent, Belgium
| | - Mary J Van Helden
- Inflammation Research Center, VIB, Ghent, Belgium.,Department of Respiratory Medicine, Ghent University Hospital, Ghent, Belgium
| | - Bart N Lambrecht
- Inflammation Research Center, VIB, Ghent, Belgium.,Department of Respiratory Medicine, Ghent University Hospital, Ghent, Belgium
| | - Piet Demeester
- Department of Information Technology, Ghent University, iMinds, Ghent, Belgium
| | - Tom Dhaene
- Department of Information Technology, Ghent University, iMinds, Ghent, Belgium
| | - Yvan Saeys
- Inflammation Research Center, VIB, Ghent, Belgium.,Department of Respiratory Medicine, Ghent University Hospital, Ghent, Belgium
| |
Collapse
|
47
|
Abstract
Respondent-driven sampling (RDS) is a widely used method for sampling from hard-to-reach human populations, especially populations at higher risk for HIV. Data are collected through peer-referral over social networks. RDS has proven practical for data collection in many difficult settings and is widely used. Inference from RDS data requires many strong assumptions because the sampling design is partially beyond the control of the researcher and partially unobserved. We introduce diagnostic tools for most of these assumptions and apply them in 12 high risk populations. These diagnostics empower researchers to better understand their data and encourage future statistical research on RDS.
Collapse
Affiliation(s)
| | - Lisa G Johnston
- Tulane University, New Orleans, LA, USA and University of California, San Francisco, San Francisco, CA, USA
| | - Matthew J Salganik
- Microsoft Research, New York, NY USA and Princeton University, Princeton, NJ, USA
| |
Collapse
|
48
|
Abstract
BACKGROUND Biomolecular pathways and networks are dynamic and complex, and the perturbations to them which cause disease are often multiple, heterogeneous and contingent. Pathway and network visualizations, rendered on a computer or published on paper, however, tend to be static, lacking in detail, and ill-equipped to explore the variety and quantities of data available today, and the complex causes we seek to understand. RESULTS RCytoscape integrates R (an open-ended programming environment rich in statistical power and data-handling facilities) and Cytoscape (powerful network visualization and analysis software). RCytoscape extends Cytoscape's functionality beyond what is possible with the Cytoscape graphical user interface. To illustrate the power of RCytoscape, a portion of the Glioblastoma multiforme (GBM) data set from the Cancer Genome Atlas (TCGA) is examined. Network visualization reveals previously unreported patterns in the data suggesting heterogeneous signaling mechanisms active in GBM Proneural tumors, with possible clinical relevance. CONCLUSIONS Progress in bioinformatics and computational biology depends upon exploratory and confirmatory data analysis, upon inference, and upon modeling. These activities will eventually permit the prediction and control of complex biological systems. Network visualizations--molecular maps--created from an open-ended programming environment rich in statistical power and data-handling facilities, such as RCytoscape, will play an essential role in this progression.
Collapse
Affiliation(s)
- Paul T Shannon
- Fred Hutchison Cancer Research Institute, Seattle Washington, and the Institute for Systems Biology, 401 Terry Ave. N, Seattle, WA, USA
- Institute for Systems Biology, 401 Terry Ave. N, Seattle, WA, USA
| | - Mark Grimes
- Division of Biological Sciences, Center for Structural and Functional Neuroscience, University of Montana, Missoula, MT, USA
| | - Burak Kutlu
- Institute for Systems Biology, 401 Terry Ave. N, Seattle, WA, USA
| | - Jan J Bot
- Delft University of Technology, Delft Bioinformatics Lab, Delft, The Netherlands
| | - David J Galas
- Pacific Northwest Diabetes Research Institute, 720 Broadway, Seattle, WA 98120, USA
| |
Collapse
|
49
|
Aghaeepour N, Jalali A, O’Neill K, Chattopadhyay PK, Roederer M, Hoos HH, Brinkman RR. RchyOptimyx: cellular hierarchy optimization for flow cytometry. Cytometry A 2012; 81:1022-30. [PMID: 23044634 PMCID: PMC3726344 DOI: 10.1002/cyto.a.22209] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2012] [Revised: 08/07/2012] [Accepted: 09/05/2012] [Indexed: 12/19/2022]
Abstract
Analysis of high-dimensional flow cytometry datasets can reveal novel cell populations with poorly understood biology. Following discovery, characterization of these populations in terms of the critical markers involved is an important step, as this can help to both better understand the biology of these populations and aid in designing simpler marker panels to identify them on simpler instruments and with fewer reagents (i.e., in resource poor or highly regulated clinical settings). However, current tools to design panels based on the biological characteristics of the target cell populations work exclusively based on technical parameters (e.g., instrument configurations, spectral overlap, and reagent availability). To address this shortcoming, we developed RchyOptimyx (cellular hieraRCHY OPTIMization), a computational tool that constructs cellular hierarchies by combining automated gating with dynamic programming and graph theory to provide the best gating strategies to identify a target population to a desired level of purity or correlation with a clinical outcome, using the simplest possible marker panels. RchyOptimyx can assess and graphically present the trade-offs between marker choice and population specificity in high-dimensional flow or mass cytometry datasets. We present three proof-of-concept use cases for RchyOptimyx that involve 1) designing a panel of surface markers for identification of rare populations that are primarily characterized using their intracellular signature; 2) simplifying the gating strategy for identification of a target cell population; 3) identification of a non-redundant marker set to identify a target cell population.
Collapse
Affiliation(s)
- Nima Aghaeepour
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, British Columbia, Canada
| | - Adrin Jalali
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, British Columbia, Canada
| | - Kieran O’Neill
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, British Columbia, Canada
| | | | - Mario Roederer
- Vaccine Research Center, National Institute of Health, Bethesda, Massachusetts
| | - Holger H. Hoos
- Department of Computer Science, University of British Columbia, British Columbia, Canada
| | - Ryan R. Brinkman
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, British Columbia, Canada
- Department of Medical Genetics, University of British Columbia, British Columbia, Canada
| |
Collapse
|
50
|
Abstract
Graphics are widely used in modern applied statistics because they are easy to create, convenient to use, and they can present information effectively. Static plots do not allow interacting with graphics. User interaction, on the other hand, is crucial in exploring data. It gives flexibility and control. One can experiment with the data and the displays. One can investigate the data from different perspectives to produce views that are easily interpretable and informative. In this paper, we try to explain interactive graphics and advocate their use as a practical tool. The benefits and strengths of interactive graphics for data exploration and data quality analyses are illustrated systematically with three complex real datasets.
Collapse
|