1
|
Data-Driven Battery Characterization and Prognosis: Recent Progress, Challenges, and Prospects. SMALL METHODS 2024:e2301021. [PMID: 38213008 DOI: 10.1002/smtd.202301021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 12/11/2023] [Indexed: 01/13/2024]
Abstract
Battery characterization and prognosis are essential for analyzing underlying electrochemical mechanisms and ensuring safe operation, especially with the assistance of superior data-driven artificial intelligence systems. This review provides a unique perspective on recent progress in data-driven battery characterization and prognosis methods. First, recent informative image characterization and impedance spectrum as well as high-throughput screening approaches on revealing battery electrochemical mechanisms at multiple scales are summarized. Thereafter, battery prognosis tasks and strategies are described, with the comparison of various physics-informed modeling strategies. Considering unlocking mechanisms from tremendous battery data, the dominant role of physics-informed interpretable learning in accelerating energy device development is presented. Finally, challenges and prospects on data-driven characterization and prognosis are discussed toward accelerating energy device development with much-enhanced electrochemical transparency and generalization. This review is hoped to supply new ideas and inspirations to the next-generation battery development.
Collapse
|
2
|
Probing quantum correlations in many-body systems: a review of scalable methods. REPORTS ON PROGRESS IN PHYSICS. PHYSICAL SOCIETY (GREAT BRITAIN) 2023; 86. [PMID: 37699388 DOI: 10.1088/1361-6633/acf8d7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 09/12/2023] [Indexed: 09/14/2023]
Abstract
We review methods that allow one to detect and characterize quantum correlations in many-body systems, with a special focus on approaches which are scalable. Namely, those applicable to systems with many degrees of freedom, without requiring a number of measurements or computational resources to analyze the data that scale exponentially with the system size. We begin with introducing the concepts of quantum entanglement, Einstein-Podolsky-Rosen steering, and Bell nonlocality in the bipartite scenario, to then present their multipartite generalization. We review recent progress on characterizing these quantum correlations from partial information on the system state, such as through data-driven methods or witnesses based on low-order moments of collective observables. We then review state-of-the-art experiments that demonstrate the preparation, manipulation and detection of highly-entangled many-body systems. For each platform (e.g. atoms, ions, photons, superconducting circuits) we illustrate the available toolbox for state preparation and measurement, emphasizing the challenges that each system poses. To conclude, we present a list of timely open problems in the field.
Collapse
|
3
|
Environmental susceptibility for all: A data-driven approach suggests individual differences in domain-general and domain-specific patterns of environmental susceptibility. Dev Psychopathol 2023:1-17. [PMID: 37466086 DOI: 10.1017/s0954579423000779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/20/2023]
Abstract
How we are influenced by our environment is a fundamental question in developmental science. Theories and empirical research have claimed that some individuals are susceptible to environmental influences and others are much less susceptible. The present study addressed four questions: (1) Is environmental susceptibility a continuous or categorical construct? (2) Is environmental susceptibility unidimensional (i.e., domain general) or multidimensional (i.e., domain specific)? (3) Are there genetic contributions to individual differences in environmental susceptibility? (4) What are the temperamental characteristics of different environmental susceptibility patterns? We used child- and mother-report data from a sample of 11-year-old twins (N = 1,507) and applied a novel data-driven approach to assess an environmental susceptibility space, based on simultaneous associations between multiple environmental exposures (18 measures relating to parenting, parent, peer, and twin relationships) and developmental outcomes (10 measures relating to empathy, prosocial behavior, aggression, and self-esteem). The results suggest that the environmental susceptibility space we assessed is better conceptualized as continuous and multidimensional. Different children showed susceptibility to different contexts and variation in domain-general versus domain-specific patterns. A comparison of distances between monozygotic and dizygotic twins within the space demonstrated genetic contributions. Finally, susceptibility patterns could not be differentiated based on a specific temperament trait, but rather related to temperament profiles.
Collapse
|
4
|
Real-Time Forecasting of Subsurface Inclusion Defects for Continuous Casting Slabs: A Data-Driven Comparative Study. SENSORS (BASEL, SWITZERLAND) 2023; 23:5415. [PMID: 37420581 DOI: 10.3390/s23125415] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/08/2023] [Revised: 05/30/2023] [Accepted: 06/06/2023] [Indexed: 07/09/2023]
Abstract
Subsurface inclusions are one of the most common defects that affect the inner quality of continuous casting slabs. This increases the defects in the final products and increases the complexity of the hot charge rolling process and may even cause breakout accidents. The defects are, however, hard to detect online by traditional mechanism-model-based and physics-based methods. In the present paper, a comparative study is carried out based on data-driven methods, which are only sporadically discussed in the literature. As a further contribution, a scatter-regularized kernel discriminative least squares (SR-KDLS) model and a stacked defect-related autoencoder back propagation neural network (SDAE-BPNN) model are developed to improve the forecasting performance. The scatter-regularized kernel discriminative least squares is designed as a coherent framework to directly provide forecasting information instead of low-dimensional embeddings. The stacked defect-related autoencoder back propagation neural network extracts deep defect-related features layer by layer for a higher feasibility and accuracy. The feasibility and efficiency of the data-driven methods are demonstrated through case studies based on a real-life continuous casting process, where the imbalance degree drastically vary in different categories, showing that the defects are timely (within 0.01 ms) and accurately forecasted. Moreover, experiments illustrate the merits of the developed scatter-regularized kernel discriminative least squares and stacked defect-related autoencoder back propagation neural network methods regarding the computational burden; the F1 scores of the developed methods are clearly higher than common methods.
Collapse
|
5
|
Deep Clinical Phenotyping of Schizophrenia Spectrum Disorders Using Data-Driven Methods: Marching towards Precision Psychiatry. J Pers Med 2023; 13:954. [PMID: 37373943 DOI: 10.3390/jpm13060954] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 05/30/2023] [Accepted: 06/01/2023] [Indexed: 06/29/2023] Open
Abstract
Heterogeneity is the main challenge in the traditional classification of mental disorders, including schizophrenia spectrum disorders (SSD). This can be partly attributed to the absence of objective diagnostic criteria and the multidimensional nature of symptoms and their associated factors. This article provides an overview of findings from the Genetic Risk and Outcome of Psychosis (GROUP) cohort study on the deep clinical phenotyping of schizophrenia spectrum disorders targeting positive and negative symptoms, cognitive impairments and psychosocial functioning. Three to four latent subtypes of positive and negative symptoms were identified in patients, siblings and controls, whereas four to six latent cognitive subtypes were identified. Five latent subtypes of psychosocial function-multidimensional social inclusion and premorbid adjustment-were also identified in patients. We discovered that the identified subtypes had mixed profiles and exhibited stable, deteriorating, relapsing and ameliorating longitudinal courses over time. Baseline positive and negative symptoms, premorbid adjustment, psychotic-like experiences, health-related quality of life and PRSSCZ were found to be the strong predictors of the identified subtypes. Our findings are comprehensive, novel and of clinical interest for precisely identifying high-risk population groups, patients with good or poor disease prognosis and the selection of optimal intervention, ultimately fostering precision psychiatry by tackling diagnostic and treatment selection challenges pertaining to heterogeneity.
Collapse
|
6
|
Oscillatory ERK Signaling and Morphology Determine Heterogeneity of Breast Cancer Cell Chemotaxis via MEK-ERK and p38-MAPK Signaling Pathways. Bioengineering (Basel) 2023; 10:bioengineering10020269. [PMID: 36829763 PMCID: PMC9952091 DOI: 10.3390/bioengineering10020269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 01/24/2023] [Accepted: 02/12/2023] [Indexed: 02/22/2023] Open
Abstract
Chemotaxis, regulated by oscillatory signals, drives critical processes in cancer metastasis. Crucial chemoattractant molecules in breast cancer, CXCL12 and EGF, drive the activation of ERK and Akt. Regulated by feedback and crosstalk mechanisms, oscillatory signals in ERK and Akt control resultant changes in cell morphology and chemotaxis. While commonly studied at the population scale, metastasis arises from small numbers of cells that successfully disseminate, underscoring the need to analyze processes that cancer cells use to connect oscillatory signaling to chemotaxis at single-cell resolution. Furthermore, little is known about how to successfully target fast-migrating cells to block metastasis. We investigated to what extent oscillatory networks in single cells associate with heterogeneous chemotactic responses and how targeted inhibitors block signaling processes in chemotaxis. We integrated live, single-cell imaging with time-dependent data processing to discover oscillatory signal processes defining heterogeneous chemotactic responses. We identified that short ERK and Akt waves, regulated by MEK-ERK and p38-MAPK signaling pathways, determine the heterogeneous random migration of cancer cells. By comparison, long ERK waves and the morphological changes regulated by MEK-ERK signaling, determine heterogeneous directed motion. This study indicates that treatments against chemotaxis in consider must interrupt oscillatory signaling.
Collapse
|
7
|
Deep learning for centre manifold reduction and stability analysis in nonlinear systems. PHILOSOPHICAL TRANSACTIONS. SERIES A, MATHEMATICAL, PHYSICAL, AND ENGINEERING SCIENCES 2022; 380:20210212. [PMID: 35719074 DOI: 10.1098/rsta.2021.0212] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2021] [Accepted: 03/10/2022] [Indexed: 06/15/2023]
Abstract
Bifurcations cause large qualitative and quantitative changes in the dynamics of nonlinear systems with slowly varying parameters. These changes most often are due to modifications that occur in a low-dimensional subspace of the overall system dynamics. The key challenge is to determine what that low-dimensional subspace is, and construct a low-order model that governs the dynamics in that subspace. Centre manifold theory can provide a theoretical means to construct such low-order models for strongly nonlinear systems that undergo bifurcations. Performing a centre manifold analysis, however, is particularly challenging when the system dimensionality is high or impossible when an accurate model of the system is not available. This paper introduces a data-driven approach for identifying a reduced order model of the system based on centre manifold theory. The approach does not require a model of the full order system. Instead, a deep learning approach capable of identifying the centre manifold and the transformation to the centre space is created using measurements of the system dynamics from random perturbations. This approach unravels the characteristics of the system dynamics in the vicinity of bifurcations, providing critical information regarding the behaviour of the system. This article is part of the theme issue 'Data-driven prediction in dynamical systems'.
Collapse
|
8
|
Facial expressions elicit multiplexed perceptions of emotion categories and dimensions. Curr Biol 2022; 32:200-209.e6. [PMID: 34767768 PMCID: PMC8751635 DOI: 10.1016/j.cub.2021.10.035] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Revised: 09/07/2021] [Accepted: 10/14/2021] [Indexed: 11/22/2022]
Abstract
Human facial expressions are complex, multi-component signals that can communicate rich information about emotions,1-5 including specific categories, such as "anger," and broader dimensions, such as "negative valence, high arousal."6-8 An enduring question is how this complex signaling is achieved. Communication theory predicts that multi-component signals could transmit each type of emotion information-i.e., specific categories and broader dimensions-via the same or different facial signal components, with implications for elucidating the system and ontology of facial expression communication.9 We addressed this question using a communication-systems-based method that agnostically generates facial expressions and uses the receiver's perceptions to model the specific facial signal components that represent emotion category and dimensional information to them.10-12 First, we derived the facial expressions that elicit the perception of emotion categories (i.e., the six classic emotions13 plus 19 complex emotions3) and dimensions (i.e., valence and arousal) separately, in 60 individual participants. Comparison of these facial signals showed that they share subsets of components, suggesting that specific latent signals jointly represent-i.e., multiplex-categorical and dimensional information. Further examination revealed these specific latent signals and the joint information they represent. Our results-based on white Western participants, same-ethnicity face stimuli, and commonly used English emotion terms-show that facial expressions can jointly represent specific emotion categories and broad dimensions to perceivers via multiplexed facial signal components. Our results provide insights into the ontology and system of facial expression communication and a new information-theoretic framework that can characterize its complexities.
Collapse
|
9
|
Knowledge-based radiation treatment planning: A data-driven method survey. J Appl Clin Med Phys 2021; 22:16-44. [PMID: 34231970 PMCID: PMC8364264 DOI: 10.1002/acm2.13337] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 04/26/2021] [Accepted: 06/02/2021] [Indexed: 12/18/2022] Open
Abstract
This paper surveys the data-driven dose prediction methods investigated for knowledge-based planning (KBP) in the last decade. These methods were classified into two major categories-traditional KBP methods and deep-learning (DL) methods-according to their techniques of utilizing previous knowledge. Traditional KBP methods include studies that require geometric or anatomical features to either find the best-matched case(s) from a repository of prior treatment plans or to build dose prediction models. DL methods include studies that train neural networks to make dose predictions. A comprehensive review of each category is presented, highlighting key features, methods, and their advancements over the years. We separated the cited works according to the framework and cancer site in each category. Finally, we briefly discuss the performance of both traditional KBP methods and DL methods, then discuss future trends of both data-driven KBP methods to dose prediction.
Collapse
|
10
|
A Data-Driven Scheme for Fault Detection of Discrete-Time Switched Systems. SENSORS 2021; 21:s21124138. [PMID: 34208628 PMCID: PMC8235235 DOI: 10.3390/s21124138] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Revised: 06/09/2021] [Accepted: 06/11/2021] [Indexed: 11/16/2022]
Abstract
This paper is concerned with the fault detection issue for a class of discrete-time switched systems via the data-driven approach. For the fault detection of switched systems, it is inevitable to consider the mode matching problem between the activated subsystem and the executed residual generator since the mode mismatching may cause a false fault alarm in all probability. Frequently, studies assume that the switching laws are available to the residual generator, by which the residual generator keeps the same mode as the system plant and then the mode mismatching is excluded. However, this assumption is conservative and impractical because many switching laws are hard to acquire in practical applications. This work focuses on the case of switched systems with unavailable switching laws. In view of the unavailability of switching information, the mode recognition is considered for the fault detection process and meanwhile, sufficient conditions are presented for the mode distinguishability. Moreover, a novel decision logic for the fault detection is proposed, based on which new algorithms are established for the data-driven realization. Finally, a benchmark case on a three-tank system is used to illustrate the feasibility and usefulness of the obtained results.
Collapse
|
11
|
Dynamical landscape and multistability of a climate model. Proc Math Phys Eng Sci 2021; 477:20210019. [PMID: 35153562 PMCID: PMC8299554 DOI: 10.1098/rspa.2021.0019] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Accepted: 05/04/2021] [Indexed: 12/15/2022] Open
Abstract
We apply two independent data analysis methodologies to locate stable climate states in an intermediate complexity climate model and analyse their interplay. First, drawing from the theory of quasi-potentials, and viewing the state space as an energy landscape with valleys and mountain ridges, we infer the relative likelihood of the identified multistable climate states and investigate the most likely transition trajectories as well as the expected transition times between them. Second, harnessing techniques from data science, and specifically manifold learning, we characterize the data landscape of the simulation output to find climate states and basin boundaries within a fully agnostic and unsupervised framework. Both approaches show remarkable agreement, and reveal, apart from the well known warm and snowball earth states, a third intermediate stable state in one of the two versions of PLASIM, the climate model used in this study. The combination of our approaches allows to identify how the negative feedback of ocean heat transport and entropy production via the hydrological cycle drastically change the topography of the dynamical landscape of Earth’s climate.
Collapse
|
12
|
Polymer informatics with multi-task learning. PATTERNS (NEW YORK, N.Y.) 2021; 2:100238. [PMID: 33982028 PMCID: PMC8085610 DOI: 10.1016/j.patter.2021.100238] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/07/2021] [Revised: 02/12/2021] [Accepted: 03/16/2021] [Indexed: 12/25/2022]
Abstract
Modern data-driven tools are transforming application-specific polymer development cycles. Surrogate models that can be trained to predict properties of polymers are becoming commonplace. Nevertheless, these models do not utilize the full breadth of the knowledge available in datasets, which are oftentimes sparse; inherent correlations between different property datasets are disregarded. Here, we demonstrate the potency of multi-task learning approaches that exploit such inherent correlations effectively. Data pertaining to 36 different properties of over 13,000 polymers are supplied to deep-learning multi-task architectures. Compared to conventional single-task learning models, the multi-task approach is accurate, efficient, scalable, and amenable to transfer learning as more data on the same or different properties become available. Moreover, these models are interpretable. Chemical rules, that explain how certain features control trends in property values, emerge from the present work, paving the way for the rational design of application specific polymers meeting desired property or performance objectives. We overcome data scarcity in polymer datasets using multi-task models Our approach is expected to become the preferred training method for materials data We derive chemical guidelines for the design of application specific polymers
Polymers display extraordinary diversity in their chemistry, structure, and applications. However, finding the ideal polymer possessing the right combination of properties for a given application is non-trivial as the chemical space of polymers is practically infinite. This daunting search problem can be mitigated by surrogate models, trained using machine learning algorithms on available property data, that can make instantaneous predictions of polymer properties. In this work, we present a versatile, interpretable, and scalable scheme to build such predictive models. Our “multi-task learning” approach is used for the first time within materials informatics and efficiently, effectively, and simultaneously learns and predicts multiple polymer properties. This development is expected to have a significant impact on data-driven materials discovery.
Collapse
|
13
|
Industrial Control under Non-Ideal Measurements: Data-Based Signal Processing as an Alternative to Controller Retuning. SENSORS 2021; 21:s21041237. [PMID: 33578649 PMCID: PMC7916400 DOI: 10.3390/s21041237] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/14/2021] [Revised: 02/01/2021] [Accepted: 02/06/2021] [Indexed: 11/23/2022]
Abstract
Industrial environments are characterised by the non-lineal and highly complex processes they perform. Different control strategies are considered to assure that these processes are correctly performed. Nevertheless, these strategies are sensible to noise-corrupted and delayed measurements. For that reason, denoising techniques and delay correction methodologies should be considered but, most of these techniques require a complex design and optimisation process as a function of the scenario where they are applied. To alleviate this, a complete data-based approach devoted to denoising and correcting the delay of measurements is proposed here with a two-fold objective: simplify the solution design process and achieve its decoupling from the considered control strategy as well as from the scenario. Here it corresponds to a Wastewater Treatment Plant (WWTP). However, the proposed solution can be adopted at any industrial environment since neither an optimization nor a design focused on the scenario is required, only pairs of input and output data. Results show that a minimum Root Mean Squared Error (RMSE) improvement of a 63.87% is achieved when the new proposed data-based denoising approach is considered. In addition, the whole system performance show that similar and even better results are obtained when compared to scenario-optimised methodologies.
Collapse
|
14
|
Deep learning in photoacoustic tomography: current approaches and future directions. JOURNAL OF BIOMEDICAL OPTICS 2020; 25:112903. [PMCID: PMC7593654 DOI: 10.1117/1.jbo.25.11.112903] [Citation(s) in RCA: 49] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2020] [Accepted: 09/24/2020] [Indexed: 05/18/2023]
Abstract
Biomedical photoacoustic tomography, which can provide high-resolution 3D soft tissue images based on optical absorption, has advanced to the stage at which translation from the laboratory to clinical settings is becoming possible. The need for rapid image formation and the practical restrictions on data acquisition that arise from the constraints of a clinical workflow are presenting new image reconstruction challenges. There are many classical approaches to image reconstruction, but ameliorating the effects of incomplete or imperfect data through the incorporation of accurate priors is challenging and leads to slow algorithms. Recently, the application of deep learning (DL), or deep neural networks, to this problem has received a great deal of attention. We review the literature on learned image reconstruction, summarizing the current trends and explain how these approaches fit within, and to some extent have arisen from, a framework that encompasses classical reconstruction methods. In particular, it shows how these techniques can be understood from a Bayesian perspective, providing useful insights. We also provide a concise tutorial demonstration of three prototypical approaches to learned image reconstruction. The code and data sets for these demonstrations are available to researchers. It is anticipated that it is in in vivo applications—where data may be sparse, fast imaging critical, and priors difficult to construct by hand—that DL will have the most impact. With this in mind, we conclude with some indications of possible future research directions.
Collapse
|
15
|
Abstract
Real-world studies show that the facial expressions produced during pain and orgasm-two different and intense affective experiences-are virtually indistinguishable. However, this finding is counterintuitive, because facial expressions are widely considered to be a powerful tool for social interaction. Consequently, debate continues as to whether the facial expressions of these extreme positive and negative affective states serve a communicative function. Here, we address this debate from a novel angle by modeling the mental representations of dynamic facial expressions of pain and orgasm in 40 observers in each of two cultures (Western, East Asian) using a data-driven method. Using a complementary approach of machine learning, an information-theoretic analysis, and a human perceptual discrimination task, we show that mental representations of pain and orgasm are physically and perceptually distinct in each culture. Cross-cultural comparisons also revealed that pain is represented by similar face movements across cultures, whereas orgasm showed distinct cultural accents. Together, our data show that mental representations of the facial expressions of pain and orgasm are distinct, which questions their nondiagnosticity and instead suggests they could be used for communicative purposes. Our results also highlight the potential role of cultural and perceptual factors in shaping the mental representation of these facial expressions. We discuss new research directions to further explore their relationship to the production of facial expressions.
Collapse
|
16
|
Data-Driven Methods to Diversify Knowledge of Human Psychology. Trends Cogn Sci 2017; 22:1-5. [PMID: 29126772 DOI: 10.1016/j.tics.2017.10.002] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2017] [Revised: 10/07/2017] [Accepted: 10/09/2017] [Indexed: 01/14/2023]
Abstract
Psychology aims to understand real human behavior. However, cultural biases in the scientific process can constrain knowledge. We describe here how data-driven methods can relax these constraints to reveal new insights that theories can overlook. To advance knowledge we advocate a symbiotic approach that better combines data-driven methods with theory.
Collapse
|
17
|
Abstract
INTRODUCTION Asthma is no longer thought of as a single disease, but rather a collection of varying symptoms expressing different disease patterns. One of the ongoing challenges is understanding the underlying pathophysiological mechanisms that may be responsible for the varying responses to treatment. Areas Covered: This review provides an overview of our current understanding of the asthma phenotype concept in childhood and describes key findings from both conventional and data-driven methods. Expert Commentary: With the vast amounts of data generated from cohorts, there is hope that we can elucidate distinct pathophysiological mechanisms, or endotypes. In return, this would lead to better patient stratification and disease management, thereby providing true personalised medicine.
Collapse
|
18
|
Learning Data-Driven Patient Risk Stratification Models for Clostridium difficile. Open Forum Infect Dis 2014; 1:ofu045. [PMID: 25734117 PMCID: PMC4281796 DOI: 10.1093/ofid/ofu045] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2014] [Accepted: 06/05/2014] [Indexed: 12/21/2022] Open
Abstract
BACKGROUND Although many risk factors are well known, Clostridium difficile infection (CDI) continues to be a significant problem throughout the world. The purpose of this study was to develop and validate a data-driven, hospital-specific risk stratification procedure for estimating the probability that an inpatient will test positive for C difficile. METHODS We consider electronic medical record (EMR) data from patients admitted for ≥24 hours to a large urban hospital in the U.S. between April 2011 and April 2013. Predictive models were constructed using L2-regularized logistic regression and data from the first year. The number of observational variables considered varied from a small set of well known risk factors readily available to a physician to over 10 000 variables automatically extracted from the EMR. Each model was evaluated on holdout admission data from the following year. A total of 34 846 admissions with 372 cases of CDI was used to train the model. RESULTS Applied to the separate validation set of 34 722 admissions with 355 cases of CDI, the model that made use of the additional EMR data yielded an area under the receiver operating characteristic curve (AUROC) of 0.81 (95% confidence interval [CI], .79-.83), and it significantly outperformed the model that considered only the small set of known clinical risk factors, AUROC of 0.71 (95% CI, .69-.75). CONCLUSIONS Automated risk stratification of patients based on the contents of their EMRs can be used to accurately identify a high-risk population of patients. The proposed method holds promise for enabling the selective allocation of interventions aimed at reducing the rate of CDI.
Collapse
|