1
|
A practical treatment of sensitivity analyses in activity level evaluations. Forensic Sci Int 2024; 355:111944. [PMID: 38277913 DOI: 10.1016/j.forsciint.2024.111944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 01/09/2024] [Accepted: 01/15/2024] [Indexed: 01/28/2024]
Abstract
Evaluations of forensic observations considering activity level propositions are becoming more common place in forensic institutions. A measure that can be taken to interrogate the evaluation for robustness is called sensitivity analysis. A sensitivity analysis explores the sensitivity of the evaluation to the data used when assigning probabilities, or to the level of uncertainty surrounding a probability assignment, or to the choice of various assumptions within the model. There have been a number of publications that describe sensitivity analysis in technical terms, and demonstrate their use, but limited literature on how that theory can be applied in practice. In this work we provide some simplified examples of how sensitivity analyses can be carried out, when they are likely to show that the evaluation is sensitive to underlying data, knowledge or assumptions, how to interpret the results of sensitivity analysis, and how the outcome can be reported. We also provide access to an application to conduct sensitivity analysis.
Collapse
|
2
|
Predicting lung cancer survival prognosis based on the conditional survival bayesian network. BMC Med Res Methodol 2024; 24:16. [PMID: 38254038 PMCID: PMC10801949 DOI: 10.1186/s12874-023-02043-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2022] [Accepted: 09/25/2023] [Indexed: 01/24/2024] Open
Abstract
Lung cancer is a leading cause of cancer deaths and imposes an enormous economic burden on patients. It is important to develop an accurate risk assessment model to determine the appropriate treatment for patients after an initial lung cancer diagnosis. The Cox proportional hazards model is mainly employed in survival analysis. However, real-world medical data are usually incomplete, posing a great challenge to the application of this model. Commonly used imputation methods cannot achieve sufficient accuracy when data are missing, so we investigated novel methods for the development of clinical prediction models. In this article, we present a novel model for survival prediction in missing scenarios. We collected data from 5,240 patients diagnosed with lung cancer at the Weihai Municipal Hospital, China. Then, we applied a joint model that combined a BN and a Cox model to predict mortality risk in individual patients with lung cancer. The established prognostic model achieved good predictive performance in discrimination and calibration. We showed that combining the BN with the Cox proportional hazards model is highly beneficial and provides a more efficient tool for risk prediction.
Collapse
|
3
|
Spectral Bayesian network theory. LINEAR ALGEBRA AND ITS APPLICATIONS 2023; 674:282-303. [PMID: 37520305 PMCID: PMC10373448 DOI: 10.1016/j.laa.2023.06.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/01/2023]
Abstract
A Bayesian Network (BN) is a probabilistic model that represents a set of variables using a directed acyclic graph (DAG). Current algorithms for learning BN structures from data focus on estimating the edges of a specific DAG, and often lead to many 'likely' network structures. In this paper, we lay the groundwork for an approach that focuses on learning global properties of the DAG rather than exact edges. This is done by defining the structural hypergraph of a BN, which is shown to be related to the inverse-covariance matrix of the network. Spectral bounds are derived for the normalized inverse-covariance matrix, which are shown to be closely related to the maximum indegree of the associated BN.
Collapse
|
4
|
Understanding multimorbidity requires sign-disease networks and higher-order interactions, a perspective. FRONTIERS IN SYSTEMS BIOLOGY 2023; 3:1155599. [PMID: 37810371 PMCID: PMC10557993 DOI: 10.3389/fsysb.2023.1155599] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/10/2023]
Abstract
Background Count scores, disease clustering, and pairwise associations between diseases remain ubiquitous in multimorbidity research despite two major shortcomings: they yield no insight into plausible mechanisms underlying multimorbidity, and they ignore higher-order interactions such as effect modification. Objectives We argue that two components are currently missing but vital to develop novel multimorbidity metrics. Firstly, networks should be constructed which consists simultaneously of signs, symptoms, and diseases, since only then could they yield insight into plausible shared biological mechanisms underlying diseases.Secondly, learning pairwise associations is insufficient to fully characterize the correlations in a system. That is, synergistic (e.g., cooperative or antagonistic) effects are widespread in complex systems, where two or more elements combined give a larger or smaller effect than the sum of their individual effects. It can even occur that pairs of symptoms have no pairwise associations whatsoever, but in combination have a significant association. Therefore, higher-order interactions should be included in networks used to study multimorbidity, resulting in so-called hypergraphs. Methods We illustrate our argument using a synthetic Bayesian Network model of symptoms, signs and diseases, composed of pairwise and higher-order interactions. We simulate network interventions on both individual and population levels and compare the ground-truth outcomes with the predictions from pairwise associations. Conclusion We find that, when judged purely from the pairwise associations, interventions can have unexpected 'side-effects' or the most opportune intervention could be missed. The hypergraph uncovers links missed in pairwise networks, giving a more complete overview of sign and disease associations.
Collapse
|
5
|
Learning EKG Diagnostic Models with Hierarchical Class Label Dependencies. ARTIFICIAL INTELLIGENCE IN MEDICINE. CONFERENCE ON ARTIFICIAL INTELLIGENCE IN MEDICINE (2005- ) 2023; 13897:260-270. [PMID: 37303465 PMCID: PMC10256236 DOI: 10.1007/978-3-031-34344-5_31] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Electrocardiogram (EKG/ECG) is a key diagnostic tool to assess patient's cardiac condition and is widely used in clinical applications such as patient monitoring, surgery support, and heart medicine research. With recent advances in machine learning (ML) technology there has been a growing interest in the development of models supporting automatic EKG interpretation and diagnosis based on past EKG data. The problem can be modeled as multi-label classification (MLC), where the objective is to learn a function that maps each EKG reading to a vector of diagnostic class labels reflecting the underlying patient condition at different levels of abstraction. In this paper, we propose and investigate an ML model that considers class-label dependency embedded in the hierarchical organization of EKG diagnoses to improve the EKG classification performance. Our model first transforms the EKG signals into a low-dimensional vector, and after that uses the vector to predict different class labels with the help of the conditional tree structured Bayesian network (CTBN) that is able to capture hierarchical dependencies among class variables. We evaluate our model on the publicly available PTB-XL dataset. Our experiments demonstrate that modeling of hierarchical dependencies among class variables improves the diagnostic model performance under multiple classification performance metrics as compared to classification models that predict each class label independently.
Collapse
|
6
|
A Bayesian Network Approach to Explainable Reinforcement Learning with Distal Information. SENSORS (BASEL, SWITZERLAND) 2023; 23:2013. [PMID: 36850617 PMCID: PMC9961455 DOI: 10.3390/s23042013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 01/27/2023] [Accepted: 02/07/2023] [Indexed: 06/18/2023]
Abstract
Nowadays, Artificial Intelligence systems have expanded their competence field from research to industry and daily life, so understanding how they make decisions is becoming fundamental to reducing the lack of trust between users and machines and increasing the transparency of the model. This paper aims to automate the generation of explanations for model-free Reinforcement Learning algorithms by answering "why" and "why not" questions. To this end, we use Bayesian Networks in combination with the NOTEARS algorithm for automatic structure learning. This approach complements an existing framework very well and demonstrates thus a step towards generating explanations with as little user input as possible. This approach is computationally evaluated in three benchmarks using different Reinforcement Learning methods to highlight that it is independent of the type of model used and the explanations are then rated through a human study. The results obtained are compared to other baseline explanation models to underline the satisfying performance of the framework presented in terms of increasing the understanding, transparency and trust in the action chosen by the agent.
Collapse
|
7
|
Comparison of genotyping and weight of evidence results when applying different genotyping strategies on samples from a DNA transfer experiment. Int J Legal Med 2023; 137:47-56. [PMID: 36416964 DOI: 10.1007/s00414-022-02918-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Accepted: 11/15/2022] [Indexed: 11/24/2022]
Abstract
In this study, we assessed to what extent data on the subject of TPPR (transfer, persistence, prevalence, recovery) that are obtained through an older STR typing kit can be used in an activity-level evaluation for a case profiled with a more modern STR kit. Newer kits generally hold more loci and may show higher sensitivity especially when reduced reaction volumes are used, and this could increase the evidential value at the source level. On the other hand, the increased genotyping information may invoke a higher number of contributors in the weight of evidence calculations, which could affect the evidential values as well. An activity scenario well explored in earlier studies [1,2] was redone using volunteers with known DNA profiles. DNA extracts were analyzed with three different approaches, namely using the optimal DNA input for (1) an older and (2) a newer STR typing system, and (3) using a standard, volume-based input combined with replicate PCR analysis with only the newer STR kit. The genotyping results were analyzed for various aspects such as percentage detected alleles and relative peak height contribution for background and the contributors known to be involved in the activity. Next, source-level LRs were calculated and the same trends were observed with regard to inclusionary and exclusionary LRs for persons who had or had not been in direct contact with the sampled areas. We subsequently assessed the impact on the outcome of the activity-level evaluation in an exemplary case by applying the assigned probabilities to a Bayesian network. We infer that data from different STR kits can be combined in the activity-level evaluations.
Collapse
|
8
|
Integrated regulatory and metabolic networks of the tumor microenvironment for therapeutic target prioritization. Stat Appl Genet Mol Biol 2023; 22:sagmb-2022-0054. [PMID: 37988745 DOI: 10.1515/sagmb-2022-0054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2022] [Accepted: 09/28/2023] [Indexed: 11/23/2023]
Abstract
Translation of genomic discovery, such as single-cell sequencing data, to clinical decisions remains a longstanding bottleneck in the field. Meanwhile, computational systems biological models, such as cellular metabolism models and cell signaling pathways, have emerged as powerful approaches to provide efficient predictions in metabolites and gene expression levels, respectively. However, there has been limited research on the integration between these two models. This work develops a methodology for integrating computational models of probabilistic gene regulatory networks with a constraint-based metabolism model. By using probabilistic reasoning with Bayesian Networks, we aim to predict cell-specific changes under different interventions, which are embedded into the constraint-based models of metabolism. Applications to single-cell sequencing data of glioblastoma brain tumors generate predictions about the effects of pharmaceutical interventions on the regulatory network and downstream metabolisms in different cell types from the tumor microenvironment. The model presents possible insights into treatments that could potentially suppress anaerobic metabolism in malignant cells with minimal impact on other cell types' metabolism. The proposed integrated model can guide therapeutic target prioritization, the formulation of combination therapies, and future drug discovery. This model integration framework is also generalizable to other applications, such as different cell types, organisms, and diseases.
Collapse
|
9
|
An Integrated Intelligent System for Breast Cancer Detection at Early Stages Using IR Images and Machine Learning Methods with Explainability. SN COMPUTER SCIENCE 2023; 4:184. [PMID: 36742416 PMCID: PMC9888345 DOI: 10.1007/s42979-022-01536-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Accepted: 11/30/2022] [Indexed: 02/01/2023]
Abstract
Breast cancer is the second most common cause of death among women. An early diagnosis is vital for reducing the fatality rate in the fight against breast cancer. Thermography could be suggested as a safe, non-invasive, non-contact supplementary method to diagnose breast cancer and can be the most promising method for breast self-examination as envisioned by the World Health Organization (WHO). Moreover, thermography could be combined with artificial intelligence and automated diagnostic methods towards a diagnosis with a negligible number of false positive or false negative results. In the current study, a novel intelligent integrated diagnosis system is proposed using IR thermal images with Convolutional Neural Networks and Bayesian Networks to achieve good diagnostic accuracy from a relatively small dataset of images and data. We demonstrate the juxtaposition of transfer learning models such as ResNet50 with the proposed combination of BNs with artificial neural network methods such as CNNs which provides a state-of-the-art expert system with explainability. The novelties of our methodology include: (i) the construction of a diagnostic tool with high accuracy from a small number of images for training; (ii) the features extracted from the images are found to be the appropriate ones leading to very good diagnosis; (iii) our expert model exhibits interpretability, i.e., one physician can understand which factors/features play critical roles for the diagnosis. The results of the study showed an accuracy that varies for the most successful models amongst four implemented approaches from approximately 91% to 93%, with a precision value of 91% to 95%, sensitivity from 91% to 92 %, and with specificity from 91% to 97%. In conclusion, we have achieved accurate diagnosis with understandability with the novel integrated approach.
Collapse
|
10
|
Adoption of a Data-Driven Bayesian Belief Network Investigating Organizational Factors that Influence Patient Safety. RISK ANALYSIS : AN OFFICIAL PUBLICATION OF THE SOCIETY FOR RISK ANALYSIS 2022; 42:1277-1293. [PMID: 33070320 PMCID: PMC9291329 DOI: 10.1111/risa.13610] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Revised: 09/26/2020] [Accepted: 09/30/2020] [Indexed: 06/01/2023]
Abstract
Medical errors pose high risks to patients. Several organizational factors may impact the high rate of medical errors in complex and dynamic healthcare systems. However, limited research is available regarding probabilistic interdependencies between the organizational factors and patient safety errors. To explore this, we adopt a data-driven Bayesian Belief Network (BBN) model to represent a class of probabilistic models, using the hospital-level aggregate survey data from U.K. hospitals. Leveraging the use of probabilistic dependence models and visual features in the BBN model, the results shed new light on relationships existing among eight organizational factors and patient safety errors. With the high prediction capability, the data-driven approach results suggest that "health and well-being" and "bullying and harassment in the work environment" are the two leading factors influencing the number of reported errors and near misses affecting patient safety. This study provides significant insights to understand organizational factors' role and their relative importance in supporting decision-making and safety improvements.
Collapse
|
11
|
Exploiting the Capabilities of Bayesian Networks for Engineering Risk Assessment: Causal Reasoning through Interventions. RISK ANALYSIS : AN OFFICIAL PUBLICATION OF THE SOCIETY FOR RISK ANALYSIS 2022; 42:1306-1324. [PMID: 33687077 PMCID: PMC9290605 DOI: 10.1111/risa.13711] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
In the last decade, Bayesian networks (BNs) have been widely used in engineering risk assessment due to the benefits that they provide over other methods. Among these, the most significant is the ability to model systems, causal factors, and their dependencies in a probabilistic manner. This capability has enabled the community to do causal reasoning through associations, which answers questions such as: "How does new evidence x'$x^{\prime }$ about the occurrence of event X$X$ change my belief about the occurrence of event Y$Y$ ?" Associative reasoning has helped risk analysts to identify relevant risk-contributing factors and perform scenario analysis by evidence propagation. However, engineering risk assessment has yet to explore other features of BNs, such as the ability to reason through interventions, which enables the BN model to support answering questions of the form "How does doing X=x'$X=x^{\prime }$ change my belief about the occurrence of event Y$Y$ ?" In this article, we propose to expand the scope of use of BN models in engineering risk assessment to support intervention reasoning. This will provide more robust risk-informed decision support by enabling the modeling of policies and actions before being implemented. To do this, we provide the formal mathematical background and tools to model interventions in BNs and propose a framework that enables its use in engineering risk assessment. This is demonstrated in an illustrative case study on third-party damage of natural gas pipelines, showing how BNs can be used to inform decision-makers about the effect that new actions/policies can have on a system.
Collapse
|
12
|
What are the leading causes of fatal and severe injury crashes involving older pedestrian? Evidence from Bayesian network model. JOURNAL OF SAFETY RESEARCH 2022; 80:281-292. [PMID: 35249608 DOI: 10.1016/j.jsr.2021.12.011] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/07/2020] [Revised: 06/16/2021] [Accepted: 12/13/2021] [Indexed: 06/14/2023]
Abstract
INTRODUCTION Identifying factors contributing to the risk of older pedestrian fatal/severe injuries, along with their possible interdependency, is the first step towards improving safety. Several previous studies focused on identifying the influence of individual factors while ignoring their interdependencies. This study investigated the leading risk factors associated with older pedestrian fatalities/severe injuries by identifying the interdependency relationship among variables. METHOD A Bayesian Logistic Regression (BLR) model was developed to identify significant factors influencing pedestrian fatalities and severe injuries, followed by a Bayesian Network (BN) model to reveal the interdependency relationship among the statistically significant variables and crash severity. Furthermore, the probabilistic inference was conducted to identify the leading cause of fatal and severe injuries involving older pedestrians. The models were developed with data from 913 pedestrian crashes involving older pedestrians at signalized intersections in Florida from 2016 through 2018. RESULTS Vehicle maneuver, lighting condition, road type, and shoulder type were directly associated with older pedestrian fatality/severe injury. Vehicle maneuver (going straight ahead) was the most significant factor in influencing the severity of crashes involving older pedestrians. The interdependency of vehicle moving straight, nighttime condition, and two-way divided roadway with curbed shoulders was associated with the highest likelihood of fatal and severe-injury crashes involving older pedestrians. CONCLUSIONS The Bayesian Network revealed the interdependency between variables associated with fatal and severe injury-crashes involving older pedestrians. The interdependency relationship with the highest likelihood to cause fatalities/severe-injuries comprised factors with the significant individual contribution to the severity of crashes involving older pedestrians. Practical applications: The interdependencies among variables identified in this research could help devise targeted engineering, education, and enforcement strategies that could potentially have a greater effect on improving the safety of older pedestrians.
Collapse
|
13
|
Regulatory Relationships of Demographic, Clinical Characteristics and Quality of Care for Heart Failure Patients in Southern China. Int J Qual Health Care 2021; 34:6468985. [PMID: 34919681 DOI: 10.1093/intqhc/mzab159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Revised: 10/21/2021] [Accepted: 12/17/2021] [Indexed: 11/13/2022] Open
Abstract
BACKGROUND Quality of care for Chinese patients with heart failure was substandard. It is of utmost value to ascertain the characteristics related to quality of care to narrow the gap. METHODS Data from 2,064 heart failure patients between 1 January 2012 and 31 December 2015 at a hospital in Fujian Province were analyzed. Bayesian Network was used to assess the regulatory relationships between demographic, clinical characteristics and compliance with quality indicators. RESULTS The compliance with quality indicators ranged from 42.5% to 90.2%. The compliance with recommended doses for medications all reached or was close to 100% except indapamide. In Bayesian network, residence place, hypertension, troponin, B-type natriuretic peptide, heart rate, lung disease, number of emergency treatment, ejection fraction directly regulated the compliance and gender, age, medical payment method, myocardiopathy, coronary heart disease, arrhythmia had indirectly effect. The lower compliance was found in patients under emergency treatment, patients with abnormal testing indicators, patients without specific comorbidities and patients with NRCMS or self-paying. Patients with lung disease and those who lived in urban area had longer length of stay. CONCLUSIONS The compliance with medication indicators for heart failure were suboptimal, but recommended doses were prescribed in patients who received medications. A series of strategies should be developed to improve the quality of care, such as expanding the scope and depth of knowledge of guidelines and clinical pathway, integrating the reminder and quality assessment model into hospital medical record information system, paying more attention to vulnerable population and improving the medical security system.
Collapse
|
14
|
Fusion-Learning of Bayesian Network Models for Fault Diagnostics. SENSORS 2021; 21:s21227633. [PMID: 34833709 PMCID: PMC8622961 DOI: 10.3390/s21227633] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Revised: 11/13/2021] [Accepted: 11/14/2021] [Indexed: 11/21/2022]
Abstract
Bayesian Network (BN) models are being successfully applied to improve fault diagnosis, which in turn can improve equipment uptime and customer service. Most of these BN models are essentially trained using quantitative data obtained from sensors. However, sensors may not be able to cover all faults and therefore such BN models would be incomplete. Furthermore, many systems have maintenance logs that can serve as qualitative data, potentially containing historic causation information in unstructured natural language replete with technical terms. The motivation of this paper is to leverage all of the data available to improve BN learning. Specifically, we propose a method for fusion-learning of BNs: for quantitative data obtained from sensors, metrology data and qualitative data from maintenance logs, corrective and preventive action reports, and then follow by fusing these two BNs. Furthermore, we propose a human-in-the-loop approach for expert knowledge elicitation of the BN structure aided by logged natural language data instead of relying exclusively on their anecdotal memory. The resulting fused BN model can be expected to provide improved diagnostics as it has a wider fault coverage than the individual BNs. We demonstrate the efficacy of our proposed method using real world data from uninterruptible power supply (UPS) fault diagnostics.
Collapse
|
15
|
RFU derived LRs for activity level assignments using Bayesian Networks. Forensic Sci Int Genet 2021; 56:102608. [PMID: 34735938 DOI: 10.1016/j.fsigen.2021.102608] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2021] [Revised: 08/20/2021] [Accepted: 10/12/2021] [Indexed: 01/28/2023]
Abstract
A comparative study has been carried out, comparing two different methods to estimate activity level likelihood ratios (LRa) using Bayesian Networks. The first method uses the sub-source likelihood ratio (log10LRϕ) as a 'quality indicator'. However, this has been criticised as introducing potential bias from population differences in allelic proportions. An alternative method has been introduced that is based upon the total RFU of a DNA profile that is adjusted using the mixture proportion (Mx) which is calculated from quantitative probabilistic genotyping software (EuroForMix). Bayesian logistic regressions of direct transfer data showed that the two methods were comparable. Differences were attributed to sampling error, and small sample sizes of secondary transfer data. The Bayesian approach facilitates comparative studies by taking account of sampling error; it can easily be extended to compare different methods.
Collapse
|
16
|
Radiation Response Prediction Model based on Integrated Clinical and Genomic Data Analysis. Cancer Res Treat 2021; 54:383-395. [PMID: 34425668 PMCID: PMC9016297 DOI: 10.4143/crt.2021.759] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Accepted: 08/23/2021] [Indexed: 11/21/2022] Open
Abstract
Purpose The value of the genomic profiling by targeted gene-sequencing on radiation therapy response prediction was evaluated through integrated analysis including clinical information. Radiation response prediction model was constructed based on the analyzed findings. Materials and Methods Patients who had the tumor sequenced using institutional cancer panel after informed consent and received radiotherapy for the measurable disease served as the target cohort. Patients with irradiated tumor locally controlled for more than 6 months after radiotherapy were defined as the durable local control (DLC) group, otherwise, non-durable local control (NDLC) group. Significant genomic factors and domain knowledge were used to develop the Bayesian Network model to predict radiotherapy response. Results Altogether, 88 patients were collected for analysis. Of those, 41 (43.6%) and 47 (54.4%) patients were classified as the NDLC and DLC group, respectively. Somatic mutations of NOTCH2 and BCL were enriched in the NDLC group, whereas, mutations of CHEK2, MSH2, and NOTCH1 were more frequently found in the DLC group. Altered DNA repair pathway was associated with better local failure-free survival (HR 0.40, 95%CI 0.19-0.86, p=0.014). Smoking somatic signature was found more frequently in the DLC group. AUC of the Bayesian Network model predicting probability of 6-month local control was 0.83. Conclusion Durable radiation response was associated with alterations of DNA repair pathway and smoking somatic signature. Bayesian network model could provide helpful insights for high precision radiotherapy. However, these findings should be verified in prospective cohort for further individualization.
Collapse
|
17
|
Ocular Trauma in Operation Iraqi Freedom and Operation Enduring Freedom from 2001 to 2011: A Bayesian Network Analysis. Ophthalmic Epidemiol 2020; 28:312-321. [PMID: 32998604 DOI: 10.1080/09286586.2020.1828494] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
PURPOSE To update the epidemiology of ocular injuries in soldiers admitted to Walter Reed Army Medical Center (WRAMC) from 2001 to 2011 after sustaining combat injuries in Operation Iraqi Freedom (OIF) and Operation Enduring Freedom (OEF). METHODS Data were collected in the Walter Reed Ocular Trauma Database. A Bayesian Network Analysis was completed to better understand the relationships between different ocular demographic variables, injuries, surgeries, ocular trauma scores (OTS) and visual outcomes. RESULTS There were 890 consecutive globe or adnexal combat injuries, or both, sustained by 652 United States soldiers treated at WRAMC between 2001 and 2011.The primary mechanism of injury was improvised explosive device (62.47%). Many patients (62.0%) had final visual acuity (VA) grades of 1-2 (20/15 - 20/200), while 29.9% of patients had final VA grades of 3-5 (less than 20/200), and 8.1% had unknown final VA grades. Bayesian Network Analysis revealed that the injury variables of Retina (47.9%), Lens (44.6%), Posterior Segment (43.7%) and Anterior Segment (40.3%), and the surgical variables of Enucleation (97.6%) and cataract extraction and posterior capsule intraocular lens placement (CEPCIOL; 43.3%) all had probabilities greater than 40% for a poor final VA, while all other variables were less than 40%. CONCLUSION Modern-day combat trauma results in complicated ocular injuries causing 30% of patients to be left legally blind in their injured eye. It is critical to maintain a wide variety of deployable, specialty trained ophthalmologists to ensure the best visual outcomes for wounded warriors and to maintain mission readiness.
Collapse
|
18
|
A framework using topological pathways for deeper analysis of transcriptome data. BMC Genomics 2020; 21:834. [PMID: 32138666 PMCID: PMC7057456 DOI: 10.1186/s12864-019-6155-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Accepted: 09/30/2019] [Indexed: 12/03/2022] Open
Abstract
Background Pathway analysis is one of the later stage data analysis steps essential in interpreting high-throughput gene expression data. We propose a set of algorithms which given gene expression data can recognize which portion of sub-pathways are actively utilized in the biological system being studied. The degree of activation is measured by conditional probability of the input expression data based on the Bayesian Network model constructed from the topological pathway. Results We demonstrate the effectiveness of our pathway analysis method by conducting two case studies. The first one applies our method to a well-studied temporal microarray data set for the cell cycle using the KEGG Cell Cycle pathway. Our method closely reproduces the biological claims associated with the data sets, but unlike the original work ours can produce how pathway routes interact with each other above and beyond merely identifying which pathway routes are involved in the process. The second study applies the method to the p53 mutation microarray data to perform a comparative study. Conclusions We show that our method achieves comparable performance against all other pathway analysis systems included in this study in identifying p53 altered pathways. Our method could pave a new way of carrying out next generation pathway analysis.
Collapse
|
19
|
Enriching Analytics Models with Domain Knowledge for Smart Manufacturing Data Analysis. INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH 2020; 58:10.1080/00207543.2019.1680895. [PMID: 33304000 PMCID: PMC7722260 DOI: 10.1080/00207543.2019.1680895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2018] [Accepted: 05/28/2019] [Indexed: 06/12/2023]
Abstract
Today, data analytics plays an important role in Smart Manufacturing decision making. Domain knowledge is very important to support the development of analytics models. However, in today's data analytics projects, domain knowledge is only documented, but not properly captured and integrated with analytics models. This raises problems in interoperability and traceability of the relevant domain knowledge that is used to develop analytics models. To address these problems, this paper proposes a methodology to enrich analytics models with domain knowledge. To illustrate the proposed methodology, a case study is introduced to demonstrate the utilization of the enriched analytics model to support the development of a Bayesian Network model. The case study shows that the utilization of an enriched analytics model improves the efficiency in developing the Bayesian Network model.
Collapse
|
20
|
Spatiotemporal prediction of Escherichia coli and Enterococci for the Commonwealth Games triathlon event using Bayesian Networks. MARINE POLLUTION BULLETIN 2019; 146:11-21. [PMID: 31426138 DOI: 10.1016/j.marpolbul.2019.05.066] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/16/2019] [Revised: 05/28/2019] [Accepted: 05/29/2019] [Indexed: 06/10/2023]
Abstract
A number of Bayesian Networks were developed in order to nowcast and forecast, up to 4 days ahead and in different locations, the likelihood of water quality within the 2018 Commonwealth Games Triathlon swim course exceeding the critical limits for Enterococci and Escherichia coli. The models are data-driven, but the identification of potential inputs and optimal model structure was performed through the parallel contribution of several stakeholders and experts, consulted through workshops. The models, whose main nodes were discretised with a customised discretisation algorithm, were validated over a test set of data and deployed in real-time during the Commonwealth Games in support to a traditional water quality monitoring program. The proposed modelling framework proved to be cost-effective and less time-consuming than process-based models while still achieving high accuracy; in addition, the added value of a continuous stakeholder engagement guarantees a shared understanding of the model outputs and its future deployment.
Collapse
|
21
|
Evaluating the Safety Risk of Rural Roadsides Using a Bayesian Network Method. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2019; 16:ijerph16071166. [PMID: 30939766 PMCID: PMC6480398 DOI: 10.3390/ijerph16071166] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Revised: 03/28/2019] [Accepted: 03/29/2019] [Indexed: 11/16/2022]
Abstract
Evaluating the safety risk of rural roadsides is critical for achieving reasonable allocation of a limited budget and avoiding excessive installation of safety facilities. To assess the safety risk of rural roadsides when the crash data are unavailable or missing, this study proposed a Bayesian Network (BN) method that uses the experts’ judgments on the conditional probability of different safety risk factors to evaluate the safety risk of rural roadsides. Eight factors were considered, including seven factors identified in the literature and a new factor named access point density. To validate the effectiveness of the proposed method, a case study was conducted using 19.42 km long road networks in the rural area of Nantong, China. By comparing the results of the proposed method and run-off-road (ROR) crash data from 2015–2016 in the study area, the road segments with higher safety risk levels identified by the proposed method were found to be statistically significantly correlated with higher crash severity based on the crash data. In addition, by comparing the respective results evaluated by eight factors and seven factors (a new factor removed), we also found that access point density significantly contributed to the safety risk of rural roadsides. These results show that the proposed method can be considered as a low-cost solution to evaluating the safety risk of rural roadsides with relatively high accuracy, especially for areas with large rural road networks and incomplete ROR crash data due to budget limitation, human errors, negligence, or inconsistent crash recordings.
Collapse
|
22
|
Identification of conservation and restoration priority areas in the Danube River based on the multi-functionality of river-floodplain systems. THE SCIENCE OF THE TOTAL ENVIRONMENT 2019; 654:763-777. [PMID: 30448667 DOI: 10.1016/j.scitotenv.2018.10.322] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2018] [Revised: 10/23/2018] [Accepted: 10/24/2018] [Indexed: 06/09/2023]
Abstract
Large river-floodplain systems are hotspots of biodiversity and ecosystem services but are also used for multiple human activities, making them one of the most threatened ecosystems worldwide. There is wide evidence that reconnecting river channels with their floodplains is an effective measure to increase their multi-functionality, i.e., ecological integrity, habitats for multiple species and the multiple functions and services of river-floodplain systems, although, the selection of promising sites for restoration projects can be a demanding task. In the case of the Danube River in Europe, planning and implementation of restoration projects is substantially hampered by the complexity and heterogeneity of the environmental problems, lack of data and strong differences in socio-economic conditions as well as inconsistencies in legislation related to river management. We take a quantitative approach based on best-available data to assess biodiversity using selected species and three ecosystem services (flood regulation, crop pollination, and recreation), focused on the navigable main stem of the Danube River and its floodplains. We spatially prioritize river-floodplain segments for conservation and restoration based on (1) multi-functionality related to biodiversity and ecosystem services, (2) availability of remaining semi-natural areas and (3) reversibility as it relates to multiple human activities (e.g. flood protection, hydropower and navigation). Our approach can thus serve as a strategic planning tool for the Danube and provide a method for similar analyses in other large river-floodplain systems.
Collapse
|
23
|
Correction Workers' Burnout and Outcomes: A Bayesian Network Approach. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2019; 16:ijerph16020282. [PMID: 30669527 PMCID: PMC6352158 DOI: 10.3390/ijerph16020282] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Revised: 01/07/2019] [Accepted: 01/17/2019] [Indexed: 12/18/2022]
Abstract
The present study seeks to demonstrate how Bayesian Network analysis can be used to support Total Worker Health® research on correction workers by (1) revealing the most probable scenario of how psychosocial and behavioral outcome variables in corrections work are interrelated and (2) identifying the key contributing factors of this interdependency relationship within the unique occupational context of corrections work. The data from 353 correction workers from a state department of corrections in the United States were utilized. A Bayesian Network analysis approach was used to probabilistically sort out potential interrelations among various psychosocial and behavioral variables. The identified model revealed that work-related exhaustion may serve as a primary driver of occupational stress and impaired workability, and also that exhaustion limits the ability of correction workers to get regular physical exercise, while their interrelations with depressed mood, a lack of work engagement, and poor work-family balance were also noted. The results suggest the importance of joint consideration of psychosocial and behavioral factors when investigating variables that may impact health and wellbeing of correction workers. Also, they supported the value of adopting the Total Worker Health® framework, a holistic strategy to integrate prevention of work-related injury and illness and the facilitation of worker well-being, when considering integrated health protection and promotion interventions for workers in high-risk occupations.
Collapse
|
24
|
Evaluating Flight Crew Performance by a Bayesian Network Model. ENTROPY 2018; 20:e20030178. [PMID: 33265269 PMCID: PMC7512694 DOI: 10.3390/e20030178] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/28/2017] [Revised: 02/22/2018] [Accepted: 02/24/2018] [Indexed: 01/01/2023]
Abstract
Flight crew performance is of great significance in keeping flights safe and sound. When evaluating the crew performance, quantitative detailed behavior information may not be available. The present paper introduces the Bayesian Network to perform flight crew performance evaluation, which permits the utilization of multidisciplinary sources of objective and subjective information, despite sparse behavioral data. In this paper, the causal factors are selected based on the analysis of 484 aviation accidents caused by human factors. Then, a network termed Flight Crew Performance Model is constructed. The Delphi technique helps to gather subjective data as a supplement to objective data from accident reports. The conditional probabilities are elicited by the leaky noisy MAX model. Two ways of inference for the BN-probability prediction and probabilistic diagnosis are used and some interesting conclusions are drawn, which could provide data support to make interventions for human error management in aviation safety.
Collapse
|
25
|
Risk assessment model to prioritize sewer pipes inspection in wastewater collection networks. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2017; 190:91-101. [PMID: 28040592 DOI: 10.1016/j.jenvman.2016.12.052] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2016] [Revised: 12/17/2016] [Accepted: 12/21/2016] [Indexed: 06/06/2023]
Abstract
In wastewater systems as one of the most important urban infrastructures, the adverse consequences and effects of unsuitable performance and failure event can sometimes lead to disrupt part of a city functioning. By identifying high failure risk areas, inspections can be implemented based on the system status and thus can significantly increase the sewer network performance. In this study, a new risk assessment model is developed to prioritize sewer pipes inspection using Bayesian Networks (BNs) as a probabilistic approach for computing probability of failure and weighted average method to calculate the consequences of failure values. Finally to consider uncertainties, risk of a sewer pipe is obtained from integration of probability and consequences of failure values using a fuzzy inference system (FIS). As a case study, sewer pipes of a local wastewater collection network in Iran are prioritized to inspect based on their criticality. Results show that majority of sewers (about 62%) has moderate risk, but 12%of sewers are in a critical situation. Regarding the budgetary constraints, the proposed model and resultant risk values are expected to assist wastewater agencies to repair or replace risky sewer pipelines especially in dealing with incomplete and uncertain datasets.
Collapse
|
26
|
Identification of genetic interaction networks via an evolutionary algorithm evolved Bayesian network. BioData Min 2016; 9:18. [PMID: 27168765 PMCID: PMC4862166 DOI: 10.1186/s13040-016-0094-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2015] [Accepted: 04/18/2016] [Indexed: 12/01/2022] Open
Abstract
Background The future of medicine is moving towards the phase of precision medicine, with the goal to prevent and treat diseases by taking inter-individual variability into account. A large part of the variability lies in our genetic makeup. With the fast paced improvement of high-throughput methods for genome sequencing, a tremendous amount of genetics data have already been generated. The next hurdle for precision medicine is to have sufficient computational tools for analyzing large sets of data. Genome-Wide Association Studies (GWAS) have been the primary method to assess the relationship between single nucleotide polymorphisms (SNPs) and disease traits. While GWAS is sufficient in finding individual SNPs with strong main effects, it does not capture potential interactions among multiple SNPs. In many traits, a large proportion of variation remain unexplained by using main effects alone, leaving the door open for exploring the role of genetic interactions. However, identifying genetic interactions in large-scale genomics data poses a challenge even for modern computing. Results For this study, we present a new algorithm, Grammatical Evolution Bayesian Network (GEBN) that utilizes Bayesian Networks to identify interactions in the data, and at the same time, uses an evolutionary algorithm to reduce the computational cost associated with network optimization. GEBN excelled in simulation studies where the data contained main effects and interaction effects. We also applied GEBN to a Type 2 diabetes (T2D) dataset obtained from the Marshfield Personalized Medicine Research Project (PMRP). We were able to identify genetic interactions for T2D cases and controls and use information from those interactions to classify T2D samples. We obtained an average testing area under the curve (AUC) of 86.8 %. We also identified several interacting genes such as INADL and LPP that are known to be associated with T2D. Conclusions Developing the computational tools to explore genetic associations beyond main effects remains a critically important challenge in human genetics. Methods, such as GEBN, demonstrate the utility of considering genetic interactions, as they likely explain some of the missing heritability.
Collapse
|
27
|
Risk analysis of emergent water pollution accidents based on a Bayesian Network. JOURNAL OF ENVIRONMENTAL MANAGEMENT 2016; 165:199-205. [PMID: 26433361 DOI: 10.1016/j.jenvman.2015.09.024] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/24/2014] [Revised: 09/02/2015] [Accepted: 09/17/2015] [Indexed: 05/24/2023]
Abstract
To guarantee the security of water quality in water transfer channels, especially in open channels, analysis of potential emergent pollution sources in the water transfer process is critical. It is also indispensable for forewarnings and protection from emergent pollution accidents. Bridges above open channels with large amounts of truck traffic are the main locations where emergent accidents could occur. A Bayesian Network model, which consists of six root nodes and three middle layer nodes, was developed in this paper, and was employed to identify the possibility of potential pollution risk. Dianbei Bridge is reviewed as a typical bridge on an open channel of the Middle Route of the South to North Water Transfer Project where emergent traffic accidents could occur. Risk of water pollutions caused by leakage of pollutants into water is focused in this study. The risk for potential traffic accidents at the Dianbei Bridge implies a risk for water pollution in the canal. Based on survey data, statistical analysis, and domain specialist knowledge, a Bayesian Network model was established. The human factor of emergent accidents has been considered in this model. Additionally, this model has been employed to describe the probability of accidents and the risk level. The sensitive reasons for pollution accidents have been deduced. The case has also been simulated that sensitive factors are in a state of most likely to lead to accidents.
Collapse
|
28
|
A sequential decision-theoretic model for medical diagnostic system. Technol Health Care 2015; 23 Suppl 1:S37-42. [PMID: 26410326 DOI: 10.3233/thc-150926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Although diagnostic expert systems using a knowledge base which models decision-making of traditional experts can provide important information to non-experts, they tend to duplicate the errors made by experts. Decision-Theoretic Model (DTM) is therefore very useful in expert system since they prevent experts from incorrect reasoning under uncertainty. For the diagnostic expert system, corresponding DTM and arithmetic are studied and a sequential diagnostic decision-theoretic model based on Bayesian Network is given. In the model, the alternative features are categorized into two classes (including diseases features and test features), then an arithmetic for prior of test is provided. The different features affect other features weights are also discussed. Bayesian Network is adopted to solve uncertainty presentation and propagation. The model can help knowledge engineers model the knowledge involved in sequential diagnosis and decide evidence alternative priority. A practical example of the models is also presented: at any time of the diagnostic process the expert is provided with a dynamically updated list of suggested tests in order to support him in the decision-making problem about which test to execute next. The results show it is better than the traditional diagnostic model which is based on experience.
Collapse
|
29
|
Identification of reciprocal causality between non-alcoholic fatty liver disease and metabolic syndrome by a simplified Bayesian network in a Chinese population. BMJ Open 2015; 5:e008204. [PMID: 26395497 PMCID: PMC4593152 DOI: 10.1136/bmjopen-2015-008204] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
OBJECTIVES It remains unclear whether non-alcoholic fatty liver disease (NAFLD) is a cause or a consequence of metabolic syndrome (MetS). We proposed a simplified Bayesian network (BN) and attempted to confirm their reciprocal causality. SETTING Bidirectional longitudinal cohorts (subcohorts A and B) were designed and followed up from 2005 to 2011 based on a large-scale health check-up in a Chinese population. PARTICIPANTS Subcohort A (from NAFLD to MetS, n=8426) included the participants with or without NAFLD at baseline to follow-up the incidence of MetS, while subcohort B (from MetS to NAFLD, n=16,110) included the participants with or without MetS at baseline to follow-up the incidence of NAFLD. RESULTS Incidence densities were 2.47 and 17.39 per 100 person-years in subcohorts A and B, respectively. Generalised estimating equation analyses demonstrated that NAFLD was a potential causal factor for MetS (relative risk, RR, 95% CI 5.23, 3.50 to 7.81), while MetS was also a factor for NAFLD (2.55, 2.23 to 2.92). A BN with 5 simplification strategies was used for the reciprocal causal inference. The BN's causal inference illustrated that the total effect of NAFLD on MetS (attributable risks, AR%) was 2.49%, while it was 19.92% for MetS on NAFLD. The total effect of NAFLD on MetS components was different, with dyslipidemia having the greatest (AR%, 10.15%), followed by obesity (7.63%), diabetes (3.90%) and hypertension (3.51%). Similar patterns were inferred for MetS components on NAFLD, with obesity having the greatest (16.37%) effect, followed by diabetes (10.85%), dyslipidemia (10.74%) and hypertension (7.36%). Furthermore, the most important causal pathway from NAFLD to MetS was that NAFLD led to elevated GGT, then to MetS components, while the dominant causal pathway from MetS to NAFLD began with dyslipidaemia. CONCLUSIONS The findings suggest a reciprocal causality between NAFLD and MetS, and the effect of MetS on NAFLD is significantly greater than that of NAFLD on MetS.
Collapse
|
30
|
Modeling signal transduction from protein phosphorylation to gene expression. Cancer Inform 2014; 13:59-67. [PMID: 25392684 PMCID: PMC4216050 DOI: 10.4137/cin.s13883] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2014] [Revised: 05/04/2014] [Accepted: 05/04/2014] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Signaling networks are of great importance for us to understand the cell’s regulatory mechanism. The rise of large-scale genomic and proteomic data, and prior biological knowledge has paved the way for the reconstruction and discovery of novel signaling pathways in a data-driven manner. In this study, we investigate computational methods that integrate proteomics and transcriptomic data to identify signaling pathways transmitting signals in response to specific stimuli. Such methods can be applied to cancer genomic data to infer perturbed signaling pathways. METHOD We proposed a novel Bayesian Network (BN) framework to integrate transcriptomic data with proteomic data reflecting protein phosphorylation states for the purpose of identifying the pathways transmitting the signal of diverse stimuli in rat and human cells. We represented the proteins and genes as nodes in a BN in which edges reflect the regulatory relationship between signaling proteins. We designed an efficient inference algorithm that incorporated the prior knowledge of pathways and searched for a network structure in a data-driven manner. RESULTS We applied our method to infer rat and human specific networks given gene expression and proteomic datasets. We were able to effectively identify sparse signaling networks that modeled the observed transcriptomic and proteomic data. Our methods were able to identify distinct signaling pathways for rat and human cells in a data-driven manner, based on the facts that rat and human cells exhibited distinct transcriptomic and proteomics responses to a common set of stimuli. Our model performed well in the SBV IMPROVER challenge in comparison to other models addressing the same task. The capability of inferring signaling pathways in a data-driven fashion may contribute to cancer research by identifying distinct aberrations in signaling pathways underlying heterogeneous cancers subtypes.
Collapse
|
31
|
SAGA: a hybrid search algorithm for Bayesian Network structure learning of transcriptional regulatory networks. J Biomed Inform 2014; 53:27-35. [PMID: 25181467 DOI: 10.1016/j.jbi.2014.08.010] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2014] [Revised: 08/17/2014] [Accepted: 08/22/2014] [Indexed: 11/16/2022]
Abstract
Bayesian Networks have been used for the inference of transcriptional regulatory relationships among genes, and are valuable for obtaining biological insights. However, finding optimal Bayesian Network (BN) is NP-hard. Thus, heuristic approaches have sought to effectively solve this problem. In this work, we develop a hybrid search method combining Simulated Annealing with a Greedy Algorithm (SAGA). SAGA explores most of the search space by undergoing a two-phase search: first with a Simulated Annealing search and then with a Greedy search. Three sets of background-corrected and normalized microarray datasets were used to test the algorithm. BN structure learning was also conducted using the datasets, and other established search methods as implemented in BANJO (Bayesian Network Inference with Java Objects). The Bayesian Dirichlet Equivalence (BDe) metric was used to score the networks produced with SAGA. SAGA predicted transcriptional regulatory relationships among genes in networks that evaluated to higher BDe scores with high sensitivities and specificities. Thus, the proposed method competes well with existing search algorithms for Bayesian Network structure learning of transcriptional regulatory networks.
Collapse
|
32
|
Integrated modelling for Sustainability Appraisal of urban river corridors: going beyond compartmentalised thinking. WATER RESEARCH 2013; 47:7221-7234. [PMID: 24200012 DOI: 10.1016/j.watres.2013.10.034] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2013] [Revised: 10/09/2013] [Accepted: 10/11/2013] [Indexed: 06/02/2023]
Abstract
Sustainability Appraisal (SA) is a complex task that involves integration of social, environmental and economic considerations and often requires trade-offs between multiple stakeholders that may not easily be brought to consensus. Classical SA, often compartmentalised in the rigid boundary of disciplines, can facilitate discussion, but can only partially inform decision makers as many important aspects of sustainability remain abstract and not interlinked. A fully integrated model can overcome compartmentality in the assessment process and provides opportunity for a better integrative exploratory planning process. The objective of this paper is to explore the benefit of an integrated modelling approach to SA and how a structured integrated model can be used to provide a coherent, consistent and deliberative platform to assess policy or planning proposals. The paper discusses a participative and integrative modelling approach to urban river corridor development, incorporating the principal of sustainability. The paper uses a case study site in Sheffield, UK, with three alternative development scenarios, incorporating a number of possible riverside design features. An integrated SA model is used to develop better design by optimising different design elements and delivering a more sustainable (re)-development plan. We conclude that participatory integrated modelling has strong potential for supporting the SA processes. A high degree of integration provides the opportunity for more inclusive and informed decision-making regarding issues of urban development. It also provides the opportunity to reflect on their long-term dynamics, and to gain insights on the interrelationships underlying persistent sustainability problems. Thus the ability to address economic, social and environmental interdependencies within policies, plans, and legislations is enhanced.
Collapse
|
33
|
Comparing models for quantitative risk assessment: an application to the European Registry of foreign body injuries in children. Stat Methods Med Res 2013; 25:1244-59. [PMID: 23427223 DOI: 10.1177/0962280213476167] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Risk Assessment is the systematic study of decisions subject to uncertain consequences. An increasing interest has been focused on modeling techniques like Bayesian Networks since their capability of (1) combining in the probabilistic framework different type of evidence including both expert judgments and objective data; (2) overturning previous beliefs in the light of the new information being received and (3) making predictions even with incomplete data. In this work, we proposed a comparison among Bayesian Networks and other classical Quantitative Risk Assessment techniques such as Neural Networks, Classification Trees, Random Forests and Logistic Regression models. Hybrid approaches, combining both Classification Trees and Bayesian Networks, were also considered. Among Bayesian Networks, a clear distinction between purely data-driven approach and combination of expert knowledge with objective data is made. The aim of this paper consists in evaluating among this models which best can be applied, in the framework of Quantitative Risk Assessment, to assess the safety of children who are exposed to the risk of inhalation/insertion/aspiration of consumer products. The issue of preventing injuries in children is of paramount importance, in particular where product design is involved: quantifying the risk associated to product characteristics can be of great usefulness in addressing the product safety design regulation. Data of the European Registry of Foreign Bodies Injuries formed the starting evidence for risk assessment. Results showed that Bayesian Networks appeared to have both the ease of interpretability and accuracy in making prediction, even if simpler models like logistic regression still performed well.
Collapse
|
34
|
Bayesian Method for Causal Discovery of Latent-Variable Models from a Mixture of Experimental and Observational Data. Comput Stat Data Anal 2012; 56:2183-2205. [PMID: 32831439 DOI: 10.1016/j.csda.2012.01.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
This paper describes a Bayesian method for learning causal Bayesian networks through networks that contain latent variables from an arbitrary mixture of observational and experimental data. The paper presents Bayesian methods (including a new method) for learning the causal structure and parameters of the underlying causal process that is generating the data, given that the data contain a mixture of observational and experimental cases. These learning methods were applied using as input various mixtures of experimental and observational data that were generated from the ALARM causal Bayesian network. The paper reports how these structure predictions and parameter estimates compare with the true causal structures and parameters as given by the ALARM network. The paper shows that (1) the new method for learning Bayesian network structure from a mixture of data that this paper introduce, Gibbs Volume method, best estimates the probability of the data given the latent variable model and (2) using large data (>10,000 cases), another model, the implicit latent variable method, is asymptotically correct and efficient.
Collapse
|
35
|
An automated bayesian framework for integrative gene expression analysis and predictive medicine. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE PROCEEDINGS. AMIA JOINT SUMMITS ON TRANSLATIONAL SCIENCE 2012; 2012:95-104. [PMID: 22779059 PMCID: PMC3392067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
MOTIVATION This work constructs a closed loop Bayesian Network framework for predictive medicine via integrative analysis of publicly available gene expression findings pertaining to various diseases. RESULTS An automated pipeline was successfully constructed. Integrative models were made based on gene expression data obtained from GEO experiments relating to four different diseases using Bayesian statistical methods. Many of these models demonstrated a high level of accuracy and predictive ability. The approach described in this paper can be applied to any complex disorder and can include any number and type of genome-scale studies.
Collapse
|