1
|
Zeimet AG, Reimer D, Huszar M, Winterhoff B, Puistola U, Abdel Azim S, Müller-Holzner E, Ben-Arie A, van Kempen LC, Petru E, Jahn S, Geels YP, Massuger LF, Amant F, Polterauer S, Lappi-Blanco E, Bulten J, Meuter A, Tanouye S, Oppelt P, Stroh-Weigert M, Reinthaller A, Mariani A, Hackl W, Netzer M, Schirmer U, Vergote I, Altevogt P, Marth C, Fogel M. L1CAM in Early-Stage Type I Endometrial Cancer: Results of a Large Multicenter Evaluation. ACTA ACUST UNITED AC 2013; 105:1142-50. [DOI: 10.1093/jnci/djt144] [Citation(s) in RCA: 162] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
|
|
12 |
162 |
2
|
Netzer M, Millonig G, Osl M, Pfeifer B, Praun S, Villinger J, Vogel W, Baumgartner C. A new ensemble-based algorithm for identifying breath gas marker candidates in liver disease using ion molecule reaction mass spectrometry. Bioinformatics 2009; 25:941-7. [PMID: 19223453 DOI: 10.1093/bioinformatics/btp093] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Alcoholic fatty liver disease (AFLD) and non-AFLD (NAFLD) can progress to severe liver diseases such as steatohepatitis, cirrhosis and cancer. Thus, the detection of early liver disease is essential; however, minimal invasive diagnostic methods in clinical hepatology still lack specificity. RESULTS Ion molecule reaction mass spectrometry (IMR-MS) was applied to a total of 126 human breath gas samples comprising 91 cases (AFLD, NAFLD and cirrhosis) and 35 healthy controls. A new feature selection modality termed Stacked Feature Ranking (SFR) was developed to identify potential liver disease marker candidates in breath gas samples, relying on the combination of different entropy- and correlation-based feature ranking methods including statistical hypothesis testing using a two-level architecture with a suggestion and a decision layer. We benchmarked SFR against four single feature selection methods, a wrapper and a recently described ensemble method, indicating a significantly higher discriminatory ability of up to 10-15% for the SFR selected gas compounds expressed by the area under the ROC curve (AUC) of 0.85-0.95. Using this approach, we were able to identify unexpected breath gas marker candidates in liver disease of high predictive value. A literature study further supports top-ranked markers to be associated with liver disease. We propose SFR as a powerful tool for biomarker search in breath gas and other biological samples using mass spectrometry. AVAILABILITY The algorithm SFR and IMR-MS datasets are available under http://biomed.umit.at/page.cfm?pageid=526.
Collapse
|
Research Support, Non-U.S. Gov't |
16 |
58 |
3
|
Baumgartner C, Osl M, Netzer M, Baumgartner D. Bioinformatic-driven search for metabolic biomarkers in disease. J Clin Bioinforma 2011; 1:2. [PMID: 21884622 PMCID: PMC3143899 DOI: 10.1186/2043-9113-1-2] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2010] [Accepted: 01/20/2011] [Indexed: 02/06/2023] Open
Abstract
The search and validation of novel disease biomarkers requires the complementary power of professional study planning and execution, modern profiling technologies and related bioinformatics tools for data analysis and interpretation. Biomarkers have considerable impact on the care of patients and are urgently needed for advancing diagnostics, prognostics and treatment of disease. This survey article highlights emerging bioinformatics methods for biomarker discovery in clinical metabolomics, focusing on the problem of data preprocessing and consolidation, the data-driven search, verification, prioritization and biological interpretation of putative metabolic candidate biomarkers in disease. In particular, data mining tools suitable for the application to omic data gathered from most frequently-used type of experimental designs, such as case-control or longitudinal biomarker cohort studies, are reviewed and case examples of selected discovery steps are delineated in more detail. This review demonstrates that clinical bioinformatics has evolved into an essential element of biomarker discovery, translating new innovations and successes in profiling technologies and bioinformatics to clinical application.
Collapse
|
review-article |
14 |
46 |
4
|
Baumgartner C, Lewis GD, Netzer M, Pfeifer B, Gerszten RE. A new data mining approach for profiling and categorizing kinetic patterns of metabolic biomarkers after myocardial injury. ACTA ACUST UNITED AC 2010; 26:1745-51. [PMID: 20483816 DOI: 10.1093/bioinformatics/btq254] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION The discovery of new and unexpected biomarkers in cardiovascular disease is a highly data-driven process that requires the complementary power of modern metabolite profiling technologies, bioinformatics and biostatistics. Clinical biomarkers of early myocardial injury are lacking. A prospective biomarker cohort study was carried out to identify, categorize and profile kinetic patterns of early metabolic biomarkers of planned myocardial infarction (PMI) and spontaneous (SMI) myocardial infarction. We applied a targeted mass spectrometry (MS)-based metabolite profiling platform to serial blood samples drawn from carefully phenotyped patients undergoing alcohol septal ablation for hypertrophic obstructive cardiomyopathy serving as a human model of PMI. Patients with SMI and patients undergoing catheterization without induction of myocardial infarction served as positive and negative controls to assess generalizability of markers identified in PMI. RESULTS To identify metabolites of high predictive value in tandem mass spectrometry data, we introduced a new feature selection method for the categorization of metabolic signatures into three classes of weak, moderate and strong predictors, which can be easily applied to both paired and unpaired samples. Our paradigm outperformed standard null-hypothesis significance testing and other popular methods for feature selection in terms of the area under the receiver operating curve and the product of sensitivity and specificity. Our results emphasize that this new method was able to identify, classify and validate alterations of levels in multiple metabolites participating in pathways associated with myocardial injury as early as 10 min after PMI. AVAILABILITY The algorithm as well as supplementary material is available for download at: www.umit.at/page.cfm?vpath=departments/technik/iebe/tools/bi
Collapse
|
Research Support, Non-U.S. Gov't |
15 |
32 |
5
|
Steffens J, Netzer M, Isenberg E, Alloussi S, Ziegler M. Vasopressin deficiency in primary nocturnal enuresis. Results of a controlled prospective study. Eur Urol 1993; 24:366-70. [PMID: 8262104 DOI: 10.1159/000474330] [Citation(s) in RCA: 29] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
The lack of circadian rhythmicity of plasma arginine vasopressin (AVP) in primary nocturnal enuresis (PNE) in some children is known. The original test protocol is time-consuming and needs excellent compliance by children and parents. The goals of the presented study are the introduction of a simple screening test and the evaluation of the response of treatment using intranasal synthetic vasopressin. Fifty-five children (aged 8.2 +/- 3.1 years) with PNE and 15 children (aged 7.9 +/- 2.4 years) of a control group were investigated. Using a standardized protocol, AVP levels were measured by radioimmunoassay (RIA) under controlled water intake 3 times per day over a period of 72 h. Fourteen of 55 tested children (25.5%) with PNE had a significant decrease in nocturnal AVP when compared to the control group. We measured also an increased nocturnal urine volume and a lower urine osmolality in this enuretic group. Eight of 14 patients (57.1%) with plasma AVP deficiency (AVPD) also had bladder instability. Nine of 14 patients (64.3%) with AVPD with or without concomitant bladder instability were totally dry during desmopressin treatment, but only 2 (14.3%) remained dry after discontinuation of treatment. Our data suggest that nocturnal urine osmolality measurement may reflect AVPD and predict a positive treatment outcome.
Collapse
|
Clinical Trial |
32 |
29 |
6
|
Netzer M, Weinberger KM, Handler M, Seger M, Fang X, Kugler KG, Graber A, Baumgartner C. Profiling the human response to physical exercise: a computational strategy for the identification and kinetic analysis of metabolic biomarkers. J Clin Bioinforma 2011; 1:34. [PMID: 22182709 PMCID: PMC3320562 DOI: 10.1186/2043-9113-1-34] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2011] [Accepted: 12/19/2011] [Indexed: 12/23/2022] Open
Abstract
BACKGROUND In metabolomics, biomarker discovery is a highly data driven process and requires sophisticated computational methods for the search and prioritization of novel and unforeseen biomarkers in data, typically gathered in preclinical or clinical studies. In particular, the discovery of biomarker candidates from longitudinal cohort studies is crucial for kinetic analysis to better understand complex metabolic processes in the organism during physical activity. FINDINGS In this work we introduce a novel computational strategy that allows to identify and study kinetic changes of putative biomarkers using targeted MS/MS profiling data from time series cohort studies or other cross-over designs. We propose a prioritization model with the objective of classifying biomarker candidates according to their discriminatory ability and couple this discovery step with a novel network-based approach to visualize, review and interpret key metabolites and their dynamic interactions within the network. The application of our method on longitudinal stress test data revealed a panel of metabolic signatures, i.e., lactate, alanine, glycine and the short-chain fatty acids C2 and C3 in trained and physically fit persons during bicycle exercise. CONCLUSIONS We propose a new computational method for the discovery of new signatures in dynamic metabolic profiling data which revealed known and unexpected candidate biomarkers in physical activity. Many of them could be verified and confirmed by literature. Our computational approach is freely available as R package termed BiomarkeR under LGPL via CRAN http://cran.r-project.org/web/packages/BiomarkeR/.
Collapse
|
Journal Article |
14 |
29 |
7
|
Pfeifer B, Kugler K, Tejada MM, Baumgartner C, Seger M, Osl M, Netzer M, Handler M, Dander A, Wurz M, Graber A, Tilg B. A cellular automaton framework for infectious disease spread simulation. Open Med Inform J 2008; 2:70-81. [PMID: 19415136 PMCID: PMC2666960 DOI: 10.2174/1874431100802010070] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2008] [Revised: 03/27/2008] [Accepted: 04/09/2008] [Indexed: 12/02/2022] Open
Abstract
In this paper, a cellular automaton framework for processing the spatiotemporal spread of infectious diseases is presented. The developed environment simulates and visualizes how infectious diseases might spread, and hence provides a powerful instrument for health care organizations to generate disease prevention and contingency plans. In this study, the outbreak of an avian flu like virus was modeled in the state of Tyrol, and various scenarios such as quarantine, effect of different medications on viral spread and changes of social behavior were simulated. The proposed framework is implemented using the programming language Java. The set up of the simulation environment requires specification of the disease parameters and the geographical information using a population density colored map, enriched with demographic data. The results of the numerical simulations and the analysis of the computed parameters will be used to get a deeper understanding of how the disease spreading mechanisms work, and how to protect the population from contracting the disease. Strategies for optimization of medical treatment and vaccination regimens will also be investigated using our cellular automaton framework. In this study, six different scenarios were simulated. It showed that geographical barriers may help to slow down the spread of an infectious disease, however, when an aggressive and deadly communicable disease spreads, only quarantine and controlled medical treatment are able to stop the outbreak, if at all.
Collapse
|
Journal Article |
17 |
28 |
8
|
Millonig G, Praun S, Netzer M, Baumgartner C, Dornauer A, Mueller S, Villinger J, Vogel W. Non-invasive diagnosis of liver diseases by breath analysis using an optimized ion-molecule reaction-mass spectrometry approach: a pilot study. Biomarkers 2010; 15:297-306. [PMID: 20151876 DOI: 10.3109/13547501003624512] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Breath composition is altered in liver diseases. We tested if ion-molecule-reaction mass spectrometry (IMR-MS) combined with a new statistical modality improves the diagnostic accuracy of breath analysis in liver diseases. We analysed 114 molecules in the breath of 126 individuals (healthy controls, and patients with non-alcoholic and alcoholic fatty liver disease and liver cirrhosis) by IMR-MS. Characteristic exhalation patterns were identified for each group. Combining two to seven molecules in the new stacked feature ranking model reached a diagnostic accuracy (area under the curve) for individual liver diseases between 0.88 and 0.97. IMR-MS followed by sophisticated statistical analysis is a promising tool for liver diagnostics by breath analysis.
Collapse
|
Research Support, Non-U.S. Gov't |
15 |
24 |
9
|
Fang X, Netzer M, Baumgartner C, Bai C, Wang X. Genetic network and gene set enrichment analysis to identify biomarkers related to cigarette smoking and lung cancer. Cancer Treat Rev 2012; 39:77-88. [PMID: 22789435 DOI: 10.1016/j.ctrv.2012.06.001] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2012] [Revised: 06/03/2012] [Accepted: 06/06/2012] [Indexed: 10/28/2022]
Abstract
OBJECTIVES Cigarette smoking is the most demonstrated risk factor for the development of lung cancer, while the related genetic mechanisms are still unclear. METHODS The preprocessed microarray expression dataset was downloaded from Gene Expression Omnibus database. Samples were classified according to the disease state, stage and smoking state. A new computational strategy was applied for the identification and biological interpretation of new candidate genes in lung cancer and smoking by coupling a network-based approach with gene set enrichment analysis. MEASUREMENTS Network analysis was performed by pair-wise comparison according to the disease states (tumor or normal), smoking states (current smokers or nonsmokers or former smokers), or the disease stage (stages I-IV). The most activated metabolic pathways were identified by gene set enrichment analysis. RESULTS Panels of top ranked gene candidates in smoking or cancer development were identified, including genes involved in cell proliferation and drug metabolism like cytochrome P450 and WW domain containing transcription regulator 1. Semaphorin 5A and protein phosphatase 1F are the common genes represented as major hubs in both the smoking and cancer related network. Six pathways, e.g. cell cycle, DNA replication, RNA transport, protein processing in endoplasmic reticulum, vascular smooth muscle contraction and endocytosis were commonly involved in smoking and lung cancer when comparing the top ten selected pathways. CONCLUSION New approach of bioinformatics for biomarker identification and validation can probe into deep genetic relationships between cigarette smoking and lung cancer. Our studies indicate that disease-specific network biomarkers, interaction between genes/proteins, or cross-talking of pathways provide more specific values for the development of precision therapies for lung.
Collapse
|
Research Support, Non-U.S. Gov't |
13 |
19 |
10
|
Osl M, Dreiseitl S, Cerqueira F, Netzer M, Pfeifer B, Baumgartner C. Demoting redundant features to improve the discriminatory ability in cancer data. J Biomed Inform 2009; 42:721-5. [PMID: 19460463 DOI: 10.1016/j.jbi.2009.05.006] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2008] [Revised: 03/09/2009] [Accepted: 05/13/2009] [Indexed: 11/27/2022]
Abstract
The identification of a set of relevant but not redundant features is an important first step in building predictive and diagnostic models from biomedical data sets. Most commonly, individual features are ranked in terms of a quality criterion, out of which the best (first) k features are selected. However, feature ranking methods do not sufficiently account for interactions and correlations between the features. Thus, redundancy is likely to be encountered in the selected features. We present a new algorithm, termed Redundancy Demoting (RD), that takes an arbitrary feature ranking as input, and improves this ranking by identifying redundant features and demoting them to positions in the ranking in which they are not redundant. Redundant features are those that are correlated with other features and not relevant in the sense that they do not improve the discriminatory ability of a set of features. Experiments on two cancer data sets, one melanoma image data set and one lung cancer microarray data set, show that our algorithm greatly improves the feature rankings provided by the methods information gain, ReliefF and Student's t-test in terms of predictive power.
Collapse
|
Journal Article |
16 |
14 |
11
|
Kusonmano K, Netzer M, Baumgartner C, Dehmer M, Liedl KR, Graber A. Effects of pooling samples on the performance of classification algorithms: a comparative study. ScientificWorldJournal 2012; 2012:278352. [PMID: 22654582 PMCID: PMC3361225 DOI: 10.1100/2012/278352] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2011] [Accepted: 01/10/2012] [Indexed: 12/19/2022] Open
Abstract
A pooling design can be used as a powerful strategy to compensate for limited amounts of samples or high biological variation. In this paper, we perform a comparative study to model and quantify the effects of virtual pooling on the performance of the widely applied classifiers, support vector machines (SVMs), random forest (RF), k-nearest neighbors (k-NN), penalized logistic regression (PLR), and prediction analysis for microarrays (PAMs). We evaluate a variety of experimental designs using mock omics datasets with varying levels of pool sizes and considering effects from feature selection. Our results show that feature selection significantly improves classifier performance for non-pooled and pooled data. All investigated classifiers yield lower misclassification rates with smaller pool sizes. RF mainly outperforms other investigated algorithms, while accuracy levels are comparable among all the remaining ones. Guidelines are derived to identify an optimal pooling scheme for obtaining adequate predictive power and, hence, to motivate a study design that meets best experimental objectives and budgetary conditions, including time constraints.
Collapse
|
Research Support, Non-U.S. Gov't |
13 |
12 |
12
|
Mueller LAJ, Kugler KG, Netzer M, Graber A, Dehmer M. A network-based approach to classify the three domains of life. Biol Direct 2011; 6:53. [PMID: 21995640 PMCID: PMC3226542 DOI: 10.1186/1745-6150-6-53] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2011] [Accepted: 10/13/2011] [Indexed: 11/22/2022] Open
Abstract
Background Identifying group-specific characteristics in metabolic networks can provide better insight into evolutionary developments. Here, we present an approach to classify the three domains of life using topological information about the underlying metabolic networks. These networks have been shown to share domain-independent structural similarities, which pose a special challenge for our endeavour. We quantify specific structural information by using topological network descriptors to classify this set of metabolic networks. Such measures quantify the structural complexity of the underlying networks. In this study, we use such measures to capture domain-specific structural features of the metabolic networks to classify the data set. So far, it has been a challenging undertaking to examine what kind of structural complexity such measures do detect. In this paper, we apply two groups of topological network descriptors to metabolic networks and evaluate their classification performance. Moreover, we combine the two groups to perform a feature selection to estimate the structural features with the highest classification ability in order to optimize the classification performance. Results By combining the two groups, we can identify seven topological network descriptors that show a group-specific characteristic by ANOVA. A multivariate analysis using feature selection and supervised machine learning leads to a reasonable classification performance with a weighted F-score of 83.7% and an accuracy of 83.9%. We further demonstrate that our approach outperforms alternative methods. Also, our results reveal that entropy-based descriptors show the highest classification ability for this set of networks. Conclusions Our results show that these particular topological network descriptors are able to capture domain-specific structural characteristics for classifying metabolic networks between the three domains of life.
Collapse
|
Research Support, Non-U.S. Gov't |
14 |
10 |
13
|
Netzer M, Kugler KG, Müller LAJ, Weinberger KM, Graber A, Baumgartner C, Dehmer M. A network-based feature selection approach to identify metabolic signatures in disease. J Theor Biol 2012; 310:216-22. [PMID: 22771628 DOI: 10.1016/j.jtbi.2012.06.003] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2011] [Revised: 04/16/2012] [Accepted: 06/03/2012] [Indexed: 12/17/2022]
Abstract
The identification and interpretation of metabolic biomarkers is a challenging task. In this context, network-based approaches have become increasingly a key technology in systems biology allowing to capture complex interactions in biological systems. In this work, we introduce a novel network-based method to identify highly predictive biomarker candidates for disease. First, we infer two different types of networks: (i) correlation networks, and (ii) a new type of network called ratio networks. Based on these networks, we introduce scores to prioritize features using topological descriptors of the vertices. To evaluate our method we use an example dataset where quantitative targeted MS/MS analysis was applied to a total of 52 blood samples from 22 persons with obesity (BMI >30) and 30 healthy controls. Using our network-based feature selection approach we identified highly discriminating metabolites for obesity (F-score >0.85, accuracy >85%), some of which could be verified by the literature.
Collapse
|
Research Support, Non-U.S. Gov't |
13 |
9 |
14
|
Abstract
Nephrocalcin, an acidic glycoprotein that inhibits calcium oxalate crystal growth, has been previously localized in proximal tubules of kidneys by an immunohistochemical staining method and purified from tissue culture media of 2 renal carcinoma cell lines. A polyclonal antibody specific to nephrocalcin was raised in rabbits and the level of nephrocalcin was quantitatively determined in urine of 19 renal cell carcinoma patients (0.241 +/- 0.341 microgram nephrocalcin per mg. creatinine) and compared to healthy controls (0.022 +/- 0.012 micrograms nephrocalcin per mg. creatinine). Nephrocalcin levels after tumor nephrectomy decreased dramatically in 5 patients and to a lesser degree in 7. A specific nephrocalcin fraction that was eluted from an anion exchange column with low ionic strength was detected in urine of the renal cell carcinoma patients, and this fraction decreased or disappeared after tumor nephrectomy in 6 of 9 patients studied. Amino acid composition, phosphate content and dissociation constants toward calcium oxalate monohydrate crystals were investigated in the nephrocalcin from tumor patients and compared to that from healthy controls. Our studies demonstrate that nephrocalcin in patients with renal cell carcinoma is atypical and usually in much higher quantity. Further studies are needed to determine the clinical significance of these observations.
Collapse
|
|
31 |
5 |
15
|
Mattes J, Chemelli A, Wick M, Soimu D, Pontow C, Lopez A, Netzer M, Chemelli-Steingruber IE. Evaluation of a new computerized analysis system developed for the processing of CT follow-up scans after EVR of infrarenal aneurysm. Eur J Radiol 2011; 81:496-501. [PMID: 21300491 DOI: 10.1016/j.ejrad.2010.12.070] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2010] [Revised: 12/27/2010] [Accepted: 12/28/2010] [Indexed: 11/24/2022]
Abstract
PURPOSE The aim of this retrospective study was to present a new computerized analysis system developed for the evaluation of follow-up CT scans after endovascular repair (EVR) of infrarenal aneurysm and to compare it to the conventional evaluation method as regards preciseness and ease of application. The system is based on the extraction of the surface of the stent-graft (SG) and that of the spinal canal and the overlay of surfaces obtained at different points in time. MATERIALS AND METHODS A total of 116 CT follow-up data sets obtained from 49 patients after EVR of infrarenal aneurysm were evaluated using both, the conventional method and the new computerized system. Two parameters were analyzed: SG length and the distance between the most ventral point of the SG and the vertebral column. The correlation between the results of the two methods and the correlation between the results obtained by two independent observers (radiologist and lay person) using the new system were assessed by statistical analysis. RESULTS Comparison of the two methods yielded a very high correlation for both parameters, (correlation coefficients of around 0.9 and p<0.001). Comparison of the results obtained by the two observers yielded an equally high correlation (correlation coefficients of around 0.9 and p<0.001). CONCLUSION Our results show that the new computerized system is as precise and reliable as the conventional method, but allows better visualization and quantification of SG changes by surface overlay. Moreover, it is easier to apply and less time-consuming and can be easily integrated into existing systems.
Collapse
|
Journal Article |
14 |
4 |
16
|
Hanser F, Seger M, Netzer M, Osl M, Modre-Osprian R, Schreier G, Baumgartner C, Pfeifer B, Wurz M. An Epidemiological Modeling and Data Integration Framework. Methods Inf Med 2018; 49:290-6. [DOI: 10.3414/me09-02-0025] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2009] [Accepted: 03/08/2010] [Indexed: 11/09/2022]
Abstract
Summary
Objectives: In this work, a cellular automaton software package for simulating different infectious diseases, storing the simulation results in a data warehouse system and analyzing the obtained results to generate prediction models as well as contingency plans, is proposed. The Brisbane H3N2 flu virus, which has been spreading during the winter season 2009, was used for simulation in the federal state of Tyrol, Austria.
Methods: The simulation-modeling framework consists of an underlying cellular automaton. The cellular automaton model is parameterized by known disease parameters and geographical as well as demographical conditions are included for simulating the spreading. The data generated by simulation are stored in the back room of the data warehouse using the Talend Open Studio software package, and subsequent statistical and data mining tasks are performed using the tool, termed Knowledge Discovery in Database Designer (KD3).
Results: The obtained simulation results were used for generating prediction models for all nine federal states of Austria.
Conclusion: The proposed framework provides a powerful and easy to handle interface for parameterizing and simulating different infectious diseases in order to generate prediction models and improve contingency plans for future events.
Collapse
|
|
7 |
2 |
17
|
Netzer M, Hackl WO, Schaller M, Alber L, Marksteiner J, Ammenwerth E. Evaluating Performance and Interpretability of Machine Learning Methods for Predicting Delirium in Gerontopsychiatric Patients. Stud Health Technol Inform 2020; 271:121-128. [PMID: 32578554 DOI: 10.3233/shti200087] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Delirium is an acute mental disturbance that particularly occurs during hospital stay. Current clinical assessment instruments include the Delirium Observation Screening Scale (DOSS) or the Confusion Assessment Method (CAM). The aim of this work is to analyze the performance of machine learning approaches to detect delirium based on DOSS and CAM information obtained from two geropsychiatric wards in Tyrol. From a machine learning perspective, the questions of these two assessment instruments represent the features and the ICD 10 diagnoses of delirium (yes/no) is the corresponding class variable. We compare seven popular classification methods and analyze the performance and interpretability of the learning models. As our dataset is highly imbalanced, we also evaluate the effect of common sampling methods including down- and up-sampling methods as well as hybrid methods. Our results indicate a high predictive ability of advanced methods such as Random Forest that can handle even unbalanced datasets. Overall, combining a good performance of a prediction model with the ability of users to understand the prediction is challenging. However, for clinical application in fully electronic settings, a good performance seems to be more important than an easy interpretation of the prediction by the user. On the other hand, explanations of decisions are often needed to assess other criteria such as safety.
Collapse
|
Journal Article |
5 |
2 |
18
|
Visvanathan M, Netzer M, Seger M, Adagarla BS, Baumgartner C, Sittampalam S, Lushington GH. Oncogenes and pathway identification using filter-based approaches between various carcinoma types in lung. INTERNATIONAL JOURNAL OF COMPUTATIONAL BIOLOGY AND DRUG DESIGN 2009; 2:236-51. [PMID: 20090162 PMCID: PMC2825752 DOI: 10.1504/ijcbdd.2009.030115] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
Lung cancer accounts for the most cancer-related deaths. The identification of cancer-associated genes and the related pathways are essential to prevent many types of cancer. In this paper, a more systematic approach is considered. First, we did pathway analysis using Hyper Geometric Distribution (HGD) and significantly overrepresented sets of reactions were identified. Second, feature-selection-based Particle Swarm Optimisation (PSO), Information Gain (IG) and the Biomarker Identifier (BMI) for the identification of different types of lung cancer were used. We also evaluated PSO and developed a new method to determine the BMI thresholds to prioritize genes. We were able to identify sets of key genes that can be found in several pathways. Experimental results show that our method simplifies features effectively and obtains higher classification accuracy than the other methods from the literature.
Collapse
|
Research Support, N.I.H., Extramural |
16 |
1 |
19
|
Netzer M, Hanser F, Ledochowski M, Baumgarten D. Supervised Machine Learning for Predicting Carbohydrate Malabsorptions Using Hydrogen Breath Tests. CURRENT DIRECTIONS IN BIOMEDICAL ENGINEERING 2022. [DOI: 10.1515/cdbme-2022-1073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Abstract
Introduction: Carbohydrate malabsorptions symptoms include intestinal fluid retention, causing diarrhea and abdominal distention. The aim of this work is to create a machine learning model that predicts carbohydrate malabsorption using H2 measurements from lactose and fructose tolerance tests. Methods: We compare the predictive ability of popular classifiers with classifiers that are specifically designed for time series data. Our approach was implemented using sklearn and sktime Python machine learning libraries. Results: The highest predictive ability for the fructose dataset was achieved using a Random Forest Classifier (balanced accuracy = 0.91). In contrast, the highest predictive ability (balanced accuracy = 0.81) for the lactose dataset was obtained using an IndividualTDE time classifier. Conclusion: Our results indicate a high predictive ability for distinguishing between carbohydrate malabsorptions. However, the detection of SIBO is challenging but adapted time classifier models could reach higher performances compared to standard methods. Our results could establish the basis of an expert system for diagnosing carbohydrate malabsorptions and SIBO, respectively.
Collapse
|
|
3 |
|
20
|
Hackl WO, Netzer M, Nantschev R, Schaller M, Ammenwerth E. Visual Analytics in Delirium Management. Stud Health Technol Inform 2021; 279:147-148. [PMID: 33965932 DOI: 10.3233/shti210102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
BACKGROUND Delirium is a patient safety issue that often occurs within the population of elderly people. As delirium may be characterized by fluctuating progress, the aim of this work is to find methods to visualize the occurrence of delirium over time in different patient stays in gerontopsychatric settings. METHODS We analyzed current data mining visualization techniques for clinical research using a delirium data set collected in a gerontopsychatric setting. RESULTS We identified heatmaps and dendrograms resulting from hierarchical clustering as a suitable visualization method. CONCLUSION Heat maps with hierarchical clustering are a suitable data mining tool or visualization technique to study delirium cases in the time course of patient stays.
Collapse
|
|
4 |
|
21
|
Siemer S, Graf N, Netzer M, Steffens J. Nebennierenrindenkarzinome im Kindesalter. Aktuelle Urol 2008. [DOI: 10.1055/s-2008-1057832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
|
17 |
|
22
|
Heitmann J, Grote L, Netzer M, Krzyzanek E, Ploch T, Peter JH. [Use of discontinuous long-term blood pressure measurement (Spacelabs 90207) in patients with sleep apnea--a comparison with intra-arterial data]. Pneumologie 1997; 51 Suppl 3:747-9. [PMID: 9340631] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Circadian blood pressure (BP) profile is an important determinant of cardiovascular risk. Patients with Sleep-Related Breathing Disorders (SRBD) often suffer from arterial hypertension and an altered circadian blood pressure profile. The aim of this study was to compare intraarterial blood pressure recordings with DBPR (Discontinuous Blood Pressure Recorder using Spacelabs 90207) in 20 patients with mild to moderate arterial hypertension and mild to moderate SRBD. Our results of overnight measurements show that the mean arterial pressure seems to be the most reliable value (mean difference Spacelabs-Part: +1.7 mmHg). In contrast, the systolic BP was systematically underestimated (mean: -17.5 mmHg) while the diastolic value was systematically overestimated (mean: +9.3 mmHg) by Spacelabs. Our data show that neither systolic nor diastolic BP from the Spacelabs measurement reflected the real cardiovascular load. Since the mean arterial pressure proved to be the most reliable value, this value should be used to distinguish between dipper and non-dipper in circadian BP profiles. Further, reliable noninvasive continuous measurement of the BP is required to assess the real vascular load in patients with SRBD during the night.
Collapse
|
Comparative Study |
28 |
|
23
|
Ammenwerth E, Netzer M, Hackl WO. Learning Analytics and the Community of Inquiry: Indicators to Analyze and Visualize Online-Based Learning. Stud Health Technol Inform 2020; 271:67-68. [PMID: 32578543 DOI: 10.3233/shti200076] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
BACKGROUND The Community of Inquiry (CoI) describes success factors for online-based learning. OBJECTIVES To develop approaches for automatic analysis of CoI to be visualized within student and teacher dashboards. METHODS Extending indicators from social network analysis and linguistics; evaluation within a case study. RESULTS The project is just starting. CONCLUSION Results will help to better understand and improve cooperative online-based learning in higher education.
Collapse
|
|
5 |
|
24
|
Dornauer V, Netzer M, Kaczkó É, Norz LM, Ammenwerth E. Automatic Classification of Online Discussions and Other Learning Traces to Detect Cognitive Presence. INTERNATIONAL JOURNAL OF ARTIFICIAL INTELLIGENCE IN EDUCATION 2023; 34:395-415. [PMID: 38827645 PMCID: PMC11139697 DOI: 10.1007/s40593-023-00335-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/20/2023] [Indexed: 06/04/2024]
Abstract
Cognitive presence is a core construct of the Community of Inquiry (CoI) framework. It is considered crucial for deep and meaningful online-based learning. CoI-based real-time dashboards visualizing students' cognitive presence may help instructors to monitor and support students' learning progress. Such real-time classifiers are often based on the linguistic analysis of the content of posts made by students. It is unclear whether these classifiers could be improved by considering other learning traces, such as files attached to students' posts. We aimed to develop a German-language cognitive presence classifier that includes linguistic analysis using the Linguistic Inquiry and Word Count (LIWC) tool and other learning traces based on 1,521 manually coded meaningful units from an online-based university course. As learning traces, we included not only the linguistic features from the LIWC tool, but also features such as attaching files to a post, tagging, or using terms from the course glossary. We used the k-nearest neighbor method, a random forest model, and a multilayer perceptron as classifiers. The results showed an accuracy of up to 82% and a Cohen's κ of 0.76 for the cognitive presence classifier for German posts. Including learning traces did not improve the predictive ability. In conclusion, we developed an automatic classifier for German-language courses based on a linguistic analysis of students' posts. This classifier is a step toward a teacher dashboard. Our work also provides the first fully CoI-coded German dataset for future research on cognitive presence.
Collapse
|
research-article |
2 |
|
25
|
Netzer M, Hanser F, Breit M, Weinberger KM, Baumgartner C, Baumgarten D. Ensemble Based Approach for Time Series Classification in Metabolomics. Stud Health Technol Inform 2019; 260:89-96. [PMID: 31118323] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
BACKGROUND Machine learning is one important application in the area of health informatics, however classification methods for longitudinal data are still rare. OBJECTIVES The aim of this work is to analyze and classify differences in metabolite time series data between groups of individuals regarding their athletic activity. METHODS We propose a new ensemble-based 2-tier approach to classify metabolite time series data. The first tier uses polynomial fitting to generate a class prediction for each metabolite. An induced classifier (k-nearest-neighbor or naïve bayes) combines the results to produce a final prediction. Metabolite levels of 47 individuals undergoing a cycle ergometry test were measured using mass spectrometry. RESULTS In accordance with our previous work the statistical results indicate strong changes over time. We found only small but systematic differences between the groups. However, our proposed stacking approach obtained a mean accuracy of 78% using 10-fold cross-validation. CONCLUSION Our proposed classification approach allows a considerable classification performance for time series data with small differences between the groups.
Collapse
|
|
6 |
|