1
|
Optimizing Crowdsourced Land Use and Land Cover Data Collection: A Two-Stage Approach. LAND 2022. [DOI: 10.3390/land11070958] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Citizen science has become an increasingly popular approach to scientific data collection, where classification tasks involving visual interpretation of images is one prominent area of application, e.g., to support the production of land cover and land-use maps. Achieving a minimum accuracy in these classification tasks at a minimum cost is the subject of this study. A Bayesian approach provides an intuitive and reasonably straightforward solution to achieve this objective. However, its application requires additional information, such as the relative frequency of the classes and the accuracy of each user. While the former is often available, the latter requires additional data collection. In this paper, we present a two-stage approach to gathering this additional information. We demonstrate its application using a hypothetical two-class example and then apply it to an actual crowdsourced dataset with five classes, which was taken from a previous Geo-Wiki crowdsourcing campaign on identifying the size of agricultural fields from very high-resolution satellite imagery. We also attach the R code for the implementation of the newly presented approach.
Collapse
|
2
|
Miettinen T, Mäntyselkä P, Hagelberg N, Mustola S, Kalso E, Lötsch J. Machine learning suggests sleep as a core factor in chronic pain. Pain 2021; 162:109-123. [PMID: 32694382 DOI: 10.1097/j.pain.0000000000002002] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Patients with chronic pain have complex pain profiles and associated problems. Subgroup analysis can help identify key problems. We used a data-based approach to define pain phenotypes and their most relevant associated problems in 320 patients undergoing tertiary pain management. Unsupervised machine learning analysis of parameters "pain intensity," "number of pain areas," "pain duration," "activity pain interference," and "affective pain interference," implemented as emergent self-organizing maps, identified 3 patient phenotype clusters. Supervised analyses, implemented as different types of decision rules, identified "affective pain interference" and the "number of pain areas" as most relevant for cluster assignment. These appeared 698 and 637 times, respectively, in 1000 cross-validation runs among the most relevant characteristics in an item categorization approach in a computed ABC analysis. Cluster assignment was achieved with a median balanced accuracy of 79.9%, a sensitivity of 74.1%, and a specificity of 87.7%. In addition, among 59 demographic, pain etiology, comorbidity, lifestyle, psychological, and treatment-related variables, sleep problems appeared 638 and 439 times among the most important characteristics in 1000 cross-validation runs where patients were assigned to the 2 extreme pain phenotype clusters. Also important were the parameters "fear of pain," "self-rated poor health," and "systolic blood pressure." Decision trees trained with this information assigned patients to the extreme pain phenotype with an accuracy of 67%. Machine learning suggested sleep problems as key factors in the most difficult pain presentations, therefore deserving priority in the treatment of chronic pain.
Collapse
Affiliation(s)
- Teemu Miettinen
- Pain Clinic, Department of Anesthesiology, Intensive Care, and Pain Medicine, University of Helsinki, Helsinki University Central Hospital, Helsinki, Finland
| | - Pekka Mäntyselkä
- Institute of Public Health and Clinical Nutrition, University of Eastern Finland, Kuopio, Finland and Primary Health Care Unit, Kuopio University Hospital, Kuopio, Finland
| | | | - Seppo Mustola
- Department of Anesthesia, Intensive Care, and Pain, South Karelia Central Hospital, Lappeenranta, Finland
| | - Eija Kalso
- Pain Clinic, Department of Anesthesiology, Intensive Care, and Pain Medicine, University of Helsinki, Helsinki University Central Hospital, Helsinki, Finland
- Sleepwell Research Programme, University of Helsinki, Helsinki, Finland
| | - Jörn Lötsch
- Institute of Clinical Pharmacology, Goethe-University, Frankfurt am Main, Germany
- Fraunhofer Institute for Molecular Biology and Applied Ecology IME, Project Group Translational Medicine and Pharmacology TMP, Frankfurt am Main, Germany
| |
Collapse
|
3
|
Lausser L, Schäfer LM, Kühlwein SD, Kestler AMR, Kestler HA. Detecting Ordinal Subcascades. Neural Process Lett 2020. [DOI: 10.1007/s11063-020-10362-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
AbstractOrdinal classifier cascades are constrained by a hypothesised order of the semantic class labels of a dataset. This order determines the overall structure of the decision regions in feature space. Assuming the correct order on these class labels will allow a high generalisation performance, while an incorrect one will lead to diminished results. In this way ordinal classifier systems can facilitate explorative data analysis allowing to screen for potential candidate orders of the class labels. Previously, we have shown that screening is possible for total orders of all class labels. However, as datasets might comprise samples of ordinal as well as non-ordinal classes, the assumption of a total ordering might be not appropriate. An analysis of subsets of classes is required to detect such hidden ordinal substructures. In this work, we devise a novel screening procedure for exhaustive evaluations of all order permutations of all subsets of classes by bounding the number of enumerations we have to examine. Experiments with multi-class data from diverse applications revealed ordinal substructures that generate new and support known relations.
Collapse
|
4
|
Völkel G, Laban S, Fürstberger A, Kühlwein SD, Ikonomi N, Hoffmann TK, Brunner C, Neuberg DS, Gaidzik V, Döhner H, Kraus JM, Kestler HA. Analysis, identification and visualization of subgroups in genomics. Brief Bioinform 2020; 22:5909009. [PMID: 32954413 PMCID: PMC8138884 DOI: 10.1093/bib/bbaa217] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Revised: 08/14/2020] [Accepted: 08/17/2020] [Indexed: 12/22/2022] Open
Abstract
Motivation Cancer is a complex and heterogeneous disease involving multiple somatic mutations that accumulate during its progression. In the past years, the wide availability of genomic data from patients’ samples opened new perspectives in the analysis of gene mutations and alterations. Hence, visualizing and further identifying genes mutated in massive sets of patients are nowadays a critical task that sheds light on more personalized intervention approaches. Results Here, we extensively review existing tools for visualization and analysis of alteration data. We compare different approaches to study mutual exclusivity and sample coverage in large-scale omics data. We complement our review with the standalone software AVAtar (‘analysis and visualization of alteration data’) that integrates diverse aspects known from different tools into a comprehensive platform. AVAtar supplements customizable alteration plots by a multi-objective evolutionary algorithm for subset identification and provides an innovative and user-friendly interface for the evaluation of concurrent solutions. A use case from personalized medicine demonstrates its unique features showing an application on vaccination target selection. Availability AVAtar is available at: https://github.com/sysbio-bioinf/avatar Contact hans.kestler@uni-ulm.de, phone: +49 (0) 731 500 24 500, fax: +49 (0) 731 500 24 502
Collapse
Affiliation(s)
| | | | | | | | | | - Thomas K Hoffmann
- Department of Otorhinolaryngology, Head and Neck Surgery, Ulm University Medical Center, Germany
| | - Cornelia Brunner
- Department of Otorhinolaryngology, Head and Neck Surgery, Ulm University Medical Center, Germany
| | - Donna S Neuberg
- Department of Biostatistics, Dana-Farber Cancer Institute, Boston, Massachusetts, USA
| | - Verena Gaidzik
- Department of Internal Medicine III, Ulm University Medical Center, Germany
| | - Hartmut Döhner
- Department of Internal Medicine III, Ulm University Medical Center, Germany
| | | | | |
Collapse
|
5
|
Lausser L, Szekely R, Klimmek A, Schmid F, Kestler HA. Constraining classifiers in molecular analysis: invariance and robustness. J R Soc Interface 2020; 17:20190612. [PMID: 32019472 PMCID: PMC7061712 DOI: 10.1098/rsif.2019.0612] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2020] [Accepted: 01/09/2020] [Indexed: 12/02/2022] Open
Abstract
Analysing molecular profiles requires the selection of classification models that can cope with the high dimensionality and variability of these data. Also, improper reference point choice and scaling pose additional challenges. Often model selection is somewhat guided by ad hoc simulations rather than by sophisticated considerations on the properties of a categorization model. Here, we derive and report four linked linear concept classes/models with distinct invariance properties for high-dimensional molecular classification. We can further show that these concept classes also form a half-order of complexity classes in terms of Vapnik-Chervonenkis dimensions, which also implies increased generalization abilities. We implemented support vector machines with these properties. Surprisingly, we were able to attain comparable or even superior generalization abilities to the standard linear one on the 27 investigated RNA-Seq and microarray datasets. Our results indicate that a priori chosen invariant models can replace ad hoc robustness analysis by interpretable and theoretically guaranteed properties in molecular categorization.
Collapse
Affiliation(s)
- Ludwig Lausser
- Institute of Medical Systems Biology, Ulm University, Ulm, Germany
| | - Robin Szekely
- Institute of Medical Systems Biology, Ulm University, Ulm, Germany
| | - Attila Klimmek
- Institute of Medical Systems Biology, Ulm University, Ulm, Germany
| | - Florian Schmid
- Institute of Medical Systems Biology, Ulm University, Ulm, Germany
| | - Hans A. Kestler
- Institute of Medical Systems Biology, Ulm University, Ulm, Germany
- Leibniz Institute on Aging, Jena, Germany
| |
Collapse
|
6
|
De Lellis P, Nakayama S, Porfiri M. Using demographics toward efficient data classification in citizen science: a Bayesian approach. PeerJ Comput Sci 2019; 5:e239. [PMID: 33816892 PMCID: PMC7924415 DOI: 10.7717/peerj-cs.239] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Accepted: 10/26/2019] [Indexed: 06/12/2023]
Abstract
Public participation in scientific activities, often called citizen science, offers a possibility to collect and analyze an unprecedentedly large amount of data. However, diversity of volunteers poses a challenge to obtain accurate information when these data are aggregated. To overcome this problem, we propose a classification algorithm using Bayesian inference that harnesses diversity of volunteers to improve data accuracy. In the algorithm, each volunteer is grouped into a distinct class based on a survey regarding either their level of education or motivation to citizen science. We obtained the behavior of each class through a training set, which was then used as a prior information to estimate performance of new volunteers. By applying this approach to an existing citizen science dataset to classify images into categories, we demonstrate improvement in data accuracy, compared to the traditional majority voting. Our algorithm offers a simple, yet powerful, way to improve data accuracy under limited effort of volunteers by predicting the behavior of a class of individuals, rather than attempting at a granular description of each of them.
Collapse
Affiliation(s)
- Pietro De Lellis
- Department of Electrical Engineering and Information Technology, University of Naples Federico II, Naples, Italy
- Department of Mechanical and Aerospace Engineering, New York University Tandon School of Engineering, Brooklyn, NY, USA
| | - Shinnosuke Nakayama
- Department of Mechanical and Aerospace Engineering, New York University Tandon School of Engineering, Brooklyn, NY, USA
| | - Maurizio Porfiri
- Department of Mechanical and Aerospace Engineering, New York University Tandon School of Engineering, Brooklyn, NY, USA
- Department of Biomedical Engineering, New York University Tandon School of Engineering, Brooklyn, NY, USA
| |
Collapse
|
7
|
Abstract
Biological entities are key elements of biomedical research. Their definition and their relationships are important in areas such as phylogenetic reconstruction, developmental processes or tumor evolution. Hypotheses about relationships like phenotype order are often postulated based on prior knowledge or belief. Evidence on a molecular level is typically unknown and whether total orders are reflected in the molecular measurements is unclear or not assessed. In this work we propose a method that allows a fast and exhaustive screening for total orders in large datasets. We utilise ordinal classifier cascades to identify discriminable molecular representations of the phenotypes. These classifiers are constrained by an order hypothesis and are highly sensitive to incorrect assumptions. Two new error bounds, which are introduced and theoretically proven, lead to a substantial speed-up and allow the application to large collections of many phenotypes. In our experiments we show that by exhaustively evaluating all possible candidate orders, we are able to identify phenotype orders that best coincide with the high-dimensional molecular profiles.
Collapse
|
8
|
Torre M, Nakayama S, Tolbert TJ, Porfiri M. Producing knowledge by admitting ignorance: Enhancing data quality through an "I don't know" option in citizen science. PLoS One 2019; 14:e0211907. [PMID: 30811452 PMCID: PMC6392254 DOI: 10.1371/journal.pone.0211907] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2018] [Accepted: 01/22/2019] [Indexed: 11/18/2022] Open
Abstract
The "noisy labeler problem" in crowdsourced data has attracted great attention in recent years, with important ramifications in citizen science, where non-experts must produce high-quality data. Particularly relevant to citizen science is dynamic task allocation, in which the level of agreement among labelers can be progressively updated through the information-theoretic notion of entropy. Under dynamic task allocation, we hypothesized that providing volunteers with an "I don't know" option would contribute to enhancing data quality, by introducing further, useful information about the level of agreement among volunteers. We investigated the influence of an "I don't know" option on the data quality in a citizen science project that entailed classifying the image of a highly polluted canal into "threat" or "no threat" to the environment. Our results show that an "I don't know" option can enhance accuracy, compared to the case without the option; such an improvement mostly affects the true negative rather than the true positive rate. In an information-theoretic sense, these seemingly meaningless blank votes constitute a meaningful piece of information to help enhance accuracy of data in citizen science.
Collapse
Affiliation(s)
- Marina Torre
- Department of Mechanical and Aerospace Engineering, New York University Tandon School of Engineering, Brooklyn, New York, United States of America
| | - Shinnosuke Nakayama
- Department of Mechanical and Aerospace Engineering, New York University Tandon School of Engineering, Brooklyn, New York, United States of America
| | - Tyrone J. Tolbert
- Department of Mechanical and Aerospace Engineering, New York University Tandon School of Engineering, Brooklyn, New York, United States of America
| | - Maurizio Porfiri
- Department of Mechanical and Aerospace Engineering, New York University Tandon School of Engineering, Brooklyn, New York, United States of America
- Department of Biomedical Engineering, New York University Tandon School of Engineering, Brooklyn, New York, United States of America
- * E-mail:
| |
Collapse
|
9
|
Gress TM, Lausser L, Schirra LR, Ortmüller L, Diels R, Kong B, Michalski CW, Hackert T, Strobel O, Giese NA, Schenk M, Lawlor RT, Scarpa A, Kestler HA, Buchholz M. Combined microRNA and mRNA microfluidic TaqMan array cards for the diagnosis of malignancy of multiple types of pancreatico-biliary tumors in fine-needle aspiration material. Oncotarget 2017; 8:108223-108237. [PMID: 29296236 PMCID: PMC5746138 DOI: 10.18632/oncotarget.22601] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2017] [Accepted: 10/30/2017] [Indexed: 02/07/2023] Open
Abstract
Pancreatic ductal adenocarcinoma (PDAC) continues to carry the lowest survival rates among all solid tumors. A marked resistance against available therapies, late clinical presentation and insufficient means for early diagnosis contribute to the dismal prognosis. Novel biomarkers are thus required to aid treatment decisions and improve patient outcomes. We describe here a multi-omics molecular platform that allows for the first time to simultaneously analyze miRNA and mRNA expression patterns from minimal amounts of biopsy material on a single microfluidic TaqMan Array card. Expression profiles were generated from 113 prospectively collected fine needle aspiration biopsies (FNAB) from patients undergoing surgery for suspect masses in the pancreas. Molecular classifiers were constructed using support vector machines, and rigorously evaluated for diagnostic performance using 10×10fold cross validation. The final combined miRNA/mRNA classifier demonstrated a sensitivity of 91.7%, a specificity of 94.5%, and an overall diagnostic accuracy of 93.0% for the differentiation between PDAC and benign pancreatic masses, clearly outperfoming miRNA-only classifiers. The classification algorithm also performed very well in the diagnosis of other types of solid tumors (acinar cell carcinomas, ampullary cancer and distal bile duct carcinomas), but was less suited for the diagnostic analysis of cystic lesions. We thus demonstrate that simultaneous analysis of miRNA and mRNA biomarkers from FNAB samples using multi-omics TaqMan Array cards is suitable to differentiate suspect solid pancreatic masses with high precision.
Collapse
Affiliation(s)
- Thomas M Gress
- Clinic for Gastroenterology, Endocrinology and Metabolism, University Hospital, Philipps-Universität Marburg, Marburg, Germany
| | - Ludwig Lausser
- Institute of Medical Systems Biology, University of Ulm, Ulm, Germany
| | | | - Lisa Ortmüller
- Clinic for Gastroenterology, Endocrinology and Metabolism, University Hospital, Philipps-Universität Marburg, Marburg, Germany
| | - Ramona Diels
- Clinic for Gastroenterology, Endocrinology and Metabolism, University Hospital, Philipps-Universität Marburg, Marburg, Germany
| | - Bo Kong
- Department of Surgery, Technical University of Munich, Munich, Germany
| | - Christoph W Michalski
- Department of Surgery, Technical University of Munich, Munich, Germany.,Department of Surgery, University of Heidelberg, Heidelberg, Germany
| | - Thilo Hackert
- Department of Surgery, University of Heidelberg, Heidelberg, Germany
| | - Oliver Strobel
- Department of Surgery, University of Heidelberg, Heidelberg, Germany
| | - Nathalia A Giese
- Department of Surgery, University of Heidelberg, Heidelberg, Germany
| | - Miriam Schenk
- Department of Surgery, University of Heidelberg, Heidelberg, Germany
| | - Rita T Lawlor
- ARC-Net Centre for Applied Research on Cancer and Department of Pathology, University of Verona, Verona, Italy
| | - Aldo Scarpa
- ARC-Net Centre for Applied Research on Cancer and Department of Pathology, University of Verona, Verona, Italy
| | - Hans A Kestler
- Institute of Medical Systems Biology, University of Ulm, Ulm, Germany
| | - Malte Buchholz
- Clinic for Gastroenterology, Endocrinology and Metabolism, University Hospital, Philipps-Universität Marburg, Marburg, Germany
| |
Collapse
|
10
|
|
11
|
Taudien S, Lausser L, Giamarellos-Bourboulis EJ, Sponholz C, Schöneweck F, Felder M, Schirra LR, Schmid F, Gogos C, Groth S, Petersen BS, Franke A, Lieb W, Huse K, Zipfel PF, Kurzai O, Moepps B, Gierschik P, Bauer M, Scherag A, Kestler HA, Platzer M. Genetic Factors of the Disease Course After Sepsis: Rare Deleterious Variants Are Predictive. EBioMedicine 2016; 12:227-238. [PMID: 27639823 PMCID: PMC5078585 DOI: 10.1016/j.ebiom.2016.08.037] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2016] [Revised: 08/19/2016] [Accepted: 08/24/2016] [Indexed: 12/20/2022] Open
Abstract
Sepsis is a life-threatening organ dysfunction caused by dysregulated host response to infection. For its clinical course, host genetic factors are important and rare genomic variants are suspected to contribute. We sequenced the exomes of 59 Greek and 15 German patients with bacterial sepsis divided into two groups with extremely different disease courses. Variant analysis was focusing on rare deleterious single nucleotide variants (SNVs). We identified significant differences in the number of rare deleterious SNVs per patient between the ethnic groups. Classification experiments based on the data of the Greek patients allowed discrimination between the disease courses with estimated sensitivity and specificity > 75%. By application of the trained model to the German patients we observed comparable discriminatory properties despite lower population-specific rare SNV load. Furthermore, rare SNVs in genes of cell signaling and innate immunity related pathways were identified as classifiers discriminating between the sepsis courses. Sepsis patients with favorable disease course after sepsis, even in the case of unfavorable preconditions, seem to be affected more often by rare deleterious SNVs in cell signaling and innate immunity related pathways, suggesting a protective role of impairments in these processes against a poor disease course. Rare SNV load is higher in the Greek vs. German population. Subsets of rare deleterious SNVs are predictive for the disease course after sepsis. Patients with favorable disease course seem to carry protective deleterious variants in sepsis related pathways.
Sepsis is a life-threatening disease caused by improper response to infection. Only little is known about the role of genetic factors. From > 4000 patients we selected the most extreme cases showing either a favorable or adverse disease course. We determined rare (< 1/200) protein-damaging genetic variants, as they may have a large effect. Using a computational model that includes knowledge on genes we can predict the disease course with > 75% accuracy. Surprisingly, favorable courses can be expected if defense mechanisms are damaged and prevented from overshooting. This underlines the relevance of rare variants for better understanding of sepsis and may offer new treatment options.
Collapse
Affiliation(s)
- Stefan Taudien
- Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital, Jena, Germany; Leibniz Institute on Aging - Fritz Lipmann Institute, Jena, Germany
| | - Ludwig Lausser
- Leibniz Institute on Aging - Fritz Lipmann Institute, Jena, Germany; Institute of Medical Systems Biology, Ulm University, Germany
| | - Evangelos J Giamarellos-Bourboulis
- Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital, Jena, Germany; 4th Department of Internal Medicine, National and Kapodistrian University of Athens, Athens, Greece
| | - Christoph Sponholz
- Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital, Jena, Germany; Leibniz Institute on Aging - Fritz Lipmann Institute, Jena, Germany; Department of Anaesthesiology and Intensive Care Therapy, Jena University Hospital, Jena, Germany
| | - Franziska Schöneweck
- Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital, Jena, Germany; Research group Clinical Epidemiology, CSCC, Jena University Hospital, Jena, Germany
| | - Marius Felder
- Leibniz Institute on Aging - Fritz Lipmann Institute, Jena, Germany
| | | | - Florian Schmid
- Institute of Medical Systems Biology, Ulm University, Germany
| | - Charalambos Gogos
- Department of Internal Medicine, University of Patras, Medical School, Greece
| | - Susann Groth
- Leibniz Institute on Aging - Fritz Lipmann Institute, Jena, Germany
| | - Britt-Sabina Petersen
- Institute of Clinical Molecular Biology, Christian-Albrechts-Universität Kiel, Kiel, Germany
| | - Andre Franke
- Institute of Clinical Molecular Biology, Christian-Albrechts-Universität Kiel, Kiel, Germany
| | - Wolfgang Lieb
- Institute of Epidemiology, Christian-Albrechts-Universität Kiel, Kiel, Germany
| | - Klaus Huse
- Leibniz Institute on Aging - Fritz Lipmann Institute, Jena, Germany
| | - Peter F Zipfel
- Leibniz Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute, Jena, Germany; Friedrich Schiller University Jena, Jena, Germany
| | - Oliver Kurzai
- Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital, Jena, Germany; Septomics Research Center Jena, Leibniz Institute for Natural Product Research and Infection Biology - Hans-Knöll-Institute, Jena, Germany
| | - Barbara Moepps
- Institute of Pharmacology and Toxicology, Ulm University Medical Center, Ulm, Germany
| | - Peter Gierschik
- Institute of Pharmacology and Toxicology, Ulm University Medical Center, Ulm, Germany
| | - Michael Bauer
- Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital, Jena, Germany; Department of Anaesthesiology and Intensive Care Therapy, Jena University Hospital, Jena, Germany
| | - André Scherag
- Integrated Research and Treatment Center, Center for Sepsis Control and Care (CSCC), Jena University Hospital, Jena, Germany; Research group Clinical Epidemiology, CSCC, Jena University Hospital, Jena, Germany
| | - Hans A Kestler
- Leibniz Institute on Aging - Fritz Lipmann Institute, Jena, Germany; Institute of Medical Systems Biology, Ulm University, Germany; Friedrich Schiller University Jena, Jena, Germany.
| | - Matthias Platzer
- Leibniz Institute on Aging - Fritz Lipmann Institute, Jena, Germany.
| |
Collapse
|
12
|
Müssel C, Schmid F, Blätte TJ, Hopfensitz M, Lausser L, Kestler HA. BiTrinA--multiscale binarization and trinarization with quality analysis. Bioinformatics 2015; 32:465-8. [PMID: 26468003 DOI: 10.1093/bioinformatics/btv591] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2015] [Accepted: 10/08/2015] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION When processing gene expression profiles or other biological data, it is often required to assign measurements to distinct categories (e.g. 'high' and 'low' and possibly 'intermediate'). Subsequent analyses strongly depend on the results of this quantization. Poor quantization will have potentially misleading effects on further investigations. We propose the BiTrinA package that integrates different multiscale algorithms for binarization and for trinarization of one-dimensional data with methods for quality assessment and visualization of the results. By identifying measurements that show large variations over different time points or conditions, this quality assessment can determine candidates that are related to the specific experimental setting. AVAILABILITY AND IMPLEMENTATION BiTrinA is freely available on CRAN. CONTACT hans.kestler@leibniz-fli.de or hans.kestler@uni-ulm.de SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Florian Schmid
- Medical Systems Biology, Ulm University, 89069 Ulm, Germany
| | - Tamara J Blätte
- Medical Systems Biology, Ulm University, 89069 Ulm, Germany, Section of Oncology, Internal Medicine III, Ulm University, 89069 Ulm, Germany and
| | | | - Ludwig Lausser
- Leibniz Institute on Aging-Fritz Lipmann Institute, 07745 Jena, Germany
| | - Hans A Kestler
- Medical Systems Biology, Ulm University, 89069 Ulm, Germany, Leibniz Institute on Aging-Fritz Lipmann Institute, 07745 Jena, Germany
| |
Collapse
|
13
|
Three Transductive Set Covering Machines. STUDIES IN CLASSIFICATION, DATA ANALYSIS, AND KNOWLEDGE ORGANIZATION 2014. [DOI: 10.1007/978-3-319-01595-8_33] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
|
14
|
|
15
|
|