1
|
|
|
18 |
143 |
2
|
Prüfer K, Muetzel B, Do HH, Weiss G, Khaitovich P, Rahm E, Pääbo S, Lachmann M, Enard W. FUNC: a package for detecting significant associations between gene sets and ontological annotations. BMC Bioinformatics 2007; 8:41. [PMID: 17284313 PMCID: PMC1800870 DOI: 10.1186/1471-2105-8-41] [Citation(s) in RCA: 139] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2006] [Accepted: 02/06/2007] [Indexed: 11/17/2022] Open
Abstract
Background Genome-wide expression, sequence and association studies typically yield large sets of gene candidates, which must then be further analysed and interpreted. Information about these genes is increasingly being captured and organized in ontologies, such as the Gene Ontology. Relationships between the gene sets identified by experimental methods and biological knowledge can be made explicit and used in the interpretation of results. However, it is often difficult to assess the statistical significance of such analyses since many inter-dependent categories are tested simultaneously. Results We developed the program package FUNC that includes and expands on currently available methods to identify significant associations between gene sets and ontological annotations. Implemented are several tests in particular well suited for genome wide sequence comparisons, estimates of the family-wise error rate, the false discovery rate, a sensitive estimator of the global significance of the results and an algorithm to reduce the complexity of the results. Conclusion FUNC is a versatile and useful tool for the analysis of genome-wide data. It is freely available under the GPL license and also accessible via a web service.
Collapse
|
Research Support, Non-U.S. Gov't |
18 |
139 |
3
|
Do HH, Melnik S, Rahm E. Comparison of Schema Matching Evaluations. WEB, WEB-SERVICES, AND DATABASE SYSTEMS 2003. [DOI: 10.1007/3-540-36560-5_17] [Citation(s) in RCA: 129] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
|
22 |
129 |
4
|
|
|
15 |
100 |
5
|
Müller R, Greiner U, Rahm E. AgentWork: a workflow system supporting rule-based workflow adaptation. DATA KNOWL ENG 2004. [DOI: 10.1016/j.datak.2004.03.010] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
|
21 |
57 |
6
|
Winter A, Stäubert S, Ammon D, Aiche S, Beyan O, Bischoff V, Daumke P, Decker S, Funkat G, Gewehr JE, de Greiff A, Haferkamp S, Hahn U, Henkel A, Kirsten T, Klöss T, Lippert J, Löbe M, Lowitsch V, Maassen O, Maschmann J, Meister S, Mikolajczyk R, Nüchter M, Pletz MW, Rahm E, Riedel M, Saleh K, Schuppert A, Smers S, Stollenwerk A, Uhlig S, Wendt T, Zenker S, Fleig W, Marx G, Scherag A, Löffler M. Smart Medical Information Technology for Healthcare (SMITH). Methods Inf Med 2018; 57:e92-e105. [PMID: 30016815 PMCID: PMC6193398 DOI: 10.3414/me18-02-0004] [Citation(s) in RCA: 55] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
INTRODUCTION This article is part of the Focus Theme of Methods of Information in Medicine on the German Medical Informatics Initiative. "Smart Medical Information Technology for Healthcare (SMITH)" is one of four consortia funded by the German Medical Informatics Initiative (MI-I) to create an alliance of universities, university hospitals, research institutions and IT companies. SMITH's goals are to establish Data Integration Centers (DICs) at each SMITH partner hospital and to implement use cases which demonstrate the usefulness of the approach. OBJECTIVES To give insight into architectural design issues underlying SMITH data integration and to introduce the use cases to be implemented. GOVERNANCE AND POLICIES SMITH implements a federated approach as well for its governance structure as for its information system architecture. SMITH has designed a generic concept for its data integration centers. They share identical services and functionalities to take best advantage of the interoperability architectures and of the data use and access process planned. The DICs provide access to the local hospitals' Electronic Medical Records (EMR). This is based on data trustee and privacy management services. DIC staff will curate and amend EMR data in the Health Data Storage. METHODOLOGY AND ARCHITECTURAL FRAMEWORK To share medical and research data, SMITH's information system is based on communication and storage standards. We use the Reference Model of the Open Archival Information System and will consistently implement profiles of Integrating the Health Care Enterprise (IHE) and Health Level Seven (HL7) standards. Standard terminologies will be applied. The SMITH Market Place will be used for devising agreements on data access and distribution. 3LGM2 for enterprise architecture modeling supports a consistent development process.The DIC reference architecture determines the services, applications and the standardsbased communication links needed for efficiently supporting the ingesting, data nourishing, trustee, privacy management and data transfer tasks of the SMITH DICs. The reference architecture is adopted at the local sites. Data sharing services and the market place enable interoperability. USE CASES The methodological use case "Phenotype Pipeline" (PheP) constructs algorithms for annotations and analyses of patient-related phenotypes according to classification rules or statistical models based on structured data. Unstructured textual data will be subject to natural language processing to permit integration into the phenotyping algorithms. The clinical use case "Algorithmic Surveillance of ICU Patients" (ASIC) focusses on patients in Intensive Care Units (ICU) with the acute respiratory distress syndrome (ARDS). A model-based decision-support system will give advice for mechanical ventilation. The clinical use case HELP develops a "hospital-wide electronic medical record-based computerized decision support system to improve outcomes of patients with blood-stream infections" (HELP). ASIC and HELP use the PheP. The clinical benefit of the use cases ASIC and HELP will be demonstrated in a change of care clinical trial based on a step wedge design. DISCUSSION SMITH's strength is the modular, reusable IT architecture based on interoperability standards, the integration of the hospitals' information management departments and the public-private partnership. The project aims at sustainability beyond the first 4-year funding period.
Collapse
Grants
- German Federal Ministry of Education and Research Grant No's. 01ZZ1609A, 01ZZ1609B, 01ZZ1609C, 01ZZ1803A, 01ZZ1803B, 01ZZ1803C, 01ZZ1803D, 01ZZ1803E, 01ZZ1803F, 01ZZ1803G, 01ZZ1803H, 01ZZ1803I, 01ZZ1803J, 01ZZ1803K, 01ZZ1803L, 01ZZ1803M, 01ZZ1803N
Collapse
|
Research Support, Non-U.S. Gov't |
7 |
55 |
7
|
Kirsten T, Gross A, Hartung M, Rahm E. GOMMA: a component-based infrastructure for managing and analyzing life science ontologies and their evolution. J Biomed Semantics 2011; 2:6. [PMID: 21914205 PMCID: PMC3198872 DOI: 10.1186/2041-1480-2-6] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2010] [Accepted: 09/13/2011] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND Ontologies are increasingly used to structure and semantically describe entities of domains, such as genes and proteins in life sciences. Their increasing size and the high frequency of updates resulting in a large set of ontology versions necessitates efficient management and analysis of this data. RESULTS We present GOMMA, a generic infrastructure for managing and analyzing life science ontologies and their evolution. GOMMA utilizes a generic repository to uniformly and efficiently manage ontology versions and different kinds of mappings. Furthermore, it provides components for ontology matching, and determining evolutionary ontology changes. These components are used by analysis tools, such as the Ontology Evolution Explorer (OnEX) and the detection of unstable ontology regions. We introduce the component-based infrastructure and show analysis results for selected components and life science applications. GOMMA is available at http://dbs.uni-leipzig.de/GOMMA. CONCLUSIONS GOMMA provides a comprehensive and scalable infrastructure to manage large life science ontologies and analyze their evolution. Key functions include a generic storage of ontology versions and mappings, support for ontology matching and determining ontology changes. The supported features for analyzing ontology changes are helpful to assess their impact on ontology-dependent applications such as for term enrichment. GOMMA complements OnEX by providing functionalities to manage various versions of mappings between two ontologies and allows combining different match approaches.
Collapse
|
Journal Article |
14 |
39 |
8
|
Müller R, Rahm E. Dealing with Logical Failures for Collaborating Workflows. COOPERATIVE INFORMATION SYSTEMS 2000. [DOI: 10.1007/10722620_21] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
|
|
25 |
26 |
9
|
Groß A, Hartung M, Prüfer K, Kelso J, Rahm E. Impact of ontology evolution on functional analyses. ACTA ACUST UNITED AC 2012; 28:2671-7. [PMID: 22954631 DOI: 10.1093/bioinformatics/bts498] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION Ontologies are used in the annotation and analysis of biological data. As knowledge accumulates, ontologies and annotation undergo constant modifications to reflect this new knowledge. These modifications may influence the results of statistical applications such as functional enrichment analyses that describe experimental data in terms of ontological groupings. Here, we investigate to what degree modifications of the Gene Ontology (GO) impact these statistical analyses for both experimental and simulated data. The analysis is based on new measures for the stability of result sets and considers different ontology and annotation changes. RESULTS Our results show that past changes in the GO are non-uniformly distributed over different branches of the ontology. Considering the semantic relatedness of significant categories in analysis results allows a more realistic stability assessment for functional enrichment studies. We observe that the results of term-enrichment analyses tend to be surprisingly stable despite changes in ontology and annotation.
Collapse
|
Research Support, Non-U.S. Gov't |
13 |
26 |
10
|
Hartung M, Kirsten T, Gross A, Rahm E. OnEX: Exploring changes in life science ontologies. BMC Bioinformatics 2009; 10:250. [PMID: 19678926 PMCID: PMC2746816 DOI: 10.1186/1471-2105-10-250] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2009] [Accepted: 08/13/2009] [Indexed: 12/01/2022] Open
Abstract
Background Numerous ontologies have recently been developed in life sciences to support a consistent annotation of biological objects, such as genes or proteins. These ontologies underlie continuous changes which can impact existing annotations. Therefore, it is valuable for users of ontologies to study the stability of ontologies and to see how many and what kind of ontology changes occurred. Results We present OnEX (Ontology Evolution EXplorer) a system for exploring ontology changes. Currently, OnEX provides access to about 560 versions of 16 well-known life science ontologies. The system is based on a three-tier architecture including an ontology version repository, a middleware component and the OnEX web application. Interactive workflows allow a systematic and explorative change analysis of ontologies and their concepts as well as the semi-automatic migration of out-dated annotations to the current version of an ontology. Conclusion OnEX provides a user-friendly web interface to explore information about changes in current life science ontologies. It is available at .
Collapse
|
|
16 |
20 |
11
|
|
|
39 |
18 |
12
|
Groß A, Pruski C, Rahm E. Evolution of biomedical ontologies and mappings: Overview of recent approaches. Comput Struct Biotechnol J 2016; 14:333-40. [PMID: 27642503 PMCID: PMC5018063 DOI: 10.1016/j.csbj.2016.08.002] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2016] [Revised: 08/19/2016] [Accepted: 08/23/2016] [Indexed: 11/16/2022] Open
Abstract
Biomedical ontologies are heavily used to annotate data, and different ontologies are often interlinked by ontology mappings. These ontology-based mappings and annotations are used in many applications and analysis tasks. Since biomedical ontologies are continuously updated dependent artifacts can become outdated and need to undergo evolution as well. Hence there is a need for largely automated approaches to keep ontology-based mappings up-to-date in the presence of evolving ontologies. In this article, we survey current approaches and novel directions in the context of ontology and mapping evolution. We will discuss requirements for mapping adaptation and provide a comprehensive overview on existing approaches. We will further identify open challenges and outline ideas for future developments.
Collapse
|
Review |
9 |
17 |
13
|
Hartung M, Gross A, Rahm E. CODEX: exploration of semantic changes between ontology versions. Bioinformatics 2012; 28:895-6. [DOI: 10.1093/bioinformatics/bts029] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
|
13 |
13 |
14
|
Müller R, Rahm E. Rule-Based Dynamic Modification of Workflows in a Medical Domain. INFORMATIK AKTUELL 1999. [DOI: 10.1007/978-3-642-60119-4_26] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
|
|
26 |
12 |
15
|
Saeedi A, Nentwig M, Peukert E, Rahm E. Scalable Matching and Clustering of Entities with FAMER. COMPLEX SYSTEMS INFORMATICS AND MODELING QUARTERLY 2018. [DOI: 10.7250/csimq.2018-16.04] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
|
7 |
11 |
16
|
Lee LH, Groß A, Hartung M, Liou DM, Rahm E. A multi-part matching strategy for mapping LOINC with laboratory terminologies. J Am Med Inform Assoc 2013; 21:792-800. [PMID: 24363318 DOI: 10.1136/amiajnl-2013-002139] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022] Open
Abstract
OBJECTIVE To address the problem of mapping local laboratory terminologies to Logical Observation Identifiers Names and Codes (LOINC). To study different ontology matching algorithms and investigate how the probability of term combinations in LOINC helps to increase match quality and reduce manual effort. MATERIALS AND METHODS We proposed two matching strategies: full name and multi-part. The multi-part approach also considers the occurrence probability of combined concept parts. It can further recommend possible combinations of concept parts to allow more local terms to be mapped. Three real-world laboratory databases from Taiwanese hospitals were used to validate the proposed strategies with respect to different quality measures and execution run time. A comparison with the commonly used tool, Regenstrief LOINC Mapping Assistant (RELMA) Lab Auto Mapper (LAM), was also carried out. RESULTS The new multi-part strategy yields the best match quality, with F-measure values between 89% and 96%. It can automatically match 70-85% of the laboratory terminologies to LOINC. The recommendation step can further propose mapping to (proposed) LOINC concepts for 9-20% of the local terminology concepts. On average, 91% of the local terminology concepts can be correctly mapped to existing or newly proposed LOINC concepts. CONCLUSIONS The mapping quality of the multi-part strategy is significantly better than that of LAM. It enables domain experts to perform LOINC matching with little manual work. The probability of term combinations proved to be a valuable strategy for increasing the quality of match results, providing recommendations for proposed LOINC conepts, and decreasing the run time for match processing.
Collapse
|
Research Support, Non-U.S. Gov't |
12 |
10 |
17
|
Abstract
We introduce a novel approach to extract semantic relations (e.g., is-a and part-of relations) from Wikipedia articles. These relations are used to build up a large and up-to-date thesaurus providing background knowledge for tasks such as determining semantic ontology mappings. Our automatic approach uses a comprehensive set of semantic patterns, finite state machines and NLP techniques to extract millions of relations between concepts. An evaluation for different domains shows the high quality and effectiveness of the proposed approach. We also illustrate the value of the newly found relations for improving existing ontology mappings.
Collapse
|
|
10 |
9 |
18
|
Vatsalan D, Christen P, Rahm E. Incremental clustering techniques for multi-party Privacy-Preserving Record Linkage. DATA KNOWL ENG 2020. [DOI: 10.1016/j.datak.2020.101809] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
|
5 |
6 |
19
|
|
|
11 |
6 |
20
|
Hartung M, Loebe F, Herre H, Rahm E. Management of evolving semantic grid metadata within a collaborative platform. Inf Sci (N Y) 2010. [DOI: 10.1016/j.ins.2009.08.008] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
|
15 |
4 |
21
|
Rohde F, Franke M, Sehili Z, Lablans M, Rahm E. Optimization of the Mainzelliste software for fast privacy-preserving record linkage. J Transl Med 2021; 19:33. [PMID: 33451317 PMCID: PMC7809773 DOI: 10.1186/s12967-020-02678-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Accepted: 12/14/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Data analysis for biomedical research often requires a record linkage step to identify records from multiple data sources referring to the same person. Due to the lack of unique personal identifiers across these sources, record linkage relies on the similarity of personal data such as first and last names or birth dates. However, the exchange of such identifying data with a third party, as is the case in record linkage, is generally subject to strict privacy requirements. This problem is addressed by privacy-preserving record linkage (PPRL) and pseudonymization services. Mainzelliste is an open-source record linkage and pseudonymization service used to carry out PPRL processes in real-world use cases. METHODS We evaluate the linkage quality and performance of the linkage process using several real and near-real datasets with different properties w.r.t. size and error-rate of matching records. We conduct a comparison between (plaintext) record linkage and PPRL based on encoded records (Bloom filters). Furthermore, since the Mainzelliste software offers no blocking mechanism, we extend it by phonetic blocking as well as novel blocking schemes based on locality-sensitive hashing (LSH) to improve runtime for both standard and privacy-preserving record linkage. RESULTS The Mainzelliste achieves high linkage quality for PPRL using field-level Bloom filters due to the use of an error-tolerant matching algorithm that can handle variances in names, in particular missing or transposed name compounds. However, due to the absence of blocking, the runtimes are unacceptable for real use cases with larger datasets. The newly implemented blocking approaches improve runtimes by orders of magnitude while retaining high linkage quality. CONCLUSION We conduct the first comprehensive evaluation of the record linkage facilities of the Mainzelliste software and extend it with blocking methods to improve its runtime. We observed a very high linkage quality for both plaintext as well as encoded data even in the presence of errors. The provided blocking methods provide order of magnitude improvements regarding runtime performance thus facilitating the use in research projects with large datasets and many participants.
Collapse
|
Research Support, Non-U.S. Gov't |
4 |
4 |
22
|
Mueller R, Rahm E, Ramsch J, Heller B, Loeffler M, Greiner U. AdaptFlow: Protocol-based Medical Treatment Using Adaptive Workflows. Methods Inf Med 2018. [DOI: 10.1055/s-0038-1633926] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
Summary
Objectives:
In many medical domains investigator-initiated clinical trials are used to introduce new treatments and hence act as implementations of guideline-based therapies. Trial protocols contain detailed instructions to conduct the therapy and additionally specify reactions to exceptional situations (for instance an infection or a toxicity). To increase quality in health care and raise the number of patients treated according to trial protocols, a consultation system is needed that supports the handling of the complex trial therapy processes efficiently. Our objective was to design and evaluate a consultation system that should 1) observe the status of the therapies currently being applied, 2) offer automatic recognition of exceptional situations and appropriate decision support and 3) provide an automatic adaptation of affected therapy processes to handle exceptional situations.
Methods:
We applied a hybrid approach that combines process support for the timely and efficient execution of the therapy processes as offered by workflow management systems with a knowledge and rule base and a mechanism for dynamic workflow adaptation to change running therapy processes if induced by changed patient condition.
Results and Conclusions:
This approach has been implemented in the AdaptFlow prototype. We performed several evaluation studies on the practicability of the approach and the usefulness of the system. These studies show that the AdaptFlow prototype offers adequate support for the execution of real-world investigator-initiated trial protocols and is able to handle a large number of exceptions.
Collapse
|
|
7 |
4 |
23
|
Rahm E, Kirsten T, Lange J. The GeWare data warehouse platform for the analysis of molecular-biological and clinical data. J Integr Bioinform 2007. [DOI: 10.1515/jib-2007-47] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Abstract We introduce the GeWare data warehouse platform for the integrated analysis of clinical information, microarray data and annotations within large biomedical research studies. Clinical data is obtained from a commercial study management system while publicly available data is integrated using a mediator approach. The platform utilizes a generic approach to manage different types of annotations. We outline the overall architecture of the platform, its implementation as well as the main processing and analysis workflows.
Collapse
|
|
18 |
3 |
24
|
Hartung M, Gross A, Kirsten T, Rahm E. Discovering Evolving Regions in Life Science Ontologies. LECTURE NOTES IN COMPUTER SCIENCE 2010. [DOI: 10.1007/978-3-642-15120-0_3] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|
|
15 |
3 |
25
|
Kirsten T, Lange J, Rahm E. An Integrated Platform for Analyzing Molecular-Biological Data Within Clinical Studies. CURRENT TRENDS IN DATABASE TECHNOLOGY – EDBT 2006 2006. [DOI: 10.1007/11896548_31] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
|
|
19 |
3 |