1
|
Mayer C, Vogt A, Uslu T, Scalzitti N, Chennen K, Poch O, Thompson JD. CeGAL: Redefining a Widespread Fungal-Specific Transcription Factor Family Using an In Silico Error-Tracking Approach. J Fungi (Basel) 2023; 9:jof9040424. [PMID: 37108879 PMCID: PMC10141177 DOI: 10.3390/jof9040424] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 03/21/2023] [Accepted: 03/28/2023] [Indexed: 03/31/2023] Open
Abstract
In fungi, the most abundant transcription factor (TF) class contains a fungal-specific ‘GAL4-like’ Zn2C6 DNA binding domain (DBD), while the second class contains another fungal-specific domain, known as ‘fungal_trans’ or middle homology domain (MHD), whose function remains largely uncharacterized. Remarkably, almost a third of MHD-containing TFs in public sequence databases apparently lack DNA binding activity, since they are not predicted to contain a DBD. Here, we reassess the domain organization of these ‘MHD-only’ proteins using an in silico error-tracking approach. In a large-scale analysis of ~17,000 MHD-only TF sequences present in all fungal phyla except Microsporidia and Cryptomycota, we show that the vast majority (>90%) result from genome annotation errors and we are able to predict a new DBD sequence for 14,261 of them. Most of these sequences correspond to a Zn2C6 domain (82%), with a small proportion of C2H2 domains (4%) found only in Dikarya. Our results contradict previous findings that the MHD-only TF are widespread in fungi. In contrast, we show that they are exceptional cases, and that the fungal-specific Zn2C6–MHD domain pair represents the canonical domain signature defining the most predominant fungal TF family. We call this family CeGAL, after the highly characterized members: Cep3, whose 3D structure is determined, and GAL4, a eukaryotic TF archetype. We believe that this will not only improve the annotation and classification of the Zn2C6 TF but will also provide critical guidance for future fungal gene regulatory network analyses.
Collapse
Affiliation(s)
- Claudine Mayer
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000 Strasbourg, France
- Faculté des Sciences, Université Paris Cité, UFR Sciences du Vivant, 75013 Paris, France
- Correspondence: (C.M.); (J.D.T.)
| | - Arthur Vogt
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000 Strasbourg, France
| | - Tuba Uslu
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000 Strasbourg, France
| | - Nicolas Scalzitti
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000 Strasbourg, France
| | - Kirsley Chennen
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000 Strasbourg, France
| | - Olivier Poch
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000 Strasbourg, France
| | - Julie D. Thompson
- Complex Systems and Translational Bioinformatics (CSTB), ICube Laboratory, UMR7357, University of Strasbourg, 1 rue Eugène Boeckel, 67000 Strasbourg, France
- Correspondence: (C.M.); (J.D.T.)
| |
Collapse
|
2
|
Green S, Hillersdal L, Holt J, Hoeyer K, Wadmann S. The practical ethics of repurposing health data: how to acknowledge invisible data work and the need for prioritization. MEDICINE, HEALTH CARE, AND PHILOSOPHY 2023; 26:119-132. [PMID: 36402853 PMCID: PMC9676846 DOI: 10.1007/s11019-022-10128-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Accepted: 11/04/2022] [Indexed: 06/16/2023]
Abstract
Throughout the Global North, policymakers invest in large-scale integration of health-data infrastructures to facilitate the reuse of clinical data for administration, research, and innovation. Debates about the ethical implications of data repurposing have focused extensively on issues of patient autonomy and privacy. We suggest that it is time to scrutinize also how the everyday work of healthcare staff is affected by political ambitions of data reuse for an increasing number of purposes, and how different purposes are prioritized. Our analysis builds on ethnographic studies within the Danish healthcare system, which is internationally known for its high degree of digitalization and well-connected data infrastructures. Although data repurposing ought to be relatively seamless in this context, we demonstrate how it involves costs and trade-offs for those who produce and use health data. Even when IT systems and automation strategies are introduced to enhance efficiency and reduce data work, they can end up generating new forms of data work and fragmentation of clinically relevant information. We identify five types of data work related to the production, completion, validation, sorting, and recontextualization of health data. Each of these requires medical expertise and clinical resources. We propose that the implications for these forms of data work should be considered early in the planning stages of initiatives for large-scale data sharing and reuse, such as the European Health Data Space. We believe that political awareness of clinical costs and trade-offs related to such data work can provide better and more informed decisions about data repurposing.
Collapse
Affiliation(s)
- Sara Green
- Section for History and Philosophy of Science, Department of Science Education, University of Copenhagen, Niels Bohr Building (NBB), Universitetsparken 5, 2100 Copenhagen Ø, Denmark
| | - Line Hillersdal
- Department of Anthropology, University of Copenhagen, Øster Farimagsgade 5, 1353 Copenhagen K, Denmark
| | - Jette Holt
- Infectious Disease Epidemiology & Prevention, The National Center for Infection Control (CEI), Artillerivej 5, 2300 Copenhagen S, Denmark
| | - Klaus Hoeyer
- Centre for Medical Science and Technology Studies, Department of Public Health, University of Copenhagen, Øster Farigmagsgade 5, 1014 Copenhagen K, Denmark
| | - Sarah Wadmann
- The Danish Center for Social Science Research, VIVE, Herluf Trolles Gade 11, 1052 Copenhagen, Denmark
| |
Collapse
|
3
|
Fitzpatrick R, Stefan MI. Validation Through Collaboration: Encouraging Team Efforts to Ensure Internal and External Validity of Computational Models of Biochemical Pathways. Neuroinformatics 2022; 20:277-284. [PMID: 35543917 PMCID: PMC9537119 DOI: 10.1007/s12021-022-09584-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/17/2022] [Indexed: 01/09/2023]
Abstract
Computational modelling of biochemical reaction pathways is an increasingly important part of neuroscience research. In order to be useful, computational models need to be valid in two senses: First, they need to be consistent with experimental data and able to make testable predictions (external validity). Second, they need to be internally consistent and independently reproducible (internal validity). Here, we discuss both types of validity and provide a brief overview of tools and technologies used to ensure they are met. We also suggest the introduction of new collaborative technologies to ensure model validity: an incentivised experimental database for external validity and reproducibility audits for internal validity. Both rely on FAIR principles and on collaborative science practices.
Collapse
Affiliation(s)
- Richard Fitzpatrick
- Centre for Discovery Brain Sciences, University of Edinburgh, Edinburgh, UK ,School of Biological Sciences, University of Edinburgh, Edinburgh, UK
| | - Melanie I. Stefan
- Centre for Discovery Brain Sciences, University of Edinburgh, Edinburgh, UK ,ZJU-UoE Institute, Zhejiang University, Haining, China
| |
Collapse
|
4
|
Chatterjee A, Swierstra T, Kuiper M. Dealing with different conceptions of pollution in the Gene Regulation Knowledge Commons. BIOCHIMICA ET BIOPHYSICA ACTA. GENE REGULATORY MECHANISMS 2022; 1865:194779. [PMID: 34971789 DOI: 10.1016/j.bbagrm.2021.194779] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/21/2021] [Revised: 11/28/2021] [Accepted: 11/29/2021] [Indexed: 06/14/2023]
Abstract
Current research of gene regulatory mechanisms is increasingly dependent on the availability of high-quality information from manually curated databases. Biocurators undertake the task of extracting knowledge claims from scholarly publications, organizing these claims in a meaningful format and making them computable. In doing so, they enhance the value of existing scientific knowledge by making it accessible to the users of their databases. In this capacity, biocurators are well positioned to identify and weed out information that is of insufficient quality. The criteria that define information quality are typically outlined in curation guidelines developed by biocurators. These guidelines have been prudently developed to reflect the needs of the user community the database caters to. The guidelines depict the standard evidence that this community recognizes as sufficient justification for trustworthy data. Additionally, these guidelines determine the process by which data should be organized and maintained to be valuable to users. Following these guidelines, biocurators assess the quality, reliability, and validity of the information they encounter. In this article we explore to what extent different use cases agree with the inclusion criteria that define positive and negative data, implemented by the database. What are the drawbacks to users who have queries that would be well served by results that fall just short of the criteria used by a database? Finally, how can databases (and biocurators) accommodate the needs of such more explorative use cases?
Collapse
Affiliation(s)
- Anamika Chatterjee
- Department of Philosophy and Religious Studies, Norwegian University of Science and Technology (NTNU), Trondheim, Norway.
| | - Tsjalling Swierstra
- Department of Philosophy, Maastricht University, Maastricht, the Netherlands
| | - Martin Kuiper
- Department of Biology, Norwegian University of Science and Technology (NTNU), Trondheim, Norway
| |
Collapse
|
5
|
Swierstra T, Efstathiou S. Knowledge repositories. In digital knowledge we trust. MEDICINE, HEALTH CARE, AND PHILOSOPHY 2020; 23:543-547. [PMID: 32944868 DOI: 10.1007/s11019-020-09978-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Affiliation(s)
- Tsjalling Swierstra
- Department of Philosophy, Maastricht University, Maastricht, The Netherlands.
- Department of Philosophy, Norwegian University of Science and Technology, Trondheim, Norway.
| | - Sophia Efstathiou
- Department of Philosophy, Norwegian University of Science and Technology, Trondheim, Norway
| |
Collapse
|