1
|
Concha-Eloko R, Stock M, De Baets B, Briers Y, Sanjuán R, Domingo-Calap P, Boeckaerts D. DepoScope: Accurate phage depolymerase annotation and domain delineation using large language models. PLoS Comput Biol 2024; 20:e1011831. [PMID: 39102416 PMCID: PMC11326577 DOI: 10.1371/journal.pcbi.1011831] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Revised: 08/15/2024] [Accepted: 07/20/2024] [Indexed: 08/07/2024] Open
Abstract
Bacteriophages (phages) are viruses that infect bacteria. Many of them produce specific enzymes called depolymerases to break down external polysaccharide structures. Accurate annotation and domain identification of these depolymerases are challenging due to their inherent sequence diversity. Hence, we present DepoScope, a machine learning tool that combines a fine-tuned ESM-2 model with a convolutional neural network to identify depolymerase sequences and their enzymatic domains precisely. To accomplish this, we curated a dataset from the INPHARED phage genome database, created a polysaccharide-degrading domain database, and applied sequential filters to construct a high-quality dataset, which is subsequently used to train DepoScope. Our work is the first approach that combines sequence-level predictions with amino-acid-level predictions for accurate depolymerase detection and functional domain identification. In that way, we believe that DepoScope can greatly enhance our understanding of phage-host interactions at the level of depolymerases.
Collapse
|
2
|
De Brouwer E, Becker T, Werthen-Brabants L, Dewulf P, Iliadis D, Dekeyser C, Laureys G, Van Wijmeersch B, Popescu V, Dhaene T, Deschrijver D, Waegeman W, De Baets B, Stock M, Horakova D, Patti F, Izquierdo G, Eichau S, Girard M, Prat A, Lugaresi A, Grammond P, Kalincik T, Alroughani R, Grand’Maison F, Skibina O, Terzi M, Lechner-Scott J, Gerlach O, Khoury SJ, Cartechini E, Van Pesch V, Sà MJ, Weinstock-Guttman B, Blanco Y, Ampapa R, Spitaleri D, Solaro C, Maimone D, Soysal A, Iuliano G, Gouider R, Castillo-Triviño T, Sánchez-Menoyo JL, Laureys G, van der Walt A, Oh J, Aguera-Morales E, Altintas A, Al-Asmi A, de Gans K, Fragoso Y, Csepany T, Hodgkinson S, Deri N, Al-Harbi T, Taylor B, Gray O, Lalive P, Rozsa C, McGuigan C, Kermode A, Sempere AP, Mihaela S, Simo M, Hardy T, Decoo D, Hughes S, Grigoriadis N, Sas A, Vella N, Moreau Y, Peeters L. Machine-learning-based prediction of disability progression in multiple sclerosis: An observational, international, multi-center study. PLOS DIGITAL HEALTH 2024; 3:e0000533. [PMID: 39052668 PMCID: PMC11271865 DOI: 10.1371/journal.pdig.0000533] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 05/14/2024] [Indexed: 07/27/2024]
Abstract
BACKGROUND Disability progression is a key milestone in the disease evolution of people with multiple sclerosis (PwMS). Prediction models of the probability of disability progression have not yet reached the level of trust needed to be adopted in the clinic. A common benchmark to assess model development in multiple sclerosis is also currently lacking. METHODS Data of adult PwMS with a follow-up of at least three years from 146 MS centers, spread over 40 countries and collected by the MSBase consortium was used. With basic inclusion criteria for quality requirements, it represents a total of 15, 240 PwMS. External validation was performed and repeated five times to assess the significance of the results. Transparent Reporting for Individual Prognosis Or Diagnosis (TRIPOD) guidelines were followed. Confirmed disability progression after two years was predicted, with a confirmation window of six months. Only routinely collected variables were used such as the expanded disability status scale, treatment, relapse information, and MS course. To learn the probability of disability progression, state-of-the-art machine learning models were investigated. The discrimination performance of the models is evaluated with the area under the receiver operator curve (ROC-AUC) and under the precision recall curve (AUC-PR), and their calibration via the Brier score and the expected calibration error. All our preprocessing and model code are available at https://gitlab.com/edebrouwer/ms_benchmark, making this task an ideal benchmark for predicting disability progression in MS. FINDINGS Machine learning models achieved a ROC-AUC of 0⋅71 ± 0⋅01, an AUC-PR of 0⋅26 ± 0⋅02, a Brier score of 0⋅1 ± 0⋅01 and an expected calibration error of 0⋅07 ± 0⋅04. The history of disability progression was identified as being more predictive for future disability progression than the treatment or relapses history. CONCLUSIONS Good discrimination and calibration performance on an external validation set is achieved, using only routinely collected variables. This suggests machine-learning models can reliably inform clinicians about the future occurrence of progression and are mature for a clinical impact study.
Collapse
|
3
|
Boeckaerts D, Stock M, Ferriol-González C, Oteo-Iglesias J, Sanjuán R, Domingo-Calap P, De Baets B, Briers Y. Prediction of Klebsiella phage-host specificity at the strain level. Nat Commun 2024; 15:4355. [PMID: 38778023 PMCID: PMC11111740 DOI: 10.1038/s41467-024-48675-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Accepted: 05/08/2024] [Indexed: 05/25/2024] Open
Abstract
Phages are increasingly considered promising alternatives to target drug-resistant bacterial pathogens. However, their often-narrow host range can make it challenging to find matching phages against bacteria of interest. Current computational tools do not accurately predict interactions at the strain level in a way that is relevant and properly evaluated for practical use. We present PhageHostLearn, a machine learning system that predicts strain-level interactions between receptor-binding proteins and bacterial receptors for Klebsiella phage-bacteria pairs. We evaluate this system both in silico and in the laboratory, in the clinically relevant setting of finding matching phages against bacterial strains. PhageHostLearn reaches a cross-validated ROC AUC of up to 81.8% in silico and maintains this performance in laboratory validation. Our approach provides a framework for developing and evaluating phage-host prediction methods that are useful in practice, which we believe to be a meaningful contribution to the machine-learning-guided development of phage therapeutics and diagnostics.
Collapse
|
4
|
Castle SD, Stock M, Gorochowski TE. Engineering is evolution: a perspective on design processes to engineer biology. Nat Commun 2024; 15:3640. [PMID: 38684714 PMCID: PMC11059173 DOI: 10.1038/s41467-024-48000-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 04/18/2024] [Indexed: 05/02/2024] Open
Abstract
Careful consideration of how we approach design is crucial to all areas of biotechnology. However, choosing or developing an effective design methodology is not always easy as biology, unlike most areas of engineering, is able to adapt and evolve. Here, we put forward that design and evolution follow a similar cyclic process and therefore all design methods, including traditional design, directed evolution, and even random trial and error, exist within an evolutionary design spectrum. This contrasts with conventional views that often place these methods at odds and provides a valuable framework for unifying engineering approaches for challenging biological design problems.
Collapse
|
5
|
Stock M, De Swaef T, wyffels F. Editorial: Plant sensing and computing - PlantComp 2022. FRONTIERS IN PLANT SCIENCE 2024; 15:1384726. [PMID: 38476694 PMCID: PMC10927963 DOI: 10.3389/fpls.2024.1384726] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/10/2024] [Accepted: 02/13/2024] [Indexed: 03/14/2024]
|
6
|
Taillieu E, Taelman S, De Bruyckere S, Goossens E, Chantziaras I, Van Steenkiste C, Yde P, Hanssens S, De Meyer D, Van Criekinge W, Stock M, Maes D, Chiers K, Haesebrouck F. The role of Helicobacter suis, Fusobacterium gastrosuis, and the pars oesophageal microbiota in gastric ulceration in slaughter pigs receiving meal or pelleted feed. Vet Res 2024; 55:15. [PMID: 38317242 PMCID: PMC10845778 DOI: 10.1186/s13567-024-01274-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Accepted: 01/04/2024] [Indexed: 02/07/2024] Open
Abstract
This study investigated the role of causative infectious agents in ulceration of the non-glandular part of the porcine stomach (pars oesophagea). In total, 150 stomachs from slaughter pigs were included, 75 from pigs that received a meal feed, 75 from pigs that received an equivalent pelleted feed with a smaller particle size. The pars oesophagea was macroscopically examined after slaughter. (q)PCR assays for H. suis, F. gastrosuis and H. pylori-like organisms were performed, as well as 16S rRNA sequencing for pars oesophagea microbiome analyses. All 150 pig stomachs showed lesions. F. gastrosuis was detected in 115 cases (77%) and H. suis in 117 cases (78%), with 92 cases (61%) of co-infection; H. pylori-like organisms were detected in one case. Higher infectious loads of H. suis increased the odds of severe gastric lesions (OR = 1.14, p = 0.038), while the presence of H. suis infection in the pyloric gland zone increased the probability of pars oesophageal erosions [16.4% (95% CI 0.6-32.2%)]. The causal effect of H. suis was mediated by decreased pars oesophageal microbiome diversity [-1.9% (95% CI - 5.0-1.2%)], increased abundances of Veillonella and Campylobacter spp., and decreased abundances of Lactobacillus, Escherichia-Shigella, and Enterobacteriaceae spp. Higher infectious loads of F. gastrosuis in the pars oesophagea decreased the odds of severe gastric lesions (OR = 0.8, p = 0.0014). Feed pelleting had no significant impact on the prevalence of severe gastric lesions (OR = 1.72, p = 0.28). H. suis infections are a risk factor for ulceration of the porcine pars oesophagea, probably mediated through alterations in pars oesophageal microbiome diversity and composition.
Collapse
|
7
|
Stock M, Gorochowski TE. Open-endedness in synthetic biology: A route to continual innovation for biological design. SCIENCE ADVANCES 2024; 10:eadi3621. [PMID: 38241375 DOI: 10.1126/sciadv.adi3621] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 12/20/2023] [Indexed: 01/21/2024]
Abstract
Design in synthetic biology is typically goal oriented, aiming to repurpose or optimize existing biological functions, augmenting biology with new-to-nature capabilities, or creating life-like systems from scratch. While the field has seen many advances, bottlenecks in the complexity of the systems built are emerging and designs that function in the lab often fail when used in real-world contexts. Here, we propose an open-ended approach to biological design, with the novelty of designed biology being at least as important as how well it fulfils its goal. Rather than solely focusing on optimization toward a single best design, designing with novelty in mind may allow us to move beyond the diminishing returns we see in performance for most engineered biology. Research from the artificial life community has demonstrated that embracing novelty can automatically generate innovative and unexpected solutions to challenging problems beyond local optima. Synthetic biology offers the ideal playground to explore more creative approaches to biological design.
Collapse
|
8
|
Stock M, Pieters O, De Swaef T, wyffels F. Plant science in the age of simulation intelligence. FRONTIERS IN PLANT SCIENCE 2024; 14:1299208. [PMID: 38293629 PMCID: PMC10824965 DOI: 10.3389/fpls.2023.1299208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 12/07/2023] [Indexed: 02/01/2024]
Abstract
Historically, plant and crop sciences have been quantitative fields that intensively use measurements and modeling. Traditionally, researchers choose between two dominant modeling approaches: mechanistic plant growth models or data-driven, statistical methodologies. At the intersection of both paradigms, a novel approach referred to as "simulation intelligence", has emerged as a powerful tool for comprehending and controlling complex systems, including plants and crops. This work explores the transformative potential for the plant science community of the nine simulation intelligence motifs, from understanding molecular plant processes to optimizing greenhouse control. Many of these concepts, such as surrogate models and agent-based modeling, have gained prominence in plant and crop sciences. In contrast, some motifs, such as open-ended optimization or program synthesis, still need to be explored further. The motifs of simulation intelligence can potentially revolutionize breeding and precision farming towards more sustainable food production.
Collapse
|
9
|
Knäusl B, Langgartner L, Stock M, Janson M, Furutani KM, Beltran CJ, Georg D, Resch AF. Requirements for dose calculation on an active scanned proton beamline for small, shallow fields. Phys Med 2023; 113:102659. [PMID: 37598612 DOI: 10.1016/j.ejmp.2023.102659] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Revised: 06/18/2023] [Accepted: 08/05/2023] [Indexed: 08/22/2023] Open
Abstract
INTRODUCTION A growing interest in using proton pencil beam scanning in combination with collimators for the treatment of small, shallow targets, such as ocular melanoma or pre-clinical research emerged recently. This study aims at demonstrating that the dose of a synchrotron-based PBS system with a dedicated small, shallow field nozzle can be accurately predicted by a commercial treatment planning system (TPS) following appropriate tuning of both, nozzle and TPS. MATERIALS A removable extension to the clinical nozzle was developed to modify the beam shape passively. Five circular apertures with diameters between 5 to 34mm, mounted 72cm downstream of a range shifter were used. For each collimator treatment plans with spread-out Bragg peaks (SOBP) with a modulation of 3 to 30mm were measured and calculated with GATE/Geant4 and the research TPS RayStation (RS11B-R). The dose grid, multiple coulomb scattering and block discretization resolution were varied to find the optimal balance between accuracy and performance. RESULTS For SOBPs deeper than 10mm, the dose in the target agreed within 1% between RS11B-R, GATE/Geant4 and measurements for aperture diameters between 8 to 34mm, but deviated up to 5% for smaller apertures. A plastic taper was introduced reducing scatter contributions to the patient (from the pipe) and improving the dose calculation accuracy of the TPS to a 5% level in the entrance region for large apertures. CONCLUSION The commercial TPS and GATE/Geant4 can accurately calculate the dose for shallow, small proton fields using a collimator and pencil beam scanning.
Collapse
|
10
|
Van Haeverbeke M, De Baets B, Stock M. Plant impedance spectroscopy: a review of modeling approaches and applications. FRONTIERS IN PLANT SCIENCE 2023; 14:1187573. [PMID: 37588419 PMCID: PMC10426379 DOI: 10.3389/fpls.2023.1187573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 06/20/2023] [Indexed: 08/18/2023]
Abstract
Electrochemical impedance spectroscopy has emerged over the past decade as an efficient, non-destructive method to investigate various (eco-)physiological and morphological properties of plants. This work reviews the state-of-the-art of impedance spectra modeling for plant applications. In addition to covering the traditional, widely-used representations of electrochemical impedance spectra, we also consider the more recent machine-learning-based approaches.
Collapse
|
11
|
Tubin S, Vozenin M, Prezado Y, Durante M, Prise K, Lara P, Greco C, Massaccesi M, Guha C, Wu X, Mohiuddin M, Vestergaard A, Bassler N, Gupta S, Stock M, Timmerman R. Novel unconventional radiotherapy techniques: Current status and future perspectives - Report from the 2nd international radiation oncology online seminar. Clin Transl Radiat Oncol 2023; 40:100605. [PMID: 36910025 PMCID: PMC9996385 DOI: 10.1016/j.ctro.2023.100605] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 02/16/2023] [Accepted: 02/19/2023] [Indexed: 02/25/2023] Open
Abstract
•Improvement of therapeutic ratio by novel unconventional radiotherapy approaches.•Immunomodulation using high-dose spatially fractionated radiotherapy.•Boosting radiation anti-tumor effects by adding an immune-mediated cell killing.
Collapse
|
12
|
Lütgendorf-Caucig C, Flechl B, Konrath L, Pelak M, Fraller A, Mock U, Fossati P, Stock M, Georg P, Hug E. JS09.6.A Low incidence of radiation-induced brain lesions and stable QoL following proton irradiation for CNS and Skull Base tumors- results from the prospective MedAustron register REGI-MA-002015. Neuro Oncol 2022. [DOI: 10.1093/neuonc/noac174.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Abstract
Background
Irradiation of intracranial tumors may induce endothelial damage in the surrounding normal brain tissues, resulting in an increase of capillary permeability. These changes can be depicted on magnetic resonance imaging (MRI) as a new contrast medium uptake - not associated with tumor. Radiation-induced brain lesions (RIBL) occur after photon as well as proton irradiation. This study evaluated the incidence of RIBL after proton irradiation and their impact on Quality of Life (QoL).
Material and Methods
421 patients treated between 01/2017 and 06/2021 were included. All patients participated in a prospective registry study (ClinicalTrials.gov Identifier: NCT03049072). Follow-up evaluations including MRIs were at 3,6,12 months after treatment completion and annually thereafter. QoL parameters were assessed by EORTC-CTC30 and BN20 questionnaires.
Results
48.9% (n=206) patients received therapy for intracranial non-CNS tumors (meningioma, pituitary adenoma, and other), 26.8% (n=113) for head and neck cancer with skull base involvement, 14.5% (n=61) for primary CNS tumors and 9.7% (n=41) for skull base tumor. Median follow-up was 24 months (range 6-54 ), 352 (86%) patients had proton therapy as primary treatment, 59 (14%) had salvage treatment with proton re-irradiation (ReRT). Median prescribed dose was 58.5 Gy (RBE) (range 40-78 Gy (RBE)), median D1% of brain tissue was 54.3 Gy (RBE) (range 30-76 Gy RBE). Local control and overall survival were 91% and 95% at 2 years. The cumulative RIBL incidence was 15.0% (n=63), with significantly lower occurrence in the primary RT group vs. the ReRT group (12.9% vs. 27.1%; p<0.001). According to Grade, the distribution was 10.5% (n=44) Grade I (asymptomatic, MRT finding only), Grade II RIBL, 13 (3.1%) (moderate symptoms) (grade 2) and 1,4% (n=6) developed Grade 3 toxicity. Actuarial 2-year RIBL incidence was 18.2% (95%CI: 14.1-23.2) for the all Grades and the entire, 15.7% (95%CI: 11.6-21) following primary radiation and 34.2% (95%CI: 21.9-50.9) after ReRT. All RIBL developed outside the residual tumor, but inside the Planning Target Volume (PTV), median D1% was 60.3Gy (RBE) (range 46.1- 122.3 Gy(RBE)), median time to development was 11.8 months (range 2.7-37 months) in the total cohort, for primary RT 14.2mo (4.3mo -37.1mo) and 6.0mo (2.7mo -19.3mo) following ReRT. At the time of analysis 26 of the 63 RIBL had resolved (41.3%). General QoL was not compromised. In a matched-pair analysis of 54/50 patients with/without RIBL, only at the 12 month a significant difference in the global health score in favour of non-RIBL patients was observed. At 24 months the score for RIBL patients improved without difference between the groups.
Conclusion
Overall incidence of RIBL after proton radiotherapy is very low - even for skull base tumors requiring high total doses and it had no significant negative impact on long term QoL.
Collapse
|
13
|
Janssens LK, Boeckaerts D, Hudson S, Morozova D, Cannaert A, Wood DM, Wolfe C, De Baets B, Stock M, Dargan PI, Stove CP. Large-scale activity-based SCRA screening on patient serum samples: CB1 bioassay supported by machine learning. TOXICOLOGIE ANALYTIQUE ET CLINIQUE 2022. [DOI: 10.1016/j.toxac.2022.06.060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/15/2022]
|
14
|
Orenstein EC, Ayata S, Maps F, Becker ÉC, Benedetti F, Biard T, de Garidel‐Thoron T, Ellen JS, Ferrario F, Giering SLC, Guy‐Haim T, Hoebeke L, Iversen MH, Kiørboe T, Lalonde J, Lana A, Laviale M, Lombard F, Lorimer T, Martini S, Meyer A, Möller KO, Niehoff B, Ohman MD, Pradalier C, Romagnan J, Schröder S, Sonnet V, Sosik HM, Stemmann LS, Stock M, Terbiyik‐Kurt T, Valcárcel‐Pérez N, Vilgrain L, Wacquet G, Waite AM, Irisson J. Machine learning techniques to characterize functional traits of plankton from image data. LIMNOLOGY AND OCEANOGRAPHY 2022; 67:1647-1669. [PMID: 36247386 PMCID: PMC9543351 DOI: 10.1002/lno.12101] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Revised: 04/21/2022] [Accepted: 04/27/2022] [Indexed: 06/16/2023]
Abstract
Plankton imaging systems supported by automated classification and analysis have improved ecologists' ability to observe aquatic ecosystems. Today, we are on the cusp of reliably tracking plankton populations with a suite of lab-based and in situ tools, collecting imaging data at unprecedentedly fine spatial and temporal scales. But these data have potential well beyond examining the abundances of different taxa; the individual images themselves contain a wealth of information on functional traits. Here, we outline traits that could be measured from image data, suggest machine learning and computer vision approaches to extract functional trait information from the images, and discuss promising avenues for novel studies. The approaches we discuss are data agnostic and are broadly applicable to imagery of other aquatic or terrestrial organisms.
Collapse
|
15
|
Becker DJ, Albery GF, Sjodin AR, Poisot T, Bergner LM, Chen B, Cohen LE, Dallas TA, Eskew EA, Fagre AC, Farrell MJ, Guth S, Han BA, Simmons NB, Stock M, Teeling EC, Carlson CJ. Optimising predictive models to prioritise viral discovery in zoonotic reservoirs. THE LANCET. MICROBE 2022; 3:e625-e637. [PMID: 35036970 PMCID: PMC8747432 DOI: 10.1016/s2666-5247(21)00245-7] [Citation(s) in RCA: 36] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Despite the global investment in One Health disease surveillance, it remains difficult and costly to identify and monitor the wildlife reservoirs of novel zoonotic viruses. Statistical models can guide sampling target prioritisation, but the predictions from any given model might be highly uncertain; moreover, systematic model validation is rare, and the drivers of model performance are consequently under-documented. Here, we use the bat hosts of betacoronaviruses as a case study for the data-driven process of comparing and validating predictive models of probable reservoir hosts. In early 2020, we generated an ensemble of eight statistical models that predicted host-virus associations and developed priority sampling recommendations for potential bat reservoirs of betacoronaviruses and bridge hosts for SARS-CoV-2. During a time frame of more than a year, we tracked the discovery of 47 new bat hosts of betacoronaviruses, validated the initial predictions, and dynamically updated our analytical pipeline. We found that ecological trait-based models performed well at predicting these novel hosts, whereas network methods consistently performed approximately as well or worse than expected at random. These findings illustrate the importance of ensemble modelling as a buffer against mixed-model quality and highlight the value of including host ecology in predictive models. Our revised models showed an improved performance compared with the initial ensemble, and predicted more than 400 bat species globally that could be undetected betacoronavirus hosts. We show, through systematic validation, that machine learning models can help to optimise wildlife sampling for undiscovered viruses and illustrates how such approaches are best implemented through a dynamic process of prediction, data collection, validation, and updating.
Collapse
|
16
|
Fernandez J, Fitzgerald C, Rouzard K, Tamura M, Healy J, Tao K, Guo L, Hu X, Stock M, Stock J, Perez E. 817 Encapsulated activated-grape seed extract (E-AGSE): A novel liposome-based formulation that promotes anti-aging, brightening and hydration in human skin. J Invest Dermatol 2022. [DOI: 10.1016/j.jid.2022.05.831] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
|
17
|
Van Huffel K, Stock M, Ruttink T, De Baets B. Covering the Combinatorial Design Space of Multiplex CRISPR/Cas Experiments in Plants. FRONTIERS IN PLANT SCIENCE 2022; 13:907095. [PMID: 35795354 PMCID: PMC9251496 DOI: 10.3389/fpls.2022.907095] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Accepted: 05/16/2022] [Indexed: 06/15/2023]
Abstract
Over the past years, CRISPR/Cas-mediated genome editing has revolutionized plant genetic studies and crop breeding. Specifically, due to its ability to simultaneously target multiple genes, the multiplex CRISPR/Cas system has emerged as a powerful technology for functional analysis of genetic pathways. As such, it holds great potential for application in plant systems to discover genetic interactions and to improve polygenic agronomic traits in crop breeding. However, optimal experimental design regarding coverage of the combinatorial design space in multiplex CRISPR/Cas screens remains largely unexplored. To contribute to well-informed experimental design of such screens in plants, we first establish a representation of the design space at different stages of a multiplex CRISPR/Cas experiment. We provide two independent computational approaches yielding insights into the plant library size guaranteeing full coverage of all relevant multiplex combinations of gene knockouts in a specific multiplex CRISPR/Cas screen. These frameworks take into account several design parameters (e.g., the number of target genes, the number of gRNAs designed per gene, and the number of elements in the combinatorial array) and efficiencies at subsequent stages of a multiplex CRISPR/Cas experiment (e.g., the distribution of gRNA/Cas delivery, gRNA-specific mutation efficiency, and knockout efficiency). With this work, we intend to raise awareness about the limitations regarding the number of target genes and order of genetic interaction that can be realistically analyzed in multiplex CRISPR/Cas experiments with a given number of plants. Finally, we establish guidelines for designing multiplex CRISPR/Cas experiments with an optimal coverage of the combinatorial design space at minimal plant library size.
Collapse
|
18
|
Boeckaerts D, Stock M, De Baets B, Briers Y. Identification of Phage Receptor-Binding Protein Sequences with Hidden Markov Models and an Extreme Gradient Boosting Classifier. Viruses 2022; 14:1329. [PMID: 35746800 PMCID: PMC9230537 DOI: 10.3390/v14061329] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Revised: 06/09/2022] [Accepted: 06/16/2022] [Indexed: 11/30/2022] Open
Abstract
Receptor-binding proteins (RBPs) of bacteriophages initiate the infection of their corresponding bacterial host and act as the primary determinant for host specificity. The ever-increasing amount of sequence data enables the development of predictive models for the automated identification of RBP sequences. However, the development of such models is challenged by the inconsistent or missing annotation of many phage proteins. Recently developed tools have started to bridge this gap but are not specifically focused on RBP sequences, for which many different annotations are available. We have developed two parallel approaches to alleviate the complex identification of RBP sequences in phage genomic data. The first combines known RBP-related hidden Markov models (HMMs) from the Pfam database with custom-built HMMs to identify phage RBPs based on protein domains. The second approach consists of training an extreme gradient boosting classifier that can accurately discriminate between RBPs and other phage proteins. We explained how these complementary approaches can reinforce each other in identifying RBP sequences. In addition, we benchmarked our methods against the recently developed PhANNs tool. Our best performing model reached a precision-recall area-under-the-curve of 93.8% and outperformed PhANNs on an independent test set, reaching an F1-score of 84.0% compared to 69.8%.
Collapse
|
19
|
Lebbink F, Stocchiero S, Engwall E, Stock M, Georg D, Knäusl B. PO-1714 parameter vs logfile based 4D proton dose tracking for small movers. Radiother Oncol 2022. [DOI: 10.1016/s0167-8140(22)03678-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
20
|
Barna S, Meouchi C, Magrin G, Conte V, Stock M, Resch A, Georg D, Palmans H. PD-0815 Microdosimetry with tissue-equivalent proportional counters at an ion beam therapy facility. Radiother Oncol 2022. [DOI: 10.1016/s0167-8140(22)02956-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
21
|
Van Huffel K, Stock M, De Baets B. BioCCP.jl: collecting coupons in combinatorial biotechnology. Bioinformatics 2022; 38:1144-1145. [PMID: 34788379 DOI: 10.1093/bioinformatics/btab775] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 10/15/2021] [Accepted: 11/08/2021] [Indexed: 02/03/2023] Open
Abstract
SUMMARY In combinatorial biotechnology, it is crucial for screening experiments to sufficiently cover the design space. In the BioCCP.jl package (Julia), we provide functions for minimum sample size determination based on the mathematical framework coined the Coupon Collector Problem. AVAILABILITY AND IMPLEMENTATION BioCCP.jl, including source code, documentation and Pluto notebooks, is available at https://github.com/kirstvh/BioCCP.jl.
Collapse
|
22
|
Mey F, Clauwaert J, Van Brempt M, Stock M, Maertens J, Waegeman W, De Mey M. ProD: A Tool for Predictive Design of Tailored Promoters in Escherichia coli. Methods Mol Biol 2022; 2516:51-59. [PMID: 35922621 DOI: 10.1007/978-1-0716-2413-5_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
A major goal in synthetic biology is the engineering of synthetic gene circuits with a predictable, controlled and designed outcome. This creates a need for building blocks that can modulate gene expression without interference with the native cell system. A tool allowing forward engineering of promoters with predictable transcription initiation frequency is still lacking. Promoter libraries specific for σ70 to ensure the orthogonality of gene expression were built in Escherichia coli and labeled using fluorescence-activated cell sorting to obtain high-throughput DNA sequencing data to train a convolutional neural network. We were able to confirm in vivo that the model is able to predict the promoter transcription initiation frequency (TIF) of new promoter sequences. Here, we provide an online tool for promoter design (ProD) in E. coli, which can be used to tailor output sequences of desired promoter TIF or predict the TIF of a custom sequence.
Collapse
|
23
|
Lood C, Boeckaerts D, Stock M, De Baets B, Lavigne R, van Noort V, Briers Y. Digital phagograms: predicting phage infectivity through a multilayer machine learning approach. Curr Opin Virol 2021; 52:174-181. [PMID: 34952265 DOI: 10.1016/j.coviro.2021.12.004] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 11/26/2021] [Accepted: 12/04/2021] [Indexed: 12/19/2022]
Abstract
Machine learning has been broadly implemented to investigate biological systems. In this regard, the field of phage biology has embraced machine learning to elucidate and predict phage-host interactions, based on receptor-binding proteins, (anti-)defense systems, prophage detection, and life cycle recognition. Here, we highlight the enormous potential of integrating information from omics data with insights from systems biology to better understand phage-host interactions. We conceptualize and discuss the potential of a multilayer model that mirrors the phage infection process, integrating adsorption, bacterial pan-immune components and hijacking of the bacterial metabolism to predict phage infectivity. In the future, this model can offer insights into the underlying mechanisms of the infection process, and digital phagograms can support phage cocktail design and phage engineering.
Collapse
|
24
|
Knäusl B, Zimmermann L, Stock M, Lütgendorf-Caucig C, Georg D, Kuess P. PO-1674 An MRI sequence independent Convolutional Neural Network for head sCT generation in proton therapy. Radiother Oncol 2021. [DOI: 10.1016/s0167-8140(21)08125-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
25
|
Grevillot L, Dreindl R, Fayos-Solà Capilla R, Elia A, Bolsa-Ferruz M, Gora J, Amico A, Padilla-Cabal F, Carlino A, Stock M. PH-0597 Commissioning and clinical implementation of myQAiON for proton independent dose calculation (IDC). Radiother Oncol 2021. [DOI: 10.1016/s0167-8140(21)07369-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|