Lucia-Sanz A, Peng S, Leung CY(J, Gupta A, Meyer JR, Weitz JS. Inferring strain-level mutational drivers of phage-bacteria interaction phenotypes.
BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.08.574707. [PMID:
38260415 PMCID:
PMC10802490 DOI:
10.1101/2024.01.08.574707]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2024]
Abstract
The enormous diversity of bacteriophages and their bacterial hosts presents a significant challenge to predict which phages infect a focal set of bacteria. Infection is largely determined by complementary -and largely uncharacterized- genetics of adsorption, injection, and cell take-over. Here we present a machine learning (ML) approach to predict phage-bacteria interactions trained on genome sequences of and phenotypic interactions amongst 51 Escherichia coli strains and 45 phage λ strains that coevolved in laboratory conditions for 37 days. Leveraging multiple inference strategies and without a priori knowledge of driver mutations, this framework predicts both who infects whom and the quantitative levels of infections across a suite of 2,295 potential interactions. The most effective ML approach inferred interaction phenotypes from independent contributions from phage and bacteria mutations, predicting phage host range with 86% mean classification accuracy while reducing the relative error in the estimated strength of the infection phenotype by 40%. Further, transparent feature selection in the predictive model revealed 18 of 176 phage λ and 6 of 18 E. coli mutations that have a significant influence on the outcome of phage-bacteria interactions, corroborating sites previously known to affect phage λ infections, as well as identifying mutations in genes of unknown function not previously shown to influence bacterial resistance. While the genetic variation studied was limited to a focal, coevolved phage-bacteria system, the method's success at recapitulating strain-level infection outcomes provides a path forward towards developing strategies for inferring interactions in non-model systems, including those of therapeutic significance.
Collapse