Faure E, Ayata SD, Bittner L. Towards omics-based predictions of planktonic functional composition from environmental data.
Nat Commun 2021;
12:4361. [PMID:
34272373 PMCID:
PMC8285379 DOI:
10.1038/s41467-021-24547-1]
[Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2020] [Accepted: 05/25/2021] [Indexed: 02/06/2023] Open
Abstract
Marine microbes play a crucial role in climate regulation, biogeochemical cycles, and trophic networks. Unprecedented amounts of data on planktonic communities were recently collected, sparking a need for innovative data-driven methodologies to quantify and predict their ecosystemic functions. We reanalyze 885 marine metagenome-assembled genomes through a network-based approach and detect 233,756 protein functional clusters, from which 15% are functionally unannotated. We investigate all clusters' distributions across the global ocean through machine learning, identifying biogeographical provinces as the best predictors of protein functional clusters' abundance. The abundances of 14,585 clusters are predictable from the environmental context, including 1347 functionally unannotated clusters. We analyze the biogeography of these 14,585 clusters, identifying the Mediterranean Sea as an outlier in terms of protein functional clusters composition. Applicable to any set of sequences, our approach constitutes a step towards quantitative predictions of functional composition from the environmental context.
Collapse