1
|
Tang X, Shang J, Ji Y, Sun Y. PLASMe: a tool to identify PLASMid contigs from short-read assemblies using transformer. Nucleic Acids Res 2023; 51:e83. [PMID: 37427782 PMCID: PMC10450166 DOI: 10.1093/nar/gkad578] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 06/19/2023] [Accepted: 06/26/2023] [Indexed: 07/11/2023] Open
Abstract
Plasmids are mobile genetic elements that carry important accessory genes. Cataloging plasmids is a fundamental step to elucidate their roles in promoting horizontal gene transfer between bacteria. Next generation sequencing (NGS) is the main source for discovering new plasmids today. However, NGS assembly programs tend to return contigs, making plasmid detection difficult. This problem is particularly grave for metagenomic assemblies, which contain short contigs of heterogeneous origins. Available tools for plasmid contig detection still suffer from some limitations. In particular, alignment-based tools tend to miss diverged plasmids while learning-based tools often have lower precision. In this work, we develop a plasmid detection tool PLASMe that capitalizes on the strength of alignment and learning-based methods. Closely related plasmids can be easily identified using the alignment component in PLASMe while diverged plasmids can be predicted using order-specific Transformer models. By encoding plasmid sequences as a language defined on the protein cluster-based token set, Transformer can learn the importance of proteins and their correlation through positionally token embedding and the attention mechanism. We compared PLASMe and other tools on detecting complete plasmids, plasmid contigs, and contigs assembled from CAMI2 simulated data. PLASMe achieved the highest F1-score. After validating PLASMe on data with known labels, we also tested it on real metagenomic and plasmidome data. The examination of some commonly used marker genes shows that PLASMe exhibits more reliable performance than other tools.
Collapse
Affiliation(s)
- Xubo Tang
- Department of Electrical Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong SAR, China
| | - Jiayu Shang
- Department of Electrical Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong SAR, China
| | - Yongxin Ji
- Department of Electrical Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong SAR, China
| | - Yanni Sun
- Department of Electrical Engineering, City University of Hong Kong, Tat Chee Avenue, Kowloon, Hong Kong SAR, China
| |
Collapse
|
2
|
Cai Z, Li P, Zhu W, Wei J, Lu J, Song X, Li K, Li S, Li M. Metagenomic analysis reveals gut plasmids as diagnosis markers for colorectal cancer. Front Microbiol 2023; 14:1130446. [PMID: 37283932 PMCID: PMC10239823 DOI: 10.3389/fmicb.2023.1130446] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 05/09/2023] [Indexed: 06/08/2023] Open
Abstract
Background Colorectal cancer (CRC) is linked to distinct gut microbiome patterns. The efficacy of gut bacteria as diagnostic biomarkers for CRC has been confirmed. Despite the potential to influence microbiome physiology and evolution, the set of plasmids in the gut microbiome remains understudied. Methods We investigated the essential features of gut plasmid using metagenomic data of 1,242 samples from eight distinct geographic cohorts. We identified 198 plasmid-related sequences that differed in abundance between CRC patients and controls and screened 21 markers for the CRC diagnosis model. We utilize these plasmid markers combined with bacteria to construct a random forest classifier model to diagnose CRC. Results The plasmid markers were able to distinguish between the CRC patients and controls [mean area under the receiver operating characteristic curve (AUC = 0.70)] and maintained accuracy in two independent cohorts. In comparison to the bacteria-only model, the performance of the composite panel created by combining plasmid and bacteria features was significantly improved in all training cohorts (mean AUCcomposite = 0.804 and mean AUCbacteria = 0.787) and maintained high accuracy in all independent cohorts (mean AUCcomposite = 0.839 and mean AUCbacteria = 0.821). In comparison to controls, we found that the bacteria-plasmid correlation strength was weaker in CRC patients. Additionally, the KEGG orthology (KO) genes in plasmids that are independent of bacteria or plasmids significantly correlated with CRC. Conclusion We identified plasmid features associated with CRC and showed how plasmid and bacterial markers could be combined to further enhance CRC diagnosis accuracy.
Collapse
Affiliation(s)
- Zhiyuan Cai
- Guangdong Provincial Key Laboratory of Biomedical Imaging and Guangdong Provincial Engineering Research Center of Molecular Imaging, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
| | - Ping Li
- Guangdong Provincial Key Laboratory of Biomedical Imaging and Guangdong Provincial Engineering Research Center of Molecular Imaging, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
| | - Wen Zhu
- Guangdong Provincial Key Laboratory of Biomedical Imaging and Guangdong Provincial Engineering Research Center of Molecular Imaging, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
| | - Jingyue Wei
- Guangdong Provincial Key Laboratory of Biomedical Imaging and Guangdong Provincial Engineering Research Center of Molecular Imaging, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
| | - Jieyu Lu
- Guangdong Provincial Key Laboratory of Biomedical Imaging and Guangdong Provincial Engineering Research Center of Molecular Imaging, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
| | - Xiaoyi Song
- Guangdong Provincial Key Laboratory of Biomedical Imaging and Guangdong Provincial Engineering Research Center of Molecular Imaging, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
| | - Kunwei Li
- Radiology Department, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
| | - Sikai Li
- Guangdong Provincial Key Laboratory of Biomedical Imaging and Guangdong Provincial Engineering Research Center of Molecular Imaging, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
| | - Man Li
- Guangdong Provincial Key Laboratory of Biomedical Imaging and Guangdong Provincial Engineering Research Center of Molecular Imaging, The Fifth Affiliated Hospital of Sun Yat-sen University, Zhuhai, China
| |
Collapse
|
3
|
Srinivas M, O’Sullivan O, Cotter PD, van Sinderen D, Kenny JG. The Application of Metagenomics to Study Microbial Communities and Develop Desirable Traits in Fermented Foods. Foods 2022; 11:3297. [PMCID: PMC9601669 DOI: 10.3390/foods11203297] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The microbial communities present within fermented foods are diverse and dynamic, producing a variety of metabolites responsible for the fermentation processes, imparting characteristic organoleptic qualities and health-promoting traits, and maintaining microbiological safety of fermented foods. In this context, it is crucial to study these microbial communities to characterise fermented foods and the production processes involved. High Throughput Sequencing (HTS)-based methods such as metagenomics enable microbial community studies through amplicon and shotgun sequencing approaches. As the field constantly develops, sequencing technologies are becoming more accessible, affordable and accurate with a further shift from short read to long read sequencing being observed. Metagenomics is enjoying wide-spread application in fermented food studies and in recent years is also being employed in concert with synthetic biology techniques to help tackle problems with the large amounts of waste generated in the food sector. This review presents an introduction to current sequencing technologies and the benefits of their application in fermented foods.
Collapse
Affiliation(s)
- Meghana Srinivas
- Food Biosciences Department, Teagasc Food Research Centre, Moorepark, P61 C996 Cork, Ireland
- APC Microbiome Ireland, University College Cork, T12 CY82 Cork, Ireland
- School of Microbiology, University College Cork, T12 CY82 Cork, Ireland
| | - Orla O’Sullivan
- Food Biosciences Department, Teagasc Food Research Centre, Moorepark, P61 C996 Cork, Ireland
- APC Microbiome Ireland, University College Cork, T12 CY82 Cork, Ireland
- VistaMilk SFI Research Centre, Fermoy, P61 C996 Cork, Ireland
| | - Paul D. Cotter
- Food Biosciences Department, Teagasc Food Research Centre, Moorepark, P61 C996 Cork, Ireland
- APC Microbiome Ireland, University College Cork, T12 CY82 Cork, Ireland
- VistaMilk SFI Research Centre, Fermoy, P61 C996 Cork, Ireland
| | - Douwe van Sinderen
- APC Microbiome Ireland, University College Cork, T12 CY82 Cork, Ireland
- School of Microbiology, University College Cork, T12 CY82 Cork, Ireland
| | - John G. Kenny
- Food Biosciences Department, Teagasc Food Research Centre, Moorepark, P61 C996 Cork, Ireland
- APC Microbiome Ireland, University College Cork, T12 CY82 Cork, Ireland
- VistaMilk SFI Research Centre, Fermoy, P61 C996 Cork, Ireland
- Correspondence:
| |
Collapse
|
4
|
Berbers B, Ceyssens PJ, Bogaerts P, Vanneste K, Roosens NHC, Marchal K, De Keersmaecker SCJ. Development of an NGS-Based Workflow for Improved Monitoring of Circulating Plasmids in Support of Risk Assessment of Antimicrobial Resistance Gene Dissemination. Antibiotics (Basel) 2020; 9:E503. [PMID: 32796589 PMCID: PMC7460218 DOI: 10.3390/antibiotics9080503] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Revised: 08/07/2020] [Accepted: 08/08/2020] [Indexed: 11/29/2022] Open
Abstract
Antimicrobial resistance (AMR) is one of the most prominent public health threats. AMR genes localized on plasmids can be easily transferred between bacterial isolates by horizontal gene transfer, thereby contributing to the spread of AMR. Next-generation sequencing (NGS) technologies are ideal for the detection of AMR genes; however, reliable reconstruction of plasmids is still a challenge due to large repetitive regions. This study proposes a workflow to reconstruct plasmids with NGS data in view of AMR gene localization, i.e., chromosomal or on a plasmid. Whole-genome and plasmid DNA extraction methods were compared, as were assemblies consisting of short reads (Illumina MiSeq), long reads (Oxford Nanopore Technologies) and a combination of both (hybrid). Furthermore, the added value of conjugation of a plasmid to a known host was evaluated. As a case study, an isolate harboring a large, low-copy mcr-1-carrying plasmid (>200 kb) was used. Hybrid assemblies of NGS data obtained from whole-genome DNA extractions of the original isolates resulted in the most complete reconstruction of plasmids. The optimal workflow was successfully applied to multidrug-resistant Salmonella Kentucky isolates, where the transfer of an ESBL-gene-containing fragment from a plasmid to the chromosome was detected. This study highlights a strategy including wet and dry lab parameters that allows accurate plasmid reconstruction, which will contribute to an improved monitoring of circulating plasmids and the assessment of their risk of transfer.
Collapse
Affiliation(s)
- Bas Berbers
- Transversal Activities in Applied Genomics, Sciensano, 1050 Brussels, Belgium; (B.B.); (K.V.); (N.H.C.R.)
- Department of Information Technology, IDLab, Ghent University, IMEC, 9052 Ghent, Belgium;
| | | | - Pierre Bogaerts
- National Reference Center for Antimicrobial Resistance in Gram-Negative Bacteria, CHU UCL Namur, 5530 Yvoir, Belgium;
| | - Kevin Vanneste
- Transversal Activities in Applied Genomics, Sciensano, 1050 Brussels, Belgium; (B.B.); (K.V.); (N.H.C.R.)
| | - Nancy H. C. Roosens
- Transversal Activities in Applied Genomics, Sciensano, 1050 Brussels, Belgium; (B.B.); (K.V.); (N.H.C.R.)
| | - Kathleen Marchal
- Department of Information Technology, IDLab, Ghent University, IMEC, 9052 Ghent, Belgium;
- Department of Plant Biotechnology and Bioinformatics, Ghent University, 9052 Ghent, Belgium
- Department of Genetics, University of Pretoria, Pretoria 0083, South Africa
| | | |
Collapse
|