1
|
Tominello-Ramirez CS, Muñoz Hoyos L, Oubounyt M, Stam R. Network analyses predict major regulators of resistance to early blight disease complex in tomato. BMC PLANT BIOLOGY 2024; 24:641. [PMID: 38971719 PMCID: PMC11227178 DOI: 10.1186/s12870-024-05366-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/13/2024] [Accepted: 07/01/2024] [Indexed: 07/08/2024]
Abstract
BACKGROUND Early blight and brown leaf spot are often cited as the most problematic pathogens of tomato in many agricultural regions. Their causal agents are Alternaria spp., a genus of Ascomycota containing numerous necrotrophic pathogens. Breeding programs have yielded quantitatively resistant commercial cultivars, but fungicide application remains necessary to mitigate the yield losses. A major hindrance to resistance breeding is the complexity of the genetic determinants of resistance and susceptibility. In the absence of sufficiently resistant germplasm, we sequenced the transcriptomes of Heinz 1706 tomatoes treated with strongly virulent and weakly virulent isolates of Alternaria spp. 3 h post infection. We expanded existing functional gene annotations in tomato and using network statistics, we analyzed the transcriptional modules associated with defense and susceptibility. RESULTS The induced responses are very distinct. The weakly virulent isolate induced a defense response of calcium-signaling, hormone responses, and transcription factors. These defense-associated processes were found in a single transcriptional module alongside secondary metabolite biosynthesis genes, and other defense responses. Co-expression and gene regulatory networks independently predicted several D clade ethylene response factors to be early regulators of the defense transcriptional module, as well as other transcription factors both known and novel in pathogen defense, including several JA-associated genes. In contrast, the strongly virulent isolate elicited a much weaker response, and a separate transcriptional module bereft of hormone signaling. CONCLUSIONS Our findings have predicted major defense regulators and several targets for downstream functional analyses. Combined with our improved gene functional annotation, they suggest that defense is achieved through induction of Alternaria-specific immune pathways, and susceptibility is mediated by modulating hormone responses. The implication of multiple specific clade D ethylene response factors and upregulation of JA-associated genes suggests that host defense in this pathosystem involves ethylene response factors to modulate jasmonic acid signaling.
Collapse
Affiliation(s)
- Christopher S Tominello-Ramirez
- Department of Phytopathology and Crop Protection, Institute for Phytopathology, Christian Albrechts University, Kiel, Germany
- Phytopathology, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Lina Muñoz Hoyos
- Phytopathology, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany
| | - Mhaned Oubounyt
- Institute for Computational Systems Biology, University of Hamburg, Hamburg, Germany
| | - Remco Stam
- Department of Phytopathology and Crop Protection, Institute for Phytopathology, Christian Albrechts University, Kiel, Germany.
- Phytopathology, TUM School of Life Sciences Weihenstephan, Technical University of Munich, Freising, Germany.
| |
Collapse
|
2
|
Slyusarev GS, Skalon EK, Starunov VV. Evolution of Orthonectida body plan. Evol Dev 2024; 26:e12462. [PMID: 37889073 DOI: 10.1111/ede.12462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2023] [Revised: 07/18/2023] [Accepted: 10/15/2023] [Indexed: 10/28/2023]
Abstract
Orthonectida is an enigmatic group of animals with still uncertain phylogenetic position. Orthonectids parasitize various marine invertebrates. Their life cycle comprises a parasitic plasmodium and free-living males and females. Sexual individuals develop inside the plasmodium; after egress from the host they copulate in the external environment, and the larva, which has developed inside the female infects a new host. In a series of studied orthonectid species simplification of free-living sexual individuals can be clearly traced. The number of longitudinal and transverse muscle fibers is gradually reduced. In the nervous system, simplification is even more pronounced. The number of neurons constituting the ganglion is dramatically reduced from 200 in Rhopalura ophiocomae to 4-6 in Intoshia variabili. The peripheral nervous system undergoes gradual simplification as well. The morphological simplification is accompanied with genome reduction. However, not only genes are lost from the genome, it also undergoes compactization ensured by extreme reduction of intergenic distances, short intron sizes, and elimination of repetitive elements. The main trend in orthonectid evolution is simplification and miniaturization of free-living sexual individuals coupled with reduction and compactization of the genome.
Collapse
Affiliation(s)
- George S Slyusarev
- Department of Invertebrate Zoology, Faculty of Biology, Saint-Petersburg State University, St-Petersburg, Russia
| | - Elizaveta K Skalon
- Department of Invertebrate Zoology, Faculty of Biology, Saint-Petersburg State University, St-Petersburg, Russia
| | - Victor V Starunov
- Department of Invertebrate Zoology, Faculty of Biology, Saint-Petersburg State University, St-Petersburg, Russia
- Zoological Institute RAS, St-Petersburg, Russia
| |
Collapse
|
3
|
Singh KP, Kumari P, Rai PK. GWAS for the identification of introgressed candidate genes of Sinapis alba with increased branching numbers in backcross lines of the allohexaploid Brassica. FRONTIERS IN PLANT SCIENCE 2024; 15:1381387. [PMID: 38978520 PMCID: PMC11228338 DOI: 10.3389/fpls.2024.1381387] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/03/2024] [Accepted: 06/11/2024] [Indexed: 07/10/2024]
Abstract
Plant architecture is a crucial determinant of crop yield. The number of primary (PB) and secondary branches (SB) is particularly significant in shaping the architecture of Indian mustard. In this study, we analyzed a panel of 86 backcross introgression lines (BCILs) derived from the first stable allohexaploid Brassicas with 170 Sinapis alba genome-specific SSR markers to identify associated markers with higher PB and SB through association mapping. The structure analysis revealed three subpopulations, i.e., P1, P2, and P3, in the association panel containing a total of 11, 33, and 42 BCILs, respectively. We identified five novel SSR markers linked to higher PB and SB. Subsequently, we explored the 20 kb up- and downstream regions of these SSR markers to predict candidate genes for improved branching and annotated them through BLASTN. As a result, we predicted 47 complete genes within the 40 kb regions of all trait-linked markers, among which 35 were identified as candidate genes for higher PB and SB numbers in BCILs. These candidate genes were orthologous to ANT, RAMOSUS, RAX, MAX, MP, SEU, REV, etc., branching genes. The remaining 12 genes were annotated for additional roles using BLASTP with protein databases. This study identified five novel S. alba genome-specific SSR markers associated with increased PB and SB, as well as 35 candidate genes contributing to plant architecture through improved branching numbers. To the best of our knowledge, this is the first report of introgressive genes for higher branching numbers in B. juncea from S. alba.
Collapse
Affiliation(s)
- Kaushal Pratap Singh
- Plant Protection Unit, Indian Council of Agricultural Research (ICAR)-Directorate of Rapeseed Mustard Research, Sewar, Bharatpur, India
| | - Preetesh Kumari
- Genetics Division, ICAR-Indian Agricultural Research Institute, New Delhi, India
- School of Agriculture, Sanskriti University, Mathura - Delhi Highway, Chhata, Mathura, India
| | - Pramod Kumar Rai
- Plant Protection Unit, Indian Council of Agricultural Research (ICAR)-Directorate of Rapeseed Mustard Research, Sewar, Bharatpur, India
| |
Collapse
|
4
|
Ulusoy E, Doğan T. Mutual annotation-based prediction of protein domain functions with Domain2GO. Protein Sci 2024; 33:e4988. [PMID: 38757367 PMCID: PMC11099699 DOI: 10.1002/pro.4988] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 02/25/2024] [Accepted: 03/30/2024] [Indexed: 05/18/2024]
Abstract
Identifying unknown functional properties of proteins is essential for understanding their roles in both health and disease states. The domain composition of a protein can reveal critical information in this context, as domains are structural and functional units that dictate how the protein should act at the molecular level. The expensive and time-consuming nature of wet-lab experimental approaches prompted researchers to develop computational strategies for predicting the functions of proteins. In this study, we proposed a new method called Domain2GO that infers associations between protein domains and function-defining gene ontology (GO) terms, thus redefining the problem as domain function prediction. Domain2GO uses documented protein-level GO annotations together with proteins' domain annotations. Co-annotation patterns of domains and GO terms in the same proteins are examined using statistical resampling to obtain reliable associations. As a use-case study, we evaluated the biological relevance of examples selected from the Domain2GO-generated domain-GO term mappings via literature review. Then, we applied Domain2GO to predict unknown protein functions by propagating domain-associated GO terms to proteins annotated with these domains. For function prediction performance evaluation and comparison against other methods, we employed Critical Assessment of Function Annotation 3 (CAFA3) challenge datasets. The results demonstrated the high potential of Domain2GO, particularly for predicting molecular function and biological process terms, along with advantages such as producing interpretable results and having an exceptionally low computational cost. The approach presented here can be extended to other ontologies and biological entities to investigate unknown relationships in complex and large-scale biological data. The source code, datasets, results, and user instructions for Domain2GO are available at https://github.com/HUBioDataLab/Domain2GO. Additionally, we offer a user-friendly online tool at https://huggingface.co/spaces/HUBioDataLab/Domain2GO, which simplifies the prediction of functions of previously unannotated proteins solely using amino acid sequences.
Collapse
Affiliation(s)
- Erva Ulusoy
- Biological Data Science Lab, Department of Computer EngineeringHacettepe UniversityAnkaraTurkey
- Department of BioinformaticsGraduate School of Health Sciences, Hacettepe UniversityAnkaraTurkey
| | - Tunca Doğan
- Biological Data Science Lab, Department of Computer EngineeringHacettepe UniversityAnkaraTurkey
- Department of BioinformaticsGraduate School of Health Sciences, Hacettepe UniversityAnkaraTurkey
| |
Collapse
|
5
|
Graci S, Cigliano RA, Barone A. Exploring the gene expression network involved in the heat stress response of a thermotolerant tomato genotype. BMC Genomics 2024; 25:509. [PMID: 38783170 PMCID: PMC11112777 DOI: 10.1186/s12864-024-10393-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2024] [Accepted: 05/08/2024] [Indexed: 05/25/2024] Open
Abstract
BACKGROUND The increase in temperatures due to the current climate change dramatically affects crop cultivation, resulting in yield losses and altered fruit quality. Tomato is one of the most extensively grown and consumed horticultural products, and although it can withstand a wide range of climatic conditions, heat stress can affect plant growth and development specially on the reproductive stage, severely influencing the final yield. In the present work, the heat stress response mechanisms of one thermotolerant genotype (E42) were investigated by exploring its regulatory gene network. This was achieved through a promoter analysis based on the identification of the heat stress elements (HSEs) mapping in the promoters, combined with a gene co-expression network analysis aimed at identifying interactions among heat-related genes. RESULTS Results highlighted 82 genes presenting HSEs in the promoter and belonging to one of the 52 gene networks obtained by the GCN analysis; 61 of these also interact with heat shock factors (Hsfs). Finally, a list of 13 candidate genes including two Hsfs, nine heat shock proteins (Hsps) and two GDSL esterase/lipase (GELPs) were retrieved by focusing on those E42 genes exhibiting HSEs in the promoters, interacting with Hsfs and showing variants, compared to Heinz reference genome, with HIGH and/or MODERATE impact on the translated protein. Among these, the Gene Ontology annotation analysis evidenced that only LeHsp100 (Solyc02g088610) belongs to a network specifically involved in the response to heat stress. CONCLUSIONS As a whole, the combination of bioinformatic analyses carried out on genomic and trascriptomic data available for tomato, together with polymorphisms detected in HS-related genes of the thermotolerant E42 allowed to determine a subset of candidate genes involved in the HS response in tomato. This study provides a novel approach in the investigation of abiotic stress response mechanisms and further studies will be conducted to validate the role of the highlighted genes.
Collapse
Affiliation(s)
- Salvatore Graci
- Department of Agricultural Sciences, University of Naples Federico II, Portici, Naples, Italy
| | | | - Amalia Barone
- Department of Agricultural Sciences, University of Naples Federico II, Portici, Naples, Italy.
| |
Collapse
|
6
|
Calderón L, Carbonell-Bejerano P, Muñoz C, Bree L, Sola C, Bergamin D, Tulle W, Gomez-Talquenca S, Lanz C, Royo C, Ibáñez J, Martinez-Zapater JM, Weigel D, Lijavetzky D. Diploid genome assembly of the Malbec grapevine cultivar enables haplotype-aware analysis of transcriptomic differences underlying clonal phenotypic variation. HORTICULTURE RESEARCH 2024; 11:uhae080. [PMID: 38766532 PMCID: PMC11101320 DOI: 10.1093/hr/uhae080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/01/2023] [Accepted: 03/08/2024] [Indexed: 05/22/2024]
Abstract
To preserve their varietal attributes, established grapevine cultivars (Vitis vinifera L. ssp. vinifera) must be clonally propagated, due to their highly heterozygous genomes. Malbec is a France-originated cultivar appreciated for producing high-quality wines and is the offspring of cultivars Prunelard and Magdeleine Noire des Charentes. Here, we have built a diploid genome assembly of Malbec, after trio binning of PacBio long reads into the two haploid complements inherited from either parent. After haplotype-aware deduplication and corrections, complete assemblies for the two haplophases were obtained with a very low haplotype switch-error rate (<0.025). The haplophase alignment identified > 25% of polymorphic regions. Gene annotation including RNA-seq transcriptome assembly and ab initio prediction evidence resulted in similar gene model numbers for both haplophases. The annotated diploid assembly was exploited in the transcriptomic comparison of four clonal accessions of Malbec that exhibited variation in berry composition traits. Analysis of the ripening pericarp transcriptome using either haplophases as a reference yielded similar results, although some differences were observed. Particularly, among the differentially expressed genes identified only with the Magdeleine-inherited haplotype as reference, we observed an over-representation of hypothetically hemizygous genes. The higher berry anthocyanin content of clonal accession 595 was associated with increased abscisic acid responses, possibly leading to the observed overexpression of phenylpropanoid metabolism genes and deregulation of genes associated with abiotic stress response. Overall, the results highlight the importance of producing diploid assemblies to fully represent the genomic diversity of highly heterozygous woody crop cultivars and unveil the molecular bases of clonal phenotypic variation.
Collapse
Affiliation(s)
- Luciano Calderón
- Instituto de Biología Agrícola de Mendoza (CONICET-UNCuyo), Genetica y Genomica de Vid, Chacras de Coria 5505, Mendoza, Argentina
| | - Pablo Carbonell-Bejerano
- Instituto de Ciencias de la Vid y del Vino, ICVV, CSIC - Universidad de La Rioja - Gobierno de La Rioja, Logroño 26007, La Rioja, Spain
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| | - Claudio Muñoz
- Instituto de Biología Agrícola de Mendoza (CONICET-UNCuyo), Genetica y Genomica de Vid, Chacras de Coria 5505, Mendoza, Argentina
- Facultad de Ciencias Agrarias (UNCuyo), Cátedra Fitopatología, Chacras de Coria 5505, Mendoza, Argentina
| | - Laura Bree
- Vivero Mercier Argentina, Perdriel 5500, Mendoza, Argentina
| | - Cristobal Sola
- Vivero Mercier Argentina, Perdriel 5500, Mendoza, Argentina
| | | | - Walter Tulle
- Instituto de Biología Agrícola de Mendoza (CONICET-UNCuyo), Genetica y Genomica de Vid, Chacras de Coria 5505, Mendoza, Argentina
| | - Sebastian Gomez-Talquenca
- Plant Virology Laboratory, Instituto Nacional de Tecnología Agropecuaria, Luján de Cuyo 5534, Mendoza, Argentina
| | - Christa Lanz
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| | - Carolina Royo
- Instituto de Ciencias de la Vid y del Vino, ICVV, CSIC - Universidad de La Rioja - Gobierno de La Rioja, Logroño 26007, La Rioja, Spain
| | - Javier Ibáñez
- Instituto de Ciencias de la Vid y del Vino, ICVV, CSIC - Universidad de La Rioja - Gobierno de La Rioja, Logroño 26007, La Rioja, Spain
| | - José Miguel Martinez-Zapater
- Instituto de Ciencias de la Vid y del Vino, ICVV, CSIC - Universidad de La Rioja - Gobierno de La Rioja, Logroño 26007, La Rioja, Spain
| | - Detlef Weigel
- Department of Molecular Biology, Max Planck Institute for Biology Tübingen, 72076 Tübingen, Germany
| | - Diego Lijavetzky
- Instituto de Biología Agrícola de Mendoza (CONICET-UNCuyo), Genetica y Genomica de Vid, Chacras de Coria 5505, Mendoza, Argentina
| |
Collapse
|
7
|
Gajda Ł, Daszkowska-Golec A, Świątek P. Trophic Position of the White Worm ( Enchytraeus albidus) in the Context of Digestive Enzyme Genes Revealed by Transcriptomics Analysis. Int J Mol Sci 2024; 25:4685. [PMID: 38731903 PMCID: PMC11083476 DOI: 10.3390/ijms25094685] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2024] [Revised: 04/20/2024] [Accepted: 04/23/2024] [Indexed: 05/13/2024] Open
Abstract
To assess the impact of Enchytraeidae (potworms) on the functioning of the decomposer system, knowledge of the feeding preferences of enchytraeid species is required. Different food preferences can be explained by variations in enzymatic activities among different enchytraeid species, as there are no significant differences in the morphology or anatomy of their alimentary tracts. However, it is crucial to distinguish between the contribution of microbial enzymes and the animal's digestive capacity. Here, we computationally analyzed the endogenous digestive enzyme genes in Enchytraeus albidus. The analysis was based on RNA-Seq of COI-monohaplotype culture (PL-A strain) specimens, utilizing transcriptome profiling to determine the trophic position of the species. We also corroborated the results obtained using transcriptomics data from genetically heterogeneous freeze-tolerant strains. Our results revealed that E. albidus expresses a wide range of glycosidases, including GH9 cellulases and a specific digestive SH3b-domain-containing i-type lysozyme, previously described in the earthworm Eisenia andrei. Therefore, E. albidus combines traits of both primary decomposers (primary saprophytophages) and secondary decomposers (sapro-microphytophages/microbivores) and can be defined as an intermediate decomposer. Based on assemblies of publicly available RNA-Seq reads, we found close homologs for these cellulases and i-type lysozymes in various clitellate taxa, including Crassiclitellata and Enchytraeidae.
Collapse
Affiliation(s)
| | | | - Piotr Świątek
- Institute of Biology, Biotechnology and Environmental Protection, Faculty of Natural Sciences, University of Silesia in Katowice, 9 Bankowa St., 40-007 Katowice, Poland; (Ł.G.); (A.D.-G.)
| |
Collapse
|
8
|
Tyagi R, Rosa BA, Swain A, Artyomov MN, Jasmer DP, Mitreva M. Intestinal cell diversity and treatment responses in a parasitic nematode at single cell resolution. BMC Genomics 2024; 25:341. [PMID: 38575858 PMCID: PMC10996262 DOI: 10.1186/s12864-024-10203-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 03/08/2024] [Indexed: 04/06/2024] Open
Abstract
BACKGROUND Parasitic nematodes, significant pathogens for humans, animals, and plants, depend on diverse organ systems for intra-host survival. Understanding the cellular diversity and molecular variations underlying these functions holds promise for developing novel therapeutics, with specific emphasis on the neuromuscular system's functional diversity. The nematode intestine, crucial for anthelmintic therapies, exhibits diverse cellular phenotypes, and unraveling this diversity at the single-cell level is essential for advancing knowledge in anthelmintic research across various organ systems. RESULTS Here, using novel single-cell transcriptomics datasets, we delineate cellular diversity within the intestine of adult female Ascaris suum, a parasitic nematode species that infects animals and people. Gene transcripts expressed in individual nuclei of untreated intestinal cells resolved three phenotypic clusters, while lower stringency resolved additional subclusters and more potential diversity. Clusters 1 and 3 phenotypes displayed variable congruence with scRNA phenotypes of C. elegans intestinal cells, whereas the A. suum cluster 2 phenotype was markedly unique. Distinct functional pathway enrichment characterized each A. suum intestinal cell cluster. Cluster 2 was distinctly enriched for Clade III-associated genes, suggesting it evolved within clade III nematodes. Clusters also demonstrated differential transcriptional responsiveness to nematode intestinal toxic treatments, with Cluster 2 displaying the least responses to short-term intra-pseudocoelomic nematode intestinal toxin treatments. CONCLUSIONS This investigation presents advances in knowledge related to biological differences among major cell populations of adult A. suum intestinal cells. For the first time, diverse nematode intestinal cell populations were characterized, and associated biological markers of these cells were identified to support tracking of constituent cells under experimental conditions. These advances will promote better understanding of this and other parasitic nematodes of global importance, and will help to guide future anthelmintic treatments.
Collapse
Affiliation(s)
- Rahul Tyagi
- Division of Infectious Diseases, Department of Internal Medicine, Washington University School of Medicine, 63110, St. Louis, MO, USA
| | - Bruce A Rosa
- Division of Infectious Diseases, Department of Internal Medicine, Washington University School of Medicine, 63110, St. Louis, MO, USA
| | - Amanda Swain
- Department of Pathology and Immunology, Washington University School of Medicine, 63110, Saint Louis, MO, USA
| | - Maxim N Artyomov
- Department of Pathology and Immunology, Washington University School of Medicine, 63110, Saint Louis, MO, USA
| | - Douglas P Jasmer
- Department of Veterinary Microbiology and Pathology, Washington State University, 99164, Pullman, WA, USA.
| | - Makedonka Mitreva
- Division of Infectious Diseases, Department of Internal Medicine, Washington University School of Medicine, 63110, St. Louis, MO, USA.
- Department of Genetics, Washington University School of Medicine, 63110, St. Louis, MO, USA.
- McDonnell Genome Institute, Washington University School of Medicine, 63110, St Louis, MO, USA.
| |
Collapse
|
9
|
Nagy NA, Tóth GE, Kurucz K, Kemenesi G, Laczkó L. The updated genome of the Hungarian population of Aedes koreicus. Sci Rep 2024; 14:7545. [PMID: 38555322 PMCID: PMC10981705 DOI: 10.1038/s41598-024-58096-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Accepted: 03/25/2024] [Indexed: 04/02/2024] Open
Abstract
Vector-borne diseases pose a potential risk to human and animal welfare, and understanding their spread requires genomic resources. The mosquito Aedes koreicus is an emerging vector that has been introduced into Europe more than 15 years ago but only a low quality, fragmented genome was available. In this study, we carried out additional sequencing and assembled and characterized the genome of the species to provide a background for understanding its evolution and biology. The updated genome was 1.1 Gbp long and consisted of 6099 contigs with an N50 value of 329,610 bp and a BUSCO score of 84%. We identified 22,580 genes that could be functionally annotated and paid particular attention to the identification of potential insecticide resistance genes. The assessment of the orthology of the genes indicates a high turnover at the terminal branches of the species tree of mosquitoes with complete genomes, which could contribute to the adaptation and evolutionary success of the species. These results could form the basis for numerous downstream analyzes to develop targets for the control of mosquito populations.
Collapse
Affiliation(s)
- Nikoletta Andrea Nagy
- Department of Evolutionary Zoology and Human Biology, University of Debrecen, Debrecen, Hungary.
- HUN-REN-UD Behavioural Ecology Research Group, University of Debrecen, Debrecen, Hungary.
- Institute of Metagenomics, University of Debrecen, Debrecen, Hungary.
| | - Gábor Endre Tóth
- National Laboratory of Virology, Szentágothai Research Centre, University of Pécs, Pecs, Hungary
- Bernhard Nocht Institute for Tropical Medicine, WHO Collaborating Centre for Arbovirus and Hemorrhagic Fever Reference and Research, Hamburg, Germany
| | - Kornélia Kurucz
- National Laboratory of Virology, Szentágothai Research Centre, University of Pécs, Pecs, Hungary
- Institute of Biology, Faculty of Sciences, University of Pécs, Pecs, Hungary
| | - Gábor Kemenesi
- National Laboratory of Virology, Szentágothai Research Centre, University of Pécs, Pecs, Hungary
- Institute of Biology, Faculty of Sciences, University of Pécs, Pecs, Hungary
| | - Levente Laczkó
- HUN-REN-UD Conservation Biology Research Group, University of Debrecen, Debrecen, Hungary
- One Health Institute, University of Debrecen, Debrecen, Hungary
| |
Collapse
|
10
|
Laczkó L, Jordán S, Póliska S, Rácz HV, Nagy NA, Molnár V A, Sramkó G. The draft genome of Spiraea crenata L. (Rosaceae) - the first complete genome in tribe Spiraeeae. Sci Data 2024; 11:219. [PMID: 38368431 PMCID: PMC10874383 DOI: 10.1038/s41597-024-03046-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 02/05/2024] [Indexed: 02/19/2024] Open
Abstract
Spiraea crenata L. is a deciduous shrub distributed across the Eurasian steppe zone. The species is of cultural and horticultural importance and occurs in scattered populations throughout its westernmost range. Currently, there is no genomic information on the tribe of Spiraeeae. Therefore we sequenced and assembled the whole genome of S. crenata using second- and third-generation sequencing and a hybrid assembly approach to expand genomic resources for conservation and support research on this horticulturally important lineage. In addition to the organellar genomes (the plastome and the mitochondrion), we present the first draft genome of the species with an estimated size of 220 Mbp, an N50 value of 7.7 Mbp, and a BUSCO score of 96.0%. Being the first complete genome in tribe Spiraeeae, this may not only be the first step in the genomic study of a rare plant but also a contribution to genomic resources supporting the study of biodiversity and evolutionary history of Rosaceae.
Collapse
Affiliation(s)
- Levente Laczkó
- Department of Metagenomics, University of Debrecen, Debrecen, Hungary
- HUN-REN-UD Conservation Biology Research Group, University of Debrecen, Debrecen, Hungary
| | - Sándor Jordán
- Department of Metagenomics, University of Debrecen, Debrecen, Hungary
- HUN-REN-UD Conservation Biology Research Group, University of Debrecen, Debrecen, Hungary
- Juhász-Nagy Pál Doctoral School, University of Debrecen, Debrecen, Hungary
| | - Szilárd Póliska
- Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Hungary
| | - Hanna Viktória Rácz
- Department of Biotechnology and Microbiology, Faculty of Science and Technology, University of Debrecen, Debrecen, Hungary
| | - Nikoletta Andrea Nagy
- Department of Evolutionary Zoology and Human Biology, Faculty of Science and Technology, University of Debrecen, Debrecen, Hungary
- HUN-REN-UD Behavioural Ecology Research Group, University of Debrecen, Debrecen, Hungary
| | - Attila Molnár V
- HUN-REN-UD Conservation Biology Research Group, University of Debrecen, Debrecen, Hungary
- Evolutionary Genomics Research Group, Department of Botany, Faculty of Science and Technology, University of Debrecen, Debrecen, Hungary
| | - Gábor Sramkó
- HUN-REN-UD Conservation Biology Research Group, University of Debrecen, Debrecen, Hungary.
- Evolutionary Genomics Research Group, Department of Botany, Faculty of Science and Technology, University of Debrecen, Debrecen, Hungary.
| |
Collapse
|
11
|
Dhiman V, Biswas S, Shekhawat RS, Sadhukhan A, Yadav P. In silico characterization of five novel disease-resistance proteins in Oryza sativa sp. japonica against bacterial leaf blight and rice blast diseases. 3 Biotech 2024; 14:48. [PMID: 38268986 PMCID: PMC10803709 DOI: 10.1007/s13205-023-03893-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 12/16/2023] [Indexed: 01/26/2024] Open
Abstract
In the current study, gene network analysis revealed five novel disease-resistance proteins against bacterial leaf blight (BB) and rice blast (RB) diseases caused by Xanthomonas oryzae pv. oryzae (Xoo) and Magnaporthe oryzae (M. oryzae), respectively. In silico modeling, refinement, and model quality assessment were performed to predict the best structures of these five proteins and submitted to ModelArchive for future use. An in-silico annotation indicated that the five proteins functioned in signal transduction pathways as kinases, phospholipases, transcription factors, and DNA-modifying enzymes. The proteins were localized in the nucleus and plasma membrane. Phylogenetic analysis showed the evolutionary relation of the five proteins with disease-resistance proteins (XA21, OsTRX1, PLD, and HKD-motif-containing proteins). This indicates similar disease-resistant properties between five unknown proteins and their evolutionary-related proteins. Furthermore, gene expression profiling of these proteins using public microarray data showed their differential expression under Xoo and M. oryzae infection. This study provides an insight into developing disease-resistant rice varieties by predicting novel candidate resistance proteins, which will assist rice breeders in improving crop yield to address future food security through molecular breeding and biotechnology. Supplementary Information The online version contains supplementary material available at 10.1007/s13205-023-03893-5.
Collapse
Affiliation(s)
- Vedikaa Dhiman
- Department of Bioscience and Bioengineering, Indian Institute of Technology, Jodhpur, 342030 Rajasthan India
| | - Soham Biswas
- Department of Biotechnology and Bioinformatics, University of Hyderabad, Hyderabad, Telangana India
| | - Rajveer Singh Shekhawat
- Department of Bioscience and Bioengineering, Indian Institute of Technology, Jodhpur, 342030 Rajasthan India
| | - Ayan Sadhukhan
- Department of Bioscience and Bioengineering, Indian Institute of Technology, Jodhpur, 342030 Rajasthan India
| | - Pankaj Yadav
- Department of Bioscience and Bioengineering, Indian Institute of Technology, Jodhpur, 342030 Rajasthan India
- School of Artificial Intelligence and Data Science, Indian Institute of Technology, Jodhpur, Rajasthan India
| |
Collapse
|
12
|
O'Meara MJ, Rapala JR, Nichols CB, Alexandre AC, Billmyre RB, Steenwyk JL, Alspaugh JA, O'Meara TR. CryptoCEN: A Co-Expression Network for Cryptococcus neoformans reveals novel proteins involved in DNA damage repair. PLoS Genet 2024; 20:e1011158. [PMID: 38359090 PMCID: PMC10901339 DOI: 10.1371/journal.pgen.1011158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Revised: 02/28/2024] [Accepted: 01/30/2024] [Indexed: 02/17/2024] Open
Abstract
Elucidating gene function is a major goal in biology, especially among non-model organisms. However, doing so is complicated by the fact that molecular conservation does not always mirror functional conservation, and that complex relationships among genes are responsible for encoding pathways and higher-order biological processes. Co-expression, a promising approach for predicting gene function, relies on the general principal that genes with similar expression patterns across multiple conditions will likely be involved in the same biological process. For Cryptococcus neoformans, a prevalent human fungal pathogen greatly diverged from model yeasts, approximately 60% of the predicted genes in the genome lack functional annotations. Here, we leveraged a large amount of publicly available transcriptomic data to generate a C. neoformans Co-Expression Network (CryptoCEN), successfully recapitulating known protein networks, predicting gene function, and enabling insights into the principles influencing co-expression. With 100% predictive accuracy, we used CryptoCEN to identify 13 new DNA damage response genes, underscoring the utility of guilt-by-association for determining gene function. Overall, co-expression is a powerful tool for uncovering gene function, and decreases the experimental tests needed to identify functions for currently under-annotated genes.
Collapse
Affiliation(s)
- Matthew J O'Meara
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Jackson R Rapala
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Connie B Nichols
- Departments of Medicine and Molecular Genetics/Microbiology; and Cell Biology, Duke University School of Medicine, Durham, North Carolina, United States of America
| | - A Christina Alexandre
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, United States of America
| | - R Blake Billmyre
- Departments of Pharmaceutical and Biomedical Sciences/Infectious Disease, College of Pharmacy/College of Veterinary Medicine, University of Georgia, Athens, Georgia, United States of America
| | - Jacob L Steenwyk
- Howard Hughes Medical Institute and the Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, California, United States of America
| | - J Andrew Alspaugh
- Departments of Medicine and Molecular Genetics/Microbiology; and Cell Biology, Duke University School of Medicine, Durham, North Carolina, United States of America
| | - Teresa R O'Meara
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, Michigan, United States of America
| |
Collapse
|
13
|
Poorinmohammad N, Salavati R. Prioritization of Trypanosoma brucei editosome protein interactions interfaces at residue resolution through proteome-scale network analysis. BMC Mol Cell Biol 2024; 25:3. [PMID: 38279116 PMCID: PMC10811811 DOI: 10.1186/s12860-024-00499-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2023] [Accepted: 01/19/2024] [Indexed: 01/28/2024] Open
Abstract
BACKGROUND Trypanosoma brucei is the causative agent for trypanosomiasis in humans and livestock, which presents a growing challenge due to drug resistance. While identifying novel drug targets is vital, the process is delayed due to a lack of functional information on many of the pathogen's proteins. Accordingly, this paper presents a computational framework for prioritizing drug targets within the editosome, a vital molecular machinery responsible for mitochondrial RNA processing in T. brucei. Importantly, this framework may eliminate the need for prior gene or protein characterization, potentially accelerating drug discovery efforts. RESULTS By integrating protein-protein interaction (PPI) network analysis, PPI structural modeling, and residue interaction network (RIN) analysis, we quantitatively ranked and identified top hub editosome proteins, their key interaction interfaces, and hotspot residues. Our findings were cross-validated and further prioritized by incorporating them into gene set analysis and differential expression analysis of existing quantitative proteomics data across various life stages of T. brucei. In doing so, we highlighted PPIs such as KREL2-KREPA1, RESC2-RESC1, RESC12A-RESC13, and RESC10-RESC6 as top candidates for further investigation. This includes examining their interfaces and hotspot residues, which could guide drug candidate selection and functional studies. CONCLUSION RNA editing offers promise for target-based drug discovery, particularly with proteins and interfaces that play central roles in the pathogen's life cycle. This study introduces an integrative drug target identification workflow combining information from the PPI network, PPI 3D structure, and reside-level information of their interface which can be applicable to diverse pathogens. In the case of T. brucei, via this pipeline, the present study suggested potential drug targets with residue-resolution from RNA editing machinery. However, experimental validation is needed to fully realize its potential in advancing urgently needed antiparasitic drug development.
Collapse
Affiliation(s)
- Naghmeh Poorinmohammad
- Institute of Parasitology, McGill University, Ste. Anne de Bellevue, Montreal, Quebec, H9X 3V9, Canada
| | - Reza Salavati
- Institute of Parasitology, McGill University, Ste. Anne de Bellevue, Montreal, Quebec, H9X 3V9, Canada.
- Department of Biochemistry, McGill University, Montreal, Quebec, H3G 1Y6, Canada.
| |
Collapse
|
14
|
Jacob F, Hamid R, Ghorbanzadeh Z, Valsalan R, Ajinath LS, Mathew D. Genome-wide identification, characterization, and expression analysis of MIPS family genes in legume species. BMC Genomics 2024; 25:95. [PMID: 38262915 PMCID: PMC10804463 DOI: 10.1186/s12864-023-09937-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2023] [Accepted: 12/23/2023] [Indexed: 01/25/2024] Open
Abstract
BACKGROUND Evolutionarily conserved in plants, the enzyme D-myo-inositol-3-phosphate synthase (MIPS; EC 5.5.1.4) regulates the initial, rate-limiting reaction in the phytic acid biosynthetic pathway. They are reported to be transcriptional regulators involved in various physiological functions in the plants, growth, and biotic/abiotic stress responses. Even though the genomes of most legumes are fully sequenced and available, an all-inclusive study of the MIPS family members in legumes is still ongoing. RESULTS We found 24 MIPS genes in ten legumes: Arachis hypogea, Cicer arietinum, Cajanus cajan, Glycine max, Lablab purpureus, Medicago truncatula, Pisum sativum, Phaseolus vulgaris, Trifolium pratense and Vigna unguiculata. The total number of MIPS genes found in each species ranged from two to three. The MIPS genes were classified into five clades based on their evolutionary relationships with Arabidopsis genes. The structural patterns of intron/exon and the protein motifs that were conserved in each gene were highly group-specific. In legumes, MIPS genes were inconsistently distributed across their genomes. A comparison of genomes and gene sequences showed that this family was subjected to purifying selection and the gene expansion in MIPS family in legumes was mainly caused by segmental duplication. Through quantitative PCR, expression patterns of MIPS in response to various abiotic stresses, in the vegetative tissues of various legumes were studied. Expression pattern shows that MIPS genes control the development and differentiation of various organs, and have significant responses to salinity and drought stress. CONCLUSION The MIPS genes in the genomes of legumes have been identified, characterized and their expression was analysed. The findings pave way for understanding their molecular functions and evolution, and lead to identify the putative MIPS genes associated with different cell and tissue development.
Collapse
Affiliation(s)
- Feba Jacob
- Centre for Plant Biotechnology and Molecular Biology, Kerala Agricultural University, Thrissur, India
| | - Rasmieh Hamid
- Department of Plant Breeding, Cotton Research Institute of Iran (CRII), Agricultural Research, Education and Extension Organization (AREEO), Gorgan, Iran
| | - Zahra Ghorbanzadeh
- Department of Systems Biology, Agricultural Biotechnology Research Institute of Iran (ABRII), Agricultural Research, Education and Extension Organization (AREEO), Karaj, Iran
| | - Ravisankar Valsalan
- Centre for Plant Biotechnology and Molecular Biology, Kerala Agricultural University, Thrissur, India
| | - Lavale Shivaji Ajinath
- Centre for Plant Biotechnology and Molecular Biology, Kerala Agricultural University, Thrissur, India
| | - Deepu Mathew
- Centre for Plant Biotechnology and Molecular Biology, Kerala Agricultural University, Thrissur, India.
| |
Collapse
|
15
|
Iacovelli R, He T, Allen JL, Hackl T, Haslinger K. Genome sequencing and molecular networking analysis of the wild fungus Anthostomella pinea reveal its ability to produce a diverse range of secondary metabolites. Fungal Biol Biotechnol 2024; 11:1. [PMID: 38172933 PMCID: PMC10763133 DOI: 10.1186/s40694-023-00170-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Accepted: 12/07/2023] [Indexed: 01/05/2024] Open
Abstract
BACKGROUND Filamentous fungi are prolific producers of bioactive molecules and enzymes with important applications in industry. Yet, the vast majority of fungal species remain undiscovered or uncharacterized. Here we focus our attention to a wild fungal isolate that we identified as Anthostomella pinea. The fungus belongs to a complex polyphyletic genus in the family of Xylariaceae, which is known to comprise endophytic and pathogenic fungi that produce a plethora of interesting secondary metabolites. Despite that, Anthostomella is largely understudied and only two species have been fully sequenced and characterized at a genomic level. RESULTS In this work, we used long-read sequencing to obtain the complete 53.7 Mb genome sequence including the full mitochondrial DNA. We performed extensive structural and functional annotation of coding sequences, including genes encoding enzymes with potential applications in biotechnology. Among others, we found that the genome of A. pinea encodes 91 biosynthetic gene clusters, more than 600 CAZymes, and 164 P450s. Furthermore, untargeted metabolomics and molecular networking analysis of the cultivation extracts revealed a rich secondary metabolism, and in particular an abundance of sesquiterpenoids and sesquiterpene lactones. We also identified the polyketide antibiotic xanthoepocin, to which we attribute the anti-Gram-positive effect of the extracts that we observed in antibacterial plate assays. CONCLUSIONS Taken together, our results provide a first glimpse into the potential of Anthstomella pinea to provide new bioactive molecules and biocatalysts and will facilitate future research into these valuable metabolites.
Collapse
Affiliation(s)
- R Iacovelli
- Department of Chemical and Pharmaceutical Biology, Groningen Research Institute of Pharmacy, University of Groningen, 9713 AV, Groningen, The Netherlands
| | - T He
- Department of Chemical and Pharmaceutical Biology, Groningen Research Institute of Pharmacy, University of Groningen, 9713 AV, Groningen, The Netherlands
| | - J L Allen
- Department of Biology, Eastern Washington University, Cheney, WA, 99004, USA
| | - T Hackl
- Groningen Institute for Evolutionary Life Sciences, University of Groningen, 9747 AG, Groningen, The Netherlands
| | - K Haslinger
- Department of Chemical and Pharmaceutical Biology, Groningen Research Institute of Pharmacy, University of Groningen, 9713 AV, Groningen, The Netherlands.
| |
Collapse
|
16
|
Schelkunov MI, Shtratnikova VY, Klepikova AV, Makarenko MS, Omelchenko DO, Novikova LA, Obukhova EN, Bogdanov VP, Penin AA, Logacheva MD. The genome of the toxic invasive species Heracleum sosnowskyi carries an increased number of genes despite absence of recent whole-genome duplications. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2024; 117:449-463. [PMID: 37846604 DOI: 10.1111/tpj.16500] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 09/26/2023] [Accepted: 10/04/2023] [Indexed: 10/18/2023]
Abstract
Heracleum sosnowskyi, belonging to a group of giant hogweeds, is a plant with large effects on ecosystems and human health. It is an invasive species that contributes to the deterioration of grassland ecosystems. The ability of H. sosnowskyi to produce linear furanocoumarins (FCs), photosensitizing compounds, makes it very dangerous. At the same time, linear FCs are compounds with high pharmaceutical value used in skin disease therapies. Despite this high importance, it has not been the focus of genetic and genomic studies. Here, we report a chromosome-scale assembly of Sosnowsky's hogweed genome. Genomic analysis revealed an unusually high number of genes (55106) in the hogweed genome, in contrast to the 25-35 thousand found in most plants. However, we did not find any traces of recent whole-genome duplications not shared with its confamiliar, Daucus carota (carrot), which has approximately thirty thousand genes. The analysis of the genomic proximity of duplicated genes indicates on tandem duplications as a main reason for this increase. We performed a genome-wide search of the genes of the FC biosynthesis pathway and surveyed their expression in aboveground plant parts. Using a combination of expression data and phylogenetic analysis, we found candidate genes for psoralen synthase and experimentally showed the activity of one of them using a heterologous yeast expression system. These findings expand our knowledge on the evolution of gene space in plants and lay a foundation for further analysis of hogweed as an invasive plant and as a source of FCs.
Collapse
Affiliation(s)
- Mikhail I Schelkunov
- Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russia
- Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Viktoria Yu Shtratnikova
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, Russia
| | - Anna V Klepikova
- Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russia
| | - Maksim S Makarenko
- Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russia
| | - Denis O Omelchenko
- Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russia
| | - Lyudmila A Novikova
- Belozersky Institute of Physico-Chemical Biology, Lomonosov Moscow State University, Moscow, Russia
| | | | - Viktor P Bogdanov
- Life Sciences Research Center, Moscow Institute of Physics and Technology, Dolgoprudniy, Russia
| | - Aleksey A Penin
- Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russia
| | - Maria D Logacheva
- Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow, Russia
- Skolkovo Institute of Science and Technology, Moscow, Russia
| |
Collapse
|
17
|
Chavarro-Carrero EA, Snelders NC, Torres DE, Kraege A, López-Moral A, Petti GC, Punt W, Wieneke J, García-Velasco R, López-Herrera CJ, Seidl MF, Thomma BPHJ. The soil-borne white root rot pathogen Rosellinia necatrix expresses antimicrobial proteins during host colonization. PLoS Pathog 2024; 20:e1011866. [PMID: 38236788 PMCID: PMC10796067 DOI: 10.1371/journal.ppat.1011866] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Accepted: 11/27/2023] [Indexed: 01/22/2024] Open
Abstract
Rosellinia necatrix is a prevalent soil-borne plant-pathogenic fungus that is the causal agent of white root rot disease in a broad range of host plants. The limited availability of genomic resources for R. necatrix has complicated a thorough understanding of its infection biology. Here, we sequenced nine R. necatrix strains with Oxford Nanopore sequencing technology, and with DNA proximity ligation we generated a gapless assembly of one of the genomes into ten chromosomes. Whereas many filamentous pathogens display a so-called two-speed genome with more dynamic and more conserved compartments, the R. necatrix genome does not display such genome compartmentalization. It has recently been proposed that fungal plant pathogens may employ effectors with antimicrobial activity to manipulate the host microbiota to promote infection. In the predicted secretome of R. necatrix, 26 putative antimicrobial effector proteins were identified, nine of which are expressed during plant colonization. Two of the candidates were tested, both of which were found to possess selective antimicrobial activity. Intriguingly, some of the inhibited bacteria are antagonists of R. necatrix growth in vitro and can alleviate R. necatrix infection on cotton plants. Collectively, our data show that R. necatrix encodes antimicrobials that are expressed during host colonization and that may contribute to modulation of host-associated microbiota to stimulate disease development.
Collapse
Affiliation(s)
- Edgar A. Chavarro-Carrero
- Laboratory of Phytopathology, Wageningen University & Research, Wageningen, The Netherlands
- Institute for Plant Sciences, Cluster of Excellence on Plant Sciences (CEPLAS), University of Cologne, Cologne, Germany
| | - Nick C. Snelders
- Institute for Plant Sciences, Cluster of Excellence on Plant Sciences (CEPLAS), University of Cologne, Cologne, Germany
- Theoretical Biology & Bioinformatics Group, Department of Biology, Utrecht University, Utrecht, The Netherlands
| | - David E. Torres
- Laboratory of Phytopathology, Wageningen University & Research, Wageningen, The Netherlands
- Theoretical Biology & Bioinformatics Group, Department of Biology, Utrecht University, Utrecht, The Netherlands
| | - Anton Kraege
- Institute for Plant Sciences, Cluster of Excellence on Plant Sciences (CEPLAS), University of Cologne, Cologne, Germany
| | - Ana López-Moral
- Institute for Plant Sciences, Cluster of Excellence on Plant Sciences (CEPLAS), University of Cologne, Cologne, Germany
| | - Gabriella C. Petti
- Institute for Plant Sciences, Cluster of Excellence on Plant Sciences (CEPLAS), University of Cologne, Cologne, Germany
| | - Wilko Punt
- Institute for Plant Sciences, Cluster of Excellence on Plant Sciences (CEPLAS), University of Cologne, Cologne, Germany
| | - Jan Wieneke
- Institute for Plant Sciences, Cluster of Excellence on Plant Sciences (CEPLAS), University of Cologne, Cologne, Germany
| | - Rómulo García-Velasco
- Laboratory of Phytopathology, Tenancingo University Center, Autonomous University of the State of Mexico, Tenancingo, State of Mexico, Mexico
| | - Carlos J. López-Herrera
- CSIC, Instituto de Agricultura Sostenible, Dept. Protección de Cultivos, C/Alameda del Obispo s/n, Córdoba, Spain
| | - Michael F. Seidl
- Theoretical Biology & Bioinformatics Group, Department of Biology, Utrecht University, Utrecht, The Netherlands
| | - Bart P. H. J. Thomma
- Laboratory of Phytopathology, Wageningen University & Research, Wageningen, The Netherlands
- Institute for Plant Sciences, Cluster of Excellence on Plant Sciences (CEPLAS), University of Cologne, Cologne, Germany
| |
Collapse
|
18
|
Belcher LJ, Dewar AE, Hao C, Katz Z, Ghoul M, West SA. SOCfinder: a genomic tool for identifying social genes in bacteria. Microb Genom 2023; 9:001171. [PMID: 38117204 PMCID: PMC10763506 DOI: 10.1099/mgen.0.001171] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Accepted: 12/08/2023] [Indexed: 12/21/2023] Open
Abstract
Bacteria cooperate by working collaboratively to defend their colonies, share nutrients, and resist antibiotics. Nevertheless, our understanding of these remarkable behaviours primarily comes from studying a few well-characterized species. Consequently, there is a significant gap in our understanding of microbial social traits, particularly in natural environments. To address this gap, we can use bioinformatic tools to identify genes that control cooperative or otherwise social traits. Existing tools address this challenge through two approaches. One approach is to identify genes that encode extracellular proteins, which can provide benefits to neighbouring cells. An alternative approach is to predict gene function using annotation tools. However, these tools have several limitations. Not all extracellular proteins are cooperative, and not all cooperative behaviours are controlled by extracellular proteins. Furthermore, existing functional annotation methods frequently miss known cooperative genes. We introduce SOCfinder as a new tool to find bacterial genes that control cooperative or otherwise social traits. SOCfinder combines information from several methods, considering if a gene is likely to [1] code for an extracellular protein [2], have a cooperative functional annotation, or [3] be part of the biosynthesis of a cooperative secondary metabolite. We use data on two extensively-studied species (P. aeruginosa and B. subtilis) to show that SOCfinder is better at finding known cooperative genes than existing tools. We also use theory from population genetics to identify a signature of kin selection in SOCfinder cooperative genes, which is lacking in genes identified by existing tools. SOCfinder opens up a number of exciting directions for future research, and is available to download from https://github.com/lauriebelch/SOCfinder.
Collapse
Affiliation(s)
| | - Anna E. Dewar
- Department of Biology, University of Oxford, Oxford, OX1 3SZ, UK
| | - Chunhui Hao
- Department of Biology, University of Oxford, Oxford, OX1 3SZ, UK
| | - Zohar Katz
- Department of Biology, University of Oxford, Oxford, OX1 3SZ, UK
| | - Melanie Ghoul
- Department of Biology, University of Oxford, Oxford, OX1 3SZ, UK
| | - Stuart A. West
- Department of Biology, University of Oxford, Oxford, OX1 3SZ, UK
| |
Collapse
|
19
|
Chen J, Gu Z, Lai L, Pei J. In silico protein function prediction: the rise of machine learning-based approaches. MEDICAL REVIEW (2021) 2023; 3:487-510. [PMID: 38282798 PMCID: PMC10808870 DOI: 10.1515/mr-2023-0038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 10/11/2023] [Indexed: 01/30/2024]
Abstract
Proteins function as integral actors in essential life processes, rendering the realm of protein research a fundamental domain that possesses the potential to propel advancements in pharmaceuticals and disease investigation. Within the context of protein research, an imperious demand arises to uncover protein functionalities and untangle intricate mechanistic underpinnings. Due to the exorbitant costs and limited throughput inherent in experimental investigations, computational models offer a promising alternative to accelerate protein function annotation. In recent years, protein pre-training models have exhibited noteworthy advancement across multiple prediction tasks. This advancement highlights a notable prospect for effectively tackling the intricate downstream task associated with protein function prediction. In this review, we elucidate the historical evolution and research paradigms of computational methods for predicting protein function. Subsequently, we summarize the progress in protein and molecule representation as well as feature extraction techniques. Furthermore, we assess the performance of machine learning-based algorithms across various objectives in protein function prediction, thereby offering a comprehensive perspective on the progress within this field.
Collapse
Affiliation(s)
- Jiaxiao Chen
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Zhonghui Gu
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Luhua Lai
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- BNLMS, College of Chemistry and Molecular Engineering, Peking University, Beijing, China
- Research Unit of Drug Design Method, Chinese Academy of Medical Sciences (2021RU014), Beijing, China
| | - Jianfeng Pei
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
- Research Unit of Drug Design Method, Chinese Academy of Medical Sciences (2021RU014), Beijing, China
| |
Collapse
|
20
|
Ghorbel M, Zribi I, Chihaoui M, Alghamidi A, Mseddi K, Brini F. Genome-Wide Investigation and Expression Analysis of the Catalase Gene Family in Oat Plants ( Avena sativa L.). PLANTS (BASEL, SWITZERLAND) 2023; 12:3694. [PMID: 37960051 PMCID: PMC10650400 DOI: 10.3390/plants12213694] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 09/25/2023] [Accepted: 10/23/2023] [Indexed: 11/15/2023]
Abstract
Through the degradation of reactive oxygen species (ROS), different antioxidant enzymes, such as catalase (CAT), defend organisms against oxidative stress. These enzymes are crucial to numerous biological functions, like plant development and defense against several biotic and abiotic stresses. However, despite the major economic importance of Avena sativa around the globe, little is known about the CAT gene's structure and organization in this crop. Thus, a genome-wide investigation of the CAT gene family in oat plants has been carried out to characterize the potential roles of those genes under different stressors. Bioinformatic approaches were used in this study to predict the AvCAT gene's structure, secondary and tertiary protein structures, physicochemical properties, phylogenetic tree, and expression profiling under diverse developmental and biological conditions. A local Saudi oat variety (AlShinen) was used in this work. Here, ten AvCAT genes that belong to three groups (Groups I-III) were identified. All identified CATs harbor the two conserved domains (pfam00199 and pfam06628), a heme-binding domain, and a catalase activity motif. Moreover, identified AvCAT proteins were located in different compartments in the cell, such as the peroxisome, mitochondrion, and cytoplasm. By analyzing their promoters, different cis-elements were identified as being related to plant development, maturation, and response to different environmental stresses. Gene expression analysis revealed that three different AvCAT genes belonging to three different subgroups showed noticeable modifications in response to various stresses, such as mannitol, salt, and ABA. As far as we know, this is the first report describing the genome-wide analysis of the oat catalase gene family, and these data will help further study the roles of catalase genes during stress responses, leading to crop improvement.
Collapse
Affiliation(s)
- Mouna Ghorbel
- Department of Biology, College of Sciences, University of Hail, Ha’il City 81451, Saudi Arabia;
| | - Ikram Zribi
- Laboratory of Biotechnology and Plant Improvement, Center of Biotechnology of Sfax, Sfax 3018, Tunisia;
| | - Mejda Chihaoui
- Computer Science Departement, Applied College, University of Ha’il, Ha’il City 81451, Saudi Arabia;
| | - Ahmad Alghamidi
- Department of Biology, College of Sciences, University of Hail, Ha’il City 81451, Saudi Arabia;
- National Center for Vegetation Cover & Combating Desertification, Riyadh 13312, Saudi Arabia
| | - Khalil Mseddi
- Department of Biology, Faculty of Science of Sfax, University of Sfax, Sfax 3000, Tunisia;
| | - Faiçal Brini
- Laboratory of Biotechnology and Plant Improvement, Center of Biotechnology of Sfax, Sfax 3018, Tunisia;
| |
Collapse
|
21
|
Ramos-Lizardo GN, Mucherino-Muñoz JJ, Aguiar ERGR, Pirovani CP, Corrêa RX. A repertoire of candidate effector proteins of the fungus Ceratocystis cacaofunesta. Sci Rep 2023; 13:16368. [PMID: 37773261 PMCID: PMC10542334 DOI: 10.1038/s41598-023-43117-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Accepted: 09/20/2023] [Indexed: 10/01/2023] Open
Abstract
The genus Ceratocystis includes many phytopathogenic fungi that affect different plant species. One of these is Ceratocystis cacaofunesta, which is pathogenic to the cocoa tree and causes Ceratocystis wilt, a lethal disease for the crop. However, little is known about how this pathogen interacts with its host. The knowledge and identification of possible genes encoding effector proteins are essential to understanding this pathosystem. The present work aimed to predict genes that code effector proteins of C. cacaofunesta from a comparative analysis of the genomes of five Ceratocystis species available in databases. We performed a new genome annotation through an in-silico analysis. We analyzed the secretome and effectorome of C. cacaofunesta using the characteristics of the peptides, such as the presence of signal peptide for secretion, absence of transmembrane domain, and richness of cysteine residues. We identified 160 candidate effector proteins in the C. cacaofunesta proteome that could be classified as cytoplasmic (102) or apoplastic (58). Of the total number of candidate effector proteins, 146 were expressed, presenting an average of 206.56 transcripts per million. Our database was created using a robust bioinformatics strategy, followed by manual curation, generating information on pathogenicity-related genes involved in plant interactions, including CAZymes, hydrolases, lyases, and oxidoreductases. Comparing proteins already characterized as effectors in Sordariomycetes species revealed five groups of protein sequences homologous to C. cacaofunesta. These data provide a valuable resource for studying the infection mechanisms of these pathogens in their hosts.
Collapse
Affiliation(s)
- Gabriela N Ramos-Lizardo
- Departamento de Ciências Biológicas (DCB), Centro de Biotecnologia e Genética (CBG), Universidade Estadual de Santa Cruz (UESC), Ilhéus, BA, 45662-900, Brazil
| | - Jonathan J Mucherino-Muñoz
- Departamento de Ciências Biológicas (DCB), Centro de Biotecnologia e Genética (CBG), Universidade Estadual de Santa Cruz (UESC), Ilhéus, BA, 45662-900, Brazil
| | - Eric R G R Aguiar
- Departamento de Ciências Biológicas (DCB), Centro de Biotecnologia e Genética (CBG), Universidade Estadual de Santa Cruz (UESC), Ilhéus, BA, 45662-900, Brazil
| | - Carlos Priminho Pirovani
- Departamento de Ciências Biológicas (DCB), Centro de Biotecnologia e Genética (CBG), Universidade Estadual de Santa Cruz (UESC), Ilhéus, BA, 45662-900, Brazil
| | - Ronan Xavier Corrêa
- Departamento de Ciências Biológicas (DCB), Centro de Biotecnologia e Genética (CBG), Universidade Estadual de Santa Cruz (UESC), Ilhéus, BA, 45662-900, Brazil.
| |
Collapse
|
22
|
Bianca F, Ispano E, Gazzola E, Lavezzo E, Fontana P, Toppo S. FunTaxIS-lite: a simple and light solution to investigate protein functions in all living organisms. Bioinformatics 2023; 39:btad549. [PMID: 37672040 PMCID: PMC10500080 DOI: 10.1093/bioinformatics/btad549] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Revised: 07/27/2023] [Accepted: 09/05/2023] [Indexed: 09/07/2023] Open
Abstract
MOTIVATION Defining the full domain of protein functions belonging to an organism is a complex challenge that is due to the huge heterogeneity of the taxonomy, where single or small groups of species can bear unique functional characteristics. FunTaxIS-lite provides a solution to this challenge by determining taxon-based constraints on Gene Ontology (GO) terms, which specify the functions that an organism can or cannot perform. The tool employs a set of rules to generate and spread the constraints across both the taxon hierarchy and the GO graph. RESULTS The taxon-based constraints produced by FunTaxIS-lite extend those provided by the Gene Ontology Consortium by an average of 300%. The implementation of these rules significantly reduces errors in function predictions made by automatic algorithms and can assist in correcting inconsistent protein annotations in databases. AVAILABILITY AND IMPLEMENTATION FunTaxIS-lite is available on https://www.medcomp.medicina.unipd.it/funtaxis-lite and from https://github.com/MedCompUnipd/FunTaxIS-lite.
Collapse
Affiliation(s)
- Federico Bianca
- Computational Medicine Group (MedComp), Department of Molecular Medicine, University of Padova, Padova, Italy
| | - Emilio Ispano
- Computational Medicine Group (MedComp), Department of Molecular Medicine, University of Padova, Padova, Italy
| | - Ermanno Gazzola
- Computational Medicine Group (MedComp), Department of Molecular Medicine, University of Padova, Padova, Italy
| | - Enrico Lavezzo
- Computational Medicine Group (MedComp), Department of Molecular Medicine, University of Padova, Padova, Italy
| | - Paolo Fontana
- Research and Innovation Center, Edmund Mach Foundation, San Michele all'Adige, Trento, Italy
| | - Stefano Toppo
- Computational Medicine Group (MedComp), Department of Molecular Medicine, University of Padova, Padova, Italy
| |
Collapse
|
23
|
Otero-Ruiz A, Rodriguez-Anaya LZ, Lares-Villa F, Lozano Aguirre Beltrán LF, Lares-Jiménez LF, Gonzalez-Galaviz JR, Cruz-Mendívil A. Functional annotation and comparative genomics analysis of Balamuthia mandrillaris reveals potential virulence-related genes. Sci Rep 2023; 13:14318. [PMID: 37653073 PMCID: PMC10471605 DOI: 10.1038/s41598-023-41657-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 08/29/2023] [Indexed: 09/02/2023] Open
Abstract
Balamuthia mandrillaris is a pathogenic protozoan that causes a rare but almost always fatal infection of the central nervous system and, in some cases, cutaneous lesions. Currently, the genomic data for this free-living amoeba include the description of several complete mitochondrial genomes. In contrast, two complete genomes with draft quality are available in GenBank, but none of these have a functional annotation. In the present study, the complete genome of B. mandrillaris isolated from a freshwater artificial lagoon was sequenced and assembled, obtaining an assembled genome with better assembly quality parameter values than the currently available genomes. Afterward, the genome mentioned earlier, along with strains V039 and 2046, were subjected to functional annotation. Finally, comparative genomics analysis was performed, and it was found that homologous genes in the core genome potentially involved in the virulence of Acanthamoeba spp. and Trypanosoma cruzi. Moreover, eleven of fifteen genes were identified in the three strains described as potential target genes to develop new treatment approaches for B. mandrillaris infections. These results describe proteins in this protozoan's complete genome and help prioritize which target genes could be used to develop new treatments.
Collapse
Affiliation(s)
- Alejandro Otero-Ruiz
- Programa de Doctorado en Ciencias Especialidad en Biotecnología, Departamento de Biotecnología y Ciencias Alimentarias, Instituto Tecnológico de Sonora, 85000, Ciudad Obregón, Sonora, Mexico
| | | | - Fernando Lares-Villa
- Departamento de Ciencias Agronómicas y Veterinarias, Instituto Tecnológico de Sonora, 85000, Ciudad Obregón, Sonora, Mexico
| | - Luis Fernando Lozano Aguirre Beltrán
- Unidad de Análisis Bioinformáticos, Centro de Ciencias Genómicas de la Universidad Nacional Autónoma de México (UNAM), 62210, Cuernavaca, Morelos, Mexico
| | - Luis Fernando Lares-Jiménez
- Departamento de Ciencias Agronómicas y Veterinarias, Instituto Tecnológico de Sonora, 85000, Ciudad Obregón, Sonora, Mexico
| | | | - Abraham Cruz-Mendívil
- CONAHCYT-Instituto Politécnico Nacional, CIIDIR Unidad Sinaloa, 81101, Guasave, Sinaloa, Mexico
| |
Collapse
|
24
|
Di Maggio LS, Fischer K, Yates D, Curtis KC, Rosa BA, Martin J, Erdmann-Gilmore P, Sprung RSW, Mitreva M, Townsend RR, Weil GJ, Fischer PU. The proteome of extracellular vesicles of the lung fluke Paragonimus kellicotti produced in vitro and in the lung cyst. Sci Rep 2023; 13:13726. [PMID: 37608002 PMCID: PMC10444896 DOI: 10.1038/s41598-023-39966-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2023] [Accepted: 08/02/2023] [Indexed: 08/24/2023] Open
Abstract
Paragonimiasis is a zoonotic, food-borne trematode infection that affects 21 million people globally. Trematodes interact with their hosts via extracellular vesicles (EV) that carry protein and RNA cargo. We analyzed EV in excretory-secretory products (ESP) released by Paragonimus kellicotti adult worms cultured in vitro (EV ESP) and EV isolated from lung cyst fluid (EV CFP) recovered from infected gerbils. The majority of EV were approximately 30-50 nm in diameter. We identified 548 P. kellicotti-derived proteins in EV ESP by mass spectrometry and 8 proteins in EV CFP of which 7 were also present in EV ESP. No parasite-derived proteins were reliably detected in EV isolated from plasma samples. A cysteine protease (MK050848, CP-6) was the most abundant protein found in EV CFP in all technical and biological replicates. Immunolocalization of CP-6 showed strong labeling in the tegument of P. kellicotti and in the adjacent cyst and lung tissue that contained worm eggs. It is likely that CP-6 present in EV is involved in parasite-host interactions. These results provide new insights into interactions between Paragonimus and their mammalian hosts, and they provide potential clues for development of novel diagnostic tools and treatments.
Collapse
Affiliation(s)
- Lucia S Di Maggio
- Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA.
| | - Kerstin Fischer
- Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Devyn Yates
- Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Kurt C Curtis
- Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Bruce A Rosa
- Department of Internal Medicine, Washington University of St. Louis School of Medicine, St. Louis, MO, USA
| | - John Martin
- Department of Internal Medicine, Washington University of St. Louis School of Medicine, St. Louis, MO, USA
| | - Petra Erdmann-Gilmore
- Division of Endocrinology, Metabolism and Lipid Research, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Robert S W Sprung
- Division of Endocrinology, Metabolism and Lipid Research, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Makedonka Mitreva
- Department of Internal Medicine, Washington University of St. Louis School of Medicine, St. Louis, MO, USA
| | - R Reid Townsend
- Division of Endocrinology, Metabolism and Lipid Research, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
- Department of Cell Biology and Physiology, Washington University School of Medicine, St. Louis, MO, USA
| | - Gary J Weil
- Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| | - Peter U Fischer
- Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO, USA
| |
Collapse
|
25
|
O’Meara MJ, Rapala JR, Nichols CB, Alexandre C, Billmyre RB, Steenwyk JL, Alspaugh JA, O’Meara TR. CryptoCEN: A Co-Expression Network for Cryptococcus neoformans reveals novel proteins involved in DNA damage repair. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.17.553567. [PMID: 37645941 PMCID: PMC10462067 DOI: 10.1101/2023.08.17.553567] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
Elucidating gene function is a major goal in biology, especially among non-model organisms. However, doing so is complicated by the fact that molecular conservation does not always mirror functional conservation, and that complex relationships among genes are responsible for encoding pathways and higher-order biological processes. Co-expression, a promising approach for predicting gene function, relies on the general principal that genes with similar expression patterns across multiple conditions will likely be involved in the same biological process. For Cryptococcus neoformans, a prevalent human fungal pathogen greatly diverged from model yeasts, approximately 60% of the predicted genes in the genome lack functional annotations. Here, we leveraged a large amount of publicly available transcriptomic data to generate a C. neoformans Co-Expression Network (CryptoCEN), successfully recapitulating known protein networks, predicting gene function, and enabling insights into the principles influencing co-expression. With 100% predictive accuracy, we used CryptoCEN to identify 13 new DNA damage response genes, underscoring the utility of guilt-by-association for determining gene function. Overall, co-expression is a powerful tool for uncovering gene function, and decreases the experimental tests needed to identify functions for currently under-annotated genes.
Collapse
Affiliation(s)
- Matthew J. O’Meara
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA
| | - Jackson R. Rapala
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI, USA
| | - Connie B. Nichols
- Departments of Medicine and Molecular Genetics/Microbiology; and Cell Biology, Duke University School of Medicine, Durham, North Carolina, USA
| | - Christina Alexandre
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI, USA
| | - R. Blake Billmyre
- Departments of Pharmaceutical and Biomedical Sciences/Infectious Disease, College of Pharmacy/College of Veterinary Medicine, University of Georgia, Athens, Georgia, USA
| | - Jacob L Steenwyk
- Howards Hughes Medical Institute and the Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA, USA
| | - J. Andrew Alspaugh
- Departments of Medicine and Molecular Genetics/Microbiology; and Cell Biology, Duke University School of Medicine, Durham, North Carolina, USA
| | - Teresa R. O’Meara
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI, USA
| |
Collapse
|
26
|
Bowman JP. Genome-wide and constrained ordination-based analyses of EC code data support reclassification of the species of Massilia La Scola et al. 2000 into Telluria Bowman et al. 1993, Mokoshia gen. nov. and Zemynaea gen. nov. Int J Syst Evol Microbiol 2023; 73. [PMID: 37589187 DOI: 10.1099/ijsem.0.005991] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/18/2023] Open
Abstract
Based on genome-wide data, Massilia species belonging to the clade including Telluria mixta LMG 11547T should be entirely transferred to the genus Telluria owing to the nomenclatural priority of the type species Telluria mixta. This results in the transfer of 35 Massilia species to the genus Telluria. The presented data also supports the creation of two new genera since peripherally branching Massilia species are distinct from Telluria and other related genera. It is proposed that 13 Massilia species are transferred to Mokoshia gen. nov. with the type species designated Mokoshia eurypsychrophila comb. nov. The species Massilia arenosa is proposed to belong to the genus Zemynaea gen. nov. as the type species Zemynaea arenosa comb. nov. The genome-wide analysis was well supported by canonical ordination analysis of Enzyme Commission (EC) codes annotated from genomes via pannzer2. This new approach was performed to assess the conclusions of the genome-based data and reduce possible ambiguity in the taxonomic decision making. Cross-validation of EC code data compared within canonical plots validated the reclassifications and correctly visualized the expected genus-level taxonomic relationships. The approach is complementary to genome-wide methodology and could be used for testing sequence alignment based data across genetically related genera. In addition to the proposed broader reclassifications, invalidly described species 'Massilia antibiotica', 'Massilia aromaticivorans', 'Massilia cellulosiltytica' and 'Massilia humi' are described as Telluria antibiotica sp. nov., Telluria aromaticivorans sp. nov., Telluria cellulosilytica sp. nov. and Pseudoduganella humi sp. nov., respectively. In addition, Telluria chitinolytica is reclassified as Pseudoduganella chitinolytica comb. nov. The use of combined genome-wide and annotation descriptors compared using canonical ordination clarifies the taxonomy of Telluria and its sibling genera and provides another way to evaluate complex taxonomic data.
Collapse
Affiliation(s)
- John P Bowman
- Tasmanian Institute of Agriculture, University of Tasmania, Sandy Bay, Hobart, Tasmania, 7005, Australia
| |
Collapse
|
27
|
MacCready JS, Roggenkamp EM, Gdanetz K, Chilvers MI. Elucidating the Obligate Nature and Biological Capacity of an Invasive Fungal Corn Pathogen. MOLECULAR PLANT-MICROBE INTERACTIONS : MPMI 2023; 36:411-424. [PMID: 36853195 DOI: 10.1094/mpmi-10-22-0213-r] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Tar spot is a devasting corn disease caused by the obligate fungal pathogen Phyllachora maydis. Since its initial identification in the United States in 2015, P. maydis has become an increasing threat to corn production. Despite this, P. maydis has remained largely understudied at the molecular level, due to difficulties surrounding its obligate lifestyle. Here, we generated a significantly improved P. maydis nuclear and mitochondrial genome, using a combination of long- and short-read technologies, and also provide the first transcriptomic analysis of primary tar spot lesions. Our results show that P. maydis is deficient in inorganic nitrogen utilization, is likely heterothallic, and encodes for significantly more protein-coding genes, including secreted enzymes and effectors, than previous determined. Furthermore, our expression analysis suggests that, following primary tar spot lesion formation, P. maydis might reroute carbon flux away from DNA replication and cell division pathways and towards pathways previously implicated in having significant roles in pathogenicity, such as autophagy and secretion. Together, our results identified several highly expressed unique secreted factors that likely contribute to host recognition and subsequent infection, greatly increasing our knowledge of the biological capacity of P. maydis, which have much broader implications for mitigating tar spot of corn. [Formula: see text] Copyright © 2023 The Author(s). This is an open access article distributed under the CC BY-NC-ND 4.0 International license.
Collapse
Affiliation(s)
- Joshua S MacCready
- Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, MI 48824, U.S.A
| | - Emily M Roggenkamp
- Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, MI 48824, U.S.A
| | - Kristi Gdanetz
- Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, MI 48824, U.S.A
| | - Martin I Chilvers
- Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, MI 48824, U.S.A
| |
Collapse
|
28
|
Mokrousov I, Vyazovaya A, Shitikov E, Badleeva M, Belopolskaya O, Bespiatykh D, Gerasimova A, Ioannidis P, Jiao W, Khromova P, Masharsky A, Naizabayeva D, Papaventsis D, Pasechnik O, Perdigão J, Rastogi N, Shen A, Sinkov V, Skiba Y, Solovieva N, Tafaj S, Valcheva V, Kostyukova I, Zhdanova S, Zhuravlev V, Ogarkov O. Insight into pathogenomics and phylogeography of hypervirulent and highly-lethal Mycobacterium tuberculosis strain cluster. BMC Infect Dis 2023; 23:426. [PMID: 37353765 DOI: 10.1186/s12879-023-08413-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2023] [Accepted: 06/21/2023] [Indexed: 06/25/2023] Open
Abstract
BACKGROUND . The Mycobacterium tuberculosis Beijing genotype is globally spread lineage with important medical properties that however vary among its subtypes. M. tuberculosis Beijing 14717-15-cluster was recently discovered as both multidrug-resistant, hypervirulent, and highly-lethal strain circulating in the Far Eastern region of Russia. Here, we aimed to analyze its pathogenomic features and phylogeographic pattern. RESULTS . The study collection included M. tuberculosis DNA collected between 1996 and 2020 in different world regions. The bacterial DNA was subjected to genotyping and whole genome sequencing followed by bioinformatics and phylogenetic analysis. The PCR-based assay to detect specific SNPs of the Beijing 14717-15-cluster was developed and used for its screening in the global collections. Phylogenomic and phylogeographic analysis confirmed endemic prevalence of the Beijing 14717-15-cluster in the Asian part of Russia, and distant common ancestor with isolates from Korea (> 115 SNPs). The Beijing 14717-15-cluster isolates had two common resistance mutations RpsL Lys88Arg and KatG Ser315Thr and belonged to spoligotype SIT269. The Russian isolates of this cluster were from the Asian Russia while 4 isolates were from the Netherlands and Spain. The cluster-specific SNPs that significantly affect the protein function were identified in silico in genes within different categories (lipid metabolism, regulatory proteins, intermediary metabolism and respiration, PE/PPE, cell wall and cell processes). CONCLUSIONS . We developed a simple method based on real-time PCR to detect clinically significant MDR and hypervirulent Beijing 14717-15-cluster. Most of the identified cluster-specific mutations were previously unreported and could potentially be associated with increased pathogenic properties of this hypervirulent M. tuberculosis strain. Further experimental study to assess the pathobiological role of these mutations is warranted.
Collapse
Affiliation(s)
- Igor Mokrousov
- Laboratory of Molecular Epidemiology and Evolutionary Genetics, St. Petersburg Pasteur Institute, St. Petersburg, Russia.
- Henan International Joint Laboratory of Children's Infectious Diseases, Henan Children's Hospital, Children's Hospital, Zhengzhou University, Zhengzhou Children's Hospital, Zhengzhou, China.
| | - Anna Vyazovaya
- Laboratory of Molecular Epidemiology and Evolutionary Genetics, St. Petersburg Pasteur Institute, St. Petersburg, Russia
| | - Egor Shitikov
- Department of Biomedicine and Genomics, Lopukhin Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, Moscow, 119435, Russia
| | - Maria Badleeva
- Department of Infectious Diseases, Dorji Banzarov Buryat State University, Ulan-Ude, Buryatia, Russia
| | - Olesya Belopolskaya
- Resource Center Bio-bank Center, Research Park of St. Petersburg State University, St. Petersburg, Russia
- Laboratory of Genogeography, Vavilov Institute of General Genetics Russian Academy of Sciences Moscow, Moscow, Russia
| | - Dmitry Bespiatykh
- Department of Biomedicine and Genomics, Lopukhin Federal Research and Clinical Center of Physical-Chemical Medicine of Federal Medical Biological Agency, Moscow, 119435, Russia
| | - Alena Gerasimova
- Laboratory of Molecular Epidemiology and Evolutionary Genetics, St. Petersburg Pasteur Institute, St. Petersburg, Russia
| | - Panayotis Ioannidis
- National Reference Laboratory for Mycobacteria, Sotiria Chest Diseases Hospital, Athens, Greece
| | - Weiwei Jiao
- National Clinical Research Center for Respiratory Diseases, Beijing Key Laboratory of Pediatric Respiratory Infection Disease, Beijing Children's Hospital, Beijing Pediatric Research Institute, Capital Medical University, National Center for Children's Health, Beijing, China
| | - Polina Khromova
- Department of Epidemiology and Microbiology, Scientific Centre of the Family Health and Human Reproduction Problems, Irkutsk, Russia
| | - Aleksey Masharsky
- Resource Center Bio-bank Center, Research Park of St. Petersburg State University, St. Petersburg, Russia
| | - Dinara Naizabayeva
- Laboratory of Molecular Biology, Almaty Branch of National Center for Biotechnology in Central Reference Laboratory, Almaty, Kazakhstan
- Department of Biotechnology, Al-Farabi Kazakh National University, Almaty, Kazakhstan
| | - Dimitrios Papaventsis
- National Reference Laboratory for Mycobacteria, Sotiria Chest Diseases Hospital, Athens, Greece
| | - Oksana Pasechnik
- Department of Public Health, Omsk State Medical University, Omsk, Russia
| | - João Perdigão
- iMed.ULisboa - Instituto de Investigação do Medicamento, Faculdade de Farmácia, Universidade de Lisboa, Lisbon, Portugal
| | - Nalin Rastogi
- WHO Supranational TB Reference Laboratory, Unité de la Tuberculose et des Mycobactéries, Institut Pasteur de la Guadeloupe, Abymes, Guadeloupe, France
| | - Adong Shen
- National Clinical Research Center for Respiratory Diseases, Beijing Key Laboratory of Pediatric Respiratory Infection Disease, Beijing Children's Hospital, Beijing Pediatric Research Institute, Capital Medical University, National Center for Children's Health, Beijing, China
- Henan Children's Hospital, Children's Hospital Affiliated to Zhengzhou University, Zhengzhou Children's Hospital, Zhengzhou, China
| | - Viacheslav Sinkov
- Department of Epidemiology and Microbiology, Scientific Centre of the Family Health and Human Reproduction Problems, Irkutsk, Russia
| | - Yuriy Skiba
- Laboratory of Molecular Biology, Almaty Branch of National Center for Biotechnology in Central Reference Laboratory, Almaty, Kazakhstan
| | - Natalia Solovieva
- St. Petersburg Research Institute of Phthisiopulmonology, St. Petersburg, Russia
| | - Silva Tafaj
- National Mycobacteria Reference Laboratory, University Hospital Shefqet Ndroqi, Tirana, Albania
| | - Violeta Valcheva
- Laboratory of Molecular Genetics of Mycobacteria, The Stephan Angeloff Institute of Microbiology, Bulgarian Academy of Sciences, Sofia, Bulgaria
| | - Irina Kostyukova
- Bacteriology laboratory, Clinical Tuberculosis Dispensary, Omsk, Russia
| | - Svetlana Zhdanova
- Department of Epidemiology and Microbiology, Scientific Centre of the Family Health and Human Reproduction Problems, Irkutsk, Russia
| | - Viacheslav Zhuravlev
- St. Petersburg Research Institute of Phthisiopulmonology, St. Petersburg, Russia
| | - Oleg Ogarkov
- Department of Epidemiology and Microbiology, Scientific Centre of the Family Health and Human Reproduction Problems, Irkutsk, Russia
| |
Collapse
|
29
|
Singh NK, Wood JM, Patane J, Moura LMS, Lombardino J, Setubal JC, Venkateswaran K. Characterization of metagenome-assembled genomes from the International Space Station. MICROBIOME 2023; 11:125. [PMID: 37264385 DOI: 10.1186/s40168-023-01545-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 04/07/2023] [Indexed: 06/03/2023]
Abstract
BACKGROUND Several investigations on the microbial diversity and functional properties of the International Space Station (ISS) environment were carried out to understand the influence of spaceflight conditions on the microbial population. However, metagenome-assembled genomes (MAGs) of ISS samples are yet to be generated and subjected to various genomic analyses, including phylogenetic affiliation, predicted functional pathways, antimicrobial resistance, and virulence characteristics. RESULTS In total, 46 MAGs were assembled from 21 ISS environmental metagenomes, in which metaSPAdes yielded 20 MAGs and metaWRAP generated 26 MAGs. Among 46 MAGs retrieved, 18 bacterial species were identified, including one novel genus/species combination (Kalamiella piersonii) and one novel bacterial species (Methylobacterium ajmalii). In addition, four bins exhibited fungal genomes; this is the first-time fungal genomes were assembled from ISS metagenomes. Phylogenetic analyses of five bacterial species showed ISS-specific evolution. The genes pertaining to cell membranes, such as transmembrane transport, cell wall organization, and regulation of cell shape, were enriched. Variations in the antimicrobial-resistant (AMR) and virulence genes of the selected 20 MAGs were characterized to predict the ecology and evolution of biosafety level (BSL) 2 microorganisms in space. Since microbial virulence increases in microgravity, AMR gene sequences of MAGs were compared with genomes of respective ISS isolates and corresponding type strains. Among these 20 MAGs characterized, AMR genes were more prevalent in the Enterobacter bugandensis MAG, which has been predominantly isolated from clinical samples. MAGs were further used to analyze if genes involved in AMR and biofilm formation of viable microbes in ISS have variation due to generational evolution in microgravity and radiation pressure. CONCLUSIONS Comparative analyses of MAGs and whole-genome sequences of related ISS isolates and their type strains were characterized to understand the variation related to the microbial evolution under microgravity. The Pantoea/Kalamiella strains have the maximum single-nucleotide polymorphisms found within the ISS strains examined. This may suggest that Pantoea/Kalamiella strains are much more subjective to microgravity changes. The reconstructed genomes will enable researchers to study the evolution of genomes under microgravity and low-dose irradiation compared to the evolution of microbes here on Earth. Video Abstract.
Collapse
Affiliation(s)
- Nitin K Singh
- Biotechnology and Planetary Protection Group, Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA, 91109, USA
| | - Jason M Wood
- Biotechnology and Planetary Protection Group, Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA, 91109, USA
| | - Jose Patane
- Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, São Paulo, SP, Brazil
| | - Livia Maria Silva Moura
- Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, São Paulo, SP, Brazil
| | - Jonathan Lombardino
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, USA
| | - João Carlos Setubal
- Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, São Paulo, SP, Brazil
| | - Kasthuri Venkateswaran
- Biotechnology and Planetary Protection Group, Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA, 91109, USA.
| |
Collapse
|
30
|
Monir MM, Islam MT, Mazumder R, Mondal D, Nahar KS, Sultana M, Morita M, Ohnishi M, Huq A, Watanabe H, Qadri F, Rahman M, Thomson N, Seed K, Colwell RR, Ahmed T, Alam M. Genomic attributes of Vibrio cholerae O1 responsible for 2022 massive cholera outbreak in Bangladesh. Nat Commun 2023; 14:1154. [PMID: 36859426 PMCID: PMC9977884 DOI: 10.1038/s41467-023-36687-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Accepted: 02/09/2023] [Indexed: 03/03/2023] Open
Abstract
In 2022, one of its worst cholera outbreaks began in Bangladesh and the icddr,b Dhaka hospital treated more than 1300 patients and ca. 42,000 diarrheal cases from March-1 to April-10, 20221. Here, we present genomic attributes of V. cholerae O1 responsible for the 2022 Dhaka outbreak and 960 7th pandemic El Tor (7PET) strains from 88 countries. Results show strains isolated during the Dhaka outbreak cluster with 7PET wave-3 global clade strains, but comprise subclade BD-1.2, for which the most recent common ancestor appears to be that responsible for recent endemic cholera in India. BD-1.2 strains are present in Bangladesh since 2016, but not establishing dominance over BD-2 lineage strains2 until 2018 and predominantly associated with endemic cholera. In conclusion, the recent shift in lineage and genetic attributes, including serotype switching of BD-1.2 from Ogawa to Inaba, may explain the increasing number of cholera cases in Bangladesh.
Collapse
Affiliation(s)
- Md Mamun Monir
- Infectious diseases division, icddr,b (International Centre for Diarrhoeal Disease Research, Bangladesh), Dhaka, Bangladesh
| | - Mohammad Tarequl Islam
- Infectious diseases division, icddr,b (International Centre for Diarrhoeal Disease Research, Bangladesh), Dhaka, Bangladesh
| | - Razib Mazumder
- Laboratory Sciences and Services Division, icddr,b (International Centre for Diarrhoeal Disease Research, Bangladesh), Dhaka, Bangladesh
| | - Dinesh Mondal
- Laboratory Sciences and Services Division, icddr,b (International Centre for Diarrhoeal Disease Research, Bangladesh), Dhaka, Bangladesh
| | - Kazi Sumaita Nahar
- Infectious diseases division, icddr,b (International Centre for Diarrhoeal Disease Research, Bangladesh), Dhaka, Bangladesh
| | - Marzia Sultana
- Infectious diseases division, icddr,b (International Centre for Diarrhoeal Disease Research, Bangladesh), Dhaka, Bangladesh
| | - Masatomo Morita
- Department of Bacteriology, National Institute of Infectious Diseases (NIID), Tokyo, Japan
| | - Makoto Ohnishi
- Department of Bacteriology, National Institute of Infectious Diseases (NIID), Tokyo, Japan
| | - Anwar Huq
- Maryland Pathogen Research Institute, University of Maryland, College Park, MD, USA
| | - Haruo Watanabe
- Department of Bacteriology, National Institute of Infectious Diseases (NIID), Tokyo, Japan
| | - Firdausi Qadri
- Infectious diseases division, icddr,b (International Centre for Diarrhoeal Disease Research, Bangladesh), Dhaka, Bangladesh
| | - Mustafizur Rahman
- Infectious diseases division, icddr,b (International Centre for Diarrhoeal Disease Research, Bangladesh), Dhaka, Bangladesh
| | - Nicholas Thomson
- Parasites and Microbes Programme, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SA, UK
- London School of Hygiene and Tropical Medicine, London, WC1E 7HT, United Kingdom
| | - Kimberley Seed
- Department of Plant and Microbial Biology, University of California, Berkeley, CA, USA
| | - Rita R Colwell
- Maryland Pathogen Research Institute, University of Maryland, College Park, MD, USA
- Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
| | - Tahmeed Ahmed
- Nutrition and Clinical Services Division, icddr,b (International Centre for Diarrhoeal Disease Research, Bangladesh), Dhaka, Bangladesh
| | - Munirul Alam
- Infectious diseases division, icddr,b (International Centre for Diarrhoeal Disease Research, Bangladesh), Dhaka, Bangladesh.
| |
Collapse
|
31
|
Yan TC, Yue ZX, Xu HQ, Liu YH, Hong YF, Chen GX, Tao L, Xie T. A systematic review of state-of-the-art strategies for machine learning-based protein function prediction. Comput Biol Med 2023; 154:106446. [PMID: 36680931 DOI: 10.1016/j.compbiomed.2022.106446] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2022] [Revised: 12/07/2022] [Accepted: 12/19/2022] [Indexed: 12/24/2022]
Abstract
New drug discovery is inseparable from the discovery of drug targets, and the vast majority of the known targets are proteins. At the same time, proteins are essential structural and functional elements of living cells necessary for the maintenance of all forms of life. Therefore, protein functions have become the focus of many pharmacological and biological studies. Traditional experimental techniques are no longer adequate for rapidly growing annotation of protein sequences, and approaches to protein function prediction using computational methods have emerged and flourished. A significant trend has been to use machine learning to achieve this goal. In this review, approaches to protein function prediction based on the sequence, structure, protein-protein interaction (PPI) networks, and fusion of multi-information sources are discussed. The current status of research on protein function prediction using machine learning is considered, and existing challenges and prominent breakthroughs are discussed to provide ideas and methods for future studies.
Collapse
Affiliation(s)
- Tian-Ci Yan
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Zi-Xuan Yue
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Hong-Quan Xu
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Yu-Hong Liu
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Yan-Feng Hong
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Gong-Xing Chen
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China
| | - Lin Tao
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China.
| | - Tian Xie
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicines, School of Pharmacy, Hangzhou Normal University, Hangzhou, 311121, China.
| |
Collapse
|
32
|
Plyusnin I, Vapalahti O, Sironen T, Kant R, Smura T. Enhanced Viral Metagenomics with Lazypipe 2. Viruses 2023; 15:v15020431. [PMID: 36851645 PMCID: PMC9960287 DOI: 10.3390/v15020431] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 01/29/2023] [Accepted: 01/31/2023] [Indexed: 02/08/2023] Open
Abstract
Viruses are the main agents causing emerging and re-emerging infectious diseases. It is therefore important to screen for and detect them and uncover the evolutionary processes that support their ability to jump species boundaries and establish themselves in new hosts. Metagenomic next-generation sequencing (mNGS) is a high-throughput, impartial technology that has enabled virologists to detect either known or novel, divergent viruses from clinical, animal, wildlife and environmental samples, with little a priori assumptions. mNGS is heavily dependent on bioinformatic analysis, with an emerging demand for integrated bioinformatic workflows. Here, we present Lazypipe 2, an updated mNGS pipeline with, as compared to Lazypipe1, significant improvements in code stability and transparency, with added functionality and support for new software components. We also present extensive benchmarking results, including evaluation of a novel canine simulated metagenome, precision and recall of virus detection at varying sequencing depth, and a low to extremely low proportion of viral genetic material. Additionally, we report accuracy of virus detection with two strategies: homology searches using nucleotide or amino acid sequences. We show that Lazypipe 2 with nucleotide-based annotation approaches near perfect detection for eukaryotic viruses and, in terms of accuracy, outperforms the compared pipelines. We also discuss the importance of homology searches with amino acid sequences for the detection of highly divergent novel viruses.
Collapse
Affiliation(s)
- Ilya Plyusnin
- Department of Veterinary Biosciences, University of Helsinki, 00014 Helsinki, Finland
- Department of Virology, University of Helsinki, 00014 Helsinki, Finland
- Correspondence:
| | - Olli Vapalahti
- Department of Veterinary Biosciences, University of Helsinki, 00014 Helsinki, Finland
- Department of Virology, University of Helsinki, 00014 Helsinki, Finland
- HUS Diagnostic Center, Clinical Microbiology, Helsinki University Hospital, University of Helsinki, 00029 Helsinki, Finland
| | - Tarja Sironen
- Department of Veterinary Biosciences, University of Helsinki, 00014 Helsinki, Finland
- Department of Virology, University of Helsinki, 00014 Helsinki, Finland
| | - Ravi Kant
- Department of Veterinary Biosciences, University of Helsinki, 00014 Helsinki, Finland
- Department of Virology, University of Helsinki, 00014 Helsinki, Finland
- Department of Tropical Parasitology, Institute of Maritime and Tropical Medicine, Medical University of Gdansk, 81-519 Gdynia, Poland
| | - Teemu Smura
- Department of Virology, University of Helsinki, 00014 Helsinki, Finland
- HUS Diagnostic Center, Clinical Microbiology, Helsinki University Hospital, University of Helsinki, 00029 Helsinki, Finland
| |
Collapse
|
33
|
Uzoechi SC, Rosa BA, Singh KS, Choi YJ, Bracken BK, Brindley PJ, Townsend RR, Sprung R, Zhan B, Bottazzi ME, Hawdon JM, Wong Y, Loukas A, Djuranovic S, Mitreva M. Excretory/Secretory Proteome of Females and Males of the Hookworm Ancylostoma ceylanicum. Pathogens 2023; 12:95. [PMID: 36678443 PMCID: PMC9865600 DOI: 10.3390/pathogens12010095] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Revised: 12/20/2022] [Accepted: 01/04/2023] [Indexed: 01/09/2023] Open
Abstract
The dynamic host-parasite mechanisms underlying hookworm infection establishment and maintenance in mammalian hosts remain poorly understood but are primarily mediated by hookworm's excretory/secretory products (ESPs), which have a wide spectrum of biological functions. We used ultra-high performance mass spectrometry to comprehensively profile and compare female and male ESPs from the zoonotic human hookworm Ancylostoma ceylanicum, which is a natural parasite of dogs, cats, and humans. We improved the genome annotation, decreasing the number of protein-coding genes by 49% while improving completeness from 92 to 96%. Compared to the previous genome annotation, we detected 11% and 10% more spectra in female and male ESPs, respectively, using this improved version, identifying a total of 795 ESPs (70% in both sexes, with the remaining sex-specific). Using functional databases (KEGG, GO and Interpro), common and sex-specific enriched functions were identified. Comparisons with the exclusively human-infective hookworm Necator americanus identified species-specific and conserved ESPs. This is the first study identifying ESPs from female and male A. ceylanicum. The findings provide a deeper understanding of hookworm protein functions that assure long-term host survival and facilitate future engineering of transgenic hookworms and analysis of regulatory elements mediating the high-level expression of ESPs. Furthermore, the findings expand the list of potential vaccine and diagnostic targets and identify biologics that can be explored for anti-inflammatory potential.
Collapse
Affiliation(s)
- Samuel C. Uzoechi
- Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Bruce A. Rosa
- Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Kumar Sachin Singh
- Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Young-Jun Choi
- Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | | | - Paul J. Brindley
- Department of Microbiology, Immunology & Tropical Medicine, Research Center for Neglected Diseases of Poverty, School of Medicine and Health Sciences, George Washington University, Washington, DC 20037, USA
| | - R. Reid Townsend
- Division of Endocrinology, Metabolism and Lipid Research, Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Robert Sprung
- Division of Endocrinology, Metabolism and Lipid Research, Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Bin Zhan
- Department of Pediatric Tropical Medicine, National School of Tropical Medicine, Baylor College of Medicine, Houston, TX 77030, USA
| | - Maria-Elena Bottazzi
- Department of Pediatric Tropical Medicine, National School of Tropical Medicine, Baylor College of Medicine, Houston, TX 77030, USA
| | - John M. Hawdon
- Department of Microbiology, Immunology & Tropical Medicine, Research Center for Neglected Diseases of Poverty, School of Medicine and Health Sciences, George Washington University, Washington, DC 20037, USA
| | - Yide Wong
- Centre for Molecular Therapeutics, Australian Institute of Tropical Health and Medicine, James Cook University, Cairns 4878, Australia
| | - Alex Loukas
- Centre for Molecular Therapeutics, Australian Institute of Tropical Health and Medicine, James Cook University, Cairns 4878, Australia
| | - Sergej Djuranovic
- Department of Cell Biology and Physiology, Internal Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| | - Makedonka Mitreva
- Division of Infectious Diseases, Department of Medicine, Washington University School of Medicine, St. Louis, MO 63110, USA
| |
Collapse
|
34
|
Whole genome sequence characterization of Aspergillus terreus ATCC 20541 and genome comparison of the fungi A. terreus. Sci Rep 2023; 13:194. [PMID: 36604572 PMCID: PMC9814666 DOI: 10.1038/s41598-022-27311-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2022] [Accepted: 12/29/2022] [Indexed: 01/06/2023] Open
Abstract
Aspergillus terreus is well-known for lovastatin and itaconic acid production with biomedical and commercial importance. The mechanisms of metabolite formation have been extensively studied to improve their yield through genetic engineering. However, the combined repertoire of carbohydrate-active enzymes (CAZymes), cytochrome P450s (CYP) enzymes, and secondary metabolites (SMs) in the different A. terreus strains has not been well studied yet, especially with respect to the presence of biosynthetic gene clusters (BGCs). Here we present a 30 Mb whole genome sequence of A. terreus ATCC 20541 in which we predicted 10,410 protein-coding genes. We compared the CAZymes, CYPs enzyme, and SMs across eleven A. terreus strains, and the results indicate that all strains have rich pectin degradation enzyme and CYP52 families. The lovastatin BGC of lovI was linked with lovF in A. terreus ATCC 20541, and the phenomenon was not found in the other strains. A. terreus ATCC 20541 lacked a non-ribosomal peptide synthetase (AnaPS) participating in acetylaszonalenin production, which was a conserved protein in the ten other strains. Our results present a comprehensive analysis of CAZymes, CYPs enzyme, and SM diversities in A. terreus strains and will facilitate further research in the function of BGCs associated with valuable SMs.
Collapse
|
35
|
Maia GA, Filho VB, Kawagoe EK, Teixeira Soratto TA, Moreira RS, Grisard EC, Wagner G. AnnotaPipeline: An integrated tool to annotate eukaryotic proteins using multi-omics data. Front Genet 2022; 13:1020100. [DOI: 10.3389/fgene.2022.1020100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 11/11/2022] [Indexed: 11/23/2022] Open
Abstract
Assignment of gene function has been a crucial, laborious, and time-consuming step in genomics. Due to a variety of sequencing platforms that generates increasing amounts of data, manual annotation is no longer feasible. Thus, the need for an integrated, automated pipeline allowing the use of experimental data towards validation of in silico prediction of gene function is of utmost relevance. Here, we present a computational workflow named AnnotaPipeline that integrates distinct software and data types on a proteogenomic approach to annotate and validate predicted features in genomic sequences. Based on FASTA (i) nucleotide or (ii) protein sequences or (iii) structural annotation files (GFF3), users can input FASTQ RNA-seq data, MS/MS data from mzXML or similar formats, as the pipeline uses both transcriptomic and proteomic information to corroborate annotations and validate gene prediction, providing transcription and expression evidence for functional annotation. Reannotation of the available Arabidopsis thaliana, Caenorhabditis elegans, Candida albicans, Trypanosoma cruzi, and Trypanosoma rangeli genomes was performed using the AnnotaPipeline, resulting in a higher proportion of annotated proteins and a reduced proportion of hypothetical proteins when compared to the annotations publicly available for these organisms. AnnotaPipeline is a Unix-based pipeline developed using Python and is available at: https://github.com/bioinformatics-ufsc/AnnotaPipeline.
Collapse
|
36
|
Anuntasomboon P, Siripattanapipong S, Unajak S, Choowongkomon K, Burchmore R, Leelayoova S, Mungthin M, E-kobon T. Making the Most of Its Short Reads: A Bioinformatics Workflow for Analysing the Short-Read-Only Data of Leishmania orientalis (Formerly Named Leishmania siamensis) Isolate PCM2 in Thailand. BIOLOGY 2022; 11:biology11091272. [PMID: 36138751 PMCID: PMC9495971 DOI: 10.3390/biology11091272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/26/2022] [Revised: 08/23/2022] [Accepted: 08/24/2022] [Indexed: 11/17/2022]
Abstract
Simple Summary Leishmaniasis is a parasitic disease caused by flagellated protozoa of the genus Leishmania. Multiple genome sequencing platforms have been employed to complete Leishmania genomes at the expense of high cost. This study proposes an integrative bioinformatic workflow for assembling only the short-read data of Leishmania orientalis isolate PCM2 from Thailand and produce an acceptable-quality genome for further genomic analysis. This workflow gives extensive information required for identifying strain-specific markers and virulence-associated genes useful for drug and vaccine development before a more exhaustive and expensive investigation. Abstract Background: Leishmania orientalis (formerly named Leishmania siamensis) has been neglected for years in Thailand. The genomic study of L. orientalis has gained much attention recently after the release of the first high-quality reference genome of the isolate LSCM4. The integrative approach of multiple sequencing platforms for whole-genome sequencing has proven effective at the expense of considerably expensive costs. This study presents a preliminary bioinformatic workflow including the use of multi-step de novo assembly coupled with the reference-based assembly method to produce high-quality genomic drafts from the short-read Illumina sequence data of L. orientalis isolate PCM2. Results: The integrating multi-step de novo assembly by MEGAHIT and SPAdes with the reference-based method using the L. enriettii genome and salvaging the unmapped reads resulted in the 30.27 Mb genomic draft of L. orientalis isolate PCM2 with 3367 contigs and 8887 predicted genes. The results from the integrated approach showed the best integrity, coverage, and contig alignment when compared to the genome of L. orientalis isolate LSCM4 collected from the northern province of Thailand. Similar patterns of gene ratios and frequency were observed from the GO biological process annotation. Fifty GO terms were assigned to the assembled genomes, and 23 of these (accounting for 61.6% of the annotated genes) showed higher gene counts and ratios when results from our workflow were compared to those of the LSCM4 isolate. Conclusions: These results indicated that our proposed bioinformatic workflow produced an acceptable-quality genome of L. orientalis strain PCM2 for functional genomic analysis, maximising the usage of the short-read data. This workflow would give extensive information required for identifying strain-specific markers and virulence-associated genes useful for drug and vaccine development before a more exhaustive and expensive investigation.
Collapse
Affiliation(s)
- Pornchai Anuntasomboon
- Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand
- Omics Center for Agriculture, Bioresources, Food, and Health, Kasetsart University (OmiKU), Bangkok 10900, Thailand
| | | | - Sasimanas Unajak
- Department of Biochemistry, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand
| | - Kiattawee Choowongkomon
- Department of Biochemistry, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand
| | - Richard Burchmore
- Glasgow Polyomics, College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow G12 8QQ, UK
| | - Saovanee Leelayoova
- Department of Parasitology, Phramongkutklao College of Medicine, Bangkok 10400, Thailand
| | - Mathirut Mungthin
- Department of Parasitology, Phramongkutklao College of Medicine, Bangkok 10400, Thailand
| | - Teerasak E-kobon
- Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand
- Omics Center for Agriculture, Bioresources, Food, and Health, Kasetsart University (OmiKU), Bangkok 10900, Thailand
- Correspondence: ; Tel.: +66-812-85-4672
| |
Collapse
|
37
|
Identification and Characterization of Jasmonic Acid Biosynthetic Genes in Salvia miltiorrhiza Bunge. Int J Mol Sci 2022; 23:ijms23169384. [PMID: 36012649 PMCID: PMC9409215 DOI: 10.3390/ijms23169384] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Revised: 08/12/2022] [Accepted: 08/15/2022] [Indexed: 11/22/2022] Open
Abstract
Jasmonic acid (JA) is a vital plant hormone that performs a variety of critical functions for plants. Salvia miltiorrhiza Bunge (S. miltiorrhiza), also known as Danshen, is a renowned traditional Chinese medicinal herb. However, no thorough and systematic analysis of JA biosynthesis genes in S. miltiorrhiza exists. Through genome-wide prediction and molecular cloning, 23 candidate genes related to JA biosynthesis were identified in S. miltiorrhiza. These genes belong to four families that encode lipoxygenase (LOX), allene oxide synthase (AOS), allene oxide cyclase (AOC), and 12-OPDA reductase3 (OPR3). It was discovered that the candidate genes for JA synthesis of S. miltiorrhiza were distinct and conserved, in contrast to related genes in other plants, by evaluating their genetic structures, protein characteristics, and phylogenetic trees. These genes displayed tissue-specific expression patterns concerning to methyl jasmonate (MeJA) and wound tests. Overall, the results of this study provide valuable information for elucidating the JA biosynthesis pathway in S. miltiorrhiza by comprehensive and methodical examination.
Collapse
|
38
|
Saroha A, Pal D, Gomashe SS, Akash, Kaur V, Ujjainwal S, Rajkumar S, Aravind J, Radhamani J, Kumar R, Chand D, Sengupta A, Wankhede DP. Identification of QTNs Associated With Flowering Time, Maturity, and Plant Height Traits in Linum usitatissimum L. Using Genome-Wide Association Study. Front Genet 2022; 13:811924. [PMID: 35774513 PMCID: PMC9237403 DOI: 10.3389/fgene.2022.811924] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 05/02/2022] [Indexed: 12/21/2022] Open
Abstract
Early flowering, maturity, and plant height are important traits for linseed to fit in rice fallows, for rainfed agriculture, and for economically viable cultivation. Here, Multi-Locus Genome-Wide Association Study (ML-GWAS) was undertaken in an association mapping panel of 131 accessions, genotyped using 68,925 SNPs identified by genotyping by sequencing approach. Phenotypic evaluation data of five environments comprising 3 years and two locations were used. GWAS was performed for three flowering time traits including days to 5%, 50%, and 95% flowering, days to maturity, and plant height by employing five ML-GWAS methods: FASTmrEMMA, FASTmrMLM, ISIS EM-BLASSO, mrMLM, and pLARmEB. A total of 335 unique QTNs have been identified for five traits across five environments. 109 QTNs were stable as observed in ≥2 methods and/or environments, explaining up to 36.6% phenotypic variance. For three flowering time traits, days to maturity, and plant height, 53, 30, and 27 stable QTNs, respectively, were identified. Candidate genes having roles in flower, pollen, embryo, seed and fruit development, and xylem/phloem histogenesis have been identified. Gene expression of candidate genes for flowering and plant height were studied using transcriptome of an early maturing variety Sharda (IC0523807). The present study unravels QTNs/candidate genes underlying complex flowering, days to maturity, and plant height traits in linseed.
Collapse
|
39
|
Holm L. Dali server: structural unification of protein families. Nucleic Acids Res 2022; 50:W210-W215. [PMID: 35610055 PMCID: PMC9252788 DOI: 10.1093/nar/gkac387] [Citation(s) in RCA: 364] [Impact Index Per Article: 182.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Revised: 04/27/2022] [Accepted: 05/02/2022] [Indexed: 12/26/2022] Open
Abstract
Protein structure is key to understanding biological function. Structure comparison deciphers deep phylogenies, providing insight into functional conservation and functional shifts during evolution. Until recently, structural coverage of the protein universe was limited by the cost and labour involved in experimental structure determination. Recent breakthroughs in deep learning revolutionized structural bioinformatics by providing accurate structural models of numerous protein families for which no structural information existed. The Dali server for 3D protein structure comparison is widely used by crystallographers to relate new structures to pre-existing ones. Here, we report two most recent upgrades to the web server: (i) the foldomes of key organisms in the AlphaFold Database (version 1) are searchable by Dali, (ii) structural alignments are annotated with protein families. Using these new features, we discovered a novel functionally diverse subgroup within the WRKY/GCM1 clan. This was accomplished by linking the structurally characterized SWI/SNF and NAM families as well as the structural models of the CG-1 family and uncharacterized proteins to the structure of Gti1/Pac2, a previously known member of the WRKY/GCM1 clan. The Dali server is available at http://ekhidna2.biocenter.helsinki.fi/dali. This website is free and open to all users and there is no login requirement.
Collapse
Affiliation(s)
- Liisa Holm
- Institute of Biotechnology, Helsinki Institute of Life Sciences, and Organismal and Evolutionary Biology Research Program, Faculty of Biosciences, University of Helsinki, Finland
| |
Collapse
|
40
|
Guo W, Coulter M, Waugh R, Zhang R. The value of genotype-specific reference for transcriptome analyses in barley. Life Sci Alliance 2022; 5:5/8/e202101255. [PMID: 35459738 PMCID: PMC9034525 DOI: 10.26508/lsa.202101255] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Revised: 04/10/2022] [Accepted: 04/11/2022] [Indexed: 12/31/2022] Open
Abstract
We demonstrate in this study that using a common reference genome may lead to loss of genotype-specific information in the assembled Reference Transcript Dataset (RTD) and the generation of erroneous, incomplete, or misleading transcriptomics analysis results in barley. It is increasingly apparent that although different genotypes within a species share “core” genes, they also contain variable numbers of “specific” genes and different structures of “core” genes that are only present in a subset of individuals. Using a common reference genome may thus lead to a loss of genotype-specific information in the assembled Reference Transcript Dataset (RTD) and the generation of erroneous, incomplete or misleading transcriptomics analysis results. In this study, we assembled genotype-specific RTD (sRTD) and common reference–based RTD (cRTD) from RNA-seq data of cultivated Barke and Morex barley, respectively. Our quantitative evaluation showed that the sRTD has a significantly higher diversity of transcripts and alternative splicing events, whereas the cRTD missed 40% of transcripts present in the sRTD and it only has ∼70% accurate transcript assemblies. We found that the sRTD is more accurate for transcript quantification as well as differential expression analysis. However, gene-level quantification is less affected, which may be a reasonable compromise when a high-quality genotype-specific reference is not available.
Collapse
Affiliation(s)
- Wenbin Guo
- Information and Computational Sciences, James Hutton Institute, Dundee, UK
| | - Max Coulter
- Plant Sciences Division, School of Life Sciences, University of Dundee at The James Hutton Institute, Dundee, UK
| | - Robbie Waugh
- Plant Sciences Division, School of Life Sciences, University of Dundee at The James Hutton Institute, Dundee, UK.,Cell and Molecular Sciences, James Hutton Institute, Dundee, UK
| | - Runxuan Zhang
- Information and Computational Sciences, James Hutton Institute, Dundee, UK
| |
Collapse
|
41
|
Kabir MN, Wong L. EnsembleFam: towards more accurate protein family prediction in the twilight zone. BMC Bioinformatics 2022; 23:90. [PMID: 35287576 PMCID: PMC8919565 DOI: 10.1186/s12859-022-04626-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Accepted: 03/02/2022] [Indexed: 11/30/2022] Open
Abstract
Background Current protein family modeling methods like profile Hidden Markov Model (pHMM), k-mer based methods, and deep learning-based methods do not provide very accurate protein function prediction for proteins in the twilight zone, due to low sequence similarity to reference proteins with known functions. Results We present a novel method EnsembleFam, aiming at better function prediction for proteins in the twilight zone. EnsembleFam extracts the core characteristics of a protein family using similarity and dissimilarity features calculated from sequence homology relations. EnsembleFam trains three separate Support Vector Machine (SVM) classifiers for each family using these features, and an ensemble prediction is made to classify novel proteins into these families. Extensive experiments are conducted using the Clusters of Orthologous Groups (COG) dataset and G Protein-Coupled Receptor (GPCR) dataset. EnsembleFam not only outperforms state-of-the-art methods on the overall dataset but also provides a much more accurate prediction for twilight zone proteins. Conclusions EnsembleFam, a machine learning method to model protein families, can be used to better identify members with very low sequence homology. Using EnsembleFam protein functions can be predicted using just sequence information with better accuracy than state-of-the-art methods.
Collapse
Affiliation(s)
- Mohammad Neamul Kabir
- Department of Computer Science, National University of Singapore, 13 Computing Drive, 117417, Singapore, Singapore.
| | - Limsoon Wong
- Department of Computer Science, National University of Singapore, 13 Computing Drive, 117417, Singapore, Singapore
| |
Collapse
|
42
|
Shahrear S, Afroj Zinnia M, Sany MRU, Islam ABMMK. Functional Analysis of Hypothetical Proteins of Vibrio parahaemolyticus Reveals the Presence of Virulence Factors and Growth-Related Enzymes With Therapeutic Potential. Bioinform Biol Insights 2022; 16:11779322221136002. [PMID: 36386863 PMCID: PMC9661560 DOI: 10.1177/11779322221136002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Accepted: 09/30/2022] [Indexed: 11/11/2022] Open
Abstract
Vibrio parahaemolyticus, an aquatic pathogen, is a major concern in the shrimp aquaculture industry. Several strains of this pathogen are responsible for causing acute hepatopancreatic necrosis disease as well as other serious illness, both of which result in severe economic losses. The genome sequence of two pathogenic strains of V. parahaemolyticus, MSR16 and MSR17, isolated from Bangladesh, have been reported to gain a better understanding of their diversity and virulence. However, the prevalence of hypothetical proteins (HPs) makes it challenging to obtain a comprehensive understanding of the pathogenesis of V. parahaemolyticus. The aim of the present study is to provide a functional annotation of the HPs to elucidate their role in pathogenesis employing several in silico tools. The exploration of protein domains and families, similarity searches against proteins with known function, gene ontology enrichment, along with protein-protein interaction analysis of the HPs led to the functional assignment with a high level of confidence for 656 proteins out of a pool of 2631 proteins. The in silico approach used in this study was important for accurately assigning function to HPs and inferring interactions with proteins with previously described functions. The HPs with function predicted were categorized into various groups such as enzymes involved in small-compound biosynthesis pathway, iron binding proteins, antibiotics resistance proteins, and other proteins. Several proteins with potential druggability were identified among them. In addition, the HPs were investigated in search of virulent factors, which led to the identification of proteins that have the potential to be exploited as vaccine candidate. The findings of the study will be effective in gaining a better understanding of the molecular mechanisms of bacterial pathogenesis. They may also provide an insight into the process of evaluating promising targets for the development of drugs and vaccines against V. parahaemolyticus.
Collapse
Affiliation(s)
- Sazzad Shahrear
- Department of Genetic Engineering and Biotechnology, University of Dhaka, Dhaka, Bangladesh
| | | | - Md. Rabi Us Sany
- Department of Genetic Engineering and Biotechnology, University of Dhaka, Dhaka, Bangladesh
| | | |
Collapse
|