76
|
Effect of Prenatal Opioid Exposure on the Human Placental Methylome. Biomedicines 2022; 10:biomedicines10051150. [PMID: 35625888 PMCID: PMC9138340 DOI: 10.3390/biomedicines10051150] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2022] [Revised: 05/10/2022] [Accepted: 05/11/2022] [Indexed: 11/17/2022] Open
Abstract
Prenatal exposure to addictive drugs can lead to placental epigenetic modifications, but a methylome-wide evaluation of placental DNA methylation changes after prenatal opioid exposure has not yet been performed. Placental tissue samples were collected at delivery from 19 opioid-exposed and 20 unexposed control full-term pregnancies. Placental DNA methylomes were profiled using the Illumina Infinium HumanMethylationEPIC BeadChip. Differentially methylated CpG sites associated with opioid exposure were identified with a linear model using the ‘limma’ R package. To identify differentially methylated regions (DMRs) spanning multiple CpG sites, the ‘DMRcate’ R package was used. The functions of genes mapped by differentially methylated CpG sites and DMRs were further annotated using Enrichr. Differentially methylated CpGs (n = 684, unadjusted p < 0.005 and |∆β| ≥ 0.05) were mapped to 258 genes (including PLD1, MGAM, and ALCS2). Differentially methylated regions (n = 199) were located in 174 genes (including KCNMA1). Enrichment analysis of the top differentially methylated CpG sites and regions indicated disrupted epigenetic regulation of genes involved in synaptic structure, chemical synaptic transmission, and nervous system development. Our findings imply that placental epigenetic changes due to prenatal opioid exposure could result in placental dysfunction, leading to abnormal fetal brain development and the symptoms of opioid withdrawal in neonates.
Collapse
|
77
|
Singh G, Gupta D. In-Silico Functional Annotation of Plasmodium falciparum Hypothetical Proteins to Identify Novel Drug Targets. Front Genet 2022; 13:821516. [PMID: 35444689 PMCID: PMC9013929 DOI: 10.3389/fgene.2022.821516] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 03/07/2022] [Indexed: 11/16/2022] Open
Abstract
Plasmodium falciparum is one of the plasmodium species responsible for the majority of life-threatening malaria cases. The current antimalarial therapies are becoming less effective due to growing drug resistance, leading to the urgent requirement for alternative and more effective antimalarial drugs or vaccines. To facilitate the novel drug discovery or vaccine development efforts, recent advances in sequencing technologies provide valuable information about the whole genome of the parasite, yet a lot more needs to be deciphered due to its incomplete proteome annotation. Surprisingly, out of the 5,389 proteins currently annotated in the Plasmodium falciparum 3D7 strain, 1,626 proteins (∼30% data) are annotated as hypothetical proteins. In parasite genomic studies, the challenge to annotate hypothetical proteins is often ignored, which may obscure the crucial information related to the pathogenicity of the parasite. In this study, we attempt to characterize hypothetical proteins of the parasite to identify novel drug targets using a computational pipeline. The study reveals that out of the overall pool of the hypothetical proteins, 266 proteins have conserved functional signatures. Furthermore, the pathway analysis of these proteins revealed that 23 proteins have an essential role in various biochemical, signalling and metabolic pathways. Additionally, all the proteins (266) were subjected to computational structure analysis. We could successfully model 11 proteins. We validated and checked the structural stability of the models by performing molecular dynamics simulation. Interestingly, eight proteins show stable conformations, and seven proteins are specific for Plasmodium falciparum, based on homology analysis. Lastly, mapping the seven shortlisted hypothetical proteins on the Plasmodium falciparum protein-protein interaction network revealed 3,299 nodes and 2,750,692 edges. Our study revealed interesting functional details of seven hypothetical proteins of the parasite, which help learn more about the less-studied molecules and their interactions, providing valuable clues to unravel the role of these proteins via future experimental validation.
Collapse
|
78
|
Kim J, You S. Comprehensive analysis of miRNA-mRNA interactions in ovaries of aged mice. Anim Sci J 2022; 93:e13721. [PMID: 35417047 PMCID: PMC9285582 DOI: 10.1111/asj.13721] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 03/04/2022] [Accepted: 03/09/2022] [Indexed: 01/01/2023]
Abstract
Advanced maternal age and ovarian aging are deleterious to the quantity and quality of oocytes and epigenetic modifications, which can affect the health of offspring. However, relatively little is known about the regulation of microRNA-mediated transcription during ovarian aging. We therefore aimed to identify age-related mRNA and microRNA changes and their interactions in the ovaries of aged mice. We performed QuantSeq 3'mRNA and small RNA sequencing to compare their expression patterns in post-ovulation ovaries from young (12-week-old) and old (44-week-old) mice. Functional annotation and integrative analyses were performed to identify the potential functions of differentially expressed genes and identify binding sites for critical microRNAs. We found 343 differentially expressed genes and 9 microRNAs in our comparison of the two mouse groups, with fold changes >2.0 (P < 0.01). Furthermore, we identified possible direct interactions between 24 differentially expressed mRNAs and 8 microRNAs. The differentially expressed genes are involved in fat digestion and absorption, the PI3K-Akt signaling pathway, serotonergic synapse, and ovarian steroidogenesis, which are important for folliculogenesis and oocyte growth. During ovarian aging, changes in gene expression induce alterations in folliculogenesis, oocyte growth, and steroidogenesis, resulting in decreased oocyte quality and reproductive outcomes.
Collapse
|
79
|
Mathieu A, Leclercq M, Sanabria M, Perin O, Droit A. Machine Learning and Deep Learning Applications in Metagenomic Taxonomy and Functional Annotation. Front Microbiol 2022; 13:811495. [PMID: 35359727 PMCID: PMC8964132 DOI: 10.3389/fmicb.2022.811495] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Accepted: 02/02/2022] [Indexed: 12/12/2022] Open
Abstract
Shotgun sequencing of environmental DNA (i.e., metagenomics) has revolutionized the field of environmental microbiology, allowing the characterization of all microorganisms in a sequencing experiment. To identify the microbes in terms of taxonomy and biological activity, the sequenced reads must necessarily be aligned on known microbial genomes/genes. However, current alignment methods are limited in terms of speed and can produce a significant number of false positives when detecting bacterial species or false negatives in specific cases (virus, plasmids, and gene detection). Moreover, recent advances in metagenomics have enabled the reconstruction of new genomes using de novo binning strategies, but these genomes, not yet fully characterized, are not used in classic approaches, whereas machine and deep learning methods can use them as models. In this article, we attempted to review the different methods and their efficiency to improve the annotation of metagenomic sequences. Deep learning models have reached the performance of the widely used k-mer alignment-based tools, with better accuracy in certain cases; however, they still must demonstrate their robustness across the variety of environmental samples and across the rapid expansion of accessible genomes in databases.
Collapse
|
80
|
Bacterial Communities of Forest Soils along Different Elevations: Diversity, Structure, and Functional Composition with Potential Impacts on CO 2 Emission. Microorganisms 2022; 10:microorganisms10040766. [PMID: 35456816 PMCID: PMC9032212 DOI: 10.3390/microorganisms10040766] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Revised: 03/23/2022] [Accepted: 03/29/2022] [Indexed: 11/17/2022] Open
Abstract
Soil bacteria are important components of forest ecosystems, there compostion structure and functions are sensitive to environmental conditions along elevation gradients. Using 16S rRNA gene amplicon sequencing followed by FAPROTAX function prediction, we examined the diversity, composition, and functional potentials of soil bacterial communities at three sites at elevations of 1400 m, 1600 m, and 2200 m in a temperate forest. We showed that microbial taxonomic composition did not change with elevation (p = 0.311), though soil bacterial α-diversities did. Proteobacteria, Acidobacteria, Actinobacteria, and Verrucomicrobia were abundant phyla in almost all soil samples, while Nitrospirae, closely associated with soil nitrogen cycling, was the fourth most abundant phylum in soils at 2200 m. Chemoheterotrophy and aerobic chemoheterotrophy were the two most abundant functions performed in soils at 1400 m and 1600 m, while nitrification (25.59% on average) and aerobic nitrite oxidation (19.38% on average) were higher in soils at 2200 m. Soil CO2 effluxes decreased (p < 0.050) with increasing elevation, while they were positively correlated (r = 0.55, p = 0.035) with the abundances of bacterial functional groups associated with carbon degradation. Moreover, bacterial functional composition, rather than taxonomic composition, was significantly associated with soil CO2 effluxes, suggesting a decoupling of taxonomy and function, with the latter being a better predictor of ecosystem functions. Annual temperature, annual precipitation, and pH shaped (p < 0.050) both bacterial taxonomic and functional communities. By establishing linkages between bacterial taxonomic communities, abundances of bacterial functional groups, and soil CO2 fluxes, we provide novel insights into how soil bacterial communities could serve as potential proxies of ecosystem functions.
Collapse
|
81
|
Kader MA, Ahammed A, Khan MS, Ashik SAA, Islam MS, Hossain MU. Hypothetical protein predicted to be tumor suppressor: a protein functional analysis. Genomics Inform 2022; 20:e6. [PMID: 35399005 PMCID: PMC9002001 DOI: 10.5808/gi.21073] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Accepted: 01/08/2022] [Indexed: 12/22/2022] Open
Abstract
Litorilituus sediminis is a Gram-negative, aerobic, novel bacterium under the family of Colwelliaceae, has a stunning hypothetical protein containing domain called von Hippel-Lindau that has significant tumor suppressor activity. Therefore, this study was designed to elucidate the structure and function of the biologically important hypothetical protein EMK97_00595 (QBG34344.1) using several bioinformatics tools. The functional annotation exposed that the hypothetical protein is an extracellular secretory soluble signal peptide and contains the von Hippel-Lindau (VHL; VHL beta) domain that has a significant role in tumor suppression. This domain is conserved throughout evolution, as its homologs are available in various types of the organism like mammals, insects, and nematode. The gene product of VHL has a critical regulatory activity in the ubiquitous oxygen-sensing pathway. This domain has a significant role in inhibiting cell proliferation, angiogenesis progression, kidney cancer, breast cancer, and colon cancer. At last, the current study depicts that the annotated hypothetical protein is linked with tumor suppressor activity which might be of great interest to future research in the higher organism.
Collapse
|
82
|
Long L, Liu Z, Deng C, Li C, Wu L, Hou B, Lin Q. Genomic sequence and transcriptome analysis of the medicinal fungus Keithomyces neogunnii. Genome Biol Evol 2022; 14:6535711. [PMID: 35201278 PMCID: PMC8907406 DOI: 10.1093/gbe/evac033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/18/2022] [Indexed: 11/19/2022] Open
Abstract
The filamentous fungus Keithomyces neogunnii can infect the larvae of Lepidoptera (Hepialus sp.) and form an insect–fungi complex, which is utilized as an important traditional Chinese medicine. As a valuable medicinal fungus, K. neogunnii produces diverse bioactive substances (e.g., polysaccharide, vitamins, cordycepic acid, and adenosine) under cultivation conditions. Herein, we report the first high-quality genome of the K. neogunnii single-spore isolate Cg7.2a using single-molecule real-time sequencing technology in combination with Illumina sequencing. The assembled genome was 32.6 Mb in size, containing 8,641 predicted genes and having a GC content of 52.16%. RNA sequencing analysis revealed the maximum number of differentially expressed genes in the fungus during the stroma formation stage compared with those during the mycelium stage. These data are valuable to enhance our understanding of the biology, development, evolution, and physiological metabolism of K. neogunnii.
Collapse
|
83
|
Wehrspan ZJ, McDonnell RT, Elcock AH. Identification of Iron-Sulfur (Fe-S) Cluster and Zinc (Zn) Binding Sites Within Proteomes Predicted by DeepMind's AlphaFold2 Program Dramatically Expands the Metalloproteome. J Mol Biol 2022; 434:167377. [PMID: 34838520 PMCID: PMC8785651 DOI: 10.1016/j.jmb.2021.167377] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2021] [Revised: 11/17/2021] [Accepted: 11/18/2021] [Indexed: 02/01/2023]
Abstract
DeepMind's AlphaFold2 software has ushered in a revolution in high quality, 3D protein structure prediction. In very recent work by the DeepMind team, structure predictions have been made for entire proteomes of twenty-one organisms, with >360,000 structures made available for download. Here we show that thousands of novel binding sites for iron-sulfur (Fe-S) clusters and zinc (Zn) ions can be identified within these predicted structures by exhaustive enumeration of all potential ligand-binding orientations. We demonstrate that AlphaFold2 routinely makes highly specific predictions of ligand binding sites: for example, binding sites that are comprised exclusively of four cysteine sidechains fall into three clusters, representing binding sites for 4Fe-4S clusters, 2Fe-2S clusters, or individual Zn ions. We show further: (a) that the majority of known Fe-S cluster and Zn binding sites documented in UniProt are recovered by the AlphaFold2 structures, (b) that there are occasional disputes between AlphaFold2 and UniProt with AlphaFold2 predicting highly plausible alternative binding sites, (c) that the Fe-S cluster binding sites that we identify in E. coli agree well with previous bioinformatics predictions, (d) that cysteines predicted here to be part of ligand binding sites show little overlap with those shown via chemoproteomics techniques to be highly reactive, and (e) that AlphaFold2 occasionally appears to build erroneous disulfide bonds between cysteines that should instead coordinate a ligand. These results suggest that AlphaFold2 could be an important tool for the functional annotation of proteomes, and the methodology presented here is likely to be useful for predicting other ligand-binding sites.
Collapse
|
84
|
Mazumder L, Hasan M, Rus'd AA, Islam MA. In-silico characterization and structure-based functional annotation of a hypothetical protein from Campylobacter jejuni involved in propionate catabolism. Genomics Inform 2022; 19:e43. [PMID: 35012287 PMCID: PMC8752978 DOI: 10.5808/gi.21043] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2021] [Accepted: 12/09/2021] [Indexed: 11/20/2022] Open
Abstract
Campylobacter jejuni is one of the most prevalent organisms associated with foodborne illness across the globe causing campylobacteriosis and gastritis. Many proteins of C. jejuni are still unidentified. The purpose of this study was to determine the structure and function of a non-annotated hypothetical protein (HP) from C. jejuni. A number of properties like physiochemical characteristics, 3D structure, and functional annotation of the HP (accession No. CAG2129885.1) were predicted using various bioinformatics tools followed by further validation and quality assessment. Moreover, the protein-protein interactions and active site were obtained from the STRING and CASTp server, respectively. The hypothesized protein possesses various characteristics including an acidic pH, thermal stability, water solubility, and cytoplasmic distribution. While alpha-helix and random coil structures are the most prominent structural components of this protein, most of it is formed of helices and coils. Along with expected quality, the 3D model has been found to be novel. This study has identified the potential role of the HP in 2-methylcitric acid cycle and propionate catabolism. Furthermore, protein-protein interactions revealed several significant functional partners. The in-silico characterization of this protein will assist to understand its molecular mechanism of action better. The methodology of this study would also serve as the basis for additional research into proteomic and genomic data for functional potential identification.
Collapse
|
85
|
Savojardo C, Babbi G, Baldazzi D, Martelli PL, Casadio R. A Glance into MTHFR Deficiency at a Molecular Level. Int J Mol Sci 2021; 23:167. [PMID: 35008593 PMCID: PMC8745156 DOI: 10.3390/ijms23010167] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2021] [Revised: 12/03/2021] [Accepted: 12/21/2021] [Indexed: 12/16/2022] Open
Abstract
MTHFR deficiency still deserves an investigation to associate the phenotype to protein structure variations. To this aim, considering the MTHFR wild type protein structure, with a catalytic and a regulatory domain and taking advantage of state-of-the-art computational tools, we explore the properties of 72 missense variations known to be disease associated. By computing the thermodynamic ΔΔG change according to a consensus method that we recently introduced, we find that 61% of the disease-related variations destabilize the protein, are present both in the catalytic and regulatory domain and correspond to known biochemical deficiencies. The propensity of solvent accessible residues to be involved in protein-protein interaction sites indicates that most of the interacting residues are located in the regulatory domain, and that only three of them, located at the interface of the functional protein homodimer, are both disease-related and destabilizing. Finally, we compute the protein architecture with Hidden Markov Models, one from Pfam for the catalytic domain and the second computed in house for the regulatory domain. We show that patterns of disease-associated, physicochemical variation types, both in the catalytic and regulatory domains, are unique for the MTHFR deficiency when mapped into the protein architecture.
Collapse
|
86
|
Whole Genome Sequencing and Annotation of Naematelia aurantialba (Basidiomycota, Edible-Medicinal Fungi). J Fungi (Basel) 2021; 8:jof8010006. [PMID: 35049946 PMCID: PMC8777972 DOI: 10.3390/jof8010006] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2021] [Revised: 12/18/2021] [Accepted: 12/21/2021] [Indexed: 12/26/2022] Open
Abstract
Naematelia aurantialba is a rare edible fungus with both nutritional and medicinal values and especially rich in bioactive polysaccharides. However, due to the lack of genomic information, researches on the mining of active compounds, artificial breeding and cultivation, genetics, and molecular biology are limited. To facilitate the medicinal and food applications of N. aurantialba, we sequenced and analyzed the whole genome of N. aurantialba for the first time. The 21-Mb genome contained 15 contigs, and a total of 5860 protein-coding genes were predicted. The genome sequence shows that 296 genes are related to polysaccharide synthesis, including 15 genes related to nucleoside-activated sugar synthesis and 11 genes related to glucan synthesis. The genome also contains genes and gene clusters for the synthesis of other active substances, including terpenoids, unsaturated fatty acids, and bioactive proteins. In addition, it was also found that N. aurantialba was more closely related to Naematelia encephala than to Tremella fuciformis. In short, this study provides a reference for molecular cognition of N. aurantialba and related researches.
Collapse
|
87
|
Mhade S, Panse S, Tendulkar G, Awate R, Narasimhan Y, Kadam S, Yennamalli RM, Kaushik KS. AMPing Up the Search: A Structural and Functional Repository of Antimicrobial Peptides for Biofilm Studies, and a Case Study of Its Application to Corynebacterium striatum, an Emerging Pathogen. Front Cell Infect Microbiol 2021; 11:803774. [PMID: 34976872 PMCID: PMC8716830 DOI: 10.3389/fcimb.2021.803774] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2021] [Accepted: 11/24/2021] [Indexed: 11/13/2022] Open
Abstract
Antimicrobial peptides (AMPs) have been recognized for their ability to target processes important for biofilm formation. Given the vast array of AMPs, identifying potential anti-biofilm candidates remains a significant challenge, and prompts the need for preliminary in silico investigations prior to extensive in vitro and in vivo studies. We have developed Biofilm-AMP (B-AMP), a curated 3D structural and functional repository of AMPs relevant to biofilm studies. In its current version, B-AMP contains predicted 3D structural models of 5544 AMPs (from the DRAMP database) developed using a suite of molecular modeling tools. The repository supports a user-friendly search, using source, name, DRAMP ID, and PepID (unique to B-AMP). Further, AMPs are annotated to existing biofilm literature, consisting of a vast library of over 10,000 articles, enhancing the functional capabilities of B-AMP. To provide an example of the usability of B-AMP, we use the sortase C biofilm target of the emerging pathogen Corynebacterium striatum as a case study. For this, 100 structural AMP models from B-AMP were subject to in silico protein-peptide molecular docking against the catalytic site residues of the C. striatum sortase C protein. Based on docking scores and interacting residues, we suggest a preference scale using which candidate AMPs could be taken up for further in silico, in vitro and in vivo testing. The 3D protein-peptide interaction models and preference scale are available in B-AMP. B-AMP is a comprehensive structural and functional repository of AMPs, and will serve as a starting point for future studies exploring AMPs for biofilm studies. B-AMP is freely available to the community at https://b-amp.karishmakaushiklab.com and will be regularly updated with AMP structures, interaction models with potential biofilm targets, and annotations to biofilm literature.
Collapse
|
88
|
Rophina M, Pandhare K, Jadhao S, Nagaraj SH, Scaria V. BGvar: A comprehensive resource for blood group immunogenetics. Transfus Med 2021; 32:229-236. [PMID: 34897852 DOI: 10.1111/tme.12844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 11/11/2021] [Accepted: 12/01/2021] [Indexed: 11/29/2022]
Abstract
BACKGROUND Blood groups form the basis of effective and safe blood transfusion. There are about 43 well-recognised human blood group systems presently known. Blood groups are molecularly determined by the presence of specific antigens on the red blood cells and are genetically determined and inherited following Mendelian principles. The lack of a comprehensive, relevant, manually compiled and genome-ready dataset of red cell antigens limited the widespread application of genomic technologies to characterise and interpret the blood group complement of an individual from genomic datasets. MATERIALS AND METHODS A range of public datasets was used to systematically annotate the variation compendium for its functionality and allele frequencies across global populations. Details on phenotype or relevant clinical importance were collated from reported literature evidence. RESULTS We have compiled the Blood Group Associated Genomic Variant Resource (BGvar), a manually curated online resource comprising all known human blood group related allelic variants including a total of 1700 International Society of Blood Transfusion approved alleles and 1706 alleles predicted and curated from literature reports. This repository includes 1682 single nucleotide variations (SNVs), 310 Insertions, Deletions (InDels) and Duplications (Copy Number Variations) and about 1360 combination mutations corresponding to 43 human blood group systems and 2 transcription factors. This compendium also encompasses gene fusion and rearrangement events occurring in human blood group genes. CONCLUSION To the best of our knowledge, BGvar is a comprehensive and a user-friendly resource with most relevant collation of blood group alleles in humans. BGvar is accessible online at URL: http://clingen.igib.res.in/bgvar/.
Collapse
|
89
|
Cantalapiedra CP, Hernández-Plaza A, Letunic I, Bork P, Huerta-Cepas J. eggNOG-mapper v2: Functional Annotation, Orthology Assignments, and Domain Prediction at the Metagenomic Scale. Mol Biol Evol 2021; 38:5825-5829. [PMID: 34597405 PMCID: PMC8662613 DOI: 10.1093/molbev/msab293] [Citation(s) in RCA: 980] [Impact Index Per Article: 326.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Even though automated functional annotation of genes represents a fundamental step in most genomic and metagenomic workflows, it remains challenging at large scales. Here, we describe a major upgrade to eggNOG-mapper, a tool for functional annotation based on precomputed orthology assignments, now optimized for vast (meta)genomic data sets. Improvements in version 2 include a full update of both the genomes and functional databases to those from eggNOG v5, as well as several efficiency enhancements and new features. Most notably, eggNOG-mapper v2 now allows for: 1) de novo gene prediction from raw contigs, 2) built-in pairwise orthology prediction, 3) fast protein domain discovery, and 4) automated GFF decoration. eggNOG-mapper v2 is available as a standalone tool or as an online service at http://eggnog-mapper.embl.de.
Collapse
|
90
|
Cantalapiedra CP, Hernández-Plaza A, Letunic I, Bork P, Huerta-Cepas J. eggNOG-mapper v2: Functional Annotation, Orthology Assignments, and Domain Prediction at the Metagenomic Scale. Mol Biol Evol 2021; 38:5825-5829. [PMID: 34597405 DOI: 10.1101/2021.06.03.446934] [Citation(s) in RCA: 48] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/23/2023] Open
Abstract
Even though automated functional annotation of genes represents a fundamental step in most genomic and metagenomic workflows, it remains challenging at large scales. Here, we describe a major upgrade to eggNOG-mapper, a tool for functional annotation based on precomputed orthology assignments, now optimized for vast (meta)genomic data sets. Improvements in version 2 include a full update of both the genomes and functional databases to those from eggNOG v5, as well as several efficiency enhancements and new features. Most notably, eggNOG-mapper v2 now allows for: 1) de novo gene prediction from raw contigs, 2) built-in pairwise orthology prediction, 3) fast protein domain discovery, and 4) automated GFF decoration. eggNOG-mapper v2 is available as a standalone tool or as an online service at http://eggnog-mapper.embl.de.
Collapse
|
91
|
Shen Y, Wang H, Xie J, Wang Z, Ma Y. Trait-specific Selection Signature Detection Reveals Novel Loci of Meat Quality in Large White Pigs. Front Genet 2021; 12:761252. [PMID: 34868241 PMCID: PMC8635012 DOI: 10.3389/fgene.2021.761252] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Accepted: 10/18/2021] [Indexed: 11/24/2022] Open
Abstract
In past decades, meat quality traits have been shaped by human-driven selection in the process of genetic improvement programs. Exploring the potential genetic basis of artificial selection and mapping functional candidate genes for economic traits are of great significance in genetic improvement of pigs. In this study, we focus on investigating the genetic basis of five meat quality traits, including intramuscular fat content (IMF), drip loss, water binding capacity, pH at 45 min (pH45min), and ultimate pH (pH24h). Through making phenotypic gradient differential population pairs, Wright’s fixation index (FST) and the cross-population extended haplotype homozogysity (XPEHH) were applied to detect selection signatures for these five traits. Finally, a total of 427 and 307 trait-specific selection signatures were revealed by FST and XPEHH, respectively. Further bioinformatics analysis indicates that some genes, such as USF1, NDUFS2, PIGM, IGSF8, CASQ1, and ACBD6, overlapping with the trait-specific selection signatures are responsible for the phenotypes including fat metabolism and muscle development. Among them, a series of promising trait-specific selection signatures that were detected in the high IMF subpopulation are located in the region of 93544042-95179724bp on SSC4, and the genes harboring in this region are all related to lipids and muscle development. Overall, these candidate genes of meat quality traits identified in this analysis may provide some fundamental information for further exploring the genetic basis of this complex trait.
Collapse
|
92
|
Jalal K, Khan K, Hassam M, Abbas MN, Uddin R, Khusro A, Sahibzada MUK, Gajdács M. Identification of a Novel Therapeutic Target against XDR Salmonella Typhi H58 Using Genomics Driven Approach Followed Up by Natural Products Virtual Screening. Microorganisms 2021; 9:2512. [PMID: 34946114 PMCID: PMC8708826 DOI: 10.3390/microorganisms9122512] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 11/25/2021] [Accepted: 11/30/2021] [Indexed: 11/17/2022] Open
Abstract
Typhoid fever is caused by a pathogenic, rod-shaped, flagellated, and Gram-negative bacterium known as Salmonella Typhi. It features a polysaccharide capsule that acts as a virulence factor and deceives the host immune system by protecting phagocytosis. Typhoid fever remains a major health concern in low and middle-income countries, with an estimated death rate of ~200,000 per annum. However, the situation is exacerbated by the emergence of the extensively drug-resistant (XDR) strain designated as H58 of S. Typhi. The emergence of the XDR strain is alarming, and it poses serious threats to public health due to the failure of the current therapeutic regimen. A relatively newer computational method called subtractive genomics analyses has been widely applied to discover novel and new drug targets against pathogens, particularly drug-resistant ones. The method involves the gradual reduction of the complete proteome of the pathogen, leading to few potential and novel drug targets. Thus, in the current study, a subtractive genomics approach was applied against the Salmonella XDR strain to identify potential drug targets. The current study predicted four prioritized proteins (i.e., Colanic acid biosynthesis acetyltransferase wcaB, Shikimate dehydrogenase aroE, multidrug efflux RND transporter permease subunit MdtC, and pantothenate synthetase panC) as potential drug targets. Though few of the prioritized proteins are treated in the literature as the established drug targets against other pathogenic bacteria, these drug targets are identified here for the first time against S. Typhi (i.e., S. Typhi XDR). The current study aimed at drawing attention to new drug targets against S. Typhi that remain largely unexplored. One of the prioritized drug targets, i.e., Colanic acid biosynthesis acetyltransferase, was predicted as a unique, new drug target against S. Typhi XDR. Therefore, the Colanic acid was further explored using structure-based techniques. Additionally, ~1000 natural compounds were docked with Colanic acid biosynthesis acetyltransferase, resulting in the prediction of seven compounds as potential lead candidates against the S. Typhi XDR strain. The ADMET properties and binding energies via the docking program of these seven compounds characterized them as novel drug candidates. They may potentially be used for the development of future drugs in the treatment of Typhoid fever.
Collapse
|
93
|
Hermankova K, Kourilova X, Pernicova I, Bezdicek M, Lengerova M, Obruca S, Sedlar K. Complete Genome Sequence of the Type Strain Tepidimonas taiwanensis LMG 22826T, a Thermophilic Alkaline Protease and Polyhydroxyalkanoate Producer. Genome Biol Evol 2021; 13:6462190. [PMID: 34908127 PMCID: PMC8715522 DOI: 10.1093/gbe/evab280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/09/2021] [Indexed: 11/20/2022] Open
Abstract
Tepidimonas taiwanensis is a moderately thermophilic, Gram-negative, rod-shaped, chemoorganoheterotrophic, motile bacterium. The alkaline protease producing type strain T. taiwanensis LMG 22826T was recently reported to also be a promising producer of polyhydroxyalkanoates (PHAs)—renewable and biodegradable polymers representing an alternative to conventional plastics. Here, we present its first complete genome sequence which is also the first complete genome sequence of the whole species. The genome consists of a single 2,915,587-bp-long circular chromosome with GC content of 68.75%. Genome annotation identified 2,764 genes in total while 2,634 open reading frames belonged to protein-coding genes. Although functional annotation of the genome and division of genes into Clusters of Orthologous Groups (COGs) revealed a relatively high number of 694 genes with unknown function or unknown COG, the majority of genes were assigned a function. Most of the genes, 406 in total, were involved in energy production and conversion, and amino acid transport and metabolism. Moreover, particular key genes involved in the metabolism of PHA were identified. Knowledge of the genome in connection with the recently reported ability to produce bioplastics from the waste stream of wine production makes T. taiwanensis LMG 22826T, an ideal candidate for further genome engineering as a bacterium with high biotechnological potential.
Collapse
|
94
|
Claussnitzer M, Susztak K. Gaining insight into metabolic diseases from human genetic discoveries. Trends Genet 2021; 37:1081-1094. [PMID: 34315631 PMCID: PMC8578350 DOI: 10.1016/j.tig.2021.07.005] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2021] [Revised: 06/29/2021] [Accepted: 07/05/2021] [Indexed: 12/30/2022]
Abstract
Human large-scale genetic association studies have identified sequence variations at thousands of genetic risk loci that are more common in patients with diverse metabolic disease compared with healthy controls. While these genetic associations have been replicated in multiple large cohorts and sometimes can explain up to 50% of heritability, the molecular and cellular mechanisms affected by common genetic variation associated with metabolic disease remains mostly unknown. A variety of new genome-wide data types, in conjunction with novel biostatistical and computational analytical methodologies and foundational experimental technologies, are paving the way for a principled approach to systematic variant-to-function (V2F) studies for metabolic diseases, turning associated regions into causal variants, cell types and states of action, effector genes, and cellular and physiological mechanisms. Identification of new target genes and cellular programs for metabolic risk loci will improve mechanistic understanding of disease biology and identification of novel therapeutic strategies.
Collapse
|
95
|
Sharma D, Aswal M, Ahmad N, Kumar M, Khan AU. Proteomic analysis of the colistin-resistant E. coli clinical isolate: Explorations of the resistome. Protein Pept Lett 2021; 29:184-198. [PMID: 34844531 DOI: 10.2174/0929866528666211129095001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Revised: 10/12/2021] [Accepted: 10/20/2021] [Indexed: 11/22/2022]
Abstract
BACKGROUND Antimicrobial resistance is a worldwide problem after the emergence of colistin resistance since it was the last option left to treat carbapenemase-resistant bacterial infections. The mcr gene and its variants are one of the causes for colistin resistance. Besides mcr genes, some other intrinsic genes are also involved in colistin resistance but still need to be explored. OBJECTIVE The aim of this study was to investigate differential proteins expression of colistin-resistant E. coli clinical isolate and to understand their interactive partners as future drug targets. METHODS In this study, we have employed the whole proteome analysis through LC-MS/MS. The advance proteomics tools were used to find differentially expressed proteins in the colistin-resistant Escherichia coli clinical isolate compared to susceptible isolate. Gene ontology and STRING were used for functional annotation and protein-protein interaction networks, respectively. RESULTS LC-MS/MS analysis showed overexpression of 47 proteins and underexpression of 74 proteins in colistin-resistant E. coli. These proteins belong to DNA replication, transcription and translational process; defense and stress related proteins; proteins of phosphoenol pyruvate phosphotransferase system (PTS) and sugar metabolism. Functional annotation and protein-protein interaction showed translational and cellular metabolic process, sugar metabolism and metabolite interconversion. CONCLUSION We conclude that these protein targets and their pathways might be used to develop novel therapeutics against colistin-resistant infections. These proteins could unveil the mechanism of colistin resistance.
Collapse
|
96
|
Peng S, Petersen JL, Bellone RR, Kalbfleisch T, Kingsley NB, Barber AM, Cappelletti E, Giulotto E, Finno CJ. Decoding the Equine Genome: Lessons from ENCODE. Genes (Basel) 2021; 12:genes12111707. [PMID: 34828313 PMCID: PMC8625040 DOI: 10.3390/genes12111707] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 10/24/2021] [Accepted: 10/26/2021] [Indexed: 12/23/2022] Open
Abstract
The horse reference genome assemblies, EquCab2.0 and EquCab3.0, have enabled great advancements in the equine genomics field, from tools to novel discoveries. However, significant gaps of knowledge regarding genome function remain, hindering the study of complex traits in horses. In an effort to address these gaps and with inspiration from the Encyclopedia of DNA Elements (ENCODE) project, the equine Functional Annotation of Animal Genome (FAANG) initiative was proposed to bridge the gap between genome and gene expression, providing further insights into functional regulation within the horse genome. Three years after launching the initiative, the equine FAANG group has generated data from more than 400 experiments using over 50 tissues, targeting a variety of regulatory features of the equine genome. In this review, we examine how valuable lessons learned from the ENCODE project informed our decisions in the equine FAANG project. We report the current state of the equine FAANG project and discuss how FAANG can serve as a template for future expansion of functional annotation in the equine genome and be used as a reference for studies of complex traits in horse. A well-annotated reference functional atlas will also help advance equine genetics in the pan-genome and precision medicine era.
Collapse
|
97
|
Vlasova A, Hermoso Pulido T, Camara F, Ponomarenko J, Guigó R. FA-nf: A Functional Annotation Pipeline for Proteins from Non-Model Organisms Implemented in Nextflow. Genes (Basel) 2021; 12:genes12101645. [PMID: 34681040 PMCID: PMC8535801 DOI: 10.3390/genes12101645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2021] [Revised: 10/12/2021] [Accepted: 10/14/2021] [Indexed: 11/17/2022] Open
Abstract
Functional annotation allows adding biologically relevant information to predicted features in genomic sequences, and it is, therefore, an important procedure of any de novo genome sequencing project. It is also useful for proofreading and improving gene structural annotation. Here, we introduce FA-nf, a pipeline implemented in Nextflow, a versatile computational workflow management engine. The pipeline integrates different annotation approaches, such as NCBI BLAST+, DIAMOND, InterProScan, and KEGG. It starts from a protein sequence FASTA file and, optionally, a structural annotation file in GFF format, and produces several files, such as GO assignments, output summaries of the abovementioned programs and final annotation reports. The pipeline can be broken easily into smaller processes for the purpose of parallelization and easily deployed in a Linux computational environment, thanks to software containerization, thus helping to ensure full reproducibility.
Collapse
|
98
|
Li B, Ritchie MD. From GWAS to Gene: Transcriptome-Wide Association Studies and Other Methods to Functionally Understand GWAS Discoveries. Front Genet 2021; 12:713230. [PMID: 34659337 PMCID: PMC8515949 DOI: 10.3389/fgene.2021.713230] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2021] [Accepted: 07/27/2021] [Indexed: 12/12/2022] Open
Abstract
Since their inception, genome-wide association studies (GWAS) have identified more than a hundred thousand single nucleotide polymorphism (SNP) loci that are associated with various complex human diseases or traits. The majority of GWAS discoveries are located in non-coding regions of the human genome and have unknown functions. The valley between non-coding GWAS discoveries and downstream affected genes hinders the investigation of complex disease mechanism and the utilization of human genetics for the improvement of clinical care. Meanwhile, advances in high-throughput sequencing technologies reveal important genomic regulatory roles that non-coding regions play in the transcriptional activities of genes. In this review, we focus on data integrative bioinformatics methods that combine GWAS with functional genomics knowledge to identify genetically regulated genes. We categorize and describe two types of data integrative methods. First, we describe fine-mapping methods. Fine-mapping is an exploratory approach that calibrates likely causal variants underneath GWAS signals. Fine-mapping methods connect GWAS signals to potentially causal genes through statistical methods and/or functional annotations. Second, we discuss gene-prioritization methods. These are hypothesis generating approaches that evaluate whether genetic variants regulate genes via certain genetic regulatory mechanisms to influence complex traits, including colocalization, mendelian randomization, and the transcriptome-wide association study (TWAS). TWAS is a gene-based association approach that investigates associations between genetically regulated gene expression and complex diseases or traits. TWAS has gained popularity over the years due to its ability to reduce multiple testing burden in comparison to other variant-based analytic approaches. Multiple types of TWAS methods have been developed with varied methodological designs and biological hypotheses over the past 5 years. We dive into discussions of how TWAS methods differ in many aspects and the challenges that different TWAS methods face. Overall, TWAS is a powerful tool for identifying complex trait-associated genes. With the advent of single-cell sequencing, chromosome conformation capture, gene editing technologies, and multiplexing reporter assays, we are expecting a more comprehensive understanding of genomic regulation and genetically regulated genes underlying complex human diseases and traits in the future.
Collapse
|
99
|
Adikusuma W, Irham LM, Chou WH, Wong HSC, Mugiyanto E, Ting J, Perwitasari DA, Chang WP, Chang WC. Drug Repurposing for Atopic Dermatitis by Integration of Gene Networking and Genomic Information. Front Immunol 2021; 12:724277. [PMID: 34721386 PMCID: PMC8548825 DOI: 10.3389/fimmu.2021.724277] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Accepted: 09/15/2021] [Indexed: 12/02/2022] Open
Abstract
Atopic Dermatitis (AD) is a chronic and relapsing skin disease. The medications for treating AD are still limited, most of them are topical corticosteroid creams or antibiotics. The current study attempted to discover potential AD treatments by integrating a gene network and genomic analytic approaches. Herein, the Single Nucleotide Polymorphism (SNPs) associated with AD were extracted from the GWAS catalog. We identified 70 AD-associated loci, and then 94 AD risk genes were found by extending to proximal SNPs based on r2 > 0.8 in Asian populations using HaploReg v4.1. Next, we prioritized the AD risk genes using in silico pipelines of bioinformatic analysis based on six functional annotations to identify biological AD risk genes. Finally, we expanded them according to the molecular interactions using the STRING database to find the drug target genes. Our analysis showed 27 biological AD risk genes, and they were mapped to 76 drug target genes. According to DrugBank and Therapeutic Target Database, 25 drug target genes overlapping with 53 drugs were identified. Importantly, dupilumab, which is approved for AD, was successfully identified in this bioinformatic analysis. Furthermore, ten drugs were found to be potentially useful for AD with clinical or preclinical evidence. In particular, we identified filgotinub and fedratinib, targeting gene JAK1, as potential drugs for AD. Furthermore, four monoclonal antibody drugs (lebrikizumab, tralokinumab, tocilizumab, and canakinumab) were successfully identified as promising for AD repurposing. In sum, the results showed the feasibility of gene networking and genomic information as a potential drug discovery resource.
Collapse
|
100
|
Cao Z, Huang Y, Duan R, Jin P, Qin ZS, Zhang S. Disease category-specific annotation of variants using an ensemble learning framework. Brief Bioinform 2021; 23:6394995. [PMID: 34643213 DOI: 10.1093/bib/bbab438] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 09/03/2021] [Accepted: 09/22/2021] [Indexed: 02/01/2023] Open
Abstract
Understanding the impact of non-coding sequence variants on complex diseases is an essential problem. We present a novel ensemble learning framework-CASAVA, to predict genomic loci in terms of disease category-specific risk. Using disease-associated variants identified by GWAS as training data, and diverse sequencing-based genomics and epigenomics profiles as features, CASAVA provides risk prediction of 24 major categories of diseases throughout the human genome. Our studies showed that CASAVA scores at a genomic locus provide a reasonable prediction of the disease-specific and disease category-specific risk prediction for non-coding variants located within the locus. Taking MHC2TA and immune system diseases as an example, we demonstrate the potential of CASAVA in revealing variant-disease associations. A website (http://zhanglabtools.org/CASAVA) has been built to facilitate easily access to CASAVA scores.
Collapse
|