Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Gligorijević V, Renfrew PD, Kosciolek T, Leman JK, Berenberg D, Vatanen T, Chandler C, Taylor BC, Fisk IM, Vlamakis H, Xavier RJ, Knight R, Cho K, Bonneau R. Structure-based protein function prediction using graph convolutional networks. Nat Commun 2021;12:3168. [PMID: 34039967 DOI: 10.1038/s41467-021-23303-9] [Citation(s) in RCA: 217] [Impact Index Per Article: 72.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 04/22/2021] [Indexed: 02/04/2023] Open

For:	Gligorijević V, Renfrew PD, Kosciolek T, Leman JK, Berenberg D, Vatanen T, Chandler C, Taylor BC, Fisk IM, Vlamakis H, Xavier RJ, Knight R, Cho K, Bonneau R. Structure-based protein function prediction using graph convolutional networks. Nat Commun 2021;12:3168. [PMID: 34039967 DOI: 10.1038/s41467-021-23303-9] [Citation(s) in RCA: 217] [Impact Index Per Article: 72.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 04/22/2021] [Indexed: 02/04/2023] Open

Number

Cited by Other Article(s)

151

Kondo HX, Iizuka H, Masumoto G, Kabaya Y, Kanematsu Y, Takano Y. Prediction of Protein Function from Tertiary Structure of the Active Site in Heme Proteins by Convolutional Neural Network. Biomolecules 2023;13:biom13010137. [PMID: 36671521 PMCID: PMC9855806 DOI: 10.3390/biom13010137] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Accepted: 01/07/2023] [Indexed: 01/11/2023] Open

152

George A, Kim DN, Moser T, Gildea IT, Evans JE, Cheung MS. Graph identification of proteins in tomograms (GRIP-Tomo). Protein Sci 2023;32:e4538. [PMID: 36482866 PMCID: PMC9798246 DOI: 10.1002/pro.4538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 11/23/2022] [Accepted: 12/03/2022] [Indexed: 12/14/2022]

153

Lim PK, Julca I, Mutwil M. Redesigning plant specialized metabolism with supervised machine learning using publicly available reactome data. Comput Struct Biotechnol J 2023;21:1639-1650. [PMID: 36874159 PMCID: PMC9976193 DOI: 10.1016/j.csbj.2023.01.013] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Revised: 01/12/2023] [Accepted: 01/12/2023] [Indexed: 01/19/2023] Open

154

Dehnavi A, Nazem F, Ghasemi F, Fassihi A, Rasti R. A GU-Net-based architecture predicting ligand–Protein-binding atoms. JOURNAL OF MEDICAL SIGNALS & SENSORS 2023. [DOI: 10.4103/jmss.jmss_142_21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]

155

Zhang J, Lin X, Chen Y, Li T, Lee AC, Chow EY, Cho WC, Chan T. LAFITE Reveals the Complexity of Transcript Isoforms in Subcellular Fractions. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2023;10:e2203480. [PMID: 36461702 PMCID: PMC9875686 DOI: 10.1002/advs.202203480] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Revised: 10/28/2022] [Indexed: 06/17/2023]

156

Durairaj J, de Ridder D, van Dijk AD. Beyond sequence: Structure-based machine learning. Comput Struct Biotechnol J 2022;21:630-643. [PMID: 36659927 PMCID: PMC9826903 DOI: 10.1016/j.csbj.2022.12.039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2022] [Revised: 12/21/2022] [Accepted: 12/21/2022] [Indexed: 12/31/2022] Open

157

Fischer S, Gillis J. Defining the extent of gene function using ROC curvature. Bioinformatics 2022;38:5390-5397. [PMID: 36271855 PMCID: PMC9750128 DOI: 10.1093/bioinformatics/btac692] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2022] [Revised: 09/19/2022] [Accepted: 10/20/2022] [Indexed: 12/25/2022] Open

Abstract

MOTIVATION

Interactions between proteins help us understand how genes are functionally related and how they contribute to phenotypes. Experiments provide imperfect 'ground truth' information about a small subset of potential interactions in a specific biological context, which can then be extended to the whole genome across different contexts, such as conditions, tissues or species, through machine learning methods. However, evaluating the performance of these methods remains a critical challenge. Here, we propose to evaluate the generalizability of gene characterizations through the shape of performance curves.

RESULTS

We identify Functional Equivalence Classes (FECs), subsets of annotated and unannotated genes that jointly drive performance, by assessing the presence of straight lines in ROC curves built from gene-centric prediction tasks, such as function or interaction predictions. FECs are widespread across data types and methods, they can be used to evaluate the extent and context-specificity of functional annotations in a data-driven manner. For example, FECs suggest that B cell markers can be decomposed into shared primary markers (10-50 genes), and tissue-specific secondary markers (100-500 genes). In addition, FECs suggest the existence of functional modules that span a wide range of the genome, with marker sets spanning at most 5% of the genome and data-driven extensions of Gene Ontology sets spanning up to 40% of the genome. Simple to assess visually and statistically, the identification of FECs in performance curves paves the way for novel functional characterization and increased robustness in the definition of functional gene sets.

AVAILABILITY AND IMPLEMENTATION

Code for analyses and figures is available at https://github.com/yexilein/pyroc.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

158

Jokinen E, Dumitrescu A, Huuhtanen J, Gligorijević V, Mustjoki S, Bonneau R, Heinonen M, Lähdesmäki H. TCRconv: predicting recognition between T cell receptors and epitopes using contextualized motifs. Bioinformatics 2022;39:6881078. [PMID: 36477794 PMCID: PMC9825763 DOI: 10.1093/bioinformatics/btac788] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Revised: 11/01/2022] [Accepted: 12/06/2022] [Indexed: 12/12/2022] Open

159

Zacharias HU, Kaleta C, Cossais F, Schaeffer E, Berndt H, Best L, Dost T, Glüsing S, Groussin M, Poyet M, Heinzel S, Bang C, Siebert L, Demetrowitsch T, Leypoldt F, Adelung R, Bartsch T, Bosy-Westphal A, Schwarz K, Berg D. Microbiome and Metabolome Insights into the Role of the Gastrointestinal-Brain Axis in Parkinson's and Alzheimer's Disease: Unveiling Potential Therapeutic Targets. Metabolites 2022;12:metabo12121222. [PMID: 36557259 PMCID: PMC9786685 DOI: 10.3390/metabo12121222] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2022] [Revised: 11/25/2022] [Accepted: 11/28/2022] [Indexed: 12/12/2022] Open

Affiliation(s)

Helena U. Zacharias Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Hannover Medical School, 30625 Hannover, Germany Department of Internal Medicine I, University Medical Center Schleswig-Holstein, Campus Kiel, 24105 Kiel, Germany Institute of Clinical Molecular Biology, Kiel University and University Medical Center Schleswig-Holstein, Campus Kiel, 24105 Kiel, Germany Correspondence: (H.U.Z.); (C.K.)
Christoph Kaleta Research Group Medical Systems Biology, Institute for Experimental Medicine, Kiel University, 24105 Kiel, Germany Kiel Nano, Surface and Interface Science—KiNSIS, Kiel University, 24118 Kiel, Germany Correspondence: (H.U.Z.); (C.K.)
François Cossais Institute of Anatomy, Kiel University, 24118 Kiel, Germany
Eva Schaeffer Department of Neurology, Kiel University and University Medical Center Schleswig-Holstein, Campus Kiel, 24105 Kiel, Germany
Henry Berndt Research Group Comparative Immunobiology, Zoological Institute, Kiel University, 24118 Kiel, Germany
Lena Best Research Group Medical Systems Biology, Institute for Experimental Medicine, Kiel University, 24105 Kiel, Germany
Thomas Dost Research Group Medical Systems Biology, Institute for Experimental Medicine, Kiel University, 24105 Kiel, Germany
Svea Glüsing Institute of Human Nutrition and Food Science, Food Technology, Kiel University, 24118 Kiel, Germany
Mathieu Groussin Institute of Clinical Molecular Biology, Kiel University and University Medical Center Schleswig-Holstein, Campus Kiel, 24105 Kiel, Germany
Mathilde Poyet Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
Sebastian Heinzel Department of Neurology, Kiel University and University Medical Center Schleswig-Holstein, Campus Kiel, 24105 Kiel, Germany Institute of Medical Informatics and Statistics, Kiel University and University Medical Center Schleswig-Holstein, Campus Kiel, 24105 Kiel, Germany
Corinna Bang Institute of Clinical Molecular Biology, Kiel University and University Medical Center Schleswig-Holstein, Campus Kiel, 24105 Kiel, Germany
Leonard Siebert Kiel Nano, Surface and Interface Science—KiNSIS, Kiel University, 24118 Kiel, Germany Functional Nanomaterials, Department of Materials Science, Kiel University, 24143 Kiel, Germany
Tobias Demetrowitsch Institute of Human Nutrition and Food Science, Food Technology, Kiel University, 24118 Kiel, Germany Kiel Network of Analytical Spectroscopy and Mass Spectrometry, Kiel University, 24118 Kiel, Germany
Frank Leypoldt Department of Neurology, Kiel University and University Medical Center Schleswig-Holstein, Campus Kiel, 24105 Kiel, Germany Neuroimmunology, Institute of Clinical Chemistry, University Medical Center Schleswig-Holstein, 24105 Kiel, Germany
Rainer Adelung Kiel Nano, Surface and Interface Science—KiNSIS, Kiel University, 24118 Kiel, Germany Functional Nanomaterials, Department of Materials Science, Kiel University, 24143 Kiel, Germany
Thorsten Bartsch Kiel Nano, Surface and Interface Science—KiNSIS, Kiel University, 24118 Kiel, Germany Department of Neurology, Kiel University and University Medical Center Schleswig-Holstein, Campus Kiel, 24105 Kiel, Germany
Anja Bosy-Westphal Institute of Human Nutrition and Food Science, Kiel University, 24107 Kiel, Germany
Karin Schwarz Kiel Nano, Surface and Interface Science—KiNSIS, Kiel University, 24118 Kiel, Germany Institute of Human Nutrition and Food Science, Food Technology, Kiel University, 24118 Kiel, Germany Kiel Network of Analytical Spectroscopy and Mass Spectrometry, Kiel University, 24118 Kiel, Germany
Daniela Berg Kiel Nano, Surface and Interface Science—KiNSIS, Kiel University, 24118 Kiel, Germany Department of Neurology, Kiel University and University Medical Center Schleswig-Holstein, Campus Kiel, 24105 Kiel, Germany

Collapse

160

Petrovsky DV, Rudnev VR, Nikolsky KS, Kulikova LI, Malsagova KM, Kopylov AT, Kaysheva AL. PSSNet-An Accurate Super-Secondary Structure for Protein Segmentation. Int J Mol Sci 2022;23:ijms232314813. [PMID: 36499138 PMCID: PMC9740782 DOI: 10.3390/ijms232314813] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Revised: 11/18/2022] [Accepted: 11/24/2022] [Indexed: 12/03/2022] Open

161

Singh D, Roy J. A large-scale benchmark study of tools for the classification of protein-coding and non-coding RNAs. Nucleic Acids Res 2022;50:12094-12111. [PMID: 36420898 PMCID: PMC9757047 DOI: 10.1093/nar/gkac1092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2022] [Revised: 10/22/2022] [Accepted: 10/28/2022] [Indexed: 11/27/2022] Open

162

Marchetti L, Nifosì R, Martelli PL, Da Pozzo E, Cappello V, Banterle F, Trincavelli ML, Martini C, D’Elia M. Quantum computing algorithms: getting closer to critical problems in computational biology. Brief Bioinform 2022;23:6758194. [PMID: 36220772 PMCID: PMC9677474 DOI: 10.1093/bib/bbac437] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 08/15/2022] [Accepted: 09/08/2022] [Indexed: 12/14/2022] Open

163

Wu L, Yin C, Zhu J, Wu Z, He L, Xia Y, Xie S, Qin T, Liu TY. SPRoBERTa: protein embedding learning with local fragment modeling. Brief Bioinform 2022;23:6711410. [PMID: 36136367 DOI: 10.1093/bib/bbac401] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Revised: 07/18/2022] [Accepted: 08/18/2022] [Indexed: 12/14/2022] Open

164

Yan K, Lv H, Guo Y, Peng W, Liu B. sAMPpred-GAT: prediction of antimicrobial peptide by graph attention network and predicted peptide structure. Bioinformatics 2022;39:6808615. [PMID: 36342186 PMCID: PMC9805557 DOI: 10.1093/bioinformatics/btac715] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Revised: 10/24/2022] [Accepted: 11/04/2022] [Indexed: 11/09/2022] Open

165

Li L, Peng S, Wang Z, Zhang T, Li H, Xiao Y, Li J, Liu Y, Yin H. Genome mining reveals abiotic stress resistance genes in plant genomes acquired from microbes via HGT. FRONTIERS IN PLANT SCIENCE 2022;13:1025122. [PMID: 36407614 PMCID: PMC9667741 DOI: 10.3389/fpls.2022.1025122] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 09/07/2022] [Indexed: 06/16/2023]

Abstract

Colonization by beneficial microbes can enhance plant tolerance to abiotic stresses. However, there are still many unknown fields regarding the beneficial plant-microbe interactions. In this study, we have assessed the amount or impact of horizontal gene transfer (HGT)-derived genes in plants that have potentials to confer abiotic stress resistance. We have identified a total of 235 gene entries in fourteen high-quality plant genomes belonging to phyla Chlorophyta and Streptophyta that confer resistance against a wide range of abiotic pressures acquired from microbes through independent HGTs. These genes encode proteins contributed to toxic metal resistance (e.g., ChrA, CopA, CorA), osmotic and drought stress resistance (e.g., Na⁺/proline symporter, potassium/proton antiporter), acid resistance (e.g., PcxA, ArcA, YhdG), heat and cold stress resistance (e.g., DnaJ, Hsp20, CspA), oxidative stress resistance (e.g., GST, PoxA, glutaredoxin), DNA damage resistance (e.g., Rad25, Rad51, UvrD), and organic pollutant resistance (e.g., CytP450, laccase, CbbY). Phylogenetic analyses have supported the HGT inferences as the plant lineages are all clustering closely with distant microbial lineages. Deep-learning-based protein structure prediction and analyses, in combination with expression assessment based on codon adaption index (CAI) further corroborated the functionality and expressivity of the HGT genes in plant genomes. A case-study applying fold comparison and molecular dynamics (MD) of the HGT-driven CytP450 gave a more detailed illustration on the resemblance and evolutionary linkage between the plant recipient and microbial donor sequences. Together, the microbe-originated HGT genes identified in plant genomes and their participation in abiotic pressures resistance indicate a more profound impact of HGT on the adaptive evolution of plants.

Collapse

166

Wu F, Jin S, Jiang Y, Jin X, Tang B, Niu Z, Liu X, Zhang Q, Zeng X, Li SZ. Pre-Training of Equivariant Graph Matching Networks with Conformation Flexibility for Drug Binding. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2022;9:e2203796. [PMID: 36202759 PMCID: PMC9685463 DOI: 10.1002/advs.202203796] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 09/07/2022] [Indexed: 05/16/2023]

167

Zhao S, Martin-Vicente A, Colabardini AC, Pereira Silva L, Rinker DC, Fortwendel JR, Goldman GH, Gibbons JG. Genomic and Molecular Identification of Genes Contributing to the Caspofungin Paradoxical Effect in Aspergillus fumigatus. Microbiol Spectr 2022;10:e0051922. [PMID: 36094204 PMCID: PMC9603777 DOI: 10.1128/spectrum.00519-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 08/17/2022] [Indexed: 11/25/2022] Open

Abstract

Aspergillus fumigatus is a deadly opportunistic fungal pathogen responsible for ~100,000 annual deaths. Azoles are the first line antifungal agent used against A. fumigatus, but azole resistance has rapidly evolved making treatment challenging. Caspofungin is an important second-line therapy against invasive pulmonary aspergillosis, a severe A. fumigatus infection. Caspofungin functions by inhibiting β-1,3-glucan synthesis, a primary and essential component of the fungal cell wall. A phenomenon termed the caspofungin paradoxical effect (CPE) has been observed in several fungal species where at higher concentrations of caspofungin, chitin replaces β-1,3-glucan, morphology returns to normal, and growth rate increases. CPE appears to occur in vivo, and it is therefore clinically important to better understand the genetic contributors to CPE. We applied genomewide association (GWA) analysis and molecular genetics to identify and validate candidate genes involved in CPE. We quantified CPE across 67 clinical isolates and conducted three independent GWA analyses to identify genetic variants associated with CPE. We identified 48 single nucleotide polymorphisms (SNPs) associated with CPE. We used a CRISPR/Cas9 approach to generate gene deletion mutants for seven genes harboring candidate SNPs. Two null mutants, ΔAfu3g13230 and ΔAfu4g07080 (dscP), resulted in reduced basal growth rate and a loss of CPE. We further characterized the dscP phosphatase-null mutant and observed a significant reduction in conidia production and extremely high sensitivity to caspofungin at both low and high concentrations. Collectively, our work reveals the contribution of Afu3g13230 and dscP in CPE and sheds new light on the complex genetic interactions governing this phenotype. IMPORTANCE This is one of the first studies to apply genomewide association (GWA) analysis to identify genes involved in an Aspergillus fumigatus phenotype. A. fumigatus is an opportunistic fungal pathogen that causes hundreds of thousands of infections and ~100,000 deaths each year, and antifungal resistance has rapidly evolved in this species. A phenomenon called the caspofungin paradoxical effect (CPE) occurs in some isolates, where high concentrations of the drug lead to increased growth rate. There is clinical relevance in understanding the genetic basis of this phenotype, since caspofungin concentrations could lead to unintended adverse clinical outcomes in certain cases. Using GWA analysis, we identified several interesting candidate polymorphisms and genes and then generated gene deletion mutants to determine whether these genes were important for CPE. Two of these mutant strains (ΔAfu3g13230 and ΔAfu4g07080/ΔdscP) displayed a loss of the CPE. This study sheds light on the genes involved in clinically important phenotype CPE.

Collapse

168

Jiang Y, Renata H. Finding Superior Biocatalysts via Homolog Screening. CHEM CATALYSIS 2022;2:2471-2480. [PMID: 36406237 PMCID: PMC9667982 DOI: 10.1016/j.checat.2022.09.038] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]

169

Greenrod STE, Stoycheva M, Elphinstone J, Friman VP. Global diversity and distribution of prophages are lineage-specific within the Ralstonia solanacearum species complex. BMC Genomics 2022;23:689. [PMID: 36199029 PMCID: PMC9535894 DOI: 10.1186/s12864-022-08909-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 09/23/2022] [Indexed: 11/17/2022] Open

Abstract

Background

Ralstonia solanacearum species complex (RSSC) strains are destructive plant pathogenic bacteria and the causative agents of bacterial wilt disease, infecting over 200 plant species worldwide. In addition to chromosomal genes, their virulence is mediated by mobile genetic elements including integrated DNA of bacteriophages, i.e., prophages, which may carry fitness-associated auxiliary genes or modulate host gene expression. Although experimental studies have characterised several prophages that shape RSSC virulence, the global diversity, distribution, and wider functional gene content of RSSC prophages are unknown. In this study, prophages were identified in a diverse collection of 192 RSSC draft genome assemblies originating from six continents.

Results

Prophages were identified bioinformatically and their diversity investigated using genetic distance measures, gene content, GC, and total length. Prophage distributions were characterised using metadata on RSSC strain geographic origin and lineage classification (phylotypes), and their functional gene content was assessed by identifying putative prophage-encoded auxiliary genes. In total, 313 intact prophages were identified, forming ten genetically distinct clusters. These included six prophage clusters with similarity to the Inoviridae, Myoviridae, and Siphoviridae phage families, and four uncharacterised clusters, possibly representing novel, previously undescribed phages. The prophages had broad geographical distributions, being present across multiple continents. However, they were generally host phylogenetic lineage-specific, and overall, prophage diversity was proportional to the genetic diversity of their hosts. The prophages contained many auxiliary genes involved in metabolism and virulence of both phage and bacteria.

Conclusions

Our results show that while RSSC prophages are highly diverse globally, they make lineage-specific contributions to the RSSC accessory genome, which could have resulted from shared coevolutionary history.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12864-022-08909-7.

Collapse

170

Structural host immune-microbiota interactions. Curr Opin Struct Biol 2022;76:102445. [PMID: 36063760 DOI: 10.1016/j.sbi.2022.102445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

171

Zhu YH, Zhang C, Liu Y, Omenn GS, Freddolino PL, Yu DJ, Zhang Y. TripletGO: Integrating Transcript Expression Profiles with Protein Homology Inferences for Gene Function Prediction. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022;20:1013-1027. [PMID: 35568117 PMCID: PMC10025770 DOI: 10.1016/j.gpb.2022.03.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Revised: 03/02/2022] [Accepted: 04/16/2022] [Indexed: 01/13/2023]

172

Sengupta K, Saha S, Halder AK, Chatterjee P, Nasipuri M, Basu S, Plewczynski D. PFP-GO: Integrating protein sequence, domain and protein-protein interaction information for protein function prediction using ranked GO terms. Front Genet 2022;13:969915. [PMID: 36246645 PMCID: PMC9556876 DOI: 10.3389/fgene.2022.969915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2022] [Accepted: 08/31/2022] [Indexed: 11/13/2022] Open

173

Two Conserved Amino Acids Characterized in the Island Domain Are Essential for the Biological Functions of Brassinolide Receptors. Int J Mol Sci 2022;23:ijms231911454. [PMID: 36232750 PMCID: PMC9570414 DOI: 10.3390/ijms231911454] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2022] [Revised: 09/19/2022] [Accepted: 09/26/2022] [Indexed: 11/16/2022] Open

174

Hu JX, Yang Y, Xu YY, Shen HB. GraphLoc: a graph neural network model for predicting protein subcellular localization from immunohistochemistry images. Bioinformatics 2022;38:4941-4948. [DOI: 10.1093/bioinformatics/btac634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 09/07/2022] [Accepted: 09/15/2022] [Indexed: 11/14/2022] Open

Abstract Abstract Motivation Recognition of protein subcellular distribution patterns and identification of location biomarker proteins in cancer tissues are important for understanding protein functions and related diseases. Immunohistochemical (IHC) images enable visualizing the distribution of proteins at the tissue level, providing an important resource for the protein localization studies. In the past decades, several image-based protein subcellular location prediction methods have been developed, but the prediction accuracies still have much space to improve due to the complexity of protein patterns resulting from multi-label proteins and variation of location patterns across cell types or states. Results Here, we propose a multi-label multi-instance model based on deep graph convolutional neural networks, GraphLoc, to recognize protein subcellular location patterns. GraphLoc builds a graph of multiple IHC images for one protein, learns protein-level representations by graph convolutions, and predicts multi-label information by a dynamic threshold method. Our results show that GraphLoc is a promising model for image-based protein subcellular location prediction with model interpretability. Furthermore, we apply GraphLoc to the identification of candidate location biomarkers and potential members for protein networks. A large portion of the predicted results have supporting evidence from the existing literatures and the new candidates also provide guidance for further experimental screening. Availability The dataset and code are available at: www.csbio.sjtu.edu.cn/bioinf/GraphLoc. Supplementary information Supplementary data are available at Bioinformatics online. Collapse

175

Szydlowski L, Ehlich J, Szczerbiak P, Shibata N, Goryanin I. Novel species identification and deep functional annotation of electrogenic biofilms, selectively enriched in a microbial fuel cell array. Front Microbiol 2022;13:951044. [PMID: 36188001 PMCID: PMC9517587 DOI: 10.3389/fmicb.2022.951044] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Accepted: 08/17/2022] [Indexed: 11/13/2022] Open

176

A pocket-based 3D molecule generative model fueled by experimental electron density. Sci Rep 2022;12:15100. [PMID: 36068257 PMCID: PMC9448726 DOI: 10.1038/s41598-022-19363-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2022] [Accepted: 08/29/2022] [Indexed: 11/08/2022] Open

177

Han Y, Wennersten SA, Wright JM, Ludwig RW, Lau E, Lam MPY. Proteogenomics reveals sex-biased aging genes and coordinated splicing in cardiac aging. Am J Physiol Heart Circ Physiol 2022;323:H538-H558. [PMID: 35930447 PMCID: PMC9448281 DOI: 10.1152/ajpheart.00244.2022] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/19/2022] [Revised: 07/20/2022] [Accepted: 07/31/2022] [Indexed: 01/24/2023]

Abstract

The risks of heart diseases are significantly modulated by age and sex, but how these factors influence baseline cardiac gene expression remains incompletely understood. Here, we used RNA sequencing and mass spectrometry to compare gene expression in female and male young adult (4 mo) and early aging (20 mo) mouse hearts, identifying thousands of age- and sex-dependent gene expression signatures. Sexually dimorphic cardiac genes are broadly distributed, functioning in mitochondrial metabolism, translation, and other processes. In parallel, we found over 800 genes with differential aging response between male and female, including genes in cAMP and PKA signaling. Analysis of the sex-adjusted aging cardiac transcriptome revealed a widespread remodeling of exon usage patterns that is largely independent from differential gene expression, concomitant with upstream changes in RNA-binding protein and splice factor transcripts. To evaluate the impact of the splicing events on cardiac proteoform composition, we applied an RNA-guided proteomics computational pipeline to analyze the mass spectrometry data and detected hundreds of putative splice variant proteins that have the potential to rewire the cardiac proteome. Taken together, the results here suggest that cardiac aging is associated with 1) widespread sex-biased aging genes and 2) a rewiring of RNA splicing programs, including sex- and age-dependent changes in exon usages and splice patterns that have the potential to influence cardiac protein structure and function. These changes contribute to the emerging evidence for considerable sexual dimorphism in the cardiac aging process that should be considered in the search for disease mechanisms.NEW & NOTEWORTHY Han et al. used proteogenomics to compare male and female mouse hearts at 4 and 20 mo. Sex-biased cardiac genes function in mitochondrial metabolism, translation, autophagy, and other processes. Hundreds of cardiac genes show sex-by-age interactions, that is, sex-biased aging genes. Cardiac aging is accompanied with a remodeling of exon usage in functionally coordinated genes, concomitant with differential expression of RNA-binding proteins and splice factors. These features represent an underinvestigated aspect of cardiac aging that may be relevant to the search for disease mechanisms.

Collapse

178

Newaz K, Piland J, Clark PL, Emrich SJ, Li J, Milenković T. Multi-layer sequential network analysis improves protein 3D structural classification. Proteins 2022;90:1721-1731. [PMID: 35441395 PMCID: PMC9356989 DOI: 10.1002/prot.26349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Revised: 03/04/2022] [Accepted: 03/30/2022] [Indexed: 11/08/2022]

179

Yuvaraj I, Chaudhary SK, Jeyakanthan J, Sekar K. Structure of the hypothetical protein TTHA1873 from Thermus thermophilus. Acta Crystallogr F Struct Biol Commun 2022;78:338-346. [PMID: 36048084 PMCID: PMC9435673 DOI: 10.1107/s2053230x22008457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Accepted: 08/23/2022] [Indexed: 11/10/2022] Open

180

Ma W, Zhang S, Li Z, Jiang M, Wang S, Lu W, Bi X, Jiang H, Zhang H, Wei Z. Enhancing Protein Function Prediction Performance by Utilizing AlphaFold-Predicted Protein Structures. J Chem Inf Model 2022;62:4008-4017. [PMID: 36006049 DOI: 10.1021/acs.jcim.2c00885] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

181

ML helps predict enzyme turnover rates. Nat Catal 2022. [DOI: 10.1038/s41929-022-00827-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

182

Li W, Zhang H, Li M, Han M, Yin Y. MGEGFP: a multi-view graph embedding method for gene function prediction based on adaptive estimation with GCN. Brief Bioinform 2022;23:6659744. [PMID: 35947989 DOI: 10.1093/bib/bbac333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2022] [Revised: 07/02/2022] [Accepted: 07/21/2022] [Indexed: 11/14/2022] Open

183

I-TASSER-MTD: a deep-learning-based platform for multi-domain protein structure and function prediction. Nat Protoc 2022;17:2326-2353. [PMID: 35931779 DOI: 10.1038/s41596-022-00728-0] [Citation(s) in RCA: 104] [Impact Index Per Article: 52.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2022] [Accepted: 05/24/2022] [Indexed: 01/17/2023]

184

Qiu XY, Wu H, Shao J. TALE-cmap: Protein function prediction based on a TALE-based architecture and the structure information from contact map. Comput Biol Med 2022;149:105938. [DOI: 10.1016/j.compbiomed.2022.105938] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2022] [Revised: 07/26/2022] [Accepted: 08/06/2022] [Indexed: 11/03/2022]

185

Du H, Jiang D, Gao J, Zhang X, Jiang L, Zeng Y, Wu Z, Shen C, Xu L, Cao D, Hou T, Pan P. Proteome-Wide Profiling of the Covalent-Druggable Cysteines with a Structure-Based Deep Graph Learning Network. Research (Wash D C) 2022;2022:9873564. [PMID: 35958111 PMCID: PMC9343084 DOI: 10.34133/2022/9873564] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2022] [Accepted: 06/27/2022] [Indexed: 11/06/2022] Open

Affiliation(s)

Hongyan Du Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China State Key Lab of CAD&CG, Zhejiang University, Hangzhou, 310058 Zhejiang, China
Dejun Jiang Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China State Key Lab of CAD&CG, Zhejiang University, Hangzhou, 310058 Zhejiang, China
Junbo Gao Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China
Xujun Zhang Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China
Lingxiao Jiang Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China
Yundian Zeng Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China
Zhenxing Wu Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China
Chao Shen Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China
Lei Xu Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China
Dongsheng Cao Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410004 Hunan, China
Tingjun Hou Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China State Key Lab of CAD&CG, Zhejiang University, Hangzhou, 310058 Zhejiang, China
Peichen Pan Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058 Zhejiang, China

Collapse

186

Xiao X, Jin Z, Wang S, Xu J, Peng Z, Wang R, Shao W, Hui Y. A dual-path dynamic directed graph convolutional network for air quality prediction. THE SCIENCE OF THE TOTAL ENVIRONMENT 2022;827:154298. [PMID: 35271925 DOI: 10.1016/j.scitotenv.2022.154298] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Revised: 02/28/2022] [Accepted: 02/28/2022] [Indexed: 06/14/2023]

187

Sharma VS, Fossati A, Ciuffa R, Buljan M, Williams EG, Chen Z, Shao W, Pedrioli PGA, Purcell AW, Martínez MR, Song J, Manica M, Aebersold R, Li C. PCfun: a hybrid computational framework for systematic characterization of protein complex function. Brief Bioinform 2022;23:6611913. [PMID: 35724564 PMCID: PMC9310514 DOI: 10.1093/bib/bbac239] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Revised: 05/05/2022] [Accepted: 05/21/2022] [Indexed: 11/14/2022] Open

Affiliation(s)

Varun S Sharma Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Switzerland.,CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
Andrea Fossati Quantitative Biosciences Institute (QBI) and Department of Cellular and Molecular Pharmacology, University of California, San Francisco, CA 94158, USA.,J. David Gladstone Institutes, San Francisco, CA 94158, USA
Rodolfo Ciuffa Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Switzerland
Marija Buljan Empa - Swiss Federal Laboratories for Materials Science and Technology, St. Gallen, Switzerland.,Swiss Institute of Bioinformatics (SIB), Lausanne, Switzerland
Evan G Williams Luxembourg Centre for Systems Biomedicine, University of Luxembourg, Esch-sur-Alzette Luxembourg
Zhen Chen Collaborative Innovation Center of Henan Grain Crops, Henan Agricultural University, Zhengzhou 450046, China
Wenguang Shao Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Switzerland
Patrick G A Pedrioli Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Switzerland
Anthony W Purcell Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia
María Rodríguez Martínez IBM Research Europe, Zurich, Switzerland
Jiangning Song Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia.,Monash Data Futures Institute, Monash University, Melbourne, VIC 3800, Australia
Matteo Manica IBM Research Europe, Zurich, Switzerland
Ruedi Aebersold Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Switzerland.,Faculty of Science, University of Zurich, Switzerland
Chen Li Department of Biology, Institute of Molecular Systems Biology, ETH Zurich, Switzerland.,Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC 3800, Australia

Collapse

188

Odrzywolek K, Karwowska Z, Majta J, Byrski A, Milanowska-Zabel K, Kosciolek T. Deep embeddings to comprehend and visualize microbiome protein space. Sci Rep 2022;12:10332. [PMID: 35725732 PMCID: PMC9209496 DOI: 10.1038/s41598-022-14055-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2022] [Accepted: 05/31/2022] [Indexed: 12/13/2022] Open

189

Kagaya Y, Flannery ST, Jain A, Kihara D. ContactPFP: Protein Function Prediction Using Predicted Contact Information. FRONTIERS IN BIOINFORMATICS 2022;2. [PMID: 35875419 PMCID: PMC9302406 DOI: 10.3389/fbinf.2022.896295] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Abstract Computational function prediction is one of the most important problems in bioinformatics as elucidating the function of genes is a central task in molecular biology and genomics. Most of the existing function prediction methods use protein sequences as the primary source of input information because the sequence is the most available information for query proteins. There are attempts to consider other attributes of query proteins. Among these attributes, the three-dimensional (3D) structure of proteins is known to be very useful in identifying the evolutionary relationship of proteins, from which functional similarity can be inferred. Here, we report a novel protein function prediction method, ContactPFP, which uses predicted residue-residue contact maps as input structural features of query proteins. Although 3D structure information is known to be useful, it has not been routinely used in function prediction because the 3D structure is not experimentally determined for many proteins. In ContactPFP, we overcome this limitation by using residue-residue contact prediction, which has become increasingly accurate due to rapid development in the protein structure prediction field. ContactPFP takes a query protein sequence as input and uses predicted residue-residue contact as a proxy for the 3D protein structure. To characterize how predicted contacts contribute to function prediction accuracy, we compared the performance of ContactPFP with several well-established sequence-based function prediction methods. The comparative study revealed the advantages and weaknesses of ContactPFP compared to contemporary sequence-based methods. There were many cases where it showed higher prediction accuracy. We examined factors that affected the accuracy of ContactPFP using several illustrative cases that highlight the strength of our method. Collapse

190

Ihalage A, Hao Y. Formula Graph Self-Attention Network for Representation-Domain Independent Materials Discovery. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2022;9:e2200164. [PMID: 35475548 PMCID: PMC9218748 DOI: 10.1002/advs.202200164] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/09/2022] [Revised: 03/05/2022] [Indexed: 06/14/2023]

191

Hu S, Zhang Z, Xiong H, Jiang M, Luo Y, Yan W, Zhao B. A tensor-based bi-random walks model for protein function prediction. BMC Bioinformatics 2022;23:199. [PMID: 35637427 PMCID: PMC9150346 DOI: 10.1186/s12859-022-04747-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2022] [Accepted: 05/24/2022] [Indexed: 11/26/2022] Open

Abstract

Background

The accurate characterization of protein functions is critical to understanding life at the molecular level and has a huge impact on biomedicine and pharmaceuticals. Computationally predicting protein function has been studied in the past decades. Plagued by noise and errors in protein–protein interaction (PPI) networks, researchers have undertaken to focus on the fusion of multi-omics data in recent years. A data model that appropriately integrates network topologies with biological data and preserves their intrinsic characteristics is still a bottleneck and an aspirational goal for protein function prediction.

Results

In this paper, we propose the RWRT (Random Walks with Restart on Tensor) method to accomplish protein function prediction by applying bi-random walks on the tensor. RWRT firstly constructs a functional similarity tensor by combining protein interaction networks with multi-omics data derived from domain annotation and protein complex information. After this, RWRT extends the bi-random walks algorithm from a two-dimensional matrix to the tensor for scoring functional similarity between proteins. Finally, RWRT filters out possible pretenders based on the concept of cohesiveness coefficient and annotates target proteins with functions of the remaining functional partners. Experimental results indicate that RWRT performs significantly better than the state-of-the-art methods and improves the area under the receiver-operating curve (AUROC) by no less than 18%.

Conclusions

The functional similarity tensor offers us an alternative, in that it is a collection of networks sharing the same nodes; however, the edges belong to different categories or represent interactions of different nature. We demonstrate that the tensor-based random walk model can not only discover more partners with similar functions but also free from the constraints of errors in protein interaction networks effectively. We believe that the performance of function prediction depends greatly on whether we can extract and exploit proper functional similarity information on protein correlations.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12859-022-04747-2.

Collapse

192

Wang S, Wu R, Lu J, Jiang Y, Huang T, Cai YD. Protein-protein interaction networks as miners of biological discovery. Proteomics 2022;22:e2100190. [PMID: 35567424 DOI: 10.1002/pmic.202100190] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2021] [Revised: 03/28/2022] [Accepted: 04/29/2022] [Indexed: 11/12/2022]

193

Chen Z, Liu X, Zhao P, Li C, Wang Y, Li F, Akutsu T, Bain C, Gasser RB, Li J, Yang Z, Gao X, Kurgan L, Song J. iFeatureOmega: an integrative platform for engineering, visualization and analysis of features from molecular sequences, structural and ligand data sets. Nucleic Acids Res 2022;50:W434-W447. [PMID: 35524557 PMCID: PMC9252729 DOI: 10.1093/nar/gkac351] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2022] [Revised: 04/22/2022] [Accepted: 04/25/2022] [Indexed: 01/07/2023] Open

Abstract

The rapid accumulation of molecular data motivates development of innovative approaches to computationally characterize sequences, structures and functions of biological and chemical molecules in an efficient, accessible and accurate manner. Notwithstanding several computational tools that characterize protein or nucleic acids data, there are no one-stop computational toolkits that comprehensively characterize a wide range of biomolecules. We address this vital need by developing a holistic platform that generates features from sequence and structural data for a diverse collection of molecule types. Our freely available and easy-to-use iFeatureOmega platform generates, analyzes and visualizes 189 representations for biological sequences, structures and ligands. To the best of our knowledge, iFeatureOmega provides the largest scope when directly compared to the current solutions, in terms of the number of feature extraction and analysis approaches and coverage of different molecules. We release three versions of iFeatureOmega including a webserver, command line interface and graphical interface to satisfy needs of experienced bioinformaticians and less computer-savvy biologists and biochemists. With the assistance of iFeatureOmega, users can encode their molecular data into representations that facilitate construction of predictive models and analytical studies. We highlight benefits of iFeatureOmega based on three research applications, demonstrating how it can be used to accelerate and streamline research in bioinformatics, computational biology, and cheminformatics areas. The iFeatureOmega webserver is freely available at http://ifeatureomega.erc.monash.edu and the standalone versions can be downloaded from https://github.com/Superzchen/iFeatureOmega-GUI/ and https://github.com/Superzchen/iFeatureOmega-CLI/.

Collapse

Affiliation(s)

Zhen Chen Collaborative Innovation Center of Henan Grain Crops, Henan Agricultural University, Zhengzhou 450046, China.,Center for Crop Genome Engineering, Henan Agricultural University, Zhengzhou 450046, China
Xuhan Liu Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden 2333 CC, The Netherlands
Pei Zhao State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Sciences (CAAS), Anyang 455000, China
Chen Li Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria 3800, Australia
Yanan Wang Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria 3800, Australia
Fuyi Li Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria 3800, Australia
Tatsuya Akutsu Bioinformatics Center, Institute for Chemical Research, Kyoto University, Kyoto 611-0011, Japan
Chris Bain Monash Data Future Institutes, Monash University, Melbourne, Victoria 3800, Australia
Robin B Gasser Department of Veterinary Biosciences, Melbourne Veterinary School, The University of Melbourne, Parkville, Victoria 3010, Australia
Junzhou Li Collaborative Innovation Center of Henan Grain Crops, Henan Agricultural University, Zhengzhou 450046, China
Zuoren Yang State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Sciences (CAAS), Anyang 455000, China
Xin Gao Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia
Lukasz Kurgan Department of Computer Science, Virginia Commonwealth University, Richmond, VA, USA
Jiangning Song Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria 3800, Australia.,Monash Data Future Institutes, Monash University, Melbourne, Victoria 3800, Australia

Collapse

194

Prediction of GPCR activity using Machine Learning. Comput Struct Biotechnol J 2022;20:2564-2573. [PMID: 35685352 PMCID: PMC9163700 DOI: 10.1016/j.csbj.2022.05.016] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2022] [Revised: 05/08/2022] [Accepted: 05/09/2022] [Indexed: 11/20/2022] Open

195

Newton MAH, Rahman J, Zaman R, Sattar A. Enhancing Protein Contact Map Prediction Accuracy via Ensembles of Inter-Residue Distance Predictors. Comput Biol Chem 2022;99:107700. [DOI: 10.1016/j.compbiolchem.2022.107700] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 05/19/2022] [Accepted: 05/19/2022] [Indexed: 11/03/2022]

196

LM-GVP: an extensible sequence and structure informed deep learning framework for protein property prediction. Sci Rep 2022;12:6832. [PMID: 35477726 PMCID: PMC9046255 DOI: 10.1038/s41598-022-10775-y] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Accepted: 04/11/2022] [Indexed: 11/27/2022] Open

197

Gu J, Zhang T, Wu C, Liang Y, Shi X. Refined Contact Map Prediction of Peptides Based on GCN and ResNet. Front Genet 2022;13:859626. [PMID: 35571037 PMCID: PMC9092020 DOI: 10.3389/fgene.2022.859626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2022] [Accepted: 03/23/2022] [Indexed: 11/13/2022] Open

198

Ghorbani M, Prasad S, Klauda J, Brooks B. GraphVAMPNet, using graph neural networks and variational approach to Markov processes for dynamical modeling of biomolecules. J Chem Phys 2022;156:184103. [PMID: 35568532 PMCID: PMC9094994 DOI: 10.1063/5.0085607] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

199

Detlefsen NS, Hauberg S, Boomsma W. Learning meaningful representations of protein sequences. Nat Commun 2022;13:1914. [PMID: 35395843 PMCID: PMC8993921 DOI: 10.1038/s41467-022-29443-w] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2020] [Accepted: 03/15/2022] [Indexed: 01/27/2023] Open

200

Pi J, Jiao P, Zhang Y, Li J. MDGNN: Microbial Drug Prediction Based on Heterogeneous Multi-Attention Graph Neural Network. Front Microbiol 2022;13:819046. [PMID: 35464940 PMCID: PMC9021438 DOI: 10.3389/fmicb.2022.819046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2021] [Accepted: 03/07/2022] [Indexed: 11/14/2022] Open