Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Yang Y, Fang Q, Shen HB. Predicting gene regulatory interactions based on spatial gene expression data and deep learning. PLoS Comput Biol 2019;15:e1007324. [PMID: 31527870 PMCID: PMC6764701 DOI: 10.1371/journal.pcbi.1007324] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2018] [Revised: 09/27/2019] [Accepted: 08/08/2019] [Indexed: 11/23/2022] Open

For:	Yang Y, Fang Q, Shen HB. Predicting gene regulatory interactions based on spatial gene expression data and deep learning. PLoS Comput Biol 2019;15:e1007324. [PMID: 31527870 PMCID: PMC6764701 DOI: 10.1371/journal.pcbi.1007324] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2018] [Revised: 09/27/2019] [Accepted: 08/08/2019] [Indexed: 11/23/2022] Open

Number

Cited by Other Article(s)

Cahill R, Wang Y, Xian RP, Lee AJ, Zeng H, Yu B, Tasic B, Abbasi-Asl R. Unsupervised pattern identification in spatial gene expression atlas reveals mouse brain regions beyond established ontology. Proc Natl Acad Sci U S A 2024;121:e2319804121. [PMID: 39226356 DOI: 10.1073/pnas.2319804121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Accepted: 07/24/2024] [Indexed: 09/05/2024] Open

Cui W, Long Q, Xiao M, Wang X, Feng G, Li X, Wang P, Zhou Y. Refining computational inference of gene regulatory networks: integrating knockout data within a multi-task framework. Brief Bioinform 2024;25:bbae361. [PMID: 39082651 PMCID: PMC11289685 DOI: 10.1093/bib/bbae361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 05/09/2024] [Accepted: 07/16/2024] [Indexed: 08/03/2024] Open

Affiliation(s)

Wentao Cui Computer Network Information Center, Chinese Academy of Sciences, CAS Informatization Plaza No. 2 Dong Sheng Nan Lu, Haidian District, Beijing, 100083, China University of Chinese Academy of Sciences, No. 19A Yuquan Road, Shijingshan District, Beijing, 100049, China
Qingqing Long Computer Network Information Center, Chinese Academy of Sciences, CAS Informatization Plaza No. 2 Dong Sheng Nan Lu, Haidian District, Beijing, 100083, China
Meng Xiao Computer Network Information Center, Chinese Academy of Sciences, CAS Informatization Plaza No. 2 Dong Sheng Nan Lu, Haidian District, Beijing, 100083, China University of Chinese Academy of Sciences, No. 19A Yuquan Road, Shijingshan District, Beijing, 100049, China
Xuezhi Wang Computer Network Information Center, Chinese Academy of Sciences, CAS Informatization Plaza No. 2 Dong Sheng Nan Lu, Haidian District, Beijing, 100083, China University of Chinese Academy of Sciences, No. 19A Yuquan Road, Shijingshan District, Beijing, 100049, China
Guihai Feng University of Chinese Academy of Sciences, No. 19A Yuquan Road, Shijingshan District, Beijing, 100049, China State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, 1 Beichen West Road, Chaoyang District, Beijing, 100101, China
Xin Li University of Chinese Academy of Sciences, No. 19A Yuquan Road, Shijingshan District, Beijing, 100049, China State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Zoology, Chinese Academy of Sciences, 1 Beichen West Road, Chaoyang District, Beijing, 100101, China
Pengfei Wang Computer Network Information Center, Chinese Academy of Sciences, CAS Informatization Plaza No. 2 Dong Sheng Nan Lu, Haidian District, Beijing, 100083, China University of Chinese Academy of Sciences, No. 19A Yuquan Road, Shijingshan District, Beijing, 100049, China
Yuanchun Zhou Computer Network Information Center, Chinese Academy of Sciences, CAS Informatization Plaza No. 2 Dong Sheng Nan Lu, Haidian District, Beijing, 100083, China University of Chinese Academy of Sciences, No. 19A Yuquan Road, Shijingshan District, Beijing, 100049, China

Collapse

Wu S, Jin K, Tang M, Xia Y, Gao W. Inference of Gene Regulatory Networks Based on Multi-view Hierarchical Hypergraphs. Interdiscip Sci 2024;16:318-332. [PMID: 38342857 DOI: 10.1007/s12539-024-00604-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 11/26/2023] [Accepted: 01/03/2024] [Indexed: 02/13/2024]

Wang Y, Chen X, Zheng Z, Huang L, Xie W, Wang F, Zhang Z, Wong KC. scGREAT: Transformer-based deep-language model for gene regulatory network inference from single-cell transcriptomics. iScience 2024;27:109352. [PMID: 38510148 PMCID: PMC10951644 DOI: 10.1016/j.isci.2024.109352] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Revised: 12/29/2023] [Accepted: 02/23/2024] [Indexed: 03/22/2024] Open

Mousavi R, Lobo D. Automatic design of gene regulatory mechanisms for spatial pattern formation. NPJ Syst Biol Appl 2024;10:35. [PMID: 38565850 PMCID: PMC10987498 DOI: 10.1038/s41540-024-00361-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 03/19/2024] [Indexed: 04/04/2024] Open

Huang Y, Yu G, Yang Y. MIGGRI: A multi-instance graph neural network model for inferring gene regulatory networks for Drosophila from spatial expression images. PLoS Comput Biol 2023;19:e1011623. [PMID: 37939200 PMCID: PMC10659162 DOI: 10.1371/journal.pcbi.1011623] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Revised: 11/20/2023] [Accepted: 10/22/2023] [Indexed: 11/10/2023] Open

Abstract

Recent breakthrough in spatial transcriptomics has brought great opportunities for exploring gene regulatory networks (GRNs) from a brand-new perspective. Especially, the local expression patterns and spatio-temporal regulation mechanisms captured by spatial expression images allow more delicate delineation of the interplay between transcript factors and their target genes. However, the complexity and size of spatial image collections pose significant challenges to GRN inference using image-based methods. Extracting regulatory information from expression images is difficult due to the lack of supervision and the multi-instance nature of the problem, where a gene often corresponds to multiple images captured from different views. While graph models, particularly graph neural networks, have emerged as a promising method for leveraging underlying structure information from known GRNs, incorporating expression images into graphs is not straightforward. To address these challenges, we propose a two-stage approach, MIGGRI, for capturing comprehensive regulatory patterns from image collections for each gene and known interactions. Our approach involves a multi-instance graph neural network (GNN) model for GRN inference, which first extracts gene regulatory features from spatial expression images via contrastive learning, and then feeds them to a multi-instance GNN for semi-supervised learning. We apply our approach to a large set of Drosophila embryonic spatial gene expression images. MIGGRI achieves outstanding performance in the inference of GRNs for early eye development and mesoderm development of Drosophila, and shows robustness in the scenarios of missing image information. Additionally, we perform interpretable analysis on image reconstruction and functional subgraphs that may reveal potential pathways or coordinate regulations. By leveraging the power of graph neural networks and the information contained in spatial expression images, our approach has the potential to advance our understanding of gene regulation in complex biological systems.

Collapse

Wu Y, Qian B, Wang A, Dong H, Zhu E, Ma B. iLSGRN: inference of large-scale gene regulatory networks based on multi-model fusion. Bioinformatics 2023;39:btad619. [PMID: 37851379 PMCID: PMC10589915 DOI: 10.1093/bioinformatics/btad619] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Revised: 10/04/2023] [Accepted: 10/17/2023] [Indexed: 10/19/2023] Open

Mousavi R, Lobo D. Automatic design of gene regulatory mechanisms for spatial pattern formation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.26.550573. [PMID: 37546866 PMCID: PMC10402059 DOI: 10.1101/2023.07.26.550573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]

Wu YH, Huang YA, Li JQ, You ZH, Hu PW, Hu L, Leung VCM, Du ZH. Knowledge graph embedding for profiling the interaction between transcription factors and their target genes. PLoS Comput Biol 2023;19:e1011207. [PMID: 37339154 PMCID: PMC10313080 DOI: 10.1371/journal.pcbi.1011207] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2022] [Revised: 06/30/2023] [Accepted: 05/23/2023] [Indexed: 06/22/2023] Open

Fang Z, Ford AJ, Hu T, Zhang N, Mantalaris A, Coskun AF. Subcellular spatially resolved gene neighborhood networks in single cells. CELL REPORTS METHODS 2023;3:100476. [PMID: 37323566 PMCID: PMC10261906 DOI: 10.1016/j.crmeth.2023.100476] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Revised: 02/18/2023] [Accepted: 04/18/2023] [Indexed: 06/17/2023]

A survey on gene expression data analysis using deep learning methods for cancer diagnosis. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2023;177:1-13. [PMID: 35988771 DOI: 10.1016/j.pbiomolbio.2022.08.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2022] [Revised: 08/09/2022] [Accepted: 08/12/2022] [Indexed: 02/07/2023]

Inference of gene regulatory networks based on the Light Gradient Boosting Machine. Comput Biol Chem 2022;101:107769. [DOI: 10.1016/j.compbiolchem.2022.107769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2022] [Revised: 08/12/2022] [Accepted: 09/06/2022] [Indexed: 11/23/2022]

Lei J, Cai Z, He X, Zheng W, Liu J. An approach of gene regulatory network construction using mixed entropy optimizing context-related likelihood mutual information. Bioinformatics 2022;39:6808612. [PMID: 36342190 PMCID: PMC9805593 DOI: 10.1093/bioinformatics/btac717] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2022] [Revised: 09/18/2022] [Accepted: 11/04/2022] [Indexed: 11/09/2022] Open

Abstract

MOTIVATION

The question of how to construct gene regulatory networks has long been a focus of biological research. Mutual information can be used to measure nonlinear relationships, and it has been widely used in the construction of gene regulatory networks. However, this method cannot measure indirect regulatory relationships under the influence of multiple genes, which reduces the accuracy of inferring gene regulatory networks.

APPROACH

This work proposes a method for constructing gene regulatory networks based on mixed entropy optimizing context-related likelihood mutual information (MEOMI). First, two entropy estimators were combined to calculate the mutual information between genes. Then, distribution optimization was performed using a context-related likelihood algorithm to eliminate some indirect regulatory relationships and obtain the initial gene regulatory network. To obtain the complex interaction between genes and eliminate redundant edges in the network, the initial gene regulatory network was further optimized by calculating the conditional mutual inclusive information (CMI2) between gene pairs under the influence of multiple genes. The network was iteratively updated to reduce the impact of mutual information on the overestimation of the direct regulatory intensity.

RESULTS

The experimental results show that the MEOMI method performed better than several other kinds of gene network construction methods on DREAM challenge simulated datasets (DREAM3 and DREAM5), three real Escherichia coli datasets (E.coli SOS pathway network, E.coli SOS DNA repair network and E.coli community network) and two human datasets.

AVAILABILITY AND IMPLEMENTATION

Source code and dataset are available at https://github.com/Dalei-Dalei/MEOMI/ and http://122.205.95.139/MEOMI/.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Tan H, Qiu S, Wang J, Yu G, Guo W, Guo M. Weighted deep factorizing heterogeneous molecular network for genome-phenome association prediction. Methods 2022;205:18-28. [PMID: 35690250 DOI: 10.1016/j.ymeth.2022.05.008] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2022] [Revised: 05/14/2022] [Accepted: 05/26/2022] [Indexed: 11/18/2022] Open

Abstract

Genome-phenome association (GPA) prediction can promote the understanding of biological mechanisms about complex pathology of phenotypes (i.e., traits and diseases). Traditional heterogeneous network-based GPA approaches overwhelmingly need to project heterogeneous data toward homogeneous network for data fusion and prediction, such projections result in the loss of heterogeneous network structure information. Matrix factorization based data fusion can avoid such projection by integrating multi-type data in a coherent way, but they typically perform linear factorization and cannot mine the nonlinear relationships between molecules, which compromise the accuracy of GPA analysis. Furthermore, most of them can not selectively synergy network topology and node attribution information in a principle way. In this paper, we propose a weighted deep matrix factorization based solution (WDGPA) to predict GPAs by selectively and differentially fusing heterogeneous molecular network and diverse attributes of nodes. WDGPA firstly assigns weights to inter/intra-relational data matrices and attribute data matrices, and performs deep matrix factorization on these matrices of heterogeneous network in a cooperative manner to obtain the nonlinear representations of different nodes. In addition, it performs low-rank representation learning on the attribute data with the shared nonlinear representations. In this way, both the network topology and node attributes are jointly mined to explore the representations of molecules and complex interplays between molecules and phenotypes. WDGPA then uses the representational vectors of gene and phenotype nodes to predict GPAs. Experimental results on maize and human datasets confirm that WDGPA outperforms competitive methods by a large margin under different evaluation protocols.

Collapse

Du ZH, Wu YH, Huang YA, Chen J, Pan GQ, Hu L, You ZH, Li JQ. GraphTGI: an attention-based graph embedding model for predicting TF-target gene interactions. Brief Bioinform 2022;23:6576453. [PMID: 35511108 DOI: 10.1093/bib/bbac148] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2021] [Revised: 03/25/2022] [Accepted: 03/31/2022] [Indexed: 12/26/2022] Open

Huang YA, Pan GQ, Wang J, Li JQ, Chen J, Wu YH. Heterogeneous graph embedding model for predicting interactions between TF and target gene. Bioinformatics 2022;38:2554-2560. [PMID: 35266510 DOI: 10.1093/bioinformatics/btac148] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2021] [Revised: 02/13/2022] [Accepted: 03/09/2022] [Indexed: 11/15/2022] Open

Li X, Ma S, Liu J, Tang J, Guo F. Inferring gene regulatory network via fusing gene expression image and RNA-seq data. Bioinformatics 2022;38:1716-1723. [PMID: 34999771 DOI: 10.1093/bioinformatics/btac008] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2021] [Revised: 12/09/2021] [Accepted: 01/04/2022] [Indexed: 02/04/2023] Open

Kazempour A, Kazempoor R. The effect of Lacticaseibacillus casei on inflammatory cytokine (IL-8) gene expression induced by exposure to Shigella sonnei in Zebrafish (Danio rerio). ARQ BRAS MED VET ZOO 2022. [DOI: 10.1590/1678-4162-12513] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open

Inference on the structure of gene regulatory networks. J Theor Biol 2022;539:111055. [PMID: 35150721 DOI: 10.1016/j.jtbi.2022.111055] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 01/29/2022] [Accepted: 02/03/2022] [Indexed: 11/20/2022]

Zhao M, He W, Tang J, Zou Q, Guo F. A hybrid deep learning framework for gene regulatory network inference from single-cell transcriptomic data. Brief Bioinform 2022;23:6513730. [DOI: 10.1093/bib/bbab568] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Revised: 12/09/2021] [Accepted: 12/11/2021] [Indexed: 12/21/2022] Open

Abstract Abstract Inferring gene regulatory networks (GRNs) based on gene expression profiles is able to provide an insight into a number of cellular phenotypes from the genomic level and reveal the essential laws underlying various life phenomena. Different from the bulk expression data, single-cell transcriptomic data embody cell-to-cell variance and diverse biological information, such as tissue characteristics, transformation of cell types, etc. Inferring GRNs based on such data offers unprecedented advantages for making a profound study of cell phenotypes, revealing gene functions and exploring potential interactions. However, the high sparsity, noise and dropout events of single-cell transcriptomic data pose new challenges for regulation identification. We develop a hybrid deep learning framework for GRN inference from single-cell transcriptomic data, DGRNS, which encodes the raw data and fuses recurrent neural network and convolutional neural network (CNN) to train a model capable of distinguishing related gene pairs from unrelated gene pairs. To overcome the limitations of such datasets, it applies sliding windows to extract valuable features while preserving the direction of regulation. DGRNS is constructed as a deep learning model containing gated recurrent unit network for exploring time-dependent information and CNN for learning spatially related information. Our comprehensive and detailed comparative analysis on the dataset of mouse hematopoietic stem cells illustrates that DGRNS outperforms state-of-the-art methods. The networks inferred by DGRNS are about 16% higher than the area under the receiver operating characteristic curve of other unsupervised methods and 10% higher than the area under the precision recall curve of other supervised methods. Experiments on human datasets show the strong robustness and excellent generalization of DGRNS. By comparing the predictions with standard network, we discover a series of novel interactions which are proved to be true in some specific cell types. Importantly, DGRNS identifies a series of regulatory relationships with high confidence and functional consistency, which have not yet been experimentally confirmed and merit further research. Collapse

Grisanti Canozo FJ, Zuo Z, Martin JF, Samee MAH. Cell-type modeling in spatial transcriptomics data elucidates spatially variable colocalization and communication between cell-types in mouse brain. Cell Syst 2022;13:58-70.e5. [PMID: 34626538 PMCID: PMC8776574 DOI: 10.1016/j.cels.2021.09.004] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Revised: 08/06/2021] [Accepted: 09/10/2021] [Indexed: 01/21/2023]

Monti M, Fiorentino J, Milanetti E, Gosti G, Tartaglia GG. Prediction of Time Series Gene Expression and Structural Analysis of Gene Regulatory Networks Using Recurrent Neural Networks. ENTROPY (BASEL, SWITZERLAND) 2022;24:141. [PMID: 35205437 PMCID: PMC8871363 DOI: 10.3390/e24020141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 01/14/2022] [Accepted: 01/15/2022] [Indexed: 11/17/2022]

Zheng L, Liu Z, Yang Y, Shen HB. Accurate inference of gene regulatory interactions from spatial gene expression with deep contrastive learning. Bioinformatics 2022;38:746-753. [PMID: 34664632 DOI: 10.1093/bioinformatics/btab718] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2021] [Revised: 09/19/2021] [Accepted: 10/15/2021] [Indexed: 02/03/2023] Open

Krishnakumar R, Ruffing AM. OperonSEQer: A set of machine-learning algorithms with threshold voting for detection of operon pairs using short-read RNA-sequencing data. PLoS Comput Biol 2022;18:e1009731. [PMID: 34986143 PMCID: PMC8765615 DOI: 10.1371/journal.pcbi.1009731] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 01/18/2022] [Accepted: 12/07/2021] [Indexed: 11/19/2022] Open

Abstract

Operon prediction in prokaryotes is critical not only for understanding the regulation of endogenous gene expression, but also for exogenous targeting of genes using newly developed tools such as CRISPR-based gene modulation. A number of methods have used transcriptomics data to predict operons, based on the premise that contiguous genes in an operon will be expressed at similar levels. While promising results have been observed using these methods, most of them do not address uncertainty caused by technical variability between experiments, which is especially relevant when the amount of data available is small. In addition, many existing methods do not provide the flexibility to determine the stringency with which genes should be evaluated for being in an operon pair. We present OperonSEQer, a set of machine learning algorithms that uses the statistic and p-value from a non-parametric analysis of variance test (Kruskal-Wallis) to determine the likelihood that two adjacent genes are expressed from the same RNA molecule. We implement a voting system to allow users to choose the stringency of operon calls depending on whether your priority is high recall or high specificity. In addition, we provide the code so that users can retrain the algorithm and re-establish hyperparameters based on any data they choose, allowing for this method to be expanded as additional data is generated. We show that our approach detects operon pairs that are missed by current methods by comparing our predictions to publicly available long-read sequencing data. OperonSEQer therefore improves on existing methods in terms of accuracy, flexibility, and adaptability.

Bacteria and archaea, single-cell organisms collectively known as prokaryotes, live in all imaginable environments and comprise the majority of living organisms on this planet. Prokaryotes play a critical role in the homeostasis of multicellular organisms (such as animals and plants) and ecosystems. In addition, bacteria can be pathogenic and cause a variety of diseases in these same hosts and ecosystems. In short, understanding the biology and molecular functions of bacteria and archaea and devising mechanisms to engineer and optimize their properties are critical scientific endeavors with significant implications in healthcare, agriculture, manufacturing, and climate science among others. One major molecular difference between unicellular and multicellular organisms is the way they express genes–multicellular organisms make individual RNA molecules for each gene while, prokaryotes express operons (i.e., a group of genes coding functionally related proteins) in contiguous polycistronic RNA molecules. Understanding which genes exist within operons is critical for elucidating basic biology and for engineering organisms. In this work, we use a combination of statistical and machine learning-based methods to use next-generation sequencing data to predict operon structure across a range of prokaryotes. Our method provides an easily implemented, robust, accurate, and flexible way to determine operon structure in an organism-agnostic manner using readily available data.

Collapse

Farahmand S, Fernandez AI, Ahmed FS, Rimm DL, Chuang JH, Reisenbichler E, Zarringhalam K. Deep learning trained on hematoxylin and eosin tumor region of Interest predicts HER2 status and trastuzumab treatment response in HER2+ breast cancer. Mod Pathol 2022;35:44-51. [PMID: 34493825 DOI: 10.1038/s41379-021-00911-w] [Citation(s) in RCA: 55] [Impact Index Per Article: 27.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Revised: 08/13/2021] [Accepted: 08/13/2021] [Indexed: 12/19/2022]

Huminiecki Ł. Virtual Gene Concept and a Corresponding Pragmatic Research Program in Genetical Data Science. ENTROPY (BASEL, SWITZERLAND) 2021;24:17. [PMID: 35052043 PMCID: PMC8774939 DOI: 10.3390/e24010017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 12/02/2021] [Accepted: 12/14/2021] [Indexed: 06/14/2023]

Novakovsky G, Saraswat M, Fornes O, Mostafavi S, Wasserman WW. Biologically relevant transfer learning improves transcription factor binding prediction. Genome Biol 2021;22:280. [PMID: 34579793 PMCID: PMC8474956 DOI: 10.1186/s13059-021-02499-5] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Accepted: 09/15/2021] [Indexed: 12/27/2022] Open

Lee JY, Nguyen B, Orosco C, Styczynski MP. SCOUR: a stepwise machine learning framework for predicting metabolite-dependent regulatory interactions. BMC Bioinformatics 2021;22:365. [PMID: 34238207 PMCID: PMC8268592 DOI: 10.1186/s12859-021-04281-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Accepted: 06/30/2021] [Indexed: 11/22/2022] Open

Abstract

BACKGROUND

The topology of metabolic networks is both well-studied and remarkably well-conserved across many species. The regulation of these networks, however, is much more poorly characterized, though it is known to be divergent across organisms-two characteristics that make it difficult to model metabolic networks accurately. While many computational methods have been built to unravel transcriptional regulation, there have been few approaches developed for systems-scale analysis and study of metabolic regulation. Here, we present a stepwise machine learning framework that applies established algorithms to identify regulatory interactions in metabolic systems based on metabolic data: stepwise classification of unknown regulation, or SCOUR.

RESULTS

We evaluated our framework on both noiseless and noisy data, using several models of varying sizes and topologies to show that our approach is generalizable. We found that, when testing on data under the most realistic conditions (low sampling frequency and high noise), SCOUR could identify reaction fluxes controlled only by the concentration of a single metabolite (its primary substrate) with high accuracy. The positive predictive value (PPV) for identifying reactions controlled by the concentration of two metabolites ranged from 32 to 88% for noiseless data, 9.2 to 49% for either low sampling frequency/low noise or high sampling frequency/high noise data, and 6.6-27% for low sampling frequency/high noise data, with results typically sufficiently high for lab validation to be a practical endeavor. While the PPVs for reactions controlled by three metabolites were lower, they were still in most cases significantly better than random classification.

CONCLUSIONS

SCOUR uses a novel approach to synthetically generate the training data needed to identify regulators of reaction fluxes in a given metabolic system, enabling metabolomics and fluxomics data to be leveraged for regulatory structure inference. By identifying and triaging the most likely candidate regulatory interactions, SCOUR can drastically reduce the amount of time needed to identify and experimentally validate metabolic regulatory interactions. As high-throughput experimental methods for testing these interactions are further developed, SCOUR will provide critical impact in the development of predictive metabolic models in new organisms and pathways.

Collapse

Westerman EL, Bowman SEJ, Davidson B, Davis MC, Larson ER, Sanford CPJ. Deploying Big Data to Crack the Genotype to Phenotype Code. Integr Comp Biol 2021;60:385-396. [PMID: 32492136 DOI: 10.1093/icb/icaa055] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open

Zhang M, Sheffield T, Zhan X, Li Q, Yang DM, Wang Y, Wang S, Xie Y, Wang T, Xiao G. Spatial molecular profiling: platforms, applications and analysis tools. Brief Bioinform 2021;22:bbaa145. [PMID: 32770205 PMCID: PMC8138878 DOI: 10.1093/bib/bbaa145] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2020] [Revised: 05/26/2020] [Accepted: 06/09/2020] [Indexed: 12/24/2022] Open

Sajid M, Channakesavula CN, Stone SR, Kaur P. Synthetic Biology towards Improved Flavonoid Pharmacokinetics. Biomolecules 2021;11:biom11050754. [PMID: 34069975 PMCID: PMC8157843 DOI: 10.3390/biom11050754] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2021] [Revised: 05/13/2021] [Accepted: 05/17/2021] [Indexed: 12/14/2022] Open

Mousavi R, Konuru SH, Lobo D. Inference of dynamic spatial GRN models with multi-GPU evolutionary computation. Brief Bioinform 2021;22:6217729. [PMID: 33834216 DOI: 10.1093/bib/bbab104] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Revised: 02/15/2021] [Accepted: 03/09/2021] [Indexed: 02/06/2023] Open

Zhao M, He W, Tang J, Zou Q, Guo F. A comprehensive overview and critical evaluation of gene regulatory network inference technologies. Brief Bioinform 2021;22:6128842. [PMID: 33539514 DOI: 10.1093/bib/bbab009] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2020] [Revised: 12/11/2020] [Accepted: 01/06/2021] [Indexed: 12/12/2022] Open

Kwon MS, Lee BT, Lee SY, Kim HU. Modeling regulatory networks using machine learning for systems metabolic engineering. Curr Opin Biotechnol 2020;65:163-170. [DOI: 10.1016/j.copbio.2020.02.014] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2020] [Revised: 02/23/2020] [Accepted: 02/26/2020] [Indexed: 12/18/2022]

Buono L, Martinez-Morales JR. Retina Development in Vertebrates: Systems Biology Approaches to Understanding Genetic Programs: On the Contribution of Next-Generation Sequencing Methods to the Characterization of the Regulatory Networks Controlling Vertebrate Eye Development. Bioessays 2020;42:e1900187. [PMID: 31997389 DOI: 10.1002/bies.201900187] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2019] [Revised: 01/16/2020] [Indexed: 12/18/2022]