1
|
Wen S, Lv X, Li P, Li J, Qin D. Analysis of cancer-associated fibroblasts in cervical cancer by single-cell RNA sequencing. Aging (Albany NY) 2023; 15:15340-15359. [PMID: 38157264 PMCID: PMC10781451 DOI: 10.18632/aging.205353] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2023] [Accepted: 11/10/2023] [Indexed: 01/03/2024]
Abstract
OBJECTIVE Since scRNA-seq is an effective tool to study tumor heterogeneity, this paper intends to reveal the differences of cervical cancer in patients at the individual cell level by scRNA-seq, and focus on the biological functions of cancer-associated fibroblasts (CAFs) in cervical cancer, facilitating the provision of a new interpretation of the heterogeneity of the microenvironment of cervical cancer, and an in-depth exploration of the pathogenesis of cervical cancer as well as pursuit of effective means of treatment intake. METHODS 3 cervical cancer specimens were collected by clinical surgery for single-cell RNA sequencing. Cell suspensions of fresh cervical cancer tissues were prepared, and cDNA libraries were created and sequenced on the machine. Furthermore, the sequencing data were analyzed using bioinformatics, including descending clustering of cells, identification of cell populations, mimetic time series analysis, inferCNV, cell communication analysis, and identification of transcription factors. RESULTS A total of 9 cell types were identified, encompassing T cells, epithelial cells, smooth muscle cells, CAFs, endothelial cells, macrophages, B cells, lymphocytes, and plasma cells. CAFs were further divided into three cell subtypes, named type1 cells, type2 cells, and type3 cells. With key transcription factors for the three cells, TCF21, ZC3H11A, and MYEF2 obtained, this research revealed the communication relationship between CAFs and several other cells, and found an important role of CAFs in the MK signaling pathway. CONCLUSIONS scRNA-seq technology contributed to exploring the tumor heterogeneity of cervical cancer more deeply, and also further gaining insight into the biological functions of CAFs in cervical cancer.
Collapse
Affiliation(s)
- Shuang Wen
- Reproductive Center, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
| | - Xuefeng Lv
- Department of Laboratory Medicine, The Third Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
| | - Pengxiang Li
- Henan Provincial Chest Hospital, Zhengzhou, Henan, China
| | - Jinpeng Li
- Department of Laboratory Medicine, The Third Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
| | - Dongchun Qin
- Department of Laboratory Medicine, The First Affiliated Hospital of Zhengzhou University, Zhengzhou, Henan, China
| |
Collapse
|
2
|
Jiang J, Lyu P, Li J, Huang S, Tao J, Blackshaw S, Qian J, Wang J. IReNA: Integrated regulatory network analysis of single-cell transcriptomes and chromatin accessibility profiles. iScience 2022; 25:105359. [PMID: 36325073 PMCID: PMC9619378 DOI: 10.1016/j.isci.2022.105359] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Revised: 09/19/2022] [Accepted: 10/12/2022] [Indexed: 11/16/2022] Open
Abstract
Recently, single-cell RNA sequencing (scRNA-seq) and single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) have been developed to separately measure transcriptomes and chromatin accessibility profiles at the single-cell resolution. However, few methods can reliably integrate these data to perform regulatory network analysis. Here, we developed integrated regulatory network analysis (IReNA) for network inference through the integrated analysis of scRNA-seq and scATAC-seq data, network modularization, transcription factor enrichment, and construction of simplified intermodular regulatory networks. Using public datasets, we showed that integrated network analysis of scRNA-seq data with scATAC-seq data is more precise to identify known regulators than scRNA-seq data analysis alone. Moreover, IReNA outperformed currently available methods in identifying known regulators. IReNA facilitates the systems-level understanding of biological regulatory mechanisms and is available at https://github.com/jiang-junyao/IReNA.
Collapse
Affiliation(s)
- Junyao Jiang
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
| | - Pin Lyu
- Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Jinlian Li
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
| | - Sunan Huang
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
| | - Jiawang Tao
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
| | - Seth Blackshaw
- Solomon H. Snyder Department of Neuroscience, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Jiang Qian
- Department of Ophthalmology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Jie Wang
- CAS Key Laboratory of Regenerative Biology, Guangdong Provincial Key Laboratory of Biocomputing, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
- State Key Laboratory of Respiratory Disease, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, China
- China-New Zealand Joint Laboratory on Biomedicine and Health, Guangzhou 510530, China
- Corresponding author
| |
Collapse
|
3
|
Wang C, Shi J, Cai J, Zhang Y, Zheng X, Zhang N. DriverRWH: discovering cancer driver genes by random walk on a gene mutation hypergraph. BMC Bioinformatics 2022; 23:277. [PMID: 35831792 PMCID: PMC9281118 DOI: 10.1186/s12859-022-04788-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Accepted: 06/08/2022] [Indexed: 12/24/2022] Open
Abstract
Background Recent advances in next-generation sequencing technologies have helped investigators generate massive amounts of cancer genomic data. A critical challenge in cancer genomics is identification of a few cancer driver genes whose mutations cause tumor growth. However, the majority of existing computational approaches underuse the co-occurrence mutation information of the individuals, which are deemed to be important in tumorigenesis and tumor progression, resulting in high rate of false positive. Results To make full use of co-mutation information, we present a random walk algorithm referred to as DriverRWH on a weighted gene mutation hypergraph model, using somatic mutation data and molecular interaction network data to prioritize candidate driver genes. Applied to tumor samples of different cancer types from The Cancer Genome Atlas, DriverRWH shows significantly better performance than state-of-art prioritization methods in terms of the area under the curve scores and the cumulative number of known driver genes recovered in top-ranked candidate genes. Besides, DriverRWH discovers several potential drivers, which are enriched in cancer-related pathways. DriverRWH recovers approximately 50% known driver genes in the top 30 ranked candidate genes for more than half of the cancer types. In addition, DriverRWH is also highly robust to perturbations in the mutation data and gene functional network data. Conclusion DriverRWH is effective among various cancer types in prioritizes cancer driver genes and provides considerable improvement over other tools with a better balance of precision and sensitivity. It can be a useful tool for detecting potential driver genes and facilitate targeted cancer therapies. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04788-7.
Collapse
Affiliation(s)
- Chenye Wang
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Junhan Shi
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Jiansheng Cai
- Department of Mathematics, Weifang University, Weifang, 261061, Shandong, China
| | - Yusen Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China
| | - Xiaoqi Zheng
- Department of Mathematics, Shanghai Normal University, Shanghai, 200234, China
| | - Naiqian Zhang
- School of Mathematics and Statistics, Shandong University, Weihai, 264209, China.
| |
Collapse
|
4
|
Peng W, Tang Q, Dai W, Chen T. Improving cancer driver gene identification using multi-task learning on graph convolutional network. Brief Bioinform 2021; 23:6394994. [PMID: 34643232 DOI: 10.1093/bib/bbab432] [Citation(s) in RCA: 38] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Revised: 09/08/2021] [Accepted: 09/18/2021] [Indexed: 01/18/2023] Open
Abstract
Cancer is thought to be caused by the accumulation of driver genetic mutations. Therefore, identifying cancer driver genes plays a crucial role in understanding the molecular mechanism of cancer and developing precision therapies and biomarkers. In this work, we propose a Multi-Task learning method, called MTGCN, based on the Graph Convolutional Network to identify cancer driver genes. First, we augment gene features by introducing their features on the protein-protein interaction (PPI) network. After that, the multi-task learning framework propagates and aggregates nodes and graph features from input to next layer to learn node embedding features, simultaneously optimizing the node prediction task and the link prediction task. Finally, we use a Bayesian task weight learner to balance the two tasks automatically. The outputs of MTGCN assign each gene a probability of being a cancer driver gene. Our method and the other four existing methods are applied to predict cancer drivers for pan-cancer and some single cancer types. The experimental results show that our model shows outstanding performance compared with the state-of-the-art methods in terms of the area under the Receiver Operating Characteristic (ROC) curves and the area under the precision-recall curves. The MTGCN is freely available via https://github.com/weiba/MTGCN.
Collapse
Affiliation(s)
- Wei Peng
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, P. R. China.,Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, Yunnan 650500, P. R. China
| | - Qi Tang
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, P. R. China
| | - Wei Dai
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, P. R. China.,Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Kunming, Yunnan 650500, P. R. China
| | - Tielin Chen
- Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming, Yunnan 650500, P. R. China
| |
Collapse
|
5
|
Song J, Peng W, Wang F. An Entropy-Based Method for Identifying Mutual Exclusive Driver Genes in Cancer. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:758-768. [PMID: 30763245 DOI: 10.1109/tcbb.2019.2897931] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Cancer in essence is a complex genomic alteration disease which is caused by the somatic mutations during the lifetime. According to previous researches, the first step to overcome cancer is to identify driver genes which can promote carcinogenesis. However, it is still a big challenge to precisely and efficiently extract the cancer related driver genes because the nature of cancer is heterogeneous and there exists tremendously irrelevant passenger mutations which have no function impact on the cancer's development. In this work, we proposed a novel entropy-based method namely EntroRank to identify driver genes by integrating the subcellular localization information and mutual exclusive of variation frequency into the network. EntroRank can take into full consideration different properties of driver genes. Considering the modularity of driver genes, the mutated genes in the network were first clustered into different subgroups according to their located compartments. After that, the structural entropy of the gene in the subgroup was employed to measure its indispensability. Considering mutual exclusive property between driver genes in the modules, relative entropy was utilized to measure the degree of mutual exclusive between two mutated genes in terms of their variation frequency. We applied our method to three different cancers including lung, prostate, and breast cancer. The results show our method not only detect the well-known important drivers but also prioritiz the rare unknown driver genes. Besides, EntroRank can identify driver genes having mutual exclusive property. Compared with other existing methods, our method achieves a better performance for most of cancer types in terms of Precision, Recall, and Fscore.
Collapse
|
6
|
Chen X, Xu C, Hong S, Xia X, Cao Y, McDermott J, Mu Y, Han JDJ. Immune Cell Types and Secreted Factors Contributing to Inflammation-to-Cancer Transition and Immune Therapy Response. Cell Rep 2020; 26:1965-1977.e4. [PMID: 30759403 DOI: 10.1016/j.celrep.2019.01.080] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2018] [Revised: 09/17/2018] [Accepted: 01/22/2019] [Indexed: 12/23/2022] Open
Abstract
Although chronic inflammation increases many cancers' risk, how inflammation facilitates cancer development is still not well studied. Recognizing whether and when inflamed tissues transition to cancerous tissues is of utmost importance. To unbiasedly infer molecular events, immune cell types, and secreted factors contributing to the inflammation-to-cancer (I2C) transition, we develop a computational package called "SwitchDetector" based on liver, gastric, and colon cancer I2C data. Using it, we identify angiogenesis associated with a common critical transition stage for multiple I2C events. Furthermore, we infer infiltrated immune cell type composition and their secreted or suppressed extracellular proteins to predict expression of important transition stage genes. This identifies extracellular proteins that may serve as early-detection biomarkers for pre-cancer and early-cancer stages. They alone or together with I2C hallmark angiogenesis genes are significantly related to cancer prognosis and can predict immune therapy response. The SwitchDetector and I2C database are publicly available at www.inflammation2cancer.org.
Collapse
Affiliation(s)
- Xingwei Chen
- Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Chi Xu
- Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Shengjun Hong
- Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xian Xia
- Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yaqiang Cao
- Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Joseph McDermott
- Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China
| | - Yonglin Mu
- Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jing-Dong J Han
- Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
7
|
Zhang W, Wang SL. A Novel Method for Identifying the Potential Cancer Driver Genes Based on Molecular Data Integration. Biochem Genet 2019; 58:16-39. [PMID: 31115714 DOI: 10.1007/s10528-019-09924-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2019] [Accepted: 05/02/2019] [Indexed: 12/17/2022]
Abstract
The identification of the cancer driver genes is essential for personalized therapy. The mutation frequency of most driver genes is in the middle (2-20%) or even lower range, which makes it difficult to find the driver genes with low-frequency mutations. Other forms of genomic aberrations, such as copy number variations (CNVs) and epigenetic changes, may also reflect cancer progression. In this work, a method for identifying the potential cancer driver genes (iPDG) based on molecular data integration is proposed. DNA copy number variation, somatic mutation, and gene expression data of matched cancer samples are integrated. In combination with the method of iKEEG, the "key genes" of cancer are identified, and the change in their expression levels is used for auxiliary evaluation of whether the mutated genes are potential drivers. For a mutated gene, the concept of mutational effect is defined, which takes into account the effects of copy number variation, mutation gene itself, and its neighbor genes. The method mainly includes two steps: the first step is data preprocessing. First, DNA copy number variation and somatic mutation data are integrated. Then, the integrated data are mapped to a given interaction network, and the diffusion kernel is used to form the mutation effect matrix. The second step is to obtain the key genes by using the iKGGE method, and construct the connection matrix by means of the gene expression data of the key genes and mutation impact matrix of the mutated genes. Experiments on TCGA breast cancer and Glioblastoma multiforme datasets demonstrate that iPDG is effective not only to identify the known cancer driver genes but also to discover the rare potential driver genes. When measured by functional enrichment analysis, we find that these genes are clearly associated with these two types of cancers.
Collapse
Affiliation(s)
- Wei Zhang
- College of Computer Science and Electronics Engineering, Hunan University, Changsha, 410082, Hunan, China
| | - Shu-Lin Wang
- College of Computer Science and Electronics Engineering, Hunan University, Changsha, 410082, Hunan, China.
| |
Collapse
|
8
|
Song J, Peng W, Wang F. A random walk-based method to identify driver genes by integrating the subcellular localization and variation frequency into bipartite graph. BMC Bioinformatics 2019; 20:238. [PMID: 31088372 PMCID: PMC6518800 DOI: 10.1186/s12859-019-2847-9] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2019] [Accepted: 04/24/2019] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Cancer as a worldwide problem is driven by genomic alterations. With the advent of high-throughput sequencing technology, a huge amount of genomic data generates at every second which offer many valuable cancer information and meanwhile throw a big challenge to those investigators. As the major characteristic of cancer is heterogeneity and most of alterations are supposed to be useless passenger mutations that make no contribution to the cancer progress. Hence, how to dig out driver genes that have effect on a selective growth advantage in tumor cells from those tremendously and noisily data is still an urgent task. RESULTS Considering previous network-based method ignoring some important biological properties of driver genes and the low reliability of gene interactive network, we proposed a random walk method named as Subdyquency that integrates the information of subcellular localization, variation frequency and its interaction with other dysregulated genes to improve the prediction accuracy of driver genes. We applied our model to three different cancers: lung, prostate and breast cancer. The results show our model can not only identify the well-known important driver genes but also prioritize the rare unknown driver genes. Besides, compared with other existing methods, our method can improve the precision, recall and fscore to a higher level for most of cancer types. CONCLUSIONS The final results imply that driver genes are those prone to have higher variation frequency and impact more dysregulated genes in the common significant compartment. AVAILABILITY The source code can be obtained at https://github.com/weiba/Subdyquency .
Collapse
Affiliation(s)
- Junrong Song
- Faculty of Management and Economics/Computer center/Faculty of Information Engineering and Automation/Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Lianhua Road, 650050, Kunming, People's Republic of China
| | - Wei Peng
- Faculty of Management and Economics/Computer center/Faculty of Information Engineering and Automation/Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Lianhua Road, 650050, Kunming, People's Republic of China.
| | - Feng Wang
- Faculty of Management and Economics/Computer center/Faculty of Information Engineering and Automation/Technology Application Key Lab of Yunnan Province, Kunming University of Science and Technology, Lianhua Road, 650050, Kunming, People's Republic of China
| |
Collapse
|
9
|
Hu H, Liu JM, Hu Z, Jiang X, Yang X, Li J, Zhang Y, Yu H, Khaitovich P. Recently Evolved Tumor Suppressor Transcript TP73-AS1 Functions as Sponge of Human-Specific miR-941. Mol Biol Evol 2019; 35:1063-1077. [PMID: 29474580 PMCID: PMC5913670 DOI: 10.1093/molbev/msy022] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
MicroRNA (miRNA) sponges are vital components of posttranscriptional gene regulation. Yet, only a limited number of miRNA sponges have been identified. Here, we show that the recently evolved noncoding tumor suppressor transcript, antisense RNA to TP73 gene (TP73-AS1), functions as a natural sponge of human-specific miRNA miR-941. We find unusually nine high-affinity miR-941 binding sites clustering within 1 kb region on TP73-AS1, which forms miR-941 sponge region. This sponge region displays increased sequence constraint only in humans, and its formation can be traced to the tandem expansion of a 71-nt-long sequence containing a single miR-941 binding site in old world monkeys. We further confirm TP73-AS1 functions as an efficient miR-941 sponge based on massive transcriptome data analyses, wound-healing assay, and Argonaute protein immunoprecipitation experiments conducted in cell lines. The expression of miR-941 and its sponge correlate inversely across multiple healthy and cancerous tissues, with miR-941 being highly expressed in tumors and preferentially repressing tumor suppressors. Thus, the TP73-AS1 and miR-941 duo represents an unusual case of the extremely rapid evolution of noncoding regulators controlling cell migration, proliferation, and tumorigenesis.
Collapse
Affiliation(s)
- Haiyang Hu
- School of Life Science and Technology, China Pharmaceutical University, Nanjing, China.,CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai, China
| | - Jian-Mei Liu
- State Key Laboratory of Natural Resource Conservation and Utilization in Yunnan and Center for Life Science, School of Life Sciences, Yunnan University, Kunming, China
| | - Zhenyu Hu
- State Key Laboratory of Natural Resource Conservation and Utilization in Yunnan and Center for Life Science, School of Life Sciences, Yunnan University, Kunming, China
| | - Xi Jiang
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai, China
| | - Xiaode Yang
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Jiangxia Li
- State Key Laboratory of Natural Resource Conservation and Utilization in Yunnan and Center for Life Science, School of Life Sciences, Yunnan University, Kunming, China
| | - Yao Zhang
- State Key Laboratory of Natural Resource Conservation and Utilization in Yunnan and Center for Life Science, School of Life Sciences, Yunnan University, Kunming, China
| | - Haijing Yu
- State Key Laboratory of Natural Resource Conservation and Utilization in Yunnan and Center for Life Science, School of Life Sciences, Yunnan University, Kunming, China
| | - Philipp Khaitovich
- Skolkovo Institute of Science and Technology, Skolkovo, Russia.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China.,Comparative Biology Group, CAS-MPG Partner Institute for Computational Biology, Shanghai, China.,Max Planck Institute for Evolutionary Anthropology, Leipzig, Germany.,School of Life Science and Technology, ShanghaiTech University, Shanghai, China
| |
Collapse
|
10
|
Suo S, Zhu Q, Saadatpour A, Fei L, Guo G, Yuan GC. Revealing the Critical Regulators of Cell Identity in the Mouse Cell Atlas. Cell Rep 2018; 25:1436-1445.e3. [PMID: 30404000 PMCID: PMC6281296 DOI: 10.1016/j.celrep.2018.10.045] [Citation(s) in RCA: 167] [Impact Index Per Article: 27.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2018] [Revised: 09/06/2018] [Accepted: 10/11/2018] [Indexed: 12/19/2022] Open
Abstract
Recent progress in single-cell technologies has enabled the identification of all major cell types in mouse. However, for most cell types, the regulatory mechanism underlying their identity remains poorly understood. By computational analysis of the recently published mouse cell atlas data, we have identified 202 regulons whose activities are highly variable across different cell types, and more importantly, predicted a small set of essential regulators for each major cell type in mouse. Systematic validation by automated literature and data mining provides strong additional support for our predictions. Thus, these predictions serve as a valuable resource that would be useful for the broad biological community. Finally, we have built a user-friendly, interactive web portal to enable users to navigate this mouse cell network atlas.
Collapse
Affiliation(s)
- Shengbao Suo
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard T.H. Chan School of Public Health, Boston, MA 02215, USA
| | - Qian Zhu
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard T.H. Chan School of Public Health, Boston, MA 02215, USA
| | - Assieh Saadatpour
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard T.H. Chan School of Public Health, Boston, MA 02215, USA
| | - Lijiang Fei
- Center for Stem Cell and Regenerative Medicine, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Guoji Guo
- Center for Stem Cell and Regenerative Medicine, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Guo-Cheng Yuan
- Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute and Harvard T.H. Chan School of Public Health, Boston, MA 02215, USA.
| |
Collapse
|
11
|
Song M, Kang K, Young An J. Investigating drug-disease interactions in drug-symptom-disease triples via citation relations. J Assoc Inf Sci Technol 2018. [DOI: 10.1002/asi.24060] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Affiliation(s)
- Min Song
- Department of Library and Information Science; Yonsei University; Republic of Korea
| | - Keunyoung Kang
- Department of Library and Information Science; Yonsei University; Republic of Korea
| | - Ju Young An
- Department of Library and Information Science; Yonsei University; Republic of Korea
| |
Collapse
|
12
|
Abstract
SIGNIFICANCE Reductionist studies have contributed greatly to our understanding of the basic biology of aging in recent years but we still do not understand fundamental mechanisms for many identified drugs and pathways. Use of systems approaches will help us move forward in our understanding of aging. Recent Advances: Recent work described here has illustrated the power of systems biology to inform our understanding of aging through the study of (i) diet restriction, (ii) neurodegenerative disease, and (iii) biomarkers of aging. CRITICAL ISSUES Although we do not understand all of the individual genes and pathways that affect aging, as we continue to uncover more of them, we have now also begun to synthesize existing data using systems-level approaches, often to great effect. The three examples noted here all benefit from computational approaches that were unknown a few years ago, and from biological insights gleaned from multiple model systems, from aging laboratories as well as many other areas of biology. FUTURE DIRECTIONS Many new technologies, such as single-cell sequencing, advances in epigenetics beyond the methylome (specifically, assay for transposase-accessible chromatin with high throughput sequencing ), and multiomic network studies, will increase the reach of systems biologists. This suggests that approaches similar to those described here will continue to lead to striking findings, and to interventions that may allow us to delay some of the many age-associated diseases in humans; perhaps sooner that we expect. Antioxid. Redox Signal. 29, 973-984.
Collapse
Affiliation(s)
| | - Daniel E L Promislow
- 2 Department of Pathology, University of Washington , Seattle, Washington.,3 Department of Biology, University of Washington , Seattle, Washington
| |
Collapse
|
13
|
Xu C, Ai D, Shi D, Suo S, Chen X, Yan Y, Cao Y, Zhang R, Sun N, Chen W, McDermott J, Zhang S, Zeng Y, Han JDJ. Accurate Drug Repositioning through Non-tissue-Specific Core Signatures from Cancer Transcriptomes. Cell Rep 2018; 25:523-535.e5. [DOI: 10.1016/j.celrep.2018.09.031] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2018] [Revised: 08/03/2018] [Accepted: 09/10/2018] [Indexed: 10/28/2022] Open
|
14
|
Chen W, Liu Y, Zhu S, Chen G, Han JDJ. Inter-nucleosomal communication between histone modifications for nucleosome phasing. PLoS Comput Biol 2018; 14:e1006416. [PMID: 30188887 PMCID: PMC6126837 DOI: 10.1371/journal.pcbi.1006416] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2018] [Accepted: 08/02/2018] [Indexed: 11/20/2022] Open
Abstract
Combinatorial effects of epigenetic modifications on transcription activity have been proposed as “histone codes”. However, it is unclear whether there also exist inter-nucleosomal communications among epigenetic modifications at single nucleosome level, and if so, what functional roles they play. Meanwhile, how clear nucleosome patterns, such as nucleosome phasing and depletion, are formed at functional regions remains an intriguing enigma. To address these questions, we developed a Bayesian network model for interactions among different histone modifications across neighboring nucleosomes, based on the framework of dynamic Bayesian network (DBN). From this model, we found that robust inter-nucleosomal interactions exist around transcription start site (TSS), transcription termination sites (TTS) or around CTCF binding sites; and these inter-nucleosomal interactions are often involved in transcription regulation. In addition to these general principles, DBN also uncovered a novel specific epigenetic interaction between H2A.Z and H4K20me1 on neighboring nucleosomes, involved in nucleosome free region (NFR) and nucleosome phasing establishment or maintenance. The level of negative correlation between neighboring H2A.Z and H4K20me1 strongly correlate with the size of NFR and the strength of nucleosome phasing around TSS. Our study revealed inter-nucleosomal communications as important players in signal propagation, chromatin remodeling and transcription regulation. Nucleosomes are the basic unit of chromatin organization. At a global level, they fold up to form chromatin fibers in higher order structure to control the activation/repression states of chromatins. At a local level, especially around transcriptional starting sites (TSSs), nucleosomes play an important role in regulating gene expression by dynamically positioning to affect the recruitment of RNA polymerase II and transcriptional factors. In particular around actively transcribed TSSs, nucleosomes are regularly positioned to form a typical pattern of nucleosome phasing. As it suggests that the forming of nucleosome phasing is a synergistic behavior across the nucleosomes around TSS, we hypothesize that there exist communications, which is probably some propagations of histone modifications, between neighboring nucleosomes, as nucleosome functions are essentially due to histone modifications. Here, to address the question, we investigated the correlations of histone modifications across neighboring nucleosomes, and revealed a negative correlation between H2A.Z and H4K20me1 across neighboring nucleosomes. It is a development to the well accepted knowledge that H2A.Z and H4K20me1 are positively correlated at genome-wide level. In addition, we revealed a probable contribution of H2A.Z-H4K20me1 anti-correlation in nucleosome phasing around active TSSs, therefore, shedding light on understanding the forming of nucleosome phasing.
Collapse
Affiliation(s)
- Weizhong Chen
- Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Yi Liu
- Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- Beijing Key Lab of Traffic Data Analysis and Mining, School of Computer and Information Technology, Beijing Jiaotong University, Beijing, China
| | - Shanshan Zhu
- Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Guoyu Chen
- Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Jing-Dong J. Han
- Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- * E-mail:
| |
Collapse
|
15
|
Wang S, Ma J, Yu MK, Zheng F, Huang EW, Han J, Peng J, Ideker T. Annotating gene sets by mining large literature collections with protein networks. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2018; 23:602-613. [PMID: 29218918 PMCID: PMC5806628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
Analysis of patient genomes and transcriptomes routinely recognizes new gene sets associated with human disease. Here we present an integrative natural language processing system which infers common functions for a gene set through automatic mining of the scientific literature with biological networks. This system links genes with associated literature phrases and combines these links with protein interactions in a single heterogeneous network. Multiscale functional annotations are inferred based on network distances between phrases and genes and then visualized as an ontology of biological concepts. To evaluate this system, we predict functions for gene sets representing known pathways and find that our approach achieves substantial improvement over the conventional text-mining baseline method. Moreover, our system discovers novel annotations for gene sets or pathways without previously known functions. Two case studies demonstrate how the system is used in discovery of new cancer-related pathways with ontological annotations.
Collapse
Affiliation(s)
- Sheng Wang
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Jianzhu Ma
- School of Medicine, University of California San Diego, San Diego, CA, USA
| | - Michael Ku Yu
- School of Medicine, University of California San Diego, San Diego, CA, USA
| | - Fan Zheng
- School of Medicine, University of California San Diego, San Diego, CA, USA
| | - Edward W Huang
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Jiawei Han
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Jian Peng
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | - Trey Ideker
- School of Medicine, University of California San Diego, San Diego, CA, USA
| |
Collapse
|
16
|
Sun N, Yu X, Li F, Liu D, Suo S, Chen W, Chen S, Song L, Green CD, McDermott J, Shen Q, Jing N, Han JDJ. Inference of differentiation time for single cell transcriptomes using cell population reference data. Nat Commun 2017; 8:1856. [PMID: 29187729 PMCID: PMC5707349 DOI: 10.1038/s41467-017-01860-2] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2017] [Accepted: 10/18/2017] [Indexed: 12/17/2022] Open
Abstract
Single-cell RNA sequencing (scRNA-seq) is a powerful method for dissecting intercellular heterogeneity during development. Conventional trajectory analysis provides only a pseudotime of development, and often discards cell-cycle events as confounding factors. Here using matched cell population RNA-seq (cpRNA-seq) as a reference, we developed an “iCpSc” package for integrative analysis of cpRNA-seq and scRNA-seq data. By generating a computational model for reference “biological differentiation time” using cell population data and applying it to single-cell data, we unbiasedly associated cell-cycle checkpoints to the internal molecular timer of single cells. Through inferring a network flow from cpRNA-seq to scRNA-seq data, we predicted a role of M phase in controlling the speed of neural differentiation of mouse embryonic stem cells, and validated it through gene knockout (KO) experiments. By linking temporally matched cpRNA-seq and scRNA-seq data, our approach provides an effective and unbiased approach for identifying developmental trajectory and timing-related regulatory events. Single cell transcriptome data can be used to determine developmental lineage trajectories. Here the authors map single cell transcriptomes onto a differentiation trajectory defined by cell population transcriptomes and show that cell cycle regulators have a role in differentiation timing.
Collapse
Affiliation(s)
- Na Sun
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Xiaoming Yu
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China.,Tsinghua-Peking Center for Life Sciences, Tsinghua University, School of Medicine, Tsinghua University, Beijing, 100084, China
| | - Fang Li
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Denghui Liu
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Shengbao Suo
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Weiyang Chen
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Shirui Chen
- State Key Laboratory of Cell Biology, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Lu Song
- State Key Laboratory of Cell Biology, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Christopher D Green
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Joseph McDermott
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Qin Shen
- Tsinghua-Peking Center for Life Sciences, Tsinghua University, School of Medicine, Tsinghua University, Beijing, 100084, China
| | - Naihe Jing
- State Key Laboratory of Cell Biology, Institute of Biochemistry and Cell Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China
| | - Jing-Dong J Han
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, 200031, China.
| |
Collapse
|
17
|
Fujii C, Kuwahara H, Yu G, Guo L, Gao X. Learning gene regulatory networks from gene expression data using weighted consensus. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2016.02.087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
18
|
Lam MPY, Venkatraman V, Xing Y, Lau E, Cao Q, Ng DCM, Su AI, Ge J, Van Eyk JE, Ping P. Data-Driven Approach To Determine Popular Proteins for Targeted Proteomics Translation of Six Organ Systems. J Proteome Res 2016; 15:4126-4134. [PMID: 27356587 DOI: 10.1021/acs.jproteome.6b00095] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Amidst the proteomes of human tissues lie subsets of proteins that are closely involved in conserved pathophysiological processes. Much of biomedical research concerns interrogating disease signature proteins and defining their roles in disease mechanisms. With advances in proteomics technologies, it is now feasible to develop targeted proteomics assays that can accurately quantify protein abundance as well as their post-translational modifications; however, with rapidly accumulating number of studies implicating proteins in diseases, current resources are insufficient to target every protein without judiciously prioritizing the proteins with high significance and impact for assay development. We describe here a data science method to prioritize and expedite assay development on high-impact proteins across research fields by leveraging the biomedical literature record to rank and normalize proteins that are popularly and preferentially published by biomedical researchers. We demonstrate this method by finding priority proteins across six major physiological systems (cardiovascular, cerebral, hepatic, renal, pulmonary, and intestinal). The described method is data-driven and builds upon the collective knowledge of previous publications referenced on PubMed to lend objectivity to target selection. The method and resulting popular protein lists may also be useful for exploring biological processes associated with various physiological systems and research topics, in addition to benefiting ongoing efforts to facilitate the broad translation of proteomics technologies.
Collapse
Affiliation(s)
| | - Vidya Venkatraman
- Advanced Clinical Biosystems Research Institute, Department of Medicine and The Heart Institute, Cedars-Sinai Medical Center , Los Angeles, California 90048, United States
| | | | | | - Quan Cao
- Department of Cardiology, Shanghai Institute of Cardiovascular Diseases, Zhongshan Hospital, Fudan University , Shanghai, 200433, China
| | | | - Andrew I Su
- Department of Molecular and Experimental Medicine, The Scripps Research Institute , La Jolla, California 92037, United States
| | - Junbo Ge
- Department of Cardiology, Shanghai Institute of Cardiovascular Diseases, Zhongshan Hospital, Fudan University , Shanghai, 200433, China
| | - Jennifer E Van Eyk
- Advanced Clinical Biosystems Research Institute, Department of Medicine and The Heart Institute, Cedars-Sinai Medical Center , Los Angeles, California 90048, United States
| | | |
Collapse
|
19
|
Hou L, Wang D, Chen D, Liu Y, Zhang Y, Cheng H, Xu C, Sun N, McDermott J, Mair WB, Han JDJ. A Systems Approach to Reverse Engineer Lifespan Extension by Dietary Restriction. Cell Metab 2016; 23:529-40. [PMID: 26959186 PMCID: PMC5110149 DOI: 10.1016/j.cmet.2016.02.002] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/21/2015] [Revised: 12/14/2015] [Accepted: 02/03/2016] [Indexed: 12/16/2022]
Abstract
Dietary restriction (DR) is the most powerful natural means to extend lifespan. Although several genes can mediate responses to alternate DR regimens, no single genetic intervention has recapitulated the full effects of DR, and no unified system is known for different DR regimens. Here we obtain temporally resolved transcriptomes during calorie restriction and intermittent fasting in Caenorhabditis elegans and find that early and late responses involve metabolism and cell cycle/DNA damage, respectively. We uncover three network modules of DR regulators by their target specificity. By genetic manipulations of nodes representing discrete modules, we induce transcriptomes that progressively resemble DR as multiple nodes are perturbed. Targeting all three nodes simultaneously results in extremely long-lived animals that are refractory to DR. These results and dynamic simulations demonstrate that extensive feedback controls among regulators may be leveraged to drive the regulatory circuitry to a younger steady state, recapitulating the full effect of DR.
Collapse
Affiliation(s)
- Lei Hou
- Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China
| | - Dan Wang
- Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China; Graduate University of Chinese Academy of Sciences, Beijing 100049, China
| | - Di Chen
- State Key Laboratory of Pharmaceutical Biotechnology and MOE Key Laboratory of Model Animals for Disease Study, Model Animal Research Center, Nanjing University, Nanjing, Jiangsu 210061, China
| | - Yi Liu
- Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China; Beijing Key Lab of Traffic Data Analysis and Mining, School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China
| | - Yue Zhang
- Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Hao Cheng
- Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China; Graduate University of Chinese Academy of Sciences, Beijing 100049, China
| | - Chi Xu
- Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China; Graduate University of Chinese Academy of Sciences, Beijing 100049, China
| | - Na Sun
- Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China
| | - Joseph McDermott
- Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China
| | - William B Mair
- Harvard T.H. Chan School of Public Health, Boston, MA 02115, USA
| | - Jing-Dong J Han
- Key Laboratory of Computational Biology, CAS Center for Excellence in Molecular Cell Science, Collaborative Innovation Center for Genetics and Developmental Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, 320 Yue Yang Road, Shanghai 200031, China.
| |
Collapse
|
20
|
Shi K, Gao L, Wang B. Discovering potential cancer driver genes by an integrated network-based approach. MOLECULAR BIOSYSTEMS 2016; 12:2921-31. [DOI: 10.1039/c6mb00274a] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
An integrated network-based approach is proposed to nominate driver genes. It is composed of two steps including a network diffusion step and an aggregated ranking step, which fuses the correlation between the gene mutations and gene expression, the relationship between the mutated genes and the heterogeneous characteristic of the patient mutation.
Collapse
Affiliation(s)
- Kai Shi
- School of Computer Science and Technology
- Xidian University
- Xi'an
- China
- College of Science
| | - Lin Gao
- School of Computer Science and Technology
- Xidian University
- Xi'an
- China
| | - Bingbo Wang
- School of Computer Science and Technology
- Xidian University
- Xi'an
- China
| |
Collapse
|
21
|
Cui X, Naveed H, Gao X. Finding optimal interaction interface alignments between biological complexes. Bioinformatics 2015; 31:i133-41. [PMID: 26072475 PMCID: PMC4765866 DOI: 10.1093/bioinformatics/btv242] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Motivation: Biological molecules perform their functions through interactions with other molecules. Structure alignment of interaction interfaces between biological complexes is an indispensable step in detecting their structural similarities, which are keys to understanding their evolutionary histories and functions. Although various structure alignment methods have been developed to successfully access the similarities of protein structures or certain types of interaction interfaces, existing alignment tools cannot directly align arbitrary types of interfaces formed by protein, DNA or RNA molecules. Specifically, they require a ‘blackbox preprocessing’ to standardize interface types and chain identifiers. Yet their performance is limited and sometimes unsatisfactory. Results: Here we introduce a novel method, PROSTA-inter, that automatically determines and aligns interaction interfaces between two arbitrary types of complex structures. Our method uses sequentially remote fragments to search for the optimal superimposition. The optimal residue matching problem is then formulated as a maximum weighted bipartite matching problem to detect the optimal sequence order-independent alignment. Benchmark evaluation on all non-redundant protein–DNA complexes in PDB shows significant performance improvement of our method over TM-align and iAlign (with the ‘blackbox preprocessing’). Two case studies where our method discovers, for the first time, structural similarities between two pairs of functionally related protein–DNA complexes are presented. We further demonstrate the power of our method on detecting structural similarities between a protein–protein complex and a protein–RNA complex, which is biologically known as a protein–RNA mimicry case. Availability and implementation: The PROSTA-inter web-server is publicly available at http://www.cbrc.kaust.edu.sa/prosta/. Contact:xin.gao@kaust.edu.sa
Collapse
Affiliation(s)
- Xuefeng Cui
- Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Hammad Naveed
- Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Xin Gao
- Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| |
Collapse
|
22
|
Gray MM, Parmenter MD, Hogan CA, Ford I, Cuthbert RJ, Ryan PG, Broman KW, Payseur BA. Genetics of Rapid and Extreme Size Evolution in Island Mice. Genetics 2015; 201:213-28. [PMID: 26199233 PMCID: PMC4566264 DOI: 10.1534/genetics.115.177790] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2015] [Accepted: 07/18/2015] [Indexed: 12/21/2022] Open
Abstract
Organisms on islands provide a revealing window into the process of adaptation. Populations that colonize islands often evolve substantial differences in body size from their mainland relatives. Although the ecological drivers of this phenomenon have received considerable attention, its genetic basis remains poorly understood. We use house mice (subspecies: Mus musculus domesticus) from remote Gough Island to provide a genetic portrait of rapid and extreme size evolution. In just a few hundred generations, Gough Island mice evolved the largest body size among wild house mice from around the world. Through comparisons with a smaller-bodied wild-derived strain from the same subspecies (WSB/EiJ), we demonstrate that Gough Island mice achieve their exceptional body weight primarily by growing faster during the 6 weeks after birth. We use genetic mapping in large F(2) intercrosses between Gough Island mice and WSB/EiJ to identify 19 quantitative trait loci (QTL) responsible for the evolution of 16-week weight trajectories: 8 QTL for body weight and 11 QTL for growth rate. QTL exhibit modest effects that are mostly additive. We conclude that body size evolution on islands can be genetically complex, even when substantial size changes occur rapidly. In comparisons to published studies of laboratory strains of mice that were artificially selected for divergent body sizes, we discover that the overall genetic profile of size evolution in nature and in the laboratory is similar, but many contributing loci are distinct. Our results underscore the power of genetically characterizing the entire growth trajectory in wild populations and lay the foundation necessary for identifying the mutations responsible for extreme body size evolution in nature.
Collapse
Affiliation(s)
- Melissa M Gray
- Laboratory of Genetics, University of Wisconsin, Madison, Wisconsin 53706
| | | | - Caley A Hogan
- Laboratory of Genetics, University of Wisconsin, Madison, Wisconsin 53706
| | - Irene Ford
- Laboratory of Genetics, University of Wisconsin, Madison, Wisconsin 53706
| | - Richard J Cuthbert
- Royal Society for the Protection of Birds, The Lodge, Sandy, Bedfordshire, SG19 2DL, United Kingdom
| | - Peter G Ryan
- Percy FitzPatrick Institute of African Ornithology, DST-NRF Centre of Excellence, University of Cape Town, Rondebosch 7701, South Africa
| | - Karl W Broman
- Department of Biostatistics and Medical Informatics, University of Wisconsin, Madison, Wisconsin 53706
| | - Bret A Payseur
- Laboratory of Genetics, University of Wisconsin, Madison, Wisconsin 53706
| |
Collapse
|
23
|
Naveed H, Hameed US, Harrus D, Bourguet W, Arold ST, Gao X. An integrated structure- and system-based framework to identify new targets of metabolites and known drugs. Bioinformatics 2015; 31:3922-9. [PMID: 26286808 PMCID: PMC4673972 DOI: 10.1093/bioinformatics/btv477] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2015] [Accepted: 08/08/2015] [Indexed: 02/07/2023] Open
Abstract
Motivation: The inherent promiscuity of small molecules towards protein targets impedes our understanding of healthy versus diseased metabolism. This promiscuity also poses a challenge for the pharmaceutical industry as identifying all protein targets is important to assess (side) effects and repositioning opportunities for a drug. Results: Here, we present a novel integrated structure- and system-based approach of drug-target prediction (iDTP) to enable the large-scale discovery of new targets for small molecules, such as pharmaceutical drugs, co-factors and metabolites (collectively called ‘drugs’). For a given drug, our method uses sequence order–independent structure alignment, hierarchical clustering and probabilistic sequence similarity to construct a probabilistic pocket ensemble (PPE) that captures promiscuous structural features of different binding sites on known targets. A drug’s PPE is combined with an approximation of its delivery profile to reduce false positives. In our cross-validation study, we use iDTP to predict the known targets of 11 drugs, with 63% sensitivity and 81% specificity. We then predicted novel targets for these drugs—two that are of high pharmacological interest, the peroxisome proliferator-activated receptor gamma and the oncogene B-cell lymphoma 2, were successfully validated through in vitro binding experiments. Our method is broadly applicable for the prediction of protein-small molecule interactions with several novel applications to biological research and drug development. Availability and implementation: The program, datasets and results are freely available to academic users at http://sfb.kaust.edu.sa/Pages/Software.aspx. Contact:xin.gao@kaust.edu.sa and stefan.arold@kaust.edu.sa Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hammad Naveed
- Computer, Electrical and Mathematical Sciences and Engineering Division, Computational Bioscience Research Center
| | - Umar S Hameed
- Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Deborah Harrus
- Inserm U1054, Centre de Biochimie Structurale and CNRS UMR5048, Universités Montpellier 1 & 2, Montpellier, France
| | - William Bourguet
- Inserm U1054, Centre de Biochimie Structurale and CNRS UMR5048, Universités Montpellier 1 & 2, Montpellier, France
| | - Stefan T Arold
- Computational Bioscience Research Center, Biological and Environmental Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Xin Gao
- Computer, Electrical and Mathematical Sciences and Engineering Division, Computational Bioscience Research Center
| |
Collapse
|
24
|
Huang Y, Linsen SEV. Partial depletion of yolk during zebrafish embryogenesis changes the dynamics of methionine cycle and metabolic genes. BMC Genomics 2015; 16:427. [PMID: 26040990 PMCID: PMC4455928 DOI: 10.1186/s12864-015-1654-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2014] [Accepted: 05/21/2015] [Indexed: 12/21/2022] Open
Abstract
Background Limited nutrient availability during development is associated with metabolic diseases in adulthood. The molecular cause for these defects is unclear. Here, we investigate if transcriptional changes caused by developmental malnutrition reveal an early response that can be linked to metabolism and metabolic diseases. Results We limited nutrient availability by removing yolk from zebrafish (Danio rerio) embryos. We then measured genome expression after 8, 24, 32 h post-fertilization (hpf) by RNA sequencing and 48 hpf by microarray profiling. We assessed the functional impact of deregulated genes by enrichment analysis of gene ontologies, pathways and CpG sites around the transcription start sites. Nutrient depletion during embryogenesis does not affect viability, but induces a bias towards female development. It induces subtle expression changes of metabolic genes: lipid transport, oxidative signaling, and glycolysis are affected during earlier stages, and hormonal signaling at 48 hpf. Co-citation analysis indicates association of deregulated genes to the metabolic syndrome, a known outcome of early-life nutrient depletion. Notably, deregulated methionine cycle genes indicate altered methyl donor availability. We find that the regulation of deregulated genes may be less dependent on methyl donor availability. Conclusions The systemic response to reduced nutrient availability in zebrafish embryos affects metabolic pathways and can be linked to metabolic diseases. Further exploration of the reported zebrafish model system may elucidate the consequences of reduced nutrient availability during embryogenesis. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1654-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Yunxian Huang
- Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China.
| | - Sam E V Linsen
- Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, China.
| |
Collapse
|
25
|
Babur Ö, Gönen M, Aksoy BA, Schultz N, Ciriello G, Sander C, Demir E. Systematic identification of cancer driving signaling pathways based on mutual exclusivity of genomic alterations. Genome Biol 2015; 16:45. [PMID: 25887147 PMCID: PMC4381444 DOI: 10.1186/s13059-015-0612-6] [Citation(s) in RCA: 106] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2014] [Accepted: 02/10/2015] [Indexed: 12/21/2022] Open
Abstract
We present a novel method for the identification of sets of mutually exclusive gene alterations in a given set of genomic profiles. We scan the groups of genes with a common downstream effect on the signaling network, using a mutual exclusivity criterion that ensures that each gene in the group significantly contributes to the mutual exclusivity pattern. We test the method on all available TCGA cancer genomics datasets, and detect multiple previously unreported alterations that show significant mutual exclusivity and are likely to be driver events.
Collapse
Affiliation(s)
- Özgün Babur
- Computational Biology Center, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, Box 460, New York, 10065, USA.
| | - Mithat Gönen
- Department of Epidemiology and Biostatistics, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, New York, 10065, USA.
| | - Bülent Arman Aksoy
- Computational Biology Center, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, Box 460, New York, 10065, USA.
- Tri-Institutional Training Program in Computational Biology and Medicine, 1275 York Avenue, New York, 10065, USA.
| | - Nikolaus Schultz
- Computational Biology Center, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, Box 460, New York, 10065, USA.
| | - Giovanni Ciriello
- Computational Biology Center, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, Box 460, New York, 10065, USA.
| | - Chris Sander
- Computational Biology Center, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, Box 460, New York, 10065, USA.
| | - Emek Demir
- Computational Biology Center, Memorial Sloan Kettering Cancer Center, 1275 York Avenue, Box 460, New York, 10065, USA.
| |
Collapse
|
26
|
Laukens K, Naulaerts S, Berghe WV. Bioinformatics approaches for the functional interpretation of protein lists: from ontology term enrichment to network analysis. Proteomics 2015; 15:981-96. [PMID: 25430566 DOI: 10.1002/pmic.201400296] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2014] [Revised: 10/16/2014] [Accepted: 11/24/2014] [Indexed: 12/24/2022]
Abstract
The main result of a great deal of the published proteomics studies is a list of identified proteins, which then needs to be interpreted in relation to the research question and existing knowledge. In the early days of proteomics this interpretation was only based on expert insights, acquired by digesting a large amount of relevant literature. With the growing size and complexity of the experimental datasets, many computational techniques, databases, and tools have claimed a central role in this task. In this review we discuss commonly and less commonly used methods to functionally interpret experimental proteome lists and compare them with available knowledge. We first address several functional analysis and enrichment techniques based on ontologies and literature. Then we outline how various types of network and pathway information can be used. While the problem of functional interpretation of proteome data is to an extent equivalent to the interpretation of transcriptome or other ''omics'' data, this paper addresses some of the specific challenges and solutions of the proteomics field.
Collapse
Affiliation(s)
- Kris Laukens
- Department of Mathematics and Computer Science, University of Antwerp, Middelheimlaan, Antwerp, Belgium; Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp / Antwerp University Hospital, Antwerp, Belgium
| | | | | |
Collapse
|
27
|
Huang Y, Yu X, Sun N, Qiao N, Cao Y, Boyd-Kirkup JD, Shen Q, Han JDJ. Single-cell-level spatial gene expression in the embryonic neural differentiation niche. Genome Res 2015; 25:570-81. [PMID: 25575549 PMCID: PMC4381528 DOI: 10.1101/gr.181966.114] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2014] [Accepted: 01/08/2015] [Indexed: 11/29/2022]
Abstract
With the rapidly increasing availability of high-throughput in situ hybridization images, how to effectively analyze these images at high resolution for global patterns and testable hypotheses has become an urgent challenge. Here we developed a semi-automated image analysis pipeline to analyze in situ hybridization images of E14.5 mouse embryos at single-cell resolution for more than 1600 telencephalon-expressed genes from the Eurexpress database. Using this pipeline, we derived the spatial gene expression profiles at single-cell resolution across the cortical layers to gain insight into the key processes occurring during cerebral cortex development. These profiles displayed high spatial modularity in gene expression, precisely recapitulated known differentiation zones, and uncovered additional unknown transition zones or cellular states. In particular, they revealed a distinctive spatial transition phase dedicated to chromatin remodeling events during neural differentiation, which can be validated by genomic clustering patterns, epigenetic modifications switches, and network modules. Our analysis further revealed a role of mitotic checkpoints during spatial gene expression state transition. As a novel approach to analyzing at the single-cell level the spatial modularity, dynamic trajectory, and transient states of gene expression during embryonic neural differentiation and to inferring regulatory events, our approach will be useful and applicable in many different systems for understanding the dynamic differentiation processes in vivo and at high resolution.
Collapse
Affiliation(s)
- Yi Huang
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiaoming Yu
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China; Center for Molecular Systems Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China
| | - Na Sun
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Nan Qiao
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Yaqiang Cao
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jerome D Boyd-Kirkup
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Qin Shen
- Tsinghua-Peking Center for Life Sciences, Tsinghua University, Beijing 100084, China; Center for Stem Cell Biology and Regenerative Medicine, School of Medicine, Tsinghua University, Beijing 100084, China
| | - Jing-Dong J Han
- Chinese Academy of Sciences Key Laboratory of Computational Biology, Chinese Academy of Sciences-Max Planck Partner Institute for Computational Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai 200031, China; Collaborative Innovation Center for Genetics and Developmental Biology
| |
Collapse
|
28
|
Chen W, Liu Y, Zhu S, Green CD, Wei G, Han JDJ. Improved nucleosome-positioning algorithm iNPS for accurate nucleosome positioning from sequencing data. Nat Commun 2014; 5:4909. [DOI: 10.1038/ncomms5909] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2013] [Accepted: 08/04/2014] [Indexed: 11/09/2022] Open
|
29
|
How to learn about gene function: text-mining or ontologies? Methods 2014; 74:3-15. [PMID: 25088781 DOI: 10.1016/j.ymeth.2014.07.004] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2014] [Revised: 07/01/2014] [Accepted: 07/09/2014] [Indexed: 12/31/2022] Open
Abstract
As the amount of genome information increases rapidly, there is a correspondingly greater need for methods that provide accurate and automated annotation of gene function. For example, many high-throughput technologies--e.g., next-generation sequencing--are being used today to generate lists of genes associated with specific conditions. However, their functional interpretation remains a challenge and many tools exist trying to characterize the function of gene-lists. Such systems rely typically in enrichment analysis and aim to give a quick insight into the underlying biology by presenting it in a form of a summary-report. While the load of annotation may be alleviated by such computational approaches, the main challenge in modern annotation remains to develop a systems form of analysis in which a pipeline can effectively analyze gene-lists quickly and identify aggregated annotations through computerized resources. In this article we survey some of the many such tools and methods that have been developed to automatically interpret the biological functions underlying gene-lists. We overview current functional annotation aspects from the perspective of their epistemology (i.e., the underlying theories used to organize information about gene function into a body of verified and documented knowledge) and find that most of the currently used functional annotation methods fall broadly into one of two categories: they are based either on 'known' formally-structured ontology annotations created by 'experts' (e.g., the GO terms used to describe the function of Entrez Gene entries), or--perhaps more adventurously--on annotations inferred from literature (e.g., many text-mining methods use computer-aided reasoning to acquire knowledge represented in natural languages). Overall however, deriving detailed and accurate insight from such gene lists remains a challenging task, and improved methods are called for. In particular, future methods need to (1) provide more holistic insight into the underlying molecular systems; (2) provide better follow-up experimental testing and treatment options, and (3) better manage gene lists derived from organisms that are not well-studied. We discuss some promising approaches that may help achieve these advances, especially the use of extended dictionaries of biomedical concepts and molecular mechanisms, as well as greater use of annotation benchmarks.
Collapse
|
30
|
Nolte MJ, Wang Y, Deng JM, Swinton PG, Wei C, Guindani M, Schwartz RJ, Behringer RR. Functional analysis of limb transcriptional enhancers in the mouse. Evol Dev 2014; 16:207-23. [PMID: 24920384 DOI: 10.1111/ede.12084] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
Transcriptional enhancers are genomic sequences bound by transcription factors that act together with basal transcriptional machinery to regulate gene transcription. Several high-throughput methods have generated large datasets of tissue-specific enhancer sequences with putative roles in developmental processes. However, few enhancers have been deleted from the genome to determine their roles in development. To understand the roles of two enhancers active in the mouse embryonic limb bud we deleted them from the genome. Although the genes regulated by these enhancers are unknown, they were selected because they were identified in a screen for putative limb bud-specific enhancers associated with p300, an acetyltransferase that participates in protein complexes that promote active transcription, and because the orthologous human enhancers (H1442 and H280) drive distinct lacZ expression patterns in limb buds of embryonic day (E) 11.5 transgenic mice. We show that the orthologous mouse sequences, M1442 and M280, regulate dynamic expression in the developing limb. Although significant transcriptional differences in enhancer-proximal genes in embryonic limb buds accompany the deletion of M1442 and M280 no gross limb malformations during embryonic development were observed, demonstrating that M1442 and M280 are not required for mouse limb development. However, M280 is required for the development and/or maintenance of body size; M280 mice are significantly smaller than controls. M280 also harbors an "ultraconserved" sequence that is identical between human, rat, and mouse. This is the first report of a phenotype resulting from the deletion of an ultraconserved element. These studies highlight the importance of determining enhancer regulatory function by experiments that manipulate them in situ and suggest that some of an enhancer's regulatory capacities may be developmentally tolerated rather than developmentally required.
Collapse
Affiliation(s)
- Mark J Nolte
- Graduate Program in Genes and Development, University of Texas Graduate School of Biomedical Sciences at Houston, Houston, TX, USA; Department of Genetics, University of Texas M.D. Anderson Cancer Center, Houston, TX, USA
| | | | | | | | | | | | | | | |
Collapse
|
31
|
Hu HY, He L, Khaitovich P. Deep sequencing reveals a novel class of bidirectional promoters associated with neuronal genes. BMC Genomics 2014; 15:457. [PMID: 24916849 PMCID: PMC4094773 DOI: 10.1186/1471-2164-15-457] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2013] [Accepted: 05/27/2014] [Indexed: 12/22/2022] Open
Abstract
Background Comprehensive annotation of transcripts expressed in a given tissue is a critical step towards the understanding of regulatory and functional pathways that shape the transcriptome. Results Here, we reconstructed a cumulative transcriptome of the human prefrontal cortex (PFC) based on approximately 300 million strand-specific RNA sequence (RNA-seq) reads collected at different stages of postnatal development. We find that more than 50% of reconstructed transcripts represent novel transcriptome elements, including 8,343 novel exons and exon extensions of annotated coding genes, 11,217 novel antisense transcripts and 29,541 novel intergenic transcripts or their fragments showing canonical features of long non-coding RNAs (lncRNAs). Our analysis further led to a surprising discovery of a novel class of bidirectional promoters (NBiPs) driving divergent transcription of mRNA and novel lncRNA pairs and displaying a distinct set of sequence and epigenetic features. In contrast to known bidirectional and unidirectional promoters, NBiPs are strongly associated with genes involved in neuronal functions and regulated by neuron-associated transcription factors. Conclusions Taken together, our results demonstrate that large portions of the human transcriptome remain uncharacterized. The distinct sequence and epigenetic features of NBiPs, as well as their specific association with neuronal genes, further suggest existence of regulatory pathways specific to the human brain. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-457) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Hai Yang Hu
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, 320 Yue Yang Road, 200031 Shanghai, China.
| | | | | |
Collapse
|