Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Zhou H, Jin J, Zhang H, Yi B, Wozniak M, Wong L. IntPath--an integrated pathway gene relationship database for model organisms and important pathogens. BMC Syst Biol 2012;6 Suppl 2:S2. [PMID: 23282057 PMCID: PMC3521174 DOI: 10.1186/1752-0509-6-s2-s2] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]

For:	Zhou H, Jin J, Zhang H, Yi B, Wozniak M, Wong L. IntPath--an integrated pathway gene relationship database for model organisms and important pathogens. BMC Syst Biol 2012;6 Suppl 2:S2. [PMID: 23282057 PMCID: PMC3521174 DOI: 10.1186/1752-0509-6-s2-s2] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]

Number

Cited by Other Article(s)

Zhang Y, Wu L, Wen X, Lv X. Identification and validation of risk score model based on gene set activity as a diagnostic biomarker for endometriosis. Heliyon 2023;9:e18277. [PMID: 37539146 PMCID: PMC10395533 DOI: 10.1016/j.heliyon.2023.e18277] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2022] [Revised: 06/28/2023] [Accepted: 07/13/2023] [Indexed: 08/05/2023] Open

Abstract

Objective

The enigmatic nature of Endometriosis (EMS) pathogenesis necessitates investigating alterations in signaling pathway activity to enhance our comprehension of the disease's characteristics.

Methods

Three published gene expression profiles (GSE11691, GSE25628, and GSE7305 datasets) were downloaded, and the "combat" algorithm was employed for batch correction, gene expression difference analysis, and pathway enrichment difference analysis. The protein-protein interaction (PPI) network was constructed to identify core genes, and the relative enrichment degree of gene sets was evaluated. The Lasso regression model identified candidate gene sets with diagnostic value, and a risk scoring diagnostic model was constructed for further validation on the GSE86534 and GSE5108 datasets. CIBERSORT was used to assess the composition of immune cells in EMS, and the correlation between EMS diagnostic value gene sets and immune cells was evaluated.

Results

A total of 568 differentially expressed genes were identified between eutopic and ectopic endometrium, with 10 core genes in the PPI network associated with cell cycle regulation. Inflammation-related pathways, including cytokine-receptor signaling and chemokine signaling pathways, were significantly more active in ectopic endometrium compared to eutopic endometrium. Diagnostic gene sets for EMS, such as homologous recombination, base excision repair, DNA replication, P53 signaling pathway, adherens junction, and SNARE interactions in vesicular transport, were identified. The risk score's area under the curve (AUC) was 0.854, as indicated by the receiver operating characteristic (ROC) curve, and the risk score's diagnostic value was validated by the validation cohort. Immune cell infiltration analysis revealed correlations between the risk score and Macrophages M2, Plasma cells, resting NK cells, activated NK cells, and regulatory T cells.

Conclusion

The risk scoring diagnostic model, based on pathway activity, demonstrates high diagnostic value and offers novel insights and strategies for the clinical diagnosis and treatment of Endometriosis.

Collapse

Alharbi F, Vakanski A. Machine Learning Methods for Cancer Classification Using Gene Expression Data: A Review. Bioengineering (Basel) 2023;10:bioengineering10020173. [PMID: 36829667 PMCID: PMC9952758 DOI: 10.3390/bioengineering10020173] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 01/24/2023] [Accepted: 01/26/2023] [Indexed: 01/31/2023] Open

Castro-Mondragon JA, Aure M, Lingjærde O, Langerød A, Martens JWM, Børresen-Dale AL, Kristensen V, Mathelier A. Cis-regulatory mutations associate with transcriptional and post-transcriptional deregulation of gene regulatory programs in cancers. Nucleic Acids Res 2022;50:12131-12148. [PMID: 36477895 PMCID: PMC9757053 DOI: 10.1093/nar/gkac1143] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Revised: 11/03/2022] [Accepted: 11/17/2022] [Indexed: 12/13/2022] Open

Yang Q, Liu T, Wu T, Lei T, Li Y, Wang X. GGDB: A Grameneae genome alignment database of homologous genes hierarchically related to evolutionary events. PLANT PHYSIOLOGY 2022;190:340-351. [PMID: 35789395 PMCID: PMC9434254 DOI: 10.1093/plphys/kiac297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Accepted: 06/01/2022] [Indexed: 06/15/2023]

Tan Y, Neto FBL, Neto UB. PALLAS: Penalized mAximum LikeLihood and pArticle Swarms for Inference of Gene Regulatory Networks From Time Series Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:1807-1816. [PMID: 33170782 DOI: 10.1109/tcbb.2020.3037090] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]

Analyzing RNA-Seq Gene Expression Data Using Deep Learning Approaches for Cancer Classification. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12041850] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Li H, Xiao X, Wu X, Ye L, Ji G. scLINE: A multi-network integration framework based on network embedding for representation of single-cell RNA-seq data. J Biomed Inform 2021;122:103899. [PMID: 34481921 DOI: 10.1016/j.jbi.2021.103899] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2021] [Revised: 08/22/2021] [Accepted: 08/24/2021] [Indexed: 01/18/2023]

Huckstep H, Fearnley LG, Davis MJ. Measuring pathway database coverage of the phosphoproteome. PeerJ 2021;9:e11298. [PMID: 34113485 PMCID: PMC8162239 DOI: 10.7717/peerj.11298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Accepted: 03/29/2021] [Indexed: 12/02/2022] Open

Gautam US, Mehra S, Kumari P, Alvarez X, Niu T, Tyagi JS, Kaushal D. Mycobacterium tuberculosis sensor kinase DosS modulates the autophagosome in a DosR-independent manner. Commun Biol 2019;2:349. [PMID: 31552302 PMCID: PMC6754383 DOI: 10.1038/s42003-019-0594-0] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2018] [Accepted: 09/03/2019] [Indexed: 01/03/2023] Open

Pasala C, Chilamakuri CSR, Katari SK, Nalamolu RM, Bitla AR, Umamaheswari A. An in silico study: Novel targets for potential drug and vaccine design against drug resistant H. pylori. Microb Pathog 2018;122:156-161. [DOI: 10.1016/j.micpath.2018.05.037] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2017] [Revised: 05/19/2018] [Accepted: 05/22/2018] [Indexed: 02/08/2023]

Jiang S, Zhou H, Liang J, Gerdt C, Wang C, Ke L, Schmidt SCS, Narita Y, Ma Y, Wang S, Colson T, Gewurz B, Li G, Kieff E, Zhao B. The Epstein-Barr Virus Regulome in Lymphoblastoid Cells. Cell Host Microbe 2018;22:561-573.e4. [PMID: 29024646 DOI: 10.1016/j.chom.2017.09.001] [Citation(s) in RCA: 77] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2017] [Revised: 06/21/2017] [Accepted: 08/30/2017] [Indexed: 01/01/2023]

Affiliation(s)

Sizun Jiang Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115, USA
Hufeng Zhou Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115, USA
Jun Liang Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115, USA
Catherine Gerdt Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
Chong Wang Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115, USA
Liangru Ke Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Department of Nasopharyngeal Carcinoma, Sun Yat-Sen Cancer Center, Sun Yat-Sen University, Guangzhou 510060, China
Stefanie C S Schmidt Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115, USA
Yohei Narita Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115, USA
Yijie Ma Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115, USA
Shuangqi Wang National Key Laboratory of Crop Genetic Improvement, College of Life Sciences and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, China
Tyler Colson Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
Benjamin Gewurz Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
Guoliang Li National Key Laboratory of Crop Genetic Improvement, College of Life Sciences and Technology, Huazhong Agricultural University, Wuhan, Hubei 430070, China
Elliott Kieff Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA; Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115, USA.
Bo Zhao Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA.

Collapse

Li X, Chen W, Chen Y, Zhang X, Gu J, Zhang MQ. Network embedding-based representation learning for single cell RNA-seq data. Nucleic Acids Res 2017;45:e166. [PMID: 28977434 PMCID: PMC5737094 DOI: 10.1093/nar/gkx750] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2017] [Accepted: 08/17/2017] [Indexed: 11/13/2022] Open

Hüls A, Ickstadt K, Schikowski T, Krämer U. Detection of gene-environment interactions in the presence of linkage disequilibrium and noise by using genetic risk scores with internal weights from elastic net regression. BMC Genet 2017;18:55. [PMID: 28606108 PMCID: PMC5469185 DOI: 10.1186/s12863-017-0519-1] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2017] [Accepted: 05/23/2017] [Indexed: 12/17/2022] Open

Abstract

BACKGROUND

For the analysis of gene-environment (GxE) interactions commonly single nucleotide polymorphisms (SNPs) are used to characterize genetic susceptibility, an approach that mostly lacks power and has poor reproducibility. One promising approach to overcome this problem might be the use of weighted genetic risk scores (GRS), which are defined as weighted sums of risk alleles of gene variants. The gold-standard is to use external weights from published meta-analyses.

METHODS

In this study, we used internal weights from the marginal genetic effects of the SNPs estimated by a multivariate elastic net regression and thereby provided a method that can be used if there are no external weights available. We conducted a simulation study for the detection of GxE interactions and compared power and type I error of single SNPs analyses with Bonferroni correction and corresponding analysis with unweighted and our weighted GRS approach in scenarios with six risk SNPs and an increasing number of highly correlated (up to 210) and noise SNPs (up to 840).

RESULTS

Applying weighted GRS increased the power enormously in comparison to the common single SNPs approach (e.g. 94.2% vs. 35.4%, respectively, to detect a weak interaction with an OR ≈ 1.04 for six uncorrelated risk SNPs and n = 700 with a well-controlled type I error). Furthermore, weighted GRS outperformed the unweighted GRS, in particular in the presence of SNPs without any effect on the phenotype (e.g. 90.1% vs. 43.9%, respectively, when 20 noise SNPs were added to the six risk SNPs). This outperforming of the weighted GRS was confirmed in a real data application on lung inflammation in the SALIA cohort (n = 402). However, in scenarios with a high number of noise SNPs (>200 vs. 6 risk SNPs), larger sample sizes are needed to avoid an increased type I error, whereas a high number of correlated SNPs can be handled even in small samples (e.g. n = 400).

CONCLUSION

In conclusion, weighted GRS with weights from the marginal genetic effects of the SNPs estimated by a multivariate elastic net regression were shown to be a powerful tool to detect gene-environment interactions in scenarios of high Linkage disequilibrium and noise.

Collapse

Rezabakhsh A, Cheraghi O, Nourazarian A, Hassanpour M, Kazemi M, Ghaderi S, Faraji E, Rahbarghazi R, Avci ÇB, Bagca BG, Garjani A. Type 2 Diabetes Inhibited Human Mesenchymal Stem Cells Angiogenic Response by Over-Activity of the Autophagic Pathway. J Cell Biochem 2017;118:1518-1530. [DOI: 10.1002/jcb.25814] [Citation(s) in RCA: 39] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2016] [Accepted: 11/28/2016] [Indexed: 12/18/2022]

Goh WWB, Wong L. Integrating Networks and Proteomics: Moving Forward. Trends Biotechnol 2016;34:951-959. [DOI: 10.1016/j.tibtech.2016.05.015] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2016] [Revised: 05/23/2016] [Accepted: 05/24/2016] [Indexed: 11/28/2022]

Kruppa J, Kramer F, Beißbarth T, Jung K. A simulation framework for correlated count data of features subsets in high-throughput sequencing or proteomics experiments. Stat Appl Genet Mol Biol 2016;15:401-414. [PMID: 27655448 DOI: 10.1515/sagmb-2015-0082] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Goh WWB, Wong L. Advancing Clinical Proteomics via Analysis Based on Biological Complexes: A Tale of Five Paradigms. J Proteome Res 2016;15:3167-79. [DOI: 10.1021/acs.jproteome.6b00402] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]

Stable Gene Regulatory Network Modeling From Steady-State Data. Bioengineering (Basel) 2016;3:bioengineering3020012. [PMID: 28952574 PMCID: PMC5597136 DOI: 10.3390/bioengineering3020012] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2015] [Revised: 03/09/2016] [Accepted: 04/06/2016] [Indexed: 12/19/2022] Open

Wu X, Wu G, Yao X, Hou G, Jiang F. The clinicopathological significance and ethnic difference of FHIT hypermethylation in non-small-cell lung carcinoma: a meta-analysis and literature review. DRUG DESIGN DEVELOPMENT AND THERAPY 2016;10:699-709. [PMID: 26929601 PMCID: PMC4760666 DOI: 10.2147/dddt.s85253] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]

Liu Q, Song R, Li J. Inference of gene interaction networks using conserved subsequential patterns from multiple time course gene expression datasets. BMC Genomics 2015;16 Suppl 12:S4. [PMID: 26681650 PMCID: PMC4682423 DOI: 10.1186/1471-2164-16-s12-s4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open

Abstract

Motivation

Deciphering gene interaction networks (GINs) from time-course gene expression (TCGx) data is highly valuable to understand gene behaviors (e.g., activation, inhibition, time-lagged causality) at the system level. Existing methods usually use a global or local proximity measure to infer GINs from a single dataset. As the noise contained in a single data set is hardly self-resolved, the results are sometimes not reliable. Also, these proximity measurements cannot handle the co-existence of the various in vivo positive, negative and time-lagged gene interactions.

Methods and results

We propose to infer reliable GINs from multiple TCGx datasets using a novel conserved subsequential pattern of gene expression. A subsequential pattern is a maximal subset of genes sharing positive, negative or time-lagged correlations of one expression template on their own subsets of time points. Based on these patterns, a GIN can be built from each of the datasets. It is assumed that reliable gene interactions would be detected repeatedly. We thus use conserved gene pairs from the individual GINs of the multiple TCGx datasets to construct a reliable GIN for a species. We apply our method on six TCGx datasets related to yeast cell cycle, and validate the reliable GINs using protein interaction networks, biopathways and transcription factor-gene regulations. We also compare the reliable GINs with those GINs reconstructed by a global proximity measure Pearson correlation coefficient method from single datasets. It has been demonstrated that our reliable GINs achieve much better prediction performance especially with much higher precision. The functional enrichment analysis also suggests that gene sets in a reliable GIN are more functionally significant. Our method is especially useful to decipher GINs from multiple TCGx datasets related to less studied organisms where little knowledge is available except gene expression data.

Collapse

Takahashi H, Kaniwa N, Saito Y, Sai K, Hamaguchi T, Shirao K, Shimada Y, Matsumura Y, Ohtsu A, Yoshino T, Doi T, Takahashi A, Odaka Y, Okuyama M, Sawada JI, Sakamoto H, Yoshida T. Construction of possible integrated predictive index based on EGFR and ANXA3 polymorphisms for chemotherapy response in fluoropyrimidine-treated Japanese gastric cancer patients using a bioinformatic method. BMC Cancer 2015;15:718. [PMID: 26475168 PMCID: PMC4609065 DOI: 10.1186/s12885-015-1721-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2015] [Accepted: 10/08/2015] [Indexed: 12/23/2022] Open

Abstract

Background

Variability in drug response between individual patients is a serious concern in medicine. To identify single-nucleotide polymorphisms (SNPs) related to drug response variability, many genome-wide association studies have been conducted.

Methods

We previously applied a knowledge-based bioinformatic approach to a pharmacogenomics study in which 119 fluoropyrimidine-treated gastric cancer patients were genotyped at 109,365 SNPs using the Illumina Human-1 BeadChip. We identified the SNP rs2293347 in the human epidermal growth factor receptor (EGFR) gene as a novel genetic factor related to chemotherapeutic response. In the present study, we reanalyzed these hypothesis-free genomic data using extended knowledge.

Results

We identified rs2867461 in annexin A3 (ANXA3) gene as another candidate. Using logistic regression, we confirmed that the performance of the rs2867461 + rs2293347 model was superior to those of the single factor models. Furthermore, we propose a novel integrated predictive index (iEA) based on these two polymorphisms in EGFR and ANXA3. The p value for iEA was 1.47 × 10⁻⁸ by Fisher’s exact test. Recent studies showed that the mutations in EGFR is associated with high expression of dihydropyrimidine dehydrogenase, which is an inactivating and rate-limiting enzyme for fluoropyrimidine, and suggested that the combination of chemotherapy with fluoropyrimidine and EGFR-targeting agents is effective against EGFR-overexpressing gastric tumors, while ANXA3 overexpression confers resistance to tyrosine kinase inhibitors targeting the EGFR pathway.

Conclusions

These results suggest that the iEA index or a combination of polymorphisms in EGFR and ANXA3 may serve as predictive factors of drug response, and therefore could be useful for optimal selection of chemotherapy regimens.

Electronic supplementary material

The online version of this article (doi:10.1186/s12885-015-1721-z) contains supplementary material, which is available to authorized users.

Collapse

Affiliation(s)

Hiro Takahashi Graduate School of Horticulture, Chiba University, 648 Matsudo, Matsudo, Chiba, 271-8510, Japan. .,Plant Biology Research Center, Chubu University, Matsumoto-cho 1200, Kasugai, Aichi, 487-8501, Japan. .,Division of Genetics, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo, 104-0045, Japan.
Nahoko Kaniwa Division of Medicinal Safety Science, National Institute of Health Sciences, 1-18-1 Kamiyoga, Setagaya-ku, Tokyo, 158-8501, Japan.
Yoshiro Saito Division of Medicinal Safety Science, National Institute of Health Sciences, 1-18-1 Kamiyoga, Setagaya-ku, Tokyo, 158-8501, Japan.
Kimie Sai Division of Medicinal Safety Science, National Institute of Health Sciences, 1-18-1 Kamiyoga, Setagaya-ku, Tokyo, 158-8501, Japan.
Tetsuya Hamaguchi Gastrointestinal Medical Oncology Division, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo, 104-0045, Japan.
Kuniaki Shirao Gastrointestinal Medical Oncology Division, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo, 104-0045, Japan.
Yasuhiro Shimada Gastrointestinal Medical Oncology Division, National Cancer Center Hospital, 5-1-1 Tsukiji, Chuo-ku, Tokyo, 104-0045, Japan.
Yasuhiro Matsumura Division of Developmental Therapeutics, Research Center for Innovative Oncology, National Cancer Center Hospital East, 6-5-1, Kashiwanoha, Kashiwa, Chiba, 277-8577, Japan.
Atsushi Ohtsu Department of Gastrointestinal Oncology, National Cancer Center Hospital East, 6-5-1, Kashiwanoha, Kashiwa, Chiba, 277-8577, Japan.
Takayuki Yoshino Department of Gastrointestinal Oncology, National Cancer Center Hospital East, 6-5-1, Kashiwanoha, Kashiwa, Chiba, 277-8577, Japan.
Toshihiko Doi Department of Gastrointestinal Oncology, National Cancer Center Hospital East, 6-5-1, Kashiwanoha, Kashiwa, Chiba, 277-8577, Japan.
Anna Takahashi Plant Biology Research Center, Chubu University, Matsumoto-cho 1200, Kasugai, Aichi, 487-8501, Japan.
Yoko Odaka Division of Genetics, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo, 104-0045, Japan.
Misuzu Okuyama Division of Genetics, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo, 104-0045, Japan.
Jun-Ichi Sawada Division of Functional Biochemistry and Genomics, National Institute of Health Sciences, 1-18-1 Kamiyoga, Setagaya-ku, Tokyo, 158-8501, Japan. .,Present address: Pharmaceutical and Medical Devices Agency, Shinkasumigaseki-building, 3-3-2 Kasumigaseki, Chiyoda-ku, Tokyo, 100-0013, Japan.
Hiromi Sakamoto Division of Genetics, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo, 104-0045, Japan.
Teruhiko Yoshida Division of Genetics, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo, 104-0045, Japan.

Collapse

Systematic analysis of somatic mutations impacting gene expression in 12 tumour types. Nat Commun 2015;6:8554. [PMID: 26436532 PMCID: PMC4600750 DOI: 10.1038/ncomms9554] [Citation(s) in RCA: 83] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2014] [Accepted: 09/04/2015] [Indexed: 12/27/2022] Open

Evasion of affinity-based selection in germinal centers by Epstein-Barr virus LMP2A. Proc Natl Acad Sci U S A 2015;112:11612-7. [PMID: 26305967 DOI: 10.1073/pnas.1514484112] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open

Gautam US, Mehra S, Kaushal D. In-Vivo Gene Signatures of Mycobacterium tuberculosis in C3HeB/FeJ Mice. PLoS One 2015;10:e0135208. [PMID: 26270051 PMCID: PMC4535907 DOI: 10.1371/journal.pone.0135208] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2015] [Accepted: 07/19/2015] [Indexed: 11/28/2022] Open

Abstract

Despite considerable progress in understanding the pathogenesis of Mycobacterium tuberculosis (Mtb), development of new therapeutics and vaccines against it has proven difficult. This is at least in part due to the use of less than optimal models of in-vivo Mtb infection, which has precluded a study of the physiology of the pathogen in niches where it actually persists. C3HeB/FeJ (Kramnik) mice develop human-like lesions when experimentally infected with Mtb and thus make available, a faithful and highly tractable system to study the physiology of the pathogen in-vivo. We compared the transcriptomics of Mtb and various mutants in the DosR (DevR) regulon derived from Kramnik mouse granulomas to those cultured in-vitro. We recently showed that mutant ΔdosS is attenuated in C3HeB/FeJ mice. Aerosol exposure of mice with the mutant mycobacteria resulted in a substantially different and a relatively weaker transcriptional response (< = 20 genes were induced) for the functional category ‘Information Pathways’ in Mtb:ΔdosR; ‘Lipid Metabolism’ in Mtb:ΔdosT; ‘Virulence, Detoxification, Adaptation’ in both Mtb:ΔdosR and Mtb:ΔdosT; and ‘PE/PPE’ family in all mutant strains compare to wild-type Mtb H37Rv, suggesting that the inability to induce DosR functions to different levels can modulate the interaction of the pathogen with the host. The Mtb genes expressed during growth in C3HeB/FeJ mice appear to reflect adaptation to differential nutrient utilization for survival in mouse lungs. The genes such as glnB, Rv0744c, Rv3281, sdhD/B, mce4A, dctA etc. downregulated in mutant ΔdosS indicate their requirement for bacterial growth and flow of carbon/energy source from host cells. We conclude that genes expressed in Mtb during in-vivo chronic phase of infection in Kramnik mice mainly contribute to growth, cell wall processes, lipid metabolism, and virulence.

Collapse

Dai H, Charnigo R. Compound hierarchical correlated beta mixture with an application to cluster mouse transcription factor DNA binding data. Biostatistics 2015;16:641-54. [PMID: 25964663 DOI: 10.1093/biostatistics/kxv016] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2014] [Accepted: 04/10/2015] [Indexed: 11/12/2022] Open

Zhou H, Schmidt SCS, Jiang S, Willox B, Bernhardt K, Liang J, Johannsen EC, Kharchenko P, Gewurz BE, Kieff E, Zhao B. Epstein-Barr virus oncoprotein super-enhancers control B cell growth. Cell Host Microbe 2015;17:205-16. [PMID: 25639793 DOI: 10.1016/j.chom.2014.12.013] [Citation(s) in RCA: 123] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2014] [Revised: 10/16/2014] [Accepted: 11/15/2014] [Indexed: 01/11/2023]

Affiliation(s)

Hufeng Zhou Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115, USA
Stefanie C S Schmidt Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115, USA
Sizun Jiang Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115, USA
Bradford Willox Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA
Katharina Bernhardt Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115, USA
Jun Liang Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115, USA
Eric C Johannsen Department of Medicine and McArdle Laboratory for Cancer Research, University of Wisconsin-Madison, Madison, WI 53706, USA
Peter Kharchenko Center for Biomedical Informatics, Harvard Medical School and Division of Hematology, Children's Hospital, Boston, MA 02115, USA
Benjamin E Gewurz Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115, USA
Elliott Kieff Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115, USA.
Bo Zhao Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA; Department of Microbiology and Immunobiology, Harvard Medical School, Boston, MA 02115, USA.

Collapse

Yu G, Zhu H, Domeniconi C. Predicting protein functions using incomplete hierarchical labels. BMC Bioinformatics 2015;16:1. [PMID: 25591917 PMCID: PMC4384381 DOI: 10.1186/s12859-014-0430-y] [Citation(s) in RCA: 83] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2014] [Accepted: 12/11/2014] [Indexed: 02/07/2023] Open

Abstract

BACKGROUND

Protein function prediction is to assign biological or biochemical functions to proteins, and it is a challenging computational problem characterized by several factors: (1) the number of function labels (annotations) is large; (2) a protein may be associated with multiple labels; (3) the function labels are structured in a hierarchy; and (4) the labels are incomplete. Current predictive models often assume that the labels of the labeled proteins are complete, i.e. no label is missing. But in real scenarios, we may be aware of only some hierarchical labels of a protein, and we may not know whether additional ones are actually present. The scenario of incomplete hierarchical labels, a challenging and practical problem, is seldom studied in protein function prediction.

RESULTS

In this paper, we propose an algorithm to Predict protein functions using Incomplete hierarchical LabeLs (PILL in short). PILL takes into account the hierarchical and the flat taxonomy similarity between function labels, and defines a Combined Similarity (ComSim) to measure the correlation between labels. PILL estimates the missing labels for a protein based on ComSim and the known labels of the protein, and uses a regularization to exploit the interactions between proteins for function prediction. PILL is shown to outperform other related techniques in replenishing the missing labels and in predicting the functions of completely unlabeled proteins on publicly available PPI datasets annotated with MIPS Functional Catalogue and Gene Ontology labels.

CONCLUSION

The empirical study shows that it is important to consider the incomplete annotation for protein function prediction. The proposed method (PILL) can serve as a valuable tool for protein function prediction using incomplete labels. The Matlab code of PILL is available upon request.

Collapse

Abelin ACT, Marinov GK, Williams BA, McCue K, Wold BJ. A ratiometric-based measure of gene co-expression. BMC Bioinformatics 2014;15:331. [PMID: 25411051 PMCID: PMC4289233 DOI: 10.1186/1471-2105-15-331] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2014] [Accepted: 07/18/2014] [Indexed: 12/02/2022] Open

Abstract

Background

Gene co-expression analysis has previously been based on measures that include correlation coefficients and mutual information, as well as newcomers such as MIC. These measures depend primarily on the degree of association between the RNA levels of two genes and to a lesser extent on their variability. They focus on the similarity of expression value trajectories that change in like manner across samples. However there are relationships of biological interest for which these classical measures are expected to be insensitive. These include genes whose expression levels are ratiometrically stable and genes whose variance is tightly constrained. Large-scale studies of relatively homogeneous samples, including single cell RNA-seq, are experimental settings in which such relationships might be especially pertinent.

Results

We develop and implement a ratiometric approach for detecting gene associations (abbreviated RA). It is based on the coefficient of variation of the measured expression ratio of each pair of genes. We apply it to a collection of lymphoblastoid RNA-seq data from the 1000 Genomes Project Consortium, a typical sample set with high overall homogeneity. RA is a selective method, reporting in this case ~1/4 of all possible gene pairs, yet these relationships include a distilled picture of biological relationships previously found by other methods. In addition, RA reveals expression relationships that are not detected by traditional correlation and mutual information methods. We also analyze data from individual lymphoblastoid cells and show that desirable properties of the RA method extend to single-cell RNA-seq.

Conclusion

We show that our ratiometric method identifies biologically significant relationships that are often missed or low-ranked by conventional association-based methods when applied to a relatively homogenous dataset. The results open new questions about the regulatory mechanisms that produce strong RA relationships. RA is scalable and potentially well suited for the analysis of thousands of bulk-RNA or single-cell transcriptomes.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-331) contains supplementary material, which is available to authorized users.

Collapse

Design pattern mining using distributed learning automata and DNA sequence alignment. PLoS One 2014;9:e106313. [PMID: 25243670 PMCID: PMC4171372 DOI: 10.1371/journal.pone.0106313] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2014] [Accepted: 07/30/2014] [Indexed: 11/19/2022] Open

Abstract

Context

Over the last decade, design patterns have been used extensively to generate reusable solutions to frequently encountered problems in software engineering and object oriented programming. A design pattern is a repeatable software design solution that provides a template for solving various instances of a general problem.

Objective

This paper describes a new method for pattern mining, isolating design patterns and relationship between them; and a related tool, DLA-DNA for all implemented pattern and all projects used for evaluation. DLA-DNA achieves acceptable precision and recall instead of other evaluated tools based on distributed learning automata (DLA) and deoxyribonucleic acid (DNA) sequences alignment.

Method

The proposed method mines structural design patterns in the object oriented source code and extracts the strong and weak relationships between them, enabling analyzers and programmers to determine the dependency rate of each object, component, and other section of the code for parameter passing and modular programming. The proposed model can detect design patterns better that available other tools those are Pinot, PTIDEJ and DPJF; and the strengths of their relationships.

Results

The result demonstrate that whenever the source code is build standard and non-standard, based on the design patterns, then the result of the proposed method is near to DPJF and better that Pinot and PTIDEJ. The proposed model is tested on the several source codes and is compared with other related models and available tools those the results show the precision and recall of the proposed method, averagely 20% and 9.6% are more than Pinot, 27% and 31% are more than PTIDEJ and 3.3% and 2% are more than DPJF respectively.

Conclusion

The primary idea of the proposed method is organized in two following steps: the first step, elemental design patterns are identified, while at the second step, is composed to recognize actual design patterns.

Collapse

Analysis of gene expression profiles of soft tissue sarcoma using a combination of knowledge-based filtering with integration of multiple statistics. PLoS One 2014;9:e106801. [PMID: 25188299 PMCID: PMC4154757 DOI: 10.1371/journal.pone.0106801] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2014] [Accepted: 08/01/2014] [Indexed: 12/21/2022] Open

Jadhav A, Shanmugham B, Rajendiran A, Pan A. Unraveling novel broad-spectrum antibacterial targets in food and waterborne pathogens using comparative genomics and protein interaction network analysis. INFECTION GENETICS AND EVOLUTION 2014;27:300-8. [PMID: 25128740 DOI: 10.1016/j.meegid.2014.08.007] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2014] [Revised: 07/31/2014] [Accepted: 08/07/2014] [Indexed: 02/04/2023]

Abstract

Food and waterborne diseases are a growing concern in terms of human morbidity and mortality worldwide, even in the 21st century, emphasizing the need for new therapeutic interventions for these diseases. The current study aims at prioritizing broad-spectrum antibacterial targets, present in multiple food and waterborne bacterial pathogens, through a comparative genomics strategy coupled with a protein interaction network analysis. The pathways unique and common to all the pathogens under study (viz., methane metabolism, d-alanine metabolism, peptidoglycan biosynthesis, bacterial secretion system, two-component system, C5-branched dibasic acid metabolism), identified by comparative metabolic pathway analysis, were considered for the analysis. The proteins/enzymes involved in these pathways were prioritized following host non-homology analysis, essentiality analysis, gut flora non-homology analysis and protein interaction network analysis. The analyses revealed a set of promising broad-spectrum antibacterial targets, present in multiple food and waterborne pathogens, which are essential for bacterial survival, non-homologous to host and gut flora, and functionally important in the metabolic network. The identified broad-spectrum candidates, namely, integral membrane protein/virulence factor (MviN), preprotein translocase subunits SecB and SecG, carbon storage regulator (CsrA), and nitrogen regulatory protein P-II 1 (GlnB), contributed by the peptidoglycan pathway, bacterial secretion systems and two-component systems, were also found to be present in a wide range of other disease-causing bacteria. Cytoplasmic proteins SecG, CsrA and GlnB were considered as drug targets, while membrane proteins MviN and SecB were classified as vaccine targets. The identified broad-spectrum targets can aid in the design and development of antibacterial agents not only against food and waterborne pathogens but also against other pathogens.

Collapse

Koo I, Yao S, Zhang X, Kim S. Comparative analysis of false discovery rate methods in constructing metabolic association networks. J Bioinform Comput Biol 2014;12:1450018. [PMID: 25152043 DOI: 10.1142/s0219720014500188] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]

Lee S, Kim JY, Hwang J, Kim S, Lee JH, Han DH. Investigation of pathogenic genes in peri-implantitis from implant clustering failure patients: a whole-exome sequencing pilot study. PLoS One 2014;9:e99360. [PMID: 24921256 PMCID: PMC4055653 DOI: 10.1371/journal.pone.0099360] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2014] [Accepted: 05/13/2014] [Indexed: 01/21/2023] Open

Abstract

Peri-implantitis is a frequently occurring gum disease linked to multi-factorial traits with various environmental and genetic causalities and no known concrete pathogenesis. The varying severity of peri-implantitis among patients with relatively similar environments suggests a genetic aspect which needs to be investigated to understand and regulate the pathogenesis of the disease. Six unrelated individuals with multiple clusterization implant failure due to severe peri-implantitis were chosen for this study. These six individuals had relatively healthy lifestyles, with minimal environmental causalities affecting peri-implantitis. Research was undertaken to investigate pathogenic genes in peri-implantitis albeit with a small number of subjects and incomplete elimination of environmental causalities. Whole-exome sequencing was performed on collected saliva samples via self DNA collection kit. Common variants with minor allele frequencies (MAF) > = 0.05 from all control datasets were eliminated and variants having high and moderate impact and loss of function were used for comparison. Gene set enrichment analysis was performed to reveal functional groups associated with the genetic variants. 2,022 genes were left after filtering against dbSNP, the 1000 Genomes East Asian population, and healthy Korean randomized subsample data (GSK project). 175 (p-value <0.05) out of 927 gene sets were obtained via GSEA (DAVID). The top 10 was chosen (p-value <0.05) from cluster enrichment showing significance of cytoskeleton, cell adhesion, and metal ion binding. Network analysis was applied to find relationships between functional clusters. Among the functional groups, ion metal binding was located in the center of all clusters, indicating dysfunction of regulation in metal ion concentration might affect cell morphology or cell adhesion, resulting in implant failure. This result may demonstrate the feasibility of and provide pilot data for a larger research project aimed at discovering biomarkers for early diagnosis of peri-implantitis.

Collapse

Chen YA, Tripathi LP, Dessailly BH, Nyström-Persson J, Ahmad S, Mizuguchi K. Integrated pathway clusters with coherent biological themes for target prioritisation. PLoS One 2014;9:e99030. [PMID: 24918583 PMCID: PMC4053319 DOI: 10.1371/journal.pone.0099030] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2014] [Accepted: 05/07/2014] [Indexed: 12/15/2022] Open

Zhou H, Gao S, Nguyen NN, Fan M, Jin J, Liu B, Zhao L, Xiong G, Tan M, Li S, Wong L. Stringent homology-based prediction of H. sapiens-M. tuberculosis H37Rv protein-protein interactions. Biol Direct 2014;9:5. [PMID: 24708540 PMCID: PMC4022245 DOI: 10.1186/1745-6150-9-5] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2013] [Accepted: 03/26/2014] [Indexed: 12/31/2022] Open

Abstract

BACKGROUND

H. sapiens-M. tuberculosis H37Rv protein-protein interaction (PPI) data are essential for understanding the infection mechanism of the formidable pathogen M. tuberculosis H37Rv. Computational prediction is an important strategy to fill the gap in experimental H. sapiens-M. tuberculosis H37Rv PPI data. Homology-based prediction is frequently used in predicting both intra-species and inter-species PPIs. However, some limitations are not properly resolved in several published works that predict eukaryote-prokaryote inter-species PPIs using intra-species template PPIs.

RESULTS

We develop a stringent homology-based prediction approach by taking into account (i) differences between eukaryotic and prokaryotic proteins and (ii) differences between inter-species and intra-species PPI interfaces. We compare our stringent homology-based approach to a conventional homology-based approach for predicting host-pathogen PPIs, based on cellular compartment distribution analysis, disease gene list enrichment analysis, pathway enrichment analysis and functional category enrichment analysis. These analyses support the validity of our prediction result, and clearly show that our approach has better performance in predicting H. sapiens-M. tuberculosis H37Rv PPIs. Using our stringent homology-based approach, we have predicted a set of highly plausible H. sapiens-M. tuberculosis H37Rv PPIs which might be useful for many of related studies. Based on our analysis of the H. sapiens-M. tuberculosis H37Rv PPI network predicted by our stringent homology-based approach, we have discovered several interesting properties which are reported here for the first time. We find that both host proteins and pathogen proteins involved in the host-pathogen PPIs tend to be hubs in their own intra-species PPI network. Also, both host and pathogen proteins involved in host-pathogen PPIs tend to have longer primary sequence, tend to have more domains, tend to be more hydrophilic, etc. And the protein domains from both host and pathogen proteins involved in host-pathogen PPIs tend to have lower charge, and tend to be more hydrophilic.

CONCLUSIONS

Our stringent homology-based prediction approach provides a better strategy in predicting PPIs between eukaryotic hosts and prokaryotic pathogens than a conventional homology-based approach. The properties we have observed from the predicted H. sapiens-M. tuberculosis H37Rv PPI network are useful for understanding inter-species host-pathogen PPI networks and provide novel insights for host-pathogen interaction studies.

Collapse

Zhou H, Rezaei J, Hugo W, Gao S, Jin J, Fan M, Yong CH, Wozniak M, Wong L. Stringent DDI-based prediction of H. sapiens-M. tuberculosis H37Rv protein-protein interactions. BMC SYSTEMS BIOLOGY 2013;7 Suppl 6:S6. [PMID: 24564941 PMCID: PMC4029759 DOI: 10.1186/1752-0509-7-s6-s6] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Abstract

BACKGROUND

H. sapiens-M. tuberculosis H37Rv protein-protein interaction (PPI) data are very important information to illuminate the infection mechanism of M. tuberculosis H37Rv. But current H. sapiens-M. tuberculosis H37Rv PPI data are very scarce. This seriously limits the study of the interaction between this important pathogen and its host H. sapiens. Computational prediction of H. sapiens-M. tuberculosis H37Rv PPIs is an important strategy to fill in the gap. Domain-domain interaction (DDI) based prediction is one of the frequently used computational approaches in predicting both intra-species and inter-species PPIs. However, the performance of DDI-based host-pathogen PPI prediction has been rather limited.

RESULTS

We develop a stringent DDI-based prediction approach with emphasis on (i) differences between the specific domain sequences on annotated regions of proteins under the same domain ID and (ii) calculation of the interaction strength of predicted PPIs based on the interacting residues in their interaction interfaces. We compare our stringent DDI-based approach to a conventional DDI-based approach for predicting PPIs based on gold standard intra-species PPIs and coherent informative Gene Ontology terms assessment. The assessment results show that our stringent DDI-based approach achieves much better performance in predicting PPIs than the conventional approach. Using our stringent DDI-based approach, we have predicted a small set of reliable H. sapiens-M. tuberculosis H37Rv PPIs which could be very useful for a variety of related studies. We also analyze the H. sapiens-M. tuberculosis H37Rv PPIs predicted by our stringent DDI-based approach using cellular compartment distribution analysis, functional category enrichment analysis and pathway enrichment analysis. The analyses support the validity of our prediction result. Also, based on an analysis of the H. sapiens-M. tuberculosis H37Rv PPI network predicted by our stringent DDI-based approach, we have discovered some important properties of domains involved in host-pathogen PPIs. We find that both host and pathogen proteins involved in host-pathogen PPIs tend to have more domains than proteins involved in intra-species PPIs, and these domains have more interaction partners than domains on proteins involved in intra-species PPI.

CONCLUSIONS

The stringent DDI-based prediction approach reported in this work provides a stringent strategy for predicting host-pathogen PPIs. It also performs better than a conventional DDI-based approach in predicting PPIs. We have predicted a small set of accurate H. sapiens-M. tuberculosis H37Rv PPIs which could be very useful for a variety of related studies.

Collapse