51
|
Chen Y, Shi JX, Pan XF, Feng J, Zhao H. Identification of candidate genes for lung cancer somatic mutation test kits. Genet Mol Biol 2013; 36:455-64. [PMID: 24130455 PMCID: PMC3795175 DOI: 10.1590/s1415-47572013000300022] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2013] [Accepted: 06/04/2013] [Indexed: 11/25/2022] Open
Abstract
Over the past three decades, mortality from lung cancer has sharply and continuously increased in China, ascending to the first cause of death among all types of cancer. The ability to identify the actual sequence of gene mutations may help doctors determine which mutations lead to precancerous lesions and which produce invasive carcinomas, especially using next-generation sequencing (NGS) technology. In this study, we analyzed the latest lung cancer data in the COSMIC database, in order to find genomic “hotspots” that are frequently mutated in human lung cancer genomes. The results revealed that the most frequently mutated lung cancer genes are EGFR, KRAS and TP53. In recent years, EGFR and KRAS lung cancer test kits have been utilized for detecting lung cancer patients, but they presented many disadvantages, as they proved to be of low sensitivity, labor-intensive and time-consuming. In this study, we constructed a more complete catalogue of lung cancer mutation events including 145 mutated genes. With the genes of this list it may be feasible to develop a NGS kit for lung cancer mutation detection.
Collapse
Affiliation(s)
- Yong Chen
- Thoracic Department, Shanghai Chest Hospital, Shanghai, China
| | | | | | | | | |
Collapse
|
52
|
Gulati S, Cheng TMK, Bates PA. Cancer networks and beyond: interpreting mutations using the human interactome and protein structure. Semin Cancer Biol 2013; 23:219-26. [PMID: 23680723 DOI: 10.1016/j.semcancer.2013.05.002] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2013] [Revised: 04/30/2013] [Accepted: 05/03/2013] [Indexed: 01/08/2023]
Abstract
Over recent years, with the advances in next-generation sequencing, a large number of cancer mutations have been identified and accumulated in public repositories. Coupled to this is our increased ability to generate detailed interactome maps that help to enrich our knowledge of the biological implications of cancer mutations. As a result, network analysis approaches have become an invaluable tool to predict and interpret mutations that are associated with tumour survival and progression. Our understanding of cancer mechanisms is further enhanced by mapping protein structure information to such networks. Here we review the current methodologies for annotating the functional impacts of cancer mutations, which range from analysis of protein structures to protein-protein interaction network studies.
Collapse
Affiliation(s)
- Sakshi Gulati
- Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, London, United Kingdom
| | | | | |
Collapse
|
53
|
Schramm SJ, Li SS, Jayaswal V, Fung DCY, Campain AE, Pang CNI, Scolyer RA, Yang YH, Mann GJ, Wilkins MR. Disturbed protein-protein interaction networks in metastatic melanoma are associated with worse prognosis and increased functional mutation burden. Pigment Cell Melanoma Res 2013; 26:708-22. [PMID: 23738911 DOI: 10.1111/pcmr.12126] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2013] [Accepted: 05/30/2013] [Indexed: 12/15/2022]
Abstract
For disseminated melanoma, new prognostic biomarkers and therapeutic targets are urgently needed. The organization of protein-protein interaction networks was assessed via the transcriptomes of four independent studies of metastatic melanoma and related to clinical outcome and MAP-kinase pathway mutations (BRAF/NRAS). We also examined patient outcome-related differences in a predicted network of microRNAs and their targets. The 32 hub genes with the most reproducible survival-related disturbances in co-expression with their protein partner genes included oncogenes and tumor suppressors, previously known correlates of prognosis, and other proteins not previously associated with melanoma outcome. Notably, this network-based gene set could classify patients according to clinical outcomes with 67-80% accuracy among cohorts. Reproducibly disturbed networks were also more likely to have a higher functional mutation burden than would be expected by chance. The disturbed regions of networks are therefore markers of clinically relevant, selectable tumor evolution in melanoma which may carry driver mutations.
Collapse
Affiliation(s)
- Sarah-Jane Schramm
- Sydney Medical School, The University of Sydney at Westmead Millennium Institute for Medical Research, Sydney, NSW, Australia
| | | | | | | | | | | | | | | | | | | |
Collapse
|
54
|
Basu M, Bhattacharyya NP, Mohanty PK. Comparison of modules of wild type and mutant Huntingtin and TP53 protein interaction networks: implications in biological processes and functions. PLoS One 2013; 8:e64838. [PMID: 23741403 PMCID: PMC3669416 DOI: 10.1371/journal.pone.0064838] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2012] [Accepted: 04/19/2013] [Indexed: 02/07/2023] Open
Abstract
Disease-causing mutations usually change the interacting partners of mutant proteins. In this article, we propose that the biological consequences of mutation are directly related to the alteration of corresponding protein protein interaction networks (PPIN). Mutation of Huntingtin (HTT) which causes Huntington's disease (HD) and mutations to TP53 which is associated with different cancers are studied as two example cases. We construct the PPIN of wild type and mutant proteins separately and identify the structural modules of each of the networks. The functional role of these modules are then assessed by Gene Ontology (GO) enrichment analysis for biological processes (BPs). We find that a large number of significantly enriched () GO terms in mutant PPIN were absent in the wild type PPIN indicating the gain of BPs due to mutation. Similarly some of the GO terms enriched in wild type PPIN cease to exist in the modules of mutant PPIN, representing the loss. GO terms common in modules of mutant and wild type networks indicate both loss and gain of BPs. We further assign relevant biological function(s) to each module by classifying the enriched GO terms associated with it. It turns out that most of these biological functions in HTT networks are already known to be altered in HD and those of TP53 networks are altered in cancers. We argue that gain of BPs, and the corresponding biological functions, are due to new interacting partners acquired by mutant proteins. The methodology we adopt here could be applied to genetic diseases where mutations alter the ability of the protein to interact with other proteins.
Collapse
Affiliation(s)
- Mahashweta Basu
- Theoretical Condensed Matter Physics Division, Saha Institute of Nuclear Physics, Bidhan Nagar, Kolkata, India
| | - Nitai P. Bhattacharyya
- Crystallography and Molecular Biology Division, Saha Institute of Nuclear Physics, Bidhan Nagar, Kolkata, India
| | - Pradeep K. Mohanty
- Theoretical Condensed Matter Physics Division, Saha Institute of Nuclear Physics, Bidhan Nagar, Kolkata, India
- * E-mail:
| |
Collapse
|
55
|
Lee CH, Kuo WH, Lin CC, Oyang YJ, Huang HC, Juan HF. MicroRNA-regulated protein-protein interaction networks and their functions in breast cancer. Int J Mol Sci 2013; 14:11560-606. [PMID: 23722663 PMCID: PMC3709748 DOI: 10.3390/ijms140611560] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2013] [Revised: 05/21/2013] [Accepted: 05/22/2013] [Indexed: 12/13/2022] Open
Abstract
MicroRNAs, which are small endogenous RNA regulators, have been associated with various types of cancer. Breast cancer is a major health threat for women worldwide. Many miRNAs were reported to be associated with the progression and carcinogenesis of breast cancer. In this study, we aimed to discover novel breast cancer-related miRNAs and to elucidate their functions. First, we identified confident miRNA-target pairs by combining data from miRNA target prediction databases and expression profiles of miRNA and mRNA. Then, miRNA-regulated protein interaction networks (PINs) were constructed with confident pairs and known interaction data in the human protein reference database (HPRD). Finally, the functions of miRNA-regulated PINs were elucidated by functional enrichment analysis. From the results, we identified some previously reported breast cancer-related miRNAs and functions of the PINs, e.g., miR-125b, miR-125a, miR-21, and miR-497. Some novel miRNAs without known association to breast cancer were also found, and the putative functions of their PINs were also elucidated. These include miR-139 and miR-383. Furthermore, we validated our results by receiver operating characteristic (ROC) curve analysis using our miRNA expression profile data, gene expression-based outcome for breast cancer online (GOBO) survival analysis, and a literature search. Our results may provide new insights for research in breast cancer-associated miRNAs.
Collapse
Affiliation(s)
- Chia-Hsien Lee
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei 106, Taiwan; E-Mails: (C.-H.L.); (C.-C.L.); (Y.-J.O.)
| | - Wen-Hong Kuo
- Department of Physiology, College of Medicine, National Taiwan University, Taipei 100, Taiwan; E-Mail:
| | - Chen-Ching Lin
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei 106, Taiwan; E-Mails: (C.-H.L.); (C.-C.L.); (Y.-J.O.)
| | - Yen-Jen Oyang
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei 106, Taiwan; E-Mails: (C.-H.L.); (C.-C.L.); (Y.-J.O.)
| | - Hsuan-Cheng Huang
- Institute of Biomedical Informatics and Center for Systems and Synthetic Biology, National Yang-Ming University, Taipei 112, Taiwan
- Authors to whom correspondence should be addressed; E-Mails: (H.-C.H.); (H.-F.J.); Tel.: +886-2-2826-7357 (H.-C.H.); +886-2-3366-4536 (H.-F.J.); Fax: +886-2-2820-2508 (H.-C.H.); +886-2-2367-3374 (H.-F.J.)
| | - Hsueh-Fen Juan
- Graduate Institute of Biomedical Electronics and Bioinformatics, National Taiwan University, Taipei 106, Taiwan; E-Mails: (C.-H.L.); (C.-C.L.); (Y.-J.O.)
- Institute of Molecular and Cellular Biology and Department of Life Science, National Taiwan University, Taipei 106, Taiwan
- Authors to whom correspondence should be addressed; E-Mails: (H.-C.H.); (H.-F.J.); Tel.: +886-2-2826-7357 (H.-C.H.); +886-2-3366-4536 (H.-F.J.); Fax: +886-2-2820-2508 (H.-C.H.); +886-2-2367-3374 (H.-F.J.)
| |
Collapse
|
56
|
Pattin KA, Moore JH. Addressing the Challenges of Detecting Epistasis in Genome-Wide Association Studies of Common Human Diseases Using Biological Expert Knowledge. Bioinformatics 2013. [DOI: 10.4018/978-1-4666-3604-0.ch038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
Recent technological developments in the field of genetics have given rise to an abundance of research tools, such as genome-wide genotyping, that allow researchers to conduct genome-wide association studies (GWAS) for detecting genetic variants that confer increased or decreased susceptibility to disease. However, discovering epistatic, or gene-gene, interactions in high dimensional datasets is a problem due to the computational complexity that results from the analysis of all possible combinations of single-nucleotide polymorphisms (SNPs). A recently explored approach to this problem employs biological expert knowledge, such as pathway or protein-protein interaction information, to guide an analysis by the selection or weighting of SNPs based on this knowledge. Narrowing the evaluation to gene combinations that have been shown to interact experimentally provides a biologically concise reason why those two genes may be detected together statistically. This chapter discusses the challenges of discovering epistatic interactions in GWAS and how biological expert knowledge can be used to facilitate genome-wide genetic studies.
Collapse
|
57
|
Uhart M, Bustos DM. Human 14-3-3 paralogs differences uncovered by cross-talk of phosphorylation and lysine acetylation. PLoS One 2013; 8:e55703. [PMID: 23418452 PMCID: PMC3572099 DOI: 10.1371/journal.pone.0055703] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2012] [Accepted: 12/28/2012] [Indexed: 01/24/2023] Open
Abstract
The 14-3-3 protein family interacts with more than 700 different proteins in mammals, in part as a result of its specific phospho-serine/phospho-threonine binding activity. Upon binding to 14-3-3, the stability, subcellular localization and/or catalytic activity of the ligands are modified. Seven paralogs are strictly conserved in mammalian species. Although initially thought as redundant, the number of studies showing specialization is growing. We created a protein-protein interaction network for 14-3-3, kinases and their substrates signaling in human cells. We included information of phosphorylation, acetylation and other PTM sites, obtaining a complete representation of the 14-3-3 binding partners and their modifications. Using a computational system approach we found that networks of each 14-3-3 isoform are statistically different. It was remarkable to find that Tyr was the most phosphorylatable amino acid in domains of 14-3-3 epsilon partners. This, together with the over-representation of SH3 and Tyr_Kinase domains, suggest that epsilon could be involved in growth factors receptors signaling pathways particularly. We also found that within zeta's network, the number of acetylated partners (and the number of modify lysines) is significantly higher compared with each of the other isoforms. Our results imply previously unreported hidden differences of the 14-3-3 isoforms interaction networks. The phosphoproteome and lysine acetylome within each network revealed post-transcriptional regulation intertwining phosphorylation and lysine acetylation. A global understanding of these networks will contribute to predict what could occur when regulatory circuits become dysfunctional or are modified in response to external stimuli.
Collapse
Affiliation(s)
- Marina Uhart
- Laboratorio de Biología Estructural y Celular de Modificaciones post-traduccionales, Instituto de Investigaciones Biotecnológicas-Instituto Tecnológico de Chascomus (IIB-INTECH), Universidad Nacional de San Martín (UNSAM), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Int. Marino Km 8.2, Chascomus, Argentina
| | - Diego M. Bustos
- Laboratorio de Biología Estructural y Celular de Modificaciones post-traduccionales, Instituto de Investigaciones Biotecnológicas-Instituto Tecnológico de Chascomus (IIB-INTECH), Universidad Nacional de San Martín (UNSAM), Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Int. Marino Km 8.2, Chascomus, Argentina
| |
Collapse
|
58
|
Inder KL, Davis M, Hill MM. Ripples in the pond--using a systems approach to decipher the cellular functions of membrane microdomains. MOLECULAR BIOSYSTEMS 2013; 9:330-8. [PMID: 23322173 DOI: 10.1039/c2mb25300c] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Membrane microdomains such as lipid rafts and caveolae regulate a myriad of cellular functions including cell signalling, protein trafficking, cell viability, and cell movement. They have been implicated in diseases such as cancer, diabetes and Alzheimer's disease, highlighting the essential role they play in cell processes. Despite much research and debate on the size, composition and dynamics of membrane microdomains, the molecular mechanism(s) of their action remain poorly understood. Most studies have dealt solely with the content and properties of the membrane microdomain as an entity in itself. However, recent work shows that membrane microdomain disruption has wide ranging effects on other subcellular compartments, and the cell as a whole. Hence we propose that a systems approach incorporating many cellular attributes such as subcellular localisation is required in order to understand the global impact of microdomains on cell function. Although analysis of sub-proteome changes already provides additional insight, we further propose biological network analysis of functional proteomics data to capture effects at the systems level. In this review, we highlight the use of protein-protein interactions networks and mixed networks to portray and visualize the relationships between proteins within and between subcellular fractions. Such a systems analysis will be required to improve our understanding of the full cellular function of membrane microdomains.
Collapse
|
59
|
Adding protein context to the human protein-protein interaction network to reveal meaningful interactions. PLoS Comput Biol 2013; 9:e1002860. [PMID: 23300433 PMCID: PMC3536619 DOI: 10.1371/journal.pcbi.1002860] [Citation(s) in RCA: 63] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2012] [Accepted: 11/09/2012] [Indexed: 01/31/2023] Open
Abstract
Interactions of proteins regulate signaling, catalysis, gene expression and many other cellular functions. Therefore, characterizing the entire human interactome is a key effort in current proteomics research. This challenge is complicated by the dynamic nature of protein-protein interactions (PPIs), which are conditional on the cellular context: both interacting proteins must be expressed in the same cell and localized in the same organelle to meet. Additionally, interactions underlie a delicate control of signaling pathways, e.g. by post-translational modifications of the protein partners - hence, many diseases are caused by the perturbation of these mechanisms. Despite the high degree of cell-state specificity of PPIs, many interactions are measured under artificial conditions (e.g. yeast cells are transfected with human genes in yeast two-hybrid assays) or even if detected in a physiological context, this information is missing from the common PPI databases. To overcome these problems, we developed a method that assigns context information to PPIs inferred from various attributes of the interacting proteins: gene expression, functional and disease annotations, and inferred pathways. We demonstrate that context consistency correlates with the experimental reliability of PPIs, which allows us to generate high-confidence tissue- and function-specific subnetworks. We illustrate how these context-filtered networks are enriched in bona fide pathways and disease proteins to prove the ability of context-filters to highlight meaningful interactions with respect to various biological questions. We use this approach to study the lung-specific pathways used by the influenza virus, pointing to IRAK1, BHLHE40 and TOLLIP as potential regulators of influenza virus pathogenicity, and to study the signalling pathways that play a role in Alzheimer's disease, identifying a pathway involving the altered phosphorylation of the Tau protein. Finally, we provide the annotated human PPI network via a web frontend that allows the construction of context-specific networks in several ways. Protein-protein-interactions (PPIs) participate in virtually all biological processes. However, the PPI map is not static but the pairs of proteins that interact depends on the type of cell, the subcellular localization and modifications of the participating proteins, among many other factors. Therefore, it is important to understand the specific conditions under which a PPI happens. Unfortunately, experimental methods often do not provide this information or, even worse, measure PPIs under artificial conditions not found in biological systems. We developed a method to infer this missing information from properties of the interacting proteins, such as in which cell types the proteins are found, which functions they fulfill and whether they are known to play a role in disease. We show that PPIs for which we can infer conditions under which they happen have a higher experimental reliability. Also, our inference agrees well with known pathways and disease proteins. Since diseases usually affect specific cell types, we study PPI networks of influenza proteins in lung tissues and of Alzheimer's disease proteins in neural tissues. In both cases, we can highlight interesting interactions potentially playing a role in disease progression.
Collapse
|
60
|
Pérez-Bercoff Å, Hudson CM, Conant GC. A conserved mammalian protein interaction network. PLoS One 2013; 8:e52581. [PMID: 23320073 PMCID: PMC3539715 DOI: 10.1371/journal.pone.0052581] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2012] [Accepted: 11/20/2012] [Indexed: 11/19/2022] Open
Abstract
Physical interactions between proteins mediate a variety of biological functions, including signal transduction, physical structuring of the cell and regulation. While extensive catalogs of such interactions are known from model organisms, their evolutionary histories are difficult to study given the lack of interaction data from phylogenetic outgroups. Using phylogenomic approaches, we infer a upper bound on the time of origin for a large set of human protein-protein interactions, showing that most such interactions appear relatively ancient, dating no later than the radiation of placental mammals. By analyzing paired alignments of orthologous and putatively interacting protein-coding genes from eight mammals, we find evidence for weak but significant co-evolution, as measured by relative selective constraint, between pairs of genes with interacting proteins. However, we find no strong evidence for shared instances of directional selection within an interacting pair. Finally, we use a network approach to show that the distribution of selective constraint across the protein interaction network is non-random, with a clear tendency for interacting proteins to share similar selective constraints. Collectively, the results suggest that, on the whole, protein interactions in mammals are under selective constraint, presumably due to their functional roles.
Collapse
Affiliation(s)
- Åsa Pérez-Bercoff
- Smurfit Institute of Genetics, University of Dublin, Trinity College, Dublin, Ireland
| | - Corey M. Hudson
- Informatics Institute, University of Missouri, Columbia, Missouri, United States of America
| | - Gavin C. Conant
- Informatics Institute, University of Missouri, Columbia, Missouri, United States of America
- Division of Animal Sciences, University of Missouri, Columbia, Missouri, United States of America
- * E-mail:
| |
Collapse
|
61
|
De Biasio A, Blanco FJ. Proliferating Cell Nuclear Antigen Structure and Interactions. PROTEIN-NUCLEIC ACIDS INTERACTIONS 2013; 91:1-36. [DOI: 10.1016/b978-0-12-411637-5.00001-9] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
62
|
Vashisht S, Bagler G. An approach for the identification of targets specific to bone metastasis using cancer genes interactome and gene ontology analysis. PLoS One 2012; 7:e49401. [PMID: 23166660 PMCID: PMC3498148 DOI: 10.1371/journal.pone.0049401] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2012] [Accepted: 10/11/2012] [Indexed: 12/20/2022] Open
Abstract
Metastasis is one of the most enigmatic aspects of cancer pathogenesis and is a major cause of cancer-associated mortality. Secondary bone cancer (SBC) is a complex disease caused by metastasis of tumor cells from their primary site and is characterized by intricate interplay of molecular interactions. Identification of targets for multifactorial diseases such as SBC, the most frequent complication of breast and prostate cancers, is a challenge. Towards achieving our aim of identification of targets specific to SBC, we constructed a 'Cancer Genes Network', a representative protein interactome of cancer genes. Using graph theoretical methods, we obtained a set of key genes that are relevant for generic mechanisms of cancers and have a role in biological essentiality. We also compiled a curated dataset of 391 SBC genes from published literature which serves as a basis of ontological correlates of secondary bone cancer. Building on these results, we implement a strategy based on generic cancer genes, SBC genes and gene ontology enrichment method, to obtain a set of targets that are specific to bone metastasis. Through this study, we present an approach for probing one of the major complications in cancers, namely, metastasis. The results on genes that play generic roles in cancer phenotype, obtained by network analysis of 'Cancer Genes Network', have broader implications in understanding the role of molecular regulators in mechanisms of cancers. Specifically, our study provides a set of potential targets that are of ontological and regulatory relevance to secondary bone cancer.
Collapse
Affiliation(s)
- Shikha Vashisht
- Biotechnology Division, Institute of Himalayan Bioresource Technology, Council of Scientific and Industrial Research, Palampur, India
| | - Ganesh Bagler
- Biotechnology Division, Institute of Himalayan Bioresource Technology, Council of Scientific and Industrial Research, Palampur, India
| |
Collapse
|
63
|
Tang H, Zhong F, Xie H. A quick guide to biomolecular network studies: construction, analysis, applications, and resources. Biochem Biophys Res Commun 2012; 424:7-11. [PMID: 22732414 DOI: 10.1016/j.bbrc.2012.06.085] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2012] [Accepted: 06/18/2012] [Indexed: 10/28/2022]
Abstract
Over the past decade, a rapid increase in network data including signaling, transcription regulation, metabolic reaction, protein-protein interaction and genetic interaction has been observed. Many biology issues have been investigated by analyzing these diverse networks, providing new insights into biology. Networks also play an important role in disease studies including disease gene screening and clinical diagnosis. Large amounts of databases and software have been developed to facilitate the storage, exchange, integration, and analysis of network data and network analysis is becoming a routine procedure for biologists to infer biological information. In this review, several main aspects of network studies are discussed, including network construction, analysis, application, and resources.
Collapse
Affiliation(s)
- Hailin Tang
- College of Mechanical & Electronic Engineering and Automatization, National University of Defense Technology, Changsha 410073, China
| | | | | |
Collapse
|
64
|
Garcia-Garcia J, Schleker S, Klein-Seetharaman J, Oliva B. BIPS: BIANA Interolog Prediction Server. A tool for protein-protein interaction inference. Nucleic Acids Res 2012; 40:W147-51. [PMID: 22689642 PMCID: PMC3394316 DOI: 10.1093/nar/gks553] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Protein–protein interactions (PPIs) play a crucial role in biology, and high-throughput experiments have greatly increased the coverage of known interactions. Still, identification of complete inter- and intraspecies interactomes is far from being complete. Experimental data can be complemented by the prediction of PPIs within an organism or between two organisms based on the known interactions of the orthologous genes of other organisms (interologs). Here, we present the BIANA (Biologic Interactions and Network Analysis) Interolog Prediction Server (BIPS), which offers a web-based interface to facilitate PPI predictions based on interolog information. BIPS benefits from the capabilities of the framework BIANA to integrate the several PPI-related databases. Additional metadata can be used to improve the reliability of the predicted interactions. Sensitivity and specificity of the server have been calculated using known PPIs from different interactomes using a leave-one-out approach. The specificity is between 72 and 98%, whereas sensitivity varies between 1 and 59%, depending on the sequence identity cut-off used to calculate similarities between sequences. BIPS is freely accessible at http://sbi.imim.es/BIPS.php.
Collapse
Affiliation(s)
- Javier Garcia-Garcia
- Structural Bioinformatics Laboratory (GRIB-IMIM), Universitat Pompeu Fabra, Barcelona Research Park of Biomedicine (PRBB), 08003 Barcelona, Catalonia, Spain
| | | | | | | |
Collapse
|
65
|
Tseng CW, Huang HC, Shih ACC, Chang YY, Hsu CC, Chang JY, Li WH, Juan HF. Revealing the anti-tumor effect of artificial miRNA p-27-5p on human breast carcinoma cell line T-47D. Int J Mol Sci 2012; 13:6352-6369. [PMID: 22754369 PMCID: PMC3382822 DOI: 10.3390/ijms13056352] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2012] [Revised: 05/09/2012] [Accepted: 05/18/2012] [Indexed: 02/06/2023] Open
Abstract
microRNAs (miRNAs) cause mRNA degradation or translation suppression of their target genes. Previous studies have found direct involvement of miRNAs in cancer initiation and progression. Artificial miRNAs, designed to target single or multiple genes of interest, provide a new therapeutic strategy for cancer. This study investigates the anti-tumor effect of a novel artificial miRNA, miR P-27-5p, on breast cancer. In this study, we reveal that miR P-27-5p downregulates the differential gene expressions associated with the protein modification process and regulation of cell cycle in T-47D cells. Introduction of this novel artificial miRNA, miR P-27-5p, into breast cell lines inhibits cell proliferation and induces the first “gap” phase (G1) cell cycle arrest in cancer cell lines but does not affect normal breast cells. We further show that miR P-27-5p targets the 3′-untranslated mRNA region (3′-UTR) of cyclin-dependent kinase 4 (CDK4) and reduces both the mRNA and protein level of CDK4, which in turn, interferes with phosphorylation of the retinoblastoma protein (RB1). Overall, our data suggest that the effects of miR p-27-5p on cell proliferation and G1 cell cycle arrest are through the downregulation of CDK4 and the suppression of RB1 phosphorylation. This study opens avenues for future therapies targeting breast cancer.
Collapse
Affiliation(s)
- Chien-Wei Tseng
- Department of Life Science, Institute of Molecular and Cellular Biology, National Taiwan University, Taipei 106, Taiwan; E-Mails: (C.-W.T.); (Y.-Y.C.); (C.-C.H.); (J.-Y.C.)
| | - Hsuan-Cheng Huang
- Institute of Biomedical Informatics, Center for Systems and Synthetic Biology, National Yang-Ming University, Taipei 112, Taiwan; E-Mail:
| | - Arthur Chun-Chieh Shih
- Institute of Information Science, Research Center for Information Technology Innovation, Academia Sinica, Taipei 115, Taiwan; E-Mail:
| | - Ya-Ya Chang
- Department of Life Science, Institute of Molecular and Cellular Biology, National Taiwan University, Taipei 106, Taiwan; E-Mails: (C.-W.T.); (Y.-Y.C.); (C.-C.H.); (J.-Y.C.)
| | - Chung-Cheng Hsu
- Department of Life Science, Institute of Molecular and Cellular Biology, National Taiwan University, Taipei 106, Taiwan; E-Mails: (C.-W.T.); (Y.-Y.C.); (C.-C.H.); (J.-Y.C.)
| | - Jen-Yun Chang
- Department of Life Science, Institute of Molecular and Cellular Biology, National Taiwan University, Taipei 106, Taiwan; E-Mails: (C.-W.T.); (Y.-Y.C.); (C.-C.H.); (J.-Y.C.)
| | - Wen-Hsiung Li
- Biodiversity Research Center and Genomics Research Center, Academia Sinica, Taipei 115, Taiwan
- Department of Ecology and Evolution, University of Chicago, Chicago, IL 60637, USA
- Authors to whom correspondence should be addressed; E-Mails: (W.-H.L.); (H.-F.J.); Tel.: +1-773-702-3104 (W.-H.L.); +886-2-33664536 (H.-F.J.); Fax: +1-773-702-9740 (W.-H.L.); +886-2-23673374 (H.-F.J.)
| | - Hsueh-Fen Juan
- Department of Life Science, Institute of Molecular and Cellular Biology, National Taiwan University, Taipei 106, Taiwan; E-Mails: (C.-W.T.); (Y.-Y.C.); (C.-C.H.); (J.-Y.C.)
- Authors to whom correspondence should be addressed; E-Mails: (W.-H.L.); (H.-F.J.); Tel.: +1-773-702-3104 (W.-H.L.); +886-2-33664536 (H.-F.J.); Fax: +1-773-702-9740 (W.-H.L.); +886-2-23673374 (H.-F.J.)
| |
Collapse
|
66
|
Schleker S, Garcia-Garcia J, Klein-Seetharaman J, Oliva B. Prediction and comparison of Salmonella-human and Salmonella-Arabidopsis interactomes. Chem Biodivers 2012; 9:991-1018. [PMID: 22589098 PMCID: PMC3407687 DOI: 10.1002/cbdv.201100392] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Salmonellosis caused by Salmonella bacteria is a food-borne disease and a worldwide health threat causing millions of infections and thousands of deaths every year. This pathogen infects an unusually broad range of host organisms including human and plants. A better understanding of the mechanisms of communication between Salmonella and its hosts requires identifying the interactions between Salmonella and host proteins. Protein-protein interactions (PPIs) are the fundamental building blocks of communication. Here, we utilize the prediction platform BIANA to obtain the putative Salmonella-human and Salmonella-Arabidopsis interactomes based on sequence and domain similarity to known PPIs. A gold standard list of Salmonella-host PPIs served to validate the quality of the human model. 24,726 and 10,926 PPIs comprising interactions between 38 and 33 Salmonella effectors and virulence factors with 9,740 human and 4,676 Arabidopsis proteins, respectively, were predicted. Putative hub proteins could be identified, and parallels between the two interactomes were discovered. This approach can provide insight into possible biological functions of so far uncharacterized proteins. The predicted interactions are available via a web interface which allows filtering of the database according to parameters provided by the user to narrow down the list of suspected interactions. The interactions are available via a web interface at http://sbi.imim.es/web/SHIPREC.php.
Collapse
Affiliation(s)
- Sylvia Schleker
- Forschungszentrum Jülich, Institute of Complex Systems (ICS-5), 52425 Jülich, Germany
| | - Javier Garcia-Garcia
- Structural Bioinformatics Group (GRIB-IMIM). Universitat Pompeu Fabra. Barcelona Research Park of Biomedicine (PRBB), Barcelona 08003, Catalonia, Spain (phone: +34 933 160 509; fax: +34 933 160 550
| | - Judith Klein-Seetharaman
- Forschungszentrum Jülich, Institute of Complex Systems (ICS-5), 52425 Jülich, Germany
- Department of Structural Biology, University of Pittsburgh, Pittsburgh, PA 15260, USA (phone: +1 412 383 7325; fax: +1 412 648 8998
| | - Baldo Oliva
- Structural Bioinformatics Group (GRIB-IMIM). Universitat Pompeu Fabra. Barcelona Research Park of Biomedicine (PRBB), Barcelona 08003, Catalonia, Spain (phone: +34 933 160 509; fax: +34 933 160 550
| |
Collapse
|
67
|
Garcia-Garcia J, Bonet J, Guney E, Fornes O, Planas J, Oliva B. Networks of ProteinProtein Interactions: From Uncertainty to Molecular Details. Mol Inform 2012; 31:342-62. [PMID: 27477264 DOI: 10.1002/minf.201200005] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2012] [Accepted: 03/09/2012] [Indexed: 11/08/2022]
Abstract
Proteins are the bricks and mortar of cells. The work of proteins is structural and functional, as they are the principal element of the organization of the cell architecture, but they also play a relevant role in its metabolism and regulation. To perform all these functions, proteins need to interact with each other and with other bio-molecules, either to form complexes or to recognize precise targets of their action. For instance, a particular transcription factor may activate one gene or another depending on its interactions with other proteins and not only with DNA. Hence, the ability of a protein to interact with other bio-molecules, and the partners they have at each particular time and location can be crucial to characterize the role of a protein. Proteins rarely act alone; they rather constitute a mingled network of physical interactions or other types of relationships (such as metabolic and regulatory) or signaling cascades. In this context, understanding the function of a protein implies to recognize the members of its neighborhood and to grasp how they associate, both at the systemic and atomic level. The network of physical interactions between the proteins of a system, cell or organism, is defined as the interactome. The purpose of this review is to deepen the description of interactomes at different levels of detail: from the molecular structure of complexes to the global topology of the network of interactions. The approaches and techniques applied experimentally and computationally to attain each level are depicted. The limits of each technique and its integration into a model network, the challenges and actual problems of completeness of an interactome, and the reliability of the interactions are reviewed and summarized. Finally, the application of the current knowledge of protein-protein interactions on modern network medicine and protein function annotation is also explored.
Collapse
Affiliation(s)
- Javier Garcia-Garcia
- Structural Bioinformatics Group, GRIB-IMIM, Universitat Pompeu Fabra, Barcelona Research Park of Biomedicine (PRBB), Catalonia, Spain
| | - Jaume Bonet
- Structural Bioinformatics Group, GRIB-IMIM, Universitat Pompeu Fabra, Barcelona Research Park of Biomedicine (PRBB), Catalonia, Spain
| | - Emre Guney
- Structural Bioinformatics Group, GRIB-IMIM, Universitat Pompeu Fabra, Barcelona Research Park of Biomedicine (PRBB), Catalonia, Spain
| | - Oriol Fornes
- Structural Bioinformatics Group, GRIB-IMIM, Universitat Pompeu Fabra, Barcelona Research Park of Biomedicine (PRBB), Catalonia, Spain
| | - Joan Planas
- Structural Bioinformatics Group, GRIB-IMIM, Universitat Pompeu Fabra, Barcelona Research Park of Biomedicine (PRBB), Catalonia, Spain
| | - Baldo Oliva
- Structural Bioinformatics Group, GRIB-IMIM, Universitat Pompeu Fabra, Barcelona Research Park of Biomedicine (PRBB), Catalonia, Spain.
| |
Collapse
|
68
|
James K, Wipat A, Hallinan J. Is newer better?--evaluating the effects of data curation on integrated analyses in Saccharomyces cerevisiae. Integr Biol (Camb) 2012; 4:715-27. [PMID: 22526920 DOI: 10.1039/c2ib00123c] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Recent high-throughput experiments have produced a wealth of heterogeneous datasets, each of which provides information about different aspects of the cell. Consequently, integration of diverse data types is essential in order to address many biological questions. The quality of any integrated analysis system is dependent upon the quality of its component data, and upon the Gold Standard data used to evaluate it. It is commonly assumed that the quality of data improves as databases grow and change, particularly for manually curated databases. However, the validity of this assumption can be questioned, given the constant changes in the data coupled with the high level of noise associated with high-throughput experimental techniques. One of the most powerful approaches to data integration is the use of Probabilistic Functional Integrated Networks (PFINs). Here, we systematically analyse the changes in four highly-curated and widely-used online databases and evaluate the extent to which these changes affect the protein function prediction performance of PFINs in the yeast Saccharomyces cerevisiae. We find that the global trend in network performance improves over time. Where individual areas of biology are concerned, however, the most recent files do not always produce the best results. Individual datasets have unique biases towards different biological processes and by selecting and integrating relevant datasets performance can be improved. When using any type of integrated system to answer a specific biological question careful selection of raw data and Gold Standard is vital, since the most recent data may not be the most appropriate.
Collapse
Affiliation(s)
- Katherine James
- School of Computing Science, Newcastle University, Newcastle upon Tyne, NE1 7RU, United Kingdom
| | | | | |
Collapse
|
69
|
Massanet-Vila R, Padró T, Cardús A, Badimon L, Caminal P, Perera A. Analysis of incomplete gene expression dataset through protein-protein interaction information. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2012; 2011:6845-8. [PMID: 22255911 DOI: 10.1109/iembs.2011.6091688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
This paper shows a graph based method to analyze proteomic expression data. The method allows the prediction of the expression of genes not measured by the gene expression technology based on the local connectivity properties of the measured differentially expressed gene set. The prediction of the expression jointly with the stability of this prediction as a function of the variation of the initial expressed set is computed. The method is able to correctly predict one third of the proteins with independence of variations on the selection of the initial set. The algorithm is validated through a Matrix-Assisted Laser Desorption/Ionization Time of Flight Mass Spectrometer (MALDI-TOF) protein expression experiment aiming the study of the protein expression patterns and post-translational modifications in human endothelial vascular cells exposed to atherosclerotic levels of Low Density Lipoproteins (LDL).
Collapse
Affiliation(s)
- Raimon Massanet-Vila
- Department of ESAII, Technical University of Catalonia, Pau Gargallo 5, 08028 Barcelona, Spain.
| | | | | | | | | | | |
Collapse
|
70
|
Massanet-Vila R, Albert FF, Caminal P, Perera A. Network-based enrichment analysis of gene expression through protein-protein interaction data. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2012; 2012:6317-6320. [PMID: 23367373 DOI: 10.1109/embc.2012.6347438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
High-throughput analysis of gene expression data is subject to technological and statistical issues that confuse the underlying expression-condition associations. In this contribution a network-based candidate gene prioritization strategy was applied to the enrichment of a publicly available gene expression dataset, focused on the study of the mechanosensitivity of genes exposed to altered pulmonary matrix stiffness. Results suggested that some genes which had not been taken into account in the original study could have an important role in the processes causing, or affected by, pulmonary fibrosis.
Collapse
Affiliation(s)
- Raimon Massanet-Vila
- Dept. of Sistems Engineering, Automatics and Industrial Informatics, Technical University of Catalonia (UPC), Pau Gargallo 5, 08028, Barcelona, Spain.
| | | | | | | |
Collapse
|
71
|
Davis MJ, Shin CJ, Jing N, Ragan MA. Rewiring the dynamic interactome. MOLECULAR BIOSYSTEMS 2012; 8:2054-66, 2013. [DOI: 10.1039/c2mb25050k] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
72
|
Kwofie SK, Schaefer U, Sundararajan VS, Bajic VB, Christoffels A. HCVpro: Hepatitis C virus protein interaction database. INFECTION GENETICS AND EVOLUTION 2011; 11:1971-7. [DOI: 10.1016/j.meegid.2011.09.001] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/13/2011] [Revised: 08/24/2011] [Accepted: 09/02/2011] [Indexed: 02/07/2023]
|
73
|
Protein network study of human AF4 reveals its central role in RNA Pol II-mediated transcription and in phosphorylation-dependent regulatory mechanisms. Biochem J 2011; 438:121-31. [PMID: 21574958 PMCID: PMC3174057 DOI: 10.1042/bj20101633] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
AF4 belongs to a family of proteins implicated in childhood lymphoblastic leukaemia, FRAXE (Fragile X E site) mental retardation and ataxia. AF4 is a transcriptional activator that is involved in transcriptional elongation. Although AF4 has been implicated in MLL (mixed-lineage leukaemia)-related leukaemogenesis, AF4-dependent physiological mechanisms have not been clearly defined. Proteins that interact with AF4 may also play important roles in mediating oncogenesis, and are potential targets for novel therapies. Using a functional proteomic approach involving tandem MS and bioinformatics, we identified 51 AF4-interacting proteins of various Gene Ontology categories. Approximately 60% participate in transcription regulatory mechanisms, including the Mediator complex in eukaryotic cells. In the present paper we report one of the first extensive proteomic studies aimed at elucidating AF4 protein cross-talk. Moreover, we found that the AF4 residues Thr220 and Ser212 are phosphorylated, which suggests that AF4 function depends on phosphorylation mechanisms. We also mapped the AF4-interaction site with CDK9 (cyclin-dependent kinase 9), which is a direct interactor crucial for the function and regulation of the protein. The findings of the present study significantly expand the number of putative members of the multiprotein complex formed by AF4, which is instrumental in promoting the transcription/elongation of specific genes in human cells.
Collapse
|
74
|
Zoumaro-Djayoon AD, Heck AJR, Muñoz J. Targeted analysis of tyrosine phosphorylation by immuno-affinity enrichment of tyrosine phosphorylated peptides prior to mass spectrometric analysis. Methods 2011; 56:268-74. [PMID: 21945579 DOI: 10.1016/j.ymeth.2011.09.003] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2011] [Revised: 09/08/2011] [Accepted: 09/09/2011] [Indexed: 01/18/2023] Open
Abstract
Tyrosine phosphorylation is a key process that regulates seminal biological functions, hence, deregulation of this mechanism is an underlying cause of several diseases including cancer and immunological disorders. Due to its low abundance, tyrosine phosphorylation is typically under-represented in most of the global MS-based phosphoproteomic studies. Here, we describe a selective approach based on immuno-affinity purification using specific antibodies to enrich tyrosine phosphorylated peptides from a complex proteolytic digest. LC-MS/MS analysis is subsequently used for peptide identification allowing the exact localization of the phosphorylated residue within the sequence. Using this approach more than 1000 non-redundant phosphotyrosine peptides can be identified in less than 6h of MS analysis, reflecting the high sensitivity and specificity of the technique. The identified tyrosine phosphorylated peptides can be used to study different biological aspects of tyrosine signaling and disease.
Collapse
Affiliation(s)
- Adja D Zoumaro-Djayoon
- Biomolecular Mass and Spectrometry and Proteomics Group, Bijvoet Centre for Biomolecular Research and Utrecht Institute for Pharmaceutical Sciences, University of Utrecht, Padualaan 8, 3584 CH Utrecht, The Netherlands
| | | | | |
Collapse
|
75
|
Lopes TJS, Schaefer M, Shoemaker J, Matsuoka Y, Fontaine JF, Neumann G, Andrade-Navarro MA, Kawaoka Y, Kitano H. Tissue-specific subnetworks and characteristics of publicly available human protein interaction databases. ACTA ACUST UNITED AC 2011; 27:2414-21. [PMID: 21798963 DOI: 10.1093/bioinformatics/btr414] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION Protein-protein interaction (PPI) databases are widely used tools to study cellular pathways and networks; however, there are several databases available that still do not account for cell type-specific differences. Here, we evaluated the characteristics of six interaction databases, incorporated tissue-specific gene expression information and finally, investigated if the most popular proteins of scientific literature are involved in good quality interactions. RESULTS We found that the evaluated databases are comparable in terms of node connectivity (i.e. proteins with few interaction partners also have few interaction partners in other databases), but may differ in the identity of interaction partners. We also observed that the incorporation of tissue-specific expression information significantly altered the interaction landscape and finally, we demonstrated that many of the most intensively studied proteins are engaged in interactions associated with low confidence scores. In summary, interaction databases are valuable research tools but may lead to different predictions on interactions or pathways. The accuracy of predictions can be improved by incorporating datasets on organ- and cell type-specific gene expression, and by obtaining additional interaction evidence for the most 'popular' proteins. CONTACT kitano@sbi.jp SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Tiago J S Lopes
- JST ERATO KAWAOKA Infection-induced Host Responses Project, Tokyo, Japan
| | | | | | | | | | | | | | | | | |
Collapse
|
76
|
Goel R, Muthusamy B, Pandey A, Prasad TSK. Human protein reference database and human proteinpedia as discovery resources for molecular biotechnology. Mol Biotechnol 2011; 48:87-95. [PMID: 20927658 DOI: 10.1007/s12033-010-9336-8] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
In the recent years, research in molecular biotechnology has transformed from being small scale studies targeted at a single or a small set of molecule(s) into a combination of high throughput discovery platforms and extensive validations. Such a discovery platform provided an unbiased approach which resulted in the identification of several novel genetic and protein biomarkers. High throughput nature of these investigations coupled with higher sensitivity and specificity of Next Generation technologies provided qualitatively and quantitatively richer biological data. These developments have also revolutionized biological research and speed of data generation. However, it is becoming difficult for individual investigators to directly benefit from this data because they are not easily accessible. Data resources became necessary to assimilate, store and disseminate information that could allow future discoveries. We have developed two resources--Human Protein Reference Database (HPRD) and Human Proteinpedia, which integrate knowledge relevant to human proteins. A number of protein features including protein-protein interactions, post-translational modifications, subcellular localization, and tissue expression, which have been studied using different strategies were incorporated in these databases. Human Proteinpedia also provides a portal for community participation to annotate and share proteomic data and uses HPRD as the scaffold for data processing. Proteomic investigators can even share unpublished data in Human Proteinpedia, which provides a meaningful platform for data sharing. As proteomic information reflects a direct view of cellular systems, proteomics is expected to complement other areas of biology such as genomics, transcriptomics, molecular biology, cloning, and classical genetics in understanding the relationships among multiple facets of biological systems.
Collapse
Affiliation(s)
- Renu Goel
- Institute of Bioinformatics, International Technology Park, Bangalore 560066, India
| | | | | | | |
Collapse
|
77
|
Bell L, Chowdhary R, Liu JS, Niu X, Zhang J. Integrated bio-entity network: a system for biological knowledge discovery. PLoS One 2011; 6:e21474. [PMID: 21738677 PMCID: PMC3124513 DOI: 10.1371/journal.pone.0021474] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2011] [Accepted: 06/01/2011] [Indexed: 01/26/2023] Open
Abstract
A significant part of our biological knowledge is centered on relationships between biological entities (bio-entities) such as proteins, genes, small molecules, pathways, gene ontology (GO) terms and diseases. Accumulated at an increasing speed, the information on bio-entity relationships is archived in different forms at scattered places. Most of such information is buried in scientific literature as unstructured text. Organizing heterogeneous information in a structured form not only facilitates study of biological systems using integrative approaches, but also allows discovery of new knowledge in an automatic and systematic way. In this study, we performed a large scale integration of bio-entity relationship information from both databases containing manually annotated, structured information and automatic information extraction of unstructured text in scientific literature. The relationship information we integrated in this study includes protein–protein interactions, protein/gene regulations, protein–small molecule interactions, protein–GO relationships, protein–pathway relationships, and pathway–disease relationships. The relationship information is organized in a graph data structure, named integrated bio-entity network (IBN), where the vertices are the bio-entities and edges represent their relationships. Under this framework, graph theoretic algorithms can be designed to perform various knowledge discovery tasks. We designed breadth-first search with pruning (BFSP) and most probable path (MPP) algorithms to automatically generate hypotheses—the indirect relationships with high probabilities in the network. We show that IBN can be used to generate plausible hypotheses, which not only help to better understand the complex interactions in biological systems, but also provide guidance for experimental designs.
Collapse
Affiliation(s)
- Lindsey Bell
- Department of Statistics, Florida State University, Tallahassee, Florida, United States of America
| | | | | | | | | |
Collapse
|
78
|
Li Z, Li F, Ni M, Li P, Bo X, Wang S. Characterization the regulation of herpesvirus miRNAs from the view of human protein interaction network. BMC SYSTEMS BIOLOGY 2011; 5:93. [PMID: 21668952 PMCID: PMC3125315 DOI: 10.1186/1752-0509-5-93] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2011] [Accepted: 06/13/2011] [Indexed: 11/10/2022]
Abstract
Background miRNAs are a class of non-coding RNA molecules that play crucial roles in the regulation of virus-host interactions. The ever-increasing data of known viral miRNAs and human protein interaction network (PIN) has made it possible to study the targeting characteristics of viral miRNAs in the context of these networks. Results We performed topological analysis to explore the targeting propensities of herpesvirus miRNAs from the view of human PIN and found that (1) herpesvirus miRNAs significantly target more hubs, moreover, compared with non-hubs (non-bottlenecks), hubs (bottlenecks) are targeted by much more virus miRNAs and virus types. (2) There are significant differences in the degree and betweenness centrality between common and specific targets, specifically we observed a significant positive correlation between virus types targeting these nodes and the proportion of hubs, and (3) K-core and ER analysis determined that common targets are closer to the global PIN center. Compared with random conditions, the giant connected component (GCC) and the density of the sub-network formed by common targets have significantly higher values, indicating the module characteristic of these targets. Conclusions Herpesvirus miRNAs preferentially target hubs and bottlenecks. There are significant differences between common and specific targets. Moreover, common targets are more intensely connected and occupy the central part of the network. These results will help unravel the complex mechanism of herpesvirus-host interactions and may provide insight into the development of novel anti-herpesvirus drugs.
Collapse
Affiliation(s)
- Zhenpeng Li
- Department of Biotechnology, Beijing Institute of Radiation Medicine, No,27, Taiping Road, Haidian District, Beijing 100850, China
| | | | | | | | | | | |
Collapse
|
79
|
Rinaldi F, Kaljurand K, Sætre R. Terminological resources for text mining over biomedical scientific literature. Artif Intell Med 2011; 52:107-14. [PMID: 21652190 DOI: 10.1016/j.artmed.2011.04.011] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2010] [Revised: 04/18/2011] [Accepted: 04/18/2011] [Indexed: 11/30/2022]
Abstract
OBJECTIVE We present a combined terminological resource for text mining over biomedical literature. The purpose of the resource is to allow the detection of mentions of specific biological entities in scientific publications, and their grounding to widely accepted identifiers. This is an essential process, useful in itself, and necessary as an intermediate step for almost every type of complex text mining application. METHODS We discuss some of the properties of the terminology for this domain, in particular the degree of ambiguity, which constitutes a peculiar problem for text mining applications. Without a correct recognition and disambiguation of the domain entities no reliable results can be produced. RESULTS We also discuss an application that makes use of the resulting terminological knowledge base. We annotate an existing corpus of sentences about protein interactions. The annotation consists of a normalization step that matches the terms in our resource with their actual representation in the corpus, and a disambiguation step that resolves the ambiguity of matched terms. CONCLUSION In this paper we present a large terminological resource, compiled through the aggregation of a number of different manually curated sources. We discuss the lexical properties of such resources, specifically the degree of ambiguity of the terms, and we inspect the causes of such ambiguity, in particular for protein names. This information is of vital importance for the implementation of an efficient term normalization and grounding algorithm.
Collapse
Affiliation(s)
- Fabio Rinaldi
- Institute of Computational Linguistics, University of Zurich, Binzmühlestrasse 14, CH-8050 Zurich, Switzerland.
| | | | | |
Collapse
|
80
|
Tseng CW, Yang JC, Chen CN, Huang HC, Chuang KN, Lin CC, Lai HS, Lee PH, Chang KJ, Juan HF. Identification of 14-3-3β in human gastric cancer cells and its potency as a diagnostic and prognostic biomarker. Proteomics 2011; 11:2423-39. [PMID: 21598387 DOI: 10.1002/pmic.201000449] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2010] [Revised: 03/07/2011] [Accepted: 03/08/2011] [Indexed: 02/06/2023]
Abstract
Gastric cancer is the second most common cause of cancer deaths worldwide and due to its poor prognosis, it is important that specific biomarkers are identified to enable its early detection. Through 2-D gel electrophoresis and MALDI-TOF-TOF-based proteomics approaches, we found that 14-3-3β, which was one of the proteins that were differentially expressed by 5-fluorouracil-treated gastric cancer SC-M1 cells, was upregulated in gastric cancer cells. 14-3-3β levels in tissues and serum were further validated in gastric cancer patients and controls. The results showed that 14-3-3β levels were elevated in tumor tissues (n=40) in comparison to normal tissues (n=40; p<0.01), and serum 14-3-3β levels in cancer patients (n=145) were also significantly higher than those in controls (n=63; p<0.0001). Elevated serum 14-3-3β levels highly correlated with the number of lymph node metastases, tumor size and a reduced survival rate. Moreover, overexpression of 14-3-3β enhanced the growth, invasiveness and migratory activities of tumor cells. Twenty-eight proteins involved in anti-apoptosis and tumor progression were also found to be differentially expressed in 14-3-3β-overexpressing gastric cancer cells. Overall, these results highlight the significance of 14-3-3β in gastric cancer cell progression and suggest that it has the potential to be used as a diagnostic and prognostic biomarker in gastric cancer.
Collapse
Affiliation(s)
- Chien-Wei Tseng
- Department of Life Science, Institute of Molecular and Cellular Biology, National Taiwan University, Taipei, Taiwan
| | | | | | | | | | | | | | | | | | | |
Collapse
|
81
|
Sualp M, Can T. Using network context as a filter for miRNA target prediction. Biosystems 2011; 105:201-9. [PMID: 21524683 DOI: 10.1016/j.biosystems.2011.04.002] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2010] [Revised: 04/08/2011] [Accepted: 04/08/2011] [Indexed: 01/04/2023]
Affiliation(s)
- M Sualp
- Department of Computer Engineering, Middle East Technical University, Ankara, Turkey.
| | | |
Collapse
|
82
|
Abstract
![]()
Current limitations in proteome analysis by high-throughput mass spectrometry (MS) approaches have sometimes led to incomplete (or inconclusive) data sets being published or unpublished. In this work, we used an iTRAQ reference data on hepatocellular carcinoma (HCC) to design a two-stage functional analysis pipeline to widen and improve the proteome coverage and, subsequently, to unveil the molecular changes that occur during HCC progression in human tumorous tissue. The first involved functional cluster analysis by incorporating an expansion step on a cleaned integrated network. The second used an in-house developed pathway database where recovery of shared neighbors was followed by pathway enrichment analysis. In the original MS data set, over 500 proteins were detected from the tumors of 12 male patients, but in this paper we reported an additional 1000 proteins after application of our bioinformatics pipeline. Through an integrative effort of network cleaning, community finding methods, and network analysis, we also uncovered several biologically interesting clusters implicated in HCC. We established that HCC transition from a moderate to poor stage involved densely connected clusters that comprised of PCNA, XRCC5, XRCC6, PARP1, PRKDC, and WRN. From our pathway enrichment analyses, it appeared that the HCC moderate stage, unlike the poor stage, is enriched in proteins involved in immune responses, thus suggesting the acquisition of immuno-evasion. Our strategy illustrates how an original oncoproteome could be expanded to one of a larger dynamic range where current technology limitations prevent/limit comprehensive proteome characterization. Comprehensive proteome coverage by mass spectrometry remains a challenge. We used an integrated analysis pipeline to improve proteome coverage and functional analysis of hepatocellular carcinoma progression. The expanded proteome includes low abundant proteins involved in transcription and signaling. With that we established that HCC transition from moderate to poor involved densely connected clusters, which implicates DNA repair and immune dysregulation.
Collapse
|
83
|
Wang YC, Chen BS. A network-based biomarker approach for molecular investigation and diagnosis of lung cancer. BMC Med Genomics 2011; 4:2. [PMID: 21211025 PMCID: PMC3027087 DOI: 10.1186/1755-8794-4-2] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2010] [Accepted: 01/06/2011] [Indexed: 12/24/2022] Open
Abstract
Background Lung cancer is the leading cause of cancer deaths worldwide. Many studies have investigated the carcinogenic process and identified the biomarkers for signature classification. However, based on the research dedicated to this field, there is no highly sensitive network-based method for carcinogenesis characterization and diagnosis from the systems perspective. Methods In this study, a systems biology approach integrating microarray gene expression profiles and protein-protein interaction information was proposed to develop a network-based biomarker for molecular investigation into the network mechanism of lung carcinogenesis and diagnosis of lung cancer. The network-based biomarker consists of two protein association networks constructed for cancer samples and non-cancer samples. Results Based on the network-based biomarker, a total of 40 significant proteins in lung carcinogenesis were identified with carcinogenesis relevance values (CRVs). In addition, the network-based biomarker, acting as the screening test, proved to be effective in diagnosing smokers with signs of lung cancer. Conclusions A network-based biomarker using constructed protein association networks is a useful tool to highlight the pathways and mechanisms of the lung carcinogenic process and, more importantly, provides potential therapeutic targets to combat cancer.
Collapse
Affiliation(s)
- Yu-Chao Wang
- Laboratory of Control and Systems Biology, Department of Electrical Engineering, National Tsing Hua University, Hsinchu 30013, Taiwan
| | | |
Collapse
|
84
|
Zhang GL, DeLuca DS, Brusic V. Database resources for proteomics-based analysis of cancer. Methods Mol Biol 2011; 723:349-64. [PMID: 21370076 DOI: 10.1007/978-1-61779-043-0_22] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Biological/bioinformatics databases are essential for medical and biological studies. They integrate and organize biologically related information in a structured format and provide researchers with easy access to a variety of relevant data. This review presents an overview of publicly available databases relevant to proteomics studies in cancer research. They include gene/protein expression databases, gene mutation and single nucleotide polymorphisms databases, tumor antigen databases, protein-protein interaction, and biological pathway databases. Automated information retrieval from these databases enables efficient large-scale proteomics data analysis.
Collapse
Affiliation(s)
- Guang Lan Zhang
- Cancer Vaccine Center, Department of Medical Oncology, Dana-Farber Cancer Institute, Harvard Medical School, Boston, MA, USA
| | | | | |
Collapse
|
85
|
Doderer MS, Yoon K, Robbins KA. SIDEKICK: Genomic data driven analysis and decision-making framework. BMC Bioinformatics 2010; 11:611. [PMID: 21192813 PMCID: PMC3022632 DOI: 10.1186/1471-2105-11-611] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2010] [Accepted: 12/30/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Scientists striving to unlock mysteries within complex biological systems face myriad barriers in effectively integrating available information to enhance their understanding. While experimental techniques and available data sources are rapidly evolving, useful information is dispersed across a variety of sources, and sources of the same information often do not use the same format or nomenclature. To harness these expanding resources, scientists need tools that bridge nomenclature differences and allow them to integrate, organize, and evaluate the quality of information without extensive computation. RESULTS Sidekick, a genomic data driven analysis and decision making framework, is a web-based tool that provides a user-friendly intuitive solution to the problem of information inaccessibility. Sidekick enables scientists without training in computation and data management to pursue answers to research questions like "What are the mechanisms for disease X" or "Does the set of genes associated with disease X also influence other diseases." Sidekick enables the process of combining heterogeneous data, finding and maintaining the most up-to-date data, evaluating data sources, quantifying confidence in results based on evidence, and managing the multi-step research tasks needed to answer these questions. We demonstrate Sidekick's effectiveness by showing how to accomplish a complex published analysis in a fraction of the original time with no computational effort using Sidekick. CONCLUSIONS Sidekick is an easy-to-use web-based tool that organizes and facilitates complex genomic research, allowing scientists to explore genomic relationships and formulate hypotheses without computational effort. Possible analysis steps include gene list discovery, gene-pair list discovery, various enrichments for both types of lists, and convenient list manipulation. Further, Sidekick's ability to characterize pairs of genes offers new ways to approach genomic analysis that traditional single gene lists do not, particularly in areas such as interaction discovery.
Collapse
Affiliation(s)
- Mark S Doderer
- Department of Computer Science, The University of Texas at San Antonio, San Antonio, TX 78249, USA.
| | | | | |
Collapse
|
86
|
Turinsky AL, Razick S, Turner B, Donaldson IM, Wodak SJ. Literature curation of protein interactions: measuring agreement across major public databases. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2010; 2010:baq026. [PMID: 21183497 PMCID: PMC3011985 DOI: 10.1093/database/baq026] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Literature curation of protein interaction data faces a number of challenges. Although curators increasingly adhere to standard data representations, the data that various databases actually record from the same published information may differ significantly. Some of the reasons underlying these differences are well known, but their global impact on the interactions collectively curated by major public databases has not been evaluated. Here we quantify the agreement between curated interactions from 15 471 publications shared across nine major public databases. Results show that on average, two databases fully agree on 42% of the interactions and 62% of the proteins curated from the same publication. Furthermore, a sizable fraction of the measured differences can be attributed to divergent assignments of organism or splice isoforms, different organism focus and alternative representations of multi-protein complexes. Our findings highlight the impact of divergent curation policies across databases, and should be relevant to both curators and data consumers interested in analyzing protein-interaction data generated by the scientific community. Database URL:http://wodaklab.org/iRefWeb
Collapse
Affiliation(s)
- Andrei L Turinsky
- Molecular Structure and Function Program, Hospital for Sick Children, 555 University Avenue, Toronto, Ontario, Canada
| | | | | | | | | |
Collapse
|
87
|
Agrawal P, Yu K, Salomon AR, Sedivy JM. Proteomic profiling of Myc-associated proteins. Cell Cycle 2010; 9:4908-21. [PMID: 21150319 DOI: 10.4161/cc.9.24.14199] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Mammalian c-Myc is a member of a small family of three closely related transcription factors. The Myc family of proto-oncogenes are among the most potent activators of tumorigenesis, and are frequently overexpressed in diverse cancers. c-Myc has an unusually broad array of regulatory functions, which include, in addition to roles in the cell cycle and apoptosis, effects on a variety of metabolic functions, cell differentiation, senescence, and stem cell maintenance. A significant number of c-Myc interacting proteins have already been defined, but it is widely believed that the c-Myc interactome is vastly larger than currently documented. In addition to interactions with components of the transcription machinery, transcription independent nuclear interactions with the DNA replication and RNA processing pathways have been reported. Cytoplasmic roles of c-Myc have also been recently substantiated. Recent advances in proteomics have opened new possibilities for the isolation of protein complexes under native conditions and confidently identifying the components using ultrasensitive, high mass accuracy and high resolution mass spectrometry techniques. In this communication we report a new tandem affinity purification (TAP) c-Myc interaction screen that employed new cell lines with near-physiological levels of c-Myc expression with multi-dimensional protein identification techniques (MudPIT) for the detection and quantification of proteins. Both label-free and the recently developed stable isotope labeling with amino acids in cell culture (SILAC) methodologies were used. Combined data from multiple biological replicates provided a dataset of 418 non-redundant proteins, 389 of which are putative novel interactors. This new information should significantly advance our understanding of this interesting and important master regulator.
Collapse
Affiliation(s)
- Pooja Agrawal
- Department of Molecular Biology, Cell Biology and Biochemistry, Brown University, Providence, RI, USA
| | | | | | | |
Collapse
|
88
|
Schaefer U, Schmeier S, Bajic VB. TcoF-DB: dragon database for human transcription co-factors and transcription factor interacting proteins. Nucleic Acids Res 2010; 39:D106-10. [PMID: 20965969 PMCID: PMC3013796 DOI: 10.1093/nar/gkq945] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
The initiation and regulation of transcription in eukaryotes is complex and involves a large number of transcription factors (TFs), which are known to bind to the regulatory regions of eukaryotic DNA. Apart from TF–DNA binding, protein–protein interaction involving TFs is an essential component of the machinery facilitating transcriptional regulation. Proteins that interact with TFs in the context of transcription regulation but do not bind to the DNA themselves, we consider transcription co-factors (TcoFs). The influence of TcoFs on transcriptional regulation and initiation, although indirect, has been shown to be significant with the functionality of TFs strongly influenced by the presence of TcoFs. While the role of TFs and their interaction with regulatory DNA regions has been well-studied, the association between TFs and TcoFs has so far been given less attention. Here, we present a resource that is comprised of a collection of human TFs and the TcoFs with which they interact. Other proteins that have a proven interaction with a TF, but are not considered TcoFs are also included. Our database contains 157 high-confidence TcoFs and additionally 379 hypothetical TcoFs. These have been identified and classified according to the type of available evidence for their involvement in transcriptional regulation and their presence in the cell nucleus. We have divided TcoFs into four groups, one of which contains high-confidence TcoFs and three others contain TcoFs which are hypothetical to different extents. We have developed the Dragon Database for Human Transcription Co-Factors and Transcription Factor Interacting Proteins (TcoF-DB). A web-based interface for this resource can be freely accessed at http://cbrc.kaust.edu.sa/tcof/ and http://apps.sanbi.ac.za/tcof/.
Collapse
Affiliation(s)
- Ulf Schaefer
- Computational Bioscience Research Center, 4700 King Abdullah University of Science and Technology, Thuwal 23955-6900, Kingdom of Saudi Arabia
| | | | | |
Collapse
|
89
|
Rajasekaran S, Merlin JC, Kundeti V, Mi T, Oommen A, Vyas J, Alaniz I, Chung K, Chowdhury F, Deverasatty S, Irvey TM, Lacambacal D, Lara D, Panchangam S, Rathnayake V, Watts P, Schiller MR. A computational tool for identifying minimotifs in protein-protein interactions and improving the accuracy of minimotif predictions. Proteins 2010; 79:153-64. [PMID: 20938975 DOI: 10.1002/prot.22868] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2010] [Revised: 07/22/2010] [Accepted: 08/13/2010] [Indexed: 01/12/2023]
Abstract
Protein-protein interactions are important to understanding cell functions; however, our theoretical understanding is limited. There is a general discontinuity between the well-accepted physical and chemical forces that drive protein-protein interactions and the large collections of identified protein-protein interactions in various databases. Minimotifs are short functional peptide sequences that provide a basis to bridge this gap in knowledge. However, there is no systematic way to study minimotifs in the context of protein-protein interactions or vice versa. Here we have engineered a set of algorithms that can be used to identify minimotifs in known protein-protein interactions and implemented this for use by scientists in Minimotif Miner. By globally testing these algorithms on verified data and on 100 individual proteins as test cases, we demonstrate the utility of these new computation tools. This tool also can be used to reduce false-positive predictions in the discovery of novel minimotifs. The statistical significance of these algorithms is demonstrated by an ROC analysis (P = 0.001).
Collapse
Affiliation(s)
- Sanguthevar Rajasekaran
- Department of Computer Science and Engineering, University of Connecticut, Storrs, Connecticut 06269-2155, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
90
|
Functional variation of alternative splice forms in their protein interaction networks: a literature mining approach. BMC Bioinformatics 2010. [PMCID: PMC2956394 DOI: 10.1186/1471-2105-11-s5-p1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|
91
|
Kerrigan JJ, Xie Q, Ames RS, Lu Q. Production of protein complexes via co-expression. Protein Expr Purif 2010; 75:1-14. [PMID: 20692346 DOI: 10.1016/j.pep.2010.07.015] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2010] [Revised: 07/22/2010] [Accepted: 07/31/2010] [Indexed: 12/21/2022]
Abstract
Multi-protein complexes are involved in essentially all cellular processes. A protein's function is defined by a combination of its own properties, its interacting partners, and the stoichiometry of each. Depending on binding partners, a transcription factor can function as an activator in one instance and a repressor in another. The study of protein function or malfunction is best performed in the relevant context. While many protein complexes can be reconstituted from individual component proteins after being produced individually, many others require co-expression of their native partners in the host cells for proper folding, stability, and activity. Protein co-expression has led to the production of a variety of biological active complexes in sufficient quantities for biochemical, biophysical, structural studies, and high throughput screens. This article summarizes examples of such cases and discusses critical considerations in selecting co-expression partners, and strategies to achieve successful production of protein complexes.
Collapse
Affiliation(s)
- John J Kerrigan
- Biological Reagents & Assay Development, Platform Technology & Science, GlaxoSmithKline R&D, 1250 South Collegeville Road, Collegeville, PA 19426, USA
| | | | | | | |
Collapse
|
92
|
Xu JZ, Wong CW. Hunting for robust gene signature from cancer profiling data: sources of variability, different interpretations, and recent methodological developments. Cancer Lett 2010; 296:9-16. [PMID: 20579805 DOI: 10.1016/j.canlet.2010.05.008] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2010] [Revised: 05/05/2010] [Accepted: 05/18/2010] [Indexed: 12/24/2022]
Abstract
Gene microarray is a powerful platform to investigate the expression patterns of thousands of genes simultaneously. One central objective of such analysis is to select sets of genes (i.e., gene signatures) which correlate with clinical characteristics, such as disease subtype diagnosis, response to drug treatment and prognosis. However, previous studies have found that mRNA signatures are highly unstable and strongly depend on the selection of patient samples. Based on five large microRNA profiling datasets, we empirically found that microRNA signatures are also generally unstable. Therefore, concerns arise regarding the reproducibility and clinical applicability of these derived gene signatures. Here, we first provide a brief review on the sources of variability and different interpretations of multiple distinct gene signatures. We then focus on those recent methodological progresses aimed at developing more stable gene signatures.
Collapse
Affiliation(s)
- Jian-Zhen Xu
- College of Bioengineering, Henan University of Technology, Zhengzhou 450001, China.
| | | |
Collapse
|
93
|
Ooi HS, Schneider G, Chan YL, Lim TT, Eisenhaber B, Eisenhaber F. Databases of protein-protein interactions and complexes. Methods Mol Biol 2010; 609:145-59. [PMID: 20221918 DOI: 10.1007/978-1-60327-241-4_9] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]
Abstract
In the current understanding, translation of genomic sequences into proteins is the most important path for realization of genome information. In exercising their intended function, proteins work together through various forms of direct (physical) or indirect interaction mechanisms. For a variety of basic functions, many proteins form a large complex representing a molecular machine or a macromolecular super-structural building block. After several high-throughput techniques for detection of protein-protein interactions had matured, protein interaction data became available in a large scale and curated databases for protein-protein interactions (PPIs) are a new necessity for efficient research. Here, their scope, annotation quality, and retrieval tools are reviewed. In addition, attention is paid to portals that provide unified access to a variety of such databases with added annotation value.
Collapse
Affiliation(s)
- Hong Sain Ooi
- Bioinformatics Institute, Agency for science, Technology, and Research, Singapore
| | | | | | | | | | | |
Collapse
|
94
|
Lee HJ, Zheng JJ. PDZ domains and their binding partners: structure, specificity, and modification. Cell Commun Signal 2010; 8:8. [PMID: 20509869 PMCID: PMC2891790 DOI: 10.1186/1478-811x-8-8] [Citation(s) in RCA: 397] [Impact Index Per Article: 28.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2009] [Accepted: 05/28/2010] [Indexed: 02/07/2023] Open
Abstract
PDZ domains are abundant protein interaction modules that often recognize short amino acid motifs at the C-termini of target proteins. They regulate multiple biological processes such as transport, ion channel signaling, and other signal transduction systems. This review discusses the structural characterization of PDZ domains and the use of recently emerging technologies such as proteomic arrays and peptide libraries to study the binding properties of PDZ-mediated interactions. Regulatory mechanisms responsible for PDZ-mediated interactions, such as phosphorylation in the PDZ ligands or PDZ domains, are also discussed. A better understanding of PDZ protein-protein interaction networks and regulatory mechanisms will improve our knowledge of many cellular and biological processes.
Collapse
Affiliation(s)
- Ho-Jin Lee
- Department of Structural Biology, St, Jude Children's Research Hospital, Memphis, TN 38105, USA.
| | | |
Collapse
|
95
|
Börner K, Hermle J, Sommer C, Brown NP, Knapp B, Glass B, Kunkel J, Torralba G, Reymann J, Beil N, Beneke J, Pepperkok R, Schneider R, Ludwig T, Hausmann M, Hamprecht F, Erfle H, Kaderali L, Kräusslich HG, Lehmann MJ. From experimental setup to bioinformatics: an RNAi screening platform to identify host factors involved in HIV-1 replication. Biotechnol J 2010; 5:39-49. [PMID: 20013946 DOI: 10.1002/biot.200900226] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
RNA interference (RNAi) has emerged as a powerful technique for studying loss-of-function phenotypes by specific down-regulation of gene expression, allowing the investigation of virus-host interactions by large-scale high-throughput RNAi screens. Here we present a robust and sensitive small interfering RNA screening platform consisting of an experimental setup, single-cell image and statistical analysis as well as bioinformatics. The workflow has been established to elucidate host gene functions exploited by viruses, monitoring both suppression and enhancement of viral replication simultaneously by fluorescence microscopy. The platform comprises a two-stage procedure in which potential host factors are first identified in a primary screen and afterwards re-tested in a validation screen to confirm true positive hits. Subsequent bioinformatics allows the identification of cellular genes participating in metabolic pathways and cellular networks utilised by viruses for efficient infection. Our workflow has been used to investigate host factor usage by the human immunodeficiency virus-1 (HIV-1), but can also be adapted to other viruses. Importantly, we expect that the description of the platform will guide further screening approaches for virus-host interactions. The ViroQuant-CellNetworks RNAi Screening core facility is an integral part of the recently founded BioQuant centre for systems biology at the University of Heidelberg and will provide service to external users in the near future.
Collapse
Affiliation(s)
- Kathleen Börner
- Department of Infectious Diseases, Virology, University of Heidelberg, Heidelberg, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
96
|
Gehlenborg N, O'Donoghue SI, Baliga NS, Goesmann A, Hibbs MA, Kitano H, Kohlbacher O, Neuweger H, Schneider R, Tenenbaum D, Gavin AC. Visualization of omics data for systems biology. Nat Methods 2010; 7:S56-68. [DOI: 10.1038/nmeth.1436] [Citation(s) in RCA: 474] [Impact Index Per Article: 33.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
97
|
Qi Y, Dhiman HK, Bhola N, Budyak I, Kar S, Man D, Dutta A, Tirupula K, Carr BI, Grandis J, Bar-Joseph Z, Klein-Seetharaman J. Systematic prediction of human membrane receptor interactions. Proteomics 2010; 9:5243-55. [PMID: 19798668 DOI: 10.1002/pmic.200900259] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Membrane receptor-activated signal transduction pathways are integral to cellular functions and disease mechanisms in humans. Identification of the full set of proteins interacting with membrane receptors by high-throughput experimental means is difficult because methods to directly identify protein interactions are largely not applicable to membrane proteins. Unlike prior approaches that attempted to predict the global human interactome, we used a computational strategy that only focused on discovering the interacting partners of human membrane receptors leading to improved results for these proteins. We predict specific interactions based on statistical integration of biological data containing highly informative direct and indirect evidences together with feedback from experts. The predicted membrane receptor interactome provides a system-wide view, and generates new biological hypotheses regarding interactions between membrane receptors and other proteins. We have experimentally validated a number of these interactions. The results suggest that a framework of systematically integrating computational predictions, global analyses, biological experimentation and expert feedback is a feasible strategy to study the human membrane receptor interactome.
Collapse
Affiliation(s)
- Yanjun Qi
- School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
98
|
Suthram S, Dudley JT, Chiang AP, Chen R, Hastie TJ, Butte AJ. Network-based elucidation of human disease similarities reveals common functional modules enriched for pluripotent drug targets. PLoS Comput Biol 2010; 6:e1000662. [PMID: 20140234 PMCID: PMC2816673 DOI: 10.1371/journal.pcbi.1000662] [Citation(s) in RCA: 221] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2009] [Accepted: 12/30/2009] [Indexed: 11/18/2022] Open
Abstract
Current work in elucidating relationships between diseases has largely been based on pre-existing knowledge of disease genes. Consequently, these studies are limited in their discovery of new and unknown disease relationships. We present the first quantitative framework to compare and contrast diseases by an integrated analysis of disease-related mRNA expression data and the human protein interaction network. We identified 4,620 functional modules in the human protein network and provided a quantitative metric to record their responses in 54 diseases leading to 138 significant similarities between diseases. Fourteen of the significant disease correlations also shared common drugs, supporting the hypothesis that similar diseases can be treated by the same drugs, allowing us to make predictions for new uses of existing drugs. Finally, we also identified 59 modules that were dysregulated in at least half of the diseases, representing a common disease-state "signature". These modules were significantly enriched for genes that are known to be drug targets. Interestingly, drugs known to target these genes/proteins are already known to treat significantly more diseases than drugs targeting other genes/proteins, highlighting the importance of these core modules as prime therapeutic opportunities.
Collapse
Affiliation(s)
- Silpa Suthram
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, California, United States of America
- Department of Pediatrics, Stanford University, Stanford, California, United States of America
- Lucile Packard Children's Hospital, Palo Alto, California, United States of America
| | - Joel T. Dudley
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, California, United States of America
- Department of Pediatrics, Stanford University, Stanford, California, United States of America
- Lucile Packard Children's Hospital, Palo Alto, California, United States of America
| | - Annie P. Chiang
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, California, United States of America
- Department of Pediatrics, Stanford University, Stanford, California, United States of America
- Lucile Packard Children's Hospital, Palo Alto, California, United States of America
| | - Rong Chen
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, California, United States of America
- Department of Pediatrics, Stanford University, Stanford, California, United States of America
- Lucile Packard Children's Hospital, Palo Alto, California, United States of America
| | - Trevor J. Hastie
- Department of Statistics, Stanford University, Stanford, California, United States of America
| | - Atul J. Butte
- Stanford Center for Biomedical Informatics Research, Stanford University, Stanford, California, United States of America
- Department of Pediatrics, Stanford University, Stanford, California, United States of America
- Lucile Packard Children's Hospital, Palo Alto, California, United States of America
- * E-mail:
| |
Collapse
|
99
|
Pattin KA, Moore JH. Role for protein-protein interaction databases in human genetics. Expert Rev Proteomics 2010; 6:647-59. [PMID: 19929610 DOI: 10.1586/epr.09.86] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Proteomics and the study of protein-protein interactions are becoming increasingly important in our effort to understand human diseases on a system-wide level. Thanks to the development and curation of protein-interaction databases, up-to-date information on these interaction networks is accessible and publicly available to the scientific community. As our knowledge of protein-protein interactions increases, it is important to give thought to the different ways that these resources can impact biomedical research. In this article, we highlight the importance of protein-protein interactions in human genetics and genetic epidemiology. Since protein-protein interactions demonstrate one of the strongest functional relationships between genes, combining genomic data with available proteomic data may provide us with a more in-depth understanding of common human diseases. In this review, we will discuss some of the fundamentals of protein interactions, the databases that are publicly available and how information from these databases can be used to facilitate genome-wide genetic studies.
Collapse
Affiliation(s)
- Kristine A Pattin
- Computational Genetics Laboratory and Department of Genetics, Dartmouth Medical School, Lebanon, NH, USA.
| | | |
Collapse
|
100
|
Predicting Protein-Protein Interactions with K-Nearest Neighbors Classification Algorithm. COMPUTATIONAL INTELLIGENCE METHODS FOR BIOINFORMATICS AND BIOSTATISTICS 2010. [DOI: 10.1007/978-3-642-14571-1_10] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|