1
|
Dayebgadoh G, Sardiu ME, Florens L, Washburn MP. Biochemical Reduction of the Topology of the Diverse WDR76 Protein Interactome. J Proteome Res 2019; 18:3479-3491. [PMID: 31353912 DOI: 10.1021/acs.jproteome.9b00373] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
A hub protein in protein interaction networks will typically have a large number of diverse interactions. Determining the core interactions and the function of such a hub protein remains a significant challenge in the study of networks. Proteins with WD40 repeats represent a large class of proteins that can be hub proteins. WDR76 is a poorly characterized WD40 repeat protein with possible involvement in DNA damage repair, cell-cycle progression, apoptosis, gene expression regulation, and protein quality control. WDR76 has a large and diverse interaction network that has made its study challenging. Here we rigorously carry out a series of affinity purification coupled to mass spectrometry (AP-MS) analyses to map out the WDR76 interactome through different biochemical conditions. We apply AP-MS analysis coupled to size-exclusion chromatography to resolve WDR76-based protein complexes. Furthermore, we also show that WDR76 interacts with the CCT complex via its WD40 repeat domain and with DNA-PK-KU, PARP1, GAN, SIRT1, and histones outside of the WD40 domain. An evaluation of the stability of WDR76 interactions led to focused and streamlined reciprocal analyses that validate the interactions with GAN and SIRT1. Overall, the approaches used to study WDR76 would be valuable to study other proteins containing WD40 repeat domains, which are conserved in a large number of proteins in many organisms.
Collapse
Affiliation(s)
- Gerald Dayebgadoh
- Stowers Institute for Medical Research , Kansas City , Missouri 64110 , United States
| | - Mihaela E Sardiu
- Stowers Institute for Medical Research , Kansas City , Missouri 64110 , United States
| | - Laurence Florens
- Stowers Institute for Medical Research , Kansas City , Missouri 64110 , United States
| | - Michael P Washburn
- Stowers Institute for Medical Research , Kansas City , Missouri 64110 , United States.,Department of Pathology and Laboratory Medicine , The University of Kansas Medical Center , 3901 Rainbow Boulevard , Kansas City , Kansas 66160 , United States
| |
Collapse
|
2
|
Manners HN, Roy S, Kalita JK. Intrinsic-overlapping co-expression module detection with application to Alzheimer's Disease. Comput Biol Chem 2018; 77:373-389. [PMID: 30466046 DOI: 10.1016/j.compbiolchem.2018.10.014] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Revised: 10/28/2018] [Accepted: 10/29/2018] [Indexed: 11/18/2022]
Abstract
Genes interact with each other and may cause perturbation in the molecular pathways leading to complex diseases. Often, instead of any single gene, a subset of genes interact, forming a network, to share common biological functions. Such a subnetwork is called a functional module or motif. Identifying such modules and central key genes in them, that may be responsible for a disease, may help design patient-specific drugs. In this study, we consider the neurodegenerative Alzheimer's Disease (AD) and identify potentially responsible genes from functional motif analysis. We start from the hypothesis that central genes in genetic modules are more relevant to a disease that is under investigation and identify hub genes from the modules as potential marker genes. Motifs or modules are often non-exclusive or overlapping in nature. Moreover, they sometimes show intrinsic or hierarchical distributions with overlapping functional roles. To the best of our knowledge, no prior work handles both the situations in an integrated way. We propose a non-exclusive clustering approach, CluViaN (Clustering Via Network) that can detect intrinsic as well as overlapping modules from gene co-expression networks constructed using microarray expression profiles. We compare our method with existing methods to evaluate the quality of modules extracted. CluViaN reports the presence of intrinsic and overlapping motifs in different species not reported by any other research. We further apply our method to extract significant AD specific modules using CluViaN and rank them based the number of genes from a module involved in the disease pathways. Finally, top central genes are identified by topological analysis of the modules. We use two different AD phenotype data for experimentation. We observe that central genes, namely PSEN1, APP, NDUFB2, NDUFA1, UQCR10, PPP3R1 and a few more, play significant roles in the AD. Interestingly, our experiments also find a hub gene, PML, which has recently been reported to play a role in plasticity, circadian rhythms and the response to proteins which can cause neurodegenerative disorders. MUC4, another hub gene that we find experimentally is yet to be investigated for its potential role in AD. A software implementation of CluViaN in Java is available for download at https://sites.google.com/site/swarupnehu/publications/resources/CluViaN Software.rar.
Collapse
Affiliation(s)
- Hazel Nicolette Manners
- Department of Information Technology, North Eastern Hill University, Shillong, Meghalaya, India.
| | - Swarup Roy
- Department of Computer Applications, Sikkim University, Gangtok, Sikkim, India; Department of Information Technology, North Eastern Hill University, Shillong, Meghalaya, India.
| | - Jugal K Kalita
- Department of Computer Science, University of Colorado, Colorado Springs, USA.
| |
Collapse
|
3
|
Liu C, Brattico E, Abu-Jamous B, Pereira CS, Jacobsen T, Nandi AK. Effect of Explicit Evaluation on Neural Connectivity Related to Listening to Unfamiliar Music. Front Hum Neurosci 2017; 11:611. [PMID: 29311874 PMCID: PMC5742221 DOI: 10.3389/fnhum.2017.00611] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2017] [Accepted: 11/30/2017] [Indexed: 12/26/2022] Open
Abstract
People can experience different emotions when listening to music. A growing number of studies have investigated the brain structures and neural connectivities associated with perceived emotions. However, very little is known about the effect of an explicit act of judgment on the neural processing of emotionally-valenced music. In this study, we adopted the novel consensus clustering paradigm, called binarisation of consensus partition matrices (Bi-CoPaM), to study whether and how the conscious aesthetic evaluation of the music would modulate brain connectivity networks related to emotion and reward processing. Participants listened to music under three conditions - one involving a non-evaluative judgment, one involving an explicit evaluative aesthetic judgment, and one involving no judgment at all (passive listening only). During non-evaluative attentive listening we obtained auditory-limbic connectivity whereas when participants were asked to decide explicitly whether they liked or disliked the music excerpt, only two clusters of intercommunicating brain regions were found: one including areas related to auditory processing and action observation, and the other comprising higher-order structures involved with visual processing. Results indicate that explicit evaluative judgment has an impact on the neural auditory-limbic connectivity during affective processing of music.
Collapse
Affiliation(s)
- Chao Liu
- Department of Electronic and Computer Engineering, Brunel University London, Uxbridge, United Kingdom
| | - Elvira Brattico
- Department of Clinical Medicine, Center for Music in the Brain, Aarhus University & Royal Academy of Music Aarhus/Aalborg, Aarhus, Denmark.,AMI Centre, School of Science, Aalto University, Espoo, Finland
| | - Basel Abu-Jamous
- Department of Electronic and Computer Engineering, Brunel University London, Uxbridge, United Kingdom
| | | | - Thomas Jacobsen
- Experimental Psychology Unit, Helmut Schmidt University, University of Federal Armed Forces, Hamburg, Germany
| | - Asoke K Nandi
- Department of Electronic and Computer Engineering, Brunel University London, Uxbridge, United Kingdom.,The Key Laboratory of Embedded Systems and Service Computing, College of Electronic and Information Engineering, Tongji University, Shanghai, China
| |
Collapse
|
4
|
Abu-Jamous B, Buffa FM, Harris AL, Nandi AK. In vitro downregulated hypoxia transcriptome is associated with poor prognosis in breast cancer. Mol Cancer 2017; 16:105. [PMID: 28619028 PMCID: PMC5472949 DOI: 10.1186/s12943-017-0673-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2016] [Accepted: 06/02/2017] [Indexed: 12/18/2022] Open
Abstract
BACKGROUND Hypoxia is a characteristic of breast tumours indicating poor prognosis. Based on the assumption that those genes which are up-regulated under hypoxia in cell-lines are expected to be predictors of poor prognosis in clinical data, many signatures of poor prognosis were identified. However, it was observed that cell line data do not always concur with clinical data, and therefore conclusions from cell line analysis should be considered with caution. As many transcriptomic cell-line datasets from hypoxia related contexts are available, integrative approaches which investigate these datasets collectively, while not ignoring clinical data, are required. RESULTS We analyse sixteen heterogeneous breast cancer cell-line transcriptomic datasets in hypoxia-related conditions collectively by employing the unique capabilities of the method, UNCLES, which integrates clustering results from multiple datasets and can address questions that cannot be answered by existing methods. This has been demonstrated by comparison with the state-of-the-art iCluster method. From this collection of genome-wide datasets include 15,588 genes, UNCLES identified a relatively high number of genes (>1000 overall) which are consistently co-regulated over all of the datasets, and some of which are still poorly understood and represent new potential HIF targets, such as RSBN1 and KIAA0195. Two main, anti-correlated, clusters were identified; the first is enriched with MYC targets participating in growth and proliferation, while the other is enriched with HIF targets directly participating in the hypoxia response. Surprisingly, in six clinical datasets, some sub-clusters of growth genes are found consistently positively correlated with hypoxia response genes, unlike the observation in cell lines. Moreover, the ability to predict bad prognosis by a combined signature of one sub-cluster of growth genes and one sub-cluster of hypoxia-induced genes appears to be comparable and perhaps greater than that of known hypoxia signatures. CONCLUSIONS We present a clustering approach suitable to integrate data from diverse experimental set-ups. Its application to breast cancer cell line datasets reveals new hypoxia-regulated signatures of genes which behave differently when in vitro (cell-line) data is compared with in vivo (clinical) data, and are of a prognostic value comparable or exceeding the state-of-the-art hypoxia signatures.
Collapse
Affiliation(s)
- Basel Abu-Jamous
- Department of Electronic and Computer Engineering, Brunel University London, Uxbridge, Middlesex, UB8 3PH UK
- Department of Plant Sciences, University of Oxford, Oxford, OX1 3RB UK
| | - Francesca M. Buffa
- Cancer Research UK, Department of Oncology, Weatherall Institute of Molecular Medicine, Oxford, OX3 9DS UK
| | - Adrian L. Harris
- Cancer Research UK, Department of Oncology, Weatherall Institute of Molecular Medicine, Oxford, OX3 9DS UK
| | - Asoke K. Nandi
- Department of Electronic and Computer Engineering, Brunel University London, Uxbridge, Middlesex, UB8 3PH UK
- The Key Laboratory of Embedded Systems and Service Computing, College of Electronic and Information Engineering, Tongji University, Shanghai, Peoples, Republic of China
| |
Collapse
|
5
|
Gilmore JM, Sardiu ME, Groppe BD, Thornton JL, Liu X, Dayebgadoh G, Banks CA, Slaughter BD, Unruh JR, Workman JL, Florens L, Washburn MP. WDR76 Co-Localizes with Heterochromatin Related Proteins and Rapidly Responds to DNA Damage. PLoS One 2016; 11:e0155492. [PMID: 27248496 PMCID: PMC4889050 DOI: 10.1371/journal.pone.0155492] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2016] [Accepted: 04/30/2016] [Indexed: 12/21/2022] Open
Abstract
Proteins that respond to DNA damage play critical roles in normal and diseased states in human biology. Studies have suggested that the S. cerevisiae protein CMR1/YDL156w is associated with histones and is possibly associated with DNA repair and replication processes. Through a quantitative proteomic analysis of affinity purifications here we show that the human homologue of this protein, WDR76, shares multiple protein associations with the histones H2A, H2B, and H4. Furthermore, our quantitative proteomic analysis of WDR76 associated proteins demonstrated links to proteins in the DNA damage response like PARP1 and XRCC5 and heterochromatin related proteins like CBX1, CBX3, and CBX5. Co-immunoprecipitation studies validated these interactions. Next, quantitative imaging studies demonstrated that WDR76 was recruited to laser induced DNA damage immediately after induction, and we compared the recruitment of WDR76 to laser induced DNA damage to known DNA damage proteins like PARP1, XRCC5, and RPA1. In addition, WDR76 co-localizes to puncta with the heterochromatin proteins CBX1 and CBX5, which are also recruited to DNA damage but much less intensely than WDR76. This work demonstrates the chromatin and DNA damage protein associations of WDR76 and demonstrates the rapid response of WDR76 to laser induced DNA damage.
Collapse
Affiliation(s)
- Joshua M. Gilmore
- Stowers Institute for Medical Research, Kansas City, MO, 64110, United States of America
| | - Mihaela E. Sardiu
- Stowers Institute for Medical Research, Kansas City, MO, 64110, United States of America
| | - Brad D. Groppe
- Stowers Institute for Medical Research, Kansas City, MO, 64110, United States of America
| | - Janet L. Thornton
- Stowers Institute for Medical Research, Kansas City, MO, 64110, United States of America
| | - Xingyu Liu
- Stowers Institute for Medical Research, Kansas City, MO, 64110, United States of America
| | - Gerald Dayebgadoh
- Stowers Institute for Medical Research, Kansas City, MO, 64110, United States of America
| | - Charles A. Banks
- Stowers Institute for Medical Research, Kansas City, MO, 64110, United States of America
| | - Brian D. Slaughter
- Stowers Institute for Medical Research, Kansas City, MO, 64110, United States of America
| | - Jay R. Unruh
- Stowers Institute for Medical Research, Kansas City, MO, 64110, United States of America
| | - Jerry L. Workman
- Stowers Institute for Medical Research, Kansas City, MO, 64110, United States of America
| | - Laurence Florens
- Stowers Institute for Medical Research, Kansas City, MO, 64110, United States of America
| | - Michael P. Washburn
- Stowers Institute for Medical Research, Kansas City, MO, 64110, United States of America
- Department of Pathology and Laboratory Medicine, The University of Kansas Medical Center, 3901 Rainbow Boulevard, Kansas City, Kansas, 66160, United States of America
- * E-mail:
| |
Collapse
|
6
|
Recruitment of Saccharomyces cerevisiae Cmr1/Ydl156w to Coding Regions Promotes Transcription Genome Wide. PLoS One 2016; 11:e0148897. [PMID: 26848854 PMCID: PMC4744024 DOI: 10.1371/journal.pone.0148897] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2015] [Accepted: 01/25/2016] [Indexed: 12/03/2022] Open
Abstract
Cmr1 (changed mutation rate 1) is a largely uncharacterized nuclear protein that has recently emerged in several global genetic interaction and protein localization studies. It clusters with proteins involved in DNA damage and replication stress response, suggesting a role in maintaining genome integrity. Under conditions of proteasome inhibition or replication stress, this protein localizes to distinct sub-nuclear foci termed as intranuclear quality control (INQ) compartments, which sequester proteins for their subsequent degradation. Interestingly, it also interacts with histones, chromatin remodelers and modifiers, as well as with proteins involved in transcription including subunits of RNA Pol I and Pol III, but not with those of Pol II. It is not known whether Cmr1 plays a role in regulating transcription of Pol II target genes. Here, we show that Cmr1 is recruited to the coding regions of transcribed genes of S. cerevisiae. Cmr1 occupancy correlates with the Pol II occupancy genome-wide, indicating that it is recruited to coding sequences in a transcription-dependent manner. Cmr1-enriched genes include Gcn4 targets and ribosomal protein genes. Furthermore, our results show that Cmr1 recruitment to coding sequences is stimulated by Pol II CTD kinase, Kin28, and the histone deacetylases, Rpd3 and Hos2. Finally, our genome-wide analyses implicate Cmr1 in regulating Pol II occupancy at transcribed coding sequences. However, it is dispensable for maintaining co-transcriptional histone occupancy and histone modification (acetylation and methylation). Collectively, our results show that Cmr1 facilitates transcription by directly engaging with transcribed coding regions.
Collapse
|
7
|
Abu-Jamous B, Fa R, Roberts DJ, Nandi AK. UNCLES: method for the identification of genes differentially consistently co-expressed in a specific subset of datasets. BMC Bioinformatics 2015; 16:184. [PMID: 26040489 PMCID: PMC4453228 DOI: 10.1186/s12859-015-0614-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2015] [Accepted: 05/16/2015] [Indexed: 12/13/2022] Open
Abstract
Background Collective analysis of the increasingly emerging gene expression datasets are required. The recently proposed binarisation of consensus partition matrices (Bi-CoPaM) method can combine clustering results from multiple datasets to identify the subsets of genes which are consistently co-expressed in all of the provided datasets in a tuneable manner. However, results validation and parameter setting are issues that complicate the design of such methods. Moreover, although it is a common practice to test methods by application to synthetic datasets, the mathematical models used to synthesise such datasets are usually based on approximations which may not always be sufficiently representative of real datasets. Results Here, we propose an unsupervised method for the unification of clustering results from multiple datasets using external specifications (UNCLES). This method has the ability to identify the subsets of genes consistently co-expressed in a subset of datasets while being poorly co-expressed in another subset of datasets, and to identify the subsets of genes consistently co-expressed in all given datasets. We also propose the M-N scatter plots validation technique and adopt it to set the parameters of UNCLES, such as the number of clusters, automatically. Additionally, we propose an approach for the synthesis of gene expression datasets using real data profiles in a way which combines the ground-truth-knowledge of synthetic data and the realistic expression values of real data, and therefore overcomes the problem of faithfulness of synthetic expression data modelling. By application to those datasets, we validate UNCLES while comparing it with other conventional clustering methods, and of particular relevance, biclustering methods. We further validate UNCLES by application to a set of 14 real genome-wide yeast datasets as it produces focused clusters that conform well to known biological facts. Furthermore, in-silico-based hypotheses regarding the function of a few previously unknown genes in those focused clusters are drawn. Conclusions The UNCLES method, the M-N scatter plots technique, and the expression data synthesis approach will have wide application for the comprehensive analysis of genomic and other sources of multiple complex biological datasets. Moreover, the derived in-silico-based biological hypotheses represent subjects for future functional studies. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0614-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Basel Abu-Jamous
- Department of Electronic and Computer Engineering, Brunel University London, Uxbridge, Middlesex, UB8 3PH, UK.
| | - Rui Fa
- Department of Electronic and Computer Engineering, Brunel University London, Uxbridge, Middlesex, UB8 3PH, UK.
| | - David J Roberts
- National Health Service Blood and Transplant, Oxford, OX3 9BQ, UK. .,Radcliffe Department of Medicine, University of Oxford, John Radcliffe Hospital, Oxford, OX3 9DU, UK.
| | - Asoke K Nandi
- Department of Electronic and Computer Engineering, Brunel University London, Uxbridge, Middlesex, UB8 3PH, UK. .,Department of Mathematical Information Technology, University of Jyväskylä, Jyväskylä, Finland.
| |
Collapse
|
8
|
Gallina I, Colding C, Henriksen P, Beli P, Nakamura K, Offman J, Mathiasen DP, Silva S, Hoffmann E, Groth A, Choudhary C, Lisby M. Cmr1/WDR76 defines a nuclear genotoxic stress body linking genome integrity and protein quality control. Nat Commun 2015; 6:6533. [PMID: 25817432 PMCID: PMC4389229 DOI: 10.1038/ncomms7533] [Citation(s) in RCA: 69] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2014] [Accepted: 02/05/2015] [Indexed: 11/09/2022] Open
Abstract
DNA replication stress is a source of genomic instability. Here we identify changed mutation rate 1 (Cmr1) as a factor involved in the response to DNA replication stress in Saccharomyces cerevisiae and show that Cmr1--together with Mrc1/Claspin, Pph3, the chaperonin containing TCP1 (CCT) and 25 other proteins--define a novel intranuclear quality control compartment (INQ) that sequesters misfolded, ubiquitylated and sumoylated proteins in response to genotoxic stress. The diversity of proteins that localize to INQ indicates that other biological processes such as cell cycle progression, chromatin and mitotic spindle organization may also be regulated through INQ. Similar to Cmr1, its human orthologue WDR76 responds to proteasome inhibition and DNA damage by relocalizing to nuclear foci and physically associating with CCT, suggesting an evolutionarily conserved biological function. We propose that Cmr1/WDR76 plays a role in the recovery from genotoxic stress through regulation of the turnover of sumoylated and phosphorylated proteins.
Collapse
Affiliation(s)
- Irene Gallina
- Department of Biology, University of Copenhagen, Room 4.1.07, Copenhagen N DK-2200, Denmark
| | - Camilla Colding
- Department of Biology, University of Copenhagen, Room 4.1.07, Copenhagen N DK-2200, Denmark
| | - Peter Henriksen
- The Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N DK-2200, Denmark
| | - Petra Beli
- The Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N DK-2200, Denmark
| | - Kyosuke Nakamura
- Biotech Research and Innovation Centre (BRIC) and Centre for Epigenetics, University of Copenhagen, Copenhagen N DK-2200, Denmark
| | - Judith Offman
- MRC, Centre for Genome Damage and Stability, School of Life Sciences, University of Sussex, Brighton BN1 9RH, UK
| | - David P Mathiasen
- Department of Biology, University of Copenhagen, Room 4.1.07, Copenhagen N DK-2200, Denmark
| | - Sonia Silva
- Department of Biology, University of Copenhagen, Room 4.1.07, Copenhagen N DK-2200, Denmark
| | - Eva Hoffmann
- MRC, Centre for Genome Damage and Stability, School of Life Sciences, University of Sussex, Brighton BN1 9RH, UK
| | - Anja Groth
- Biotech Research and Innovation Centre (BRIC) and Centre for Epigenetics, University of Copenhagen, Copenhagen N DK-2200, Denmark
| | - Chunaram Choudhary
- The Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Copenhagen N DK-2200, Denmark
| | - Michael Lisby
- Department of Biology, University of Copenhagen, Room 4.1.07, Copenhagen N DK-2200, Denmark
| |
Collapse
|
9
|
Abu-Jamous B, Fa R, Roberts DJ, Nandi AK. Comprehensive analysis of forty yeast microarray datasets reveals a novel subset of genes (APha-RiB) consistently negatively associated with ribosome biogenesis. BMC Bioinformatics 2014; 15:322. [PMID: 25267386 PMCID: PMC4262117 DOI: 10.1186/1471-2105-15-322] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2014] [Accepted: 09/22/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The scale and complexity of genomic data lend themselves to analysis using sophisticated mathematical techniques to yield information that can generate new hypotheses and so guide further experimental investigations. An ensemble clustering method has the ability to perform consensus clustering over the same set of genes from different microarray datasets by combining results from different clustering methods into a single consensus result. RESULTS In this paper we have performed comprehensive analysis of forty yeast microarray datasets. One recently described Bi-CoPaM method can analyse expressions of the same set of genes from various microarray datasets while using different clustering methods, and then combine these results into a single consensus result whose clusters' tightness is tunable from tight, specific clusters to wide, overlapping clusters. This has been adopted in a novel way over genome-wide data from forty yeast microarray datasets to discover two clusters of genes that are consistently co-expressed over all of these datasets from different biological contexts and various experimental conditions. Most strikingly, average expression profiles of those clusters are consistently negatively correlated in all of the forty datasets while neither profile leads or lags the other. CONCLUSIONS The first cluster is enriched with ribosomal biogenesis genes. The biological processes of most of the genes in the second cluster are either unknown or apparently unrelated although they show high connectivity in protein-protein and genetic interaction networks. Therefore, it is possible that this mostly uncharacterised cluster and the ribosomal biogenesis cluster are transcriptionally oppositely regulated by some common machinery. Moreover, we anticipate that the genes included in this previously unknown cluster participate in generic, in contrast to specific, stress response processes. These novel findings illuminate coordinated gene expression in yeast and suggest several hypotheses for future experimental functional work. Additionally, we have demonstrated the usefulness of the Bi-CoPaM-based approach, which may be helpful for the analysis of other groups of (microarray) datasets from other species and systems for the exploration of global genetic co-expression.
Collapse
Affiliation(s)
- Basel Abu-Jamous
- />Department of Electronic and Computer Engineering, Brunel University, Uxbridge, Middlesex, UB8 3PH UK
| | - Rui Fa
- />Department of Electronic and Computer Engineering, Brunel University, Uxbridge, Middlesex, UB8 3PH UK
| | - David J Roberts
- />National Health Service Blood and Transplant, Oxford, UK
- />Radcliffe Department of Medicine, University of Oxford, John Radcliffe Hospital, Oxford, UK
| | - Asoke K Nandi
- />Department of Electronic and Computer Engineering, Brunel University, Uxbridge, Middlesex, UB8 3PH UK
- />Department of Mathematical Information Technology, University of Jyväskylä, Jyväskylä, Finland
| |
Collapse
|
10
|
Fa R, Nandi AK. Noise Resistant Generalized Parametric Validity Index of Clustering for Gene Expression Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2014; 11:741-752. [PMID: 26356344 DOI: 10.1109/tcbb.2014.2312006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Validity indices have been investigated for decades. However, since there is no study of noise-resistance performance of these indices in the literature, there is no guideline for determining the best clustering in noisy data sets, especially microarray data sets. In this paper, we propose a generalized parametric validity (GPV) index which employs two tunable parameters α and β to control the proportions of objects being considered to calculate the dissimilarities. The greatest advantage of the proposed GPV index is its noise-resistance ability, which results from the flexibility of tuning the parameters. Several rules are set to guide the selection of parameter values. To illustrate the noise-resistance performance of the proposed index, we evaluate the GPV index for assessing five clustering algorithms in two gene expression data simulation models with different noise levels and compare the ability of determining the number of clusters with eight existing indices. We also test the GPV in three groups of real gene expression data sets. The experimental results suggest that the proposed GPV index has superior noise-resistance ability and provides fairly accurate judgements.
Collapse
|
11
|
Fa R, Roberts DJ, Nandi AK. SMART: unique splitting-while-merging framework for gene clustering. PLoS One 2014; 9:e94141. [PMID: 24714159 PMCID: PMC3979766 DOI: 10.1371/journal.pone.0094141] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2013] [Accepted: 03/14/2014] [Indexed: 11/18/2022] Open
Abstract
Successful clustering algorithms are highly dependent on parameter settings. The clustering performance degrades significantly unless parameters are properly set, and yet, it is difficult to set these parameters a priori. To address this issue, in this paper, we propose a unique splitting-while-merging clustering framework, named "splitting merging awareness tactics" (SMART), which does not require any a priori knowledge of either the number of clusters or even the possible range of this number. Unlike existing self-splitting algorithms, which over-cluster the dataset to a large number of clusters and then merge some similar clusters, our framework has the ability to split and merge clusters automatically during the process and produces the the most reliable clustering results, by intrinsically integrating many clustering techniques and tasks. The SMART framework is implemented with two distinct clustering paradigms in two algorithms: competitive learning and finite mixture model. Nevertheless, within the proposed SMART framework, many other algorithms can be derived for different clustering paradigms. The minimum message length algorithm is integrated into the framework as the clustering selection criterion. The usefulness of the SMART framework and its algorithms is tested in demonstration datasets and simulated gene expression datasets. Moreover, two real microarray gene expression datasets are studied using this approach. Based on the performance of many metrics, all numerical results show that SMART is superior to compared existing self-splitting algorithms and traditional algorithms. Three main properties of the proposed SMART framework are summarized as: (1) needing no parameters dependent on the respective dataset or a priori knowledge about the datasets, (2) extendible to many different applications, (3) offering superior performance compared with counterpart algorithms.
Collapse
Affiliation(s)
- Rui Fa
- Department of Electronic and Computer Engineering, Brunel University, Uxbridge, Middlesex, United Kingdom
| | - David J. Roberts
- National Health Service Blood and Transplant, Oxford, United Kingdom
- The University of Oxford, John Radcliffe Hospital, Oxford, United Kingdom
| | - Asoke K. Nandi
- Department of Electronic and Computer Engineering, Brunel University, Uxbridge, Middlesex, United Kingdom
- Department of Mathematical Information Technology, University of Jyväskylä, Jyväskylä, Finland
| |
Collapse
|