Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Getz G, Levine E, Domany E. Coupled two-way clustering analysis of gene microarray data. Proc Natl Acad Sci U S A 2000;97:12079-84. [PMID: 11035779 PMCID: PMC17297 DOI: 10.1073/pnas.210134797] [Citation(s) in RCA: 309] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

For:	Getz G, Levine E, Domany E. Coupled two-way clustering analysis of gene microarray data. Proc Natl Acad Sci U S A 2000;97:12079-84. [PMID: 11035779 PMCID: PMC17297 DOI: 10.1073/pnas.210134797] [Citation(s) in RCA: 309] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Number

Cited by Other Article(s)

Jain N, Ghosh S, Ghosh A. A parameter free relative density based biclustering method for identifying non-linear feature relations. Heliyon 2024;10:e34736. [PMID: 39157398 PMCID: PMC11327522 DOI: 10.1016/j.heliyon.2024.e34736] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2023] [Revised: 07/09/2024] [Accepted: 07/16/2024] [Indexed: 08/20/2024] Open

Abstract

The existing biclustering algorithms often depend on assumptions like monotonicity or linearity of feature relations for finding biclusters. Though a few algorithms overcome this problem using density-based methods, they tend to miss out many biclusters because they use global criteria for identifying dense regions. The proposed method, PF-RelDenBi, uses local variations in marginal and joint densities for each pair of features to find the subset of observations, forming the basis of the relation between them. It then finds the set of features connected by a common set of observations using a non-linear feature relation index, resulting in a bicluster. This approach allows us to find biclusters based on feature relations, even if the relations are non-linear or non-monotonous. Additionally, the proposed method does not require the user to provide any parameters, allowing its application to datasets from different domains. To study the behaviour of PF-RelDenBi on datasets with different properties, experiments were carried out on sixteen simulated datasets and the performance has been compared with eleven state-of-the-art algorithms. The proposed method is seen to produce better results for most of the simulated datasets. Experiments were conducted with five benchmark datasets and biclusters were detected using PF-RelDenBi. For the first two datasets, the detected biclusters were used to generate additional features that improved classification performance. For the other three datasets, the performance of PF-RelDenBi was compared with the eleven state-of-the-art methods in terms of accuracy, NMI and ARI. The proposed method is seen to detect biclusters with greater accuracy. The proposed technique has also been applied to the COVID-19 dataset to identify some demographic features that are likely to affect the spread of COVID-19.

Collapse

Chekouo T, Mukherjee H. A Bayesian hierarchical hidden Markov model for clustering and gene selection: Application to kidney cancer gene expression data. Biom J 2024;66:e2300173. [PMID: 38817110 PMCID: PMC11239327 DOI: 10.1002/bimj.202300173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2023] [Revised: 02/18/2024] [Accepted: 03/02/2024] [Indexed: 06/01/2024]

Castanho EN, Aidos H, Madeira SC. Biclustering data analysis: a comprehensive survey. Brief Bioinform 2024;25:bbae342. [PMID: 39007596 PMCID: PMC11247412 DOI: 10.1093/bib/bbae342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 05/16/2024] [Accepted: 07/01/2024] [Indexed: 07/16/2024] Open

Liu F, Yang Y, Xu XS, Yuan M. MESBC: A novel mutually exclusive spectral biclustering method for cancer subtyping. Comput Biol Chem 2024;109:108009. [PMID: 38219419 DOI: 10.1016/j.compbiolchem.2023.108009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Revised: 12/22/2023] [Accepted: 12/24/2023] [Indexed: 01/16/2024]

Abstract

Many soft biclustering algorithms have been developed and applied to various biological and biomedical data analyses. However, few mutually exclusive (hard) biclustering algorithms have been proposed, which could better identify disease or molecular subtypes with survival significance based on genomic or transcriptomic data. In this study, we developed a novel mutually exclusive spectral biclustering (MESBC) algorithm based on spectral method to detect mutually exclusive biclusters. MESBC simultaneously detects relevant features (genes) and corresponding conditions (patients) subgroups and, therefore, automatically uses the signature features for each subtype to perform the clustering. Extensive simulations revealed that MESBC provided superior accuracy in detecting pre-specified biclusters compared with the non-negative matrix factorization (NMF) and Dhillon's algorithm, particularly in very noisy data. Further analysis of the algorithm on real datasets obtained from the TCGA database showed that MESBC provided more accurate (i.e., smaller p-value) overall survival prediction in patients with lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) cancers when compared to the existing, gold-standard subtypes for lung cancers (integrative clustering). Furthermore, MESBC detected several genes with significant prognostic value in both LUAD and LUSC patients. External validation on an independent, unseen GEO dataset of LUAD showed that MESBC-derived clusters based on TCGA data still exhibited clear biclustering patterns and consistent, outstanding prognostic predictability, demonstrating robust generalizability of MESBC. Therefore, MESBC could potentially be used as a risk stratification tool to optimize the treatment for the patient, improve the selection of patients for clinical trials, and contribute to the development of novel therapeutic agents.

Collapse

Han W, Zhang S, Gao H, Bu D. Clustering on hierarchical heterogeneous data with prior pairwise relationships. BMC Bioinformatics 2024;25:40. [PMID: 38262930 PMCID: PMC10807103 DOI: 10.1186/s12859-024-05652-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Accepted: 01/12/2024] [Indexed: 01/25/2024] Open

Abstract

BACKGROUND

Clustering is a fundamental problem in statistics and has broad applications in various areas. Traditional clustering methods treat features equally and ignore the potential structure brought by the characteristic difference of features. Especially in cancer diagnosis and treatment, several types of biological features are collected and analyzed together. Treating these features equally fails to identify the heterogeneity of both data structure and cancer itself, which leads to incompleteness and inefficacy of current anti-cancer therapies.

OBJECTIVES

In this paper, we propose a clustering framework based on hierarchical heterogeneous data with prior pairwise relationships. The proposed clustering method fully characterizes the difference of features and identifies potential hierarchical structure by rough and refined clusters.

RESULTS

The refined clustering further divides the clusters obtained by the rough clustering into different subtypes. Thus it provides a deeper insight of cancer that can not be detected by existing clustering methods. The proposed method is also flexible with prior information, additional pairwise relationships of samples can be incorporated to help to improve clustering performance. Finally, well-grounded statistical consistency properties of our proposed method are rigorously established, including the accurate estimation of parameters and determination of clustering structures.

CONCLUSIONS

Our proposed method achieves better clustering performance than other methods in simulation studies, and the clustering accuracy increases with prior information incorporated. Meaningful biological findings are obtained in the analysis of lung adenocarcinoma with clinical imaging data and omics data, showing that hierarchical structure produced by rough and refined clustering is necessary and reasonable.

Collapse

Pauk J, Daunoraviciene K, Ziziene J, Minta-Bielecka K, Dzieciol-Anikiej Z. Classification of muscle activity patterns in healthy children using biclustering algorithm. Biomed Signal Process Control 2023. [DOI: 10.1016/j.bspc.2023.104731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/08/2023]

Zhang W, Wendt C, Bowler R, Hersh CP, Safo SE. Robust integrative biclustering for multi-view data. Stat Methods Med Res 2022;31:2201-2216. [PMID: 36113157 PMCID: PMC10153449 DOI: 10.1177/09622802221122427] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Liu T, Yu H, Blair RH. Stability estimation for unsupervised clustering: A review. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL STATISTICS 2022;14:e1575. [PMID: 36583207 PMCID: PMC9787023 DOI: 10.1002/wics.1575] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Revised: 11/24/2021] [Accepted: 12/08/2021] [Indexed: 01/01/2023]

Michalak M. Hierarchical heuristics for Boolean-reasoning-based binary bicluster induction. ACTA INFORM 2022. [DOI: 10.1007/s00236-021-00415-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/01/2022]

Maisog JM, DeMarco AT, Devarajan K, Young SS, Fogel P, Luta G. Assessing Methods for Evaluating the Number of Components in Non-Negative Matrix Factorization. MATHEMATICS (BASEL, SWITZERLAND) 2021;9:2840. [PMID: 35694180 PMCID: PMC9181460 DOI: 10.3390/math9222840] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/09/2023]

Fang Q, Su D, Ng W, Feng J. An Effective Biclustering-Based Framework for Identifying Cell Subpopulations From scRNA-seq Data. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2021;18:2249-2260. [PMID: 32167906 DOI: 10.1109/tcbb.2020.2979717] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]

A Holistic Performance Comparison for Lung Cancer Classification Using Swarm Intelligence Techniques. JOURNAL OF HEALTHCARE ENGINEERING 2021;2021:6680424. [PMID: 34373776 PMCID: PMC8349254 DOI: 10.1155/2021/6680424] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/14/2020] [Accepted: 07/17/2021] [Indexed: 12/22/2022]

Salcedo EC, Winter MB, Khuri N, Knudsen GM, Sali A, Craik CS. Global Protease Activity Profiling Identifies HER2-Driven Proteolysis in Breast Cancer. ACS Chem Biol 2021;16:712-723. [PMID: 33765766 DOI: 10.1021/acschembio.0c01000] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]

Abstract

Differential expression of extracellular proteases and endogenous protease inhibitors has been associated with distinct molecular subtypes of breast cancer. However, due to the tight post-translational regulation of protease activity, protease expression-level data alone are not sufficient to understand the role of proteases in malignant transformation. Therefore, we hypothesized that global profiles of extracellular protease activity could more completely reflect differences observed at the transcriptional level in breast cancer and that subtype-associated protease activity may be leveraged to identify specific proteases that play a functional role in cancer signaling. Here, we used a global peptide library-based approach to profile the activities of proteases within distinct breast cancer subtypes. Analysis of 3651 total peptide cleavages from a panel of well-characterized breast cancer cell lines demonstrated differences in proteolytic signatures between cell lines. Cell line clustering based on protease cleavages within the peptide library expanded upon the expected classification derived from transcriptional profiling. An isogenic cell line model developed to further interrogate proteolysis in the HER2 subtype revealed a proteolytic signature consistent with activation of TGF-β signaling. Specifically, we determined that a metalloprotease involved in TGF-β signaling, BMP1, was upregulated at both the protein (2-fold, P = 0.001) and activity (P = 0.0599) levels. Inhibition of BMP1 and HER2 suppressed invasion of HER2-expressing cells by 35% (P < 0.0001), compared to 15% (P = 0.0086) observed in cells where only HER2 was inhibited. In summary, through global identification of extracellular proteolysis in breast cancer cell lines, we demonstrate subtype-specific differences in protease activity and elucidate proteolysis associated with HER2-mediated signaling.

Collapse

Li Y, Bandyopadhyay D, Xie F, Xu Y. BAREB: A Bayesian repulsive biclustering model for periodontal data. Stat Med 2020;39:2139-2151. [PMID: 32246534 PMCID: PMC7272289 DOI: 10.1002/sim.8536] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2019] [Revised: 02/12/2020] [Accepted: 03/07/2020] [Indexed: 11/11/2022]

Flynn C, Perry P. Profile likelihood biclustering. Electron J Stat 2020. [DOI: 10.1214/19-ejs1667] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Biswal BS, Mohapatra A, Vipsita S. Triclustering of gene expression microarray data using coarse grained and dynamic deme based parallel genetic approach. EVOLUTIONARY INTELLIGENCE 2019. [DOI: 10.1007/s12065-019-00330-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Saini N, Saha S, Soni C, Bhattacharyya P. Automatic evolution of bi-clusters from microarray data using self-organized multi-objective evolutionary algorithm. APPL INTELL 2019. [DOI: 10.1007/s10489-019-01554-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]

Cirrincione G, Ciravegna G, Barbiero P, Randazzo V, Pasero E. The GH-EXIN neural network for hierarchical clustering. Neural Netw 2019;121:57-73. [PMID: 31536900 DOI: 10.1016/j.neunet.2019.07.018] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Revised: 06/11/2019] [Accepted: 07/21/2019] [Indexed: 10/26/2022]

Acharya S, Saha S, Sahoo P. Bi-clustering of microarray data using a symmetry-based multi-objective optimization framework. Soft comput 2019. [DOI: 10.1007/s00500-018-3227-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]

Zhang L, Wei Y, Yan X, Li N, Song H, Yang L, Wu Y, Xi YF, Weng HW, Li JH, Lin EH, Zou LQ. Survivin is a prognostic marker and therapeutic target for extranodal, nasal-type natural killer/T cell lymphoma. ANNALS OF TRANSLATIONAL MEDICINE 2019;7:316. [PMID: 31475186 DOI: 10.21037/atm.2019.06.53] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]

Singh A, Bhanot G, Khiabanian H. TuBA: Tunable biclustering algorithm reveals clinically relevant tumor transcriptional profiles in breast cancer. Gigascience 2019;8:giz064. [PMID: 31216036 PMCID: PMC6582332 DOI: 10.1093/gigascience/giz064] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2019] [Revised: 04/17/2019] [Accepted: 05/06/2019] [Indexed: 11/13/2022] Open

Abstract

BACKGROUND

Traditional clustering approaches for gene expression data are not well adapted to address the complexity and heterogeneity of tumors, where small sets of genes may be aberrantly co-expressed in specific subsets of tumors. Biclustering algorithms that perform local clustering on subsets of genes and conditions help address this problem. We propose a graph-based Tunable Biclustering Algorithm (TuBA) based on a novel pairwise proximity measure, examining the relationship of samples at the extremes of genes' expression profiles to identify similarly altered signatures.

RESULTS

TuBA's predictions are consistent in 3,940 breast invasive carcinoma samples from 3 independent sources, using different technologies for measuring gene expression (RNA sequencing and Microarray). More than 60% of biclusters identified independently in each dataset had significant agreement in their gene sets, as well as similar clinical implications. Approximately 50% of biclusters were enriched in the estrogen receptor-negative/HER2-negative (or basal-like) subtype, while >50% were associated with transcriptionally active copy number changes. Biclusters representing gene co-expression patterns in stromal tissue were also identified in tumor specimens.

CONCLUSIONS

TuBA offers a simple biclustering method that can identify biologically relevant gene co-expression signatures not captured by traditional unsupervised clustering approaches. It complements biclustering approaches that are designed to identify constant or coherent submatrices in gene expression datasets, and outperforms them in identifying a multitude of altered transcriptional profiles that are associated with observed genomic heterogeneity of diseased states in breast cancer, both within and across tumor subtypes, a promising step in understanding disease heterogeneity, and a necessary first step in individualized therapy.

Collapse

Wang T, Zhang J, Huang K. Generalized gene co-expression analysis via subspace clustering using low-rank representation. BMC Bioinformatics 2019;20:196. [PMID: 31074376 PMCID: PMC6509871 DOI: 10.1186/s12859-019-2733-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open

Probabilistic density-based estimation of the number of clusters using the DBSCAN-martingale process. Pattern Recognit Lett 2019. [DOI: 10.1016/j.patrec.2019.03.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]

Rahaman MA, Turner JA, Gupta CN, Rachakonda S, Chen J, Liu J, van Erp TGM, Potkin S, Ford J, Mathalon D, Lee HJ, Jiang W, Mueller BA, Andreassen O, Agartz I, Sponheim SR, Mayer AR, Stephen J, Jung RE, Canive J, Bustillo J, Calhoun VD. N-BiC: A Method for Multi-Component and Symptom Biclustering of Structural MRI Data: Application to Schizophrenia. IEEE Trans Biomed Eng 2019;67:110-121. [PMID: 30946659 PMCID: PMC7906485 DOI: 10.1109/tbme.2019.2908815] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Kalisky T, Oriel S, Bar-Lev TH, Ben-Haim N, Trink A, Wineberg Y, Kanter I, Gilad S, Pyne S. A brief review of single-cell transcriptomic technologies. Brief Funct Genomics 2019;17:64-76. [PMID: 28968725 DOI: 10.1093/bfgp/elx019] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open

Gupta MK, Vadde R. Identification and characterization of differentially expressed genes in Type 2 Diabetes using in silico approach. Comput Biol Chem 2019;79:24-35. [PMID: 30708140 DOI: 10.1016/j.compbiolchem.2019.01.010] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2018] [Revised: 12/26/2018] [Accepted: 01/23/2019] [Indexed: 12/14/2022]

Franco M, Vivo JM. Cluster Analysis of Microarray Data. Methods Mol Biol 2019;1986:153-183. [PMID: 31115888 DOI: 10.1007/978-1-4939-9442-7_7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Single-cell analyses demonstrate that a heme-GATA1 feedback loop regulates red cell differentiation. Blood 2018;133:457-469. [PMID: 30530752 DOI: 10.1182/blood-2018-05-850412] [Citation(s) in RCA: 32] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2018] [Accepted: 12/01/2018] [Indexed: 01/07/2023] Open

Abstract

Erythropoiesis is the complex, dynamic, and tightly regulated process that generates all mature red blood cells. To understand this process, we mapped the developmental trajectories of progenitors from wild-type, erythropoietin-treated, and Flvcr1-deleted mice at single-cell resolution. Importantly, we linked the quantity of each cell's surface proteins to its total transcriptome, which is a novel method. Deletion of Flvcr1 results in high levels of intracellular heme, allowing us to identify heme-regulated circuitry. Our studies demonstrate that in early erythroid cells (CD71⁺Ter119^neg-lo), heme increases ribosomal protein transcripts, suggesting that heme, in addition to upregulating globin transcription and translation, guarantees ample ribosomes for globin synthesis. In later erythroid cells (CD71⁺Ter119^lo-hi), heme decreases GATA1, GATA1-target gene, and mitotic spindle gene expression. These changes occur quickly. For example, in confirmatory studies using human marrow erythroid cells, ribosomal protein transcripts and proteins increase, and GATA1 transcript and protein decrease, within 15 to 30 minutes of amplifying endogenous heme synthesis with aminolevulinic acid. Because GATA1 initiates heme synthesis, GATA1 and heme together direct red cell maturation, and heme stops GATA1 synthesis, our observations reveal a GATA1-heme autoregulatory loop and implicate GATA1 and heme as the comaster regulators of the normal erythroid differentiation program. In addition, as excessive heme could amplify ribosomal protein imbalance, prematurely lower GATA1, and impede mitosis, these data may help explain the ineffective (early termination of) erythropoiesis in Diamond Blackfan anemia and del(5q) myelodysplasia, disorders with excessive heme in colony-forming unit-erythroid/proerythroblasts, explain why these anemias are macrocytic, and show why children with GATA1 mutations have DBA-like clinical phenotypes.

Collapse

Clustering, Pathway Enrichment, and Protein-Protein Interaction Analysis of Gene Expression in Neurodevelopmental Disorders. Adv Pharmacol Sci 2018;2018:3632159. [PMID: 30598663 PMCID: PMC6288580 DOI: 10.1155/2018/3632159] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2018] [Accepted: 10/30/2018] [Indexed: 12/21/2022] Open

Abstract

Neuronal developmental disorder is a class of diseases in which there is impairment of the central nervous system and brain function. The brain in its developmental phase undergoes tremendous changes depending upon the stage and environmental factors. Neurodevelopmental disorders include abnormalities associated with cognitive, speech, reading, writing, linguistic, communication, and growth disorders with lifetime effects. Computational methods provide great potential for betterment of research and insight into the molecular mechanism of diseases. In this study, we have used four samples of microarray neuronal developmental data: control, RV (resveratrol), NGF (nerve growth factor), and RV + NGF. By using computational methods, we have identified genes that are expressed in the early stage of neuronal development and also involved in neuronal diseases. We have used MeV application to cluster the raw data using distance metric Pearson correlation coefficient. Finally, 60 genes were selected on the basis of coexpression analysis. Further pathway analysis was done using the Metascape tool, and the biological process was studied using gene ontology database. A total of 13 genes AKT1, BAD, BAX, BCL2, BDNF, CASP3, CASP8, CASP9, MYC, PIK3CD, MAPK1, MAPK10, and CYCS were identified that are common in all clusters. These genes are involved in neuronal developmental disorders and cancers like colorectal cancer, apoptosis, tuberculosis, amyotrophic lateral sclerosis (ALS), neuron death, and prostate cancer pathway. A protein-protein interaction study was done to identify proteins that belong to the same pathway. These genes can be used to design potential inhibitors against neurological disorders at the early stage of neuronal development. The microarray samples discussed in this publication are part of the data deposited in NCBI's Gene Expression Omnibus (Yadav et al., 2018) and are accessible through GEO Series (accession number GSE121261).

Collapse

Yin L, Qiu J, Gao S. Biclustering of Gene Expression Data Using Cuckoo Search and Genetic Algorithm. INT J PATTERN RECOGN 2018. [DOI: 10.1142/s0218001418500398] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Abstract Biclustering analysis of gene expression data can reveal a large number of biologically significant local gene expression patterns. Therefore, a large number of biclustering algorithms apply meta-heuristic algorithms such as genetic algorithm (GA) and cuckoo search (CS) to analyze the biclusters. However, different meta-heuristic algorithms have different applicability and characteristics. For example, the CS algorithm can obtain high-quality bicluster and strong global search ability, but its local search ability is relatively poor. In contrast to the CS algorithm, the GA has strong local search ability, but its global search ability is poor. In order to not only improve the global search ability of a bicluster and its coverage, but also improve the local search ability of the bicluster and its quality, this paper proposed a meta-heuristic algorithm based on GA and CS algorithm (GA-CS Biclustering, Georgia Association of Community Service Boards (GACSB)) to solve the problem of gene expression data clustering. The algorithm uses the CS algorithm as the main framework, and uses the tournament strategy and the elite retention strategy based on the GA to generate the next generation of the population. Compared with the experimental results of common biclustering analysis algorithms such as correlated correspondence (CC), fast, local clustering (FLOC), interior search algorithm (ISA), Securities Exchange Board of India (SEBI), sum of squares between (SSB) and coordinated scheduling/beamforming (CSB), the GACSB algorithm can not only obtain biclusters of high quality, but also obtain biclusters of high-biologic significance. In addition, we also use different bicluster evaluation indicators, such as Average Correlation Value (ACV), Mean-Squared Residue (MSR) and Virtual Error (VE), and verify that the GACSB algorithm has a strong scalability. Collapse

Ding KF, Finlay D, Yin H, Hendricks WPD, Sereduk C, Kiefer J, Sekulic A, LoRusso PM, Vuori K, Trent JM, Schork NJ. Network Rewiring in Cancer: Applications to Melanoma Cell Lines and the Cancer Genome Atlas Patients. Front Genet 2018;9:228. [PMID: 30042785 PMCID: PMC6048451 DOI: 10.3389/fgene.2018.00228] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2018] [Accepted: 06/08/2018] [Indexed: 01/21/2023] Open

A review of conceptual clustering algorithms. Artif Intell Rev 2018. [DOI: 10.1007/s10462-018-9627-1] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]

A contiguous column coherent evolution biclustering algorithm for time-series gene expression data. INT J MACH LEARN CYB 2018. [DOI: 10.1007/s13042-015-0487-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Bentham RB, Bryson K, Szabadkai G. MCbiclust: a novel algorithm to discover large-scale functionally related gene sets from massive transcriptomics data collections. Nucleic Acids Res 2017;45:8712-8730. [PMID: 28911113 PMCID: PMC5587796 DOI: 10.1093/nar/gkx590] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2016] [Accepted: 07/01/2017] [Indexed: 12/16/2022] Open

Drug–target interaction prediction by integrating multiview network data. Comput Biol Chem 2017. [DOI: 10.1016/j.compbiolchem.2017.03.011] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

Shi F, Huang H. Identifying Cell Subpopulations and Their Genetic Drivers from Single-Cell RNA-Seq Data Using a Biclustering Approach. J Comput Biol 2017;24:663-674. [PMID: 28657835 PMCID: PMC5510693 DOI: 10.1089/cmb.2017.0049] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Huang Y. Clustering multi-typed objects in extended star-structured heterogeneous data. INTELL DATA ANAL 2017. [DOI: 10.3233/ida-150416] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Henriques R, Ferreira FL, Madeira SC. BicPAMS: software for biological data analysis with pattern-based biclustering. BMC Bioinformatics 2017;18:82. [PMID: 28153040 PMCID: PMC5290636 DOI: 10.1186/s12859-017-1493-3] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2016] [Accepted: 01/21/2017] [Indexed: 12/21/2022] Open

Abstract

BACKGROUND

Biclustering has been largely applied for the unsupervised analysis of biological data, being recognised today as a key technique to discover putative modules in both expression data (subsets of genes correlated in subsets of conditions) and network data (groups of coherently interconnected biological entities). However, given its computational complexity, only recent breakthroughs on pattern-based biclustering enabled efficient searches without the restrictions that state-of-the-art biclustering algorithms place on the structure and homogeneity of biclusters. As a result, pattern-based biclustering provides the unprecedented opportunity to discover non-trivial yet meaningful biological modules with putative functions, whose coherency and tolerance to noise can be tuned and made problem-specific.

METHODS

To enable the effective use of pattern-based biclustering by the scientific community, we developed BicPAMS (Biclustering based on PAttern Mining Software), a software that: 1) makes available state-of-the-art pattern-based biclustering algorithms (BicPAM (Henriques and Madeira, Alg Mol Biol 9:27, 2014), BicNET (Henriques and Madeira, Alg Mol Biol 11:23, 2016), BicSPAM (Henriques and Madeira, BMC Bioinforma 15:130, 2014), BiC2PAM (Henriques and Madeira, Alg Mol Biol 11:1-30, 2016), BiP (Henriques and Madeira, IEEE/ACM Trans Comput Biol Bioinforma, 2015), DeBi (Serin and Vingron, AMB 6:1-12, 2011) and BiModule (Okada et al., IPSJ Trans Bioinf 48(SIG5):39-48, 2007)); 2) consistently integrates their dispersed contributions; 3) further explores additional accuracy and efficiency gains; and 4) makes available graphical and application programming interfaces.

RESULTS

Results on both synthetic and real data confirm the relevance of BicPAMS for biological data analysis, highlighting its essential role for the discovery of putative modules with non-trivial yet biologically significant functions from expression and network data.

CONCLUSIONS

BicPAMS is the first biclustering tool offering the possibility to: 1) parametrically customize the structure, coherency and quality of biclusters; 2) analyze large-scale biological networks; and 3) tackle the restrictive assumptions placed by state-of-the-art biclustering algorithms. These contributions are shown to be key for an adequate, complete and user-assisted unsupervised analysis of biological data.

SOFTWARE

BicPAMS and its tutorial available in http://www.bicpams.com .

Collapse

Oyelade J, Isewon I, Oladipupo F, Aromolaran O, Uwoghiren E, Ameh F, Achas M, Adebiyi E. Clustering Algorithms: Their Application to Gene Expression Data. Bioinform Biol Insights 2016;10:237-253. [PMID: 27932867 PMCID: PMC5135122 DOI: 10.4137/bbi.s38316] [Citation(s) in RCA: 69] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2016] [Revised: 09/05/2016] [Accepted: 09/09/2016] [Indexed: 12/17/2022] Open

Fushing H, Hsueh CH, Heitkamp C, Matthews MA, Koehl P. Unravelling the geometry of data matrices: effects of water stress regimes on winemaking. J R Soc Interface 2016;12:20150753. [PMID: 26468072 DOI: 10.1098/rsif.2015.0753] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Van Mechelen I, Bock HH, De Boeck P. Two-mode clustering methods: astructuredoverview. Stat Methods Med Res 2016;13:363-94. [PMID: 15516031 DOI: 10.1191/0962280204sm373ra] [Citation(s) in RCA: 134] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Ensemble Feature Learning of Genomic Data Using Support Vector Machine. PLoS One 2016;11:e0157330. [PMID: 27304923 PMCID: PMC4909287 DOI: 10.1371/journal.pone.0157330] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2015] [Accepted: 05/28/2016] [Indexed: 11/29/2022] Open

Abstract

The identification of a subset of genes having the ability to capture the necessary information to distinguish classes of patients is crucial in bioinformatics applications. Ensemble and bagging methods have been shown to work effectively in the process of gene selection and classification. Testament to that is random forest which combines random decision trees with bagging to improve overall feature selection and classification accuracy. Surprisingly, the adoption of these methods in support vector machines has only recently received attention but mostly on classification not gene selection. This paper introduces an ensemble SVM-Recursive Feature Elimination (ESVM-RFE) for gene selection that follows the concepts of ensemble and bagging used in random forest but adopts the backward elimination strategy which is the rationale of RFE algorithm. The rationale behind this is, building ensemble SVM models using randomly drawn bootstrap samples from the training set, will produce different feature rankings which will be subsequently aggregated as one feature ranking. As a result, the decision for elimination of features is based upon the ranking of multiple SVM models instead of choosing one particular model. Moreover, this approach will address the problem of imbalanced datasets by constructing a nearly balanced bootstrap sample. Our experiments show that ESVM-RFE for gene selection substantially increased the classification performance on five microarray datasets compared to state-of-the-art methods. Experiments on the childhood leukaemia dataset show that an average 9% better accuracy is achieved by ESVM-RFE over SVM-RFE, and 5% over random forest based approach. The selected genes by the ESVM-RFE algorithm were further explored with Singular Value Decomposition (SVD) which reveals significant clusters with the selected data.

Collapse

Wagner JR, Lee CT, Durrant JD, Malmstrom RD, Feher VA, Amaro RE. Emerging Computational Methods for the Rational Discovery of Allosteric Drugs. Chem Rev 2016;116:6370-90. [PMID: 27074285 PMCID: PMC4901368 DOI: 10.1021/acs.chemrev.5b00631] [Citation(s) in RCA: 158] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Wang Z, Li G, Robinson RW, Huang X. UniBic: Sequential row-based biclustering algorithm for analysis of gene expression data. Sci Rep 2016;6:23466. [PMID: 27001340 PMCID: PMC4802312 DOI: 10.1038/srep23466] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2015] [Accepted: 03/08/2016] [Indexed: 11/29/2022] Open

Mortlock SA, Booth R, Mazrier H, Khatkar MS, Williamson P. Visualization of Genome Diversity in German Shepherd Dogs. Bioinform Biol Insights 2016;9:37-42. [PMID: 26884680 PMCID: PMC4750897 DOI: 10.4137/bbi.s30524] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2015] [Revised: 12/06/2015] [Accepted: 12/11/2015] [Indexed: 12/16/2022] Open

Fuzzy soft subspace clustering method for gene co-expression network analysis. INT J MACH LEARN CYB 2016. [DOI: 10.1007/s13042-015-0486-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Discovery of bidirectional contiguous column coherent bicluster in time-series gene expression data. INT J MACH LEARN CYB 2015. [DOI: 10.1007/s13042-015-0464-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Cheon M, Kim C, Chang I. Uncovering multiloci-ordering by algebraic property of Laplacian matrix and its Fiedler vector. ACTA ACUST UNITED AC 2015;32:801-7. [PMID: 26568627 DOI: 10.1093/bioinformatics/btv669] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Accepted: 11/09/2015] [Indexed: 11/13/2022]

Nepomuceno JA, Troncoso A, Aguilar-Ruiz JS. Scatter search-based identification of local patterns with positive and negative correlations in gene expression data. Appl Soft Comput 2015. [DOI: 10.1016/j.asoc.2015.06.019] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Pontes B, Giráldez R, Aguilar-Ruiz JS. Biclustering on expression data: A review. J Biomed Inform 2015;57:163-80. [PMID: 26160444 DOI: 10.1016/j.jbi.2015.06.028] [Citation(s) in RCA: 165] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2015] [Revised: 06/22/2015] [Accepted: 06/30/2015] [Indexed: 11/28/2022]