Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Wang J, Delabie J, Aasheim HC, Smeland E, Myklebost O. Clustering of the SOM easily reveals distinct gene expression patterns: results of a reanalysis of lymphoma study. BMC Bioinformatics 2002;3:36. [PMID: 12445336 PMCID: PMC138792 DOI: 10.1186/1471-2105-3-36] [Citation(s) in RCA: 58] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2002] [Accepted: 11/24/2002] [Indexed: 12/03/2022] Open

For:	Wang J, Delabie J, Aasheim HC, Smeland E, Myklebost O. Clustering of the SOM easily reveals distinct gene expression patterns: results of a reanalysis of lymphoma study. BMC Bioinformatics 2002;3:36. [PMID: 12445336 PMCID: PMC138792 DOI: 10.1186/1471-2105-3-36] [Citation(s) in RCA: 58] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2002] [Accepted: 11/24/2002] [Indexed: 12/03/2022] Open

Number

Cited by Other Article(s)

Wu X, Han M, Song X, He S, Bo X, Zhu Y. COMMO: a web server for the identification and analysis of consensus gene modules across multiple methods. Bioinformatics 2023;39:btad708. [PMID: 37995293 PMCID: PMC10713113 DOI: 10.1093/bioinformatics/btad708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2023] [Revised: 11/05/2023] [Accepted: 11/22/2023] [Indexed: 11/25/2023] Open

Manganaro L, Bianco S, Bironzo P, Cipollini F, Colombi D, Corà D, Corti G, Doronzo G, Errico L, Falco P, Gandolfi L, Guerrera F, Monica V, Novello S, Papotti M, Parab S, Pittaro A, Primo L, Righi L, Sabbatini G, Sandri A, Vattakunnel S, Bussolino F, Scagliotti GV. Consensus clustering methodology to improve molecular stratification of non-small cell lung cancer. Sci Rep 2023;13:7759. [PMID: 37173325 PMCID: PMC10182023 DOI: 10.1038/s41598-023-33954-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 04/21/2023] [Indexed: 05/15/2023] Open

Affiliation(s)

L Manganaro aizoOn Technology Consulting S.R.L, Torino, Italy
S Bianco aizoOn Technology Consulting S.R.L, Torino, Italy
P Bironzo Medical Oncology Division at San Luigi Hospital, Department of Oncology, University of Torino, Orbassano (TO), Italy
F Cipollini aizoOn Technology Consulting S.R.L, Torino, Italy
D Colombi aizoOn Technology Consulting S.R.L, Torino, Italy
D Corà Department of Translational Medicine, Piemonte Orientale University, Novara, Italy Center for Translational Research on Autoimmune and Allergic Diseases-CAAD, Novara, Italy
G Corti Department of Oncology, University of Torino, 10060, Candiolo, Italy Candiolo Cancer Institute-IRCCS-FPO, 10060, Candiolo, Italy
G Doronzo Department of Oncology, University of Torino, 10060, Candiolo, Italy Candiolo Cancer Institute-IRCCS-FPO, 10060, Candiolo, Italy
L Errico Division of Thoracic Surgery at AOU San Luigi, Department of Oncology, University of Torino, Orbassano (TO), Italy
P Falco aizoOn Technology Consulting S.R.L, Torino, Italy
L Gandolfi Department of Oncology, University of Torino, 10060, Candiolo, Italy Candiolo Cancer Institute-IRCCS-FPO, 10060, Candiolo, Italy
F Guerrera Division of Thoracic Surgery at AOU Città della Salute e della Scienza, Department of Surgical Sciences, University of Torino, Torino, Italy
V Monica Department of Oncology, University of Torino, 10060, Candiolo, Italy Candiolo Cancer Institute-IRCCS-FPO, 10060, Candiolo, Italy
S Novello Medical Oncology Division at San Luigi Hospital, Department of Oncology, University of Torino, Orbassano (TO), Italy
M Papotti Pathology Division at AOU Città della Salute e della Scienza, Department of Oncology, University of Torino, Torino, Italy
S Parab Department of Oncology, University of Torino, 10060, Candiolo, Italy Candiolo Cancer Institute-IRCCS-FPO, 10060, Candiolo, Italy
A Pittaro Pathology Division at AOU Città della Salute e della Scienza, Department of Oncology, University of Torino, Torino, Italy
L Primo Department of Oncology, University of Torino, 10060, Candiolo, Italy Candiolo Cancer Institute-IRCCS-FPO, 10060, Candiolo, Italy
L Righi Pathology Division at AOU San Luigi, Department of Oncology, University of Torino, Orbassano (TO), Italy
G Sabbatini aizoOn Technology Consulting S.R.L, Torino, Italy
A Sandri Division of Thoracic Surgery at AOU San Luigi, Department of Oncology, University of Torino, Orbassano (TO), Italy
S Vattakunnel aizoOn Technology Consulting S.R.L, Torino, Italy
F Bussolino Department of Oncology, University of Torino, 10060, Candiolo, Italy Candiolo Cancer Institute-IRCCS-FPO, 10060, Candiolo, Italy
G V Scagliotti Medical Oncology Division at San Luigi Hospital, Department of Oncology, University of Torino, Orbassano (TO), Italy.

Collapse

Banoei MM, Mahé E, Mansoor A, Stewart D, Winston BW, Habibi HR, Shabani-Rad MT. NMR-based metabolomic profiling can differentiate follicular lymphoma from benign lymph node tissues and may be predictive of outcome. Sci Rep 2022;12:8294. [PMID: 35585165 PMCID: PMC9117304 DOI: 10.1038/s41598-022-12445-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Accepted: 05/10/2022] [Indexed: 11/10/2022] Open

Yan Z, Shen Z, Li Z, Chao Q, Kong L, Gao ZF, Li QW, Zheng HY, Zhao CF, Lu CM, Wang YW, Wang BC. Genome-wide transcriptome and proteome profiles indicate an active role of alternative splicing during de-etiolation of maize seedlings. PLANTA 2020;252:60. [PMID: 32964359 DOI: 10.1007/s00425-020-03464-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 09/12/2020] [Indexed: 06/11/2023]

Affiliation(s)

Zhen Yan Photosynthesis Research Center, Key Laboratory of Photobiology, Institute of Botany, Chinese Academy of Sciences, No. 20 Nanxincun, Xiangshan, Beijing, 100093, China University of Chinese Academy of Sciences, 100049, Beijing, China
Zhuo Shen Vegetable Research Institute, Guangdong Academy of Agricultural Sciences, Guangdong Key Laboratory for New Technology Research of Vegetables, Guangzhou, 510640, China
Zhe Li Precision Scientific (Beijing) Co., Ltd., Beijing, 100085, China
Qing Chao Photosynthesis Research Center, Key Laboratory of Photobiology, Institute of Botany, Chinese Academy of Sciences, No. 20 Nanxincun, Xiangshan, Beijing, 100093, China University of Chinese Academy of Sciences, 100049, Beijing, China The Innovative Academy of Seed Design, Chinese Academy of Sciences, Beijing, 100039, China
Lei Kong State Key Laboratory of Protein and Plant Gene Research, Center for Bioinformatics, College of Life Sciences, Peking University, Beijing, 100871, China
Zhi-Fang Gao Photosynthesis Research Center, Key Laboratory of Photobiology, Institute of Botany, Chinese Academy of Sciences, No. 20 Nanxincun, Xiangshan, Beijing, 100093, China
Qing-Wei Li Beijing Botanical Garden, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China
Hai-Yan Zheng Center for Advanced Biotechnology and Medicine, Biological Mass Spectrometry Facility, Rutgers University, Piscataway, NJ, 08855, USA
Cai-Feng Zhao Center for Advanced Biotechnology and Medicine, Biological Mass Spectrometry Facility, Rutgers University, Piscataway, NJ, 08855, USA
Cong-Ming Lu State Key Laboratory of Crop Biology, College of Life Sciences, Shandong Agricultural University, Taian, 271018, Shandong, China
Ying-Wei Wang Beijing Botanical Garden, Institute of Botany, Chinese Academy of Sciences, Beijing, 100093, China.
Bai-Chen Wang Photosynthesis Research Center, Key Laboratory of Photobiology, Institute of Botany, Chinese Academy of Sciences, No. 20 Nanxincun, Xiangshan, Beijing, 100093, China. University of Chinese Academy of Sciences, 100049, Beijing, China. The Innovative Academy of Seed Design, Chinese Academy of Sciences, Beijing, 100039, China.

Collapse

Li HM, Liu P, Zhang XJ, Li LM, Jiang HY, Yan H, Hou FH, Chen JP. Combined proteomics and transcriptomics reveal the genetic basis underlying the differentiation of skin appendages and immunity in pangolin. Sci Rep 2020;10:14566. [PMID: 32884035 PMCID: PMC7471334 DOI: 10.1038/s41598-020-71513-w] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2020] [Accepted: 08/17/2020] [Indexed: 11/18/2022] Open

Tao J, Han Q, Zhou H, Diao X. Transcriptomic responses of regenerating earthworms (Eisenia foetida) to retinoic acid reveals the role of pluripotency genes. CHEMOSPHERE 2019;226:47-59. [PMID: 30913427 DOI: 10.1016/j.chemosphere.2019.03.111] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/14/2019] [Revised: 03/16/2019] [Accepted: 03/16/2019] [Indexed: 06/09/2023]

Manners HN, Roy S, Kalita JK. Intrinsic-overlapping co-expression module detection with application to Alzheimer's Disease. Comput Biol Chem 2018;77:373-389. [PMID: 30466046 DOI: 10.1016/j.compbiolchem.2018.10.014] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2018] [Revised: 10/28/2018] [Accepted: 10/29/2018] [Indexed: 11/18/2022]

Abstract

Genes interact with each other and may cause perturbation in the molecular pathways leading to complex diseases. Often, instead of any single gene, a subset of genes interact, forming a network, to share common biological functions. Such a subnetwork is called a functional module or motif. Identifying such modules and central key genes in them, that may be responsible for a disease, may help design patient-specific drugs. In this study, we consider the neurodegenerative Alzheimer's Disease (AD) and identify potentially responsible genes from functional motif analysis. We start from the hypothesis that central genes in genetic modules are more relevant to a disease that is under investigation and identify hub genes from the modules as potential marker genes. Motifs or modules are often non-exclusive or overlapping in nature. Moreover, they sometimes show intrinsic or hierarchical distributions with overlapping functional roles. To the best of our knowledge, no prior work handles both the situations in an integrated way. We propose a non-exclusive clustering approach, CluViaN (Clustering Via Network) that can detect intrinsic as well as overlapping modules from gene co-expression networks constructed using microarray expression profiles. We compare our method with existing methods to evaluate the quality of modules extracted. CluViaN reports the presence of intrinsic and overlapping motifs in different species not reported by any other research. We further apply our method to extract significant AD specific modules using CluViaN and rank them based the number of genes from a module involved in the disease pathways. Finally, top central genes are identified by topological analysis of the modules. We use two different AD phenotype data for experimentation. We observe that central genes, namely PSEN1, APP, NDUFB2, NDUFA1, UQCR10, PPP3R1 and a few more, play significant roles in the AD. Interestingly, our experiments also find a hub gene, PML, which has recently been reported to play a role in plasticity, circadian rhythms and the response to proteins which can cause neurodegenerative disorders. MUC4, another hub gene that we find experimentally is yet to be investigated for its potential role in AD. A software implementation of CluViaN in Java is available for download at https://sites.google.com/site/swarupnehu/publications/resources/CluViaN Software.rar.

Collapse

Bioinformatics and Translation Elongation. BIOINFORMATICS AND THE CELL 2018:197-238. [PMCID: PMC7121122 DOI: 10.1007/978-3-319-90684-3_9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/11/2023]

A Survey of Data Mining and Deep Learning in Bioinformatics. J Med Syst 2018;42:139. [DOI: 10.1007/s10916-018-1003-9] [Citation(s) in RCA: 81] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Accepted: 06/21/2018] [Indexed: 12/13/2022]

Aschenbrenner AC, Bassler K, Brondolin M, Bonaguro L, Carrera P, Klee K, Ulas T, Schultze JL, Hoch M. A cross-species approach to identify transcriptional regulators exemplified for Dnajc22 and Hnf4a. Sci Rep 2017;7:4056. [PMID: 28642491 PMCID: PMC5481429 DOI: 10.1038/s41598-017-04370-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2016] [Accepted: 05/05/2017] [Indexed: 12/03/2022] Open

Kanzaki N, Kataoka T, Etani R, Sasaoka K, Kanagawa A, Yamaoka K. Analysis of liver damage from radon, X-ray, or alcohol treatments in mice using a self-organizing map. JOURNAL OF RADIATION RESEARCH 2017;58:33-40. [PMID: 27614200 PMCID: PMC5321189 DOI: 10.1093/jrr/rrw083] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/27/2016] [Revised: 06/24/2016] [Accepted: 06/30/2016] [Indexed: 05/30/2023]

Seno A, Kasai T, Ikeda M, Vaidyanath A, Masuda J, Mizutani A, Murakami H, Ishikawa T, Seno M. Characterization of Gene Expression Patterns among Artificially Developed Cancer Stem Cells Using Spherical Self-Organizing Map. Cancer Inform 2016;15:163-78. [PMID: 27559294 PMCID: PMC4988459 DOI: 10.4137/cin.s39839] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2016] [Revised: 05/15/2016] [Accepted: 05/30/2016] [Indexed: 12/20/2022] Open

Udatha DBRKG, Topakas E, Salazar M, Olsson L, Andersen MR, Panagiotou G. Deciphering the signaling mechanisms of the plant cell wall degradation machinery in Aspergillus oryzae. BMC SYSTEMS BIOLOGY 2015;9:77. [PMID: 26573537 PMCID: PMC4647334 DOI: 10.1186/s12918-015-0224-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/29/2015] [Accepted: 10/28/2015] [Indexed: 12/30/2022]

Abstract

Background

The gene expression and secretion of fungal lignocellulolytic enzymes are tightly controlled at the transcription level using independent mechanisms to respond to distinct inducers from plant biomass. An advanced systems-level understanding of transcriptional regulatory networks is required to rationally engineer filamentous fungi for more efficient bioconversion of different types of biomass.

Results

In this study we focused on ten chemically defined inducers to drive expression of cellulases, hemicellulases and accessory enzymes in the model filamentous fungus Aspergillus oryzae and shed light on the complex network of transcriptional activators required. The chemical diversity analysis of the inducers, based on 186 chemical descriptors calculated from the structure, resulted into three clusters, however, the global, metabolic and extracellular protein transcription of the A. oryzae genome were only partially explained by the chemical similarity of the enzyme inducers. Genes encoding enzymes that have attracted considerable interest such as cellobiose dehydrogenases and copper-dependent polysaccharide mono-oxygenases presented a substrate-specific induction. Several homology-model structures were derived using ab-initio multiple threading alignment in our effort to elucidate the interplay of transcription factors involved in regulating plant-deconstructing enzymes and metabolites. Systematic investigation of metabolite-protein interactions, using the 814 unique reactants involved in 2360 reactions in the genome scale metabolic network of A. oryzae, was performed through a two-step molecular docking against the binding pockets of the transcription factors AoXlnR and AoAmyR. A total of six metabolites viz., sulfite (H₂SO₃), sulfate (SLF), uroporphyrinogen III (UPGIII), ethanolamine phosphate (PETHM), D-glyceraldehyde 3-phosphate (T3P1) and taurine (TAUR) were found as strong binders, whereas the genes involved in the metabolic reactions that these metabolites appear were found to be significantly differentially expressed when comparing the inducers with glucose.

Conclusions

Based on our observations, we believe that specific binding of sulfite to the regulator of the cellulase gene expression, AoXlnR, may be the molecular basis for the connection of sulfur metabolism and cellulase gene expression in filamentous fungi. Further characterization and manipulation of the regulatory network components identified in this study, will enable rational engineering of industrial strains for improved production of the sophisticated set of enzymes necessary to break-down chemically divergent plant biomass.

Electronic supplementary material

The online version of this article (doi:10.1186/s12918-015-0224-5) contains supplementary material, which is available to authorized users.

Collapse

LacSubPred: predicting subtypes of Laccases, an important lignin metabolism-related enzyme class, using in silico approaches. BMC Bioinformatics 2014;15 Suppl 11:S15. [PMID: 25350584 PMCID: PMC4251044 DOI: 10.1186/1471-2105-15-s11-s15] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Abstract

Background

Laccases (E.C. 1.10.3.2) are multi-copper oxidases that have gained importance in many industries such as biofuels, pulp production, textile dye bleaching, bioremediation, and food production. Their usefulness stems from the ability to act on a diverse range of phenolic compounds such as o-/p-quinols, aminophenols, polyphenols, polyamines, aryl diamines, and aromatic thiols. Despite acting on a wide range of compounds as a family, individual Laccases often exhibit distinctive and varied substrate ranges. This is likely due to Laccases involvement in many metabolic roles across diverse taxa. Classification systems for multi-copper oxidases have been developed using multiple sequence alignments, however, these systems seem to largely follow species taxonomy rather than substrate ranges, enzyme properties, or specific function. It has been suggested that the roles and substrates of various Laccases are related to their optimal pH. This is consistent with the observation that fungal Laccases usually prefer acidic conditions, whereas plant and bacterial Laccases prefer basic conditions. Based on these observations, we hypothesize that a descriptor-based unsupervised learning system could generate homology independent classification system for better describing the functional properties of Laccases.

Results

In this study, we first utilized unsupervised learning approach to develop a novel homology independent Laccase classification system. From the descriptors considered, physicochemical properties showed the best performance. Physicochemical properties divided the Laccases into twelve subtypes. Analysis of the clusters using a t-test revealed that the majority of the physicochemical descriptors had statistically significant differences between the classes. Feature selection identified the most important features as negatively charges residues, the peptide isoelectric point, and acidic or amidic residues. Secondly, to allow for classification of new Laccases, a supervised learning system was developed from the clusters. The models showed high performance with an overall accuracy of 99.03%, error of 0.49%, MCC of 0.9367, precision of 94.20%, sensitivity of 94.20%, and specificity of 99.47% in a 5-fold cross-validation test. In an independent test, our models still provide a high accuracy of 97.98%, error rate of 1.02%, MCC of 0.8678, precision of 87.88%, sensitivity of 87.88% and specificity of 98.90%.

Conclusion

This study provides a useful classification system for better understanding of Laccases from their physicochemical properties perspective. We also developed a publically available web tool for the characterization of Laccase protein sequences (http://lacsubpred.bioinfo.ucr.edu/). Finally, the programs used in the study are made available for researchers interested in applying the system to other enzyme classes (https://github.com/tweirick/SubClPred).

Collapse

Inano R, Oishi N, Kunieda T, Arakawa Y, Yamao Y, Shibata S, Kikuchi T, Fukuyama H, Miyamoto S. Voxel-based clustered imaging by multiparameter diffusion tensor images for glioma grading. NEUROIMAGE-CLINICAL 2014;5:396-407. [PMID: 25180159 PMCID: PMC4145535 DOI: 10.1016/j.nicl.2014.08.001] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/02/2014] [Revised: 07/15/2014] [Accepted: 08/05/2014] [Indexed: 11/26/2022]

Abstract

Gliomas are the most common intra-axial primary brain tumour; therefore, predicting glioma grade would influence therapeutic strategies. Although several methods based on single or multiple parameters from diagnostic images exist, a definitive method for pre-operatively determining glioma grade remains unknown. We aimed to develop an unsupervised method using multiple parameters from pre-operative diffusion tensor images for obtaining a clustered image that could enable visual grading of gliomas. Fourteen patients with low-grade gliomas and 19 with high-grade gliomas underwent diffusion tensor imaging and three-dimensional T1-weighted magnetic resonance imaging before tumour resection. Seven features including diffusion-weighted imaging, fractional anisotropy, first eigenvalue, second eigenvalue, third eigenvalue, mean diffusivity and raw T2 signal with no diffusion weighting, were extracted as multiple parameters from diffusion tensor imaging. We developed a two-level clustering approach for a self-organizing map followed by the K-means algorithm to enable unsupervised clustering of a large number of input vectors with the seven features for the whole brain. The vectors were grouped by the self-organizing map as protoclusters, which were classified into the smaller number of clusters by K-means to make a voxel-based diffusion tensor-based clustered image. Furthermore, we also determined if the diffusion tensor-based clustered image was really helpful for predicting pre-operative glioma grade in a supervised manner. The ratio of each class in the diffusion tensor-based clustered images was calculated from the regions of interest manually traced on the diffusion tensor imaging space, and the common logarithmic ratio scales were calculated. We then applied support vector machine as a classifier for distinguishing between low- and high-grade gliomas. Consequently, the sensitivity, specificity, accuracy and area under the curve of receiver operating characteristic curves from the 16-class diffusion tensor-based clustered images that showed the best performance for differentiating high- and low-grade gliomas were 0.848, 0.745, 0.804 and 0.912, respectively. Furthermore, the log-ratio value of each class of the 16-class diffusion tensor-based clustered images was compared between low- and high-grade gliomas, and the log-ratio values of classes 14, 15 and 16 in the high-grade gliomas were significantly higher than those in the low-grade gliomas (p < 0.005, p < 0.001 and p < 0.001, respectively). These classes comprised different patterns of the seven diffusion tensor imaging-based parameters. The results suggest that the multiple diffusion tensor imaging-based parameters from the voxel-based diffusion tensor-based clustered images can help differentiate between low- and high-grade gliomas.

•

We have developed a novel unsupervised method for voxel-based clustered imaging.

•

Each class ratio in clustered images differentiated high from low-grade gliomas.

•

The 16-class clustered images showed the best performance for the differentiation.

•

Each class comprised different patterns of the seven diffusion tensor-based features.

•

Multiple parameters from diffusion tensor images are useful for glioma grading.

Collapse

Milone DH, Stegmayer G, López M, Kamenetzky L, Carrari F. Improving clustering with metabolic pathway data. BMC Bioinformatics 2014;15:101. [PMID: 24717120 PMCID: PMC4002909 DOI: 10.1186/1471-2105-15-101] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2013] [Accepted: 03/25/2014] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

It is a common practice in bioinformatics to validate each group returned by a clustering algorithm through manual analysis, according to a-priori biological knowledge. This procedure helps finding functionally related patterns to propose hypotheses for their behavior and the biological processes involved. Therefore, this knowledge is used only as a second step, after data are just clustered according to their expression patterns. Thus, it could be very useful to be able to improve the clustering of biological data by incorporating prior knowledge into the cluster formation itself, in order to enhance the biological value of the clusters.

RESULTS

A novel training algorithm for clustering is presented, which evaluates the biological internal connections of the data points while the clusters are being formed. Within this training algorithm, the calculation of distances among data points and neurons centroids includes a new term based on information from well-known metabolic pathways. The standard self-organizing map (SOM) training versus the biologically-inspired SOM (bSOM) training were tested with two real data sets of transcripts and metabolites from Solanum lycopersicum and Arabidopsis thaliana species. Classical data mining validation measures were used to evaluate the clustering solutions obtained by both algorithms. Moreover, a new measure that takes into account the biological connectivity of the clusters was applied. The results of bSOM show important improvements in the convergence and performance for the proposed clustering method in comparison to standard SOM training, in particular, from the application point of view.

CONCLUSIONS

Analyses of the clusters obtained with bSOM indicate that including biological information during training can certainly increase the biological value of the clusters found with the proposed method. It is worth to highlight that this fact has effectively improved the results, which can simplify their further analysis.The algorithm is available as a web-demo at http://fich.unl.edu.ar/sinc/web-demo/bsom-lite/. The source code and the data sets supporting the results of this article are available at http://sourceforge.net/projects/sourcesinc/files/bsom.

Collapse

Maksimov P, Zerweck J, Dubey JP, Pantchev N, Frey CF, Maksimov A, Reimer U, Schutkowski M, Hosseininejad M, Ziller M, Conraths FJ, Schares G. Serotyping of Toxoplasma gondii in cats (Felis domesticus) reveals predominance of type II infections in Germany. PLoS One 2013;8:e80213. [PMID: 24244652 PMCID: PMC3820565 DOI: 10.1371/journal.pone.0080213] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2013] [Accepted: 10/01/2013] [Indexed: 11/19/2022] Open

Abstract

Background

Cats are definitive hosts of Toxoplasma gondii and play an essential role in the epidemiology of this parasite. The study aims at clarifying whether cats are able to develop specific antibodies against different clonal types of T. gondii and to determine by serotyping the T. gondii clonal types prevailing in cats as intermediate hosts in Germany.

Methodology

To establish a peptide-microarray serotyping test, we identified 24 suitable peptides using serological T. gondii positive (n=21) and negative cat sera (n=52). To determine the clonal type-specific antibody response of cats in Germany, 86 field sera from T. gondii seropositive naturally infected cats were tested. In addition, we analyzed the antibody response in cats experimentally infected with non-canonical T. gondii types (n=7).

Findings

Positive cat reference sera reacted predominantly with peptides harbouring amino acid sequences specific for the clonal T. gondii type the cats were infected with. When the array was applied to field sera from Germany, 98.8% (85/86) of naturally-infected cats recognized similar peptide patterns as T. gondii type II reference sera and showed the strongest reaction intensities with clonal type II-specific peptides. In addition, naturally infected cats recognized type II-specific peptides significantly more frequently than peptides of other type-specificities. Cats infected with non-canonical types showed the strongest reactivity with peptides presenting amino-acid sequences specific for both, type I and type III.

Conclusions

Cats are able to mount a clonal type-specific antibody response against T. gondii. Serotyping revealed for most seropositive field sera patterns resembling those observed after clonal type II-T. gondii infection. This finding is in accord with our previous results on the occurrence of T. gondii clonal types in oocysts shed by cats in Germany.

Collapse

Kutsuna N, Higaki T, Matsunaga S, Otsuki T, Yamaguchi M, Fujii H, Hasezawa S. Active learning framework with iterative clustering for bioimage classification. Nat Commun 2013;3:1032. [PMID: 22929789 PMCID: PMC3432472 DOI: 10.1038/ncomms2030] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2012] [Accepted: 07/30/2012] [Indexed: 11/19/2022] Open

Stegmayer G, Gerard M, Milone D. Data Mining Over Biological Datasets: An Integrated Approach Based on Computational Intelligence. IEEE COMPUT INTELL M 2012. [DOI: 10.1109/mci.2012.2215122] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Song WM, Di Matteo T, Aste T. Hierarchical information clustering by means of topologically embedded graphs. PLoS One 2012;7:e31929. [PMID: 22427814 PMCID: PMC3302882 DOI: 10.1371/journal.pone.0031929] [Citation(s) in RCA: 70] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2011] [Accepted: 01/17/2012] [Indexed: 11/19/2022] Open

MALDI-typing of infectious algae of the genus Prototheca using SOM portraits. J Microbiol Methods 2012;88:83-97. [DOI: 10.1016/j.mimet.2011.10.013] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2011] [Revised: 10/17/2011] [Accepted: 10/20/2011] [Indexed: 01/13/2023]

Gillet JP, Wang J, Calcagno AM, Green LJ, Varma S, Elstrand MB, Trope CG, Ambudkar SV, Davidson B, Gottesman MM. Clinical relevance of multidrug resistance gene expression in ovarian serous carcinoma effusions. Mol Pharm 2011;8:2080-8. [PMID: 21761824 PMCID: PMC3224865 DOI: 10.1021/mp200240a] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]

Zheng CH, Zhang L, Ng VTY, Shiu SCK, Huang DS. Molecular pattern discovery based on penalized matrix decomposition. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011;8:1592-1603. [PMID: 21519114 DOI: 10.1109/tcbb.2011.79] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]

Lamb PF, Mündermann A, Bartlett RM, Robins A. Visualizing changes in lower body coordination with different types of foot orthoses using self-organizing maps (SOM). Gait Posture 2011;34:485-9. [PMID: 21821418 DOI: 10.1016/j.gaitpost.2011.06.024] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 09/21/2010] [Revised: 06/06/2011] [Accepted: 06/11/2011] [Indexed: 02/02/2023]

Wirth H, Löffler M, von Bergen M, Binder H. Expression cartography of human tissues using self organizing maps. BMC Bioinformatics 2011;12:306. [PMID: 21794127 PMCID: PMC3161046 DOI: 10.1186/1471-2105-12-306] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2011] [Accepted: 07/27/2011] [Indexed: 12/02/2022] Open

Abstract

Background

Parallel high-throughput microarray and sequencing experiments produce vast quantities of multidimensional data which must be arranged and analyzed in a concerted way. One approach to addressing this challenge is the machine learning technique known as self organizing maps (SOMs). SOMs enable a parallel sample- and gene-centered view of genomic data combined with strong visualization and second-level analysis capabilities. The paper aims at bridging the gap between the potency of SOM-machine learning to reduce dimension of high-dimensional data on one hand and practical applications with special emphasis on gene expression analysis on the other hand.

Results

The method was applied to generate a SOM characterizing the whole genome expression profiles of 67 healthy human tissues selected from ten tissue categories (adipose, endocrine, homeostasis, digestion, exocrine, epithelium, sexual reproduction, muscle, immune system and nervous tissues). SOM mapping reduces the dimension of expression data from ten of thousands of genes to a few thousand metagenes, each representing a minicluster of co-regulated single genes. Tissue-specific and common properties shared between groups of tissues emerge as a handful of localized spots in the tissue maps collecting groups of co-regulated and co-expressed metagenes. The functional context of the spots was discovered using overrepresentation analysis with respect to pre-defined gene sets of known functional impact. We found that tissue related spots typically contain enriched populations of genes related to specific molecular processes in the respective tissue. Analysis techniques normally used at the gene-level such as two-way hierarchical clustering are better represented and provide better signal-to-noise ratios if applied to the metagenes. Metagene-based clustering analyses aggregate the tissues broadly into three clusters containing nervous, immune system and the remaining tissues.

Conclusions

The SOM technique provides a more intuitive and informative global view of the behavior of a few well-defined modules of correlated and differentially expressed genes than the separate discovery of the expression levels of hundreds or thousands of individual genes. The program is available as R-package 'oposSOM'.

Collapse

Dalton L, Ballarin V, Brun M. Clustering algorithms: on learning, validation, performance, and applications to genomics. Curr Genomics 2011;10:430-45. [PMID: 20190957 PMCID: PMC2766793 DOI: 10.2174/138920209789177601] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2009] [Revised: 04/20/2009] [Accepted: 05/11/2009] [Indexed: 11/22/2022] Open

Abstract

The development of microarray technology has enabled scientists to measure the expression of thousands of genes simultaneously, resulting in a surge of interest in several disciplines throughout biology and medicine. While data clustering has been used for decades in image processing and pattern recognition, in recent years it has joined this wave of activity as a popular technique to analyze microarrays. To illustrate its application to genomics, clustering applied to genes from a set of microarray data groups together those genes whose expression levels exhibit similar behavior throughout the samples, and when applied to samples it offers the potential to discriminate pathologies based on their differential patterns of gene expression. Although clustering has now been used for many years in the context of gene expression microarrays, it has remained highly problematic. The choice of a clustering algorithm and validation index is not a trivial one, more so when applying them to high throughput biological or medical data. Factors to consider when choosing an algorithm include the nature of the application, the characteristics of the objects to be analyzed, the expected number and shape of the clusters, and the complexity of the problem versus computational power available. In some cases a very simple algorithm may be appropriate to tackle a problem, but many situations may require a more complex and powerful algorithm better suited for the job at hand. In this paper, we will cover the theoretical aspects of clustering, including error and learning, followed by an overview of popular clustering algorithms and classical validation indices. We also discuss the relative performance of these algorithms and indices and conclude with examples of the application of clustering to computational biology.

Collapse

The interplay of descriptor-based computational analysis with pharmacophore modeling builds the basis for a novel classification scheme for feruloyl esterases. Biotechnol Adv 2011;29:94-110. [DOI: 10.1016/j.biotechadv.2010.09.003] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2010] [Revised: 08/27/2010] [Accepted: 09/06/2010] [Indexed: 11/18/2022]

Elgaaen BV, Haug KBF, Wang J, Olstad OK, Fortunati D, Onsrud M, Staff AC, Sauer T, Gautvik KM. POLD2 and KSP37 (FGFBP2) correlate strongly with histology, stage and outcome in ovarian carcinomas. PLoS One 2010;5:e13837. [PMID: 21079801 PMCID: PMC2973954 DOI: 10.1371/journal.pone.0013837] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2010] [Accepted: 10/01/2010] [Indexed: 12/16/2022] Open

Abstract

BACKGROUND

Epithelial ovarian cancer (EOC) constitutes more than 90% of ovarian cancers and is associated with high mortality. EOC comprises a heterogeneous group of tumours, and the causes and molecular pathology are essentially unknown. Improved insight into the molecular characteristics of the different subgroups of EOC is urgently needed, and should eventually lead to earlier diagnosis as well as more individualized and effective treatments. Previously, we reported a limited number of mRNAs strongly upregulated in human osteosarcomas and other malignancies, and six were selected to be tested for a possible association with three subgroups of ovarian carcinomas and clinical parameters.

METHODOLOGY/PRINCIPAL FINDINGS

The six selected mRNAs were quantified by RT-qPCR in biopsies from eleven poorly differentiated serous carcinomas (PDSC, stage III-IV), twelve moderately differentiated serous carcinomas (MDSC, stage III-IV) and eight clear cell carcinomas (CCC, stage I-IV) of the ovary. Superficial scrapings from six normal ovaries (SNO), as well as biopsies from three normal ovaries (BNO) and three benign ovarian cysts (BBOC) were analyzed for comparison. The gene expression level was related to the histological and clinical parameters of human ovarian carcinoma samples. One of the mRNAs, DNA polymerase delta 2 small subunit (POLD2), was increased in average 2.5- to almost 20-fold in MDSC and PDSC, respectively, paralleling the degree of dedifferentiation and concordant with a poor prognosis. Except for POLD2, the serous carcinomas showed a similar transcription profile, being clearly different from CCC. Another mRNA, Killer-specific secretory protein of 37 kDa (KSP37) showed six- to eight-fold higher levels in CCC stage I compared with the more advanced staged carcinomas, and correlated positively with an improved clinical outcome.

CONCLUSIONS/SIGNIFICANCE

We have identified two biomarkers which are markedly upregulated in two subgroups of ovarian carcinomas and are also associated with stage and outcome. The results suggest that POLD2 and KSP37 might be potential prognostic biomarkers.

Collapse

Milone DH, Stegmayer GS, Kamenetzky L, López M, Lee JM, Giovannoni JJ, Carrari F. *omeSOM: a software for clustering and visualization of transcriptional and metabolite data mined from interspecific crosses of crop plants. BMC Bioinformatics 2010;11:438. [PMID: 20796314 PMCID: PMC2942854 DOI: 10.1186/1471-2105-11-438] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2010] [Accepted: 08/26/2010] [Indexed: 11/10/2022] Open

Sun J, Masterman-Smith MD, Graham NA, Jiao J, Mottahedeh J, Laks DR, Ohashi M, DeJesus J, Kamei KI, Lee KB, Wang H, Yu ZTF, Lu YT, Hou S, Li K, Liu M, Zhang N, Wang S, Angenieux B, Panosyan E, Samuels ER, Park J, Williams D, Konkankit V, Nathanson D, van Dam RM, Phelps ME, Wu H, Liau LM, Mischel PS, Lazareff JA, Kornblum HI, Yong WH, Graeber TG, Tseng HR. A microfluidic platform for systems pathology: multiparameter single-cell signaling measurements of clinical brain tumor specimens. Cancer Res 2010;70:6128-38. [PMID: 20631065 DOI: 10.1158/0008-5472.can-10-0076] [Citation(s) in RCA: 94] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]

Guo Y, Eichler GS, Feng Y, Ingber DE, Huang S. Towards a holistic, yet gene-centered analysis of gene expression profiles: a case study of human lung cancers. J Biomed Biotechnol 2010;2006:69141. [PMID: 17489018 PMCID: PMC1698264 DOI: 10.1155/jbb/2006/69141] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

Newman AM, Cooper JB. AutoSOME: a clustering method for identifying gene expression modules without prior knowledge of cluster number. BMC Bioinformatics 2010;11:117. [PMID: 20202218 PMCID: PMC2846907 DOI: 10.1186/1471-2105-11-117] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2009] [Accepted: 03/04/2010] [Indexed: 12/25/2022] Open

Reexamination of risk criteria in dengue patients using the self-organizing map. Med Biol Eng Comput 2009;48:293-301. [PMID: 20016950 DOI: 10.1007/s11517-009-0561-x] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2009] [Accepted: 11/17/2009] [Indexed: 10/20/2022]

Application of Kohonen maps for solving the classification puzzle in AGC kinase protein sequences. Interdiscip Sci 2009;1:173-8. [DOI: 10.1007/s12539-009-0032-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2008] [Revised: 02/25/2009] [Accepted: 03/16/2009] [Indexed: 10/20/2022]

Blazadonakis ME, Zervakis M. The linear neuron as marker selector and clinical predictor in cancer gene analysis. COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2008;91:22-35. [PMID: 18423925 DOI: 10.1016/j.cmpb.2008.02.009] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2007] [Revised: 02/22/2008] [Accepted: 02/23/2008] [Indexed: 05/26/2023]

Sivaraksa M, Lowe D. Predictive gene lists for breast cancer prognosis: a topographic visualisation study. BMC Med Genomics 2008;1:8. [PMID: 18419801 PMCID: PMC2375896 DOI: 10.1186/1755-8794-1-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2007] [Accepted: 04/17/2008] [Indexed: 11/10/2022] Open

Abstract

Background

The controversy surrounding the non-uniqueness of predictive gene lists (PGL) of small selected subsets of genes from very large potential candidates as available in DNA microarray experiments is now widely acknowledged [1]. Many of these studies have focused on constructing discriminative semi-parametric models and as such are also subject to the issue of random correlations of sparse model selection in high dimensional spaces. In this work we outline a different approach based around an unsupervised patient-specific nonlinear topographic projection in predictive gene lists.

Methods

We construct nonlinear topographic projection maps based on inter-patient gene-list relative dissimilarities. The Neuroscale, the Stochastic Neighbor Embedding(SNE) and the Locally Linear Embedding(LLE) techniques have been used to construct two-dimensional projective visualisation plots of 70 dimensional PGLs per patient, classifiers are also constructed to identify the prognosis indicator of each patient using the resulting projections from those visualisation techniques and investigate whether a-posteriori two prognosis groups are separable on the evidence of the gene lists.

A literature-proposed predictive gene list for breast cancer is benchmarked against a separate gene list using the above methods. Generalisation ability is investigated by using the mapping capability of Neuroscale to visualise the follow-up study, but based on the projections derived from the original dataset.

Results

The results indicate that small subsets of patient-specific PGLs have insufficient prognostic dissimilarity to permit a distinction between two prognosis patients. Uncertainty and diversity across multiple gene expressions prevents unambiguous or even confident patient grouping. Comparative projections across different PGLs provide similar results.

Conclusion

The random correlation effect to an arbitrary outcome induced by small subset selection from very high dimensional interrelated gene expression profiles leads to an outcome with associated uncertainty. This continuum and uncertainty precludes any attempts at constructing discriminative classifiers.

However a patient's gene expression profile could possibly be used in treatment planning, based on knowledge of other patients' responses.

We conclude that many of the patients involved in such medical studies are intrinsically unclassifiable on the basis of provided PGL evidence. This additional category of 'unclassifiable' should be accommodated within medical decision support systems if serious errors and unnecessary adjuvant therapy are to be avoided.

Collapse

Fernandez EA, Balzarini M. Improving cluster visualization in self-organizing maps: Application in gene expression data analysis. Comput Biol Med 2007;37:1677-89. [PMID: 17544390 DOI: 10.1016/j.compbiomed.2007.04.003] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2006] [Revised: 04/03/2007] [Accepted: 04/10/2007] [Indexed: 11/26/2022]

Mueller CG, Boix C, Kwan WH, Daussy C, Fournier E, Fridman WH, Molina TJ. Critical role of monocytes to support normal B cell and diffuse large B cell lymphoma survival and proliferation. J Leukoc Biol 2007;82:567-75. [PMID: 17575267 DOI: 10.1189/jlb.0706481] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open

Kaput J, Dawson K. Complexity of type 2 diabetes mellitus data sets emerging from nutrigenomic research: a case for dimensionality reduction? Mutat Res 2007;622:19-32. [PMID: 17559889 PMCID: PMC1994901 DOI: 10.1016/j.mrfmmm.2007.02.033] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2006] [Accepted: 02/13/2007] [Indexed: 02/07/2023]

Belacel N, Wang Q, Cuperlovic-Culf M. Clustering methods for microarray gene expression data. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2007;10:507-31. [PMID: 17233561 DOI: 10.1089/omi.2006.10.507] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Weeraratna AT, Taub DD. Microarray data analysis: an overview of design, methodology, and analysis. Methods Mol Biol 2007;377:1-16. [PMID: 17634607 DOI: 10.1007/978-1-59745-390-5_1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Microarray expression technology in clinical research of non-Hodgkin lymphoma. ARCHIVE OF ONCOLOGY 2007. [DOI: 10.2298/aoo0702028b] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

Sjödin A, Bylesjö M, Skogström O, Eriksson D, Nilsson P, Rydén P, Jansson S, Karlsson J. UPSC-BASE--Populus transcriptomics online. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2006;48:806-17. [PMID: 17092314 DOI: 10.1111/j.1365-313x.2006.02920.x] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]

Moorman C, Sun LV, Wang J, de Wit E, Talhout W, Ward LD, Greil F, Lu XJ, White KP, Bussemaker HJ, van Steensel B. Hotspots of transcription factor colocalization in the genome of Drosophila melanogaster. Proc Natl Acad Sci U S A 2006;103:12027-32. [PMID: 16880385 PMCID: PMC1567692 DOI: 10.1073/pnas.0605003103] [Citation(s) in RCA: 158] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2006] [Indexed: 11/18/2022] Open

Dunphy CH. Gene expression profiling data in lymphoma and leukemia: review of the literature and extrapolation of pertinent clinical applications. Arch Pathol Lab Med 2006;130:483-520. [PMID: 16594743 DOI: 10.5858/2006-130-483-gepdil] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Kaderali L, Zander T, Faigle U, Wolf J, Schultze JL, Schrader R. CASPAR: a hierarchical bayesian approach to predict survival times in cancer from gene expression data. Bioinformatics 2006;22:1495-502. [PMID: 16554338 DOI: 10.1093/bioinformatics/btl103] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Abstract

MOTIVATION

DNA microarrays allow the simultaneous measurement of thousands of gene expression levels in any given patient sample. Gene expression data have been shown to correlate with survival in several cancers, however, analysis of the data is difficult, since typically at most a few hundred patients are available, resulting in severely underdetermined regression or classification models. Several approaches exist to classify patients in different risk classes, however, relatively little has been done with respect to the prediction of actual survival times. We introduce CASPAR, a novel method to predict true survival times for the individual patient based on microarray measurements. CASPAR is based on a multivariate Cox regression model that is embedded in a Bayesian framework. A hierarchical prior distribution on the regression parameters is specifically designed to deal with high dimensionality (large number of genes) and low sample size settings, that are typical for microarray measurements. This enables CASPAR to automatically select small, most informative subsets of genes for prediction.

RESULTS

Validity of the method is demonstrated on two publicly available datasets on diffuse large B-cell lymphoma (DLBCL) and on adenocarcinoma of the lung. The method successfully identifies long and short survivors, with high sensitivity and specificity. We compare our method with two alternative methods from the literature, demonstrating superior results of our approach. In addition, we show that CASPAR can further refine predictions made using clinical scoring systems such as the International Prognostic Index (IPI) for DLBCL and clinical staging for lung cancer, thus providing an additional tool for the clinician. An analysis of the genes identified confirms previously published results, and furthermore, new candidate genes correlated with survival are identified.

Collapse

Carmona-Saez P, Pascual-Marqui RD, Tirado F, Carazo JM, Pascual-Montano A. Biclustering of gene expression data by Non-smooth Non-negative Matrix Factorization. BMC Bioinformatics 2006;7:78. [PMID: 16503973 PMCID: PMC1434777 DOI: 10.1186/1471-2105-7-78] [Citation(s) in RCA: 94] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2005] [Accepted: 02/17/2006] [Indexed: 12/01/2022] Open

Gulmann C, Espina V, Petricoin E, Longo DL, Santi M, Knutsen T, Raffeld M, Jaffe ES, Liotta LA, Feldman AL. Proteomic Analysis of Apoptotic Pathways Reveals Prognostic Factors in Follicular Lymphoma. Clin Cancer Res 2005;11:5847-55. [PMID: 16115925 DOI: 10.1158/1078-0432.ccr-05-0637] [Citation(s) in RCA: 86] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Jiang YM, Yamamoto M, Kobayashi Y, Yoshihara T, Liang Y, Terao S, Takeuchi H, Ishigaki S, Katsuno M, Adachi H, Niwa JI, Tanaka F, Doyu M, Yoshida M, Hashizume Y, Sobue G. Gene expression profile of spinal motor neurons in sporadic amyotrophic lateral sclerosis. Ann Neurol 2005;57:236-51. [PMID: 15668976 DOI: 10.1002/ana.20379] [Citation(s) in RCA: 191] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

Atalay V, Cetin-Atalay R. Implicit motif distribution based hybrid computational kernel for sequence classification. Bioinformatics 2004;21:1429-36. [PMID: 15598837 DOI: 10.1093/bioinformatics/bti212] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open