Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Caldas J, Gehlenborg N, Faisal A, Brazma A, Kaski S. Probabilistic retrieval and visualization of biologically relevant microarray experiments. Bioinformatics 2009;25:i145-53. [PMID: 19477980 PMCID: PMC2687969 DOI: 10.1093/bioinformatics/btp215] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

For:	Caldas J, Gehlenborg N, Faisal A, Brazma A, Kaski S. Probabilistic retrieval and visualization of biologically relevant microarray experiments. Bioinformatics 2009;25:i145-53. [PMID: 19477980 PMCID: PMC2687969 DOI: 10.1093/bioinformatics/btp215] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open

Number

Cited by Other Article(s)

Zhao Z, Zucknick M, Aittokallio T. EnrichIntersect: an R package for custom set enrichment analysis and interactive visualization of intersecting sets. BIOINFORMATICS ADVANCES 2022;2:vbac073. [PMID: 36699400 PMCID: PMC9710586 DOI: 10.1093/bioadv/vbac073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Accepted: 09/26/2022] [Indexed: 02/01/2023]

Text-based experiment retrieval in genomic databases. J Inf Sci 2022. [DOI: 10.1177/01655515221118670] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Chen S, Andrienko N, Andrienko G, Adilova L, Barlet J, Kindermann J, Nguyen PH, Thonnard O, Turkay C. LDA Ensembles for Interactive Exploration and Categorization of Behaviors. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2020;26:2775-2792. [PMID: 30869622 DOI: 10.1109/tvcg.2019.2904069] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]

Yang G, Ma A, Qin ZS, Chen L. Application of topic models to a compendium of ChIP-Seq datasets uncovers recurrent transcriptional regulatory modules. Bioinformatics 2020;36:2352-2358. [PMID: 31899481 DOI: 10.1093/bioinformatics/btz975] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2019] [Revised: 10/29/2019] [Accepted: 12/30/2019] [Indexed: 11/14/2022] Open

Lekschas F, Gehlenborg N. SATORI: a system for ontology-guided visual exploration of biomedical data repositories. Bioinformatics 2018;34:1200-1207. [PMID: 29186292 PMCID: PMC6031061 DOI: 10.1093/bioinformatics/btx739] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2017] [Accepted: 11/22/2017] [Indexed: 01/14/2023] Open

Heinonen M, Milliat F, Benadjaoud MA, François A, Buard V, Tarlet G, d’Alché-Buc F, Guipaud O. Temporal clustering analysis of endothelial cell gene expression following exposure to a conventional radiotherapy dose fraction using Gaussian process clustering. PLoS One 2018;13:e0204960. [PMID: 30281653 PMCID: PMC6169916 DOI: 10.1371/journal.pone.0204960] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2018] [Accepted: 09/15/2018] [Indexed: 12/31/2022] Open

Abstract

The vascular endothelium is considered as a key cell compartment for the response to ionizing radiation of normal tissues and tumors, and as a promising target to improve the differential effect of radiotherapy in the future. Following radiation exposure, the global endothelial cell response covers a wide range of gene, miRNA, protein and metabolite expression modifications. Changes occur at the transcriptional, translational and post-translational levels and impact cell phenotype as well as the microenvironment by the production and secretion of soluble factors such as reactive oxygen species, chemokines, cytokines and growth factors. These radiation-induced dynamic modifications of molecular networks may control the endothelial cell phenotype and govern recruitment of immune cells, stressing the importance of clearly understanding the mechanisms which underlie these temporal processes. A wide variety of time series data is commonly used in bioinformatics studies, including gene expression, protein concentrations and metabolomics data. The use of clustering of these data is still an unclear problem. Here, we introduce kernels between Gaussian processes modeling time series, and subsequently introduce a spectral clustering algorithm. We apply the methods to the study of human primary endothelial cells (HUVECs) exposed to a radiotherapy dose fraction (2 Gy). Time windows of differential expressions of 301 genes involved in key cellular processes such as angiogenesis, inflammation, apoptosis, immune response and protein kinase were determined from 12 hours to 3 weeks post-irradiation. Then, 43 temporal clusters corresponding to profiles of similar expressions, including 49 genes out of 301 initially measured, were generated according to the proposed method. Forty-seven transcription factors (TFs) responsible for the expression of clusters of genes were predicted from sequence regulatory elements using the MotifMap system. Their temporal profiles of occurrences were established and clustered. Dynamic network interactions and molecular pathways of TFs and differential genes were finally explored, revealing key node genes and putative important cellular processes involved in tissue infiltration by immune cells following exposure to a radiotherapy dose fraction.

Collapse

Rauber PE, Falcão AX, Telea AC. Projections as visual aids for classification system design. INFORMATION VISUALIZATION 2018;17:282-305. [PMID: 30263012 PMCID: PMC6131729 DOI: 10.1177/1473871617713337] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Kohonen P, Parkkinen JA, Willighagen EL, Ceder R, Wennerberg K, Kaski S, Grafström RC. A transcriptomics data-driven gene space accurately predicts liver cytopathology and drug-induced liver injury. Nat Commun 2017;8:15932. [PMID: 28671182 PMCID: PMC5500850 DOI: 10.1038/ncomms15932] [Citation(s) in RCA: 71] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2016] [Accepted: 05/15/2017] [Indexed: 01/17/2023] Open

MetaTopics: an integration tool to analyze microbial community profile by topic model. BMC Genomics 2017;18:962. [PMID: 28198670 PMCID: PMC5310276 DOI: 10.1186/s12864-016-3257-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open

Söderholm S, Fu Y, Gaelings L, Belanov S, Yetukuri L, Berlinkov M, Cheltsov AV, Anders S, Aittokallio T, Nyman TA, Matikainen S, Kainov DE. Multi-Omics Studies towards Novel Modulators of Influenza A Virus-Host Interaction. Viruses 2016;8:v8100269. [PMID: 27690086 PMCID: PMC5086605 DOI: 10.3390/v8100269] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2016] [Revised: 09/13/2016] [Accepted: 09/22/2016] [Indexed: 12/20/2022] Open

Liu L, Tang L, Dong W, Yao S, Zhou W. An overview of topic modeling and its current applications in bioinformatics. SPRINGERPLUS 2016;5:1608. [PMID: 27652181 PMCID: PMC5028368 DOI: 10.1186/s40064-016-3252-8] [Citation(s) in RCA: 94] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/24/2016] [Accepted: 09/08/2016] [Indexed: 11/10/2022]

González J, Muñoz A, Martos G. Asymmetric latent semantic indexing for gene expression experiments visualization. J Bioinform Comput Biol 2016;14:1650023. [PMID: 27427382 DOI: 10.1142/s0219720016500232] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Şener DD, Oğul H. Retrieving relevant time-course experiments: a study on Arabidopsis microarrays. IET Syst Biol 2016;10:87-93. [PMID: 27187987 DOI: 10.1049/iet-syb.2015.0042] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open

Blomstedt P, Dutta R, Seth S, Brazma A, Kaski S. Modelling-based experiment retrieval: a case study with gene expression clustering. Bioinformatics 2016;32:1388-94. [PMID: 26740526 DOI: 10.1093/bioinformatics/btv762] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Accepted: 12/28/2015] [Indexed: 12/18/2022] Open

Abstract

MOTIVATION

Public and private repositories of experimental data are growing to sizes that require dedicated methods for finding relevant data. To improve on the state of the art of keyword searches from annotations, methods for content-based retrieval have been proposed. In the context of gene expression experiments, most methods retrieve gene expression profiles, requiring each experiment to be expressed as a single profile, typically of case versus control. A more general, recently suggested alternative is to retrieve experiments whose models are good for modelling the query dataset. However, for very noisy and high-dimensional query data, this retrieval criterion turns out to be very noisy as well.

RESULTS

We propose doing retrieval using a denoised model of the query dataset, instead of the original noisy dataset itself. To this end, we introduce a general probabilistic framework, where each experiment is modelled separately and the retrieval is done by finding related models. For retrieval of gene expression experiments, we use a probabilistic model called product partition model, which induces a clustering of genes that show similar expression patterns across a number of samples. The suggested metric for retrieval using clusterings is the normalized information distance. Empirical results finally suggest that inference for the full probabilistic model can be approximated with good performance using computationally faster heuristic clustering approaches (e.g. k-means). The method is highly scalable and straightforward to apply to construct a general-purpose gene expression experiment retrieval method.

AVAILABILITY AND IMPLEMENTATION

The method can be implemented using standard clustering algorithms and normalized information distance, available in many statistical software packages.

CONTACT

paul.blomstedt@aalto.fi or samuel.kaski@aalto.fi

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

miSEA: microRNA set enrichment analysis. Biosystems 2015;134:37-42. [DOI: 10.1016/j.biosystems.2015.05.004] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2015] [Revised: 05/11/2015] [Accepted: 05/12/2015] [Indexed: 11/20/2022]

Açıcı K, Terzi YK, Oğul H. Retrieving relevant experiments: The case of microRNA microarrays. Biosystems 2015;134:71-8. [PMID: 26116091 DOI: 10.1016/j.biosystems.2015.06.003] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2015] [Revised: 06/15/2015] [Accepted: 06/17/2015] [Indexed: 01/06/2023]

Uziela K, Honkela A. Probe Region Expression Estimation for RNA-Seq Data for Improved Microarray Comparability. PLoS One 2015;10:e0126545. [PMID: 25966034 PMCID: PMC4429080 DOI: 10.1371/journal.pone.0126545] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2014] [Accepted: 04/03/2015] [Indexed: 01/25/2023] Open

Faisal A, Peltonen J, Georgii E, Rung J, Kaski S. Toward computational cumulative biology by combining models of biological datasets. PLoS One 2014;9:e113053. [PMID: 25427176 PMCID: PMC4245117 DOI: 10.1371/journal.pone.0113053] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2014] [Accepted: 10/17/2014] [Indexed: 11/21/2022] Open

Jalanka-Tuovinen J, Salojärvi J, Salonen A, Immonen O, Garsed K, Kelly FM, Zaitoun A, Palva A, Spiller RC, de Vos WM. Faecal microbiota composition and host-microbe cross-talk following gastroenteritis and in postinfectious irritable bowel syndrome. Gut 2014;63:1737-45. [PMID: 24310267 DOI: 10.1136/gutjnl-2013-305994] [Citation(s) in RCA: 233] [Impact Index Per Article: 23.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Abstract

BACKGROUND

About 10% of patients with IBS report the start of the syndrome after infectious enteritis. The clinical features of postinfectious IBS (PI-IBS) resemble those of diarrhoea-predominant IBS (IBS-D). While altered faecal microbiota has been identified in other IBS subtypes, composition of the microbiota in patients with PI-IBS remains uncharacterised.

OBJECTIVE

To characterise the microbial composition of patients with PI-IBS, and to examine the associations between the faecal microbiota and a patient's clinical features.

DESIGN

Using a phylogenetic microarray and selected qPCR assays, we analysed differences in the faecal microbiota of 57 subjects from five study groups: patients with diagnosed PI-IBS, patients who 6 months after gastroenteritis had either persisting bowel dysfunction or no IBS symptoms, benchmarked against patients with IBS-D and healthy controls. In addition, the associations between the faecal microbiota and health were investigated by correlating the microbial profiles to immunological markers, quality of life indicators and host gene expression in rectal biopsies.

RESULTS

Microbiota analysis revealed a bacterial profile of 27 genus-like groups, providing an Index of Microbial Dysbiosis (IMD), which significantly separated patient groups and controls. Within this profile, several members of Bacteroidetes phylum were increased 12-fold in patients, while healthy controls had 35-fold more uncultured Clostridia. We showed correlations between the IMD and expression of several host gene pathways, including amino acid synthesis, cell junction integrity and inflammatory response, suggesting an impaired epithelial barrier function in IBS.

CONCLUSIONS

The faecal microbiota of patients with PI-IBS differs from that of healthy controls and resembles that of patients with IBS-D, suggesting a common pathophysiology. Moreover, our analysis suggests a variety of host-microbe associations that may underlie intestinal symptoms, initiated by gastroenteritis.

Collapse

Information retrieval approach to meta-visualization. Mach Learn 2014. [DOI: 10.1007/s10994-014-5464-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Seth S, Välimäki N, Kaski S, Honkela A. Exploration and retrieval of whole-metagenome sequencing samples. Bioinformatics 2014;30:2471-9. [PMID: 24845653 PMCID: PMC4230234 DOI: 10.1093/bioinformatics/btu340] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Affiliation(s)

Sohan Seth Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, Espoo, Finland, Genome-Scale Biology Program and Department of Medical Genetics, University of Helsinki, Helsinki, Finland, and Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland
Niko Välimäki Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, Espoo, Finland, Genome-Scale Biology Program and Department of Medical Genetics, University of Helsinki, Helsinki, Finland, and Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, Espoo, Finland, Genome-Scale Biology Program and Department of Medical Genetics, University of Helsinki, Helsinki, Finland, and Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland
Samuel Kaski Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, Espoo, Finland, Genome-Scale Biology Program and Department of Medical Genetics, University of Helsinki, Helsinki, Finland, and Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, Espoo, Finland, Genome-Scale Biology Program and Department of Medical Genetics, University of Helsinki, Helsinki, Finland, and Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland
Antti Honkela Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, Espoo, Finland, Genome-Scale Biology Program and Department of Medical Genetics, University of Helsinki, Helsinki, Finland, and Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland

Collapse

Wang V, Xi L, Enayetallah A, Fauman E, Ziemek D. GeneTopics--interpretation of gene sets via literature-driven topic models. BMC SYSTEMS BIOLOGY 2013;7 Suppl 5:S10. [PMID: 24564875 PMCID: PMC4029197 DOI: 10.1186/1752-0509-7-s5-s10] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]

Abstract

Background

Annotation of a set of genes is often accomplished through comparison to a library of labelled gene sets such as biological processes or canonical pathways. However, this approach might fail if the employed libraries are not up to date with the latest research, don't capture relevant biological themes or are curated at a different level of granularity than is required to appropriately analyze the input gene set. At the same time, the vast biomedical literature offers an unstructured repository of the latest research findings that can be tapped to provide thematic sub-groupings for any input gene set.

Methods

Our proposed method relies on a gene-specific text corpus and extracts commonalities between documents in an unsupervised manner using a topic model approach. We automatically determine the number of topics summarizing the corpus and calculate a gene relevancy score for each topic allowing us to eliminate non-specific topics. As a result we obtain a set of literature topics in which each topic is associated with a subset of the input genes providing directly interpretable keywords and corresponding documents for literature research.

Results

We validate our method based on labelled gene sets from the KEGG metabolic pathway collection and the genetic association database (GAD) and show that the approach is able to detect topics consistent with the labelled annotation. Furthermore, we discuss the results on three different types of experimentally derived gene sets, (1) differentially expressed genes from a cardiac hypertrophy experiment in mice, (2) altered transcript abundance in human pancreatic beta cells, and (3) genes implicated by GWA studies to be associated with metabolite levels in a healthy population. In all three cases, we are able to replicate findings from the original papers in a quick and semi-automated manner.

Conclusions

Our approach provides a novel way of automatically generating meaningful annotations for gene sets that are directly tied to relevant articles in the literature. Extending a general topic model method, the approach introduced here establishes a workflow for the interpretation of gene sets generated from diverse experimental scenarios that can complement the classical approach of comparison to reference gene sets.

Collapse

Georgii E, Salojärvi J, Brosché M, Kangasjärvi J, Kaski S. Targeted retrieval of gene expression measurements using regulatory models. Bioinformatics 2012;28:2349-56. [DOI: 10.1093/bioinformatics/bts361] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open

Khan SA, Faisal A, Mpindi JP, Parkkinen JA, Kalliokoski T, Poso A, Kallioniemi OP, Wennerberg K, Kaski S. Comprehensive data-driven analysis of the impact of chemoinformatic structure on the genome-wide biological response profiles of cancer cells to 1159 drugs. BMC Bioinformatics 2012;13:112. [PMID: 22646858 PMCID: PMC3532323 DOI: 10.1186/1471-2105-13-112] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2011] [Accepted: 04/09/2012] [Indexed: 11/16/2022] Open

Abstract

Background

Detailed and systematic understanding of the biological effects of millions of available compounds on living cells is a significant challenge. As most compounds impact multiple targets and pathways, traditional methods for analyzing structure-function relationships are not comprehensive enough. Therefore more advanced integrative models are needed for predicting biological effects elicited by specific chemical features. As a step towards creating such computational links we developed a data-driven chemical systems biology approach to comprehensively study the relationship of 76 structural 3D-descriptors (VolSurf, chemical space) of 1159 drugs with the microarray gene expression responses (biological space) they elicited in three cancer cell lines. The analysis covering 11350 genes was based on data from the Connectivity Map. We decomposed the biological response profiles into components, each linked to a characteristic chemical descriptor profile.

Results

Integrated analysis of both the chemical and biological space was more informative than either dataset alone in predicting drug similarity as measured by shared protein targets. We identified ten major components that link distinct VolSurf chemical features across multiple compounds to specific cellular responses. For example, component 2 (hydrophobic properties) strongly linked to DNA damage response, while component 3 (hydrogen bonding) was associated with metabolic stress. Individual structural and biological features were often linked to one cell line only, such as leukemia cells (HL-60) specifically responding to cardiac glycosides.

Conclusions

In summary, our approach identified several novel links between specific chemical structure properties and distinct biological responses in cells incubated with these drugs. Importantly, the analysis focused on chemical-biological properties that emerge across multiple drugs. The decoding of such systematic relationships is necessary to build better models of drug effects, including unanticipated types of molecular properties having strong biological effects.

Collapse

Corander J, Aittokallio T, Ripatti S, Kaski S. The rocky road to personalized medicine: computational and statistical challenges. Per Med 2012;9:109-114. [DOI: 10.2217/pme.12.1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]

Caldas J, Gehlenborg N, Kettunen E, Faisal A, Rönty M, Nicholson AG, Knuutila S, Brazma A, Kaski S. Data-driven information retrieval in heterogeneous collections of transcriptomics data links SIM2s to malignant pleural mesothelioma. ACTA ACUST UNITED AC 2011;28:246-53. [PMID: 22106335 PMCID: PMC3259436 DOI: 10.1093/bioinformatics/btr634] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]

Caldas J, Kaski S. Hierarchical generative biclustering for microRNA expression analysis. J Comput Biol 2011;18:251-61. [PMID: 21385032 DOI: 10.1089/cmb.2010.0256] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Kilpinen SK, Ojala KA, Kallioniemi OP. Alignment of gene expression profiles from test samples against a reference database: New method for context-specific interpretation of microarray data. BioData Min 2011;4:5. [PMID: 21453538 PMCID: PMC3080808 DOI: 10.1186/1756-0381-4-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2010] [Accepted: 03/31/2011] [Indexed: 02/07/2023] Open

Abstract

Background

Gene expression microarray data have been organized and made available as public databases, but the utilization of such highly heterogeneous reference datasets in the interpretation of data from individual test samples is not as developed as e.g. in the field of nucleotide sequence comparisons. We have created a rapid and powerful approach for the alignment of microarray gene expression profiles (AGEP) from test samples with those contained in a large annotated public reference database and demonstrate here how this can facilitate interpretation of microarray data from individual samples.

Methods

AGEP is based on the calculation of kernel density distributions for the levels of expression of each gene in each reference tissue type and provides a quantitation of the similarity between the test sample and the reference tissue types as well as the identity of the typical and atypical genes in each comparison. As a reference database, we used 1654 samples from 44 normal tissues (extracted from the Genesapiens database).

Results

Using leave-one-out validation, AGEP correctly defined the tissue of origin for 1521 (93.6%) of all the 1654 samples in the original database. Independent validation of 195 external normal tissue samples resulted in 87% accuracy for the exact tissue type and 97% accuracy with related tissue types. AGEP analysis of 10 Duchenne muscular dystrophy (DMD) samples provided quantitative description of the key pathogenetic events, such as the extent of inflammation, in individual samples and pinpointed tissue-specific genes whose expression changed (SAMD4A) in DMD. AGEP analysis of microarray data from adipocytic differentiation of mesenchymal stem cells and from normal myeloid cell types and leukemias provided quantitative characterization of the transcriptomic changes during normal and abnormal cell differentiation.

Conclusions

The AGEP method is a widely applicable method for the rapid comprehensive interpretation of microarray data, as proven here by the definition of tissue- and disease-specific changes in gene expression as well as during cellular differentiation. The capability to quantitatively compare data from individual samples against a large-scale annotated reference database represents a widely applicable paradigm for the analysis of all types of high-throughput data. AGEP enables systematic and quantitative comparison of gene expression data from test samples against a comprehensive collection of different cell/tissue types previously studied by the entire research community.

Collapse

Engreitz JM, Morgan AA, Dudley JT, Chen R, Thathoo R, Altman RB, Butte AJ. Content-based microarray search using differential expression profiles. BMC Bioinformatics 2010;11:603. [PMID: 21172034 PMCID: PMC3022631 DOI: 10.1186/1471-2105-11-603] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2010] [Accepted: 12/21/2010] [Indexed: 12/20/2022] Open

Freudenberg JM, Sivaganesan S, Phatak M, Shinde K, Medvedovic M. Generalized random set framework for functional enrichment analysis using primary genomics datasets. ACTA ACUST UNITED AC 2010;27:70-7. [PMID: 20971985 DOI: 10.1093/bioinformatics/btq593] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Abeel T, de Ridder J, Peixoto L. Highlights from the 5th International Society for Computational Biology Student Council Symposium at the 17th Annual International Conference on Intelligent Systems for Molecular Biology and the 8th European Conference on Computational Biology. BMC Bioinformatics 2009;10 Suppl 13:I1. [PMID: 19840405 PMCID: PMC2764124 DOI: 10.1186/1471-2105-10-s13-i1] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open

Caldas J, Gehlenborg N, Faisal A, Brazma A, Kaski S. Probabilistic retrieval and visualization of biologically relevant microarray experiments. BMC Bioinformatics 2009. [PMCID: PMC2764132 DOI: 10.1186/1471-2105-10-s13-p1] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open