Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Caldas J, Gehlenborg N, Kettunen E, Faisal A, Rönty M, Nicholson AG, Knuutila S, Brazma A, Kaski S. Data-driven information retrieval in heterogeneous collections of transcriptomics data links SIM2s to malignant pleural mesothelioma. ACTA ACUST UNITED AC 2011;28:246-53. [PMID: 22106335 PMCID: PMC3259436 DOI: 10.1093/bioinformatics/btr634] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]

For:	Caldas J, Gehlenborg N, Kettunen E, Faisal A, Rönty M, Nicholson AG, Knuutila S, Brazma A, Kaski S. Data-driven information retrieval in heterogeneous collections of transcriptomics data links SIM2s to malignant pleural mesothelioma. ACTA ACUST UNITED AC 2011;28:246-53. [PMID: 22106335 PMCID: PMC3259436 DOI: 10.1093/bioinformatics/btr634] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]

Number

Cited by Other Article(s)

Cakiroglu E, Senturk S. Genomics and Functional Genomics of Malignant Pleural Mesothelioma. Int J Mol Sci 2020;21:ijms21176342. [PMID: 32882916 PMCID: PMC7504302 DOI: 10.3390/ijms21176342] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Revised: 08/20/2020] [Accepted: 08/20/2020] [Indexed: 12/17/2022] Open

Lekschas F, Gehlenborg N. SATORI: a system for ontology-guided visual exploration of biomedical data repositories. Bioinformatics 2018;34:1200-1207. [PMID: 29186292 PMCID: PMC6031061 DOI: 10.1093/bioinformatics/btx739] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2017] [Accepted: 11/22/2017] [Indexed: 01/14/2023] Open

Blomstedt P, Dutta R, Seth S, Brazma A, Kaski S. Modelling-based experiment retrieval: a case study with gene expression clustering. Bioinformatics 2016;32:1388-94. [PMID: 26740526 DOI: 10.1093/bioinformatics/btv762] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2015] [Accepted: 12/28/2015] [Indexed: 12/18/2022] Open

Abstract

MOTIVATION

Public and private repositories of experimental data are growing to sizes that require dedicated methods for finding relevant data. To improve on the state of the art of keyword searches from annotations, methods for content-based retrieval have been proposed. In the context of gene expression experiments, most methods retrieve gene expression profiles, requiring each experiment to be expressed as a single profile, typically of case versus control. A more general, recently suggested alternative is to retrieve experiments whose models are good for modelling the query dataset. However, for very noisy and high-dimensional query data, this retrieval criterion turns out to be very noisy as well.

RESULTS

We propose doing retrieval using a denoised model of the query dataset, instead of the original noisy dataset itself. To this end, we introduce a general probabilistic framework, where each experiment is modelled separately and the retrieval is done by finding related models. For retrieval of gene expression experiments, we use a probabilistic model called product partition model, which induces a clustering of genes that show similar expression patterns across a number of samples. The suggested metric for retrieval using clusterings is the normalized information distance. Empirical results finally suggest that inference for the full probabilistic model can be approximated with good performance using computationally faster heuristic clustering approaches (e.g. k-means). The method is highly scalable and straightforward to apply to construct a general-purpose gene expression experiment retrieval method.

AVAILABILITY AND IMPLEMENTATION

The method can be implemented using standard clustering algorithms and normalized information distance, available in many statistical software packages.

CONTACT

paul.blomstedt@aalto.fi or samuel.kaski@aalto.fi

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Uziela K, Honkela A. Probe Region Expression Estimation for RNA-Seq Data for Improved Microarray Comparability. PLoS One 2015;10:e0126545. [PMID: 25966034 PMCID: PMC4429080 DOI: 10.1371/journal.pone.0126545] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2014] [Accepted: 04/03/2015] [Indexed: 01/25/2023] Open

Faisal A, Peltonen J, Georgii E, Rung J, Kaski S. Toward computational cumulative biology by combining models of biological datasets. PLoS One 2014;9:e113053. [PMID: 25427176 PMCID: PMC4245117 DOI: 10.1371/journal.pone.0113053] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2014] [Accepted: 10/17/2014] [Indexed: 11/21/2022] Open

Tun AW, Chaiyarit S, Kaewsutthi S, Katanyoo W, Chuenkongkaew W, Kuwano M, Tomonaga T, Peerapittayamongkol C, Thongboonkerd V, Lertrit P. Profiling the mitochondrial proteome of Leber's Hereditary Optic Neuropathy (LHON) in Thailand: down-regulation of bioenergetics and mitochondrial protein quality control pathways in fibroblasts with the 11778G>A mutation. PLoS One 2014;9:e106779. [PMID: 25215595 PMCID: PMC4162555 DOI: 10.1371/journal.pone.0106779] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2014] [Accepted: 08/08/2014] [Indexed: 12/24/2022] Open

Abstract

Leber's Hereditary Optic Neuropathy (LHON) is one of the commonest mitochondrial diseases. It causes total blindness, and predominantly affects young males. For the disease to develop, it is necessary for an individual to carry one of the primary mtDNA mutations 11778G>A, 14484T>C or 3460G>A. However these mutations are not sufficient to cause disease, and they do not explain the characteristic features of LHON such as the higher prevalence in males, incomplete penetrance, and relatively later age of onset. In order to explore the roles of nuclear encoded mitochondrial proteins in development of LHON, we applied a proteomic approach to samples from affected and unaffected individuals from 3 pedigrees and from 5 unrelated controls. Two-dimensional electrophoresis followed by MS/MS analysis in the mitochondrial lysate identified 17 proteins which were differentially expressed between LHON cases and unrelated controls, and 24 proteins which were differentially expressed between unaffected relatives and unrelated controls. The proteomic data were successfully validated by western blot analysis of 3 selected proteins. All of the proteins identified in the study were mitochondrial proteins and most of them were down regulated in 11778G>A mutant fibroblasts. These proteins included: subunits of OXPHOS enzyme complexes, proteins involved in intermediary metabolic processes, nucleoid related proteins, chaperones, cristae remodelling proteins and an anti-oxidant enzyme. The protein profiles of both the affected and unaffected 11778G>A carriers shared many features which differed from those of unrelated control group, revealing similar proteomic responses to 11778G>A mutation in both affected and unaffected individuals. Differentially expressed proteins revealed two broad groups: a cluster of bioenergetic pathway proteins and a cluster involved in protein quality control system. Defects in these systems are likely to impede the function of retinal ganglion cells, and may lead to the development of LHON in synergy with the primary mtDNA mutation.

Collapse

Seth S, Välimäki N, Kaski S, Honkela A. Exploration and retrieval of whole-metagenome sequencing samples. Bioinformatics 2014;30:2471-9. [PMID: 24845653 PMCID: PMC4230234 DOI: 10.1093/bioinformatics/btu340] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Affiliation(s)

Sohan Seth Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, Espoo, Finland, Genome-Scale Biology Program and Department of Medical Genetics, University of Helsinki, Helsinki, Finland, and Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland
Niko Välimäki Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, Espoo, Finland, Genome-Scale Biology Program and Department of Medical Genetics, University of Helsinki, Helsinki, Finland, and Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, Espoo, Finland, Genome-Scale Biology Program and Department of Medical Genetics, University of Helsinki, Helsinki, Finland, and Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland
Samuel Kaski Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, Espoo, Finland, Genome-Scale Biology Program and Department of Medical Genetics, University of Helsinki, Helsinki, Finland, and Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, Espoo, Finland, Genome-Scale Biology Program and Department of Medical Genetics, University of Helsinki, Helsinki, Finland, and Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland
Antti Honkela Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, Espoo, Finland, Genome-Scale Biology Program and Department of Medical Genetics, University of Helsinki, Helsinki, Finland, and Helsinki Institute for Information Technology HIIT, Department of Computer Science, University of Helsinki, Helsinki, Finland

Collapse

Global meta-analysis of transcriptomics studies. PLoS One 2014;9:e89318. [PMID: 24586684 PMCID: PMC3935861 DOI: 10.1371/journal.pone.0089318] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2013] [Accepted: 01/18/2014] [Indexed: 12/18/2022] Open

Abstract

Transcriptomics meta-analysis aims at re-using existing data to derive novel biological hypotheses, and is motivated by the public availability of a large number of independent studies. Current methods are based on breaking down studies into multiple comparisons between phenotypes (e.g. disease vs. healthy), based on the studies' experimental designs, followed by computing the overlap between the resulting differential expression signatures. While useful, in this methodology each study yields multiple independent phenotype comparisons, and connections are established not between studies, but rather between subsets of the studies corresponding to phenotype comparisons. We propose a rank-based statistical meta-analysis framework that establishes global connections between transcriptomics studies without breaking down studies into sets of phenotype comparisons. By using a rank product method, our framework extracts global features from each study, corresponding to genes that are consistently among the most expressed or differentially expressed genes in that study. Those features are then statistically modelled via a term-frequency inverse-document frequency (TF-IDF) model, which is then used for connecting studies. Our framework is fast and parameter-free; when applied to large collections of Homo sapiens and Streptococcus pneumoniae transcriptomics studies, it performs better than similarity-based approaches in retrieving related studies, using a Medical Subject Headings gold standard. Finally, we highlight via case studies how the framework can be used to derive novel biological hypotheses regarding related studies and the genes that drive those connections. Our proposed statistical framework shows that it is possible to perform a meta-analysis of transcriptomics studies with arbitrary experimental designs by deriving global expression features rather than decomposing studies into multiple phenotype comparisons.

Collapse

Faisal A, Gillberg J, Leen G, Peltonen J. Transfer learning using a nonparametric sparse topic model. Neurocomputing 2013. [DOI: 10.1016/j.neucom.2012.12.038] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]

Georgii E, Salojärvi J, Brosché M, Kangasjärvi J, Kaski S. Targeted retrieval of gene expression measurements using regulatory models. Bioinformatics 2012;28:2349-56. [DOI: 10.1093/bioinformatics/bts361] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023] Open