1
|
Kim J, Sandri BJ, Rao RB, Lock EF. Bayesian predictive modeling of multi-source multi-way data. Comput Stat Data Anal 2023; 186:107783. [PMID: 37274461 PMCID: PMC10237362 DOI: 10.1016/j.csda.2023.107783] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
A Bayesian approach to predict a continuous or binary outcome from data that are collected from multiple sources with a multi-way (i.e., multidimensional tensor) structure is described. As a motivating example, molecular data from multiple 'omics sources, each measured over multiple developmental time points, as predictors of early-life iron deficiency (ID) in a rhesus monkey model are considered. The method uses a linear model with a low-rank structure on the coefficients to capture multi-way dependence and model the variance of the coefficients separately across each source to infer their relative contributions. Conjugate priors facilitate an efficient Gibbs sampling algorithm for posterior inference, assuming a continuous outcome with normal errors or a binary outcome with a probit link. Simulations demonstrate that the model performs as expected in terms of misclassification rates and correlation of estimated coefficients with true coefficients, with large gains in performance by incorporating multi-way structure and modest gains when accounting for differing signal sizes across the different sources. Moreover, it provides robust classification of ID monkeys for the motivating application.
Collapse
Affiliation(s)
- Jonathan Kim
- Division of Biostatistics, University of Minnesota, Minneapolis, 55455, USA
| | - Brian J. Sandri
- Division of Neonatology, Department of Pediatrics, University of Minnesota, Minneapolis, MN, USA
- Masonic Institute for the Developing Brain, University of Minnesota, Minneapolis, MN, USA
| | - Raghavendra B. Rao
- Division of Neonatology, Department of Pediatrics, University of Minnesota, Minneapolis, MN, USA
- Masonic Institute for the Developing Brain, University of Minnesota, Minneapolis, MN, USA
| | - Eric F. Lock
- Division of Biostatistics, University of Minnesota, Minneapolis, 55455, USA
| |
Collapse
|
2
|
Mandal A, Maji P. Multiview Regularized Discriminant Canonical Correlation Analysis: Sequential Extraction of Relevant Features From Multiblock Data. IEEE TRANSACTIONS ON CYBERNETICS 2023; 53:5497-5509. [PMID: 35417362 DOI: 10.1109/tcyb.2022.3155875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
One of the important issues associated with real-life high-dimensional data analysis is how to extract significant and relevant features from multiview data. The multiset canonical correlation analysis (MCCA) is a well-known statistical method for multiview data integration. It finds a linear subspace that maximizes the correlations among different views. However, the existing methods to find the multiset canonical variables are computationally very expensive, which restricts the application of the MCCA in real-life big data analysis. The covariance matrix of each high-dimensional view may also suffer from the singularity problem due to the limited number of samples. Moreover, the MCCA-based existing feature extraction algorithms are, in general, unsupervised in nature. In this regard, a new supervised feature extraction algorithm is proposed, which integrates multimodal multidimensional data sets by solving maximal correlation problem of the MCCA. A new block matrix representation is introduced to reduce the computational complexity for computing the canonical variables of the MCCA. The analytical formulation enables efficient computation of the multiset canonical variables under supervised ridge regression optimization technique. It deals with the "curse of dimensionality" problem associated with high-dimensional data and facilitates the sequential generation of relevant features with significantly lower computational cost. The effectiveness of the proposed multiblock data integration algorithm, along with a comparison with other existing methods, is demonstrated on several benchmark and real-life cancer data.
Collapse
|
3
|
Bertinetto C, Engel J, Jansen J. ANOVA simultaneous component analysis: A tutorial review. Anal Chim Acta X 2020; 6:100061. [PMID: 33392497 PMCID: PMC7772684 DOI: 10.1016/j.acax.2020.100061] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Revised: 09/29/2020] [Accepted: 10/02/2020] [Indexed: 12/27/2022] Open
Abstract
When analyzing experimental chemical data, it is often necessary to incorporate the structure of the study design into the chemometric/statistical models to effectively address the research questions of interest. ANOVA-Simultaneous Component Analysis (ASCA) is one of the most prominent methods to include such information in the quantitative analysis of multivariate data, especially when the number of variables is large. This tutorial review intends to explain in a simple way how ASCA works, how it is operated and how to correctly interpret ASCA results, with approachable mathematical and visual descriptions. Two examples are given: the first, a simulated chemical reaction, serves to illustrate the ASCA steps and the second, from a real chemical ecology data set, the interpretation of results. An overview of methods closely related to ASCA is also provided, pointing out their differences and scope, to give a wide-ranging picture of the available options to build multivariate models that take experimental design into account. ASCA is a multivariate method for analysis of multi-factor data. An overview of the (mathematical) principles of ASCA is presented. Key aspects for practical application of ASCA are discussed. Detailed explanation of ASCA output in terms of score and loading plots is given. Literature review of other multivariate techniques for analysis of multi-factor data.
Collapse
Affiliation(s)
- Carlo Bertinetto
- Department of Analytical Chemistry, Institute of Molecular Materials, Radboud University, the Netherlands
| | - Jasper Engel
- Biometris, Wageningen UR, Droevendaalsesteeg 1, 6708 PB, Wageningen, the Netherlands
| | - Jeroen Jansen
- Department of Analytical Chemistry, Institute of Molecular Materials, Radboud University, the Netherlands
| |
Collapse
|
4
|
Mandal A, Maji P. FaRoC: Fast and Robust Supervised Canonical Correlation Analysis for Multimodal Omics Data. IEEE TRANSACTIONS ON CYBERNETICS 2018; 48:1229-1241. [PMID: 28391216 DOI: 10.1109/tcyb.2017.2685625] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
One of the main problems associated with high dimensional multimodal real life data sets is how to extract relevant and significant features. In this regard, a fast and robust feature extraction algorithm, termed as FaRoC, is proposed, integrating judiciously the merits of canonical correlation analysis (CCA) and rough sets. The proposed method extracts new features sequentially from two multidimensional data sets by maximizing their relevance with respect to class label and significance with respect to already-extracted features. To generate canonical variables sequentially, an analytical formulation is introduced to establish the relation between regularization parameters and CCA. The formulation enables the proposed method to extract required number of correlated features sequentially with lesser computational cost as compared to existing methods. To compute both significance and relevance measures of a feature, the concept of hypercuboid equivalence partition matrix of rough hypercuboid approach is used. It also provides an efficient way to find optimum regularization parameters employed in CCA. The efficacy of the proposed FaRoC algorithm, along with a comparison with other existing methods, is extensively established on several real life data sets.
Collapse
|
5
|
Hu W, Lin D, Cao S, Liu J, Chen J, Calhoun VD, Wang YP. Adaptive Sparse Multiple Canonical Correlation Analysis With Application to Imaging (Epi)Genomics Study of Schizophrenia. IEEE Trans Biomed Eng 2018; 65:390-399. [PMID: 29364120 PMCID: PMC5826588 DOI: 10.1109/tbme.2017.2771483] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Finding correlations across multiple data sets in imaging and (epi)genomics is a common challenge. Sparse multiple canonical correlation analysis (SMCCA) is a multivariate model widely used to extract contributing features from each data while maximizing the cross-modality correlation. The model is achieved by using the combination of pairwise covariances between any two data sets. However, the scales of different pairwise covariances could be quite different and the direct combination of pairwise covariances in SMCCA is unfair. The problem of "unfair combination of pairwise covariances" restricts the power of SMCCA for feature selection. In this paper, we propose a novel formulation of SMCCA, called adaptive SMCCA, to overcome the problem by introducing adaptive weights when combining pairwise covariances. Both simulation and real-data analysis show the outperformance of adaptive SMCCA in terms of feature selection over conventional SMCCA and SMCCA with fixed weights. Large-scale numerical experiments show that adaptive SMCCA converges as fast as conventional SMCCA. When applying it to imaging (epi)genetics study of schizophrenia subjects, we can detect significant (epi)genetic variants and brain regions, which are consistent with other existing reports. In addition, several significant brain-development related pathways, e.g., neural tube development, are detected by our model, demonstrating imaging epigenetic association may be overlooked by conventional SMCCA. All these results demonstrate that adaptive SMCCA are well suited for detecting three-way or multiway correlations and thus can find widespread applications in multiple omics and imaging data integration.
Collapse
Affiliation(s)
- Wenxing Hu
- Biomedical Engineering Department, Tulane University, New Orleans, LA 70118, USA
| | - Dongdong Lin
- Mind Research Network and Dept. of ECE, University of New Mexico, Albuquerque, NM, 87106
| | - Shaolong Cao
- Department of Bioinformatics & Computational Biology, UT MD Anderson Cancer Center, Houston, TX
| | - Jingyu Liu
- Mind Research Network and Dept. of ECE, University of New Mexico, Albuquerque, NM, 87106
| | - Jiayu Chen
- Mind Research Network and Dept. of ECE, University of New Mexico, Albuquerque, NM, 87106
| | - Vince D. Calhoun
- Mind Research Network and Dept. of ECE, University of New Mexico, Albuquerque, NM, 87106
| | - Yu-Ping Wang
- Biomedical Engineering Department, Tulane University, New Orleans, LA 70118, USA
| |
Collapse
|
6
|
Abstract
The lung develops from a very simple outpouching of the foregut into a highly complex, finely structured organ with multiple specialized cell types that are required for its normal physiological function. During both the development of the lung and its remodeling in the context of disease or response to injury, gene expression must be activated and silenced in a coordinated manner to achieve the tremendous phenotypic heterogeneity of cell types required for homeostasis and pathogenesis. Epigenetic mechanisms, consisting of DNA base modifications such as methylation, alteration of histones resulting in chromatin modification, and the action of noncoding RNA, control the regulation of information "beyond the genome" required for both lung modeling and remodeling. Epigenetic regulation is subject to modification by environmental stimuli, such as oxidative stress, infection, and aging, and is thus critically important in chronic remodeling disorders such as idiopathic pulmonary fibrosis (IPF), chronic obstructive pulmonary disease (COPD), bronchopulmonary dysplasia (BPD), and pulmonary hypertension (PH). Technological advances have made it possible to evaluate genome-wide epigenetic changes (epigenomics) in diseases of lung remodeling, clarifying existing pathophysiological paradigms and uncovering novel mechanisms of disease. Many of these represent new therapeutic targets. Advances in epigenomic technology will accelerate our understanding of lung development and remodeling, and lead to novel treatments for chronic lung diseases.
Collapse
Affiliation(s)
- James S Hagood
- Department of Pediatrics, Division of Respiratory Medicine, University of California-San Diego and Rady Children's Hospital of San Diego, San Diego, California
| |
Collapse
|
7
|
Acar E, Papalexakis EE, Gürdeniz G, Rasmussen MA, Lawaetz AJ, Nilsson M, Bro R. Structure-revealing data fusion. BMC Bioinformatics 2014; 15:239. [PMID: 25015427 PMCID: PMC4117975 DOI: 10.1186/1471-2105-15-239] [Citation(s) in RCA: 73] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2013] [Accepted: 06/26/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Analysis of data from multiple sources has the potential to enhance knowledge discovery by capturing underlying structures, which are, otherwise, difficult to extract. Fusing data from multiple sources has already proved useful in many applications in social network analysis, signal processing and bioinformatics. However, data fusion is challenging since data from multiple sources are often (i) heterogeneous (i.e., in the form of higher-order tensors and matrices), (ii) incomplete, and (iii) have both shared and unshared components. In order to address these challenges, in this paper, we introduce a novel unsupervised data fusion model based on joint factorization of matrices and higher-order tensors. RESULTS While the traditional formulation of coupled matrix and tensor factorizations modeling only shared factors fails to capture the underlying structures in the presence of both shared and unshared factors, the proposed data fusion model has the potential to automatically reveal shared and unshared components through modeling constraints. Using numerical experiments, we demonstrate the effectiveness of the proposed approach in terms of identifying shared and unshared components. Furthermore, we measure a set of mixtures with known chemical composition using both LC-MS (Liquid Chromatography - Mass Spectrometry) and NMR (Nuclear Magnetic Resonance) and demonstrate that the structure-revealing data fusion model can (i) successfully capture the chemicals in the mixtures and extract the relative concentrations of the chemicals accurately, (ii) provide promising results in terms of identifying shared and unshared chemicals, and (iii) reveal the relevant patterns in LC-MS by coupling with the diffusion NMR data. CONCLUSIONS We have proposed a structure-revealing data fusion model that can jointly analyze heterogeneous, incomplete data sets with shared and unshared components and demonstrated its promising performance as well as potential limitations on both simulated and real data.
Collapse
Affiliation(s)
- Evrim Acar
- Department of Food Science, Faculty of Science, University of Copenhagen, Frederiksberg C, Denmark.
| | | | | | | | | | | | | |
Collapse
|
8
|
Systems biology strategies to study lipidomes in health and disease. Prog Lipid Res 2014; 55:43-60. [DOI: 10.1016/j.plipres.2014.06.001] [Citation(s) in RCA: 64] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2013] [Revised: 06/18/2014] [Accepted: 06/21/2014] [Indexed: 12/14/2022]
|
9
|
Parkkinen JA, Kaski S. Probabilistic drug connectivity mapping. BMC Bioinformatics 2014; 15:113. [PMID: 24742351 PMCID: PMC4011783 DOI: 10.1186/1471-2105-15-113] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2013] [Accepted: 04/14/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The aim of connectivity mapping is to match drugs using drug-treatment gene expression profiles from multiple cell lines. This can be viewed as an information retrieval task, with the goal of finding the most relevant profiles for a given query drug. We infer the relevance for retrieval by data-driven probabilistic modeling of the drug responses, resulting in probabilistic connectivity mapping, and further consider the available cell lines as different data sources. We use a special type of probabilistic model to separate what is shared and specific between the sources, in contrast to earlier connectivity mapping methods that have intentionally aggregated all available data, neglecting information about the differences between the cell lines. RESULTS We show that the probabilistic multi-source connectivity mapping method is superior to alternatives in finding functionally and chemically similar drugs from the Connectivity Map data set. We also demonstrate that an extension of the method is capable of retrieving combinations of drugs that match different relevant parts of the query drug response profile. CONCLUSIONS The probabilistic modeling-based connectivity mapping method provides a promising alternative to earlier methods. Principled integration of data from different cell lines helps to identify relevant responses for specific drug repositioning applications.
Collapse
Affiliation(s)
| | - Samuel Kaski
- Helsinki Institute for Information Technology HIIT, Department of Information and Computer Science, Aalto University, Espoo, Finland.
| |
Collapse
|
10
|
Abstract
Inferring microRNA (miRNA) functions and activities has been extremely important to understand their system-level roles and the mechanisms behind the cellular behaviors of their target genes. This chapter first details methodologies necessary for prediction of function and activity. It then introduces the computational methods available for investigation of sequence and experimental data and for analysis of the information flow mediated through miRNAs.
Collapse
Affiliation(s)
- Hasan Oğul
- Department of Computer Engineering, Baskent University, Ankara, Turkey
| |
Collapse
|
11
|
Seoane JA, Day INM, Gaunt TR, Campbell C. A pathway-based data integration framework for prediction of disease progression. ACTA ACUST UNITED AC 2013; 30:838-45. [PMID: 24162466 PMCID: PMC3957070 DOI: 10.1093/bioinformatics/btt610] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Motivation: Within medical research there is an increasing trend toward deriving multiple types of data from the same individual. The most effective prognostic prediction methods should use all available data, as this maximizes the amount of information used. In this article, we consider a variety of learning strategies to boost prediction performance based on the use of all available data. Implementation: We consider data integration via the use of multiple kernel learning supervised learning methods. We propose a scheme in which feature selection by statistical score is performed separately per data type and by pathway membership. We further consider the introduction of a confidence measure for the class assignment, both to remove some ambiguously labeled datapoints from the training data and to implement a cautious classifier that only makes predictions when the associated confidence is high. Results: We use the METABRIC dataset for breast cancer, with prediction of survival at 2000 days from diagnosis. Predictive accuracy is improved by using kernels that exclusively use those genes, as features, which are known members of particular pathways. We show that yet further improvements can be made by using a range of additional kernels based on clinical covariates such as Estrogen Receptor (ER) status. Using this range of measures to improve prediction performance, we show that the test accuracy on new instances is nearly 80%, though predictions are only made on 69.2% of the patient cohort. Availability:https://github.com/jseoane/FSMKL Contact:J.Seoane@bristol.ac.uk Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- José A Seoane
- MRC Centre for Causal Analyses in Translational Epidemiology, MRC Integrative Epidemiology Unit, School of Social and Community Medicine, University of Bristol, Clifton BS8 2BN, UK and Intelligent Systems Laboratory, University of Bristol, Bristol BS8 1UB, UK
| | | | | | | |
Collapse
|
12
|
Xiang S, Yuan L, Fan W, Wang Y, Thompson PM, Ye J. Bi-level multi-source learning for heterogeneous block-wise missing data. Neuroimage 2013; 102 Pt 1:192-206. [PMID: 23988272 DOI: 10.1016/j.neuroimage.2013.08.015] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2013] [Revised: 06/10/2013] [Accepted: 08/09/2013] [Indexed: 11/19/2022] Open
Abstract
Bio-imaging technologies allow scientists to collect large amounts of high-dimensional data from multiple heterogeneous sources for many biomedical applications. In the study of Alzheimer's Disease (AD), neuroimaging data, gene/protein expression data, etc., are often analyzed together to improve predictive power. Joint learning from multiple complementary data sources is advantageous, but feature-pruning and data source selection are critical to learn interpretable models from high-dimensional data. Often, the data collected has block-wise missing entries. In the Alzheimer's Disease Neuroimaging Initiative (ADNI), most subjects have MRI and genetic information, but only half have cerebrospinal fluid (CSF) measures, a different half has FDG-PET; only some have proteomic data. Here we propose how to effectively integrate information from multiple heterogeneous data sources when data is block-wise missing. We present a unified "bi-level" learning model for complete multi-source data, and extend it to incomplete data. Our major contributions are: (1) our proposed models unify feature-level and source-level analysis, including several existing feature learning approaches as special cases; (2) the model for incomplete data avoids imputing missing data and offers superior performance; it generalizes to other applications with block-wise missing data sources; (3) we present efficient optimization algorithms for modeling complete and incomplete data. We comprehensively evaluate the proposed models including all ADNI subjects with at least one of four data types at baseline: MRI, FDG-PET, CSF and proteomics. Our proposed models compare favorably with existing approaches.
Collapse
Affiliation(s)
- Shuo Xiang
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, USA; Center for Evolutionary Medicine and Informatics, The Biodesign Institute, Arizona State University, Tempe, AZ, USA
| | - Lei Yuan
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, USA; Center for Evolutionary Medicine and Informatics, The Biodesign Institute, Arizona State University, Tempe, AZ, USA
| | - Wei Fan
- Huawei Noah's Ark Lab, Hong Kong
| | - Yalin Wang
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, USA
| | - Paul M Thompson
- Imaging Genetics Center, Laboratory of Neuro Imaging, Department of Neurology & Psychiatry, UCLA School of Medicine, Los Angeles, CA, USA
| | - Jieping Ye
- School of Computing, Informatics, and Decision Systems Engineering, Arizona State University, Tempe, AZ, USA; Center for Evolutionary Medicine and Informatics, The Biodesign Institute, Arizona State University, Tempe, AZ, USA.
| |
Collapse
|
13
|
Hubberten HM, Klie S, Caldana C, Degenkolbe T, Willmitzer L, Hoefgen R. Additional role of O-acetylserine as a sulfur status-independent regulator during plant growth. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2012; 70:666-77. [PMID: 22243437 DOI: 10.1111/j.1365-313x.2012.04905.x] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
O-acetylserine (OAS) is one of the most prominent metabolites whose levels are altered upon sulfur starvation. However, its putative role as a signaling molecule in higher plants is controversial. This paper provides further evidence that OAS is a signaling molecule, based on computational analysis of time-series experiments and on studies of transgenic plants conditionally displaying increased OAS levels. Transcripts whose levels correlated with the transient and specific increase in OAS levels observed in leaves of Arabidopsis thaliana plants 5-10 min after transfer to darkness and with diurnal oscillation of the OAS content, showing a characteristic peak during the night, were identified. Induction of a serine-O-acetyltransferase gene (SERAT) in transgenic A. thaliana plants expressing the genes under the control of an inducible promoter resulted in a specific time-dependent increase in OAS levels. Monitoring the transcriptome response at time points at which no changes in sulfur-related metabolites except OAS were observed and correlating this with the light/dark transition and diurnal experiments resulted in identification of six genes whose expression was highly correlated with that of OAS (adenosine-5'-phosphosulfate reductase 3, sulfur-deficiency-induced 1, sulfur-deficiency-induced 2, low-sulfur-induced 1, serine hydroxymethyltransferase 7 and ChaC-like protein). These data suggest that OAS displays a signalling function leading to changes in transcript levels of a specific gene set irrespective of the sulfur status of the plant. Additionally, a role for OAS in a specific part of the sulfate response can be deduced.
Collapse
Affiliation(s)
- Hans-Michael Hubberten
- Max Planck Institut für Molekulare Pflanzenphysiologie, Am Mühlenberg 1, Potsdam-Golm, Germany.
| | | | | | | | | | | |
Collapse
|
14
|
Hanash S, Schliekelman M, Zhang Q, Taguchi A. Integration of proteomics into systems biology of cancer. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2012; 4:327-37. [PMID: 22407608 DOI: 10.1002/wsbm.1169] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Deciphering the complexity and heterogeneity of cancer, benefits from integration of proteomic level data into systems biology efforts. The opportunities available as a result of advances in proteomic technologies, the successes to date, and the challenges involved in integrating diverse datasets are addressed in this review.
Collapse
Affiliation(s)
- S Hanash
- Molecular Diagnostics Program, Fred Hutchinson Cancer Research Center, Seattle, WA, USA.
| | | | | | | |
Collapse
|
15
|
Abstract
PURPOSE OF REVIEW Lipidomics characterizes the composition of intact lipid molecular species in biological systems and the field has been driven by some spectacular advances in mass spectrometry instrumentation and applications. This review will highlight these advances and outline their recent application to address clinical issues. RECENT FINDINGS This review first identifies recent advances in lipid detection and analysis by a variety of mass spectrometry techniques, then reviews specific application including stable isotope labelling of lipids, lipid mass spectrometry imaging, data analysis and bioinformatics, and finally presents examples of the application of lipidomics to selected disease states. SUMMARY Lipidomics so far has been principally concerned with identifying novel methodologies, but recent advances demonstrating applications in diabetes, neurodegenerative diseases, cystic fibrosis and other respiratory diseases clearly indicate the potential usefulness of lipidomics both to generate biomarkers of disease and to probe signalling and metabolic processes.
Collapse
Affiliation(s)
- Anthony D Postle
- Clinical and Experimental Sciences, University of Southampton, Hampshire, UK.
| |
Collapse
|
16
|
Orešič M. Informatics and computational strategies for the study of lipids. Biochim Biophys Acta Mol Cell Biol Lipids 2011; 1811:991-9. [DOI: 10.1016/j.bbalip.2011.06.012] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2011] [Revised: 05/23/2011] [Accepted: 06/07/2011] [Indexed: 12/29/2022]
|
17
|
Yetukuri L, Huopaniemi I, Koivuniemi A, Maranghi M, Hiukka A, Nygren H, Kaski S, Taskinen MR, Vattulainen I, Jauhiainen M, Orešič M. High density lipoprotein structural changes and drug response in lipidomic profiles following the long-term fenofibrate therapy in the FIELD substudy. PLoS One 2011; 6:e23589. [PMID: 21887280 PMCID: PMC3160907 DOI: 10.1371/journal.pone.0023589] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2011] [Accepted: 07/20/2011] [Indexed: 11/26/2022] Open
Abstract
In a recent FIELD study the fenofibrate therapy surprisingly failed to achieve significant benefit over placebo in the primary endpoint of coronary heart disease events. Increased levels of atherogenic homocysteine were observed in some patients assigned to fenofibrate therapy but the molecular mechanisms behind this are poorly understood. Herein we investigated HDL lipidomic profiles associated with fenofibrate treatment and the drug-induced Hcy levels in the FIELD substudy. We found that fenofibrate leads to complex HDL compositional changes including increased apoA-II, diminishment of lysophosphatidylcholines and increase of sphingomyelins. Ethanolamine plasmalogens were diminished only in a subgroup of fenofibrate-treated patients with elevated homocysteine levels. Finally we performed molecular dynamics simulations to qualitatively reconstitute HDL particles in silico. We found that increased number of apoA-II excludes neutral lipids from HDL surface and apoA-II is more deeply buried in the lipid matrix than apoA-I. In conclusion, a detailed molecular characterization of HDL may provide surrogates for predictors of drug response and thus help identify the patients who might benefit from fenofibrate treatment.
Collapse
Affiliation(s)
| | - Ilkka Huopaniemi
- Aalto University School of Science, Department of Information and Computer Science, Helsinki Institute for Information Technology, Espoo, Finland
| | | | - Marianna Maranghi
- Department of Internal Medicine and Medical Specialties, Sapienza University, Rome, Italy
- Division of Cardiology, Department of Medicine, University of Helsinki, Helsinki, Finland
| | - Anne Hiukka
- Division of Cardiology, Department of Medicine, University of Helsinki, Helsinki, Finland
| | - Heli Nygren
- Technical Research Centre of Finland, Espoo, Finland
| | - Samuel Kaski
- Aalto University School of Science, Department of Information and Computer Science, Helsinki Institute for Information Technology, Espoo, Finland
| | - Marja-Riitta Taskinen
- Division of Cardiology, Department of Medicine, University of Helsinki, Helsinki, Finland
| | - Ilpo Vattulainen
- Department of Physics, Tampere University of Technology, Tampere, Finland
- Department of Applied Physics, Aalto University School of Science and Technology, Espoo, Finland
- MEMPHYS – Center for Biomembrane Physics, University of Southern Denmark, Odense, Denmark
| | - Matti Jauhiainen
- National Institute for Health and Welfare, Helsinki, Finland
- Institute for Molecular Medicine Finland, Helsinki, Finland
| | - Matej Orešič
- Technical Research Centre of Finland, Espoo, Finland
- Institute for Molecular Medicine Finland, Helsinki, Finland
- * E-mail:
| |
Collapse
|
18
|
Xia J, Sinelnikov IV, Wishart DS. MetATT: a web-based metabolomics tool for analyzing time-series and two-factor datasets. Bioinformatics 2011; 27:2455-6. [DOI: 10.1093/bioinformatics/btr392] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
19
|
|