1
|
Chin JL, Tan ZC, Chan LC, Ruffin F, Parmar R, Ahn R, Taylor SD, Bayer AS, Hoffmann A, Fowler VG, Reed EF, Yeaman MR, Meyer AS. Tensor modeling of MRSA bacteremia cytokine and transcriptional patterns reveals coordinated, outcome-associated immunological programs. PNAS NEXUS 2024; 3:pgae185. [PMID: 38779114 PMCID: PMC11109816 DOI: 10.1093/pnasnexus/pgae185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 04/17/2024] [Indexed: 05/25/2024]
Abstract
Methicillin-resistant Staphylococcus aureus (MRSA) bacteremia is a common and life-threatening infection that imposes up to 30% mortality even when appropriate therapy is used. Despite in vitro efficacy determined by minimum inhibitory concentration breakpoints, antibiotics often fail to resolve these infections in vivo, resulting in persistent MRSA bacteremia. Recently, several genetic, epigenetic, and proteomic correlates of persistent outcomes have been identified. However, the extent to which single variables or their composite patterns operate as independent predictors of outcome or reflect shared underlying mechanisms of persistence is unknown. To explore this question, we employed a tensor-based integration of host transcriptional and cytokine datasets across a well-characterized cohort of patients with persistent or resolving MRSA bacteremia outcomes. This method yielded high correlative accuracy with outcomes and immunologic signatures united by transcriptomic and cytokine datasets. Results reveal that patients with persistent MRSA bacteremia (PB) exhibit signals of granulocyte dysfunction, suppressed antigen presentation, and deviated lymphocyte polarization. In contrast, patients with resolving bacteremia (RB) heterogeneously exhibit correlates of robust antigen-presenting cell trafficking and enhanced neutrophil maturation corresponding to appropriate T lymphocyte polarization and B lymphocyte response. These results suggest that transcriptional and cytokine correlates of PB vs. RB outcomes are complex and may not be disclosed by conventional modeling. In this respect, a tensor-based integration approach may help to reveal consensus molecular and cellular mechanisms and their biological interpretation.
Collapse
Affiliation(s)
- Jackson L Chin
- Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA 90024, USA
| | - Zhixin Cyrillus Tan
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA 90024, USA
| | - Liana C Chan
- The Lundquist Institute for Biomedical Innovation, Harbor-UCLA Medical Center, Torrance, CA 90502, USA
- Department of Medicine, David Geffen School of Medicine, University of California, Los Angeles, CA 90095, USA
- Division of Infectious Diseases, Department of Medicine, Harbor-UCLA Medical Center, Torrance, CA 90502, USA
- Division of Molecular Medicine, Department of Medicine, Harbor-UCLA Medical Center, Torrance, CA 90502, USA
| | - Felicia Ruffin
- Division of Infectious Diseases, Duke University School of Medicine, Durham, NC 27710, USA
| | - Rajesh Parmar
- Department of Pathology and Laboratory Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Richard Ahn
- Institute for Quantitative and Computational Biosciences, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095, USA
| | - Scott D Taylor
- Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA 90024, USA
| | - Arnold S Bayer
- The Lundquist Institute for Biomedical Innovation, Harbor-UCLA Medical Center, Torrance, CA 90502, USA
- Department of Medicine, David Geffen School of Medicine, University of California, Los Angeles, CA 90095, USA
| | - Alexander Hoffmann
- Institute for Quantitative and Computational Biosciences, David Geffen School of Medicine at UCLA, Los Angeles, CA 90095, USA
| | - Vance G Fowler
- Division of Infectious Diseases, Duke University School of Medicine, Durham, NC 27710, USA
| | - Elaine F Reed
- Department of Pathology and Laboratory Medicine, University of California, Los Angeles, Los Angeles, CA 90095, USA
| | - Michael R Yeaman
- The Lundquist Institute for Biomedical Innovation, Harbor-UCLA Medical Center, Torrance, CA 90502, USA
- Department of Medicine, David Geffen School of Medicine, University of California, Los Angeles, CA 90095, USA
- Division of Infectious Diseases, Department of Medicine, Harbor-UCLA Medical Center, Torrance, CA 90502, USA
- Division of Molecular Medicine, Department of Medicine, Harbor-UCLA Medical Center, Torrance, CA 90502, USA
- Division of Infectious Diseases, Duke University School of Medicine, Durham, NC 27710, USA
| | - Aaron S Meyer
- Department of Bioengineering, University of California, Los Angeles, Los Angeles, CA 90024, USA
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, Los Angeles, CA 90024, USA
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, Los Angeles, CA 90024, USA
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, University of California, Los Angeles, Los Angeles, CA 90024, USA
| |
Collapse
|
2
|
Braga KBN, Maciel LÍL, Vaz BG, Pinto L, Santos JM. A rapid and direct method for dating blue pen ink in documents using multiset modeling of infrared spectroscopy and mass spectrometry data. ANALYTICAL METHODS : ADVANCING METHODS AND APPLICATIONS 2023; 15:6523-6530. [PMID: 37987504 DOI: 10.1039/d3ay01732j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2023]
Abstract
The dating of documents is crucial in forensic chemistry, particularly for verifying their authenticity. This study aimed to develop a rapid and direct method for the dating of pen ink in documents, using a combination of Fourier transform infrared spectroscopy in attenuated reflectance mode (FTIR-ATR), desorption electrospray ionization mass spectrometry (DESI-MS) and multiple ensemble data modeling. Two sets of paper document samples containing writing in blue pen ink were investigated: (I) artificially aged documents and (II) real documents dating from 1960 to 2022. The FTIR-ATR spectra of both sets of samples showed a decrease in absorbance at ∼1584 cm-1, related to the chemical modification of the CN bond in the molecular structure of Basic Violet 3 (BV3), one of the main dyes used in blue pen ink. DESI-MS confirmed the presence of BV3 and its degradation by-products in all the samples, indicating its widespread utilization in blue pen ink production. Moreover, DESI-MS detected combinations of dyes within the ink composition. The models were built using the DESI-MS and FTIR-ATR data separately, but the error and trend were significantly reduced when both sets of data were used. The combination of DESI-MS and FTIR-ATR spectral information resulted in a final predictive model with low error for pen inks from real documents in writing from the years 1960 to 2022. These analyses proved to be effective for the dating of pen inks and are suitable for use in routine forensic analysis, providing a direct and rapid method that allows for accurate prediction.
Collapse
Affiliation(s)
- Kauanny B N Braga
- Grupo de Pesquisa em Petróleo, Energia e Espectrometria de Massas (PEM), Departamento de Química, Universidade Federal Rural de Pernambuco - UFRPE, Rua Dom Manoel de Medeiros, s/n, Recife, Pernambuco, 52171-131, Brazil.
| | - Lanaia Í L Maciel
- Instituto de Química, Universidade Federal de Goiás, Goiânia, GO, 74690-900, Brazil
| | - Boniek G Vaz
- Instituto de Química, Universidade Federal de Goiás, Goiânia, GO, 74690-900, Brazil
| | - Licarion Pinto
- Departamento de Química Analítica, Instituto de Química, Universidade do Estado do Rio de Janeiro, R. São Francisco Xavier, 524, Maracanã, Rio de Janeiro, RJ, 20550-013, Brazil
| | - Jandyson M Santos
- Grupo de Pesquisa em Petróleo, Energia e Espectrometria de Massas (PEM), Departamento de Química, Universidade Federal Rural de Pernambuco - UFRPE, Rua Dom Manoel de Medeiros, s/n, Recife, Pernambuco, 52171-131, Brazil.
| |
Collapse
|
3
|
Orcutt-Jahns B, Junior JRL, Rockne RC, Matache A, Branciamore S, Hung E, Rodin AS, Lee PP, Meyer AS. Systems profiling reveals recurrently dysregulated cytokine signaling responses in ER+ breast cancer patients' blood. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.31.564987. [PMID: 37961682 PMCID: PMC10635026 DOI: 10.1101/2023.10.31.564987] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
Cytokines mediate cell-to-cell communication across the immune system and therefore are critical to immunosurveillance in cancer and other diseases. Several cytokines show dysregulated abundance or signaling responses in breast cancer, associated with the disease and differences in survival and progression. Cytokines operate in a coordinated manner to affect immune surveillance and regulate one another, necessitating a systems approach for a complete picture of this dysregulation. Here, we profiled cytokine signaling responses of peripheral immune cells from breast cancer patients as compared to healthy controls in a multidimensional manner across ligands, cell populations, and responsive pathways. We find alterations in cytokine responsiveness across pathways and cell types that are best defined by integrated signatures across dimensions. Alterations in the abundance of a cytokine's cognate receptor do not explain differences in responsiveness. Rather, alterations in baseline signaling and receptor abundance suggesting immune cell reprogramming are associated with altered responses. These integrated features suggest a global reprogramming of immune cell communication in breast cancer.
Collapse
Affiliation(s)
- Brian Orcutt-Jahns
- Department of Bioengineering, University of California, Los Angeles (UCLA), USA
| | | | - Russell C. Rockne
- Department of Computational and Quantitative Medicine, Beckman Research Institute of the City of Hope, Duarte, CA, USA
| | - Adina Matache
- Department of Computational and Quantitative Medicine, Beckman Research Institute of the City of Hope, Duarte, CA, USA
| | - Sergio Branciamore
- Department of Computational and Quantitative Medicine, Beckman Research Institute of the City of Hope, Duarte, CA, USA
| | - Ethan Hung
- Department of Bioengineering, University of California, Los Angeles (UCLA), USA
| | - Andrei S. Rodin
- Department of Computational and Quantitative Medicine, Beckman Research Institute of the City of Hope, Duarte, CA, USA
| | - Peter P. Lee
- Department of Immuno-Oncology, Beckman Research Institute of the City of Hope, Duarte, CA, USA
| | - Aaron S. Meyer
- Department of Bioengineering, University of California, Los Angeles (UCLA), USA
- Jonsson Comprehensive Cancer Center, UCLA, United States of America
- Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research, UCLA, USA
| |
Collapse
|
4
|
Mishra P. Sequentially orthogonalized canonical partial least squares for improved multiple responses modeling in multiblock data sets. Anal Chim Acta 2023; 1250:340957. [PMID: 36898815 DOI: 10.1016/j.aca.2023.340957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 12/22/2022] [Accepted: 02/08/2023] [Indexed: 02/11/2023]
Abstract
Multiblock data sets and modeling techniques are widely encountered in the chemometric community. Although the currently available techniques, such as sequential orthogonalized partial least squares (SO-PLS) regression are mainly focused on the prediction of a single response and deal with the multiple response(s) case using PLS2 type approach. Recently, a new approach called canonical PLS (CPLS) was proposed for extracting the subspaces efficiently for multiple response(s) cases, supporting both regression and classification. 'Efficiently' here means more information in fewer latent variables. This work suggests a combination of SO-PLS and CPLS, sequential orthogonalized canonical partial least squares (SO-CPLS), to model multiple response(s) for multiblock data sets. The cases of SO-CPLS for modeling multiple response(s) regression and classification were demonstrated on several data sets. Also, the capability of SO-CPLS to incorporate meta-information related to samples for efficient subspace extraction is demonstrated. Furthermore, a comparison with the commonly used sequential modeling technique, called sequential orthogonalized partial least squares (SO-PLS), is also presented. The SO-CPLS approach can benefit both the multiple response(s) regression and classification modeling and can be of high importance when meta-information such as experimental design or sample classes is available.
Collapse
Affiliation(s)
- Puneet Mishra
- Wageningen Food and Biobased Research, Bornse Weilanden 9, P.O. Box 17, 6700AA, Wageningen, the Netherlands.
| |
Collapse
|
5
|
Liu Y, Zhang Y, Jiang Z, Kong W, Zou L. Exploring Neural Mechanisms of Reward Processing Using Coupled Matrix Tensor Factorization: A Simultaneous EEG-fMRI Investigation. Brain Sci 2023; 13:485. [PMID: 36979295 PMCID: PMC10046863 DOI: 10.3390/brainsci13030485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 03/06/2023] [Accepted: 03/08/2023] [Indexed: 03/14/2023] Open
Abstract
BACKGROUND It is crucial to understand the neural feedback mechanisms and the cognitive decision-making of the brain during the processing of rewards. Here, we report the first attempt for a simultaneous electroencephalography (EEG)-functional magnetic resonance imaging (fMRI) study in a gambling task by utilizing tensor decomposition. METHODS First, the single-subject EEG data are represented as a third-order spectrogram tensor to extract frequency features. Next, the EEG and fMRI data are jointly decomposed into a superposition of multiple sources characterized by space-time-frequency profiles using coupled matrix tensor factorization (CMTF). Finally, graph-structured clustering is used to select the most appropriate model according to four quantitative indices. RESULTS The results clearly show that not only are the regions of interest (ROIs) found in other literature activated, but also the olfactory cortex and fusiform gyrus which are usually ignored. It is found that regions including the orbitofrontal cortex and insula are activated for both winning and losing stimuli. Meanwhile, regions such as the superior orbital frontal gyrus and anterior cingulate cortex are activated upon winning stimuli, whereas the inferior frontal gyrus, cingulate cortex, and medial superior frontal gyrus are activated upon losing stimuli. CONCLUSION This work sheds light on the reward-processing progress, provides a deeper understanding of brain function, and opens a new avenue in the investigation of neurovascular coupling via CMTF.
Collapse
Affiliation(s)
- Yuchao Liu
- School of Computer and Artificial Intelligence, Changzhou University, Changzhou 213164, China
| | - Yin Zhang
- School of Microelectronics and Control Engineering, Changzhou University, Changzhou 213164, China
| | - Zhongyi Jiang
- School of Computer and Artificial Intelligence, Changzhou University, Changzhou 213164, China
| | - Wanzeng Kong
- College of Computer Science, Hangzhou Dianzi University, Hangzhou 310018, China
- Key Laboratory of Brain Machine Collaborative Intelligence Foundation of Zhejiang Province, Hangzhou 310018, China
| | - Ling Zou
- School of Computer and Artificial Intelligence, Changzhou University, Changzhou 213164, China
- School of Microelectronics and Control Engineering, Changzhou University, Changzhou 213164, China
- Key Laboratory of Brain Machine Collaborative Intelligence Foundation of Zhejiang Province, Hangzhou 310018, China
| |
Collapse
|
6
|
Hayes E, Greene D, O’Donnell C, O’Shea N, Fenelon MA. Spectroscopic technologies and data fusion: Applications for the dairy industry. Front Nutr 2023; 9:1074688. [PMID: 36712542 PMCID: PMC9875022 DOI: 10.3389/fnut.2022.1074688] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Accepted: 12/05/2022] [Indexed: 01/12/2023] Open
Abstract
Increasing consumer awareness, scale of manufacture, and demand to ensure safety, quality and sustainability have accelerated the need for rapid, reliable, and accurate analytical techniques for food products. Spectroscopy, coupled with Artificial Intelligence-enabled sensors and chemometric techniques, has led to the fusion of data sources for dairy analytical applications. This article provides an overview of the current spectroscopic technologies used in the dairy industry, with an introduction to data fusion and the associated methodologies used in spectroscopy-based data fusion. The relevance of data fusion in the dairy industry is considered, focusing on its potential to improve predictions for processing traits by chemometric techniques, such as principal component analysis (PCA), partial least squares regression (PLS), and other machine learning algorithms.
Collapse
Affiliation(s)
- Elena Hayes
- University College Dublin (UCD) School of Biosystems and Food Engineering, University College Dublin, Dublin, Ireland,Teagasc Food Research Centre, Moorepark, Fermoy, Ireland
| | - Derek Greene
- University College Dublin (UCD) School of Computer Science, University College Dublin, Dublin, Ireland
| | - Colm O’Donnell
- University College Dublin (UCD) School of Biosystems and Food Engineering, University College Dublin, Dublin, Ireland
| | - Norah O’Shea
- Teagasc Food Research Centre, Moorepark, Fermoy, Ireland
| | - Mark A. Fenelon
- University College Dublin (UCD) School of Biosystems and Food Engineering, University College Dublin, Dublin, Ireland,Teagasc Food Research Centre, Moorepark, Fermoy, Ireland,*Correspondence: Mark A. Fenelon,
| |
Collapse
|
7
|
Xu Y, Zhang J, Wang Y. Recent trends of multi-source and non-destructive information for quality authentication of herbs and spices. Food Chem 2023; 398:133939. [DOI: 10.1016/j.foodchem.2022.133939] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2022] [Revised: 07/19/2022] [Accepted: 08/10/2022] [Indexed: 11/15/2022]
|
8
|
New Constructed EEM Spectra Combined with N-PLS Analysis Approach as an Effective Way to Determine Multiple Target Compounds in Complex Samples. MOLECULES (BASEL, SWITZERLAND) 2022; 27:molecules27238378. [PMID: 36500471 PMCID: PMC9740148 DOI: 10.3390/molecules27238378] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Revised: 08/26/2022] [Accepted: 09/20/2022] [Indexed: 12/02/2022]
Abstract
Excitation-emission matrix (EEM) fluorescence spectroscopy has been applied to many fields. In this study, a simple method was proposed to obtain the new constructed three-dimensional (3D) EEM spectra based on the original EEM spectra. Then, the application of the N-PLS method to the new constructed 3D EEM spectra was proposed to quantify target compounds in two complex data sets. The quantitative models were established on external sample sets and validated using statistical parameters. For validation purposes, the obtained results were compared with those obtained by applying the N-PLS method to the original EEM spectra and applying the PLS method to the extracted maximum spectra in the concatenated mode. The comparison of the results demonstrated that, given the advantages of less useless information and a high calculating speed of the new constructed 3D EEM spectra, N-PLS on the new constructed 3D EEM spectra obtained better quantitative analysis results with a correlation coefficient of prediction above 0.9906 and recovery values in the range of 85.6-95.6%. Therefore, one can conclude that the N-PLS method combined with the new constructed 3D EEM spectra is expected to be broadened as an alternative strategy for the simultaneous determination of multiple target compounds.
Collapse
|
9
|
Huang L, Zhang L, Chen X. Updated review of advances in microRNAs and complex diseases: experimental results, databases, webservers and data fusion. Brief Bioinform 2022; 23:6696143. [PMID: 36094095 DOI: 10.1093/bib/bbac397] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2022] [Revised: 07/19/2022] [Accepted: 08/15/2022] [Indexed: 12/14/2022] Open
Abstract
MicroRNAs (miRNAs) are gene regulators involved in the pathogenesis of complex diseases such as cancers, and thus serve as potential diagnostic markers and therapeutic targets. The prerequisite for designing effective miRNA therapies is accurate discovery of miRNA-disease associations (MDAs), which has attracted substantial research interests during the last 15 years, as reflected by more than 55 000 related entries available on PubMed. Abundant experimental data gathered from the wealth of literature could effectively support the development of computational models for predicting novel associations. In 2017, Chen et al. published the first-ever comprehensive review on MDA prediction, presenting various relevant databases, 20 representative computational models, and suggestions for building more powerful ones. In the current review, as the continuation of the previous study, we revisit miRNA biogenesis, detection techniques and functions; summarize recent experimental findings related to common miRNA-associated diseases; introduce recent updates of miRNA-relevant databases and novel database releases since 2017, present mainstream webservers and new webserver releases since 2017 and finally elaborate on how fusion of diverse data sources has contributed to accurate MDA prediction.
Collapse
Affiliation(s)
- Li Huang
- Academy of Arts and Design, Tsinghua University, Beijing, 10084, China.,The Future Laboratory, Tsinghua University, Beijing, 10084, China
| | - Li Zhang
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China
| | - Xing Chen
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou, 221116, China.,Artificial Intelligence Research Institute, China University of Mining and Technology, Xuzhou, 221116, China
| |
Collapse
|
10
|
Mosayebi R, Dehghani A, Hossein-Zadeh GA. Dynamic functional connectivity estimation for neurofeedback emotion regulation paradigm with simultaneous EEG-fMRI analysis. Front Hum Neurosci 2022; 16:933538. [PMID: 36188168 PMCID: PMC9524189 DOI: 10.3389/fnhum.2022.933538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2022] [Accepted: 08/22/2022] [Indexed: 11/15/2022] Open
Abstract
Joint Analysis of EEG and fMRI datasets can bring new insight into brain mechanisms. In this paper, we employed the recently introduced Correlated Coupled Tensor Matrix Factorization (CCMTF) method for analysis of the emotion regulation paradigm based on EEG frontal asymmetry neurofeedback in the alpha frequency band with simultaneous fMRI. CCMTF method assumes that the co-variations of the common dimension (temporal dimension) between EEG and fMRI are correlated and not necessarily identical. The results of the CCMTF method suggested that EEG and fMRI had similar covariations during the transition of brain activities from resting states to task (view and upregulation) states and these covariations followed an increasing trend. The fMRI shared spatial component showed activations in the limbic system, DLPFC, OFC, and VLPC regions, which were consistent with the previous studies and were linked to EEG frequency patterns in the range of 1–15 Hz with a correlation value close to 0.75. The estimated regions from the CCMTF method were then used as the candidate nodes for dynamic functional connectivity (dFC) analysis, in which the changes in connectivity from view to upregulation states were examined. The results of the dFC analysis were compared with a Normalized Mutual information (NMI) based approach in two different frequency ranges (1–15 and 15–40 Hz) as the NMI method was applied to the vectors of dFC nodes of EEG and fMRI data. The results of the two methods illustrated that the relation between EEG and fMRI datasets was mostly in the frequency range of 1–15 Hz. These relations were both in the brain activations and the dFCs between the two modalities. This paper suggests that the CCMTF method is a capable approach for extracting the shared information between EEG and fMRI data and can reveal new information about brain functions and their connectivity without solving the EEG inverse problem or analyzing different frequency bands.
Collapse
Affiliation(s)
- Raziyeh Mosayebi
- School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran
- *Correspondence: Raziyeh Mosayebi,
| | - Amin Dehghani
- School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran
| | - Gholam-Ali Hossein-Zadeh
- School of Electrical and Computer Engineering, University of Tehran, Tehran, Iran
- School of Cognitive Sciences, Institute for Research in Fundamental Sciences (IPM), Tehran, Iran
| |
Collapse
|
11
|
Li P, Sofuoglu SE, Aviyente S, Maiti T. Coupled support tensor machine classification for multimodal neuroimaging data. Stat Anal Data Min 2022. [DOI: 10.1002/sam.11587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
- Peide Li
- Boehringer Ingelheim Pharmaceuticals Duluth Georgia USA
| | | | - Selin Aviyente
- College of Engineering Michigan State University East Lansing Michigan USA
| | - Tapabrata Maiti
- College of Natural Science Michigan State University East Lansing Michigan USA
| |
Collapse
|
12
|
Watson ER, Taherian Fard A, Mar JC. Computational Methods for Single-Cell Imaging and Omics Data Integration. Front Mol Biosci 2022; 8:768106. [PMID: 35111809 PMCID: PMC8801747 DOI: 10.3389/fmolb.2021.768106] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 11/29/2021] [Indexed: 12/12/2022] Open
Abstract
Integrating single cell omics and single cell imaging allows for a more effective characterisation of the underlying mechanisms that drive a phenotype at the tissue level, creating a comprehensive profile at the cellular level. Although the use of imaging data is well established in biomedical research, its primary application has been to observe phenotypes at the tissue or organ level, often using medical imaging techniques such as MRI, CT, and PET. These imaging technologies complement omics-based data in biomedical research because they are helpful for identifying associations between genotype and phenotype, along with functional changes occurring at the tissue level. Single cell imaging can act as an intermediary between these levels. Meanwhile new technologies continue to arrive that can be used to interrogate the genome of single cells and its related omics datasets. As these two areas, single cell imaging and single cell omics, each advance independently with the development of novel techniques, the opportunity to integrate these data types becomes more and more attractive. This review outlines some of the technologies and methods currently available for generating, processing, and analysing single-cell omics- and imaging data, and how they could be integrated to further our understanding of complex biological phenomena like ageing. We include an emphasis on machine learning algorithms because of their ability to identify complex patterns in large multidimensional data.
Collapse
Affiliation(s)
| | - Atefeh Taherian Fard
- Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, QLD, Australia
| | - Jessica Cara Mar
- Australian Institute for Bioengineering and Nanotechnology, The University of Queensland, Brisbane, QLD, Australia
| |
Collapse
|
13
|
Mishra P, Roger JM, Jouan-Rimbaud-Bouveresse D, Biancolillo A, Marini F, Nordon A, Rutledge DN. Recent trends in multi-block data analysis in chemometrics for multi-source data integration. Trends Analyt Chem 2021. [DOI: 10.1016/j.trac.2021.116206] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
|
14
|
PCovR2: A flexible principal covariates regression approach to parsimoniously handle multiple criterion variables. Behav Res Methods 2021; 53:1648-1668. [PMID: 33420716 DOI: 10.3758/s13428-020-01508-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/02/2020] [Indexed: 11/08/2022]
Abstract
Principal covariates regression (PCovR) allows one to deal with the interpretational and technical problems associated with running ordinary regression using many predictor variables. In PCovR, the predictor variables are reduced to a limited number of components, and simultaneously, criterion variables are regressed on these components. By means of a weighting parameter, users can flexibly choose how much they want to emphasize reconstruction and prediction. However, when datasets contain many criterion variables, PCovR users face new interpretational problems, because many regression weights will be obtained and because some criteria might be unrelated to the predictors. We therefore propose PCovR2, which extends PCovR by also reducing the criteria to a few components. These criterion components are predicted based on the predictor components. The PCovR2 weighting parameter can again be flexibly used to focus on the reconstruction of the predictors and criteria, or on filtering out relevant predictor components and predictable criterion components. We compare PCovR2 to two other approaches, based on partial least squares (PLS) and principal components regression (PCR), that also reduce the criteria and are therefore called PLS2 and PCR2. By means of a simulated example, we show that PCovR2 outperforms PLS2 and PCR2 when one aims to recover all relevant predictor components and predictable criterion components. Moreover, we conduct a simulation study to evaluate how well PCovR2, PLS2 and PCR2 succeed in finding (1) all underlying components and (2) the subset of relevant predictor and predictable criterion components. Finally, we illustrate the use of PCovR2 by means of empirical data.
Collapse
|
15
|
Van Eyndhoven S, Dupont P, Tousseyn S, Vervliet N, Van Paesschen W, Van Huffel S, Hunyadi B. Augmenting interictal mapping with neurovascular coupling biomarkers by structured factorization of epileptic EEG and fMRI data. Neuroimage 2020; 228:117652. [PMID: 33359347 PMCID: PMC7903163 DOI: 10.1016/j.neuroimage.2020.117652] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2020] [Revised: 11/28/2020] [Accepted: 12/04/2020] [Indexed: 12/20/2022] Open
Abstract
EEG-correlated fMRI analysis is widely used to detect regional BOLD fluctuations that are synchronized to interictal epileptic discharges, which can provide evidence for localizing the ictal onset zone. However, the typical, asymmetrical and mass-univariate approach cannot capture the inherent, higher order structure in the EEG data, nor multivariate relations in the fMRI data, and it is nontrivial to accurately handle varying neurovascular coupling over patients and brain regions. We aim to overcome these drawbacks in a data-driven manner by means of a novel structured matrix-tensor factorization: the single-subject EEG data (represented as a third-order spectrogram tensor) and fMRI data (represented as a spatiotemporal BOLD signal matrix) are jointly decomposed into a superposition of several sources, characterized by space-time-frequency profiles. In the shared temporal mode, Toeplitz-structured factors account for a spatially specific, neurovascular 'bridge' between the EEG and fMRI temporal fluctuations, capturing the hemodynamic response's variability over brain regions. By analyzing interictal data from twelve patients, we show that the extracted source signatures provide a sensitive localization of the ictal onset zone (10/12). Moreover, complementary parts of the IOZ can be uncovered by inspecting those regions with the most deviant neurovascular coupling, as quantified by two entropy-like metrics of the hemodynamic response function waveforms (9/12). Hence, this multivariate, multimodal factorization provides two useful sets of EEG-fMRI biomarkers, which can assist the presurgical evaluation of epilepsy. We make all code required to perform the computations available at https://github.com/svaneynd/structured-cmtf.
Collapse
Affiliation(s)
- Simon Van Eyndhoven
- Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, Belgium.
| | - Patrick Dupont
- Laboratory for Cognitive Neurology, Department of Neurosciences, KU Leuven, Leuven, Belgium; Leuven Brain Institute, Leuven, Belgium
| | - Simon Tousseyn
- Academic Center for Epileptology, Kempenhaeghe and Maastricht UMC+, Heeze, the Netherlands
| | - Nico Vervliet
- Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, Belgium
| | - Wim Van Paesschen
- Laboratory for Epilepsy Research, KU Leuven, Leuven, Belgium; Department of Neurology, University Hospitals Leuven, Leuven, Belgium
| | - Sabine Van Huffel
- Department of Electrical Engineering (ESAT), STADIUS Center for Dynamical Systems, Signal Processing and Data Analytics, KU Leuven, Belgium
| | - Borbála Hunyadi
- Circuits and Systems Group (CAS), Department of Microelectronics, Delft University of Technology, Delft, the Netherlands
| |
Collapse
|
16
|
Lu SH, Zhai HL, Zhao BQ, Yin B, Zhu L. Novel Approach to the Analysis of Chemical Third-Order Data. J Chem Inf Model 2020; 60:4750-4756. [PMID: 32955255 DOI: 10.1021/acs.jcim.0c00554] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
For the more complex samples, chemical higher-order data can be collected from various information sources, which become the necessary foundation of accurate analysis. In this article, the Tchebichef cubic moment (TCM) was developed for the analysis of chemical third-order data for the first time. Then, the proposed TCM approach was applied to the fluorescence excitation-emission time data for the analysis of adrenaline and noradrenaline in urinary samples (Data I) and the data fusion of the excitation-emission matrix (EEM), NMR, and liquid chromatography-mass spectrometry (LC-MS) spectra for the determination of the five target components (Data II). For Data I, all of the cross-validation correlation coefficients (Rcv2) of the obtained linear models on the calibration set were more than 0.9937 and the prediction root-mean-square errors (RMSEp) of the external independent test samples were less than 0.0250 μM. For Data II, all of the Rcv2 were higher than 0.9846 and RMSEp were less than 0.2267 μM. Compared with several conventional methods, the proposed method was more convenient and accurate. This study provides another effective approach to the analysis of complex samples based on their chemical third-order data.
Collapse
Affiliation(s)
- Shao Hua Lu
- College of Chemistry and Chemical Engineering, Lanzhou University, Lanzhou 730000, P. R. China
| | - Hong Lin Zhai
- College of Chemistry and Chemical Engineering, Lanzhou University, Lanzhou 730000, P. R. China
| | - Bing Qiang Zhao
- College of Chemistry and Chemical Engineering, Lanzhou University, Lanzhou 730000, P. R. China
| | - Bo Yin
- College of Chemistry and Chemical Engineering, Lanzhou University, Lanzhou 730000, P. R. China
| | - Ling Zhu
- College of Chemistry and Chemical Engineering, Lanzhou University, Lanzhou 730000, P. R. China
| |
Collapse
|
17
|
Mosayebi R, Hossein-Zadeh GA. Correlated coupled matrix tensor factorization method for simultaneous EEG-fMRI data fusion. Biomed Signal Process Control 2020. [DOI: 10.1016/j.bspc.2020.102071] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
18
|
Jonmohamadi Y, Muthukumaraswamy S, Chen J, Roberts J, Crawford R, Pandey A. Extraction of Common Task Features in EEG-fMRI Data Using Coupled Tensor-Tensor Decomposition. Brain Topogr 2020; 33:636-650. [PMID: 32728794 DOI: 10.1007/s10548-020-00787-0] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2019] [Accepted: 07/23/2020] [Indexed: 01/20/2023]
Abstract
The fusion of simultaneously recorded EEG and fMRI data is of great value to neuroscience research due to the complementary properties of the individual modalities. Traditionally, techniques such as PCA and ICA, which rely on strong non-physiological assumptions such as orthogonality and statistical independence, have been used for this purpose. Recently, tensor decomposition techniques such as parallel factor analysis have gained more popularity in neuroimaging applications as they are able to inherently contain the multidimensionality of neuroimaging data and achieve uniqueness in decomposition without making strong assumptions. Previously, the coupled matrix-tensor decomposition (CMTD) has been applied for the fusion of the EEG and fMRI. Only recently the coupled tensor-tensor decomposition (CTTD) has been proposed. Here for the first time, we propose the use of CTTD of a 4th order EEG tensor (space, time, frequency, and participant) and 3rd order fMRI tensor (space, time, participant), coupled partially in time and participant domains, for the extraction of the task related features in both modalities. We used both the sensor-level and source-level EEG for the coupling. The phase shifted paradigm signals were incorporated as the temporal initializers of the CTTD to extract the task related features. The validation of the approach is demonstrated on simultaneous EEG-fMRI recordings from six participants performing an N-Back memory task. The EEG and fMRI tensors were coupled in 9 components out of which seven components had a high correlation (more than 0.85) with the task. The result of the fusion recapitulates the well-known attention network as being positively, and the default mode network working negatively time-locked to the memory task.
Collapse
Affiliation(s)
- Yaqub Jonmohamadi
- School of Electrical Engineering and Robotics, Queensland University of Technology, Brisbane, Australia.
| | | | - Joseph Chen
- School of Pharmacy, The University of Auckland, Auckland, New Zealand
| | - Jonathan Roberts
- School of Electrical Engineering and Robotics, Queensland University of Technology, Brisbane, Australia
| | - Ross Crawford
- Institute of Health and Biomedical Innovation, Queensland University of Technology, Brisbane, Australia
| | - Ajay Pandey
- School of Electrical Engineering and Robotics, Queensland University of Technology, Brisbane, Australia
| |
Collapse
|
19
|
Gloaguen A, Philippe C, Frouin V, Gennari G, Dehaene-Lambertz G, Le Brusquet L, Tenenhaus A. Multiway generalized canonical correlation analysis. Biostatistics 2020; 23:240-256. [PMID: 32451525 DOI: 10.1093/biostatistics/kxaa010] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Revised: 11/07/2020] [Accepted: 01/20/2020] [Indexed: 11/13/2022] Open
Abstract
Regularized generalized canonical correlation analysis (RGCCA) is a general multiblock data analysis framework that encompasses several important multivariate analysis methods such as principal component analysis, partial least squares regression, and several versions of generalized canonical correlation analysis. In this article, we extend RGCCA to the case where at least one block has a tensor structure. This method is called multiway generalized canonical correlation analysis (MGCCA). Convergence properties of the MGCCA algorithm are studied, and computation of higher-level components are discussed. The usefulness of MGCCA is shown on simulation and on the analysis of a cognitive study in human infants using electroencephalography (EEG).
Collapse
Affiliation(s)
- Arnaud Gloaguen
- Laboratoire des Signaux et Systèmes (L2S), CNRS-CentraleSupélec, Université Paris-Saclay, 3 rue Joliot-Curie, 91192 Gif-sur-Yvette cedex, France and Université Paris-Saclay, CEA, Neurospin, 91191, Gif-sur-Yvette, France
| | - Cathy Philippe
- Université Paris-Saclay, CEA, Neurospin, 91191, Gif-sur-Yvette, France
| | - Vincent Frouin
- Université Paris-Saclay, CEA, Neurospin, 91191, Gif-sur-Yvette, France
| | - Giulia Gennari
- Cognitive Neuroimaging Unit, CEA, INSERM U992, NeuroSpin Center, 91191 Gif-sur-Yvette, France
| | | | - Laurent Le Brusquet
- Laboratoire des Signaux et Systèmes (L2S), CNRS-CentraleSupélec, Université Paris-Saclay, 3 rue Joliot-Curie, 91192 Gif-sur-Yvette cedex, France
| | - Arthur Tenenhaus
- Laboratoire des Signaux et Systèmes (L2S), CNRS-CentraleSupélec, Université Paris-Saclay, 3 rue Joliot-Curie, 91192 Gif-sur-Yvette cedex, France and Institut du Cerveau, INSERM U1127, CNRS UMR 7225, Sorbonne Universitè, F-75013, Paris, France
| |
Collapse
|
20
|
Second-order universal calibration. Talanta 2020; 212:120787. [PMID: 32113550 DOI: 10.1016/j.talanta.2020.120787] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Revised: 01/24/2020] [Accepted: 01/25/2020] [Indexed: 11/22/2022]
Abstract
Quantification and qualification of an analyte of interest in pharmaceutical tablets from different manufacturers/companies are a hard task because of the potential presence of various interfering molecules. Indeed, the composition of the tablets covers a wide range of interferents which can be even unknown. As a consequence, we propose to determine the concentration of an analyte of interest regardless of the interferents using the concept of universal calibration. Universal calibration paves the way to the quantification of a specific chemical entity in samples with various compositions and different interferents. This is possible by the trilinear structure of analyte's signal. In fact, the second-order advantage resulting from the second-order universal calibration models is exploited. However, a new second-order calibration strategy was conducted in this work using Trilinear Factor Extraction (TFE). A simulated data set was exemplified to highlight the ability of the proposed procedure in order to accurate extraction of the analyte's concentration profile. Additionally, two real data sets were also explored in order to test the TFE method. In the first case, Acetaminophen was quantified using fluorescence spectroscopy in tablets with different formulations from 6 companies. In the second experimental data, a peptide (Valine-Tyrosine-Valine) was successfully quantified in different samples using spectrofluorimetric data. Finally, these real data sets were analyzed by Multivariate Curve resolution - Alternating Least-Squares (MCR-ALS) under non-negativity and trilinearity constraints for the sake of comparison. The calculated Root Mean Square Error of Predictions (RMSEP) of Acetaminophen were 0.028 and 0.026 for the MCR-ALS and TFE models, respectively. On the other hand, for the second experimental data set, the RMSEP were 0.216 and 0.165, respectively. Finally, based on a paired t-test, the results of MCR-ALS and TFE were not significantly different.
Collapse
|
21
|
Wimalawarne K, Yamada M, Mamitsuka H. Scaled Coupled Norms and Coupled Higher-Order Tensor Completion. Neural Comput 2019; 32:447-484. [PMID: 31835002 DOI: 10.1162/neco_a_01254] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
Recently, a set of tensor norms known as coupled norms has been proposed as a convex solution to coupled tensor completion. Coupled norms have been designed by combining low-rank inducing tensor norms with the matrix trace norm. Though coupled norms have shown good performances, they have two major limitations: they do not have a method to control the regularization of coupled modes and uncoupled modes, and they are not optimal for couplings among higher-order tensors. In this letter, we propose a method that scales the regularization of coupled components against uncoupled components to properly induce the low-rankness on the coupled mode. We also propose coupled norms for higher-order tensors by combining the square norm to coupled norms. Using the excess risk-bound analysis, we demonstrate that our proposed methods lead to lower risk bounds compared to existing coupled norms. We demonstrate the robustness of our methods through simulation and real-data experiments.
Collapse
Affiliation(s)
- Kishan Wimalawarne
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan
| | - Makoto Yamada
- Graduate School of Informatics, Kyoto University, Yoshida-honmachi, Sakyo-ku, Kyoto 606-8501, Japan; RIKEN, Center for Advanced Intelligence Project, Tokyo 103-0027, Japan; Institute of Statistical Mathematics, Tokyo 190-8562, Japan; and PRESTO, Japan Science and Technological Agency, Japan
| | - Hiroshi Mamitsuka
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto 611-0011, Japan, and Department of Computer Science, Aalto University, Espoo F1-00076, Finland
| |
Collapse
|
22
|
Fusing data of different orders for environmental monitoring. Anal Chim Acta 2019; 1085:48-60. [PMID: 31522730 DOI: 10.1016/j.aca.2019.08.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Revised: 06/25/2019] [Accepted: 08/05/2019] [Indexed: 11/20/2022]
Abstract
In the present work a novel application of data fusion to an environmental monitoring study is proposed. This paper involves the joint analysis of zeroth-, first- and second-order data measured on a particular environmental system. The main advantage of this methodology is the possibility of analyzing the relationships of the different order data provided by several analytical techniques. This approach enables to achieve new knowledge, in a way that would be not accessible if considering the information individually. Environmental monitoring databases usually generate large amount of data. Multivariate statistical techniques are necessary to process all this information and obtain a correct interpretation. The Ludueña Stream located in Argentina was chosen as the study system. Samples from different sites of the basin were taken periodically. Conductivity and pH (zeroth-order data) were fused with near-infrared (NIR) spectra of suspended particulate material (first-order data) and with fluorescence emission-excitation matrices of dissolved organic matter (second-order data). Different chemometric algorithms made it possible to extract and merge all the information in a new database, enabling its later analysis as a whole. This methodology allowed to successfully studying the behavior of dissolved organic matter together with suspended particulate material and other specific variables, showing links between them. Their distributions along the basin and their evolutions over time were possible to obtain. Therefore, a simpler interpretation to evaluate the system status was achieved. This model allowed differentiating the variables affected by anthropic activities from those with a natural origin.
Collapse
|
23
|
Hériché JK, Alexander S, Ellenberg J. Integrating Imaging and Omics: Computational Methods and Challenges. Annu Rev Biomed Data Sci 2019. [DOI: 10.1146/annurev-biodatasci-080917-013328] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Fluorescence microscopy imaging has long been complementary to DNA sequencing- and mass spectrometry–based omics in biomedical research, but these approaches are now converging. On the one hand, omics methods are moving from in vitro methods that average across large cell populations to in situ molecular characterization tools with single-cell sensitivity. On the other hand, fluorescence microscopy imaging has moved from a morphological description of tissues and cells to quantitative molecular profiling with single-molecule resolution. Recent technological developments underpinned by computational methods have started to blur the lines between imaging and omics and have made their direct correlation and seamless integration an exciting possibility. As this trend continues rapidly, it will allow us to create comprehensive molecular profiles of living systems with spatial and temporal context and subcellular resolution. Key to achieving this ambitious goal will be novel computational methods and successfully dealing with the challenges of data integration and sharing as well as cloud-enabled big data analysis.
Collapse
Affiliation(s)
- Jean-Karim Hériché
- Cell Biology and Biophysics Unit, European Molecular Biology Laboratory (EMBL), 69117 Heidelberg, Germany
| | - Stephanie Alexander
- Cell Biology and Biophysics Unit, European Molecular Biology Laboratory (EMBL), 69117 Heidelberg, Germany
| | - Jan Ellenberg
- Cell Biology and Biophysics Unit, European Molecular Biology Laboratory (EMBL), 69117 Heidelberg, Germany
| |
Collapse
|
24
|
Acar E, Schenker C, Levin-Schwartz Y, Calhoun VD, Adali T. Unraveling Diagnostic Biomarkers of Schizophrenia Through Structure-Revealing Fusion of Multi-Modal Neuroimaging Data. Front Neurosci 2019; 13:416. [PMID: 31130835 PMCID: PMC6509223 DOI: 10.3389/fnins.2019.00416] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Accepted: 04/11/2019] [Indexed: 11/13/2022] Open
Abstract
Fusing complementary information from different modalities can lead to the discovery of more accurate diagnostic biomarkers for psychiatric disorders. However, biomarker discovery through data fusion is challenging since it requires extracting interpretable and reproducible patterns from data sets, consisting of shared/unshared patterns and of different orders. For example, multi-channel electroencephalography (EEG) signals from multiple subjects can be represented as a third-order tensor with modes: subject, time, and channel, while functional magnetic resonance imaging (fMRI) data may be in the form of subject by voxel matrices. Traditional data fusion methods rearrange higher-order tensors, such as EEG, as matrices to use matrix factorization-based approaches. In contrast, fusion methods based on coupled matrix and tensor factorizations (CMTF) exploit the potential multi-way structure of higher-order tensors. The CMTF approach has been shown to capture underlying patterns more accurately without imposing strong constraints on the latent neural patterns, i.e., biomarkers. In this paper, EEG, fMRI, and structural MRI (sMRI) data collected during an auditory oddball task (AOD) from a group of subjects consisting of patients with schizophrenia and healthy controls, are arranged as matrices and higher-order tensors coupled along the subject mode, and jointly analyzed using structure-revealing CMTF methods [also known as advanced CMTF (ACMTF)] focusing on unique identification of underlying patterns in the presence of shared/unshared patterns. We demonstrate that joint analysis of the EEG tensor and fMRI matrix using ACMTF reveals significant and biologically meaningful components in terms of differentiating between patients with schizophrenia and healthy controls while also providing spatial patterns with high resolution and improving the clustering performance compared to the analysis of only the EEG tensor. We also show that these patterns are reproducible, and study reproducibility for different model parameters. In comparison to the joint independent component analysis (jICA) data fusion approach, ACMTF provides easier interpretation of EEG data by revealing a single summary map of the topography for each component. Furthermore, fusion of sMRI data with EEG and fMRI through an ACMTF model provides structural patterns; however, we also show that when fusing data sets from multiple modalities, hence of very different nature, preprocessing plays a crucial role.
Collapse
Affiliation(s)
- Evrim Acar
- Machine Intelligence Department, Simula Metropolitan Center for Digital Engineering, Oslo, Norway
| | - Carla Schenker
- Machine Intelligence Department, Simula Metropolitan Center for Digital Engineering, Oslo, Norway
| | - Yuri Levin-Schwartz
- Department of Environmental Medicine and Public Health, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Vince D. Calhoun
- The Mind Research Network, Albuquerque, NM, United States
- Department of Electrical and Computer Engineering, University of New Mexico, Albuquerque, NM, United States
| | - Tülay Adali
- Department of Computer Science and Electrical Engineering, University of Maryland Baltimore County, Baltimore, MD, United States
| |
Collapse
|
25
|
Kinney-Lang E, Ebied A, Auyeung B, Chin RFM, Escudero J. Introducing the Joint EEG-Development Inference (JEDI) Model: A Multi-Way, Data Fusion Approach for Estimating Paediatric Developmental Scores via EEG. IEEE Trans Neural Syst Rehabil Eng 2019; 27:348-357. [DOI: 10.1109/tnsre.2019.2891827] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
26
|
Li Y, Wu FX, Ngom A. A review on machine learning principles for multi-view biological data integration. Brief Bioinform 2019; 19:325-340. [PMID: 28011753 DOI: 10.1093/bib/bbw113] [Citation(s) in RCA: 126] [Impact Index Per Article: 25.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2016] [Indexed: 01/08/2023] Open
Abstract
Driven by high-throughput sequencing techniques, modern genomic and clinical studies are in a strong need of integrative machine learning models for better use of vast volumes of heterogeneous information in the deep understanding of biological systems and the development of predictive models. How data from multiple sources (called multi-view data) are incorporated in a learning system is a key step for successful analysis. In this article, we provide a comprehensive review on omics and clinical data integration techniques, from a machine learning perspective, for various analyses such as prediction, clustering, dimension reduction and association. We shall show that Bayesian models are able to use prior information and model measurements with various distributions; tree-based methods can either build a tree with all features or collectively make a final decision based on trees learned from each view; kernel methods fuse the similarity matrices learned from individual views together for a final similarity matrix or learning model; network-based fusion methods are capable of inferring direct and indirect associations in a heterogeneous network; matrix factorization models have potential to learn interactions among features from different views; and a range of deep neural networks can be integrated in multi-modal learning for capturing the complex mechanism of biological systems.
Collapse
Affiliation(s)
- Yifeng Li
- Information and Communications Technologies, National Research Council Canada, Ottawa, Ontario, Canada
| | - Fang-Xiang Wu
- Department of Mechanical Engineering, University of Saskatchewan, Saskatoon, Saskatchewan, Canada
| | - Alioune Ngom
- School of Computer Science, University of Windsor, Windsor, Ontario, Canada
| |
Collapse
|
27
|
Smolinska A, Engel J, Szymanska E, Buydens L, Blanchet L. General Framing of Low-, Mid-, and High-Level Data Fusion With Examples in the Life Sciences. DATA HANDLING IN SCIENCE AND TECHNOLOGY 2019. [DOI: 10.1016/b978-0-444-63984-4.00003-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
|
28
|
Wünsch UJ, Acar E, Koch BP, Murphy KR, Schmitt-Kopplin P, Stedmon CA. The Molecular Fingerprint of Fluorescent Natural Organic Matter Offers Insight into Biogeochemical Sources and Diagenetic State. Anal Chem 2018; 90:14188-14197. [DOI: 10.1021/acs.analchem.8b02863] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Affiliation(s)
- Urban J. Wünsch
- Chalmers University of Technology, Architecture and Civil Engineering, Water Environment Technology, Sven Hultins Gata 6, 41296 Gothenburg, Sweden
- National Institute of Aquatic Resources, Technical University of Denmark, Kemitorvet, 2800 Kgs. Lyngby, Denmark
| | - Evrim Acar
- Simula Metropolitan Center for Digital Engineering, Pilestredet 52, 0167 Oslo, Norway
| | - Boris P. Koch
- Alfred Wegener Institute Helmholtz Centre for Polar and Marine Research, Am Handelshafen 12, 27570 Bremerhaven, Germany
- University of Applied Sciences, An der Karlstadt 8, 27568 Bremerhaven, Germany
| | - Kathleen R. Murphy
- Chalmers University of Technology, Architecture and Civil Engineering, Water Environment Technology, Sven Hultins Gata 6, 41296 Gothenburg, Sweden
| | - Philippe Schmitt-Kopplin
- Research Unit Analytical Biogeochemistry (BGC), Helmholtz Zentrum München, German Research Center for Environmental Health, Ingolstädter Landstrasse 1, 85764 Neuherberg, Germany
| | - Colin A. Stedmon
- National Institute of Aquatic Resources, Technical University of Denmark, Kemitorvet, 2800 Kgs. Lyngby, Denmark
| |
Collapse
|
29
|
Wimalawarne K, Yamada M, Mamitsuka H. Convex Coupled Matrix and Tensor Completion. Neural Comput 2018; 30:3095-3127. [PMID: 30148706 DOI: 10.1162/neco_a_01123] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
We propose a set of convex low-rank inducing norms for coupled matrices and tensors (hereafter referred to as coupled tensors), in which information is shared between the matrices and tensors through common modes. More specifically, we first propose a mixture of the overlapped trace norm and the latent norms with the matrix trace norm, and then, propose a completion model regularized using these norms to impute coupled tensors. A key advantage of the proposed norms is that they are convex and can be used to find a globally optimal solution, whereas existing methods for coupled learning are nonconvex. We also analyze the excess risk bounds of the completion model regularized using our proposed norms and show that they can exploit the low-rankness of coupled tensors, leading to better bounds compared to those obtained using uncoupled norms. Through synthetic and real-data experiments, we show that the proposed completion model compares favorably with existing ones.
Collapse
Affiliation(s)
- Kishan Wimalawarne
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji 611-004, Japan
| | - Makoto Yamada
- RIKEN, Center for Advanced Intelligence Project, Chuo-ku, Tokyo 103-0027, Japan; Institute of Statistical Mathematics, Tachikawa, Tokyo 190-8562, Japan; and PRESTO, Japan Science and Technological Agency, Kawaguchi-shi, Saitama 332-0012, Japan
| | - Hiroshi Mamitsuka
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji 611-0011, Japan, and Department of Computer Science, Aalto University, Espoo 02150, Finland
| |
Collapse
|
30
|
Castañar L, Poggetto GD, Colbourne AA, Morris GA, Nilsson M. The GNAT: A new tool for processing NMR data. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2018; 56:546-558. [PMID: 29396867 PMCID: PMC6001793 DOI: 10.1002/mrc.4717] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/29/2017] [Revised: 01/15/2018] [Accepted: 01/22/2018] [Indexed: 05/31/2023]
Abstract
The GNAT (General NMR Analysis Toolbox) is a free and open-source software package for processing, visualising, and analysing NMR data. It supersedes the popular DOSY Toolbox, which has a narrower focus on diffusion NMR. Data import of most common formats from the major NMR platforms is supported, as well as a GNAT generic format. Key basic processing of NMR data (e.g., Fourier transformation, baseline correction, and phasing) is catered for within the program, as well as more advanced techniques (e.g., reference deconvolution and pure shift FID reconstruction). Analysis tools include DOSY and SCORE for diffusion data, ROSY T1 /T2 estimation for relaxation data, and PARAFAC for multilinear analysis. The GNAT is written for the MATLAB® language and comes with a user-friendly graphical user interface. The standard version is intended to run with a MATLAB installation, but completely free-standing compiled versions for Windows, Mac, and Linux are also freely available.
Collapse
Affiliation(s)
- Laura Castañar
- School of ChemistryUniversity of ManchesterOxford RoadManchesterM13 9PLUK
| | | | - Adam A. Colbourne
- School of ChemistryUniversity of ManchesterOxford RoadManchesterM13 9PLUK
| | - Gareth A. Morris
- School of ChemistryUniversity of ManchesterOxford RoadManchesterM13 9PLUK
| | - Mathias Nilsson
- School of ChemistryUniversity of ManchesterOxford RoadManchesterM13 9PLUK
| |
Collapse
|
31
|
Li BQ, Wang X, Xu ML, Zhai HL, Chen J, Liu JJ. The multi-resolution capability of Tchebichef moments and its applications to the analysis of fluorescence excitation-emission spectra. Methods Appl Fluoresc 2017; 6:015008. [PMID: 28933348 DOI: 10.1088/2050-6120/aa8e1e] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Fluorescence spectroscopy with an excitation-emission matrix (EEM) is a fast and inexpensive technique and has been applied to the detection of a very wide range of analytes. However, serious scattering and overlapping signals hinder the applications of EEM spectra. In this contribution, the multi-resolution capability of Tchebichef moments was investigated in depth and applied to the analysis of two EEM data sets (data set 1 consisted of valine-tyrosine-valine, tryptophan-glycine and phenylalanine, and data set 2 included vitamin B1, vitamin B2 and vitamin B6) for the first time. By means of the Tchebichef moments with different orders, the different information in the EEM spectra can be represented. It is owing to this multi-resolution capability that the overlapping problem was solved, and the information of chemicals and scatterings were separated. The obtained results demonstrated that the Tchebichef moment method is very effective, which provides a promising tool for the analysis of EEM spectra. It is expected that the applications of Tchebichef moment method could be developed and extended in complex systems such as biological fluids, food, environment and others to deal with the practical problems (overlapped peaks, unknown interferences, baseline drifts, and so on) with other spectra.
Collapse
Affiliation(s)
- Bao Qiong Li
- College of Chemistry & Chemical Engineering, Lanzhou University, Lanzhou, 730000, People's Republic of China
| | | | | | | | | | | |
Collapse
|
32
|
Bruno C, Patin F, Bocca C, Nadal-Desbarats L, Bonnier F, Reynier P, Emond P, Vourc'h P, Joseph-Delafont K, Corcia P, Andres CR, Blasco H. The combination of four analytical methods to explore skeletal muscle metabolomics: Better coverage of metabolic pathways or a marketing argument? J Pharm Biomed Anal 2017; 148:273-279. [PMID: 29059617 DOI: 10.1016/j.jpba.2017.10.013] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2017] [Revised: 10/12/2017] [Accepted: 10/13/2017] [Indexed: 01/11/2023]
Abstract
OBJECTIVES Metabolomics is an emerging science based on diverse high throughput methods that are rapidly evolving to improve metabolic coverage of biological fluids and tissues. Technical progress has led researchers to combine several analytical methods without reporting the impact on metabolic coverage of such a strategy. The objective of our study was to develop and validate several analytical techniques (mass spectrometry coupled to gas or liquid chromatography and nuclear magnetic resonance) for the metabolomic analysis of small muscle samples and evaluate the impact of combining methods for more exhaustive metabolite covering. DESIGN AND METHODS We evaluated the muscle metabolome from the same pool of mouse muscle samples after 2 metabolite extraction protocols. Four analytical methods were used: targeted flow injection analysis coupled with mass spectrometry (FIA-MS/MS), gas chromatography coupled with mass spectrometry (GC-MS), liquid chromatography coupled with high-resolution mass spectrometry (LC-HRMS), and nuclear magnetic resonance (NMR) analysis. We evaluated the global variability of each compound i.e., analytical (from quality controls) and extraction variability (from muscle extracts). We determined the best extraction method and we reported the common and distinct metabolites identified based on the number and identity of the compounds detected with low analytical variability (variation coefficient<30%) for each method. Finally, we assessed the coverage of muscle metabolic pathways obtained. RESULTS Methanol/chloroform/water and water/methanol were the best extraction solvent for muscle metabolome analysis by NMR and MS, respectively. We identified 38 metabolites by nuclear magnetic resonance, 37 by FIA-MS/MS, 18 by GC-MS, and 80 by LC-HRMS. The combination led us to identify a total of 132 metabolites with low variability partitioned into 58 metabolic pathways, such as amino acid, nitrogen, purine, and pyrimidine metabolism, and the citric acid cycle. This combination also showed that the contribution of GC-MS was low when used in combination with other mass spectrometry methods and nuclear magnetic resonance to explore muscle samples. CONCLUSION This study reports the validation of several analytical methods, based on nuclear magnetic resonance and several mass spectrometry methods, to explore the muscle metabolome from a small amount of tissue, comparable to that obtained during a clinical trial. The combination of several techniques may be relevant for the exploration of muscle metabolism, with acceptable analytical variability and overlap between methods However, the difficult and time-consuming data pre-processing, processing, and statistical analysis steps do not justify systematically combining analytical methods.
Collapse
Affiliation(s)
- C Bruno
- CHRU de Tours, Laboratoire de Biochimie et Biologie Moléculaire, Tours, France; UMR INSERM U930, Université François Rabelais de Tours, France
| | - F Patin
- CHRU de Tours, Laboratoire de Biochimie et Biologie Moléculaire, Tours, France; UMR INSERM U930, Université François Rabelais de Tours, France
| | - C Bocca
- Institut MITOVASC, CNRS 6015, INSERM U1083, Université d'Angers, Angers, France
| | | | - F Bonnier
- Université François-Rabelais de Tours, Faculté de Pharmacie, EA 6295 Nanomédicaments et Nanosondes, Tours, France
| | - P Reynier
- Institut MITOVASC, CNRS 6015, INSERM U1083, Université d'Angers, Angers, France
| | - P Emond
- UMR INSERM U930, Université François Rabelais de Tours, France
| | - P Vourc'h
- CHRU de Tours, Laboratoire de Biochimie et Biologie Moléculaire, Tours, France; UMR INSERM U930, Université François Rabelais de Tours, France
| | - K Joseph-Delafont
- CHRU de Tours, Laboratoire de Biochimie et Biologie Moléculaire, Tours, France
| | - P Corcia
- UMR INSERM U930, Université François Rabelais de Tours, France; Centre de Ressources et de Compétences SLA, CHU Tours, France; Fédération des Centres de Ressources et de Compétences de Tours et Limoges, Litorals, France
| | - C R Andres
- CHRU de Tours, Laboratoire de Biochimie et Biologie Moléculaire, Tours, France; UMR INSERM U930, Université François Rabelais de Tours, France
| | - H Blasco
- CHRU de Tours, Laboratoire de Biochimie et Biologie Moléculaire, Tours, France; UMR INSERM U930, Université François Rabelais de Tours, France.
| |
Collapse
|
33
|
Papalexakis EE, Faloutsos C, Sidiropoulos ND. Tensors for Data Mining and Data Fusion. ACM T INTEL SYST TEC 2017. [DOI: 10.1145/2915921] [Citation(s) in RCA: 51] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
Tensors and tensor decompositions are very powerful and versatile tools that can model a wide variety of heterogeneous, multiaspect data. As a result, tensor decompositions, which extract useful latent information out of multiaspect data tensors, have witnessed increasing popularity and adoption by the data mining community. In this survey, we present some of the most widely used tensor decompositions, providing the key insights behind them, and summarizing them from a practitioner’s point of view. We then provide an overview of a very broad spectrum of applications where tensors have been instrumental in achieving state-of-the-art performance, ranging from social network analysis to brain data analysis, and from web mining to healthcare. Subsequently, we present recent algorithmic advances in scaling tensor decompositions up to today’s big data, outlining the existing systems and summarizing the key ideas behind them. Finally, we conclude with a list of challenges and open problems that outline exciting future research directions.
Collapse
|
34
|
Preprocessing and Pretreatment of Metabolomics Data for Statistical Analysis. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2017; 965:145-161. [DOI: 10.1007/978-3-319-47656-8_6] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
35
|
|
36
|
Rivet B, Duda M, Guérin-Dugué A, Jutten C, Comon P. Multimodal approach to estimate the ocular movements during EEG recordings: A coupled tensor factorization method. 2015 37TH ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY (EMBC) 2015; 2015:6983-6. [PMID: 26737899 DOI: 10.1109/embc.2015.7319999] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
37
|
Whelan NV, Kocot KM, Halanych KM. Employing Phylogenomics to Resolve the Relationships among Cnidarians, Ctenophores, Sponges, Placozoans, and Bilaterians. Integr Comp Biol 2015; 55:1084-95. [PMID: 25972566 DOI: 10.1093/icb/icv037] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Despite an explosion in the amount of sequence data, phylogenomics has failed to settle controversy regarding some critical nodes on the animal tree of life. Understanding relationships among Bilateria, Ctenophora, Cnidaria, Placozoa, and Porifera is essential for studying how complex traits such as neurons, muscles, and gastrulation have evolved. Recent studies have cast doubt on the historical viewpoint that sponges are sister to all other animal lineages with recent studies recovering ctenophores as sister. However, the ctenophore-sister hypothesis has been criticized as unrealistic and caused by systematic error. We review past phylogenomic studies and potential causes of systematic error in an effort to identify areas that can be improved in future studies. Increased sampling of taxa, less missing data, and a priori removal of sequences and taxa that may cause systematic error in phylogenomic inference will likely be the most fruitful areas of focus when assembling future datasets. Ultimately, we foresee metazoan relationships being resolved with higher support in the near future, and we caution against dismissing novel hypotheses merely because they conflict with historical viewpoints of animal evolution.
Collapse
Affiliation(s)
- Nathan V Whelan
- *Department of Biological Sciences, Molette Biology Laboratory for Environmental and Climate Change Studies, Auburn University, 101 Life Sciences Building, Auburn, AL 36849, USA;
| | - Kevin M Kocot
- School of Biological Sciences, The University of Queensland, 325 Goddard Building, St Lucia, QLD 4101, Australia
| | - Kenneth M Halanych
- *Department of Biological Sciences, Molette Biology Laboratory for Environmental and Climate Change Studies, Auburn University, 101 Life Sciences Building, Auburn, AL 36849, USA
| |
Collapse
|