1
|
Odriozola I, Rasmussen JA, Gilbert MTP, Limborg MT, Alberdi A. A practical introduction to holo-omics. CELL REPORTS METHODS 2024; 4:100820. [PMID: 38986611 PMCID: PMC11294832 DOI: 10.1016/j.crmeth.2024.100820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 04/17/2024] [Accepted: 06/20/2024] [Indexed: 07/12/2024]
Abstract
Holo-omics refers to the joint study of non-targeted molecular data layers from host-microbiota systems or holobionts, which is increasingly employed to disentangle the complex interactions between the elements that compose them. We navigate through the generation, analysis, and integration of omics data, focusing on the commonalities and main differences to generate and analyze the various types of omics, with a special focus on optimizing data generation and integration. We advocate for careful generation and distillation of data, followed by independent exploration and analyses of the single omic layers to obtain a better understanding of the study system, before the integration of multiple omic layers in a final model is attempted. We highlight critical decision points to achieve this aim and flag the main challenges to address complex biological questions regarding the integrative study of host-microbiota relationships.
Collapse
Affiliation(s)
- Iñaki Odriozola
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Jacob A Rasmussen
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - M Thomas P Gilbert
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark; University Museum, NTNU, Trondheim, Norway
| | - Morten T Limborg
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark
| | - Antton Alberdi
- Center for Evolutionary Hologenomics, Globe Institute, University of Copenhagen, Copenhagen, Denmark.
| |
Collapse
|
2
|
Curion F, Theis FJ. Machine learning integrative approaches to advance computational immunology. Genome Med 2024; 16:80. [PMID: 38862979 PMCID: PMC11165829 DOI: 10.1186/s13073-024-01350-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Accepted: 05/23/2024] [Indexed: 06/13/2024] Open
Abstract
The study of immunology, traditionally reliant on proteomics to evaluate individual immune cells, has been revolutionized by single-cell RNA sequencing. Computational immunologists play a crucial role in analysing these datasets, moving beyond traditional protein marker identification to encompass a more detailed view of cellular phenotypes and their functional roles. Recent technological advancements allow the simultaneous measurements of multiple cellular components-transcriptome, proteome, chromatin, epigenetic modifications and metabolites-within single cells, including in spatial contexts within tissues. This has led to the generation of complex multiscale datasets that can include multimodal measurements from the same cells or a mix of paired and unpaired modalities. Modern machine learning (ML) techniques allow for the integration of multiple "omics" data without the need for extensive independent modelling of each modality. This review focuses on recent advancements in ML integrative approaches applied to immunological studies. We highlight the importance of these methods in creating a unified representation of multiscale data collections, particularly for single-cell and spatial profiling technologies. Finally, we discuss the challenges of these holistic approaches and how they will be instrumental in the development of a common coordinate framework for multiscale studies, thereby accelerating research and enabling discoveries in the computational immunology field.
Collapse
Affiliation(s)
- Fabiola Curion
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany
- Department of Mathematics, School of Computation, Information and Technology, Technical University of Munich, Munich, Germany
| | - Fabian J Theis
- Institute of Computational Biology, Helmholtz Center Munich, Munich, Germany.
- Department of Mathematics, School of Computation, Information and Technology, Technical University of Munich, Munich, Germany.
- School of Life Sciences Weihenstephan, Technical University of Munich, Munich, Germany.
| |
Collapse
|
3
|
Yang H, Zhu D, He S, Xu Z, Liu Z, Zhang W, Cai J. Enhancing psychiatric rehabilitation outcomes through a multimodal multitask learning model based on BERT and TabNet: An approach for personalized treatment and improved decision-making. Psychiatry Res 2024; 336:115896. [PMID: 38626625 DOI: 10.1016/j.psychres.2024.115896] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Revised: 04/03/2024] [Accepted: 04/05/2024] [Indexed: 04/18/2024]
Abstract
Evaluating the rehabilitation status of individuals with serious mental illnesses (SMI) necessitates a comprehensive analysis of multimodal data, including unstructured text records and structured diagnostic data. However, progress in the effective assessment of rehabilitation status remains limited. Our study develops a deep learning model integrating Bidirectional Encoder Representations from Transformers (BERT) and TabNet through a late fusion strategy to enhance rehabilitation prediction, including referral risk, dangerous behaviors, self-awareness, and medication adherence, in patients with SMI. BERT processes unstructured textual data, such as doctor's notes, whereas TabNet manages structured diagnostic information. The model's interpretability function serves to assist healthcare professionals in understanding the model's predictive decisions, improving patient care. Our model exhibited excellent predictive performance for all four tasks, with an accuracy exceeding 0.78 and an area under the curve of 0.70. In addition, a series of tests proved the model's robustness, fairness, and interpretability. This study combines multimodal and multitask learning strategies into a model and applies it to rehabilitation assessment tasks, offering a promising new tool that can be seamlessly integrated with the clinical workflow to support the provision of optimized patient care.
Collapse
Affiliation(s)
- Hongyi Yang
- School of Design, Shanghai Jiao Tong University, Shanghai, China
| | - Dian Zhu
- School of Design, Shanghai Jiao Tong University, Shanghai, China
| | - Siyuan He
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China
| | - Zhiqi Xu
- School of Design, Shanghai Jiao Tong University, Shanghai, China
| | - Zhao Liu
- School of Design, Shanghai Jiao Tong University, Shanghai, China.
| | - Weibo Zhang
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China; Shanghai Institute of Infectious Disease and Biosecurity, Fudan University, Shanghai, China; Mental Health Branch, China Hospital Development Institute, Shanghai Jiao Tong University, Shanghai, China.
| | - Jun Cai
- Shanghai Mental Health Center, Shanghai Jiao Tong University School of Medicine, Shanghai, China; Mental Health Branch, China Hospital Development Institute, Shanghai Jiao Tong University, Shanghai, China.
| |
Collapse
|
4
|
Rey N, Ebrahimian T, Gloaguen C, Kereselidze D, Christelle E, Brizais C, Bachelot F, Riazi G, Monceau V, Demarquay C, Zineddine IG, Klokov D, Lehoux S, Ebrahimian TG. Low to moderate dose 137Cs (γ) radiation promotes M2 type macrophage skewing and reduces atherosclerotic plaque CD68+ cell content in ApoE (-/-) mice. Sci Rep 2024; 14:12450. [PMID: 38816571 PMCID: PMC11139881 DOI: 10.1038/s41598-024-63084-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2024] [Accepted: 05/24/2024] [Indexed: 06/01/2024] Open
Abstract
The effects of low doses of ionizing radiation on atherosclerosis remain uncertain, particularly as regards the generation of pro- or anti-inflammatory responses, and the time scale at which such effects can occur following irradiation. To explore these phenomena, we exposed atheroprone ApoE(-/-) mice to a single dose of 0, 0.05, 0.5 or 1 Gy of 137Cs (γ) administered at a 10.35 mGy min-1 dose rate and evaluated short-term (1-10 days) and long-term consequences (100 days). Bone marrow-derived macrophages were derived from mice 1 day after exposure. Irradiation was associated with a significant skewing of M0 and M2 polarized macrophages towards the M2 phenotype, as demonstrated by an increased mRNA expression of Retnla, Arg1, and Chil3 in cells from mice exposed to 0.5 or 1 Gy compared with non-irradiated animals. Minimal effects were noted in M1 cells or M1 marker mRNA. Concurrently, we observed a reduced secretion of IL-1β but enhanced IL-10 release from M0 and M2 macrophages. Effects of irradiation on circulating monocytes were most marked at day 10 post-exposure, when the 1 Gy dose was associated with enhanced numbers of both Ly6CHigh and Ly6Low cells. By day 100, levels of circulating monocytes in irradiated and non-irradiated mice were equivalent, but anti-inflammatory Ly6CLow monocytes were significantly increased in the spleen of mice exposed to 0.05 or 1 Gy. Long term exposures did not affect atherosclerotic plaque size or lipid content, as determined by Oil red O staining, whatever the dose applied. Similarly, irradiation did not affect atherosclerotic plaque collagen or smooth muscle cell content. However, we found that lesion CD68+ cell content tended to decrease with rising doses of radioactivity exposure, culminating in a significant reduction of plaque macrophage content at 1 Gy. Taken together, our results show that short- and long-term exposures to low to moderate doses of ionizing radiation drive an anti-inflammatory response, skewing bone marrow-derived macrophages towards an IL-10-secreting M2 phenotype and decreasing plaque macrophage content. These results suggest a low-grade athero-protective effect of low and moderate doses of ionizing radiation.
Collapse
Affiliation(s)
- N Rey
- Institut de Radioprotection et de Sûreté Nucléaire, Laboratoire de Radiotoxicologie et de Radiobiologie Expérimentale, 92262, Fontenay-Aux Roses, France
| | - T Ebrahimian
- Department of Medicine, Lady Davis Institute for Biomedical Research, McGill University, Montreal, Canada
| | - C Gloaguen
- Institut de Radioprotection et de Sûreté Nucléaire, Laboratoire de Radiotoxicologie et de Radiobiologie Expérimentale, 92262, Fontenay-Aux Roses, France
| | - D Kereselidze
- Institut de Radioprotection et de Sûreté Nucléaire, Laboratoire de Radiotoxicologie et de Radiobiologie Expérimentale, 92262, Fontenay-Aux Roses, France
| | - E Christelle
- Institut de Radioprotection et de Sûreté Nucléaire, Laboratoire de Radiotoxicologie et de Radiobiologie Expérimentale, 92262, Fontenay-Aux Roses, France
| | - C Brizais
- Institut de Radioprotection et de Sûreté Nucléaire, Laboratoire de Radiotoxicologie et de Radiobiologie Expérimentale, 92262, Fontenay-Aux Roses, France
| | - F Bachelot
- Institut de Radioprotection et de Sûreté Nucléaire, Laboratoire de Radiotoxicologie et de Radiobiologie Expérimentale, 92262, Fontenay-Aux Roses, France
| | - G Riazi
- Institut de Radioprotection et de Sûreté Nucléaire, Laboratoire de Radiotoxicologie et de Radiobiologie Expérimentale, 92262, Fontenay-Aux Roses, France
| | - V Monceau
- Institut de Radioprotection et de Sûreté Nucléaire, Laboratoire de Radiotoxicologie et de Radiobiologie Expérimentale, 92262, Fontenay-Aux Roses, France
| | - C Demarquay
- Institut de Radioprotection et de Sûreté Nucléaire, Laboratoire de Radiotoxicologie et de Radiobiologie Expérimentale, 92262, Fontenay-Aux Roses, France
| | - I Garali Zineddine
- Institut de Radioprotection et de Sûreté Nucléaire, Laboratoire de Radiotoxicologie et de Radiobiologie Expérimentale, 92262, Fontenay-Aux Roses, France
| | - D Klokov
- Institut de Radioprotection et de Sûreté Nucléaire, Laboratoire de Radiotoxicologie et de Radiobiologie Expérimentale, 92262, Fontenay-Aux Roses, France
| | - S Lehoux
- Department of Medicine, Lady Davis Institute for Biomedical Research, McGill University, Montreal, Canada.
| | - Teni G Ebrahimian
- Institut de Radioprotection et de Sûreté Nucléaire, Laboratoire de Radiotoxicologie et de Radiobiologie Expérimentale, 92262, Fontenay-Aux Roses, France.
| |
Collapse
|
5
|
d'Humières C, Delavy M, Alla L, Ichou F, Gauliard E, Ghozlane A, Levenez F, Galleron N, Quinquis B, Pons N, Mullaert J, Bridier-Nahmias A, Condamine B, Touchon M, Rainteau D, Lamazière A, Lesnik P, Ponnaiah M, Lhomme M, Sertour N, Devente S, Docquier JD, Bougnoux ME, Tenaillon O, Magnan M, Ruppé E, Grall N, Duval X, Ehrlich D, Mentré F, Denamur E, Rocha EPC, Le Chatelier E, Burdet C. Perturbation and resilience of the gut microbiome up to 3 months after β-lactams exposure in healthy volunteers suggest an important role of microbial β-lactamases. MICROBIOME 2024; 12:50. [PMID: 38468305 DOI: 10.1186/s40168-023-01746-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/14/2023] [Accepted: 12/20/2023] [Indexed: 03/13/2024]
Abstract
BACKGROUND Antibiotics notoriously perturb the gut microbiota. We treated healthy volunteers either with cefotaxime or ceftriaxone for 3 days, and collected in each subject 12 faecal samples up to day 90. Using untargeted and targeted phenotypic and genotypic approaches, we studied the changes in the bacterial, phage and fungal components of the microbiota as well as the metabolome and the β-lactamase activity of the stools. This allowed assessing their degrees of perturbation and resilience. RESULTS While only two subjects had detectable concentrations of antibiotics in their faeces, suggesting important antibiotic degradation in the gut, the intravenous treatment perturbed very significantly the bacterial and phage microbiota, as well as the composition of the metabolome. In contrast, treatment impact was relatively low on the fungal microbiota. At the end of the surveillance period, we found evidence of resilience across the gut system since most components returned to a state like the initial one, even if the structure of the bacterial microbiota changed and the dynamics of the different components over time were rarely correlated. The observed richness of the antibiotic resistance genes repertoire was significantly reduced up to day 30, while a significant increase in the relative abundance of β-lactamase encoding genes was observed up to day 10, consistent with a concomitant increase in the β-lactamase activity of the microbiota. The level of β-lactamase activity at baseline was positively associated with the resilience of the metabolome content of the stools. CONCLUSIONS In healthy adults, antibiotics perturb many components of the microbiota, which return close to the baseline state within 30 days. These data suggest an important role of endogenous β-lactamase-producing anaerobes in protecting the functions of the microbiota by de-activating the antibiotics reaching the colon. Video Abstract.
Collapse
Affiliation(s)
- Camille d'Humières
- Université Paris Cité, IAME, INSERM, Paris, F-75018, France
- Institut Pasteur, Université Paris Cité, CNRS UMR3525, Microbial Evolutionary Genomics, Paris, 75015, France
| | - Margot Delavy
- Institut Pasteur, Université Paris Cité, INRAE, USC2019, Unité Biologie Et Pathogénicité Fongiques, Paris, F-75015, France
| | - Laurie Alla
- Université Paris-Saclay, INRAE, MetaGenoPolis, Jouy-en-Josas, F-78350, France
| | - Farid Ichou
- ICANomics, Foundation of Innovation in Cardiometabolism and Nutrition (IHU ICAN), Paris, F-75013, France
| | - Emilie Gauliard
- Sorbonne Université, INSERM U938, Centre de Recherche Saint-Antoine, Paris, F-75012, France
| | - Amine Ghozlane
- Institut Pasteur, Université Paris Cité, Bioinformatics and Biostatistics Hub, Paris, F-75015, France
| | - Florence Levenez
- Université Paris-Saclay, INRAE, MetaGenoPolis, Jouy-en-Josas, F-78350, France
| | - Nathalie Galleron
- Université Paris-Saclay, INRAE, MetaGenoPolis, Jouy-en-Josas, F-78350, France
| | - Benoit Quinquis
- Université Paris-Saclay, INRAE, MetaGenoPolis, Jouy-en-Josas, F-78350, France
| | - Nicolas Pons
- Université Paris-Saclay, INRAE, MetaGenoPolis, Jouy-en-Josas, F-78350, France
| | - Jimmy Mullaert
- Université Paris Cité, IAME, INSERM, Paris, F-75018, France
- AP-HP, Département d'Epidemiologie, Biostatistique and Recherche Clinique, Hôpital Bichat, Paris, F-75018, France
| | | | | | - Marie Touchon
- Institut Pasteur, Université Paris Cité, CNRS UMR3525, Microbial Evolutionary Genomics, Paris, 75015, France
| | - Dominique Rainteau
- Sorbonne Université, INSERM U938, Centre de Recherche Saint-Antoine, Paris, F-75012, France
| | - Antonin Lamazière
- Sorbonne Université, INSERM U938, Centre de Recherche Saint-Antoine, Paris, F-75012, France
| | - Philippe Lesnik
- INSERM UMR-S 1166, Institute of Cardiometabolism and Nutrition, Sorbonne Université, Hôpital Pitié-Salpêtrière, Paris, F-75013, France
- ICANomics, Foundation of Innovation in Cardiometabolism and Nutrition (IHU ICAN), Paris, F-75013, France
| | - Maharajah Ponnaiah
- ICANomics, Foundation of Innovation in Cardiometabolism and Nutrition (IHU ICAN), Paris, F-75013, France
| | - Marie Lhomme
- ICANomics, Foundation of Innovation in Cardiometabolism and Nutrition (IHU ICAN), Paris, F-75013, France
| | - Natacha Sertour
- Institut Pasteur, Université Paris Cité, INRAE, USC2019, Unité Biologie Et Pathogénicité Fongiques, Paris, F-75015, France
| | - Savannah Devente
- Dipartimento di Biotecnologie Mediche, Università di Siena, Siena, I-53100, Italy
| | - Jean-Denis Docquier
- Dipartimento di Biotecnologie Mediche, Università di Siena, Siena, I-53100, Italy
| | - Marie-Elisabeth Bougnoux
- Institut Pasteur, Université Paris Cité, INRAE, USC2019, Unité Biologie Et Pathogénicité Fongiques, Paris, F-75015, France
- AP-HP, Unité de Parasitologie-Mycologie, Service de Microbiologie Clinique, Hôpital Necker-Enfants-Malades, Paris, F-75015, France
| | | | - Mélanie Magnan
- Université Paris Cité, IAME, INSERM, Paris, F-75018, France
| | - Etienne Ruppé
- Université Paris Cité, IAME, INSERM, Paris, F-75018, France
- AP-HP, Laboratoire de Bactériologie, Hôpital Bichat, Paris, F-75018, France
| | - Nathalie Grall
- Université Paris Cité, IAME, INSERM, Paris, F-75018, France
- AP-HP, Laboratoire de Bactériologie, Hôpital Bichat, Paris, F-75018, France
| | - Xavier Duval
- Université Paris Cité, IAME, INSERM, Paris, F-75018, France
- AP-HP, Centre d'Investigation Clinique, INSERM CIC 1425, Hôpital Bichat, Paris, F-75018, France
| | - Dusko Ehrlich
- Université Paris-Saclay, INRAE, MetaGenoPolis, Jouy-en-Josas, F-78350, France
- University College London, Institute for Neurology, London, UK
| | - France Mentré
- Université Paris Cité, IAME, INSERM, Paris, F-75018, France
- AP-HP, Département d'Epidemiologie, Biostatistique and Recherche Clinique, Hôpital Bichat, Paris, F-75018, France
| | - Erick Denamur
- Université Paris Cité, IAME, INSERM, Paris, F-75018, France
- AP-HP, Laboratoire de Génétique Moléculaire, Hôpital Bichat, Paris, F-75018, France
| | - Eduardo P C Rocha
- Institut Pasteur, Université Paris Cité, CNRS UMR3525, Microbial Evolutionary Genomics, Paris, 75015, France
| | | | - Charles Burdet
- Université Paris Cité, IAME, INSERM, Paris, F-75018, France.
- AP-HP, Département d'Epidemiologie, Biostatistique and Recherche Clinique, Hôpital Bichat, Paris, F-75018, France.
| |
Collapse
|
6
|
Yildirim-Balatan C, Fenyi A, Besnault P, Gomez L, Sepulveda-Diaz JE, Michel PP, Melki R, Hunot S. Parkinson's disease-derived α-synuclein assemblies combined with chronic-type inflammatory cues promote a neurotoxic microglial phenotype. J Neuroinflammation 2024; 21:54. [PMID: 38383421 PMCID: PMC10882738 DOI: 10.1186/s12974-024-03043-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 02/12/2024] [Indexed: 02/23/2024] Open
Abstract
Parkinson's disease (PD) is a common age-related neurodegenerative disorder characterized by the aggregation of α-Synuclein (αSYN) building up intraneuronal inclusions termed Lewy pathology. Mounting evidence suggests that neuron-released αSYN aggregates could be central to microglial activation, which in turn mounts and orchestrates neuroinflammatory processes potentially harmful to neurons. Therefore, understanding the mechanisms that drive microglial cell activation, polarization and function in PD might have important therapeutic implications. Here, using primary microglia, we investigated the inflammatory potential of pure αSYN fibrils derived from PD patients. We further explored and characterized microglial cell responses to a chronic-type inflammatory stimulation combining PD patient-derived αSYN fibrils (FPD), Tumor necrosis factor-α (TNFα) and prostaglandin E2 (PGE2) (TPFPD). We showed that FPD hold stronger inflammatory potency than pure αSYN fibrils generated de novo. When combined with TNFα and PGE2, FPD polarizes microglia toward a particular functional phenotype departing from FPD-treated cells and featuring lower inflammatory cytokine and higher glutamate release. Whereas metabolomic studies showed that TPFPD-exposed microglia were closely related to classically activated M1 proinflammatory cells, notably with similar tricarboxylic acid cycle disruption, transcriptomic analysis revealed that TPFPD-activated microglia assume a unique molecular signature highlighting upregulation of genes involved in glutathione and iron metabolisms. In particular, TPFPD-specific upregulation of Slc7a11 (which encodes the cystine-glutamate antiporter xCT) was consistent with the increased glutamate response and cytotoxic activity of these cells toward midbrain dopaminergic neurons in vitro. Together, these data further extend the structure-pathological relationship of αSYN fibrillar polymorphs to their innate immune properties and demonstrate that PD-derived αSYN fibrils, TNFα and PGE2 act in concert to drive microglial cell activation toward a specific and highly neurotoxic chronic-type inflammatory phenotype characterized by robust glutamate release and iron retention.
Collapse
Affiliation(s)
- Cansu Yildirim-Balatan
- Sorbonne Université, Paris, France
- Institut du Cerveau - Paris Brain Institute - ICM, Hôpital de la Pitié-Salpêtrière, 91 Bd de l'Hôpital, 75013, Paris, France
- Inserm UMRS 1127, Paris, France
- CNRS UMR 7225, Paris, France
| | - Alexis Fenyi
- CEA and Laboratory of Neurodegenerative Diseases, CNRS, Institut François Jacob, MIRCen, 92265, Fontenay-aux-Roses, France
| | - Pierre Besnault
- Sorbonne Université, Paris, France
- Institut du Cerveau - Paris Brain Institute - ICM, Hôpital de la Pitié-Salpêtrière, 91 Bd de l'Hôpital, 75013, Paris, France
- Inserm UMRS 1127, Paris, France
- CNRS UMR 7225, Paris, France
| | - Lina Gomez
- Sorbonne Université, Paris, France
- Institut du Cerveau - Paris Brain Institute - ICM, Hôpital de la Pitié-Salpêtrière, 91 Bd de l'Hôpital, 75013, Paris, France
- Inserm UMRS 1127, Paris, France
- CNRS UMR 7225, Paris, France
| | - Julia E Sepulveda-Diaz
- Sorbonne Université, Paris, France
- Institut du Cerveau - Paris Brain Institute - ICM, Hôpital de la Pitié-Salpêtrière, 91 Bd de l'Hôpital, 75013, Paris, France
- Inserm UMRS 1127, Paris, France
- CNRS UMR 7225, Paris, France
| | - Patrick P Michel
- Sorbonne Université, Paris, France
- Institut du Cerveau - Paris Brain Institute - ICM, Hôpital de la Pitié-Salpêtrière, 91 Bd de l'Hôpital, 75013, Paris, France
- Inserm UMRS 1127, Paris, France
- CNRS UMR 7225, Paris, France
| | - Ronald Melki
- CEA and Laboratory of Neurodegenerative Diseases, CNRS, Institut François Jacob, MIRCen, 92265, Fontenay-aux-Roses, France
| | - Stéphane Hunot
- Sorbonne Université, Paris, France.
- Institut du Cerveau - Paris Brain Institute - ICM, Hôpital de la Pitié-Salpêtrière, 91 Bd de l'Hôpital, 75013, Paris, France.
- Inserm UMRS 1127, Paris, France.
- CNRS UMR 7225, Paris, France.
| |
Collapse
|
7
|
Chen Y, Zheng R, Liu J, Li M. scMLC: an accurate and robust multiplex community detection method for single-cell multi-omics data. Brief Bioinform 2024; 25:bbae101. [PMID: 38493339 PMCID: PMC10944569 DOI: 10.1093/bib/bbae101] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 01/03/2024] [Accepted: 02/15/2024] [Indexed: 03/18/2024] Open
Abstract
Clustering cells based on single-cell multi-modal sequencing technologies provides an unprecedented opportunity to create high-resolution cell atlas, reveal cellular critical states and study health and diseases. However, effectively integrating different sequencing data for cell clustering remains a challenging task. Motivated by the successful application of Louvain in scRNA-seq data, we propose a single-cell multi-modal Louvain clustering framework, called scMLC, to tackle this problem. scMLC builds multiplex single- and cross-modal cell-to-cell networks to capture modal-specific and consistent information between modalities and then adopts a robust multiplex community detection method to obtain the reliable cell clusters. In comparison with 15 state-of-the-art clustering methods on seven real datasets simultaneously measuring gene expression and chromatin accessibility, scMLC achieves better accuracy and stability in most datasets. Synthetic results also indicate that the cell-network-based integration strategy of multi-omics data is superior to other strategies in terms of generalization. Moreover, scMLC is flexible and can be extended to single-cell sequencing data with more than two modalities.
Collapse
Affiliation(s)
- Yuxuan Chen
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Ruiqing Zheng
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Jin Liu
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| | - Min Li
- School of Computer Science and Engineering, Central South University, Changsha 410083, China
| |
Collapse
|
8
|
Álvarez-Cuesta JA, Mora-Batista C, Reyes-Carreto R, Carrillo-Rodes FJ, Fitz SJT, González-Zaldivar Y, Vargas-De-León C. On the Cut-Off Value of the Anteroposterior Diameter of the Midbrain Atrophy in Spinocerebellar Ataxia Type 2 Patients. Brain Sci 2024; 14:53. [PMID: 38248268 PMCID: PMC10813098 DOI: 10.3390/brainsci14010053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Revised: 12/23/2023] [Accepted: 01/02/2024] [Indexed: 01/23/2024] Open
Abstract
(1) Background: Spinocerebellar ataxias (SCA) is a term that refers to a group of hereditary ataxias, which are neurological diseases characterized by degeneration of the cells that constitute the cerebellum. Studies suggest that magnetic resonance imaging (MRI) supports diagnoses of ataxias, and linear measurements of the aneteroposterior diameter of the midbrain (ADM) have been investigated using MRI. These measurements correspond to studies in spinocerebellar ataxia type 2 (SCA2) patients and in healthy subjects. Our goal was to obtain the cut-off value for ADM atrophy in SCA2 patients. (2) Methods: This study evaluated 99 participants (66 SCA2 patients and 33 healthy controls). The sample was divided into estimations (80%) and validation (20%) samples. Using the estimation sample, we fitted a logistic model using the ADM and obtained the cut-off value through the inverse of regression. (3) Results: The optimal cut-off value of ADM was found to be 18.21 mm. The area under the curve (AUC) of the atrophy risk score was 0.957 (95% CI: 0.895-0.991). Using this cut-off on the validation sample, we found a sensitivity of 100.00% (95% CI: 76.84%-100.00%) and a specificity of 85.71% (95% CI: 42.13%-99.64%). (4) Conclusions: We obtained a cut-off value that has an excellent discriminatory capacity to identify SCA2 patients.
Collapse
Affiliation(s)
- José Alberto Álvarez-Cuesta
- Centro de Investigación y Rehabilitación de las Ataxias Hereditarias, VPWP+RM5, Holguín 80100, Cuba; (J.A.Á.-C.); (F.J.C.-R.); (Y.G.-Z.)
| | - Camilo Mora-Batista
- Facultad de Matemáticas, Universidad Autónoma de Guerrero, Chilpancingo de los Bravo 39087, Mexico;
| | - Ramón Reyes-Carreto
- Facultad de Matemáticas, Universidad Autónoma de Guerrero, Chilpancingo de los Bravo 39087, Mexico;
| | - Frank Jesus Carrillo-Rodes
- Centro de Investigación y Rehabilitación de las Ataxias Hereditarias, VPWP+RM5, Holguín 80100, Cuba; (J.A.Á.-C.); (F.J.C.-R.); (Y.G.-Z.)
| | | | - Yanetza González-Zaldivar
- Centro de Investigación y Rehabilitación de las Ataxias Hereditarias, VPWP+RM5, Holguín 80100, Cuba; (J.A.Á.-C.); (F.J.C.-R.); (Y.G.-Z.)
| | - Cruz Vargas-De-León
- División de Investigación, Hospital Juárez de México, Ciudad de México 07760, Mexico
- Laboratorio de Modelación Bioestadística para la Salud, Sección de Estudios de Posgrado e Investigación, Escuela Superior de Medicina, Instituto Politécnico Nacional, Ciudad de México 11340, Mexico
| |
Collapse
|
9
|
Urbain F, Ponnaiah M, Ichou F, Lhomme M, Materne C, Galier S, Haroche J, Frisdal E, Mathian A, Durand H, Pha M, Hie M, Kontush A, Cluzel P, Lesnik P, Amoura Z, Guerin M, Cohen Aubart F, Le Goff W. Impaired metabolism predicts coronary artery calcification in women with systemic lupus erythematosus. EBioMedicine 2023; 96:104802. [PMID: 37725854 PMCID: PMC10518349 DOI: 10.1016/j.ebiom.2023.104802] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 08/23/2023] [Accepted: 09/03/2023] [Indexed: 09/21/2023] Open
Abstract
BACKGROUND Patients with systemic lupus erythematosus (SLE) exhibit a high risk for cardiovascular diseases (CVD) which is not fully explained by the classical Framingham risk factors. SLE is characterized by major metabolic alterations which can contribute to the elevated prevalence of CVD. METHODS A comprehensive analysis of the circulating metabolome and lipidome was conducted in a large cohort of 211 women with SLE who underwent a multi-detector computed tomography scan for quantification of coronary artery calcium (CAC), a robust predictor of coronary heart disease (CHD). FINDINGS Beyond traditional risk factors, including age and hypertension, disease activity and duration were independent risk factors for developing CAC in women with SLE. The presence of coronary calcium was associated with major alterations of circulating lipidome dominated by an elevated abundance of ceramides with very long chain fatty acids. Alterations in multiple metabolic pathways, including purine, arginine and proline metabolism, and microbiota-derived metabolites, were also associated with CAC in women with SLE. Logistic regression with bootstrapping of lipidomic and metabolomic variables were used to develop prognostic scores. Strikingly, combining metabolic and lipidomic variables with clinical and biological parameters markedly improved the prediction (area under the curve: 0.887, p < 0.001) of the presence of coronary calcium in women with SLE. INTERPRETATION The present study uncovers the contribution of disturbed metabolism to the presence of coronary artery calcium and the associated risk of CHD in SLE. Identification of novel lipid and metabolite biomarkers may help stratifying patients for reducing CVD morbidity and mortality in SLE. FUNDING INSERM and Sorbonne Université.
Collapse
Affiliation(s)
- Fanny Urbain
- Sorbonne Université, Assistance Publique-Hôpitaux de Paris (AP-HP), Groupement Hospitalier Pitié-Salpêtrière, Centre de Référence pour le Lupus, le Syndrome des Anti-phospholipides et Autres Maladies Auto-immunes Rares, Service de Médecine Interne 2, Paris, France
| | - Maharajah Ponnaiah
- Foundation for Innovation in Cardiometabolism and Nutrition (IHU ICAN), ICAN I/O Data Science (MPo), ICAN Omics (FI and ML), 75013, Paris, France
| | - Farid Ichou
- Foundation for Innovation in Cardiometabolism and Nutrition (IHU ICAN), ICAN I/O Data Science (MPo), ICAN Omics (FI and ML), 75013, Paris, France
| | - Marie Lhomme
- Foundation for Innovation in Cardiometabolism and Nutrition (IHU ICAN), ICAN I/O Data Science (MPo), ICAN Omics (FI and ML), 75013, Paris, France
| | - Clément Materne
- Sorbonne Université, INSERM, Foundation for Innovation in Cardiometabolism and Nutrition (IHU ICAN), UMR_S1166, F-75013, Paris, France
| | - Sophie Galier
- Sorbonne Université, INSERM, Foundation for Innovation in Cardiometabolism and Nutrition (IHU ICAN), UMR_S1166, F-75013, Paris, France
| | - Julien Haroche
- Sorbonne Université, Assistance Publique-Hôpitaux de Paris (AP-HP), Groupement Hospitalier Pitié-Salpêtrière, Centre de Référence pour le Lupus, le Syndrome des Anti-phospholipides et Autres Maladies Auto-immunes Rares, Service de Médecine Interne 2, Paris, France
| | - Eric Frisdal
- Sorbonne Université, INSERM, Foundation for Innovation in Cardiometabolism and Nutrition (IHU ICAN), UMR_S1166, F-75013, Paris, France
| | - Alexis Mathian
- Sorbonne Université, Assistance Publique-Hôpitaux de Paris (AP-HP), Groupement Hospitalier Pitié-Salpêtrière, Centre de Référence pour le Lupus, le Syndrome des Anti-phospholipides et Autres Maladies Auto-immunes Rares, Service de Médecine Interne 2, Paris, France
| | - Herve Durand
- Sorbonne Université, INSERM, Foundation for Innovation in Cardiometabolism and Nutrition (IHU ICAN), UMR_S1166, F-75013, Paris, France
| | - Micheline Pha
- Sorbonne Université, Assistance Publique-Hôpitaux de Paris (AP-HP), Groupement Hospitalier Pitié-Salpêtrière, Centre de Référence pour le Lupus, le Syndrome des Anti-phospholipides et Autres Maladies Auto-immunes Rares, Service de Médecine Interne 2, Paris, France
| | - Miguel Hie
- Sorbonne Université, Assistance Publique-Hôpitaux de Paris (AP-HP), Groupement Hospitalier Pitié-Salpêtrière, Centre de Référence pour le Lupus, le Syndrome des Anti-phospholipides et Autres Maladies Auto-immunes Rares, Service de Médecine Interne 2, Paris, France
| | - Anatol Kontush
- Sorbonne Université, INSERM, Foundation for Innovation in Cardiometabolism and Nutrition (IHU ICAN), UMR_S1166, F-75013, Paris, France
| | - Philippe Cluzel
- Cardiovascular and Interventional Radiology Department, Sorbonne Université, Assistance Publique-Hôpitaux de Paris (AP-HP), Hôpital Pitié-Salpêtrière, Paris, F-75013, France
| | - Philippe Lesnik
- Sorbonne Université, INSERM, Foundation for Innovation in Cardiometabolism and Nutrition (IHU ICAN), UMR_S1166, F-75013, Paris, France
| | - Zahir Amoura
- Sorbonne Université, Assistance Publique-Hôpitaux de Paris (AP-HP), Groupement Hospitalier Pitié-Salpêtrière, Centre de Référence pour le Lupus, le Syndrome des Anti-phospholipides et Autres Maladies Auto-immunes Rares, Service de Médecine Interne 2, Paris, France; Sorbonne Université, Inserm, Centre d'Immunologie et des Maladies Infectieuses (CIMI-Paris), 75013, Paris, France
| | - Maryse Guerin
- Sorbonne Université, INSERM, Foundation for Innovation in Cardiometabolism and Nutrition (IHU ICAN), UMR_S1166, F-75013, Paris, France
| | - Fleur Cohen Aubart
- Sorbonne Université, Assistance Publique-Hôpitaux de Paris (AP-HP), Groupement Hospitalier Pitié-Salpêtrière, Centre de Référence pour le Lupus, le Syndrome des Anti-phospholipides et Autres Maladies Auto-immunes Rares, Service de Médecine Interne 2, Paris, France.
| | - Wilfried Le Goff
- Sorbonne Université, INSERM, Foundation for Innovation in Cardiometabolism and Nutrition (IHU ICAN), UMR_S1166, F-75013, Paris, France.
| |
Collapse
|
10
|
Fang Z, Ford AJ, Hu T, Zhang N, Mantalaris A, Coskun AF. Subcellular spatially resolved gene neighborhood networks in single cells. CELL REPORTS METHODS 2023; 3:100476. [PMID: 37323566 PMCID: PMC10261906 DOI: 10.1016/j.crmeth.2023.100476] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/30/2022] [Revised: 02/18/2023] [Accepted: 04/18/2023] [Indexed: 06/17/2023]
Abstract
Image-based spatial omics methods such as fluorescence in situ hybridization (FISH) generate molecular profiles of single cells at single-molecule resolution. Current spatial transcriptomics methods focus on the distribution of single genes. However, the spatial proximity of RNA transcripts can play an important role in cellular function. We demonstrate a spatially resolved gene neighborhood network (spaGNN) pipeline for the analysis of subcellular gene proximity relationships. In spaGNN, machine-learning-based clustering of subcellular spatial transcriptomics data yields subcellular density classes of multiplexed transcript features. The nearest-neighbor analysis produces heterogeneous gene proximity maps in distinct subcellular regions. We illustrate the cell-type-distinguishing capability of spaGNN using multiplexed error-robust FISH data of fibroblast and U2-OS cells and sequential FISH data of mesenchymal stem cells (MSCs), revealing tissue-source-specific MSC transcriptomics and spatial distribution characteristics. Overall, the spaGNN approach expands the spatial features that can be used for cell-type classification tasks.
Collapse
Affiliation(s)
- Zhou Fang
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
- Machine Learning Graduate Program, Georgia Institute of Technology, Atlanta, GA, USA
| | - Adam J. Ford
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
| | - Thomas Hu
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
- School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA, USA
| | - Nicholas Zhang
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
- Interdisciplinary Bioengineering Graduate Program, Georgia Institute of Technology, Atlanta, GA, USA
| | - Athanasios Mantalaris
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
| | - Ahmet F. Coskun
- Wallace H. Coulter Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, Atlanta, GA, USA
- Interdisciplinary Bioengineering Graduate Program, Georgia Institute of Technology, Atlanta, GA, USA
- Parker H. Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, GA 30332, USA
| |
Collapse
|
11
|
Bensalma F, Mezghani N, Cagnin A, Fuente A, Lenoir L, Hagemeister N. Multimodal data analysis of knee osteoarthritis assessment: factors selection for conservative care decision making. Comput Methods Biomech Biomed Engin 2023; 26:450-459. [PMID: 35472257 DOI: 10.1080/10255842.2022.2066973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
Abstract
When assessing a patient with knee osteoarthritis (OA), a number of factors are considered to guide treatment plan, namely, demographic, radiographic, clinical, musculoskeletal, and biomechanical factors. The aim of this study is to identify which of these factors are the most related to each other to potentially better prioritize the modifiable factors to be addressed as they may influence treatment outcomes. We investigated a multimodal canonical correlation analysis to evaluate associations between these factors. The analysis was performed on 415 OA patients who were not candidates for knee arthroplasty, to identify factors that are associated to the patients' clinical conditions.
Collapse
Affiliation(s)
- F Bensalma
- Research Center LICEF institute, TELUQ, Montréal, Canada.,Laboratoire de recherche en imagerie et orthopédie (LIO), Research Centre of the centre hospitalier de l'université de Montréal (CRCHUM), Montréal, Canada
| | - N Mezghani
- Research Center LICEF institute, TELUQ, Montréal, Canada.,Laboratoire de recherche en imagerie et orthopédie (LIO), Research Centre of the centre hospitalier de l'université de Montréal (CRCHUM), Montréal, Canada
| | - A Cagnin
- Laboratoire de recherche en imagerie et orthopédie (LIO), Research Centre of the centre hospitalier de l'université de Montréal (CRCHUM), Montréal, Canada.,LIO, École de technologie supérieure, Montréal, Canada
| | | | | | - N Hagemeister
- Laboratoire de recherche en imagerie et orthopédie (LIO), Research Centre of the centre hospitalier de l'université de Montréal (CRCHUM), Montréal, Canada.,LIO, École de technologie supérieure, Montréal, Canada
| |
Collapse
|
12
|
Ochoa S, Hernández-Lemus E. Functional impact of multi-omic interactions in breast cancer subtypes. Front Genet 2023; 13:1078609. [PMID: 36685900 PMCID: PMC9850112 DOI: 10.3389/fgene.2022.1078609] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Accepted: 12/15/2022] [Indexed: 01/07/2023] Open
Abstract
Multi-omic approaches are expected to deliver a broader molecular view of cancer. However, the promised mechanistic explanations have not quite settled yet. Here, we propose a theoretical and computational analysis framework to semi-automatically produce network models of the regulatory constraints influencing a biological function. This way, we identified functions significantly enriched on the analyzed omics and described associated features, for each of the four breast cancer molecular subtypes. For instance, we identified functions sustaining over-representation of invasion-related processes in the basal subtype and DNA modification processes in the normal tissue. We found limited overlap on the omics-associated functions between subtypes; however, a startling feature intersection within subtype functions also emerged. The examples presented highlight new, potentially regulatory features, with sound biological reasons to expect a connection with the functions. Multi-omic regulatory networks thus constitute reliable models of the way omics are connected, demonstrating a capability for systematic generation of mechanistic hypothesis.
Collapse
Affiliation(s)
- Soledad Ochoa
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico,Programa de Doctorado en Ciencias Biomédicas, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico,Center for Complexity Sciences, Universidad Nacional Autónoma de México, Mexico City, Mexico,*Correspondence: Enrique Hernández-Lemus,
| |
Collapse
|
13
|
Athieniti E, Spyrou GM. A guide to multi-omics data collection and integration for translational medicine. Comput Struct Biotechnol J 2022; 21:134-149. [PMID: 36544480 PMCID: PMC9747357 DOI: 10.1016/j.csbj.2022.11.050] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 11/25/2022] [Accepted: 11/25/2022] [Indexed: 12/02/2022] Open
Abstract
The emerging high-throughput technologies have led to the shift in the design of translational medicine projects towards collecting multi-omics patient samples and, consequently, their integrated analysis. However, the complexity of integrating these datasets has triggered new questions regarding the appropriateness of the available computational methods. Currently, there is no clear consensus on the best combination of omics to include and the data integration methodologies required for their analysis. This article aims to guide the design of multi-omics studies in the field of translational medicine regarding the types of omics and the integration method to choose. We review articles that perform the integration of multiple omics measurements from patient samples. We identify five objectives in translational medicine applications: (i) detect disease-associated molecular patterns, (ii) subtype identification, (iii) diagnosis/prognosis, (iv) drug response prediction, and (v) understand regulatory processes. We describe common trends in the selection of omic types combined for different objectives and diseases. To guide the choice of data integration tools, we group them into the scientific objectives they aim to address. We describe the main computational methods adopted to achieve these objectives and present examples of tools. We compare tools based on how they deal with the computational challenges of data integration and comment on how they perform against predefined objective-specific evaluation criteria. Finally, we discuss examples of tools for downstream analysis and further extraction of novel insights from multi-omics datasets.
Collapse
Affiliation(s)
- Efi Athieniti
- Department of Bioinformatics, The Cyprus Institute of Neurology and Genetics, 6 Iroon Avenue, 2371 Ayios Dometios, Nicosia, Cyprus
| | - George M. Spyrou
- Department of Bioinformatics, The Cyprus Institute of Neurology and Genetics, 6 Iroon Avenue, 2371 Ayios Dometios, Nicosia, Cyprus
| |
Collapse
|
14
|
Wang Y, Tang S, Ma R, Zamit I, Wei Y, Pan Y. Multi-modal intermediate integrative methods in neuropsychiatric disorders: A review. Comput Struct Biotechnol J 2022; 20:6149-6162. [PMID: 36420153 PMCID: PMC9674886 DOI: 10.1016/j.csbj.2022.11.008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 11/04/2022] [Accepted: 11/04/2022] [Indexed: 11/09/2022] Open
Abstract
The etiology of neuropsychiatric disorders involves complex biological processes at different omics layers, such as genomics, transcriptomics, epigenetics, proteomics, and metabolomics. The advent of high-throughput technology, as well as the availability of large open-source datasets, has ushered in a new era in system biology, necessitating the integration of various types of omics data. The complexity of biological mechanisms, the limitations of integrative strategies, and the heterogeneity of multi-omics data have all presented significant challenges to computational scientists. In comparison to early and late integration, intermediate integration may transform each data type into appropriate intermediate representations using various data transformation techniques, allowing it to capture more complementary information contained in each omics and highlight new interactions across omics layers. Here, we reviewed multi-modal intermediate integrative techniques based on component analysis, matrix factorization, similarity network, multiple kernel learning, Bayesian network, artificial neural networks, and graph transformation, as well as their applications in neuropsychiatric domains. We depicted advancements in these approaches and compared the strengths and weaknesses of each method examined. We believe that our findings will aid researchers in their understanding of the transformation and integration of multi-omics data in neuropsychiatric disorders.
Collapse
Affiliation(s)
- Yanlin Wang
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518055, China
| | - Shi Tang
- Li Chiu Kong Family Sleep Assessment Unit, Department of Psychiatry, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong Special Administrative Region
| | - Ruimin Ma
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518055, China
| | - Ibrahim Zamit
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518055, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yanjie Wei
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518055, China
| | - Yi Pan
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518055, China
| |
Collapse
|
15
|
Gliozzo J, Mesiti M, Notaro M, Petrini A, Patak A, Puertas-Gallardo A, Paccanaro A, Valentini G, Casiraghi E. Heterogeneous data integration methods for patient similarity networks. Brief Bioinform 2022; 23:6604996. [PMID: 35679533 PMCID: PMC9294435 DOI: 10.1093/bib/bbac207] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2021] [Revised: 04/14/2022] [Accepted: 05/04/2022] [Indexed: 12/29/2022] Open
Abstract
Patient similarity networks (PSNs), where patients are represented as nodes and their similarities as weighted edges, are being increasingly used in clinical research. These networks provide an insightful summary of the relationships among patients and can be exploited by inductive or transductive learning algorithms for the prediction of patient outcome, phenotype and disease risk. PSNs can also be easily visualized, thus offering a natural way to inspect complex heterogeneous patient data and providing some level of explainability of the predictions obtained by machine learning algorithms. The advent of high-throughput technologies, enabling us to acquire high-dimensional views of the same patients (e.g. omics data, laboratory data, imaging data), calls for the development of data fusion techniques for PSNs in order to leverage this rich heterogeneous information. In this article, we review existing methods for integrating multiple biomedical data views to construct PSNs, together with the different patient similarity measures that have been proposed. We also review methods that have appeared in the machine learning literature but have not yet been applied to PSNs, thus providing a resource to navigate the vast machine learning literature existing on this topic. In particular, we focus on methods that could be used to integrate very heterogeneous datasets, including multi-omics data as well as data derived from clinical information and medical imaging.
Collapse
Affiliation(s)
- Jessica Gliozzo
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,European Commission, Joint Research Centre (JRC), Ispra (VA), Italy.,CINI, Infolife National Laboratory, Roma, Italy
| | - Marco Mesiti
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,CINI, Infolife National Laboratory, Roma, Italy
| | - Marco Notaro
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,CINI, Infolife National Laboratory, Roma, Italy
| | - Alessandro Petrini
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,CINI, Infolife National Laboratory, Roma, Italy
| | - Alex Patak
- European Commission, Joint Research Centre (JRC), Ispra (VA), Italy
| | | | - Alberto Paccanaro
- Department of Computer Science, Royal Holloway, University of London, Egham, TW20 0EX UK.,School of Applied Mathematics (EMAp), Fundação Getúlio Vargas, Rio de Janeiro Brazil
| | - Giorgio Valentini
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,CINI, Infolife National Laboratory, Roma, Italy.,DSRC UNIMI, Data Science Research Center, Milano, 20135, Italy.,ELLIS, European Laboratory for Learning and Intelligent Systems, Berlin, Germany
| | - Elena Casiraghi
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,CINI, Infolife National Laboratory, Roma, Italy
| |
Collapse
|
16
|
Thomas DB, Harmer AMT, Giovanardi S, Holvast EJ, McGoverin CM, Tenenhaus A. Constructing a multiple‐part morphospace using a multiblock method. Methods Ecol Evol 2021. [DOI: 10.1111/2041-210x.13781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Daniel B. Thomas
- School of Natural Sciences Massey University Auckland New Zealand
| | | | | | - Emma J. Holvast
- School of Natural Sciences Massey University Auckland New Zealand
| | - Cushla M. McGoverin
- Department of Physics University of Auckland Auckland New Zealand
- The Dodd‐Walls Centre for Photonic and Quantum Technologies Auckland New Zealand
| | - Arthur Tenenhaus
- Laboratoire des Signaux et Systèmes CentraleSupelec Université Paris‐Saclay Gif‐sur‐Yvette France
| |
Collapse
|
17
|
Proof of concept and development of a couple-based machine learning model to stratify infertile patients with idiopathic infertility. Sci Rep 2021; 11:24003. [PMID: 34907216 PMCID: PMC8671584 DOI: 10.1038/s41598-021-03165-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Accepted: 11/18/2021] [Indexed: 11/08/2022] Open
Abstract
We aimed to develop and evaluate a machine learning model that can stratify infertile/fertile couples on the basis of their bioclinical signature helping the management of couples with unexplained infertility. Fertile and infertile couples were recruited in the ALIFERT cross-sectional case-control multicentric study between September 2009 and December 2013 (NCT01093378). The study group consisted of 97 infertile couples presenting a primary idiopathic infertility (> 12 months) from 4 French infertility centers compared with 100 fertile couples (with a spontaneously conceived child (< 2 years of age) and with time to pregnancy < 12 months) recruited from the healthy population of the areas around the infertility centers. The study group is comprised of 2 independent sets: a development set (n = 136 from 3 centers) serving to train the model and a test set (n = 61 from 1 center) used to provide an unbiased validation of the model. Our results have shown that: (i) a couple-modeling approach was more discriminant than models in which men's and women's parameters are considered separately; (ii) the most discriminating variables were anthropometric, or related to the metabolic and oxidative status; (iii) a refined model capable to stratify fertile vs. infertile couples with accuracy 73.8% was proposed after the variables selection (from 80 to 13). These influential factors (anthropometric, antioxidative, and metabolic signatures) are all modifiable by the couple lifestyle. The model proposed takes place in the management of couples with idiopathic infertility, for whom the decision-making tools are scarce. Prospective interventional studies are now needed to validate the model clinical use.Trial registration: NCT01093378 ALIFERT https://clinicaltrials.gov/ct2/show/NCT01093378?term=ALIFERT&rank=1 . Registered: March 25, 2010.
Collapse
|
18
|
Zhou H, Nguyen H, Enriquez A, Morsy L, Curtis M, Piser T, Kenney C, Stephen CD, Gupta AS, Schmahmann JD, Vaziri A. Assessment of gait and balance impairment in people with spinocerebellar ataxia using wearable sensors. Neurol Sci 2021; 43:2589-2599. [PMID: 34664180 DOI: 10.1007/s10072-021-05657-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2021] [Accepted: 10/05/2021] [Indexed: 11/30/2022]
Abstract
OBJECTIVE To explore the use of wearable sensors for objective measurement of motor impairment in spinocerebellar ataxia (SCA) patients during clinical assessments of gait and balance. METHODS In total, 14 patients with genetically confirmed SCA (mean age 61.6 ± 8.6 years) and 4 healthy controls (mean age 49.0 ± 16.4 years) were recruited through the Massachusetts General Hospital (MGH) Ataxia Center. Participants donned seven inertial sensors while performing two independent trials of gait and balance assessments from the Scale for the Assessment and Rating of Ataxia (SARA) and Brief Ataxia Rating Scale (BARS2). Univariate analysis was used to identify sensor-derived metrics from wearable sensors that discriminate motor function between the SCA and control groups. Multivariate linear regression models were used to estimate the subjective in-person SARA/BARS2 ratings. Spearman correlation coefficients were used to evaluate the performance of the model. RESULTS Stride length variability, stride duration, cadence, stance phase, pelvis sway, and turn duration were different between SCA and controls (p < 0.05). Similarly, sway and sway velocity of the ankle, hip, and center of mass differentiated SCA and controls (p < 0.05). Using these features, linear regression models showed moderate-to-strong correlation with clinical scores from the in-person rater during SARA assessments of gait (r = 0.73, p = 0.003) and stance (r = 0.90, p < 0.001) and the BARS2 gait assessment (r = 0.74, p = 0.003). CONCLUSION This study demonstrates that sensor-derived metrics can potentially be used to estimate the level of motor impairment in patient with SCA quickly and objectively. Thus, digital biomarkers from wearable sensors have the potential to be an integral tool for SCA clinical trials and care.
Collapse
Affiliation(s)
- He Zhou
- BioSensics LLC, Newton, MA, USA
| | | | | | | | | | | | | | - Christopher D Stephen
- Ataxia Center, Department of Neurology, Massachusetts General Hospital, Boston, MA, USA
| | - Anoopum S Gupta
- Ataxia Center, Department of Neurology, Massachusetts General Hospital, Boston, MA, USA
| | - Jeremy D Schmahmann
- Ataxia Center, Department of Neurology, Massachusetts General Hospital, Boston, MA, USA
| | | |
Collapse
|
19
|
Misra N, Clavaud C, Guinot F, Bourokba N, Nouveau S, Mezzache S, Palazzi P, Appenzeller BMR, Tenenhaus A, Leung MHY, Lee PKH, Bastien P, Aguilar L, Cavusoglu N. Multi-omics analysis to decipher the molecular link between chronic exposure to pollution and human skin dysfunction. Sci Rep 2021; 11:18302. [PMID: 34526566 PMCID: PMC8443591 DOI: 10.1038/s41598-021-97572-1] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Accepted: 08/03/2021] [Indexed: 12/24/2022] Open
Abstract
Environmental pollution is composed of several factors, namely particulate matter (PM2.5, PM10), ozone and Ultra Violet (UV) rays among others and first and the most exposed tissue to these substances is the skin epidermis. It has been established that several skin disorders such as eczema, acne, lentigines and wrinkles are aggravated by exposure to atmospheric pollution. While pollutants can interact with skin surface, contamination of deep skin by ultrafine particles or Polycyclic aromatic hydrocarbons (PAH) might be explained by their presence in blood and hair cortex. Molecular mechanisms leading to skin dysfunction due to pollution exposure have been poorly explored in humans. In addition to various host skin components, cutaneous microbiome is another target of these environment aggressors and can actively contribute to visible clinical manifestation such as wrinkles and aging. The present study aimed to investigate the association between pollution exposure, skin microbiota, metabolites and skin clinical signs in women from two cities with different pollution levels. Untargeted metabolomics and targeted proteins were analyzed from D-Squame samples from healthy women (n = 67 per city), aged 25-45 years and living for at least 15 years in the Chinese cities of Baoding (used as a model of polluted area) and Dalian (control area with lower level of pollution). Additional samples by swabs were collected from the cheeks from the same population and microbiome was analysed using bacterial 16S rRNA as well as fungal ITS1 amplicon sequencing and metagenomics analysis. The level of exposure to pollution was assessed individually by the analysis of polycyclic aromatic hydrocarbons (PAH) and their metabolites in hair samples collected from each participant. All the participants of the study were assessed for the skin clinical parameters (acne, wrinkles, pigmented spots etc.). Women from the two cities (polluted and less polluted) showed distinct metabolic profiles and alterations in skin microbiome. Profiling data from 350 identified metabolites, 143 microbes and 39 PAH served to characterize biochemical events that correlate with pollution exposure. Finally, using multiblock data analysis methods, we obtained a potential molecular map consisting of multi-omics signatures that correlated with the presence of skin pigmentation dysfunction in individuals living in a polluted environment. Overall, these signatures point towards macromolecular alterations by pollution that could manifest as clinical sign of early skin pigmentation and/or other imperfections.
Collapse
Affiliation(s)
- Namita Misra
- Research and Innovation, L'Oréal SA, Aulnay Sous Bois, France.
| | - Cécile Clavaud
- Research and Innovation, L'Oréal SA, Aulnay Sous Bois, France
| | - Florent Guinot
- Research and Innovation, L'Oréal SA, Aulnay Sous Bois, France
| | | | | | - Sakina Mezzache
- Research and Innovation, L'Oréal SA, Aulnay Sous Bois, France
| | - Paul Palazzi
- Human Biomonitoring Research Unit, Luxembourg Institute of Health, Strassen, Luxemburg
| | - Brice M R Appenzeller
- Human Biomonitoring Research Unit, Luxembourg Institute of Health, Strassen, Luxemburg
| | - Arthur Tenenhaus
- CentraleSupelec Laboratoire des Signaux et Systemes, Université Paris-Saclay, CNRS, Gif-sur-Yvette, France
- Brain and Spine Institute, Paris, France
| | - Marcus H Y Leung
- School of Energy and Environment, City University of Hong Kong, Kowloon, Hong Kong SAR, China
| | - Patrick K H Lee
- School of Energy and Environment, City University of Hong Kong, Kowloon, Hong Kong SAR, China
| | | | - Luc Aguilar
- Research and Innovation, L'Oréal SA, Aulnay Sous Bois, France
| | | |
Collapse
|
20
|
Reel PS, Reel S, Pearson E, Trucco E, Jefferson E. Using machine learning approaches for multi-omics data analysis: A review. Biotechnol Adv 2021; 49:107739. [PMID: 33794304 DOI: 10.1016/j.biotechadv.2021.107739] [Citation(s) in RCA: 265] [Impact Index Per Article: 88.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2020] [Revised: 03/01/2021] [Accepted: 03/25/2021] [Indexed: 02/06/2023]
Abstract
With the development of modern high-throughput omic measurement platforms, it has become essential for biomedical studies to undertake an integrative (combined) approach to fully utilise these data to gain insights into biological systems. Data from various omics sources such as genetics, proteomics, and metabolomics can be integrated to unravel the intricate working of systems biology using machine learning-based predictive algorithms. Machine learning methods offer novel techniques to integrate and analyse the various omics data enabling the discovery of new biomarkers. These biomarkers have the potential to help in accurate disease prediction, patient stratification and delivery of precision medicine. This review paper explores different integrative machine learning methods which have been used to provide an in-depth understanding of biological systems during normal physiological functioning and in the presence of a disease. It provides insight and recommendations for interdisciplinary professionals who envisage employing machine learning skills in multi-omics studies.
Collapse
Affiliation(s)
- Parminder S Reel
- Division of Population Health and Genomics, School of Medicine, University of Dundee, Dundee, United Kingdom
| | - Smarti Reel
- Division of Population Health and Genomics, School of Medicine, University of Dundee, Dundee, United Kingdom
| | - Ewan Pearson
- Division of Population Health and Genomics, School of Medicine, University of Dundee, Dundee, United Kingdom
| | - Emanuele Trucco
- VAMPIRE project, Computing, School of Science and Engineering, University of Dundee, Dundee, United Kingdom
| | - Emily Jefferson
- Division of Population Health and Genomics, School of Medicine, University of Dundee, Dundee, United Kingdom.
| |
Collapse
|
21
|
Öz G, Harding IH, Krahe J, Reetz K. MR imaging and spectroscopy in degenerative ataxias: toward multimodal, multisite, multistage monitoring of neurodegeneration. Curr Opin Neurol 2021; 33:451-461. [PMID: 32657886 DOI: 10.1097/wco.0000000000000834] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
PURPOSE OF REVIEW Degenerative ataxias are rare and currently untreatable movement disorders, primarily characterized by neurodegeneration in the cerebellum and brainstem. We highlight MRI studies with the most potential for utility in pending ataxia trials and underscore advances in disease characterization and diagnostics in the field. RECENT FINDINGS With availability of advanced MRI acquisition methods and specialized software dedicated to the analysis of MRI of the cerebellum, patterns of cerebellar atrophy in different degenerative ataxias are increasingly well defined. The field further embraced rigorous multimodal investigations to study network-level microstructural and functional brain changes and their neurochemical correlates. MRI and magnetic resonance spectroscopy were shown to be more sensitive to disease progression than clinical scales and to detect abnormalities in premanifest mutation carriers. SUMMARY Magnetic resonance techniques are increasingly well placed for characterizing the expression and progression of degenerative ataxias. The most impactful work has arguably come through multi-institutional studies that monitor relatively large cohorts, multimodal investigations that assess the sensitivity of different measures and their interrelationships, and novel imaging approaches that are targeted to known pathophysiology (e.g., iron and spinal imaging in Friedreich ataxia). These multimodal, multi-institutional studies are paving the way to clinical trial readiness and enhanced understanding of disease in degenerative ataxias.
Collapse
Affiliation(s)
- Gülin Öz
- Center for Magnetic Resonance Research, Department of Radiology, University of Minnesota, Minneapolis, Minnesota, United States
| | - Ian H Harding
- Department of Neuroscience, Central Clinical School.,Monash Biomedical Imaging, Monash University, Melbourne, Australia
| | - Janna Krahe
- Department of Neurology.,JARA Brain Institute Molecular Neuroscience and Neuroimaging, Research Centre Ju[Combining Diaeresis]lich, RWTH Aachen University, Aachen, Germany
| | - Kathrin Reetz
- Department of Neurology.,JARA Brain Institute Molecular Neuroscience and Neuroimaging, Research Centre Ju[Combining Diaeresis]lich, RWTH Aachen University, Aachen, Germany
| |
Collapse
|
22
|
Avants BB, Tustison NJ, Stone JR. Similarity-driven multi-view embeddings from high-dimensional biomedical data. NATURE COMPUTATIONAL SCIENCE 2021; 1:143-152. [PMID: 33796865 PMCID: PMC8009088 DOI: 10.1038/s43588-021-00029-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/10/2020] [Accepted: 01/19/2021] [Indexed: 12/31/2022]
Abstract
Diverse, high-dimensional modalities collected in large cohorts present new opportunities for the formulation and testing of integrative scientific hypotheses. Similarity-driven multi-view linear reconstruction (SiMLR) is an algorithm that exploits inter-modality relationships to transform large scientific datasets into smaller, more well-powered and interpretable low-dimensional spaces. SiMLR contributes an objective function for identifying joint signal, regularization based on sparse matrices representing prior within-modality relationships and an implementation that permits application to joint reduction of large data matrices. We demonstrate that SiMLR outperforms closely related methods on supervised learning problems in simulation data, a multi-omics cancer survival prediction dataset and multiple modality neuroimaging datasets. Taken together, this collection of results shows that SiMLR may be applied to joint signal estimation from disparate modalities and may yield practically useful results in a variety of application domains.
Collapse
Affiliation(s)
- Brian B Avants
- Department of Radiology and Medical Imaging, University of Virginia, Charlottesville, VA
| | - Nicholas J Tustison
- Department of Radiology and Medical Imaging, University of Virginia, Charlottesville, VA
| | - James R Stone
- Department of Radiology and Medical Imaging, University of Virginia, Charlottesville, VA
| |
Collapse
|
23
|
Hsu LL, Culhane AC. Impact of Data Preprocessing on Integrative Matrix Factorization of Single Cell Data. Front Oncol 2020; 10:973. [PMID: 32656082 PMCID: PMC7324639 DOI: 10.3389/fonc.2020.00973] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2020] [Accepted: 05/18/2020] [Indexed: 01/04/2023] Open
Abstract
Integrative, single-cell analyses may provide unprecedented insights into cellular and spatial diversity of the tumor microenvironment. The sparsity, noise, and high dimensionality of these data present unique challenges. Whilst approaches for integrating single-cell data are emerging and are far from being standardized, most data integration, cell clustering, cell trajectory, and analysis pipelines employ a dimension reduction step, frequently principal component analysis (PCA), a matrix factorization method that is relatively fast, and can easily scale to large datasets when used with sparse-matrix representations. In this review, we provide a guide to PCA and related methods. We describe the relationship between PCA and singular value decomposition, the difference between PCA of a correlation and covariance matrix, the impact of scaling, log-transforming, and standardization, and how to recognize a horseshoe or arch effect in a PCA. We describe canonical correlation analysis (CCA), a popular matrix factorization approach for the integration of single-cell data from different platforms or studies. We discuss alternatives to CCA and why additional preprocessing or weighting datasets within the joint decomposition should be considered.
Collapse
Affiliation(s)
- Lauren L Hsu
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, United States.,Division of Biostatistics and Computational Biology, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, United States
| | - Aedin C Culhane
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, United States.,Division of Biostatistics and Computational Biology, Department of Data Sciences, Dana-Farber Cancer Institute, Boston, MA, United States
| |
Collapse
|
24
|
Tang J, Mou M, Wang Y, Luo Y, Zhu F. MetaFS: Performance assessment of biomarker discovery in metaproteomics. Brief Bioinform 2020; 22:5854399. [PMID: 32510556 DOI: 10.1093/bib/bbaa105] [Citation(s) in RCA: 63] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2020] [Revised: 04/17/2020] [Accepted: 05/05/2020] [Indexed: 12/19/2022] Open
Abstract
Metaproteomics suffers from the issues of dimensionality and sparsity. Data reduction methods can maximally identify the relevant subset of significant differential features and reduce data redundancy. Feature selection (FS) methods were applied to obtain the significant differential subset. So far, a variety of feature selection methods have been developed for metaproteomic study. However, due to FS's performance depended heavily on the data characteristics of a given research, the well-suitable feature selection method must be carefully selected to obtain the reproducible differential proteins. Moreover, it is critical to evaluate the performance of each FS method according to comprehensive criteria, because the single criterion is not sufficient to reflect the overall performance of the FS method. Therefore, we developed an online tool named MetaFS, which provided 13 types of FS methods and conducted the comprehensive evaluation on the complex FS methods using four widely accepted and independent criteria. Furthermore, the function and reliability of MetaFS were systematically tested and validated via two case studies. In sum, MetaFS could be a distinguished tool for discovering the overall well-performed FS method for selecting the potential biomarkers in microbiome studies. The online tool is freely available at https://idrblab.org/metafs/.
Collapse
|
25
|
Effect of congenital adrenal hyperplasia treated by glucocorticoids on plasma metabolome: a machine-learning-based analysis. Sci Rep 2020; 10:8859. [PMID: 32483270 PMCID: PMC7264133 DOI: 10.1038/s41598-020-65897-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Accepted: 05/11/2020] [Indexed: 12/30/2022] Open
Abstract
Background. Congenital adrenal hyperplasia (CAH) due to 21-hydroxylase deficiency leads to impaired cortisol biosynthesis. Treatment includes glucocorticoid supplementation. We studied the specific metabolomics signatures in CAH patients using two different algorithms. Methods. In a case-control study of CAH patients matched on sex and age with healthy control subjects, two metabolomic analyses were performed: one using MetaboDiff, a validated differential metabolomic analysis tool and the other, using Predomics, a novel machine-learning algorithm. Results. 168 participants were included (84 CAH patients). There was no correlation between plasma cortisol levels during glucocorticoid supplementation and metabolites in CAH patients. Indoleamine 2,3-dioxygenase enzyme activity was correlated with ACTH (rho coefficient = −0.25, p-value = 0.02), in CAH patients but not in controls subjects. Overall, 33 metabolites were significantly altered in CAH patients. Main changes came from: purine and pyrimidine metabolites, branched aminoacids, tricarboxylic acid cycle metabolites and associated pathways (urea, glucose, pentose phosphates). MetaboDiff identified 2 modules that were significantly different between both groups: aminosugar metabolism and purine metabolism. Predomics found several interpretable models which accurately discriminated the two groups (accuracy of 0.86 and AUROC of 0.9). Conclusion. CAH patients and healthy control subjects exhibit significant differences in plasma metabolomes, which may be explained by glucocorticoid supplementation.
Collapse
|
26
|
Xia Y. Correlation and association analyses in microbiome study integrating multiomics in health and disease. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2020; 171:309-491. [PMID: 32475527 DOI: 10.1016/bs.pmbts.2020.04.003] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Correlation and association analyses are one of the most widely used statistical methods in research fields, including microbiome and integrative multiomics studies. Correlation and association have two implications: dependence and co-occurrence. Microbiome data are structured as phylogenetic tree and have several unique characteristics, including high dimensionality, compositionality, sparsity with excess zeros, and heterogeneity. These unique characteristics cause several statistical issues when analyzing microbiome data and integrating multiomics data, such as large p and small n, dependency, overdispersion, and zero-inflation. In microbiome research, on the one hand, classic correlation and association methods are still applied in real studies and used for the development of new methods; on the other hand, new methods have been developed to target statistical issues arising from unique characteristics of microbiome data. Here, we first provide a comprehensive view of classic and newly developed univariate correlation and association-based methods. We discuss the appropriateness and limitations of using classic methods and demonstrate how the newly developed methods mitigate the issues of microbiome data. Second, we emphasize that concepts of correlation and association analyses have been shifted by introducing network analysis, microbe-metabolite interactions, functional analysis, etc. Third, we introduce multivariate correlation and association-based methods, which are organized by the categories of exploratory, interpretive, and discriminatory analyses and classification methods. Fourth, we focus on the hypothesis testing of univariate and multivariate regression-based association methods, including alpha and beta diversities-based, count-based, and relative abundance (or compositional)-based association analyses. We demonstrate the characteristics and limitations of each approaches. Fifth, we introduce two specific microbiome-based methods: phylogenetic tree-based association analysis and testing for survival outcomes. Sixth, we provide an overall view of longitudinal methods in analysis of microbiome and omics data, which cover standard, static, regression-based time series methods, principal trend analysis, and newly developed univariate overdispersed and zero-inflated as well as multivariate distance/kernel-based longitudinal models. Finally, we comment on current association analysis and future direction of association analysis in microbiome and multiomics studies.
Collapse
Affiliation(s)
- Yinglin Xia
- Department of Medicine, University of Illinois at Chicago, Chicago, IL, United States.
| |
Collapse
|
27
|
Blum MGB, Valeri L, François O, Cadiou S, Siroux V, Lepeule J, Slama R. Challenges Raised by Mediation Analysis in a High-Dimension Setting. ENVIRONMENTAL HEALTH PERSPECTIVES 2020; 128:55001. [PMID: 32379489 PMCID: PMC7263455 DOI: 10.1289/ehp6240] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/17/2019] [Revised: 04/14/2020] [Accepted: 04/15/2020] [Indexed: 05/19/2023]
Abstract
BACKGROUND Mediation analysis is used in epidemiology to identify pathways through which exposures influence health. The advent of high-throughput (omics) technologies gives opportunities to perform mediation analysis with a high-dimension pool of covariates. OBJECTIVE We aimed to highlight some biostatistical issues of this expanding field of high-dimension mediation. DISCUSSION The mediation techniques used for a single mediator cannot be generalized in a straightforward manner to high-dimension mediation. Causal knowledge on the relation between covariates is required for mediation analysis, and it is expected to be more limited as dimension and system complexity increase. The methods developed in high dimension can be distinguished according to whether mediators are considered separately or as a whole. Methods considering each potential mediator separately do not allow efficient identification of the indirect effects when mutual influences exist among the mediators, which is expected for many biological (e.g., epigenetic) parameters. In this context, methods considering all potential mediators simultaneously, based, for example, on data reduction techniques, are more adapted to the causal inference framework. Their cost is a possible lack of ability to single out the causal mediators. Moreover, the ability of the mediators to predict the outcome can be overestimated, in particular because many machine-learning algorithms are optimized to increase predictive ability rather than their aptitude to make causal inference. Given the lack of overarching validated framework and the generally complex causal structure of high-dimension data, analysis of high-dimension mediation currently requires great caution and effort to incorporate a priori biological knowledge. https://doi.org/10.1289/EHP6240.
Collapse
Affiliation(s)
- Michaël G B Blum
- Laboratoire Techniques de l'Imagerie Médicale et de la Complexité (TIMC-IMAG; UMR 5525), French National Centre for Scientific Research (CNRS), University Grenoble Alpes, La Tronche, France
- OWKIN, Paris, France
| | - Linda Valeri
- Department of Biostatistics, Columbia University Mailman School of Public Health, New York, New York, USA
| | - Olivier François
- Laboratoire Techniques de l'Imagerie Médicale et de la Complexité (TIMC-IMAG; UMR 5525), French National Centre for Scientific Research (CNRS), University Grenoble Alpes, La Tronche, France
| | - Solène Cadiou
- Team of Environmental Epidemiology applied to Reproduction and Respiratory Health, Institute for Advanced Biosciences (IAB) joint research center, Institut national de la santé et de la recherché médicale (Inserm), CNRS, University Grenoble-Alpes, Grenoble, France
| | - Valérie Siroux
- Team of Environmental Epidemiology applied to Reproduction and Respiratory Health, Institute for Advanced Biosciences (IAB) joint research center, Institut national de la santé et de la recherché médicale (Inserm), CNRS, University Grenoble-Alpes, Grenoble, France
| | - Johanna Lepeule
- Team of Environmental Epidemiology applied to Reproduction and Respiratory Health, Institute for Advanced Biosciences (IAB) joint research center, Institut national de la santé et de la recherché médicale (Inserm), CNRS, University Grenoble-Alpes, Grenoble, France
| | - Rémy Slama
- Team of Environmental Epidemiology applied to Reproduction and Respiratory Health, Institute for Advanced Biosciences (IAB) joint research center, Institut national de la santé et de la recherché médicale (Inserm), CNRS, University Grenoble-Alpes, Grenoble, France
| |
Collapse
|
28
|
Abstract
The molecular mechanisms and functions in complex biological systems currently remain elusive. Recent high-throughput techniques, such as next-generation sequencing, have generated a wide variety of multiomics datasets that enable the identification of biological functions and mechanisms via multiple facets. However, integrating these large-scale multiomics data and discovering functional insights are, nevertheless, challenging tasks. To address these challenges, machine learning has been broadly applied to analyze multiomics. This review introduces multiview learning-an emerging machine learning field-and envisions its potentially powerful applications to multiomics. In particular, multiview learning is more effective than previous integrative methods for learning data's heterogeneity and revealing cross-talk patterns. Although it has been applied to various contexts, such as computer vision and speech recognition, multiview learning has not yet been widely applied to biological data-specifically, multiomics data. Therefore, this paper firstly reviews recent multiview learning methods and unifies them in a framework called multiview empirical risk minimization (MV-ERM). We further discuss the potential applications of each method to multiomics, including genomics, transcriptomics, and epigenomics, in an aim to discover the functional and mechanistic interpretations across omics. Secondly, we explore possible applications to different biological systems, including human diseases (e.g., brain disorders and cancers), plants, and single-cell analysis, and discuss both the benefits and caveats of using multiview learning to discover the molecular mechanisms and functions of these systems.
Collapse
Affiliation(s)
- Nam D. Nguyen
- Department of Computer Science, Stony Brook University, Stony Brook, New York, United States of America
| | - Daifeng Wang
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- Waisman Center, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| |
Collapse
|
29
|
Metabolic and Organelle Morphology Defects in Mice and Human Patients Define Spinocerebellar Ataxia Type 7 as a Mitochondrial Disease. Cell Rep 2020; 26:1189-1202.e6. [PMID: 30699348 PMCID: PMC6420346 DOI: 10.1016/j.celrep.2019.01.028] [Citation(s) in RCA: 48] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2018] [Revised: 05/14/2018] [Accepted: 01/08/2019] [Indexed: 12/17/2022] Open
Abstract
Spinocerebellar ataxia type 7 (SCA7) is a retinal-cerebellar degenerative disorder caused by CAG-polyglutamine (polyQ) repeat expansions in the ataxin-7 gene. As many SCA7 clinical phenotypes occur in mitochondrial disorders, and magnetic resonance spectroscopy of patients revealed altered energy metabolism, we considered a role for mitochondrial dysfunction. Studies of SCA7 mice uncovered marked impairments in oxygen consumption and respiratory exchange. When we examined cerebellar Purkinje cells in mice, we observed mitochondrial network abnormalities, with enlarged mitochondria upon ultrastructural analysis. We developed stem cell models from patients and created stem cell knockout rescue systems, documenting mitochondrial morphology defects, impaired oxidative metabolism, and reduced expression of nicotinamide adenine dinucleotide (NAD+) production enzymes in SCA7 models. We observed NAD+ reductions in mitochondria of SCA7 patient NPCs using ratiometric fluorescent sensors and documented alterations in tryptophan-kynurenine metabolism in patients. Our results indicate that mitochondrial dysfunction, stemming from decreased NAD+, is a defining feature of SCA7.
Collapse
|
30
|
Boyd A, Boccara F, Meynard JL, Ichou F, Bastard JP, Fellahi S, Samri A, Sauce D, Haddour N, Autran B, Cohen A, Girard PM, Capeau J. Serum Tryptophan-Derived Quinolinate and Indole-3-Acetate Are Associated With Carotid Intima-Media Thickness and its Evolution in HIV-Infected Treated Adults. Open Forum Infect Dis 2019; 6:ofz516. [PMID: 31890722 PMCID: PMC6929253 DOI: 10.1093/ofid/ofz516] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Accepted: 12/05/2019] [Indexed: 11/14/2022] Open
Abstract
Background HIV-infected individuals undergoing effective antiretroviral therapy (ART) present an increased risk of atherosclerotic cardiovascular disease. We identified serum metabolites associated with carotid intima-media thickness (c-IMT) and its evolution. Methods One hundred forty-three hydrophilic serum metabolites were measured by ultraperformance liquid chromatography coupled with high-resolution mass spectrometry in 49 HIV+ ART+, 48 HIV+ ART-naïve and 50 HIV-negative, age-matched, never-smoking male triads. Metabolites differentially altered between groups ("features") were defined as having a Benjamini-Hochberg-adjusted P value <.05 from a t test and >0.25 log2 absolute mean fold change in metabolite levels. c-IMT was measured across 12 sites at inclusion in all individuals and at the carotid artery (cca) after a median of 5.1 years in 32 HIV+ ART+ individuals. The difference in c-IMT (cross-sectional analysis) and slope of cca-IMT regression/progression per year (longitudinal analysis) for each log10 (area) increase in metabolite level were estimated with linear regression. Results Compared with HIV-, metabolite features of HIV+ ART+ were increased N6,N6,N6-trimethyl-L-lysine and decreased ferulate and 5-hydroxy-L-tryptophan, whereas features of HIV+ ART-naïve were increased malate, kynurenine, 2-oxoglutarate, and indole-3-acetate and decreased succinate and 5-hydroxy-L-tryptophan. In HIV+ ART+ individuals, quinolinate and/or indole-3-acetate were positively associated with c-IMT (P < .03), cca-IMT (P < .03), and cca-IMT progression (P < .008). These associations were not observed in HIV+ ART-naïve or HIV-negative individuals. In HIV+ ART+ individuals, the metabolites xanthosine and uridine, from nucleotide metabolism, and g-butyrobetaine, from lysine/dietary choline degradation, were also positively or negatively associated with c-IMT and/or cca-IMT (all P < .01), but not its evolution. Conclusions In these highly selected HIV-positive ART-controlled males, 2 novel metabolites derived from tryptophan catabolism, indole-3-acetate and quinolinate, were associated with c-IMT and its progression.
Collapse
Affiliation(s)
- Anders Boyd
- Inserm UMR_S1136, Sorbonne Université, Institut Pierre Louis d'Epidémiologie et de Santé Publique (IPLESP), Paris, France
| | - Franck Boccara
- Department of Cardiology, AP-HP, Hôpital Saint-Antoine, Paris, France.,Faculty of Medicine, Sorbonne Université, Inserm UMR_S938, ICAN, Paris, France
| | - Jean-Luc Meynard
- Department of Infectious Diseases, APHP, Hôpital Saint-Antoine, Paris, France
| | - Farid Ichou
- Institute of Cardiometabolism and Nutrition, ICAN, ICANalytics, Paris, France
| | - Jean-Philippe Bastard
- Faculty of Medicine, Sorbonne Université, Inserm UMR_S938, ICAN, Paris, France.,Department of Biochemistry, APHP, Hôpital Tenon, Paris, France
| | - Soraya Fellahi
- Faculty of Medicine, Sorbonne Université, Inserm UMR_S938, ICAN, Paris, France.,Department of Biochemistry, APHP, Hôpital Tenon, Paris, France
| | - Assia Samri
- Sorbonne Université, INSERM U1135, Centre d'Immunologie et des Maladies Infectieuses, Paris, France
| | - Delphine Sauce
- Sorbonne Université, INSERM U1135, Centre d'Immunologie et des Maladies Infectieuses, Paris, France
| | - Nabila Haddour
- Department of Cardiology, AP-HP, Hôpital Saint-Antoine, Paris, France
| | - Brigitte Autran
- Sorbonne Université, INSERM U1135, Centre d'Immunologie et des Maladies Infectieuses, Paris, France
| | - Ariel Cohen
- Department of Cardiology, AP-HP, Hôpital Saint-Antoine, Paris, France
| | - Pierre-Marie Girard
- Inserm UMR_S1136, Sorbonne Université, Institut Pierre Louis d'Epidémiologie et de Santé Publique (IPLESP), Paris, France.,Department of Infectious Diseases, APHP, Hôpital Saint-Antoine, Paris, France
| | - Jacqueline Capeau
- Faculty of Medicine, Sorbonne Université, Inserm UMR_S938, ICAN, Paris, France
| | | |
Collapse
|
31
|
Xicota L, Ichou F, Lejeune FX, Colsch B, Tenenhaus A, Leroy I, Fontaine G, Lhomme M, Bertin H, Habert MO, Epelbaum S, Dubois B, Mochel F, Potier MC. Multi-omics signature of brain amyloid deposition in asymptomatic individuals at-risk for Alzheimer's disease: The INSIGHT-preAD study. EBioMedicine 2019; 47:518-528. [PMID: 31492558 PMCID: PMC6796577 DOI: 10.1016/j.ebiom.2019.08.051] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2019] [Revised: 08/23/2019] [Accepted: 08/23/2019] [Indexed: 01/21/2023] Open
Abstract
BACKGROUND One of the biggest challenge in Alzheimer's disease (AD) is to identify pathways and markers of disease prediction easily accessible, for prevention and treatment. Here we analysed blood samples from the INveStIGation of AlzHeimer's predicTors (INSIGHT-preAD) cohort of elderly asymptomatic individuals with and without brain amyloid load. METHODS We performed blood RNAseq, and plasma metabolomics and lipidomics using liquid chromatography-mass spectrometry on 48 individuals amyloid positive and 48 amyloid negative (SUVr cut-off of 0·7918). The three data sets were analysed separately using differential gene expression based on negative binomial distribution, non-parametric (Wilcoxon) and parametric (correlation-adjusted Student't) tests. Data integration was conducted using sparse partial least squares-discriminant and principal component analyses. Bootstrap-selected top-ten features from the three data sets were tested for their discriminant power using Receiver Operating Characteristic curve. Longitudinal metabolomic analysis was carried out on a subset of 22 subjects. FINDINGS Univariate analyses identified three medium chain fatty acids, 4-nitrophenol and a set of 64 transcripts enriched for inflammation and fatty acid metabolism differentially quantified in amyloid positive and negative subjects. Importantly, the amounts of the three medium chain fatty acids were correlated over time in a subset of 22 subjects (p < 0·05). Multi-omics integrative analyses showed that metabolites efficiently discriminated between subjects according to their amyloid status while lipids did not and transcripts showed trends. Finally, the ten top metabolites and transcripts represented the most discriminant omics features with 99·4% chance prediction for amyloid positivity. INTERPRETATION This study suggests a potential blood omics signature for prediction of amyloid positivity in asymptomatic at-risk subjects, allowing for a less invasive, more accessible, and less expensive risk assessment of AD as compared to PET studies or lumbar puncture. FUND: Institut Hospitalo-Universitaire and Institut du Cerveau et de la Moelle Epiniere (IHU-A-ICM), French Ministry of Research, Fondation Alzheimer, Pfizer, and Avid.
Collapse
Affiliation(s)
- Laura Xicota
- ICM Institut du Cerveau et de la Moelle épinière, CNRS UMR7225, INSERM U1127, UPMC, Hôpital de la Pitié-Salpêtrière, 47 Bd de l'Hôpital, Paris, France
| | - Farid Ichou
- ICANalytcis Platforms, Institute of Cardiometabolism and Nutrition ICAN, Paris, France
| | - François-Xavier Lejeune
- ICM Institut du Cerveau et de la Moelle épinière, CNRS UMR7225, INSERM U1127, UPMC, Hôpital de la Pitié-Salpêtrière, 47 Bd de l'Hôpital, Paris, France
| | - Benoit Colsch
- Service de Pharmacologie et Immunoanalyse (SPI), CEA, INRA, Université Paris-Saclay, MetaboHUB, Gif-sur-Yvette, France
| | - Arthur Tenenhaus
- Laboratoire des Signaux et Systèmes, CentraleSupélec, Université Paris-Saclay, Gif sur Yvette, France
| | - Inka Leroy
- ICM Institut du Cerveau et de la Moelle épinière, CNRS UMR7225, INSERM U1127, UPMC, Hôpital de la Pitié-Salpêtrière, 47 Bd de l'Hôpital, Paris, France
| | - Gaëlle Fontaine
- ICM Institut du Cerveau et de la Moelle épinière, CNRS UMR7225, INSERM U1127, UPMC, Hôpital de la Pitié-Salpêtrière, 47 Bd de l'Hôpital, Paris, France
| | - Marie Lhomme
- ICANalytcis Platforms, Institute of Cardiometabolism and Nutrition ICAN, Paris, France
| | - Hugo Bertin
- Centre Acquisition et Traitement des Images, Paris, France
| | - Marie-Odile Habert
- Laboratoire d'Imagerie Biomédicale, Nuclear Medicine Department, Sorbonne Université, Hôpital de la Salpêtrière, Paris, France
| | - Stéphane Epelbaum
- ICM Institut du Cerveau et de la Moelle épinière, CNRS UMR7225, INSERM U1127, UPMC, Hôpital de la Pitié-Salpêtrière, 47 Bd de l'Hôpital, Paris, France; Centre des Maladies Cognitives et Comportementales, Sorbonne Université, Hôpital de la Salpêtrière, Paris, France; Inria, Aramis-Project Team, Paris, France
| | - Bruno Dubois
- Centre des Maladies Cognitives et Comportementales, Sorbonne Université, Hôpital de la Salpêtrière, Paris, France
| | - Fanny Mochel
- ICM Institut du Cerveau et de la Moelle épinière, CNRS UMR7225, INSERM U1127, UPMC, Hôpital de la Pitié-Salpêtrière, 47 Bd de l'Hôpital, Paris, France.
| | - Marie-Claude Potier
- ICM Institut du Cerveau et de la Moelle épinière, CNRS UMR7225, INSERM U1127, UPMC, Hôpital de la Pitié-Salpêtrière, 47 Bd de l'Hôpital, Paris, France.
| |
Collapse
|
32
|
López de Maturana E, Alonso L, Alarcón P, Martín-Antoniano IA, Pineda S, Piorno L, Calle ML, Malats N. Challenges in the Integration of Omics and Non-Omics Data. Genes (Basel) 2019; 10:genes10030238. [PMID: 30897838 PMCID: PMC6471713 DOI: 10.3390/genes10030238] [Citation(s) in RCA: 50] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2019] [Revised: 03/05/2019] [Accepted: 03/14/2019] [Indexed: 11/16/2022] Open
Abstract
Omics data integration is already a reality. However, few omics-based algorithms show enough predictive ability to be implemented into clinics or public health domains. Clinical/epidemiological data tend to explain most of the variation of health-related traits, and its joint modeling with omics data is crucial to increase the algorithm’s predictive ability. Only a small number of published studies performed a “real” integration of omics and non-omics (OnO) data, mainly to predict cancer outcomes. Challenges in OnO data integration regard the nature and heterogeneity of non-omics data, the possibility of integrating large-scale non-omics data with high-throughput omics data, the relationship between OnO data (i.e., ascertainment bias), the presence of interactions, the fairness of the models, and the presence of subphenotypes. These challenges demand the development and application of new analysis strategies to integrate OnO data. In this contribution we discuss different attempts of OnO data integration in clinical and epidemiological studies. Most of the reviewed papers considered only one type of omics data set, mainly RNA expression data. All selected papers incorporated non-omics data in a low-dimensionality fashion. The integrative strategies used in the identified papers adopted three modeling methods: Independent, conditional, and joint modeling. This review presents, discusses, and proposes integrative analytical strategies towards OnO data integration.
Collapse
Affiliation(s)
- Evangelina López de Maturana
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), and CIBERONC, Melchor Fernández Almagro 3, 28029 Madrid, Spain.
| | - Lola Alonso
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), and CIBERONC, Melchor Fernández Almagro 3, 28029 Madrid, Spain.
| | - Pablo Alarcón
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), and CIBERONC, Melchor Fernández Almagro 3, 28029 Madrid, Spain.
| | - Isabel Adoración Martín-Antoniano
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), and CIBERONC, Melchor Fernández Almagro 3, 28029 Madrid, Spain.
| | - Silvia Pineda
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), and CIBERONC, Melchor Fernández Almagro 3, 28029 Madrid, Spain.
| | - Lucas Piorno
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), and CIBERONC, Melchor Fernández Almagro 3, 28029 Madrid, Spain.
| | - M Luz Calle
- Biosciences Department, University of Vic-Central University of Catalonia, Carrer de la Laura 13, 08570 Vic, Spain.
| | - Núria Malats
- Genetic and Molecular Epidemiology Group, Spanish National Cancer Research Centre (CNIO), and CIBERONC, Melchor Fernández Almagro 3, 28029 Madrid, Spain.
| |
Collapse
|
33
|
Adanyeguh IM, Perlbarg V, Henry PG, Rinaldi D, Petit E, Valabregue R, Brice A, Durr A, Mochel F. Autosomal dominant cerebellar ataxias: Imaging biomarkers with high effect sizes. NEUROIMAGE-CLINICAL 2018; 19:858-867. [PMID: 29922574 PMCID: PMC6005808 DOI: 10.1016/j.nicl.2018.06.011] [Citation(s) in RCA: 68] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/01/2018] [Revised: 05/19/2018] [Accepted: 06/07/2018] [Indexed: 12/13/2022]
Abstract
Objective As gene-based therapies may soon arise for patients with spinocerebellar ataxia (SCA), there is a critical need to identify biomarkers of disease progression with effect sizes greater than clinical scores, enabling trials with smaller sample sizes. Methods We enrolled a unique cohort of patients with SCA1 (n = 15), SCA2 (n = 12), SCA3 (n = 20) and SCA7 (n = 10) and 24 healthy controls of similar age, sex and body mass index. We collected longitudinal clinical and imaging data at baseline and follow-up (mean interval of 24 months). We performed both manual and automated volumetric analyses. Diffusion tensor imaging (DTI) and a novel tractography method, called fixel-based analysis (FBA), were assessed at follow-up. Effect sizes were calculated for clinical scores and imaging parameters. Results Clinical scores worsened as atrophy increased over time (p < 0.05). However, atrophy of cerebellum and pons showed very large effect sizes (>1.2) compared to clinical scores (<0.8). FBA, applied for the first time to SCA, was sensitive to microstructural cross-sectional differences that were not captured by conventional DTI metrics, especially in the less studied SCA7 group. FBA also showed larger effect sizes than DTI metrics. Conclusion This study showed that volumetry outperformed clinical scores to measure disease progression in SCA1, SCA2, SCA3 and SCA7. Therefore, we advocate the use of volumetric biomarkers in therapeutic trials of autosomal dominant ataxias. In addition, FBA showed larger effect size than DTI to detect cross-sectional microstructural alterations in patients relative to controls. Biomarkers are needed to test upcoming therapies for spinocerebellar ataxia. As spinocerebellar ataxias are rare, biomarkers with high effect sizes are needed. We identified imaging biomarkers with higher effect sizes than clinical scores.
Collapse
Key Words
- Apparent fiber density
- CCFS, composite cerebellar functional severity score
- CFE, connectivity-based fixel enhancement
- CSD, constrained spherical deconvolution
- CST, corticospinal tract
- DTI, diffusion tensor imaging
- Diffusion imaging.
- FA, fractional anisotropy
- FBA, fixel-based analysis
- FC, fiber cross-section
- FD, fiber density
- FDC, fiber density and cross-section
- FOD, fiber orientation distribution
- FOV, Field of view
- Fixel analysis
- GRAPPA, generalized autocalibrating partial parallel acquisition
- Imaging biomarkers
- MPRAGE, magnetization-prepared rapid gradient-echo
- MRI, magnetic resonance imaging
- RD, radial diffusivity
- SARA, scale for the assessment and rating of ataxia
- SCA, spinocerebellar ataxias
- SNR, signal-to-noise ratio
- Spinocerebellar ataxia
- TBSS, tract-based spatial statistics
- TE, echo time
- TR, repetition time
Collapse
Affiliation(s)
- Isaac M Adanyeguh
- INSERM U 1127, CNRS UMR 7225, Sorbonne Universités, UPMC Univ Paris 06 UMR S 1127, Institut du Cerveau et de la Moelle épinière, ICM, F-75013 Paris, France
| | - Vincent Perlbarg
- INSERM U 1127, CNRS UMR 7225, Sorbonne Universités, UPMC Univ Paris 06 UMR S 1127, Institut du Cerveau et de la Moelle épinière, ICM, F-75013 Paris, France; Bioinformatics and Biostatistics Core Facililty, iCONICS, Institut du Ceveau et de la Moelle épinière, ICM, F-75013 Paris, France
| | - Pierre-Gilles Henry
- Center for Magnetic Resonance Research (CMRR), University of Minnesota, Minneapolis, MN, United States
| | - Daisy Rinaldi
- INSERM U 1127, CNRS UMR 7225, Sorbonne Universités, UPMC Univ Paris 06 UMR S 1127, Institut du Cerveau et de la Moelle épinière, ICM, F-75013 Paris, France
| | - Elodie Petit
- INSERM U 1127, CNRS UMR 7225, Sorbonne Universités, UPMC Univ Paris 06 UMR S 1127, Institut du Cerveau et de la Moelle épinière, ICM, F-75013 Paris, France
| | - Romain Valabregue
- INSERM U 1127, CNRS UMR 7225, Sorbonne Universités, UPMC Univ Paris 06 UMR S 1127, Institut du Cerveau et de la Moelle épinière, ICM, F-75013 Paris, France; Center for NeuroImaging Research (CENIR), Institut du Cerveau et de la Moelle épinière, 75013 Paris, France
| | - Alexis Brice
- INSERM U 1127, CNRS UMR 7225, Sorbonne Universités, UPMC Univ Paris 06 UMR S 1127, Institut du Cerveau et de la Moelle épinière, ICM, F-75013 Paris, France
| | - Alexandra Durr
- INSERM U 1127, CNRS UMR 7225, Sorbonne Universités, UPMC Univ Paris 06 UMR S 1127, Institut du Cerveau et de la Moelle épinière, ICM, F-75013 Paris, France; AP-HP, Pitié-Salpêtrière University Hospital, Department of Genetics, Paris, France
| | - Fanny Mochel
- INSERM U 1127, CNRS UMR 7225, Sorbonne Universités, UPMC Univ Paris 06 UMR S 1127, Institut du Cerveau et de la Moelle épinière, ICM, F-75013 Paris, France; AP-HP, Pitié-Salpêtrière University Hospital, Department of Genetics, Paris, France; University Pierre and Marie Curie, Neurometabolic Research Group, Paris, France.
| |
Collapse
|
34
|
Exploring patterns enriched in a dataset with contrastive principal component analysis. Nat Commun 2018; 9:2134. [PMID: 29849030 PMCID: PMC5976774 DOI: 10.1038/s41467-018-04608-8] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2017] [Accepted: 04/25/2018] [Indexed: 12/11/2022] Open
Abstract
Visualization and exploration of high-dimensional data is a ubiquitous challenge across disciplines. Widely used techniques such as principal component analysis (PCA) aim to identify dominant trends in one dataset. However, in many settings we have datasets collected under different conditions, e.g., a treatment and a control experiment, and we are interested in visualizing and exploring patterns that are specific to one dataset. This paper proposes a method, contrastive principal component analysis (cPCA), which identifies low-dimensional structures that are enriched in a dataset relative to comparison data. In a wide variety of experiments, we demonstrate that cPCA with a background dataset enables us to visualize dataset-specific patterns missed by PCA and other standard methods. We further provide a geometric interpretation of cPCA and strong mathematical guarantees. An implementation of cPCA is publicly available, and can be used for exploratory data analysis in many applications where PCA is currently used. Dimensionality reduction and visualization methods lack a principled way of comparing multiple datasets. Here, Abid et al. introduce contrastive PCA, which identifies low-dimensional structures enriched in one dataset compared to another and enables visualization of dataset-specific patterns.
Collapse
|
35
|
Chong J, Xia J. Computational Approaches for Integrative Analysis of the Metabolome and Microbiome. Metabolites 2017; 7:E62. [PMID: 29156542 PMCID: PMC5746742 DOI: 10.3390/metabo7040062] [Citation(s) in RCA: 55] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2017] [Revised: 11/14/2017] [Accepted: 11/16/2017] [Indexed: 12/31/2022] Open
Abstract
The study of the microbiome, the totality of all microbes inhabiting the host or an environmental niche, has experienced exponential growth over the past few years. The microbiome contributes functional genes and metabolites, and is an important factor for maintaining health. In this context, metabolomics is increasingly applied to complement sequencing-based approaches (marker genes or shotgun metagenomics) to enable resolution of microbiome-conferred functionalities associated with health. However, analyzing the resulting multi-omics data remains a significant challenge in current microbiome studies. In this review, we provide an overview of different computational approaches that have been used in recent years for integrative analysis of metabolome and microbiome data, ranging from statistical correlation analysis to metabolic network-based modeling approaches. Throughout the process, we strive to present a unified conceptual framework for multi-omics integration and interpretation, as well as point out potential future directions.
Collapse
Affiliation(s)
- Jasmine Chong
- Institute of Parasitology, McGill University, Montreal, QC H3A 0G4, Canada.
| | - Jianguo Xia
- Institute of Parasitology, McGill University, Montreal, QC H3A 0G4, Canada.
- Department of Animal Science, McGill University, Montreal, QC H3A 0G4, Canada.
| |
Collapse
|
36
|
Ibrahim EC, Guillemot V, Comte M, Tenenhaus A, Zendjidjian XY, Cancel A, Belzeaux R, Sauvanaud F, Blin O, Frouin V, Fakra E. Modeling a linkage between blood transcriptional expression and activity in brain regions to infer the phenotype of schizophrenia patients. NPJ SCHIZOPHRENIA 2017; 3:25. [PMID: 28883405 PMCID: PMC5589880 DOI: 10.1038/s41537-017-0027-3] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Revised: 07/05/2017] [Accepted: 07/21/2017] [Indexed: 11/20/2022]
Abstract
Hundreds of genetic loci participate to schizophrenia liability. It is also known that impaired cerebral connectivity is directly related to the cognitive and affective disturbances in schizophrenia. How genetic susceptibility and brain neural networks interact to specify a pathological phenotype in schizophrenia remains elusive. Imaging genetics, highlighting brain variations, has proven effective to establish links between vulnerability loci and associated clinical traits. As previous imaging genetics works in schizophrenia have essentially focused on structural DNA variants, these findings could be blurred by epigenetic mechanisms taking place during gene expression. We explored the meaningful links between genetic data from peripheral blood tissues on one hand, and regional brain reactivity to emotion task assayed by blood oxygen level-dependent functional magnetic resonance imaging on the other hand, in schizophrenia patients and matched healthy volunteers. We applied Sparse Generalized Canonical Correlation Analysis to identify joint signals between two blocks of variables: (i) the transcriptional expression of 33 candidate genes, and (ii) the blood oxygen level-dependent activity in 16 region of interest. Results suggested that peripheral transcriptional expression is related to brain imaging variations through a sequential pathway, ending with the schizophrenia phenotype. Generalization of such an approach to larger data sets should thus help in outlining the pathways involved in psychiatric illnesses such as schizophrenia. IMAGING SEARCHING FOR LINKS TO AID DIAGNOSIS: Researchers explore links between the expression of genes associated with schizophrenia in blood cells and variations in brain activity during emotion processing. El Chérif Ibrahim and Eric Fakra at Aix-Marseille Université, France, and colleagues have developed a method to relate the expression levels of 33 schizophrenia susceptibility genes in blood cells and functional magnetic resonance imaging (fMRI) data obtained as individuals carry out a task that triggers emotional responses. Although they found no significant differences in the expression of genes between the 26 patients with schizophrenia and 26 healthy controls they examined, variations in activity in the superior temporal gyrus were strongly linked to schizophrenia-associated gene expression and presence of disease. Similar analyses of larger data sets will shed further light on the relationship between peripheral molecular changes and disease-related behaviors and ultimately, aid the diagnosis of neuropsychiatric disease.
Collapse
Affiliation(s)
- El Chérif Ibrahim
- Aix-Marseille Univ, CNRS, CRN2M, Marseille, France.
- Fondation FondaMental, Fondation de Recherche et de Soins en Santé Mentale, Créteil, France.
- Aix-Marseille Univ, CNRS, INT, Inst Neurosci Timone, Marseille, France.
| | - Vincent Guillemot
- INSERM, U 1127, Paris, France
- CNRS, 7225, Paris, France
- Sorbonne Universités, UPMC Univ Paris 06, UMRS_1127, Paris, France
- ICM, Département des maladies du système nerveux and Département de Génétique, Hôpital Pitié-Salpêtrière, Paris, France
| | - Magali Comte
- Aix-Marseille Univ, CNRS, INT, Inst Neurosci Timone, Marseille, France
| | - Arthur Tenenhaus
- Laboratoire des Signaux et Systèmes (L2S, UMR CNRS 8506), CentraleSupélec-CNRS Université Paris-Sud, Gif-sur-Yvette, France
- Bioinformatics/Biostatistics Platform IHU-A-ICM, Brain and Spine Institute, Paris, France
| | - Xavier Yves Zendjidjian
- Pôle Psychiatrie centre, Hôpital de la Conception, Assistance Publique des Hôpitaux de Marseille, Marseille, France
| | - Aida Cancel
- Aix-Marseille Univ, CNRS, INT, Inst Neurosci Timone, Marseille, France
- Service Hospitalo-Universitaire de Psychiatrie Secteur Saint-Etienne, Hôpital Nord, Saint-Etienne, France
| | - Raoul Belzeaux
- Aix-Marseille Univ, CNRS, CRN2M, Marseille, France
- Fondation FondaMental, Fondation de Recherche et de Soins en Santé Mentale, Créteil, France
- McGill Group for Suicide Studies, Douglas Mental Health University Institute, Department of Psychiatry, McGill University, Montreal, Quebec, Canada
| | - Florence Sauvanaud
- Service Hospitalo-Universitaire de Psychiatrie Secteur Saint-Etienne, Hôpital Nord, Saint-Etienne, France
| | - Olivier Blin
- Aix-Marseille Univ, CNRS, INT, Inst Neurosci Timone, Marseille, France
- CIC-UPCET et Pharmacologie Clinique, Hôpital de la Timone, Assistance Publique des Hôpitaux de Marseille, Marseille, France
| | | | - Eric Fakra
- Aix-Marseille Univ, CNRS, INT, Inst Neurosci Timone, Marseille, France.
- Service Hospitalo-Universitaire de Psychiatrie Secteur Saint-Etienne, Hôpital Nord, Saint-Etienne, France.
| |
Collapse
|