1
|
Liu Y, Ren J, Ma S, Wu C. The spike-and-slab quantile LASSO for robust variable selection in cancer genomics studies. Stat Med 2024; 43:4928-4983. [PMID: 39260448 DOI: 10.1002/sim.10196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2023] [Revised: 05/28/2024] [Accepted: 07/31/2024] [Indexed: 09/13/2024]
Abstract
Data irregularity in cancer genomics studies has been widely observed in the form of outliers and heavy-tailed distributions in the complex traits. In the past decade, robust variable selection methods have emerged as powerful alternatives to the nonrobust ones to identify important genes associated with heterogeneous disease traits and build superior predictive models. In this study, to keep the remarkable features of the quantile LASSO and fully Bayesian regularized quantile regression while overcoming their disadvantage in the analysis of high-dimensional genomics data, we propose the spike-and-slab quantile LASSO through a fully Bayesian spike-and-slab formulation under the robust likelihood by adopting the asymmetric Laplace distribution (ALD). The proposed robust method has inherited the prominent properties of selective shrinkage and self-adaptivity to the sparsity pattern from the spike-and-slab LASSO (Roc̆ková and George, J Am Stat Associat, 2018, 113(521): 431-444). Furthermore, the spike-and-slab quantile LASSO has a computational advantage to locate the posterior modes via soft-thresholding rule guided Expectation-Maximization (EM) steps in the coordinate descent framework, a phenomenon rarely observed for robust regularization with nondifferentiable loss functions. We have conducted comprehensive simulation studies with a variety of heavy-tailed errors in both homogeneous and heterogeneous model settings to demonstrate the superiority of the spike-and-slab quantile LASSO over its competing methods. The advantage of the proposed method has been further demonstrated in case studies of the lung adenocarcinomas (LUAD) and skin cutaneous melanoma (SKCM) data from The Cancer Genome Atlas (TCGA).
Collapse
Affiliation(s)
- Yuwen Liu
- Department of Statistics, Kansas State University, Manhattan, Kansas, USA
| | - Jie Ren
- Department of Biostatistics and Health Data Sciences, Indiana University School of Medicine, Indianapolis, Indiana, USA
| | - Shuangge Ma
- Department of Biostatistics, Yale University, New Haven, Connecticut, USA
| | - Cen Wu
- Department of Statistics, Kansas State University, Manhattan, Kansas, USA
| |
Collapse
|
2
|
Madeira D, Madeira C, Calosi P, Vermandele F, Carrier-Belleau C, Barria-Araya A, Daigle R, Findlay HS, Poisot T. Multilayer biological networks to upscale marine research to global change-smart management and sustainable resource use. THE SCIENCE OF THE TOTAL ENVIRONMENT 2024; 944:173837. [PMID: 38866145 DOI: 10.1016/j.scitotenv.2024.173837] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2023] [Revised: 05/30/2024] [Accepted: 06/05/2024] [Indexed: 06/14/2024]
Abstract
Human activities are having a massive negative impact on biodiversity and ecological processes worldwide. The rate and magnitude of ecological transformations induced by climate change, habitat destruction, overexploitation and pollution are now so substantial that a sixth mass extinction event is currently underway. The biodiversity crisis of the Anthropocene urges scientists to put forward a transformative vision to promote the conservation of biodiversity, and thus indirectly the preservation of ecosystem functions. Here, we identify pressing issues in global change biology research and propose an integrative framework based on multilayer biological networks as a tool to support conservation actions and marine risk assessments in multi-stressor scenarios. Multilayer networks can integrate different levels of environmental and biotic complexity, enabling us to combine information on molecular, physiological and behaviour responses, species interactions and biotic communities. The ultimate aim of this framework is to link human-induced environmental changes to species physiology, fitness, biogeography and ecosystem impacts across vast seascapes and time frames, to help guide solutions to address biodiversity loss and ecological tipping points. Further, we also define our current ability to adopt a widespread use of multilayer networks within ecology, evolution and conservation by providing examples of case-studies. We also assess which approaches are ready to be transferred and which ones require further development before use. We conclude that multilayer biological networks will be crucial to inform (using reliable multi-levels integrative indicators) stakeholders and support their decision-making concerning the sustainable use of resources and marine conservation.
Collapse
Affiliation(s)
- Diana Madeira
- Laboratory for Innovation and Sustainability of Marine Biological Resources (ECOMARE), Centre for Environmental and Marine Studies (CESAM), Department of Biology, University of Aveiro, Aveiro, Portugal.
| | - Carolina Madeira
- Applied Molecular Biosciences Unit, Department of Life Sciences, School of Science and Technology, NOVA University of Lisbon, Caparica, Portugal; i4HB - Institute for Health and Bioeconomy, School of Science and Technology, NOVA University of Lisbon, Caparica, Portugal
| | - Piero Calosi
- Laboratory of Marine Ecological and Evolutionary Physiology, Department of Biology, Chemistry and Geography, University of Quebec in Rimouski, 300 Allée des Ursulines, Rimouski, G5L 3A1, Québec, Canada
| | - Fanny Vermandele
- Laboratory of Marine Ecological and Evolutionary Physiology, Department of Biology, Chemistry and Geography, University of Quebec in Rimouski, 300 Allée des Ursulines, Rimouski, G5L 3A1, Québec, Canada
| | | | - Aura Barria-Araya
- Laboratory of Marine Ecological and Evolutionary Physiology, Department of Biology, Chemistry and Geography, University of Quebec in Rimouski, 300 Allée des Ursulines, Rimouski, G5L 3A1, Québec, Canada
| | - Remi Daigle
- Bedford Institute of Oceanography, Fisheries and Oceans Canada, Dartmouth, Nova Scotia, Canada; Marine Affairs Program, Dalhousie University, Halifax, Nova Scotia, Canada
| | | | - Timothée Poisot
- Department of Biological Sciences, University of Montreal, Montreal, Canada
| |
Collapse
|
3
|
Hernández-Lemus E, Ochoa S. Methods for multi-omic data integration in cancer research. Front Genet 2024; 15:1425456. [PMID: 39364009 PMCID: PMC11446849 DOI: 10.3389/fgene.2024.1425456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Accepted: 08/28/2024] [Indexed: 10/05/2024] Open
Abstract
Multi-omics data integration is a term that refers to the process of combining and analyzing data from different omic experimental sources, such as genomics, transcriptomics, methylation assays, and microRNA sequencing, among others. Such data integration approaches have the potential to provide a more comprehensive functional understanding of biological systems and has numerous applications in areas such as disease diagnosis, prognosis and therapy. However, quantitative integration of multi-omic data is a complex task that requires the use of highly specialized methods and approaches. Here, we discuss a number of data integration methods that have been developed with multi-omics data in view, including statistical methods, machine learning approaches, and network-based approaches. We also discuss the challenges and limitations of such methods and provide examples of their applications in the literature. Overall, this review aims to provide an overview of the current state of the field and highlight potential directions for future research.
Collapse
Affiliation(s)
- Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Center for Complexity Sciences, Universidad Nacional Autónoma de México, Mexico City, Mexico
| | - Soledad Ochoa
- Computational Genomics Division, National Institute of Genomic Medicine, Mexico City, Mexico
- Department of Obstetrics and Gynecology, Cedars-Sinai Medical Center, Los Angeles, CA, United States
| |
Collapse
|
4
|
Fan K, Subedi S, Yang G, Lu X, Ren J, Wu C. Is Seeing Believing? A Practitioner's Perspective on High-Dimensional Statistical Inference in Cancer Genomics Studies. ENTROPY (BASEL, SWITZERLAND) 2024; 26:794. [PMID: 39330127 PMCID: PMC11430850 DOI: 10.3390/e26090794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Revised: 08/23/2024] [Accepted: 09/06/2024] [Indexed: 09/28/2024]
Abstract
Variable selection methods have been extensively developed for and applied to cancer genomics data to identify important omics features associated with complex disease traits, including cancer outcomes. However, the reliability and reproducibility of the findings are in question if valid inferential procedures are not available to quantify the uncertainty of the findings. In this article, we provide a gentle but systematic review of high-dimensional frequentist and Bayesian inferential tools under sparse models which can yield uncertainty quantification measures, including confidence (or Bayesian credible) intervals, p values and false discovery rates (FDR). Connections in high-dimensional inferences between the two realms have been fully exploited under the "unpenalized loss function + penalty term" formulation for regularization methods and the "likelihood function × shrinkage prior" framework for regularized Bayesian analysis. In particular, we advocate for robust Bayesian variable selection in cancer genomics studies due to its ability to accommodate disease heterogeneity in the form of heavy-tailed errors and structured sparsity while providing valid statistical inference. The numerical results show that robust Bayesian analysis incorporating exact sparsity has yielded not only superior estimation and identification results but also valid Bayesian credible intervals under nominal coverage probabilities compared with alternative methods, especially in the presence of heavy-tailed model errors and outliers.
Collapse
Affiliation(s)
- Kun Fan
- Department of Statistics, Kansas State University, Manhattan, KS 66506, USA
| | - Srijana Subedi
- Department of Statistics, Kansas State University, Manhattan, KS 66506, USA
| | - Gongshun Yang
- Department of Statistics, Kansas State University, Manhattan, KS 66506, USA
| | - Xi Lu
- Department of Pharmaceutical Health Outcomes and Policy, College of Pharmacy, University of Houston, Houston, TX 77204, USA
| | - Jie Ren
- Department of Biostatistics and Health Data Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA
| | - Cen Wu
- Department of Statistics, Kansas State University, Manhattan, KS 66506, USA
| |
Collapse
|
5
|
Zhao S, Qi C, Zhao G, Wang Y, Fu G. A model-free and distribution-free multi-omics integration approach for detecting novel lung adenocarcinoma genes. Sci Rep 2024; 14:17996. [PMID: 39097651 PMCID: PMC11297939 DOI: 10.1038/s41598-023-45813-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2023] [Accepted: 10/24/2023] [Indexed: 08/05/2024] Open
Abstract
Detection of important genes affecting lung adenocarcinoma (LUAD) is critical to finding effective therapeutic targets for this highly lethal cancer. However, many existing approaches have focused on single outcomes or phenotypic associations, which may not be as thorough as investigating molecular transcript levels within cells. In this article, we apply a novel multivariate rank-distance correlation-based gene selection procedure (MrDcGene) to LUAD multi-omics data downloaded from The Cancer Genome Atlas (TCGA). MrDcGene provides additional opportunities for detecting novel susceptibility genes as it leverages information from multiple platforms, while efficiently handling challenges such as high dimensionality, low signal-to-noise ratio, unknown distributions, and non-linear structures, etc. Notably, the MrDcGene method is able to detect two different scenarios, i.e., strong association strength with a few gene expressions and weak association strength with several gene expressions. After thoroughly exploring the association between gene expression (GE) and multiple other platforms, including reverse phase protein array (RPPA), miRNA, copy number variation (CNV) and DNA methylation (ME), we detect several novel genes that may play an important role in LUAD (ZNF133, CCDC159, YWHAZ, HNRNPR. ITPR2, PTHLH, and WIPI2). In addition, we quantitatively validate several other susceptibility genes that were reported in the literature using different methods and studies. The accuracy of the MrDcGene approach is theoretically assured and empirically demonstrated by the simulation studies.
Collapse
Affiliation(s)
- Shaofei Zhao
- Binghamton University, Department of Mathematics and Statistics, Binghamton, NY, 13902, USA.
| | - Caleb Qi
- Binghamton University, Department of Mathematics and Statistics, Binghamton, NY, 13902, USA
| | - Geran Zhao
- Binghamton University, Department of Mathematics and Statistics, Binghamton, NY, 13902, USA
| | - Yangsheng Wang
- Binghamton University, Department of Mathematics and Statistics, Binghamton, NY, 13902, USA
| | - Guifang Fu
- Binghamton University, Department of Mathematics and Statistics, Binghamton, NY, 13902, USA.
| |
Collapse
|
6
|
Segú H, Jalševac F, Lores M, Beltrán-Debón R, Terra X, Pinent M, Ardévol A, Rodríguez-Gallego E, Blay MT. Intestinal Taste Receptor Expression and Its Implications for Health: An Integrative Analysis in Female Rats after Chronic Insect Supplementation. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2024; 72:13929-13942. [PMID: 38857423 PMCID: PMC11191688 DOI: 10.1021/acs.jafc.4c02408] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Revised: 05/27/2024] [Accepted: 06/02/2024] [Indexed: 06/12/2024]
Abstract
Taste receptors are found in the gastrointestinal tract, where they are susceptible to dietary modulation, a key point that is crucial for diet-related responses. Insects are sustainable and good-quality protein sources. This study analyzed the impact of insect consumption on the modulation of taste receptor expression across various segments of the rat intestine under healthy or inflammatory conditions. Female Wistar rats were supplemented with Tenebrio molitor (T) or Alphitobius diaperinus (B), alongside a control group (C), over 21 days under healthy or LPS-induced inflammation. The present study reveals, for the first time, that insect consumption modulates taste receptor gene expression, mainly in the ascending colon. This modulation was not found under inflammation. Integrative analysis revealed colonic Tas1r1 as a key discriminator for insect consumption (C = 1.04 ± 0.32, T = 1.78 ± 0.72, B = 1.99 ± 0.82, p-value <0.05 and 0.01, respectively). Additionally, correlation analysis showed the interplay between intestinal taste receptors and metabolic and inflammatory responses. These findings underscore how insect consumption modulates taste receptors, influencing intestinal function and broader physiological mechanisms.
Collapse
Affiliation(s)
- Helena Segú
- MoBioFood Research Group,
Departament de Bioquímica i Biotecnologia, Universitat Rovira i Virgili, c/Marcel·lí Domingo n°1, 43007 Tarragona, Spain
| | - Florijan Jalševac
- MoBioFood Research Group,
Departament de Bioquímica i Biotecnologia, Universitat Rovira i Virgili, c/Marcel·lí Domingo n°1, 43007 Tarragona, Spain
| | - Mònica Lores
- MoBioFood Research Group,
Departament de Bioquímica i Biotecnologia, Universitat Rovira i Virgili, c/Marcel·lí Domingo n°1, 43007 Tarragona, Spain
| | - Raúl Beltrán-Debón
- MoBioFood Research Group,
Departament de Bioquímica i Biotecnologia, Universitat Rovira i Virgili, c/Marcel·lí Domingo n°1, 43007 Tarragona, Spain
| | - Ximena Terra
- MoBioFood Research Group,
Departament de Bioquímica i Biotecnologia, Universitat Rovira i Virgili, c/Marcel·lí Domingo n°1, 43007 Tarragona, Spain
| | - Montserrat Pinent
- MoBioFood Research Group,
Departament de Bioquímica i Biotecnologia, Universitat Rovira i Virgili, c/Marcel·lí Domingo n°1, 43007 Tarragona, Spain
| | - Anna Ardévol
- MoBioFood Research Group,
Departament de Bioquímica i Biotecnologia, Universitat Rovira i Virgili, c/Marcel·lí Domingo n°1, 43007 Tarragona, Spain
| | - Esther Rodríguez-Gallego
- MoBioFood Research Group,
Departament de Bioquímica i Biotecnologia, Universitat Rovira i Virgili, c/Marcel·lí Domingo n°1, 43007 Tarragona, Spain
| | - Maria Teresa Blay
- MoBioFood Research Group,
Departament de Bioquímica i Biotecnologia, Universitat Rovira i Virgili, c/Marcel·lí Domingo n°1, 43007 Tarragona, Spain
| |
Collapse
|
7
|
Taunk K, Jajula S, Bhavsar PP, Choudhari M, Bhanuse S, Tamhankar A, Naiya T, Kalita B, Rapole S. The prowess of metabolomics in cancer research: current trends, challenges and future perspectives. Mol Cell Biochem 2024:10.1007/s11010-024-05041-w. [PMID: 38814423 DOI: 10.1007/s11010-024-05041-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2023] [Accepted: 05/18/2024] [Indexed: 05/31/2024]
Abstract
Cancer due to its heterogeneous nature and large prevalence has tremendous socioeconomic impacts on populations across the world. Therefore, it is crucial to discover effective panels of biomarkers for diagnosing cancer at an early stage. Cancer leads to alterations in cell growth and differentiation at the molecular level, some of which are very unique. Therefore, comprehending these alterations can aid in a better understanding of the disease pathology and identification of the biomolecules that can serve as effective biomarkers for cancer diagnosis. Metabolites, among other biomolecules of interest, play a key role in the pathophysiology of cancer whose levels are significantly altered while 'reprogramming the energy metabolism', a cellular condition favored in cancer cells which is one of the hallmarks of cancer. Metabolomics, an emerging omics technology has tremendous potential to contribute towards the goal of investigating cancer metabolites or the metabolic alterations during the development of cancer. Diverse metabolites can be screened in a variety of biofluids, and tumor tissues sampled from cancer patients against healthy controls to capture the altered metabolism. In this review, we provide an overview of different metabolomics approaches employed in cancer research and the potential of metabolites as biomarkers for cancer diagnosis. In addition, we discuss the challenges associated with metabolomics-driven cancer research and gaze upon the prospects of this emerging field.
Collapse
Affiliation(s)
- Khushman Taunk
- Proteomics Lab, National Centre for Cell Science, Ganeshkhind, Pune, Maharashtra, 411007, India
- Department of Biotechnology, Maulana Abul Kalam Azad University of Technology, West Bengal, NH12 Simhat, Haringhata, Nadia, West Bengal, 741249, India
| | - Saikiran Jajula
- Proteomics Lab, National Centre for Cell Science, Ganeshkhind, Pune, Maharashtra, 411007, India
| | - Praneeta Pradip Bhavsar
- Proteomics Lab, National Centre for Cell Science, Ganeshkhind, Pune, Maharashtra, 411007, India
| | - Mahima Choudhari
- Proteomics Lab, National Centre for Cell Science, Ganeshkhind, Pune, Maharashtra, 411007, India
| | - Sadanand Bhanuse
- Proteomics Lab, National Centre for Cell Science, Ganeshkhind, Pune, Maharashtra, 411007, India
| | - Anup Tamhankar
- Department of Surgical Oncology, Deenanath Mangeshkar Hospital and Research Centre, Erandawne, Pune, Maharashtra, 411004, India
| | - Tufan Naiya
- Department of Biotechnology, Maulana Abul Kalam Azad University of Technology, West Bengal, NH12 Simhat, Haringhata, Nadia, West Bengal, 741249, India
| | - Bhargab Kalita
- Proteomics Lab, National Centre for Cell Science, Ganeshkhind, Pune, Maharashtra, 411007, India.
- Amrita School of Nanosciences and Molecular Medicine, Amrita Institute of Medical Sciences and Research Centre, Amrita Vishwa Vidyapeetham, Ponekkara, Kochi, Kerala, 682041, India.
| | - Srikanth Rapole
- Proteomics Lab, National Centre for Cell Science, Ganeshkhind, Pune, Maharashtra, 411007, India.
| |
Collapse
|
8
|
Peng KW, Klotz A, Guven A, Kapadnis U, Ravipaty S, Tolstikov V, Vemulapalli V, Rodrigues LO, Li H, Kellogg MD, Kausar F, Rees L, Sarangarajan R, Schüle B, Langston W, Narain P, Narain NR, Kiebish MA. Identification and validation of N-acetylputrescine in combination with non-canonical clinical features as a Parkinson's disease biomarker panel. Sci Rep 2024; 14:10036. [PMID: 38693432 PMCID: PMC11063140 DOI: 10.1038/s41598-024-60872-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2022] [Accepted: 04/29/2024] [Indexed: 05/03/2024] Open
Abstract
Parkinson's disease is a progressive neurodegenerative disorder in which loss of dopaminergic neurons in the substantia nigra results in a clinically heterogeneous group with variable motor and non-motor symptoms with a degree of misdiagnosis. Only 3-25% of sporadic Parkinson's patients present with genetic abnormalities that could represent a risk factor, thus environmental, metabolic, and other unknown causes contribute to the pathogenesis of Parkinson's disease, which highlights the critical need for biomarkers. In the present study, we prospectively collected and analyzed plasma samples from 194 Parkinson's disease patients and 197 age-matched non-diseased controls. N-acetyl putrescine (NAP) in combination with sense of smell (B-SIT), depression/anxiety (HADS), and acting out dreams (RBD1Q) clinical measurements demonstrated combined diagnostic utility. NAP was increased by 28% in Parkinsons disease patients and exhibited an AUC of 0.72 as well as an OR of 4.79. The clinical and NAP panel demonstrated an area under the curve, AUC = 0.9 and an OR of 20.4. The assessed diagnostic panel demonstrates combinatorial utility in diagnosing Parkinson's disease, allowing for an integrated interpretation of disease pathophysiology and highlighting the use of multi-tiered panels in neurological disease diagnosis.
Collapse
Affiliation(s)
- Kuan-Wei Peng
- BPGbio, 500 Old Connecticut Path, Framingham, MA, 01701, USA
| | - Allison Klotz
- BPGbio, 500 Old Connecticut Path, Framingham, MA, 01701, USA
| | - Arcan Guven
- BPGbio, 500 Old Connecticut Path, Framingham, MA, 01701, USA
| | - Unnati Kapadnis
- BPGbio, 500 Old Connecticut Path, Framingham, MA, 01701, USA
| | - Shobha Ravipaty
- BPGbio, 500 Old Connecticut Path, Framingham, MA, 01701, USA
| | | | | | | | - Hongyan Li
- BPGbio, 500 Old Connecticut Path, Framingham, MA, 01701, USA
| | - Mark D Kellogg
- BPGbio, 500 Old Connecticut Path, Framingham, MA, 01701, USA
- Department of Pathology, Harvard Medical School, Boston, MA, USA
- Department of Laboratory Medicine, Boston Children's Hospital, 300 Longwood Avenue, Boston, MA, 02115, USA
| | - Farah Kausar
- Department of Neurology, Weill Institute for Neurosciences, University of California San Francisco, San Francisco, CA, 94158, USA
| | - Linda Rees
- Neurocrine Biosciences, San Diego, CA, 92130, USA
| | | | - Birgitt Schüle
- Department of Pathology, Stanford School of Medicine, Stanford, CA, 94305, USA
| | - William Langston
- Department of Pathology, Stanford School of Medicine, Stanford, CA, 94305, USA
| | - Paula Narain
- BPGbio, 500 Old Connecticut Path, Framingham, MA, 01701, USA
| | - Niven R Narain
- BPGbio, 500 Old Connecticut Path, Framingham, MA, 01701, USA
| | | |
Collapse
|
9
|
Buch G, Schulz A, Schmidtmann I, Strauch K, Wild PS. Interpretability of bi-level variable selection methods. Biom J 2024; 66:e2300063. [PMID: 38519877 DOI: 10.1002/bimj.202300063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Revised: 01/31/2024] [Accepted: 02/07/2024] [Indexed: 03/25/2024]
Abstract
Variable selection is usually performed to increase interpretability, as sparser models are easier to understand than full models. However, a focus on sparsity is not always suitable, for example, when features are related due to contextual similarities or high correlations. Here, it may be more appropriate to identify groups and their predictive members, a task that can be accomplished with bi-level selection procedures. To investigate whether such techniques lead to increased interpretability, group exponential LASSO (GEL), sparse group LASSO (SGL), composite minimax concave penalty (cMCP), and least absolute shrinkage, and selection operator (LASSO) as reference methods were used to select predictors in time-to-event, regression, and classification tasks in bootstrap samples from a cohort of 1001 patients. Different groupings based on prior knowledge, correlation structure, and random assignment were compared in terms of selection relevance, group consistency, and collinearity tolerance. The results show that bi-level selection methods are superior to LASSO in all criteria. The cMCP demonstrated superiority in selection relevance, while SGL was convincing in group consistency. An all-round capacity was achieved by GEL: the approach jointly selected correlated and content-related predictors while maintaining high selection relevance. This method seems recommendable when variables are grouped, and interpretation is of primary interest.
Collapse
Affiliation(s)
- Gregor Buch
- Preventive Cardiology and Preventive Medicine, Department of Cardiology, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
- Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
- German Center for Cardiovascular Research (DZHK), Mainz, Germany
| | - Andreas Schulz
- Preventive Cardiology and Preventive Medicine, Department of Cardiology, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
| | - Irene Schmidtmann
- Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
| | - Konstantin Strauch
- Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
| | - Philipp S Wild
- Preventive Cardiology and Preventive Medicine, Department of Cardiology, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
- German Center for Cardiovascular Research (DZHK), Mainz, Germany
- Clinical Epidemiology and Systems Medicine, Center for Thrombosis and Hemostasis, University Medical Center of the Johannes Gutenberg University Mainz, Mainz, Germany
- Institute of Molecular Biology (IMB), Mainz, Germany
| |
Collapse
|
10
|
Liu W, Pratte KA, Castaldi PJ, Hersh C, Bowler RP, Banaei-Kashani F, Kechris KJ. A Generalized Higher-order Correlation Analysis Framework for Multi-Omics Network Inference. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.22.576667. [PMID: 38328226 PMCID: PMC10849540 DOI: 10.1101/2024.01.22.576667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/09/2024]
Abstract
Multiple -omics (genomics, proteomics, etc.) profiles are commonly generated to gain insight into a disease or physiological system. Constructing multi-omics networks with respect to the trait(s) of interest provides an opportunity to understand relationships between molecular features but integration is challenging due to multiple data sets with high dimensionality. One approach is to use canonical correlation to integrate one or two omics types and a single trait of interest. However, these types of methods may be limited due to (1) not accounting for higher-order correlations existing among features, (2) computational inefficiency when extending to more than two omics data when using a penalty term-based sparsity method, and (3) lack of flexibility for focusing on specific correlations (e.g., omics-to-phenotype correlation versus omics-to-omics correlations). In this work, we have developed a novel multi-omics network analysis pipeline called Sparse Generalized Tensor Canonical Correlation Analysis Network Inference (SGTCCA-Net) that can effectively overcome these limitations. We also introduce an implementation to improve the summarization of networks for downstream analyses. Simulation and real-data experiments demonstrate the effectiveness of our novel method for inferring omics networks and features of interest.
Collapse
Affiliation(s)
- Weixuan Liu
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| | | | - Peter J. Castaldi
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, United States
| | - Craig Hersh
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Boston, United States
| | - Russell P. Bowler
- Division of Pulmonary Medicine, Department of Medicine, National Jewish Health, Denver, CO, USA
| | - Farnoush Banaei-Kashani
- Department of Computer Science and Engineering, College of Engineering, Design and Computing, University of Colorado Denver, Denver, CO, USA
| | - Katerina J. Kechris
- Department of Biostatistics and Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA
| |
Collapse
|
11
|
Miao G, Yu L, Yang J, Bennett DA, Zhao J, Wu SS. Learning from vertically distributed data across multiple sites: An efficient privacy-preserving algorithm for Cox proportional hazards model with variable selection. J Biomed Inform 2024; 149:104581. [PMID: 38142903 PMCID: PMC10996392 DOI: 10.1016/j.jbi.2023.104581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2023] [Revised: 08/24/2023] [Accepted: 12/19/2023] [Indexed: 12/26/2023]
Abstract
OBJECTIVE To develop a lossless distributed algorithm for regularized Cox proportional hazards model with variable selection to support federated learning for vertically distributed data. METHODS We propose a novel distributed algorithm for fitting regularized Cox proportional hazards model when data sharing among different data providers is restricted. Based on cyclical coordinate descent, the proposed algorithm computes intermediary statistics by each site and then exchanges them to update the model parameters in other sites without accessing individual patient-level data. We evaluate the performance of the proposed algorithm with (1) a simulation study and (2) a real-world data analysis predicting the risk of Alzheimer's dementia from the Religious Orders Study and Rush Memory and Aging Project (ROSMAP). Moreover, we compared the performance of our method with existing privacy-preserving models. RESULTS Our algorithm achieves privacy-preserving variable selection for time-to-event data in the vertically distributed setting, without degradation of accuracy compared with a centralized approach. Simulation demonstrates that our algorithm is highly efficient in analyzing high-dimensional datasets. Real-world data analysis reveals that our distributed Cox model yields higher accuracy in predicting the risk of Alzheimer's dementia than the conventional Cox model built by each data provider without data sharing. Moreover, our algorithm is computationally more efficient compared with existing privacy-preserving Cox models with or without regularization term. CONCLUSION The proposed algorithm is lossless, privacy-preserving and highly efficient to fit regularized Cox model for vertically distributed data. It provides a suitable and convenient approach for modeling time-to-event data in a distributed manner.
Collapse
Affiliation(s)
- Guanhong Miao
- Department of Epidemiology, College of Public Health & Health Professions and College of Medicine, University of Florida, Gainesville, FL, USA; Center for Genetic Epidemiology and Bioinformatics, University of Florida, Gainesville, FL, USA; Department of Biostatistics, College of Public Health & Health Professions and College of Medicine, University of Florida, Gainesville, FL, USA
| | - Lei Yu
- Rush Alzheimer's Disease Center & Department of Neurological Sciences, Rush University Medical Center, Chicago, IL, USA
| | - Jingyun Yang
- Rush Alzheimer's Disease Center & Department of Neurological Sciences, Rush University Medical Center, Chicago, IL, USA
| | - David A Bennett
- Rush Alzheimer's Disease Center & Department of Neurological Sciences, Rush University Medical Center, Chicago, IL, USA
| | - Jinying Zhao
- Department of Epidemiology, College of Public Health & Health Professions and College of Medicine, University of Florida, Gainesville, FL, USA; Center for Genetic Epidemiology and Bioinformatics, University of Florida, Gainesville, FL, USA
| | - Samuel S Wu
- Department of Biostatistics, College of Public Health & Health Professions and College of Medicine, University of Florida, Gainesville, FL, USA.
| |
Collapse
|
12
|
San Valentin EMD, Do KA, Yeung SCJ, Reyes-Gibby CC. Attempts to Understand Oral Mucositis in Head and Neck Cancer Patients through Omics Studies: A Narrative Review. Int J Mol Sci 2023; 24:16995. [PMID: 38069314 PMCID: PMC10706892 DOI: 10.3390/ijms242316995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2023] [Revised: 11/27/2023] [Accepted: 11/29/2023] [Indexed: 12/18/2023] Open
Abstract
Oral mucositis (OM) is a common and clinically impactful side effect of cytotoxic cancer treatment, particularly in patients with head and neck squamous cell carcinoma (HNSCC) who undergo radiotherapy with or without concomitant chemotherapy. The etiology and pathogenic mechanisms of OM are complex, multifaceted and elicit both direct and indirect damage to the mucosa. In this narrative review, we describe studies that use various omics methodologies (genomics, transcriptomics, microbiomics and metabolomics) in attempts to elucidate the biological pathways associated with the development or severity of OM. Integrating different omics into multi-omics approaches carries the potential to discover links among host factors (genomics), host responses (transcriptomics, metabolomics), and the local environment (microbiomics).
Collapse
Affiliation(s)
- Erin Marie D. San Valentin
- Department of Emergency Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- Department of Interventional Radiology, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Kim-Anh Do
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Sai-Ching J. Yeung
- Department of Emergency Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| | - Cielito C. Reyes-Gibby
- Department of Emergency Medicine, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
- Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA
| |
Collapse
|
13
|
Zhou F, Ren J, Ma S, Wu C. The Bayesian Regularized Quantile Varying Coefficient Model. Comput Stat Data Anal 2023; 187:107808. [PMID: 38746689 PMCID: PMC11090482 DOI: 10.1016/j.csda.2023.107808] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/19/2024]
Abstract
The quantile varying coefficient (VC) model can flexibly capture dynamical patterns of regression coefficients. In addition, due to the quantile check loss function, it is robust against outliers and heavy-tailed distributions of the response variable, and can provide a more comprehensive picture of modeling via exploring the conditional quantiles of the response variable. Although extensive studies have been conducted to examine variable selection for the high-dimensional quantile varying coefficient models, the Bayesian analysis has been rarely developed. The Bayesian regularized quantile varying coefficient model has been proposed to incorporate robustness against data heterogeneity while accommodating the non-linear interactions between the effect modifier and predictors. Selecting important varying coefficients can be achieved through Bayesian variable selection. Incorporating the multivariate spike-and-slab priors further improves performance by inducing exact sparsity. The Gibbs sampler has been derived to conduct efficient posterior inference of the sparse Bayesian quantile VC model through Markov chain Monte Carlo (MCMC). The merit of the proposed model in selection and estimation accuracy over the alternatives has been systematically investigated in simulation under specific quantile levels and multiple heavy-tailed model errors. In the case study, the proposed model leads to identification of biologically sensible markers in a non-linear gene-environment interaction study using the NHS data.
Collapse
Affiliation(s)
- Fei Zhou
- Department of Statistics, Kansas State University, Manhattan, KS
| | - Jie Ren
- Department of Biostatistics, Indiana University School of Medicine, Indianapolis, IN
| | - Shuangge Ma
- Department of Biostatistics, Yale University, New Haven, CT
| | - Cen Wu
- Department of Statistics, Kansas State University, Manhattan, KS
| |
Collapse
|
14
|
Wang C. Optimization of sports effect evaluation technology from random forest algorithm and elastic network algorithm. PLoS One 2023; 18:e0292557. [PMID: 37862380 PMCID: PMC10588863 DOI: 10.1371/journal.pone.0292557] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Accepted: 09/23/2023] [Indexed: 10/22/2023] Open
Abstract
This study leverages advanced data mining and machine learning techniques to delve deeper into the impact of sports activities on physical health and provide a scientific foundation for informed sports selection and health promotion. Guided by the Elastic Net algorithm, a sports performance assessment model is meticulously constructed. In contrast to the conventional Least Absolute Shrinkage and Selection Operator (Lasso) algorithm, this model seeks to elucidate the factors influencing physical health indicators due to sports activities. Additionally, the incorporation of the Random Forest algorithm facilitates a comprehensive evaluation of sports performance across distinct dimensions: wrestling-type sports, soccer-type sports, skill-based sports, and school physical education. Employing the Top-K criterion for evaluation and juxtaposing it with the high-performance Support Vector Machine (SVM) algorithm, the accuracy is scrutinized under three distinct criteria: Top-3, Top-5, and Top-10. The pivotal innovation of this study resides in the amalgamation of the Elastic Net and Random Forest algorithms, permitting a holistic contemplation of the influencing factors of diverse sports activities on physical health indicators. Through this integrated methodology, the research achieves a more precise assessment of the effects of sports activities, unveiling a range of impacts various sports have on physical health. Consequently, a more refined assessment tool for sports performance detection and health development is established. Capitalizing on the Elastic Net algorithm, this research optimizes model construction during the pivotal feature selection phase, effectively capturing the crucial influencing factors associated with different sports activities. Concurrently, the integration of the Random Forest algorithm augments the predictive prowess of the model, enabling the sports performance assessment model to comprehensively unveil the extent of impact stemming from various sports activities. This study stands as a noteworthy contribution to the arena of sports performance assessment, offering substantial insights and advancements to both sports health and research methodologies.
Collapse
Affiliation(s)
- Caixia Wang
- Department of Primary Education, Jiaozuo Normal College, Jiaozuo, Henan, China
| |
Collapse
|
15
|
Downing T, Angelopoulos N. A primer on correlation-based dimension reduction methods for multi-omics analysis. J R Soc Interface 2023; 20:20230344. [PMID: 37817584 PMCID: PMC10565429 DOI: 10.1098/rsif.2023.0344] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 09/19/2023] [Indexed: 10/12/2023] Open
Abstract
The continuing advances of omic technologies mean that it is now more tangible to measure the numerous features collectively reflecting the molecular properties of a sample. When multiple omic methods are used, statistical and computational approaches can exploit these large, connected profiles. Multi-omics is the integration of different omic data sources from the same biological sample. In this review, we focus on correlation-based dimension reduction approaches for single omic datasets, followed by methods for pairs of omics datasets, before detailing further techniques for three or more omic datasets. We also briefly detail network methods when three or more omic datasets are available and which complement correlation-oriented tools. To aid readers new to this area, these are all linked to relevant R packages that can implement these procedures. Finally, we discuss scenarios of experimental design and present road maps that simplify the selection of appropriate analysis methods. This review will help researchers navigate emerging methods for multi-omics and integrating diverse omic datasets appropriately. This raises the opportunity of implementing population multi-omics with large sample sizes as omics technologies and our understanding improve.
Collapse
Affiliation(s)
- Tim Downing
- Pirbright Institute, Pirbright, Surrey, UK
- Department of Biotechnology, Dublin City University, Dublin, Ireland
| | | |
Collapse
|
16
|
Joshi AD, Rahnavard A, Kachroo P, Mendez KM, Lawrence W, Julián-Serrano S, Hua X, Fuller H, Sinnott-Armstrong N, Tabung FK, Shutta KH, Raffield LM, Darst BF. An epidemiological introduction to human metabolomic investigations. Trends Endocrinol Metab 2023; 34:505-525. [PMID: 37468430 PMCID: PMC10527234 DOI: 10.1016/j.tem.2023.06.006] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/10/2023] [Revised: 06/17/2023] [Accepted: 06/19/2023] [Indexed: 07/21/2023]
Abstract
Metabolomics holds great promise for uncovering insights around biological processes impacting disease in human epidemiological studies. Metabolites can be measured across biological samples, including plasma, serum, saliva, urine, stool, and whole organs and tissues, offering a means to characterize metabolic processes relevant to disease etiology and traits of interest. Metabolomic epidemiology studies face unique challenges, such as identifying metabolites from targeted and untargeted assays, defining standards for quality control, harmonizing results across platforms that often capture different metabolites, and developing statistical methods for high-dimensional and correlated metabolomic data. In this review, we introduce metabolomic epidemiology to the broader scientific community, discuss opportunities and challenges presented by these studies, and highlight emerging innovations that hold promise to uncover new biological insights.
Collapse
Affiliation(s)
- Amit D Joshi
- Clinical & Translational Epidemiology Unit, Massachusetts General Hospital, Boston, MA, USA
| | - Ali Rahnavard
- Computational Biology Institute, Department of Biostatistics and Bioinformatics, Milken Institute School of Public Health, The George Washington University, Washington, DC, USA
| | - Priyadarshini Kachroo
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Kevin M Mendez
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA
| | - Wayne Lawrence
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
| | - Sachelly Julián-Serrano
- Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA; Department of Public Health, University of Massachusetts Lowell, Lowell, MA, USA
| | - Xinwei Hua
- Clinical & Translational Epidemiology Unit, Massachusetts General Hospital, Boston, MA, USA; Department of Cardiology, Peking University Third Hospital, Beijing, China
| | - Harriett Fuller
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Nasa Sinnott-Armstrong
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
| | - Fred K Tabung
- The Ohio State University College of Medicine and Comprehensive Cancer Center, Columbus, OH, USA
| | - Katherine H Shutta
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA, USA; Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| | - Laura M Raffield
- Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Burcu F Darst
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA.
| |
Collapse
|
17
|
Maiorino E, Loscalzo J. Phenomics and Robust Multiomics Data for Cardiovascular Disease Subtyping. Arterioscler Thromb Vasc Biol 2023; 43:1111-1123. [PMID: 37226730 PMCID: PMC10330619 DOI: 10.1161/atvbaha.122.318892] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2022] [Accepted: 05/10/2023] [Indexed: 05/26/2023]
Abstract
The complex landscape of cardiovascular diseases encompasses a wide range of related pathologies arising from diverse molecular mechanisms and exhibiting heterogeneous phenotypes. This variety of manifestations poses significant challenges in the development of treatment strategies. The increasing availability of precise phenotypic and multiomics data of cardiovascular disease patient populations has spurred the development of a variety of computational disease subtyping techniques to identify distinct subgroups with unique underlying pathogeneses. In this review, we outline the essential components of computational approaches to select, integrate, and cluster omics and clinical data in the context of cardiovascular disease research. We delve into the challenges faced during different stages of the analysis, including feature selection and extraction, data integration, and clustering algorithms. Next, we highlight representative applications of subtyping pipelines in heart failure and coronary artery disease. Finally, we discuss the current challenges and future directions in the development of robust subtyping approaches that can be implemented in clinical workflows, ultimately contributing to the ongoing evolution of precision medicine in health care.
Collapse
Affiliation(s)
- Enrico Maiorino
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts
| | - Joseph Loscalzo
- Channing Division of Network Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts
- Division of Cardiovascular Medicine, Department of Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, Massachusetts
| |
Collapse
|
18
|
Barzegar Behrooz A, Latifi-Navid H, da Silva Rosa SC, Swiat M, Wiechec E, Vitorino C, Vitorino R, Jamalpoor Z, Ghavami S. Integrating Multi-Omics Analysis for Enhanced Diagnosis and Treatment of Glioblastoma: A Comprehensive Data-Driven Approach. Cancers (Basel) 2023; 15:3158. [PMID: 37370767 DOI: 10.3390/cancers15123158] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2023] [Revised: 06/06/2023] [Accepted: 06/07/2023] [Indexed: 06/29/2023] Open
Abstract
The most aggressive primary malignant brain tumor in adults is glioblastoma (GBM), which has poor overall survival (OS). There is a high relapse rate among patients with GBM despite maximally safe surgery, radiation therapy, temozolomide (TMZ), and aggressive treatment. Hence, there is an urgent and unmet clinical need for new approaches to managing GBM. The current study identified modules (MYC, EGFR, PIK3CA, SUZ12, and SPRK2) involved in GBM disease through the NeDRex plugin. Furthermore, hub genes were identified in a comprehensive interaction network containing 7560 proteins related to GBM disease and 3860 proteins associated with signaling pathways involved in GBM. By integrating the results of the analyses mentioned above and again performing centrality analysis, eleven key genes involved in GBM disease were identified. ProteomicsDB and Gliovis databases were used for determining the gene expression in normal and tumor brain tissue. The NetworkAnalyst and the mGWAS-Explorer tools identified miRNAs, SNPs, and metabolites associated with these 11 genes. Moreover, a literature review of recent studies revealed other lists of metabolites related to GBM disease. The enrichment analysis of identified genes, miRNAs, and metabolites associated with GBM disease was performed using ExpressAnalyst, miEAA, and MetaboAnalyst tools. Further investigation of metabolite roles in GBM was performed using pathway, joint pathway, and network analyses. The results of this study allowed us to identify 11 genes (UBC, HDAC1, CTNNB1, TRIM28, CSNK2A1, RBBP4, TP53, APP, DAB1, PINK1, and RELN), five miRNAs (hsa-mir-221-3p, hsa-mir-30a-5p, hsa-mir-15a-5p, hsa-mir-130a-3p, and hsa-let-7b-5p), six metabolites (HDL, N6-acetyl-L-lysine, cholesterol, formate, N, N-dimethylglycine/xylose, and X2. piperidinone) and 15 distinct signaling pathways that play an indispensable role in GBM disease development. The identified top genes, miRNAs, and metabolite signatures can be targeted to establish early diagnostic methods and plan personalized GBM treatment strategies.
Collapse
Affiliation(s)
- Amir Barzegar Behrooz
- Trauma Research Center, Aja University of Medical Sciences, Tehran 14117-18541, Iran
| | - Hamid Latifi-Navid
- Department of Molecular Medicine, National Institute of Genetic Engineering and Biotechnology, Tehran 14977-16316, Iran
| | - Simone C da Silva Rosa
- Department of Human Anatomy and Cell Science, University of Manitoba College of Medicine, Winnipeg, MB R3E 3P5, Canada
| | - Maciej Swiat
- Faculty of Medicine in Zabrze, University of Technology in Katowice, 41-800 Zabrze, Poland
| | - Emilia Wiechec
- Division of Cell Biology, Department of Biomedical and Clinical Sciences, Linköping University, 58185 Linköping, Sweden
| | - Carla Vitorino
- Coimbra Chemistry Coimbra, Institute of Molecular Sciences-IMS, Department of Chemistry, University of Coimbra, 3000-456 Coimbra, Portugal
- Faculty of Pharmacy, University of Coimbra, 3000-456 Coimbra, Portugal
| | - Rui Vitorino
- Department of Medical Sciences, Institute of Biomedicine iBiMED, University of Aveiro, 3810-193 Aveiro, Portugal
- UnIC, Department of Surgery and Physiology, Faculty of Medicine, University of Porto, 4200-319 Porto, Portugal
| | - Zahra Jamalpoor
- Trauma Research Center, Aja University of Medical Sciences, Tehran 14117-18541, Iran
| | - Saeid Ghavami
- Department of Human Anatomy and Cell Science, University of Manitoba College of Medicine, Winnipeg, MB R3E 3P5, Canada
- Faculty of Medicine in Zabrze, University of Technology in Katowice, 41-800 Zabrze, Poland
- Biology of Breathing Theme, Children Hospital Research Institute of Manitoba, University of Manitoba, Winnipeg, MB R3T 2N2, Canada
- Research Institute of Oncology and Hematology, Cancer Care Manitoba-University of Manitoba, Winnipeg, MB R3T 2N2, Canada
| |
Collapse
|
19
|
Li X, Teng T, Yan W, Fan L, Liu X, Clarke G, Zhu D, Jiang Y, Xiang Y, Yu Y, Zhang Y, Yin B, Lu L, Zhou X, Xie P. AKT and MAPK signaling pathways in hippocampus reveals the pathogenesis of depression in four stress-induced models. Transl Psychiatry 2023; 13:200. [PMID: 37308476 DOI: 10.1038/s41398-023-02486-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 05/06/2023] [Accepted: 05/26/2023] [Indexed: 06/14/2023] Open
Abstract
Major depressive disorder (MDD) is a highly heterogeneous psychiatric disorder. The pathogenesis of MDD remained unclear, and it may be associated with exposure to different stressors. Most previous studies have focused on molecular changes in a single stress-induced depression model, which limited the identification of the pathogenesis of MDD. The depressive-like behaviors were induced by four well-validated stress models in rats, including chronic unpredictable mild stress, learned helplessness stress, chronic restraint stress and social defeat stress. We applied proteomic and metabolomic to investigate molecular changes in the hippocampus of those four models and revealed 529 proteins and 98 metabolites. Ingenuity Pathways Analysis (IPA) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis identified differentially regulated canonical pathways, and then we presented a schematic model that simulates AKT and MAPK signaling pathways network and their interactions and revealed the cascade reactions. Further, the western blot confirmed that p-AKT, p-ERK12, GluA1, p-MEK1, p-MEK2, p-P38, Syn1, and TrkB, which were changed in at least one depression model. Importantly, p-AKT, p-ERK12, p-MEK1 and p-P38 were identified as common alterations in four depression models. The molecular level changes caused by different stressors may be dramatically different, and even opposite, between four depression models. However, the different molecular alterations converge on a common AKT and MAPK molecular pathway. Further studies of these pathways could contribute to a better understanding of the pathogenesis of depression, with the ultimate goal of helping to develop or select more effective treatment strategies for MDD.
Collapse
Affiliation(s)
- Xuemei Li
- Department of Psychiatry, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
- NHC Key Laboratory of Diagnosis and Treatment on Brain Functional Diseases, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Teng Teng
- Department of Psychiatry, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
- NHC Key Laboratory of Diagnosis and Treatment on Brain Functional Diseases, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Wei Yan
- Peking University Sixth Hospital, Peking University Institute of Mental Health, NHC Key Laboratory of Mental Health (Peking University), National Clinical Research Center for Mental Disorders (Peking University Sixth Hospital), Beijing, China
| | - Li Fan
- NHC Key Laboratory of Diagnosis and Treatment on Brain Functional Diseases, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
- Department of Neurology, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Xueer Liu
- NHC Key Laboratory of Diagnosis and Treatment on Brain Functional Diseases, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
- Department of Neurology, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Gerard Clarke
- Department of Psychiatry and Neurobehavioural Science, University College Cork, Cork, Ireland
- APC Microbiome Ireland, University College Cork, Cork, Ireland
| | - Dan Zhu
- Department of Neurology, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Yuanliang Jiang
- Department of Psychiatry, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
- NHC Key Laboratory of Diagnosis and Treatment on Brain Functional Diseases, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Yajie Xiang
- NHC Key Laboratory of Diagnosis and Treatment on Brain Functional Diseases, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
- Department of Neurology, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Ying Yu
- NHC Key Laboratory of Diagnosis and Treatment on Brain Functional Diseases, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
- Department of Neurology, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Yuqing Zhang
- NHC Key Laboratory of Diagnosis and Treatment on Brain Functional Diseases, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
- Department of Neurology, The Second Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Bangmin Yin
- Department of Psychiatry, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
- NHC Key Laboratory of Diagnosis and Treatment on Brain Functional Diseases, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China
| | - Lin Lu
- Peking University Sixth Hospital, Peking University Institute of Mental Health, NHC Key Laboratory of Mental Health (Peking University), National Clinical Research Center for Mental Disorders (Peking University Sixth Hospital), Beijing, China.
| | - Xinyu Zhou
- Department of Psychiatry, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China.
- NHC Key Laboratory of Diagnosis and Treatment on Brain Functional Diseases, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China.
| | - Peng Xie
- NHC Key Laboratory of Diagnosis and Treatment on Brain Functional Diseases, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China.
- Department of Neurology, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China.
| |
Collapse
|
20
|
Zhou F, Liu Y, Ren J, Wang W, Wu C. Springer: An R package for bi-level variable selection of high-dimensional longitudinal data. Front Genet 2023; 14:1088223. [PMID: 37091810 PMCID: PMC10117642 DOI: 10.3389/fgene.2023.1088223] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2022] [Accepted: 02/28/2023] [Indexed: 04/09/2023] Open
Abstract
In high-dimensional data analysis, the bi-level (or the sparse group) variable selection can simultaneously conduct penalization on the group level and within groups, which has been developed for continuous, binary, and survival responses in the literature. Zhou et al. (2022) (PMID: 35766061) has further extended it under the longitudinal response by proposing a quadratic inference function-based penalization method in gene-environment interaction studies. This study introduces "springer," an R package implementing the bi-level variable selection within the QIF framework developed in Zhou et al. (2022). In addition, R package "springer" has also implemented the generalized estimating equation-based sparse group penalization method. Alternative methods focusing only on the group level or individual level have also been provided by the package. In this study, we have systematically introduced the longitudinal penalization methods implemented in the "springer" package. We demonstrate the usage of the core and supporting functions, which is followed by the numerical examples and discussions. R package "springer" is available at https://cran.r-project.org/package=springer.
Collapse
Affiliation(s)
- Fei Zhou
- Department of Statistics, Kansas State University, Manhattan, KS, United States
| | - Yuwen Liu
- Department of Statistics, Kansas State University, Manhattan, KS, United States
| | - Jie Ren
- Department of Biostatistics and Health Data Sciences, Indiana University School of Medicine, Indianapolis, IN, United States
| | - Weiqun Wang
- Department of Food, Nutrition, Dietetics and Health, Kansas State University, Manhattan, KS, United States
| | - Cen Wu
- Department of Statistics, Kansas State University, Manhattan, KS, United States
| |
Collapse
|
21
|
Donovan SM, Aghaeepour N, Andres A, Azad MB, Becker M, Carlson SE, Järvinen KM, Lin W, Lönnerdal B, Slupsky CM, Steiber AL, Raiten DJ. Evidence for human milk as a biological system and recommendations for study design-a report from "Breastmilk Ecology: Genesis of Infant Nutrition (BEGIN)" Working Group 4. Am J Clin Nutr 2023; 117 Suppl 1:S61-S86. [PMID: 37173061 PMCID: PMC10356565 DOI: 10.1016/j.ajcnut.2022.12.021] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2022] [Revised: 12/06/2022] [Accepted: 12/08/2022] [Indexed: 05/15/2023] Open
Abstract
Human milk contains all of the essential nutrients required by the infant within a complex matrix that enhances the bioavailability of many of those nutrients. In addition, human milk is a source of bioactive components, living cells and microbes that facilitate the transition to life outside the womb. Our ability to fully appreciate the importance of this matrix relies on the recognition of short- and long-term health benefits and, as highlighted in previous sections of this supplement, its ecology (i.e., interactions among the lactating parent and breastfed infant as well as within the context of the human milk matrix itself). Designing and interpreting studies to address this complexity depends on the availability of new tools and technologies that account for such complexity. Past efforts have often compared human milk to infant formula, which has provided some insight into the bioactivity of human milk, as a whole, or of individual milk components supplemented with formula. However, this experimental approach cannot capture the contributions of the individual components to the human milk ecology, the interaction between these components within the human milk matrix, or the significance of the matrix itself to enhance human milk bioactivity on outcomes of interest. This paper presents approaches to explore human milk as a biological system and the functional implications of that system and its components. Specifically, we discuss study design and data collection considerations and how emerging analytical technologies, bioinformatics, and systems biology approaches could be applied to advance our understanding of this critical aspect of human biology.
Collapse
Affiliation(s)
- Sharon M Donovan
- Department of Food Science and Human Nutrition, University of Illinois, Urbana-Champaign, IL, USA.
| | - Nima Aghaeepour
- Department of Anesthesiology, Pain, and Perioperative Medicine, Department of Pediatrics, and Department of Biomedical Data Sciences, School of Medicine, Stanford University, Stanford, CA, USA
| | - Aline Andres
- Arkansas Children's Nutrition Center and Department of Pediatrics, University of Arkansas for Medical Sciences, Little Rock, AR, USA
| | - Meghan B Azad
- Manitoba Interdisciplinary Lactation Centre (MILC), Children's Hospital Research Institute of Manitoba, Department of Pediatrics and Child Health and Department of Immunology, University of Manitoba, Winnipeg, Manitoba, Canada
| | - Martin Becker
- Department of Anesthesiology, Pain, and Perioperative Medicine, Department of Pediatrics, and Department of Biomedical Data Sciences, School of Medicine, Stanford University, Stanford, CA, USA
| | - Susan E Carlson
- Department of Dietetics and Nutrition, University of Kansas Medical Center, Kansas City, KS, USA
| | - Kirsi M Järvinen
- Department of Pediatrics, Division of Allergy and Immunology and Center for Food Allergy, University of Rochester Medical Center, New York, NY, USA
| | - Weili Lin
- Biomedical Research Imaging Center and Department of Radiology, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - Bo Lönnerdal
- Department of Nutrition, University of California, Davis, CA, USA
| | - Carolyn M Slupsky
- Department of Nutrition, University of California, Davis, CA, USA; Department of Food Science and Technology, University of California, Davis, CA, USA
| | | | - Daniel J Raiten
- Pediatric Growth and Nutrition Branch, Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
22
|
Tavassolifar MJ, Aghdaei HA, Sadatpour O, Maleknia S, Fayazzadeh S, Mohebbi SR, Montazer F, Rabbani A, Zali MR, Izad M, Meyfour A. New insights into extracellular and intracellular redox status in COVID-19 patients. Redox Biol 2023; 59:102563. [PMID: 36493512 PMCID: PMC9715463 DOI: 10.1016/j.redox.2022.102563] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 11/12/2022] [Accepted: 11/28/2022] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND The imbalance of redox homeostasis induces hyper-inflammation in viral infections. In this study, we explored the redox system signature in response to SARS-COV-2 infection and examined the status of these extracellular and intracellular signatures in COVID-19 patients. METHOD The multi-level network was constructed using multi-level data of oxidative stress-related biological processes, protein-protein interactions, transcription factors, and co-expression coefficients obtained from GSE164805, which included gene expression profiles of peripheral blood mononuclear cells (PBMCs) from COVID-19 patients and healthy controls. Top genes were designated based on the degree and closeness centralities. The expression of high-ranked genes was evaluated in PBMCs and nasopharyngeal (NP) samples of 30 COVID-19 patients and 30 healthy controls. The intracellular levels of GSH and ROS/O2• - and extracellular oxidative stress markers were assayed in PBMCs and plasma samples by flow cytometry and ELISA. ELISA results were applied to construct a classification model using logistic regression to differentiate COVID-19 patients from healthy controls. RESULTS CAT, NFE2L2, SOD1, SOD2 and CYBB were 5 top genes in the network analysis. The expression of these genes and intracellular levels of ROS/O2• - were increased in PBMCs of COVID-19 patients while the GSH level decreased. The expression of high-ranked genes was lower in NP samples of COVID-19 patients compared to control group. The activity of extracellular enzymes CAT and SOD, and the total oxidant status (TOS) level were increased in plasma samples of COVID-19 patients. Also, the 2-marker panel of CAT and TOS and 3-marker panel showed the best performance. CONCLUSION SARS-COV-2 disrupts the redox equilibrium in immune cells and the upper respiratory tract, leading to exacerbated inflammation and increased replication and entrance of SARS-COV-2 into host cells. Furthermore, utilizing markers of oxidative stress as a complementary validation to discriminate COVID-19 from healthy controls, seems promising.
Collapse
Affiliation(s)
- Mohammad Javad Tavassolifar
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Hamid Asadzadeh Aghdaei
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Omid Sadatpour
- Department of Immunology, School of Public Health, Tehran University of Medical Sciences, Tehran, Iran
| | - Samaneh Maleknia
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Sara Fayazzadeh
- Bioinformatics and Computational Omics Lab (BioCOOL), Department of Biophysics, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran
| | - Seyed Reza Mohebbi
- Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Fatemeh Montazer
- Department of Pathology, Firoozabadi Hospital, School of Medicine, Iran University of Medical Sciences (IUMS), Tehran, Iran
| | - Amirhassan Rabbani
- Department of Transplant & Hepatobiliary Surgery, Taleghani Hospital, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Mohammad Reza Zali
- Gastroenterology and Liver Diseases Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Maryam Izad
- Immunology Department, School of Medicine, Tehran University of Medical Sciences, Tehran, Iran; MS Research Center, Neuroscience Institute, Tehran University of Medical Sciences, Tehran, Iran.
| | - Anna Meyfour
- Basic and Molecular Epidemiology of Gastrointestinal Disorders Research Center, Research Institute for Gastroenterology and Liver Diseases, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
23
|
Strandberg R, Abrahamsson L, Isheden G, Humphreys K. Tumour Growth Models of Breast Cancer for Evaluating Early Detection-A Summary and a Simulation Study. Cancers (Basel) 2023; 15:cancers15030912. [PMID: 36765870 PMCID: PMC9913080 DOI: 10.3390/cancers15030912] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 01/26/2023] [Accepted: 01/29/2023] [Indexed: 02/04/2023] Open
Abstract
With the advent of nationwide mammography screening programmes, a number of natural history models of breast cancers have been developed and used to assess the effects of screening. The first half of this article provides an overview of a class of these models and describes how they can be used to study latent processes of tumour progression from observational data. The second half of the article describes a simulation study which applies a continuous growth model to illustrate how effects of extending the maximum age of the current Swedish screening programme from 74 to 80 can be evaluated. Compared to no screening, the current and extended programmes reduced breast cancer mortality by 18.5% and 21.7%, respectively. The proportion of screen-detected invasive cancers which were overdiagnosed was estimated to be 1.9% in the current programme and 2.9% in the extended programme. With the help of these breast cancer natural history models, we can better understand the latent processes, and better study the effects of breast cancer screening.
Collapse
Affiliation(s)
- Rickard Strandberg
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, 171 77 Stockholm, Sweden
- Correspondence: (R.S.); (K.H.)
| | - Linda Abrahamsson
- Center for Primary Health Care Research, Lund University, 205 02 Malmö, Sweden
| | | | - Keith Humphreys
- Department of Medical Epidemiology and Biostatistics, Karolinska Institutet, 171 77 Stockholm, Sweden
- Correspondence: (R.S.); (K.H.)
| |
Collapse
|
24
|
Hsu TC, Lin C. Learning from small medical data-robust semi-supervised cancer prognosis classifier with Bayesian variational autoencoder. BIOINFORMATICS ADVANCES 2023; 3:vbac100. [PMID: 36698767 PMCID: PMC9832968 DOI: 10.1093/bioadv/vbac100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 12/07/2022] [Accepted: 01/08/2023] [Indexed: 01/11/2023]
Abstract
Motivation Cancer is one of the world's leading mortality causes, and its prognosis is hard to predict due to complicated biological interactions among heterogeneous data types. Numerous challenges, such as censorship, high dimensionality and small sample size, prevent researchers from using deep learning models for precise prediction. Results We propose a robust Semi-supervised Cancer prognosis classifier with bAyesian variational autoeNcoder (SCAN) as a structured machine-learning framework for cancer prognosis prediction. SCAN incorporates semi-supervised learning for predicting 5-year disease-specific survival and overall survival in breast and non-small cell lung cancer (NSCLC) patients, respectively. SCAN achieved significantly better AUROC scores than all existing benchmarks (81.73% for breast cancer; 80.46% for NSCLC), including our previously proposed bimodal neural network classifiers (77.71% for breast cancer; 78.67% for NSCLC). Independent validation results showed that SCAN still achieved better AUROC scores (74.74% for breast; 72.80% for NSCLC) than the bimodal neural network classifiers (64.13% for breast; 67.07% for NSCLC). SCAN is general and can potentially be trained on more patient data. This paves the foundation for personalized medicine for early cancer risk screening. Availability and implementation The source codes reproducing the main results are available on GitHub: https://gitfront.io/r/user-4316673/36e8714573f3fbfa0b24690af5d1a9d5ca159cf4/scan/. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Te-Cheng Hsu
- Institute of Communications Engineering, National Tsing Hua University, Hsinchu 30013, Taiwan
| | - Che Lin
- To whom correspondence should be addressed.
| |
Collapse
|
25
|
Yan H, Zhang S, Ma S. Hierarchy‐assisted gene expression regulatory network analysis. Stat Anal Data Min 2023. [DOI: 10.1002/sam.11609] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Affiliation(s)
- Han Yan
- School of Mathematical Sciences University of Chinese Academy of Sciences Beijing China
- Key Laboratory of Big Data Mining and Knowledge Management Chinese Academy of Sciences Beijing China
- Department of Biostatistics Yale School of Public Health New Haven Connecticut USA
| | - Sanguo Zhang
- School of Mathematical Sciences University of Chinese Academy of Sciences Beijing China
- Key Laboratory of Big Data Mining and Knowledge Management Chinese Academy of Sciences Beijing China
- Pazhou Lab Guangzhou China
| | - Shuangge Ma
- Department of Biostatistics Yale School of Public Health New Haven Connecticut USA
| |
Collapse
|
26
|
Xu Y, Wu M, Ma S. Multidimensional molecular measurements-environment interaction analysis for disease outcomes. Biometrics 2022; 78:1542-1554. [PMID: 34213006 PMCID: PMC9366385 DOI: 10.1111/biom.13526] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2020] [Revised: 02/27/2021] [Accepted: 06/28/2021] [Indexed: 12/30/2022]
Abstract
Multiple types of molecular (genetic, genomic, epigenetic, etc.) measurements, environmental risk factors, and their interactions have been found to contribute to the outcomes and phenotypes of complex diseases. In each of the previous studies, only the interactions between one type of molecular measurement and environmental risk factors have been analyzed. In recent biomedical studies, multidimensional profiling, in which data from multiple types of molecular measurements are collected from the same subjects, is becoming popular. A myriad of recent studies have shown that collectively analyzing multiple types of molecular measurements is not only biologically sensible but also leads to improved estimation and prediction. In this study, we conduct an M-E interaction analysis, with M standing for multidimensional molecular measurements and E standing for environmental risk factors. This can accommodate multiple types of molecular measurements and sufficiently account for their overlapping as well as independent information. Extensive simulation shows that it outperforms several closely related alternatives. In the analysis of TCGA (The Cancer Genome Atlas) data on lung adenocarcinoma and cutaneous melanoma, we make some stable biological findings and achieve stable prediction.
Collapse
Affiliation(s)
- Yaqing Xu
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, USA
| | - Mengyun Wu
- School of Statistics and Management, Shanghai University of Finance and Economics, Shanghai, China
| | - Shuangge Ma
- Department of Biostatistics, Yale School of Public Health, New Haven, Connecticut, USA
| |
Collapse
|
27
|
Xu Y, Cui X, Zhang L, Zhao T, Wang Y. Metastasis-related gene identification by compound constrained NMF and a semisupervised cluster approach using pancancer multiomics features. Comput Biol Med 2022; 151:106263. [PMID: 36371902 DOI: 10.1016/j.compbiomed.2022.106263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Revised: 10/16/2022] [Accepted: 10/30/2022] [Indexed: 11/11/2022]
Abstract
In recent years, with the gradual increase in pancancer-related research, more attention has been given to the field of pancancer metastasis. However, the molecular mechanism of pancancer metastasis is very unclear, and identification methods for pancancer metastasis-related genes are still lacking. In view of this research status, we developed a novel pipeline to identify pancancer metastasis-related genes based on compound constrained nonnegative matrix factorization (CCNMF). To solve the above problems, the following modules were designed. A correntropy operator and feature similarity fusion (FSF) were first adopted to process the multiomics features of genes; thus, the influences caused by irrelevant biomolecular patterns, manifested as non-Gaussian noise, were minimized. CCNMF was then adopted to handle the above features with compound constraints consisting of a gene relation network and a "metastasis-related" gene set, which maximizes the biological interpretability of the metafeatures generated by NMF. Since a negative set of pancancer "metastasis-related" genes could hardly be obtained, semisupervised analyses were performed on gene features acquired by each step in our pipeline to examine our method's effect. 83% of the 236 candidates identified by the above method were associated with the metastasis of one or more cancers, 71.9% candidates were identified immune-related in pancancer in addition to the hallmark genes. Our study provides an effective and interpretable method for identifying metastasis-related as well as immune-related genes, and the method is successfully applied to TCGA pancancer data.
Collapse
Affiliation(s)
- Yining Xu
- Faculty of Computing, Harbin Institute of Technology, 92 Xidazhi Street,TIB #20, Harbin, 150000, Hei Long Jiang, China.
| | - Xinran Cui
- Faculty of Computing, Harbin Institute of Technology, 92 Xidazhi Street,TIB #20, Harbin, 150000, Hei Long Jiang, China.
| | - Liyuan Zhang
- Faculty of Computing, Harbin Institute of Technology, 92 Xidazhi Street,TIB #20, Harbin, 150000, Hei Long Jiang, China.
| | - Tianyi Zhao
- School of medicine and Health, Harbin Institute of Technology, 92 Xidazhi Street,TIB #20, Harbin, 150000, Hei Long Jiang, China.
| | - Yadong Wang
- Faculty of Computing, Harbin Institute of Technology, 92 Xidazhi Street,TIB #20, Harbin, 150000, Hei Long Jiang, China.
| |
Collapse
|
28
|
Madani M, Behzadi MM, Nabavi S. The Role of Deep Learning in Advancing Breast Cancer Detection Using Different Imaging Modalities: A Systematic Review. Cancers (Basel) 2022; 14:5334. [PMID: 36358753 PMCID: PMC9655692 DOI: 10.3390/cancers14215334] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2022] [Revised: 10/23/2022] [Accepted: 10/25/2022] [Indexed: 12/02/2022] Open
Abstract
Breast cancer is among the most common and fatal diseases for women, and no permanent treatment has been discovered. Thus, early detection is a crucial step to control and cure breast cancer that can save the lives of millions of women. For example, in 2020, more than 65% of breast cancer patients were diagnosed in an early stage of cancer, from which all survived. Although early detection is the most effective approach for cancer treatment, breast cancer screening conducted by radiologists is very expensive and time-consuming. More importantly, conventional methods of analyzing breast cancer images suffer from high false-detection rates. Different breast cancer imaging modalities are used to extract and analyze the key features affecting the diagnosis and treatment of breast cancer. These imaging modalities can be divided into subgroups such as mammograms, ultrasound, magnetic resonance imaging, histopathological images, or any combination of them. Radiologists or pathologists analyze images produced by these methods manually, which leads to an increase in the risk of wrong decisions for cancer detection. Thus, the utilization of new automatic methods to analyze all kinds of breast screening images to assist radiologists to interpret images is required. Recently, artificial intelligence (AI) has been widely utilized to automatically improve the early detection and treatment of different types of cancer, specifically breast cancer, thereby enhancing the survival chance of patients. Advances in AI algorithms, such as deep learning, and the availability of datasets obtained from various imaging modalities have opened an opportunity to surpass the limitations of current breast cancer analysis methods. In this article, we first review breast cancer imaging modalities, and their strengths and limitations. Then, we explore and summarize the most recent studies that employed AI in breast cancer detection using various breast imaging modalities. In addition, we report available datasets on the breast-cancer imaging modalities which are important in developing AI-based algorithms and training deep learning models. In conclusion, this review paper tries to provide a comprehensive resource to help researchers working in breast cancer imaging analysis.
Collapse
Affiliation(s)
- Mohammad Madani
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269, USA
| | - Mohammad Mahdi Behzadi
- Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269, USA
| | - Sheida Nabavi
- Department of Computer Science and Engineering, University of Connecticut, Storrs, CT 06269, USA
| |
Collapse
|
29
|
Chen Y, Li H, Sun X. Construction and analysis of sample-specific driver modules for breast cancer. BMC Genomics 2022; 23:717. [PMID: 36266635 PMCID: PMC9583575 DOI: 10.1186/s12864-022-08928-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2022] [Accepted: 10/07/2022] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND It is important to understand the functional impact of somatic mutation and methylation aberration at an individual level to implement precision medicine. Recent studies have demonstrated that the perturbation of gene interaction networks can provide a fundamental link between genotype (or epigenotype) and phenotype. However, it is unclear how individual mutations affect the function of biological networks, especially for individual methylation aberration. To solve this, we provided a sample-specific driver module construction method using the 2-order network theory and hub-gene theory to identify individual perturbation networks driven by mutations or methylation aberrations. RESULTS Our method integrated multi-omics of breast cancer, including genomics, transcriptomics, epigenomics and interactomics, and provided new insight into the synergistic collaboration between methylation and mutation at an individual level. A common driver pattern of breast cancer was identified from a novel perspective of a driver module, which is correlated to the occurrence and development of breast cancer. The constructed driver module reflects the survival prognosis and degree of malignancy among different subtypes of breast cancer. Additionally, subtype-specific driver modules were identified. CONCLUSIONS This study explores the driver module of individual cancer, and contributes to a better understanding of the mechanism of breast cancer driven by the mutations and methylation variations from the point of view of the driver network. This work will help identify new therapeutic combinations of gene mutations and drugs in humans.
Collapse
Affiliation(s)
- Yuanyuan Chen
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, 210096 P. R. China
- College of Science, Nanjing Agricultural University, Nanjing, 210095 P. R. China
| | - Haitao Li
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, 210096 P. R. China
| | - Xiao Sun
- State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing, 210096 P. R. China
| |
Collapse
|
30
|
Jardillier R, Koca D, Chatelain F, Guyon L. Prognosis of lasso-like penalized Cox models with tumor profiling improves prediction over clinical data alone and benefits from bi-dimensional pre-screening. BMC Cancer 2022; 22:1045. [PMID: 36199072 PMCID: PMC9533541 DOI: 10.1186/s12885-022-10117-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 09/14/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Prediction of patient survival from tumor molecular '-omics' data is a key step toward personalized medicine. Cox models performed on RNA profiling datasets are popular for clinical outcome predictions. But these models are applied in the context of "high dimension", as the number p of covariates (gene expressions) greatly exceeds the number n of patients and e of events. Thus, pre-screening together with penalization methods are widely used for dimensional reduction. METHODS In the present paper, (i) we benchmark the performance of the lasso penalization and three variants (i.e., ridge, elastic net, adaptive elastic net) on 16 cancers from TCGA after pre-screening, (ii) we propose a bi-dimensional pre-screening procedure based on both gene variability and p-values from single variable Cox models to predict survival, and (iii) we compare our results with iterative sure independence screening (ISIS). RESULTS First, we show that integration of mRNA-seq data with clinical data improves predictions over clinical data alone. Second, our bi-dimensional pre-screening procedure can only improve, in moderation, the C-index and/or the integrated Brier score, while excluding irrelevant genes for prediction. We demonstrate that the different penalization methods reached comparable prediction performances, with slight differences among datasets. Finally, we provide advice in the case of multi-omics data integration. CONCLUSIONS Tumor profiles convey more prognostic information than clinical variables such as stage for many cancer subtypes. Lasso and Ridge penalizations perform similarly than Elastic Net penalizations for Cox models in high-dimension. Pre-screening of the top 200 genes in term of single variable Cox model p-values is a practical way to reduce dimension, which may be particularly useful when integrating multi-omics.
Collapse
Affiliation(s)
- Rémy Jardillier
- IRIG, Biosanté U1292, Univ. Grenoble Alpes, Inserm, CEA, Grenoble, France
- GIPSA-lab, Institute of Engineering University Grenoble Alpes, Univ. Grenoble Alpes, CNRS, Grenoble INP, Grenoble, France
| | - Dzenis Koca
- IRIG, Biosanté U1292, Univ. Grenoble Alpes, Inserm, CEA, Grenoble, France
| | - Florent Chatelain
- GIPSA-lab, Institute of Engineering University Grenoble Alpes, Univ. Grenoble Alpes, CNRS, Grenoble INP, Grenoble, France
| | - Laurent Guyon
- IRIG, Biosanté U1292, Univ. Grenoble Alpes, Inserm, CEA, Grenoble, France
| |
Collapse
|
31
|
The effects of Aronia berry polyphenol supplementation on arterial function and the gut microbiome in middle aged men and women: Results from a randomized controlled trial. Clin Nutr 2022; 41:2549-2561. [DOI: 10.1016/j.clnu.2022.08.024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2022] [Revised: 08/13/2022] [Accepted: 08/22/2022] [Indexed: 11/20/2022]
|
32
|
Speller J, Staerk C, Mayr A. Robust statistical boosting with quantile-based adaptive loss functions. Int J Biostat 2022:ijb-2021-0127. [PMID: 35950232 DOI: 10.1515/ijb-2021-0127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2021] [Accepted: 06/20/2022] [Indexed: 11/15/2022]
Abstract
We combine robust loss functions with statistical boosting algorithms in an adaptive way to perform variable selection and predictive modelling for potentially high-dimensional biomedical data. To achieve robustness against outliers in the outcome variable (vertical outliers), we consider different composite robust loss functions together with base-learners for linear regression. For composite loss functions, such as the Huber loss and the Bisquare loss, a threshold parameter has to be specified that controls the robustness. In the context of boosting algorithms, we propose an approach that adapts the threshold parameter of composite robust losses in each iteration to the current sizes of residuals, based on a fixed quantile level. We compared the performance of our approach to classical M-regression, boosting with standard loss functions or the lasso regarding prediction accuracy and variable selection in different simulated settings: the adaptive Huber and Bisquare losses led to a better performance when the outcome contained outliers or was affected by specific types of corruption. For non-corrupted data, our approach yielded a similar performance to boosting with the efficient L 2 loss or the lasso. Also in the analysis of skewed KRT19 protein expression data based on gene expression measurements from human cancer cell lines (NCI-60 cell line panel), boosting with the new adaptive loss functions performed favourably compared to standard loss functions or competing robust approaches regarding prediction accuracy and resulted in very sparse models.
Collapse
Affiliation(s)
- Jan Speller
- Medical Faculty, Institute of Medical Biometrics, Informatics and Epidemiology (IMBIE), University of Bonn, Bonn, Germany
| | - Christian Staerk
- Medical Faculty, Institute of Medical Biometrics, Informatics and Epidemiology (IMBIE), University of Bonn, Bonn, Germany
| | - Andreas Mayr
- Medical Faculty, Institute of Medical Biometrics, Informatics and Epidemiology (IMBIE), University of Bonn, Bonn, Germany
| |
Collapse
|
33
|
K-Clique Multiomics Framework: A Novel Protocol to Decipher the Role of Gut Microbiota Communities in Nutritional Intervention Trials. Metabolites 2022; 12:metabo12080736. [PMID: 36005608 PMCID: PMC9412844 DOI: 10.3390/metabo12080736] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Revised: 08/03/2022] [Accepted: 08/04/2022] [Indexed: 11/17/2022] Open
Abstract
The availability of omics data providing information from different layers of complex biological processes that link nutrition to human health would benefit from the development of integrated approaches combining holistically individual omics data, including those associated with the microbiota that impacts the metabolisation and bioavailability of food components. Microbiota must be considered as a set of populations of interconnected consortia, with compensatory capacities to adapt to different nutritional intake. To study the consortium nature of the microbiome, we must rely on specially designed data analysis tools. The purpose of this work is to propose the construction of a general correlation network-based explorative tool, suitable for nutritional clinical trials, by integrating omics data from faecal microbial taxa, stool metabolome (1H NMR spectra) and GC-MS for stool volatilome. The presented approach exploits a descriptive paradigm necessary for a true multiomics integration of data, which is a powerful tool to investigate the complex physiological effects of nutritional interventions.
Collapse
|
34
|
Li L, Wei Y, Shi G, Yang H, Li Z, Fang R, Cao H, Cui Y. Multi-omics data integration for subtype identification of Chinese lower-grade gliomas: a joint similarity network fusion approach. Comput Struct Biotechnol J 2022; 20:3482-3492. [PMID: 35860412 PMCID: PMC9284445 DOI: 10.1016/j.csbj.2022.06.065] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 06/30/2022] [Accepted: 06/30/2022] [Indexed: 12/28/2022] Open
Abstract
Lower-grade gliomas (LGG), characterized by heterogeneity and invasiveness, originate from the central nervous system. Although studies focusing on molecular subtyping and molecular characteristics have provided novel insights into improving the diagnosis and therapy of LGG, there is an urgent need to identify new molecular subtypes and biomarkers that are promising to improve patient survival outcomes. Here, we proposed a joint similarity network fusion (Joint-SNF) method to integrate different omics data types to construct a fused network using the Joint and Individual Variation Explained (JIVE) technique under the SNF framework. Focusing on the joint network structure, a spectral clustering method was employed to obtain subtypes of patients. Simulation studies show that the proposed Joint-SNF method outperforms the original SNF approach under various simulation scenarios. We further applied the method to a Chinese LGG data set including mRNA expression, DNA methylation and microRNA (miRNA). Three molecular subtypes were identified and showed statistically significant differences in patient survival outcomes. The five-year mortality rates of the three subtypes are 80.8%, 32.1%, and 34.4%, respectively. After adjusting for clinically relevant covariates, the death risk of patients in Cluster 1 was 5.06 times higher than patients in other clusters. The fused network attained by the proposed Joint-SNF method enhances strong similarities, thus greatly improves subtyping performance compared to the original SNF method. The findings in the real application may provide important clues for improving patient survival outcomes and for precision treatment for Chinese LGG patients. An R package to implement the method can be accessed in Github at https://github.com/Sameerer/Joint-SNF.
Collapse
Affiliation(s)
- Lingmei Li
- Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi 030001, PR China
| | - Yifang Wei
- Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi 030001, PR China
| | - Guojing Shi
- Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi 030001, PR China
| | - Haitao Yang
- Division of Health Statistics, School of Public Health, Hebei Medical University, Shijiazhuang, Hebei 050017, PR China
| | - Zhi Li
- Department of Hematology, Taiyuan Central Hospital of Shanxi Medical University, Taiyuan, Shanxi 030001, PR China
| | - Ruiling Fang
- Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi 030001, PR China
| | - Hongyan Cao
- Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi 030001, PR China
- Shanxi Medical University-Yidu Cloud Institute of Medical Data Science, Taiyuan, Shanxi 030001, PR China
- Corresponding authors at: Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi, PR China.
| | - Yuehua Cui
- Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA
- Corresponding authors at: Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi, PR China.
| |
Collapse
|
35
|
Zhou F, Lu X, Ren J, Fan K, Ma S, Wu C. Sparse group variable selection for gene-environment interactions in the longitudinal study. Genet Epidemiol 2022; 46:317-340. [PMID: 35766061 DOI: 10.1002/gepi.22461] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2021] [Revised: 01/31/2022] [Accepted: 03/15/2022] [Indexed: 11/06/2022]
Abstract
Penalized variable selection for high-dimensional longitudinal data has received much attention as it can account for the correlation among repeated measurements while providing additional and essential information for improved identification and prediction performance. Despite the success, in longitudinal studies, the potential of penalization methods is far from fully understood for accommodating structured sparsity. In this article, we develop a sparse group penalization method to conduct the bi-level gene-environment (G × $\times $ E) interaction study under the repeatedly measured phenotype. Within the quadratic inference function framework, the proposed method can achieve simultaneous identification of main and interaction effects on both the group and individual levels. Simulation studies have shown that the proposed method outperforms major competitors. In the case study of asthma data from the Childhood Asthma Management Program, we conduct G × $\times $ E study by using high-dimensional single nucleotide polymorphism data as genetic factors and the longitudinal trait, forced expiratory volume in 1 s, as the phenotype. Our method leads to improved prediction and identification of main and interaction effects with important implications.
Collapse
Affiliation(s)
- Fei Zhou
- Department of Statistics, Kansas State University, Manhattan, Kansas, 66506, USA
| | - Xi Lu
- Department of Statistics, Kansas State University, Manhattan, Kansas, 66506, USA
| | - Jie Ren
- Department of Biostatistics and Health Data Sciences, Indiana University School of Medicine, Indianapolis, Indiana, 46202, USA
| | - Kun Fan
- Department of Statistics, Kansas State University, Manhattan, Kansas, 66506, USA
| | - Shuangge Ma
- Department of Biostatistics, Yale University, New Haven, Connecticut, 06520, USA
| | - Cen Wu
- Department of Statistics, Kansas State University, Manhattan, Kansas, 66506, USA
| |
Collapse
|
36
|
Hill C, Avila-Palencia I, Maxwell AP, Hunter RF, McKnight AJ. Harnessing the Full Potential of Multi-Omic Analyses to Advance the Study and Treatment of Chronic Kidney Disease. FRONTIERS IN NEPHROLOGY 2022; 2:923068. [PMID: 37674991 PMCID: PMC10479694 DOI: 10.3389/fneph.2022.923068] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Accepted: 05/30/2022] [Indexed: 09/08/2023]
Abstract
Chronic kidney disease (CKD) was the 12th leading cause of death globally in 2017 with the prevalence of CKD estimated at ~9%. Early detection and intervention for CKD may improve patient outcomes, but standard testing approaches even in developed countries do not facilitate identification of patients at high risk of developing CKD, nor those progressing to end-stage kidney disease (ESKD). Recent advances in CKD research are moving towards a more personalised approach for CKD. Heritability for CKD ranges from 30% to 75%, yet identified genetic risk factors account for only a small proportion of the inherited contribution to CKD. More in depth analysis of genomic sequencing data in large cohorts is revealing new genetic risk factors for common diagnoses of CKD and providing novel diagnoses for rare forms of CKD. Multi-omic approaches are now being harnessed to improve our understanding of CKD and explain some of the so-called 'missing heritability'. The most common omic analyses employed for CKD are genomics, epigenomics, transcriptomics, metabolomics, proteomics and phenomics. While each of these omics have been reviewed individually, considering integrated multi-omic analysis offers considerable scope to improve our understanding and treatment of CKD. This narrative review summarises current understanding of multi-omic research alongside recent experimental and analytical approaches, discusses current challenges and future perspectives, and offers new insights for CKD.
Collapse
Affiliation(s)
| | | | | | | | - Amy Jayne McKnight
- Centre for Public Health, Queen’s University Belfast, Belfast, United Kingdom
| |
Collapse
|
37
|
Gliozzo J, Mesiti M, Notaro M, Petrini A, Patak A, Puertas-Gallardo A, Paccanaro A, Valentini G, Casiraghi E. Heterogeneous data integration methods for patient similarity networks. Brief Bioinform 2022; 23:6604996. [PMID: 35679533 PMCID: PMC9294435 DOI: 10.1093/bib/bbac207] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2021] [Revised: 04/14/2022] [Accepted: 05/04/2022] [Indexed: 12/29/2022] Open
Abstract
Patient similarity networks (PSNs), where patients are represented as nodes and their similarities as weighted edges, are being increasingly used in clinical research. These networks provide an insightful summary of the relationships among patients and can be exploited by inductive or transductive learning algorithms for the prediction of patient outcome, phenotype and disease risk. PSNs can also be easily visualized, thus offering a natural way to inspect complex heterogeneous patient data and providing some level of explainability of the predictions obtained by machine learning algorithms. The advent of high-throughput technologies, enabling us to acquire high-dimensional views of the same patients (e.g. omics data, laboratory data, imaging data), calls for the development of data fusion techniques for PSNs in order to leverage this rich heterogeneous information. In this article, we review existing methods for integrating multiple biomedical data views to construct PSNs, together with the different patient similarity measures that have been proposed. We also review methods that have appeared in the machine learning literature but have not yet been applied to PSNs, thus providing a resource to navigate the vast machine learning literature existing on this topic. In particular, we focus on methods that could be used to integrate very heterogeneous datasets, including multi-omics data as well as data derived from clinical information and medical imaging.
Collapse
Affiliation(s)
- Jessica Gliozzo
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,European Commission, Joint Research Centre (JRC), Ispra (VA), Italy.,CINI, Infolife National Laboratory, Roma, Italy
| | - Marco Mesiti
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,CINI, Infolife National Laboratory, Roma, Italy
| | - Marco Notaro
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,CINI, Infolife National Laboratory, Roma, Italy
| | - Alessandro Petrini
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,CINI, Infolife National Laboratory, Roma, Italy
| | - Alex Patak
- European Commission, Joint Research Centre (JRC), Ispra (VA), Italy
| | | | - Alberto Paccanaro
- Department of Computer Science, Royal Holloway, University of London, Egham, TW20 0EX UK.,School of Applied Mathematics (EMAp), Fundação Getúlio Vargas, Rio de Janeiro Brazil
| | - Giorgio Valentini
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,CINI, Infolife National Laboratory, Roma, Italy.,DSRC UNIMI, Data Science Research Center, Milano, 20135, Italy.,ELLIS, European Laboratory for Learning and Intelligent Systems, Berlin, Germany
| | - Elena Casiraghi
- AnacletoLab - Computer Science Department, Universitá degli Studi di Milano, Via Celoria 18, 20135, Milan, Italy.,CINI, Infolife National Laboratory, Roma, Italy
| |
Collapse
|
38
|
Mokhtari A, Porte B, Belzeaux R, Etain B, Ibrahim EC, Marie-Claire C, Lutz PE, Delahaye-Duriez A. The molecular pathophysiology of mood disorders: From the analysis of single molecular layers to multi-omic integration. Prog Neuropsychopharmacol Biol Psychiatry 2022; 116:110520. [PMID: 35104608 DOI: 10.1016/j.pnpbp.2022.110520] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Revised: 01/22/2022] [Accepted: 01/22/2022] [Indexed: 12/14/2022]
Abstract
Next-generation sequencing now enables the rapid and affordable production of reliable biological data at multiple molecular levels, collectively referred to as "omics". To maximize the potential for discovery, computational biologists have created and adapted integrative multi-omic analytical methods. When applied to diseases with traceable pathophysiology such as cancer, these new algorithms and statistical approaches have enabled the discovery of clinically relevant molecular mechanisms and biomarkers. In contrast, these methods have been much less applied to the field of molecular psychiatry, although diagnostic and prognostic biomarkers are similarly needed. In the present review, we first briefly summarize main findings from two decades of studies that investigated single molecular processes in relation to mood disorders. Then, we conduct a systematic review of multi-omic strategies that have been proposed and used more recently. We also list databases and types of data available to researchers for future work. Finally, we present the newest methodologies that have been employed for multi-omics integration in other medical fields, and discuss their potential for molecular psychiatry studies.
Collapse
Affiliation(s)
- Amazigh Mokhtari
- NeuroDiderot, Inserm U1141, Université de Paris, F-75019 Paris, France
| | - Baptiste Porte
- NeuroDiderot, Inserm U1141, Université de Paris, F-75019 Paris, France
| | - Raoul Belzeaux
- Aix Marseille Université CNRS, Institut de Neurosciences de la Timone, F-13005 Marseille, France; Fondation FondaMental, F-94000 Créteil, France; Assistance Publique Hôpitaux de Marseille, Pôle de psychiatrie, pédopsychiatrie et addictologie, F-13005 Marseille, France
| | - Bruno Etain
- Assistance Publique des Hôpitaux de Paris, GHU Lariboisière-Saint Louis-Fernand Widal, DMU Neurosciences, Département de psychiatrie et de Médecine Addictologique, F-75010 Paris, France; Université de Paris, INSERM UMR-S 1144, Optimisation thérapeutique en neuropsychopharmacologie, OTeN, F-75006 Paris, France
| | - El Cherif Ibrahim
- Aix Marseille Université CNRS, Institut de Neurosciences de la Timone, F-13005 Marseille, France
| | - Cynthia Marie-Claire
- Université de Paris, INSERM UMR-S 1144, Optimisation thérapeutique en neuropsychopharmacologie, OTeN, F-75006 Paris, France
| | - Pierre-Eric Lutz
- Centre National de la Recherche Scientifique, Université de Strasbourg, Fédération de Médecine Translationnelle de Strasbourg, Institut des Neurosciences Cellulaires et Intégratives UPR3212, F-67000 Strasbourg, France; Douglas Mental Health University Institute, McGill University, QC H4H 1R3 Montréal, Canada.
| | - Andrée Delahaye-Duriez
- NeuroDiderot, Inserm U1141, Université de Paris, F-75019 Paris, France; Assistance Publique des Hôpitaux de Paris, Unité de médecine génomique, Département BioPhaReS, Hôpital Jean Verdier, Hôpitaux Universitaires de Paris Seine Saint Denis, F-93140 Bondy, France; Université Sorbonne Paris Nord, F-93000 Bobigny, France.
| |
Collapse
|
39
|
Monroy Kuhn JM, Miok V, Lutter D. Correlation-guided Network Integration (CoNI), an R package for integrating numerical omics data that allows multiform graph representations to study molecular interaction networks. BIOINFORMATICS ADVANCES 2022; 2:vbac042. [PMID: 36699352 PMCID: PMC9710706 DOI: 10.1093/bioadv/vbac042] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Revised: 05/09/2022] [Accepted: 06/01/2022] [Indexed: 02/01/2023]
Abstract
Summary Today's immense growth in complex biological data demands effective and flexible tools for integration, analysis and extraction of valuable insights. Here, we present CoNI, a practical R package for the unsupervised integration of numerical omics datasets. Our tool is based on partial correlations to identify putative confounding variables for a set of paired dependent variables. CoNI combines two omics datasets in an integrated, complex hypergraph-like network, represented as a weighted undirected graph, a bipartite graph, or a hypergraph structure. These network representations form a basis for multiple further analyses, such as identifying priority candidates of biological importance or comparing network structures dependent on different conditions. Availability and implementation The R package CoNI is available on the Comprehensive R Archive Network (https://cran.r-project.org/web/packages/CoNI/) and GitLab (https://gitlab.com/computational-discovery-research/coni). It is distributed under the GNU General Public License (version 3). Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
| | - Viktorian Miok
- Computational Discovery Unit, Institute for Diabetes & Obesity, Helmholtz Zentrum München, Neuherberg, Germany
- German Center for Diabetes Research (DZD), Neuherberg, Germany
- Astrocyte-Neuron Networks, Institute for Diabetes & Obesity, Helmholtz Zentrum München, Neuherberg, Germany
| | | |
Collapse
|
40
|
Yang B, Yang Y, Su X. Deep structure integrative representation of multi-omics data for cancer subtyping. Bioinformatics 2022; 38:3337-3342. [PMID: 35639657 DOI: 10.1093/bioinformatics/btac345] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 04/22/2022] [Accepted: 05/17/2022] [Indexed: 01/01/2023] Open
Abstract
MOTIVATION Cancer is a heterogeneous group of diseases. Cancer subtyping is crucial and critical step to diagnosis, prognosis and treatment. Since high-throughput sequencing technologies provide unprecedented opportunity to rapid collect multi-omics data for the same individuals, an urgent need in current is how to effectively represent and integrate these multi-omics data to achieve clinically meaningful cancer subtyping. RESULTS We propose a novel deep learning model, called Deep Structure Integrative Representation (DSIR), for cancer subtypes dentification by integrating representation and clustering multi-omics data. DSIR simultaneously captures the global structures in sparse subspace and local structures in manifold subspace from multi-omics data and constructs consensus similarity matrix by utilizing deep neural networks. Extensive tests are performed in twelve different cancers on three levels of omics data from The Cancer Genome Atlas. The results demonstrate that DSIR obtains more significant performances than the state-of-the-art integrative methods. AVAILABILITY https://github.com/Polytech-bioinf/Deep-structure-integrative-representation.git. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Bo Yang
- School of Computer Science, Xi'an Polytechnic University, Xi'an, 710048, China.,Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, M5S 3E1, ON, Canada
| | - Yan Yang
- School of Computer Science, Xi'an Polytechnic University, Xi'an, 710048, China
| | - Xueping Su
- School of Electronics and Information, Xi'an Polytechnic University, Xi'an, 710048, China
| |
Collapse
|
41
|
Multi-omics approach in tea polyphenol research regarding tea plant growth, development and tea processing: current technologies and perspectives. FOOD SCIENCE AND HUMAN WELLNESS 2022. [DOI: 10.1016/j.fshw.2021.12.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
42
|
Pitaloka DAE, Syamsunarno MRAAA, Abdulah R, Chaidir L. Omics Biomarkers for Monitoring Tuberculosis Treatment: A Mini-Review of Recent Insights and Future Approaches. Infect Drug Resist 2022; 15:2703-2711. [PMID: 35664683 PMCID: PMC9160605 DOI: 10.2147/idr.s366580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2022] [Accepted: 05/20/2022] [Indexed: 11/23/2022] Open
Abstract
Poor sensitivity of sputum conversion for monitoring tuberculosis (TB) treatment that makes identification of a non-sputum-based biomarker is urgently needed. Monitoring biomarkers in TB treatment is used to decide whether critical thresholds have been reached and helps clinicians to conclude the therapeutic success. In this mini review, we highlight recent studies on omics-related contributes to identifying of a novel biomarker as surrogate markers for the cure and predicting future reactivation risk following TB treatment. We catalogue the studies published to seek the progress made in transcriptomics, proteomics, and metabolomics in pulmonary TB. We also discuss how integrative multi-omics data will provide further understanding and effective TB treatment, such as revealing the interrelationships at multiple molecular levels, facilitating the identification of biologically interconnected processes, and accelerating precision medicine in TB treatment. However, proper validation in prospective longitudinal studies with long-term follow-up and outcome assessment must be conducted before the biomarkers are utilized in clinical practice.
Collapse
Affiliation(s)
- Dian Ayu Eka Pitaloka
- Department of Pharmacology and Clinical Pharmacy, Faculty of Pharmacy, Universitas Padjadjaran, Sumedang, West Java, 45363, Indonesia
- Center for Translational Biomarker Research, Universitas Padjadjaran, Bandung, West Java, 40132, Indonesia
| | - Mas Rizky Anggun A A Syamsunarno
- Center for Translational Biomarker Research, Universitas Padjadjaran, Bandung, West Java, 40132, Indonesia
- Department of Biomedical Sciences, Faculty of Medicine, Universitas Padjadjaran, Sumedang, West Java, 45363, Indonesia
| | - Rizky Abdulah
- Department of Pharmacology and Clinical Pharmacy, Faculty of Pharmacy, Universitas Padjadjaran, Sumedang, West Java, 45363, Indonesia
- Center of Excellence in Higher Education for Pharmaceutical Care Innovation, Universitas Padjadjaran, Sumedang, West Java, 45363, Indonesia
| | - Lidya Chaidir
- Center for Translational Biomarker Research, Universitas Padjadjaran, Bandung, West Java, 40132, Indonesia
- Department of Biomedical Sciences, Faculty of Medicine, Universitas Padjadjaran, Sumedang, West Java, 45363, Indonesia
- Correspondence: Lidya Chaidir, Department of Biomedical Sciences, Faculty of Medicine, Universitas Padjadjaran, Sumedang, West Java, 45363, Indonesia, Tel +62-22-84288812, Email
| |
Collapse
|
43
|
Hong S, Park JH, Cho W, Choe H, Cheon JH. Secure tumor classification by shallow neural network using homomorphic encryption. BMC Genomics 2022; 23:284. [PMID: 35395714 PMCID: PMC8994372 DOI: 10.1186/s12864-022-08469-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Accepted: 03/04/2022] [Indexed: 12/24/2022] Open
Abstract
BACKGROUND Disclosure of patients' genetic information in the process of applying machine learning techniques for tumor classification hinders the privacy of personal information. Homomorphic Encryption (HE), which supports operations between encrypted data, can be used as one of the tools to perform such computation without information leakage, but it brings great challenges for directly applying general machine learning algorithms due to the limitations of operations supported by HE. In particular, non-polynomial activation functions, including softmax functions, are difficult to implement with HE and require a suitable approximation method to minimize the loss of accuracy. In the secure genome analysis competition called iDASH 2020, it is presented as a competition task that a multi-label tumor classification method that predicts the class of samples based on genetic information using HE. METHODS We develop a secure multi-label tumor classification method using HE to ensure privacy during all the computations of the model inference process. Our solution is based on a 1-layer neural network with the softmax activation function model and uses the approximate HE scheme. We present an approximation method that enables softmax activation in the model using HE and a technique for efficiently encoding data to reduce computational costs. In addition, we propose a HE-friendly data filtering method to reduce the size of large-scale genetic data. RESULTS We aim to analyze the dataset from The Cancer Genome Atlas (TCGA) dataset, which consists of 3,622 samples from 11 types of cancers, genetic features from 25,128 genes. Our preprocessing method reduces the number of genes to 4,096 or less and achieves a microAUC value of 0.9882 (85% accuracy) with a 1-layer shallow neural network. Using our model, we successfully compute the tumor classification inference steps on the encrypted test data in 3.75 minutes. As a result of exceptionally high microAUC values, our solution was awarded co-first place in iDASH 2020 Track 1: "Secure multi-label Tumor classification using Homomorphic Encryption". CONCLUSIONS Our solution is the first result of implementing a neural network model with softmax activation using HE. Also, HE optimization methods presented in this work enable machine learning implementation using HE or other challenging HE applications.
Collapse
Affiliation(s)
- Seungwan Hong
- Department of Mathematical Sciences, Seoul National University, 1, Gwanak-ro, Gwanak-gu, Seoul, Republic of Korea.
| | - Jai Hyun Park
- Department of Mathematical Sciences, Seoul National University, 1, Gwanak-ro, Gwanak-gu, Seoul, Republic of Korea
| | - Wonhee Cho
- Department of Mathematical Sciences, Seoul National University, 1, Gwanak-ro, Gwanak-gu, Seoul, Republic of Korea
| | - Hyeongmin Choe
- Department of Mathematical Sciences, Seoul National University, 1, Gwanak-ro, Gwanak-gu, Seoul, Republic of Korea
| | - Jung Hee Cheon
- Department of Mathematical Sciences, Seoul National University, 1, Gwanak-ro, Gwanak-gu, Seoul, Republic of Korea.,Cryptolab Inc., 1, Gwanak-ro, Gwanak-gu, Seoul, Republic of Korea
| |
Collapse
|
44
|
Genetic and Molecular Characterization Revealed the Prognosis Efficiency of Histone Acetylation in Pan-Digestive Cancers. JOURNAL OF ONCOLOGY 2022; 2022:3938652. [PMID: 35422864 PMCID: PMC9005301 DOI: 10.1155/2022/3938652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/21/2021] [Revised: 03/02/2022] [Accepted: 03/14/2022] [Indexed: 11/18/2022]
Abstract
The imbalance between acetylation and deacetylation of histone proteins, important for epigenetic modifications, is closely associated with various diseases, including cancer. However, knowledge regarding the modification of histones across the different types of digestive cancers is still lacking. The purpose of this research was to analyze the role of histone acetylation and deacetylation in pan-digestive cancers. We systematically characterized the molecular alterations and clinical relevance of 13 histone acetyltransferase (HAT) and 18 histone deacetylase (HDAC) genes in five types of digestive cancers, including esophageal carcinoma, gastric cancer, hepatocellular carcinoma, pancreatic cancer, and colorectal cancer. Recurrent mutations and copy number variation (CNV) were extensively found in acetylation-associated genes across pan-digestive cancers. HDAC9 and KAT6A showed widespread copy number amplification across five pan-digestive cancers, while ESCO2, EP300, and HDAC10 had prevalent copy number deletions. Accordingly, we found that HAT and HDAC genes correlated with multiple cancer hallmark-related pathways, especially the histone modification-related pathway, PRC2 complex pathway. Furthermore, the expression pattern of HAT and HDAC genes stratified patients with clinical benefit in hepatocellular carcinoma and pancreatic cancer. These results indicated that acetylation acts as a key molecular regulation of pan-digestive tumor progression.
Collapse
|
45
|
Zhou F, Ren J, Liu Y, Li X, Wang W, Wu C. Interep: An R Package for High-Dimensional Interaction Analysis of the Repeated Measurement Data. Genes (Basel) 2022; 13:544. [PMID: 35328097 PMCID: PMC8950762 DOI: 10.3390/genes13030544] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Revised: 03/12/2022] [Accepted: 03/13/2022] [Indexed: 02/05/2023] Open
Abstract
We introduce interep, an R package for interaction analysis of repeated measurement data with high-dimensional main and interaction effects. In G × E interaction studies, the forms of environmental factors play a critical role in determining how structured sparsity should be imposed in the high-dimensional scenario to identify important effects. Zhou et al. (2019) (PMID: 31816972) proposed a longitudinal penalization method to select main and interaction effects corresponding to the individual and group structure, respectively, which requires a mixture of individual and group level penalties. The R package interep implements generalized estimating equation (GEE)-based penalization methods with this sparsity assumption. Moreover, alternative methods have also been implemented in the package. These alternative methods merely select effects on an individual level and ignore the group-level interaction structure. In this software article, we first introduce the statistical methodology corresponding to the penalized GEE methods implemented in the package. Next, we present the usage of the core and supporting functions, which is followed by a simulation example with R codes and annotations. The R package interep is available at The Comprehensive R Archive Network (CRAN).
Collapse
Affiliation(s)
- Fei Zhou
- Department of Statistics, Kansas State University, Manhattan, KS 66506, USA; (F.Z.); (Y.L.); (X.L.)
| | - Jie Ren
- Department of Biostatistics and Health Data Sciences, Indiana University School of Medicine, Indianapolis, IN 46202, USA;
| | - Yuwen Liu
- Department of Statistics, Kansas State University, Manhattan, KS 66506, USA; (F.Z.); (Y.L.); (X.L.)
| | - Xiaoxi Li
- Department of Statistics, Kansas State University, Manhattan, KS 66506, USA; (F.Z.); (Y.L.); (X.L.)
| | - Weiqun Wang
- Department of Food, Nutrition, Dietetics and Health, Kansas State University, Manhattan, KS 66506, USA;
| | - Cen Wu
- Department of Statistics, Kansas State University, Manhattan, KS 66506, USA; (F.Z.); (Y.L.); (X.L.)
| |
Collapse
|
46
|
Seyres D, Cabassi A, Lambourne JJ, Burden F, Farrow S, McKinney H, Batista J, Kempster C, Pietzner M, Slingsby O, Cao TH, Quinn PA, Stefanucci L, Sims MC, Rehnstrom K, Adams CL, Frary A, Ergüener B, Kreuzhuber R, Mocciaro G, D'Amore S, Koulman A, Grassi L, Griffin JL, Ng LL, Park A, Savage DB, Langenberg C, Bock C, Downes K, Wareham NJ, Allison M, Vacca M, Kirk PDW, Frontini M. Transcriptional, epigenetic and metabolic signatures in cardiometabolic syndrome defined by extreme phenotypes. Clin Epigenetics 2022; 14:39. [PMID: 35279219 PMCID: PMC8917653 DOI: 10.1186/s13148-022-01257-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Accepted: 02/25/2022] [Indexed: 02/06/2023] Open
Abstract
BACKGROUND This work is aimed at improving the understanding of cardiometabolic syndrome pathophysiology and its relationship with thrombosis by generating a multi-omic disease signature. METHODS/RESULTS We combined classic plasma biochemistry and plasma biomarkers with the transcriptional and epigenetic characterisation of cell types involved in thrombosis, obtained from two extreme phenotype groups (morbidly obese and lipodystrophy) and lean individuals to identify the molecular mechanisms at play, highlighting patterns of abnormal activation in innate immune phagocytic cells. Our analyses showed that extreme phenotype groups could be distinguished from lean individuals, and from each other, across all data layers. The characterisation of the same obese group, 6 months after bariatric surgery, revealed the loss of the abnormal activation of innate immune cells previously observed. However, rather than reverting to the gene expression landscape of lean individuals, this occurred via the establishment of novel gene expression landscapes. NETosis and its control mechanisms emerge amongst the pathways that show an improvement after surgical intervention. CONCLUSIONS We showed that the morbidly obese and lipodystrophy groups, despite some differences, shared a common cardiometabolic syndrome signature. We also showed that this could be used to discriminate, amongst the normal population, those individuals with a higher likelihood of presenting with the disease, even when not displaying the classic features.
Collapse
Affiliation(s)
- Denis Seyres
- National Institute for Health Research BioResource, Cambridge University Hospitals, Cambridge Biomedical Campus, Cambridge, UK.
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK.
- NHS Blood and Transplant, Cambridge Biomedical Campus, Cambridge, UK.
| | - Alessandra Cabassi
- MRC Biostatistics Unit, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
| | - John J Lambourne
- National Institute for Health Research BioResource, Cambridge University Hospitals, Cambridge Biomedical Campus, Cambridge, UK
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
- NHS Blood and Transplant, Cambridge Biomedical Campus, Cambridge, UK
| | - Frances Burden
- National Institute for Health Research BioResource, Cambridge University Hospitals, Cambridge Biomedical Campus, Cambridge, UK
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
- NHS Blood and Transplant, Cambridge Biomedical Campus, Cambridge, UK
| | - Samantha Farrow
- National Institute for Health Research BioResource, Cambridge University Hospitals, Cambridge Biomedical Campus, Cambridge, UK
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
- NHS Blood and Transplant, Cambridge Biomedical Campus, Cambridge, UK
| | - Harriet McKinney
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
| | - Joana Batista
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
| | - Carly Kempster
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
| | - Maik Pietzner
- MRC Epidemiology Unit, University of Cambridge, Cambridge, UK
| | - Oliver Slingsby
- Department of Cardiovascular Sciences, Glenfield Hospital, University of Leicester, Leicester, UK
- National Institute for Health Research Leicester Biomedical Research Centre, Glenfield Hospital, Leicester, UK
| | - Thong Huy Cao
- Department of Cardiovascular Sciences, Glenfield Hospital, University of Leicester, Leicester, UK
- National Institute for Health Research Leicester Biomedical Research Centre, Glenfield Hospital, Leicester, UK
| | - Paulene A Quinn
- Department of Cardiovascular Sciences, Glenfield Hospital, University of Leicester, Leicester, UK
- National Institute for Health Research Leicester Biomedical Research Centre, Glenfield Hospital, Leicester, UK
| | - Luca Stefanucci
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
- NHS Blood and Transplant, Cambridge Biomedical Campus, Cambridge, UK
- British Heart Foundation Centre of Excellence, Cambridge Biomedical Campus, Cambridge, UK
| | - Matthew C Sims
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
- NHS Blood and Transplant, Cambridge Biomedical Campus, Cambridge, UK
- Oxford Haemophilia and Thrombosis Centre, Oxford University Hospitals NHS Foundation Trust, NIHR Oxford Biomedical Research Centre, Oxford, UK
| | - Karola Rehnstrom
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
| | - Claire L Adams
- Metabolic Research Laboratories, Wellcome Trust-Medical Research Council Institute of Metabolic Science, University of Cambridge, Cambridge, CB2 0QQ, UK
| | - Amy Frary
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
| | - Bekir Ergüener
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
| | - Roman Kreuzhuber
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
- European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK
| | - Gabriele Mocciaro
- Department of Biochemistry and the Cambridge Systems Biology Centre, University of Cambridge, The Sanger Building, 80 Tennis Court Road, Cambridge, CB2 1GA, UK
| | - Simona D'Amore
- Addenbrooke's Hospital, NIHR Cambridge Biomedical Research Centre, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
- Department of Medicine, Aldo Moro University of Bari, Piazza Giulio Cesare 11, 70124, Bari, Italy
- National Cancer Research Center, IRCCS Istituto Tumori 'Giovanni Paolo II', Viale Orazio Flacco, 65, 70124, Bari, Italy
| | - Albert Koulman
- MRC Epidemiology Unit, University of Cambridge, Cambridge, UK
- MRC Elsie Widdowson Laboratory, Cambridge, UK
- National Institute for Health Research Biomedical Research Centres Core Nutritional Biomarker Laboratory, Addenbrooke's Hospital, University of Cambridge, Cambridge, UK
- National Institute for Health Research Biomedical Research Centres Core Metabolomics and Lipidomics Laboratory, Addenbrooke's Hospital, University of Cambridge, Cambridge, UK
| | - Luigi Grassi
- National Institute for Health Research BioResource, Cambridge University Hospitals, Cambridge Biomedical Campus, Cambridge, UK
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
- NHS Blood and Transplant, Cambridge Biomedical Campus, Cambridge, UK
| | - Julian L Griffin
- Department of Biochemistry and the Cambridge Systems Biology Centre, University of Cambridge, The Sanger Building, 80 Tennis Court Road, Cambridge, CB2 1GA, UK
| | - Leong Loke Ng
- Department of Cardiovascular Sciences, Glenfield Hospital, University of Leicester, Leicester, UK
- National Institute for Health Research Leicester Biomedical Research Centre, Glenfield Hospital, Leicester, UK
| | - Adrian Park
- Addenbrooke's Hospital, NIHR Cambridge Biomedical Research Centre, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - David B Savage
- Metabolic Research Laboratories, Wellcome Trust-Medical Research Council Institute of Metabolic Science, University of Cambridge, Cambridge, CB2 0QQ, UK
| | | | - Christoph Bock
- CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, Vienna, Austria
- Ludwig Boltzmann Institute for Rare and Undiagnosed Diseases, Vienna, Austria
- Department of Laboratory Medicine, Medical University of Vienna, Vienna, Austria
| | - Kate Downes
- National Institute for Health Research BioResource, Cambridge University Hospitals, Cambridge Biomedical Campus, Cambridge, UK
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK
- East Midlands and East of England Genomic Laboratory Hub, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | | | - Michael Allison
- Addenbrooke's Hospital, NIHR Cambridge Biomedical Research Centre, Cambridge University Hospitals NHS Foundation Trust, Cambridge, UK
| | - Michele Vacca
- Metabolic Research Laboratories, Wellcome Trust-Medical Research Council Institute of Metabolic Science, University of Cambridge, Cambridge, CB2 0QQ, UK
- Department of Biochemistry and the Cambridge Systems Biology Centre, University of Cambridge, The Sanger Building, 80 Tennis Court Road, Cambridge, CB2 1GA, UK
| | - Paul D W Kirk
- MRC Biostatistics Unit, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK.
- Cambridge Institute of Therapeutic Immunology and Infectious Disease (CITIID), Jeffrey Cheah Biomedical Centre, University of Cambridge, Cambridge Biomedical Campus, Puddicombe Way, Cambridge, CB2 0AW, UK.
| | - Mattia Frontini
- Department of Haematology, University of Cambridge, Cambridge Biomedical Campus, Cambridge, UK.
- NHS Blood and Transplant, Cambridge Biomedical Campus, Cambridge, UK.
- British Heart Foundation Centre of Excellence, Cambridge Biomedical Campus, Cambridge, UK.
- Institute of Biomedical & Clinical Science, College of Medicine and Health, University of Exeter Medical School, RILD Building, Barrack Road, Exeter, EX2 5DW, UK.
| |
Collapse
|
47
|
Kang M, Ko E, Mersha TB. A roadmap for multi-omics data integration using deep learning. Brief Bioinform 2022; 23:bbab454. [PMID: 34791014 PMCID: PMC8769688 DOI: 10.1093/bib/bbab454] [Citation(s) in RCA: 81] [Impact Index Per Article: 40.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Revised: 09/30/2021] [Accepted: 10/05/2021] [Indexed: 12/18/2022] Open
Abstract
High-throughput next-generation sequencing now makes it possible to generate a vast amount of multi-omics data for various applications. These data have revolutionized biomedical research by providing a more comprehensive understanding of the biological systems and molecular mechanisms of disease development. Recently, deep learning (DL) algorithms have become one of the most promising methods in multi-omics data analysis, due to their predictive performance and capability of capturing nonlinear and hierarchical features. While integrating and translating multi-omics data into useful functional insights remain the biggest bottleneck, there is a clear trend towards incorporating multi-omics analysis in biomedical research to help explain the complex relationships between molecular layers. Multi-omics data have a role to improve prevention, early detection and prediction; monitor progression; interpret patterns and endotyping; and design personalized treatments. In this review, we outline a roadmap of multi-omics integration using DL and offer a practical perspective into the advantages, challenges and barriers to the implementation of DL in multi-omics data.
Collapse
Affiliation(s)
- Mingon Kang
- Department of Computer Science at the University of Nevada, Las Vegas, NV, USA
| | - Euiseong Ko
- Department of Computer Science at the University of Nevada, Las Vegas, NV, USA
| | - Tesfaye B Mersha
- Department of Pediatrics, Cincinnati Children’s Hospital Medical Center, University of Cincinnati, Cincinnati, OH, USA
| |
Collapse
|
48
|
Dumas T, Courant F, Almunia C, Boccard J, Rosain D, Duporté G, Armengaud J, Fenet H, Gomez E. An integrated metabolomics and proteogenomics approach reveals molecular alterations following carbamazepine exposure in the male mussel Mytilus galloprovincialis. CHEMOSPHERE 2022; 286:131793. [PMID: 34364230 DOI: 10.1016/j.chemosphere.2021.131793] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Revised: 07/05/2021] [Accepted: 08/02/2021] [Indexed: 06/13/2023]
Abstract
Carbamazepine is one of the most abundant pharmaceutical active compounds detected in aquatic systems. Based on laboratory exposures, carbamazepine has been proven to adversely affect aquatic organisms. However, the underlying molecular events remain poorly understood. This study aims to investigate the molecular mechanisms potentially associated with toxicological effects of carbamazepine on the mussel Mytilus galloprovincialis exposed for 3 days at realistic concentrations encountered in coastal environments (80 ng/L and 8 μg/L). An integrated metabolomics and proteogenomics approach, including data fusion strategy, was applied to gain more insight in molecular events and cellular processes triggered by carbamazepine exposure. Consistent metabolic and protein signatures revealed a metabolic rewiring and cellular stress at both concentrations (e.g. intensification of protein synthesis, transport and catabolism processes, disruption of lipid and amino acid metabolisms). These highlighted molecular signatures point to the induction of autophagy, closely related with carbamazepine mechanism of action, as well as a destabilization of the lysosomal membranes and an enzymatic overactivity of the peroxisomes. Induction of programmed cell death was highlighted by the modulation of apoptotic cognate proteins. The proposed integrative omics data analysis was shown to be highly relevant to identify the modulations of the two molecular levels, i.e. metabolites and proteins. Multi-omics approach is able to explain the resulting complex biological system, and document stronger toxicological pieces of evidence on pharmaceutical active compounds at environmental concentrations in sentinel organisms.
Collapse
Affiliation(s)
- Thibaut Dumas
- HydroSciences Montpellier, IRD, CNRS, University of Montpellier, Montpellier, France
| | - Frédérique Courant
- HydroSciences Montpellier, IRD, CNRS, University of Montpellier, Montpellier, France.
| | - Christine Almunia
- Université Paris-Saclay, CEA, INRAE, Département Médicaments et Technologies pour la Santé (DMTS), SPI, 30200, Bagnols-sur-Cèze, France
| | - Julien Boccard
- School of Pharmaceutical Sciences, University of Geneva, Geneva, 1211, Switzerland; Institute of Pharmaceutical Sciences of Western Switzerland, University of Geneva, Geneva, 1211, Switzerland
| | - David Rosain
- HydroSciences Montpellier, IRD, CNRS, University of Montpellier, Montpellier, France
| | - Geoffroy Duporté
- HydroSciences Montpellier, IRD, CNRS, University of Montpellier, Montpellier, France
| | - Jean Armengaud
- Université Paris-Saclay, CEA, INRAE, Département Médicaments et Technologies pour la Santé (DMTS), SPI, 30200, Bagnols-sur-Cèze, France
| | - Hélène Fenet
- HydroSciences Montpellier, IRD, CNRS, University of Montpellier, Montpellier, France
| | - Elena Gomez
- HydroSciences Montpellier, IRD, CNRS, University of Montpellier, Montpellier, France
| |
Collapse
|
49
|
Reiss JD, Peterson LS, Nesamoney SN, Chang AL, Pasca AM, Marić I, Shaw GM, Gaudilliere B, Wong RJ, Sylvester KG, Bonifacio SL, Aghaeepour N, Gibbs RS, Stevenson DK. Perinatal infection, inflammation, preterm birth, and brain injury: A review with proposals for future investigations. Exp Neurol 2022; 351:113988. [DOI: 10.1016/j.expneurol.2022.113988] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 01/06/2022] [Accepted: 01/13/2022] [Indexed: 11/26/2022]
|
50
|
Amaral DT, Romeiro-Brito M, Bonatelli IAS. Exploring Phylogenetic Relationships and Divergence Times of Bioluminescent Species Using Genomic and Transcriptomic Data. Methods Mol Biol 2022; 2525:409-423. [PMID: 35836087 DOI: 10.1007/978-1-0716-2473-9_32] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Next-generation sequencing (NGS) has dominated the scene of genomics and evolutionary biology as a great amount of genomic data have been accumulated for a diverse set of species. At the same time, phylogenetic approaches and programs are in development to allow better use of such large-size datasets. Phylogenomics appears as a promising field to accommodate and explore all the information of NGS data in phylogenetic methods, being an important approach to investigate the evolution of bioluminescence in different organisms. To guarantee accurate results in phylogenomic studies, it is mandatory to correctly identify orthologous genes in phylogenetic reconstruction. Here, we show a simplified step-by-step framework to perform phylogenetic analysis along with divergence time estimation, beginning with an orthologous search. As empirical data, we exemplify transcriptome sequences of six species of the Elateroidea superfamily (Coleoptera). We introduce several bioinformatics tools for handling genomic data, especially those available in the software OrthoFinder, IQTREE, BEAST2, and TreePL.
Collapse
Affiliation(s)
- Danilo T Amaral
- Departamento de Biologia, Centro de Ciências Humanas e Biológicas, Universidade Federal de São Carlos (UFSCar), Sorocaba, Brazil.
- Programa de Pós Graduação em Biologia Comparada, Faculdade de Filosofia, Ciências e Letras de Ribeirão Preto, Universidade de São Paulo (USP), Ribeirão Preto, Brazil.
| | - Monique Romeiro-Brito
- Departamento de Biologia, Centro de Ciências Humanas e Biológicas, Universidade Federal de São Carlos (UFSCar), Sorocaba, Brazil
| | - Isabel A S Bonatelli
- Departamento de Ecologia e Biologia Evolutiva, Universidade Federal de São Paulo (UNIFESP), Diadema, São Paulo, Brazil
| |
Collapse
|