2
|
Pan D, Song X, Pan J. Joint analysis of multivariate failure time data with latent variables. Stat Methods Med Res 2022; 31:1292-1312. [DOI: 10.1177/09622802221089028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
We propose a joint modeling approach to investigate the observed and latent risk factors of the multivariate failure times of interest. The proposed model comprises two parts. The first part is a distribution-free confirmatory factor analysis model that characterizes the latent factors by correlated multiple observed variables. The second part is a multivariate additive hazards model that assesses the observed and latent risk factors of the failure times. A hybrid procedure that combines the borrow-strength estimation approach and the asymptotically distribution-free generalized least square method is developed to estimate the model parameters. The asymptotic properties of the proposed estimators are derived. Simulation studies demonstrate that the proposed method performs well for practical settings. An application to a study concerning the risk factors of multiple diabetic complications is provided.
Collapse
Affiliation(s)
- Deng Pan
- School of Mathematics and Statistics, Huazhong University of Science and Technology, Wuhan, China
| | - Xinyuan Song
- Department of Statistics, The Chinese University of Hong Kong, Hong Kong, China
| | - Junhao Pan
- Department of Psychology, Sun Yat-sen University, Guangzhou, China
| |
Collapse
|
3
|
Zhang XJ, Zhou L, Lu WJ, Du WX, Mi XY, Li Z, Li XY, Wang ZW, Wang Y, Duan M, Gui JF. Comparative transcriptomic analysis reveals an association of gibel carp fatty liver with ferroptosis pathway. BMC Genomics 2021; 22:328. [PMID: 33952209 PMCID: PMC8101161 DOI: 10.1186/s12864-021-07621-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Accepted: 04/14/2021] [Indexed: 12/17/2022] Open
Abstract
Background Fatty liver has become a main problem that causes huge economic losses in many aquaculture modes. It is a common physiological or pathological phenomenon in aquaculture, but the causes and occurring mechanism are remaining enigmatic. Methods Each three liver samples from the control group of allogynogenetic gibel carp with normal liver and the overfeeding group with fatty liver were collected randomly for the detailed comparison of histological structure, lipid accumulation, transcriptomic profile, latent pathway identification analysis (LPIA), marker gene expression, and hepatocyte mitochondria analyses. Results Compared to normal liver, larger hepatocytes and more lipid accumulation were observed in fatty liver. Transcriptomic analysis between fatty liver and normal liver showed a totally different transcriptional trajectory. GO terms and KEGG pathways analyses revealed several enriched pathways in fatty liver, such as lipid biosynthesis, degradation accumulation, peroxidation, or metabolism and redox balance activities. LPIA identified an activated ferroptosis pathway in the fatty liver. qPCR analysis confirmed that gpx4, a negative regulator of ferroptosis, was significantly downregulated while the other three positively regulated marker genes, such as acsl4, tfr1 and gcl, were upregulated in fatty liver. Moreover, the hepatocytes of fatty liver had more condensed mitochondria and some of their outer membranes were almost ruptured. Conclusions We reveal an association between ferroptosis and fish fatty liver for the first time, suggesting that ferroptosis might be activated in liver fatty. Therefore, the current study provides a clue for future studies on fish fatty liver problems. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07621-2.
Collapse
Affiliation(s)
- Xiao-Juan Zhang
- College of Fisheries, Huazhong Agricultural University, Wuhan, 430070, China.,State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Wuhan, 430072, Hubei, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Li Zhou
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Wuhan, 430072, Hubei, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Wei-Jia Lu
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Wuhan, 430072, Hubei, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Wen-Xuan Du
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Wuhan, 430072, Hubei, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Xiang-Yuan Mi
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Wuhan, 430072, Hubei, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Zhi Li
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Wuhan, 430072, Hubei, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Xi-Yin Li
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Wuhan, 430072, Hubei, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Zhong-Wei Wang
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Wuhan, 430072, Hubei, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yang Wang
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Wuhan, 430072, Hubei, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Ming Duan
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Wuhan, 430072, Hubei, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Jian-Fang Gui
- College of Fisheries, Huazhong Agricultural University, Wuhan, 430070, China. .,State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, Innovation Academy for Seed Design, Chinese Academy of Sciences, Wuhan, 430072, Hubei, China. .,University of Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|
4
|
Jendoubi T, Ebbels TMD. Integrative analysis of time course metabolic data and biomarker discovery. BMC Bioinformatics 2020; 21:11. [PMID: 31918658 PMCID: PMC6953149 DOI: 10.1186/s12859-019-3333-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2019] [Accepted: 12/19/2019] [Indexed: 02/06/2023] Open
Abstract
Background Metabolomics time-course experiments provide the opportunity to understand the changes to an organism by observing the evolution of metabolic profiles in response to internal or external stimuli. Along with other omic longitudinal profiling technologies, these techniques have great potential to uncover complex relations between variations across diverse omic variables and provide unique insights into the underlying biology of the system. However, many statistical methods currently used to analyse short time-series omic data are i) prone to overfitting, ii) do not fully take into account the experimental design or iii) do not make full use of the multivariate information intrinsic to the data or iv) are unable to uncover multiple associations between different omic data. The model we propose is an attempt to i) overcome overfitting by using a weakly informative Bayesian model, ii) capture experimental design conditions through a mixed-effects model, iii) model interdependencies between variables by augmenting the mixed-effects model with a conditional auto-regressive (CAR) component and iv) identify potential associations between heterogeneous omic variables by using a horseshoe prior. Results We assess the performance of our model on synthetic and real datasets and show that it can outperform comparable models for metabolomic longitudinal data analysis. In addition, our proposed method provides the analyst with new insights on the data as it is able to identify metabolic biomarkers related to treatment, infer perturbed pathways as a result of treatment and find significant associations with additional omic variables. We also show through simulation that our model is fairly robust against inaccuracies in metabolite assignments. On real data, we demonstrate that the number of profiled metabolites slightly affects the predictive ability of the model. Conclusions Our single model approach to longitudinal analysis of metabolomics data provides an approach simultaneously for integrative analysis and biomarker discovery. In addition, it lends better interpretation by allowing analysis at the pathway level. An accompanying R package for the model has been developed using the probabilistic programming language Stan. The package offers user-friendly functions for simulating data, fitting the model, assessing model fit and postprocessing the results. The main aim of the R package is to offer freely accessible resources for integrative longitudinal analysis for metabolomics scientists and various visualization functions easy-to-use for applied researchers to interpret results.
Collapse
Affiliation(s)
- Takoua Jendoubi
- Epidemiology and Biostatistics, School of Public Health, Imperial College London, Norfolk Place, London, W2 1PG, UK. .,Statistics Section, Department of Mathematics, Imperial College London, South Kensington Campus, London, SW7 2AZ, UK.
| | - Timothy M D Ebbels
- Department of Surgery and Cancer, Imperial College London, South Kensington Campus, London, SW7 2AZ, UK
| |
Collapse
|
5
|
Hao J, Kim Y, Kim TK, Kang M. PASNet: pathway-associated sparse deep neural network for prognosis prediction from high-throughput data. BMC Bioinformatics 2018; 19:510. [PMID: 30558539 PMCID: PMC6296065 DOI: 10.1186/s12859-018-2500-z] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2018] [Accepted: 11/16/2018] [Indexed: 12/13/2022] Open
Abstract
Background Predicting prognosis in patients from large-scale genomic data is a fundamentally challenging problem in genomic medicine. However, the prognosis still remains poor in many diseases. The poor prognosis may be caused by high complexity of biological systems, where multiple biological components and their hierarchical relationships are involved. Moreover, it is challenging to develop robust computational solutions with high-dimension, low-sample size data. Results In this study, we propose a Pathway-Associated Sparse Deep Neural Network (PASNet) that not only predicts patients’ prognoses but also describes complex biological processes regarding biological pathways for prognosis. PASNet models a multilayered, hierarchical biological system of genes and pathways to predict clinical outcomes by leveraging deep learning. The sparse solution of PASNet provides the capability of model interpretability that most conventional fully-connected neural networks lack. We applied PASNet for long-term survival prediction in Glioblastoma multiforme (GBM), which is a primary brain cancer that shows poor prognostic performance. The predictive performance of PASNet was evaluated with multiple cross-validation experiments. PASNet showed a higher Area Under the Curve (AUC) and F1-score than previous long-term survival prediction classifiers, and the significance of PASNet’s performance was assessed by Wilcoxon signed-rank test. Furthermore, the biological pathways, found in PASNet, were referred to as significant pathways in GBM in previous biology and medicine research. Conclusions PASNet can describe the different biological systems of clinical outcomes for prognostic prediction as well as predicting prognosis more accurately than the current state-of-the-art methods. PASNet is the first pathway-based deep neural network that represents hierarchical representations of genes and pathways and their nonlinear effects, to the best of our knowledge. Additionally, PASNet would be promising due to its flexible model representation and interpretability, embodying the strengths of deep learning. The open-source code of PASNet is available at https://github.com/DataX-JieHao/PASNet.
Collapse
Affiliation(s)
- Jie Hao
- Kennesaw State University, Kennesaw, USA
| | | | - Tae-Kyung Kim
- University of Texas Southwestern Medical Center, Dallas, USA.,Department of Life Sciences, Pohang Institute of Science and Technology (POSTECH), Dallas, USA
| | - Mingon Kang
- Kennesaw State University, Kennesaw, USA. .,Kennesaw State University, Marietta, USA.
| |
Collapse
|
6
|
Franks AM, Markowetz F, Airoldi EM. REFINING CELLULAR PATHWAY MODELS USING AN ENSEMBLE OF HETEROGENEOUS DATA SOURCES. Ann Appl Stat 2018; 12:1361-1384. [PMID: 36506698 PMCID: PMC9733905 DOI: 10.1214/16-aoas915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Improving current models and hypotheses of cellular pathways is one of the major challenges of systems biology and functional genomics. There is a need for methods to build on established expert knowledge and reconcile it with results of new high-throughput studies. Moreover, the available sources of data are heterogeneous, and the data need to be integrated in different ways depending on which part of the pathway they are most informative for. In this paper, we introduce a compartment specific strategy to integrate edge, node and path data for refining a given network hypothesis. To carry out inference, we use a local-move Gibbs sampler for updating the pathway hypothesis from a compendium of heterogeneous data sources, and a new network regression idea for integrating protein attributes. We demonstrate the utility of this approach in a case study of the pheromone response MAPK pathway in the yeast S. cerevisiae.
Collapse
Affiliation(s)
- Alexander M Franks
- Department of Statistics and, Applied Probability, University of California, Santa Barbara, South Hall, Santa Barbara, California 93106, USA
| | - Florian Markowetz
- Cancer Research UK, Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Robinson Way, Cambridge, CB2 0RE, United Kingdom
| | - Edoardo M Airoldi
- Fox School of Business, Department of Statistical Science, Temple University, Center for Data Science, 1810 Liacouras Walk, Philadelphia, Pennsylvania 19122, USA
| |
Collapse
|