51
|
Vinga S. Structured sparsity regularization for analyzing high-dimensional omics data. Brief Bioinform 2020; 22:77-87. [PMID: 32597465 DOI: 10.1093/bib/bbaa122] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2019] [Revised: 05/15/2020] [Accepted: 05/18/2020] [Indexed: 12/18/2022] Open
Abstract
The development of new molecular and cell technologies is having a significant impact on the quantity of data generated nowadays. The growth of omics databases is creating a considerable potential for knowledge discovery and, concomitantly, is bringing new challenges to statistical learning and computational biology for health applications. Indeed, the high dimensionality of these data may hamper the use of traditional regression methods and parameter estimation algorithms due to the intrinsic non-identifiability of the inherent optimization problem. Regularized optimization has been rising as a promising and useful strategy to solve these ill-posed problems by imposing additional constraints in the solution parameter space. In particular, the field of statistical learning with sparsity has been significantly contributing to building accurate models that also bring interpretability to biological observations and phenomena. Beyond the now-classic elastic net, one of the best-known methods that combine lasso with ridge penalizations, we briefly overview recent literature on structured regularizers and penalty functions that have been applied in biomedical data to build parsimonious models in a variety of underlying contexts, from survival to generalized linear models. These methods include functions of $\ell _k$-norms and network-based penalties that take into account the inherent relationships between the features. The successful application to omics data illustrates the potential of sparse structured regularization for identifying disease's molecular signatures and for creating high-performance clinical decision support systems towards more personalized healthcare. Supplementary information: Supplementary data are available at Briefings in Bioinformatics online.
Collapse
Affiliation(s)
- Susana Vinga
- INESC-ID, Instituto Superior Técnico, Universidade de Lisboa, Lisboa, Portugal
| |
Collapse
|
52
|
Cieslak MC, Castelfranco AM, Roncalli V, Lenz PH, Hartline DK. t-Distributed Stochastic Neighbor Embedding (t-SNE): A tool for eco-physiological transcriptomic analysis. Mar Genomics 2020; 51:100723. [DOI: 10.1016/j.margen.2019.100723] [Citation(s) in RCA: 56] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2019] [Revised: 10/20/2019] [Accepted: 11/01/2019] [Indexed: 01/19/2023]
|
53
|
BEAVR: a browser-based tool for the exploration and visualization of RNA-seq data. BMC Bioinformatics 2020; 21:221. [PMID: 32471392 PMCID: PMC7260831 DOI: 10.1186/s12859-020-03549-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Accepted: 05/18/2020] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND The use of RNA-sequencing (RNA-seq) in molecular biology research and clinical settings has increased significantly over the past decade. Despite its widespread adoption, there is a lack of simple and interactive tools to analyze and explore RNA-seq data. Many established tools require programming or Unix/Bash knowledge to analyze and visualize results. This requirement presents a significant barrier for many researchers to efficiently analyze and present RNA-seq data. RESULTS Here we present BEAVR, a Browser-based tool for the Exploration And Visualization of RNA-seq data. BEAVR is an easy-to-use tool that facilitates interactive analysis and exploration of RNA-seq data. BEAVR is developed in R and uses DESeq2 as its engine for differential gene expression (DGE) analysis, but assumes users have no prior knowledge of R or DESeq2. BEAVR allows researchers to easily obtain a table of differentially-expressed genes with statistical testing and then visualize the results in a series of graphs, plots and heatmaps. Users are able to customize many parameters for statistical testing, dealing with variance, clustering methods and pathway analysis to generate high quality figures. CONCLUSION BEAVR simplifies analysis for novice users but also streamlines the RNA-seq analysis process for experts by automating several steps. BEAVR and its documentation can be found on GitHub at https://github.com/developerpiru/BEAVR. BEAVR is available as a Docker container at https://hub.docker.com/r/pirunthan/beavr.
Collapse
|
54
|
Mukherjee P, Cintra M, Huang C, Zhou M, Zhu S, Colevas AD, Fischbein N, Gevaert O. CT-based Radiomic Signatures for Predicting Histopathologic Features in Head and Neck Squamous Cell Carcinoma. Radiol Imaging Cancer 2020; 2:e190039. [PMID: 32550599 DOI: 10.1148/rycan.2020190039] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2019] [Revised: 01/08/2020] [Accepted: 01/22/2020] [Indexed: 12/15/2022]
Abstract
Purpose To determine the performance of CT-based radiomic features for noninvasive prediction of histopathologic features of tumor grade, extracapsular spread, perineural invasion, lymphovascular invasion, and human papillomavirus status in head and neck squamous cell carcinoma (HNSCC). Materials and Methods In this retrospective study, which was approved by the local institutional ethics committee, CT images and clinical data from patients with pathologically proven HNSCC from The Cancer Genome Atlas (n = 113) and an institutional test cohort (n = 71) were analyzed. A machine learning model was trained with 2131 extracted radiomic features to predict tumor histopathologic characteristics. In the model, principal component analysis was used for dimensionality reduction, and regularized regression was used for classification. Results The trained radiomic model demonstrated moderate capability of predicting HNSCC features. In the training cohort and the test cohort, the model achieved a mean area under the receiver operating characteristic curve (AUC) of 0.75 (95% confidence interval [CI]: 0.68, 0.81) and 0.66 (95% CI: 0.45, 0.84), respectively, for tumor grade; a mean AUC of 0.64 (95% CI: 0.55, 0.62) and 0.70 (95% CI: 0.47, 0.89), respectively, for perineural invasion; a mean AUC of 0.69 (95% CI: 0.56, 0.81) and 0.65 (95% CI: 0.38, 0.87), respectively, for lymphovascular invasion; a mean AUC of 0.77 (95% CI: 0.65, 0.88) and 0.67 (95% CI: 0.15, 0.80), respectively, for extracapsular spread; and a mean AUC of 0.71 (95% CI: 0.29, 1.0) and 0.80 (95% CI: 0.65, 0.92), respectively, for human papillomavirus status. Conclusion Radiomic CT models have the potential to predict characteristics typically identified on pathologic assessment of HNSCC.Supplemental material is available for this article.© RSNA, 2020.
Collapse
Affiliation(s)
- Pritam Mukherjee
- Department of Medicine, Stanford Center for Biomedical Informatics Research (BMIR), Stanford, Calif (P.M., M.C., C.H., M.Z., O.G.); Department of Radiology, Ribeirão Preto Medical School, University of São Paulo, São Paulo, Brazil (M.C.); Department of Nutrition and Food Hygiene, Chronic Disease Research Institute, School of Public Health, School of Medicine, Zhejiang University, Zhejiang, China (C.H., S.Z.); Division of Oncology, Department of Medicine (A.D.C.), Department of Radiology (N.F.), and Department of Biomedical Data Science (O.G.), Stanford University, 1265 Welch Rd, Stanford, CA 94305-5479
| | - Murilo Cintra
- Department of Medicine, Stanford Center for Biomedical Informatics Research (BMIR), Stanford, Calif (P.M., M.C., C.H., M.Z., O.G.); Department of Radiology, Ribeirão Preto Medical School, University of São Paulo, São Paulo, Brazil (M.C.); Department of Nutrition and Food Hygiene, Chronic Disease Research Institute, School of Public Health, School of Medicine, Zhejiang University, Zhejiang, China (C.H., S.Z.); Division of Oncology, Department of Medicine (A.D.C.), Department of Radiology (N.F.), and Department of Biomedical Data Science (O.G.), Stanford University, 1265 Welch Rd, Stanford, CA 94305-5479
| | - Chao Huang
- Department of Medicine, Stanford Center for Biomedical Informatics Research (BMIR), Stanford, Calif (P.M., M.C., C.H., M.Z., O.G.); Department of Radiology, Ribeirão Preto Medical School, University of São Paulo, São Paulo, Brazil (M.C.); Department of Nutrition and Food Hygiene, Chronic Disease Research Institute, School of Public Health, School of Medicine, Zhejiang University, Zhejiang, China (C.H., S.Z.); Division of Oncology, Department of Medicine (A.D.C.), Department of Radiology (N.F.), and Department of Biomedical Data Science (O.G.), Stanford University, 1265 Welch Rd, Stanford, CA 94305-5479
| | - Mu Zhou
- Department of Medicine, Stanford Center for Biomedical Informatics Research (BMIR), Stanford, Calif (P.M., M.C., C.H., M.Z., O.G.); Department of Radiology, Ribeirão Preto Medical School, University of São Paulo, São Paulo, Brazil (M.C.); Department of Nutrition and Food Hygiene, Chronic Disease Research Institute, School of Public Health, School of Medicine, Zhejiang University, Zhejiang, China (C.H., S.Z.); Division of Oncology, Department of Medicine (A.D.C.), Department of Radiology (N.F.), and Department of Biomedical Data Science (O.G.), Stanford University, 1265 Welch Rd, Stanford, CA 94305-5479
| | - Shankuan Zhu
- Department of Medicine, Stanford Center for Biomedical Informatics Research (BMIR), Stanford, Calif (P.M., M.C., C.H., M.Z., O.G.); Department of Radiology, Ribeirão Preto Medical School, University of São Paulo, São Paulo, Brazil (M.C.); Department of Nutrition and Food Hygiene, Chronic Disease Research Institute, School of Public Health, School of Medicine, Zhejiang University, Zhejiang, China (C.H., S.Z.); Division of Oncology, Department of Medicine (A.D.C.), Department of Radiology (N.F.), and Department of Biomedical Data Science (O.G.), Stanford University, 1265 Welch Rd, Stanford, CA 94305-5479
| | - A Dimitrios Colevas
- Department of Medicine, Stanford Center for Biomedical Informatics Research (BMIR), Stanford, Calif (P.M., M.C., C.H., M.Z., O.G.); Department of Radiology, Ribeirão Preto Medical School, University of São Paulo, São Paulo, Brazil (M.C.); Department of Nutrition and Food Hygiene, Chronic Disease Research Institute, School of Public Health, School of Medicine, Zhejiang University, Zhejiang, China (C.H., S.Z.); Division of Oncology, Department of Medicine (A.D.C.), Department of Radiology (N.F.), and Department of Biomedical Data Science (O.G.), Stanford University, 1265 Welch Rd, Stanford, CA 94305-5479
| | - Nancy Fischbein
- Department of Medicine, Stanford Center for Biomedical Informatics Research (BMIR), Stanford, Calif (P.M., M.C., C.H., M.Z., O.G.); Department of Radiology, Ribeirão Preto Medical School, University of São Paulo, São Paulo, Brazil (M.C.); Department of Nutrition and Food Hygiene, Chronic Disease Research Institute, School of Public Health, School of Medicine, Zhejiang University, Zhejiang, China (C.H., S.Z.); Division of Oncology, Department of Medicine (A.D.C.), Department of Radiology (N.F.), and Department of Biomedical Data Science (O.G.), Stanford University, 1265 Welch Rd, Stanford, CA 94305-5479
| | - Olivier Gevaert
- Department of Medicine, Stanford Center for Biomedical Informatics Research (BMIR), Stanford, Calif (P.M., M.C., C.H., M.Z., O.G.); Department of Radiology, Ribeirão Preto Medical School, University of São Paulo, São Paulo, Brazil (M.C.); Department of Nutrition and Food Hygiene, Chronic Disease Research Institute, School of Public Health, School of Medicine, Zhejiang University, Zhejiang, China (C.H., S.Z.); Division of Oncology, Department of Medicine (A.D.C.), Department of Radiology (N.F.), and Department of Biomedical Data Science (O.G.), Stanford University, 1265 Welch Rd, Stanford, CA 94305-5479
| |
Collapse
|
55
|
Bromig L, Kremling A, Marin-Sanguino A. Understanding biochemical design principles with ensembles of canonical non-linear models. PLoS One 2020; 15:e0230599. [PMID: 32353072 PMCID: PMC7192416 DOI: 10.1371/journal.pone.0230599] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2020] [Accepted: 03/03/2020] [Indexed: 12/22/2022] Open
Abstract
Systems biology applies concepts from engineering in order to understand biological networks. If such an understanding was complete, biologists would be able to design ad hoc biochemical components tailored for different purposes, which is the goal of synthetic biology. Needless to say that we are far away from creating biological subsystems as intricate and precise as those found in nature, but mathematical models and high throughput techniques have brought us a long way in this direction. One of the difficulties that still needs to be overcome is finding the right values for model parameters and dealing with uncertainty, which is proving to be an extremely difficult task. In this work, we take advantage of ensemble modeling techniques, where a large number of models with different parameter values are formulated and then tested according to some performance criteria. By finding features shared by successful models, the role of different components and the synergies between them can be better understood. We will address some of the difficulties often faced by ensemble modeling approaches, such as the need to sample a space whose size grows exponentially with the number of parameters, and establishing useful selection criteria. Some methods will be shown to reduce the predictions from many models into a set of understandable “design principles” that can guide us to improve or manufacture a biochemical network. Our proposed framework formulates models within standard formalisms in order to integrate information from different sources and minimize the dimension of the parameter space. Additionally, the mathematical properties of the formalism enable a partition of the parameter space into independent subspaces. Each of these subspaces can be paired with a set of criteria that depend exclusively on it, thus allowing a separate sampling/screening in spaces of lower dimension. By applying tests in a strict order where computationally cheaper tests are applied first to each subspace and applying computationally expensive tests to the remaining subset thereafter, the use of resources is optimized and a larger number of models can be examined. This can be compared to a complex database query where the order of the requests can make a huge difference in the processing time. The method will be illustrated by analyzing a classical model of a metabolic pathway with end-product inhibition. Even for such a simple model, the method provides novel insight.
Collapse
Affiliation(s)
- Lukas Bromig
- Specialty Division for Systems Biotechnology, Technische Universität München, Garching, Germany
| | - Andreas Kremling
- Specialty Division for Systems Biotechnology, Technische Universität München, Garching, Germany
| | - Alberto Marin-Sanguino
- Specialty Division for Systems Biotechnology, Technische Universität München, Garching, Germany
- * E-mail:
| |
Collapse
|
56
|
Smets T, Waelkens E, De Moor B. Prioritization of m/z-Values in Mass Spectrometry Imaging Profiles Obtained Using Uniform Manifold Approximation and Projection for Dimensionality Reduction. Anal Chem 2020; 92:5240-5248. [DOI: 10.1021/acs.analchem.9b05764] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]
Affiliation(s)
- Tina Smets
- STADIUS Center for Dynamical Systems, Signal Processing, and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, 3001 Leuven, Belgium
| | - Etienne Waelkens
- Department of Cellular and Molecular Medicine, KU Leuven, 3001 Leuven, Belgium
| | - Bart De Moor
- STADIUS Center for Dynamical Systems, Signal Processing, and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, 3001 Leuven, Belgium
| |
Collapse
|
57
|
Shetta O, Niranjan M. Robust subspace methods for outlier detection in genomic data circumvents the curse of dimensionality. ROYAL SOCIETY OPEN SCIENCE 2020; 7:190714. [PMID: 32257299 PMCID: PMC7062061 DOI: 10.1098/rsos.190714] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/16/2019] [Accepted: 12/12/2019] [Indexed: 06/11/2023]
Abstract
The application of machine learning to inference problems in biology is dominated by supervised learning problems of regression and classification, and unsupervised learning problems of clustering and variants of low-dimensional projections for visualization. A class of problems that have not gained much attention is detecting outliers in datasets, arising from reasons such as gross experimental, reporting or labelling errors. These could also be small parts of a dataset that are functionally distinct from the majority of a population. Outlier data are often identified by considering the probability density of normal data and comparing data likelihoods against some threshold. This classical approach suffers from the curse of dimensionality, which is a serious problem with omics data which are often found in very high dimensions. We develop an outlier detection method based on structured low-rank approximation methods. The objective function includes a regularizer based on neighbourhood information captured in the graph Laplacian. Results on publicly available genomic data show that our method robustly detects outliers whereas a density-based method fails even at moderate dimensions. Moreover, we show that our method has better clustering and visualization performance on the recovered low-dimensional projection when compared with popular dimensionality reduction techniques.
Collapse
Affiliation(s)
- Omar Shetta
- Author for correspondence: Omar Shetta e-mail:
| | | |
Collapse
|
58
|
Liu X, Xu Y, Wang R, Liu S, Wang J, Luo Y, Leung KS, Cheng L. A network-based algorithm for the identification of moonlighting noncoding RNAs and its application in sepsis. Brief Bioinform 2020; 22:581-588. [PMID: 32003790 DOI: 10.1093/bib/bbz154] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2019] [Revised: 10/26/2019] [Accepted: 11/01/2019] [Indexed: 12/26/2022] Open
Abstract
Moonlighting proteins provide more options for cells to execute multiple functions without increasing the genome and transcriptome complexity. Although there have long been calls for computational methods for the prediction of moonlighting proteins, no method has been designed for determining moonlighting long noncoding ribonucleicacidz (RNAs) (mlncRNAs). Previously, we developed an algorithm MoonFinder for the identification of mlncRNAs at the genome level based on the functional annotation and interactome data of lncRNAs and proteins. Here, we update MoonFinder to MoonFinder v2.0 by providing an extensive framework for the detection of protein modules and the establishment of RNA-module associations in human. A novel measure, moonlighting coefficient, was also proposed to assess the confidence of an ncRNA acting in a moonlighting manner. Moreover, we explored the expression characteristics of mlncRNAs in sepsis, in which we found that mlncRNAs tend to be upregulated and differentially expressed. Interestingly, the mlncRNAs are mutually exclusive in terms of coexpression when compared to the other lncRNAs. Overall, MoonFinder v2.0 is dedicated to the prediction of human mlncRNAs and thus bears great promise to serve as a valuable R package for worldwide research communities (https://cran.r-project.org/web/packages/MoonFinder/index.html). Also, our analyses provide the first attempt to characterize mlncRNA expression and coexpression properties in adult sepsis patients, which will facilitate the understanding of the interaction and expression patterns of mlncRNAs.
Collapse
Affiliation(s)
- Xueyan Liu
- Critical Care Medici at Shenzhen People's Hospital
| | | | - Ran Wang
- Computer Science at The Chinese University of Hong Kong
| | | | | | | | - Kwong-Sak Leung
- Computer Science at the Chinese University of Hong Kong, Hong Kong, China
| | - Lixin Cheng
- Bioinformatics at Shenzhen People's Hospital, China
| |
Collapse
|
59
|
Tancheva LP, Lazarova MI, Alexandrova AV, Dragomanova ST, Nicoletti F, Tzvetanova ER, Hodzhev YK, Kalfin RE, Miteva SA, Mazzon E, Tzvetkov NT, Atanasov AG. Neuroprotective Mechanisms of Three Natural Antioxidants on a Rat Model of Parkinson's Disease: A Comparative Study. Antioxidants (Basel) 2020; 9:antiox9010049. [PMID: 31935828 PMCID: PMC7022962 DOI: 10.3390/antiox9010049] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2019] [Revised: 12/28/2019] [Accepted: 12/30/2019] [Indexed: 02/07/2023] Open
Abstract
We compared the neuroprotective action of three natural bio-antioxidants (AOs): ellagic acid (EA), α-lipoic acid (LA), and myrtenal (Myrt) in an experimental model of Parkinson’s disease (PD) that was induced in male Wistar rats through an intrastriatal injection of 6-hydroxydopamine (6-OHDA). The animals were divided into five groups: the sham-operated (SO) control group; striatal 6-OHDA-lesioned control group; and three groups of 6-OHDA-lesioned rats pre-treated for five days with EA, LA, and Myrt (50 mg/kg; intraperitoneally- i.p.), respectively. On the 2nd and the 3rd week post lesion, the animals were subjected to several behavioral tests: apomorphine-induced rotation; rotarod; and the passive avoidance test. Biochemical evaluation included assessment of main oxidative stress parameters as well as dopamine (DA) levels in brain homogenates. The results showed that all three test compounds improved learning and memory performance as well as neuromuscular coordination. Biochemical assays showed that all three compounds substantially decreased lipid peroxidation (LPO) levels, and restored catalase (CAT) activity and DA levels that were impaired by the challenge with 6-OHDA. Based on these results, we can conclude that the studied AOs demonstrate properties that are consistent with significant antiparkinsonian effects. The most powerful neuroprotective effect was observed with Myrt, and this work represents the first demonstration of its anti-Parkinsonian impact.
Collapse
Affiliation(s)
- Lyubka P. Tancheva
- Department of Behavior Neurobiology, Institute of Neurobiology, Bulgarian Academy of Sciences, Sofia 1113, Bulgaria; (S.T.D.); (S.A.M.)
- Correspondence: (L.P.T.); (A.G.A.); Tel.: +359-2979-2175 (L.P.T.); +48-227-367-022 (A.G.A.)
| | - Maria I. Lazarova
- Department of Synaptic Signaling and Communications, Institute of Neurobiology, Bulgarian Academy of Sciences, Sofia 1113, Bulgaria; (M.I.L.); (R.E.K.)
| | - Albena V. Alexandrova
- Department Biological Effects of Natural and Synthetic Substances, Institute of Neurobiology, Bulgarian Academy of Sciences, Sofia 1113, Bulgaria; (A.V.A.); (E.R.T.)
| | - Stela T. Dragomanova
- Department of Behavior Neurobiology, Institute of Neurobiology, Bulgarian Academy of Sciences, Sofia 1113, Bulgaria; (S.T.D.); (S.A.M.)
- Department of Pharmacology, Toxicology and Pharmacotherapy, Faculty of Pharmacy, Medical University, Varna 9002, Bulgaria
| | - Ferdinando Nicoletti
- Department of Biomedical and Biotechnological Sciences, University of Catania, Via S. Sofia 89, 95123 Catania, Italy;
| | - Elina R. Tzvetanova
- Department Biological Effects of Natural and Synthetic Substances, Institute of Neurobiology, Bulgarian Academy of Sciences, Sofia 1113, Bulgaria; (A.V.A.); (E.R.T.)
| | - Yordan K. Hodzhev
- Department of Sensory Neurobiology, Institute of Neurobiology, Bulgarian Academy of Sciences, Sofia 1113, Bulgaria;
| | - Reni E. Kalfin
- Department of Synaptic Signaling and Communications, Institute of Neurobiology, Bulgarian Academy of Sciences, Sofia 1113, Bulgaria; (M.I.L.); (R.E.K.)
| | - Simona A. Miteva
- Department of Behavior Neurobiology, Institute of Neurobiology, Bulgarian Academy of Sciences, Sofia 1113, Bulgaria; (S.T.D.); (S.A.M.)
| | - Emanuela Mazzon
- IRCCS Centro Neurolesi “Bonino-Pulejo”, Via Provinciale Palermo, Contrada Casazza, 98124 Messina, Italy;
| | - Nikolay T. Tzvetkov
- Department of Biochemical Pharmacology and Drug Design, Institute of Molecular Biology, Bulgarian Academy of Sciences, Sofia 1113, Bulgaria;
| | - Atanas G. Atanasov
- Department of Synaptic Signaling and Communications, Institute of Neurobiology, Bulgarian Academy of Sciences, Sofia 1113, Bulgaria; (M.I.L.); (R.E.K.)
- Department of Molecular Biology, Institute of Genetics and Animal Breeding of the Polish Academy of Sciences, Jastrzebiec, 05-552 Magdalenka, Poland
- Department of Pharmacognosy, University of Vienna, 1090 Vienna, Austria
- Ludwig Boltzmann Institute for Digital Health and Patient Safety, Medical University of Vienna, Spitalgasse 23, 1090 Vienna, Austria
- Correspondence: (L.P.T.); (A.G.A.); Tel.: +359-2979-2175 (L.P.T.); +48-227-367-022 (A.G.A.)
| |
Collapse
|
60
|
Patil AR, Chang J, Leung MY, Kim S. Analyzing high dimensional correlated data using feature ranking and classifiers. COMPUTATIONAL AND MATHEMATICAL BIOPHYSICS 2019. [DOI: 10.1515/cmb-2019-0008] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Abstract
The Illumina Infinium HumanMethylation27 (Illumina 27K) BeadChip assay is a relatively recent high-throughput technology that allows over 27,000 CpGs to be assayed. The Illumina 27K methylation data is less commonly used in comparison to gene expression in bioinformatics. It provides a critical need to find the optimal feature ranking (FR) method for handling the high dimensional data. The optimal FR method on the classifier is not well known, and choosing the best performing FR method becomes more challenging in high dimensional data setting. Therefore, identifying the statistical methods which boost the inference is of crucial importance in this context. This paper describes the detailed performances of FR methods such as fisher score, information gain, chi-square, and minimum redundancy and maximum relevance on different classification methods such as Adaboost, Random Forest, Naive Bayes, and Support Vector Machines. Through simulation study and real data applications, we show that the fisher score as an FR method, when applied on all the classifiers, achieved best prediction accuracy with significantly small number of ranked features.
Collapse
Affiliation(s)
- Abhijeet R Patil
- Computational Science, The University of Texas at El Paso , El Paso , TX 79968 , USA
| | - Jongwha Chang
- Department of Pharmacy Practice , The University of Texas at El Paso , El Paso , TX 79968 , USA
| | - Ming-Ying Leung
- Bioinformatics and Computational Science , University of Texas at El Paso , El Paso , TX 79968 , USA
| | - Sangjin Kim
- Department of Mathematical Sciences , University of Texas at El Paso , El Paso , TX 79968 , USA
| |
Collapse
|
61
|
Yang TH. Transcription factor regulatory modules provide the molecular mechanisms for functional redundancy observed among transcription factors in yeast. BMC Bioinformatics 2019; 20:630. [PMID: 31881824 PMCID: PMC6933673 DOI: 10.1186/s12859-019-3212-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023] Open
Abstract
BACKGROUND Current technologies for understanding the transcriptional reprogramming in cells include the transcription factor (TF) chromatin immunoprecipitation (ChIP) experiments and the TF knockout experiments. The ChIP experiments show the binding targets of TFs against which the antibody directs while the knockout techniques find the regulatory gene targets of the knocked-out TFs. However, it was shown that these two complementary results contain few common targets. Researchers have used the concept of TF functional redundancy to explain the low overlap between these two techniques. But the detailed molecular mechanisms behind TF functional redundancy remain unknown. Without knowing the possible molecular mechanisms, it is hard for biologists to fully unravel the cause of TF functional redundancy. RESULTS To mine out the molecular mechanisms, a novel algorithm to extract TF regulatory modules that help explain the observed TF functional redundancy effect was devised and proposed in this research. The method first searched for candidate TF sets from the TF binding data. Then based on these candidate sets the method utilized the modified Steiner Tree construction algorithm to construct the possible TF regulatory modules from protein-protein interaction data and finally filtered out the noise-induced results by using confidence tests. The mined-out regulatory modules were shown to correlate to the concept of functional redundancy and provided testable hypotheses of the molecular mechanisms behind functional redundancy. And the biological significance of the mined-out results was demonstrated in three different biological aspects: ontology enrichment, protein interaction prevalence and expression coherence. About 23.5% of the mined-out TF regulatory modules were literature-verified. Finally, the biological applicability of the proposed method was shown in one detailed example of a verified TF regulatory module for pheromone response and filamentous growth in yeast. CONCLUSION In this research, a novel method that mined out the potential TF regulatory modules which elucidate the functional redundancy observed among TFs is proposed. The extracted TF regulatory modules not only correlate the molecular mechanisms to the observed functional redundancy among TFs, but also show biological significance in inferring TF functional binding target genes. The results provide testable hypotheses for biologists to further design subsequent research and experiments.
Collapse
Affiliation(s)
- Tzu-Hsien Yang
- Department of Information Management, National University of Kaohsiung, 700, Kaohsiung University Rd, Kaohsiung, 81148, Taiwan.
| |
Collapse
|
62
|
Wang W, Zhang K, Zhang H, Li M, Zhao Y, Wang B, Xin W, Yang W, Zhang J, Yue S, Yang X. Underlying Genes Involved in Atherosclerotic Macrophages: Insights from Microarray Data Mining. Med Sci Monit 2019; 25:9949-9962. [PMID: 31875420 PMCID: PMC6944040 DOI: 10.12659/msm.917068] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Background In an atherosclerotic artery wall, monocyte-derived macrophages are the principal mediators that respond to pathogens and inflammation. The present study aimed to investigate potential genetic changes in gene expression between normal tissue-resident macrophages and atherosclerotic macrophages in the human body. Material/Methods The expression profile data of GSE7074 acquired from the Gene Expression Omnibus (GEO) database, which includes the transcriptome of 4 types of macrophages, was downloaded. Differentially expressed genes (DEGs) were identified using R software, then we performed functional enrichment, protein-protein interaction (PPI) network construction, key node and module analysis, and prediction of microRNAs (miRNAs)/transcription factors (TFs) targeting genes. Results After data processing, 236 DEGs were identified, including 21 upregulated genes and 215 downregulated genes. The DEG set was enriched in 22 significant Gene Ontology (GO) terms and 25 Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways, and the PPI network constructed with these DEGs comprised 6 key nodes with degrees ≥8. Key nodes in the PPI network and simultaneously involved in the prime modules, including rhodopsin (RHO), coagulation factor V (F5), and bestrophin-1 (BEST1), are promising for the prediction of atherosclerotic plaque formation. Furthermore, in the miRNA/TF-target network, hsa-miR-3177-5p might be involved in the pathogenesis of atherosclerosis via regulating BEST1, and the transcription factor early growth response-1 (EGR1) was found to be a potential promoter in atherogenesis. Conclusions The identified key hub genes, predicted miRNAs/TFs, and underlying molecular mechanisms may be involved in atherogenesis, thus potentially contributing to the treatment and diagnosis of patients with atherosclerotic disease.
Collapse
Affiliation(s)
- Weihan Wang
- Department of Neurosurgery, Tianjin Medical University General Hospital, Tianjin, China (mainland).,Tianjin Neurological Institute, Key Laboratory of Post-Trauma Neuro-Repair and Regeneration in Central Nervous System, Ministry of Education, Tianjin Key Laboratory of Injuries, Variations and Regeneration of Nervous System, Tianjin, China (mainland)
| | - Kai Zhang
- Department of Gynecology and Obstetrics, Tianjin Medical University General Hospital, Tianjin, China (mainland)
| | - Hao Zhang
- Department of Neurosurgery, Tianjin Medical University General Hospital, Tianjin, China (mainland).,Tianjin Neurological Institute, Key Laboratory of Post-Trauma Neuro-Repair and Regeneration in Central Nervous System, Ministry of Education, Tianjin Key Laboratory of Injuries, Variations and Regeneration of Nervous System, Tianjin, China (mainland)
| | - Mengqi Li
- Department of Neurosurgery, Tianjin Medical University General Hospital, Tianjin, China (mainland).,Tianjin Neurological Institute, Key Laboratory of Post-Trauma Neuro-Repair and Regeneration in Central Nervous System, Ministry of Education, Tianjin Key Laboratory of Injuries, Variations and Regeneration of Nervous System, Tianjin, China (mainland)
| | - Yan Zhao
- Department of Neurosurgery, Tianjin Medical University General Hospital, Tianjin, China (mainland)
| | - Bangyue Wang
- Department of Neurosurgery, Tianjin Medical University General Hospital, Tianjin, China (mainland)
| | - Wenqiang Xin
- Department of Neurosurgery, Tianjin Medical University General Hospital, Tianjin, China (mainland).,Tianjin Neurological Institute, Key Laboratory of Post-Trauma Neuro-Repair and Regeneration in Central Nervous System, Ministry of Education, Tianjin Key Laboratory of Injuries, Variations and Regeneration of Nervous System, Tianjin, China (mainland)
| | - Weidong Yang
- Department of Neurosurgery, Tianjin Medical University General Hospital, Tianjin, China (mainland).,Tianjin Neurological Institute, Key Laboratory of Post-Trauma Neuro-Repair and Regeneration in Central Nervous System, Ministry of Education, Tianjin Key Laboratory of Injuries, Variations and Regeneration of Nervous System, Tianjin, China (mainland)
| | - Jianning Zhang
- Department of Neurosurgery, Tianjin Medical University General Hospital, Tianjin, China (mainland).,Tianjin Neurological Institute, Key Laboratory of Post-Trauma Neuro-Repair and Regeneration in Central Nervous System, Ministry of Education, Tianjin Key Laboratory of Injuries, Variations and Regeneration of Nervous System, Tianjin, China (mainland)
| | - Shuyuan Yue
- Department of Neurosurgery, Tianjin Medical University General Hospital, Tianjin, China (mainland).,Tianjin Neurological Institute, Key Laboratory of Post-Trauma Neuro-Repair and Regeneration in Central Nervous System, Ministry of Education, Tianjin Key Laboratory of Injuries, Variations and Regeneration of Nervous System, Tianjin, China (mainland)
| | - Xinyu Yang
- Department of Neurosurgery, Tianjin Medical University General Hospital, Tianjin, China (mainland).,Tianjin Neurological Institute, Key Laboratory of Post-Trauma Neuro-Repair and Regeneration in Central Nervous System, Ministry of Education, Tianjin Key Laboratory of Injuries, Variations and Regeneration of Nervous System, Tianjin, China (mainland)
| |
Collapse
|
63
|
Sun S, Wang C, Ding H, Zou Q. Machine learning and its applications in plant molecular studies. Brief Funct Genomics 2019; 19:40-48. [DOI: 10.1093/bfgp/elz036] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2019] [Revised: 09/06/2019] [Accepted: 09/15/2019] [Indexed: 01/16/2023] Open
Abstract
Abstract
The advent of high-throughput genomic technologies has resulted in the accumulation of massive amounts of genomic information. However, biologists are challenged with how to effectively analyze these data. Machine learning can provide tools for better and more efficient data analysis. Unfortunately, because many plant biologists are unfamiliar with machine learning, its application in plant molecular studies has been restricted to a few species and a limited set of algorithms. Thus, in this study, we provide the basic steps for developing machine learning frameworks and present a comprehensive overview of machine learning algorithms and various evaluation metrics. Furthermore, we introduce sources of important curated plant genomic data and R packages to enable plant biologists to easily and quickly apply appropriate machine learning algorithms in their research. Finally, we discuss current applications of machine learning algorithms for identifying various genes related to resistance to biotic and abiotic stress. Broad application of machine learning and the accumulation of plant sequencing data will advance plant molecular studies.
Collapse
Affiliation(s)
- Shanwen Sun
- University of Bayreuth in Germany. He is now a postdoctoral fellow at the Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China
| | - Chunyu Wang
- Harbin Institute of Technology in China. He is an associate professor in the School of Computer Science and Technology, Harbin Institute of Technology
| | - Hui Ding
- Inner Mongolia University in China. She is an associate professor in the Center for Informational Biology, University of Electronic Science and Technology of China
| | - Quan Zou
- Harbin Institute of Technology in China. He is a professor in the Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China
| |
Collapse
|
64
|
Grau M, Lenz G, Lenz P. Dissection of gene expression datasets into clinically relevant interaction signatures via high-dimensional correlation maximization. Nat Commun 2019; 10:5417. [PMID: 31780653 PMCID: PMC6883077 DOI: 10.1038/s41467-019-12713-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Accepted: 09/20/2019] [Indexed: 12/12/2022] Open
Abstract
Gene expression is controlled by many simultaneous interactions, frequently measured collectively in biology and medicine by high-throughput technologies. It is a highly challenging task to infer from these data the generating effects and cooperating genes. Here, we present an unsupervised hypothesis-generating learning concept termed signal dissection by correlation maximization (SDCM) that dissects large high-dimensional datasets into signatures. Each signature captures a particular signal pattern that was consistently observed for multiple genes and samples, likely caused by the same underlying interaction. A key difference to other methods is our flexible nonlinear signal superposition model, combined with a precise regression technique. Analyzing gene expression of diffuse large B-cell lymphoma, our method discovers previously unidentified signatures that reveal significant differences in patient survival. These signatures are more predictive than those from various methods used for comparison and robustly validate across technological platforms. This implies highly specific extraction of clinically relevant gene interactions. Identification of clinically relevant gene expression signatures for cancer stratification remains challenging. Here, the authors introduce a flexible nonlinear signal superposition model that enables dissection of large gene expression data sets into signatures and extraction of gene interactions.
Collapse
Affiliation(s)
- Michael Grau
- Department of Medicine A, Albert-Schweitzer Campus 1, University Hospital Münster, 48149, Münster, Germany.,Cluster of Excellence EXC 1003, Cells in Motion, University of Münster, 48149, Münster, Germany
| | - Georg Lenz
- Department of Medicine A, Albert-Schweitzer Campus 1, University Hospital Münster, 48149, Münster, Germany.,Cluster of Excellence EXC 1003, Cells in Motion, University of Münster, 48149, Münster, Germany
| | - Peter Lenz
- Department of Physics, Renthof 5, University of Marburg, 35032, Marburg, Germany. .,LOEWE Center for Synthetic Microbiology, 35032, Marburg, Germany.
| |
Collapse
|
65
|
Boac BM, Abbasi F, Ismail-Khan R, Xiong Y, Siddique A, Park H, Han M, Saeed-Vafa D, Soliman H, Henry B, Pena MJ, McClung EC, Robertson SE, Todd SL, Lopez A, Sun W, Apuri S, Lancaster JM, Berglund AE, Magliocco AM, Marchion DC. Expression of the BAD pathway is a marker of triple-negative status and poor outcome. Sci Rep 2019; 9:17496. [PMID: 31767884 PMCID: PMC6877530 DOI: 10.1038/s41598-019-53695-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2018] [Accepted: 10/28/2019] [Indexed: 02/01/2023] Open
Abstract
Triple-negative breast cancer (TNBC) has few therapeutic targets, making nonspecific chemotherapy the main treatment. Therapies enhancing cancer cell sensitivity to cytotoxic agents could significantly improve patient outcomes. A BCL2-associated agonist of cell death (BAD) pathway gene expression signature (BPGES) was derived using principal component analysis (PCA) and evaluated for associations with the TNBC phenotype and clinical outcomes. Immunohistochemistry was used to determine the relative expression levels of phospho-BAD isoforms in tumour samples. Cell survival assays evaluated the effects of BAD pathway inhibition on chemo-sensitivity. BPGES score was associated with TNBC status and overall survival (OS) in breast cancer samples of the Moffitt Total Cancer Care dataset and The Cancer Genome Atlas (TCGA). TNBC tumours were enriched for the expression of phospho-BAD isoforms. Further, the BPGES was associated with TNBC status in breast cancer cell lines of the Cancer Cell Line Encyclopedia (CCLE). Targeted inhibition of kinases known to phosphorylate BAD protein resulted in increased sensitivity to platinum agents in TNBC cell lines compared to non-TNBC cell lines. The BAD pathway is associated with triple-negative status and OS. TNBC tumours were enriched for the expression of phosphorylated BAD protein compared to non-TNBC tumours. These findings suggest that the BAD pathway it is an important determinant of TNBC clinical outcomes.
Collapse
Affiliation(s)
- Bernadette M Boac
- Department of Anatomic Pathology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
- Chemical Biology and Molecular Medicine, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | - Forough Abbasi
- Cedars-Sinai Medical Center, Los Angeles, CA, 90048, USA
| | - Roohi Ismail-Khan
- Department of Oncologic Sciences, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
- Department of Women's Oncology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | - Yin Xiong
- Department of Anatomic Pathology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
- Chemical Biology and Molecular Medicine, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | - Atif Siddique
- Department of Anatomic Pathology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | - Hannah Park
- Department of Anatomic Pathology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
- Chemical Biology and Molecular Medicine, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | - Mingda Han
- Department of Anatomic Pathology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
- Chemical Biology and Molecular Medicine, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | - Daryoush Saeed-Vafa
- Department of Anatomic Pathology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | - Hatem Soliman
- Department of Oncologic Sciences, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
- Department of Women's Oncology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | - Brendon Henry
- Department of Anatomic Pathology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | - M Juliana Pena
- Department of Anatomic Pathology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | - E Clair McClung
- University of Arizona Cancer Center, Obstetrics and Gynecology, Tucson, AZ, 85724, USA
| | | | - Sarah L Todd
- Department of Women's Oncology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | - Alex Lopez
- Department of Anatomic Pathology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | - Weihong Sun
- Department of Women's Oncology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | - Susmitha Apuri
- Department of Women's Oncology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | | | - Anders E Berglund
- Department of Bioinformatics and Biostatistics, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA
| | | | - Douglas C Marchion
- Department of Anatomic Pathology, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA.
- Chemical Biology and Molecular Medicine, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, 33612, USA.
| |
Collapse
|
66
|
Yang M, Tao B, Chen C, Jia W, Sun S, Zhang T, Wang X. Machine Learning Models Based on Molecular Fingerprints and an Extreme Gradient Boosting Method Lead to the Discovery of JAK2 Inhibitors. J Chem Inf Model 2019; 59:5002-5012. [DOI: 10.1021/acs.jcim.9b00798] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Affiliation(s)
- Minjian Yang
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, P.R. China
- Joint Laboratory of Artificial Intelligence of the Institute of Materia Medica and Yuan Qi Zhi Yao, Beijing 100050, P.R. China
| | - Bingzhong Tao
- Joint Laboratory of Artificial Intelligence of the Institute of Materia Medica and Yuan Qi Zhi Yao, Beijing 100050, P.R. China
| | - Chengjuan Chen
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, P.R. China
| | - Wenqiang Jia
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, P.R. China
| | - Shaolei Sun
- Joint Laboratory of Artificial Intelligence of the Institute of Materia Medica and Yuan Qi Zhi Yao, Beijing 100050, P.R. China
| | - Tiantai Zhang
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, P.R. China
| | - Xiaojian Wang
- State Key Laboratory of Bioactive Substances and Functions of Natural Medicines, Institute of Materia Medica, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing 100050, P.R. China
- Joint Laboratory of Artificial Intelligence of the Institute of Materia Medica and Yuan Qi Zhi Yao, Beijing 100050, P.R. China
| |
Collapse
|
67
|
Min W, Liu J, Zhang S. Edge-group sparse PCA for network-guided high dimensional data analysis. Bioinformatics 2019; 34:3479-3487. [PMID: 29726900 DOI: 10.1093/bioinformatics/bty362] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2017] [Accepted: 05/02/2018] [Indexed: 12/14/2022] Open
Abstract
Motivation Principal component analysis (PCA) has been widely used to deal with high-dimensional gene expression data. In this study, we proposed an Edge-group Sparse PCA (ESPCA) model by incorporating the group structure from a prior gene network into the PCA framework for dimension reduction and feature interpretation. ESPCA enforces sparsity of principal component (PC) loadings through considering the connectivity of gene variables in the prior network. We developed an alternating iterative algorithm to solve ESPCA. The key of this algorithm is to solve a new k-edge sparse projection problem and a greedy strategy has been adapted to address it. Here we adopted ESPCA for analyzing multiple gene expression matrices simultaneously. By incorporating prior knowledge, our method can overcome the drawbacks of sparse PCA and capture some gene modules with better biological interpretations. Results We evaluated the performance of ESPCA using a set of artificial datasets and two real biological datasets (including TCGA pan-cancer expression data and ENCODE expression data), and compared their performance with PCA and sparse PCA. The results showed that ESPCA could identify more biologically relevant genes, improve their biological interpretations and reveal distinct sample characteristics. Availability and implementation An R package of ESPCA is available at http://page.amss.ac.cn/shihua.zhang/. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Wenwen Min
- School of Computer Science, Wuhan University, Wuhan, China
| | - Juan Liu
- School of Computer Science, Wuhan University, Wuhan, China
| | - Shihua Zhang
- NCMIS, CEMS, RCSDS, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, China.,School of Mathematics Sciences, University of Chinese Academy of Sciences, Beijing, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China
| |
Collapse
|
68
|
Bhadra S, Blomberg P, Castillo S, Rousu J. Principal metabolic flux mode analysis. Bioinformatics 2019; 34:2409-2417. [PMID: 29420676 PMCID: PMC6041797 DOI: 10.1093/bioinformatics/bty049] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2017] [Accepted: 02/06/2018] [Indexed: 01/01/2023] Open
Abstract
Motivation In the analysis of metabolism, two distinct and complementary approaches are frequently used: Principal component analysis (PCA) and stoichiometric flux analysis. PCA is able to capture the main modes of variability in a set of experiments and does not make many prior assumptions about the data, but does not inherently take into account the flux mode structure of metabolism. Stoichiometric flux analysis methods, such as Flux Balance Analysis (FBA) and Elementary Mode Analysis, on the other hand, are able to capture the metabolic flux modes, however, they are primarily designed for the analysis of single samples at a time, and not best suited for exploratory analysis on a large sets of samples. Results We propose a new methodology for the analysis of metabolism, called Principal Metabolic Flux Mode Analysis (PMFA), which marries the PCA and stoichiometric flux analysis approaches in an elegant regularized optimization framework. In short, the method incorporates a variance maximization objective form PCA coupled with a stoichiometric regularizer, which penalizes projections that are far from any flux modes of the network. For interpretability, we also introduce a sparse variant of PMFA that favours flux modes that contain a small number of reactions. Our experiments demonstrate the versatility and capabilities of our methodology. The proposed method can be applied to genome-scale metabolic network in efficient way as PMFA does not enumerate elementary modes. In addition, the method is more robust on out-of-steady steady-state experimental data than competing flux mode analysis approaches. Availability and implementation Matlab software for PMFA and SPMFA and dataset used for experiments are available in https://github.com/aalto-ics-kepaco/PMFA. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sahely Bhadra
- Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland.,Computer Science and Engineering, Indian Institute of Technology, Palakkad, India
| | - Peter Blomberg
- VTT Technical Research Centre of Finland Ltd, Espoo, Finland
| | - Sandra Castillo
- VTT Technical Research Centre of Finland Ltd, Espoo, Finland
| | - Juho Rousu
- Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland
| |
Collapse
|
69
|
Singh A, Goel N. Integrative Analysis of Multi-Genomic Data for Kidney Renal Cell Carcinoma. Interdiscip Sci 2019; 12:12-23. [PMID: 31392539 DOI: 10.1007/s12539-019-00345-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2019] [Revised: 07/21/2019] [Accepted: 07/24/2019] [Indexed: 12/20/2022]
Abstract
Accounting for nine out of ten kidney cancers, kidney renal cell carcinoma (KIRC) is by far the most common type of kidney cancer. In view of limited and ineffective available therapies, understanding the genetic basis of disease becomes important for better diagnosis and treatment. The present studies are based on a single type of genomic data. These studies do not consider interactions between genomic data types and their underlying biological relationships in the disease. However, the current availability of multiple genomic data and the possibility of combining it have facilitated a better understanding of the cancer's characterization. But high dimensionality and the existence of complex interactions (within and between genomic data types) are the two main challenges of integrative methods to analyze cancer effectively. In this paper, we propose a method to build an integrative model based on Bayesian model averaging procedure for improved prediction of clinical outcome in cancer survival. The proposed method initially uses dimensionality reduction techniques to generate low-dimensional latent features for the predictive models and then incorporates interactions between them. It defines the latent features using principal components and their sparse version. It compares the predictive performance of models based on these two latent features on real data. These models also validate several ccRCC-specific cancer biomarkers previously reported in the literature. Applied on kidney renal cell carcinoma (KIRC) dataset of The Cancer Genome Atlas (TCGA), the method achieves better prediction with sparse principal components model by including latent feature interactions as compared to without including them.
Collapse
Affiliation(s)
- Ashwinder Singh
- University Institute of Engineering and Technology, Panjab University, Chandigarh, 160014, India
| | - Neelam Goel
- University Institute of Engineering and Technology, Panjab University, Chandigarh, 160014, India.
| |
Collapse
|
70
|
Marini F, Binder H. pcaExplorer: an R/Bioconductor package for interacting with RNA-seq principal components. BMC Bioinformatics 2019; 20:331. [PMID: 31195976 PMCID: PMC6567655 DOI: 10.1186/s12859-019-2879-1] [Citation(s) in RCA: 135] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2018] [Accepted: 05/07/2019] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Principal component analysis (PCA) is frequently used in genomics applications for quality assessment and exploratory analysis in high-dimensional data, such as RNA sequencing (RNA-seq) gene expression assays. Despite the availability of many software packages developed for this purpose, an interactive and comprehensive interface for performing these operations is lacking. RESULTS We developed the pcaExplorer software package to enhance commonly performed analysis steps with an interactive and user-friendly application, which provides state saving as well as the automated creation of reproducible reports. pcaExplorer is implemented in R using the Shiny framework and exploits data structures from the open-source Bioconductor project. Users can easily generate a wide variety of publication-ready graphs, while assessing the expression data in the different modules available, including a general overview, dimension reduction on samples and genes, as well as functional interpretation of the principal components. CONCLUSION pcaExplorer is distributed as an R package in the Bioconductor project ( http://bioconductor.org/packages/pcaExplorer/ ), and is designed to assist a broad range of researchers in the critical step of interactive data exploration.
Collapse
Affiliation(s)
- Federico Marini
- Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center of the Johannes Gutenberg University Mainz, Obere Zahlbacher Str. 69, Mainz, 55131 Germany
- Center for Thrombosis and Hemostasis (CTH), University Medical Center of the Johannes Gutenberg University Mainz, Langenbeckstr. 1, Mainz, 55131 Germany
| | - Harald Binder
- Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center - University of Freiburg, Stefan-Meier-Str. 26, Freiburg, 79104 Germany
| |
Collapse
|
71
|
Chiarella P, Capone P, Carbonari D, Sisto R. A Predictive Model Assessing Genetic Susceptibility Risk at Workplace. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2019; 16:ijerph16112012. [PMID: 31195756 PMCID: PMC6603935 DOI: 10.3390/ijerph16112012] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 05/30/2019] [Accepted: 06/03/2019] [Indexed: 01/08/2023]
Abstract
(1) Background: The study of susceptibility biomarkers in the immigrant workforce integrated into the social tissue of European host countries is always a challenge, due to high individual heterogeneity and the admixing of different ethnicities in the same workplace. These workers having distinct cultural backgrounds, beliefs, diets, and habits, as well as a poor knowledge of the foreign language, may feel reluctant to donate their biological specimens for the biomonitoring research studies. (2) Methods: A model predicting ethnicity-specific susceptibility based on principal component analysis has been conceived, using the genotype frequency of the investigated populations available in publicly accessible databases. (3) Results: Correlations among ethnicities and between ethnic and polymorphic genes have been found, and low/high-risk profiles have been identified as valuable susceptibility biomarkers. (4) Conclusions: In the absence of workers’ consent or access to blood genotyping, ethnicity represents a good indicator of the subject’s genotype. This model, associating ethnicity-specific genotype frequency with the susceptibility biomarkers involved in the metabolism of toxicants, may replace genotyping, ensuring the necessary safety and health conditions of workers assigned to hazardous jobs.
Collapse
Affiliation(s)
- Pieranna Chiarella
- Department of Occupational and Environmental Medicine, Epidemiology and Hygiene, INAIL Research, Via Fontana Candida 1, 00078 Monteporzio Catone, Rome, Italy.
| | - Pasquale Capone
- Department of Occupational and Environmental Medicine, Epidemiology and Hygiene, INAIL Research, Via Fontana Candida 1, 00078 Monteporzio Catone, Rome, Italy.
| | - Damiano Carbonari
- Department of Occupational and Environmental Medicine, Epidemiology and Hygiene, INAIL Research, Via Fontana Candida 1, 00078 Monteporzio Catone, Rome, Italy.
| | - Renata Sisto
- Department of Occupational and Environmental Medicine, Epidemiology and Hygiene, INAIL Research, Via Fontana Candida 1, 00078 Monteporzio Catone, Rome, Italy.
| |
Collapse
|
72
|
A hypergraph-based method for large-scale dynamic correlation study at the transcriptomic scale. BMC Genomics 2019; 20:397. [PMID: 31117943 PMCID: PMC6530038 DOI: 10.1186/s12864-019-5787-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2018] [Accepted: 05/09/2019] [Indexed: 12/22/2022] Open
Abstract
Background The biological regulatory system is highly dynamic. Correlations between functionally related genes change over different biological conditions, which are often unobserved in the data. At the gene level, the dynamic correlations result in three-way gene interactions involving a pair of genes that change correlation, and a third gene that reflects the underlying cellular conditions. This type of ternary relation can be quantified by the Liquid Association statistic. Studying these three-way interactions at the gene triplet level have revealed important regulatory mechanisms in the biological system. Currently, due to the extremely large amount of possible combinations of triplets within a high-throughput gene expression dataset, no method is available to examine the ternary relationship at the biological system level and formally address the false discovery issue. Results Here we propose a new method, Hypergraph for Dynamic Correlation (HDC), to construct module-level three-way interaction networks. The method is able to present integrative uniform hypergraphs to reflect the global dynamic correlation pattern in the biological system, providing guidance to down-stream gene triplet-level analyses. To validate the method’s ability, we conducted two real data experiments using a melanoma RNA-seq dataset from The Cancer Genome Atlas (TCGA) and a yeast cell cycle dataset. The resulting hypergraphs are clearly biologically plausible, and suggest novel relations relevant to the biological conditions in the data. Conclusions We believe the new approach provides a valuable alternative method to analyze omics data that can extract higher order structures. The software is at https://github.com/yunchuankong/HypergraphDynamicCorrelation. Electronic supplementary material The online version of this article (10.1186/s12864-019-5787-x) contains supplementary material, which is available to authorized users.
Collapse
|
73
|
Adams RH, Schield DR, Castoe TA. Recent Advances in the Inference of Gene Flow from Population Genomic Data. ACTA ACUST UNITED AC 2019. [DOI: 10.1007/s40610-019-00120-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
|
74
|
Smets T, Verbeeck N, Claesen M, Asperger A, Griffioen G, Tousseyn T, Waelput W, Waelkens E, De Moor B. Evaluation of Distance Metrics and Spatial Autocorrelation in Uniform Manifold Approximation and Projection Applied to Mass Spectrometry Imaging Data. Anal Chem 2019; 91:5706-5714. [DOI: 10.1021/acs.analchem.8b05827] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Affiliation(s)
- Tina Smets
- STADIUS Center for Dynamical Systems, Signal Processing, and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, 3001 Leuven, Belgium
| | - Nico Verbeeck
- STADIUS Center for Dynamical Systems, Signal Processing, and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, 3001 Leuven, Belgium
- Aspect Analytics NV, C-mine 12, 3600 Genk, Belgium
| | - Marc Claesen
- STADIUS Center for Dynamical Systems, Signal Processing, and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, 3001 Leuven, Belgium
- Aspect Analytics NV, C-mine 12, 3600 Genk, Belgium
| | - Arndt Asperger
- Bruker Daltonik GmbH, Fahrenheitstrasse 4, 28359 Bremen, Germany
| | | | - Thomas Tousseyn
- Department of Pathology, University Hospitals KU Leuven, 3001 Leuven, Belgium
| | - Wim Waelput
- Department of Pathology, UZ-Brussel, 1000 Brussels, Belgium
| | - Etienne Waelkens
- Department of Cellular and Molecular Medicine, KU Leuven, 3000 Leuven, Belgium
| | - Bart De Moor
- STADIUS Center for Dynamical Systems, Signal Processing, and Data Analytics, Department of Electrical Engineering (ESAT), KU Leuven, 3001 Leuven, Belgium
| |
Collapse
|
75
|
|
76
|
Cai C, Guo P, Zhou Y, Zhou J, Wang Q, Zhang F, Fang J, Cheng F. Deep Learning-Based Prediction of Drug-Induced Cardiotoxicity. J Chem Inf Model 2019; 59:1073-1084. [PMID: 30715873 DOI: 10.1021/acs.jcim.8b00769] [Citation(s) in RCA: 89] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Blockade of the human ether-à-go-go-related gene (hERG) channel by small molecules induces the prolongation of the QT interval which leads to fatal cardiotoxicity and accounts for the withdrawal or severe restrictions on the use of many approved drugs. In this study, we develop a deep learning approach, termed deephERG, for prediction of hERG blockers of small molecules in drug discovery and postmarketing surveillance. In total, we assemble 7,889 compounds with well-defined experimental data on the hERG and with diverse chemical structures. We find that deephERG models built by a multitask deep neural network (DNN) algorithm outperform those built by single-task DNN, naı̈ve Bayes (NB), support vector machine (SVM), random forest (RF), and graph convolutional neural network (GCNN). Specifically, the area under the receiver operating characteristic curve (AUC) value for the best model of deephERG is 0.967 on the validation set. Furthermore, based on 1,824 U.S. Food and Drug Administration (FDA) approved drugs, 29.6% drugs are computationally identified to have potential hERG inhibitory activities by deephERG, highlighting the importance of hERG risk assessment in early drug discovery. Finally, we showcase several novel predicted hERG blockers on approved antineoplastic agents, which are validated by clinical case reports, experimental evidence, and the literature. In summary, this study presents a powerful deep learning-based tool for risk assessment of hERG-mediated cardiotoxicities in drug discovery and postmarketing surveillance.
Collapse
Affiliation(s)
- Chuipu Cai
- Institute of Clinical Pharmacology , Guangzhou University of Chinese Medicine , Guangzhou 510405 , China.,School of Basic Medical Sciences , Guangzhou University of Chinese Medicine , Guangzhou 510405 , China
| | - Pengfei Guo
- Institute of Clinical Pharmacology , Guangzhou University of Chinese Medicine , Guangzhou 510405 , China
| | - Yadi Zhou
- Department of Chemistry and Biochemistry , Ohio University , Athens , Ohio 45701 , United States
| | - Jingwei Zhou
- Institute of Clinical Pharmacology , Guangzhou University of Chinese Medicine , Guangzhou 510405 , China
| | - Qi Wang
- Institute of Clinical Pharmacology , Guangzhou University of Chinese Medicine , Guangzhou 510405 , China
| | - Fengxue Zhang
- School of Basic Medical Sciences , Guangzhou University of Chinese Medicine , Guangzhou 510405 , China
| | - Jiansong Fang
- Institute of Clinical Pharmacology , Guangzhou University of Chinese Medicine , Guangzhou 510405 , China
| | - Feixiong Cheng
- Genomic Medicine Institute, Lerner Research Institute , Cleveland Clinic , Cleveland , Ohio 44106 , United States.,Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine , Case Western Reserve University , 9500 Euclid Avenue , Cleveland , Ohio 44195 , United States.,Case Comprehensive Cancer Center , Case Western Reserve University School of Medicine , Cleveland , Ohio 44106 , United States
| |
Collapse
|
77
|
van der Kooi ALLF, Clemens E, Broer L, Zolk O, Byrne J, Campbell H, van den Berg M, Berger C, Calaminus G, Dirksen U, Winther JF, Fosså SD, Grabow D, Haupt R, Kaiser M, Kepak T, Kremer L, Kruseova J, Modan-Moses D, Ranft A, Spix C, Kaatsch P, Laven JSE, van Dulmen-den Broeder E, Uitterlinden AG, van den Heuvel-Eibrink MM. Genetic variation in gonadal impairment in female survivors of childhood cancer: a PanCareLIFE study protocol. BMC Cancer 2018; 18:930. [PMID: 30257669 PMCID: PMC6158859 DOI: 10.1186/s12885-018-4834-3] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2018] [Accepted: 09/18/2018] [Indexed: 12/13/2022] Open
Abstract
Background Improved risk stratification, more effective therapy and better supportive care have resulted in survival rates after childhood cancer of around 80% in developed countries. Treatment however can be harsh, and three in every four childhood cancer survivors (CCS) develop at least one late effect, such as gonadal impairment. Gonadal impairment can cause involuntary childlessness, with serious consequences for the well-being of CCS. In addition, early menopause increases the risk of comorbidities such as cardiovascular disease and osteoporosis. Inter-individual variability in susceptibility to therapy related gonadal impairment suggests a role for genetic variation. Currently, only one candidate gene study investigated genetic determinants in relation to gonadal impairment in female CCS; it yielded one single nucleotide polymorphism (SNP) that was previously linked with the predicted age at menopause in the general population of women, now associated with gonadal impairment in CCS. Additionally, one genome wide association study (GWAS) evaluated an association with premature menopause, but no GWAS has been performed using endocrine measurements for gonadal impairment as the primary outcome in CCS. Methods As part of the PanCareLIFE study, the genetic variability of chemotherapy induced gonadal impairment among CCS will be addressed. Gonadal impairment will be determined by anti-Müllerian hormone (AMH) levels or alternatively by fertility and reproductive medical history retrieved by questionnaire. Clinical and genetic data from 837 non-brain or non-bilateral gonadal irradiated long-term CCS will result in the largest clinical European cohort assembled for this late-effect study to date. A candidate gene study will examine SNPs that have already been associated with age at natural menopause and DNA maintenance in the general population. In addition, a GWAS will be performed to identify novel allelic variants. The results will be validated in an independent CCS cohort. Discussion This international collaboration aims to enhance knowledge of genetic variation which may be included in risk prediction models for gonadal impairment in CCS.
Collapse
Affiliation(s)
- Anne-Lotte L F van der Kooi
- Division of Reproductive Endocrinology and Infertility, Department of Obstetrics and Gynecology, Erasmus MC - Sophia Children's Hospital, Rotterdam, The Netherlands. .,Department of Pediatric Hematology and Oncology, Erasmus MC - Sophia Children's Hospital, Rotterdam, The Netherlands. .,Princess Máxima Center for Pediatric Oncology, Lundlaan 6, 3584, EA, Utrecht, The Netherlands.
| | - Eva Clemens
- Department of Pediatric Hematology and Oncology, Erasmus MC - Sophia Children's Hospital, Rotterdam, The Netherlands.,Princess Máxima Center for Pediatric Oncology, Lundlaan 6, 3584, EA, Utrecht, The Netherlands
| | - Linda Broer
- Department of Internal Medicine, Erasmus MC, Rotterdam, The Netherlands
| | - Oliver Zolk
- Institute of Pharmacology of Natural Products and Clinical Pharmacology, University Hospital Ulm, Ulm, Germany
| | | | | | - Marleen van den Berg
- Department of Pediatric Hematology and Oncology, VU Medical Center, Amsterdam, The Netherlands
| | - Claire Berger
- Department of Paediatric Oncology, University Hospital, St-Etienne, France.,Epidemiology of Childhood and Adolescent Cancers, CRESS, INSERM, UMR 1153, Paris Descartes University, Villejuif, France
| | - Gabriele Calaminus
- Department of Paediatric Haematology and Oncology, University Children's Hospital Bonn, University of Bonn Medical School, Bonn, Germany
| | - Uta Dirksen
- Pediatrics III, West German Cancer Centre, University Hospital Essen, Essen, Germany.,German Cancer Research Centre, DKTK, sites Bonn and Essen, Germany
| | - Jeanette Falck Winther
- Danish Cancer Society Research Center, Copenhagen, Denmark.,Department of Clinical Medicine, Faculty of Health, Aarhus University, Aarhus, Denmark
| | - Sophie D Fosså
- Department of Oncology, Oslo University Hospital, Oslo, Norway
| | - Desiree Grabow
- German Childhood Cancer Registry, Institute of Medical Biostatistics, Epidemiology and Informatics, University Medical Center, Mainz, Germany
| | - Riccardo Haupt
- Epidemiology and Biostatistics Unit, Istituto Giannina Gaslini, Genoa, Italy
| | - Melanie Kaiser
- German Childhood Cancer Registry, Institute of Medical Biostatistics, Epidemiology and Informatics, University Medical Center, Mainz, Germany
| | - Tomas Kepak
- Czech Republic & International Clinical Research Center (FNUSA-ICRC), University Hospital Brno, Brno, Czech Republic
| | - Leontien Kremer
- Princess Máxima Center for Pediatric Oncology, Lundlaan 6, 3584, EA, Utrecht, The Netherlands.,Department of Pediatrics, Academic Medical Center, Emma Children's Hospital, Amsterdam, The Netherlands
| | | | - Dalit Modan-Moses
- Chaim Sheba Medical Center, The Edmond and Lily Safra Children's Hospital, Tel Hashomer, Israel.,Sackler Faculty of Medicine, Tel-Aviv University, Tel-Aviv, Israel
| | - Andreas Ranft
- Pediatrics III, West German Cancer Centre, University Hospital Essen, Essen, Germany.,German Cancer Research Centre, DKTK, sites Bonn and Essen, Germany
| | - Claudia Spix
- German Childhood Cancer Registry, Institute of Medical Biostatistics, Epidemiology and Informatics, University Medical Center, Mainz, Germany
| | - Peter Kaatsch
- German Childhood Cancer Registry, Institute of Medical Biostatistics, Epidemiology and Informatics, University Medical Center, Mainz, Germany
| | - Joop S E Laven
- Division of Reproductive Endocrinology and Infertility, Department of Obstetrics and Gynecology, Erasmus MC - Sophia Children's Hospital, Rotterdam, The Netherlands
| | | | | | | | | |
Collapse
|
78
|
Wang WH, Xie TY, Xie GL, Ren ZL, Li JM. An Integrated Approach for Identifying Molecular Subtypes in Human Colon Cancer Using Gene Expression Data. Genes (Basel) 2018; 9:E397. [PMID: 30072645 PMCID: PMC6115727 DOI: 10.3390/genes9080397] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2018] [Revised: 07/18/2018] [Accepted: 07/27/2018] [Indexed: 02/08/2023] Open
Abstract
Identifying molecular subtypes of colorectal cancer (CRC) may allow for more rational, patient-specific treatment. Various studies have identified molecular subtypes for CRC using gene expression data, but they are inconsistent and further research is necessary. From a methodological point of view, a progressive approach is needed to identify molecular subtypes in human colon cancer using gene expression data. We propose an approach to identify the molecular subtypes of colon cancer that integrates denoising by the Bayesian robust principal component analysis (BRPCA) algorithm, hierarchical clustering by the directed bubble hierarchical tree (DBHT) algorithm, and feature gene selection by an improved differential evolution based feature selection method (DEFSW) algorithm. In this approach, the normal samples being completely and exclusively clustered into one class is considered to be the standard of reasonable clustering subtypes, and the feature selection pays attention to imbalances of samples among subtypes. With this approach, we identified the molecular subtypes of colon cancer on the mRNA gene expression dataset of 153 colon cancer samples and 19 normal control samples of the Cancer Genome Atlas (TCGA) project. The colon cancer was clustered into 7 subtypes with 44 feature genes. Our approach could identify finer subtypes of colon cancer with fewer feature genes than the other two recent studies and exhibits a generic methodology that might be applied to identify the subtypes of other cancers.
Collapse
Affiliation(s)
- Wen-Hui Wang
- State Key Laboratory of Organ Failure Research, Division of Nephrology, Southern Medical University, Guangzhou 510515, China.
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China.
- Network Information Center, The Sixth Affiliated Hospital of Sun Yat-Sen University, Guangzhou 510655, China.
| | - Ting-Yan Xie
- State Key Laboratory of Organ Failure Research, Division of Nephrology, Southern Medical University, Guangzhou 510515, China.
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China.
| | - Guang-Lei Xie
- State Key Laboratory of Organ Failure Research, Division of Nephrology, Southern Medical University, Guangzhou 510515, China.
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China.
| | - Zhong-Lu Ren
- Center for Systems Medical Genetics, Department of Obstetrics & Gynecology Nanfang Hospital, Southern Medical University, Guangzhou 510515, China.
- Laboratory of Systems Neuroscience, Institute of Mental Health Southern Medical University, Southern Medical University, Guangzhou 510515, China.
| | - Jin-Ming Li
- State Key Laboratory of Organ Failure Research, Division of Nephrology, Southern Medical University, Guangzhou 510515, China.
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China.
| |
Collapse
|
79
|
Csala A, Voorbraak FPJM, Zwinderman AH, Hof MH. Sparse redundancy analysis of high-dimensional genetic and genomic data. Bioinformatics 2018; 33:3228-3234. [PMID: 28605402 DOI: 10.1093/bioinformatics/btx374] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Accepted: 06/07/2017] [Indexed: 01/06/2023] Open
Abstract
Motivation Recent technological developments have enabled the possibility of genetic and genomic integrated data analysis approaches, where multiple omics datasets from various biological levels are combined and used to describe (disease) phenotypic variations. The main goal is to explain and ultimately predict phenotypic variations by understanding their genetic basis and the interaction of the associated genetic factors. Therefore, understanding the underlying genetic mechanisms of phenotypic variations is an ever increasing research interest in biomedical sciences. In many situations, we have a set of variables that can be considered to be the outcome variables and a set that can be considered to be explanatory variables. Redundancy analysis (RDA) is an analytic method to deal with this type of directionality. Unfortunately, current implementations of RDA cannot deal optimally with the high dimensionality of omics data (p≫n). The existing theoretical framework, based on Ridge penalization, is suboptimal, since it includes all variables in the analysis. As a solution, we propose to use Elastic Net penalization in an iterative RDA framework to obtain a sparse solution. Results We proposed sparse redundancy analysis (sRDA) for high dimensional omics data analysis. We conducted simulation studies with our software implementation of sRDA to assess the reliability of sRDA. Both the analysis of simulated data, and the analysis of 485 512 methylation markers and 18,424 gene-expression values measured in a set of 55 patients with Marfan syndrome show that sRDA is able to deal with the usual high dimensionality of omics data. Availability and implementation http://uva.csala.me/rda. Contact a.csala@amc.uva.nl. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Attila Csala
- Department of Clinical Epidemiology, Biostatistics and Bioinformatics
| | - Frans P J M Voorbraak
- Department of Medical Informatics, Academic Medical Center, Amsterdam 1105 AZ, The Netherlands
| | | | - Michel H Hof
- Department of Clinical Epidemiology, Biostatistics and Bioinformatics
| |
Collapse
|
80
|
|
81
|
Vilor-Tejedor N, Alemany S, Cáceres A, Bustamante M, Pujol J, Sunyer J, González JR. Strategies for integrated analysis in imaging genetics studies. Neurosci Biobehav Rev 2018; 93:57-70. [PMID: 29944960 DOI: 10.1016/j.neubiorev.2018.06.013] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2017] [Revised: 04/30/2018] [Accepted: 06/15/2018] [Indexed: 02/06/2023]
Abstract
Imaging Genetics (IG) integrates neuroimaging and genomic data from the same individual, deepening our knowledge of the biological mechanisms behind neurodevelopmental domains and neurological disorders. Although the literature on IG has exponentially grown over the past years, the majority of studies have mainly analyzed associations between candidate brain regions and individual genetic variants. However, this strategy is not designed to deal with the complexity of neurobiological mechanisms underlying behavioral and neurodevelopmental domains. Moreover, larger sample sizes and increased multidimensionality of this type of data represents a challenge for standardizing modeling procedures in IG research. This review provides a systematic update of the methods and strategies currently used in IG studies, and serves as an analytical framework for researchers working in this field. To complement the functionalities of the Neuroconductor framework, we also describe existing R packages that implement these methodologies. In addition, we present an overview of how these methodological approaches are applied in integrating neuroimaging and genetic data.
Collapse
Affiliation(s)
- Natàlia Vilor-Tejedor
- Barcelona Research Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain; Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain; Barcelona Beta Brain Research Center (BBRC) - Pasqual Maragall Foundation, Barcelona, Spain.
| | - Silvia Alemany
- Barcelona Research Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain
| | - Alejandro Cáceres
- Barcelona Research Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain
| | - Mariona Bustamante
- Barcelona Research Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain; Center for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, Barcelona, Spain
| | - Jesús Pujol
- MRI Research Unit, Hospital del Mar, Centro de Investigación Biomédica en Red de Salud Mental, CIBERSAM G21, Barcelona, Spain
| | - Jordi Sunyer
- Barcelona Research Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain; IMIM (Hospital del Mar Medical Research Institute), Barcelona, Spain
| | - Juan R González
- Barcelona Research Institute for Global Health (ISGlobal), Barcelona, Spain; Universitat Pompeu Fabra (UPF), Barcelona, Spain; CIBER Epidemiología y Salud Pública (CIBERESP), Barcelona, Spain.
| |
Collapse
|
82
|
Hishida A, Nakatochi M, Akiyama M, Kamatani Y, Nishiyama T, Ito H, Oze I, Nishida Y, Hara M, Takashima N, Turin TC, Watanabe M, Suzuki S, Ibusuki R, Shimoshikiryo I, Nakamura Y, Mikami H, Ikezaki H, Furusyo N, Kuriki K, Endoh K, Koyama T, Matsui D, Uemura H, Arisawa K, Sasakabe T, Okada R, Kawai S, Naito M, Momozawa Y, Kubo M, Wakai K. Genome-Wide Association Study of Renal Function Traits: Results from the Japan Multi-Institutional Collaborative Cohort Study. Am J Nephrol 2018; 47:304-316. [PMID: 29779033 DOI: 10.1159/000488946] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2017] [Accepted: 03/29/2018] [Indexed: 11/19/2022]
Abstract
BACKGROUND Chronic kidney disease (CKD) is a rapidly growing, worldwide public health problem. Recent advances in genome-wide-association studies (GWAS) revealed several genetic loci associated with renal function traits worldwide. METHODS We investigated the association of genetic factors with the levels of serum creatinine (SCr) and the estimated glomerular filtration rate (eGFR) in Japanese population-based cohorts analyzing the GWAS imputed data with 11,221 subjects and 12,617,569 variants, and replicated the findings with the 148,829 hospital-based Japanese subjects. RESULTS In the discovery phase, 28 variants within 4 loci (chromosome [chr] 2 with 8 variants including rs3770636 in the LDL receptor related protein 2 gene locus, on chr 5 with 2 variants including rs270184, chr 17 with 15 variants including rs3785837 in the BCAS3 gene locus, and chr 18 with 3 variants including rs74183647 in the nuclear factor of -activated T-cells 1 gene locus) reached the suggestive level of p < 1 × 10-6 in association with eGFR and SCr, and 2 variants on chr 4 (including rs78351985 in the microsomal triglyceride transfer protein gene locus) fulfilled the suggestive level in association with the risk of CKD. In the replication phase, 25 variants within 3 loci (chr 2 with 7 variants, chr 17 with 15 variants and chr 18 with 3 variants) in association with eGFR and SCr, and 2 variants on chr 4 associated with the risk of CKD became nominally statistically significant after Bonferroni correction, among which 15 variants on chr 17 and 3 variants on chr 18 reached genome-wide significance of p < 5 × 10-8 in the combined study meta-analysis. The associations of the loci on chr 2 and 18 with eGFR and SCr as well as that on chr 4 with CKD risk have not been previously reported in the Japanese and East Asian populations. CONCLUSION Although the present GWAS of renal function traits included the largest sample of Japanese participants to date, we did not identify novel loci for renal traits. However, we identified the novel associations of the genetic loci on chr 2, 4, and 18 with renal function traits in the Japanese population, suggesting these are transethnic loci. Further investigations of these associations are expected to further validate our findings for the potential establishment of personalized prevention of renal disease in the Japanese and East Asian populations.
Collapse
MESH Headings
- Adult
- Aged
- Asian People/genetics
- Chromosomes, Human, Pair 18/genetics
- Chromosomes, Human, Pair 2/genetics
- Chromosomes, Human, Pair 4/genetics
- Cohort Studies
- Creatinine/blood
- Female
- Genetic Loci
- Genetic Predisposition to Disease
- Genome-Wide Association Study
- Glomerular Filtration Rate
- Humans
- Japan/epidemiology
- Kidney/physiopathology
- Male
- Middle Aged
- Polymorphism, Single Nucleotide
- Prevalence
- Renal Insufficiency, Chronic/blood
- Renal Insufficiency, Chronic/epidemiology
- Renal Insufficiency, Chronic/genetics
Collapse
Affiliation(s)
- Asahi Hishida
- Department of Preventive Medicine, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Masahiro Nakatochi
- Center for Advanced Medicine and Clinical Research, Nagoya University Hospital, Nagoya, Japan
| | - Masato Akiyama
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Yoichiro Kamatani
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
- Center for Genomic Medicine, Graduate School of Medicine, Kyoto University, Kyoto, Japan
| | - Takeshi Nishiyama
- Department of Public Health, Aichi Medical University School of Medicine, Nagakute, Japan
| | - Hidemi Ito
- Division of Molecular and Clinical Epidemiology, Aichi Cancer Center Research Institute, Nagoya, Japan
| | - Isao Oze
- Division of Molecular and Clinical Epidemiology, Aichi Cancer Center Research Institute, Nagoya, Japan
| | - Yuichiro Nishida
- Department of Preventive Medicine, Faculty of Medicine, Saga University, Saga, Japan
| | - Megumi Hara
- Department of Preventive Medicine, Faculty of Medicine, Saga University, Saga, Japan
| | - Naoyuki Takashima
- Department of Health Science, Shiga University of Medical Science, Otsu, Japan
| | - Tanvir Chowdhury Turin
- Department of Family Medicine, Cumming School of Medicine, University of Calgary, Calgary, Alberta, Canada
| | - Miki Watanabe
- Department of Public Health, Nagoya City University Graduate School of Medical Sciences, Nagoya, Japan
| | - Sadao Suzuki
- Department of Public Health, Nagoya City University Graduate School of Medical Sciences, Nagoya, Japan
| | - Rie Ibusuki
- Department of International Island and Community Medicine, Kagoshima University Graduate School of Medical and Dental Sciences, Kagoshima, Japan
| | - Ippei Shimoshikiryo
- Department of International Island and Community Medicine, Kagoshima University Graduate School of Medical and Dental Sciences, Kagoshima, Japan
| | - Yohko Nakamura
- Cancer Prevention Center, Chiba Cancer Center Research Institute, Chiba, Japan
| | - Haruo Mikami
- Cancer Prevention Center, Chiba Cancer Center Research Institute, Chiba, Japan
| | - Hiroaki Ikezaki
- Department of Geriatric Medicine, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Norihiro Furusyo
- Department of Geriatric Medicine, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Kiyonori Kuriki
- Laboratory of Public Health, School of Food and Nutritional Sciences, University of Shizuoka, Shizuoka, Japan
| | - Kaori Endoh
- Laboratory of Public Health, School of Food and Nutritional Sciences, University of Shizuoka, Shizuoka, Japan
| | - Teruhide Koyama
- Department of Epidemiology for Community Health and Medicine, Kyoto Prefectural University of Medicine, Kyoto, Japan
| | - Daisuke Matsui
- Department of Epidemiology for Community Health and Medicine, Kyoto Prefectural University of Medicine, Kyoto, Japan
| | - Hirokazu Uemura
- Department of Preventive Medicine, Institute of Health Biosciences, University of Tokushima Graduate School, Tokushima, Japan
| | - Kokichi Arisawa
- Department of Preventive Medicine, Institute of Health Biosciences, University of Tokushima Graduate School, Tokushima, Japan
| | - Tae Sasakabe
- Department of Preventive Medicine, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Rieko Okada
- Department of Preventive Medicine, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Sayo Kawai
- Department of Preventive Medicine, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | - Mariko Naito
- Department of Preventive Medicine, Nagoya University Graduate School of Medicine, Nagoya, Japan
| | | | - Michiaki Kubo
- RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Kenji Wakai
- Department of Preventive Medicine, Nagoya University Graduate School of Medicine, Nagoya, Japan
| |
Collapse
|
83
|
Tran TB, Bergen PJ, Creek DJ, Velkov T, Li J. Synergistic Killing of Polymyxin B in Combination With the Antineoplastic Drug Mitotane Against Polymyxin-Susceptible and -Resistant Acinetobacter baumannii: A Metabolomic Study. Front Pharmacol 2018; 9:359. [PMID: 29713282 PMCID: PMC5911485 DOI: 10.3389/fphar.2018.00359] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2018] [Accepted: 03/27/2018] [Indexed: 12/16/2022] Open
Abstract
Polymyxins are currently used as the last-resort antibiotics against multidrug-resistant Acinetobacter baumannii. As resistance to polymyxins emerges in A. baumannii with monotherapy, combination therapy is often the only remaining treatment option. A novel approach is to employ the combination of polymyxin B with non-antibiotic drugs. In the present study, we employed metabolomics to investigate the synergistic mechanism of polymyxin B in combination with the antineoplastic drug mitotane against polymyxin-susceptible and -resistant A. baumannii. The metabolomes of four A. baumannii strains were analyzed following treatment with polymyxin B, mitotane and the combination. Polymyxin B monotherapy induced significant perturbation in glycerophospholipid (GPL) metabolism and histidine degradation pathways in polymyxin-susceptible strains, and minimal perturbation in polymyxin-resistant strains. Mitotane monotherapy induced minimal perturbation in the polymyxin-susceptible strains, but caused significant perturbation in GPL metabolism, pentose phosphate pathway and histidine degradation in the LPS-deficient polymyxin-resistant strain (FADDI-AB065). The polymyxin B – mitotane combination induced significant perturbation in all strains except the lipid A modified polymyxin-resistant FADDI-AB225 strain. For the polymyxin-susceptible strains, the combination therapy significantly perturbed GPL metabolism, pentose phosphate pathway, citric acid cycle, pyrimidine ribonucleotide biogenesis, guanine ribonucleotide biogenesis, and histidine degradation. Against FADDI-AB065, the combination significantly perturbed GPL metabolism, pentose phosphate pathway, citric acid cycle, and pyrimidine ribonucleotide biogenesis. Overall, these novel findings demonstrate that the disruption of the citric acid cycle and inhibition of nucleotide biogenesis are the key metabolic features associated with synergistic bacterial killing by the combination against polymyxin-susceptible and -resistant A. baumannii.
Collapse
Affiliation(s)
- Thien B Tran
- Monash Biomedicine Discovery Institute, Department of Microbiology, School of Biomedical Sciences, Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, VIC, Australia.,Drug Delivery, Disposition and Dynamics, Monash Institute of Pharmaceutical Sciences, Monash University, Melbourne, VIC, Australia
| | - Phillip J Bergen
- Centre for Medicine Use and Safety, Monash Institute of Pharmaceutical Sciences, Monash University, Melbourne, VIC, Australia
| | - Darren J Creek
- Drug Delivery, Disposition and Dynamics, Monash Institute of Pharmaceutical Sciences, Monash University, Melbourne, VIC, Australia
| | - Tony Velkov
- Drug Delivery, Disposition and Dynamics, Monash Institute of Pharmaceutical Sciences, Monash University, Melbourne, VIC, Australia.,Department of Pharmacology and Therapeutics, School of Biomedical Sciences, Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Parkville, VIC, Australia
| | - Jian Li
- Monash Biomedicine Discovery Institute, Department of Microbiology, School of Biomedical Sciences, Faculty of Medicine, Nursing and Health Sciences, Monash University, Melbourne, VIC, Australia
| |
Collapse
|
84
|
Lim SD, Yim WC, Liu D, Hu R, Yang X, Cushman JC. A Vitis vinifera basic helix-loop-helix transcription factor enhances plant cell size, vegetative biomass and reproductive yield. PLANT BIOTECHNOLOGY JOURNAL 2018; 16:1595-1615. [PMID: 29520945 PMCID: PMC6096725 DOI: 10.1111/pbi.12898] [Citation(s) in RCA: 27] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/13/2017] [Accepted: 01/22/2018] [Indexed: 05/03/2023]
Abstract
Strategies for improving plant size are critical targets for plant biotechnology to increase vegetative biomass or reproductive yield. To improve biomass production, a codon-optimized helix-loop-helix transcription factor (VvCEB1opt ) from wine grape was overexpressed in Arabidopsis thaliana resulting in significantly increased leaf number, leaf and rosette area, fresh weight and dry weight. Cell size, but typically not cell number, was increased in all tissues resulting in increased vegetative biomass and reproductive organ size, number and seed yield. Ionomic analysis of leaves revealed the VvCEB1opt -overexpressing plants had significantly elevated, K, S and Mo contents relative to control lines. Increased K content likely drives increased osmotic potential within cells leading to greater cellular growth and expansion. To understand the mechanistic basis of VvCEB1opt action, one transgenic line was genotyped using RNA-Seq mRNA expression profiling and revealed a novel transcriptional reprogramming network with significant changes in mRNA abundance for genes with functions in delayed flowering, pathogen-defence responses, iron homeostasis, vesicle-mediated cell wall formation and auxin-mediated signalling and responses. Direct testing of VvCEB1opt -overexpressing plants showed that they had significantly elevated auxin content and a significantly increased number of lateral leaf primordia within meristems relative to controls, confirming that cell expansion and organ number proliferation were likely an auxin-mediated process. VvCEB1opt overexpression in Nicotiana sylvestris also showed larger cells, organ size and biomass demonstrating the potential applicability of this innovative strategy for improving plant biomass and reproductive yield in crops.
Collapse
Affiliation(s)
- Sung Don Lim
- Department of Biochemistry and Molecular BiologyUniversity of Nevada, RenoRenoNVUSA
| | - Won Choel Yim
- Department of Biochemistry and Molecular BiologyUniversity of Nevada, RenoRenoNVUSA
| | - Degao Liu
- Biosciences DivisionOak Ridge National LaboratoryOak RidgeTNUSA
| | - Rongbin Hu
- Biosciences DivisionOak Ridge National LaboratoryOak RidgeTNUSA
| | - Xiaohan Yang
- Biosciences DivisionOak Ridge National LaboratoryOak RidgeTNUSA
| | - John C. Cushman
- Department of Biochemistry and Molecular BiologyUniversity of Nevada, RenoRenoNVUSA
| |
Collapse
|
85
|
Cheng L, Leung KS. Quantification of non-coding RNA target localization diversity and its application in cancers. J Mol Cell Biol 2018; 10:130-138. [DOI: 10.1093/jmcb/mjy006] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2017] [Accepted: 01/24/2018] [Indexed: 12/13/2022] Open
Affiliation(s)
- Lixin Cheng
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong SAR, China
| | - Kwong-Sak Leung
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong SAR, China
| |
Collapse
|
86
|
Bhadra S, Rousu J. Analysis of Fluxomic Experiments with Principal Metabolic Flux Mode Analysis. Methods Mol Biol 2018; 1807:141-161. [PMID: 30030809 DOI: 10.1007/978-1-4939-8561-6_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
In the analysis of metabolism, two distinct and complementary approaches are frequently used: Principal component analysis (PCA) and stoichiometric flux analysis. PCA is able to capture the main modes of variability in a set of experiments and does not make many prior assumptions about the data, but does not inherently take into account the flux mode structure of metabolism. Stoichiometric flux analysis methods, such as Flux Balance Analysis (FBA) and Elementary Mode Analysis, on the other hand, are able to capture the metabolic flux modes, however, they are primarily designed for the analysis of single samples at a time, and assume the stoichiometric steady state of the metabolic network.We will discuss a new methodology for the analysis of metabolism, called Principal Metabolic Flux Mode Analysis (PMFA), which marries the PCA and stoichiometric flux analysis approaches in an elegant regularized optimization framework. In short, the method incorporates a variance maximization objective form PCA coupled with a stoichiometric regularizer, which penalizes projections that are far from any flux modes of the network. For interpretability, we also discuss a sparse variant of PMFA that favors flux modes that contain a small number of reactions. PMFA has several benefits: (1) it can be applied to large metabolic network in efficient way as PMFA does not enumerate elementary modes, (2) the method is more robust to the steady-state violations than competing approaches, and (3) can compactly capture the variation in the data by a few factors. This chapter will describe the detailed steps how to do the above task on experimental data from fluxomic and gene expression measurements.
Collapse
Affiliation(s)
| | - Juho Rousu
- Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University, Espoo, Finland
| |
Collapse
|
87
|
Liu M, Fan X, Fang K, Zhang Q, Ma S. Integrative sparse principal component analysis of gene expression data. Genet Epidemiol 2017; 41:844-865. [PMID: 29114920 DOI: 10.1002/gepi.22089] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Revised: 10/03/2017] [Accepted: 10/04/2017] [Indexed: 12/16/2022]
Abstract
In the analysis of gene expression data, dimension reduction techniques have been extensively adopted. The most popular one is perhaps the PCA (principal component analysis). To generate more reliable and more interpretable results, the SPCA (sparse PCA) technique has been developed. With the "small sample size, high dimensionality" characteristic of gene expression data, the analysis results generated from a single dataset are often unsatisfactory. Under contexts other than dimension reduction, integrative analysis techniques, which jointly analyze the raw data of multiple independent datasets, have been developed and shown to outperform "classic" meta-analysis and other multidatasets techniques and single-dataset analysis. In this study, we conduct integrative analysis by developing the iSPCA (integrative SPCA) method. iSPCA achieves the selection and estimation of sparse loadings using a group penalty. To take advantage of the similarity across datasets and generate more accurate results, we further impose contrasted penalties. Different penalties are proposed to accommodate different data conditions. Extensive simulations show that iSPCA outperforms the alternatives under a wide spectrum of settings. The analysis of breast cancer and pancreatic cancer data further shows iSPCA's satisfactory performance.
Collapse
Affiliation(s)
- Mengque Liu
- Department of Statistics, School of Economics, Xiamen University, Xiamen, China
| | - Xinyan Fan
- Department of Statistics, School of Economics, Xiamen University, Xiamen, China
| | - Kuangnan Fang
- Department of Statistics, School of Economics, Xiamen University, Xiamen, China
| | - Qingzhao Zhang
- Department of Statistics, School of Economics, Xiamen University, Xiamen, China.,Wang Yanan Institute of Economics Studies, Xiamen University, Xiamen, China
| | - Shuangge Ma
- Department of Statistics, School of Economics, Xiamen University, Xiamen, China.,Wang Yanan Institute of Economics Studies, Xiamen University, Xiamen, China.,Department of Biostatistics, Yale University, New Haven, Connecticut, United States of America
| |
Collapse
|
88
|
Morandi GD, Wiseman SB, Guan M, Zhang XW, Martin JW, Giesy JP. Elucidating mechanisms of toxic action of dissolved organic chemicals in oil sands process-affected water (OSPW). CHEMOSPHERE 2017; 186:893-900. [PMID: 28830063 DOI: 10.1016/j.chemosphere.2017.08.025] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2017] [Revised: 07/13/2017] [Accepted: 08/06/2017] [Indexed: 06/07/2023]
Abstract
Oil sands process-affected water (OSPW) is generated during extraction of bitumen in the surface-mining oil sands industry in Alberta, Canada, and is acutely and chronically toxic to aquatic organisms. It is known that dissolved organic compounds in OSPW are responsible for most toxic effects, but knowledge of the specific mechanism(s) of toxicity, is limited. Using bioassay-based effects-directed analysis, the dissolved organic fraction of OSPW has previously been fractionated, ultimately producing refined samples of dissolved organic chemicals in OSPW, each with distinct chemical profiles. Using the Escherichia coli K-12 strain MG1655 gene reporter live cell array, the present study investigated relationships between toxic potencies of each fraction, expression of genes and characterization of chemicals in each of five acutely toxic and one non-toxic extract of OSPW derived by use of effects-directed analysis. Effects on expressions of genes related to response to oxidative stress, protein stress and DNA damage were indicative of exposure to acutely toxic extracts of OSPW. Additionally, six genes were uniquely responsive to acutely toxic extracts of OSPW. Evidence presented supports a role for sulphur- and nitrogen-containing chemical classes in the toxicity of extracts of OSPW.
Collapse
Affiliation(s)
- Garrett D Morandi
- Toxicology Centre, University of Saskatchewan, Saskatoon, SK S7N 5B3, Canada
| | - Steve B Wiseman
- Toxicology Centre, University of Saskatchewan, Saskatoon, SK S7N 5B3, Canada; Department of Biological Sciences and Water Institute for Sustainable Environments (WISE), University of Lethbridge, Lethbridge, AB T1K 3M4, Canada
| | - Miao Guan
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing, Jiangsu 210023, China
| | - Xiaowei W Zhang
- State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing, Jiangsu 210023, China.
| | - Jonathan W Martin
- Division of Analytical and Environmental Toxicology, University of Alberta, Edmonton, AB T6G 2G3, Canada; Department of Environmental Sciences and Analytical Chemistry, Stockholm University, Stockholm, 114 18, Sweden
| | - John P Giesy
- Toxicology Centre, University of Saskatchewan, Saskatoon, SK S7N 5B3, Canada; State Key Laboratory of Pollution Control and Resource Reuse, School of the Environment, Nanjing University, Nanjing, Jiangsu 210023, China; Department of Veterinary Biomedical Sciences, University of Saskatchewan, Saskatoon, SK S7N 5B3, Canada; Zoology Department, Center for Integrative Toxicology, Michigan State University, East Lansing, MI 48824, USA; School of Biological Sciences, University of Hong Kong, 999077, Hong Kong Special Administrative Region.
| |
Collapse
|
89
|
Zeng N, Jiang H, Fan Q, Wang T, Rong W, Li G, Li R, Xu D, Guo T, Wang F, Zeng L, Huang M, Zheng J, Lu F, Chen W, Hu Q, Huang Z, Wang Q. Aberrant expression of miR-451a contributes to 1,2-dichloroethane-induced hepatic glycerol gluconeogenesis disorder by inhibiting glycerol kinase expression in NIH Swiss mice. J Appl Toxicol 2017; 38:292-303. [DOI: 10.1002/jat.3526] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2017] [Revised: 08/14/2017] [Accepted: 08/18/2017] [Indexed: 11/08/2022]
Affiliation(s)
- Ni Zeng
- Faculty of Preventive Medicine, A Key Laboratory of Guangzhou Environmental Pollution and Risk Assessment, School of Public Health; Sun Yat-sen University; Guangzhou 510080 China
| | - Hongmei Jiang
- Faculty of Preventive Medicine, A Key Laboratory of Guangzhou Environmental Pollution and Risk Assessment, School of Public Health; Sun Yat-sen University; Guangzhou 510080 China
| | - Qiming Fan
- Faculty of Preventive Medicine, A Key Laboratory of Guangzhou Environmental Pollution and Risk Assessment, School of Public Health; Sun Yat-sen University; Guangzhou 510080 China
| | - Ting Wang
- Faculty of Preventive Medicine, A Key Laboratory of Guangzhou Environmental Pollution and Risk Assessment, School of Public Health; Sun Yat-sen University; Guangzhou 510080 China
| | - Weifeng Rong
- Guangdong Provincial Key Laboratory of Occupational Disease Prevention and Treatment, Department of Toxicology; Guangdong Province Hospital for Occupational Disease Prevention and Treatment; Guangzhou 510300 China
| | - Guoliang Li
- Guangdong Provincial Key Laboratory of Occupational Disease Prevention and Treatment, Department of Toxicology; Guangdong Province Hospital for Occupational Disease Prevention and Treatment; Guangzhou 510300 China
| | - Ruobi Li
- Faculty of Preventive Medicine, A Key Laboratory of Guangzhou Environmental Pollution and Risk Assessment, School of Public Health; Sun Yat-sen University; Guangzhou 510080 China
| | - Dandan Xu
- Faculty of Preventive Medicine, A Key Laboratory of Guangzhou Environmental Pollution and Risk Assessment, School of Public Health; Sun Yat-sen University; Guangzhou 510080 China
| | - Tao Guo
- Faculty of Preventive Medicine, A Key Laboratory of Guangzhou Environmental Pollution and Risk Assessment, School of Public Health; Sun Yat-sen University; Guangzhou 510080 China
| | - Fei Wang
- Faculty of Preventive Medicine, A Key Laboratory of Guangzhou Environmental Pollution and Risk Assessment, School of Public Health; Sun Yat-sen University; Guangzhou 510080 China
| | - Lihai Zeng
- Guangdong Provincial Key Laboratory of Occupational Disease Prevention and Treatment, Department of Toxicology; Guangdong Province Hospital for Occupational Disease Prevention and Treatment; Guangzhou 510300 China
| | - Manqi Huang
- Guangdong Provincial Key Laboratory of Occupational Disease Prevention and Treatment, Department of Toxicology; Guangdong Province Hospital for Occupational Disease Prevention and Treatment; Guangzhou 510300 China
| | - Jiewei Zheng
- Guangdong Provincial Key Laboratory of Occupational Disease Prevention and Treatment, Department of Toxicology; Guangdong Province Hospital for Occupational Disease Prevention and Treatment; Guangzhou 510300 China
| | - Fengrong Lu
- Guangdong Provincial Key Laboratory of Occupational Disease Prevention and Treatment, Department of Toxicology; Guangdong Province Hospital for Occupational Disease Prevention and Treatment; Guangzhou 510300 China
| | - Wen Chen
- Faculty of Preventive Medicine, A Key Laboratory of Guangzhou Environmental Pollution and Risk Assessment, School of Public Health; Sun Yat-sen University; Guangzhou 510080 China
| | - Qiansheng Hu
- Faculty of Preventive Medicine, A Key Laboratory of Guangzhou Environmental Pollution and Risk Assessment, School of Public Health; Sun Yat-sen University; Guangzhou 510080 China
| | - Zhenlie Huang
- Guangdong Provincial Key Laboratory of Occupational Disease Prevention and Treatment, Department of Toxicology; Guangdong Province Hospital for Occupational Disease Prevention and Treatment; Guangzhou 510300 China
| | - Qing Wang
- Faculty of Preventive Medicine, A Key Laboratory of Guangzhou Environmental Pollution and Risk Assessment, School of Public Health; Sun Yat-sen University; Guangzhou 510080 China
| |
Collapse
|
90
|
Cai C, Wu Q, Luo Y, Ma H, Shen J, Zhang Y, Yang L, Chen Y, Wen Z, Wang Q. In silico prediction of ROCK II inhibitors by different classification approaches. Mol Divers 2017; 21:791-807. [DOI: 10.1007/s11030-017-9772-5] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2016] [Accepted: 07/19/2017] [Indexed: 11/25/2022]
|
91
|
Hao N, Xie X, Zhou Z, Li J, Kang L, Wu H, Guo P, Dang C, Zhang H. Nomogram predicted risk of peripherally inserted central catheter related thrombosis. Sci Rep 2017; 7:6344. [PMID: 28740162 PMCID: PMC5524883 DOI: 10.1038/s41598-017-06609-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2017] [Accepted: 06/14/2017] [Indexed: 12/24/2022] Open
Abstract
The use of peripherally inserted central catheters (PICCs) is increasing rapidly worldwide. A number of patient-related, clinical-related and device-related characteristics might be risk factors for PICC-related thrombosis. We retrospectively reviewed a database of 320 consecutive patients who underwent PICC insertion between December 2014 and December 2015 at the First Affiliated Hospital of Xi’an Jiaotong University to explore the potential associations between risk factors and PICC-associated thrombosis. A novel nomogram for predicting risk was developed based on the data. The nomogram prediction model included ten risk factors that were derived from different relevant estimates. The nomogram prediction model showed good discriminatory power (Harrell’s C-index, 0.709) and a high degree of similarity to actual thrombosis occurring after calibration. Furthermore, principal component analysis was performed to identify the factors that most influence PICC-related thrombosis. Our novel nomogram thrombosis risk prediction model was accurate in predicting PICC-related thrombosis. Karnofsky performance scores, D-dimer and blood platelet levels and previous chemotherapy were principal components. Our findings might help clinicians predict thrombosis risk in individual patients, select proper therapeutic strategies and optimize the timing of anticoagulation therapy.
Collapse
Affiliation(s)
- Nan Hao
- Department of Surgical Oncology, The First Affiliated Hospital of Xi'an Jiaotong University, 227W Yanta Road, Xi'an, 710061, Shaanxi, China
| | - Xin Xie
- Department of Surgical Oncology, The First Affiliated Hospital of Xi'an Jiaotong University, 227W Yanta Road, Xi'an, 710061, Shaanxi, China
| | - Zhangjian Zhou
- Department of Surgical Oncology, The First Affiliated Hospital of Xi'an Jiaotong University, 227W Yanta Road, Xi'an, 710061, Shaanxi, China
| | - Jieqiong Li
- Department of Nurse, The First Affiliated Hospital of Xi'an Jiaotong University, 227W Yanta Road, Xi'an, 710061, Shaanxi, China
| | - Li Kang
- Department of Thoracic Surgery Ward 2, The First Affiliated Hospital of Xi'an Jiaotong University, 227W Yanta Road, Xi'an, 710061, Shaanxi, China
| | - Huili Wu
- Department of Oncology, The First Affiliated Hospital of Xi'an Jiaotong University, 227W Yanta Road, Xi'an, 710061, Shaanxi, China
| | - Pingli Guo
- Department of Breast Surgery, The First Affiliated Hospital of Xi'an Jiaotong University, 227W Yanta Road, Xi'an, 710061, Shaanxi, China
| | - Chengxue Dang
- Department of Surgical Oncology, The First Affiliated Hospital of Xi'an Jiaotong University, 227W Yanta Road, Xi'an, 710061, Shaanxi, China.
| | - Hao Zhang
- Department of Surgical Oncology, The First Affiliated Hospital of Xi'an Jiaotong University, 227W Yanta Road, Xi'an, 710061, Shaanxi, China.
| |
Collapse
|
92
|
Imamura R, Murata N, Shimanouchi T, Yamashita K, Fukuzawa M, Noda M. A Label-Free Fluorescent Array Sensor Utilizing Liposome Encapsulating Calcein for Discriminating Target Proteins by Principal Component Analysis. SENSORS (BASEL, SWITZERLAND) 2017; 17:E1630. [PMID: 28714873 PMCID: PMC5539792 DOI: 10.3390/s17071630] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/16/2017] [Revised: 07/05/2017] [Accepted: 07/13/2017] [Indexed: 01/10/2023]
Abstract
A new fluorescent arrayed biosensor has been developed to discriminate species and concentrations of target proteins by using plural different phospholipid liposome species encapsulating fluorescent molecules, utilizing differences in permeation of the fluorescent molecules through the membrane to modulate liposome-target protein interactions. This approach proposes a basically new label-free fluorescent sensor, compared with the common technique of developed fluorescent array sensors with labeling. We have confirmed a high output intensity of fluorescence emission related to characteristics of the fluorescent molecules dependent on their concentrations when they leak from inside the liposomes through the perturbed lipid membrane. After taking an array image of the fluorescence emission from the sensor using a CMOS imager, the output intensities of the fluorescence were analyzed by a principal component analysis (PCA) statistical method. It is found from PCA plots that different protein species with several concentrations were successfully discriminated by using the different lipid membranes with high cumulative contribution ratio. We also confirmed that the accuracy of the discrimination by the array sensor with a single shot is higher than that of a single sensor with multiple shots.
Collapse
Affiliation(s)
- Ryota Imamura
- Graduate School of Science and Technology, Kyoto Institute of Technology, Matsugasaki, Sakyo-ku, Kyoto 606-8585, Japan.
| | - Naoki Murata
- Graduate School of Science and Technology, Kyoto Institute of Technology, Matsugasaki, Sakyo-ku, Kyoto 606-8585, Japan.
| | - Toshinori Shimanouchi
- Graduate School of Environmental and Life Science, Okayama University, 1-1-1 Tsushima-naka, Kita-ku, Okayama 700-8530, Japan.
| | - Kaoru Yamashita
- Graduate School of Science and Technology, Kyoto Institute of Technology, Matsugasaki, Sakyo-ku, Kyoto 606-8585, Japan.
| | - Masayuki Fukuzawa
- Graduate School of Science and Technology, Kyoto Institute of Technology, Matsugasaki, Sakyo-ku, Kyoto 606-8585, Japan.
| | - Minoru Noda
- Graduate School of Science and Technology, Kyoto Institute of Technology, Matsugasaki, Sakyo-ku, Kyoto 606-8585, Japan.
| |
Collapse
|
93
|
Multiple Ligand-Bound States of a Phosphohexomutase Revealed by Principal Component Analysis of NMR Peak Shifts. Sci Rep 2017; 7:5343. [PMID: 28706231 PMCID: PMC5509744 DOI: 10.1038/s41598-017-05557-w] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2017] [Accepted: 05/31/2017] [Indexed: 11/08/2022] Open
Abstract
Enzymes sample multiple conformations during their catalytic cycles. Chemical shifts from Nuclear Magnetic Resonance (NMR) are hypersensitive to conformational changes and ensembles in solution. Phosphomannomutase/phosphoglucomutase (PMM/PGM) is a ubiquitous four-domain enzyme that catalyzes phosphoryl transfer across phosphohexose substrates. We compared states the enzyme visits during its catalytic cycle. Collective responses of Pseudomonas PMM/PGM to phosphosugar substrates and inhibitor were assessed using NMR-detected titrations. Affinities were estimated from binding isotherms obtained by principal component analysis (PCA). Relationships among phosphosugar-enzyme associations emerge from PCA comparisons of the titrations. COordiNated Chemical Shifts bEhavior (CONCISE) analysis provides novel discrimination of three ligand-bound states of PMM/PGM harboring a mutation that suppresses activity. Enzyme phosphorylation and phosphosugar binding appear to drive the open dephosphorylated enzyme to the free phosphorylated state, and on toward ligand-closed states. Domain 4 appears central to collective responses to substrate and inhibitor binding. Hydrogen exchange reveals that binding of a substrate analogue enhances folding stability of the domains to a uniform level, establishing a globally unified structure. CONCISE and PCA of NMR spectra have discovered novel states of a well-studied enzyme and appear ready to discriminate other enzyme and ligand binding states.
Collapse
|
94
|
Huisman SM, van Lew B, Mahfouz A, Pezzotti N, Höllt T, Michielsen L, Vilanova A, Reinders MJ, Lelieveldt BP. BrainScope: interactive visual exploration of the spatial and temporal human brain transcriptome. Nucleic Acids Res 2017; 45:e83. [PMID: 28132031 PMCID: PMC5449549 DOI: 10.1093/nar/gkx046] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2016] [Revised: 12/22/2016] [Accepted: 01/17/2017] [Indexed: 01/26/2023] Open
Abstract
Spatial and temporal brain transcriptomics has recently emerged as an invaluable data source for molecular neuroscience. The complexity of such data poses considerable challenges for analysis and visualization. We present BrainScope: a web portal for fast, interactive visual exploration of the Allen Atlases of the adult and developing human brain transcriptome. Through a novel methodology to explore high-dimensional data (dual t-SNE), BrainScope enables the linked, all-in-one visualization of genes and samples across the whole brain and genome, and across developmental stages. We show that densities in t-SNE scatter plots of the spatial samples coincide with anatomical regions, and that densities in t-SNE scatter plots of the genes represent gene co-expression modules that are significantly enriched for biological functions. We also show that the topography of the gene t-SNE maps reflect brain region-specific gene functions, enabling hypothesis and data driven research. We demonstrate the discovery potential of BrainScope through three examples: (i) analysis of cell type specific gene sets, (ii) analysis of a set of stable gene co-expression modules across the adult human donors and (iii) analysis of the evolution of co-expression of oligodendrocyte specific genes over developmental stages. BrainScope is publicly accessible at www.brainscope.nl.
Collapse
Affiliation(s)
- Sjoerd M.H. Huisman
- Delft Bioinformatics Lab, Delft University of Technology, 2628 CD Delft, The Netherlands
- Division of Image Processing, Dept of Radiology, Leiden University Medical Center, 2300 RC Leiden, The Netherlands
| | - Baldur van Lew
- Division of Image Processing, Dept of Radiology, Leiden University Medical Center, 2300 RC Leiden, The Netherlands
- Computer Graphics and Visualisation, Delft University of Technology, 2628 CD Delft, The Netherlands
| | - Ahmed Mahfouz
- Delft Bioinformatics Lab, Delft University of Technology, 2628 CD Delft, The Netherlands
- Division of Image Processing, Dept of Radiology, Leiden University Medical Center, 2300 RC Leiden, The Netherlands
| | - Nicola Pezzotti
- Division of Image Processing, Dept of Radiology, Leiden University Medical Center, 2300 RC Leiden, The Netherlands
- Computer Graphics and Visualisation, Delft University of Technology, 2628 CD Delft, The Netherlands
| | - Thomas Höllt
- Computer Graphics and Visualisation, Delft University of Technology, 2628 CD Delft, The Netherlands
| | - Lieke Michielsen
- Delft Bioinformatics Lab, Delft University of Technology, 2628 CD Delft, The Netherlands
| | - Anna Vilanova
- Computer Graphics and Visualisation, Delft University of Technology, 2628 CD Delft, The Netherlands
| | - Marcel J.T. Reinders
- Delft Bioinformatics Lab, Delft University of Technology, 2628 CD Delft, The Netherlands
| | - Boudewijn P.F. Lelieveldt
- Delft Bioinformatics Lab, Delft University of Technology, 2628 CD Delft, The Netherlands
- Division of Image Processing, Dept of Radiology, Leiden University Medical Center, 2300 RC Leiden, The Netherlands
| |
Collapse
|
95
|
Tang H, Wang S, Xiao G, Schiller J, Papadimitrakopoulou V, Minna J, Wistuba II, Xie Y. Comprehensive evaluation of published gene expression prognostic signatures for biomarker-based lung cancer clinical studies. Ann Oncol 2017; 28:733-740. [PMID: 28200038 DOI: 10.1093/annonc/mdw683] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2006] [Accepted: 12/08/2016] [Indexed: 02/05/2023] Open
Abstract
Background A more accurate prognosis for non-small-cell lung cancer (NSCLC) patients could aid in the identification of patients at high risk for recurrence. Many NSCLC mRNA expression signatures claiming to be prognostic have been reported in the literature. The goal of this study was to identify the most promising mRNA prognostic signatures in NSCLC for further prospective clinical validation. Experimental design We carried out a systematic review and meta-analysis of published mRNA prognostic signatures for resected NSCLC. The prognostic performance of each signature was evaluated via a meta-analysis of 1927 early stage NSCLC patients collected from 15 studies using three evaluation metrics (hazard ratios, concordance scores, and time-dependent receiver-operating characteristic curves). The performance of each signature was then evaluated against 100 random signatures. The prognostic power independent of clinical risk factors was assessed by multivariate Cox models. Results Through a literature search, we identified 42 lung cancer prognostic signatures derived from genome-wide expression profiling analysis. Based on meta-analysis, 25 signatures were prognostic for survival after adjusting for clinical risk factors and 18 signatures carried out significantly better than random signatures. When analyzing histology types separately, 17 signatures and 8 signatures are prognostic for adenocarcinoma and squamous cell lung cancer, respectively. Despite little overlap among published gene signatures, the top-performing signatures are highly concordant in predicted patient outcomes. Conclusions Based on this large-scale meta-analysis, we identified a set of mRNA expression prognostic signatures appropriate for further validation in prospective clinical studies.
Collapse
Affiliation(s)
- H Tang
- Department of Breast Oncology, Sun Yat-sen University Cancer Center, State Key Laboratory of Oncology in South China, Collaborative Innovation Center for Cancer Medicine, Guangzhou, Guangdong, People's Republic of China
| | - S Wang
- Department of Medical Oncology, Guangdong Provincial Hospital of Chinese Medicine, Guangzhou, Guangdong, P.R. China
| | - G Xiao
- Department of Thoracic Surgery and Oncology, the Second Department of Thoracic Surgery, Cancer Center, the First Affiliated Hospital of Xi'an Jiaotong University, Xi'an, Shaanxi Province, China
| | - J Schiller
- Inova Schar Cancer Institute, Falls Church, VA, USA
| | - V Papadimitrakopoulou
- Department of Thoracic/Head & Neck Medical Oncology, The University of Texas, MD Anderson Cancer Center, 1515 Holcombe Blvd. Unit 0085, Houston, TX, USA
| | - J Minna
- Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, USA.,Department of Pharmacology, University of Texas Southwestern Medical Center, Dallas, USA.,Hamon Center for Therapeutic Oncology, University of Texas Southwestern Medical Center, Dallas, USA
| | - I I Wistuba
- Department of Thoracic/Head & Neck Medical Oncology, The University of Texas, MD Anderson Cancer Center, 1515 Holcombe Blvd. Unit 0085, Houston, TX, USA.,Department of Translational Molecular Pathology, MD Anderson Cancer Center, University of Texas, Houston, USA
| | - Y Xie
- Department of Oncology, First Affiliated Hospital, Soochow University, Suzhou, China.,Departments of Head and Neck and Mammary Gland Oncology and Medical Oncology, Cancer Center and State Key Laboratory of Biotherapy, Laboratory of Molecular Diagnosis of Cancer, West China Hospital, Sichuan University, Chengdu, Sichuan, China
| |
Collapse
|
96
|
Fischer D, Honkatukia M, Tuiskula-Haavisto M, Nordhausen K, Cavero D, Preisinger R, Vilkki J. Subgroup detection in genotype data using invariant coordinate selection. BMC Bioinformatics 2017; 18:173. [PMID: 28302061 PMCID: PMC5356247 DOI: 10.1186/s12859-017-1589-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Accepted: 03/09/2017] [Indexed: 12/01/2022] Open
Abstract
BACKGROUND The current gold standard in dimension reduction methods for high-throughput genotype data is the Principle Component Analysis (PCA). The presence of PCA is so dominant, that other methods usually cannot be found in the analyst's toolbox and hence are only rarely applied. RESULTS We present a modern dimension reduction method called 'Invariant Coordinate Selection' (ICS) and its application to high-throughput genotype data. The more commonly known Independent Component Analysis (ICA) is in this framework just a special case of ICS. We use ICS on both, a simulated and a real dataset to demonstrate first some deficiencies of PCA and how ICS is capable to recover the correct subgroups within the simulated data. Second, we apply the ICS method on a chicken dataset and also detect there two subgroups. These subgroups are then further investigated with respect to their genotype to provide further evidence of the biological relevance of the detected subgroup division. Further, we compare the performance of ICS also to five other popular dimension reduction methods. CONCLUSION The ICS method was able to detect subgroups in data where the PCA fails to detect anything. Hence, we promote the application of ICS to high-throughput genotype data in addition to the established PCA. Especially in statistical programming environments like e.g. R, its application does not add any computational burden to the analysis pipeline.
Collapse
Affiliation(s)
- Daniel Fischer
- Natural Resources Institute Finland (LUKE), Myllytie 1, Jokioinen, Finland
| | - Mervi Honkatukia
- Natural Resources Institute Finland (LUKE), Myllytie 1, Jokioinen, Finland
| | | | - Klaus Nordhausen
- Department of Mathematics and Statistics, University of Turku, Turku, Finland
- University of Tampere, School of Health Sciences, Medisiinarinkatu 3, Tampere, 33014 Finland
| | - David Cavero
- Lohmann Tierzucht GmbH, Am Seedeich 9-11, Cuxhaven, 27454 Germany
| | | | - Johanna Vilkki
- Natural Resources Institute Finland (LUKE), Myllytie 1, Jokioinen, Finland
| |
Collapse
|
97
|
Fang J, Wang L, Wu T, Yang C, Gao L, Cai H, Liu J, Fang S, Chen Y, Tan W, Wang Q. Network pharmacology-based study on the mechanism of action for herbal medicines in Alzheimer treatment. JOURNAL OF ETHNOPHARMACOLOGY 2017; 196:281-292. [PMID: 27888133 DOI: 10.1016/j.jep.2016.11.034] [Citation(s) in RCA: 76] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/08/2016] [Revised: 11/16/2016] [Accepted: 11/17/2016] [Indexed: 06/06/2023]
Abstract
ETHNOPHARMACOLOGICAL RELEVANCE Alzheimer's disease (AD), as the most common type of dementia, has brought a heavy economic burden to healthcare system around the world. However, currently there is still lack of effective treatment for AD patients. Herbal medicines, featured as multiple herbs, ingredients and targets, have accumulated a great deal of valuable experience in treating AD although the exact molecular mechanisms are still unclear. MATERIALS AND METHODS In this investigation, we proposed a network pharmacology-based method, which combined large-scale text-mining, drug-likeness filtering, target prediction and network analysis to decipher the mechanisms of action for the most widely studied medicinal herbs in AD treatment. RESULTS The text mining of PubMed resulted in 10 herbs exhibiting significant correlations with AD. Subsequently, after drug-likeness filtering, 1016 compounds were remaining for 10 herbs, followed by structure clustering to sum up chemical scaffolds of herb ingredients. Based on target prediction results performed by our in-house protocol named AlzhCPI, compound-target (C-T) and target-pathway (T-P) networks were constructed to decipher the mechanism of action for anti-AD herbs. CONCLUSIONS Overall, this approach provided a novel strategy to explore the mechanisms of herbal medicine from a holistic perspective.
Collapse
Affiliation(s)
- Jiansong Fang
- Institute of Clinical Pharmacology, Guangzhou University of Chinese Medicine, Guangzhou 510405, China; Department of Encephalopathy, The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou 510120, China
| | - Ling Wang
- Guangdong Provincial Key Laboratory of Fermentation and Enzyme Engineering, Pre-Incubator for Innovative Drugs & Medicine, School of Bioscience and Bioengineering, South China University of Technology, Guangzhou 510006, China
| | - Tian Wu
- Institute of Clinical Pharmacology, Guangzhou University of Chinese Medicine, Guangzhou 510405, China; Department of Encephalopathy, The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou 510120, China
| | - Cong Yang
- Institute of Clinical Pharmacology, Guangzhou University of Chinese Medicine, Guangzhou 510405, China; Department of Encephalopathy, The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou 510120, China
| | - Li Gao
- Modern Research Center for Traditional Chinese Medicine, Shanxi University, Taiyuan 030006, China
| | - Haobin Cai
- Institute of Clinical Pharmacology, Guangzhou University of Chinese Medicine, Guangzhou 510405, China; Department of Encephalopathy, The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou 510120, China
| | - Junhui Liu
- Guangxi University of Chinese Medicine, Nanning 530001, China
| | - Shuhuan Fang
- Institute of Clinical Pharmacology, Guangzhou University of Chinese Medicine, Guangzhou 510405, China; Department of Encephalopathy, The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou 510120, China
| | - Yunbo Chen
- Institute of Clinical Pharmacology, Guangzhou University of Chinese Medicine, Guangzhou 510405, China; Department of Encephalopathy, The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou 510120, China
| | - Wen Tan
- Institute of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, Guangzhou 510006, China.
| | - Qi Wang
- Institute of Clinical Pharmacology, Guangzhou University of Chinese Medicine, Guangzhou 510405, China; Department of Encephalopathy, The Second Affiliated Hospital of Guangzhou University of Chinese Medicine, Guangzhou 510120, China.
| |
Collapse
|
98
|
Tokumoto Y, Uefuji H, Yamamoto N, Kajiura H, Takeno S, Suzuki N, Nakazawa Y. Gene coexpression network for trans-1,4-polyisoprene biosynthesis involving mevalonate and methylerythritol phosphate pathways in Eucommia ulmoides Oliver. PLANT BIOTECHNOLOGY (TOKYO, JAPAN) 2017; 34:165-172. [PMID: 31275023 PMCID: PMC6565995 DOI: 10.5511/plantbiotechnology.17.0619a] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/22/2016] [Accepted: 06/19/2017] [Indexed: 05/15/2023]
Abstract
Eucommia ulmoides, a deciduous dioecious plant species, accumulates trans-1,4-polyisoprene (TPI) in its tissues such as pericarp and leaf. Probable TPI synthase (trans-isoprenyl diphosphate synthase (TIDS)) genes were identified by expressed sequence tags of this species; however, the metabolic pathway of TPI biosynthesis, including the role of TIDSs, is unknown. To understand the mechanism of TPI biosynthesis at the transcriptional level, comprehensive gene expression data from various organs were generated and TPI biosynthesis related genes were extracted by principal component analysis (PCA). The metabolic pathway was assessed by comparing the coexpression network of TPI genes with the isoprenoid gene coexpression network of model plants. By PCA, we dissected 27 genes assumed to be involved in polyisoprene biosynthesis, including TIDS genes, genes encoding enzymes of the mevalonate (MVA) pathway and the 2-C-methyl-D-erythritol 4-phosphate (MEP) pathway, and genes related to rubber synthesis. The coexpression network revealed that 22 of the 27 TPI biosynthesis genes are coordinately expressed. The network was clustered into two modules, and this was also observed in model plants. The first module was mainly comprised of MEP pathway genes and TIDS1 gene, and the second module, of MVA pathway genes and TIDS5 gene. These results indicate that TPI is likely biosynthesized by both the MEP and MVA pathways and that TIDS gene expression is differentially controlled by these pathways.
Collapse
Affiliation(s)
- Yuji Tokumoto
- Hitz (Bio) Research Alliance Laboratory, Graduate School of Engineering, Osaka University, Suita, Osaka 565-0871, Japan
| | - Hirotaka Uefuji
- Hitz (Bio) Research Alliance Laboratory, Graduate School of Engineering, Osaka University, Suita, Osaka 565-0871, Japan
| | - Naoki Yamamoto
- Hitz (Bio) Research Alliance Laboratory, Graduate School of Engineering, Osaka University, Suita, Osaka 565-0871, Japan
| | - Hiroyuki Kajiura
- Hitz (Bio) Research Alliance Laboratory, Graduate School of Engineering, Osaka University, Suita, Osaka 565-0871, Japan
| | - Shinya Takeno
- Hitz (Bio) Research Alliance Laboratory, Graduate School of Engineering, Osaka University, Suita, Osaka 565-0871, Japan
| | - Nobuaki Suzuki
- Hitz (Bio) Research Alliance Laboratory, Graduate School of Engineering, Osaka University, Suita, Osaka 565-0871, Japan
| | - Yoshihisa Nakazawa
- Hitz (Bio) Research Alliance Laboratory, Graduate School of Engineering, Osaka University, Suita, Osaka 565-0871, Japan
- E-mail: Tel & Fax: +81-6-6879-4165
| |
Collapse
|
99
|
Dillon K, Calhoun V, Wang YP. A robust sparse-modeling framework for estimating schizophrenia biomarkers from fMRI. J Neurosci Methods 2016; 276:46-55. [PMID: 27867012 DOI: 10.1016/j.jneumeth.2016.11.005] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2016] [Revised: 11/09/2016] [Accepted: 11/10/2016] [Indexed: 11/19/2022]
Abstract
BACKGROUND Our goal is to identify the brain regions most relevant to mental illness using neuroimaging. State of the art machine learning methods commonly suffer from repeatability difficulties in this application, particularly when using large and heterogeneous populations for samples. NEW METHOD We revisit both dimensionality reduction and sparse modeling, and recast them in a common optimization-based framework. This allows us to combine the benefits of both types of methods in an approach which we call unambiguous components. We use this to estimate the image component with a constrained variability, which is best correlated with the unknown disease mechanism. RESULTS We apply the method to the estimation of neuroimaging biomarkers for schizophrenia, using task fMRI data from a large multi-site study. The proposed approach yields an improvement in both robustness of the estimate and classification accuracy. COMPARISON WITH EXISTING METHODS We find that unambiguous components incorporate roughly two thirds of the same brain regions as sparsity-based methods LASSO and elastic net, while roughly one third of the selected regions differ. Further, unambiguous components achieve superior classification accuracy in differentiating cases from controls. CONCLUSIONS Unambiguous components provide a robust way to estimate important regions of imaging data.
Collapse
Affiliation(s)
- Keith Dillon
- Department of Biomedical Engineering, Tulane University, New Orleans, LA, USA; Department of Global Biostatistics and Data Science, Tulane University, New Orleans, LA, USA.
| | - Vince Calhoun
- The Mind Research Network & LBERI, Albuquerque, NM, USA; Department of Electrical Engineering, University of New Mexico, New Mexico, USA
| | - Yu-Ping Wang
- Department of Biomedical Engineering, Tulane University, New Orleans, LA, USA; Department of Global Biostatistics and Data Science, Tulane University, New Orleans, LA, USA
| |
Collapse
|
100
|
The Inflammatory Bowel Disease-Disability Index: validation of the Portuguese version according to the COSMIN checklist. Eur J Gastroenterol Hepatol 2016; 28:1151-60. [PMID: 27472270 DOI: 10.1097/meg.0000000000000701] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
BACKGROUND AND AIM Recently, the Inflammatory Bowel Disease-Disability Index (IBD-DI) was developed to assess disability in inflammatory bowel disease (IBD). Our aim was to validate the Portuguese version of IBD-DI according to the COnsensus-based Standards for the selection of the health Measurement INstruments (COSMIN) recommendations. MATERIALS AND METHODS After translation into Portuguese, the IBD-DI was administered by two interviewers to IBD patients at baseline and after 4 weeks and 4 months. We evaluated reliability (internal consistency, test-retest, and inter-rater reliability and measurement error), construct validity, responsiveness, and interpretability. RESULTS At baseline, 129 patients (73=Crohn's disease; 56=ulcerative colitis) completed the IBD-DI. After 4 weeks and 4 months, 118 and 89 patients repeated the questionnaire, respectively. Factor analysis confirmed the unidimensionality of the scale and reduced the final version to 14 items. The Cronbach's α was 0.88. The intraclass correlation coefficients were 0.87 and 0.99 for test-retest (baseline and 4 weeks) and inter-rater reliability, respectively. The smallest detectable change was 18.64 at the individual level and 1.87 at the group level. IBD-DI scores correlated negatively with the total, physical, and mental scores of Short Form-36 items. The change score of IBD-DI between baseline and 4 months correlated negatively with the clinical evolution of patients. The minimal important change was 16.96. IBD-DI scores ranged from 0 to 78.6, with a mean of 21.8±18.1. Female sex, professional inactivity, and clinical disease activity were associated with higher IBD-DI scores. CONCLUSION The Portuguese version of IBD-DI obtained is a reliable, valid, responsive, and interpretable (at the group level) tool to assess disability in Portuguese IBD patients.
Collapse
|