1
|
Mandiracioglu B, Ozden F, Kaynar G, Yilmaz MA, Alkan C, Cicek AE. ECOLE: Learning to call copy number variants on whole exome sequencing data. Nat Commun 2024; 15:132. [PMID: 38167256 PMCID: PMC10762021 DOI: 10.1038/s41467-023-44116-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Accepted: 11/30/2023] [Indexed: 01/05/2024] Open
Abstract
Copy number variants (CNV) are shown to contribute to the etiology of several genetic disorders. Accurate detection of CNVs on whole exome sequencing (WES) data has been a long sought-after goal for use in clinics. This was not possible despite recent improvements in performance because algorithms mostly suffer from low precision and even lower recall on expert-curated gold standard call sets. Here, we present a deep learning-based somatic and germline CNV caller for WES data, named ECOLE. Based on a variant of the transformer architecture, the model learns to call CNVs per exon, using high-confidence calls made on matched WGS samples. We further train and fine-tune the model with a small set of expert calls via transfer learning. We show that ECOLE achieves high performance on human expert labelled data for the first time with 68.7% precision and 49.6% recall. This corresponds to precision and recall improvements of 18.7% and 30.8% over the next best-performing methods, respectively. We also show that the same fine-tuning strategy using tumor samples enables ECOLE to detect RT-qPCR-validated variations in bladder cancer samples without the need for a control sample. ECOLE is available at https://github.com/ciceklab/ECOLE .
Collapse
Affiliation(s)
- Berk Mandiracioglu
- Department of Computer and Communication Sciences, EPFL, Lausanne, Switzerland
| | - Furkan Ozden
- Department of Computer Science, Oxford University, Oxford, UK
| | - Gun Kaynar
- Department of Computer Engineering, Bilkent University, Ankara, Turkey
| | | | - Can Alkan
- Department of Computer Engineering, Bilkent University, Ankara, Turkey
| | - A Ercument Cicek
- Department of Computer Engineering, Bilkent University, Ankara, Turkey.
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, US.
| |
Collapse
|
2
|
Kaynar G, Cakmakci D, Bund C, Todeschi J, Namer IJ, Cicek AE. PiDeeL: metabolic pathway-informed deep learning model for survival analysis and pathological classification of gliomas. Bioinformatics 2023; 39:btad684. [PMID: 37952175 PMCID: PMC10663986 DOI: 10.1093/bioinformatics/btad684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Revised: 10/19/2023] [Accepted: 11/10/2023] [Indexed: 11/14/2023] Open
Abstract
MOTIVATION Online assessment of tumor characteristics during surgery is important and has the potential to establish an intra-operative surgeon feedback mechanism. With the availability of such feedback, surgeons could decide to be more liberal or conservative regarding the resection of the tumor. While there are methods to perform metabolomics-based tumor pathology prediction, their model complexity predictive performance is limited by the small dataset sizes. Furthermore, the information conveyed by the feedback provided on the tumor tissue could be improved both in terms of content and accuracy. RESULTS In this study, we propose a metabolic pathway-informed deep learning model (PiDeeL) to perform survival analysis and pathology assessment based on metabolite concentrations. We show that incorporating pathway information into the model architecture substantially reduces parameter complexity and achieves better survival analysis and pathological classification performance. With these design decisions, we show that PiDeeL improves tumor pathology prediction performance of the state-of-the-art in terms of the Area Under the ROC Curve by 3.38% and the Area Under the Precision-Recall Curve by 4.06%. Similarly, with respect to the time-dependent concordance index (c-index), PiDeeL achieves better survival analysis performance (improvement of 4.3%) when compared to the state-of-the-art. Moreover, we show that importance analyses performed on input metabolite features as well as pathway-specific neurons of PiDeeL provide insights into tumor metabolism. We foresee that the use of this model in the surgery room will help surgeons adjust the surgery plan on the fly and will result in better prognosis estimates tailored to surgical procedures. AVAILABILITY AND IMPLEMENTATION The code is released at https://github.com/ciceklab/PiDeeL. The data used in this study are released at https://zenodo.org/record/7228791.
Collapse
Affiliation(s)
- Gun Kaynar
- Computer Engineering Department, Bilkent University, 06800 Ankara, Turkey
| | - Doruk Cakmakci
- School of Computer Science, McGill University, Montreal, QC, H3A 0E9, Canada
| | - Caroline Bund
- MNMS Platform, University Hospitals of Strasbourg, Strasbourg 67098, France
- ICube, University of Strasbourg, CNRS UMR, 7357, Strasbourg 67000, France
- Department of Nuclear Medicine and Molecular Imaging, ICANS, Strasbourg 67000, France
| | - Julien Todeschi
- Department of Neurosurgery, University Hospitals of Strasbourg, Strasbourg, 67091, France
| | - Izzie Jacques Namer
- MNMS Platform, University Hospitals of Strasbourg, Strasbourg 67098, France
- ICube, University of Strasbourg, CNRS UMR, 7357, Strasbourg 67000, France
- Department of Nuclear Medicine and Molecular Imaging, ICANS, Strasbourg 67000, France
| | - A Ercument Cicek
- Computer Engineering Department, Bilkent University, 06800 Ankara, Turkey
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, United States
| |
Collapse
|
3
|
Uner OC, Kuru HI, Cinbis RG, Tastan O, Cicek AE. DeepSide: A Deep Learning Approach for Drug Side Effect Prediction. IEEE/ACM Trans Comput Biol Bioinform 2023; 20:330-339. [PMID: 34995191 DOI: 10.1109/tcbb.2022.3141103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Drug failures due to unforeseen adverse effects at clinical trials pose health risks for the participants and lead to substantial financial losses. Side effect prediction algorithms have the potential to guide the drug design process. LINCS L1000 dataset provides a vast resource of cell line gene expression data perturbed by different drugs and creates a knowledge base for context specific features. The state-of-the-art approach that aims at using context specific information relies on only the high-quality experiments in LINCS L1000 and discards a large portion of the experiments. In this study, our goal is to boost the prediction performance by utilizing this data to its full extent. We experiment with 5 deep learning architectures. We find that a multi-modal architecture produces the best predictive performance among multi-layer perceptron-based architectures when drug chemical structure (CS), and the full set of drug perturbed gene expression profiles (GEX) are used as modalities. Overall, we observe that the CS is more informative than the GEX. A convolutional neural network-based model that uses only SMILES string representation of the drugs achieves the best results and provides 13.0% macro-AUC and 3.1% micro-AUC improvements over the state-of-the-art. We also show that the model is able to predict side effect-drug pairs that are reported in the literature but was missing in the ground truth side effect dataset. DeepSide is available at http://github.com/OnurUner/DeepSide.
Collapse
|
4
|
Karaşan O, Şen A, Tiryaki B, Cicek AE. A unifying network modeling approach for codon optimization. Bioinformatics 2022; 38:3935-3941. [PMID: 35762943 DOI: 10.1093/bioinformatics/btac428] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2021] [Revised: 05/01/2022] [Accepted: 06/27/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Synthesizing genes to be expressed in other organisms is an essential tool in biotechnology. While the many-to-one mapping from codons to amino acids makes the genetic code degenerate, codon usage in a particular organism is not random either. This bias in codon use may have a remarkable effect on the level of gene expression. A number of measures have been developed to quantify a given codon sequence's strength to express a gene in a host organism. Codon optimization aims to find a codon sequence that will optimize one or more of these measures. Efficient computational approaches are needed since the possible number of codon sequences grows exponentially as the number of amino acids increases. RESULTS We develop a unifying modeling approach for codon optimization. With our mathematical formulations based on graph/network representations of amino acid sequences, any combination of measures can be optimized in the same framework by finding a path satisfying additional limitations in an acyclic layered network. We tested our approach on bi-objectives commonly used in the literature, namely, Codon Pair Bias versus Codon Adaptation Index and Relative Codon Pair Bias versus Relative Codon Bias. However, our framework is general enough to handle any number of objectives concurrently with certain restrictions or preferences on the use of specific nucleotide sequences. We implemented our models using Python's Gurobi interface and showed the efficacy of our approach even for the largest proteins available. We also provided experimentation showing that highly expressed genes have objective values close to the optimized values in the bi-objective codon design problem. AVAILABILITY AND IMPLEMENTATION http://alpersen.bilkent.edu.tr/NetworkCodon.zip. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Oya Karaşan
- Department of Industrial Engineering, Bilkent University, Ankara 06800, Turkey
| | - Alper Şen
- Department of Industrial Engineering, Bilkent University, Ankara 06800, Turkey
| | - Banu Tiryaki
- Department of Industrial Engineering, Bilkent University, Ankara 06800, Turkey
| | - A Ercument Cicek
- Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey
| |
Collapse
|
5
|
Beyreli I, Karakahya O, Cicek AE. DeepND: Deep multitask learning of gene risk for comorbid neurodevelopmental disorders. Patterns 2022; 3:100524. [PMID: 35845835 PMCID: PMC9278518 DOI: 10.1016/j.patter.2022.100524] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/11/2022] [Revised: 03/11/2022] [Accepted: 05/09/2022] [Indexed: 01/24/2023]
Abstract
Autism spectrum disorder and intellectual disability are comorbid neurodevelopmental disorders with complex genetic architectures. Despite large-scale sequencing studies, only a fraction of the risk genes was identified for both. We present a network-based gene risk prioritization algorithm, DeepND, that performs cross-disorder analysis to improve prediction by exploiting the comorbidity of autism spectrum disorder (ASD) and intellectual disability (ID) via multitask learning. Our model leverages information from human brain gene co-expression networks using graph convolutional networks, learning which spatiotemporal neurodevelopmental windows are important for disorder etiologies and improving the state-of-the-art prediction in single- and cross-disorder settings. DeepND identifies the prefrontal and motor-somatosensory cortex (PFC-MFC) brain region and periods from early- to mid-fetal and from early childhood to young adulthood as the highest neurodevelopmental risk windows for ASD and ID. We investigate ASD- and ID-associated copy-number variation (CNV) regions and report our findings for several susceptibility gene candidates. DeepND can be generalized to analyze any combinations of comorbid disorders. DeepND can co-analyze comorbid neurodevelopmental disorders to discover risk genes The approach employs multitask learning to learn shared and disorder-specific weights DeepND uses graph convolution to process gene interactions in multiple networks The model includes a mixture-of-experts model to detect informative networks
While risk-gene-discovery algorithms have complemented exome/genome-sequencing studies of neurodevelopmental disorders, they are not capable of co-analyzing multiple comorbid conditions like autism and intellectual disability. A common approach is analyzing disorders one by one and comparing the outcomes. With this approach, the method does not utilize cross-disorder interactions and is bound by limited evidence per disorder. We address this gap with a technique, Deep Neurodevelopmental Disorders (DeepND), that uses multitask learning to co-analyze data from multiple disorders to learn shared and disorder-specific patterns. DeepND includes graph convolutional neural networks that process gene-interaction information from multiple networks. DeepND also learns which networks are important for disorder etiologies. Based on this, we propose an interpretable risk-gene-discovery algorithm for neuropsychiatric disorders.
Collapse
Affiliation(s)
- Ilayda Beyreli
- Department of Computer Engineering, Bilkent University, Ankara 06810, Turkey
| | - Oguzhan Karakahya
- Department of Computer Engineering, Bilkent University, Ankara 06810, Turkey
| | - A. Ercument Cicek
- Department of Computer Engineering, Bilkent University, Ankara 06810, Turkey
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, 15213 PA, USA
- Corresponding author
| |
Collapse
|
6
|
Abstract
Drug combination therapies have been a viable strategy for the treatment of complex diseases such as cancer due to increased efficacy and reduced side effects. However, experimentally validating all possible combinations for synergistic interaction even with high-throughout screens is intractable due to vast combinatorial search space. Computational techniques can reduce the number of combinations to be evaluated experimentally by prioritizing promising candidates. We present MatchMaker that predicts drug synergy scores using drug chemical structure information and gene expression profiles of cell lines in a deep learning framework. For the first time, our model utilizes the largest known drug combination dataset to date, DrugComb. We compare the performance of MatchMaker with the state-of-the-art models and observe up to ∼ 15% correlation and ∼ 33% mean squared error (MSE) improvements over the next best method. We investigate the cell types and drug pairs that are relatively harder to predict and present novel candidate pairs. MatchMaker is built and available at https://github.com/tastanlab/matchmaker.
Collapse
|
7
|
Cakmakci D, Kaynar G, Bund C, Piotto M, Proust F, Namer IJ, Cicek AE. Targeted metabolomics analyses for brain tumor margin assessment during surgery. Bioinformatics 2022; 38:3238-3244. [PMID: 35512389 DOI: 10.1093/bioinformatics/btac309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/13/2021] [Revised: 04/13/2022] [Accepted: 05/02/2022] [Indexed: 01/17/2023]
Abstract
MOTIVATION Identification and removal of micro-scale residual tumor tissue during brain tumor surgery are key for survival in glioma patients. For this goal, High-Resolution Magic Angle Spinning Nuclear Magnetic Resonance (HRMAS NMR) spectroscopy-based assessment of tumor margins during surgery has been an effective method. However, the time required for metabolite quantification and the need for human experts such as a pathologist to be present during surgery are major bottlenecks of this technique. While machine learning techniques that analyze the NMR spectrum in an untargeted manner (i.e. using the full raw signal) have been shown to effectively automate this feedback mechanism, high dimensional and noisy structure of the NMR signal limits the attained performance. RESULTS In this study, we show that identifying informative regions in the HRMAS NMR spectrum and using them for tumor margin assessment improves the prediction power. We use the spectra normalized with the ERETIC (electronic reference to access in vivo concentrations) method which uses an external reference signal to calibrate the HRMAS NMR spectrum. We train models to predict quantities of metabolites from annotated regions of this spectrum. Using these predictions for tumor margin assessment provides performance improvements up to 4.6% the Area Under the ROC Curve (AUC-ROC) and 2.8% the Area Under the Precision-Recall Curve (AUC-PR). We validate the importance of various tumor biomarkers and identify a novel region between 7.97 ppm and 8.09 ppm as a new candidate for a glioma biomarker. AVAILABILITY AND IMPLEMENTATION The code is released at https://github.com/ciceklab/targeted_brain_tumor_margin_assessment. The data underlying this article are available in Zenodo, at https://doi.org/10.5281/zenodo.5781769. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Doruk Cakmakci
- School of Computer Science, McGill University, Montreal, QC H3A 0E9, Canada
| | - Gun Kaynar
- School of Computer Science, McGill University, Montreal, QC H3A 0E9, Canada
| | - Caroline Bund
- MNMS Platform, University Hospitals of Strasbourg, Strasbourg 67098, France.,ICube, University of Strasbourg/CNRS UMR 7357, Strasbourg 67000, France.,Department of Nuclear Medicine and Molecular Imaging, ICANS, Strasbourg 67000, France
| | | | - Francois Proust
- Department of Neurosurgery, University Hospitals of Strasbourg, Strasbourg 67091, France
| | - Izzie Jacques Namer
- MNMS Platform, University Hospitals of Strasbourg, Strasbourg 67098, France.,ICube, University of Strasbourg/CNRS UMR 7357, Strasbourg 67000, France.,Department of Nuclear Medicine and Molecular Imaging, ICANS, Strasbourg 67000, France
| | - A Ercument Cicek
- Computer Engineering Department, Bilkent University, Ankara 06800, Turkey.,Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| |
Collapse
|
8
|
Kuru HI, Cicek AE, Tastan O. From cell lines to cancer patients: personalized drug synergy prediction. Bioinformatics 2022; 40:btae134. [PMID: 38718189 DOI: 10.1093/bioinformatics/btae134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/19/2023] [Revised: 12/18/2023] [Indexed: 05/12/2024]
Abstract
MOTIVATION Combination drug therapies are effective treatments for cancer. However, the genetic heterogeneity of the patients and exponentially large space of drug pairings pose significant challenges for finding the right combination for a specific patient. Current in silico prediction methods can be instrumental in reducing the vast number of candidate drug combinations. However, existing powerful methods are trained with cancer cell line gene expression data, which limits their applicability in clinical settings. While synergy measurements on cell line models are available at large scale, patient-derived samples are too few to train a complex model. On the other hand, patient-specific single-drug response data are relatively more available. RESULTS In this work, we propose a deep learning framework, Personalized Deep Synergy Predictor (PDSP), that enables us to use the patient-specific single drug response data for customizing patient drug synergy predictions. PDSP is first trained to learn synergy scores of drug pairs and their single drug responses for a given cell line using drug structures and large scale cell line gene expression data. Then, the model is fine-tuned for patients with their patient gene expression data and associated single drug response measured on the patient ex vivo samples. In this study, we evaluate PDSP on data from three leukemia patients and observe that it improves the prediction accuracy by 27% compared to models trained on cancer cell line data. AVAILABILITY AND IMPLEMENTATION PDSP is available at https://github.com/hikuru/PDSP.
Collapse
Affiliation(s)
- Halil Ibrahim Kuru
- Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey
| | - A Ercument Cicek
- Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey
- Computational Biology Department, Carnegie Mellon University, Pittsburgh 15213, United States
| | - Oznur Tastan
- Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul 34956, Turkey
| |
Collapse
|
9
|
Thibodeau A, Eroglu A, McGinnis CS, Lawlor N, Nehar-Belaid D, Kursawe R, Marches R, Conrad DN, Kuchel GA, Gartner ZJ, Banchereau J, Stitzel ML, Cicek AE, Ucar D. AMULET: a novel read count-based method for effective multiplet detection from single nucleus ATAC-seq data. Genome Biol 2021; 22:252. [PMID: 34465366 PMCID: PMC8408950 DOI: 10.1186/s13059-021-02469-x] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2021] [Accepted: 08/17/2021] [Indexed: 12/13/2022] Open
Abstract
Detecting multiplets in single nucleus (sn)ATAC-seq data is challenging due to data sparsity and limited dynamic range. AMULET (ATAC-seq MULtiplet Estimation Tool) enumerates regions with greater than two uniquely aligned reads across the genome to effectively detect multiplets. We evaluate the method by generating snATAC-seq data in the human blood and pancreatic islet samples. AMULET has high precision, estimated via donor-based multiplexing, and high recall, estimated via simulated multiplets, compared to alternatives and identifies multiplets most effectively when a certain read depth of 25K median valid reads per nucleus is achieved.
Collapse
Affiliation(s)
- Asa Thibodeau
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
| | - Alper Eroglu
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
| | - Christopher S McGinnis
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA, 94158, USA
| | - Nathan Lawlor
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
| | | | - Romy Kursawe
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
| | - Radu Marches
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
| | - Daniel N Conrad
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA, 94158, USA
| | - George A Kuchel
- University of Connecticut Center on Aging, UConn Health Center, Farmington, CT, 06030, USA
| | - Zev J Gartner
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA, 94158, USA
- Chan-Zuckerberg Biohub, San Francisco, CA, 94158, USA
- NSF Center for Cellular Construction, San Francisco, CA, 94158, USA
| | | | - Michael L Stitzel
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT, 06030, USA
- Institute for Systems Genomics, University of Connecticut Health Center, Farmington, CT, 06030, USA
| | - A Ercument Cicek
- Computer Engineering Department, Bilkent University, 06800, Ankara, Turkey
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, 15213, USA
| | - Duygu Ucar
- The Jackson Laboratory for Genomic Medicine, Farmington, CT, 06032, USA.
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT, 06030, USA.
- Institute for Systems Genomics, University of Connecticut Health Center, Farmington, CT, 06030, USA.
| |
Collapse
|
10
|
Ozden F, Siper MC, Acarsoy N, Elmas T, Marty B, Qi X, Cicek AE. DORMAN: Database of Reconstructed MetAbolic Networks. IEEE/ACM Trans Comput Biol Bioinform 2021; 18:1474-1480. [PMID: 31581093 DOI: 10.1109/tcbb.2019.2944905] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Genome-scale reconstructed metabolic networks have provided an organism specific understanding of cellular processes and their relations to phenotype. As they are deemed essential to study metabolism, the number of organisms with reconstructed metabolic networks continues to increase. This everlasting research interest lead to the development of online systems/repositories that store existing reconstructions and enable new model generation, integration, and constraint-based analyses. While features that support model reconstruction are widely available, current systems lack the means to help users who are interested in analyzing the topology of the reconstructed networks. Here, we present the Database of Reconstructed Metabolic Networks - DORMAN. DORMAN is a centralized online database that stores SBML-based reconstructed metabolic networks published in the literature, and provides web-based computational tools for visualizing and analyzing the model topology. Novel features of DORMAN are (i) interactive visualization interface that allows rendering of the complete network as well as editing and exporting the model, (ii) hierarchical navigation that provides efficient access to connected entities in the model, (iii) built-in query interface that allow posing topological queries, and finally, and (iv) model comparison tool that enables comparing models with different nomenclatures, using approximate string matching. DORMAN is online and freely accessible at http://ciceklab.cs.bilkent.edu.tr/dorman.
Collapse
|
11
|
Yilmaz S, Tastan O, Cicek AE. SPADIS: An Algorithm for Selecting Predictive and Diverse SNPs in GWAS. IEEE/ACM Trans Comput Biol Bioinform 2021; 18:1208-1216. [PMID: 31443041 DOI: 10.1109/tcbb.2019.2935437] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
Phenotypic heritability of complex traits and diseases is seldom explained by individual genetic variants identified in genome-wide association studies (GWAS). Many methods have been developed to select a subset of variant loci, which are associated with or predictive of the phenotype. Selecting connected SNPs on SNP-SNP networks have been proven successful in finding biologically interpretable and predictive SNPs. However, we argue that the connectedness constraint favors selecting redundant features that affect similar biological processes and therefore does not necessarily yield better predictive performance. In this paper, we propose a novel method called SPADIS that favors the selection of remotely located SNPs in order to account for their complementary effects in explaining a phenotype. SPADIS selects a diverse set of loci on a SNP-SNP network. This is achieved by maximizing a submodular set function with a greedy algorithm that ensures a constant factor approximation to the optimal solution. We compare SPADIS to the state-of-the-art method SConES, on a dataset of Arabidopsis Thaliana with continuous flowering time phenotypes. SPADIS has better average phenotype prediction performance in 15 out of 17 phenotypes when the same number of SNPs are selected and provides consistent improvements across multiple networks and settings on average. Moreover, it identifies more candidate genes and runs faster.
Collapse
|
12
|
Werling DM, Pochareddy S, Choi J, An JY, Sheppard B, Peng M, Li Z, Dastmalchi C, Santpere G, Sousa AMM, Tebbenkamp ATN, Kaur N, Gulden FO, Breen MS, Liang L, Gilson MC, Zhao X, Dong S, Klei L, Cicek AE, Buxbaum JD, Adle-Biassette H, Thomas JL, Aldinger KA, O'Day DR, Glass IA, Zaitlen NA, Talkowski ME, Roeder K, State MW, Devlin B, Sanders SJ, Sestan N. Whole-Genome and RNA Sequencing Reveal Variation and Transcriptomic Coordination in the Developing Human Prefrontal Cortex. Cell Rep 2021; 31:107489. [PMID: 32268104 PMCID: PMC7295160 DOI: 10.1016/j.celrep.2020.03.053] [Citation(s) in RCA: 65] [Impact Index Per Article: 21.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Revised: 11/06/2019] [Accepted: 03/16/2020] [Indexed: 02/08/2023] Open
Abstract
Gene expression levels vary across developmental stage, cell type, and region in the brain. Genomic variants also contribute to the variation in expression, and some neuropsychiatric disorder loci may exert their effects through this mechanism. To investigate these relationships, we present BrainVar, a unique resource of paired whole-genome and bulk tissue RNA sequencing from the dorsolateral prefrontal cortex of 176 individuals across prenatal and postnatal development. Here we identify common variants that alter gene expression (expression quantitative trait loci [eQTLs]) constantly across development or predominantly during prenatal or postnatal stages. Both "constant" and "temporal-predominant" eQTLs are enriched for loci associated with neuropsychiatric traits and disorders and colocalize with specific variants. Expression levels of more than 12,000 genes rise or fall in a concerted late-fetal transition, with the transitional genes enriched for cell-type-specific genes and neuropsychiatric risk loci, underscoring the importance of cataloging developmental trajectories in understanding cortical physiology and pathology.
Collapse
Affiliation(s)
- Donna M Werling
- Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA; Laboratory of Genetics, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Sirisha Pochareddy
- Department of Neuroscience and Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
| | - Jinmyung Choi
- Department of Neuroscience and Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
| | - Joon-Yong An
- Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA; Department of Integrated Biomedical and Life Science, Korea University, Seoul 02841, Republic of Korea; School of Biosystem and Biomedical Science, College of Health Science, Korea University, Seoul 02841, Republic of Korea
| | - Brooke Sheppard
- Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Minshi Peng
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Zhen Li
- Department of Neuroscience and Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA; Department of Neurosciences, University of California, San Diego, San Diego, CA 92093, USA
| | - Claudia Dastmalchi
- Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Gabriel Santpere
- Department of Neuroscience and Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA; Neurogenomics Group, Research Programme on Biomedical Informatics, Hospital del Mar Medical Research Institute, Department of Experimental and Health Sciences, Universitat Pompeu Fabra, 08003 Barcelona, Catalonia, Spain
| | - André M M Sousa
- Department of Neuroscience and Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
| | - Andrew T N Tebbenkamp
- Department of Neuroscience and Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
| | - Navjot Kaur
- Department of Neuroscience and Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
| | - Forrest O Gulden
- Department of Neuroscience and Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA
| | - Michael S Breen
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Lindsay Liang
- Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Michael C Gilson
- Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Xuefang Zhao
- Center for Genomic Medicine and Department of Neurology, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Neurology, Harvard Medical School, Boston, MA 02115, USA; Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA 02142, USA
| | - Shan Dong
- Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Lambertus Klei
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213, USA
| | - A Ercument Cicek
- Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey; Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Joseph D Buxbaum
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA
| | - Homa Adle-Biassette
- Department of Pathology, Lariboisière Hospital, APHP, Biobank BB-0033-00064, and Université de Paris, 75006 Paris, France
| | - Jean-Leon Thomas
- Department of Neurology, Yale University School of Medicine, New Haven, CT 06511, USA; UMRS1127, Sorbonne Université, Institut du Cerveau et de la Moelle Épinière, 75013 Paris, France
| | - Kimberly A Aldinger
- Center for Integrative Brain Research, Seattle Children's Research Institute, Seattle, WA 98101, USA; Brotman Baty Institute for Precision Medicine, Seattle, WA 98195, USA
| | - Diana R O'Day
- Department of Pediatrics, University of Washington, Seattle, WA 98105, USA
| | - Ian A Glass
- Department of Pediatrics, University of Washington, Seattle, WA 98105, USA
| | - Noah A Zaitlen
- Department of Medicine, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Michael E Talkowski
- Center for Genomic Medicine and Department of Neurology, Massachusetts General Hospital, Boston, MA 02114, USA; Department of Neurology, Harvard Medical School, Boston, MA 02115, USA; Program in Medical and Population Genetics and Stanley Center for Psychiatric Research, Broad Institute, Cambridge, MA 02142, USA
| | - Kathryn Roeder
- Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA; Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Matthew W State
- Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94158, USA
| | - Bernie Devlin
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA 15213, USA
| | - Stephan J Sanders
- Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA 94158, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, CA 94158, USA.
| | - Nenad Sestan
- Department of Neuroscience and Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06510, USA; Department of Psychiatry, Yale University School of Medicine, New Haven, CT 06520, USA; Department of Genetics, Yale University School of Medicine, New Haven, CT 06520, USA; Department of Comparative Medicine, Program in Integrative Cell Signaling and Neurobiology of Metabolism, Yale School of Medicine, New Haven, CT 06510, USA; Program in Cellular Neuroscience, Neurodegeneration, and Repair and Yale Child Study Center, Yale School of Medicine, New Haven, CT 06510, USA.
| |
Collapse
|
13
|
Abstract
Sharing genome data in a privacy-preserving way stands as a major bottleneck in front of the scientific progress promised by the big data era in genomics. A community-driven protocol named genomic data-sharing beacon protocol has been widely adopted for sharing genomic data. The system aims to provide a secure, easy to implement, and standardized interface for data sharing by only allowing yes/no queries on the presence of specific alleles in the dataset. However, beacon protocol was recently shown to be vulnerable against membership inference attacks. In this paper, we show that privacy threats against genomic data sharing beacons are not limited to membership inference. We identify and analyze a novel vulnerability of genomic data-sharing beacons: genome reconstruction. We show that it is possible to successfully reconstruct a substantial part of the genome of a victim when the attacker knows the victim has been added to the beacon in a recent update. In particular, we show how an attacker can use the inherent correlations in the genome and clustering techniques to run such an attack in an efficient and accurate way. We also show that even if multiple individuals are added to the beacon during the same update, it is possible to identify the victim's genome with high confidence using traits that are easily accessible by the attacker (e.g., eye color or hair type). Moreover, we show how a reconstructed genome using a beacon that is not associated with a sensitive phenotype can be used for membership inference attacks to beacons with sensitive phenotypes (e.g., HIV+). The outcome of this work will guide beacon operators on when and how to update the content of the beacon and help them (along with the beacon participants) make informed decisions.
Collapse
|
14
|
Ayoz K, Aysen M, Ayday E, Cicek AE. The effect of kinship in re-identification attacks against genomic data sharing beacons. Bioinformatics 2020; 36:i903-i910. [PMID: 33381836 PMCID: PMC7773481 DOI: 10.1093/bioinformatics/btaa821] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/08/2020] [Indexed: 12/31/2022] Open
Abstract
MOTIVATION Big data era in genomics promises a breakthrough in medicine, but sharing data in a private manner limit the pace of field. Widely accepted 'genomic data sharing beacon' protocol provides a standardized and secure interface for querying the genomic datasets. The data are only shared if the desired information (e.g. a certain variant) exists in the dataset. Various studies showed that beacons are vulnerable to re-identification (or membership inference) attacks. As beacons are generally associated with sensitive phenotype information, re-identification creates a significant risk for the participants. Unfortunately, proposed countermeasures against such attacks have failed to be effective, as they do not consider the utility of beacon protocol. RESULTS In this study, for the first time, we analyze the mitigation effect of the kinship relationships among beacon participants against re-identification attacks. We argue that having multiple family members in a beacon can garble the information for attacks since a substantial number of variants are shared among kin-related people. Using family genomes from HapMap and synthetically generated datasets, we show that having one of the parents of a victim in the beacon causes (i) significant decrease in the power of attacks and (ii) substantial increase in the number of queries needed to confirm an individual's beacon membership. We also show how the protection effect attenuates when more distant relatives, such as grandparents are included alongside the victim. Furthermore, we quantify the utility loss due adding relatives and show that it is smaller compared with flipping based techniques.
Collapse
Affiliation(s)
- Kerem Ayoz
- Computer Engineering Department, Bilkent University, Ankara 06800, Turkey
| | - Miray Aysen
- Computer Engineering Department, Bilkent University, Ankara 06800, Turkey
| | - Erman Ayday
- Computer Engineering Department, Bilkent University, Ankara 06800, Turkey,Computer and Data Sciences Department, Case Western Reserve University, Cleveland, OH 44106, USA,To whom correspondence should be addressed. E-mail: or
| | - A Ercument Cicek
- Computer Engineering Department, Bilkent University, Ankara 06800, Turkey,Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA 15213, USA,To whom correspondence should be addressed. E-mail: or
| |
Collapse
|
15
|
Firtina C, Kim JS, Alser M, Senol Cali D, Cicek AE, Alkan C, Mutlu O. Apollo: a sequencing-technology-independent, scalable and accurate assembly polishing algorithm. Bioinformatics 2020; 36:3669-3679. [PMID: 32167530 DOI: 10.1093/bioinformatics/btaa179] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2019] [Revised: 12/16/2019] [Accepted: 03/11/2020] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Third-generation sequencing technologies can sequence long reads that contain as many as 2 million base pairs. These long reads are used to construct an assembly (i.e. the subject's genome), which is further used in downstream genome analysis. Unfortunately, third-generation sequencing technologies have high sequencing error rates and a large proportion of base pairs in these long reads is incorrectly identified. These errors propagate to the assembly and affect the accuracy of genome analysis. Assembly polishing algorithms minimize such error propagation by polishing or fixing errors in the assembly by using information from alignments between reads and the assembly (i.e. read-to-assembly alignment information). However, current assembly polishing algorithms can only polish an assembly using reads from either a certain sequencing technology or a small assembly. Such technology-dependency and assembly-size dependency require researchers to (i) run multiple polishing algorithms and (ii) use small chunks of a large genome to use all available readsets and polish large genomes, respectively. RESULTS We introduce Apollo, a universal assembly polishing algorithm that scales well to polish an assembly of any size (i.e. both large and small genomes) using reads from all sequencing technologies (i.e. second- and third-generation). Our goal is to provide a single algorithm that uses read sets from all available sequencing technologies to improve the accuracy of assembly polishing and that can polish large genomes. Apollo (i) models an assembly as a profile hidden Markov model (pHMM), (ii) uses read-to-assembly alignment to train the pHMM with the Forward-Backward algorithm and (iii) decodes the trained model with the Viterbi algorithm to produce a polished assembly. Our experiments with real readsets demonstrate that Apollo is the only algorithm that (i) uses reads from any sequencing technology within a single run and (ii) scales well to polish large assemblies without splitting the assembly into multiple parts. AVAILABILITY AND IMPLEMENTATION Source code is available at https://github.com/CMU-SAFARI/Apollo. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Can Firtina
- Department of Computer Science, ETH Zurich, Zurich 8092, Switzerland
| | - Jeremie S Kim
- Department of Computer Science, ETH Zurich, Zurich 8092, Switzerland.,Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Mohammed Alser
- Department of Computer Science, ETH Zurich, Zurich 8092, Switzerland
| | - Damla Senol Cali
- Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - A Ercument Cicek
- Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey
| | - Can Alkan
- Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey
| | - Onur Mutlu
- Department of Computer Science, ETH Zurich, Zurich 8092, Switzerland.,Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA.,Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey
| |
Collapse
|
16
|
Caylak G, Tastan O, Cicek AE. A Tool for Detecting Complementary Single Nucleotide Polymorphism Pairs in Genome-Wide Association Studies for Epistasis Testing. J Comput Biol 2020; 28:378-380. [PMID: 33325775 DOI: 10.1089/cmb.2020.0430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Detecting interacting loci pairs has been instrumental to understand disease etiology when single locus associations do not fully account for the underlying heritability. However, the number of loci to test is prohibitively large. Epistasis test prioritization algorithms rank likely epistatic single nucleotide polymorphism (SNP) pairs to limit the number of statistical tests. Potpourri detects epistatic SNP pairs by diversifying the selected SNPs' genomic regions and investigating their co-occurrence patterns over the case cohort. It can also input and further prioritize SNPs in regulatory or coding regions. The program identifies and returns a list of prioritized SNP pairs for epistasis testing. This article describes how to use the program and the details of the input and output data.
Collapse
Affiliation(s)
- Gizem Caylak
- Computer Engineering Department, Bilkent University, Ankara, Turkey
| | - Oznur Tastan
- Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul, Turkey
| | - A Ercument Cicek
- Computer Engineering Department, Bilkent University, Ankara, Turkey.,Computational Biology Department, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
| |
Collapse
|
17
|
Abstract
Genome-wide association studies (GWAS) explain a fraction of the underlying heritability of genetic diseases. Investigating epistatic interactions between two or more loci help to close this gap. Unfortunately, the sheer number of loci combinations to process and hypotheses prohibit the process both computationally and statistically. Epistasis test prioritization algorithms rank likely epistatic single nucleotide polymorphism (SNP) pairs to limit the number of tests. However, they still suffer from very low precision. It was shown in the literature that selecting SNPs that are individually correlated with the phenotype and also diverse with respect to genomic location leads to better phenotype prediction due to genetic complementation. Here, we propose that an algorithm that pairs SNPs from such diverse regions and ranks them can improve prediction power. We propose an epistasis test prioritization algorithm that optimizes a submodular set function to select a diverse and complementary set of genomic regions that span the underlying genome. The SNP pairs from these regions are then further ranked w.r.t. their co-coverage of the case cohort. We compare our algorithm with the state of the art on three GWAS and show that (1) we substantially improve precision (from 0.003 to 0.652) while maintaining the significance of selected pairs, (2) decrease the number of tests by 25-fold, and (3) decrease the runtime by 4-fold. We also show that promoting SNPs from regulatory/coding regions improves the performance (up to 0.8). Potpourri is available at http:/ciceklab.cs.bilkent.edu.tr/potpourri.
Collapse
Affiliation(s)
- Gizem Caylak
- Computer Engineering Department, Bilkent University, Ankara, Turkey
| | - Oznur Tastan
- Faculty of Engineering and Natural Sciences, Sabanci University, Istanbul, Turkey
| | - A Ercument Cicek
- Computer Engineering Department, Bilkent University, Ankara, Turkey
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
| |
Collapse
|
18
|
Cakmakci D, Karakaslar EO, Ruhland E, Chenard MP, Proust F, Piotto M, Namer IJ, Cicek AE. Machine learning assisted intraoperative assessment of brain tumor margins using HRMAS NMR spectroscopy. PLoS Comput Biol 2020; 16:e1008184. [PMID: 33175838 PMCID: PMC7682900 DOI: 10.1371/journal.pcbi.1008184] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Revised: 11/23/2020] [Accepted: 07/22/2020] [Indexed: 11/19/2022] Open
Abstract
Complete resection of the tumor is important for survival in glioma patients. Even if the gross total resection was achieved, left-over micro-scale tissue in the excision cavity risks recurrence. High Resolution Magic Angle Spinning Nuclear Magnetic Resonance (HRMAS NMR) technique can distinguish healthy and malign tissue efficiently using peak intensities of biomarker metabolites. The method is fast, sensitive and can work with small and unprocessed samples, which makes it a good fit for real-time analysis during surgery. However, only a targeted analysis for the existence of known tumor biomarkers can be made and this requires a technician with chemistry background, and a pathologist with knowledge on tumor metabolism to be present during surgery. Here, we show that we can accurately perform this analysis in real-time and can analyze the full spectrum in an untargeted fashion using machine learning. We work on a new and large HRMAS NMR dataset of glioma and control samples (n = 565), which are also labeled with a quantitative pathology analysis. Our results show that a random forest based approach can distinguish samples with tumor cells and controls accurately and effectively with a median AUC of 85.6% and AUPR of 93.4%. We also show that we can further distinguish benign and malignant samples with a median AUC of 87.1% and AUPR of 96.1%. We analyze the feature (peak) importance for classification to interpret the results of the classifier. We validate that known malignancy biomarkers such as creatine and 2-hydroxyglutarate play an important role in distinguishing tumor and normal cells and suggest new biomarker regions. The code is released at http://github.com/ciceklab/HRMAS_NC.
Collapse
Affiliation(s)
- Doruk Cakmakci
- Computer Engineering Department, Bilkent University, Ankara, Turkey
| | | | - Elisa Ruhland
- MNMS Platform, University Hospitals of Strasbourg, Strasbourg, France
| | | | - Francois Proust
- Department of Neurosurgery, University Hospitals of Strasbourg, Strasbourg, France
| | | | - Izzie Jacques Namer
- MNMS Platform, University Hospitals of Strasbourg, Strasbourg, France
- ICube, University of Strasbourg / CNRS UMR 7357, Strasbourg, France
- Department of Nuclear Medicine and Molecular Imaging, ICANS, Strasbourg, France
| | - A. Ercument Cicek
- Computer Engineering Department, Bilkent University, Ankara, Turkey
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, Pennsylvania
| |
Collapse
|
19
|
Norman U, Cicek AE. ST-Steiner: a spatio-temporal gene discovery algorithm. Bioinformatics 2020; 35:3433-3440. [PMID: 30759247 DOI: 10.1093/bioinformatics/btz110] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2018] [Revised: 01/16/2019] [Accepted: 02/12/2019] [Indexed: 12/12/2022] Open
Abstract
MOTIVATION Whole exome sequencing (WES) studies for autism spectrum disorder (ASD) could identify only around six dozen risk genes to date because the genetic architecture of the disorder is highly complex. To speed the gene discovery process up, a few network-based ASD gene discovery algorithms were proposed. Although these methods use static gene interaction networks, functional clustering of genes is bound to evolve during neurodevelopment and disruptions are likely to have a cascading effect on the future associations. Thus, approaches that disregard the dynamic nature of neurodevelopment are limited. RESULTS Here, we present a spatio-temporal gene discovery algorithm, which leverages information from evolving gene co-expression networks of neurodevelopment. The algorithm solves a prize-collecting Steiner forest-based problem on co-expression networks, adapted to model neurodevelopment and transfer information from precursor neurodevelopmental windows. The decisions made by the algorithm can be traced back, adding interpretability to the results. We apply the algorithm on ASD WES data of 3871 samples and identify risk clusters using BrainSpan co-expression networks of early- and mid-fetal periods. On an independent dataset, we show that incorporation of the temporal dimension increases the predictive power: predicted clusters are hit more and show higher enrichment in ASD-related functions compared with the state-of-the-art. AVAILABILITY AND IMPLEMENTATION The code is available at http://ciceklab.cs.bilkent.edu.tr/st-steiner. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Utku Norman
- Computer Engineering Department, Bilkent University, Ankara, Turkey
| | - A Ercument Cicek
- Computer Engineering Department, Bilkent University, Ankara, Turkey.,Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
20
|
Karakaslar EO, Coskun B, Outilaft H, Namer IJ, Cicek AE. Predicting Carbon Spectrum in Heteronuclear Single Quantum Coherence Spectroscopy for Online Feedback During Surgery. IEEE/ACM Trans Comput Biol Bioinform 2020; 17:719-725. [PMID: 31180895 DOI: 10.1109/tcbb.2019.2920646] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
1H High-Resolution Magic Angle Spinning (HRMAS) Nuclear Magnetic Resonance (NMR) is a reliable technology used for detecting metabolites in solid tissues. Fast response time enables guiding surgeons in real time, for detecting tumor cells that are left over in the excision cavity. However, severe overlap of spectral resonances in 1D signal often render distinguishing metabolites impossible. In that case, Heteronuclear Single Quantum Coherence Spectroscopy (HSQC) NMR is applied which can distinguish metabolites by generating 2D spectra ( 1H- 13C). Unfortunately, this analysis requires much longer time and prohibits real time analysis. Thus, obtaining 2D spectrum fast has major implications in medicine. In this study, we show that using multiple multivariate regression and statistical total correlation spectroscopy, we can learn the relation between the 1H and 13C dimensions. Learning is possible with small sample sizes and without the need for performing the HSQC analysis, we can predict the 13C dimension by just performing 1H HRMAS NMR experiment. We show on a rat model of central nervous system tissues (80 samples, 5 tissues) that our methods achieve 0.971 and 0.957 mean R2 values, respectively. Our tests on 15 human brain tumor samples show that we can predict 104 groups of 39 metabolites with 97 percent accuracy. Finally, we show that we can predict the presence of a drug resistant tumor biomarker (creatine) despite obstructed signal in 1H dimension. In practice, this information can provide valuable feedback to the surgeon to further resect the cavity to avoid potential recurrence.
Collapse
|
21
|
Satterstrom FK, Kosmicki JA, Wang J, Breen MS, De Rubeis S, An JY, Peng M, Collins R, Grove J, Klei L, Stevens C, Reichert J, Mulhern MS, Artomov M, Gerges S, Sheppard B, Xu X, Bhaduri A, Norman U, Brand H, Schwartz G, Nguyen R, Guerrero EE, Dias C, Betancur C, Cook EH, Gallagher L, Gill M, Sutcliffe JS, Thurm A, Zwick ME, Børglum AD, State MW, Cicek AE, Talkowski ME, Cutler DJ, Devlin B, Sanders SJ, Roeder K, Daly MJ, Buxbaum JD. Large-Scale Exome Sequencing Study Implicates Both Developmental and Functional Changes in the Neurobiology of Autism. Cell 2020; 180:568-584.e23. [PMID: 31981491 PMCID: PMC7250485 DOI: 10.1016/j.cell.2019.12.036] [Citation(s) in RCA: 1077] [Impact Index Per Article: 269.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2019] [Revised: 07/08/2019] [Accepted: 12/24/2019] [Indexed: 12/15/2022]
Abstract
We present the largest exome sequencing study of autism spectrum disorder (ASD) to date (n = 35,584 total samples, 11,986 with ASD). Using an enhanced analytical framework to integrate de novo and case-control rare variation, we identify 102 risk genes at a false discovery rate of 0.1 or less. Of these genes, 49 show higher frequencies of disruptive de novo variants in individuals ascertained to have severe neurodevelopmental delay, whereas 53 show higher frequencies in individuals ascertained to have ASD; comparing ASD cases with mutations in these groups reveals phenotypic differences. Expressed early in brain development, most risk genes have roles in regulation of gene expression or neuronal communication (i.e., mutations effect neurodevelopmental and neurophysiological changes), and 13 fall within loci recurrently hit by copy number variants. In cells from the human cortex, expression of risk genes is enriched in excitatory and inhibitory neuronal lineages, consistent with multiple paths to an excitatory-inhibitory imbalance underlying ASD.
Collapse
Affiliation(s)
- F Kyle Satterstrom
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Jack A Kosmicki
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Harvard Medical School, Boston, MA, USA; Center for Genomic Medicine, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Jiebiao Wang
- Department of Statistics, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Michael S Breen
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA; The Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Silvia De Rubeis
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA; The Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Joon-Yong An
- Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA; School of Biosystem and Biomedical Science, College of Health Science, Korea University, Seoul, Republic of Korea
| | - Minshi Peng
- Department of Statistics, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Ryan Collins
- Center for Genomic Medicine, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA; Program in Bioinformatics and Integrative Genomics, Harvard Medical School, Boston, MA, USA
| | - Jakob Grove
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark; Center for Genomics and Personalized Medicine, Aarhus, Denmark; Department of Biomedicine - Human Genetics, Aarhus University, Aarhus, Denmark
| | - Lambertus Klei
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Christine Stevens
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Harvard Medical School, Boston, MA, USA; Center for Genomic Medicine, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Jennifer Reichert
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Maureen S Mulhern
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Mykyta Artomov
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Harvard Medical School, Boston, MA, USA; Center for Genomic Medicine, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Sherif Gerges
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Harvard Medical School, Boston, MA, USA; Center for Genomic Medicine, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Brooke Sheppard
- Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Xinyi Xu
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| | - Aparna Bhaduri
- Department of Neurology, University of California, San Francisco, San Francisco, CA, USA; The Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA, USA
| | - Utku Norman
- Computer Engineering Department, Bilkent University, Ankara, Turkey
| | - Harrison Brand
- Center for Genomic Medicine, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - Grace Schwartz
- Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - Rachel Nguyen
- Center for Autism Research and Translation, University of California, Irvine, Irvine, CA, USA
| | - Elizabeth E Guerrero
- MIND (Medical Investigation of Neurodevelopmental Disorders) Institute, University of California, Davis, Davis, CA, USA
| | - Caroline Dias
- Division of Genetics, Boston Children's Hospital, Boston, MA, USA; Division of Developmental Medicine, Boston Children's Hospital, Boston, MA, USA
| | - Catalina Betancur
- Sorbonne Université, INSERM, CNRS, Neuroscience Paris Seine, Institut de Biologie Paris Seine, Paris, France
| | - Edwin H Cook
- Institute for Juvenile Research, Department of Psychiatry, University of Illinois at Chicago, Chicago, IL, USA
| | - Louise Gallagher
- Department of Psychiatry, School of Medicine, Trinity College Dublin, Dublin, Ireland
| | - Michael Gill
- Department of Psychiatry, School of Medicine, Trinity College Dublin, Dublin, Ireland
| | - James S Sutcliffe
- Vanderbilt Genetics Institute, Vanderbilt University School of Medicine, Nashville, TN, USA; Department of Molecular Physiology and Biophysics and Psychiatry, Vanderbilt University School of Medicine, Nashville, TN, USA
| | - Audrey Thurm
- National Institute of Mental Health, NIH, Bethesda, MD, USA
| | - Michael E Zwick
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, USA
| | - Anders D Børglum
- The Lundbeck Foundation Initiative for Integrative Psychiatric Research, iPSYCH, Aarhus, Denmark; Center for Genomics and Personalized Medicine, Aarhus, Denmark; Department of Biomedicine - Human Genetics, Aarhus University, Aarhus, Denmark; Bioinformatics Research Centre, Aarhus University, Aarhus, Denmark
| | - Matthew W State
- Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA
| | - A Ercument Cicek
- Department of Statistics, Carnegie Mellon University, Pittsburgh, PA, USA; Computer Engineering Department, Bilkent University, Ankara, Turkey
| | - Michael E Talkowski
- Center for Genomic Medicine, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA
| | - David J Cutler
- Department of Human Genetics, Emory University School of Medicine, Atlanta, GA, USA
| | - Bernie Devlin
- Department of Psychiatry, University of Pittsburgh School of Medicine, Pittsburgh, PA, USA
| | - Stephan J Sanders
- Department of Psychiatry, UCSF Weill Institute for Neurosciences, University of California, San Francisco, San Francisco, CA, USA.
| | - Kathryn Roeder
- Department of Statistics, Carnegie Mellon University, Pittsburgh, PA, USA; Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA.
| | - Mark J Daly
- Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Analytic and Translational Genetics Unit, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA; Program in Medical and Population Genetics, Broad Institute of MIT and Harvard, Cambridge, MA, USA; Harvard Medical School, Boston, MA, USA; Center for Genomic Medicine, Department of Medicine, Massachusetts General Hospital, Boston, MA, USA; Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland.
| | - Joseph D Buxbaum
- Seaver Autism Center for Research and Treatment, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Department of Psychiatry, Icahn School of Medicine at Mount Sinai, New York, NY, USA; The Mindich Child Health and Development Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Friedman Brain Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA; Department of Neuroscience, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|
22
|
Imperiale A, Poncet G, Addeo P, Ruhland E, Roche C, Battini S, Cicek AE, Chenard MP, Hervieu V, Goichot B, Bachellier P, Walter T, Namer IJ. Metabolomics of Small Intestine Neuroendocrine Tumors and Related Hepatic Metastases. Metabolites 2019; 9:metabo9120300. [PMID: 31835679 PMCID: PMC6950539 DOI: 10.3390/metabo9120300] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2019] [Revised: 12/01/2019] [Accepted: 12/10/2019] [Indexed: 12/15/2022] Open
Abstract
To assess the metabolomic fingerprint of small intestine neuroendocrine tumors (SI-NETs) and related hepatic metastases, and to investigate the influence of the hepatic environment on SI-NETs metabolome. Ninety-four tissue samples, including 46 SI-NETs, 18 hepatic NET metastases and 30 normal SI and liver samples, were analyzed using 1H-magic angle spinning (HRMAS) NMR nuclear magnetic resonance (NMR) spectroscopy. Twenty-seven metabolites were identified and quantified. Differences between primary NETs vs. normal SI and primary NETs vs. hepatic metastases, were assessed. Network analysis was performed according to several clinical and pathological features. Succinate, glutathion, taurine, myoinositol and glycerophosphocholine characterized NETs. Normal SI specimens showed higher levels of alanine, creatine, ethanolamine and aspartate. PLS-DA revealed a continuum-like distribution among normal SI, G1-SI-NETs and G2-SI-NETs. The G2-SI-NET distribution was closer and clearly separated from normal SI tissue. Lower concentration of glucose, serine and glycine, and increased levels of choline-containing compounds, taurine, lactate and alanine, were found in SI-NETs with more aggressive tumors. Higher abundance of acetate, succinate, choline, phosphocholine, taurine, lactate and aspartate discriminated liver metastases from normal hepatic parenchyma. Higher levels of alanine, ethanolamine, glycerophosphocholine and glucose was found in hepatic metastases than in primary SI-NETs. The present work gives for the first time a snapshot of the metabolomic characteristics of SI-NETs, suggesting the existence of complex metabolic reality, maybe characteristic of different tumor evolution.
Collapse
Affiliation(s)
- Alessio Imperiale
- Biophysics and Nuclear Medicine, University Hospitals of Strasbourg, 67098 Strasbourg, France; (E.R.); (I.J.N.)
- Faculty of Medicine, University of Strasbourg, FMTS, 67000 Strasbourg, France; (M.P.C.); (B.G.); (P.B.)
- MNMS Platform, University Hospitals of Strasbourg, 67098 Strasbourg, France;
- Molecular Imaging—Institut Pluridisciplinaire Hubert Curien (IPHC), UMR 7178 – CNRS/Unistra, 67098 Strasbourg, France
- Correspondence: ; Tel.: +33-3-88-12-75-52; Fax: +33-3-88-12-81-21
| | - Gilles Poncet
- Digestive and Oncologic Surgery, Edouard-Herriot University Hospital, Claude-Bernard Lyon 1 University, 69622 Lyon, France;
| | - Pietro Addeo
- Hepato-Pancreato-Biliary Surgery and Liver transplantation, University Hospitals of Strasbourg, University of Strasbourg, 67098 Strasbourg, France;
| | - Elisa Ruhland
- Biophysics and Nuclear Medicine, University Hospitals of Strasbourg, 67098 Strasbourg, France; (E.R.); (I.J.N.)
- MNMS Platform, University Hospitals of Strasbourg, 67098 Strasbourg, France;
| | - Colette Roche
- INSERM U1052/CNRS UMR5286/University of Lyon, Cancer Research Center of Lyon, 69622 Lyon, France; (C.R.); (V.H.)
| | - Stephanie Battini
- MNMS Platform, University Hospitals of Strasbourg, 67098 Strasbourg, France;
| | - A. Ercument Cicek
- Computer Engineering Department, Bilkent University, Ankara 06800, Turkey;
| | - Marie Pierrette Chenard
- Faculty of Medicine, University of Strasbourg, FMTS, 67000 Strasbourg, France; (M.P.C.); (B.G.); (P.B.)
- Pathology, University Hospitals of Strasbourg, Strasbourg University, 67098 Strasbourg, France
| | - Valérie Hervieu
- INSERM U1052/CNRS UMR5286/University of Lyon, Cancer Research Center of Lyon, 69622 Lyon, France; (C.R.); (V.H.)
- Tissu-Tumorothèque Est (CRB-HCL, Hospices Civils de Lyon Biobank, BB-0033-00046), 69622 Lyon, France
| | - Bernard Goichot
- Faculty of Medicine, University of Strasbourg, FMTS, 67000 Strasbourg, France; (M.P.C.); (B.G.); (P.B.)
- Internal Medicine, Diabetes and Metabolic Disorders, University Hospitals of Strasbourg, Strasbourg University, 67098 Strasbourg, France
| | - Philippe Bachellier
- Faculty of Medicine, University of Strasbourg, FMTS, 67000 Strasbourg, France; (M.P.C.); (B.G.); (P.B.)
- Hepato-Pancreato-Biliary Surgery and Liver transplantation, University Hospitals of Strasbourg, University of Strasbourg, 67098 Strasbourg, France;
| | - Thomas Walter
- Medical Oncology, Edouard Herriot Hospital, Hospices Civils de Lyon, 69622 Lyon, France;
- University of Lyon, Université Lyon 1, 69622 Lyon, France
| | - Izzie Jacques Namer
- Biophysics and Nuclear Medicine, University Hospitals of Strasbourg, 67098 Strasbourg, France; (E.R.); (I.J.N.)
- Faculty of Medicine, University of Strasbourg, FMTS, 67000 Strasbourg, France; (M.P.C.); (B.G.); (P.B.)
- MNMS Platform, University Hospitals of Strasbourg, 67098 Strasbourg, France;
| |
Collapse
|
23
|
Faitot F, Ruhland E, Oncioiu C, Besch C, Addeo P, Cicek AE, Bachellier P, Namer IJ. Metabolomic profiling highlights the metabolic bases of acute-on-chronic and post-hepatectomy liver failure. HPB (Oxford) 2019; 21:1354-1361. [PMID: 30914156 DOI: 10.1016/j.hpb.2019.02.008] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Revised: 02/08/2019] [Accepted: 02/15/2019] [Indexed: 02/06/2023]
Abstract
BACKGROUND Posthepatectomy liver failure (PHLF) is the main limitation to extending liver resection but its pathophysiology is not yet fully understood. The aim of the study was to describe the metabolic adaptations that occur with PHLF. METHODS A retrospective study of 82 patients using nuclear magnetic resonance metabolomics to identify and quantify intra-hepatic metabolites was performed. The metabolite levels were compared using metabolic network analysis ADEMA between fatal PHLF (FLF) and non fatal PHLF and according to PHLF/ACLF grading. RESULTS Metabolomic profiles were significantly different between patients presenting FLF and non FLF or grade 3 ACLF versus < grade 3 ACLF. In the patients undergoing hepatectomy, valine, alanine and glycerophosphocholine were identified as powerful biomarkers to predict FLF (AUROC 0.806, 0.802 and 0.856 respectively). Network analysis showed an activation of aerobic glycolysis with glutaminolysis as observed in highly proliferating systems. Inversely, ACLF3 showed deprivation of glucose and lactate compared to lower ACLF grade. CONCLUSION Clinical andbiological severity of ACLF and PHLF correlate with specific metabolic adaptations. Metabolomics can predict fatal liver failure after hepatectomy and underline significant differences in the metabolic patterns of ACLF and PHLF.
Collapse
Affiliation(s)
- Francois Faitot
- Hepatobiliopancreatic Surgery and Transplantation Department, Hopital de Hautepierre, Hopitaux Universitaires de Strasbourg, France; Laboratoire ICube, UMR7357, University of Strasbourg, France.
| | - Elisa Ruhland
- Biophysics and Nuclear Medicine Department, Hopital de Hautepierre, Hopitaux Universitaires de Strasbourg, France
| | - Constantin Oncioiu
- Hepatobiliopancreatic Surgery and Transplantation Department, Hopital de Hautepierre, Hopitaux Universitaires de Strasbourg, France
| | - Camille Besch
- Hepatobiliopancreatic Surgery and Transplantation Department, Hopital de Hautepierre, Hopitaux Universitaires de Strasbourg, France
| | - Pietro Addeo
- Hepatobiliopancreatic Surgery and Transplantation Department, Hopital de Hautepierre, Hopitaux Universitaires de Strasbourg, France; Laboratoire ICube, UMR7357, University of Strasbourg, France
| | - A Ercument Cicek
- Lane Center of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, USA; Computer Engineering Department, Bilkent University, Ankara, Turkey
| | - Philippe Bachellier
- Hepatobiliopancreatic Surgery and Transplantation Department, Hopital de Hautepierre, Hopitaux Universitaires de Strasbourg, France
| | - Izzie-Jacques Namer
- Laboratoire ICube, UMR7357, University of Strasbourg, France; Biophysics and Nuclear Medicine Department, Hopital de Hautepierre, Hopitaux Universitaires de Strasbourg, France
| |
Collapse
|
24
|
Firtina C, Bar-Joseph Z, Alkan C, Cicek AE. Hercules: a profile HMM-based hybrid error correction algorithm for long reads. Nucleic Acids Res 2019; 46:e125. [PMID: 30124947 PMCID: PMC6265270 DOI: 10.1093/nar/gky724] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2018] [Accepted: 08/07/2018] [Indexed: 01/15/2023] Open
Abstract
Choosing whether to use second or third generation sequencing platforms can lead to trade-offs between accuracy and read length. Several types of studies require long and accurate reads. In such cases researchers often combine both technologies and the erroneous long reads are corrected using the short reads. Current approaches rely on various graph or alignment based techniques and do not take the error profile of the underlying technology into account. Efficient machine learning algorithms that address these shortcomings have the potential to achieve more accurate integration of these two technologies. We propose Hercules, the first machine learning-based long read error correction algorithm. Hercules models every long read as a profile Hidden Markov Model with respect to the underlying platform’s error profile. The algorithm learns a posterior transition/emission probability distribution for each long read to correct errors in these reads. We show on two DNA-seq BAC clones (CH17-157L1 and CH17-227A2) that Hercules-corrected reads have the highest mapping rate among all competing algorithms and have the highest accuracy when the breadth of coverage is high. On a large human CHM1 cell line WGS data set, Hercules is one of the few scalable algorithms; and among those, it achieves the highest accuracy.
Collapse
Affiliation(s)
- Can Firtina
- Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey
| | - Ziv Bar-Joseph
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Can Alkan
- Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey
| | - A Ercument Cicek
- Department of Computer Engineering, Bilkent University, Ankara 06800, Turkey.,Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| |
Collapse
|
25
|
Bund C, Guergova-Kuras M, Cicek AE, Moussallieh FM, Dali-Youcef N, Piotto M, Schneider P, Heller R, Entz-Werle N, Lhermitte B, Chenard MP, Schott R, Proust F, Noël G, Namer IJ. An integrated genomic and metabolomic approach for defining survival time in adult oligodendrogliomas patients. Metabolomics 2019; 15:69. [PMID: 31037432 DOI: 10.1007/s11306-019-1522-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Accepted: 04/01/2019] [Indexed: 01/13/2023]
Abstract
INTRODUCTION The identification of frequent acquired mutations shows that patients with oligodendrogliomas have divergent biology with differing prognoses regardless of histological classification. A better understanding of molecular features as well as their metabolic pathways is essential. OBJECTIVES The aim of this study was to examine the relationship between the tumor metabolome, six genomic aberrations (isocitrate dehydrogenase1 [IDH1] mutation, 1p/19q codeletion, tumor protein p53 [TP53] mutation, O6-methylguanin-DNA methyltransferase [MGMT] promoter methylation, epidermal growth factor receptor [EGFR] amplification, phosphate and tensin homolog [PTEN] methylation), and the patients' survival time. METHODS We applied 1H high-resolution magic-angle spinning (HRMAS) nuclear magnetic resonance (NMR) spectroscopy to 72 resected oligodendrogliomas. RESULTS The presence of IDH1, TP53, 1p19q codeletion, MGMT promoter methylation reduced the relative risk of death, whereas PTEN methylation and EGFR amplification were associated with poor prognosis. Increased concentration of 2-hydroxyglutarate (2HG), N-acetyl-aspartate (NAA), myo-inositol and the glycerophosphocholine/phosphocholine (GPC/PC) ratio were good prognostic factors. Increasing the concentration of serine, glycine, glutamate and alanine led to an increased relative risk of death. CONCLUSION HRMAS NMR spectroscopy provides accurate information on the metabolomics of oligodendrogliomas, making it possible to find new biomarkers indicative of survival. It enables rapid characterization of intact tissue and could be used as an intraoperative method.
Collapse
Affiliation(s)
- Caroline Bund
- Service de Biophysique et Médecine Nucléaire, Hôpital de Hautepierre, Hôpitaux Universitaires de Strasbourg, 1, Avenue Molière, 67098, Strasbourg Cedex 09, France.
- ICube, Université de Strasbourg/CNRS, UMR 7357, Strasbourg, France.
| | | | - A Ercument Cicek
- Lane Center of Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, USA
- Computer Engineering Department, Bilkent University, Ankara, Turkey
| | - François-Marie Moussallieh
- Service de Biophysique et Médecine Nucléaire, Hôpital de Hautepierre, Hôpitaux Universitaires de Strasbourg, 1, Avenue Molière, 67098, Strasbourg Cedex 09, France
| | - Nassim Dali-Youcef
- IGBMC (Institut de Génétique et de Biologie Moléculaire et Cellulaire)/CNRS UMR 7104/INSERM U964, Université de Strasbourg, Strasbourg, France
- Laboratoire de Biochimie et Biologie Moléculaire, Nouvel Hôpital Civil, Hôpitaux Universitaires de Strasbourg, Strasbourg, France
| | | | | | - Rémy Heller
- Laboratoire de Microbiologie et Biologie Moléculaire, Hôpitaux Civils de Colmar, Colmar, France
| | - Natacha Entz-Werle
- Service de Pédiatrie Onco-hématologie, Hôpital de Hautepierre, Hôpitaux Universitaires de Strasbourg, Strasbourg, France
| | - Benoît Lhermitte
- Service d'Anatomie Pathologique, Hôpital de Hautepierre, Hôpitaux Universitaires de Strasbourg, Strasbourg, France
| | - Marie-Pierre Chenard
- Service d'Anatomie Pathologique, Hôpital de Hautepierre, Hôpitaux Universitaires de Strasbourg, Strasbourg, France
| | - Roland Schott
- Departement d'Oncologie Médicale, Centre Paul Strauss, Strasbourg, France
| | - François Proust
- Service de Neurochirurgie, Hôpital de Hautepierre, Hôpitaux Universitaires de Strasbourg, Strasbourg, France
| | - Georges Noël
- Departement de Radiothérapie, Centre Paul Strauss, Strasbourg, France
| | - Izzie Jacques Namer
- Service de Biophysique et Médecine Nucléaire, Hôpital de Hautepierre, Hôpitaux Universitaires de Strasbourg, 1, Avenue Molière, 67098, Strasbourg Cedex 09, France
- ICube, Université de Strasbourg/CNRS, UMR 7357, Strasbourg, France
- FMTS (Fédération de Médecine Translationnelle de Strasbourg), Faculté de Médecine, Strasbourg, France
| |
Collapse
|
26
|
von Thenen N, Ayday E, Cicek AE. Re-identification of individuals in genomic data-sharing beacons via allele inference. Bioinformatics 2018; 35:365-371. [DOI: 10.1093/bioinformatics/bty643] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2018] [Accepted: 07/18/2018] [Indexed: 11/14/2022] Open
Affiliation(s)
- Nora von Thenen
- Computer Engineering Department, Bilkent University, Ankara, Turkey
| | - Erman Ayday
- Computer Engineering Department, Bilkent University, Ankara, Turkey
- Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH, USA
| | - A Ercument Cicek
- Computer Engineering Department, Bilkent University, Ankara, Turkey
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| |
Collapse
|
27
|
Liu Y, Liang Y, Cicek AE, Li Z, Li J, Muhle RA, Krenzer M, Mei Y, Wang Y, Knoblauch N, Morrison J, Zhao S, Jiang Y, Geller E, Ionita-Laza I, Wu J, Xia K, Noonan JP, Sun ZS, He X. A Statistical Framework for Mapping Risk Genes from De Novo Mutations in Whole-Genome-Sequencing Studies. Am J Hum Genet 2018; 102:1031-1047. [PMID: 29754769 PMCID: PMC5992125 DOI: 10.1016/j.ajhg.2018.03.023] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2017] [Accepted: 03/22/2018] [Indexed: 10/16/2022] Open
Abstract
Analysis of de novo mutations (DNMs) from sequencing data of nuclear families has identified risk genes for many complex diseases, including multiple neurodevelopmental and psychiatric disorders. Most of these efforts have focused on mutations in protein-coding sequences. Evidence from genome-wide association studies (GWASs) strongly suggests that variants important to human diseases often lie in non-coding regions. Extending DNM-based approaches to non-coding sequences is challenging, however, because the functional significance of non-coding mutations is difficult to predict. We propose a statistical framework for analyzing DNMs from whole-genome sequencing (WGS) data. This method, TADA-Annotations (TADA-A), is a major advance of the TADA method we developed earlier for DNM analysis in coding regions. TADA-A is able to incorporate many functional annotations such as conservation and enhancer marks, to learn from data which annotations are informative of pathogenic mutations, and to combine both coding and non-coding mutations at the gene level to detect risk genes. It also supports meta-analysis of multiple DNM studies, while adjusting for study-specific technical effects. We applied TADA-A to WGS data of ∼300 autism-affected family trios across five studies and discovered several autism risk genes. The software is freely available for all research uses.
Collapse
Affiliation(s)
- Yuwen Liu
- Department of Human Genetics, The University of Chicago, Chicago, IL 60637, USA
| | - Yanyu Liang
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15123, USA
| | - A Ercument Cicek
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15123, USA; Computer Engineering Department, Bilkent University, Ankara 06800, Turkey
| | - Zhongshan Li
- Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou, Zhejiang 325000, China
| | - Jinchen Li
- Center for Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan 410078, China; National Clinical Research Center for Geriatric Disorders, Xiangya Hospital, Central South University, Changsha, Hunan 410078, China
| | | | - Martina Krenzer
- Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06520, USA
| | - Yue Mei
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing 100000, China
| | - Yan Wang
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing 100000, China
| | - Nicholas Knoblauch
- Committee on Genetics, Genomics and Systems Biology, The University of Chicago, Chicago, IL 60637, USA
| | - Jean Morrison
- Department of Human Genetics, The University of Chicago, Chicago, IL 60637, USA
| | - Siming Zhao
- Department of Human Genetics, The University of Chicago, Chicago, IL 60637, USA
| | - Yi Jiang
- Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou, Zhejiang 325000, China; Center for Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan 410078, China
| | - Evan Geller
- Department of Genetics, Yale School of Medicine, New Haven, CT 06520, USA; Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06520, USA
| | | | - Jinyu Wu
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing 100000, China; Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou, Zhejiang 325000, China
| | - Kun Xia
- Center for Medical Genetics, School of Life Sciences, Central South University, Changsha, Hunan 410078, China
| | - James P Noonan
- Department of Genetics, Yale School of Medicine, New Haven, CT 06520, USA; Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT 06520, USA
| | - Zhong Sheng Sun
- Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing 100000, China; Institute of Genomic Medicine, Wenzhou Medical University, Wenzhou, Zhejiang 325000, China.
| | - Xin He
- Department of Human Genetics, The University of Chicago, Chicago, IL 60637, USA.
| |
Collapse
|
28
|
Nguyen A, Moussallieh FM, Mackay A, Cicek AE, Coca A, Chenard MP, Weingertner N, Lhermitte B, Letouzé E, Guérin E, Pencreach E, Jannier S, Guenot D, Namer IJ, Jones C, Entz-Werlé N. Characterization of the transcriptional and metabolic responses of pediatric high grade gliomas to mTOR-HIF-1α axis inhibition. Oncotarget 2017; 8:71597-71617. [PMID: 29069732 PMCID: PMC5641075 DOI: 10.18632/oncotarget.16500] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2016] [Accepted: 01/16/2017] [Indexed: 12/12/2022] Open
Abstract
Pediatric high grade glioma (pHGGs), including sus-tentorial and diffuse intrinsic pontine gliomas, are known to have a very dismal prognosis. For instance, even an increased knowledge on molecular biology driving this brain tumor entity, there is no treatment able to cure those patients. Therefore, we were focusing on a translational pathway able to increase the cell resistance to treatment and to reprogram metabolically tumor cells, which are, then, adapting easily to a hypoxic microenvironment. To establish, the crucial role of the hypoxic pathways in pHGGs, we, first, assessed their protein and transcriptomic deregulations in a pediatric cohort of pHGGs and in pHGG's cell lines, cultured in both normoxic and hypoxic conditions. Secondly, based on the concept of a bi-therapy targeting in pHGGs mTORC1 (rapamycin) and HIF-1α (irinotecan), we hypothesized that the balanced expressions between RAS/ERK, PI3K/AKT and HIF-1α/HIF-2α/MYC proteins or genes may provide a modulation of the cell response to this double targeting. Finally, we could evidence three protein, genomic and metabolomic profiles of response to rapamycin combined with irinotecan. The pattern of highly sensitive cells to mTOR/HIF-1α targeting was linked to a MYC/ERK/HIF-1α over-expression and the cell resistance to a major hyper-expression of HIF-2α.
Collapse
Affiliation(s)
- Aurélia Nguyen
- Laboratory EA 3430, Progression Tumorale et Micro-Environnement, Approches Translationnelles et Epidémiologie, University of Strasbourg, Strasbourg, France
| | | | - Alan Mackay
- Institute of Cancer Research, Sutton, Surrey, United Kingdom
| | - A Ercument Cicek
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA.,Computer Engineering Department, Bilkent University, Cankaya, Ankara, Turkey
| | - Andres Coca
- Department of Neurosurgery, University Hospital of Strasbourg, Strasbourg, France
| | - Marie Pierre Chenard
- Department of Pathology, University Hospital of Strasbourg, Strasbourg, France.,Centre de Ressources Biologiques, University Hospital of Strasbourg, Strasbourg, France
| | - Noelle Weingertner
- Department of Pathology, University Hospital of Strasbourg, Strasbourg, France
| | - Benoit Lhermitte
- Department of Pathology, University Hospital of Strasbourg, Strasbourg, France
| | - Eric Letouzé
- Programme Cartes d'Identité des Tumeurs, Ligue Nationale Contre Le Cancer, Paris, France
| | - Eric Guérin
- Laboratory EA 3430, Progression Tumorale et Micro-Environnement, Approches Translationnelles et Epidémiologie, University of Strasbourg, Strasbourg, France
| | - Erwan Pencreach
- Laboratory EA 3430, Progression Tumorale et Micro-Environnement, Approches Translationnelles et Epidémiologie, University of Strasbourg, Strasbourg, France
| | - Sarah Jannier
- Laboratory EA 3430, Progression Tumorale et Micro-Environnement, Approches Translationnelles et Epidémiologie, University of Strasbourg, Strasbourg, France.,Department of Pediatric Onco-hematology, University Hospital of Strasbourg, Strasbourg, France
| | - Dominique Guenot
- Laboratory EA 3430, Progression Tumorale et Micro-Environnement, Approches Translationnelles et Epidémiologie, University of Strasbourg, Strasbourg, France
| | - Izzie Jacques Namer
- Department of Nuclear Medicine, University Hospital of Strasbourg, Strasbourg, France
| | - Chris Jones
- Institute of Cancer Research, Sutton, Surrey, United Kingdom
| | - Natacha Entz-Werlé
- Laboratory EA 3430, Progression Tumorale et Micro-Environnement, Approches Translationnelles et Epidémiologie, University of Strasbourg, Strasbourg, France.,Department of Pediatric Onco-hematology, University Hospital of Strasbourg, Strasbourg, France
| |
Collapse
|
29
|
Battini S, Faitot F, Imperiale A, Cicek AE, Heimburger C, Averous G, Bachellier P, Namer IJ. Metabolomics approaches in pancreatic adenocarcinoma: tumor metabolism profiling predicts clinical outcome of patients. BMC Med 2017; 15:56. [PMID: 28298227 PMCID: PMC5353864 DOI: 10.1186/s12916-017-0810-z] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/28/2016] [Accepted: 02/07/2017] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Pancreatic adenocarcinomas (PAs) have very poor prognoses even when surgery is possible. Currently, there are no tissular biomarkers to predict long-term survival in patients with PA. The aims of this study were to (1) describe the metabolome of pancreatic parenchyma (PP) and PA, (2) determine the impact of neoadjuvant chemotherapy on PP and PA, and (3) find tissue metabolic biomarkers associated with long-term survivors, using metabolomics analysis. METHODS 1H high-resolution magic angle spinning (HRMAS) nuclear magnetic resonance (NMR) spectroscopy using intact tissues was applied to analyze metabolites in PP tissue samples (n = 17) and intact tumor samples (n = 106), obtained from 106 patients undergoing surgical resection for PA. RESULTS An orthogonal partial least square-discriminant analysis (OPLS-DA) showed a clear distinction between PP and PA. Higher concentrations of myo-inositol and glycerol were shown in PP, whereas higher levels of glucose, ascorbate, ethanolamine, lactate, and taurine were revealed in PA. Among those metabolites, one of them was particularly obvious in the distinction between long-term and short-term survivors. A high ethanolamine level was associated with worse survival. The impact of neoadjuvant chemotherapy was higher on PA than on PP. CONCLUSIONS This study shows that HRMAS NMR spectroscopy using intact tissue provides important and solid information in the characterization of PA. Metabolomics profiling can also predict long-term survival: the assessment of ethanolamine concentration can be clinically relevant as a single metabolic biomarker. This information can be obtained in 20 min, during surgery, to distinguish long-term from short-term survival.
Collapse
Affiliation(s)
- S Battini
- ICube, UMR 7357 University of Strasbourg/CNRS, Strasbourg, France
| | - F Faitot
- ICube, UMR 7357 University of Strasbourg/CNRS, Strasbourg, France
- Department of Visceral Surgery and Transplantation, Hautepierre Hospital, University Hospitals of Strasbourg, Strasbourg, France
- FMTS, Faculty of Medicine, Strasbourg, France
| | - A Imperiale
- ICube, UMR 7357 University of Strasbourg/CNRS, Strasbourg, France
- FMTS, Faculty of Medicine, Strasbourg, France
- Department of Biophysics and Nuclear Medicine, Hautepierre Hospital, University Hospitals of Strasbourg, 1, Avenue Molière, Strasbourg, Cedex, 67098, France
| | - A E Cicek
- Computational Biology Department, Carnegie Mellon University, Pittsburgh, PA, USA
- Computer Engineering Department, Bilkent University, Ankara, Turkey
| | - C Heimburger
- ICube, UMR 7357 University of Strasbourg/CNRS, Strasbourg, France
- FMTS, Faculty of Medicine, Strasbourg, France
- Department of Biophysics and Nuclear Medicine, Hautepierre Hospital, University Hospitals of Strasbourg, 1, Avenue Molière, Strasbourg, Cedex, 67098, France
| | - G Averous
- Department of Pathology, Hautepierre Hospital, University Hospitals of Strasbourg, Strasbourg, France
| | - P Bachellier
- Department of Visceral Surgery and Transplantation, Hautepierre Hospital, University Hospitals of Strasbourg, Strasbourg, France
| | - I J Namer
- ICube, UMR 7357 University of Strasbourg/CNRS, Strasbourg, France.
- FMTS, Faculty of Medicine, Strasbourg, France.
- Department of Biophysics and Nuclear Medicine, Hautepierre Hospital, University Hospitals of Strasbourg, 1, Avenue Molière, Strasbourg, Cedex, 67098, France.
| |
Collapse
|
30
|
Battini S, Imperiale A, Taïeb D, Elbayed K, Cicek AE, Sebag F, Brunaud L, Namer IJ. High-resolution magic angle spinning (1)H nuclear magnetic resonance spectroscopy metabolomics of hyperfunctioning parathyroid glands. Surgery 2016; 160:384-94. [PMID: 27106795 DOI: 10.1016/j.surg.2016.03.002] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2016] [Revised: 03/07/2016] [Accepted: 03/07/2016] [Indexed: 10/21/2022]
Abstract
BACKGROUND Primary hyperparathyroidism (PHPT) may be related to a single gland disease or multiglandular disease, which requires specific treatments. At present, an operation is the only curative treatment for PHPT. Currently, there are no biomarkers available to identify these 2 entities (single vs. multiple gland disease). The aims of the present study were to compare (1) the tissue metabolomics profiles between PHPT and renal hyperparathyroidism (secondary and tertiary) and (2) single gland disease with multiglandular disease in PHPT using metabolomics analysis. METHODS The method used was (1)H high-resolution magic angle spinning nuclear magnetic resonance spectroscopy. Forty-three samples from 32 patients suffering from hyperparathyroidism were included in this study. RESULTS Significant differences in the metabolomics profile were assessed according to PHPT and renal hyperparathyroidism. A bicomponent orthogonal partial least square-discriminant analysis showed a clear distinction between PHPT and renal hyperparathyroidism (R(2)Y = 0.85, Q(2) = 0.63). Interestingly, the model also distinguished single gland disease from multiglandular disease (R(2)Y = 0.96, Q(2) = 0.55). A network analysis was also performed using the Algorithm to Determine Expected Metabolite Level Alterations Using Mutual Information (ADEMA). Single gland disease was accurately predicted by ADEMA and was associated with higher levels of phosphorylcholine, choline, glycerophosphocholine, fumarate, succinate, lactate, glucose, glutamine, and ascorbate compared with multiglandular disease. CONCLUSION This study shows for the first time that (1)H high-resolution magic angle spinning nuclear magnetic resonance spectroscopy is a reliable and fast technique to distinguish single gland disease from multiglandular disease in patients with PHPT. The potential use of this method as an intraoperative tool requires specific further studies.
Collapse
Affiliation(s)
| | - Alessio Imperiale
- ICube, UMR 7357 University of Strasbourg/CNRS, Strasbourg, France; Department of Biophysics and Nuclear Medicine, Hautepierre Hospital, University Hospitals of Strasbourg, Strasbourg, France; FMTS, Faculty of Medicine, Strasbourg, France
| | - David Taïeb
- La Timone University Hospital, European Center for Research in Medical Imaging, Aix-Marseille University, Marseille, France
| | - Karim Elbayed
- ICube, UMR 7357 University of Strasbourg/CNRS, Strasbourg, France
| | - A Ercument Cicek
- Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA; Computer Engineering Department, Bilkent University, Ankara, Turkey
| | - Frédéric Sebag
- Department of Endocrine Surgery, Aix-Marseille University, Marseille, France
| | - Laurent Brunaud
- Department of Digestive, Hepato-Biliary and Endocrine Surgery, Brabois University Hospital, Nancy, France
| | - Izzie-Jacques Namer
- ICube, UMR 7357 University of Strasbourg/CNRS, Strasbourg, France; Department of Biophysics and Nuclear Medicine, Hautepierre Hospital, University Hospitals of Strasbourg, Strasbourg, France; FMTS, Faculty of Medicine, Strasbourg, France.
| |
Collapse
|
31
|
Abstract
Methods for the analysis of chromatin immunoprecipitation sequencing (ChIP-seq) data start by aligning the short reads to a reference genome. While often successful, they are not appropriate for cases where a reference genome is not available. Here we develop methods for de novo analysis of ChIP-seq data. Our methods combine de novo assembly with statistical tests enabling motif discovery without the use of a reference genome. We validate the performance of our method using human and mouse data. Analysis of fly data indicates that our method outperforms alignment based methods that utilize closely related species.
Collapse
Affiliation(s)
- Xin He
- Department of Human Genetics, The University of Chicago, 920 E. 58th Street, CLSC, Chicago, IL, 60637, USA.
| | - A Ercument Cicek
- Computational Biology Department, Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA, 15213, USA. .,Department of Computer Engineering, Bilkent University, Ankara, 06800, Turkey.
| | - Yuhao Wang
- Computer Science and Artificial Intelligence Laboratory, 32 Vassar Street, MIT, Cambridge, MA, 02139, USA.
| | - Marcel H Schulz
- Multimodal Computing and Interaction, Saarland University & Max Planck Institute for Informatics, Saarbrücken, 66123, Saarland, Germany.
| | - Hai-Son Le
- Computational Biology Department, Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA, 15213, USA. hple+@cs.cmu.edu
| | - Ziv Bar-Joseph
- Computational Biology Department, Carnegie Mellon University, 5000 Forbes Ave, Pittsburgh, PA, 15213, USA.
| |
Collapse
|
32
|
Cicek AE, Roeder K, Ozsoyoglu G. MIRA: mutual information-based reporter algorithm for metabolic networks. Bioinformatics 2015; 31:1160. [PMID: 25762654 PMCID: PMC4382909 DOI: 10.1093/bioinformatics/btv081] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
33
|
Abstract
MOTIVATION Discovering the transcriptional regulatory architecture of the metabolism has been an important topic to understand the implications of transcriptional fluctuations on metabolism. The reporter algorithm (RA) was proposed to determine the hot spots in metabolic networks, around which transcriptional regulation is focused owing to a disease or a genetic perturbation. Using a z-score-based scoring scheme, RA calculates the average statistical change in the expression levels of genes that are neighbors to a target metabolite in the metabolic network. The RA approach has been used in numerous studies to analyze cellular responses to the downstream genetic changes. In this article, we propose a mutual information-based multivariate reporter algorithm (MIRA) with the goal of eliminating the following problems in detecting reporter metabolites: (i) conventional statistical methods suffer from small sample sizes, (ii) as z-score ranges from minus to plus infinity, calculating average scores can lead to canceling out opposite effects and (iii) analyzing genes one by one, then aggregating results can lead to information loss. MIRA is a multivariate and combinatorial algorithm that calculates the aggregate transcriptional response around a metabolite using mutual information. We show that MIRA's results are biologically sound, empirically significant and more reliable than RA. RESULTS We apply MIRA to gene expression analysis of six knockout strains of Escherichia coli and show that MIRA captures the underlying metabolic dynamics of the switch from aerobic to anaerobic respiration. We also apply MIRA to an Autism Spectrum Disorder gene expression dataset. Results indicate that MIRA reports metabolites that highly overlap with recently found metabolic biomarkers in the autism literature. Overall, MIRA is a promising algorithm for detecting metabolic drug targets and understanding the relation between gene expression and metabolic activity. AVAILABILITY AND IMPLEMENTATION The code is implemented in C# language using .NET framework. Project is available upon request.
Collapse
Affiliation(s)
- A Ercument Cicek
- Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA 15213 and Department of Electrical Engineering and Computer Science, School of Engineering, Case Western Reserve University, Cleveland, OH, USA 44106
| | - Kathryn Roeder
- Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA 15213 and Department of Electrical Engineering and Computer Science, School of Engineering, Case Western Reserve University, Cleveland, OH, USA 44106
| | - Gultekin Ozsoyoglu
- Lane Center for Computational Biology, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA 15213 and Department of Electrical Engineering and Computer Science, School of Engineering, Case Western Reserve University, Cleveland, OH, USA 44106
| |
Collapse
|
34
|
Coskun SA, Cicek AE, Lai N, Dash RK, Ozsoyoglu ZM, Ozsoyoglu G. An online model composition tool for system biology models. BMC Syst Biol 2013; 7:88. [PMID: 24006914 PMCID: PMC3846440 DOI: 10.1186/1752-0509-7-88] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/22/2013] [Accepted: 08/21/2013] [Indexed: 11/21/2022]
Abstract
Background There are multiple representation formats for Systems Biology computational models, and the Systems Biology Markup Language (SBML) is one of the most widely used. SBML is used to capture, store, and distribute computational models by Systems Biology data sources (e.g., the BioModels Database) and researchers. Therefore, there is a need for all-in-one web-based solutions that support advance SBML functionalities such as uploading, editing, composing, visualizing, simulating, querying, and browsing computational models. Results We present the design and implementation of the Model Composition Tool (Interface) within the PathCase-SB (PathCase Systems Biology) web portal. The tool helps users compose systems biology models to facilitate the complex process of merging systems biology models. We also present three tools that support the model composition tool, namely, (1) Model Simulation Interface that generates a visual plot of the simulation according to user’s input, (2) iModel Tool as a platform for users to upload their own models to compose, and (3) SimCom Tool that provides a side by side comparison of models being composed in the same pathway. Finally, we provide a web site that hosts BioModels Database models and a separate web site that hosts SBML Test Suite models. Conclusions Model composition tool (and the other three tools) can be used with little or no knowledge of the SBML document structure. For this reason, students or anyone who wants to learn about systems biology will benefit from the described functionalities. SBML Test Suite models will be a nice starting point for beginners. And, for more advanced purposes, users will able to access and employ models of the BioModels Database as well.
Collapse
Affiliation(s)
- Sarp A Coskun
- Electrical Engineering and Computer Science Department, Case Western Reserve University, Cleveland, OH, USA.
| | | | | | | | | | | |
Collapse
|
35
|
Cicek AE, Bederman I, Henderson L, Drumm ML, Ozsoyoglu G. ADEMA: an algorithm to determine expected metabolite level alterations using mutual information. PLoS Comput Biol 2013; 9:e1002859. [PMID: 23341761 PMCID: PMC3547803 DOI: 10.1371/journal.pcbi.1002859] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2012] [Accepted: 10/23/2012] [Indexed: 01/07/2023] Open
Abstract
Metabolomics is a relatively new “omics” platform, which analyzes a discrete set of metabolites detected in bio-fluids or tissue samples of organisms. It has been used in a diverse array of studies to detect biomarkers and to determine activity rates for pathways based on changes due to disease or drugs. Recent improvements in analytical methodology and large sample throughput allow for creation of large datasets of metabolites that reflect changes in metabolic dynamics due to disease or a perturbation in the metabolic network. However, current methods of comprehensive analyses of large metabolic datasets (metabolomics) are limited, unlike other “omics” approaches where complex techniques for analyzing coexpression/coregulation of multiple variables are applied. This paper discusses the shortcomings of current metabolomics data analysis techniques, and proposes a new multivariate technique (ADEMA) based on mutual information to identify expected metabolite level changes with respect to a specific condition. We show that ADEMA better predicts De Novo Lipogenesis pathway metabolite level changes in samples with Cystic Fibrosis (CF) than prediction based on the significance of individual metabolite level changes. We also applied ADEMA's classification scheme on three different cohorts of CF and wildtype mice. ADEMA was able to predict whether an unknown mouse has a CF or a wildtype genotype with 1.0, 0.84, and 0.9 accuracy for each respective dataset. ADEMA results had up to 31% higher accuracy as compared to other classification algorithms. In conclusion, ADEMA advances the state-of-the-art in metabolomics analysis, by providing accurate and interpretable classification results. Metabolomics is an experimental approach that analyzes differences in metabolite levels detected in experimental samples. It has been used in the literature to understand the changes in metabolism with respect to diseases or drugs. Unlike transcriptomics or proteomics, which analyze gene and protein expression levels respectively, the techniques that consider co-regulation of multiple metabolites are quite limited. In this paper, we propose a novel technique, called ADEMA, which computes the expected level changes for each metabolite with respect to a given condition. ADEMA considers multiple metabolites at the same time and is mutual information (MI)-based. We show that ADEMA predicts metabolite level changes for young mice with Cystic Fibrosis (CF) better than significance testing that considers one metabolite at a time. Using three different datasets that contain CF and wild-type (WT) mice, we show that ADEMA can classify an individual as being CF or WT based on the metabolic profiles (with 1.0, 0.84, and 0.9 accuracy, respectively). Compared to other well-known classification algorithms, ADEMA's accuracy is higher by up to 31%.
Collapse
Affiliation(s)
- A Ercument Cicek
- Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, Ohio, USA.
| | | | | | | | | |
Collapse
|
36
|
Abstract
Steady state metabolic network dynamics analysis (SMDA) is a recently proposed computational metabolomics tool that (i) captures a metabolic network and its rules via a metabolic network database, (ii) mimics the reasoning of a biochemist, given a set of metabolic observations, and (iii) locates efficiently all possible metabolic activation/inactivation (flux) alternatives. However, a number of factors may cause the SMDA algorithm to eliminate feasible flux scenarios. These factors include (i) inherent error margins in observations (measurements), (ii) lack of knowledge to classify measurements as normal versus abnormal, and (iii) choosing a highly constrained metabolic subnetwork to query against. In this work, we first present and formalize these obstacles. Then, we propose techniques to eliminate them and present an experimental evaluation of our proposed techniques.
Collapse
Affiliation(s)
- A Ercument Cicek
- Department of Electrical Engineering and Computer Science, Case Western Reserve University, 10900 Euclid Ave., Cleveland, OH 44106, USA.
| | | |
Collapse
|
37
|
Coskun SA, Qi X, Cakmak A, Cheng E, Cicek AE, Yang L, Jadeja R, Dash RK, Lai N, Ozsoyoglu G, Ozsoyoglu ZM. PathCase-SB: integrating data sources and providing tools for systems biology research. BMC Syst Biol 2012; 6:67. [PMID: 22697505 PMCID: PMC3410775 DOI: 10.1186/1752-0509-6-67] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/29/2011] [Accepted: 06/14/2012] [Indexed: 11/10/2022]
Abstract
BACKGROUND Integration of metabolic pathways resources and metabolic network models, and deploying new tools on the integrated platform can help perform more effective and more efficient systems biology research on understanding the regulation of metabolic networks. Therefore, the tasks of (a) integrating under a single database environment regulatory metabolic networks and existing models, and (b) building tools to help with modeling and analysis are desirable and intellectually challenging computational tasks. RESULTS PathCase Systems Biology (PathCase-SB) is built and released. This paper describes PathCase-SB user interfaces developed to date. The current PathCase-SB system provides a database-enabled framework and web-based computational tools towards facilitating the development of kinetic models for biological systems. PathCase-SB aims to integrate systems biology models data and metabolic network data of selected biological data sources on the web (currently, BioModels Database and KEGG, respectively), and to provide more powerful and/or new capabilities via the new web-based integrative framework. CONCLUSIONS Each of the current four PathCase-SB interfaces, namely, Browser, Visualization, Querying, and Simulation interfaces, have expanded and new capabilities as compared with the original data sources. PathCase-SB is already available on the web and being used by researchers across the globe.
Collapse
Affiliation(s)
- Sarp A Coskun
- Electrical Engineering and Computer Science Department, Case Western Reserve University, Cleveland, USA
| | - Xinjian Qi
- Electrical Engineering and Computer Science Department, Case Western Reserve University, Cleveland, USA
| | - Ali Cakmak
- Electrical Engineering and Computer Science Department, Case Western Reserve University, Cleveland, USA
| | - En Cheng
- Electrical Engineering and Computer Science Department, Case Western Reserve University, Cleveland, USA
| | - A Ercument Cicek
- Electrical Engineering and Computer Science Department, Case Western Reserve University, Cleveland, USA
| | - Lei Yang
- Electrical Engineering and Computer Science Department, Case Western Reserve University, Cleveland, USA
| | - Rishiraj Jadeja
- Electrical Engineering and Computer Science Department, Case Western Reserve University, Cleveland, USA
| | - Ranjan K Dash
- Department of Physiology, Medical College of Wisconsin, Milwaukee, USA
| | - Nicola Lai
- Department of Biomedical Engineering, Case Western Reserve University, Cleveland, USA
- Department of Pediatrics, Case Western Reserve University, Cleveland, USA
| | - Gultekin Ozsoyoglu
- Electrical Engineering and Computer Science Department, Case Western Reserve University, Cleveland, USA
| | - Zehra Meral Ozsoyoglu
- Electrical Engineering and Computer Science Department, Case Western Reserve University, Cleveland, USA
| |
Collapse
|
38
|
Cakmak A, Qi X, Coskun SA, Das M, Cheng E, Cicek AE, Lai N, Ozsoyoglu G, Ozsoyoglu ZM. PathCase-SB architecture and database design. BMC Syst Biol 2011; 5:188. [PMID: 22070889 PMCID: PMC3229461 DOI: 10.1186/1752-0509-5-188] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/29/2011] [Accepted: 11/09/2011] [Indexed: 11/16/2022]
Abstract
Background Integration of metabolic pathways resources and regulatory metabolic network models, and deploying new tools on the integrated platform can help perform more effective and more efficient systems biology research on understanding the regulation in metabolic networks. Therefore, the tasks of (a) integrating under a single database environment regulatory metabolic networks and existing models, and (b) building tools to help with modeling and analysis are desirable and intellectually challenging computational tasks. Description PathCase Systems Biology (PathCase-SB) is built and released. The PathCase-SB database provides data and API for multiple user interfaces and software tools. The current PathCase-SB system provides a database-enabled framework and web-based computational tools towards facilitating the development of kinetic models for biological systems. PathCase-SB aims to integrate data of selected biological data sources on the web (currently, BioModels database and KEGG), and to provide more powerful and/or new capabilities via the new web-based integrative framework. This paper describes architecture and database design issues encountered in PathCase-SB's design and implementation, and presents the current design of PathCase-SB's architecture and database. Conclusions PathCase-SB architecture and database provide a highly extensible and scalable environment with easy and fast (real-time) access to the data in the database. PathCase-SB itself is already being used by researchers across the world.
Collapse
Affiliation(s)
- Ali Cakmak
- Electrical Engineering and Computer Science Department, Case Western Reserve University, Cleveland, OH 44106, USA
| | | | | | | | | | | | | | | | | |
Collapse
|