1
|
Sahoo K, Sundararajan V. Methods in DNA methylation array dataset analysis: A review. Comput Struct Biotechnol J 2024; 23:2304-2325. [PMID: 38845821 PMCID: PMC11153885 DOI: 10.1016/j.csbj.2024.05.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 04/25/2024] [Accepted: 05/08/2024] [Indexed: 06/09/2024] Open
Abstract
Understanding the intricate relationships between gene expression levels and epigenetic modifications in a genome is crucial to comprehending the pathogenic mechanisms of many diseases. With the advancement of DNA Methylome Profiling techniques, the emphasis on identifying Differentially Methylated Regions (DMRs/DMGs) has become crucial for biomarker discovery, offering new insights into the etiology of illnesses. This review surveys the current state of computational tools/algorithms for the analysis of microarray-based DNA methylation profiling datasets, focusing on key concepts underlying the diagnostic/prognostic CpG site extraction. It addresses methodological frameworks, algorithms, and pipelines employed by various authors, serving as a roadmap to address challenges and understand changing trends in the methodologies for analyzing array-based DNA methylation profiling datasets derived from diseased genomes. Additionally, it highlights the importance of integrating gene expression and methylation datasets for accurate biomarker identification, explores prognostic prediction models, and discusses molecular subtyping for disease classification. The review also emphasizes the contributions of machine learning, neural networks, and data mining to enhance diagnostic workflow development, thereby improving accuracy, precision, and robustness.
Collapse
Affiliation(s)
| | - Vino Sundararajan
- Correspondence to: Department of Bio Sciences, School of Bio Sciences and Technology, Vellore Institute of Technology, Vellore 632 014, Tamil Nadu, India.
| |
Collapse
|
2
|
Dai L, Johnson-Buck A, Laird PW, Tewari M, Walter NG. Ultrasensitive amplification-free quantification of a methyl CpG-rich cancer biomarker by single-molecule kinetic fingerprinting. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.06.587997. [PMID: 38645159 PMCID: PMC11030368 DOI: 10.1101/2024.04.06.587997] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
The most well-studied epigenetic marker in humans is the 5-methyl modification of cytosine in DNA, which has great potential as a disease biomarker in liquid biopsies of cell-free DNA. Currently, quantification of DNA methylation relies heavily on bisulfite conversion followed by PCR amplification and NGS or microarray analysis. PCR is subject to potential bias in differential amplification of bisulfite-converted methylated versus unmethylated sequences. Here, we combine bisulfite conversion with single-molecule kinetic fingerprinting to develop an amplification-free assay for DNA methylation at the branched-chain amino acid transaminase 1 (BCAT1) promoter. Our assay selectively responds to methylated sequences with a limit of detection below 1 fM and a specificity of 99.9999%. Evaluating complex genomic DNA matrices, we reliably distinguish 2-5% DNA methylation at the BCAT1 promoter in whole blood DNA from completely unmethylated whole-genome amplified DNA. Taken together, these results demonstrate the feasibility and sensitivity of our amplification-free, single-molecule quantification approach to improve the early detection of methylated cancer DNA biomarkers.
Collapse
Affiliation(s)
- Liuhan Dai
- Single Molecule Analysis Group, Department of Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
- Center for RNA Biomedicine, University of Michigan, Ann Arbor, MI 48109, USA
| | - Alexander Johnson-Buck
- Single Molecule Analysis Group, Department of Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
| | - Peter W. Laird
- Department of Epigenetics, Van Andel Institute, Grand Rapids, MI, 49503, USA
| | - Muneesh Tewari
- Center for RNA Biomedicine, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Internal Medicine, Division of Hematology/Oncology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Nils G. Walter
- Single Molecule Analysis Group, Department of Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
- Center for RNA Biomedicine, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
3
|
Liu S, Huang J, Zhou J, Chen S, Zheng W, Liu C, Lin Q, Zhang P, Wu D, He S, Ye J, Liu S, Zhou K, Li B, Qu L, Yang J. NAP-seq reveals multiple classes of structured noncoding RNAs with regulatory functions. Nat Commun 2024; 15:2425. [PMID: 38499544 PMCID: PMC10948791 DOI: 10.1038/s41467-024-46596-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 03/04/2024] [Indexed: 03/20/2024] Open
Abstract
Up to 80% of the human genome produces "dark matter" RNAs, most of which are noncapped RNAs (napRNAs) that frequently act as noncoding RNAs (ncRNAs) to modulate gene expression. Here, by developing a method, NAP-seq, to globally profile the full-length sequences of napRNAs with various terminal modifications at single-nucleotide resolution, we reveal diverse classes of structured ncRNAs. We discover stably expressed linear intron RNAs (sliRNAs), a class of snoRNA-intron RNAs (snotrons), a class of RNAs embedded in miRNA spacers (misRNAs) and thousands of previously uncharacterized structured napRNAs in humans and mice. These napRNAs undergo dynamic changes in response to various stimuli and differentiation stages. Importantly, we show that a structured napRNA regulates myoblast differentiation and a napRNA DINAP interacts with dyskerin pseudouridine synthase 1 (DKC1) to promote cell proliferation by maintaining DKC1 protein stability. Our approach establishes a paradigm for discovering various classes of ncRNAs with regulatory functions.
Collapse
Affiliation(s)
- Shurong Liu
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China
| | - Junhong Huang
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China
- The Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, 519082, Guangdong, China
| | - Jie Zhou
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China
| | - Siyan Chen
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China
- The Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, 519082, Guangdong, China
| | - Wujian Zheng
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China
| | - Chang Liu
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China
| | - Qiao Lin
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China
| | - Ping Zhang
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China
| | - Di Wu
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China
- The Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, 519082, Guangdong, China
| | - Simeng He
- The Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, 519082, Guangdong, China
| | - Jiayi Ye
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China
| | - Shun Liu
- Department of Chemistry, The University of Chicago, Chicago, IL, 60637, USA
| | - Keren Zhou
- Department of Systems Biology, Beckman Research Institute of City of Hope, Monrovia, CA, 91016, USA
| | - Bin Li
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China.
| | - Lianghu Qu
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China.
| | - Jianhua Yang
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory of Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, Guangdong, China.
- The Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, 519082, Guangdong, China.
| |
Collapse
|
4
|
Halim-Fikri H, Syed-Hassan SNRK, Wan-Juhari WK, Assyuhada MGSN, Hernaningsih Y, Yusoff NM, Merican AF, Zilfalil BA. Central resources of variant discovery and annotation and its role in precision medicine. ASIAN BIOMED 2022; 16:285-298. [PMID: 37551357 PMCID: PMC10392146 DOI: 10.2478/abm-2022-0032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/09/2023]
Abstract
Rapid technological advancement in high-throughput genomics, microarray, and deep sequencing technologies has accelerated the possibility of more complex precision medicine research using large amounts of heterogeneous health-related data from patients, including genomic variants. Genomic variants can be identified and annotated based on the reference human genome either within the sequence as a whole or in a putative functional genomic element. The American College of Medical Genetics and Genomics (ACMG) and the Association for Molecular Pathology (AMP) mutually created standards and guidelines for the appraisal of proof to expand consistency and straightforwardness in clinical variation interpretations. Various efforts toward precision medicine have been facilitated by many national and international public databases that classify and annotate genomic variation. In the present study, several resources are highlighted with recognition and data spreading of clinically important genetic variations.
Collapse
Affiliation(s)
- Hashim Halim-Fikri
- Malaysian Node of the Human Variome Project, School of Medical Sciences, Universiti Sains Malaysia, Kelantan16150, Malaysia
| | | | - Wan-Khairunnisa Wan-Juhari
- Malaysian Node of the Human Variome Project, School of Medical Sciences, Universiti Sains Malaysia, Kelantan16150, Malaysia
- Human Genome Centre, School of Medical Sciences, Universiti Sains Malaysia, Kelantan16150, Malaysia
| | - Mat Ghani Siti Nor Assyuhada
- Malaysian Node of the Human Variome Project, School of Medical Sciences, Universiti Sains Malaysia, Kelantan16150, Malaysia
| | - Yetti Hernaningsih
- Department of Clinical Pathology, Faculty of Medicine Universitas Airlangga, Dr. Soetomo Academic General Hospital, Surabaya, Indonesia
| | - Narazah Mohd Yusoff
- Department of Clinical Pathology, Faculty of Medicine Universitas Airlangga, Dr. Soetomo Academic General Hospital, Surabaya, Indonesia
- Clinical Diagnostic Laboratory, Advanced Medical and Dental Institute, Universiti Sains Malaysia, Penang13200, Malaysia
| | - Amir Feisal Merican
- Institute of Biological Sciences, Faculty of Science, University of Malaya, Kuala Lumpur50603, Malaysia
- Center of Research for Computational Sciences and Informatics in Biology, Bio Industry, Environment, Agriculture and Healthcare (CRYSTAL), University of Malaya, Kuala Lumpur50603, Malaysia
| | - Bin Alwi Zilfalil
- Malaysian Node of the Human Variome Project, School of Medical Sciences, Universiti Sains Malaysia, Kelantan16150, Malaysia
- Human Genome Centre, School of Medical Sciences, Universiti Sains Malaysia, Kelantan16150, Malaysia
| |
Collapse
|
5
|
Medvedev KE, Savelyeva AV, Chen KS, Bagrodia A, Jia L, Grishin NV. Integrated Molecular Analysis Reveals 2 Distinct Subtypes of Pure Seminoma of the Testis. Cancer Inform 2022; 21:11769351221132634. [PMID: 36330202 PMCID: PMC9623390 DOI: 10.1177/11769351221132634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2022] [Accepted: 09/24/2022] [Indexed: 11/07/2022] Open
Abstract
Objective: Testicular germ cell tumors (TGCT) are the most common solid malignancy in
adolescent and young men, with a rising incidence over the past 20 years.
Overall, TGCTs are second in terms of the average life years lost per person
dying of cancer, and clinical therapeutics without adverse long-term side
effects are lacking. Platinum-based regimens for TGCTs have heterogeneous
outcomes even within the same histotype that frequently leads to under- and
over-treatment. Understanding of molecular differences that lead to diverse
outcomes of TGCT patients may improve current treatment approaches. Seminoma
is the most common subtype of TGCTs, which can either be pure or present in
combination with other histotypes. Methods: Here we conducted a computational study of 64 pure seminoma samples from The
Cancer Genome Atlas, applied consensus clustering approach to their
transcriptomic data and revealed 2 clinically relevant seminoma subtypes:
seminoma subtype 1 and 2. Results: Our analysis identified significant differences in pluripotency stage,
activity of double stranded DNA breaks repair mechanisms, rates of loss of
heterozygosity, and expression of lncRNA responsible for cisplatin
resistance between the subtypes. Seminoma subtype 1 is characterized by
higher pluripotency state, while subtype 2 showed attributes of reprograming
into non-seminomatous TGCT. The seminoma subtypes we identified may provide
a molecular underpinning for variable responses to chemotherapy and
radiation. Conclusion: Translating our findings into clinical care may help improve risk
stratification of seminoma, decrease overtreatment rates, and increase
long-term quality of life for TGCT survivors.
Collapse
Affiliation(s)
- Kirill E Medvedev
- Department of Biophysics, University of
Texas Southwestern Medical Center, Dallas, TX, USA,Kirill E Medvedev, Department of
Biophysics, University of Texas Southwestern Medical Center, 5323 Harry Hines
Blvd, Dallas, TX 75390, USA.
| | - Anna V Savelyeva
- Department of Urology, University of
Texas Southwestern Medical Center, Dallas, TX, USA
| | - Kenneth S Chen
- Department of Pediatrics, University of
Texas Southwestern Medical Center, Dallas, TX, USA,Children’s Medical Center Research
Institute, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Aditya Bagrodia
- Department of Urology, University of
Texas Southwestern Medical Center, Dallas, TX, USA,Department of Urology, University of
California San Diego Health, La Jolla, CA, USA
| | - Liwei Jia
- Department of Pathology, University of
Texas Southwestern Medical Center, Dallas, TX, USA
| | - Nick V Grishin
- Department of Biophysics, University of
Texas Southwestern Medical Center, Dallas, TX, USA,Department of Biochemistry, University
of Texas Southwestern Medical Center, Dallas, TX, USA
| |
Collapse
|
6
|
Garcia C, Furtado de Almeida AA, Costa M, Britto D, Correa F, Mangabeira P, Silva L, Silva J, Royaert S, Marelli JP. Single-base resolution methylomes of somatic embryogenesis in Theobroma cacao L. reveal epigenome modifications associated with somatic embryo abnormalities. Sci Rep 2022; 12:15097. [PMID: 36064870 PMCID: PMC9445004 DOI: 10.1038/s41598-022-18035-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Accepted: 08/04/2022] [Indexed: 11/09/2022] Open
Abstract
Propagation by somatic embryogenesis in Theobroma cacao has some issues to be solved, as many morphologically abnormal somatic embryos that do not germinate into plants are frequently observed, thus hampering plant production on a commercial scale. For the first time the methylome landscape of T. cacao somatic embryogenesis was examined, using whole-genome bisulfite sequencing technique, with the aim to understand the epigenetic basis of somatic embryo abnormalities. We identified 873 differentially methylated genes (DMGs) in the CpG context between zygotic embryos, normal and abnormal somatic embryos, with important roles in development, programmed cell death, oxidative stress, and hypoxia induction, which can help to explain the morphological abnormalities of somatic embryos. We also identified the role of ethylene and its precursor 1-aminocyclopropane-1-carboxylate in several biological processes, such as hypoxia induction, cell differentiation and cell polarity, that could be associated to the development of abnormal somatic embryos. The biological processes and the hypothesis of ethylene and its precursor involvement in the somatic embryo abnormalities in cacao are discussed.
Collapse
Affiliation(s)
| | | | - Marcio Costa
- Department of Biological Sciences, State University of Santa Cruz, Ilhéus, Brazil
| | | | - Fabio Correa
- Department of Statistics, Rhodes University, Makhanda, South Africa
| | - Pedro Mangabeira
- Department of Biological Sciences, State University of Santa Cruz, Ilhéus, Brazil
| | | | - Jose Silva
- Department of Biological Sciences, State University of Santa Cruz, Ilhéus, Brazil
| | | | | |
Collapse
|
7
|
Kim H, Momen-Heravi F, Chen S, Hoffmann P, Kebschull M, Papapanou PN. Differential DNA methylation and mRNA transcription in gingival tissues in periodontal health and disease. J Clin Periodontol 2021; 48:1152-1164. [PMID: 34101221 DOI: 10.1111/jcpe.13504] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2021] [Revised: 04/13/2021] [Accepted: 05/14/2021] [Indexed: 12/25/2022]
Abstract
AIM We investigated differential DNA methylation in gingival tissues in periodontal health, gingivitis, and periodontitis, and its association with differential mRNA expression. MATERIALS AND METHODS Gingival tissues were harvested from individuals and sites with clinically healthy and intact periodontium, gingivitis, and periodontitis. Samples were processed for differential DNA methylation and mRNA expression using the IlluminaEPIC (850 K) and the IlluminaHiSeq2000 platforms, respectively. Across the three phenotypes, we identified differentially methylated CpG sites and regions, differentially expressed genes (DEGs), and genes with concomitant differential methylation at their promoters and expression were identified. The findings were validated using our earlier databases using HG-U133Plus2.0Affymetrix microarrays and Illumina (450 K) methylation arrays. RESULTS We observed 43,631 differentially methylated positions (DMPs) between periodontitis and health, and 536 DMPs between gingivitis and health (FDR < 0.05). On the mRNA level, statistically significant DEGs were observed only between periodontitis and health (n = 126). Twelve DEGs between periodontitis and health (DCC, KCNA3, KCNA2, RIMS2, HOXB7, PNOC, IRX1, JSRP1, TBX1, OPCML, CECR1, SCN4B) were also differentially methylated between the two phenotypes. Spearman correlations between methylation and expression in the EPIC/mRNAseq dataset were largely replicated in the 450 K/Affymetrix datasets. CONCLUSIONS Concomitant study of DNA methylation and gene expression patterns may identify genes whose expression is epigenetically regulated in periodontitis.
Collapse
Affiliation(s)
- Hyunjin Kim
- Biomedical Informatics Shared Resource, Herbert Irving Comprehensive Cancer Center and Department of Systems Biology, Columbia University, New York, New York, USA.,Department of Immunology, St. Jude Children's Research Hospital, Memphis, Tennessee, USA
| | - Fatemeh Momen-Heravi
- Division of Periodontics, Section of Oral, Diagnostic and Rehabilitation Sciences, College of Dental Medicine, Columbia University, New York, New York, USA
| | - Steven Chen
- Division of Periodontics, Section of Oral, Diagnostic and Rehabilitation Sciences, College of Dental Medicine, Columbia University, New York, New York, USA
| | - Per Hoffmann
- Institute of Human Genetics, University of Bonn, Bonn, Germany
| | - Moritz Kebschull
- Division of Periodontics, Section of Oral, Diagnostic and Rehabilitation Sciences, College of Dental Medicine, Columbia University, New York, New York, USA.,School of Dentistry, Institute of Clinical Sciences, University of Birmingham, Birmingham, UK
| | - Panos N Papapanou
- Division of Periodontics, Section of Oral, Diagnostic and Rehabilitation Sciences, College of Dental Medicine, Columbia University, New York, New York, USA
| |
Collapse
|
8
|
Mano T, Murata K, Kon K, Shimizu C, Ono H, Shi S, Yamada RG, Miyamichi K, Susaki EA, Touhara K, Ueda HR. CUBIC-Cloud provides an integrative computational framework toward community-driven whole-mouse-brain mapping. CELL REPORTS METHODS 2021; 1:100038. [PMID: 35475238 PMCID: PMC9017177 DOI: 10.1016/j.crmeth.2021.100038] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Revised: 03/17/2021] [Accepted: 05/20/2021] [Indexed: 01/18/2023]
Abstract
Recent advancements in tissue clearing technologies have offered unparalleled opportunities for researchers to explore the whole mouse brain at cellular resolution. With the expansion of this experimental technique, however, a scalable and easy-to-use computational tool is in demand to effectively analyze and integrate whole-brain mapping datasets. To that end, here we present CUBIC-Cloud, a cloud-based framework to quantify, visualize, and integrate mouse brain data. CUBIC-Cloud is a fully automated system where users can upload their whole-brain data, run analyses, and publish the results. We demonstrate the generality of CUBIC-Cloud by a variety of applications. First, we investigated the brain-wide distribution of five cell types. Second, we quantified Aβ plaque deposition in Alzheimer's disease model mouse brains. Third, we reconstructed a neuronal activity profile under LPS-induced inflammation by c-Fos immunostaining. Last, we show brain-wide connectivity mapping by pseudotyped rabies virus. Together, CUBIC-Cloud provides an integrative platform to advance scalable and collaborative whole-brain mapping.
Collapse
Affiliation(s)
- Tomoyuki Mano
- Department of Information Physics and Computing, Graduate School of Information Science and Technology, The University of Tokyo, Bunkyo-ku, Tokyo 113-0033, Japan
- Laboratory for Synthetic Biology, RIKEN Center for Biosystems Dynamics Research, Suita, Osaka 565-5241, Japan
| | - Ken Murata
- Department of Applied Biological Chemistry, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Bunkyo-ku, Tokyo 113-8657, Japan
| | - Kazuhiro Kon
- Department of Systems Pharmacology, Graduate School of Medicine, The University of Tokyo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Chika Shimizu
- Laboratory for Synthetic Biology, RIKEN Center for Biosystems Dynamics Research, Suita, Osaka 565-5241, Japan
| | - Hiroaki Ono
- Department of Systems Pharmacology, Graduate School of Medicine, The University of Tokyo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Shoi Shi
- Laboratory for Synthetic Biology, RIKEN Center for Biosystems Dynamics Research, Suita, Osaka 565-5241, Japan
- Department of Systems Pharmacology, Graduate School of Medicine, The University of Tokyo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Rikuhiro G. Yamada
- Laboratory for Synthetic Biology, RIKEN Center for Biosystems Dynamics Research, Suita, Osaka 565-5241, Japan
| | - Kazunari Miyamichi
- Department of Applied Biological Chemistry, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Bunkyo-ku, Tokyo 113-8657, Japan
| | - Etsuo A. Susaki
- Laboratory for Synthetic Biology, RIKEN Center for Biosystems Dynamics Research, Suita, Osaka 565-5241, Japan
- Department of Systems Pharmacology, Graduate School of Medicine, The University of Tokyo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Kazushige Touhara
- Department of Applied Biological Chemistry, Graduate School of Agricultural and Life Sciences, The University of Tokyo, Bunkyo-ku, Tokyo 113-8657, Japan
- International Research Center for Neurointelligence (WPI-IRCN), UTIAS, The University of Tokyo, Bunkyo-ku, Tokyo 113-0033, Japan
| | - Hiroki R. Ueda
- Department of Information Physics and Computing, Graduate School of Information Science and Technology, The University of Tokyo, Bunkyo-ku, Tokyo 113-0033, Japan
- Laboratory for Synthetic Biology, RIKEN Center for Biosystems Dynamics Research, Suita, Osaka 565-5241, Japan
- Department of Systems Pharmacology, Graduate School of Medicine, The University of Tokyo, Bunkyo-ku, Tokyo 113-0033, Japan
| |
Collapse
|
9
|
Rosikiewicz W, Sikora J, Skrzypczak T, Kubiak MR, Makałowska I. Promoter switching in response to changing environment and elevated expression of protein-coding genes overlapping at their 5' ends. Sci Rep 2021; 11:8984. [PMID: 33903630 PMCID: PMC8076222 DOI: 10.1038/s41598-021-87970-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2020] [Accepted: 04/07/2021] [Indexed: 11/09/2022] Open
Abstract
Despite the number of studies focused on sense-antisense transcription, the key question of whether such organization evolved as a regulator of gene expression or if this is only a byproduct of other regulatory processes has not been elucidated to date. In this study, protein-coding sense-antisense gene pairs were analyzed with a particular focus on pairs overlapping at their 5' ends. Analyses were performed in 73 human transcription start site libraries. The results of our studies showed that the overlap between genes is not a stable feature and depends on which TSSs are utilized in a given cell type. An analysis of gene expression did not confirm that overlap between genes causes downregulation of their expression. This observation contradicts earlier findings. In addition, we showed that the switch from one promoter to another, leading to genes overlap, may occur in response to changing environment of a cell or tissue. We also demonstrated that in transfected and cancerous cells genes overlap is observed more often in comparison with normal tissues. Moreover, utilization of overlapping promoters depends on particular state of a cell and, at least in some groups of genes, is not merely coincidental.
Collapse
Affiliation(s)
- Wojciech Rosikiewicz
- Center for Applied Bioinformatics, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Jarosław Sikora
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland
| | - Tomasz Skrzypczak
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland
- Center for Advanced Technology, Adam Mickiewicz University, Poznań, Poland
| | - Magdalena R Kubiak
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland
| | - Izabela Makałowska
- Institute of Human Biology and Evolution, Faculty of Biology, Adam Mickiewicz University, Poznań, Poland.
| |
Collapse
|
10
|
Krishnan V, Utiramerur S, Ng Z, Datta S, Snyder MP, Ashley EA. Benchmarking workflows to assess performance and suitability of germline variant calling pipelines in clinical diagnostic assays. BMC Bioinformatics 2021; 22:85. [PMID: 33627090 PMCID: PMC7903625 DOI: 10.1186/s12859-020-03934-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2019] [Accepted: 12/15/2020] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Benchmarking the performance of complex analytical pipelines is an essential part of developing Lab Developed Tests (LDT). Reference samples and benchmark calls published by Genome in a Bottle (GIAB) consortium have enabled the evaluation of analytical methods. The performance of such methods is not uniform across the different genomic regions of interest and variant types. Several benchmarking methods such as hap.py, vcfeval, and vcflib are available to assess the analytical performance characteristics of variant calling algorithms. However, assessing the performance characteristics of an overall LDT assay still requires stringing together several such methods and experienced bioinformaticians to interpret the results. In addition, these methods are dependent on the hardware, operating system and other software libraries, making it impossible to reliably repeat the analytical assessment, when any of the underlying dependencies change in the assay. Here we present a scalable and reproducible, cloud-based benchmarking workflow that is independent of the laboratory and the technician executing the workflow, or the underlying compute hardware used to rapidly and continually assess the performance of LDT assays, across their regions of interest and reportable range, using a broad set of benchmarking samples. RESULTS The benchmarking workflow was used to evaluate the performance characteristics for secondary analysis pipelines commonly used by Clinical Genomics laboratories in their LDT assays such as the GATK HaplotypeCaller v3.7 and the SpeedSeq workflow based on FreeBayes v0.9.10. Five reference sample truth sets generated by Genome in a Bottle (GIAB) consortium, six samples from the Personal Genome Project (PGP) and several samples with validated clinically relevant variants from the Centers for Disease Control were used in this work. The performance characteristics were evaluated and compared for multiple reportable ranges, such as whole exome and the clinical exome. CONCLUSIONS We have implemented a benchmarking workflow for clinical diagnostic laboratories that generates metrics such as specificity, precision and sensitivity for germline SNPs and InDels within a reportable range using whole exome or genome sequencing data. Combining these benchmarking results with validation using known variants of clinical significance in publicly available cell lines, we were able to establish the performance of variant calling pipelines in a clinical setting.
Collapse
Affiliation(s)
- Vandhana Krishnan
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA, USA.,Stanford Center for Genomics and Personalized Medicine, Stanford University, Palo Alto, CA, USA
| | - Sowmithri Utiramerur
- Stanford Center for Genomics and Personalized Medicine, Stanford University, Palo Alto, CA, USA. .,Clinical Genomics Program, Stanford Health Care, Stanford, CA, USA. .,Roche Diagnostics Solutions, Research and Early Development, Pleasanton, CA, USA.
| | - Zena Ng
- Clinical Genomics Program, Stanford Health Care, Stanford, CA, USA
| | - Somalee Datta
- Stanford Center for Genomics and Personalized Medicine, Stanford University, Palo Alto, CA, USA.,School of Medicine, Research IT - Technology and Digital Solutions, Stanford University, Redwood City, CA, USA
| | - Michael P Snyder
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA, USA.,Stanford Center for Genomics and Personalized Medicine, Stanford University, Palo Alto, CA, USA
| | - Euan A Ashley
- Department of Genetics, School of Medicine, Stanford University, Stanford, CA, USA. .,Department of Cardiovascular Medicine, Stanford University, Stanford, CA, USA. .,Department of Biomedical Data Science, Stanford University, Stanford, CA, USA.
| |
Collapse
|
11
|
Emmer BT, Sherman EJ, Lascuna PJ, Graham SE, Willer CJ, Ginsburg D. Genome-scale CRISPR screening for modifiers of cellular LDL uptake. PLoS Genet 2021; 17:e1009285. [PMID: 33513160 PMCID: PMC7875399 DOI: 10.1371/journal.pgen.1009285] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Revised: 02/10/2021] [Accepted: 11/18/2020] [Indexed: 12/12/2022] Open
Abstract
Hypercholesterolemia is a causal and modifiable risk factor for atherosclerotic cardiovascular disease. A critical pathway regulating cholesterol homeostasis involves the receptor-mediated endocytosis of low-density lipoproteins into hepatocytes, mediated by the LDL receptor. We applied genome-scale CRISPR screening to query the genetic determinants of cellular LDL uptake in HuH7 cells cultured under either lipoprotein-rich or lipoprotein-starved conditions. Candidate LDL uptake regulators were validated through the synthesis and secondary screening of a customized library of gRNA at greater depth of coverage. This secondary screen yielded significantly improved performance relative to the primary genome-wide screen, with better discrimination of internal positive controls, no identification of negative controls, and improved concordance between screen hits at both the gene and gRNA level. We then applied our customized gRNA library to orthogonal screens that tested for the specificity of each candidate regulator for LDL versus transferrin endocytosis, the presence or absence of genetic epistasis with LDLR deletion, the impact of each perturbation on LDLR expression and trafficking, and the generalizability of LDL uptake modifiers across multiple cell types. These findings identified several previously unrecognized genes with putative roles in LDL uptake and suggest mechanisms for their functional interaction with LDLR.
Collapse
Affiliation(s)
- Brian T. Emmer
- Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan, United States of America
- Life Sciences Institute, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Emily J. Sherman
- Life Sciences Institute, University of Michigan, Ann Arbor, Michigan, United States of America
- Chemical Biology Program, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Paul J. Lascuna
- Life Sciences Institute, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Sarah E. Graham
- Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Cristen J. Willer
- Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan, United States of America
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - David Ginsburg
- Department of Internal Medicine, University of Michigan, Ann Arbor, Michigan, United States of America
- Life Sciences Institute, University of Michigan, Ann Arbor, Michigan, United States of America
- Department of Human Genetics, University of Michigan, Ann Arbor, Michigan, United States of America
- Department of Pediatrics and Communicable Diseases, University of Michigan, Ann Arbor, Michigan, United States of America
- Howard Hughes Medical Institute, University of Michigan, Ann Arbor, Michigan, United States of America
| |
Collapse
|
12
|
Ray M, Sable MN, Sarkar S, Hallur V. Essential interpretations of bioinformatics in COVID-19 pandemic. Meta Gene 2020; 27:100844. [PMID: 33349792 PMCID: PMC7744275 DOI: 10.1016/j.mgene.2020.100844] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2020] [Revised: 12/02/2020] [Accepted: 12/14/2020] [Indexed: 02/06/2023] Open
Abstract
The currently emerging pathogen SARS-CoV-2 has produced the global pandemic crisis by causing COVID-19. The unique and novel genetic makeup of SARS-CoV-2 has created hurdles in biological research, due to which the potential drug/vaccine candidates have not yet been discovered by the scientific community. Meanwhile, the advantages of bioinformatics in viral research had created a milestone since last few decades. The exploitation of bioinformatics tools and techniques has successfully interpreted this viral genomics architecture. Some major in silico studies involving next-generation sequencing, genome-wide association studies, computer-aided drug design etc. have been effectively applied in COVID-19 research methodologies and discovered novel information on SARS-CoV-2 in several ways. Nowadays the implementation of in silico studies in COVID-19 research has not only sequenced the SARS-CoV-2 genome but also properly analyzed the sequencing errors, evolutionary relationship, genetic variations, putative drug candidates against SARS-CoV-2 viral genes etc. within a very short time period. These would be very needful towards further research on COVID-19 pandemic and essential for vaccine development against SARS-CoV-2 which will save public health.
Collapse
Affiliation(s)
- Manisha Ray
- Department of Pathology & Lab Medicine, All India Institute of Medical Sciences, Bhubaneswar, Odisha 751019, India
| | - Mukund Namdev Sable
- Department of ENT, All India Institute of Medical Sciences, Bhubaneswar, Odisha 751019, India
| | - Saurav Sarkar
- Department of Microbiology, All India Institute of Medical Sciences, Bhubaneswar, Odisha 751019, India
| | - Vinaykumar Hallur
- Department of Microbiology, All India Institute of Medical Sciences, Bhubaneswar, Odisha 751019, India
| |
Collapse
|
13
|
Wu HC, Cohn BA, Cirillo PM, Santella RM, Terry MB. DDT exposure during pregnancy and DNA methylation alterations in female offspring in the Child Health and Development Study. Reprod Toxicol 2020; 92:138-147. [PMID: 30822522 PMCID: PMC6710160 DOI: 10.1016/j.reprotox.2019.02.010] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2018] [Revised: 02/07/2019] [Accepted: 02/25/2019] [Indexed: 12/14/2022]
Abstract
Studies measuring dichlorodiphenyltrichloroethane (DDT) exposure during key windows of susceptibility including the intrauterine period suggest that DDT exposure is associated with breast cancer risk. We hypothesized that prenatal DDT exposure is associated with DNA methylation. Using prospective data from 316 daughters in the Child Health and Development Study, we examined the association between prenatal exposure to DDTs and DNA methylation in blood collected in midlife (mean age: 49 years). To identify differentially methylated regions (DMRs) associated with markers of DDTs (p,p'-DDT and the primary metabolite of p,p'-DDT, p,p'-DDE, and o,p'-DDT, the primary constituents of technical DDT), we measured methylation in 30 genes important to breast cancer. We observed DDT DMRs in three genes, CCDC85A, CYP1A1 and ZFPM2, each of which has been previously implicated in pubertal development and breast cancer susceptibility. These findings suggest prenatal DDT exposure may have life-long consequence through alteration in genes relevant to breast cancer.
Collapse
Affiliation(s)
- Hui-Chen Wu
- Herbert Irving Comprehensive Cancer Center, Columbia University Medical Center, New York, NY
- Department of Environmental Health Sciences, Mailman School of Public Health of Columbia University, New York, NY
| | - Barbara A. Cohn
- Child Health and Development Studies, Public Health Institute, Berkeley, California
| | - Piera M. Cirillo
- Child Health and Development Studies, Public Health Institute, Berkeley, California
| | - Regina M. Santella
- Herbert Irving Comprehensive Cancer Center, Columbia University Medical Center, New York, NY
- Department of Environmental Health Sciences, Mailman School of Public Health of Columbia University, New York, NY
| | - Mary Beth Terry
- Herbert Irving Comprehensive Cancer Center, Columbia University Medical Center, New York, NY
- Department of Environmental Health Sciences, Mailman School of Public Health of Columbia University, New York, NY
- Imprints Center, Columbia University Medical Center, New York, NY
- Department of Epidemiology, Mailman School of Public Health of Columbia University, New York, NY
| |
Collapse
|
14
|
Zhu Z, Wang Y, Zhou X, Yang L, Meng G, Zhang Z. SWAV: a web-based visualization browser for sliding window analysis. Sci Rep 2020; 10:149. [PMID: 31924845 PMCID: PMC6954255 DOI: 10.1038/s41598-019-57038-x] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2019] [Accepted: 12/19/2019] [Indexed: 01/23/2023] Open
Abstract
Sliding window analysis has been extensively applied in evolutionary biology. With the development of the high-throughput DNA sequencing of organisms at the population level, an application that is dedicated to visualizing population genetic test statistics at the genomic level is needed. We have developed the sliding window analysis viewer (SWAV), which is a web-based program that can be used to integrate, view and browse test statistics and perform genome annotation. In addition to browsing, SAV can mark, generate and customize statistical images and search by sequence alignment, position or gene name. These features facilitate the effectiveness of sliding window analysis. As an example application, yeast and silkworm resequencing data are analyzed with SWAV. The SWAV package, user manual and usage demo are available at http://swav.popgenetics.net.
Collapse
Affiliation(s)
- Zhenglin Zhu
- School of Life Sciences, Chongqing University, No. 55 Daxuecheng South Rd., Shapingba, Chongqing, 401331, China.
| | - Yawang Wang
- School of Life Sciences, Chongqing University, No. 55 Daxuecheng South Rd., Shapingba, Chongqing, 401331, China.,Khoury College of Computer Sciences, Northeastern University, Seattle, 98109, WA, USA
| | - Xichuan Zhou
- The School of Microelectronics and Communication Engineering, Chongqing University, Chongqing, 400044, China
| | - Liuqing Yang
- Department of Medical Ultrasonics, Chongqing Occupational Disease Prevention Hospital, Chongqing, 400060, China
| | - Geng Meng
- College of Veterinary Medicine, China Agricultural University, Beijing, 100094, China.
| | - Ze Zhang
- School of Life Sciences, Chongqing University, No. 55 Daxuecheng South Rd., Shapingba, Chongqing, 401331, China
| |
Collapse
|
15
|
Shulman ED, Elkon R. Cell-type-specific analysis of alternative polyadenylation using single-cell transcriptomics data. Nucleic Acids Res 2019; 47:10027-10039. [PMID: 31501864 PMCID: PMC6821429 DOI: 10.1093/nar/gkz781] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2019] [Revised: 08/27/2019] [Accepted: 09/01/2019] [Indexed: 12/22/2022] Open
Abstract
Alternative polyadenylation (APA) is emerging as an important layer of gene regulation because the majority of mammalian protein-coding genes contain multiple polyadenylation (pA) sites in their 3′ UTR. By alteration of 3′ UTR length, APA can considerably affect post-transcriptional gene regulation. Yet, our understanding of APA remains rudimentary. Novel single-cell RNA sequencing (scRNA-seq) techniques allow molecular characterization of different cell types to an unprecedented degree. Notably, the most popular scRNA-seq protocols specifically sequence the 3′ end of transcripts. Building on this property, we implemented a method for analysing patterns of APA regulation from such data. Analyzing multiple datasets from diverse tissues, we identified widespread modulation of APA in different cell types resulting in global 3′ UTR shortening/lengthening and enhanced cleavage at intronic pA sites. Our results provide a proof-of-concept demonstration that the huge volume of scRNA-seq data that accumulates in the public domain offers a unique resource for the exploration of APA based on a very broad collection of cell types and biological conditions.
Collapse
Affiliation(s)
- Eldad David Shulman
- Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Ran Elkon
- Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
16
|
Formica C, Malas T, Balog J, Verburg L, 't Hoen PAC, Peters DJM. Characterisation of transcription factor profiles in polycystic kidney disease (PKD): identification and validation of STAT3 and RUNX1 in the injury/repair response and PKD progression. J Mol Med (Berl) 2019; 97:1643-1656. [PMID: 31773180 PMCID: PMC6920240 DOI: 10.1007/s00109-019-01852-3] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Revised: 11/01/2019] [Accepted: 11/07/2019] [Indexed: 01/12/2023]
Abstract
Abstract Autosomal dominant polycystic kidney disease (ADPKD) is the most common genetic renal disease, caused in the majority of the cases by a mutation in either the PKD1 or the PKD2 gene. ADPKD is characterised by a progressive increase in the number and size of cysts, together with fibrosis and distortion of the renal architecture, over the years. This is accompanied by alterations in a complex network of signalling pathways. However, the underlying molecular mechanisms are not well characterised. Previously, we defined the PKD Signature, a set of genes typically dysregulated in PKD across different disease models from a meta-analysis of expression profiles. Given the importance of transcription factors (TFs) in modulating disease, we focused in this paper on characterising TFs from the PKD Signature. Our results revealed that out of the 1515 genes in the PKD Signature, 92 were TFs with altered expression in PKD, and 32 of those were also implicated in tissue injury/repair mechanisms. Validating the dysregulation of these TFs by qPCR in independent PKD and injury models largely confirmed these findings. STAT3 and RUNX1 displayed the strongest activation in cystic kidneys, as demonstrated by chromatin immunoprecipitation (ChIP) followed by qPCR. Using immunohistochemistry, we showed a dramatic increase of expression after renal injury in mice and cystic renal tissue of mice and humans. Our results suggest a role for STAT3 and RUNX1 and their downstream targets in the aetiology of ADPKD and indicate that the meta-analysis approach is a viable strategy for new target discovery in PKD. Key messages We identified a list of transcription factors (TFs) commonly dysregulated in ADPKD. Out of the 92 TFs identified in the PKD Signature, 35% are also involved in injury/repair processes. STAT3 and RUNX1 are the most significantly dysregulated TFs after injury and during PKD progression. STAT3 and RUNX1 activity is increased in cystic compared to non-cystic mouse kidneys. Increased expression of STAT3 and RUNX1 is observed in the nuclei of renal epithelial cells, also in human ADPKD samples.
Electronic supplementary material The online version of this article (10.1007/s00109-019-01852-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Chiara Formica
- Department of Human Genetics, Leiden University Medical Center, Einthovenweg 20, 2333, ZC, Leiden, The Netherlands
| | - Tareq Malas
- Department of Human Genetics, Leiden University Medical Center, Einthovenweg 20, 2333, ZC, Leiden, The Netherlands
| | - Judit Balog
- Department of Human Genetics, Leiden University Medical Center, Einthovenweg 20, 2333, ZC, Leiden, The Netherlands
| | - Lotte Verburg
- Department of Pathology, Leiden University Medical Center, Albinusdreef 2, 2333, ZA, Leiden, The Netherlands
| | - Peter A C 't Hoen
- Department of Human Genetics, Leiden University Medical Center, Einthovenweg 20, 2333, ZC, Leiden, The Netherlands.,Centre for Molecular and Biomolecular Informatics, Radboud Institute for Molecular Life Sciences, Radboud University Medical Center Nijmegen, Geert Grooteplein Zuid 26/28, 6525, GA, Nijmegen, The Netherlands
| | - Dorien J M Peters
- Department of Human Genetics, Leiden University Medical Center, Einthovenweg 20, 2333, ZC, Leiden, The Netherlands.
| |
Collapse
|
17
|
Vijayabaskar MS, Goode DK, Obier N, Lichtinger M, Emmett AML, Abidin FNZ, Shar N, Hannah R, Assi SA, Lie-A-Ling M, Gottgens B, Lacaud G, Kouskoff V, Bonifer C, Westhead DR. Identification of gene specific cis-regulatory elements during differentiation of mouse embryonic stem cells: An integrative approach using high-throughput datasets. PLoS Comput Biol 2019; 15:e1007337. [PMID: 31682597 PMCID: PMC6855567 DOI: 10.1371/journal.pcbi.1007337] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2017] [Revised: 11/14/2019] [Accepted: 08/15/2019] [Indexed: 01/22/2023] Open
Abstract
Gene expression governs cell fate, and is regulated via a complex interplay of transcription factors and molecules that change chromatin structure. Advances in sequencing-based assays have enabled investigation of these processes genome-wide, leading to large datasets that combine information on the dynamics of gene expression, transcription factor binding and chromatin structure as cells differentiate. While numerous studies focus on the effects of these features on broader gene regulation, less work has been done on the mechanisms of gene-specific transcriptional control. In this study, we have focussed on the latter by integrating gene expression data for the in vitro differentiation of murine ES cells to macrophages and cardiomyocytes, with dynamic data on chromatin structure, epigenetics and transcription factor binding. Combining a novel strategy to identify communities of related control elements with a penalized regression approach, we developed individual models to identify the potential control elements predictive of the expression of each gene. Our models were compared to an existing method and evaluated using the existing literature and new experimental data from embryonic stem cell differentiation reporter assays. Our method is able to identify transcriptional control elements in a gene specific manner that reflect known regulatory relationships and to generate useful hypotheses for further testing.
Collapse
Affiliation(s)
- M. S. Vijayabaskar
- School of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom
| | - Debbie K. Goode
- Wellcome Trust & MRC Cambridge Stem Cell Institute and Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom
| | - Nadine Obier
- Institute for Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham. Birmingham, United Kingdom
| | - Monika Lichtinger
- Institute for Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham. Birmingham, United Kingdom
| | - Amber M. L. Emmett
- School of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom
| | - Fatin N. Zainul Abidin
- School of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom
| | - Nisar Shar
- School of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom
| | - Rebecca Hannah
- Wellcome Trust & MRC Cambridge Stem Cell Institute and Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom
| | - Salam A. Assi
- Institute for Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham. Birmingham, United Kingdom
| | - Michael Lie-A-Ling
- CRUK Manchester Institute, University of Manchester, Manchester, United Kingdom
| | - Berthold Gottgens
- Wellcome Trust & MRC Cambridge Stem Cell Institute and Cambridge Institute for Medical Research, University of Cambridge, Cambridge, United Kingdom
| | - Georges Lacaud
- CRUK Manchester Institute, University of Manchester, Manchester, United Kingdom
| | - Valerie Kouskoff
- Division of Developmental Biology and Medicine, The University of Manchester, Manchester, United Kingdom
| | - Constanze Bonifer
- Institute for Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham. Birmingham, United Kingdom
| | - David R. Westhead
- School of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom
| |
Collapse
|
18
|
Li P, Shi R, Zhang QC. icSHAPE-pipe: A comprehensive toolkit for icSHAPE data analysis and evaluation. Methods 2019; 178:96-103. [PMID: 31606387 DOI: 10.1016/j.ymeth.2019.09.020] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2019] [Revised: 09/04/2019] [Accepted: 09/30/2019] [Indexed: 01/25/2023] Open
Abstract
RNA molecules have the intrinsic ability to fold into complex structures that are important in regulating many biological processes, including transcription, translation, processing and degradation. However, our knowledge for RNA structures remains very limited. Previously, we developed icSHAPE, a high-throughput method to probe single-stranded RNA nucleotide in cells. To recover the structural profile of an RNA or a transcript by icSHAPE, it is essential to accurately calculate an icSHAPE reactivity as the RNA structure score for each base. Here, we present icSHAPE-pipe, a comprehensive toolkit for the analysis of RNA structure sequencing data obtained from icSHAPE experiments. Compared to the original icSHAPE data processing protocol, icSHAPE-pipe calculates RNA structural information with higher accuracy and achieves higher coverage of the transcriptome. In addition, icSHAPE-pipe can perform quality control, and generate reports on sequencing data and the statistics of results. In sum, icSHAPE-pipe provides a convenient workflow for researchers to analyze RNA structural data from icSHAPE sequencing experiments.
Collapse
Affiliation(s)
- Pan Li
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology, Center for Synthetic and Systems Biology, Tsinghua-Peking Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China.
| | - Ruoyao Shi
- BioKnow Health Informatics Lab, College of Life Science, Jilin University, Changchun 130012, China.
| | - Qiangfeng Cliff Zhang
- MOE Key Laboratory of Bioinformatics, Beijing Advanced Innovation Center for Structural Biology, Center for Synthetic and Systems Biology, Tsinghua-Peking Center for Life Sciences, School of Life Sciences, Tsinghua University, Beijing 100084, China.
| |
Collapse
|
19
|
Stansfield JC, Tran D, Nguyen T, Dozmorov MG. R Tutorial: Detection of Differentially Interacting Chromatin Regions From Multiple Hi-C Datasets. CURRENT PROTOCOLS IN BIOINFORMATICS 2019; 66:e76. [PMID: 31125519 PMCID: PMC6588411 DOI: 10.1002/cpbi.76] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
The three-dimensional (3D) interactions of chromatin regulate cell-type-specific gene expression, recombination, X-chromosome inactivation, and many other genomic processes. High-throughput chromatin conformation capture (Hi-C) technologies capture the structure of the chromatin on a global scale by measuring all-vs.-all interactions and can provide new insights into genomic regulation. The workflow presented here describes how to analyze and interpret a comparative Hi-C experiment. We describe the process of obtaining Hi-C data from public repositories and give suggestions for pre-processing pipelines for users who intend to analyze their own raw data. We then describe the data normalization and comparative analysis process. We present three protocols describing the use of the multiHiCcompare, diffHic, and FIND R packages, respectively, to perform a comparative analysis of Hi-C experiments. Finally, visualization of the results and downstream interpretation of the differentially interacting regions are discussed. The bulk of this tutorial uses the R programming environment, and the processes described can be performed with most operating systems and a single computer. © 2019 by John Wiley & Sons, Inc.
Collapse
Affiliation(s)
- John C. Stansfield
- Dept. of Biostatistics, Virginia Commonwealth University, Richmond, VA, 23298, USA
| | - Duc Tran
- Dept. of Computer Science & Engineering, University of Nevada, Reno, NV, 89557, USA
| | - Tin Nguyen
- Dept. of Computer Science & Engineering, University of Nevada, Reno, NV, 89557, USA
| | - Mikhail G. Dozmorov
- Dept. of Biostatistics, Virginia Commonwealth University, Richmond, VA, 23298, USA
| |
Collapse
|
20
|
Memon D, Bi J, Miller CJ. In silico prediction of housekeeping long intergenic non-coding RNAs reveals HKlincR1 as an essential player in lung cancer cell survival. Sci Rep 2019; 9:7372. [PMID: 31089191 PMCID: PMC6517443 DOI: 10.1038/s41598-019-43758-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2019] [Accepted: 04/29/2019] [Indexed: 12/27/2022] Open
Abstract
Prioritising long intergenic noncoding RNAs (lincRNAs) for functional characterisation is a significant challenge. Here we applied computational approaches to discover lincRNAs expected to play a critical housekeeping (HK) role within the cell. Using the Illumina Human BodyMap RNA sequencing dataset as a starting point, we first identified lincRNAs ubiquitously expressed across a panel of human tissues. This list was then further refined by reference to conservation score, secondary structure and promoter DNA methylation status. Finally, we used tumour expression and copy number data to identify lincRNAs rarely downregulated or deleted in multiple tumour types. The resulting list of candidate essential lincRNAs was then subjected to co-expression analyses using independent data from ENCODE and The Cancer Genome Atlas (TCGA). This identified a substantial subset with a predicted role in DNA replication and cell cycle regulation. One of these, HKlincR1, was selected for further characterisation. Depletion of HKlincR1 affected cell growth in multiple lung cancer cell lines, and led to disruption of genes involved in cell growth and viability. In addition, HKlincR1 expression was correlated with overall survival in lung adenocarcinoma patients. Our in silico studies therefore reveal a set of housekeeping noncoding RNAs of interest both in terms of their role in normal homeostasis, and their relevance in tumour growth and maintenance.
Collapse
Affiliation(s)
- Danish Memon
- RNA Biology Group, CRUK Manchester Institute, The University of Manchester, Alderley Park, Manchester, SK10 4TG, UK
- European Bioinformatics Institute (EMBL-EBI)/Cancer Research UK Cambridge Institute, The University of Cambridge, Cambridge, UK
| | - Jing Bi
- RNA Biology Group, CRUK Manchester Institute, The University of Manchester, Alderley Park, Manchester, SK10 4TG, UK
| | - Crispin J Miller
- RNA Biology Group, CRUK Manchester Institute, The University of Manchester, Alderley Park, Manchester, SK10 4TG, UK.
| |
Collapse
|
21
|
Michaelovsky E, Carmel M, Frisch A, Salmon-Divon M, Pasmanik-Chor M, Weizman A, Gothelf D. Risk gene-set and pathways in 22q11.2 deletion-related schizophrenia: a genealogical molecular approach. Transl Psychiatry 2019; 9:15. [PMID: 30710087 PMCID: PMC6358611 DOI: 10.1038/s41398-018-0354-9] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Revised: 12/05/2018] [Accepted: 12/10/2018] [Indexed: 11/15/2022] Open
Abstract
The 22q11.2 deletion is a strong, but insufficient, "first hit" genetic risk factor for schizophrenia (SZ). We attempted to identify "second hits" from the entire genome in a unique multiplex 22q11.2 deletion syndrome (DS) family. Bioinformatic analysis of whole-exome sequencing and comparative-genomic hybridization array identified de novo and inherited, rare and damaging variants, including copy number variations, outside the 22q11.2 region. A specific 22q11.2-haplotype was associated with psychosis. The interaction of the identified "second hits" with the 22q11.2 haploinsufficiency may affect neurodevelopmental processes, including neuron projection, cytoskeleton activity, and histone modification in 22q11.2DS-ralated psychosis. A larger load of variants, involved in neurodevelopment, in combination with additional molecular events that affect sensory perception, olfactory transduction and G-protein-coupled receptor signaling may account for the development of 22q11.2DS-related SZ. Comprehensive analysis of multiplex families is a promising approach to the elucidation of the molecular pathophysiology of 22q11.2DS-related SZ with potential relevance to treatment.
Collapse
Affiliation(s)
- Elena Michaelovsky
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel.
- Felsenstein Medical Research Center, Petah Tikva, Israel.
| | - Miri Carmel
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Felsenstein Medical Research Center, Petah Tikva, Israel
| | - Amos Frisch
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Felsenstein Medical Research Center, Petah Tikva, Israel
| | | | - Metsada Pasmanik-Chor
- Bioinformatics Unit, G.S. Wise Faculty of Life Science, Tel Aviv University, Tel Aviv, Israel
| | - Abraham Weizman
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Felsenstein Medical Research Center, Petah Tikva, Israel
- Geha Mental Health Center, Petah Tikva, Israel
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
| | - Doron Gothelf
- Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
- Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
- The Behavioral Neurogenetics Center, Sheba Medical Center, Tel Hashomer, Israel
| |
Collapse
|
22
|
de la Rosa JV, Ramón-Vázquez A, Tabraue C, Castrillo A. Analysis of LXR Nuclear Receptor Cistrome Through ChIP-Seq Data Bioinformatics. Methods Mol Biol 2019; 1951:99-109. [PMID: 30825147 DOI: 10.1007/978-1-4939-9130-3_8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Liver X receptors are members of the nuclear receptor superfamily of transcription factors. The LXR genes (NR1H2 and NR1H3) encode for two different proteins referred to as LXRα and LXRβ. Each LXR presents diverse tissue distribution but similar target DNA-binding elements and ligands. Both LXRs act as relevant transcriptional regulators of cholesterol metabolism in many tissues. Additionally, LXRs participate in innate immunity and inflammation. Therefore, in order to understand the molecular requirements that operate in LXR-dependent transcription, it is important to decipher LXR genomic binding properties. We have recently performed genome-wide binding analysis of LXR proteins. In this method paper, we describe a detailed computational protocol primarily based on HOMER software package for the analysis of ChIP-seq data.
Collapse
Affiliation(s)
- Juan Vladimir de la Rosa
- Instituto de Investigaciones Biomédicas "Alberto Sols", Consejo Superior de Investigaciones Científicas (CSIC), Centro Mixto CSIC-Universidad Autónoma de Madrid, Madrid, Spain
- Unidad de Biomedicina IIBM-ULPGC (Unidad Asociada al CSIC), Universidad de Las Palmas de Gran Canaria, Las Palmas, Spain
- Grupo de Investigación Medio Ambiente y Salud (GIMAS), Instituto Universitario de Investigaciones Biomédicas y Sanitarias (IUIBS) de la ULPGC, Las Palmas, Spain
| | - Ana Ramón-Vázquez
- Instituto de Investigaciones Biomédicas "Alberto Sols", Consejo Superior de Investigaciones Científicas (CSIC), Centro Mixto CSIC-Universidad Autónoma de Madrid, Madrid, Spain
- Unidad de Biomedicina IIBM-ULPGC (Unidad Asociada al CSIC), Universidad de Las Palmas de Gran Canaria, Las Palmas, Spain
- Grupo de Investigación Medio Ambiente y Salud (GIMAS), Instituto Universitario de Investigaciones Biomédicas y Sanitarias (IUIBS) de la ULPGC, Las Palmas, Spain
| | - Carlos Tabraue
- Instituto de Investigaciones Biomédicas "Alberto Sols", Consejo Superior de Investigaciones Científicas (CSIC), Centro Mixto CSIC-Universidad Autónoma de Madrid, Madrid, Spain
- Unidad de Biomedicina IIBM-ULPGC (Unidad Asociada al CSIC), Universidad de Las Palmas de Gran Canaria, Las Palmas, Spain
- Grupo de Investigación Medio Ambiente y Salud (GIMAS), Instituto Universitario de Investigaciones Biomédicas y Sanitarias (IUIBS) de la ULPGC, Las Palmas, Spain
| | - Antonio Castrillo
- Instituto de Investigaciones Biomédicas "Alberto Sols", Consejo Superior de Investigaciones Científicas (CSIC), Centro Mixto CSIC-Universidad Autónoma de Madrid, Madrid, Spain.
- Unidad de Biomedicina IIBM-ULPGC (Unidad Asociada al CSIC), Universidad de Las Palmas de Gran Canaria, Las Palmas, Spain.
- Grupo de Investigación Medio Ambiente y Salud (GIMAS), Instituto Universitario de Investigaciones Biomédicas y Sanitarias (IUIBS) de la ULPGC, Las Palmas, Spain.
| |
Collapse
|
23
|
Guerrero Flórez M, Guerrero Gómez OA, Mena Huertas J, Yépez Chamorro MC. Mapping of microRNAs related to cervical cancer in Latin American human genomic variants. F1000Res 2018; 6:946. [PMID: 37766816 PMCID: PMC10521080 DOI: 10.12688/f1000research.10138.2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 09/05/2018] [Indexed: 09/29/2023] Open
Abstract
Background: MicroRNAs are related to human cancers, including cervical cancer (CC) caused by HPV. In 2018, approximately 56.075 cases and 28.252 deaths from this cancer were registered in Latin America and the Caribbean according to GLOBOCAN reports. The main molecular mechanism of HPV in CC is related to integration of viral DNA into the hosts' genome. However, the different variants in the human genome can result in different integration mechanisms, specifically involving microRNAs (miRNAs). Methods: The miRNAs associated with CC were obtained from literature, the miRNA sequences and four human genome variants (HGV) from Latin American populations were obtained from miRBase and 1000 Genomes Browser, respectively. HPV integration sites near cell cycle regulatory genes were identified. miRNAs were mapped on HGV. miRSNPs were identified in the miRNA sequences located at HPV integration sites on the Latin American HGV. Results: Two hundred seventy-two miRNAs associated with CC were identified in 139 reports from different geographic locations. By mapping with Blast-Like Alignment Tool (BLAT), 2028 binding sites were identified from these miRNAs on the human genome (version GRCh38/hg38); 42 miRNAs were located on unique integration sites; and miR-5095, miR-548c-5p and miR-548d-5p were involved with multiple genes related to the cell cycle. Thirty-seven miRNAs were mapped on the Latin American HGV (PUR, MXL, CLM and PEL), but only miR-11-3p, miR-31-3p, miR-107, miR-133a-3p, miR-133a-5p, miR-133b, miR-215-5p, miR-491-3p, miR-548d-5p and miR-944 were conserved. Conclusions: Ten miRNAs were conserved in the four HGV. In the remaining 27 miRNAs, substitutions, deletions or insertions were observed. These variation patterns can imply differentiated mechanisms towards each genomic variant in human populations because of specific genomic patterns and geographic features. These findings may help in determining susceptibility for CC development. Further identification of cellular genes and signalling pathways involved in CC progression could lead new therapeutic strategies based on miRNAs.
Collapse
Affiliation(s)
- Milena Guerrero Flórez
- Department of Biology, University of Nariño, Pasto, Nariño, Colombia
- Department of Biology, Center for Health Studies at the University of Nariño (CESUN), University of Nariño, Pasto, Nariño, Colombia
| | - Olivia Alexandra Guerrero Gómez
- Department of Biology, University of Nariño, Pasto, Nariño, Colombia
- Department of Biology, Center for Health Studies at the University of Nariño (CESUN), University of Nariño, Pasto, Nariño, Colombia
| | - Jaqueline Mena Huertas
- Department of Biology, University of Nariño, Pasto, Nariño, Colombia
- Department of Biology, Center for Health Studies at the University of Nariño (CESUN), University of Nariño, Pasto, Nariño, Colombia
| | - María Clara Yépez Chamorro
- Department of Biology, Center for Health Studies at the University of Nariño (CESUN), University of Nariño, Pasto, Nariño, Colombia
| |
Collapse
|
24
|
Wu HC, Do C, Andrulis IL, John EM, Daly MB, Buys SS, Chung WK, Knight JA, Bradbury AR, Keegan THM, Schwartz L, Krupska I, Miller RL, Santella RM, Tycko B, Terry MB. Breast cancer family history and allele-specific DNA methylation in the legacy girls study. Epigenetics 2018; 13:240-250. [PMID: 29436922 DOI: 10.1080/15592294.2018.1435243] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Abstract
Family history, a well-established risk factor for breast cancer, can have both genetic and environmental contributions. Shared environment in families as well as epigenetic changes that also may be influenced by shared genetics and environment may also explain familial clustering of cancers. Epigenetic regulation, such as DNA methylation, can change the activity of a DNA segment without a change in the sequence; environmental exposures experienced across the life course can induce such changes. However, genetic-epigenetic interactions, detected as methylation quantitative trait loci (mQTLs; a.k.a. meQTLs) and haplotype-dependent allele-specific methylation (hap-ASM), can also contribute to inter-individual differences in DNA methylation patterns. To identify differentially methylated regions (DMRs) associated with breast cancer susceptibility, we examined differences in white blood cell DNA methylation in 29 candidate genes in 426 girls (ages 6-13 years) from the LEGACY Girls Study, 239 with and 187 without a breast cancer family history (BCFH). We measured methylation by targeted massively parallel bisulfite sequencing (bis-seq) and observed BCFH DMRs in two genes: ESR1 (Δ4.9%, P = 0.003) and SEC16B (Δ3.6%, P = 0.026), each of which has been previously implicated in breast cancer susceptibility and pubertal development. These DMRs showed high inter-individual variability in methylation, suggesting the presence of mQTLs/hap-ASM. Using single nucleotide polymorphisms data in the bis-seq amplicon, we found strong hap-ASM in SEC16B (with allele specific-differences ranging from 42% to 74%). These findings suggest that differential methylation in genes relevant to breast cancer susceptibility may be present early in life, and that inherited genetic factors underlie some of these epigenetic differences.
Collapse
Affiliation(s)
- Hui-Chen Wu
- a Herbert Irving Comprehensive Cancer Center , Columbia University Medical Center , New York , NY.,b Department of Environmental Health Sciences , Mailman School of Public Health of Columbia University , New York , NY
| | - Catherine Do
- c John Theurer Cancer Center , Hackensack University Medical Center , Hackensack NJ
| | - Irene L Andrulis
- d Lunenfeld-Tanenbaum Research Institute , Sinai Health System , Toronto , Ontario.,e Departments of Molecular Genetics and Laboratory Medicine and Pathobiology , University of Toronto , Toronto , Ontario , Canada
| | - Esther M John
- f Cancer Prevention Institute of California , Fremont CA.,g Department of Health Research & Policy (Epidemiology) , and Stanford Cancer Institute, Stanford University School of Medicine , Stanford , CA
| | - Mary B Daly
- h Department of Clinical Genetics , Fox Chase Cancer Center , Philadelphia , PA
| | - Saundra S Buys
- i Department of Medicine and , Huntsman Cancer Institute, University of Utah Health Sciences Center , UT
| | - Wendy K Chung
- j Departments of Pediatrics ; Department of Medicine , Columbia University College of Physicians and Surgeons , New York , NY
| | - Julia A Knight
- d Lunenfeld-Tanenbaum Research Institute , Sinai Health System , Toronto , Ontario.,k Dalla Lana School of Public Health , University of Toronto , Toronto
| | - Angela R Bradbury
- l Departments of Medicine, Division of Hematology/Oncology, Department of Medical Ethics and Health Policy, Perelman School of Medicine , University of Pennsylvania , Philadelphia , PA
| | - Theresa H M Keegan
- m Center for Oncology Hematology Outcomes Research and Training (COHORT).,n Division of Hematology and Oncology , University of California Davis School of Medicine , Sacramento , CA
| | - Lisa Schwartz
- o Department of Pediatrics, Division of Oncology, Perelman School of Medicine , University of Pennsylvania , Philadelphia , PA.,p The Children's Hospital of Philadelphia , Philadelphia , PA
| | - Izabela Krupska
- a Herbert Irving Comprehensive Cancer Center , Columbia University Medical Center , New York , NY
| | - Rachel L Miller
- a Herbert Irving Comprehensive Cancer Center , Columbia University Medical Center , New York , NY.,j Departments of Pediatrics ; Department of Medicine , Columbia University College of Physicians and Surgeons , New York , NY
| | - Regina M Santella
- a Herbert Irving Comprehensive Cancer Center , Columbia University Medical Center , New York , NY.,b Department of Environmental Health Sciences , Mailman School of Public Health of Columbia University , New York , NY
| | - Benjamin Tycko
- c John Theurer Cancer Center , Hackensack University Medical Center , Hackensack NJ.,q Lombardi Comprehensive Cancer Center , Georgetown University , Washington , DC
| | - Mary Beth Terry
- a Herbert Irving Comprehensive Cancer Center , Columbia University Medical Center , New York , NY.,b Department of Environmental Health Sciences , Mailman School of Public Health of Columbia University , New York , NY.,r Imprints Center , Columbia University Medical Center , New York , NY.,s Department of Epidemiology , Mailman School of Public Health of Columbia University , New York , NY
| |
Collapse
|
25
|
Sonar K, Kabra R, Singh S. TryTransDB: A web-based resource for transport proteins in Trypanosomatidae. Sci Rep 2018; 8:4368. [PMID: 29531295 PMCID: PMC5847535 DOI: 10.1038/s41598-018-22706-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2017] [Accepted: 02/28/2018] [Indexed: 11/17/2022] Open
Abstract
TryTransDB is a web-based resource that stores transport protein data which can be retrieved using a standalone BLAST tool. We have attempted to create an integrated database that can be a one-stop shop for the researchers working with transport proteins of Trypanosomatidae family. TryTransDB (Trypanosomatidae Transport Protein Database) is a web based comprehensive resource that can fire a BLAST search against most of the transport protein sequences (protein and nucleotide) from Trypanosomatidae family organisms. This web resource further allows to compute a phylogenetic tree by performing multiple sequence alignment (MSA) using CLUSTALW suite embedded in it. Also, cross-linking to other databases helps in gathering more information for a certain transport protein in a single website.
Collapse
Affiliation(s)
- Krushna Sonar
- National Centre for Cell Science, NCCS Complex, Ganeshkhind, SP Pune University Campus, Pune, 411007, India
| | - Ritika Kabra
- National Centre for Cell Science, NCCS Complex, Ganeshkhind, SP Pune University Campus, Pune, 411007, India
| | - Shailza Singh
- National Centre for Cell Science, NCCS Complex, Ganeshkhind, SP Pune University Campus, Pune, 411007, India.
| |
Collapse
|
26
|
Cai B, Li B, Kiga N, Thusberg J, Bergquist T, Chen YC, Niknafs N, Carter H, Tokheim C, Beleva-Guthrie V, Douville C, Bhattacharya R, Yeo HTG, Fan J, Sengupta S, Kim D, Cline M, Turner T, Diekhans M, Zaucha J, Pal LR, Cao C, Yu CH, Yin Y, Carraro M, Giollo M, Ferrari C, Leonardi E, Tosatto SC, Bobe J, Ball M, Hoskins RA, Repo S, Church G, Brenner SE, Moult J, Gough J, Stanke M, Karchin R, Mooney SD. Matching phenotypes to whole genomes: Lessons learned from four iterations of the personal genome project community challenges. Hum Mutat 2017; 38:1266-1276. [PMID: 28544481 PMCID: PMC5645203 DOI: 10.1002/humu.23265] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2016] [Revised: 03/24/2017] [Accepted: 05/17/2017] [Indexed: 01/08/2023]
Abstract
The advent of next-generation sequencing has dramatically decreased the cost for whole-genome sequencing and increased the viability for its application in research and clinical care. The Personal Genome Project (PGP) provides unrestricted access to genomes of individuals and their associated phenotypes. This resource enabled the Critical Assessment of Genome Interpretation (CAGI) to create a community challenge to assess the bioinformatics community's ability to predict traits from whole genomes. In the CAGI PGP challenge, researchers were asked to predict whether an individual had a particular trait or profile based on their whole genome. Several approaches were used to assess submissions, including ROC AUC (area under receiver operating characteristic curve), probability rankings, the number of correct predictions, and statistical significance simulations. Overall, we found that prediction of individual traits is difficult, relying on a strong knowledge of trait frequency within the general population, whereas matching genomes to trait profiles relies heavily upon a small number of common traits including ancestry, blood type, and eye color. When a rare genetic disorder is present, profiles can be matched when one or more pathogenic variants are identified. Prediction accuracy has improved substantially over the last 6 years due to improved methodology and a better understanding of features.
Collapse
Affiliation(s)
- Binghuang Cai
- Department of Biomedical Informatics & Medical Education, University of Washington School of Medicine, Seattle, Washington
| | - Biao Li
- The Buck Institute for Research on Aging, Novato, California
| | - Nikki Kiga
- Department of Biomedical Informatics & Medical Education, University of Washington School of Medicine, Seattle, Washington
| | - Janita Thusberg
- The Buck Institute for Research on Aging, Novato, California
| | - Timothy Bergquist
- Department of Biomedical Informatics & Medical Education, University of Washington School of Medicine, Seattle, Washington
| | - Yun-Ching Chen
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland
| | - Noushin Niknafs
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland
| | - Hannah Carter
- Department of Medicine, Division of Medical Genetics, Institute for Genomic Medicine and Moores Cancer Center, University of California San Diego, La Jolla, Califonia
| | - Collin Tokheim
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland
| | - Violeta Beleva-Guthrie
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland
| | - Christopher Douville
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland
| | - Rohit Bhattacharya
- Department of Computer Science, Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland
| | - Hui Ting Grace Yeo
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland
| | - Jean Fan
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland
| | - Sohini Sengupta
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland
| | - Dewey Kim
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland
| | - Melissa Cline
- Department of Biomolecular Engineering, University of California, Santa Cruz, California
| | - Tychele Turner
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, Maryland
| | - Mark Diekhans
- Department of Biomolecular Engineering, University of California, Santa Cruz, California
| | - Jan Zaucha
- Department of Computer Science, University of Bristol, Bristol, UK
- Bristol Centre for Complexity Sciences, University of Bristol, Bristol, UK
| | - Lipika R. Pal
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland
| | - Chen Cao
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland
- Computational Biology, Bioinformatics and Genomics, Biological Sciences Graduate Program, University of Maryland, College Park, Maryland
| | - Chen-Hsin Yu
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland
- Computational Biology, Bioinformatics and Genomics, Biological Sciences Graduate Program, University of Maryland, College Park, Maryland
| | - Yizhou Yin
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland
- Computational Biology, Bioinformatics and Genomics, Biological Sciences Graduate Program, University of Maryland, College Park, Maryland
| | - Marco Carraro
- Department of Biomedical Sciences, University of Padova, Padova, Italy
| | - Manuel Giollo
- Department of Biomedical Sciences, University of Padova, Padova, Italy
- Department of Information Engineering, University of Padova, Padova, Italy
| | - Carlo Ferrari
- Department of Information Engineering, University of Padova, Padova, Italy
| | - Emanuela Leonardi
- Department of Woman and Child Health, University of Padova, Padova, Italy
| | - Silvio C.E. Tosatto
- Department of Biomedical Sciences, University of Padova, Padova, Italy
- CNR Neuroscience Institute, Padova, Italy
| | - Jason Bobe
- PersonalGenomes.org, Boston, Massachusetts
| | | | - Roger A. Hoskins
- Department of Plant and Microbial Biology, University of California, Berkeley, California
| | | | | | - Steven E. Brenner
- Department of Plant and Microbial Biology, University of California, Berkeley, California
| | - John Moult
- Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland
| | - Julian Gough
- Bristol Centre for Complexity Sciences, University of Bristol, Bristol, UK
| | - Mario Stanke
- Institute of Mathematics and Computer Science, University of Greifswald, Greifswald, Germany
| | - Rachel Karchin
- Department of Biomedical Engineering and Institute for Computational Medicine, Johns Hopkins University, Baltimore, Maryland
- Department of Oncology, The Johns Hopkins Medical Institutions, Baltimore, Maryland
| | - Sean D. Mooney
- Department of Biomedical Informatics & Medical Education, University of Washington School of Medicine, Seattle, Washington
| |
Collapse
|
27
|
Read TD, Petit RA, Joseph SJ, Alam MT, Weil MR, Ahmad M, Bhimani R, Vuong JS, Haase CP, Webb DH, Tan M, Dove ADM. Draft sequencing and assembly of the genome of the world's largest fish, the whale shark: Rhincodon typus Smith 1828. BMC Genomics 2017; 18:532. [PMID: 28709399 PMCID: PMC5513125 DOI: 10.1186/s12864-017-3926-9] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2017] [Accepted: 07/06/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The whale shark (Rhincodon typus) has by far the largest body size of any elasmobranch (shark or ray) species. Therefore, it is also the largest extant species of the paraphyletic assemblage commonly referred to as fishes. As both a phenotypic extreme and a member of the group Chondrichthyes - the sister group to the remaining gnathostomes, which includes all tetrapods and therefore also humans - its genome is of substantial comparative interest. Whale sharks are also listed as an endangered species on the International Union for Conservation of Nature's Red List of threatened species and are of growing popularity as both a target of ecotourism and as a charismatic conservation ambassador for the pelagic ecosystem. A genome map for this species would aid in defining effective conservation units and understanding global population structure. RESULTS We characterised the nuclear genome of the whale shark using next generation sequencing (454, Illumina) and de novo assembly and annotation methods, based on material collected from the Georgia Aquarium. The data set consisted of 878,654,233 reads, which yielded a draft assembly of 1,213,200 contigs and 997,976 scaffolds. The estimated genome size was 3.44Gb. As expected, the proteome of the whale shark was most closely related to the only other complete genome of a cartilaginous fish, the holocephalan elephant shark. The whale shark contained a novel Toll-like-receptor (TLR) protein with sequence similarity to both the TLR4 and TLR13 proteins of mammals and TLR21 of teleosts. The data are publicly available on GenBank, FigShare, and from the NCBI Short Read Archive under accession number SRP044374. CONCLUSIONS This represents the first shotgun elasmobranch genome and will aid studies of molecular systematics, biogeography, genetic differentiation, and conservation genetics in this and other shark species, as well as providing comparative data for studies of evolutionary biology and immunology across the jawed vertebrate lineages.
Collapse
Affiliation(s)
- Timothy D Read
- Department of Medicine, Division of Infectious Diseases, Emory University School of Medicine, 1760 Haygood Drive, Atlanta, GA, 30322, USA.,Department of Human Genetics, Emory University School of Medicine, 1760 Haygood Drive, Atlanta, GA, 30322, USA
| | - Robert A Petit
- Department of Medicine, Division of Infectious Diseases, Emory University School of Medicine, 1760 Haygood Drive, Atlanta, GA, 30322, USA.,Department of Human Genetics, Emory University School of Medicine, 1760 Haygood Drive, Atlanta, GA, 30322, USA
| | - Sandeep J Joseph
- Department of Medicine, Division of Infectious Diseases, Emory University School of Medicine, 1760 Haygood Drive, Atlanta, GA, 30322, USA.,Department of Human Genetics, Emory University School of Medicine, 1760 Haygood Drive, Atlanta, GA, 30322, USA
| | - Md Tauqeer Alam
- Department of Medicine, Division of Infectious Diseases, Emory University School of Medicine, 1760 Haygood Drive, Atlanta, GA, 30322, USA.,Department of Human Genetics, Emory University School of Medicine, 1760 Haygood Drive, Atlanta, GA, 30322, USA
| | - M Ryan Weil
- Department of Medicine, Division of Infectious Diseases, Emory University School of Medicine, 1760 Haygood Drive, Atlanta, GA, 30322, USA.,Department of Human Genetics, Emory University School of Medicine, 1760 Haygood Drive, Atlanta, GA, 30322, USA
| | - Maida Ahmad
- Department of Medicine, Division of Infectious Diseases, Emory University School of Medicine, 1760 Haygood Drive, Atlanta, GA, 30322, USA.,Department of Human Genetics, Emory University School of Medicine, 1760 Haygood Drive, Atlanta, GA, 30322, USA
| | - Ravila Bhimani
- Department of Medicine, Division of Infectious Diseases, Emory University School of Medicine, 1760 Haygood Drive, Atlanta, GA, 30322, USA.,Department of Human Genetics, Emory University School of Medicine, 1760 Haygood Drive, Atlanta, GA, 30322, USA
| | - Jocelyn S Vuong
- Department of Medicine, Division of Infectious Diseases, Emory University School of Medicine, 1760 Haygood Drive, Atlanta, GA, 30322, USA.,Department of Human Genetics, Emory University School of Medicine, 1760 Haygood Drive, Atlanta, GA, 30322, USA
| | - Chad P Haase
- Department of Medicine, Division of Infectious Diseases, Emory University School of Medicine, 1760 Haygood Drive, Atlanta, GA, 30322, USA.,Department of Human Genetics, Emory University School of Medicine, 1760 Haygood Drive, Atlanta, GA, 30322, USA
| | - D Harry Webb
- , Georgia Aquarium, 225 Baker Street, Atlanta, GA, 30313, USA
| | - Milton Tan
- Department of Medicine, Division of Infectious Diseases, Emory University School of Medicine, 1760 Haygood Drive, Atlanta, GA, 30322, USA. .,Department of Human Genetics, Emory University School of Medicine, 1760 Haygood Drive, Atlanta, GA, 30322, USA.
| | | |
Collapse
|
28
|
Koshy R, Ranawat A, Scaria V. al mena: a comprehensive resource of human genetic variants integrating genomes and exomes from Arab, Middle Eastern and North African populations. J Hum Genet 2017. [PMID: 28638141 DOI: 10.1038/jhg.2017.67] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Middle East and North Africa (MENA) encompass very unique populations, with a rich history and encompasses characteristic ethnic, linguistic and genetic diversity. The genetic diversity of MENA region has been largely unknown. The recent availability of whole-exome and whole-genome sequences from the region has made it possible to collect population-specific allele frequencies. The integration of data sets from this region would provide insights into the landscape of genetic variants in this region. We integrated genetic variants from multiple data sets systematically, available from this region to create a compendium of over 26 million genetic variations. The variants were systematically annotated and their allele frequencies in the data sets were computed and available as a web interface which enables quick query. As a proof of principle for application of the compendium for genetic epidemiology, we analyzed the allele frequencies for variants in transglutaminase 1 (TGM1) gene, associated with autosomal recessive lamellar ichthyosis. Our analysis revealed that the carrier frequency of selected variants differed widely with significant interethnic differences. To the best of our knowledge, al mena is the first and most comprehensive repertoire of genetic variations from the Arab, Middle Eastern and North African region. We hope al mena would accelerate Precision Medicine in the region.
Collapse
Affiliation(s)
- Remya Koshy
- GN Ramachandran Knowledge Center for Genome Informatics, CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Anop Ranawat
- GN Ramachandran Knowledge Center for Genome Informatics, CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| | - Vinod Scaria
- GN Ramachandran Knowledge Center for Genome Informatics, CSIR-Institute of Genomics and Integrative Biology, Delhi, India.,The Academy of Scientific and Innovative Research (AcSIR), CSIR-Institute of Genomics and Integrative Biology, Delhi, India
| |
Collapse
|
29
|
Sherry-Lynes MM, Sengupta S, Kulkarni S, Cochran BH. Regulation of the JMJD3 (KDM6B) histone demethylase in glioblastoma stem cells by STAT3. PLoS One 2017; 12:e0174775. [PMID: 28384648 PMCID: PMC5383422 DOI: 10.1371/journal.pone.0174775] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2017] [Accepted: 03/15/2017] [Indexed: 01/10/2023] Open
Abstract
The growth factor and cytokine regulated transcription factor STAT3 is required for the self-renewal of several stem cell types including tumor stem cells from glioblastoma. Here we show that STAT3 inhibition leads to the upregulation of the histone H3K27me2/3 demethylase Jmjd3 (KDM6B), which can reverse polycomb complex-mediated repression of tissue specific genes. STAT3 binds to the Jmjd3 promoter, suggesting that Jmjd3 is a direct target of STAT3. Overexpression of Jmjd3 slows glioblastoma stem cell growth and neurosphere formation, whereas knockdown of Jmjd3 rescues the STAT3 inhibitor-induced neurosphere formation defect. Consistent with this observation, STAT3 inhibition leads to histone H3K27 demethylation of neural differentiation genes, such as Myt1, FGF21, and GDF15. These results demonstrate that the regulation of Jmjd3 by STAT3 maintains repression of differentiation specific genes and is therefore important for the maintenance of self-renewal of normal neural and glioblastoma stem cells.
Collapse
Affiliation(s)
- Maureen M. Sherry-Lynes
- Graduate Program in Cell and Molecular Physiology, Sackler School of Graduate Biomedical Sciences and Dept. of Developmental,Molecular, and Chemical Biology Tufts University School of Medicine Boston, MA, United States of America
| | - Sejuti Sengupta
- Graduate Program in Cell and Molecular Physiology, Sackler School of Graduate Biomedical Sciences and Dept. of Developmental,Molecular, and Chemical Biology Tufts University School of Medicine Boston, MA, United States of America
| | - Shreya Kulkarni
- Graduate Program in Cell and Molecular Physiology, Sackler School of Graduate Biomedical Sciences and Dept. of Developmental,Molecular, and Chemical Biology Tufts University School of Medicine Boston, MA, United States of America
| | - Brent H. Cochran
- Graduate Program in Cell and Molecular Physiology, Sackler School of Graduate Biomedical Sciences and Dept. of Developmental,Molecular, and Chemical Biology Tufts University School of Medicine Boston, MA, United States of America
| |
Collapse
|
30
|
Dreos R, Ambrosini G, Groux R, Cavin Périer R, Bucher P. The eukaryotic promoter database in its 30th year: focus on non-vertebrate organisms. Nucleic Acids Res 2016; 45:D51-D55. [PMID: 27899657 PMCID: PMC5210552 DOI: 10.1093/nar/gkw1069] [Citation(s) in RCA: 174] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2016] [Revised: 10/21/2016] [Accepted: 10/24/2016] [Indexed: 01/21/2023] Open
Abstract
We present an update of the Eukaryotic Promoter Database EPD (http://epd.vital-it.ch), more specifically on the EPDnew division, which contains comprehensive organisms-specific transcription start site (TSS) collections automatically derived from next generation sequencing (NGS) data. Thanks to the abundant release of new high-throughput transcript mapping data (CAGE, TSS-seq, GRO-cap) the database could be extended to plant and fungal species. We further report on the expansion of the mass genome annotation (MGA) repository containing promoter-relevant chromatin profiling data and on improvements for the EPD entry viewers. Finally, we present a new data access tool, ChIP-Extract, which enables computational biologists to extract diverse types of promoter-associated data in numerical table formats that are readily imported into statistical analysis platforms such as R.
Collapse
Affiliation(s)
- René Dreos
- Swiss Institute of Bioinformatics (SIB), CH-1015 Lausanne, Switzerland
| | - Giovanna Ambrosini
- Swiss Institute of Bioinformatics (SIB), CH-1015 Lausanne, Switzerland.,Swiss Institute for Experimental Cancer Research (ISREC), School of Life Sciences, Swiss Federal Institute of Technology (EPFL), CH-1015 Lausanne, Switzerland
| | - Romain Groux
- Swiss Institute of Bioinformatics (SIB), CH-1015 Lausanne, Switzerland
| | | | - Philipp Bucher
- Swiss Institute of Bioinformatics (SIB), CH-1015 Lausanne, Switzerland.,Swiss Institute for Experimental Cancer Research (ISREC), School of Life Sciences, Swiss Federal Institute of Technology (EPFL), CH-1015 Lausanne, Switzerland
| |
Collapse
|
31
|
Katsila T, Konstantinou E, Lavda I, Malakis H, Papantoni I, Skondra L, Patrinos GP. Pharmacometabolomics-aided Pharmacogenomics in Autoimmune Disease. EBioMedicine 2016; 5:40-5. [PMID: 27077110 PMCID: PMC4816847 DOI: 10.1016/j.ebiom.2016.02.001] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2015] [Revised: 01/30/2016] [Accepted: 02/01/2016] [Indexed: 12/11/2022] Open
Abstract
Inter-individual variability has been a major hurdle to optimize disease management. Precision medicine holds promise for improving health and healthcare via tailor-made therapeutic strategies. Herein, we outline the paradigm of "pharmacometabolomics-aided pharmacogenomics" in autoimmune diseases. We envisage merging pharmacometabolomic and pharmacogenomic data (to address the interplay of genomic and environmental influences) with information technologies to facilitate data analysis as well as sense- and decision-making on the basis of synergy between artificial and human intelligence. Humans can detect patterns, which computer algorithms may fail to do so, whereas data-intensive and cognitively complex settings and processes limit human ability. We propose that better-informed, rapid and cost-effective omics studies need the implementation of holistic and multidisciplinary approaches.
Collapse
Affiliation(s)
- Theodora Katsila
- University of Patras, School of Health Sciences, Department of Pharmacy, University Campus, Rion, Patras, Greece
| | | | | | | | | | | | | |
Collapse
|
32
|
Liu TT, Arango-Argoty G, Li Z, Lin Y, Kim SW, Dueck A, Ozsolak F, Monaghan AP, Meister G, DeFranco DB, John B. Noncoding RNAs that associate with YB-1 alter proliferation in prostate cancer cells. RNA (NEW YORK, N.Y.) 2015; 21:1159-72. [PMID: 25904138 PMCID: PMC4436668 DOI: 10.1261/rna.045559.114] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/26/2014] [Accepted: 03/03/2015] [Indexed: 06/04/2023]
Abstract
The highly conserved, multifunctional YB-1 is a powerful breast cancer prognostic indicator. We report on a pervasive role for YB-1 in which it associates with thousands of nonpolyadenylated short RNAs (shyRNAs) that are further processed into small RNAs (smyRNAs). Many of these RNAs have previously been identified as functional noncoding RNAs (http://www.johnlab.org/YB1). We identified a novel, abundant, 3'-modified short RNA antisense to Dicer1 (Shad1) that colocalizes with YB-1 to P-bodies and stress granules. The expression of Shad1 was shown to correlate with that of YB-1 and whose inhibition leads to an increase in cell proliferation. Additionally, Shad1 influences the expression of additional prognostic markers of cancer progression such as DLX2 and IGFBP2. We propose that the examination of these noncoding RNAs could lead to better understanding of prostate cancer progression.
Collapse
Affiliation(s)
- Teresa T Liu
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania 15260, USA
| | - Gustavo Arango-Argoty
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania 15260, USA
| | - Zhihua Li
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania 15260, USA
| | - Yuefeng Lin
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania 15260, USA
| | - Sang Woo Kim
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania 15260, USA
| | - Anne Dueck
- University of Regensburg, Biochemistry I, 93053 Regensburg, Bavaria, Germany
| | - Fatih Ozsolak
- Helicos BioSciences Corporation, Cambridge, Massachusetts 02139, USA
| | - A Paula Monaghan
- Department of Neurobiology, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania 15260, USA
| | - Gunter Meister
- University of Regensburg, Biochemistry I, 93053 Regensburg, Bavaria, Germany
| | - Donald B DeFranco
- Department of Pharmacology and Chemical Biology, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania 15260, USA
| | - Bino John
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, Pennsylvania 15260, USA
| |
Collapse
|
33
|
Luisi P, Alvarez-Ponce D, Pybus M, Fares MA, Bertranpetit J, Laayouni H. Recent positive selection has acted on genes encoding proteins with more interactions within the whole human interactome. Genome Biol Evol 2015; 7:1141-54. [PMID: 25840415 PMCID: PMC4419801 DOI: 10.1093/gbe/evv055] [Citation(s) in RCA: 50] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Genes vary in their likelihood to undergo adaptive evolution. The genomic factors that determine adaptability, however, remain poorly understood. Genes function in the context of molecular networks, with some occupying more important positions than others and thus being likely to be under stronger selective pressures. However, how positive selection distributes across the different parts of molecular networks is still not fully understood. Here, we inferred positive selection using comparative genomics and population genetics approaches through the comparison of 10 mammalian and 270 human genomes, respectively. In agreement with previous results, we found that genes with lower network centralities are more likely to evolve under positive selection (as inferred from divergence data). Surprisingly, polymorphism data yield results in the opposite direction than divergence data: Genes with higher centralities are more likely to have been targeted by recent positive selection during recent human evolution. Our results indicate that the relationship between centrality and the impact of adaptive evolution highly depends on the mode of positive selection and/or the evolutionary time-scale.
Collapse
Affiliation(s)
- Pierre Luisi
- Institute of Evolutionary Biology, Universitat Pompeu Fabra-CSIC, CEXS-UPF-PRBB, Barcelona, Catalonia, Spain
| | - David Alvarez-Ponce
- Integrative Systems Biology Group, Instituto de Biología Molecular y Celular de Plantas, Consejo Superior de Investigaciones Científicas (CSIC)-Universidad Politécnica de Valencia (UPV), Spain Biology Department, University of Nevada, Reno Institute of Evolutionary Biology, Universitat Pompeu Fabra-CSIC, CEXS-UPF-PRBB, Barcelona, Catalonia, Spain
| | - Marc Pybus
- Institute of Evolutionary Biology, Universitat Pompeu Fabra-CSIC, CEXS-UPF-PRBB, Barcelona, Catalonia, Spain
| | - Mario A Fares
- Integrative Systems Biology Group, Instituto de Biología Molecular y Celular de Plantas, Consejo Superior de Investigaciones Científicas (CSIC)-Universidad Politécnica de Valencia (UPV), Spain Smurfit Institute of Genetics, University of Dublin, Trinity College, Ireland
| | - Jaume Bertranpetit
- Institute of Evolutionary Biology, Universitat Pompeu Fabra-CSIC, CEXS-UPF-PRBB, Barcelona, Catalonia, Spain
| | - Hafid Laayouni
- Institute of Evolutionary Biology, Universitat Pompeu Fabra-CSIC, CEXS-UPF-PRBB, Barcelona, Catalonia, Spain Departament de Genètica i de Microbiologia, Grup de Biologia Evolutiva (GBE), Universitat Autonòma de Barcelona, Bellaterra, Spain
| |
Collapse
|
34
|
Hinske LC, Galante PAF, Limbeck E, Möhnle P, Parmigiani RB, Ohno-Machado L, Camargo AA, Kreth S. Alternative polyadenylation allows differential negative feedback of human miRNA miR-579 on its host gene ZFR. PLoS One 2015; 10:e0121507. [PMID: 25799583 PMCID: PMC4370670 DOI: 10.1371/journal.pone.0121507] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2014] [Accepted: 01/31/2015] [Indexed: 02/02/2023] Open
Abstract
About half of the known miRNA genes are located within protein-coding host genes, and are thus subject to co-transcription. Accumulating data indicate that this coupling may be an intrinsic mechanism to directly regulate the host gene's expression, constituting a negative feedback loop. Inevitably, the cell requires a yet largely unknown repertoire of methods to regulate this control mechanism. We propose APA as one possible mechanism by which negative feedback of intronic miRNA on their host genes might be regulated. Using in-silico analyses, we found that host genes that contain seed matching sites for their intronic miRNAs yield longer 32UTRs with more polyadenylation sites. Additionally, the distribution of polyadenylation signals differed significantly between these host genes and host genes of miRNAs that do not contain potential miRNA binding sites. We then transferred these in-silico results to a biological example and investigated the relationship between ZFR and its intronic miRNA miR-579 in a U87 cell line model. We found that ZFR is targeted by its intronic miRNA miR-579 and that alternative polyadenylation allows differential targeting. We additionally used bioinformatics analyses and RNA-Seq to evaluate a potential cross-talk between intronic miRNAs and alternative polyadenylation. CPSF2, a gene previously associated with alternative polyadenylation signal recognition, might be linked to intronic miRNA negative feedback by altering polyadenylation signal utilization.
Collapse
Affiliation(s)
- Ludwig Christian Hinske
- Research Group Molecular Medicine, Department of Anaesthesiology, Clinic of the University of Munich, Munich, Germany
| | | | - Elisabeth Limbeck
- Molecular Oncology Center, Sírio Libanês Hospital, São Paulo, Brazil
| | - Patrick Möhnle
- Research Group Molecular Medicine, Department of Anaesthesiology, Clinic of the University of Munich, Munich, Germany
| | | | - Lucila Ohno-Machado
- Division of Biomedical Informatics, University of California San Diego, La Jolla, California, United States of America
| | | | - Simone Kreth
- Research Group Molecular Medicine, Department of Anaesthesiology, Clinic of the University of Munich, Munich, Germany
| |
Collapse
|
35
|
Wang R, Perez-Riverol Y, Hermjakob H, Vizcaíno JA. Open source libraries and frameworks for biological data visualisation: a guide for developers. Proteomics 2015; 15:1356-74. [PMID: 25475079 PMCID: PMC4409855 DOI: 10.1002/pmic.201400377] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2014] [Revised: 10/21/2014] [Accepted: 11/26/2014] [Indexed: 12/21/2022]
Abstract
Recent advances in high-throughput experimental techniques have led to an exponential increase in both the size and the complexity of the data sets commonly studied in biology. Data visualisation is increasingly used as the key to unlock this data, going from hypothesis generation to model evaluation and tool implementation. It is becoming more and more the heart of bioinformatics workflows, enabling scientists to reason and communicate more effectively. In parallel, there has been a corresponding trend towards the development of related software, which has triggered the maturation of different visualisation libraries and frameworks. For bioinformaticians, scientific programmers and software developers, the main challenge is to pick out the most fitting one(s) to create clear, meaningful and integrated data visualisation for their particular use cases. In this review, we introduce a collection of open source or free to use libraries and frameworks for creating data visualisation, covering the generation of a wide variety of charts and graphs. We will focus on software written in Java, JavaScript or Python. We truly believe this software offers the potential to turn tedious data into exciting visual stories.
Collapse
Affiliation(s)
- Rui Wang
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
| | | | | | | |
Collapse
|
36
|
Wang B, Ekblom R, Bunikis I, Siitari H, Höglund J. Whole genome sequencing of the black grouse (Tetrao tetrix): reference guided assembly suggests faster-Z and MHC evolution. BMC Genomics 2014; 15:180. [PMID: 24602261 PMCID: PMC4022176 DOI: 10.1186/1471-2164-15-180] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2013] [Accepted: 02/26/2014] [Indexed: 01/19/2023] Open
Abstract
BACKGROUND The different regions of a genome do not evolve at the same rate. For example, comparative genomic studies have suggested that the sex chromosomes and the regions harbouring the immune defence genes in the Major Histocompatability Complex (MHC) may evolve faster than other genomic regions. The advent of the next generation sequencing technologies has made it possible to study which genomic regions are evolutionary liable to change and which are static, as well as enabling an increasing number of genome studies of non-model species. However, de novo sequencing of the whole genome of an organism remains non-trivial. In this study, we present the draft genome of the black grouse, which was developed using a reference-guided assembly strategy. RESULTS We generated 133 Gbp of sequence data from one black grouse individual by the SOLiD platform and used a combination of de novo assembly and chicken reference genome mapping to assemble the reads into 4572 scaffolds with a total length of 1022 Mb. The draft genome well covers the main chicken chromosomes 1 ~ 28 and Z which have a total length of 1001 Mb. The draft genome is fragmented, but has a good coverage of the homologous chicken genes. Especially, 33.0% of the coding regions of the homologous genes have more than 90% proportion of their sequences covered. In addition, we identified ~1 M SNPs from the genome and identified 106 genomic regions which had a high nucleotide divergence between black grouse and chicken or between black grouse and turkey. CONCLUSIONS Our results support the hypothesis that the chromosome X (Z) evolves faster than the autosomes and our data are consistent with the MHC regions being more liable to change than the genome average. Our study demonstrates how a moderate sequencing effort can be combined with existing genome references to generate a draft genome for a non-model species.
Collapse
Affiliation(s)
- Biao Wang
- />Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18D, SE-75236 Uppsala, Sweden
| | - Robert Ekblom
- />Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18D, SE-75236 Uppsala, Sweden
| | - Ignas Bunikis
- />Department of Immunology, Genetics and Pathology, Rudbeck Laboratory, Uppsala University, Dag Hammarskjölds väg 20, SE-75237 Uppsala, Sweden
| | - Heli Siitari
- />Department of Biological and Environmental Science, University of Jyväskylä, P. O. Box 35, FI-40014 Jyväskylä, Finland
| | - Jacob Höglund
- />Department of Ecology and Genetics, Evolutionary Biology Centre, Uppsala University, Norbyvägen 18D, SE-75236 Uppsala, Sweden
| |
Collapse
|
37
|
Zhu JJ, Arzt J, Puckette MC, Smoliga GR, Pacheco JM, Rodriguez LL. Mechanisms of foot-and-mouth disease virus tropism inferred from differential tissue gene expression. PLoS One 2013; 8:e64119. [PMID: 23724025 PMCID: PMC3665847 DOI: 10.1371/journal.pone.0064119] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2012] [Accepted: 04/11/2013] [Indexed: 11/18/2022] Open
Abstract
Foot-and-mouth disease virus (FMDV) targets specific tissues for primary infection, secondary high-titer replication (e.g. foot and mouth where it causes typical vesicular lesions) and long-term persistence at some primary replication sites. Although integrin αVβ6 receptor has been identified as primary FMDV receptors in animals, their tissue distribution alone fails to explain these highly selective tropism-driven events. Thus, other molecular mechanisms must play roles in determining this tissue specificity. We hypothesized that differences in certain biological activities due to differential gene expression determine FMDV tropism and applied whole genome gene expression profiling to identify genes differentially expressed between FMDV-targeted and non-targeted tissues in terms of supporting primary infection, secondary replication including vesicular lesions, and persistence. Using statistical and bioinformatic tools to analyze the differential gene expression, we identified mechanisms that could explain FMDV tissue tropism based on its association with differential expression of integrin αVβ6 heterodimeric receptor (FMDV receptor), fibronectin (ligand of the receptor), IL-1 cytokines, death receptors and the ligands, and multiple genes in the biological pathways involved in extracellular matrix turnover and interferon signaling found in this study. Our results together with reported findings indicate that differences in (1) FMDV receptor availability and accessibility, (2) type I interferon-inducible immune response, and (3) ability to clear virus infected cells via death receptor signaling play roles in determining FMDV tissue tropism and the additional increase of high extracellular matrix turnover induced by FMDV infection, likely via triggering the signaling of highly expressed IL-1 cytokines, play a key role in the pathogenesis of vesicular lesions.
Collapse
Affiliation(s)
- James J. Zhu
- Foreign Animal Disease Research Unit, Agricultural Research Unit, United States Department of Agriculture, Plum Island Animal Disease Research Center, Orient Point, New York, United States of America
| | - Jonathan Arzt
- Foreign Animal Disease Research Unit, Agricultural Research Unit, United States Department of Agriculture, Plum Island Animal Disease Research Center, Orient Point, New York, United States of America
| | - Michael C. Puckette
- Foreign Animal Disease Research Unit, Agricultural Research Unit, United States Department of Agriculture, Plum Island Animal Disease Research Center, Orient Point, New York, United States of America
| | - George R. Smoliga
- Foreign Animal Disease Research Unit, Agricultural Research Unit, United States Department of Agriculture, Plum Island Animal Disease Research Center, Orient Point, New York, United States of America
| | - Juan M. Pacheco
- Foreign Animal Disease Research Unit, Agricultural Research Unit, United States Department of Agriculture, Plum Island Animal Disease Research Center, Orient Point, New York, United States of America
| | - Luis L. Rodriguez
- Foreign Animal Disease Research Unit, Agricultural Research Unit, United States Department of Agriculture, Plum Island Animal Disease Research Center, Orient Point, New York, United States of America
| |
Collapse
|
38
|
Rapid birth-and-death evolution of the xenobiotic metabolizing NAT gene family in vertebrates with evidence of adaptive selection. BMC Evol Biol 2013; 13:62. [PMID: 23497148 PMCID: PMC3601968 DOI: 10.1186/1471-2148-13-62] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2012] [Accepted: 02/27/2013] [Indexed: 11/10/2022] Open
Abstract
Background The arylamine N-acetyltransferases (NATs) are a unique family of enzymes widely distributed in nature that play a crucial role in the detoxification of aromatic amine xenobiotics. Considering the temporal changes in the levels and toxicity of environmentally available chemicals, the metabolic function of NATs is likely to be under adaptive evolution to broaden or change substrate specificity over time, making NATs a promising subject for evolutionary analyses. In this study, we trace the molecular evolutionary history of the NAT gene family during the last ~450 million years of vertebrate evolution and define the likely role of gene duplication, gene conversion and positive selection in the evolutionary dynamics of this family. Results A phylogenetic analysis of 77 NAT sequences from 38 vertebrate species retrieved from public genomic databases shows that NATs are phylogenetically unstable genes, characterized by frequent gene duplications and losses even among closely related species, and that concerted evolution only played a minor role in the patterns of sequence divergence. Local signals of positive selection are detected in several lineages, probably reflecting response to changes in xenobiotic exposure. We then put a special emphasis on the study of the last ~85 million years of primate NAT evolution by determining the NAT homologous sequences in 13 additional primate species. Our phylogenetic analysis supports the view that the three human NAT genes emerged from a first duplication event in the common ancestor of Simiiformes, yielding NAT1 and an ancestral NAT gene which in turn, duplicated in the common ancestor of Catarrhini, giving rise to NAT2 and the NATP pseudogene. Our analysis suggests a main role of purifying selection in NAT1 protein evolution, whereas NAT2 was predicted to mostly evolve under positive selection to change its amino acid sequence over time. These findings are consistent with a differential role of the two human isoenzymes and support the involvement of NAT1 in endogenous metabolic pathways. Conclusions This study provides unequivocal evidence that the NAT gene family has evolved under a dynamic process of birth-and-death evolution in vertebrates, consistent with previous observations made in fungi.
Collapse
|
39
|
Gerasimova A, Chavez L, Li B, Seumois G, Greenbaum J, Rao A, Vijayanand P, Peters B. Predicting cell types and genetic variations contributing to disease by combining GWAS and epigenetic data. PLoS One 2013; 8:e54359. [PMID: 23382893 PMCID: PMC3559682 DOI: 10.1371/journal.pone.0054359] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2012] [Accepted: 12/11/2012] [Indexed: 12/22/2022] Open
Abstract
Genome-wide association studies (GWASs) identify single nucleotide polymorphisms (SNPs) that are enriched in individuals suffering from a given disease. Most disease-associated SNPs fall into non-coding regions, so that it is not straightforward to infer phenotype or function; moreover, many SNPs are in tight genetic linkage, so that a SNP identified as associated with a particular disease may not itself be causal, but rather signify the presence of a linked SNP that is functionally relevant to disease pathogenesis. Here, we present an analysis method that takes advantage of the recent rapid accumulation of epigenomics data to address these problems for some SNPs. Using asthma as a prototypic example; we show that non-coding disease-associated SNPs are enriched in genomic regions that function as regulators of transcription, such as enhancers and promoters. Identifying enhancers based on the presence of the histone modification marks such as H3K4me1 in different cell types, we show that the location of enhancers is highly cell-type specific. We use these findings to predict which SNPs are likely to be directly contributing to disease based on their presence in regulatory regions, and in which cell types their effect is expected to be detectable. Moreover, we can also predict which cell types contribute to a disease based on overlap of the disease-associated SNPs with the locations of enhancers present in a given cell type. Finally, we suggest that it will be possible to re-analyze GWAS studies with much higher power by limiting the SNPs considered to those in coding or regulatory regions of cell types relevant to a given disease.
Collapse
Affiliation(s)
- Anna Gerasimova
- La Jolla Institute for Allergy and Immunology, La Jolla, California, United States of America.
| | | | | | | | | | | | | | | |
Collapse
|
40
|
Jung I, Kim D. LinkNMF: identification of histone modification modules in the human genome using nonnegative matrix factorization. Gene 2012; 518:215-21. [PMID: 23266811 DOI: 10.1016/j.gene.2012.11.027] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2012] [Accepted: 11/27/2012] [Indexed: 01/13/2023]
Abstract
Histone modifications are ubiquitous processes involved in various cellular mechanisms. Systemic analysis of multiple chromatin modifications has been used to characterize various chromatin states associated with functional DNA elements, gene expression, and specific biological functions. However, identification of modular modification patterns is still required to understand the functional associations between histone modification patterns and specific chromatin/DNA binding factors. To recognize modular modification patterns, we developed a novel algorithm that combines nonnegative matrix factorization (NMF) and a clique-detection algorithm. We applied it, called LinkNMF, to generate a comprehensive modification map in human CD4+ T cell promoter regions. Initially, we identified 11 modules not recognized by conventional approaches. The modules were grouped into two major classes: gene activation and repression. We found that genes targeted by each module were enriched with distinguishable biological functions, suggesting that each modular pattern plays a unique functional role. To explain the formation of modular patterns, we investigated the module-specific binding patterns of chromatin regulators. Application of LinkNMF to histone modification maps of diverse cells and developmental stages will be helpful for understanding how histone modifications regulate gene expression. The algorithm is available on our website at biodb.kaist.ac.kr/LinkNMF.
Collapse
Affiliation(s)
- Inkyung Jung
- Department of Bio and Brain Engineering, KAIST, Daejeon, South Korea
| | | |
Collapse
|
41
|
Evani US, Challis D, Yu J, Jackson AR, Paithankar S, Bainbridge MN, Jakkamsetti A, Pham P, Coarfa C, Milosavljevic A, Yu F. Atlas2 Cloud: a framework for personal genome analysis in the cloud. BMC Genomics 2012; 13 Suppl 6:S19. [PMID: 23134663 PMCID: PMC3481437 DOI: 10.1186/1471-2164-13-s6-s19] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Background Until recently, sequencing has primarily been carried out in large genome centers which have invested heavily in developing the computational infrastructure that enables genomic sequence analysis. The recent advancements in next generation sequencing (NGS) have led to a wide dissemination of sequencing technologies and data, to highly diverse research groups. It is expected that clinical sequencing will become part of diagnostic routines shortly. However, limited accessibility to computational infrastructure and high quality bioinformatic tools, and the demand for personnel skilled in data analysis and interpretation remains a serious bottleneck. To this end, the cloud computing and Software-as-a-Service (SaaS) technologies can help address these issues. Results We successfully enabled the Atlas2 Cloud pipeline for personal genome analysis on two different cloud service platforms: a community cloud via the Genboree Workbench, and a commercial cloud via the Amazon Web Services using Software-as-a-Service model. We report a case study of personal genome analysis using our Atlas2 Genboree pipeline. We also outline a detailed cost structure for running Atlas2 Amazon on whole exome capture data, providing cost projections in terms of storage, compute and I/O when running Atlas2 Amazon on a large data set. Conclusions We find that providing a web interface and an optimized pipeline clearly facilitates usage of cloud computing for personal genome analysis, but for it to be routinely used for large scale projects there needs to be a paradigm shift in the way we develop tools, in standard operating procedures, and in funding mechanisms.
Collapse
Affiliation(s)
- Uday S Evani
- The Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX 77030, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
42
|
Mohammad F, Flight RM, Harrison BJ, Petruska JC, Rouchka EC. AbsIDconvert: an absolute approach for converting genetic identifiers at different granularities. BMC Bioinformatics 2012; 13:229. [PMID: 22967011 PMCID: PMC3554462 DOI: 10.1186/1471-2105-13-229] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2012] [Accepted: 08/09/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND High-throughput molecular biology techniques yield vast amounts of data, often by detecting small portions of ribonucleotides corresponding to specific identifiers. Existing bioinformatic methodologies categorize and compare these elements using inferred descriptive annotation given this sequence information irrespective of the fact that it may not be representative of the identifier as a whole. RESULTS All annotations, no matter the granularity, can be aligned to genomic sequences and therefore annotated by genomic intervals. We have developed AbsIDconvert, a methodology for converting between genomic identifiers by first mapping them onto a common universal coordinate system using an interval tree which is subsequently queried for overlapping identifiers. AbsIDconvert has many potential uses, including gene identifier conversion, identification of features within a genomic region, and cross-species comparisons. The utility is demonstrated in three case studies: 1) comparative genomic study mapping plasmodium gene sequences to corresponding human and mosquito transcriptional regions; 2) cross-species study of Incyte clone sequences; and 3) analysis of human Ensembl transcripts mapped by Affymetrix®; and Agilent microarray probes. AbsIDconvert currently supports ID conversion of 53 species for a given list of input identifiers, genomic sequence, or genome intervals. CONCLUSION AbsIDconvert provides an efficient and reliable mechanism for conversion between identifier domains of interest. The flexibility of this tool allows for custom definition identifier domains contingent upon the availability and determination of a genomic mapping interval. As the genomes and the sequences for genetic elements are further refined, this tool will become increasingly useful and accurate. AbsIDconvert is freely available as a web application or downloadable as a virtual machine at: http://bioinformatics.louisville.edu/abid/.
Collapse
Affiliation(s)
- Fahim Mohammad
- Department of Computer Engineering and Computer Science, University of Louisville, Louisville, KY 40292, USA
| | | | | | | | | |
Collapse
|
43
|
Bánfai B, Jia H, Khatun J, Wood E, Risk B, Gundling WE, Kundaje A, Gunawardena HP, Yu Y, Xie L, Krajewski K, Strahl BD, Chen X, Bickel P, Giddings MC, Brown JB, Lipovich L. Long noncoding RNAs are rarely translated in two human cell lines. Genome Res 2012; 22:1646-57. [PMID: 22955977 PMCID: PMC3431482 DOI: 10.1101/gr.134767.111] [Citation(s) in RCA: 301] [Impact Index Per Article: 25.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2011] [Accepted: 05/03/2012] [Indexed: 12/12/2022]
Abstract
Data from the Encyclopedia of DNA Elements (ENCODE) project show over 9640 human genome loci classified as long noncoding RNAs (lncRNAs), yet only ~100 have been deeply characterized to determine their role in the cell. To measure the protein-coding output from these RNAs, we jointly analyzed two recent data sets produced in the ENCODE project: tandem mass spectrometry (MS/MS) data mapping expressed peptides to their encoding genomic loci, and RNA-seq data generated by ENCODE in long polyA+ and polyA- fractions in the cell lines K562 and GM12878. We used the machine-learning algorithm RuleFit3 to regress the peptide data against RNA expression data. The most important covariate for predicting translation was, surprisingly, the Cytosol polyA- fraction in both cell lines. LncRNAs are ~13-fold less likely to produce detectable peptides than similar mRNAs, indicating that ~92% of GENCODE v7 lncRNAs are not translated in these two ENCODE cell lines. Intersecting 9640 lncRNA loci with 79,333 peptides yielded 85 unique peptides matching 69 lncRNAs. Most cases were due to a coding transcript misannotated as lncRNA. Two exceptions were an unprocessed pseudogene and a bona fide lncRNA gene, both with open reading frames (ORFs) compromised by upstream stop codons. All potentially translatable lncRNA ORFs had only a single peptide match, indicating low protein abundance and/or false-positive peptide matches. We conclude that with very few exceptions, ribosomes are able to distinguish coding from noncoding transcripts and, hence, that ectopic translation and cryptic mRNAs are rare in the human lncRNAome.
Collapse
Affiliation(s)
- Balázs Bánfai
- Department of Statistics, University of California, Berkeley, California 94720, USA
| | - Hui Jia
- Center for Molecular Medicine and Genetics, School of Medicine, Wayne State University, Detroit, Michigan 48201, USA
| | - Jainab Khatun
- Biomolecular Research Center, Boise State University, Boise, Idaho 83725, USA
| | - Emily Wood
- Center for Molecular Medicine and Genetics, School of Medicine, Wayne State University, Detroit, Michigan 48201, USA
| | - Brian Risk
- Biomolecular Research Center, Boise State University, Boise, Idaho 83725, USA
| | - William E. Gundling
- Center for Molecular Medicine and Genetics, School of Medicine, Wayne State University, Detroit, Michigan 48201, USA
| | - Anshul Kundaje
- Department of Computer Science, Stanford University, Palo Alto, California 94305, USA
| | - Harsha P. Gunawardena
- University of North Carolina School of Medicine, Chapel Hill, North Carolina 29425, USA
| | - Yanbao Yu
- University of North Carolina School of Medicine, Chapel Hill, North Carolina 29425, USA
| | - Ling Xie
- University of North Carolina School of Medicine, Chapel Hill, North Carolina 29425, USA
| | - Krzysztof Krajewski
- University of North Carolina School of Medicine, Chapel Hill, North Carolina 29425, USA
| | - Brian D. Strahl
- University of North Carolina School of Medicine, Chapel Hill, North Carolina 29425, USA
| | - Xian Chen
- University of North Carolina School of Medicine, Chapel Hill, North Carolina 29425, USA
| | - Peter Bickel
- Department of Statistics, University of California, Berkeley, California 94720, USA
| | - Morgan C. Giddings
- College of Arts and Sciences, Boise State University, Boise, Idaho 83725, USA
| | - James B. Brown
- Department of Statistics, University of California, Berkeley, California 94720, USA
| | - Leonard Lipovich
- Center for Molecular Medicine and Genetics, School of Medicine, Wayne State University, Detroit, Michigan 48201, USA
| |
Collapse
|
44
|
Li Z, Ender C, Meister G, Moore PS, Chang Y, John B. Extensive terminal and asymmetric processing of small RNAs from rRNAs, snoRNAs, snRNAs, and tRNAs. Nucleic Acids Res 2012; 40:6787-99. [PMID: 22492706 PMCID: PMC3413118 DOI: 10.1093/nar/gks307] [Citation(s) in RCA: 238] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2011] [Accepted: 03/22/2012] [Indexed: 01/09/2023] Open
Abstract
Deep sequencing studies frequently identify small RNA fragments of abundant RNAs. These fragments are thought to represent degradation products of their precursors. Using sequencing, computational analysis, and sensitive northern blot assays, we show that constitutively expressed non-coding RNAs such as tRNAs, snoRNAs, rRNAs and snRNAs preferentially produce small 5' and 3' end fragments. Similar to that of microRNA processing, these terminal fragments are generated in an asymmetric manner that predominantly favors either the 5' or 3' end. Terminal-specific and asymmetric processing of these small RNAs occurs in both mouse and human cells. In addition to the known processing of some 3' terminal tRNA-derived fragments (tRFs) by the RNase III endonuclease Dicer, we show that several RNase family members can produce tRFs, including Angiogenin that cleaves the TψC loop to generate 3' tRFs. The 3' terminal tRFs but not the 5' tRFs are highly complementary to human endogenous retroviral sequences in the genome. Despite their independence from Dicer processing, these tRFs associate with Ago2 and are capable of down regulating target genes by transcript cleavage in vitro. We suggest that endogenous 3' tRFs have a role in regulating the unwarranted expression of endogenous viruses through the RNA interference pathway.
Collapse
MESH Headings
- Animals
- Argonaute Proteins/metabolism
- Endogenous Retroviruses/genetics
- Humans
- Mice
- Proteins/physiology
- RNA Cleavage
- RNA Processing, Post-Transcriptional
- RNA, Messenger/metabolism
- RNA, Ribosomal/chemistry
- RNA, Ribosomal/metabolism
- RNA, Small Nuclear/chemistry
- RNA, Small Nuclear/metabolism
- RNA, Small Nucleolar/chemistry
- RNA, Small Nucleolar/metabolism
- RNA, Small Untranslated/chemistry
- RNA, Small Untranslated/metabolism
- RNA, Transfer/chemistry
- RNA, Transfer/metabolism
- RNA-Binding Proteins
- Ribonuclease III/physiology
- Ribonuclease, Pancreatic/metabolism
- Ribonucleases/metabolism
Collapse
Affiliation(s)
- Zhihua Li
- School of Medicine, University of Pittsburgh, Pittsburgh, PA 15260, Cancer Virology Program, Hillman Cancer Center, 5117 Centre Avenue, Pittsburgh, PA 15213, USA and Max Planck Institute of Biochemistry, Am Klopferspitz 18, 82152 Martinsried, Germany
| | - Christine Ender
- School of Medicine, University of Pittsburgh, Pittsburgh, PA 15260, Cancer Virology Program, Hillman Cancer Center, 5117 Centre Avenue, Pittsburgh, PA 15213, USA and Max Planck Institute of Biochemistry, Am Klopferspitz 18, 82152 Martinsried, Germany
| | - Gunter Meister
- School of Medicine, University of Pittsburgh, Pittsburgh, PA 15260, Cancer Virology Program, Hillman Cancer Center, 5117 Centre Avenue, Pittsburgh, PA 15213, USA and Max Planck Institute of Biochemistry, Am Klopferspitz 18, 82152 Martinsried, Germany
| | - Patrick S. Moore
- School of Medicine, University of Pittsburgh, Pittsburgh, PA 15260, Cancer Virology Program, Hillman Cancer Center, 5117 Centre Avenue, Pittsburgh, PA 15213, USA and Max Planck Institute of Biochemistry, Am Klopferspitz 18, 82152 Martinsried, Germany
| | - Yuan Chang
- School of Medicine, University of Pittsburgh, Pittsburgh, PA 15260, Cancer Virology Program, Hillman Cancer Center, 5117 Centre Avenue, Pittsburgh, PA 15213, USA and Max Planck Institute of Biochemistry, Am Klopferspitz 18, 82152 Martinsried, Germany
| | - Bino John
- School of Medicine, University of Pittsburgh, Pittsburgh, PA 15260, Cancer Virology Program, Hillman Cancer Center, 5117 Centre Avenue, Pittsburgh, PA 15213, USA and Max Planck Institute of Biochemistry, Am Klopferspitz 18, 82152 Martinsried, Germany
| |
Collapse
|
45
|
Jung I, Kim SK, Kim M, Han YM, Kim YS, Kim D, Lee D. H2B monoubiquitylation is a 5'-enriched active transcription mark and correlates with exon-intron structure in human cells. Genome Res 2012; 22:1026-35. [PMID: 22421545 PMCID: PMC3371706 DOI: 10.1101/gr.120634.111] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2011] [Accepted: 03/07/2012] [Indexed: 11/25/2022]
Abstract
H2B monoubiquitylation (H2Bub1), which is required for multiple methylations of both H3K4 and H3K79, has been implicated in gene expression in numerous organisms ranging from yeast to human. However, the molecular crosstalk between H2Bub1 and other modifications, especially the methylations of H3K4 and H3K79, remains unclear in vertebrates. To better understand the functional role of H2Bub1, we measured genome-wide histone modification patterns in human cells. Our results suggest that H2Bub1 has dual roles, one that is H3 methylation dependent, and another that is H3 methylation independent. First, H2Bub1 is a 5'-enriched active transcription mark and co-occupies with H3K79 methylations in actively transcribed regions. Second, this study shows for the first time that H2Bub1 plays a histone H3 methylations-independent role in chromatin architecture. Furthermore, the results of this work indicate that H2Bub1 is largely positioned at the exon-intron boundaries of highly expressed exons, and it demonstrates increased occupancy in skipped exons compared with flanking exons in the human and mouse genomes. Our findings collectively suggest that a potentiating mechanism links H2Bub1 to both H3K79 methylations in actively transcribed regions and the exon-intron structure of highly expressed exons via the regulation of nucleosome dynamics during transcription elongation.
Collapse
Affiliation(s)
- Inkyung Jung
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Yuseong-gu, Daejeon 305-701, Korea
| | - Seung-Kyoon Kim
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Yuseong-gu, Daejeon 305-701, Korea
| | - Mirang Kim
- Medical Genomics Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon 305-806, South Korea
| | - Yong-Mahn Han
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Yuseong-gu, Daejeon 305-701, Korea
| | - Yong Sung Kim
- Medical Genomics Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon 305-806, South Korea
| | - Dongsup Kim
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, Yuseong-gu, Daejeon 305-701, Korea
| | - Daeyoup Lee
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, Yuseong-gu, Daejeon 305-701, Korea
| |
Collapse
|
46
|
Baxevanis AD. Searching Online Mendelian Inheritance in Man (OMIM) for information on genetic loci involved in human disease. CURRENT PROTOCOLS IN HUMAN GENETICS 2012; Chapter 9:9.13.1-9.13.10. [PMID: 22470145 DOI: 10.1002/0471142905.hg0913s73] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Online Mendelian Inheritance in Man (OMIM) is a comprehensive compendium of information on human genes and genetic disorders, with a particular emphasis on the interplay between observed phenotypes and underlying genotypes. This unit focuses on the basic methodology for formulating OMIM searches and illustrates the types of information that can be retrieved from OMIM, including descriptions of clinical manifestations resulting from genetic abnormalities. This unit also provides information on additional relevant medical and molecular biology databases. A basic knowledge of OMIM should be part of the armamentarium of physicians and scientists with an interest in research on the clinical aspects of genetic disorders.
Collapse
|
47
|
Bhagwat M, Young L, Robison RR. Using BLAT to find sequence similarity in closely related genomes. CURRENT PROTOCOLS IN BIOINFORMATICS 2012; Chapter 10:10.8.1-10.8.24. [PMID: 22389010 PMCID: PMC4101998 DOI: 10.1002/0471250953.bi1008s37] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The BLAST-Like Alignment Tool (BLAT) is used to find genomic sequences that match a protein or DNA sequence submitted by the user. BLAT is typically used for searching similar sequences within the same or closely related species. It was developed to align millions of expressed sequence tags and mouse whole-genome random reads to the human genome at a higher speed. It is freely available either on the Web or as a downloadable stand-alone program. BLAT search results provide a link for visualization in the University of California, Santa Cruz (UCSC) Genome Browser, where associated biological information may be obtained. Three example protocols are given: using an mRNA sequence to identify the exon-intron locations and associated gene in the genomic sequence of the same species, using a protein sequence to identify the coding regions in a genomic sequence and to search for gene family members in the same species, and using a protein sequence to find homologs in another species.
Collapse
Affiliation(s)
- Medha Bhagwat
- National Institutes of Health Library, National Institutes of Health, Bethesda, Maryland
| | - Lynn Young
- National Institutes of Health Library, National Institutes of Health, Bethesda, Maryland
| | - Rex R Robison
- National Institutes of Health Library, National Institutes of Health, Bethesda, Maryland
| |
Collapse
|
48
|
Stefansson OA, Jonasson JG, Olafsdottir K, Bjarnason H, Th Johannsson O, Bodvarsdottir SK, Valgeirsdottir S, Eyfjord JE. Genomic and phenotypic analysis of BRCA2 mutated breast cancers reveals co-occurring changes linked to progression. Breast Cancer Res 2011; 13:R95. [PMID: 21958427 PMCID: PMC3262207 DOI: 10.1186/bcr3020] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2011] [Revised: 07/12/2011] [Accepted: 09/29/2011] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Inherited mutations in the BRCA2 gene greatly increase the risk of developing breast cancer. Consistent with an important role for BRCA2 in error-free DNA repair, complex genomic changes are frequently observed in tumors derived from BRCA2 mutation carriers. Here, we explore the impact of DNA copy-number changes in BRCA2 tumors with respect to phenotype and clinical staging of the disease. METHODS Breast tumors (n = 33) derived from BRCA2 999del5 mutation carriers were examined in terms of copy-number changes with high-resolution aCGH (array comparative genomic hybridization) containing 385 thousand probes (about one for each 7 kbp) and expression of phenotypic markers on TMAs (tissue microarrays). The data were examined with respect to clinical parameters including TNM staging, histologic grade, S phase, and ploidy. RESULTS Tumors from BRCA2 carriers of luminal and basal/triple-negative phenotypes (TNPs) differ with respect to patterns of DNA copy-number changes. The basal/TNP subtype was characterized by lack of pRb (RB1) coupled with high/intense expression of p16 (CDKN2A) gene products. We found increased proportions of Ki-67-positive cells to be significantly associated with loss of the wild-type (wt) BRCA2 allele in luminal types, whereas BRCA2wt loss was less frequent in BRCA2 tumors displaying basal/TNP phenotypes. Furthermore, we show that deletions at 13q13.1, involving the BRCA2wt allele, represents a part of a larger network of co-occurring genetic changes, including deletions at 6q22.32-q22.33, 11q14.2-q24.1, and gains at 17q24.1. Importantly, copy-number changes at these BRCA2-linked networking regions coincide with those associated with advanced progression, involving the capacity to metastasize to the nodes or more-distant sites at diagnosis. CONCLUSIONS The results presented here demonstrate divergent paths of tumor evolution in BRCA2 carriers and that deletion of the wild-type BRCA2 allele, together with co-occurring changes at 6 q, 11 q, and 17 q, are important events in progression toward advanced disease.
Collapse
Affiliation(s)
- Olafur A Stefansson
- Cancer Research Laboratory, Faculty of Medicine, University of Iceland, Vatnsmyrarvegur 16, Reykjavik, Iceland
| | | | | | | | | | | | | | | |
Collapse
|
49
|
Mfuna-Endam L, Zhang Y, Desrosiers MY. Genetics of rhinosinusitis. Curr Allergy Asthma Rep 2011; 11:236-46. [PMID: 21499907 DOI: 10.1007/s11882-011-0189-4] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Suggestion for a potential genetic basis to chronic rhinosinusitis (CRS) is afforded by degree of inheritability suggested from family and twin studies, existence of CRS in simple mendelian diseases, and development of sinusitis as part of the phenotype of certain gene "knockout" murine models. Genetic association studies are expected to identify novel genes associated with CRS and suggest novel mechanisms implicated in disease development. Although these studies are subject to methodologic difficulties, associations of CRS and polymorphisms in more than 30 genes have been published, with single nucleotide polymorphisms in 3 (IL1A, TNFA, AOAH) replicated. While the individual risk conferred by these single nucleotide polymorphisms remains modest, taken as a group, they suggest an important implication of pathways of innate immune recognition and in regulation of downstream signaling in the development of CRS. In a demonstration of these techniques' potential to identify new targets for research, the authors present a functional investigation of LAMB1, the top-rated gene from a pooling-based genome-wide association study of CRS. Upregulation of gene expression in LAMB1 and associated laminin genes in primary epithelial cells from CRS patients implicates the extracellular matrix in development of CRS and offers a new avenue for further study.
Collapse
Affiliation(s)
- Leandra Mfuna-Endam
- Department of Otolaryngology-Head and Neck Surgery, Centre de Recherche du CHUM (CRCHUM), Hôpital Hôtel-Dieu, Université de Montréal, QC, Canada
| | | | | |
Collapse
|
50
|
Abstract
This unit includes a basic protocol with an introduction to the Map Viewer, describing how to perform a simple text-based search of genome annotations to view the genomic context of a gene, navigate along a chromosome, zoom in and out, and change the displayed maps to hide and show information. It also describes some of NCBI's sequence-analysis tools, which are provided as links from the Map Viewer. The alternate protocols describe different ways to query the genome sequence, and also illustrate additional features of the Map Viewer. Alternate Protocol 1 shows how to perform and interpret the results of a BLAST search against the human genome. Alternate Protocol 2 demonstrates how to retrieve a list of all genes between two STS markers. Finally, Alternate Protocol 3 shows how to find all annotated members of a gene family.
Collapse
|