1
|
Chi WY, Hu Y, Huang HC, Kuo HH, Lin SH, Kuo CTJ, Tao J, Fan D, Huang YM, Wu AA, Hung CF, Wu TC. Molecular targets and strategies in the development of nucleic acid cancer vaccines: from shared to personalized antigens. J Biomed Sci 2024; 31:94. [PMID: 39379923 PMCID: PMC11463125 DOI: 10.1186/s12929-024-01082-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2024] [Accepted: 09/01/2024] [Indexed: 10/10/2024] Open
Abstract
Recent breakthroughs in cancer immunotherapies have emphasized the importance of harnessing the immune system for treating cancer. Vaccines, which have traditionally been used to promote protective immunity against pathogens, are now being explored as a method to target cancer neoantigens. Over the past few years, extensive preclinical research and more than a hundred clinical trials have been dedicated to investigating various approaches to neoantigen discovery and vaccine formulations, encouraging development of personalized medicine. Nucleic acids (DNA and mRNA) have become particularly promising platform for the development of these cancer immunotherapies. This shift towards nucleic acid-based personalized vaccines has been facilitated by advancements in molecular techniques for identifying neoantigens, antigen prediction methodologies, and the development of new vaccine platforms. Generating these personalized vaccines involves a comprehensive pipeline that includes sequencing of patient tumor samples, data analysis for antigen prediction, and tailored vaccine manufacturing. In this review, we will discuss the various shared and personalized antigens used for cancer vaccine development and introduce strategies for identifying neoantigens through the characterization of gene mutation, transcription, translation and post translational modifications associated with oncogenesis. In addition, we will focus on the most up-to-date nucleic acid vaccine platforms, discuss the limitations of cancer vaccines as well as provide potential solutions, and raise key clinical and technical considerations in vaccine development.
Collapse
Affiliation(s)
- Wei-Yu Chi
- Physiology, Biophysics and Systems Biology Graduate Program, Weill Cornell Medicine, New York, NY, USA
| | - Yingying Hu
- Tri-Institutional PhD Program in Chemical Biology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Hsin-Che Huang
- Tri-Institutional PhD Program in Chemical Biology, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Hui-Hsuan Kuo
- Pharmacology PhD Program, Weill Cornell Medicine, New York, NY, USA
| | - Shu-Hong Lin
- Department of Epidemiology, The University of Texas MD Anderson Cancer Center, Houston, TX, USA
- The University of Texas Graduate School of Biomedical Sciences at Houston and MD Anderson Cancer Center, Houston, TX, USA
| | - Chun-Tien Jimmy Kuo
- Division of Pharmaceutics and Pharmacology, College of Pharmacy, The Ohio State University, Columbus, OH, USA
| | - Julia Tao
- Department of Pathology, Johns Hopkins School of Medicine, 1550 Orleans St, CRB II Room 309, Baltimore, MD, 21287, USA
| | - Darrell Fan
- Department of Pathology, Johns Hopkins School of Medicine, 1550 Orleans St, CRB II Room 309, Baltimore, MD, 21287, USA
| | - Yi-Min Huang
- Department of Pathology, Johns Hopkins School of Medicine, 1550 Orleans St, CRB II Room 309, Baltimore, MD, 21287, USA
| | - Annie A Wu
- Department of Pathology, Johns Hopkins School of Medicine, 1550 Orleans St, CRB II Room 309, Baltimore, MD, 21287, USA
| | - Chien-Fu Hung
- Department of Pathology, Johns Hopkins School of Medicine, 1550 Orleans St, CRB II Room 309, Baltimore, MD, 21287, USA
- Department of Oncology, Johns Hopkins School of Medicine, Baltimore, MD, USA
- Department of Obstetrics and Gynecology, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - T-C Wu
- Department of Pathology, Johns Hopkins School of Medicine, 1550 Orleans St, CRB II Room 309, Baltimore, MD, 21287, USA.
- Department of Oncology, Johns Hopkins School of Medicine, Baltimore, MD, USA.
- Department of Obstetrics and Gynecology, Johns Hopkins School of Medicine, Baltimore, MD, USA.
- Department of Molecular Microbiology and Immunology, Bloomberg School of Public Health, Johns Hopkins School of Medicine, Baltimore, MD, USA.
| |
Collapse
|
2
|
Premanand A, Shanmuga Priya M, Reena Rajkumari B. Genetic variants in androgenetic alopecia: insights from scalp RNA sequencing data. Arch Dermatol Res 2024; 316:590. [PMID: 39215850 DOI: 10.1007/s00403-024-03351-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 08/03/2024] [Accepted: 08/20/2024] [Indexed: 09/04/2024]
Affiliation(s)
- A Premanand
- Department of Integrative Biology, School of Bio Sciences and Technology, Vellore Institute of Technology, Vellore, Tamil Nadu, 632014, India
| | - M Shanmuga Priya
- Department of Integrative Biology, School of Bio Sciences and Technology, Vellore Institute of Technology, Vellore, Tamil Nadu, 632014, India
| | - B Reena Rajkumari
- Department of Integrative Biology, School of Bio Sciences and Technology, Vellore Institute of Technology, Vellore, Tamil Nadu, 632014, India.
| |
Collapse
|
3
|
Vigorito E, Barton A, Pitzalis C, Lewis MJ, Wallace C. BBmix: a Bayesian beta-binomial mixture model for accurate genotyping from RNA-sequencing. Bioinformatics 2023; 39:btad393. [PMID: 37338536 PMCID: PMC10318392 DOI: 10.1093/bioinformatics/btad393] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Revised: 05/15/2023] [Accepted: 06/19/2023] [Indexed: 06/21/2023] Open
Abstract
MOTIVATION While many pipelines have been developed for calling genotypes using RNA-sequencing (RNA-Seq) data, they all have adapted DNA genotype callers that do not model biases specific to RNA-Seq such as allele-specific expression (ASE). RESULTS Here, we present Bayesian beta-binomial mixture model (BBmix), a Bayesian beta-binomial mixture model that first learns the expected distribution of read counts for each genotype, and then deploys those learned parameters to call genotypes probabilistically. We benchmarked our model on a wide variety of datasets and showed that our method generally performed better than competitors, mainly due to an increase of up to 1.4% in the accuracy of heterozygous calls, which may have a big impact in reducing false positive rate in applications sensitive to genotyping error such as ASE. Moreover, BBmix can be easily incorporated into standard pipelines for calling genotypes. We further show that parameters are generally transferable within datasets, such that a single learning run of less than 1 h is sufficient to call genotypes in a large number of samples. AVAILABILITY AND IMPLEMENTATION We implemented BBmix as an R package that is available for free under a GPL-2 licence at https://gitlab.com/evigorito/bbmix and https://cran.r-project.org/package=bbmix with accompanying pipeline at https://gitlab.com/evigorito/bbmix_pipeline.
Collapse
Affiliation(s)
- Elena Vigorito
- MRC Biostatistics Unit, University of Cambridge, Cambridge CB2 0SR, United Kingdom
| | - Anne Barton
- Division of Musculoskeletal and Dermatological Sciences, University of Manchester, Manchester M13 9PL, United Kingdom
| | - Costantino Pitzalis
- Centre for Experimental Medicine and Rheumatology, William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, United Kingdom
| | - Myles J Lewis
- Centre for Experimental Medicine and Rheumatology, William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London EC1M 6BQ, United Kingdom
| | - Chris Wallace
- MRC Biostatistics Unit, University of Cambridge, Cambridge CB2 0SR, United Kingdom
- Cambridge Institute of Therapeutic Immunology & Infectious Disease (CITIID), Jeffrey Cheah Biomedical Centre, Cambridge Biomedical Campus, University of Cambridge, Cambridge CB2 0AW, United Kingdom
| |
Collapse
|
4
|
Nagi SC, Oruni A, Weetman D, Donnelly MJ. RNA-Seq-Pop: Exploiting the sequence in RNA sequencing-A Snakemake workflow reveals patterns of insecticide resistance in the malaria vector Anopheles gambiae. Mol Ecol Resour 2023; 23:946-961. [PMID: 36695302 PMCID: PMC10568660 DOI: 10.1111/1755-0998.13759] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Revised: 11/12/2022] [Accepted: 01/06/2023] [Indexed: 01/26/2023]
Abstract
We provide a reproducible and scalable Snakemake workflow, called RNA-Seq-Pop, which provides end-to-end analysis of RNA sequencing data sets. The workflow allows the user to perform quality control, perform differential expression analyses and call genomic variants. Additional options include the calculation of allele frequencies of variants of interest, summaries of genetic variation and population structure, and genome-wide selection scans, together with clear visualizations. RNA-Seq-Pop is applicable to any organism, and we demonstrate the utility of the workflow by investigating pyrethroid resistance in selected strains of the major malaria mosquito, Anopheles gambiae. The workflow provides additional modules specifically for An. gambiae, including estimating recent ancestry and determining the karyotype of common chromosomal inversions. The Busia laboratory colony used for selections was collected in Busia, Uganda, in November 2018. We performed a comparative analysis of three groups: a parental G24 Busia strain; its deltamethrin-selected G28 offspring; and the susceptible reference strain Kisumu. Measures of genetic diversity reveal patterns consistent with that of laboratory colonization and selection, with the parental Busia strain exhibiting the highest nucleotide diversity, followed by the selected Busia offspring, and finally, Kisumu. Differential expression and variant analyses reveal that the selected Busia colony exhibits a number of distinct mechanisms of pyrethroid resistance, including the Vgsc-995S target-site mutation, upregulation of SAP genes, P450s and a cluster of carboxylesterases. During deltamethrin selections, the 2La chromosomal inversion rose in frequency (from 33% to 86%), supporting a previous link with pyrethroid resistance. RNA-Seq-Pop is hosted at: github.com/sanjaynagi/rna-seq-pop. We anticipate that the workflow will provide a useful tool to facilitate reproducible, transcriptomic studies in An. gambiae and other taxa.
Collapse
Affiliation(s)
- Sanjay C. Nagi
- Department of Vector BiologyLiverpool School of Tropical MedicineLiverpoolUK
| | | | - David Weetman
- Department of Vector BiologyLiverpool School of Tropical MedicineLiverpoolUK
| | - Martin J. Donnelly
- Department of Vector BiologyLiverpool School of Tropical MedicineLiverpoolUK
| |
Collapse
|
5
|
Huang P, Hameed R, Abbas M, Balooch S, Alharthi B, Du Y, Abbas A, Younas A, Du D. Integrated omic techniques and their genomic features for invasive weeds. Funct Integr Genomics 2023; 23:44. [PMID: 36680630 DOI: 10.1007/s10142-023-00971-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 01/01/2023] [Accepted: 01/11/2023] [Indexed: 01/22/2023]
Abstract
Many emerging invasive weeds display rapid adaptation against different stressful environments compared to their natives. Rapid adaptation and dispersal habits helped invasive populations have strong diversity within the population compared to their natives. Advances in molecular marker techniques may lead to an in-depth understanding of the genetic diversity of invasive weeds. The use of molecular techniques is rapidly growing, and their implications in invasive weed studies are considered powerful tools for genome purposes. Here, we review different approach used multi-omics by invasive weed studies to understand the functional structural and genomic changes in these species under different environmental fluctuations, particularly, to check the accessibility of advance-sequencing techniques used by researchers in genome sequence projects. In this review-based study, we also examine the importance and efficiency of different molecular techniques in identifying and characterizing different genes, associated markers, proteins, metabolites, and key metabolic pathways in invasive and native weeds. Use of these techniques could help weed scientists to further reduce the knowledge gaps in understanding invasive weeds traits. Although these techniques can provide robust insights about the molecular functioning, employing a single omics platform can rarely elucidate the gene-level regulation and the associated real-time expression of weedy traits due to the complex and overlapping nature of biological interactions. We conclude that different multi-omic techniques will provide long-term benefits in launching new genome projects to enhance the understanding of invasive weeds' invasion process.
Collapse
Affiliation(s)
- Ping Huang
- Institute of Environment and Ecology, School of Environment and Safety Engineering, Jiangsu University, Zhenjiang, 212013, People's Republic of China
| | - Rashida Hameed
- Institute of Environment and Ecology, School of Environment and Safety Engineering, Jiangsu University, Zhenjiang, 212013, People's Republic of China
| | - Manzer Abbas
- School of Agriculture, Forestry and Food Engineering, Yibin University, Yibin, 644000, Sichuan Province, People's Republic of China
| | - Sidra Balooch
- Institute of Botany, Bahauddin Zakariya University, Multan, Punjab, Pakistan
| | - Badr Alharthi
- Department of Biology, University College of Al Khurmah, Taif University, PO. Box 11099, Taif, 21944, Saudi Arabia
| | - Yizhou Du
- Faculty of Engineering, School of Computer Science, University of Sydney, Sydney, New South Wales, Australia
| | - Adeel Abbas
- Institute of Environment and Ecology, School of Environment and Safety Engineering, Jiangsu University, Zhenjiang, 212013, People's Republic of China.
| | - Afifa Younas
- Department of Botany, Lahore College for Women University, Lahore, Pakistan
| | - Daolin Du
- Institute of Environment and Ecology, School of Environment and Safety Engineering, Jiangsu University, Zhenjiang, 212013, People's Republic of China.
| |
Collapse
|
6
|
Long Q, Yuan Y, Li M. RNA-SSNV: A Reliable Somatic Single Nucleotide Variant Identification Framework for Bulk RNA-Seq Data. Front Genet 2022; 13:865313. [PMID: 35846154 PMCID: PMC9279659 DOI: 10.3389/fgene.2022.865313] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Accepted: 05/17/2022] [Indexed: 11/13/2022] Open
Abstract
The usage of expressed somatic mutations may have a unique advantage in identifying active cancer driver mutations. However, accurately calling mutations from RNA-seq data is difficult due to confounding factors such as RNA-editing, reverse transcription, and gap alignment. In the present study, we proposed a framework (named RNA-SSNV, https://github.com/pmglab/RNA-SSNV) to call somatic single nucleotide variants (SSNV) from tumor bulk RNA-seq data. Based on a comprehensive multi-filtering strategy and a machine-learning classification model trained with comprehensively curated features, RNA-SSNV achieved the best precision–recall rate (0.880–0.884) in a testing dataset and robustly retained 0.94 AUC for the precision–recall curve in three validation adult-based TCGA (The Cancer Genome Atlas) datasets. We further showed that the somatic mutations called by RNA-SSNV tended to have a higher functional impact and therapeutic power in known driver genes. Furthermore, VAF (variant allele fraction) analysis revealed that subclonal harboring expressed mutations had evolutional selection advantage and RNA had higher detection power to rescue DNA-omitted mutations. In sum, RNA-SSNV will be a useful approach to accurately call expressed somatic mutations for a more insightful analysis of cancer drive genes and carcinogenic mechanisms.
Collapse
Affiliation(s)
- Qihan Long
- Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Precision Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Disease Genome Research, Sun Yat-Sen University, Guangzhou, China
| | - Yangyang Yuan
- Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Precision Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Disease Genome Research, Sun Yat-Sen University, Guangzhou, China
| | - Miaoxin Li
- Zhongshan School of Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Precision Medicine, Sun Yat-Sen University, Guangzhou, China
- Center for Disease Genome Research, Sun Yat-Sen University, Guangzhou, China
- Guangdong Provincial Key Laboratory of Biomedical Imaging and Guangdong Provincial Engineering Research Center of Molecular Imaging, The Fifth Affiliated Hospital, Sun Yat-sen University, Zhuhai, China
- Key Laboratory of Tropical Disease Control (SYSU), Ministry of Education, Guangzhou, China
- *Correspondence: Miaoxin Li,
| |
Collapse
|
7
|
Ma T, Li H, Zhang X. Discovering single-cell eQTLs from scRNA-seq data only. Gene 2022; 829:146520. [PMID: 35452708 DOI: 10.1016/j.gene.2022.146520] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2021] [Revised: 01/12/2022] [Accepted: 04/15/2022] [Indexed: 12/14/2022]
Abstract
eQTL studies are essential for understanding genomic regulation. The effects of genetic variations on gene regulation are cell-type-specific and cellular-context-related, so studying eQTLs at a single-cell level is crucial. The ideal solution is to use both mutation and expression data from the same cells. However, the current technology of such paired data in single cells is still immature. We present a new method, eQTLsingle, to discover eQTLs only with single-cell RNA-seq (scRNA-seq) data, without genomic data. It detects mutations from scRNA-seq data and models gene expression of different genotypes with the zero-inflated negative binomial (ZINB) model to find associations between genotypes and phenotypes at the single-cell level. On a glioblastoma and gliomasphere scRNA-seq dataset, eQTLsingle discovered hundreds of cell-type-specific tumor-related eQTLs, most of which cannot be found in bulk eQTL studies. Detailed analyses on examples of the discovered eQTLs revealed important underlying regulatory mechanisms. eQTLsingle is a uniquely powerful tool for utilizing the vast scRNA-seq resources for single-cell eQTL studies, and it is available for free academic use at https://github.com/horsedayday/eQTLsingle.
Collapse
Affiliation(s)
- Tianxing Ma
- MOE Key Laboratory of Bioinformatics, Bioinformatics Division, BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China
| | - Haochen Li
- School of Medicine, Center for Synthetic and Systems Biology, Tsinghua University, Beijing 100084, China
| | - Xuegong Zhang
- MOE Key Laboratory of Bioinformatics, Bioinformatics Division, BNRIST and Department of Automation, Tsinghua University, Beijing 100084, China; School of Medicine, Center for Synthetic and Systems Biology, Tsinghua University, Beijing 100084, China.
| |
Collapse
|
8
|
Whole genome re-sequencing reveals the genetic diversity and evolutionary patterns of Eucommia ulmoides. Mol Genet Genomics 2022; 297:485-494. [PMID: 35146538 DOI: 10.1007/s00438-022-01864-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Accepted: 01/23/2022] [Indexed: 10/19/2022]
Abstract
Eucommia ulmoides (E. ulmoides) is a deciduous perennial tree belonging to the order Garryales, and is known as "living fossil" plant, along with ginkgo (Ginkgo biloba), metaspaca (Metasequoia glyptostroboides) and dove tree (Davidia involucrata Baill). However, the genetic diversity and population structure of E. ulmoides are still ambiguous nowdays. In this study, we re-sequenced the genomes of 12 E. ulmoides accessions from different major climatic geography regions in China to elucidate the genetic diversity, population structure and evolutionary pattern. By integration of phylogenetic analysis, principal component analysis and population structure analysis based on a number of high-quality SNPs, a total of 12 E. ulmoides accessions were clustered into four different groups. This result is consistent with their geographical location except for group samples from Shanghai and Hunan province. E. ulmoides accessions from Hunan province exhibited a closer genetic relationship with E. ulmoides accessions from Shanghai in China compared with other regions, which is also supported by the result of population structure analyses. Genetic diversity analysis further revealed that E. ulmoides samples in Shanghai and Hunan province were with higher genetic diversity than those in other regions in this study. In addition, we treated the E. ulmoides materials from Shanghai and Hunan province as group A, and the other materials from other places as group B, and then analyzed the evolutionary pattern of E. ulmoides. The result showed the significant differentiation (Fst = 0.1545) between group A and group B. Some candidate highly divergent genome regions were identified in group A by selective sweep analyses, and the function analysis of candidate genes in these regions showed that biological regulation processes could be correlated with the Eu-rubber biosynthesis. Notably, nine genes were identified from selective sweep regions. They were involved in the Eu-rubber biosynthesis and expressed in rubber containing tissues. The genetic diversity research and evolution model of E. ulmoides were preliminarily explored in this study, which laid the foundation for the protection of germplasm resources and the development and utilization of multipurpose germplasm resources in the future.
Collapse
|
9
|
Gundogdu P, Loucera C, Alamo-Alvarez I, Dopazo J, Nepomuceno I. Integrating pathway knowledge with deep neural networks to reduce the dimensionality in single-cell RNA-seq data. BioData Min 2022; 15:1. [PMID: 34980200 PMCID: PMC8722116 DOI: 10.1186/s13040-021-00285-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Accepted: 12/04/2021] [Indexed: 11/13/2022] Open
Abstract
Background Single-cell RNA sequencing (scRNA-seq) data provide valuable insights into cellular heterogeneity which is significantly improving the current knowledge on biology and human disease. One of the main applications of scRNA-seq data analysis is the identification of new cell types and cell states. Deep neural networks (DNNs) are among the best methods to address this problem. However, this performance comes with the trade-off for a lack of interpretability in the results. In this work we propose an intelligible pathway-driven neural network to correctly solve cell-type related problems at single-cell resolution while providing a biologically meaningful representation of the data. Results In this study, we explored the deep neural networks constrained by several types of prior biological information, e.g. signaling pathway information, as a way to reduce the dimensionality of the scRNA-seq data. We have tested the proposed biologically-based architectures on thousands of cells of human and mouse origin across a collection of public datasets in order to check the performance of the model. Specifically, we tested the architecture across different validation scenarios that try to mimic how unknown cell types are clustered by the DNN and how it correctly annotates cell types by querying a database in a retrieval problem. Moreover, our approach demonstrated to be comparable to other less interpretable DNN approaches constrained by using protein-protein interactions gene regulation data. Finally, we show how the latent structure learned by the network could be used to visualize and to interpret the composition of human single cell datasets. Conclusions Here we demonstrate how the integration of pathways, which convey fundamental information on functional relationships between genes, with DNNs, that provide an excellent classification framework, results in an excellent alternative to learn a biologically meaningful representation of scRNA-seq data. In addition, the introduction of prior biological knowledge in the DNN reduces the size of the network architecture. Comparative results demonstrate a superior performance of this approach with respect to other similar approaches. As an additional advantage, the use of pathways within the DNN structure enables easy interpretability of the results by connecting features to cell functionalities by means of the pathway nodes, as demonstrated with an example with human melanoma tumor cells. Supplementary Information The online version contains supplementary material available at 10.1186/s13040-021-00285-4.
Collapse
Affiliation(s)
- Pelin Gundogdu
- Clinical Bioinformatics Area. Fundación Progreso y Salud (FPS). CDCA, Hospital Virgen del Rocio, 41013, Sevilla, Spain
| | - Carlos Loucera
- Clinical Bioinformatics Area. Fundación Progreso y Salud (FPS). CDCA, Hospital Virgen del Rocio, 41013, Sevilla, Spain.,Computational Systems Medicine, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocio, 41013, Sevilla, Spain
| | - Inmaculada Alamo-Alvarez
- Clinical Bioinformatics Area. Fundación Progreso y Salud (FPS). CDCA, Hospital Virgen del Rocio, 41013, Sevilla, Spain.,Computational Systems Medicine, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocio, 41013, Sevilla, Spain
| | - Joaquin Dopazo
- Clinical Bioinformatics Area. Fundación Progreso y Salud (FPS). CDCA, Hospital Virgen del Rocio, 41013, Sevilla, Spain. .,Computational Systems Medicine, Institute of Biomedicine of Seville (IBIS), Hospital Virgen del Rocio, 41013, Sevilla, Spain. .,Bioinformatics in Rare Diseases (BiER), Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), FPS, Hospital Virgen del Rocío, 41013, Sevilla, Spain. .,FPS/ELIXIR-es, Hospital Virgen del Rocío, 42013, Sevilla, Spain.
| | - Isabel Nepomuceno
- Department of Computer Languages and Systems, Universidad de Sevilla, Sevilla, Spain.
| |
Collapse
|
10
|
He Y, Huang L, Tang Y, Yang Z, Han Z. Genome-wide Identification and Analysis of Splicing QTLs in Multiple Sclerosis by RNA-Seq Data. Front Genet 2021; 12:769804. [PMID: 34868258 PMCID: PMC8633104 DOI: 10.3389/fgene.2021.769804] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Accepted: 10/18/2021] [Indexed: 12/21/2022] Open
Abstract
Multiple sclerosis (MS) is an autoimmune disease characterized by inflammatory demyelinating lesions in the central nervous system. Recently, the dysregulation of alternative splicing (AS) in the brain has been found to significantly influence the progression of MS. Moreover, previous studies demonstrate that many MS-related variants in the genome act as the important regulation factors of AS events and contribute to the pathogenesis of MS. However, by far, no genome-wide research about the effect of genomic variants on AS events in MS has been reported. Here, we first implemented a strategy to obtain genomic variant genotype and AS isoform average percentage spliced-in values from RNA-seq data of 142 individuals (51 MS patients and 91 controls). Then, combing the two sets of data, we performed a cis-splicing quantitative trait loci (sQTLs) analysis to identify the cis-acting loci and the affected differential AS events in MS and further explored the characteristics of these cis-sQTLs. Finally, the weighted gene coexpression network and gene set enrichment analyses were used to investigate gene interaction pattern and functions of the affected AS events in MS. In total, we identified 5835 variants affecting 672 differential AS events. The cis-sQTLs tend to be distributed in proximity of the gene transcription initiation site, and the intronic variants of them are more capable of regulating AS events. The retained intron AS events are more susceptible to influence of genome variants, and their functions are involved in protein kinase and phosphorylation modification. In summary, these findings provide an insight into the mechanism of MS.
Collapse
Affiliation(s)
| | | | | | | | - Zhijie Han
- Department of Bioinformatics, School of Basic Medicine, Chongqing Medical University, Chongqing, China
| |
Collapse
|
11
|
Subramaniam N, Nair R, Marsden PA. Epigenetic Regulation of the Vascular Endothelium by Angiogenic LncRNAs. Front Genet 2021; 12:668313. [PMID: 34512715 PMCID: PMC8427604 DOI: 10.3389/fgene.2021.668313] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Accepted: 05/17/2021] [Indexed: 12/15/2022] Open
Abstract
The functional properties of the vascular endothelium are diverse and heterogeneous between vascular beds. This is especially evident when new blood vessels develop from a pre-existing closed cardiovascular system, a process termed angiogenesis. Endothelial cells are key drivers of angiogenesis as they undergo a highly choreographed cascade of events that has both exogenous (e.g., hypoxia and VEGF) and endogenous regulatory inputs. Not surprisingly, angiogenesis is critical in health and disease. Diverse therapeutics target proteins involved in coordinating angiogenesis with varying degrees of efficacy. It is of great interest that recent work on non-coding RNAs, especially long non-coding RNAs (lncRNAs), indicates that they are also important regulators of the gene expression paradigms that underpin this cellular cascade. The protean effects of lncRNAs are dependent, in part, on their subcellular localization. For instance, lncRNAs enriched in the nucleus can act as epigenetic modifiers of gene expression in the vascular endothelium. Of great interest to genetic disease, they are undergoing rapid evolution and show extensive inter- and intra-species heterogeneity. In this review, we describe endothelial-enriched lncRNAs that have robust effects in angiogenesis.
Collapse
Affiliation(s)
- Noeline Subramaniam
- Marsden Lab, Institute of Medical Sciences, University of Toronto, Toronto, ON, Canada
- Marsden Lab, Keenan Research Centre in the Li Ka Shing Knowledge Institute, St. Michael’s Hospital, Toronto, ON, Canada
| | - Ranju Nair
- Marsden Lab, Keenan Research Centre in the Li Ka Shing Knowledge Institute, St. Michael’s Hospital, Toronto, ON, Canada
- Marsden Lab, Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada
| | - Philip A. Marsden
- Marsden Lab, Institute of Medical Sciences, University of Toronto, Toronto, ON, Canada
- Marsden Lab, Keenan Research Centre in the Li Ka Shing Knowledge Institute, St. Michael’s Hospital, Toronto, ON, Canada
- Marsden Lab, Department of Laboratory Medicine and Pathobiology, University of Toronto, Toronto, ON, Canada
- Department of Medicine, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
12
|
Jehl F, Degalez F, Bernard M, Lecerf F, Lagoutte L, Désert C, Coulée M, Bouchez O, Leroux S, Abasht B, Tixier-Boichard M, Bed'hom B, Burlot T, Gourichon D, Bardou P, Acloque H, Foissac S, Djebali S, Giuffra E, Zerjal T, Pitel F, Klopp C, Lagarrigue S. RNA-Seq Data for Reliable SNP Detection and Genotype Calling: Interest for Coding Variant Characterization and Cis-Regulation Analysis by Allele-Specific Expression in Livestock Species. Front Genet 2021; 12:655707. [PMID: 34262593 PMCID: PMC8273700 DOI: 10.3389/fgene.2021.655707] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Accepted: 06/01/2021] [Indexed: 12/19/2022] Open
Abstract
In addition to their common usages to study gene expression, RNA-seq data accumulated over the last 10 years are a yet-unexploited resource of SNPs in numerous individuals from different populations. SNP detection by RNA-seq is particularly interesting for livestock species since whole genome sequencing is expensive and exome sequencing tools are unavailable. These SNPs detected in expressed regions can be used to characterize variants affecting protein functions, and to study cis-regulated genes by analyzing allele-specific expression (ASE) in the tissue of interest. However, gene expression can be highly variable, and filters for SNP detection using the popular GATK toolkit are not yet standardized, making SNP detection and genotype calling by RNA-seq a challenging endeavor. We compared SNP calling results using GATK suggested filters, on two chicken populations for which both RNA-seq and DNA-seq data were available for the same samples of the same tissue. We showed, in expressed regions, a RNA-seq precision of 91% (SNPs detected by RNA-seq and shared by DNA-seq) and we characterized the remaining 9% of SNPs. We then studied the genotype (GT) obtained by RNA-seq and the impact of two factors (GT call-rate and read number per GT) on the concordance of GT with DNA-seq; we proposed thresholds for them leading to a 95% concordance. Applying these thresholds to 767 multi-tissue RNA-seq of 382 birds of 11 chicken populations, we found 9.5 M SNPs in total, of which ∼550,000 SNPs per tissue and population with a reliable GT (call rate ≥ 50%) and among them, ∼340,000 with a MAF ≥ 10%. We showed that such RNA-seq data from one tissue can be used to (i) detect SNPs with a strong predicted impact on proteins, despite their scarcity in each population (16,307 SIFT deleterious missenses and 590 stop-gained), (ii) study, on a large scale, cis-regulations of gene expression, with ∼81% of protein-coding and 68% of long non-coding genes (TPM ≥ 1) that can be analyzed for ASE, and with ∼29% of them that were cis-regulated, and (iii) analyze population genetic using such SNPs located in expressed regions. This work shows that RNA-seq data can be used with good confidence to detect SNPs and associated GT within various populations and used them for different analyses as GTEx studies.
Collapse
Affiliation(s)
- Frédéric Jehl
- INRAE, INSTITUT AGRO, PEGASE UMR 1348, Saint-Gilles, France
| | - Fabien Degalez
- INRAE, INSTITUT AGRO, PEGASE UMR 1348, Saint-Gilles, France
| | - Maria Bernard
- INRAE, SIGENAE, Genotoul Bioinfo MIAT, Castanet-Tolosan, France.,INRAE, AgroParisTech, Université Paris-Saclay, GABI UMR 1313, Jouy-en-Josas, France
| | | | | | - Colette Désert
- INRAE, INSTITUT AGRO, PEGASE UMR 1348, Saint-Gilles, France
| | - Manon Coulée
- INRAE, INSTITUT AGRO, PEGASE UMR 1348, Saint-Gilles, France
| | - Olivier Bouchez
- INRAE, US 1426, GeT-PlaGe, Genotoul, Castanet-Tolosan, France
| | - Sophie Leroux
- INRAE, INPT, ENVT, Université de Toulouse, GenPhySE UMR 1388, Castanet-Tolosan, France
| | - Behnam Abasht
- Department of Animal and Food Sciences, University of Delaware, Newark, DE, United States
| | | | - Bertrand Bed'hom
- INRAE, AgroParisTech, Université Paris-Saclay, GABI UMR 1313, Jouy-en-Josas, France
| | | | | | - Philippe Bardou
- INRAE, SIGENAE, Genotoul Bioinfo MIAT, Castanet-Tolosan, France
| | - Hervé Acloque
- INRAE, AgroParisTech, Université Paris-Saclay, GABI UMR 1313, Jouy-en-Josas, France
| | - Sylvain Foissac
- INRAE, INPT, ENVT, Université de Toulouse, GenPhySE UMR 1388, Castanet-Tolosan, France
| | - Sarah Djebali
- INRAE, INPT, ENVT, Université de Toulouse, GenPhySE UMR 1388, Castanet-Tolosan, France
| | - Elisabetta Giuffra
- INRAE, AgroParisTech, Université Paris-Saclay, GABI UMR 1313, Jouy-en-Josas, France
| | - Tatiana Zerjal
- INRAE, AgroParisTech, Université Paris-Saclay, GABI UMR 1313, Jouy-en-Josas, France
| | - Frédérique Pitel
- INRAE, INPT, ENVT, Université de Toulouse, GenPhySE UMR 1388, Castanet-Tolosan, France
| | | | | |
Collapse
|
13
|
Da Broi MG, Plaça JR, Silva WAD, Ferriani RA, Navarro PA. Screening of Variants in the Transcript Profile of Eutopic Endometrium from Infertile Women with Endometriosis during the Implantation Window. REVISTA BRASILEIRA DE GINECOLOGIA E OBSTETRÍCIA 2021; 43:457-466. [PMID: 34318471 PMCID: PMC10411168 DOI: 10.1055/s-0041-1730287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Accepted: 02/12/2021] [Indexed: 10/20/2022] Open
Abstract
OBJECTIVE Abnormalities in the eutopic endometrium of women with endometriosis may be related to disease-associated infertility. Although previous RNA-sequencing analysis did not show differential expression in endometrial transcripts of endometriosis patients, other molecular alterations could impact protein synthesis and endometrial receptivity. Our aim was to screen for functional mutations in the transcripts of eutopic endometria of infertile women with endometriosis and controls during the implantation window. METHODS Data from RNA-Sequencing of endometrial biopsies collected during the implantation window from 17 patients (6 infertile women with endometriosis, 6 infertile controls, 5 fertile controls) were analyzed for variant discovery and identification of functional mutations. A targeted study of the alterations found was performed to understand the data into disease's context. RESULTS None of the variants identified was common to other samples within the same group, and no mutation was repeated among patients with endometriosis, infertile and fertile controls. In the endometriosis group, nine predicted deleterious mutations were identified, but only one was previously associated to a clinical condition with no endometrial impact. When crossing the mutated genes with the descriptors endometriosis and/or endometrium, the gene CMKLR1 was associated either with inflammatory response in endometriosis or with endometrial processes for pregnancy establishment. CONCLUSION Despite no pattern of mutation having been found, we ponder the small sample size and the analysis on RNA-sequencing data. Considering the purpose of the study of screening and the importance of the CMKLR1 gene on endometrial modulation, it could be a candidate gene for powered further studies evaluating mutations in eutopic endometria from endometriosis patients.
Collapse
Affiliation(s)
- Michele Gomes Da Broi
- Department of Gynecology and Obstetrics, Universidade de São Paulo, Ribeirão Preto, SP, Brazil
| | - Jessica Rodrigues Plaça
- Department of Gynecology and Obstetrics, Universidade de São Paulo, Ribeirão Preto, SP, Brazil
| | - Wilson Araújo da Silva
- Department of Gynecology and Obstetrics, Universidade de São Paulo, Ribeirão Preto, SP, Brazil
| | - Rui Alberto Ferriani
- Department of Gynecology and Obstetrics, Universidade de São Paulo, Ribeirão Preto, SP, Brazil
| | - Paula Andrea Navarro
- Department of Gynecology and Obstetrics, Universidade de São Paulo, Ribeirão Preto, SP, Brazil
| |
Collapse
|
14
|
Youssefian L, Saeidian AH, Palizban F, Bagherieh A, Abdollahimajd F, Sotoudeh S, Mozafari N, Farahani RA, Mahmoudi H, Babashah S, Zabihi M, Zeinali S, Fortina P, Salas-Alanis JC, South AP, Vahidnezhad H, Uitto J. Whole-Transcriptome Analysis by RNA Sequencing for Genetic Diagnosis of Mendelian Skin Disorders in the Context of Consanguinity. Clin Chem 2021; 67:876-888. [PMID: 33969388 DOI: 10.1093/clinchem/hvab042] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2020] [Accepted: 02/11/2021] [Indexed: 02/07/2023]
Abstract
BACKGROUND Among the approximately 8000 Mendelian disorders, >1000 have cutaneous manifestations. In many of these conditions, the underlying mutated genes have been identified by DNA-based techniques which, however, can overlook certain types of mutations, such as exonic-synonymous and deep-intronic sequence variants. Whole-transcriptome sequencing by RNA sequencing (RNA-seq) can identify such mutations and provide information about their consequences. METHODS We analyzed the whole transcriptome of 40 families with different types of Mendelian skin disorders with extensive genetic heterogeneity. The RNA-seq data were examined for variant detection and prioritization, pathogenicity confirmation, RNA expression profiling, and genome-wide homozygosity mapping in the case of consanguineous families. Among the families examined, RNA-seq was able to provide information complementary to DNA-based analyses for exonic and intronic sequence variants with aberrant splicing. In addition, we tested the possibility of using RNA-seq as the first-tier strategy for unbiased genome-wide mutation screening without information from DNA analysis. RESULTS We found pathogenic mutations in 35 families (88%) with RNA-seq in combination with other next-generation sequencing methods, and we successfully prioritized variants and found the culprit genes. In addition, as a novel concept, we propose a pipeline that increases the yield of variant calling from RNA-seq by concurrent use of genome and transcriptome references in parallel. CONCLUSIONS Our results suggest that "clinical RNA-seq" could serve as a primary approach for mutation detection in inherited diseases, particularly in consanguineous families, provided that tissues and cells expressing the relevant genes are available for analysis.
Collapse
Affiliation(s)
- Leila Youssefian
- Jefferson Institute of Molecular Medicine, Thomas Jefferson University, Philadelphia, PA, USA
- Department of Dermatology and Cutaneous Biology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, USA
- Genetics, Genomics and Cancer Biology PhD Program, Thomas Jefferson University, Philadelphia, PA, USA
| | - Amir Hossein Saeidian
- Jefferson Institute of Molecular Medicine, Thomas Jefferson University, Philadelphia, PA, USA
- Department of Dermatology and Cutaneous Biology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, USA
- Genetics, Genomics and Cancer Biology PhD Program, Thomas Jefferson University, Philadelphia, PA, USA
| | - Fahimeh Palizban
- Laboratory of Complex Biological Systems and Bioinformatics, Institute of Biochemistry and Biophysics, University of Tehran, Tehran, Iran
| | - Atefeh Bagherieh
- Department of Molecular Genetics, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran
| | | | - Soheila Sotoudeh
- Department of Dermatology, Children's Medical Center, Center of Excellence, Tehran University of Medical Sciences, Tehran, Iran
| | - Nikoo Mozafari
- Skin Research Center, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Rahele A Farahani
- Division of Nephrology and Hypertension, Mayo Clinic, Rochester, Minnesota, United States of America
| | - Hamidreza Mahmoudi
- Department of Dermatology, Razi Hospital, Tehran University of Medical Sciences, Tehran, Iran
| | - Sadegh Babashah
- Department of Molecular Genetics, Faculty of Biological Sciences, Tarbiat Modares University, Tehran, Iran
| | | | | | - Paolo Fortina
- Cancer Genomics and Bioinformatics, Department of Cancer Biology, Sidney Kimmel Cancer Center, Thomas Jefferson University, Philadelphia, PA, USA
- Department of Translation and Precision Medicine, Sapienza University, Rome, Italy
| | | | - Andrew P South
- Department of Dermatology and Cutaneous Biology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, USA
| | - Hassan Vahidnezhad
- Jefferson Institute of Molecular Medicine, Thomas Jefferson University, Philadelphia, PA, USA
- Department of Dermatology and Cutaneous Biology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, USA
| | - Jouni Uitto
- Jefferson Institute of Molecular Medicine, Thomas Jefferson University, Philadelphia, PA, USA
- Department of Dermatology and Cutaneous Biology, Sidney Kimmel Medical College, Thomas Jefferson University, Philadelphia, PA, USA
| |
Collapse
|
15
|
Integration of SNP Disease Association, eQTL, and Enrichment Analyses to Identify Risk SNPs and Susceptibility Genes in Chronic Obstructive Pulmonary Disease. BIOMED RESEARCH INTERNATIONAL 2020; 2020:3854196. [PMID: 33457407 PMCID: PMC7785362 DOI: 10.1155/2020/3854196] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Revised: 12/09/2020] [Accepted: 12/15/2020] [Indexed: 12/14/2022]
Abstract
Chronic obstructive pulmonary disease (COPD) is a complex disease caused by the disturbance of genetic and environmental factors. Single-nucleotide polymorphisms (SNPs) play a vital role in the genetic dissection of complex diseases. In-depth analysis of SNP-related information could recognize disease-associated biomarkers and further uncover the genetic mechanism of complex diseases. Risk-related variants might act on the disease by affecting gene expression and gene function. Through integrating SNP disease association study and expression quantitative trait loci (eQTL) analysis, as well as functional enrichment of containing known causal genes, four risk SNPs and four corresponding susceptibility genes were identified utilizing next-generation sequencing (NGS) data of COPD. Of the four risk SNPs, one could be found in the SNPedia database that stored disease-related SNPs and has been linked to a disease in the literature. Four genes showed significant differences from the perspective of normal/disease or variant/nonvariant samples, as well as the high performance of sample classification. It is speculated that the four susceptibility genes could be used as biomarkers of COPD. Furthermore, three of our susceptibility genes have been confirmed in the literature to be associated with COPD. Among them, two genes had an impact on the significance of expression correlation of known causal genes they interact with, respectively. Overall, this research may present novel insights into the diagnosis and pathogenesis of COPD and susceptibility gene identification of other complex diseases.
Collapse
|
16
|
Quaglieri A, Flensburg C, Speed TP, Majewski IJ. Finding a suitable library size to call variants in RNA-Seq. BMC Bioinformatics 2020; 21:553. [PMID: 33261552 PMCID: PMC7708150 DOI: 10.1186/s12859-020-03860-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Accepted: 11/03/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND RNA sequencing allows the study of both gene expression changes and transcribed mutations, providing a highly effective way to gain insight into cancer biology. When planning the sequencing of a large cohort of samples, library size is a fundamental factor affecting both the overall cost and the quality of the results. Here we specifically address how overall library size influences the detection of somatic mutations in RNA-seq data in two acute myeloid leukaemia datasets. RESULTS : We simulated shallower sequencing depths by downsampling 45 acute myeloid leukaemia samples (100 bp PE) that are part of the Leucegene project, which were originally sequenced at high depth. We compared the sensitivity of six methods of recovering validated mutations on the same samples. The methods compared are a combination of three popular callers (MuTect, VarScan, and VarDict) and two filtering strategies. We observed an incremental loss in sensitivity when simulating libraries of 80M, 50M, 40M, 30M and 20M fragments, with the largest loss detected with less than 30M fragments (below 90%, average loss of 7%). The sensitivity in recovering insertions and deletions varied markedly between callers, with VarDict showing the highest sensitivity (60%). Single nucleotide variant sensitivity is relatively consistent across methods, apart from MuTect, whose default filters need adjustment when using RNA-Seq. We also analysed 136 RNA-Seq samples from the TCGA-LAML cohort (50 bp PE) and assessed the change in sensitivity between the initial libraries (average 59M fragments) and after downsampling to 40M fragments. When considering single nucleotide variants in recurrently mutated myeloid genes we found a comparable performance, with a 6% average loss in sensitivity using 40M fragments. CONCLUSIONS Between 30M and 40M 100 bp PE reads are needed to recover 90-95% of the initial variants on recurrently mutated myeloid genes. To extend this result to another cancer type, an exploration of the characteristics of its mutations and gene expression patterns is suggested.
Collapse
Affiliation(s)
- Anna Quaglieri
- Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, 3052, Australia. .,Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Grattan St, Melbourne, 3010, Australia.
| | - Christoffer Flensburg
- Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, 3052, Australia
| | - Terence P Speed
- Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, 3052, Australia.,Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Grattan St, Melbourne, 3010, Australia.,Department of Mathematics and Statistics, The University of Melbourne, 813 Swanston Street, Melbourne, 3010, Australia
| | - Ian J Majewski
- Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, 3052, Australia. .,Faculty of Medicine, Dentistry and Health Sciences, The University of Melbourne, Grattan St, Melbourne, 3010, Australia.
| |
Collapse
|
17
|
Lam S, Zeidan J, Miglior F, Suárez-Vega A, Gómez-Redondo I, Fonseca PAS, Guan LL, Waters S, Cánovas A. Development and comparison of RNA-sequencing pipelines for more accurate SNP identification: practical example of functional SNP detection associated with feed efficiency in Nellore beef cattle. BMC Genomics 2020; 21:703. [PMID: 33032519 PMCID: PMC7545862 DOI: 10.1186/s12864-020-07107-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2020] [Accepted: 09/28/2020] [Indexed: 12/14/2022] Open
Abstract
Background Optimization of an RNA-Sequencing (RNA-Seq) pipeline is critical to maximize power and accuracy to identify genetic variants, including SNPs, which may serve as genetic markers to select for feed efficiency, leading to economic benefits for beef production. This study used RNA-Seq data (GEO Accession ID: PRJEB7696 and PRJEB15314) from muscle and liver tissue, respectively, from 12 Nellore beef steers selected from 585 steers with residual feed intake measures (RFI; n = 6 low-RFI, n = 6 high-RFI). Three RNA-Seq pipelines were compared including multi-sample calling from i) non-merged samples; ii) merged samples by RFI group, iii) merged samples by RFI and tissue group. The RNA-Seq reads were aligned against the UMD3.1 bovine reference genome (release 94) assembly using STAR aligner. Variants were called using BCFtools and variant effect prediction (VeP) and functional annotation (ToppGene) analyses were performed. Results On average, total reads detected for Approach i) non-merged samples for liver and muscle, were 18,362,086.3 and 35,645,898.7, respectively. For Approach ii), merging samples by RFI group, total reads detected for each merged group was 162,030,705, and for Approach iii), merging samples by RFI group and tissues, was 324,061,410, revealing the highest read depth for Approach iii). Additionally, Approach iii) merging samples by RFI group and tissues, revealed the highest read depth per variant coverage (572.59 ± 3993.11) and encompassed the majority of localized positional genes detected by each approach. This suggests Approach iii) had optimized detection power, read depth, and accuracy of SNP calling, therefore increasing confidence of variant detection and reducing false positive detection. Approach iii) was then used to detect unique SNPs fixed within low- (12,145) and high-RFI (14,663) groups. Functional annotation of SNPs revealed positional candidate genes, for each RFI group (2886 for low-RFI, 3075 for high-RFI), which were significantly (P < 0.05) associated with immune and metabolic pathways. Conclusion The most optimized RNA-Seq pipeline allowed for more accurate identification of SNPs, associated positional candidate genes, and significantly associated metabolic pathways in muscle and liver tissues, providing insight on the underlying genetic architecture of feed efficiency in beef cattle.
Collapse
Affiliation(s)
- S Lam
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, 50 Stone Road E, Guelph, Ontario, N1G2W1, Canada
| | - J Zeidan
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, 50 Stone Road E, Guelph, Ontario, N1G2W1, Canada
| | - F Miglior
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, 50 Stone Road E, Guelph, Ontario, N1G2W1, Canada
| | - A Suárez-Vega
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, 50 Stone Road E, Guelph, Ontario, N1G2W1, Canada
| | - I Gómez-Redondo
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, 50 Stone Road E, Guelph, Ontario, N1G2W1, Canada.,Spanish National Institute for Agriculture and Food Research and Technology, Carretera de La Coruña, 28040, Madrid, Spain
| | - P A S Fonseca
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, 50 Stone Road E, Guelph, Ontario, N1G2W1, Canada
| | - L L Guan
- Department of Agriculture, Food & Nutritional Science, University of Alberta, Edmonton, Alberta, T6H 2P5, Canada
| | - S Waters
- Teagasc, Animal & Grassland Research and Innovation Centre, Grange, Dunsany, Co. Meath, C15 PW93, Ireland
| | - A Cánovas
- Centre for Genetic Improvement of Livestock, Department of Animal Biosciences, University of Guelph, 50 Stone Road E, Guelph, Ontario, N1G2W1, Canada.
| |
Collapse
|
18
|
Sun YM, Chen YQ. Principles and innovative technologies for decrypting noncoding RNAs: from discovery and functional prediction to clinical application. J Hematol Oncol 2020; 13:109. [PMID: 32778133 PMCID: PMC7416809 DOI: 10.1186/s13045-020-00945-8] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Accepted: 07/27/2020] [Indexed: 12/20/2022] Open
Abstract
Noncoding RNAs (ncRNAs) are a large segment of the transcriptome that do not have apparent protein-coding roles, but they have been verified to play important roles in diverse biological processes, including disease pathogenesis. With the development of innovative technologies, an increasing number of novel ncRNAs have been uncovered; information about their prominent tissue-specific expression patterns, various interaction networks, and subcellular locations will undoubtedly enhance our understanding of their potential functions. Here, we summarized the principles and innovative methods for identifications of novel ncRNAs that have potential functional roles in cancer biology. Moreover, this review also provides alternative ncRNA databases based on high-throughput sequencing or experimental validation, and it briefly describes the current strategy for the clinical translation of cancer-associated ncRNAs to be used in diagnosis.
Collapse
Affiliation(s)
- Yu-Meng Sun
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275 People’s Republic of China
| | - Yue-Qin Chen
- MOE Key Laboratory of Gene Function and Regulation, State Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275 People’s Republic of China
| |
Collapse
|
19
|
Serin Harmanci A, Harmanci AO, Zhou X. CaSpER identifies and visualizes CNV events by integrative analysis of single-cell or bulk RNA-sequencing data. Nat Commun 2020; 11:89. [PMID: 31900397 PMCID: PMC6941987 DOI: 10.1038/s41467-019-13779-x] [Citation(s) in RCA: 86] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2018] [Accepted: 11/25/2019] [Indexed: 12/15/2022] Open
Abstract
RNA sequencing experiments generate large amounts of information about expression levels of genes. Although they are mainly used for quantifying expression levels, they contain much more biologically important information such as copy number variants (CNVs). Here, we present CaSpER, a signal processing approach for identification, visualization, and integrative analysis of focal and large-scale CNV events in multiscale resolution using either bulk or single-cell RNA sequencing data. CaSpER integrates the multiscale smoothing of expression signal and allelic shift signals for CNV calling. The allelic shift signal measures the loss-of-heterozygosity (LOH) which is valuable for CNV identification. CaSpER employs an efficient methodology for the generation of a genome-wide B-allele frequency (BAF) signal profile from the reads and utilizes it for correction of CNVs calls. CaSpER increases the utility of RNA-sequencing datasets and complements other tools for complete characterization and visualization of the genomic and transcriptomic landscape of single cell and bulk RNA sequencing data.
Collapse
Affiliation(s)
- Akdes Serin Harmanci
- Center for Computational Systems Medicine, School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Arif O Harmanci
- Center for Precision Health, School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, 77030, USA
| | - Xiaobo Zhou
- Center for Computational Systems Medicine, School of Biomedical Informatics, University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
- Department of Integrative Biology and Pharmacology, McGovern Medical School at The University of Texas Health Science Center at Houston, Houston, TX, 77030, USA.
- School of Dentistry, University of Texas Health Science Center at Houston, Houston, TX, 77054, USA.
| |
Collapse
|
20
|
Dharshini SAP, Taguchi YH, Gromiha MM. Identifying suitable tools for variant detection and differential gene expression using RNA-seq data. Genomics 2019; 112:2166-2172. [PMID: 31862361 DOI: 10.1016/j.ygeno.2019.12.011] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2019] [Revised: 11/25/2019] [Accepted: 12/16/2019] [Indexed: 12/27/2022]
Abstract
Neurodegenerative diseases are the most predominate brain disorders around the globe and the affected populations are rapidly increasing. Recently, these diseases have been addressed using the data obtained from RNA-sequencing technology to reveal the changes in gene/transcript expression, effect of variants, and pathways involved in disease mechanisms. However, the observations mainly depend on the aligners/tools and the performance of existing RNA-seq tools on hg38 genome assembly has not yet been documented. In this study, we performed a systematic analysis of various spliced aligners, transcript assembling and variant calling tools based on both genomic assemblies (hg19/hg38) from hippocampus brain tissue. This helps to identify the best possible combination tools for hg38 annotation. In order to evaluate the identified variants from various pipelines, we compared them with expression Quantitative Trait Loci (eQTL) and Genome-Wide Association Study (GWAS). In addition, the identified differentially expressed genes (DG) were compared with microarray studies. From our analysis of variant calling, the combination of GATK (Genome Analysis Tool-kit) and STAR (Spliced Transcripts Alignment to a Reference) protocol yields a larger number of GWAS/eQTL variants compared to SAMtools (Sequence Alignment Map). We also identified a higher number of non-coding variants in hg38 compared to hg19 due to enhanced annotation. In the case of various DG pipelines, we found that the Salmon-based hg38 transcriptomic quantification yields a higher number of reported DG compared to other genome-based quantification methods. This study revealed that higher number of reads maps to multiple location of the genome with hg38 compared to hg19, and these spurious multi-mapped reads may affect the gene quantification techniques. We suggest that it is necessary to develop efficient algorithms, which can handle the multi-mapped reads and improve the performance of genome-based alignment quantification.
Collapse
Affiliation(s)
- S Akila Parvathy Dharshini
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamilnadu, India
| | - Y-H Taguchi
- Department of Physics, Chuo University, Kasuga, Bunkyo-ku, Tokyo 112-8551, Japan
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamilnadu, India; Advanced Computational Drug Discovery Unit, Tokyo Tech World Research Hub Initiative (WRHI), Institute of Innovative Research, Tokyo Institute of Technology, 4259 Nagatsuta-cho, Midori-ku, Yokohama, Japan.
| |
Collapse
|
21
|
Dharshini SAP, Taguchi YH, Gromiha MM. Investigating the energy crisis in Alzheimer disease using transcriptome study. Sci Rep 2019; 9:18509. [PMID: 31811163 PMCID: PMC6898285 DOI: 10.1038/s41598-019-54782-y] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2019] [Accepted: 11/09/2019] [Indexed: 01/01/2023] Open
Abstract
Alzheimer disease (AD) is a devastating neurological disorder, which initiates from hippocampus and proliferates to cortical regions. The neurons of hippocampus require higher energy to preserve the firing pattern. In AD, aberrant energy metabolism is the critical factor for neurodegeneration. However, the reason for the energy crisis in hippocampus neurons is still unresolved. Transcriptome analysis enables us in understanding the underlying mechanism of energy crisis. In this study, we identified variants/differential gene/transcript expression profiles from hippocampus RNA-seq data. We predicted the effect of variants in transcription factor (TF) binding using in silico tools. Further, a hippocampus-specific co-expression and functional interaction network were designed to decipher the relationships between TF and differentially expressed genes (DG). Identified variants predominantly influence TF binding, which subsequently regulates the DG. From the results, we hypothesize that the loss of vascular integrity is the fundamental attribute for the energy crisis, which leads to neurodegeneration.
Collapse
Affiliation(s)
- S Akila Parvathy Dharshini
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600036, Tamilnadu, India
| | - Y-H Taguchi
- Department of Physics, Chuo University, Kasuga, Bunkyo-ku, Tokyo, 112-8551, Japan
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, 600036, Tamilnadu, India.
| |
Collapse
|
22
|
Liu F, Zhang Y, Zhang L, Li Z, Fang Q, Gao R, Zhang Z. Systematic comparative analysis of single-nucleotide variant detection methods from single-cell RNA sequencing data. Genome Biol 2019; 20:242. [PMID: 31744515 PMCID: PMC6862814 DOI: 10.1186/s13059-019-1863-4] [Citation(s) in RCA: 57] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2019] [Accepted: 10/23/2019] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Systematic interrogation of single-nucleotide variants (SNVs) is one of the most promising approaches to delineate the cellular heterogeneity and phylogenetic relationships at the single-cell level. While SNV detection from abundant single-cell RNA sequencing (scRNA-seq) data is applicable and cost-effective in identifying expressed variants, inferring sub-clones, and deciphering genotype-phenotype linkages, there is a lack of computational methods specifically developed for SNV calling in scRNA-seq. Although variant callers for bulk RNA-seq have been sporadically used in scRNA-seq, the performances of different tools have not been assessed. RESULTS Here, we perform a systematic comparison of seven tools including SAMtools, the GATK pipeline, CTAT, FreeBayes, MuTect2, Strelka2, and VarScan2, using both simulation and scRNA-seq datasets, and identify multiple elements influencing their performance. While the specificities are generally high, with sensitivities exceeding 90% for most tools when calling homozygous SNVs in high-confident coding regions with sufficient read depths, such sensitivities dramatically decrease when calling SNVs with low read depths, low variant allele frequencies, or in specific genomic contexts. SAMtools shows the highest sensitivity in most cases especially with low supporting reads, despite the relatively low specificity in introns or high-identity regions. Strelka2 shows consistently good performance when sufficient supporting reads are provided, while FreeBayes shows good performance in the cases of high variant allele frequencies. CONCLUSIONS We recommend SAMtools, Strelka2, FreeBayes, or CTAT, depending on the specific conditions of usage. Our study provides the first benchmarking to evaluate the performances of different SNV detection tools for scRNA-seq data.
Collapse
Affiliation(s)
- Fenglin Liu
- School of Life Sciences and BIOPIC, Peking University, Beijing, China
| | - Yuanyuan Zhang
- School of Life Sciences and BIOPIC, Peking University, Beijing, China
| | - Lei Zhang
- Beijing Advanced Innovation Centre for Genomics, Peking-Tsinghua Centre for Life Sciences, Peking University, Beijing, China
| | - Ziyi Li
- School of Life Sciences and BIOPIC, Peking University, Beijing, China
| | - Qiao Fang
- Beijing Advanced Innovation Centre for Genomics, Peking-Tsinghua Centre for Life Sciences, Peking University, Beijing, China
| | - Ranran Gao
- School of Life Sciences and BIOPIC, Peking University, Beijing, China
| | - Zemin Zhang
- School of Life Sciences and BIOPIC, Peking University, Beijing, China
- Beijing Advanced Innovation Centre for Genomics, Peking-Tsinghua Centre for Life Sciences, Peking University, Beijing, China
| |
Collapse
|
23
|
Pharmacogenes (PGx-genes): Current understanding and future directions. Gene 2019; 718:144050. [DOI: 10.1016/j.gene.2019.144050] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2019] [Revised: 08/13/2019] [Accepted: 08/14/2019] [Indexed: 12/14/2022]
|
24
|
Adetunji MO, Lamont SJ, Abasht B, Schmidt CJ. Variant analysis pipeline for accurate detection of genomic variants from transcriptome sequencing data. PLoS One 2019; 14:e0216838. [PMID: 31545812 PMCID: PMC6756534 DOI: 10.1371/journal.pone.0216838] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Accepted: 09/10/2019] [Indexed: 12/27/2022] Open
Abstract
The wealth of information deliverable from transcriptome sequencing (RNA-seq) is significant, however current applications for variant detection still remain a challenge due to the complexity of the transcriptome. Given the ability of RNA-seq to reveal active regions of the genome, detection of RNA-seq SNPs can prove valuable in understanding the phenotypic diversity between populations. Thus, we present a novel computational workflow named VAP (Variant Analysis Pipeline) that takes advantage of multiple RNA-seq splice aware aligners to call SNPs in non-human models using RNA-seq data only. We applied VAP to RNA-seq from a highly inbred chicken line and achieved high accuracy when compared with the matching whole genome sequencing (WGS) data. Over 65% of WGS coding variants were identified from RNA-seq. Further, our results discovered SNPs resulting from post transcriptional modifications, such as RNA editing, which may reveal potentially functional variation that would have otherwise been missed in genomic data. Even with the limitation in detecting variants in expressed regions only, our method proves to be a reliable alternative for SNP identification using RNA-seq data. The source code and user manuals are available at https://modupeore.github.io/VAP/.
Collapse
Affiliation(s)
- Modupeore O. Adetunji
- Department of Animal and Food Sciences, University of Delaware, Newark, Delaware, United States of America
- * E-mail:
| | - Susan J. Lamont
- Department of Animal Science, Iowa State University, Ames, Iowa, United States of America
| | - Behnam Abasht
- Department of Animal and Food Sciences, University of Delaware, Newark, Delaware, United States of America
| | - Carl J. Schmidt
- Department of Animal and Food Sciences, University of Delaware, Newark, Delaware, United States of America
| |
Collapse
|
25
|
Islam R, Lai C. A Brief Overview of lncRNAs in Endothelial Dysfunction-Associated Diseases: From Discovery to Characterization. EPIGENOMES 2019; 3:epigenomes3030020. [PMID: 34968230 PMCID: PMC8594677 DOI: 10.3390/epigenomes3030020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Revised: 09/06/2019] [Accepted: 09/07/2019] [Indexed: 11/16/2022] Open
Abstract
Long non-coding RNAs (lncRNAs) are a novel class of regulatory RNA molecules and they are involved in many biological processes and disease developments. Several unique features of lncRNAs have been identified, such as tissue-and/or cell-specific expression pattern, which suggest that they could be potential candidates for therapeutic and diagnostic applications. More recently, the scope of lncRNA studies has been extended to endothelial biology research. Many of lncRNAs were found to be critically involved in the regulation of endothelial function and its associated disease progression. An improved understanding of endothelial biology can thus facilitate the discovery of novel biomarkers and therapeutic targets for endothelial dysfunction-associated diseases, such as abnormal angiogenesis, hypertension, diabetes, and atherosclerosis. Nevertheless, the underlying mechanism of lncRNA remains undefined in previous published studies. Therefore, in this review, we aimed to discuss the current methodologies for discovering and investigating the functions of lncRNAs and, in particular, to address the functions of selected lncRNAs in endothelial dysfunction-associated diseases.
Collapse
Affiliation(s)
- Rashidul Islam
- Department of Health Technology and Informatics, Hong Kong Polytechnic University, Hong Kong, China;
| | - Christopher Lai
- Health and Social Sciences Cluster, Singapore Institute of Technology, Singapore 138683, Singapore
- Correspondence: ; Tel.: +65-6592-1045
| |
Collapse
|
26
|
Grant AD, Vail P, Padi M, Witkiewicz AK, Knudsen ES. Interrogating Mutant Allele Expression via Customized Reference Genomes to Define Influential Cancer Mutations. Sci Rep 2019; 9:12766. [PMID: 31484939 PMCID: PMC6726654 DOI: 10.1038/s41598-019-48967-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2019] [Accepted: 08/12/2019] [Indexed: 11/16/2022] Open
Abstract
Genetic alterations are essential for cancer initiation and progression. However, differentiating mutations that drive the tumor phenotype from mutations that do not affect tumor fitness remains a fundamental challenge in cancer biology. To better understand the impact of a given mutation within cancer, RNA-sequencing data was used to categorize mutations based on their allelic expression. For this purpose, we developed the MAXX (Mutation Allelic Expression Extractor) software, which is highly effective at delineating the allelic expression of both single nucleotide variants and small insertions and deletions. Results from MAXX demonstrated that mutations can be separated into three groups based on their expression of the mutant allele, lack of expression from both alleles, or expression of only the wild-type allele. By taking into consideration the allelic expression patterns of genes that are mutated in PDAC, it was possible to increase the sensitivity of widely used driver mutation detection methods, as well as identify subtypes that have prognostic significance and are associated with sensitivity to select classes of therapeutic agents in cell culture. Thus, differentiating mutations based on their mutant allele expression via MAXX represents a means to parse somatic variants in tumor genomes, helping to elucidate a gene’s respective role in cancer.
Collapse
Affiliation(s)
- Adam D Grant
- University of Arizona Cancer Center, Tucson, AZ, 85719, USA
| | - Paris Vail
- University of Arizona Cancer Center, Tucson, AZ, 85719, USA
| | - Megha Padi
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, AZ, 85719, USA
| | | | - Erik S Knudsen
- Department of Molecular and Cellular Biology, Roswell Park Cancer Center, Buffalo, NY, 14263, USA.
| |
Collapse
|
27
|
Lee C, Kang EY, Gandal MJ, Eskin E, Geschwind DH. Profiling allele-specific gene expression in brains from individuals with autism spectrum disorder reveals preferential minor allele usage. Nat Neurosci 2019; 22:1521-1532. [PMID: 31455884 PMCID: PMC6750256 DOI: 10.1038/s41593-019-0461-9] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2017] [Accepted: 07/09/2019] [Indexed: 12/21/2022]
Abstract
One fundamental but understudied mechanism of gene regulation in disease is allele-specific expression (ASE), the preferential expression of one allele. We leveraged RNA-sequencing data from human brain to assess ASE in autism spectrum disorder (ASD). When ASE is observed in ASD, the allele with lower population frequency (minor allele) is preferentially more highly expressed than the major allele, opposite to the canonical pattern. Importantly, genes showing ASE in ASD are enriched in those downregulated in ASD postmortem brains and in genes harboring de novo mutations in ASD. Two regions, 14q32 and 15q11, containing all known orphan C/D box small nucleolar RNAs (snoRNAs), are particularly enriched in shifts to higher minor allele expression. We demonstrate that this allele shifting enhances snoRNA-targeted splicing changes in ASD-related target genes in idiopathic ASD and 15q11-q13 duplication syndrome. Together, these results implicate allelic imbalance and dysregulation of orphan C/D box snoRNAs in ASD pathogenesis.
Collapse
Affiliation(s)
- Changhoon Lee
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Neuroscience, Peter O'Donnell Jr. Brain Institute, University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Eun Yong Kang
- Department of Computer Science, Henry Samueli School of Engineering, University of California, Los Angeles, Los Angeles, CA, USA
| | - Michael J Gandal
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Center for Neurobehavioral Genetics, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Eleazar Eskin
- Department of Computer Science, Henry Samueli School of Engineering, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
- Department of Computational Medicine, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA
| | - Daniel H Geschwind
- Program in Neurogenetics, Department of Neurology, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA.
- Center for Neurobehavioral Genetics, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA.
- Department of Human Genetics, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA.
- Center for Autism Research and Treatment, Semel Institute, David Geffen School of Medicine, University of California, Los Angeles, Los Angeles, CA, USA.
| |
Collapse
|
28
|
Batcha AMN, Bamopoulos SA, Kerbs P, Kumar A, Jurinovic V, Rothenberg-Thurley M, Ksienzyk B, Philippou-Massier J, Krebs S, Blum H, Schneider S, Konstandin N, Bohlander SK, Heckman C, Kontro M, Hiddemann W, Spiekermann K, Braess J, Metzeler KH, Greif PA, Mansmann U, Herold T. Allelic Imbalance of Recurrently Mutated Genes in Acute Myeloid Leukaemia. Sci Rep 2019; 9:11796. [PMID: 31409822 PMCID: PMC6692371 DOI: 10.1038/s41598-019-48167-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2019] [Accepted: 07/29/2019] [Indexed: 12/24/2022] Open
Abstract
The patho-mechanism of somatic driver mutations in cancer usually involves transcription, but the proportion of mutations and wild-type alleles transcribed from DNA to RNA is largely unknown. We systematically compared the variant allele frequencies of recurrently mutated genes in DNA and RNA sequencing data of 246 acute myeloid leukaemia (AML) patients. We observed that 95% of all detected variants were transcribed while the rest were not detectable in RNA sequencing with a minimum read-depth cut-off (10x). Our analysis focusing on 11 genes harbouring recurring mutations demonstrated allelic imbalance (AI) in most patients. GATA2, RUNX1, TET2, SRSF2, IDH2, PTPN11, WT1, NPM1 and CEBPA showed significant AIs. While the effect size was small in general, GATA2 exhibited the largest allelic imbalance. By pooling heterogeneous data from three independent AML cohorts with paired DNA and RNA sequencing (N = 253), we could validate the preferential transcription of GATA2-mutated alleles. Differential expression analysis of the genes with significant AI showed no significant differential gene and isoform expression for the mutated genes, between mutated and wild-type patients. In conclusion, our analyses identified AI in nine out of eleven recurrently mutated genes. AI might be a common phenomenon in AML which potentially contributes to leukaemogenesis.
Collapse
Affiliation(s)
- Aarif M N Batcha
- Institute of Medical Data Processing, Biometrics and Epidemiology (IBE), Faculty of Medicine, LMU Munich, Munich, Germany. .,Data Integration for Future Medicine (DiFuture, www.difuture.de), LMU Munich, Munich, Germany.
| | - Stefanos A Bamopoulos
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany
| | - Paul Kerbs
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany
| | - Ashwini Kumar
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
| | - Vindi Jurinovic
- Institute of Medical Data Processing, Biometrics and Epidemiology (IBE), Faculty of Medicine, LMU Munich, Munich, Germany.,Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany
| | - Maja Rothenberg-Thurley
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany
| | - Bianka Ksienzyk
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany
| | - Julia Philippou-Massier
- Laboratory for Functional Genome Analysis (LAFUGA), Gene Center, University of Munich, Munich, Germany
| | - Stefan Krebs
- Laboratory for Functional Genome Analysis (LAFUGA), Gene Center, University of Munich, Munich, Germany
| | - Helmut Blum
- Laboratory for Functional Genome Analysis (LAFUGA), Gene Center, University of Munich, Munich, Germany
| | - Stephanie Schneider
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany.,Institute of Human Genetics, University Hospital, LMU Munich, Munich, Germany
| | - Nikola Konstandin
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany
| | - Stefan K Bohlander
- Leukaemia and Blood Cancer Research Unit, Department of Molecular Medicine and Pathology, University of Auckland, Auckland, New Zealand
| | - Caroline Heckman
- Institute for Molecular Medicine Finland (FIMM), University of Helsinki, Helsinki, Finland
| | - Mika Kontro
- Department of Haematology, Helsinki University Hospital Comprehensive Cancer Center, Helsinki, Finland
| | - Wolfgang Hiddemann
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany.,German Cancer Consortium (DKTK), Partner Site Munich, Munich, Germany.,German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Karsten Spiekermann
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany.,German Cancer Consortium (DKTK), Partner Site Munich, Munich, Germany.,German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Jan Braess
- Department of Oncology and Hematology, Hospital Barmherzige Brüder, Regensburg, Germany
| | - Klaus H Metzeler
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany.,German Cancer Consortium (DKTK), Partner Site Munich, Munich, Germany.,German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Philipp A Greif
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany.,German Cancer Consortium (DKTK), Partner Site Munich, Munich, Germany.,German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Ulrich Mansmann
- Institute of Medical Data Processing, Biometrics and Epidemiology (IBE), Faculty of Medicine, LMU Munich, Munich, Germany.,Data Integration for Future Medicine (DiFuture, www.difuture.de), LMU Munich, Munich, Germany.,German Cancer Consortium (DKTK), Partner Site Munich, Munich, Germany.,German Cancer Research Center (DKFZ), Heidelberg, Germany
| | - Tobias Herold
- Laboratory for Leukemia Diagnostics, Department of Medicine III, University Hospital, LMU Munich, Munich, Germany. .,German Cancer Consortium (DKTK), Partner Site Munich, Munich, Germany. .,German Cancer Research Center (DKFZ), Heidelberg, Germany. .,Research Unit Apoptosis in Hematopoietic Stem Cells, Helmholtz Zentrum München, German Research Center for Environmental Health (HMGU), Munich, Germany.
| |
Collapse
|
29
|
Sanchez de Groot N, Armaos A, Graña-Montes R, Alriquet M, Calloni G, Vabulas RM, Tartaglia GG. RNA structure drives interaction with proteins. Nat Commun 2019; 10:3246. [PMID: 31324771 PMCID: PMC6642211 DOI: 10.1038/s41467-019-10923-5] [Citation(s) in RCA: 98] [Impact Index Per Article: 19.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Accepted: 06/10/2019] [Indexed: 12/12/2022] Open
Abstract
The combination of high-throughput sequencing and in vivo crosslinking approaches leads to the progressive uncovering of the complex interdependence between cellular transcriptome and proteome. Yet, the molecular determinants governing interactions in protein-RNA networks are not well understood. Here we investigated the relationship between the structure of an RNA and its ability to interact with proteins. Analysing in silico, in vitro and in vivo experiments, we find that the amount of double-stranded regions in an RNA correlates with the number of protein contacts. This relationship -which we call structure-driven protein interactivity- allows classification of RNA types, plays a role in gene regulation and could have implications for the formation of phase-separated ribonucleoprotein assemblies. We validate our hypothesis by showing that a highly structured RNA can rearrange the composition of a protein aggregate. We report that the tendency of proteins to phase-separate is reduced by interactions with specific RNAs.
Collapse
Affiliation(s)
- Natalia Sanchez de Groot
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, 08003, Barcelona, Spain
| | - Alexandros Armaos
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, 08003, Barcelona, Spain
| | - Ricardo Graña-Montes
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, 08003, Barcelona, Spain.,Department of Biochemistry, University of Zürich, Winterthurerstrasse 190, 8057, Zürich, Switzerland
| | - Marion Alriquet
- Buchmann Institute for Molecular Life Sciences, Goethe University Frankfurt, 60438, Frankfurt am Main, Germany.,Institute of Biophysical Chemistry, Goethe University Frankfurt, 60438, Frankfurt am Main, Germany
| | - Giulia Calloni
- Buchmann Institute for Molecular Life Sciences, Goethe University Frankfurt, 60438, Frankfurt am Main, Germany.,Institute of Biophysical Chemistry, Goethe University Frankfurt, 60438, Frankfurt am Main, Germany
| | - R Martin Vabulas
- Buchmann Institute for Molecular Life Sciences, Goethe University Frankfurt, 60438, Frankfurt am Main, Germany. .,Institute of Biophysical Chemistry, Goethe University Frankfurt, 60438, Frankfurt am Main, Germany.
| | - Gian Gaetano Tartaglia
- Centre for Genomic Regulation (CRG), The Barcelona Institute for Science and Technology, Dr. Aiguader 88, 08003, Barcelona, Spain. .,ICREA 23 Passeig Lluis Companys 08010 and Universitat Pompeu Fabra (UPF), 08003, Barcelona, Spain. .,Department of Biology 'Charles Darwin', Sapienza University of Rome, P.le A. Moro 5, Rome, 00185, Italy. .,Department of Neuroscience and Brain Technologies, Istituto Italiano di Tecnologia, Via Morego 30, 16163, Genoa, Italy.
| |
Collapse
|
30
|
Brouard JS, Schenkel F, Marete A, Bissonnette N. The GATK joint genotyping workflow is appropriate for calling variants in RNA-seq experiments. J Anim Sci Biotechnol 2019; 10:44. [PMID: 31249686 PMCID: PMC6587293 DOI: 10.1186/s40104-019-0359-0] [Citation(s) in RCA: 68] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2018] [Accepted: 04/28/2019] [Indexed: 12/30/2022] Open
Abstract
The Genome Analysis Toolkit (GATK) is a popular set of programs for discovering and genotyping variants from next-generation sequencing data. The current GATK recommendation for RNA sequencing (RNA-seq) is to perform variant calling from individual samples, with the drawback that only variable positions are reported. Versions 3.0 and above of GATK offer the possibility of calling DNA variants on cohorts of samples using the HaplotypeCaller algorithm in Genomic Variant Call Format (GVCF) mode. Using this approach, variants are called individually on each sample, generating one GVCF file per sample that lists genotype likelihoods and their genome annotations. In a second step, variants are called from the GVCF files through a joint genotyping analysis. This strategy is more flexible and reduces computational challenges in comparison to the traditional joint discovery workflow. Using a GVCF workflow for mining SNP in RNA-seq data provides substantial advantages, including reporting homozygous genotypes for the reference allele as well as missing data. Taking advantage of RNA-seq data derived from primary macrophages isolated from 50 cows, the GATK joint genotyping method for calling variants on RNA-seq data was validated by comparing this approach to a so-called “per-sample” method. In addition, pair-wise comparisons of the two methods were performed to evaluate their respective sensitivity, precision and accuracy using DNA genotypes from a companion study including the same 50 cows genotyped using either genotyping-by-sequencing or with the Bovine SNP50 Beadchip (imputed to the Bovine high density). Results indicate that both approaches are very close in their capacity of detecting reference variants and that the joint genotyping method is more sensitive than the per-sample method. Given that the joint genotyping method is more flexible and technically easier, we recommend this approach for variant calling in RNA-seq experiments.
Collapse
Affiliation(s)
- Jean-Simon Brouard
- 1Sherbrooke Research and Development Centre, Agriculture and Agri-Food Canada, Sherbrooke, QC J1M 0C8 Canada
| | - Flavio Schenkel
- 2Center of Genetic Improvement of Livestock, University of Guelph, Guelph, ON N1G 2W1 Canada
| | - Andrew Marete
- 1Sherbrooke Research and Development Centre, Agriculture and Agri-Food Canada, Sherbrooke, QC J1M 0C8 Canada
| | - Nathalie Bissonnette
- 1Sherbrooke Research and Development Centre, Agriculture and Agri-Food Canada, Sherbrooke, QC J1M 0C8 Canada
| |
Collapse
|
31
|
Mutational landscape of the transcriptome offers putative targets for immunotherapy of myeloproliferative neoplasms. Blood 2019; 134:199-210. [PMID: 31064751 DOI: 10.1182/blood.2019000519] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2018] [Accepted: 04/19/2019] [Indexed: 12/11/2022] Open
Abstract
Ph-negative myeloproliferative neoplasms (MPNs) are hematological cancers that can be subdivided into entities with distinct clinical features. Somatic mutations in JAK2, CALR, and MPL have been described as drivers of the disease, together with a variable landscape of nondriver mutations. Despite detailed knowledge of disease mechanisms, targeted therapies effective enough to eliminate MPN cells are still missing. In this study of 113 MPN patients, we aimed to comprehensively characterize the mutational landscape of the granulocyte transcriptome using RNA sequencing data and subsequently examine the applicability of immunotherapeutic strategies for MPN patients. Following implementation of customized workflows and data filtering, we identified a total of 13 (12/13 novel) gene fusions, 231 nonsynonymous single nucleotide variants, and 21 insertions and deletions in 106 of 113 patients. We found a high frequency of SF3B1-mutated primary myelofibrosis patients (14%) with distinct 3' splicing patterns, many of these with a protein-altering potential. Finally, from all mutations detected, we generated a virtual peptide library and used NetMHC to predict 149 unique neoantigens in 62% of MPN patients. Peptides from CALR and MPL mutations provide a rich source of neoantigens as a result of their unique ability to bind many common MHC class I molecules. Finally, we propose that mutations derived from splicing defects present in SF3B1-mutated patients may offer an unexplored neoantigen repertoire in MPNs. We validated 35 predicted peptides to be strong MHC class I binders through direct binding of predicted peptides to MHC proteins in vitro. Our results may serve as a resource for personalized vaccine or adoptive cell-based therapy development.
Collapse
|
32
|
Toups MA, Rodrigues N, Perrin N, Kirkpatrick M. A reciprocal translocation radically reshapes sex-linked inheritance in the common frog. Mol Ecol 2019; 28:1877-1889. [PMID: 30576024 DOI: 10.1111/mec.14990] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2018] [Revised: 12/04/2018] [Accepted: 12/04/2018] [Indexed: 12/22/2022]
Abstract
X and Y chromosomes can diverge when rearrangements block recombination between them. Here we present the first genomic view of a reciprocal translocation that causes two physically unconnected pairs of chromosomes to be coinherited as sex chromosomes. In a population of the common frog (Rana temporaria), both pairs of X and Y chromosomes show extensive sequence differentiation, but not degeneration of the Y chromosomes. A new method based on gene trees shows both chromosomes are sex-linked. Furthermore, the gene trees from the two Y chromosomes have identical topologies, showing they have been coinherited since the reciprocal translocation occurred. Reciprocal translocations can thus reshape sex linkage on a much greater scale compared with inversions, the type of rearrangement that is much better known in sex chromosome evolution, and they can greatly amplify the power of sexually antagonistic selection to drive genomic rearrangement. Two more populations show evidence of other rearrangements, suggesting that this species has unprecedented structural polymorphism in its sex chromosomes.
Collapse
Affiliation(s)
- Melissa A Toups
- Department of Integrative Biology, University of Texas, Austin, Texas.,Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Nicolas Rodrigues
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Nicolas Perrin
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Mark Kirkpatrick
- Department of Integrative Biology, University of Texas, Austin, Texas
| |
Collapse
|
33
|
Miao Z, Alvarez M, Pajukanta P, Ko A. ASElux: an ultra-fast and accurate allelic reads counter. Bioinformatics 2019; 34:1313-1320. [PMID: 29186329 DOI: 10.1093/bioinformatics/btx762] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2017] [Accepted: 11/22/2017] [Indexed: 11/12/2022] Open
Abstract
Motivation Mapping bias causes preferential alignment to the reference allele, forming a major obstacle in allele-specific expression (ASE) analysis. The existing methods, such as simulation and SNP-aware alignment, are either inaccurate or relatively slow. To fast and accurately count allelic reads for ASE analysis, we developed a novel approach, ASElux, which utilizes the personal SNP information and counts allelic reads directly from unmapped RNA-sequence (RNA-seq) data. ASElux significantly reduces runtime by disregarding reads outside single nucleotide polymorphisms (SNPs) during the alignment. Results When compared to other tools on simulated and experimental data, ASElux achieves a higher accuracy on ASE estimation than non-SNP-aware aligners and requires a much shorter time than the benchmark SNP-aware aligner, GSNAP with just a slight loss in performance. ASElux can process 40 million read-pairs from an RNA-sequence (RNA-seq) sample and count allelic reads within 10 min, which is comparable to directly counting the allelic reads from alignments based on other tools. Furthermore, processing an RNA-seq sample using ASElux in conjunction with a general aligner, such as STAR, is more accurate and still ∼4× faster than STAR + WASP, and ∼33× faster than the lead SNP-aware aligner, GSNAP, making ASElux ideal for ASE analysis of large-scale transcriptomic studies. We applied ASElux to 273 lung RNA-seq samples from GTEx and identified a splice-QTL rs11078928 in lung which explains the mechanism underlying an asthma GWAS SNP rs11078927. Thus, our analysis demonstrated ASE as a highly powerful complementary tool to cis-expression quantitative trait locus (eQTL) analysis. Availability and implementation The software can be downloaded from https://github.com/abl0719/ASElux. Contact zmiao@ucla.edu or a5ko@ucla.edu. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zong Miao
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA 90024, USA.,Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA 90024, USA
| | - Marcus Alvarez
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA 90024, USA
| | - Päivi Pajukanta
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA 90024, USA.,Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA 90024, USA.,Molecular Biology Institute, UCLA, Los Angeles, CA 90024, USA
| | - Arthur Ko
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA 90024, USA.,Molecular Biology Institute, UCLA, Los Angeles, CA 90024, USA
| |
Collapse
|
34
|
Han Z, Xue W, Tao L, Lou Y, Qiu Y, Zhu F. Genome-wide identification and analysis of the eQTL lncRNAs in multiple sclerosis based on RNA-seq data. Brief Bioinform 2019; 21:1023-1037. [PMID: 31323688 DOI: 10.1093/bib/bbz036] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2018] [Revised: 03/05/2019] [Accepted: 03/06/2019] [Indexed: 12/29/2022] Open
Abstract
Abstract
The pathogenesis of multiple sclerosis (MS) is significantly regulated by long noncoding RNAs (lncRNAs), the expression of which is substantially influenced by a number of MS-associated risk single nucleotide polymorphisms (SNPs). It is thus hypothesized that the dysregulation of lncRNA induced by genomic variants may be one of the key molecular mechanisms for the pathology of MS. However, due to the lack of sufficient data on lncRNA expression and SNP genotypes of the same MS patients, such molecular mechanisms underlying the pathology of MS remain elusive. In this study, a bioinformatics strategy was applied to obtain lncRNA expression and SNP genotype data simultaneously from 142 samples (51 MS patients and 91 controls) based on RNA-seq data, and an expression quantitative trait loci (eQTL) analysis was conducted. In total, 2383 differentially expressed lncRNAs were identified as specifically expressing in brain-related tissues, and 517 of them were affected by SNPs. Then, the functional characterization, secondary structure changes and tissue and disease specificity of the cis-eQTL SNPs and lncRNA were assessed. The cis-eQTL SNPs were substantially and specifically enriched in neurological disease and intergenic region, and the secondary structure was altered in 17.6% of all lncRNAs in MS. Finally, the weighted gene coexpression network and gene set enrichment analyses were used to investigate how the influence of SNPs on lncRNAs contributed to the pathogenesis of MS. As a result, the regulation of lncRNAs by SNPs was found to mainly influence the antigen processing/presentation and mitogen-activated protein kinases (MAPK) signaling pathway in MS. These results revealed the effectiveness of the strategy proposed in this study and give insight into the mechanism (SNP-mediated modulation of lncRNAs) underlying the pathology of MS.
Collapse
Affiliation(s)
- Zhijie Han
- Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, Zhejiang University, Hangzhou, Zhejiang, China
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, China
- School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Weiwei Xue
- School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| | - Lin Tao
- Key Laboratory of Elemene Class Anti-cancer Chinese Medicine of Zhejiang Province, School of Medicine, Hangzhou Normal University, Hangzhou, China
| | - Yan Lou
- Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, Zhejiang University, Hangzhou, Zhejiang, China
| | - Yunqing Qiu
- Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, Zhejiang University, Hangzhou, Zhejiang, China
| | - Feng Zhu
- Zhejiang Provincial Key Laboratory for Drug Clinical Research and Evaluation, The First Affiliated Hospital, Zhejiang University, Hangzhou, Zhejiang, China
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang, China
- School of Pharmaceutical Sciences and Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing, China
| |
Collapse
|
35
|
Abstract
Background Single nucleotide polymorphisms (SNP) have been applied as important molecular markers in genetics and breeding studies. The rapid advance of next generation sequencing (NGS) provides a high-throughput means of SNP discovery. However, SNP development is limited by the availability of reliable SNP discovery methods. Especially, the optimum assembler and SNP caller for accurate SNP prediction from next generation sequencing data are not known. Results Herein we performed SNP prediction based on RNA-seq data of peach and mandarin peel tissue under a comprehensive comparison of two paired-end read lengths (125 bp and 150 bp), five assemblers (Trinity, IDBA, oases, SOAPdenovo, Trans-abyss) and two SNP callers (GATK and GBS). The predicted SNPs were compared with the authentic SNPs identified via PCR amplification followed by gene cloning and sequencing procedures. A total of 40 and 240 authentic SNPs were presented in five anthocyanin biosynthesis related genes in peach and in nine carotenogenic genes in mandarin. Putative SNPs predicted from the same RNA-seq data with different strategies led to quite divergent results. The rate of false positive SNPs was significantly lower when the paired-end read length was 150 bp compared with 125 bp. Trinity was superior to the other four assemblers and GATK was substantially superior to GBS due to a low rate of missing authentic SNPs. The combination of assembler Trinity, SNP caller GATK, and the paired-end read length 150 bp had the best performance in SNP discovery with 100% accuracy both in peach and in mandarin cases. This strategy was applied to the characterization of SNPs in peach and mandarin transcriptomes. Conclusions Through comparison of authentic SNPs obtained by PCR cloning strategy and putative SNPs predicted from different combinations of five assemblers, two SNP callers, and two paired-end read lengths, we provided a reliable and efficient strategy, Trinity-GATK with 150 bp paired-end read length, for SNP discovery from RNA-seq data. This strategy discovered SNP at 100% accuracy in peach and mandarin cases and might be applicable to a wide range of plants and other organisms. Electronic supplementary material The online version of this article (10.1186/s12864-019-5533-4) contains supplementary material, which is available to authorized users.
Collapse
|
36
|
Guo Y, Yu H, Samuels DC, Yue W, Ness S, Zhao YY. Single-nucleotide variants in human RNA: RNA editing and beyond. Brief Funct Genomics 2019; 18:30-39. [PMID: 30312373 PMCID: PMC7962770 DOI: 10.1093/bfgp/ely032] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2018] [Revised: 08/21/2018] [Accepted: 09/06/2018] [Indexed: 12/12/2022] Open
Abstract
Through analysis of paired high-throughput DNA-Seq and RNA-Seq data, researchers quickly recognized that RNA-Seq can be used for more than just gene expression quantification. The alternative applications of RNA-Seq data are abundant, and we are particularly interested in its usefulness for detecting single-nucleotide variants, which arise from RNA editing, genomic variants and other RNA modifications. A stunning discovery made from RNA-Seq analyses is the unexpectedly high prevalence of RNA-editing events, many of which cannot be explained by known RNA-editing mechanisms. Over the past 6-7 years, substantial efforts have been made to maximize the potential of RNA-Seq data. In this review we describe the controversial history of mining RNA-editing events from RNA-Seq data and the corresponding development of methodologies to identify, predict, assess the quality of and catalog RNA-editing events as well as genomic variants.
Collapse
Affiliation(s)
- Yan Guo
- Department of Internal Medicine, University of New Mexico Comprehensive Cancer Center, Albuquerque, NM, USA
| | - Hui Yu
- Department of Internal Medicine, University of New Mexico Comprehensive Cancer Center, Albuquerque, NM, USA
| | - David C Samuels
- Vanderbilt Genetics Institute, Department of Molecular Physiology and Biophysics, Vanderbilt University Medical School, Nashville, TN, USA
| | - Wei Yue
- Department of Internal Medicine, University of New Mexico Comprehensive Cancer Center, Albuquerque, NM, USA
| | - Scott Ness
- Department of Internal Medicine, University of New Mexico Comprehensive Cancer Center, Albuquerque, NM, USA
| | - Ying-yong Zhao
- Key Laboratory of Resource Biology and Biotechnology in Western China, School of Life Sciences, Northwest University,Xi’an, Shaanxi, China
| |
Collapse
|
37
|
Tessier L, Côté O, Bienzle D. Sequence variant analysis of RNA sequences in severe equine asthma. PeerJ 2018; 6:e5759. [PMID: 30324028 PMCID: PMC6186407 DOI: 10.7717/peerj.5759] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2017] [Accepted: 09/15/2018] [Indexed: 12/13/2022] Open
Abstract
Background Severe equine asthma is a chronic inflammatory disease of the lung in horses similar to low-Th2 late-onset asthma in humans. This study aimed to determine the utility of RNA-Seq to call gene sequence variants, and to identify sequence variants of potential relevance to the pathogenesis of asthma. Methods RNA-Seq data were generated from endobronchial biopsies collected from six asthmatic and seven non-asthmatic horses before and after challenge (26 samples total). Sequences were aligned to the equine genome with Spliced Transcripts Alignment to Reference software. Read preparation for sequence variant calling was performed with Picard tools and Genome Analysis Toolkit (GATK). Sequence variants were called and filtered using GATK and Ensembl Variant Effect Predictor (VEP) tools, and two RNA-Seq predicted sequence variants were investigated with both PCR and Sanger sequencing. Supplementary analysis of novel sequence variant selection with VEP was based on a score of <0.01 predicted with Sorting Intolerant from Tolerant software, missense nature, location within the protein coding sequence and presence in all asthmatic individuals. For select variants, effect on protein function was assessed with Polymorphism Phenotyping 2 and screening for non-acceptable polymorphism 2 software. Sequences were aligned and 3D protein structures predicted with Geneious software. Difference in allele frequency between the groups was assessed using a Pearson’s Chi-squared test with Yates’ continuity correction, and difference in genotype frequency was calculated using the Fisher’s exact test for count data. Results RNA-Seq variant calling and filtering correctly identified substitution variants in PACRG and RTTN. Sanger sequencing confirmed that the PACRG substitution was appropriately identified in all 26 samples while the RTTN substitution was identified correctly in 24 of 26 samples. These variants of uncertain significance had substitutions that were predicted to result in loss of function and to be non-neutral. Amino acid substitutions projected no change of hydrophobicity and isoelectric point in PACRG, and a change in both for RTTN. For PACRG, no difference in allele frequency between the two groups was detected but a higher proportion of asthmatic horses had the altered RTTN allele compared to non-asthmatic animals. Discussion RNA-Seq was sensitive and specific for calling gene sequence variants in this disease model. Even moderate coverage (<10–20 counts per million) yielded correct identification in 92% of samples, suggesting RNA-Seq may be suitable to detect sequence variants in low coverage samples. The impact of amino acid alterations in PACRG and RTTN proteins, and possible association of the sequence variants with asthma, is of uncertain significance, but their role in ciliary function may be of future interest.
Collapse
Affiliation(s)
- Laurence Tessier
- Department of Pathobiology, University of Guelph, Guelph, ON, Canada.,BenchSci, Toronto, ON, Canada
| | - Olivier Côté
- Department of Pathobiology, University of Guelph, Guelph, ON, Canada.,BioAssay Works, Ijamsville, MD, USA
| | - Dorothee Bienzle
- Department of Pathobiology, University of Guelph, Guelph, ON, Canada
| |
Collapse
|
38
|
Adetunji MO, Lamont SJ, Schmidt CJ. TransAtlasDB: an integrated database connecting expression data, metadata and variants. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2018:4904553. [PMID: 29688361 PMCID: PMC5824778 DOI: 10.1093/database/bay014] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/11/2017] [Accepted: 01/19/2018] [Indexed: 12/21/2022]
Abstract
High-throughput transcriptome sequencing (RNAseq) is the universally applied method for target-free transcript identification and gene expression quantification, generating huge amounts of data. The constraint of accessing such data and interpreting results can be a major impediment in postulating suitable hypothesis, thus an innovative storage solution that addresses these limitations, such as hard disk storage requirements, efficiency and reproducibility are paramount. By offering a uniform data storage and retrieval mechanism, various data can be compared and easily investigated. We present a sophisticated system, TransAtlasDB, which incorporates a hybrid architecture of both relational and NoSQL databases for fast and efficient data storage, processing and querying of large datasets from transcript expression analysis with corresponding metadata, as well as gene-associated variants (such as SNPs) and their predicted gene effects. TransAtlasDB provides the data model of accurate storage of the large amount of data derived from RNAseq analysis and also methods of interacting with the database, either via the command-line data management workflows, written in Perl, with useful functionalities that simplifies the complexity of data storage and possibly manipulation of the massive amounts of data generated from RNAseq analysis or through the web interface. The database application is currently modeled to handle analyses data from agricultural species, and will be expanded to include more species groups. Overall TransAtlasDB aims to serve as an accessible repository for the large complex results data files derived from RNAseq gene expression profiling and variant analysis. Database URL: https://modupeore.github.io/TransAtlasDB/
Collapse
Affiliation(s)
- Modupeore O Adetunji
- Department of Animal and Food Sciences, University of Delaware, Newark, DE 19716, USA
| | - Susan J Lamont
- Department of Animal Science, Iowa State University, Ames, IA 50011-3150, USA
| | - Carl J Schmidt
- Department of Animal and Food Sciences, University of Delaware, Newark, DE 19716, USA
| |
Collapse
|
39
|
Akila Parvathy Dharshini S, Taguchi YH, Michael Gromiha M. Exploring the selective vulnerability in Alzheimer disease using tissue specific variant analysis. Genomics 2018; 111:936-949. [PMID: 29879491 DOI: 10.1016/j.ygeno.2018.05.024] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2018] [Revised: 05/03/2018] [Accepted: 05/30/2018] [Indexed: 02/08/2023]
Abstract
The selective vulnerability of distinct regions of the brain is a critical factor in neurodegenerative disorders. In Alzheimer's disease (AD), neurons in hippocampus situated in medial temporal lobe are immensely damaged. Identifying tissue-specific variants is essential in order to perceive the selective vulnerability in AD. In current work, we aligned mRNA-seq data with HG19/HG38 genomic assembly and identified specific variations present in temporal, frontal and other lobes of the AD using sequence alignment map tools. We compared the results with the genome-wide association and gene expression quantitative trait loci studies of the various neurological disorders. We also distinguished variants and epitranscriptomic modifications through the RNA-modification database and evaluated the variant effect in the coding/UTR regions. In addition, we developed genetic and functional interaction networks to understand the relationship between predicted vulnerable variations and differentially expressed genes. We found that genes involved in gliogenesis, intermediate filament organization are altered in the temporal lobe. Oxidative phosphorylation, and calcium ion homeostasis are modified in the frontal lobe, and protein degradation, apoptotic signaling are altered in other lobes. From this study, we propose that disruption of glial cell structural integrity, defective gliogenesis, and failure in glia-neuron communication are the primary factors for selective vulnerability.
Collapse
Affiliation(s)
- S Akila Parvathy Dharshini
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamilnadu, India
| | - Y-H Taguchi
- Department of Physics, Chuo University, Kasuga, Bunkyo-ku, Tokyo 112-8551, Japan
| | - M Michael Gromiha
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai 600036, Tamilnadu, India; Advanced Computational Drug Discovery Unit (ACDD), Institute of Innovative Research, Tokyo Institute of Technology, 4259 Nagatsutacho, Midori-ku, Yokohama, Kanagawa 226-8501, Japan.
| |
Collapse
|
40
|
Wolff A, Bayerlová M, Gaedcke J, Kube D, Beißbarth T. A comparative study of RNA-Seq and microarray data analysis on the two examples of rectal-cancer patients and Burkitt Lymphoma cells. PLoS One 2018; 13:e0197162. [PMID: 29768462 PMCID: PMC5955523 DOI: 10.1371/journal.pone.0197162] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2017] [Accepted: 04/27/2018] [Indexed: 12/17/2022] Open
Abstract
Background Pipeline comparisons for gene expression data are highly valuable for applied real data analyses, as they enable the selection of suitable analysis strategies for the dataset at hand. Such pipelines for RNA-Seq data should include mapping of reads, counting and differential gene expression analysis or preprocessing, normalization and differential gene expression in case of microarray analysis, in order to give a global insight into pipeline performances. Methods Four commonly used RNA-Seq pipelines (STAR/HTSeq-Count/edgeR, STAR/RSEM/edgeR, Sailfish/edgeR, TopHat2/Cufflinks/CuffDiff)) were investigated on multiple levels (alignment and counting) and cross-compared with the microarray counterpart on the level of gene expression and gene ontology enrichment. For these comparisons we generated two matched microarray and RNA-Seq datasets: Burkitt Lymphoma cell line data and rectal cancer patient data. Results The overall mapping rate of STAR was 98.98% for the cell line dataset and 98.49% for the patient dataset. Tophat’s overall mapping rate was 97.02% and 96.73%, respectively, while Sailfish had only an overall mapping rate of 84.81% and 54.44%. The correlation of gene expression in microarray and RNA-Seq data was moderately worse for the patient dataset (ρ = 0.67–0.69) than for the cell line dataset (ρ = 0.87–0.88). An exception were the correlation results of Cufflinks, which were substantially lower (ρ = 0.21–0.29 and 0.34–0.53). For both datasets we identified very low numbers of differentially expressed genes using the microarray platform. For RNA-Seq we checked the agreement of differentially expressed genes identified in the different pipelines and of GO-term enrichment results. Conclusion In conclusion the combination of STAR aligner with HTSeq-Count followed by STAR aligner with RSEM and Sailfish generated differentially expressed genes best suited for the dataset at hand and in agreement with most of the other transcriptomics pipelines.
Collapse
Affiliation(s)
- Alexander Wolff
- Dept. of Medical Statistics, University Medical Center Göttingen, Göttingen, Germany
| | - Michaela Bayerlová
- Dept. of Medical Statistics, University Medical Center Göttingen, Göttingen, Germany
| | - Jochen Gaedcke
- Dept. of General-, Visceral- and Pediatric Surgery, University Medical Center Göttingen, Göttingen, Germany
| | - Dieter Kube
- Dept. of Hematology and Oncology, University Medical Center Göttingen, Göttingen, Germany
| | - Tim Beißbarth
- Dept. of Medical Statistics, University Medical Center Göttingen, Göttingen, Germany
- * E-mail:
| |
Collapse
|
41
|
Piórkowska K, Żukowski K, Ropka-Molik K, Tyra M. Detection of genetic variants between different Polish Landrace and Puławska pigs by means of RNA-seq analysis. Anim Genet 2018; 49:215-225. [PMID: 29635698 DOI: 10.1111/age.12654] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/07/2018] [Indexed: 02/06/2023]
Abstract
Variant calling analysis based on RNA sequencing data provides information about gene variants. RNA-seq is cheaper and faster than is DNA sequencing. However, it requires individual hard filters during data processing due to post-transcriptional modifications such as splicing and RNA editing. In the present study, RNA-seq transcriptome data on two Polish pig breeds (Puławska, PUL, n = 8, and Polish Landrace, PL, n = 8) were included. The pig breeds are significantly different with regard to meat qualities such as texture, water exudation, growth traits and fat content in carcasses. A total of 2451 significant mutations were identified by a chi square tests, and functional analysis was carried out using Panther, KEGG and Kobas. Interesting missense gene variants and mutations located in regulatory regions were found in a few genes related to fatty acid metabolism and lipid storage such as ACSL5, ALDH3A2, FADS1, SCD, PLA2G12A and ATGL. A validation of mutational influences on pig traits was performed for ALDH3A2, ATGL, PLA2G12A and MYOM1 variants using association analysis including 215 pigs of the PL and PUL breeds. The ALDH3A2ENSSSCT00000019636.2:c.470T>C polymorphism was found to affect the weight of the ham and loin eye area. In turn, an ENSSSCT00000004091.2:c.2836G>A MYOM1 mutation, which could be implicated in myofibrillar network organisation, had an effect on meatiness and loin texture parameters. The study aimed to estimate the usefulness of RNA-seq results for a purpose other than differentially expressed gene analysis. The analysis performed indicated interesting gene variants that could be used in the future as markers during selection.
Collapse
Affiliation(s)
- K Piórkowska
- Department of Animal Molecular Biology, National Research Institute of Animal Production, 32-083, Balice, Poland
| | - K Żukowski
- Department of Cattle Breeding, National Research Institute of Animal Production, 32-083, Balice, Poland
| | - K Ropka-Molik
- Department of Animal Molecular Biology, National Research Institute of Animal Production, 32-083, Balice, Poland
| | - M Tyra
- Department of Pig Breeding, National Research Institute of Animal Production, 32-083, Balice, Poland
| |
Collapse
|
42
|
Malmberg MM, Pembleton LW, Baillie RC, Drayton MC, Sudheesh S, Kaur S, Shinozuka H, Verma P, Spangenberg GC, Daetwyler HD, Forster JW, Cogan NO. Genotyping-by-sequencing through transcriptomics: implementation in a range of crop species with varying reproductive habits and ploidy levels. PLANT BIOTECHNOLOGY JOURNAL 2018; 16:877-889. [PMID: 28913899 PMCID: PMC5866951 DOI: 10.1111/pbi.12835] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/29/2017] [Revised: 08/03/2017] [Accepted: 09/08/2017] [Indexed: 05/09/2023]
Abstract
The application of genomics in crops has the ability to significantly improve genetic gain for agriculture. Many marker-dense tools have been developed, but few have seen broad adoption in plant genomics due to issues of significant variations of genome size, levels of ploidy, single nucleotide polymorphism (SNP) frequency and reproductive habit. When combined with limited breeding activities, small research communities and scant sequence resources, the suitability of popular systems is often suboptimal and routinely fails to effectively balance cost-effectiveness and sample throughput. Genotyping-by-sequencing (GBS) encompasses a range of protocols including resequencing of the transcriptome. This study describes a skim GBS-transcriptomics (GBS-t) approach developed to be broadly applicable, cost-effective and high-throughput while still assaying a significant number of SNP loci. A range of crop species with differing levels of ploidy and degree of inbreeding/outbreeding were chosen, including perennial ryegrass, a diploid outbreeding forage grass; phalaris, a putative segmental allotetraploid outbreeding forage grass; lentil, a diploid inbreeding grain legume; and canola, an allotetraploid partially outbreeding oilseed. GBS-t was validated as a simple and largely automated, cost-effective method which generates sufficient SNPs (from 89 738 to 231 977) with acceptable levels of missing data and even genome coverage from c. 3 million sequence reads per sample. GBS-t is therefore a broadly applicable system suitable for many crops, offering advantages over other systems. The correct choice of subsequent sequence analysis software is important, and the bioinformatics process should be iterative and tailored to the specific challenges posed by ploidy variation and extent of heterozygosity.
Collapse
Affiliation(s)
- M. Michelle Malmberg
- Agriculture VictoriaAgriBioCentre for AgriBioscience5 Ring RoadBundooraVictoria 3083Australia
- School of Applied Systems BiologyLa Trobe UniversityBundooraVictoria 3086Australia
| | - Luke W. Pembleton
- Agriculture VictoriaAgriBioCentre for AgriBioscience5 Ring RoadBundooraVictoria 3083Australia
| | - Rebecca C. Baillie
- Agriculture VictoriaAgriBioCentre for AgriBioscience5 Ring RoadBundooraVictoria 3083Australia
| | - Michelle C. Drayton
- Agriculture VictoriaAgriBioCentre for AgriBioscience5 Ring RoadBundooraVictoria 3083Australia
| | - Shimna Sudheesh
- Agriculture VictoriaAgriBioCentre for AgriBioscience5 Ring RoadBundooraVictoria 3083Australia
| | - Sukhjiwan Kaur
- Agriculture VictoriaAgriBioCentre for AgriBioscience5 Ring RoadBundooraVictoria 3083Australia
| | - Hiroshi Shinozuka
- Agriculture VictoriaAgriBioCentre for AgriBioscience5 Ring RoadBundooraVictoria 3083Australia
| | - Preeti Verma
- Agriculture VictoriaAgriBioCentre for AgriBioscience5 Ring RoadBundooraVictoria 3083Australia
| | - German C. Spangenberg
- Agriculture VictoriaAgriBioCentre for AgriBioscience5 Ring RoadBundooraVictoria 3083Australia
- School of Applied Systems BiologyLa Trobe UniversityBundooraVictoria 3086Australia
| | - Hans D. Daetwyler
- Agriculture VictoriaAgriBioCentre for AgriBioscience5 Ring RoadBundooraVictoria 3083Australia
- School of Applied Systems BiologyLa Trobe UniversityBundooraVictoria 3086Australia
| | - John W. Forster
- Agriculture VictoriaAgriBioCentre for AgriBioscience5 Ring RoadBundooraVictoria 3083Australia
- School of Applied Systems BiologyLa Trobe UniversityBundooraVictoria 3086Australia
| | - Noel O.I. Cogan
- Agriculture VictoriaAgriBioCentre for AgriBioscience5 Ring RoadBundooraVictoria 3083Australia
- School of Applied Systems BiologyLa Trobe UniversityBundooraVictoria 3086Australia
| |
Collapse
|
43
|
Ghorbani A, Izadpanah K, Dietzgen RG. Gene expression and population polymorphism of maize Iranian mosaic virus in Zea mays, and intracellular localization and interactions of viral N, P, and M proteins in Nicotiana benthamiana. Virus Genes 2018; 54:290-296. [PMID: 29450759 DOI: 10.1007/s11262-018-1540-9] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2017] [Accepted: 02/06/2018] [Indexed: 10/18/2022]
Abstract
Maize Iranian mosaic virus (MIMV; Mononegavirales, Rhabdoviridae, Nucleorhabdovirus) infects maize and several other poaceous plants. MIMV encodes six proteins, i.e., nucleocapsid protein (N), polymerase cofactor phosphoprotein (P), putative movement protein (P3), matrix protein (M), glycoprotein (G), and large RNA-dependent RNA polymerase (L). In the present study, MIMV gene expression and genetic polymorphism of an MIMV population in maize were determined. N, P, P3, and M protein genes were more highly expressed than the 5' terminal G and L genes. Twelve single nucleotide polymorphisms were identified across the genome within a MIMV population in maize from RNA-Seq read data pooled from three infected plants indicating genomic variations of potential importance to evolution of the virus. MIMV N, P, and M proteins that are known to be involved in rhabdovirus replication and transcription were characterized as to their intracellular localization and interactions. N protein accumulated exclusively in the nucleus and interacted with itself and with P protein. P protein accumulated in both the nucleus and cell periphery and interacted with itself, N and M proteins in the nucleus. M protein was localized in the cell periphery and on endomembranes, and interacted with P protein in the nucleus. MIMV proteins show a distinctive combination of intracellular localizations and interactions.
Collapse
Affiliation(s)
- Abozar Ghorbani
- College of Agriculture, Plant Virology Research Center, Shiraz University, Shiraz, Iran
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, QLD, Australia
| | | | - Ralf G Dietzgen
- Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St. Lucia, QLD, Australia.
| |
Collapse
|
44
|
Wang X, Chen Q, Wu Y, Lemmon ZH, Xu G, Huang C, Liang Y, Xu D, Li D, Doebley JF, Tian F. Genome-wide Analysis of Transcriptional Variability in a Large Maize-Teosinte Population. MOLECULAR PLANT 2018; 11:443-459. [PMID: 29275164 DOI: 10.1016/j.molp.2017.12.011] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2017] [Revised: 10/21/2017] [Accepted: 12/11/2017] [Indexed: 05/18/2023]
Abstract
Gene expression regulation plays an important role in controlling plant phenotypes and adaptation. Here, we report a comprehensive assessment of gene expression variation through the transcriptome analyses of a large maize-teosinte experimental population. Genome-wide mapping identified 25 660 expression quantitative trait loci (eQTL) for 17 311 genes, capturing an unprecedented range of expression variation. We found that local eQTL were more frequently mapped to adjacent genes, displaying a mode of expression piggybacking, which consequently created co-regulated gene clusters. Genes within the co-regulated gene clusters tend to have relevant functions and shared chromatin modifications. Distant eQTL formed 125 significant distant eQTL hotspots with their targets significantly enriched in specific functional categories. By integrating different sources of information, we identified putative trans- regulators for a variety of metabolic pathways. We demonstrated that the bHLH transcription factor R1 and hexokinase HEX9 might act as crucial regulators for flavonoid biosynthesis and glycolysis, respectively. Moreover, we showed that domestication or improvement has significantly affected global gene expression, with many genes targeted by selection. Of particular interest, the Bx genes for benzoxazinoid biosynthesis may have undergone coordinated cis-regulatory divergence between maize and teosinte, and a transposon insertion that inactivates Bx12 was under strong selection as maize spread into temperate environments with a distinct herbivore community.
Collapse
Affiliation(s)
- Xufeng Wang
- National Maize Improvement Center of China, Beijing Key Laboratory of Crop Genetic Improvement, Laboratory of Crop Heterosis and Utilization, Joint International Research Laboratory of Crop Molecular Breeding, China Agricultural University, Beijing 100193, China
| | - Qiuyue Chen
- National Maize Improvement Center of China, Beijing Key Laboratory of Crop Genetic Improvement, Laboratory of Crop Heterosis and Utilization, Joint International Research Laboratory of Crop Molecular Breeding, China Agricultural University, Beijing 100193, China
| | - Yaoyao Wu
- National Maize Improvement Center of China, Beijing Key Laboratory of Crop Genetic Improvement, Laboratory of Crop Heterosis and Utilization, Joint International Research Laboratory of Crop Molecular Breeding, China Agricultural University, Beijing 100193, China
| | - Zachary H Lemmon
- Department of Genetics, University of Wisconsin, Madison, WI 53706, USA
| | - Guanghui Xu
- National Maize Improvement Center of China, Beijing Key Laboratory of Crop Genetic Improvement, Laboratory of Crop Heterosis and Utilization, Joint International Research Laboratory of Crop Molecular Breeding, China Agricultural University, Beijing 100193, China
| | - Cheng Huang
- National Maize Improvement Center of China, Beijing Key Laboratory of Crop Genetic Improvement, Laboratory of Crop Heterosis and Utilization, Joint International Research Laboratory of Crop Molecular Breeding, China Agricultural University, Beijing 100193, China
| | - Yameng Liang
- National Maize Improvement Center of China, Beijing Key Laboratory of Crop Genetic Improvement, Laboratory of Crop Heterosis and Utilization, Joint International Research Laboratory of Crop Molecular Breeding, China Agricultural University, Beijing 100193, China
| | - Dingyi Xu
- National Maize Improvement Center of China, Beijing Key Laboratory of Crop Genetic Improvement, Laboratory of Crop Heterosis and Utilization, Joint International Research Laboratory of Crop Molecular Breeding, China Agricultural University, Beijing 100193, China
| | - Dan Li
- National Maize Improvement Center of China, Beijing Key Laboratory of Crop Genetic Improvement, Laboratory of Crop Heterosis and Utilization, Joint International Research Laboratory of Crop Molecular Breeding, China Agricultural University, Beijing 100193, China
| | - John F Doebley
- Department of Genetics, University of Wisconsin, Madison, WI 53706, USA
| | - Feng Tian
- National Maize Improvement Center of China, Beijing Key Laboratory of Crop Genetic Improvement, Laboratory of Crop Heterosis and Utilization, Joint International Research Laboratory of Crop Molecular Breeding, China Agricultural University, Beijing 100193, China.
| |
Collapse
|
45
|
Abstract
It is estimated that more than 90% of the mammalian genome is transcribed as non-coding RNAs. Recent evidences have established that these non-coding transcripts are not junk or just transcriptional noise, but they do serve important biological purpose. One of the rapidly expanding fields of this class of transcripts is the regulatory lncRNAs, which had been a major challenge in terms of their molecular functions and mechanisms of action. The emergence of high-throughput technologies and the development in various conventional approaches have led to the expansion of the lncRNA world. The combination of multidisciplinary approaches has proven to be essential to unravel the complexity of their regulatory networks and helped establish the importance of their existence. Here, we review the current methodologies available for discovering and investigating functions of long non-coding RNAs (lncRNAs) and focus on the powerful technological advancement available to specifically address their functional importance.
Collapse
|
46
|
van Son M, Tremoen NH, Gaustad AH, Myromslien FD, Våge DI, Stenseth EB, Zeremichael TT, Grindflek E. RNA sequencing reveals candidate genes and polymorphisms related to sperm DNA integrity in testis tissue from boars. BMC Vet Res 2017; 13:362. [PMID: 29183316 PMCID: PMC5706377 DOI: 10.1186/s12917-017-1279-x] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2017] [Accepted: 11/16/2017] [Indexed: 11/17/2022] Open
Abstract
Background Sperm DNA is protected against fragmentation by a high degree of chromatin packaging. It has been demonstrated that proper chromatin packaging is important for boar fertility outcome. However, little is known about the molecular mechanisms underlying differences in sperm DNA fragmentation. Knowledge of sequence variation influencing this sperm parameter could be beneficial in selecting the best artificial insemination (AI) boars for commercial production. The aim of this study was to identify genes differentially expressed in testis tissue of Norwegian Landrace and Duroc boars, with high and low sperm DNA fragmentation index (DFI), using transcriptome sequencing. Results Altogether, 308 and 374 genes were found to display significant differences in expression level between high and low DFI in Landrace and Duroc boars, respectively. Of these genes, 71 were differentially expressed in both breeds. Gene ontology analysis revealed that significant terms in common for the two breeds included extracellular matrix, extracellular region and calcium ion binding. Moreover, different metabolic processes were enriched in Landrace and Duroc, whereas immune response terms were common in Landrace only. Variant detection identified putative polymorphisms in some of the differentially expressed genes. Validation showed that predicted high impact variants in RAMP2, GIMAP6 and three uncharacterized genes are particularly interesting for sperm DNA fragmentation in boars. Conclusions We identified differentially expressed genes between groups of boars with high and low sperm DFI, and functional annotation of these genes point towards important biochemical pathways. Moreover, variant detection identified putative polymorphisms in the differentially expressed genes. Our results provide valuable insights into the molecular network underlying DFI in pigs. Electronic supplementary material The online version of this article (10.1186/s12917-017-1279-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Nina Hårdnes Tremoen
- Department of Natural Sciences and Technology, Inland Norway University of Applied Sciences, 2318, Hamar, Norway.,Centre for Integrative Genetics (CIGENE), Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, 1432, Ås, Norway
| | - Ann Helen Gaustad
- Topigs Norsvin, 2317, Hamar, Norway.,Department of Natural Sciences and Technology, Inland Norway University of Applied Sciences, 2318, Hamar, Norway
| | - Frøydis Deinboll Myromslien
- Department of Natural Sciences and Technology, Inland Norway University of Applied Sciences, 2318, Hamar, Norway
| | - Dag Inge Våge
- Centre for Integrative Genetics (CIGENE), Department of Animal and Aquacultural Sciences, Faculty of Biosciences, Norwegian University of Life Sciences, 1432, Ås, Norway
| | - Else-Berit Stenseth
- Department of Natural Sciences and Technology, Inland Norway University of Applied Sciences, 2318, Hamar, Norway
| | | | | |
Collapse
|
47
|
Li Q, Wang X, Liu X, Liao Q, Sun J, He X, Yang T, Yin J, Jia J, Li X, Colotte M, Bonnet J. Long-Term Room Temperature Storage of Dry Ribonucleic Acid for Use in RNA-Seq Analysis. Biopreserv Biobank 2017; 15:502-511. [PMID: 29022740 DOI: 10.1089/bio.2017.0024] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
RNA is an essential biological material for research in genomics and translational medicine. As such, its storage for biobanking is an important field of study. Traditionally, long-term storage in the cold (generally freezers or liquid nitrogen) is used to maintain high-quality (in terms of quantity and integrity) RNA. Room temperature (RT) preservation provides an alternative to the cold, which is plagued by serious problems (mainly cost and safety), for RNA long-term storage. In this study, we evaluated the performance of several RT storage procedures, including the RNAshell® from Imagene, where the RNA is dried and kept protected from the atmosphere, and the vacuum drying of RNA with additives such as the Imagene stabilization solution and a home-made trehalose solution. This evaluation was performed through accelerated (equivalent to 10 years for RNAshell) aging and real-time studies (4 years). To check RNA quality and integrity, we used RNA integrity number values and RNA-seq. Our study shows that isolation from atmosphere offers a superior protective effect for RNA storage compared with vacuum drying alone, and demonstrates that RNAshell permits satisfactory RNA quality for long-term RT storage. Thus, the RNA quality could meet the demand of downstream applications such as RNA-seq.
Collapse
Affiliation(s)
- Qiyuan Li
- 1 China National GeneBank-Shenzhen , BGI-Shenzhen, Shenzhen, China
| | - Xian Wang
- 1 China National GeneBank-Shenzhen , BGI-Shenzhen, Shenzhen, China
| | - Xiaopan Liu
- 1 China National GeneBank-Shenzhen , BGI-Shenzhen, Shenzhen, China
| | - Qiuyan Liao
- 1 China National GeneBank-Shenzhen , BGI-Shenzhen, Shenzhen, China
| | - Jianbo Sun
- 1 China National GeneBank-Shenzhen , BGI-Shenzhen, Shenzhen, China
| | - Xuheng He
- 1 China National GeneBank-Shenzhen , BGI-Shenzhen, Shenzhen, China
| | - Ting Yang
- 1 China National GeneBank-Shenzhen , BGI-Shenzhen, Shenzhen, China
| | - Jiefang Yin
- 1 China National GeneBank-Shenzhen , BGI-Shenzhen, Shenzhen, China
| | - Jia Jia
- 1 China National GeneBank-Shenzhen , BGI-Shenzhen, Shenzhen, China
| | - Xue Li
- 1 China National GeneBank-Shenzhen , BGI-Shenzhen, Shenzhen, China
| | - Marthe Colotte
- 2 Imagene, Production Platform , Rue Henri Desbruères, Evry, France
| | - Jacques Bonnet
- 3 Institut Bergonié, Université de Bordeaux , Bordeaux, France .,4 Imagene, R&D Department, Université de Bordeaux , ENSTBB, Bordeaux, France
| |
Collapse
|
48
|
Guo Y, Zhao S, Sheng Q, Samuels DC, Shyr Y. The discrepancy among single nucleotide variants detected by DNA and RNA high throughput sequencing data. BMC Genomics 2017; 18:690. [PMID: 28984205 PMCID: PMC5629567 DOI: 10.1186/s12864-017-4022-x] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND High throughput sequencing technology enables the both the human genome and transcriptome to be screened at the single nucleotide resolution. Tools have been developed to infer single nucleotide variants (SNVs) from both DNA and RNA sequencing data. To evaluate how much difference can be expected between DNA and RNA sequencing data, and among tissue sources, we designed a study to examine the single nucleotide difference among five sources of high throughput sequencing data generated from the same individual, including exome sequencing from blood, tumor and adjacent normal tissue, and RNAseq from tumor and adjacent normal tissue. RESULTS Through careful quality control and analysis of the SNVs, we found little difference between DNA-DNA pairs (1%-2%). However, between DNA-RNA pairs, SNV differences ranged anywhere from 10% to 20%. CONCLUSIONS Only a small portion of these differences can be explained by RNA editing. Instead, the majority of the DNA-RNA differences should be attributed to technical errors from sequencing and post-processing of RNAseq data. Our analysis results suggest that SNV detection using RNAseq is subject to high false positive rates.
Collapse
Affiliation(s)
- Yan Guo
- Department of Biomedical Informatics, Vanderbilt University, 2220 Pierce Ave, 571 PRB, Nashville, TN, 37027, USA.
| | - Shilin Zhao
- Department of Biomedical Informatics, Vanderbilt University, 2220 Pierce Ave, 571 PRB, Nashville, TN, 37027, USA
| | - Quanhu Sheng
- Department of Biomedical Informatics, Vanderbilt University, 2220 Pierce Ave, 571 PRB, Nashville, TN, 37027, USA
| | - David C Samuels
- Vanderbilt Genetics Institute, Department of Molecular Physiology and Biophysics, Vanderbilt University Medical School, Nashville, TN, USA
| | - Yu Shyr
- Department of Biostatistics, Vanderbilt University, 2220 Pierce Ave, 571 PRB, Nashville, TN, 37027, USA.
| |
Collapse
|
49
|
Audoux J, Salson M, Grosset CF, Beaumeunier S, Holder JM, Commes T, Philippe N. SimBA: A methodology and tools for evaluating the performance of RNA-Seq bioinformatic pipelines. BMC Bioinformatics 2017; 18:428. [PMID: 28969586 PMCID: PMC5623974 DOI: 10.1186/s12859-017-1831-5] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2017] [Accepted: 09/08/2017] [Indexed: 11/10/2022] Open
Abstract
Background The evolution of next-generation sequencing (NGS) technologies has led to increased focus on RNA-Seq. Many bioinformatic tools have been developed for RNA-Seq analysis, each with unique performance characteristics and configuration parameters. Users face an increasingly complex task in understanding which bioinformatic tools are best for their specific needs and how they should be configured. In order to provide some answers to these questions, we investigate the performance of leading bioinformatic tools designed for RNA-Seq analysis and propose a methodology for systematic evaluation and comparison of performance to help users make well informed choices. Results To evaluate RNA-Seq pipelines, we developed a suite of two benchmarking tools. SimCT generates simulated datasets that get as close as possible to specific real biological conditions accompanied by the list of genomic incidents and mutations that have been inserted. BenchCT then compares the output of any bioinformatics pipeline that has been run against a SimCT dataset with the simulated genomic and transcriptional variations it contains to give an accurate performance evaluation in addressing specific biological question. We used these tools to simulate a real-world genomic medicine question s involving the comparison of healthy and cancerous cells. Results revealed that performance in addressing a particular biological context varied significantly depending on the choice of tools and settings used. We also found that by combining the output of certain pipelines, substantial performance improvements could be achieved. Conclusion Our research emphasizes the importance of selecting and configuring bioinformatic tools for the specific biological question being investigated to obtain optimal results. Pipeline designers, developers and users should include benchmarking in the context of their biological question as part of their design and quality control process. Our SimBA suite of benchmarking tools provides a reliable basis for comparing the performance of RNA-Seq bioinformatics pipelines in addressing a specific biological question. We would like to see the creation of a reference corpus of data-sets that would allow accurate comparison between benchmarks performed by different groups and the publication of more benchmarks based on this public corpus. SimBA software and data-set are available at http://cractools.gforge.inria.fr/softwares/simba/. Electronic supplementary material The online version of this article (doi:10.1186/s12859-017-1831-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jérôme Audoux
- SeqOne, IRMB, CHRU de Montpellier -Hopital St Eloi, 80 avenue Augustin Fliche, Montpellier, 34295, France.,Institute of Computational Biology, Montpellier, 860, Rue Saint-Priest, Montpellier Cedex 5, 34095, France
| | - Mikaël Salson
- University Lille, CNRS, Centrale Lille, Inria, UMR 9189 - CRIStAL - Centre de Recherche en Informatique Signal et Automatique de Lille, Lille, F-59000, France
| | | | - Sacha Beaumeunier
- SeqOne, IRMB, CHRU de Montpellier -Hopital St Eloi, 80 avenue Augustin Fliche, Montpellier, 34295, France.,Institute of Computational Biology, Montpellier, 860, Rue Saint-Priest, Montpellier Cedex 5, 34095, France
| | - Jean-Marc Holder
- SeqOne, IRMB, CHRU de Montpellier -Hopital St Eloi, 80 avenue Augustin Fliche, Montpellier, 34295, France.,Institute of Computational Biology, Montpellier, 860, Rue Saint-Priest, Montpellier Cedex 5, 34095, France
| | - Thérèse Commes
- SeqOne, IRMB, CHRU de Montpellier -Hopital St Eloi, 80 avenue Augustin Fliche, Montpellier, 34295, France.,Institute of Computational Biology, Montpellier, 860, Rue Saint-Priest, Montpellier Cedex 5, 34095, France
| | - Nicolas Philippe
- SeqOne, IRMB, CHRU de Montpellier -Hopital St Eloi, 80 avenue Augustin Fliche, Montpellier, 34295, France. .,Institute of Computational Biology, Montpellier, 860, Rue Saint-Priest, Montpellier Cedex 5, 34095, France.
| |
Collapse
|
50
|
Zhang Y, Li D, Han R, Wang Y, Li G, Liu X, Tian Y, Kang X, Li Z. Transcriptome analysis of the pectoral muscles of local chickens and commercial broilers using Ribo-Zero ribonucleic acid sequencing. PLoS One 2017; 12:e0184115. [PMID: 28863190 PMCID: PMC5581173 DOI: 10.1371/journal.pone.0184115] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2017] [Accepted: 08/20/2017] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND The molecular mechanisms underlying meat quality and muscle growth are not clear. The meat quality and growth rates of local chickens and commercial broilers are very different. The Ribo-Zero RNA-Seq technology is an effective means of analyzing transcript groups to clarify molecular mechanisms. The aim of this study was to provide a reference for studies of the differences in the meat quality and growth of different breeds of chickens. RESULTS Ribo-Zero RNA-Seq technology was used to analyze the pectoral muscle transcriptomes of Gushi chickens and AA broilers. Compared with AA broilers, 1649 genes with annotated information were significantly differentially expressed (736 upregulated and 913 downregulated) in Gushi chickens with Q≤0.05 (Q is the P-value corrected by multiple assumptions test) at a fold change ≥2 or ≤0.5. In addition, 2540 novel significantly differentially expressed (SDE) genes (1405 upregulated and 1135 downregulated) were discovered. The results showed that the main signal transduction pathways that differed between Gushi chickens and AA broilers were related to amino acid metabolism. Amino acids are important for protein synthesis, and they regulate key metabolic pathways to improve the growth, development and reproduction of organisms. CONCLUSION This study showed that differentially expressed genes in the pectoral tissues of Gushi chickens and AA broilers were related to fat metabolism, which affects meat. Additionally, a large number of novel genes were found that may be involved in fat metabolism and thus may affect the formation of meat, which requires further study. The results of this study provide a reference for further studies of the molecular mechanisms of meat formation.
Collapse
Affiliation(s)
- Yanhua Zhang
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou, China
| | - Donghua Li
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou, China
| | - Ruili Han
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou, China
- Henan Innovative Engineering Research Center of Poultry Germplasm Resource, Zhengzhou, China
| | - Yanbin Wang
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou, China
- Henan Innovative Engineering Research Center of Poultry Germplasm Resource, Zhengzhou, China
| | - Guoxi Li
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou, China
- Henan Innovative Engineering Research Center of Poultry Germplasm Resource, Zhengzhou, China
| | - Xiaojun Liu
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou, China
- Henan Innovative Engineering Research Center of Poultry Germplasm Resource, Zhengzhou, China
| | - Yadong Tian
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou, China
- Henan Innovative Engineering Research Center of Poultry Germplasm Resource, Zhengzhou, China
| | - Xiangtao Kang
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou, China
- Henan Innovative Engineering Research Center of Poultry Germplasm Resource, Zhengzhou, China
| | - Zhuanjian Li
- College of Animal Science and Veterinary Medicine, Henan Agricultural University, Zhengzhou, China
- Henan Innovative Engineering Research Center of Poultry Germplasm Resource, Zhengzhou, China
| |
Collapse
|