1
|
Gurgul A, Szmatoła T, Ocłoń E, Jasielczuk I, Semik-Gurgul E, Finno CJ, Petersen JL, Bellone R, Hales EN, Ząbek T, Arent Z, Kotula-Balak M, Bugno-Poniewierska M. Another lesson from unmapped reads: in-depth analysis of RNA-Seq reads from various horse tissues. J Appl Genet 2022; 63:571-581. [PMID: 35670911 DOI: 10.1007/s13353-022-00705-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 04/27/2022] [Accepted: 05/31/2022] [Indexed: 11/25/2022]
Abstract
In recent years, a vast amount of sequencing data has been generated and large improvements have been made to reference genome sequences. Despite these advances, significant portions of reads still do not map to reference genomes and these reads have been considered as junk or artificial sequences. Recent studies have shown that these reads can be useful, e.g., for refining reference genomes or detecting contaminating microorganisms present in the analyzed biological samples. A special case of this is RNA sequencing (RNA-Seq) reads that come from tissue transcriptomes. Unmapped reads from RNA-Seq have received much less attention than those from whole-genome sequencing. In particular, in the horse, an analysis of unmapped RNA reads has not been performed yet. Thus, in this study, we analyzed the unmapped reads originating from the RNA-Seq performed through the Functional Annotation of Animal Genomes (FAANG) project in the horse, using eight different tissues from two mares. We demonstrated that unmapped reads from RNA-Seq could be easily assembled into transcripts relating to many important genes present in the sequences of other mammals. Large portions of these transcripts did not have coding potential and, thus, can be considered as non-coding RNA. Moreover, reads that were not mapped to the reference genome but aligned to the entries in NCBI database of horse proteins were enriched for biological processes that largely correspond to the functions of organ from which RNA was isolated and thus are presumably true transcripts of genes associated with cell metabolism in those tissues. In addition, a portion of reads aligned to the common pathogenic or neutral microbiota, of which the most common was Brucella spp. These data suggest that unmapped reads can be an important target for in-depth analysis that may substantially enrich results of initial RNA-Seq experiments for various tissues and organs.
Collapse
Affiliation(s)
- Artur Gurgul
- Center for Experimental and Innovative Medicine, University of Agriculture in Krakow, Rędzina 1c, 30-248, Kraków, Poland.
| | - Tomasz Szmatoła
- Center for Experimental and Innovative Medicine, University of Agriculture in Krakow, Rędzina 1c, 30-248, Kraków, Poland
| | - Ewa Ocłoń
- Center for Experimental and Innovative Medicine, University of Agriculture in Krakow, Rędzina 1c, 30-248, Kraków, Poland
| | - Igor Jasielczuk
- Center for Experimental and Innovative Medicine, University of Agriculture in Krakow, Rędzina 1c, 30-248, Kraków, Poland
| | - Ewelina Semik-Gurgul
- Department of Animal Molecular Biology, National Research Institute of Animal Production, Krakowska 1, 32-083, Balice, Poland
| | - Carrie J Finno
- Department of Population Health and Reproduction, University of California Davis School of Veterinary Medicine, Davis, CA, USA
| | - Jessica L Petersen
- Department of Animal Science, University of Nebraska Lincoln, Lincoln, NB, USA
| | - Rebecca Bellone
- Department of Population Health and Reproduction, University of California Davis School of Veterinary Medicine, Davis, CA, USA
- Veterinary Genetics Laboratory, University of California Davis School of Veterinary Medicine, Davis, CA, USA
| | - Erin N Hales
- Department of Population Health and Reproduction, University of California Davis School of Veterinary Medicine, Davis, CA, USA
| | - Tomasz Ząbek
- Department of Animal Molecular Biology, National Research Institute of Animal Production, Krakowska 1, 32-083, Balice, Poland
| | - Zbigniew Arent
- Center for Experimental and Innovative Medicine, University of Agriculture in Krakow, Rędzina 1c, 30-248, Kraków, Poland
| | - Małgorzata Kotula-Balak
- University Centre of Veterinary Medicine, University of Agriculture in Krakow, Mickiewicza 24/28, 30-059, Krakow, Poland
| | - Monika Bugno-Poniewierska
- Department of Animal Reproduction, Anatomy and Genomics, University of Agriculture in Kraków, al. Mickiewicza 24/28, 30-059, Kraków, Poland
| |
Collapse
|
2
|
Wang Y, Xue H, Aglave M, Lainé A, Gallopin M, Gautheret D. The contribution of uncharted RNA sequences to tumor identity in lung adenocarcinoma. NAR Cancer 2022; 4:zcac001. [PMID: 35118386 PMCID: PMC8807116 DOI: 10.1093/narcan/zcac001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2021] [Revised: 11/18/2021] [Accepted: 01/10/2022] [Indexed: 11/12/2022] Open
Abstract
The identity of cancer cells is defined by the interplay between genetic, epigenetic transcriptional and post-transcriptional variation. A lot of this variation is present in RNA-seq data and can be captured at once using reference-free, k-mer analysis. An important issue with k-mer analysis, however, is the difficulty of distinguishing signal from noise. Here, we use two independent lung adenocarcinoma datasets to identify all reproducible events at the k-mer level, in a tumor versus normal setting. We find reproducible events in many different locations (introns, intergenic, repeats) and forms (spliced, polyadenylated, chimeric etc.). We systematically analyze events that are ignored in conventional transcriptomics and assess their value as biomarkers and for tumor classification, survival prediction, neoantigen prediction and correlation with the immune microenvironment. We find that unannotated lincRNAs, novel splice variants, endogenous HERV, Line1 and Alu repeats and bacterial RNAs each contribute to different, important aspects of tumor identity. We argue that differential RNA-seq analysis of tumor/normal sample collections would benefit from this type k-mer analysis to cast a wider net on important cancer-related events. The code is available at https://github.com/Transipedia/dekupl-lung-cancer-inter-cohort.
Collapse
Affiliation(s)
- Yunfeng Wang
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CNRS, CEA, 1 avenue de la Terrasse, 91190, Gif-sur-Yvette, France
- Annoroad Gene Technology Co., Ltd, 100176 Beijing, China
| | - Haoliang Xue
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CNRS, CEA, 1 avenue de la Terrasse, 91190, Gif-sur-Yvette, France
| | - Marine Aglave
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CNRS, CEA, 1 avenue de la Terrasse, 91190, Gif-sur-Yvette, France
- Gustave Roussy, 114 rue Edouard Vaillant, 94800, Villejuif, France
| | - Antoine Lainé
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CNRS, CEA, 1 avenue de la Terrasse, 91190, Gif-sur-Yvette, France
| | - Mélina Gallopin
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CNRS, CEA, 1 avenue de la Terrasse, 91190, Gif-sur-Yvette, France
| | - Daniel Gautheret
- Institute for Integrative Biology of the Cell (I2BC), Université Paris-Saclay, CNRS, CEA, 1 avenue de la Terrasse, 91190, Gif-sur-Yvette, France
- Gustave Roussy, 114 rue Edouard Vaillant, 94800, Villejuif, France
| |
Collapse
|
3
|
Chakravorty S, Afzali B, Kazemian M. EBV-associated diseases: Current therapeutics and emerging technologies. Front Immunol 2022; 13:1059133. [PMID: 36389670 PMCID: PMC9647127 DOI: 10.3389/fimmu.2022.1059133] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 10/14/2022] [Indexed: 11/13/2022] Open
Abstract
EBV is a prevalent virus, infecting >90% of the world's population. This is an oncogenic virus that causes ~200,000 cancer-related deaths annually. It is, in addition, a significant contributor to the burden of autoimmune diseases. Thus, EBV represents a significant public health burden. Upon infection, EBV remains dormant in host cells for long periods of time. However, the presence or episodic reactivation of the virus increases the risk of transforming healthy cells to malignant cells that routinely escape host immune surveillance or of producing pathogenic autoantibodies. Cancers caused by EBV display distinct molecular behaviors compared to those of the same tissue type that are not caused by EBV, presenting opportunities for targeted treatments. Despite some encouraging results from exploration of vaccines, antiviral agents and immune- and cell-based treatments, the efficacy and safety of most therapeutics remain unclear. Here, we provide an up-to-date review focusing on underlying immune and environmental mechanisms, current therapeutics and vaccines, animal models and emerging technologies to study EBV-associated diseases that may help provide insights for the development of novel effective treatments.
Collapse
Affiliation(s)
- Srishti Chakravorty
- Department of Biochemistry, Purdue University, West Lafayette, IN, United States
| | - Behdad Afzali
- Immunoregulation Section, Kidney Diseases Branch, National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK), National Institutes of Health (NIH), Bethesda, MD, United States
| | - Majid Kazemian
- Department of Biochemistry, Purdue University, West Lafayette, IN, United States.,Department of Computer Science, Purdue University, West Lafayette IN, United States
| |
Collapse
|
4
|
Li D, Huang Q, Huang L, Wen J, Luo J, Li Q, Peng Y, Zhang Y. Baiting out a full length sequence from unmapped RNA-seq data. BMC Genomics 2021; 22:857. [PMID: 34837950 PMCID: PMC8626966 DOI: 10.1186/s12864-021-08146-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2021] [Accepted: 11/03/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND As a powerful tool, RNA-Seq has been widely used in various studies. Usually, unmapped RNA-seq reads have been considered as useless and been trashed or ignored. RESULTS We develop a strategy to mining the full length sequence by unmapped reads combining with specific reverse transcription primers design and high throughput sequencing. In this study, we salvage 36 unmapped reads from standard RNA-Seq data and randomly select one 149 bp read as a model. Specific reverse transcription primers are designed to amplify its both ends, followed by next generation sequencing. Then we design a statistical model based on power law distribution to estimate its integrality and significance. Further, we validate it by Sanger sequencing. The result shows that the full length is 1556 bp, with insertion mutations in microsatellite structure. CONCLUSION We believe this method would be a useful strategy to extract the sequences information from the unmapped RNA-seq data. Further, it is an alternative way to get the full length sequence of unknown cDNA.
Collapse
Affiliation(s)
- Dongwei Li
- Animal Functional Genomics Group, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120 China
- Guangdong Provincial Key Laboratory of Protein Function and Regulation in Agricultural Organisms, College of Life Sciences, South China Agricultural University, Guangzhou, Guangdong 510642 China
| | - Qitong Huang
- Animal Functional Genomics Group, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120 China
- Animal Breeding and Genomic, Wageningen University & Research, Wageningen, 6708PB, Netherlands
| | - Lei Huang
- Animal Functional Genomics Group, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120 China
| | - Jikai Wen
- Guangdong Provincial Key Laboratory of Protein Function and Regulation in Agricultural Organisms, College of Life Sciences, South China Agricultural University, Guangzhou, Guangdong 510642 China
| | - Jing Luo
- Animal Functional Genomics Group, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120 China
| | - Qing Li
- Animal Functional Genomics Group, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120 China
| | - Yanling Peng
- Animal Functional Genomics Group, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120 China
| | - Yubo Zhang
- Animal Functional Genomics Group, Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, 518120 China
| |
Collapse
|
5
|
Chen S, Ren C, Zhai J, Yu J, Zhao X, Li Z, Zhang T, Ma W, Han Z, Ma C. CAFU: a Galaxy framework for exploring unmapped RNA-Seq data. Brief Bioinform 2021; 21:676-686. [PMID: 30815667 PMCID: PMC7299299 DOI: 10.1093/bib/bbz018] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Revised: 01/23/2019] [Accepted: 01/27/2019] [Indexed: 12/13/2022] Open
Abstract
A widely used approach in transcriptome analysis is the alignment of short reads to a reference genome. However, owing to the deficiencies of specially designed analytical systems, short reads unmapped to the genome sequence are usually ignored, resulting in the loss of significant biological information and insights. To fill this gap, we present Comprehensive Assembly and Functional annotation of Unmapped RNA-Seq data (CAFU), a Galaxy-based framework that can facilitate the large-scale analysis of unmapped RNA sequencing (RNA-Seq) reads from single- and mixed-species samples. By taking advantage of machine learning techniques, CAFU addresses the issue of accurately identifying the species origin of transcripts assembled using unmapped reads from mixed-species samples. CAFU also represents an innovation in that it provides a comprehensive collection of functions required for transcript confidence evaluation, coding potential calculation, sequence and expression characterization and function annotation. These functions and their dependencies have been integrated into a Galaxy framework that provides access to CAFU via a user-friendly interface, dramatically simplifying complex exploration tasks involving unmapped RNA-Seq reads. CAFU has been validated with RNA-Seq data sets from wheat and Zea mays (maize) samples. CAFU is freely available via GitHub: https://github.com/cma2015/CAFU.
Collapse
Affiliation(s)
- Siyuan Chen
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest Agriculture and Forestry University
| | - Chengzhi Ren
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest Agriculture and Forestry University
| | - Jingjing Zhai
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest Agriculture and Forestry University
| | - Jiantao Yu
- College of Information Engineering, Northwest Agriculture and Forestry University
| | - Xuyang Zhao
- College of Information Engineering, Northwest Agriculture and Forestry University
| | - Zelong Li
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest Agriculture and Forestry University
| | - Ting Zhang
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest Agriculture and Forestry University
| | - Wenlong Ma
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest Agriculture and Forestry University
| | - Zhaoxue Han
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest Agriculture and Forestry University
| | - Chuang Ma
- State Key Laboratory of Crop Stress Biology for Arid Areas, Center of Bioinformatics, College of Life Sciences, Northwest Agriculture and Forestry University
| |
Collapse
|
6
|
Host-Virus Chimeric Events in SARS-CoV-2-Infected Cells Are Infrequent and Artifactual. J Virol 2021; 95:e0029421. [PMID: 33980601 DOI: 10.1128/jvi.00294-21] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
The pathogenic mechanisms underlying severe SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) infection remain largely unelucidated. High-throughput sequencing technologies that capture genome and transcriptome information are key approaches to gain detailed mechanistic insights from infected cells. These techniques readily detect both pathogen- and host-derived sequences, providing a means of studying host-pathogen interactions. Recent studies have reported the presence of host-virus chimeric (HVC) RNA in transcriptome sequencing (RNA-seq) data from SARS-CoV-2-infected cells and interpreted these findings as evidence of viral integration in the human genome as a potential pathogenic mechanism. Since SARS-CoV-2 is a positive-sense RNA virus that replicates in the cytoplasm, it does not have a nuclear phase in its life cycle. Thus, it is biologically unlikely to be in a location where splicing events could result in genome integration. Therefore, we investigated the biological authenticity of HVC events. In contrast to true biological events like mRNA splicing and genome rearrangement events, which generate reproducible chimeric sequencing fragments across different biological isolates, we found that HVC events across >100 RNA-seq libraries from patients with coronavirus disease 2019 (COVID-19) and infected cell lines were highly irreproducible. RNA-seq library preparation is inherently error prone due to random template switching during reverse transcription of RNA to cDNA. By counting chimeric events observed when constructing an RNA-seq library from human RNA and spiked-in RNA from an unrelated species, such as the fruit fly, we estimated that ∼1% of RNA-seq reads are artifactually chimeric. In SARS-CoV-2 RNA-seq, we found that the frequency of HVC events was, in fact, not greater than this background "noise." Finally, we developed a novel experimental approach to enrich SARS-CoV-2 sequences from bulk RNA of infected cells. This method enriched viral sequences but did not enrich HVC events, suggesting that the majority of HVC events are, in all likelihood, artifacts of library construction. In conclusion, our findings indicate that HVC events observed in RNA-sequencing libraries from SARS-CoV-2-infected cells are extremely rare and are likely artifacts arising from random template switching of reverse transcriptase and/or sequence alignment errors. Therefore, the observed HVC events do not support SARS-CoV-2 fusion to cellular genes and/or integration into human genomes. IMPORTANCE The pathogenic mechanisms underlying SARS-CoV-2, the virus responsible for COVID-19, are not fully understood. In particular, relatively little is known about the reasons some individuals develop life-threatening or persistent COVID-19. Recent studies identified host-virus chimeric (HVC) reads in RNA-sequencing data from SARS-CoV-2-infected cells and suggested that HVC events support potential "human genome invasion" and "integration" by SARS-CoV-2. This suggestion has fueled concerns about the long-term effects of current mRNA vaccines that incorporate elements of the viral genome. SARS-CoV-2 is a positive-sense, single-stranded RNA virus that does not encode a reverse transcriptase and does not include a nuclear phase in its life cycle, so some doubts have rightfully been expressed regarding the authenticity of HVCs and the role played by endogenous retrotransposons in this phenomenon. Thus, it is important to independently authenticate these HVC events. Here, we provide several lines of evidence suggesting that the observed HVC events are likely artifactual.
Collapse
|
7
|
Host-virus chimeric events in SARS-CoV2 infected cells are infrequent and artifactual. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2021. [PMID: 33619483 PMCID: PMC7899447 DOI: 10.1101/2021.02.17.431704] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Pathogenic mechanisms underlying severe SARS-CoV2 infection remain largely unelucidated. High throughput sequencing technologies that capture genome and transcriptome information are key approaches to gain detailed mechanistic insights from infected cells. These techniques readily detect both pathogen and host-derived sequences, providing a means of studying host-pathogen interactions. Recent studies have reported the presence of host-virus chimeric (HVC) RNA in RNA-seq data from SARS-CoV2 infected cells and interpreted these findings as evidence of viral integration in the human genome as a potential pathogenic mechanism. Since SARS-CoV2 is a positive sense RNA virus that replicates in the cytoplasm it does not have a nuclear phase in its life cycle, it is biologically unlikely to be in a location where splicing events could result in genome integration. Here, we investigated the biological authenticity of HVC events. In contrast to true biological events such as mRNA splicing and genome rearrangement events, which generate reproducible chimeric sequencing fragments across different biological isolates, we found that HVC events across >100 RNA-seq libraries from patients with COVID-19 and infected cell lines, were highly irreproducible. RNA-seq library preparation is inherently error-prone due to random template switching during reverse transcription of RNA to cDNA. By counting chimeric events observed when constructing an RNA-seq library from human RNA and spike-in RNA from an unrelated species, such as fruit-fly, we estimated that ~1% of RNA-seq reads are artifactually chimeric. In SARS-CoV2 RNA-seq we found that the frequency of HVC events was, in fact, not greater than this background “noise”. Finally, we developed a novel experimental approach to enrich SARS-CoV2 sequences from bulk RNA of infected cells. This method enriched viral sequences but did not enrich for HVC events, suggesting that the majority of HVC events are, in all likelihood, artifacts of library construction. In conclusion, our findings indicate that HVC events observed in RNA-sequencing libraries from SARS-CoV2 infected cells are extremely rare and are likely artifacts arising from either random template switching of reverse-transcriptase and/or sequence alignment errors. Therefore, the observed HVC events do not support SARS-CoV2 fusion to cellular genes and/or integration into human genomes.
Collapse
|
8
|
A genomic-clinicopathologic nomogram for predicting overall survival of hepatocellular carcinoma. BMC Cancer 2020; 20:1176. [PMID: 33261584 PMCID: PMC7709450 DOI: 10.1186/s12885-020-07688-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2020] [Accepted: 11/25/2020] [Indexed: 12/13/2022] Open
Abstract
Background Hepatocellular carcinoma (HCC) is a common digestive tumor with great heterogeneity and different overall survival (OS) time, causing stern problems for selecting optimal treatment. Here we aim to establish a nomogram to predict the OS in HCC patients. Methods International Cancer Genome Consortium (ICGC) database was searched for the target information in our study. Lasso regression, univariate and multivariate cox analysis were applied during the analysis process. And a nomogram integrating model scoring and clinical characteristic was drawn. Results Six mRNAs were screened out by Lasso regression to make a model for predicting the OS of HCC patients. And this model was proved to be an independent prognostic model predicting OS in HCC patients. The area under the ROC curve (AUC) of this model was 0.803. TCGA database validated the significant value of this 6-mRNA model. Eventually a nomogram including 6-mRNA risk score, gender, age, tumor stage and prior malignancy was set up to predict the OS in HCC patients. Conclusions We established an independent prognostic model of predicting OS for 1–3 years in HCC patients, which is available to all populations. And we developed a nomogram on the basis of this model, which could be of great help to precisely individual treatment measures.
Collapse
|
9
|
Qiu Z, Chen S, Qi Y, Liu C, Zhai J, Xie S, Ma C. Exploring transcriptional switches from pairwise, temporal and population RNA-Seq data using deepTS. Brief Bioinform 2020; 22:5877690. [PMID: 32728687 DOI: 10.1093/bib/bbaa137] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2020] [Revised: 05/25/2020] [Accepted: 06/05/2020] [Indexed: 12/11/2022] Open
Abstract
Transcriptional switch (TS) is a widely observed phenomenon caused by changes in the relative expression of transcripts from the same gene, in spatial, temporal or other dimensions. TS has been associated with human diseases, plant development and stress responses. Its investigation is often hampered by a lack of suitable tools allowing comprehensive and flexible TS analysis for high-throughput RNA sequencing (RNA-Seq) data. Here, we present deepTS, a user-friendly web-based implementation that enables a fully interactive, multifunctional identification, visualization and analysis of TS events for large-scale RNA-Seq datasets from pairwise, temporal and population experiments. deepTS offers rich functionality to streamline RNA-Seq-based TS analysis for both model and non-model organisms and for those with or without reference transcriptome. The presented case studies highlight the capabilities of deepTS and demonstrate its potential for the transcriptome-wide TS analysis of pairwise, temporal and population RNA-Seq data. We believe deepTS will help research groups, regardless of their informatics expertise, perform accessible, reproducible and collaborative TS analyses of large-scale RNA-Seq data.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Chuang Ma
- Bioinformatics Laboratory at Northwest A&F University
| |
Collapse
|
10
|
Nia AM, Khanipov K, Barnette BL, Ullrich RL, Golovko G, Emmett MR. Comparative RNA-Seq transcriptome analyses reveal dynamic time-dependent effects of 56Fe, 16O, and 28Si irradiation on the induction of murine hepatocellular carcinoma. BMC Genomics 2020; 21:453. [PMID: 32611366 PMCID: PMC7329445 DOI: 10.1186/s12864-020-06869-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Accepted: 06/24/2020] [Indexed: 01/04/2023] Open
Abstract
Background One of the health risks posed to astronauts during deep space flights is exposure to high charge, high-energy (HZE) ions (Z > 13), which can lead to the induction of hepatocellular carcinoma (HCC). However, little is known on the molecular mechanisms of HZE irradiation-induced HCC. Results We performed comparative RNA-Seq transcriptomic analyses to assess the carcinogenic effects of 600 MeV/n 56Fe (0.2 Gy), 1 GeV/n 16O (0.2 Gy), and 350 MeV/n 28Si (0.2 Gy) ions in a mouse model for irradiation-induced HCC. C3H/HeNCrl mice were subjected to total body irradiation to simulate space environment HZE-irradiation, and liver tissues were extracted at five different time points post-irradiation to investigate the time-dependent carcinogenic response at the transcriptomic level. Our data demonstrated a clear difference in the biological effects of these HZE ions, particularly immunological, such as Acute Phase Response Signaling, B Cell Receptor Signaling, IL-8 Signaling, and ROS Production in Macrophages. Also seen in this study were novel unannotated transcripts that were significantly affected by HZE. To investigate the biological functions of these novel transcripts, we used a machine learning technique known as self-organizing maps (SOMs) to characterize the transcriptome expression profiles of 60 samples (45 HZE-irradiated, 15 non-irradiated control) from liver tissues. A handful of localized modules in the maps emerged as groups of co-regulated and co-expressed transcripts. The functional context of these modules was discovered using overrepresentation analysis. We found that these spots typically contained enriched populations of transcripts related to specific immunological molecular processes (e.g., Acute Phase Response Signaling, B Cell Receptor Signaling, IL-3 Signaling), and RNA Transcription/Expression. Conclusions A large number of transcripts were found differentially expressed post-HZE irradiation. These results provide valuable information for uncovering the differences in molecular mechanisms underlying HZE specific induced HCC carcinogenesis. Additionally, a handful of novel differentially expressed unannotated transcripts were discovered for each HZE ion. Taken together, these findings may provide a better understanding of biological mechanisms underlying risks for HCC after HZE irradiation and may also have important implications for the discovery of potential countermeasures against and identification of biomarkers for HZE-induced HCC.
Collapse
Affiliation(s)
- Anna M Nia
- Biochemistry and Molecular Biology, University of Texas Medical Branch, 301 University Blvd, Galveston, TX, 77550, USA
| | - Kamil Khanipov
- Pharmacology and Toxicology, University of Texas Medical Branch, 301 University Blvd, Galveston, TX, 77550, USA
| | - Brooke L Barnette
- Biochemistry and Molecular Biology, University of Texas Medical Branch, 301 University Blvd, Galveston, TX, 77550, USA
| | - Robert L Ullrich
- The Radiation Effects Research Foundation (RERF), Hiroshima, Japan
| | - George Golovko
- Pharmacology and Toxicology, University of Texas Medical Branch, 301 University Blvd, Galveston, TX, 77550, USA
| | - Mark R Emmett
- Biochemistry and Molecular Biology, University of Texas Medical Branch, 301 University Blvd, Galveston, TX, 77550, USA. .,Pharmacology and Toxicology, University of Texas Medical Branch, 301 University Blvd, Galveston, TX, 77550, USA.
| |
Collapse
|
11
|
Jaiswal S, Jadhav PV, Jasrotia RS, Kale PB, Kad SK, Moharil MP, Dudhare MS, Kheni J, Deshmukh AG, Mane SS, Nandanwar RS, Penna S, Manjaya JG, Iquebal MA, Tomar RS, Kawar PG, Rai A, Kumar D. Transcriptomic signature reveals mechanism of flower bud distortion in witches'-broom disease of soybean (Glycine max). BMC PLANT BIOLOGY 2019; 19:26. [PMID: 30646861 PMCID: PMC6332543 DOI: 10.1186/s12870-018-1601-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/04/2018] [Accepted: 12/12/2018] [Indexed: 05/10/2023]
Abstract
BACKGROUND Soybean (Glycine max L. Merril) crop is major source of edible oil and protein for human and animals besides its various industrial uses including biofuels. Phytoplasma induced floral bud distortion syndrome (FBD), also known as witches' broom syndrome (WBS) has been one of the major biotic stresses adversely affecting its productivity. Transcriptomic approach can be used for knowledge discovery of this disease manifestation by morpho-physiological key pathways. RESULTS We report transcriptomic study using Illumina HiSeq NGS data of FBD in soybean, revealing 17,454 differentially expressed genes, 5561 transcription factors, 139 pathways and 176,029 genic region putative markers single sequence repeats, single nucleotide polymorphism and Insertion Deletion. Roles of PmbA, Zn-dependent protease, SAP family and auxin responsive system are described revealing mechanism of flower bud distortion having abnormalities in pollen, stigma development. Validation of 10 randomly selected genes was done by qPCR. Our findings describe the basic mechanism of FBD disease, right from sensing of phytoplasma infection by host plant triggering molecular signalling leading to mobilization of carbohydrate and protein, phyllody, abnormal pollen development, improved colonization of insect in host plants to spread the disease. Study reveals how phytoplasma hijacks metabolic machinery of soybean manifesting FBD. CONCLUSIONS This is the first report of transcriptomic signature of FBD or WBS disease of soybean revealing morphological and metabolic changes which attracts insect for spread of disease. All the genic region putative markers may be used as genomic resource for variety improvement and new agro-chemical development for disease control to enhance soybean productivity.
Collapse
Affiliation(s)
- Sarika Jaiswal
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, Library Avenue, PUSA, New Delhi, 110012 India
| | - Pravin V. Jadhav
- Post Graduate Institute, Dr. Panjabrao Deshmukh Krishi Vidyapeeth, Akola, Maharashtra, 444104 India
| | - Rahul Singh Jasrotia
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, Library Avenue, PUSA, New Delhi, 110012 India
| | - Prashant B. Kale
- National Research Centre on Plant Biotechnology, LBS Centre, PUSA Campus, New Delhi, 110012 India
| | - Snehal K. Kad
- Post Graduate Institute, Dr. Panjabrao Deshmukh Krishi Vidyapeeth, Akola, Maharashtra, 444104 India
| | - Mangesh P. Moharil
- Post Graduate Institute, Dr. Panjabrao Deshmukh Krishi Vidyapeeth, Akola, Maharashtra, 444104 India
| | - Mahendra S. Dudhare
- Post Graduate Institute, Dr. Panjabrao Deshmukh Krishi Vidyapeeth, Akola, Maharashtra, 444104 India
| | - Jashminkumar Kheni
- Department of Biotechnology, Junagadh Agricultural University, Junagadh, Gujarat India
| | - Amit G. Deshmukh
- Post Graduate Institute, Dr. Panjabrao Deshmukh Krishi Vidyapeeth, Akola, Maharashtra, 444104 India
| | - Shyamsundar S. Mane
- Post Graduate Institute, Dr. Panjabrao Deshmukh Krishi Vidyapeeth, Akola, Maharashtra, 444104 India
| | - Ravindra S. Nandanwar
- Post Graduate Institute, Dr. Panjabrao Deshmukh Krishi Vidyapeeth, Akola, Maharashtra, 444104 India
| | - Suprasanna Penna
- Nuclear Agriculture and Biotechnology Division, Homi Bhabha National Institute, Bhabha Atomic Research Centre (BARC), Trombay, Mumbai, 400 085 India
| | - Joy G. Manjaya
- Nuclear Agriculture and Biotechnology Division, Homi Bhabha National Institute, Bhabha Atomic Research Centre (BARC), Trombay, Mumbai, 400 085 India
| | - Mir Asif Iquebal
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, Library Avenue, PUSA, New Delhi, 110012 India
| | - Rukam Singh Tomar
- Department of Biotechnology, Junagadh Agricultural University, Junagadh, Gujarat India
| | - Prashant G. Kawar
- ICAR- Directorate of Floricultural Research, College of Agriculture, Pune, Maharashtra, 411 005, India
| | - Anil Rai
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, Library Avenue, PUSA, New Delhi, 110012 India
| | - Dinesh Kumar
- Centre for Agricultural Bioinformatics, ICAR-Indian Agricultural Statistics Research Institute, Library Avenue, PUSA, New Delhi, 110012 India
| |
Collapse
|
12
|
Iquebal MA, Soren KR, Gangwar P, Shanmugavadivel PS, Aravind K, Singla D, Jaiswal S, Jasrotia RS, Chaturvedi SK, Singh NP, Varshney RK, Rai A, Kumar D. Discovery of Putative Herbicide Resistance Genes and Its Regulatory Network in Chickpea Using Transcriptome Sequencing. FRONTIERS IN PLANT SCIENCE 2017; 8:958. [PMID: 28638398 PMCID: PMC5461349 DOI: 10.3389/fpls.2017.00958] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/23/2016] [Accepted: 05/22/2017] [Indexed: 05/06/2023]
Abstract
Background: Chickpea (Cicer arietinum L.) contributes 75% of total pulse production. Being cheaper than animal protein, makes it important in dietary requirement of developing countries. Weed not only competes with chickpea resulting into drastic yield reduction but also creates problem of harboring fungi, bacterial diseases and insect pests. Chemical approach having new herbicide discovery has constraint of limited lead molecule options, statutory regulations and environmental clearance. Through genetic approach, transgenic herbicide tolerant crop has given successful result but led to serious concern over ecological safety thus non-transgenic approach like marker assisted selection is desirable. Since large variability in tolerance limit of herbicide already exists in chickpea varieties, thus the genes offering herbicide tolerance can be introgressed in variety improvement programme. Transcriptome studies can discover such associated key genes with herbicide tolerance in chickpea. Results: This is first transcriptomic studies of chickpea or even any legume crop using two herbicide susceptible and tolerant genotypes exposed to imidazoline (Imazethapyr). Approximately 90 million paired-end reads generated from four samples were processed and assembled into 30,803 contigs using reference based assembly. We report 6,310 differentially expressed genes (DEGs), of which 3,037 were regulated by 980 miRNAs, 1,528 transcription factors associated with 897 DEGs, 47 Hub proteins, 3,540 putative Simple Sequence Repeat-Functional Domain Marker (SSR-FDM), 13,778 genic Single Nucleotide Polymorphism (SNP) putative markers and 1,174 Indels. Randomly selected 20 DEGs were validated using qPCR. Pathway analysis suggested that xenobiotic degradation related gene, glutathione S-transferase (GST) were only up-regulated in presence of herbicide. Down-regulation of DNA replication genes and up-regulation of abscisic acid pathway genes were observed. Study further reveals the role of cytochrome P450, xyloglucan endotransglucosylase/hydrolase, glutamate dehydrogenase, methyl crotonoyl carboxylase and of thaumatin-like genes in herbicide resistance. Conclusion: Reported DEGs can be used as genomic resource for future discovery of candidate genes associated with herbicide tolerance. Reported markers can be used for future association studies in order to develop marker assisted selection (MAS) for refinement. In endeavor of chickpea variety development programme, these findings can be of immense use in improving productivity of chickpea germplasm.
Collapse
Affiliation(s)
- Mir A. Iquebal
- Centre for Agricultural Bioinformatics, Indian Agricultural Statistics Research Institute (ICAR)New Delhi, India
| | - Khela R. Soren
- Division of Plant Biotechnology, Indian Institute of Pulses Research (ICAR)Kanpur, India
| | - Priyanka Gangwar
- Division of Plant Biotechnology, Indian Institute of Pulses Research (ICAR)Kanpur, India
| | - P. S. Shanmugavadivel
- Division of Plant Biotechnology, Indian Institute of Pulses Research (ICAR)Kanpur, India
| | - K. Aravind
- Division of Plant Biotechnology, Indian Institute of Pulses Research (ICAR)Kanpur, India
| | - Deepak Singla
- Centre for Agricultural Bioinformatics, Indian Agricultural Statistics Research Institute (ICAR)New Delhi, India
| | - Sarika Jaiswal
- Centre for Agricultural Bioinformatics, Indian Agricultural Statistics Research Institute (ICAR)New Delhi, India
| | - Rahul S. Jasrotia
- Centre for Agricultural Bioinformatics, Indian Agricultural Statistics Research Institute (ICAR)New Delhi, India
| | - Sushil K. Chaturvedi
- Division of Plant Biotechnology, Indian Institute of Pulses Research (ICAR)Kanpur, India
| | - Narendra P. Singh
- Division of Plant Biotechnology, Indian Institute of Pulses Research (ICAR)Kanpur, India
| | - Rajeev K. Varshney
- Genetic Gains, International Crops Research Institute for the Semi-Arid TropicsPatancheru, India
| | - Anil Rai
- Centre for Agricultural Bioinformatics, Indian Agricultural Statistics Research Institute (ICAR)New Delhi, India
| | - Dinesh Kumar
- Centre for Agricultural Bioinformatics, Indian Agricultural Statistics Research Institute (ICAR)New Delhi, India
| |
Collapse
|
13
|
Ocklenburg S, Schmitz J, Moinfar Z, Moser D, Klose R, Lor S, Kunz G, Tegenthoff M, Faustmann P, Francks C, Epplen JT, Kumsta R, Güntürkün O. Epigenetic regulation of lateralized fetal spinal gene expression underlies hemispheric asymmetries. eLife 2017; 6. [PMID: 28145864 PMCID: PMC5295814 DOI: 10.7554/elife.22784] [Citation(s) in RCA: 65] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2016] [Accepted: 01/31/2017] [Indexed: 12/11/2022] Open
Abstract
Lateralization is a fundamental principle of nervous system organization but its molecular determinants are mostly unknown. In humans, asymmetric gene expression in the fetal cortex has been suggested as the molecular basis of handedness. However, human fetuses already show considerable asymmetries in arm movements before the motor cortex is functionally linked to the spinal cord, making it more likely that spinal gene expression asymmetries form the molecular basis of handedness. We analyzed genome-wide mRNA expression and DNA methylation in cervical and anterior thoracal spinal cord segments of five human fetuses and show development-dependent gene expression asymmetries. These gene expression asymmetries were epigenetically regulated by miRNA expression asymmetries in the TGF-β signaling pathway and lateralized methylation of CpG islands. Our findings suggest that molecular mechanisms for epigenetic regulation within the spinal cord constitute the starting point for handedness, implying a fundamental shift in our understanding of the ontogenesis of hemispheric asymmetries in humans. DOI:http://dx.doi.org/10.7554/eLife.22784.001
Collapse
Affiliation(s)
- Sebastian Ocklenburg
- Institute of Cognitive Neuroscience, Department Biopsychology, Ruhr University Bochum, Bochum, Germany
| | - Judith Schmitz
- Institute of Cognitive Neuroscience, Department Biopsychology, Ruhr University Bochum, Bochum, Germany
| | - Zahra Moinfar
- Department of Neuroanatomy and Molecular Brain Research, Ruhr University Bochum, Bochum, Germany
| | - Dirk Moser
- Department of Genetic Psychology, Ruhr University Bochum, Bochum, Germany
| | - Rena Klose
- Institute of Cognitive Neuroscience, Department Biopsychology, Ruhr University Bochum, Bochum, Germany
| | - Stephanie Lor
- Institute of Cognitive Neuroscience, Department Biopsychology, Ruhr University Bochum, Bochum, Germany
| | - Georg Kunz
- Department of Obstetrics and Gynecology, St. Johannes Hospital, Dortmund, Germany
| | - Martin Tegenthoff
- Department of Neurology, University Hospital Bergmannsheil, Bochum, Germany
| | - Pedro Faustmann
- Department of Neuroanatomy and Molecular Brain Research, Ruhr University Bochum, Bochum, Germany
| | - Clyde Francks
- Language and Genetics Department, Max Planck Institute for Psycholinguistics, Nijmegen, Netherlands.,Donders Institute for Brain, Cognition and Behaviour, Radboud University, Nijmegen, Netherlands
| | - Jörg T Epplen
- Department of Human Genetics, Ruhr University Bochum, Bochum, Germany
| | - Robert Kumsta
- Department of Genetic Psychology, Ruhr University Bochum, Bochum, Germany
| | - Onur Güntürkün
- Institute of Cognitive Neuroscience, Department Biopsychology, Ruhr University Bochum, Bochum, Germany.,Stellenbosch Institute for Advanced Study (STIAS), Wallenberg Research Centre at Stellenbosch University, Stellenbosch, South Africa
| |
Collapse
|