1
|
Zhu Z, Chen Y, Qin X, Liu S, Wang J, Ren H. Multidimensional landscape of non-alcoholic fatty liver disease-related disease spectrum uncovered by big omics data: Profiling evidence and new perspectives. SMART MEDICINE 2023; 2:e20220029. [PMID: 39188279 PMCID: PMC11236021 DOI: 10.1002/smmd.20220029] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Accepted: 02/22/2023] [Indexed: 08/28/2024]
Abstract
Characterized by hepatic lipid accumulation, non-alcoholic fatty liver disease (NAFLD) is a multifactorial metabolic disorder that could promote the progression of non-alcoholic steatohepatitis (NASH), cirrhosis, and hepatocellular carcinoma (HCC). Benefiting from recent advances in omics technologies, such as high-throughput sequencing, voluminous profiling data in HCC-integrated molecular science into clinical medicine helped clinicians with rational guidance for treatments. In this review, we conclude the majority of publicly available omics data on the NAFLD-related disease spectrum and bring up new insights to inspire next-generation therapeutics against this increasingly prevalent disease spectrum in the post-genomic era.
Collapse
Affiliation(s)
- Zhengyi Zhu
- Department of Hepatobiliary SurgeryAffiliated Drum Tower HospitalMedical SchoolNanjing UniversityNanjingChina
| | - Yuyan Chen
- Department of Hepatobiliary SurgeryAffiliated Drum Tower HospitalMedical SchoolNanjing UniversityNanjingChina
| | - Xueqian Qin
- Department of Hepatobiliary SurgeryAffiliated Drum Tower HospitalMedical SchoolNanjing UniversityNanjingChina
| | - Shujun Liu
- Department of Hepatobiliary SurgeryAffiliated Drum Tower HospitalMedical SchoolNanjing UniversityNanjingChina
| | - Jinglin Wang
- Department of Hepatobiliary SurgeryAffiliated Drum Tower HospitalMedical SchoolNanjing UniversityNanjingChina
| | - Haozhen Ren
- Department of Hepatobiliary SurgeryAffiliated Drum Tower HospitalMedical SchoolNanjing UniversityNanjingChina
| |
Collapse
|
2
|
Juan H, Huang H. Quantitative analysis of high‐throughput biological data. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2023. [DOI: 10.1002/wcms.1658] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Affiliation(s)
- Hsueh‐Fen Juan
- Department of Life Science, Institute of Biomedical Electronics and Bioinformatics, and Center for Systems Biology National Taiwan University Taipei Taiwan
- Taiwan AI Labs Taipei Taiwan
| | - Hsuan‐Cheng Huang
- Institute of Biomedical Informatics National Yang Ming Chiao Tung University Taipei Taiwan
| |
Collapse
|
3
|
Ko G, Kim PG, Cho Y, Jeong S, Kim JY, Kim KH, Lee HY, Han J, Yu N, Ham S, Jang I, Kang B, Shin S, Kim L, Lee SW, Nam D, Kim JF, Kim N, Kim SY, Lee S, Roh TY, Lee B. Bioinformatics services for analyzing massive genomic datasets. Genomics Inform 2020; 18:e8. [PMID: 32224841 PMCID: PMC7120352 DOI: 10.5808/gi.2020.18.1.e8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 03/11/2020] [Indexed: 11/25/2022] Open
Abstract
The explosive growth of next-generation sequencing data has resulted in ultra-large-scale datasets and ensuing computational problems. In Korea, the amount of genomic data has been increasing rapidly in the recent years. Leveraging these big data requires researchers to use large-scale computational resources and analysis pipelines. A promising solution for addressing this computational challenge is cloud computing, where CPUs, memory, storage, and programs are accessible in the form of virtual machines. Here, we present a cloud computing-based system, Bio-Express, that provides user-friendly, cost-effective analysis of massive genomic datasets. Bio-Express is loaded with predefined multi-omics data analysis pipelines, which are divided into genome, transcriptome, epigenome, and metagenome pipelines. Users can employ predefined pipelines or create a new pipeline for analyzing their own omics data. We also developed several web-based services for facilitating downstream analysis of genome data. Bio-Express web service is freely available at https://www.bioexpress.re.kr/.
Collapse
Affiliation(s)
- Gunhwan Ko
- Korea Bioinformation Center (KOBIC), KRIBB, Daejeon 34141, Korea
| | - Pan-Gyu Kim
- Korea Bioinformation Center (KOBIC), KRIBB, Daejeon 34141, Korea
| | - Youngbum Cho
- Genome Editing Research Center, KRIBB, Daejeon 34141, Korea
| | - Seongmun Jeong
- Genome Editing Research Center, KRIBB, Daejeon 34141, Korea
| | - Jae-Yoon Kim
- Genome Editing Research Center, KRIBB, Daejeon 34141, Korea
| | | | - Ho-Yeon Lee
- Genome Editing Research Center, KRIBB, Daejeon 34141, Korea
| | - Jiyeon Han
- Department of BioInformation Science, Ewha Womans University, Seoul 03760, Korea
| | - Namhee Yu
- Department of BioInformation Science, Ewha Womans University, Seoul 03760, Korea
| | - Seokjin Ham
- Department of Life Sciences and Division of Integrative Biosciences & Biotechnology, Pohang University of Science & Technology (POSTECH), Pohang 37673, Korea
| | - Insoon Jang
- Department of Life Sciences and Division of Integrative Biosciences & Biotechnology, Pohang University of Science & Technology (POSTECH), Pohang 37673, Korea
| | - Byunghee Kang
- Department of Life Sciences and Division of Integrative Biosciences & Biotechnology, Pohang University of Science & Technology (POSTECH), Pohang 37673, Korea
| | - Sunguk Shin
- Department of Systems, Biology Division of Life Sciences, and Institute for Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea
| | - Lian Kim
- Bioposh Inc., Daejeon 34016, Korea
| | | | - Dougu Nam
- School of Life Sciences, Ulsan National Institute of Science and Technology, Ulsan 44919, Korea
| | - Jihyun F Kim
- Department of Systems, Biology Division of Life Sciences, and Institute for Life Science and Biotechnology, Yonsei University, Seoul 03722, Korea.,Strategic Initiative for Microbiomes in Agriculture and Food, Yonsei University, Seoul 03722, Korea
| | - Namshin Kim
- Genome Editing Research Center, KRIBB, Daejeon 34141, Korea
| | - Seon-Young Kim
- Genome Structure Research Center, KRIBB, Daejeon 34141, Korea
| | - Sanghyuk Lee
- Department of BioInformation Science, Ewha Womans University, Seoul 03760, Korea
| | - Tae-Young Roh
- Department of Life Sciences and Division of Integrative Biosciences & Biotechnology, Pohang University of Science & Technology (POSTECH), Pohang 37673, Korea.,SysGenLab Inc., Pohang 37613, Korea
| | - Byungwook Lee
- Korea Bioinformation Center (KOBIC), KRIBB, Daejeon 34141, Korea
| |
Collapse
|
4
|
Chowdhury HA, Bhattacharyya DK, Kalita JK. Differential Expression Analysis of RNA-seq Reads: Overview, Taxonomy, and Tools. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:566-586. [PMID: 30281477 DOI: 10.1109/tcbb.2018.2873010] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Analysis of RNA-sequence (RNA-seq) data is widely used in transcriptomic studies and it has many applications. We review RNA-seq data analysis from RNA-seq reads to the results of differential expression analysis. In addition, we perform a descriptive comparison of tools used in each step of RNA-seq data analysis along with a discussion of important characteristics of these tools. A taxonomy of tools is also provided. A discussion of issues in quality control and visualization of RNA-seq data is also included along with useful tools. Finally, we provide some guidelines for the RNA-seq data analyst, along with research issues and challenges which should be addressed.
Collapse
|
5
|
Singh A, Singh PK, Sharma AK, Singh NK, Sonah H, Deshmukh R, Sharma TR. Understanding the Role of the WRKY Gene Family under Stress Conditions in Pigeonpea ( Cajanus Cajan L.). PLANTS 2019; 8:plants8070214. [PMID: 31295921 PMCID: PMC6681228 DOI: 10.3390/plants8070214] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/01/2019] [Revised: 06/27/2019] [Accepted: 06/29/2019] [Indexed: 12/26/2022]
Abstract
Pigeonpea (Cajanus cajan L.), a protein-rich legume, is a major food component of the daily diet for residents in semi-arid tropical regions of the word. Pigeonpea is also known for its high level of tolerance against biotic and abiotic stresses. In this regard, understanding the genes involved in stress tolerance has great importance. In the present study, identification, and characterization of WRKY, a large transcription factor gene family involved in numerous biological processes like seed germination, metabolism, plant growth, biotic and abiotic stress responses was performed in pigeonpea. A total of 94 WRKY genes identified in the pigeonpea genome were extensively characterized for gene-structures, localizations, phylogenetic distribution, conserved motif organizations, and functional annotation. Phylogenetic analysis revealed three major groups (I, II, and III) of pigeonpea WRKY genes. Subsequently, expression profiling of 94 CcWRKY genes across different tissues like root, nodule, stem, petiole, petal, sepal, shoot apical meristem (SAM), mature pod, and mature seed retrieved from the available RNAseq data identified tissue-specific WRKY genes with preferential expression in the vegetative and reproductive stages. Gene co-expression networks identified four WRKY genes at the center of maximum interaction which may play a key role in the entire WRKY regulations. Furthermore, quantitative real-time polymerase chain reaction (qRT-PCR) expression analysis of WRKY genes in root and leaf tissue samples from plants under drought and salinity stress identified differentially expressed WRKY genes. The study will be helpful to understand the evolution, regulation, and distribution of the WRKY gene family, and additional exploration for the development of stress tolerance cultivars in pigeonpea and other legumes crops.
Collapse
Affiliation(s)
- Akshay Singh
- National Agri-Food Biotechnology Institute, Mohali, Punjab 140306 India
- Dr. A. P. J. Abdul Kalam Technical University, Lucknow, Uttar Pradesh 226031, India
| | | | - Ajay Kumar Sharma
- Meerut Institute of Engineering and Technology, Meerut, Uttar Pradesh 250005, India
| | | | - Humira Sonah
- National Agri-Food Biotechnology Institute, Mohali, Punjab 140306 India
| | - Rupesh Deshmukh
- National Agri-Food Biotechnology Institute, Mohali, Punjab 140306 India
| | - Tilak Raj Sharma
- National Agri-Food Biotechnology Institute, Mohali, Punjab 140306 India.
| |
Collapse
|
6
|
Su S, Hou Z, Wang L, Liu D, Hu J, Xu J, Tao J. Further confirmation of second- and third-generation Eimeria necatrix merozoite DEGs using suppression subtractive hybridization. Parasitol Res 2019; 118:1159-1169. [PMID: 30747293 DOI: 10.1007/s00436-019-06242-9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2018] [Accepted: 01/31/2019] [Indexed: 11/28/2022]
Abstract
In our previous study, we obtained a large number of differentially expressed genes (DEGs) between second-generation merozoites (MZ-2) and third-generation merozoites (MZ-3) of Eimeria necatrix using RNA sequencing (RNA-seq). Here, we report two subtractive cDNA libraries for MZ2 (forward library) and MZ3 (reverse library) that were constructed using suppression subtractive hybridization (SSH). PCR amplification revealed that the MZ2 and MZ3 libraries contained approximately 96.7% and 95% recombinant clones, respectively, and the length of the inserted fragments ranged from 0.5 to 1.5 kb. A total of 106 and 111 unique sequences were obtained from the MZ2 and MZ3 libraries, respectively, and were assembled into 13 specific consensus sequences (contigs or genes) (5 from MZ2 and 8 from MZ3). The qRT-PCR results revealed that 11 out of 13 genes were differentially expressed between MZ-2 and MZ-3. Of 13 genes, 11 genes were found in both SSH and our RNA-seq data and displayed a similar expression trend between SSH and RNA-seq data, and the remaining 2 genes have not been reported in both E. necatrix genome and our RNA-seq data. Among the 11 genes, the expression trends of 8 genes were highly consistent between SSH and our RNA-seq data. These DEGs may provide specialized functions related to the life-cycle transitions of Eimeria species.
Collapse
Affiliation(s)
- Shijie Su
- College of Veterinary Medicine, Yangzhou University, Yangzhou, 225009, People's Republic of China.,Jiangsu Co-innovation Center for Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou University, Yangzhou, 225009, People's Republic of China.,Jiangsu Key Laboratory of Zoonosis, Yangzhou University, Yangzhou, 225009, People's Republic of China.,Joint International Research Laboratory of Agriculture and Agri-Product Safety, the Ministry of Education of China, Yangzhou University, Yangzhou, 225009, People's Republic of China
| | - Zhaofeng Hou
- College of Veterinary Medicine, Yangzhou University, Yangzhou, 225009, People's Republic of China.,Jiangsu Co-innovation Center for Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou University, Yangzhou, 225009, People's Republic of China.,Jiangsu Key Laboratory of Zoonosis, Yangzhou University, Yangzhou, 225009, People's Republic of China.,Joint International Research Laboratory of Agriculture and Agri-Product Safety, the Ministry of Education of China, Yangzhou University, Yangzhou, 225009, People's Republic of China
| | - Lele Wang
- College of Veterinary Medicine, Yangzhou University, Yangzhou, 225009, People's Republic of China.,Jiangsu Co-innovation Center for Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou University, Yangzhou, 225009, People's Republic of China.,Jiangsu Key Laboratory of Zoonosis, Yangzhou University, Yangzhou, 225009, People's Republic of China.,Joint International Research Laboratory of Agriculture and Agri-Product Safety, the Ministry of Education of China, Yangzhou University, Yangzhou, 225009, People's Republic of China
| | - Dandan Liu
- College of Veterinary Medicine, Yangzhou University, Yangzhou, 225009, People's Republic of China.,Jiangsu Co-innovation Center for Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou University, Yangzhou, 225009, People's Republic of China.,Jiangsu Key Laboratory of Zoonosis, Yangzhou University, Yangzhou, 225009, People's Republic of China.,Joint International Research Laboratory of Agriculture and Agri-Product Safety, the Ministry of Education of China, Yangzhou University, Yangzhou, 225009, People's Republic of China
| | - Junjie Hu
- Biology Department, Yunnan University, Kunming, 650500, People's Republic of China
| | - Jinjun Xu
- College of Veterinary Medicine, Yangzhou University, Yangzhou, 225009, People's Republic of China.,Jiangsu Co-innovation Center for Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou University, Yangzhou, 225009, People's Republic of China.,Jiangsu Key Laboratory of Zoonosis, Yangzhou University, Yangzhou, 225009, People's Republic of China.,Joint International Research Laboratory of Agriculture and Agri-Product Safety, the Ministry of Education of China, Yangzhou University, Yangzhou, 225009, People's Republic of China
| | - Jianping Tao
- College of Veterinary Medicine, Yangzhou University, Yangzhou, 225009, People's Republic of China. .,Jiangsu Co-innovation Center for Prevention and Control of Important Animal Infectious Diseases and Zoonoses, Yangzhou University, Yangzhou, 225009, People's Republic of China. .,Jiangsu Key Laboratory of Zoonosis, Yangzhou University, Yangzhou, 225009, People's Republic of China. .,Joint International Research Laboratory of Agriculture and Agri-Product Safety, the Ministry of Education of China, Yangzhou University, Yangzhou, 225009, People's Republic of China.
| |
Collapse
|
7
|
Soneson C, Love MI, Patro R, Hussain S, Malhotra D, Robinson MD. A junction coverage compatibility score to quantify the reliability of transcript abundance estimates and annotation catalogs. Life Sci Alliance 2019; 2:2/1/e201800175. [PMID: 30655364 PMCID: PMC6337739 DOI: 10.26508/lsa.201800175] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2018] [Revised: 01/07/2019] [Accepted: 01/08/2019] [Indexed: 02/01/2023] Open
Abstract
Comparison of observed exon–exon junction counts to those predicted from estimated transcript abundances can identify genes with misannotated or misquantified transcripts. Most methods for statistical analysis of RNA-seq data take a matrix of abundance estimates for some type of genomic features as their input, and consequently the quality of any obtained results is directly dependent on the quality of these abundances. Here, we present the junction coverage compatibility score, which provides a way to evaluate the reliability of transcript-level abundance estimates and the accuracy of transcript annotation catalogs. It works by comparing the observed number of reads spanning each annotated splice junction in a genomic region to the predicted number of junction-spanning reads, inferred from the estimated transcript abundances and the genomic coordinates of the corresponding annotated transcripts. We show that although most genes show good agreement between the observed and predicted junction coverages, there is a small set of genes that do not. Genes with poor agreement are found regardless of the method used to estimate transcript abundances, and the corresponding transcript abundances should be treated with care in any downstream analyses.
Collapse
Affiliation(s)
- Charlotte Soneson
- Institute of Molecular Life Sciences, University of Zurich, Zurich, Switzerland .,SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland
| | - Michael I Love
- Department of Biostatistics, University of North Carolina-Chapel Hill, Chapel Hill, NC, USA.,Department of Genetics, University of North Carolina-Chapel Hill, Chapel Hill, NC, USA
| | - Rob Patro
- Department of Computer Science, Stony Brook University, NY, USA
| | - Shobbir Hussain
- Department of Biology and Biochemistry, University of Bath, Bath, UK
| | - Dheeraj Malhotra
- F. Hoffmann-La Roche Ltd, Pharma Research and Early Development, Neuroscience, Ophthalmology and Rare Diseases, Roche Innovation Center Basel, Basel, Switzerland
| | - Mark D Robinson
- Institute of Molecular Life Sciences, University of Zurich, Zurich, Switzerland .,SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, Switzerland
| |
Collapse
|
8
|
Event Analysis: Using Transcript Events To Improve Estimates of Abundance in RNA-seq Data. G3-GENES GENOMES GENETICS 2018; 8:2923-2940. [PMID: 30021829 PMCID: PMC6118309 DOI: 10.1534/g3.118.200373] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Alternative splicing leverages genomic content by allowing the synthesis of multiple transcripts and, by implication, protein isoforms, from a single gene. However, estimating the abundance of transcripts produced in a given tissue from short sequencing reads is difficult and can result in both the construction of transcripts that do not exist, and the failure to identify true transcripts. An alternative approach is to catalog the events that make up isoforms (splice junctions and exons). We present here the Event Analysis (EA) approach, where we project transcripts onto the genome and identify overlapping/unique regions and junctions. In addition, all possible logical junctions are assembled into a catalog. Transcripts are filtered before quantitation based on simple measures: the proportion of the events detected, and the coverage. We find that mapping to a junction catalog is more efficient at detecting novel junctions than mapping in a splice aware manner. We identify 99.8% of true transcripts while iReckon identifies 82% of the true transcripts and creates more transcripts not included in the simulation than were initially used in the simulation. Using PacBio Iso-seq data from a mouse neural progenitor cell model, EA detects 60% of the novel junctions that are combinations of existing exons while only 43% are detected by STAR. EA further detects ∼5,000 annotated junctions missed by STAR. Filtering transcripts based on the proportion of the transcript detected and the number of reads on average supporting that transcript captures 95% of the PacBio transcriptome. Filtering the reference transcriptome before quantitation, results in is a more stable estimate of isoform abundance, with improved correlation between replicates. This was particularly evident when EA is applied to an RNA-seq study of type 1 diabetes (T1D), where the coefficient of variation among subjects (n = 81) in the transcript abundance estimates was substantially reduced compared to the estimation using the full reference. EA focuses on individual transcriptional events. These events can be quantitate and analyzed directly or used to identify the probable set of expressed transcripts. Simple rules based on detected events and coverage used in filtering result in a dramatic improvement in isoform estimation without the use of ancillary data (e.g., ChIP, long reads) that may not be available for many studies.
Collapse
|
9
|
Esin A, Bergendahl LT, Savolainen V, Marsh JA, Warnecke T. The genetic basis and evolution of red blood cell sickling in deer. Nat Ecol Evol 2018; 2:367-376. [PMID: 29255300 PMCID: PMC5777626 DOI: 10.1038/s41559-017-0420-3] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2017] [Accepted: 11/20/2017] [Indexed: 11/09/2022]
Abstract
Crescent-shaped red blood cells, the hallmark of sickle-cell disease, present a striking departure from the biconcave disc shape normally found in mammals. Characterized by increased mechanical fragility, sickled cells promote haemolytic anaemia and vaso-occlusions and contribute directly to disease in humans. Remarkably, a similar sickle-shaped morphology has been observed in erythrocytes from several deer species, without obvious pathological consequences. The genetic basis of erythrocyte sickling in deer, however, remains unknown. Here, we determine the sequences of human β-globin orthologues in 15 deer species and use protein structural modelling to identify a sickling mechanism distinct from the human disease, coordinated by a derived valine (E22V) that is unique to sickling deer. Evidence for long-term maintenance of a trans-species sickling/non-sickling polymorphism suggests that sickling in deer is adaptive. Our results have implications for understanding the ecological regimes and molecular architectures that have promoted convergent evolution of sickling erythrocytes across vertebrates.
Collapse
Affiliation(s)
- Alexander Esin
- Molecular Systems Group, Medical Research Council London Institute of Medical Sciences, Du Cane Road, London, United Kingdom
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, Du Cane Road, London, United Kingdom
| | - L Therese Bergendahl
- Medical Research Council Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, United Kingdom
| | - Vincent Savolainen
- Department of Life Sciences, Silwood Park Campus, Imperial College London, Ascot, United Kingdom
- University of Johannesburg, Auckland Park, Johannesburg, South Africa
| | - Joseph A Marsh
- Medical Research Council Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, United Kingdom
| | - Tobias Warnecke
- Molecular Systems Group, Medical Research Council London Institute of Medical Sciences, Du Cane Road, London, United Kingdom.
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, Du Cane Road, London, United Kingdom.
| |
Collapse
|
10
|
Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol 2016. [PMID: 27043002 DOI: 10.1038/nbt.3519.] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
We present kallisto, an RNA-seq quantification program that is two orders of magnitude faster than previous approaches and achieves similar accuracy. Kallisto pseudoaligns reads to a reference, producing a list of transcripts that are compatible with each read while avoiding alignment of individual bases. We use kallisto to analyze 30 million unaligned paired-end RNA-seq reads in <10 min on a standard laptop computer. This removes a major computational bottleneck in RNA-seq analysis.
Collapse
Affiliation(s)
- Nicolas L Bray
- Innovative Genomics Initiative, University of California, Berkeley, California, USA
| | - Harold Pimentel
- Department of Computer Science, University of California, Berkeley, California, USA
| | - Páll Melsted
- Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, University of Iceland, Reykjavik, Iceland
| | - Lior Pachter
- Department of Computer Science, University of California, Berkeley, California, USA.,Department of Mathematics, University of California, Berkeley, California, USA.,Department of Molecular &Cell Biology, University of California, Berkeley, California, USA
| |
Collapse
|
11
|
Bray NL, Pimentel H, Melsted P, Pachter L. Near-optimal probabilistic RNA-seq quantification. Nat Biotechnol 2016; 34:525-7. [PMID: 27043002 DOI: 10.1038/nbt.3519] [Citation(s) in RCA: 5266] [Impact Index Per Article: 658.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2015] [Accepted: 02/25/2016] [Indexed: 12/18/2022]
Abstract
We present kallisto, an RNA-seq quantification program that is two orders of magnitude faster than previous approaches and achieves similar accuracy. Kallisto pseudoaligns reads to a reference, producing a list of transcripts that are compatible with each read while avoiding alignment of individual bases. We use kallisto to analyze 30 million unaligned paired-end RNA-seq reads in <10 min on a standard laptop computer. This removes a major computational bottleneck in RNA-seq analysis.
Collapse
Affiliation(s)
- Nicolas L Bray
- Innovative Genomics Initiative, University of California, Berkeley, California, USA
| | - Harold Pimentel
- Department of Computer Science, University of California, Berkeley, California, USA
| | - Páll Melsted
- Faculty of Industrial Engineering, Mechanical Engineering and Computer Science, University of Iceland, Reykjavik, Iceland
| | - Lior Pachter
- Department of Computer Science, University of California, Berkeley, California, USA.,Department of Mathematics, University of California, Berkeley, California, USA.,Department of Molecular &Cell Biology, University of California, Berkeley, California, USA
| |
Collapse
|
12
|
A comparison of genetically matched cell lines reveals the equivalence of human iPSCs and ESCs. Nat Biotechnol 2015; 33:1173-81. [PMID: 26501951 PMCID: PMC4847940 DOI: 10.1038/nbt.3388] [Citation(s) in RCA: 199] [Impact Index Per Article: 22.1] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2015] [Accepted: 09/24/2015] [Indexed: 12/18/2022]
Abstract
The equivalence of human induced pluripotent stem cells (hiPSCs) and human embryonic stem cells (hESCs) remains controversial. Here we use genetically matched hESC and hiPSC lines to assess the contribution of cellular origin (hESC vs. hiPSC), the Sendai virus (SeV) reprogramming method and genetic background to transcriptional and DNA methylation patterns while controlling for cell line clonality and sex. We find that transcriptional and epigenetic variation originating from genetic background dominates over variation due to cellular origin or SeV infection. Moreover, the 49 differentially expressed genes we detect between genetically matched hESCs and hiPSCs neither predict functional outcome nor distinguish an independently derived, larger set of unmatched hESC and hiPSC lines. We conclude that hESCs and hiPSCs are molecularly and functionally equivalent and cannot be distinguished by a consistent gene expression signature. Our data further imply that genetic background variation is a major confounding factor for transcriptional and epigenetic comparisons of pluripotent cell lines, explaining some of the previously observed differences between genetically unmatched hESCs and hiPSCs.
Collapse
|