1
|
Du Y, Sun F. MetaCC allows scalable and integrative analyses of both long-read and short-read metagenomic Hi-C data. Nat Commun 2023; 14:6231. [PMID: 37802989 PMCID: PMC10558524 DOI: 10.1038/s41467-023-41209-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Accepted: 08/25/2023] [Indexed: 10/08/2023] Open
Abstract
Metagenomic Hi-C (metaHi-C) can identify contig-to-contig relationships with respect to their proximity within the same physical cell. Shotgun libraries in metaHi-C experiments can be constructed by next-generation sequencing (short-read metaHi-C) or more recent third-generation sequencing (long-read metaHi-C). However, all existing metaHi-C analysis methods are developed and benchmarked on short-read metaHi-C datasets and there exists much room for improvement in terms of more scalable and stable analyses, especially for long-read metaHi-C data. Here we report MetaCC, an efficient and integrative framework for analyzing both short-read and long-read metaHi-C datasets. MetaCC outperforms existing methods on normalization and binning. In particular, the MetaCC normalization module, named NormCC, is more than 3000 times faster than the current state-of-the-art method HiCzin on a complex wastewater dataset. When applied to one sheep gut long-read metaHi-C dataset, MetaCC binning module can retrieve 709 high-quality genomes with the largest species diversity using one single sample, including an expansion of five uncultured members from the order Erysipelotrichales, and is the only binner that can recover the genome of one important species Bacteroides vulgatus. Further plasmid analyses reveal that MetaCC binning is able to capture multi-copy plasmids.
Collapse
Affiliation(s)
- Yuxuan Du
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Fengzhu Sun
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA.
| |
Collapse
|
2
|
Xu C, Li W, Li T, Yuan J, Pang X, Liu T, Liang B, Cheng L, Sun X, Dong S. Iron metabolism-related genes reveal predictive value of acute coronary syndrome. Front Pharmacol 2022; 13:1040845. [PMID: 36330096 PMCID: PMC9622999 DOI: 10.3389/fphar.2022.1040845] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Accepted: 10/04/2022] [Indexed: 11/25/2022] Open
Abstract
Iron deficiency has detrimental effects in patients with acute coronary syndrome (ACS), which is a common nutritional disorder and inflammation-related disease affects up to one-third people worldwide. However, the specific role of iron metabolism in ACS progression is opaque. In this study, we construct an iron metabolism-related genes (IMRGs) based molecular signature of ACS and to identify novel iron metabolism gene markers for early stage of ACS. The IMRGs were mainly collected from Molecular Signatures Database (mSigDB) and two relevant studies. Two blood transcriptome datasets GSE61144 and GSE60993 were used for constructing the prediction model of ACS. After differential analysis, 22 IMRGs were differentially expressed and defined as DEIGs in the training set. Then, the 22 DEIGs were trained by the Elastic Net to build the prediction model. Five genes, PADI4, HLA-DQA1, LCN2, CD7, and VNN1, were determined using multiple Elastic Net calculations and retained to obtain the optimal performance. Finally, the generated model iron metabolism-related gene signature (imSig) was assessed by the validation set GSE60993 using a series of evaluation measurements. Compared with other machine learning methods, the performance of imSig using Elastic Net was superior in the validation set. Elastic Net consistently scores the higher than Lasso and Logistic regression in the validation set in terms of ROC, PRC, Sensitivity, and Specificity. The prediction model based on iron metabolism-related genes may assist in ACS early diagnosis.
Collapse
Affiliation(s)
- Cong Xu
- Shenzhen People’s Hospital, First Affiliated Hospital of Southern University of Science and Technology, Second Clinical Medicine College of Jinan University, Shenzhen, China
| | - Wanyang Li
- School of Mathematics, South China University of Technology, Guangzhou, China
| | - Tangzhiming Li
- Shenzhen People’s Hospital, First Affiliated Hospital of Southern University of Science and Technology, Second Clinical Medicine College of Jinan University, Shenzhen, China
| | - Jie Yuan
- Shenzhen People’s Hospital, First Affiliated Hospital of Southern University of Science and Technology, Second Clinical Medicine College of Jinan University, Shenzhen, China
| | - Xinli Pang
- Shenzhen People’s Hospital, First Affiliated Hospital of Southern University of Science and Technology, Second Clinical Medicine College of Jinan University, Shenzhen, China
| | - Tao Liu
- International Digital Economy Academy, Shenzhen, China
| | - Benhui Liang
- Department of Cardiology, Xiangya Hospital, Central South University, Changsha, China
| | - Lixin Cheng
- Shenzhen People’s Hospital, First Affiliated Hospital of Southern University of Science and Technology, Second Clinical Medicine College of Jinan University, Shenzhen, China
- *Correspondence: Lixin Cheng, ; Xin Sun, ; Shaohong Dong,
| | - Xin Sun
- Shenzhen People’s Hospital, First Affiliated Hospital of Southern University of Science and Technology, Second Clinical Medicine College of Jinan University, Shenzhen, China
- *Correspondence: Lixin Cheng, ; Xin Sun, ; Shaohong Dong,
| | - Shaohong Dong
- Shenzhen People’s Hospital, First Affiliated Hospital of Southern University of Science and Technology, Second Clinical Medicine College of Jinan University, Shenzhen, China
- *Correspondence: Lixin Cheng, ; Xin Sun, ; Shaohong Dong,
| |
Collapse
|
3
|
Wang R, Zheng X, Wang J, Wan S, Song F, Wong MH, Leung KS, Cheng L. Improving bulk RNA-seq classification by transferring gene signature from single cells in acute myeloid leukemia. Brief Bioinform 2022; 23:6523149. [PMID: 35136933 DOI: 10.1093/bib/bbac002] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2021] [Revised: 12/22/2021] [Accepted: 01/04/2022] [Indexed: 12/13/2022] Open
Abstract
The advances in single-cell RNA sequencing (scRNA-seq) technologies enable the characterization of transcriptomic profiles at the cellular level and demonstrate great promise in bulk sample analysis thereby offering opportunities to transfer gene signature from scRNA-seq to bulk data. However, the gene expression signatures identified from single cells are typically inapplicable to bulk RNA-seq data due to the profiling differences of distinct sequencing technologies. Here, we propose single-cell pair-wise gene expression (scPAGE), a novel method to develop single-cell gene pair signatures (scGPSs) that were beneficial to bulk RNA-seq classification to transfer knowledge across platforms. PAGE was adopted to tackle the challenge of profiling differences. We applied the method to acute myeloid leukemia (AML) and identified the scGPS from mouse scRNA-seq that allowed discriminating between AML and control cells. The scGPS was validated in bulk RNA-seq datasets and demonstrated better performance (average area under the curve [AUC] = 0.96) than the conventional gene expression strategies (average AUC$\le$ 0.88) suggesting its potential in disclosing the molecular mechanism of AML. The scGPS also outperformed its bulk counterpart, which highlighted the benefit of gene signature transfer. Furthermore, we confirmed the utility of scPAGE in sepsis as an example of other disease scenarios. scPAGE leveraged the advantages of single-cell profiles to enhance the analysis of bulk samples revealing great potential of transferring knowledge from single-cell to bulk transcriptome studies.
Collapse
Affiliation(s)
- Ran Wang
- Shenzhen People's Hospital, First Affiliated Hospital of Southern University of Science and Technology, Second Clinical Medicine College of Jinan University, Shenzhen 518020, China.,Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Xubin Zheng
- Shenzhen People's Hospital, First Affiliated Hospital of Southern University of Science and Technology, Second Clinical Medicine College of Jinan University, Shenzhen 518020, China.,Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Jun Wang
- Shenzhen People's Hospital, First Affiliated Hospital of Southern University of Science and Technology, Second Clinical Medicine College of Jinan University, Shenzhen 518020, China
| | - Shibiao Wan
- Center for Applied Bioinformatics, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Fangda Song
- School of Data Science, The Chinese University of Hong Kong, Shenzhen 518000, China
| | - Man Hon Wong
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Kwong Sak Leung
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong
| | - Lixin Cheng
- Shenzhen People's Hospital, First Affiliated Hospital of Southern University of Science and Technology, Second Clinical Medicine College of Jinan University, Shenzhen 518020, China
| |
Collapse
|