1
|
Yang B, Li J, Li X, Liu S. Gene regulatory network inference based on novel ensemble method. Brief Funct Genomics 2024; 23:866-878. [PMID: 39324652 DOI: 10.1093/bfgp/elae036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 08/09/2024] [Accepted: 09/06/2024] [Indexed: 09/27/2024] Open
Abstract
Gene regulatory networks (GRNs) contribute toward understanding the function of genes and the development of cancer or the impact of key genes on diseases. Hence, this study proposes an ensemble method based on 13 basic classification methods and a flexible neural tree (FNT) to improve GRN identification accuracy. The primary classification methods contain ridge classification, stochastic gradient descent, Gaussian process classification, Bernoulli Naive Bayes, adaptive boosting, gradient boosting decision tree, hist gradient boosting classification, eXtreme gradient boosting (XGBoost), multilayer perceptron, light gradient boosting machine, random forest, support vector machine, and k-nearest neighbor algorithm, which are regarded as the input variable set of FNT model. Additionally, a hybrid evolutionary algorithm based on a gene programming variant and particle swarm optimization is developed to search for the optimal FNT model. Experiments on three simulation datasets and three real single-cell RNA-seq datasets demonstrate that the proposed ensemble feature outperforms 13 supervised algorithms, seven unsupervised algorithms (ARACNE, CLR, GENIE3, MRNET, PCACMI, GENECI, and EPCACMI) and four single cell-specific methods (SCODE, BiRGRN, LEAP, and BiGBoost) based on the area under the receiver operating characteristic curve, area under the precision-recall curve, and F1 metrics.
Collapse
Affiliation(s)
- Bin Yang
- School of Information Science and Engineering, Zaozhuang University, No. 1 Beian Road, Zaozhuang 277160, China
| | - Jing Li
- School of Information Science and Engineering, Zaozhuang University, No. 1 Beian Road, Zaozhuang 277160, China
| | - Xiang Li
- Information Department, Qingdao Eighth People's Hospital, No. 84 Fengshan Road, Qingdao 266121, China
| | - Sanrong Liu
- School of Information Science and Engineering, Zaozhuang University, No. 1 Beian Road, Zaozhuang 277160, China
| |
Collapse
|
2
|
Shao M, Chen K, Zhang S, Tian M, Shen Y, Cao C, Gu N. Multiome-wide Association Studies: Novel Approaches for Understanding Diseases. GENOMICS, PROTEOMICS & BIOINFORMATICS 2024; 22:qzae077. [PMID: 39471467 PMCID: PMC11630051 DOI: 10.1093/gpbjnl/qzae077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/16/2023] [Revised: 10/06/2024] [Accepted: 10/23/2024] [Indexed: 11/01/2024]
Abstract
The rapid development of multiome (transcriptome, proteome, cistrome, imaging, and regulome)-wide association study methods have opened new avenues for biologists to understand the susceptibility genes underlying complex diseases. Thorough comparisons of these methods are essential for selecting the most appropriate tool for a given research objective. This review provides a detailed categorization and summary of the statistical models, use cases, and advantages of recent multiome-wide association studies. In addition, to illustrate gene-disease association studies based on transcriptome-wide association study (TWAS), we collected 478 disease entries across 22 categories from 235 manually reviewed publications. Our analysis reveals that mental disorders are the most frequently studied diseases by TWAS, indicating its potential to deepen our understanding of the genetic architecture of complex diseases. In summary, this review underscores the importance of multiome-wide association studies in elucidating complex diseases and highlights the significance of selecting the appropriate method for each study.
Collapse
Affiliation(s)
- Mengting Shao
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Kaiyang Chen
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Shuting Zhang
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Min Tian
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Yan Shen
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Chen Cao
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Ning Gu
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
- Nanjing Key Laboratory for Cardiovascular Information and Health Engineering Medicine, Institute of Clinical Medicine, Nanjing Drum Tower Hospital, Medical School, Nanjing University, Nanjing 210093, China
| |
Collapse
|
3
|
Wang G, Zhang H, Shao M, Tian M, Feng H, Li Q, Cao C. Optimal variable identification for accurate detection of causal expression Quantitative Trait Loci with applications in heart-related diseases. Comput Struct Biotechnol J 2024; 23:2478-2486. [PMID: 38952424 PMCID: PMC11215961 DOI: 10.1016/j.csbj.2024.05.050] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 05/31/2024] [Accepted: 05/31/2024] [Indexed: 07/03/2024] Open
Abstract
Gene expression plays a pivotal role in various diseases, contributing significantly to their mechanisms. Most GWAS risk loci are in non-coding regions, potentially affecting disease risk by altering gene expression in specific tissues. This expression is notably tissue-specific, with genetic variants substantially influencing it. However, accurately detecting the expression Quantitative Trait Loci (eQTL) is challenging due to limited heritability in gene expression, extensive linkage disequilibrium (LD), and multiple causal variants. The single variant association approach in eQTL analysis is limited by its susceptibility to capture the combined effects of multiple variants, and a bias towards common variants, underscoring the need for a more robust method to accurately identify causal eQTL variants. To address this, we developed an algorithm, CausalEQTL, which integrates L 0 +L 1 penalized regression with an ensemble approach to localize eQTL, thereby enhancing prediction performance precisely. Our results demonstrate that CausalEQTL outperforms traditional models, including LASSO, Elastic Net, Ridge, in terms of power and overall performance. Furthermore, analysis of heart tissue data from the GTEx project revealed that eQTL sites identified by our algorithm provide deeper insights into heart-related tissue eQTL detection. This advancement in eQTL mapping promises to improve our understanding of the genetic basis of tissue-specific gene expression and its implications in disease. The source code and identified causal eQTLs for CausalEQTL are available on GitHub: https://github.com/zhc-moushang/CausalEQTL.
Collapse
Affiliation(s)
- Guishen Wang
- College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China
| | - Hangchen Zhang
- College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China
| | - Mengting Shao
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Min Tian
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| | - Hui Feng
- College of Computer Science and Engineering, Changchun University of Technology, Changchun 130012, China
| | - Qiaoling Li
- Department of Cardiology, Affiliated Drum Tower Hospital, Medical School of Nanjing University, Nanjing 210008, China
| | - Chen Cao
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing 211166, China
| |
Collapse
|
4
|
Cao C, Shao M, Wang J, Li Z, Chen H, You T, Li MJ, Ding Y, Zou Q. webTWAS 2.0: update platform for identifying complex disease susceptibility genes through transcriptome-wide association study. Nucleic Acids Res 2024:gkae1022. [PMID: 39526380 DOI: 10.1093/nar/gkae1022] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2024] [Revised: 10/14/2024] [Accepted: 10/17/2024] [Indexed: 11/16/2024] Open
Abstract
Transcriptome-wide association study (TWAS) has successfully identified numerous complex disease susceptibility genes in the post-genome-wide association study (GWAS) era. Over the past 3 years, the focus of TWAS algorithms has shifted from merely identifying associations to understanding how single nucleotide polymorphisms (SNPs) regulate gene expression, with a growing emphasis on incorporating fine-mapping techniques. Additionally, the rapid increase in GWAS summary statistics, driven largely by the UK Biobank and other consortia, has made it essential to update our webTWAS resource. To address these challenges and meet the growing needs of researchers, we developed webTWAS 2.0, an updated platform for identifying susceptibility genes for human complex diseases using TWAS. Additionally, webTWAS 2.0 provides an online TWAS analysis tool that simplifies conducting TWAS analyses. The updated resource includes 7247 GWAS summary statistics covering 1588 complex human diseases from 192 publications. It also incorporates multiple TWAS methods, such as sTF-TWAS, 3'aTWAS and GIFT, along with an updated interactive visualization tool that allows users to easily explore significant associations across different methods. Other upgrades include a personalized online analysis tool for user-submitted GWAS data and a refined search function that makes it easier to identify relevant associations and meet diverse user needs more efficiently. webTWAS 2.0 is freely accessible at http://www.webtwas.net.
Collapse
Affiliation(s)
- Chen Cao
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University,101 Longmian Ave, Nanjing, Jiangsu 211166, China
| | - Mengting Shao
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University,101 Longmian Ave, Nanjing, Jiangsu 211166, China
| | - Jianhua Wang
- Department of Systems Pharmacology and Translational Therapeutics, University of Pennsylvania-Perelman School of Medicine, 421 Curie Blvd, Philadelphia, PA 19104, USA
| | - Zhenghui Li
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University,101 Longmian Ave, Nanjing, Jiangsu 211166, China
| | - Haoran Chen
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University,101 Longmian Ave, Nanjing, Jiangsu 211166, China
| | - Tianyi You
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, 22 Qixiangtai Road, Tianjin 300203, China
| | - Mulin Jun Li
- Department of Pharmacology, School of Basic Medical Sciences, Tianjin Medical University, 22 Qixiangtai Road, Tianjin 300203, China
| | - Yijie Ding
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, 1 Chengdian Road, Quzhou, Zhejiang 324003, China
| | - Quan Zou
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, 1 Chengdian Road, Quzhou, Zhejiang 324003, China
| |
Collapse
|
5
|
Wang N, Ye Z, Ma T. TIPS: a novel pathway-guided joint model for transcriptome-wide association studies. Brief Bioinform 2024; 25:bbae587. [PMID: 39550224 PMCID: PMC11568880 DOI: 10.1093/bib/bbae587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Revised: 10/03/2024] [Accepted: 10/30/2024] [Indexed: 11/18/2024] Open
Abstract
In the past two decades, genome-wide association studies (GWAS) have pinpointed numerous SNPs linked to human diseases and traits, yet many of these SNPs are in non-coding regions and hard to interpret. Transcriptome-wide association studies (TWAS) integrate GWAS and expression reference panels to identify the associations at gene level with tissue specificity, potentially improving the interpretability. However, the list of individual genes identified from univariate TWAS contains little unifying biological theme, leaving the underlying mechanisms largely elusive. In this paper, we propose a novel multivariate TWAS method that Incorporates Pathway or gene Set information, namely TIPS, to identify genes and pathways most associated with complex polygenic traits. We jointly modeled the imputation and association steps in TWAS, incorporated a sparse group lasso penalty in the model to induce selection at both gene and pathway levels and developed an expectation-maximization algorithm to estimate the parameters for the penalized likelihood. We applied our method to three different complex traits: systolic and diastolic blood pressure, as well as a brain aging biomarker white matter brain age gap in UK Biobank and identified critical biologically relevant pathways and genes associated with these traits. These pathways cannot be detected by traditional univariate TWAS + pathway enrichment analysis approach, showing the power of our model. We also conducted comprehensive simulations with varying heritability levels and genetic architectures and showed our method outperformed other established TWAS methods in feature selection, statistical power, and prediction. The R package that implements TIPS is available at https://github.com/nwang123/TIPS.
Collapse
Affiliation(s)
- Neng Wang
- Department of Mathematics, University of Maryland, College Park, MD 20742, United States
- Department of Epidemiology and Biostatistics, University of Maryland, College Park, MD 20742, United States
| | - Zhenyao Ye
- Department of Epidemiology and Public Health, University of Maryland, Baltimore, MD 21201, United States
| | - Tianzhou Ma
- Department of Epidemiology and Biostatistics, University of Maryland, College Park, MD 20742, United States
| |
Collapse
|
6
|
K Lodi M, Chernikov A, Ghosh P. COFFEE: consensus single cell-type specific inference for gene regulatory networks. Brief Bioinform 2024; 25:bbae457. [PMID: 39311699 PMCID: PMC11418232 DOI: 10.1093/bib/bbae457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2024] [Revised: 07/22/2024] [Accepted: 09/02/2024] [Indexed: 09/26/2024] Open
Abstract
The inference of gene regulatory networks (GRNs) is crucial to understanding the regulatory mechanisms that govern biological processes. GRNs may be represented as edges in a graph, and hence, it have been inferred computationally for scRNA-seq data. A wisdom of crowds approach to integrate edges from several GRNs to create one composite GRN has demonstrated improved performance when compared with individual algorithm implementations on bulk RNA-seq and microarray data. In an effort to extend this approach to scRNA-seq data, we present COFFEE (COnsensus single cell-type speciFic inFerence for gEnE regulatory networks), a Borda voting-based consensus algorithm that integrates information from 10 established GRN inference methods. We conclude that COFFEE has improved performance across synthetic, curated, and experimental datasets when compared with baseline methods. Additionally, we show that a modified version of COFFEE can be leveraged to improve performance on newer cell-type specific GRN inference methods. Overall, our results demonstrate that consensus-based methods with pertinent modifications continue to be valuable for GRN inference at the single cell level. While COFFEE is benchmarked on 10 algorithms, it is a flexible strategy that can incorporate any set of GRN inference algorithms according to user preference. A Python implementation of COFFEE may be found on GitHub: https://github.com/lodimk2/coffee.
Collapse
Affiliation(s)
- Musaddiq K Lodi
- Integrative Life Sciences, Virginia Commonwealth University, 1000 W Cary St, Richmond, VA 23284, United States
| | - Anna Chernikov
- Center for Biological Data Science, Virginia Commonwealth University, 1015 Floyd Ave, Richmond, VA 23284, United States
| | - Preetam Ghosh
- Department of Computer Science, Virginia Commonwealth University, 401 W Main St, Richmond, VA 23284, United States
| |
Collapse
|
7
|
Chen M, Zou Q, Qi R, Ding Y. PseU-KeMRF: A Novel Method for Identifying RNA Pseudouridine Sites. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024; 21:1423-1435. [PMID: 38625768 DOI: 10.1109/tcbb.2024.3389094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/18/2024]
Abstract
Pseudouridine is a type of abundant RNA modification that is seen in many different animals and is crucial for a variety of biological functions. Accurately identifying pseudouridine sites within the RNA sequence is vital for the subsequent study of various biological mechanisms of pseudouridine. However, the use of traditional experimental methods faces certain challenges. The development of fast and convenient computational methods is necessary to accurately identify pseudouridine sites from RNA sequence information. To address this, we introduce a novel pseudouridine site prediction model called PseU-KeMRF, which can identify pseudouridine sites in three species, H. sapiens, S. cerevisiae, and M. musculus. Through comprehensive analysis, we selected four RNA coding schemes, including binary feature, position-specific trinucleotide propensity based on single strand (PSTNPss), nucleotide chemical property (NCP) and pseudo k-tuple composition (PseKNC). Then the support vector machine-recursive feature elimination (SVM-RFE) method was used for feature selection and the feature subset was optimized. Finally, the best feature subsets are input into the kernel based on multinomial random forests (KeMRF) classifier for cross-validation and independent testing. As a new classification method, compared with the traditional random forest, KeMRF not only improves the node splitting process of decision tree construction based on multinomial distribution, but also combines the easy to interpret kernel method for prediction, which makes the classification performance better. Our results indicate superior predictive performance of PseU-KeMRF over other existing models, which can prove that PseU-KeMRF is a highly competitive predictive model that can successfully identify pseudouridine sites in RNA sequences.
Collapse
|
8
|
Wang G, Feng H, Cao C. BiRNN-DDI: A Drug-Drug Interaction Event Type Prediction Model Based on Bidirectional Recurrent Neural Network and Graph2Seq Representation. J Comput Biol 2024. [PMID: 39049806 DOI: 10.1089/cmb.2024.0476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/27/2024] Open
Abstract
Research on drug-drug interaction (DDI) prediction, particularly in identifying DDI event types, is crucial for understanding adverse drug reactions and drug combinations. This work introduces a Bidirectional Recurrent Neural Network model for DDI event type prediction (BiRNN-DDI), which simultaneously considers structural relationships and contextual information. Our BiRNN-DDI model constructs drug feature graphs to mine structural relationships. For contextual information, it transforms drug graphs into sequences and employs a two-channel structure, integrating BiRNN, to obtain contextual representations of drug-drug pairs. The model's effectiveness is demonstrated through comparisons with state-of-the-art models on two DDI event-type benchmarks. Extensive experimental results reveal that BiRNN-DDI surpasses other models in accuracy, AUPR, AUC, F1 score, Precision, and Recall metrics on both small and large datasets. Additionally, our model exhibits a lower parameter space, indicating more efficient learning of drug feature representations and prediction of potential DDI event types.
Collapse
Affiliation(s)
- GuiShen Wang
- School of Computer Science and Engineering, Changchun University of Technology, Changchun, China
| | - Hui Feng
- School of Computer Science and Engineering, Changchun University of Technology, Changchun, China
| | - Chen Cao
- Key Laboratory for Bio-Electromagnetic Environment and Advanced Medical Theranostics, School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, China
| |
Collapse
|
9
|
Aguiar TFM, Rivas MP, de Andrade Silva EM, Pires SF, Dangoni GD, Macedo TC, Defelicibus A, Barros BDDF, Novak E, Cristofani LM, Odone V, Cypriano M, de Toledo SRC, da Cunha IW, da Costa CML, Carraro DM, Tojal I, de Oliveira Mendes TA, Krepischi ACV. First Transcriptome Analysis of Hepatoblastoma in Brazil: Unraveling the Pivotal Role of Noncoding RNAs and Metabolic Pathways. Biochem Genet 2024:10.1007/s10528-024-10764-y. [PMID: 38649558 DOI: 10.1007/s10528-024-10764-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Accepted: 02/27/2024] [Indexed: 04/25/2024]
Abstract
Hepatoblastoma stands as the most prevalent liver cancer in the pediatric population. Characterized by a low mutational burden, chromosomal and epigenetic alterations are key drivers of its tumorigenesis. Transcriptome analysis is a powerful tool for unraveling the molecular intricacies of hepatoblastoma, shedding light on the effects of genetic and epigenetic changes on gene expression. In this study conducted in Brazilian patients, an in-depth whole transcriptome analysis was performed on 14 primary hepatoblastomas, compared to control liver tissues. The analysis unveiled 1,492 differentially expressed genes (1,031 upregulated and 461 downregulated), including 920 protein-coding genes (62%). Upregulated biological processes were linked to cell differentiation, signaling, morphogenesis, and development, involving known hepatoblastoma-associated genes (DLK1, MEG3, HDAC2, TET1, HMGA2, DKK1, DKK4), alongside with novel findings (GYNG4, CDH3, and TNFRSF19). Downregulated processes predominantly centered around oxidation and metabolism, affecting amines, nicotinamides, and lipids, featuring novel discoveries like the repression of SYT7, TTC36, THRSP, CCND1, GCK and CAMK2B. Two genes, which displayed a concordant pattern of DNA methylation alteration in their promoter regions and dysregulation in the transcriptome, were further validated by RT-qPCR: the upregulated TNFRSF19, a key gene in the embryonic development, and the repressed THRSP, connected to lipid metabolism. Furthermore, based on protein-protein interaction analysis, we identified genes holding central positions in the network, such as HDAC2, CCND1, GCK, and CAMK2B, among others, that emerged as prime candidates warranting functional validation in future studies. Notably, a significant dysregulation of non-coding RNAs (ncRNAs), predominantly upregulated transcripts, was observed, with 42% of the top 50 highly expressed genes being ncRNAs. An integrative miRNA-mRNA analysis revealed crucial biological processes associated with metabolism, oxidation reactions of lipids and carbohydrates, and methylation-dependent chromatin silencing. In particular, four upregulated miRNAs (miR-186, miR-214, miR-377, and miR-494) played a pivotal role in the network, potentially targeting multiple protein-coding transcripts, including CCND1 and CAMK2B. In summary, our transcriptome analysis highlighted disrupted embryonic development as well as metabolic pathways, particularly those involving lipids, emphasizing the emerging role of ncRNAs as epigenetic regulators in hepatoblastomas. These findings provide insights into the complexity of the hepatoblastoma transcriptome and identify potential targets for future therapeutic interventions.
Collapse
Affiliation(s)
- Talita Ferreira Marques Aguiar
- Department of Genetics and Evolutionary Biology, Institute of Biosciences, Human Genome and Stem-Cell Research Center, University of São Paulo, São Paulo, Brazil
- Columbia University Irving Medical Center, New York, NY, USA
| | - Maria Prates Rivas
- Department of Genetics and Evolutionary Biology, Institute of Biosciences, Human Genome and Stem-Cell Research Center, University of São Paulo, São Paulo, Brazil
| | - Edson Mario de Andrade Silva
- Department of Biochemistry and Molecular Biology, Federal University of Viçosa, Minas Gerais, Brazil
- Horticultural Sciences Department, University of Florida, Gainesville, USA
| | - Sara Ferreira Pires
- Department of Genetics and Evolutionary Biology, Institute of Biosciences, Human Genome and Stem-Cell Research Center, University of São Paulo, São Paulo, Brazil
| | - Gustavo Dib Dangoni
- Department of Genetics and Evolutionary Biology, Institute of Biosciences, Human Genome and Stem-Cell Research Center, University of São Paulo, São Paulo, Brazil
| | - Taiany Curdulino Macedo
- Department of Genetics and Evolutionary Biology, Institute of Biosciences, Human Genome and Stem-Cell Research Center, University of São Paulo, São Paulo, Brazil
| | | | | | - Estela Novak
- Pediatric Cancer Institute (ITACI) at the Pediatric Department, São Paulo University Medical School, São Paulo, Brazil
| | - Lilian Maria Cristofani
- Pediatric Cancer Institute (ITACI) at the Pediatric Department, São Paulo University Medical School, São Paulo, Brazil
| | - Vicente Odone
- Pediatric Cancer Institute (ITACI) at the Pediatric Department, São Paulo University Medical School, São Paulo, Brazil
| | - Monica Cypriano
- Department of Pediatrics, Adolescent and Child With Cancer Support Group (GRAACC), Federal University of São Paulo, São Paulo, Brazil
| | - Silvia Regina Caminada de Toledo
- Department of Pediatrics, Adolescent and Child With Cancer Support Group (GRAACC), Federal University of São Paulo, São Paulo, Brazil
| | | | | | - Dirce Maria Carraro
- International Center for Research, A. C. Camargo Cancer Center, São Paulo, Brazil
| | - Israel Tojal
- International Center for Research, A. C. Camargo Cancer Center, São Paulo, Brazil
| | | | - Ana Cristina Victorino Krepischi
- Department of Genetics and Evolutionary Biology, Institute of Biosciences, Human Genome and Stem-Cell Research Center, University of São Paulo, São Paulo, Brazil.
| |
Collapse
|
10
|
Li Q, Bian J, Qian Y, Kossinna P, Gau C, Gordon PMK, Zhou X, Guo X, Yan J, Wu J, Long Q. An expression-directed linear mixed model discovering low-effect genetic variants. Genetics 2024; 226:iyae018. [PMID: 38314848 PMCID: PMC11630775 DOI: 10.1093/genetics/iyae018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Revised: 11/29/2023] [Accepted: 01/05/2024] [Indexed: 02/07/2024] Open
Abstract
Detecting genetic variants with low-effect sizes using a moderate sample size is difficult, hindering downstream efforts to learn pathology and estimating heritability. In this work, by utilizing informative weights learned from training genetically predicted gene expression models, we formed an alternative approach to estimate the polygenic term in a linear mixed model. Our linear mixed model estimates the genetic background by incorporating their relevance to gene expression. Our protocol, expression-directed linear mixed model, enables the discovery of subtle signals of low-effect variants using moderate sample size. By applying expression-directed linear mixed model to cohorts of around 5,000 individuals with either binary (WTCCC) or quantitative (NFBC1966) traits, we demonstrated its power gain at the low-effect end of the genetic etiology spectrum. In aggregate, the additional low-effect variants detected by expression-directed linear mixed model substantially improved estimation of missing heritability. Expression-directed linear mixed model moves precision medicine forward by accurately detecting the contribution of low-effect genetic variants to human diseases.
Collapse
Affiliation(s)
- Qing Li
- Department of Biochemistry & Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
| | - Jiayi Bian
- Department of Mathematics and Statistics, University of Calgary, Calgary T2N 1N4, Canada
| | - Yanzhao Qian
- Department of Mathematics and Statistics, University of Calgary, Calgary T2N 1N4, Canada
| | - Pathum Kossinna
- Department of Biochemistry & Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
| | - Cooper Gau
- Department of Mathematics and Statistics, University of Calgary, Calgary T2N 1N4, Canada
| | - Paul M K Gordon
- Alberta Children's Hospital Research Institute, University of Calgary, Calgary T2N 1N4, Canada
| | - Xiang Zhou
- School of Public Health, University of Michigan, Ann Arbor 48109, USA
| | - Xingyi Guo
- Department of Medicine & Biomedical Informatics, Vanderbilt University Medical Center, Nashville 37203, USA
| | - Jun Yan
- Physiology and Pharmacology, University of Calgary, Calgary T2N 1N4, Canada
- Hotchkiss Brain Institute, University of Calgary, Calgary T2N 1N4, Canada
| | - Jingjing Wu
- Department of Mathematics and Statistics, University of Calgary, Calgary T2N 1N4, Canada
| | - Quan Long
- Department of Biochemistry & Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
- Department of Mathematics and Statistics, University of Calgary, Calgary T2N 1N4, Canada
- Alberta Children's Hospital Research Institute, University of Calgary, Calgary T2N 1N4, Canada
- Hotchkiss Brain Institute, University of Calgary, Calgary T2N 1N4, Canada
- Department of Medical Genetics, University of Calgary, Calgary T2N 1N4, Canada
| |
Collapse
|
11
|
He J, Li Q, Zhang Q. rvTWAS: identifying gene-trait association using sequences by utilizing transcriptome-directed feature selection. Genetics 2024; 226:iyad204. [PMID: 38001381 DOI: 10.1093/genetics/iyad204] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Revised: 11/14/2023] [Accepted: 11/16/2023] [Indexed: 11/26/2023] Open
Abstract
Toward the identification of genetic basis of complex traits, transcriptome-wide association study (TWAS) is successful in integrating transcriptome data. However, TWAS is only applicable for common variants, excluding rare variants in exome or whole-genome sequences. This is partly because of the inherent limitation of TWAS protocols that rely on predicting gene expressions. Our previous research has revealed the insight into TWAS: the 2 steps in TWAS, building and applying the expression prediction models, are essentially genetic feature selection and aggregations that do not have to involve predictions. Based on this insight disentangling TWAS, rare variants' inability of predicting expression traits is no longer an obstacle. Herein, we developed "rare variant TWAS," or rvTWAS, that first uses a Bayesian model to conduct expression-directed feature selection and then uses a kernel machine to carry out feature aggregation, forming a model leveraging expressions for association mapping including rare variants. We demonstrated the performance of rvTWAS by thorough simulations and real data analysis in 3 psychiatric disorders, namely schizophrenia, bipolar disorder, and autism spectrum disorder. We confirmed that rvTWAS outperforms existing TWAS protocols and revealed additional genes underlying psychiatric disorders. Particularly, we formed a hypothetical mechanism in which zinc finger genes impact all 3 disorders through transcriptional regulations. rvTWAS will open a door for sequence-based association mappings integrating gene expressions.
Collapse
Affiliation(s)
- Jingni He
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
| | - Qing Li
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
| | - Qingrun Zhang
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary T2N 1N4, Canada
- Department of Mathematics and Statistics, University of Calgary, Calgary T2N 1N4, Canada
- Alberta Children's Hospital Research Institute, University of Calgary, Calgary T2N 1N4, Canada
- Arnie Charbonneau Cancer Institute, University of Calgary, Calgary T2N 1N4, Canada
| |
Collapse
|
12
|
Jiang J, Pei H, Li J, Li M, Zou Q, Lv Z. FEOpti-ACVP: identification of novel anti-coronavirus peptide sequences based on feature engineering and optimization. Brief Bioinform 2024; 25:bbae037. [PMID: 38366802 PMCID: PMC10939380 DOI: 10.1093/bib/bbae037] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2023] [Revised: 12/27/2023] [Accepted: 01/17/2024] [Indexed: 02/18/2024] Open
Abstract
Anti-coronavirus peptides (ACVPs) represent a relatively novel approach of inhibiting the adsorption and fusion of the virus with human cells. Several peptide-based inhibitors showed promise as potential therapeutic drug candidates. However, identifying such peptides in laboratory experiments is both costly and time consuming. Therefore, there is growing interest in using computational methods to predict ACVPs. Here, we describe a model for the prediction of ACVPs that is based on the combination of feature engineering (FE) optimization and deep representation learning. FEOpti-ACVP was pre-trained using two feature extraction frameworks. At the next step, several machine learning approaches were tested in to construct the final algorithm. The final version of FEOpti-ACVP outperformed existing methods used for ACVPs prediction and it has the potential to become a valuable tool in ACVP drug design. A user-friendly webserver of FEOpti-ACVP can be accessed at http://servers.aibiochem.net/soft/FEOpti-ACVP/.
Collapse
Affiliation(s)
- Jici Jiang
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Hongdi Pei
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Jiayu Li
- College of Life Science, Sichuan University, Chengdu 610065, China
| | - Mingxin Li
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
| | - Zhibin Lv
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| |
Collapse
|
13
|
Visonà G, Bouzigon E, Demenais F, Schweikert G. Network propagation for GWAS analysis: a practical guide to leveraging molecular networks for disease gene discovery. Brief Bioinform 2024; 25:bbae014. [PMID: 38340090 PMCID: PMC10858647 DOI: 10.1093/bib/bbae014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2023] [Revised: 12/28/2023] [Accepted: 01/08/2024] [Indexed: 02/12/2024] Open
Abstract
MOTIVATION Genome-wide association studies (GWAS) have enabled large-scale analysis of the role of genetic variants in human disease. Despite impressive methodological advances, subsequent clinical interpretation and application remains challenging when GWAS suffer from a lack of statistical power. In recent years, however, the use of information diffusion algorithms with molecular networks has led to fruitful insights on disease genes. RESULTS We present an overview of the design choices and pitfalls that prove crucial in the application of network propagation methods to GWAS summary statistics. We highlight general trends from the literature, and present benchmark experiments to expand on these insights selecting as case study three diseases and five molecular networks. We verify that the use of gene-level scores based on GWAS P-values offers advantages over the selection of a set of 'seed' disease genes not weighted by the associated P-values if the GWAS summary statistics are of sufficient quality. Beyond that, the size and the density of the networks prove to be important factors for consideration. Finally, we explore several ensemble methods and show that combining multiple networks may improve the network propagation approach.
Collapse
Affiliation(s)
- Giovanni Visonà
- Empirical Inference, Max-Planck Institute for Intelligent Systems, Tübingen 72076, Germany
| | | | | | | |
Collapse
|
14
|
He J, Antonyan L, Zhu H, Ardila K, Li Q, Enoma D, Zhang W, Liu A, Chekouo T, Cao B, MacDonald ME, Arnold PD, Long Q. A statistical method for image-mediated association studies discovers genes and pathways associated with four brain disorders. Am J Hum Genet 2024; 111:48-69. [PMID: 38118447 PMCID: PMC10806749 DOI: 10.1016/j.ajhg.2023.11.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 11/04/2023] [Accepted: 11/16/2023] [Indexed: 12/22/2023] Open
Abstract
Brain imaging and genomics are critical tools enabling characterization of the genetic basis of brain disorders. However, imaging large cohorts is expensive and may be unavailable for legacy datasets used for genome-wide association studies (GWASs). Using an integrated feature selection/aggregation model, we developed an image-mediated association study (IMAS), which utilizes borrowed imaging/genomics data to conduct association mapping in legacy GWAS cohorts. By leveraging the UK Biobank image-derived phenotypes (IDPs), the IMAS discovered genetic bases underlying four neuropsychiatric disorders and verified them by analyzing annotations, pathways, and expression quantitative trait loci (eQTLs). A cerebellar-mediated mechanism was identified to be common to the four disorders. Simulations show that, if the goal is identifying genetic risk, our IMAS is more powerful than a hypothetical protocol in which the imaging results were available in the GWAS dataset. This implies the feasibility of reanalyzing legacy GWAS datasets without conducting additional imaging, yielding cost savings for integrated analysis of genetics and imaging.
Collapse
Affiliation(s)
- Jingni He
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Lilit Antonyan
- Department of Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Harold Zhu
- Department of Biological Sciences, Faculty of Science, University of Calgary, Calgary, AB, Canada
| | - Karen Ardila
- Department of Biomedical Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada
| | - Qing Li
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - David Enoma
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | | | - Andy Liu
- Sir Winston Churchill High School, Calgary, AB, Canada; College of Letters and Science, University of California, Los Angeles, Los Angeles, CA, USA
| | - Thierry Chekouo
- Department of Mathematics and Statistics, Faculty of Science, University of Calgary, Calgary, AB, Canada; Division of Biostatistics, School of Public Health, University of Minnesota, Minneapolis, MN, USA
| | - Bo Cao
- Department of Psychiatry, Faculty of Medicine & Dentistry, University of Alberta, Edmonton, AB, Canada
| | - M Ethan MacDonald
- The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Biomedical Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada; Department of Electrical and Software Engineering, Schulich School of Engineering, University of Calgary, Calgary, AB, Canada; Department of Radiology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada
| | - Paul D Arnold
- Department of Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Psychiatry, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada.
| | - Quan Long
- Department of Biochemistry and Molecular Biology, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Medical Genetics, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; The Mathison Centre for Mental Health Research & Education, Hotchkiss Brain Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Alberta Children's Hospital Research Institute, Cumming School of Medicine, University of Calgary, Calgary, AB, Canada; Department of Mathematics and Statistics, Faculty of Science, University of Calgary, Calgary, AB, Canada.
| |
Collapse
|
15
|
Shi M, Tanikawa C, Munter HM, Akiyama M, Koyama S, Tomizuka K, Matsuda K, Lathrop GM, Terao C, Koido M, Kamatani Y. Genotype imputation accuracy and the quality metrics of the minor ancestry in multi-ancestry reference panels. Brief Bioinform 2023; 25:bbad509. [PMID: 38221906 PMCID: PMC10788679 DOI: 10.1093/bib/bbad509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2023] [Revised: 11/20/2023] [Accepted: 12/13/2023] [Indexed: 01/16/2024] Open
Abstract
Large-scale imputation reference panels are currently available and have contributed to efficient genome-wide association studies through genotype imputation. However, whether large-size multi-ancestry or small-size population-specific reference panels are the optimal choices for under-represented populations continues to be debated. We imputed genotypes of East Asian (180k Japanese) subjects using the Trans-Omics for Precision Medicine reference panel and found that the standard imputation quality metric (Rsq) overestimated dosage r2 (squared correlation between imputed dosage and true genotype) particularly in marginal-quality bins. Variance component analysis of Rsq revealed that the increased imputed-genotype certainty (dosages closer to 0, 1 or 2) caused upward bias, indicating some systemic bias in the imputation. Through systematic simulations using different template switching rates (θ value) in the hidden Markov model, we revealed that the lower θ value increased the imputed-genotype certainty and Rsq; however, dosage r2 was insensitive to the θ value, thereby causing a deviation. In simulated reference panels with different sizes and ancestral diversities, the θ value estimates from Minimac decreased with the size of a single ancestry and increased with the ancestral diversity. Thus, Rsq could be deviated from dosage r2 for a subpopulation in the multi-ancestry panel, and the deviation represents different imputed-dosage distributions. Finally, despite the impact of the θ value, distant ancestries in the reference panel contributed only a few additional variants passing a predefined Rsq threshold. We conclude that the θ value substantially impacts the imputed dosage and the imputation quality metric value.
Collapse
Affiliation(s)
- Mingyang Shi
- Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Chizu Tanikawa
- Laboratory of Clinical Genome Sequencing, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Hans Markus Munter
- Victor Phillip Dahdaleh Institute of Genomic Medicine, McGill University, Montreal, Québec, Canada
| | - Masato Akiyama
- Department of Ocular Pathology and Imaging Science, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Japan
| | - Satoshi Koyama
- Cardiovascular Disease Initiative, The Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
| | - Kohei Tomizuka
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Koichi Matsuda
- Laboratory of Clinical Genome Sequencing, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
| | - Gregory Mark Lathrop
- Victor Phillip Dahdaleh Institute of Genomic Medicine, McGill University, Montreal, Québec, Canada
| | - Chikashi Terao
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Masaru Koido
- Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| | - Yoichiro Kamatani
- Laboratory of Complex Trait Genomics, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, The University of Tokyo, Tokyo, Japan
- Laboratory for Statistical and Translational Genetics, RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
| |
Collapse
|
16
|
Hu H, Zhao H, Zhong T, Dong X, Wang L, Han P, Li Z. Adaptive deep propagation graph neural network for predicting miRNA-disease associations. Brief Funct Genomics 2023; 22:453-462. [PMID: 37078739 DOI: 10.1093/bfgp/elad010] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2022] [Revised: 02/13/2023] [Accepted: 03/09/2023] [Indexed: 04/21/2023] Open
Abstract
BACKGROUND A large number of experiments show that the abnormal expression of miRNA is closely related to the occurrence, diagnosis and treatment of diseases. Identifying associations between miRNAs and diseases is important for clinical applications of complex human diseases. However, traditional biological experimental methods and calculation-based methods have many limitations, which lead to the development of more efficient and accurate deep learning methods for predicting miRNA-disease associations. RESULTS In this paper, we propose a novel model on the basis of adaptive deep propagation graph neural network to predict miRNA-disease associations (ADPMDA). We first construct the miRNA-disease heterogeneous graph based on known miRNA-disease pairs, miRNA integrated similarity information, miRNA sequence information and disease similarity information. Then, we project the features of miRNAs and diseases into a low-dimensional space. After that, attention mechanism is utilized to aggregate the local features of central nodes. In particular, an adaptive deep propagation graph neural network is employed to learn the embedding of nodes, which can adaptively adjust the local and global information of nodes. Finally, the multi-layer perceptron is leveraged to score miRNA-disease pairs. CONCLUSION Experiments on human microRNA disease database v3.0 dataset show that ADPMDA achieves the mean AUC value of 94.75% under 5-fold cross-validation. We further conduct case studies on the esophageal neoplasm, lung neoplasms and lymphoma to confirm the effectiveness of our proposed model, and 49, 49, 47 of the top 50 predicted miRNAs associated with these diseases are confirmed, respectively. These results demonstrate the effectiveness and superiority of our model in predicting miRNA-disease associations.
Collapse
Affiliation(s)
- Hua Hu
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277122, China
| | - Huan Zhao
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221008, China
| | - Tangbo Zhong
- School of Computer Science and Technology, China University of Mining and Technology, Xuzhou 221008, China
| | - Xishang Dong
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277122, China
| | - Lei Wang
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277122, China
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Science, Nanning 541006, China
| | - Pengyong Han
- Central Lab, Changzhi Medical College, Changzhi 046012, China
| | - Zhengwei Li
- College of Information Science and Engineering, Zaozhuang University, Zaozhuang 277122, China
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Science, Nanning 541006, China
- KUNPAND Communications (Kunshan) Co., Ltd., Suzhou 215300, China
| |
Collapse
|
17
|
Li J, Ma S, Pei H, Jiang J, Zou Q, Lv Z. Review of T cell proliferation regulatory factors in treatment and prognostic prediction for solid tumors. Heliyon 2023; 9:e21329. [PMID: 37954355 PMCID: PMC10637962 DOI: 10.1016/j.heliyon.2023.e21329] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2023] [Revised: 10/15/2023] [Accepted: 10/19/2023] [Indexed: 11/14/2023] Open
Abstract
T cell proliferation regulators (Tcprs), which are positive regulators that promote T cell function, have made great contributions to the development of therapies to improve T cell function. CAR (chimeric antigen receptor) -T cell therapy, a type of adoptive cell transfer therapy that targets tumor cells and enhances immune lethality, has led to significant progress in the treatment of hematologic tumors. However, the applications of CAR-T in solid tumor treatment remain limited. Therefore, in this review, we focus on the development of Tcprs for solid tumor therapy and prognostic prediction. We summarize potential strategies for targeting different Tcprs to enhance T cell proliferation and activation and inhibition of cancer progression, thereby improving the antitumor activity and persistence of CAR-T. In summary, we propose means of enhancing CAR-T cells by expressing different Tcprs, which may lead to the development of a new generation of cell therapies.
Collapse
Affiliation(s)
- Jiayu Li
- Student Innovation Competition Team, College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
- College of Life Science, Sichuan University, Chengdu 610065, China
| | - Shuhan Ma
- Student Innovation Competition Team, College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Hongdi Pei
- Student Innovation Competition Team, College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Jici Jiang
- Student Innovation Competition Team, College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
| | - Zhibin Lv
- Student Innovation Competition Team, College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| |
Collapse
|
18
|
Shao M, Zhang Z, Sun H, He J, Wang J, Zhang Q, Cao C. Editorial: Statistical methods for genome-wide association studies (GWAS) and transcriptome-wide association studies (TWAS) and their applications. Front Genet 2023; 14:1287673. [PMID: 37766879 PMCID: PMC10520498 DOI: 10.3389/fgene.2023.1287673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Accepted: 09/05/2023] [Indexed: 09/29/2023] Open
Affiliation(s)
- Mengting Shao
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, China
| | - Zilong Zhang
- School of Computer Science and Technology, Hainan University, Haikou, China
| | - Huiyan Sun
- School of Artificial Intelligence, Jilin University, Changchun, China
| | - Jingni He
- Department of Biochemistry and Molecular Biology, University of Calgary, Calgary, AB, Canada
| | - Juexin Wang
- Department of Biohealth Informatics, Indiana University Purdue University Indianapolis, Indianapolis, IN, United States
| | - Qingrun Zhang
- Department of Mathematics and Statistics, University of Calgary, Calgary, AB, Canada
| | - Chen Cao
- School of Biomedical Engineering and Informatics, Nanjing Medical University, Nanjing, China
| |
Collapse
|
19
|
Fan R, Ding Y, Zou Q, Yuan L. Multi-view local hyperplane nearest neighbor model based on independence criterion for identifying vesicular transport proteins. Int J Biol Macromol 2023; 247:125774. [PMID: 37437677 DOI: 10.1016/j.ijbiomac.2023.125774] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2023] [Revised: 06/30/2023] [Accepted: 07/07/2023] [Indexed: 07/14/2023]
Abstract
Vesicular transport proteins participate in various biological processes and play a significant role in the movement of substances within cells. These proteins are associated with numerous human diseases, making their identification particularly important. In this study, we developed a novel strategy for accurately identifying vesicular transport proteins. We developed a novel multi-view classifier called graph-regularized k-local hyperplane distance nearest neighbor model (HSIC-GHKNN), which combines the Hilbert-Schmidt independence criterion (HSIC)-based multi-view learning method with a local hyperplane distance nearest-neighbor classifier. We first extracted protein evolution information using two feature extraction methods, pseudo-position-specific scoring matrix (PsePSSM) and AATP, and addressed dataset imbalance using the Edited Nearest Neighbors (ENN) algorithm. Subsequently, we employed a local hyperplane distance nearest-neighbor classifier for each view identification and added an HSIC term to maintain independence between views. We then assessed the performance of our identification strategy and analyzed the PsePSSM and AATP feature sets to determine the influencing factors of the classification results. The experimental results demonstrate that the accurate and Matthew correlation coefficients of our strategy on the independent test set are 85.8 % and 0.548, respectively. Our approach outperformed existing methods in most evaluation metrics. In addition, the proposed multi-view classification model can easily be applied to similar identification tasks.
Collapse
Affiliation(s)
- Rui Fan
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China; Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang 324000, China
| | - Yijie Ding
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang 324000, China.
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China; Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang 324000, China.
| | - Lei Yuan
- Department of Hepatobiliary Surgery, Quzhou People's Hospital, Quzhou, Zhejiang 324000, China.
| |
Collapse
|