1
|
Yu Z, Li G, Xu W. Rapid detection of liver metastasis risk in colorectal cancer patients through blood test indicators. Front Oncol 2024; 14:1460136. [PMID: 39324006 PMCID: PMC11422013 DOI: 10.3389/fonc.2024.1460136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2024] [Accepted: 08/20/2024] [Indexed: 09/27/2024] Open
Abstract
Introduction Colorectal cancer (CRC) is one of the most common malignancies, with liver metastasis being its most common form of metastasis. The diagnosis of colorectal cancer liver metastasis (CRCLM) mainly relies on imaging techniques and puncture biopsy techniques, but there is no simple and quick early diagnosisof CRCLM. Methods This study aims to develop a method for rapidly detecting the risk of liver metastasis in CRC patients through blood test indicators based on machine learning (ML) techniques, thereby improving treatment outcomes. To achieve this, blood test indicators from 246 CRC patients and 256 CRCLM patients were collected and analyzed, including routine blood tests, liver function tests, electrolyte tests, renal function tests, glucose determination, cardiac enzyme profiles, blood lipids, and tumor markers. Six commonly used ML models were used for CRC and CRCLM classification and optimized by using a feature selection strategy. Results The results showed that AdaBoost algorithm can achieve the highest accuracy of 89.3% among the six models, which improved to 91.1% after feature selection strategy, resulting with 20 key markers. Conclusions The results demonstrate that the combination of machine learning techniques with blood markers is feasible and effective for the rapid diagnosis of CRCLM, significantly im-proving diagnostic ac-curacy and patient prognosis.
Collapse
Affiliation(s)
- Zhou Yu
- Affiliated Jinhua Hospital, Zhejiang University School of Medicine,
Jinhua, China
| | - Gang Li
- College of Mathematical Medicine, Zhejiang Normal University,
Jinhua, China
| | - Wanxiu Xu
- Xingzhi College, Zhejiang Normal University,
Jinhua, China
| |
Collapse
|
2
|
Ma Q, Shen Y, Guo W, Feng K, Huang T, Cai Y. Machine Learning Reveals Impacts of Smoking on Gene Profiles of Different Cell Types in Lung. Life (Basel) 2024; 14:502. [PMID: 38672772 PMCID: PMC11051039 DOI: 10.3390/life14040502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 04/03/2024] [Accepted: 04/10/2024] [Indexed: 04/28/2024] Open
Abstract
Smoking significantly elevates the risk of lung diseases such as chronic obstructive pulmonary disease (COPD) and lung cancer. This risk is attributed to the harmful chemicals in tobacco smoke that damage lung tissue and impair lung function. Current research on the impact of smoking on gene expression in specific lung cells is limited. This study addresses this gap by analyzing gene expression profiles at the single-cell level from 43,539 lung endothelial cells, 234,349 lung epithelial cells, 189,843 lung immune cells, and 16,031 lung stromal cells using advanced machine learning techniques. The data, categorized by different lung cell types, were classified into three smoking states: active smoker, former smoker, and never smoker. Each cell sample encompassed 28,024 feature genes. Employing an incremental feature selection method within a computational framework, several specific genes have been identified as potential markers of smoking status in different lung cell types. These include B2M, EEF1A1, and TPT1 in lung endothelial cells; FTL and MT-ATP8 in lung epithelial cells; HLA-B and HLA-C in lung immune cells; and HSP90B1 and LCN2 in lung stroma cells. Additionally, this study developed quantitative rules for representing the gene expression patterns related to smoking. This research highlights the potential of machine learning in oncology, enhancing our molecular understanding of smoking's harm and laying the groundwork for future mechanism-based studies.
Collapse
Affiliation(s)
- Qinglan Ma
- School of Life Sciences, Shanghai University, Shanghai 200444, China;
| | - Yulong Shen
- Department of Radiotherapy, Strategic Support Force Medical Center, Beijing 100101, China;
| | - Wei Guo
- Key Laboratory of Stem Cell Biology, Shanghai Jiao Tong University School of Medicine (SJTUSM) & Shanghai Institutes for Biological Sciences (SIBS), Chinese Academy of Sciences (CAS), Shanghai 200030, China;
| | - Kaiyan Feng
- Department of Computer Science, Guangdong AIB Polytechnic College, Guangzhou 510507, China;
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Yudong Cai
- School of Life Sciences, Shanghai University, Shanghai 200444, China;
| |
Collapse
|
3
|
Joo MS, Pyo KH, Chung JM, Cho BC. Artificial intelligence-based non-small cell lung cancer transcriptome RNA-sequence analysis technology selection guide. Front Bioeng Biotechnol 2023; 11:1081950. [PMID: 36873350 PMCID: PMC9975749 DOI: 10.3389/fbioe.2023.1081950] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2022] [Accepted: 01/24/2023] [Indexed: 02/17/2023] Open
Abstract
The incidence and mortality rates of lung cancer are high worldwide, where non-small cell lung cancer (NSCLC) accounts for more than 85% of lung cancer cases. Recent non-small cell lung cancer research has been focused on analyzing patient prognosis after surgery and identifying mechanisms in connection with clinical cohort and ribonucleic acid (RNA) sequencing data, including single-cell ribonucleic acid (scRNA) sequencing data. This paper investigates statistical techniques and artificial intelligence (AI) based non-small cell lung cancer transcriptome data analysis methods divided into target and analysis technology groups. The methodologies of transcriptome data were schematically categorized so researchers can easily match analysis methods according to their goals. The most widely known and frequently utilized transcriptome analysis goal is to find essential biomarkers and classify carcinomas and cluster NSCLC subtypes. Transcriptome analysis methods are divided into three major categories: Statistical analysis, machine learning, and deep learning. Specific models and ensemble techniques typically used in NSCLC analysis are summarized in this paper, with the intent to lay a foundation for advanced research by converging and linking the various analysis methods available.
Collapse
Affiliation(s)
- Min Soo Joo
- School of Electrical and Electronic Engineering, College of Engineering, Yonsei University, Seoul, Republic of Korea
| | - Kyoung-Ho Pyo
- Department of Oncology, Severance Hospital, College of Medicine, Yonsei University, Seoul, Republic of Korea.,Severance Biomedical Science Institute, Yonsei University College of Medicine, Seoul, Republic of Korea.,Yonsei New Il Han Institute for Integrative Lung Cancer Research, Yonsei University College of Medicine, Seoul, Republic of Korea.,Division of Medical Oncology, Department of Internal Medicine and Yonsei Cancer Center, Severance Hospital, Yonsei University College of Medicine, Seoul, Republic of Korea
| | - Jong-Moon Chung
- School of Electrical and Electronic Engineering, College of Engineering, Yonsei University, Seoul, Republic of Korea.,Department of Emergency Medicine, College of Medicine, Yonsei University, Seoul, Republic of Korea
| | - Byoung Chul Cho
- Division of Medical Oncology, Department of Internal Medicine and Yonsei Cancer Center, Severance Hospital, Yonsei University College of Medicine, Seoul, Republic of Korea
| |
Collapse
|
4
|
Li H, Huang F, Liao H, Li Z, Feng K, Huang T, Cai YD. Identification of COVID-19-Specific Immune Markers Using a Machine Learning Method. Front Mol Biosci 2022; 9:952626. [PMID: 35928229 PMCID: PMC9344575 DOI: 10.3389/fmolb.2022.952626] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 06/21/2022] [Indexed: 01/08/2023] Open
Abstract
Notably, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has a tight relationship with the immune system. Human resistance to COVID-19 infection comprises two stages. The first stage is immune defense, while the second stage is extensive inflammation. This process is further divided into innate and adaptive immunity during the immune defense phase. These two stages involve various immune cells, including CD4+ T cells, CD8+ T cells, monocytes, dendritic cells, B cells, and natural killer cells. Various immune cells are involved and make up the complex and unique immune system response to COVID-19, providing characteristics that set it apart from other respiratory infectious diseases. In the present study, we identified cell markers for differentiating COVID-19 from common inflammatory responses, non-COVID-19 severe respiratory diseases, and healthy populations based on single-cell profiling of the gene expression of six immune cell types by using Boruta and mRMR feature selection methods. Some features such as IFI44L in B cells, S100A8 in monocytes, and NCR2 in natural killer cells are involved in the innate immune response of COVID-19. Other features such as ZFP36L2 in CD4+ T cells can regulate the inflammatory process of COVID-19. Subsequently, the IFS method was used to determine the best feature subsets and classifiers in the six immune cell types for two classification algorithms. Furthermore, we established the quantitative rules used to distinguish the disease status. The results of this study can provide theoretical support for a more in-depth investigation of COVID-19 pathogenesis and intervention strategies.
Collapse
Affiliation(s)
- Hao Li
- College of Biological and Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Feiming Huang
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Huiping Liao
- Ophthalmology and Optometry Medical School, Shandong University of Traditional Chinese Medicine, Jinan, China
| | - Zhandong Li
- College of Biological and Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Kaiyan Feng
- Department of Computer Science, Guangdong AIB Polytechnic College, Guangzhou, China
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- *Correspondence: Tao Huang, ; Yu-Dong Cai,
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
- *Correspondence: Tao Huang, ; Yu-Dong Cai,
| |
Collapse
|
5
|
Li Z, Mei Z, Ding S, Chen L, Li H, Feng K, Huang T, Cai YD. Identifying Methylation Signatures and Rules for COVID-19 With Machine Learning Methods. Front Mol Biosci 2022; 9:908080. [PMID: 35620480 PMCID: PMC9127386 DOI: 10.3389/fmolb.2022.908080] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2022] [Accepted: 04/27/2022] [Indexed: 11/13/2022] Open
Abstract
The occurrence of coronavirus disease 2019 (COVID-19) has become a serious challenge to global public health. Definitive and effective treatments for COVID-19 are still lacking, and targeted antiviral drugs are not available. In addition, viruses can regulate host innate immunity and antiviral processes through the epigenome to promote viral self-replication and disease progression. In this study, we first analyzed the methylation dataset of COVID-19 using the Monte Carlo feature selection method to obtain a feature list. This feature list was subjected to the incremental feature selection method combined with a decision tree algorithm to extract key biomarkers, build effective classification models and classification rules that can remarkably distinguish patients with or without COVID-19. EPSTI1, NACAP1, SHROOM3, C19ORF35, and MX1 as the essential features play important roles in the infection and immune response to novel coronavirus. The six significant rules extracted from the optimal classifier quantitatively explained the expression pattern of COVID-19. Therefore, these findings validated that our method can distinguish COVID-19 at the methylation level and provide guidance for the diagnosis and treatment of COVID-19.
Collapse
Affiliation(s)
- Zhandong Li
- College of Biological and Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Zi Mei
- Shanghai Institute of Nutrition and Health, Chinese Academy of Sciences, Shanghai, China
| | - Shijian Ding
- School of Life Sciences, Shanghai University, Shanghai, China
| | - Lei Chen
- College of Information Engineering, Shanghai Maritime University, Shanghai, China
| | - Hao Li
- College of Biological and Food Engineering, Jilin Engineering Normal University, Changchun, China
| | - Kaiyan Feng
- Department of Computer Science, Guangdong AIB Polytechnic College, Guangzhou, China
| | - Tao Huang
- Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
- *Correspondence: Tao Huang, ; Yu-Dong Cai,
| | - Yu-Dong Cai
- School of Life Sciences, Shanghai University, Shanghai, China
- *Correspondence: Tao Huang, ; Yu-Dong Cai,
| |
Collapse
|
6
|
Chen W, Lv X, Zhang W, Hu T, Cao X, Ren Z, Getachew T, Mwacharo JM, Haile A, Sun W. Insights Into Long Non-Coding RNA and mRNA Expression in the Jejunum of Lambs Challenged With Escherichia coli F17. Front Vet Sci 2022; 9:819917. [PMID: 35498757 PMCID: PMC9039264 DOI: 10.3389/fvets.2022.819917] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Accepted: 03/11/2022] [Indexed: 11/13/2022] Open
Abstract
It has long been recognized that enterotoxigenic Escherichia coli (ETEC) is the major pathogen responsible for vomiting and diarrhea. E. coli F17, a main subtype of ETEC, is characterized by high morbidity and mortality in young livestock. However, the transcriptomic basis underlying E. coli F17 infection has not been fully understood. In the present study, RNA sequencing was conducted to explore the expression profiles of mRNAs and long non-coding RNAs (lncRNAs) in the jejunum of lambs who were identified as resistant or sensitive to E. coli F17 that was obtained in a challenge experiment. A total of 772 differentially expressed (DE) mRNAs and 190 DE lncRNAs were detected between the E. coli F17—resistance and E. coli F17-sensitive lambs (i.e., TFF2, LOC105606142, OLFM4, LYPD8, REG4, APOA4, TCONS_00223467, and TCONS_00241897). Then, a two-step machine learning approach (RX) combination Random Forest and Extreme Gradient Boosting were performed, which identified 16 mRNAs and 17 lncRNAs as potential biomarkers, within which PPP2R3A and TCONS_00182693 were prioritized as key biomarkers involved in E. coli F17 infection. Furthermore, functional enrichment analysis showed that peroxisome proliferator-activated receptor (PPAR) pathway was significantly enriched in response to E. coli F17 infection. Our finding will help to improve the knowledge of the mechanisms underlying E. coli F17 infection and may provide novel targets for future treatment of E. coli F17 infection.
Collapse
Affiliation(s)
- Weihao Chen
- College of Animal Science and Technology, Yangzhou University, Yangzhou, China
| | - Xiaoyang Lv
- College of Animal Science and Technology, Yangzhou University, Yangzhou, China
| | - Weibo Zhang
- College of Animal Science and Technology, Yangzhou University, Yangzhou, China
| | - Tingyan Hu
- College of Animal Science and Technology, Yangzhou University, Yangzhou, China
| | - Xiukai Cao
- College of Animal Science and Technology, Yangzhou University, Yangzhou, China
| | - Ziming Ren
- College of Animal Science and Technology, Yangzhou University, Yangzhou, China
| | - Tesfaye Getachew
- International Centre for Agricultural Research in the Dry Areas, Addis Ababa, Ethiopia
| | - Joram M. Mwacharo
- International Centre for Agricultural Research in the Dry Areas, Addis Ababa, Ethiopia
| | - Aynalem Haile
- International Centre for Agricultural Research in the Dry Areas, Addis Ababa, Ethiopia
| | - Wei Sun
- College of Animal Science and Technology, Yangzhou University, Yangzhou, China
- Joint International Research Laboratory of Agriculture and Agri-Product Safety of Ministry of Education of China, Yangzhou University, Yangzhou, China
- *Correspondence: Wei Sun
| |
Collapse
|