1
|
Xiao T, Kong S, Zhang Z, Hua D, Liu F. A review of big data technology and its application in cancer care. Comput Biol Med 2024; 176:108577. [PMID: 38739981 DOI: 10.1016/j.compbiomed.2024.108577] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2023] [Revised: 05/07/2024] [Accepted: 05/07/2024] [Indexed: 05/16/2024]
Abstract
The development of modern medical devices and information technology has led to a rapid growth in the amount of data available for health protection information, with the concept of medical big data emerging globally, along with significant advances in cancer care relying on data-driven approaches. However, outstanding issues such as fragmented data governance, low-quality data specification, and data lock-in still make sharing challenging. Big data technology provides solutions for managing massive heterogeneous data while combining artificial intelligence (AI) techniques such as machine learning (ML) and deep learning (DL) to better mine the intrinsic connections between data. This paper surveys and organizes recent articles on big data technology and its applications in cancer, dividing them into three different types to outline their primary content and summarize their critical role in assisting cancer care. It then examines the latest research directions in big data technology in cancer and evaluates the current state of development of each type of application. Finally, current challenges and opportunities are discussed, and recommendations are made for the further integration of big data technology into the medical industry in the future.
Collapse
Affiliation(s)
- Tianyun Xiao
- Hebei Key Laboratory of Data Science and Application, North China University of Science and Technology, Tangshan, Hebei, 063210, China; The Key Laboratory of Engineering Computing in Tangshan City, North China University of Science and Technology, Tangshan, Hebei, 063210, China; College of Science, North China University of Science and Technology, Tangshan, Hebei, 063210, China
| | - Shanshan Kong
- College of Science, North China University of Science and Technology, Tangshan, Hebei, 063210, China.
| | - Zichen Zhang
- Hebei Key Laboratory of Data Science and Application, North China University of Science and Technology, Tangshan, Hebei, 063210, China; The Key Laboratory of Engineering Computing in Tangshan City, North China University of Science and Technology, Tangshan, Hebei, 063210, China; College of Science, North China University of Science and Technology, Tangshan, Hebei, 063210, China
| | - Dianbo Hua
- Beijing Sitairui Cancer Data Analysis Joint Laboratory, Beijing, 101149, China
| | - Fengchun Liu
- Hebei Key Laboratory of Data Science and Application, North China University of Science and Technology, Tangshan, Hebei, 063210, China; The Key Laboratory of Engineering Computing in Tangshan City, North China University of Science and Technology, Tangshan, Hebei, 063210, China; College of Science, North China University of Science and Technology, Tangshan, Hebei, 063210, China; Hebei Engineering Research Center for the Intelligentization of Iron Ore Optimization and Ironmaking Raw Materials Preparation Processes, North China University of Science and Technology, Tangshan, Hebei, China; Tangshan Intelligent Industry and Image Processing Technology Innovation Center, North China University of Science and Technology, Tangshan, Hebei, China
| |
Collapse
|
2
|
Thomas A, Douglas E, Reis-Filho JS, Gurcan MN, Wen HY. Metaplastic Breast Cancer: Current Understanding and Future Directions. Clin Breast Cancer 2023; 23:775-783. [PMID: 37179225 PMCID: PMC10584986 DOI: 10.1016/j.clbc.2023.04.004] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 04/16/2023] [Indexed: 05/15/2023]
Abstract
Metaplastic breast cancers (MBC) encompass a group of highly heterogeneous tumors which share the ability to differentiate into squamous, mesenchymal or neuroectodermal components. While often termed rare breast tumors, given the relatively high prevalence of breast cancer, they are seen with some frequency. Depending upon the definition applied, MBC represents 0.2% to 1% of breast cancers diagnosed in the United States. Less is known about the epidemiology of MBC globally, though a growing number of reports are providing information on this. These tumors are often more advanced at presentation relative to breast cancer broadly. While more indolent subtypes exist, the majority of MBC subtypes are associated with inferior survival. MBC is most commonly of triple-negative phenotype. In less common hormone receptor positive MBCs, hormone receptor status appears not to be prognostic. In contrast, relatively rare HER2-positive MBCs are associated with superior outcomes. Multiple potentially targetable molecular features are overrepresented in MBC including DNA repair deficiency signatures and PIK3/AKT/mTOR and WNT pathways alterations. Data on the prevalence of targets for novel antibody-drug conjugates is also emerging. While chemotherapy appears to be less active in MBC than in other breast cancer subtypes, efficacy is seen in some MBCs. Disease-specific trials, as well as reports of exceptional responses, may provide clues for novel approaches to this often hard-to-treat breast cancer. Strategies which harness newer research tools, such as large data and artificial intelligence hold the promise of overcoming historic barriers to the study of uncommon tumors and could markedly advance disease-specific understanding in MBC.
Collapse
Affiliation(s)
- Alexandra Thomas
- Department of Internal Medicine, Wake Forest University School of Medicine, Winston-Salem, NC.
| | - Emily Douglas
- Department of Internal Medicine, Wake Forest University School of Medicine, Winston-Salem, NC
| | - Jorge S Reis-Filho
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, New York, NY
| | - Metin N Gurcan
- Department of Internal Medicine, Wake Forest University School of Medicine, Winston-Salem, NC
| | - Hannah Y Wen
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, New York, NY
| |
Collapse
|
3
|
Thomas A, Reis-Filho JS, Geyer CE, Wen HY. Rare subtypes of triple negative breast cancer: Current understanding and future directions. NPJ Breast Cancer 2023; 9:55. [PMID: 37353557 DOI: 10.1038/s41523-023-00554-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2022] [Accepted: 05/19/2023] [Indexed: 06/25/2023] Open
Abstract
Rare subtypes of triple-negative breast cancers (TNBC) are a heterogenous group of tumors, comprising 5-10% of all TNBCs. Despite accounting for an absolute number of cases in aggregate approaching that of other less common, but well studied solid tumors, rare subtypes of triple-negative disease remain understudied. Low prevalence, diagnostic challenges and overlapping diagnoses have hindered consistent categorization of these breast cancers. Here we review epidemiology, histology and clinical and molecular characteristics of metaplastic, triple-negative lobular, apocrine, adenoid cystic, secretory and high-grade neuroendocrine TNBCs. Medullary pattern invasive ductal carcinoma no special type, which until recently was a considered a distinct subtype, is also discussed. With this background, we review how applying biological principals often applied to study TNBC no special type could improve our understanding of rare TNBCs. These could include the utilization of targeted molecular approaches or disease agnostic tools such as tumor mutational burden or germline mutation-directed treatments. Burgeoning data also suggest that pathologic response to neoadjuvant therapy and circulating tumor DNA have value in understanding rare subtypes of TNBC. Finally, we discuss a framework for advancing disease-specific knowledge in this space. While the conduct of randomized trials in rare TNBC subtypes has been challenging, re-envisioning trial design and technologic tools may offer new opportunities. These include embedding rare TNBC subtypes in umbrella studies of rare tumors, retrospective review of contemporary trials, prospective identification of patients with rare TNBC subtypes entering on clinical trials and querying big data for outcomes of patients with rare breast tumors.
Collapse
Affiliation(s)
- Alexandra Thomas
- Department of Internal Medicine, Atrium Health Wake Forest Baptist Cancer Center, Winston-Salem, NC, USA.
| | - Jorge S Reis-Filho
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - Charles E Geyer
- Department of Medicine, University of Pittsburgh UPMC Hillman Cancer Center, Pittsburgh, PA, USA
| | - Hannah Y Wen
- Department of Pathology and Laboratory Medicine, Memorial Sloan Kettering Cancer Center, New York, NY, USA
| |
Collapse
|
4
|
Cortina CS, Cobb AN, Kong AL. Invited Commentary: Current and Future Opportunities in Mitigating Breast Cancer Disparity. J Am Coll Surg 2023; 236:1239-1241. [PMID: 37058342 DOI: 10.1097/xcs.0000000000000664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/29/2023]
|
5
|
Evaluation of Feature Selection Techniques for Breast Cancer Risk Prediction. INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH 2021; 18:ijerph182010670. [PMID: 34682416 PMCID: PMC8535206 DOI: 10.3390/ijerph182010670] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Revised: 09/25/2021] [Accepted: 09/27/2021] [Indexed: 12/24/2022]
Abstract
This study evaluates several feature ranking techniques together with some classifiers based on machine learning to identify relevant factors regarding the probability of contracting breast cancer and improve the performance of risk prediction models for breast cancer in a healthy population. The dataset with 919 cases and 946 controls comes from the MCC-Spain study and includes only environmental and genetic features. Breast cancer is a major public health problem. Our aim is to analyze which factors in the cancer risk prediction model are the most important for breast cancer prediction. Likewise, quantifying the stability of feature selection methods becomes essential before trying to gain insight into the data. This paper assesses several feature selection algorithms in terms of performance for a set of predictive models. Furthermore, their robustness is quantified to analyze both the similarity between the feature selection rankings and their own stability. The ranking provided by the SVM-RFE approach leads to the best performance in terms of the area under the ROC curve (AUC) metric. Top-47 ranked features obtained with this approach fed to the Logistic Regression classifier achieve an AUC = 0.616. This means an improvement of 5.8% in comparison with the full feature set. Furthermore, the SVM-RFE ranking technique turned out to be highly stable (as well as Random Forest), whereas relief and the wrapper approaches are quite unstable. This study demonstrates that the stability and performance of the model should be studied together as Random Forest and SVM-RFE turned out to be the most stable algorithms, but in terms of model performance SVM-RFE outperforms Random Forest.
Collapse
|