1
|
Yan Z, Luke BT, Tsang SX, Xing R, Pan Y, Liu Y, Wang J, Geng T, Li J, Lu Y. Identification of gene signatures used to recognize biological characteristics of gastric cancer upon gene expression data. Biomark Insights 2014; 9:67-76. [PMID: 25210421 PMCID: PMC4149392 DOI: 10.4137/bmi.s13059] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2013] [Revised: 03/11/2014] [Accepted: 03/12/2014] [Indexed: 01/03/2023] Open
Abstract
High-throughput gene expression microarrays can be examined by machine-learning algorithms to identify gene signatures that recognize the biological characteristics of specific human diseases, including cancer, with high sensitivity and specificity. A previous study compared 20 gastric cancer (GC) samples against 20 normal tissue (NT) samples and identified 1,519 differentially expressed genes (DEGs). In this study, Classification Information Index (CII), Information Gain Index (IGI), and RELIEF algorithms are used to mine the previously reported gene expression profiling data. In all, 29 of these genes are identified by all three algorithms and are treated as GC candidate biomarkers. Three biomarkers, COL1A2, ATP4B, and HADHSC, are selected and further examined using quantitative real-time polymerase chain reaction (qRT-PCR) and immunohistochemistry (IHC) staining in two independent sets of GC and normal adjacent tissue (NAT) samples. Our study shows that COL1A2 and HADHSC are the two best biomarkers from the microarray data, distinguishing all GC from the NT, whereas ATP4B is diagnostically significant in lab tests because of its wider range of fold-changes in expression. Herein, a data-mining model applicable for small sample sizes is presented and discussed. Our result suggested that this mining model may be useful in small sample-size studies to identify putative biomarkers and potential biological features of GC.
Collapse
Affiliation(s)
- Zhi Yan
- Laboratory of Molecular Oncology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), Peking University Cancer Hospital and Institute, Beijing, People’s Republic of China
| | - Brian T Luke
- Advanced Biomedical Computing Center, Frederick National Laboratory for Cancer Research, Frederick, MD, USA
| | | | - Rui Xing
- Laboratory of Molecular Oncology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), Peking University Cancer Hospital and Institute, Beijing, People’s Republic of China
| | - Yuanming Pan
- Laboratory of Molecular Oncology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), Peking University Cancer Hospital and Institute, Beijing, People’s Republic of China
| | - Yixuan Liu
- Laboratory of Molecular Oncology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), Peking University Cancer Hospital and Institute, Beijing, People’s Republic of China
| | - Jinlian Wang
- Georgetown University Lombardi Comprehensive Cancer Center, Washington, DC, USA
| | - Tao Geng
- College of Electronic Information and Control Engineering, Beijing University of Technology, Beijing, People’s Republic of China
| | - Jiangeng Li
- College of Electronic Information and Control Engineering, Beijing University of Technology, Beijing, People’s Republic of China
| | - Youyong Lu
- Laboratory of Molecular Oncology, Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education), Peking University Cancer Hospital and Institute, Beijing, People’s Republic of China
| |
Collapse
|
2
|
Kutchukian PS, Vasilyeva NY, Xu J, Lindvall MK, Dillon MP, Glick M, Coley JD, Brooijmans N. Inside the mind of a medicinal chemist: the role of human bias in compound prioritization during drug discovery. PLoS One 2012. [PMID: 23185259 PMCID: PMC3504051 DOI: 10.1371/journal.pone.0048476] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Medicinal chemists' "intuition" is critical for success in modern drug discovery. Early in the discovery process, chemists select a subset of compounds for further research, often from many viable candidates. These decisions determine the success of a discovery campaign, and ultimately what kind of drugs are developed and marketed to the public. Surprisingly little is known about the cognitive aspects of chemists' decision-making when they prioritize compounds. We investigate 1) how and to what extent chemists simplify the problem of identifying promising compounds, 2) whether chemists agree with each other about the criteria used for such decisions, and 3) how accurately chemists report the criteria they use for these decisions. Chemists were surveyed and asked to select chemical fragments that they would be willing to develop into a lead compound from a set of ~4,000 available fragments. Based on each chemist's selections, computational classifiers were built to model each chemist's selection strategy. Results suggest that chemists greatly simplified the problem, typically using only 1-2 of many possible parameters when making their selections. Although chemists tended to use the same parameters to select compounds, differing value preferences for these parameters led to an overall lack of consensus in compound selections. Moreover, what little agreement there was among the chemists was largely in what fragments were undesirable. Furthermore, chemists were often unaware of the parameters (such as compound size) which were statistically significant in their selections, and overestimated the number of parameters they employed. A critical evaluation of the problem space faced by medicinal chemists and cognitive models of categorization were especially useful in understanding the low consensus between chemists.
Collapse
Affiliation(s)
- Peter S. Kutchukian
- Center for Proteomic Chemistry, Novartis Institutes for BioMedical Research, Cambridge, Massachusetts, United States of America
| | - Nadya Y. Vasilyeva
- Department of Psychology, Northeastern University, Boston, Massachusetts, United States of America
| | - Jordan Xu
- Global Discovery Chemistry, Novartis Institutes for BioMedical Research, Emeryville, California, United States of America
| | - Mika K. Lindvall
- Global Discovery Chemistry, Novartis Institutes for BioMedical Research, Emeryville, California, United States of America
| | - Michael P. Dillon
- Global Discovery Chemistry, Novartis Institutes for BioMedical Research, Emeryville, California, United States of America
| | - Meir Glick
- Center for Proteomic Chemistry, Novartis Institutes for BioMedical Research, Cambridge, Massachusetts, United States of America
| | - John D. Coley
- Department of Psychology, Northeastern University, Boston, Massachusetts, United States of America
- * E-mail: (JDC); (NB)
| | - Natasja Brooijmans
- Blueprint Medicines, Cambridge, Massachusetts, United States of America
- * E-mail: (JDC); (NB)
| |
Collapse
|