1
|
Rosset S, Perlich C, Świrszcz G, Melville P, Liu Y. Medical data mining: insights from winning two competitions. Data Min Knowl Discov 2009. [DOI: 10.1007/s10618-009-0158-x] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
2
|
Perlich C, Melville P, Liu Y, Świrszcz G, Lawrence R, Rosset S. Breast cancer identification. ACTA ACUST UNITED AC 2008. [DOI: 10.1145/1540276.1540289] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
Abstract
We describe the ideas and methodologies that we developed in addressing the KDD Cup 2008 on early breast cancer detection, and discuss how they contributed to our success. The most important components of our solution were 1) the identification of predictive information in the patient identifier, 2) a linear SVM on the 117 provided features, and 3) a heuristic post-processing approach to optimize the evaluation criteria.
Collapse
Affiliation(s)
| | - Prem Melville
- IBM T.J. Watson Research Center, Yorktown Heights, NY
| | - Yan Liu
- IBM T.J. Watson Research Center, Yorktown Heights, NY
| | | | | | | |
Collapse
|