Du Y, Liang Y, Yun D. Data mining for seeking an accurate quantitative relationship between molecular structure and GC retention indices of alkenes by projection pursuit.
JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES 2002;
42:1283-92. [PMID:
12444724 DOI:
10.1021/ci020285u]
[Citation(s) in RCA: 34] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Primary data mining on alkenes for seeking an accurate quantitative relationship between the molecular structure and retention indices of gas chromatography is developed in this paper. Based on the results obtained from projection pursuit, all alkenes investigated show an interesting classification. Thus, a new variable named class distance variable of alkenes, which essentially describes information about the branch, position of the double bonds, the number of double bonds, and so on for alkenes, is proposed. With the help of the new variable, both fitting and prediction accuracy of the regression model can be dramatically improved. The results obtained in this work show that the technique of projection pursuit developed in statistics is a quite promising tool for seeking an accurate quantitative structure-retention relationship (QSRR).
Collapse