Abstract
We introduce the Interactive Decision Committee method for classification when high-dimensional feature variables are grouped into feature categories. The proposed method uses the interactive re- lationships among feature categories to build base classifiers which are combined using decision committees. A two-stage 5-fold cross- validation technique is utilized to decide the total number of base classifiers to be combined. The proposed procedure is useful for clas- sifying biochemicals on the basis of toxicity activity, where the feature space consists of chemical descriptors and the responses are binary indicators of toxicity activity. Each descriptor belongs to at least one descriptor category. The support vector machine algorithm is utilized as a classifier inducer. Forward selection is used to select the best combinations of the base classifiers given the number of base classifiers. We applied the proposed method to two chemical toxic- ity data sets. For these data sets, the proposed method outperforms other decision committee methods including adaboost, bagging, random forests, the univariate decision committee, and a single large, unaggregated classifier.
Disciplines
Statistical Models
Suggested Citation
Kang, Chaeryon; Zhu, Hao; Wright, Fred A.; Zou, Fei; and Kosorok, Michael R., "The Interactive Decision Committee for Chaemical Toxicity Analysis" (December 2010). The University of North Carolina at Chapel Hill Department of Biostatistics Technical Report Series. Working Paper 18.
http://biostats.bepress.com/uncbiostat/art18