Feature Selection and Clustering for HCI

by Richard O. Duda
Department of Electrical Engineering
San Jose State University

These notes provide background on feature selection and clustering for the new NSF-sponsored course entitled Human Computer Interface Design.

We often want to recognize patterns in the signals that we get from input sensors, and other notes for this course describe some statistically-based procedures for pattern classification. The standard feature-vector model for classification assumes that one way or another the designer has identified the features upon which the classification will be based. The classifier then uses all of these features to assign a feature vector to a class.

Because the specific features are so problem specific, there is no general theory for designing an effective feature set. However, there are some useful procedures for improving the performance one can obtain with a given set of features:

Feature selection. If the number of features is too large, one can speed up and often improve the process by using a small subset of the most important features.

Clustering. If the problem possesses natural subcategories, one can improve accuracy by finding the clusters and classifying in two stages -- subcategory classification followed by final classification.

Note: These topics are usually included in books on pattern recognition. Standard texts on this topic include Devijver and Kittler, Duda and Hart, and Fukunaga. Ripley is an excellent recent book with a strong statistical orientation. For a fuzzy-set approach to clustering, see Bezdek.

Last revised:10/6/97

Up to EE296I

* Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted with or without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. To copy otherwise, to republish, to post on services, or to redistribute to lists, requires specific permission and/or a fee.

Feature Selection and Clustering for HCI

by Richard O. Duda Department of Electrical Engineering San Jose State University

by Richard O. Duda
Department of Electrical Engineering
San Jose State University