Rank Ordering with Accuracy Selection (ROWAS) for Hyperspectral Band Selection.

Peter Groves

M.S. dissertation, Department of Computer Science, University of Illinois at Urbana-Champaign, 2003

Peter Bajcsy, Advisor

Hyperspectral band selection is a key factor in creating practical, accurate predictive models for remote sensing applications. A proper subset of bands can contain the same information, with less noise, than the complete set of bands. This can lead to both an increase in accuracy and a decrease in computational complexity. The problem then becomes: how does one determine which bands to use? We first discuss the implications of sampling theory, the No Free Lunch Theorem, sensor noise, and information redundancy in feature subset selection.

We then present a generic methodology that directly follows from these implications to select the optimal subset of bands and prediction model together. This method is called Rank Ordering with Accuracy Selection (ROWAS), and works as follows. The bands are ranked by several computationally e±cient measures of information content and redundancy. Then, increasing numbers of top ranked bands are evaluated with different prediction methods using the cross-validated accuracy as a metric. The ensuing analysis provides an op- timal set of bands, along with the best prediction model. This methodology satisfies all of the design constraints, and provides a good tradeoÆ between exploration of the feature space and computation time.

We apply this generic methodology to the domain of hyperspectral band selection by developing ranking methods that assume the data is a sampled continuous spectrum. Experimental results for both a numeric prediction and classification task are presented. These experiments are the the prediction of electrical soil conductivity in a pre-growing season farm field and the classification of grass types based on hyperspectral, airborne imagery. For both problems, ROWAS achieves a high level of accuracy appears to be near the optimal accuracy possible for the problem. In the case of the grass-type classification, this is confirmed using a McNemar test for statistical significance.