: Class SegGeoPts

Class SegGeoPts

The class SegGeoPts provides a tool for aggregating geographical regions with an imposed contiguity constraint based on attributes associated with each region. The input data for the tool should consist of (a) boundary points of regions, (b) attributes associated with each region and (c) neighbor indices of each region to impose contiguity of aggregations. This type of data can be loaded from ESRI Shape files (.shp, .shx and .dbf files) in conjuction with a neighbor file (.nbr file).

Description:
The objective of the tool is to aggregate regions into spatially contiguous region aggregations. This processing operation is also called segmentation.
An aggregation of two regions takes place if two conditions are satisfied. First, attributes of regions are similar and second, the two regions are neighbors. The algorithm aggregates regions in a bottom-up manner, from the most similar regions to the least similar regions. The similarity of attributes is defined as the Euclidean distance between two vectors of attributes. A new attribute vector is assigned to a newly formed aggregation of regions by weighted averaging of original attribute vectors.
The algorithm exits its execution if (a) a user specified number of desired aggregations has been reached or (b) the maximum allowed disimilarity has been reached.

Setup: It is assumed that a shape file (.shp and .shx files) with all boundary information has been loaded in the main window. When the SegGeoPts dialog is invoked from the main menu of the main window, the attribute file (.dbf) and neighbor file (.nbr) will be loaded assuming that the root name of the shape file, attribute file and neighbor file are identical. All loaded data are represented by three internal data objects, ShapeObject (region boundaries), Table (region attributes) and BndNeighbors (neighborhood relationships or a list of neighbors). The region boundaries are visualized in the main window and the region attributes can be visualized from the SegGeoPts dialog by clicking on the "ShowDbf" button (see Dialog Figure).

An example of region boundaries representing the county boundaries in Illinois.

An example of region attributes in a tabular form that represent attrbutes per each region boundaries, e.g., attributes of counties in Illinois.

The Dialog interface to perform parameter setup, segmentation/aggregation, visualization and output of the computed results.

Step 1: The first step is to select attributes (features) and weights for aggregations. One can view all features by clicking the button "ShowFeat" and select any combination of features by clicking on "SelectFeat" and entering feature indices. For example, if one would want to select attributes/features 1, 2, 3, 4, 6, 7, 8 and 10, then one would type into a pop-up dialog 1-4, 6-8, 10, or each index separated by a space or comma. After selecting a set of features, the TextArea labeled as "Results" in the dialog will show the list of selected features with their ranges of values and their minimum and maximum value differences between all neighboring region pairs.

Step 2: While a merger is performed when feature vectors are similar, the assignment of new feature vector is conducted with a weighting coefficient that includes the frequency or size of each region to be merged. Weights are chosen the same way as the features by clickikng on the "SelectWeights" button. If no weights are selected then they are all assumed to be equal to one.

Step 3: The third step is to choose the exit and result output criteria.
There are three values labeled as MinSigma, MaxSigma and DeltaSigma that determine the range of similarity values to be explored [MinSigma, MaxSigma] and the minimum increment to go from MinSigma to MaxSigma. If the value of MaxSigma is reached during aggregations then the algorithm will exit. If none of the check-boxes (N2FindS, N2FindLayer) is checked then this is the only exit criterion unless the number of aggregated regions is one before this similarity threshold was reached.

If the check-box "N2FindS" is checked then the value in the edit box labeled as N2FindS becomes an additional exit criterion to the MaxSigma criterion. If only "N2FindS" is checked and the algorithm has met the "N2FindS" exit criterion then the output will be one set of labels with the smaller or equal number of aggregated regions to the desired number N2FindS. In this case, the similarity value increment at each step within [MinSigma, MaxSigma] is equal to the maximum of DeltaSigma and the minimum value difference between all aggregated neighboring regions.

If the check-box "N2FindLayers" is checked then the value in the edit box labeled as N2FindLayers defines how many intermediate results should be reported at the end of aggregation. In this case, the similarity value increment at each step is equal to DeltaSigma and the intermediate results (layers) are reported at evenly distributed values of similarity within the interval [MinSigma, MaxSigma]. If both, N2FindS and N2FindLayers, are checked and the N2FindS criterion is met before MaxSigma then some of the reported layers will have no labels.

Run and Display Results: The algorithm is executed by clicking the "Segment" button. The history of aggregations can be viewed by clicking "ShowResults" and viewing the TextArea labeled as "Results". The results can be displayed by clicking the button "ShowGeoResults". This type of visualization presents the labels of aggregations together with the geographical locations of regions.

An example of color coded visualization of region aggregations invoked by clicking on "ShowGeoResults".

If the check-box "N2FindLayers" was checked then there would be multiple layers of labels to display. If multiple output layers of labels have been created ("N2FindLayers" was checked) then a frame with a color coded labels will show the evolution of labels as a function of similarity value after clicking the "ShowResults" button. The horizontal axis represents the layer index or an increasing similarity value going down. The vertical axis represents labels aassociated with each row in the DBF table. An example is shown below.

Left: The TextArea in the Dialog shows information about the displayed layers.
Right: A visualization of label aggrerations as a function of attribute similarity. The number of columns corresponds to the number of regions (rows) in a DBF file. The number of rows is equal to the number of layers.

Another way how to view the multi-layer results is by clicking the button "ShowGeoResults". This action will open a new frame with the image of region labels displayed according to their geographic locations. Each layer can be viewed separately by invoking the "BandSelect" Dialog after right mouse button click on the image area and selecting the index of the layer. It is also possible to view the results as a movie by sweeping through the layers continuously. This type of visualization is enabled with the "PlayBands" Dialog that is also invoked by right mouse button click on the image area.

A visualization of the first three aggregation/segmentation layers corresponding to their maximum dissimilarity of aggregated regions equal to 5 (top), 10 (middle) and 20 (bottom).

Output of Results: The aggregation results can be saved into a DBF file by clicking the "Insert2DBF" button and then the "SaveDBF" button. It is also possible to save the results shown by "ShowGeoResults" with the button "SaveGeoResults".
It might be desirable to view the loaded attributes with the inserted labels by clicking "ShowDBF". This will open a new table viewer with all dbf file column attributes and their values.