Spatial Pattern to Learn (Sp2Learn) presents a framework for accurate estimation of geospatial models from sparse
field measurements using image processing and machine learning. The goal is to improve our understanding of the underlying physical
phenomena and increase the accuracy of geospatial models. Sp2Learn allows users to explore the accuracy improvements when several
image de-noising techniques with a decision tree machine learning technique are employed, and multiple remote sensing and terrestrial
raster measurements are used.
For example, we provide test data to illustrate how to incorporate and mine slope, soil type and proximity to water bodies for
predicting
groundwater recharge and discharge (R/D) rate models - ground water models.
Sp2Learn can be viewed as an encapsulated workflow for:
- loading multiple raster files (images),
- integrating and mosaicking all raster data sets to form a stack with
consistent spatial resolution as well as geographic projection,
- loading other files (boundaries, points or images) to create a mask for pixel selection purposes,
- integrating the existing stack of raster images with other masking information,
- selecting boundaries or image regions of interest and extracting variables from the stack of images,
- performing data-driven modeling of selected input and output variables,
- analyzing data-driven model to assign a relevance coefficient to input variables, and
- mapping the data-driven model at a pixel level to spatial domain.
All aforementioned steps are supported by visualizations (colorful, gray-scale or pseudo-color) of input,
intermediate, and output data sets, as well as the data models.
In order to bring together so much functionality, the architecture of Sp2Learn leverages
several technologies. The majority of the Sp2Learn code is based on:
- Image to Learn (Im2Learn) developed at NCSA with additional calls to:
- HDF5 library developed by NCSA and
- MODIS Reprojection Tool (MRT) to perform
geographic re-projection developed by NASA.