Advanced Information Systems for Archival Appraisals of Contemporary Documents.

Presenter: Kenton McHenry, NCSA
Authors: W. McFadden, K. McHenry, R. Kooper, M. Ondrejcek, A. Yahja and P. Bajcsy

2008 Microsoft e-Science Workshop
Indianapolis, IN, December 7 - 12, 2008
poster presentation

This work addresses the problem of designing a scalable framework for archival appraisals of contemporary PDF documents. The motivation for our work is to provide an e-Science solution that (a) fuses the independent research methodologies focusing on specific information types to one comprehensive analytical framework, (b) optimizes tradeoffs between computational requirements and preservation costs, and (b) bridges the small scale and large scale computational studies. The e- Science solution presented here consists of (1) a methodology for comprehensive comparisons of contemporary documents containing text, images and vector graphics, (2) a framework for including 3D and 3D+time data sets into the appraisal analyses, (3) interfaces supporting exploratory archival appraisal analyses with small scale data sets, and (4) infrastructure supporting the transition from small scale to large scale computations using commodity and high performance computing resources.

The novelty of our work is in designing methodologies, mathematical frameworks and prototypes for comprehensive and scalable document appraisals that include text, images, vector graphics , and high dimensional data.