Next generation data processing systems must deal with very high
data ingest rates and massive volumes of data. Such conditions are
typically encountered in the Intelligence Community (IC) where
analysts must search through huge volumes of data in order to gather
evidence to support or refute their hypotheses. Their effort is
made all the more difficult give that the data appears as
unstructured text that is written in multiple languages using
characters that have different encodings. Human analysts have not
been able to keep pace with reading the data and a large amount of
data is discarded even though it might contain key information. The
goal of our project is to assess the feasibility of incrementally
replacing humans with automation in key areas of information
processing. These areas include document ingest, content
categorization, language translation, and
context-and-temporally-based information retrieval.
Mathematical transformation algorithms, when implemented in rapidly
reconfigurable hardware, offer the potential to continuously
(re)process and (re)interpret extremely high volumes of
multi-lingual, unstructured text data. These technologies can
automatically elicit the semantics of streaming input data, organize
the data by concept (regardless of language), and associate related
concepts in order to parameterize models. To test that hypothesis,
we are building an experimentation testbed that enables the rapid
implementation of semantic processing algorithms in hardware. The
system includes a high-performance infrastructure that includes a
hardware-accelerated content processing platform; mass storage to
hold training data, test data, and experiment scenarios; and tools
for analysis and visualization of the data.
In our first use of the testbed, we performed an experiment where we
implemented three transformation algorithms using FPX hardware
platforms to perform semantic processing on document streams. Our
platform uses Field-programmable Port Extender (FPX) modules
developed at Washington University in Saint Louis.
This paper describes our approach to building the experimental
hardware platform components, discusses the major features of the
circuit designs, overviews our first experiment, and offers a
detailed description of the results, which are promising.
|