Publications on Content ProcessingReconfigurable Network Group (In Reverse Chronological Order)
(Also available as BIBTeX format)
Abstract: We have implemented a new network information processing system using reconfigurable hardware that scans volumes of data in real-time. One of the key functions of the system is to extract semantic information. Before we can determine the meaning of text, we must identify its language. In a previous project, we have implemented an N-gram based language identifier that can process up to 1 Gbps throughput. However, a large percentage of computer network traffic, such as email and web data, consists of markup information such as tags and protocol specific options. This additional data interferes with the language identification process causing decreased accuracy. Thus, we developed a hardware architecture for configurable application level processing. Our Application Level Processing System (ALPS) is a custom processor that is automatically generated using syntactic structure of the content. The resulting circuit is mapped on to a reconfigurable device to efficiently extract only the relevant data for the language identifier. To illustrate the effectiveness of the architecture, we have implemented a system that can process electronic mail. Our experiments show that ALPS can improve the accuracy of the hardware language identifier by up to a factor of 200 as compared to a system that does not decode the application-level protocol data.
Abstract: This paper presents a reconfigurable architecture for high-speed content-based routing. Our architecture goes beyond simple pattern matching by implementing a parsing engine that defines the semantics of patterns that are parsed within the data stream. Defining the semantics of patterns allows for more accurate processing and routing of packets using any fields that appear within the payload of the packet. The architecture consists of several components, including a pattern matcher, a parsing structure, and a routing module. Both the pattern matcher and parsing structure are automatically generated using an application-specific compiler that is described in this paper. The compiler accepts a grammar specification as input and outputs a data parser in VHDL. The routing module receives control signals from both the pattern matcher and the parsing structure that aid in the routing of packets. We illustrate how a content-based router can be implemented with our technique using an XML parser as an example. The XML parser presented was designed, implemented, and tested in a Xilinx Virtex XCV2000E FPGA on the FPX platform. It is capable of processing 32-bits of data per clock cycle and runs at 100 MHz. This allows the system to process and route XML messages at 3.2 Gbps.
Abstract: In this paper, we present reconfigurable hardware architecture for detecting semantics of streaming data on 1+ Gbps networks. The design leverages on the characteristics of context-free-grammar (CFG) that allows the computers to understand the semantics of data. Although our parser is not a true CFG parser, we use the linguistic structure defined in the grammars to explore a new way of parsing data using Field Programmable Gate Array (FPGA) hardware. Our system consists of pattern matchers and a syntax detector. The pattern matchers are automatically generated using the grammar token list while the syntax detector is generated based on the aspects of the grammar that define the order of all possible token sequences. Since all the rules are mapped onto the hardware as parallel processing engines, the meaning of each token can be determined by monitoring where it is being processed. Our highly parallel and fine grain pipelined engines can operate at a frequency above 500 MHz. Our initial implementation is XML content-based router for XML remote procedure calls (RPC). The implementation can process the data at 1.57 Gbps on Xilinx VirtexE FPGA and 4.26 Gbps on the Virtex 4 FPGA.
Abstract: There is a need within the intelligence communities to analyze massive streams of multilingual unstructured data. Mathematical transformation algorithms have proven effective at interpreting multilingual, unstructured data, but high computational requirements of such algorithms prevent their widespread use. The rate of computation can be vastly increased with Field Programmable Gate Array (FPGA) hardware.
To experiment with this approach, we developed a system with FPGAs that ingests content over a network at high data rates. The system extracts basewords, counts words, scores documents, and discovers concepts on data that are carried in TCP/IP network flows as packets over a Gigabit Ethernet link or in cells transported over an OC48 link. These algorithms, as implemented in FPGA hardware, introduce certain constraints on the complexity and richness of the semantic processing algorithms.
To understand the implications of these constraints and to benchmark the performance of the system, we have performed a series of experiments processing multilingual documents. In these experiments, we compare techniques to generate basewords for our semantic concepts, score documents, and discover concepts across a variety of processing operational scenarios.
Abstract: High-speed packet content inspection and filtering devices rely on a fast multi-pattern matching algorithm which is used to detect predefined keywords or signatures in the packets. Multi-pattern matching is known to require intensive memory accesses and is often a performance bottleneck. Hence specialized hardware-accelerated algorithms are being developed for line-speed packet processing. While several pattern matching algorithms have already been developed for such applications, we find that most of them suffer from scalability issues. To support a large number of patterns, the throughput is compromised or vice versa.
We present a hardware-implementable pattern matching algorithm for content filtering applications, which is scalable in terms of speed, the number of patterns and the pattern length. We modify the classic Aho-Corasick algorithm to consider multiple characters at a time for higher throughput. Furthermore, we suppress a large fraction of memory accesses by using Bloom filters implemented with a small amount of on-chip memory. The resulting algorithm can support matching of several thousands of patterns at more than 10 Gbps with the help of a less than 50 KBytes of embedded memory and a few megabytes of external SRAM. We demonstrate the merit of our algorithm through theoretical analysis and simulations performed on Snort's string set.
Abstract: A hardware-accelerated algorithm has been designed to automatically identify the primary languages used in documents transferred over the Internet. The algorithm has been implemented in hardware on the Field programmable port extender (FPX) platform. This system, referred to as the Hardware-Accelerated Identification of Languages (HAIL) project, identifies the primary languages used in content transferred over Transmission Control Protocol (TCP) / Internet Protocol (IP) networks that operate at rates exceeding 2.4 Gigabits/second. We demonstrate that this hardware accelerated circuit, operating on a Xilinx XCV2000E-8 FPGA, far outperforms software algorithms running on modern personal computers while maintaining extremely high levels of accuracy.
Abstract: Next-generation data processing systems must deal with very high data ingest rates and massive volumes of data. Such conditions are typically encountered in the Intelligence Community (IC) where analysts must search through huge volumes of data in order to gather evidence to support or refute their hypotheses. Their effort is made all the more difficult given that the data appears as unstructured text that is written in multiple languages using characters that have different encodings. Human Analysts have not been able to keep pace with reading the data and a large amount of data is discarded even though it might contain key information. The goal of our project is to assess the feasibility of incrementally replacing humans with automation in key areas of information processing. These areas include document ingest, content categorization, language translation, and context-and-temporally- based information retrieval.
Mathematical transformation algorithms, when implemented in rapidly reconfigurable hardware, offer the potential to continuously (re)process and (re)interpret extremely high volumes of multi-lingual, unstructured text data. These technologies can automatically elicit the semantics of streaming input data, organize the data by concept (regardless of language), and associate related concepts in order to parameterize models. To test that hypothesis, we are building an experimentation testbed that enables the rapid implementation of semantic processing algorithms in hardware. The system includes a high-performance infrastructure that includes hardwarea accelerated content processing platform; mass storage to hold training data, test data, and experiment scenarios; and tools for analysis and visualization of the data.
In our first use of the testbed, we performed an experiment where we implemented three transformation algorithms using FPX hardware platforms to perform semantic processing on document streams. Our platform uses Field-programmable Port Extender (FPX) modules developed at Washington University in Saint Louis. This paper describes our approach to building the experimental hardware platform components, discusses the major features of the circuit designs, overviews our first experiment, and offers a detailed of the results, which are processing.
Abstract: Network Intrusion Detection and Prevention Systems (IDPS) use string matching to scan Internet packets for malicious content. Bloom filters offer a mechanism to search for a large number of strings efficiently and concurrently when implemented with Field Programmable Gate Array (FPGA) technology. A string matching circuit has been implemented within the FPX platform using Bloom filters. Using 155 block RAMs on a single Xilinx VirtexE 2000 FPGA, the circuit scans for 35,475 unique signatures.
Abstract: Because conventional software-based packet inspection algorithms have not kept pace with high-speed networks, interest has turned to using hardware to process network data quickly. String scanning with Bloom filters can scan entire packet payloads for predifined signatures at multi-Gigabit-per-second line speeds.
Abstract: A new architecture performs content scanning of TCP flows in high-speed networks. Combining a TCP processing engine, a per-flow state store, and a content-scanning engine, this architecture permits complete payload inspections on 8 million TCP flows at 2.5 Gbps.
Abstract Today's crucial information networks are vulnerable to fast moving attacks by Internet worms and computer viruses. These attacks have the potential to cripple the Internet and compromise the integrity of the data on the end-user machines. Without new types of protection, the Internet remains susceptible to the assault of increasingly aggressive attacks. A platform has been implemented that actively detects and blocks worms and viruses at multi-Gigabit/second rates. It uses the Field-programmable Port Extender (FPX) to scan for signatures of malicious software (malware) carried in packet payloads. Dynamically reconfigurable Field Programmable Gate Array (FPGA) logic tracks the state of Internet flows and searches for regular expressions and fixedstrings that appear in the content of packets. Protection is achieved by the incremental deployment of systems throughout the Internet.
Abstract An extensible .rewall has been implemented that performs packet .ltering, content scanning, and per-.ow queuing of Internet packets at Gigabit/second rates. The .rewall uses layered protocol wrappers to parse the content of Internet data. Packet payloads are scanned for keywords using parallel regular expression matching circuits. Packet headers are compared to rules speci.ed in Ternary Content Addressable Memories (TCAMs). Per-.ow queuing is performed to mitigate the effect of Denial of Service attacks. All packet processing operations were implemented with recon.gurable hardware and .t within a single Xilinx Virtex XCV2000E Field Programmable Gate Array (FPGA). The singlechip .rewall has been used to .lter Internet SPAM and to guard against several types of network intrusion. Additional features were implemented in extensible hardware modules deployed using run-time recon.guration.
Abstract Recent advances in network packet processing focus on payload inspection for applications that include contentbased billing, layer-7 switching and Internet security. Most of the applications in this family need to search for predefined signatures in the packet payload. Hence an important building block of these processors is string matching infrastructure. Since conventional software-based algorithms for string matching have not kept pace with high network speeds, specialized high-speed, hardware-based solutions are needed. We describe a technique based on Bloom filters for detecting predefined signatures (a string of bytes) in the packet payload. A Bloom filter is a data structure for representing a set of strings in order to support membership queries. We use hardware Bloom filters to isolate all packets that potentially contain predefined signatures. Another independent process eliminates false positives produced by Bloom filters.
We outline our approach for string matching at line speeds and present the performance analysis. Finally, we report the results for a prototype implementation of this system on the FPX platform. Our analysis shows that with the state-of-the-art FPGAs, a set of 10,000 strings can be scanned in the network data at the line speed of OC48 (2.4 Gbps).
Abstract A module has been implemented in Field Programmable Gate Array (FPGA) hardware that is able to perform regular expression search-and-replace operations on the content of Internet packets at Gigabit/ second rates. All of the packet processing operations are performed using reconfigurable hardware within a single Xilinx Virtex XCV2000E FPGA. A set of layered protocol wrappers is used to parse the headers and payloads of packets for Internet protocol data. A content matching server automatically generates, compiles, synthesizes, and programs the module into the Field-programmable Port Extender (FPX) platform.
Abstract A module has been implemented in Field Programmable Gate Array (FPGA) hardware that scans the content of Internet packets at Gigabit/second rates. All of the packet processing operations are performed using recon/gurable hardware within a single Xilinx Virtex XCV2000E FPGA. A set of layered protocol wrappers is used to parse the headers and payloads of packets for Internet protocol data. A content match- ing server automatically generates the Finite State Machines (FSMs) to search for regular expressions. The complete system is operated on the Field-programmable Port Extender (FPX) platform.
Abstract: The FPX provides simple and fast mechanisms to process cells or packets at the full line speed of the card [currently 2.4 Gbits/sec]. A sample application, called `Hello World' has been developed that illustrates how easily an application can be implemented on the FPX. This application uses the FPGA hardware to search for a string on a particular flow and selectively replace contents of the payload. The resulting circuits operates at 119 MHz on a Xilinx XCV1000E-FG680-7, and occupies less than 1% of the available gates on the device.