Intrusion Detection and Prevension
(In Reverse Chronological Order)
Abstract: High-speed packet content inspection and filtering devices rely on a fast multi-pattern matching algorithm which is used to detect predefined keywords or signatures in the packets. Multi-pattern matching is known to require intensive memory accesses and is often a performance bottleneck. Hence specialized hardware-accelerated algorithms are required for line-speed packet processing.
We present hardware-implementable pattern matching algorithm for content filtering applications, which is scalable in terms of speed, the number of patterns and the pattern length. Our algorithm is based on a memory efficient multi-hashing data structure called Bloom filter. We use embedded on-chip memory blocks in FPGA/VLSI chips to construct Bloom filters which can suppress a large fraction of memory accesses and speed up string matching. Based on this concept, we first present a simple algorithm which can scan for several thousand short (up to 16 bytes) patterns at multi-gigabit per second speeds with a moderately small amount of embedded memory and a few mega bytes of external memory. Furthermore, we modify this algorithm to be able to handle arbitrarily large strings at the cost of a little more on-chip memory. We demonstrate the merit of our algorithm through theoretical analysis and simulations performed on Snort's string set.
Abstract: The performance pressures on implementing effective network security monitoring are growing fiercely due to rising traffic rates, the need to perform much more sophisticated forms of analysis, the requirement for inline processing, and the collapse of Moores law for sequential processing. Given these growing pressures, we argue that it is time to fundamentally rethink the nature of using hardware to support network security analysis. Clearly, to do so we must leverage massively parallel computing elements, as only these can provide the necessary performance. The key, however, is to devise an abstraction of parallel processing that will allow us to expose the parallelism latent in semantically rich, stateful analysis algorithms; and that we can then further compile to hardware platforms with different capabilities.
Abstract: High-speed packet content inspection and filtering devices rely on a fast multi-pattern matching algorithm which is used to detect predefined keywords or signatures in the packets. Multi-pattern matching is known to require intensive memory accesses and is often a performance bottleneck. Hence specialized hardware-accelerated algorithms are being developed for line-speed packet processing. While several pattern matching algorithms have already been developed for such applications, we find that most of them suffer from scalability issues. To support a large number of patterns, the throughput is compromised or vice versa.
We present a hardware-implementable pattern matching algorithm for content filtering applications, which is scalable in terms of speed, the number of patterns and the pattern length. We modify the classic Aho-Corasick algorithm to consider multiple characters at a time for higher throughput. Furthermore, we suppress a large fraction of memory accesses by using Bloom filters implemented with a small amount of on-chip memory. The resulting algorithm can support matching of several thousands of patterns at more than 10 Gbps with the help of a less than 50 KBytes of embedded memory and a few megabytes of external SRAM. We demonstrate the merit of our algorithm through theoretical analysis and simulations performed on Snort's string set.
Abstract: Intrusion rule processing in reconfigurable hardware enables intrusion detection and prevention services to run at multi Gigabit/second rates. High-level intrusion rules mapped directly into hardware separate malicious content from benign content in network traffic. Hardware parallelism allows intrusion systems to scale to support fast network links, such as OC-192 and 10 Gbps Ethernet. In this paper, a Snort Intrusion Filter for TCP (SIFT) is presented that operates as a preprocessor to prevent benign traffic from being inspected by an intrusion monitor running Snort. Snort is a popular open-source rule-processing intrusion system. SIFT selectively forwards IP packets that contain questionable headers or defined signatures to a PC where complete rule processing is performed. SIFT alleviates the need for most network traffic from being inspected by software. Statistics, like how many packets match rules, are used to optimize rule processing systems. SIFT has been implemented and tested in FPGA hardware and used to process Internet traffic from a campus Internet backbone with live data.
Abstract: High-performance rule processing systems are needed by network administrators in order to protect Internet systems from attack. Researchers have been working to implement components of intrusion detection systems (IDS), such as the highly popular Snort system, in reconfigurable hardware. While considerable progress has been made in the areas of string matching and header processing, complete systems have not yet been demonstrated that effectively combine all of the functionality necessary to perform rule processing for network systems. In this paper, a framework for implementing a rule processing system in reconfigurable hardware is presented. The framework integrates the functionality to scan data flows for regular expressions, fixed strings, and header values. It also allows modules to be added to perform extended functionality to support all features found in Snort rules. Reconfigurability and flexibility are key components of the framework that enable it to adapt to protect Internet systems from threats including malicious worms, computer viruses, and network intruders. To prove the framework viable, a system has been built that scans all bytes of Transmission Control Protocol/Internet Protocol (TCP/IP) traffic entering and leaving a network's gateway at multi-gigabit rates. Using Xilinx FPGA hardware on the Field programmable Port eXtender (FPX) platform, the framework can process 32,768 complex rules at data rates of 2.5 Gbps. Systems to handle data at 10 Gbps rates can be built today using the same framework in the latest reconfigurable hardware devices such as the Virtex 4.
Abstract: Next-generation data processing systems must deal with very high data ingest rates and massive volumes of data. Such conditions are typically encountered in the Intelligence Community (IC) where analysts must search through huge volumes of data in order to gather evidence to support or refute their hypotheses. Their effort is made all the more difficult given that the data appears as unstructured text that is written in multiple languages using characters that have different encodings. Human Analysts have not been able to keep pace with reading the data and a large amount of data is discarded even though it might contain key information. The goal of our project is to assess the feasibility of incrementally replacing humans with automation in key areas of information processing. These areas include document ingest, content categorization, language translation, and context-and-temporally- based information retrieval.
Mathematical transformation algorithms, when implemented in rapidly reconfigurable hardware, offer the potential to continuously (re)process and (re)interpret extremely high volumes of multi-lingual, unstructured text data. These technologies can automatically elicit the semantics of streaming input data, organize the data by concept (regardless of language), and associate related concepts in order to parameterize models. To test that hypothesis, we are building an experimentation testbed that enables the rapid implementation of semantic processing algorithms in hardware. The system includes a high-performance infrastructure that includes hardwarea accelerated content processing platform; mass storage to hold training data, test data, and experiment scenarios; and tools for analysis and visualization of the data.
In our first use of the testbed, we performed an experiment where we implemented three transformation algorithms using FPX hardware platforms to perform semantic processing on document streams. Our platform uses Field-programmable Port Extender (FPX) modules developed at Washington University in Saint Louis. This paper describes our approach to building the experimental hardware platform components, discusses the major features of the circuit designs, overviews our first experiment, and offers a detailed of the results, which are processing.
Abstract: FPGA technology has become widely used for real-time network intrusion detection. In this paper, a novel packet classification architecture called BV-TCAM is presented, which is implemented for an FPGA-based Network Intrusion Detection System (NIDS). The classifier can report multiple matches at gigabit per second network link rates. The BV-TCAM architecture combines the Ternary Content Addressable Memory (TCAM) and the Bit Vector (BV) algorithm to effectively compress the data representations and boost throughput. A tree-bitmap implementation of the BV algorithm is used for source and destination port lookup while a TCAM performs the lookup of the other header fields, which can be represented as a pre/x or exact value. The architecture eliminates the requirement for prefix expansion of port ranges. With the aid of a small embedded TCAM, packet classification can be implemented in a relatively small part of the available logic of an FPGA. The design is prototyped and evaluated in a Xilinx FPGA XCV2000E on the FPX platform. Even with the most difficult set of rules and packet inputs, the circuit is fast enough to sustain OC48 tra1c throughput. Using larger and faster FPGAs, the system can work at speeds greater than OC192.
Abstract: The proliferation of computer viruses and Internet worms has had a major impact on the Internet Community. Cleanup and control of malicious software (malware) has become a key problem for network administrators. Effective techniques are now needed to protect networks against outbreaks of malware. Wire-speed firewalls have been widely deployed to limit the flow of traffic from untrusted domains. But these devices weakness resides in a limited ability to protect networks from infected machines on otherwise trusted networks. Progressive network administrators have been using an Intrusion Prevention System (IPS) to actively block the flow of malicious traffic. New types of active and extensible network systems that use both microprocessors and reconfigurable logic can perform wire-speed services in order to protect networks against computer virus and Internet worm propagation. This paper discusses a scalable system that makes use of automated worm detection and intrusion prevention to stop the spread of computer viruses and Internet worms using extensible hardware components distributed throughout a network. The contribution of this work is to present how to manage and configure large numbers of distributed and extensible IPSs.
Abstract: Field Programmable Gate Arrays (FPGAs) can be used in Intrusion Prevention Systems (IPS) to inspect application data contained within network flows. An IPS operating on high-speed network traffic can be used to stop the propagation of Internet worms and to protect networks from Denial of Services (DoS) attacks. When used in the backbone of a core network, the device will be exposed to millions of active flows simultaneously. In order to protect the data in each connection, network devices will need to track the state of every flow. This must be done at multi-gigabit line rates without introducing significant delays. This paper describes a high performance TCP processing system called TCP-Processor which supports flow processing in high-speed networks utilizing multiple devices. This circuit provides stateful flow tracking, TCP stream reassembly, context storage, and flow manipulation services for applications which process TCP data streams. A simple client interface eases the complexities associated with processing TCP data streams. In addition, a set of encoding and decoding circuits has been developed which efficiently transports this interface between multiple FPGA devices. The circuit has been implemented in FPGA hardware and tested using live Internet traffic.
Abstract: Recent well publicized attacks have made it clear that worms constitute a threat to Internet security. Systems that secure networks against malicious code are expected to be a part of critical Internet infrastructure in the future. Intrusion Detection and Prevention Systems (IDPS) currently have limited use because they can filter only known worms. In this paper, we present the design and implementation of a system that automatically detects new worms in real-time by monitoring traffic on a network. The system uses Field Programmable Gate Arrays (FPGAs) to scan packets for patterns of similar content. Given that a new worm hits the network and the rate of infection is high, the system is automatically able to detect an outbreak. Frequently occuring strings in packet payloads are instantly reported as likely worm signatures.
Abstract: Network Intrusion Detection and Prevention Systems (IDPS) use string matching to scan Internet packets for malicious content. Bloom filters offer a mechanism to search for a large number of strings efficiently and concurrently when implemented with Field Programmable Gate Array (FPGA) technology. A string matching circuit has been implemented within the FPX platform using Bloom filters. Using 155 block RAMs on a single Xilinx VirtexE 2000 FPGA, the circuit scans for 35,475 unique signatures.
Abstract: Because conventional software-based packet inspection algorithms have not kept pace with high-speed networks, interest has turned to using hardware to process network data quickly. String scanning with Bloom filters can scan entire packet payloads for predifined signatures at multi-Gigabit-per-second line speeds.
Abstract Today's crucial information networks are vulnerable to fast moving attacks by Internet worms and computer viruses. These attacks have the potential to cripple the Internet and compromise the integrity of the data on the end-user machines. Without new types of protection, the Internet remains susceptible to the assault of increasingly aggressive attacks. A platform has been implemented that actively detects and blocks worms and viruses at multi-Gigabit/second rates. It uses the Field-programmable Port Extender (FPX) to scan for signatures of malicious software (malware) carried in packet payloads. Dynamically reconfigurable Field Programmable Gate Array (FPGA) logic tracks the state of Internet flows and searches for regular expressions and fixedstrings that appear in the content of packets. Protection is achieved by the incremental deployment of systems throughout the Internet.
Abstract An extensible .rewall has been implemented that performs packet .ltering, content scanning, and per-.ow queuing of Internet packets at Gigabit/second rates. The .rewall uses layered protocol wrappers to parse the content of Internet data. Packet payloads are scanned for keywords using parallel regular expression matching circuits. Packet headers are compared to rules speci.ed in Ternary Content Addressable Memories (TCAMs). Per-.ow queuing is performed to mitigate the effect of Denial of Service attacks. All packet processing operations were implemented with recon.gurable hardware and .t within a single Xilinx Virtex XCV2000E Field Programmable Gate Array (FPGA). The singlechip .rewall has been used to .lter Internet SPAM and to guard against several types of network intrusion. Additional features were implemented in extensible hardware modules deployed using run-time recon.guration.
Abstract Recent advances in network packet processing focus on payload inspection for applications that include contentbased billing, layer-7 switching and Internet security. Most of the applications in this family need to search for predefined signatures in the packet payload. Hence an important building block of these processors is string matching infrastructure. Since conventional software-based algorithms for string matching have not kept pace with high network speeds, specialized high-speed, hardware-based solutions are needed. We describe a technique based on Bloom filters for detecting predefined signatures (a string of bytes) in the packet payload. A Bloom filter is a data structure for representing a set of strings in order to support membership queries. We use hardware Bloom filters to isolate all packets that potentially contain predefined signatures. Another independent process eliminates false positives produced by Bloom filters.
We outline our approach for string matching at line speeds and present the performance analysis. Finally, we report the results for a prototype implementation of this system on the FPX platform. Our analysis shows that with the state-of-the-art FPGAs, a set of 10,000 strings can be scanned in the network data at the line speed of OC48 (2.4 Gbps).
Abstract Hardware assisted intrusion detection systems and content scanning engines are needed to process data at multigigabit line rates. These systems, when placed within the core of the Internet, are subject to millions of simultaneous flows, with each flow potentially containing data of interest. Existing IDS systems are not capable of processing millions of flows at gigabit-per-second data rates. This paper describes an architecture which is capable of performing complete, stateful, payload inspections on 8 million TCP flows at 2.5 gigabits-per-second. To accomplish this task, a hardware circuit is used to combine a TCP protocol processing engine, a per flow state store, and a content scanning engine.
Abstract A module has been implemented in Field Programmable Gate Array (FPGA) hardware that is able to perform regular expression search-and-replace operations on the content of Internet packets at Gigabit/ second rates. All of the packet processing operations are performed using reconfigurable hardware within a single Xilinx Virtex XCV2000E FPGA. A set of layered protocol wrappers is used to parse the headers and payloads of packets for Internet protocol data. A content matching server automatically generates, compiles, synthesizes, and programs the module into the Field-programmable Port Extender (FPX) platform.
Abstract A module has been implemented in Field Programmable Gate Array (FPGA) hardware that scans the content of Internet packets at Gigabit/second rates. All of the packet processing operations are performed using recon/gurable hardware within a single Xilinx Virtex XCV2000E FPGA. A set of layered protocol wrappers is used to parse the headers and payloads of packets for Internet protocol data. A content match- ing server automatically generates the Finite State Machines (FSMs) to search for regular expressions. The complete system is operated on the Field-programmable Port Extender (FPX) platform.
Abstract Network routing platforms and Internet firewalls of the next decade will be radically different than the platforms of today. They will contain modular components that can be dynamically reconfigured over the Internet. But, unlike the active networks that are in the research labs today, these new platforms will not suffer from the performance penalty of processing packets in software.
These platforms will implement routing, packet filtering, and queuing functions in reprogrammable hardware. The hardware of the system will evolve over time as packet pro-cessing algorithms and protocols progress. The granularity of the system will be configurable down to the level of the logic gates. These logic gates, and the interconnections be-tween them, will be reconfigurable over the Internet. These routers will enable new services to be rapidly deployed over the Internet and operate at the full rate of the an Internet backbone link.
Through the development of the the Field Programmable Port Extender (FPX), a platform has been built that demon-strates how networking modules can be used for rapid prototype and deployment of networking hardware. The platform includes high-speed network interfaces, multiple banks of memory, and Field Programmable Gate Array (FPGA) logic. Applications have been developed for the FPX that include Internet packet routing, data queuing, and application-level data modification. The FPX is currently used as a component in an evolvable router.