Evaluation of Packet Classification Algorithms
Packet classification enables network routers to provide advanced network services, e.g. network security, QoS routing, and resource reservation. There is an increasing interest in both industry and academia in algorithms and systems for efficient packet classification. On the one hand, network security and QoS have become urgent driving factors requiring large-scale packet classification; on the other hand, increasing network traffic poses greater challenges than ever for the application of large-scale packet classification. For these reasons, packet classification is still an open and challenging problem demanding continuing investigation.
Though many packet classification algorithms and architectures have been proposed and research is ongoing, researchers and technology adopters find it is difficult to choose an appropriate algorithm for their application and evaluate new algorithms objectively. Before exploring new possibilities, it is imperative to understand existing algorithms under uniform test conditions and a common set of benchmark criteria. Unfortunately, the existing algorithm evaluation is hardly persuasive for the following reasons:
1.1 Incommensurable evaluation results:
First, evaluations by different authors are not based on the same filter sets. Researchers have limited accessibility to real-world filter sets. Sometimes they have to use randomly generated filter sets for evaluations. However, the performance of many packet classification algorithms is very sensitive to the structure of the filter sets. Second, the evaluations are not based on common implementation assumption. They do not share a common implementation model. Some algorithms assume a software-based implementation and the others assume a hardware-based implementation. Different assumptions on implementation architectures and platforms can lead to very different performance evaluation results. Third, there is an absence of evaluation tools, benchmarks, and publicly accepted measurements. Different people have their own understanding of the evaluation criteria. Some criteria are unrealistic and make it hard to determine the actual performance one might expect in practice. This makes it difficult to understand and compare the evaluation results.
1.2 Irreproducible algorithm implementations and evaluation results:
Researchers rarely provide enough details to allow readers to exactly reproduce the work reported in their research papers. Either some key points of the algorithm description are missing or the test conditions, such as parameter settings and the filter sets used, are undisclosed. This situation causes confusion and creates an unnecessary hurdle for others trying to understand the algorithms and advance the research.
1.3 Incomplete evaluation and unconvincing results:
Packet classification algorithms often involve some tradeoffs, heuristics, and optimizations. Tunable parameters may have subtle effects on algorithm performance. It is important to isolate them and evaluate their behavior carefully in order to clarify their impact on the algorithm. However, some evaluations fail to identify the performance impact of individual parameters. Moreover, some researchers boast about some aspects of their algorithms while underplaying their drawbacks. Researchers sometimes make claims without sufficient proof. Some assumptions and prerequisites are impractical or invalid. The ambiguities in algorithm descriptions and evaluations are too confusing to allow readers to make valid judgments.
1.4 Inadequate Insights:
Some research papers focus on the algorithm details and lack high-level insights which reveal the inherent and intrinsic principles that underlie the algorithms. Proposed algorithms become more and more complex without convincing benefits. Deeper understanding of the problem is needed to enable more effective algorithm design efforts.
Identifying the problems in algorithm evaluations for packet classification, we try to establish a standard procedure of algorithm description and evaluation. In particular, we propose to provide the research community an objective and "advocacy-free" evaluation of a suite of packet classification algorithms. Our effort will help to better understand some representative algorithms, promote the standard for algorithm evaluation, ease the research curve, and encourage contributions from the research community to make it better.
Ideally, the evaluation should cover
the criteria of throughput, storage, incremental update support, preprocessing
time, scalability to the size of filter sets, adaptability to the structure of
filter sets, implementation cost, and power dissipation. All the evaluation
results should be normalized in a directly comparable way. In different
applications, some criteria may be more important than others, but
the evaluation should provide information without bias and let readers make
their own judgments. While asymptotic analysis of timing and storage complexity
is a useful metric, the evaluation should not be limited to it. Because packet
classification algorithms are mostly based on heuristics, different filter sets
with different structures and sizes tend to give very different results. The
performance of the algorithm on real filter sets is the decisive factor in any
realistic evaluation.
A summary of our approach follows:
2.1 Documentation of method:
First, we provide a complete description of the key data structures and all the tunable parameters. Second, we provide a detailed description of the algorithm preprocessing and lookup process along with step-by-step illustrations using an example. Third, we provide the source code for an actual implementation. We assume that a simple hardware-based model or a network processor-based model is used in our implementation, which includes multiple on-chip lookup engines or threads, a memory interface, and a commodity off-chip memory. All data are retrieved from the off-chip memory. The lookup for one packet is conducted by a sequence of dependent memory accesses. The memory bandwidth is shared by multiple independent lookup engines or threads. The on-chip resource usage is small relative to filter set size, so we ignore the cost of it in our evaluations. We also try to categorize the algorithms based on their high-level ideas and provide insights to help improve the algorithm performance or design better algorithms.
2.2 Documentation of filter set:
The open-source ClassBench is used to generate synthetic filter sets with different scales and structures. We provide the parameters used for filter set generation. We also generate a packet header trace using ClassBench for each filter set for implementation verification and algorithm evaluation. The size of a trace is about 10 times of that of the corresponding filter set. We provide the original filter sets that are used as seeds for the synthetic filter sets. The statistics files extracted from the original filter sets can be downloaded from the ClassBench website.
2.3 Metrics for evaluation:
For objective and meaningful algorithm evaluation, we measure the space efficiency of an algorithm using the average number of bytes consumed per filter. We measure the throughput of an algorithm using the memory bandwidth consumption: the number of bytes per memory access and the number of dependent memory accesses per packet lookup. The memory bandwidth consumption is evaluated in both the worst case and the average case. We will use three figures like those shown below to present the results. The overall data structure size is the product of the memory consumption per filter and the number of filters. The overall throughput can be calculated by dividing the total memory bandwidth by the memory bandwidth consumed per packet lookup.

2.4. Sensitivity Study:
We determine how each individual parameter influences the overall performance quantitatively in the algorithm evaluation. For each tunable parameter, we produce some figures like that shown below. Each figure use a different scaled filter set. The sensitivity study will clarify issues often left unresolved in the original papers. It will also help users to determine the optimal design parameters for a given filter set.

Note that our implementations are only for the purpose of simulation and evaluation, thus the source code is not optimized for software execution and the implementations do not directly map to hardware or network processor. In addition, we do not consider preprocessing cost, incremental update cost or power dissipation. These factors are left for future studies.
We only consider 5-tuple filters: {source IP address, destination IP address, source port, destination port, protocol}. The filter format is "@{source IP address prefix in dot-decimal notation}/{prefix length} {destination IP address prefix in dot-decimal notation}/{prefix length} {low source port} : {high source port} {low destination port} : {high destination port} {protocol value in hexadecimal}/{protocol mask in hexadecimal}". The header trace format is "{source IP address in decimal} {destination IP address in decimal} {source port value in decimal} {destination port value in decimal} {protocol in decimal}"
The seed filter sets are extracted from real filter sets. Their characteristics can be found in ClassBench Technical Report. The synthetic filter sets and the packet traces are both generated by ClassBench. The TCP flag field is removed. The number of the filters or the packet headers is shown in the parenthesis.
| Seed Filter |
Synthetic Filter |
||||
| Filter Set | ACL1 (752) | ACL1_100 (98) | ACL1_1K (916) | ACL1_5K (4415) | ACL1_10K (9603) |
| Packet Trace | ACL1_trace (8140) | ACL1_100_trace (1000) | ACL1_1K_trace (9380) | ACL1_5K_trace (45600) | ACL1_10K_trace (97000) |
| Filter Set | FW1 (269) | FW1_100 (92) | FW1_1K (791) | FW1_5K (4653) | FW1_10K (9311) |
| Packet Trace | FW1_trace (2830) | FW1_100_trace (920) | FW1_1K_trace (8050) | FW1_5K_trace (46700) | FW1_10K_trace (93250) |
| Filter Set | IPC1 (1550) | IPC1_100 (99) | IPC1_1K (938) | IPC1_5K (4460) | IPC1_10K (9037) |
| Packet Trace | IPC1_trace (17020) | IPC1_100_trace (990) | IPC1_1K_trace (9380) | IPC1_5K_trace (44790) | IPC1_10K_trace (90640) |
Note that the filter set and the packet trace are one-to-one correspondent. Use them together for the implementation verification. We also use the packet header trace to evaluate the average performance of the algorithms.
| Type | Algorithm | Source |
| Decision-tree Based | HiCuts | P. Gupta et. al., "Packet Classification using Hierarchical Intelligent Cuttings", IEEE Symposium on High Performance Interconnects (HotI), 1999 |
| Modular Packet Classification | T. Y. C. Woo, "A Modular Approach to Packet Classification", IEEE INFOCOM, 2000 | |
| HyperCuts | S. Singh et. al., "Packet Classification using Multidimensional Cutting", ACM SIGCOMM, 2003 | |
| Decomposition Based | RFC | P. Gupta et. al., "Packet Classification on Multiple Fields", ACM SIGCOMM, 1999 |
| BV | T. V. Lakshman et. al., "High-Speed Policy-based Packet Forwarding Using Efficient Multi-dimensional Range Matching", ACM SIGCOMM, 1998 | |
| Hash Based | Tuple Space Search | V. Srinivasan et. al., "Packet Classification Using Tuple Space Search", ACM SIGCOMM, 1999 |
5.1 Packet Classification Repository from UCSD, where you can find some algorithm implementations
5.2 ClassBench from WashU, a software tool for generating synthetic filter sets
Last updated: 13 February 2007 09:22:24 PM