Francis Zane, Girija Narlikar, Anindya Basu
Proceedings of IEEE Infocom 2003
Summary and critique by David E. Taylor
This paper addresses the problem of reducing power consumption of Ternary Content Addressable Memory (TCAMs) when used for route lookup. TCAMs represent the brute force hardware solution to the route lookup problem; exorbitant power consumption is the byproduct. Nonetheless, many vendors employ the devices despite their additional contribution to per-port costs. For those who favor the devices over efficient implementations of clever algorithms employing more efficient SRAM, this work holds value. The paper does not give a reliable reference for the amount of power consumed by algorithmic implementations (in ASIC or FPGA); therefore, it is difficult to assess the relative strength of the results of the paper. (Their reference is from a TCAM vendor!!!)
The primary contribution of this paper is an architecture and supporting algorithms for using the sub-table capabilities of modern TCAMs in order to reduce overall power consumption. By performing an initial lookup via hashing or trie-based search, the sub-table of the TCAM containing the prefix is located. This reduces the number of active regions on the chip and reduces power consumption. The authors operate under the (reasonable) first-order assumption that total power consumption of the TCAM is roughly linear with the number of active entries. The authors provide theoretical bounds and experimental results for the partitioning algorithms.
The authors operate under the general assumption that the number of TCAM partitions will be greater than the number of buckets (or bins) of prefixes produced by the partitioning algorithm. This is a questionable assumption, as current devices support at most eight partitions. The authors argue that TCAM vendors will respond to market pressure for more power efficient devices and offer devices with more partitions. (This also assumes that the influential people in the market will read this paper and get excited about the results.) Regardless of market dynamics, the authors leave the possibly fruitful (and more practical) design space of more prefix buckets and fewer TCAM partitions unexplored.
The discussion of theoretical bounds on bucket sizes for the partitioning technique using hashing is terse. The authors fail to give the reader any intuition into their derivation of the bounds. In general, the authors’ choice of notation is more a hindrance than a help in understanding the derivation of bounds. It was noted that the worst-case power bound for the hashing architecture was more than 0.5 for 100k routes (about the size of today’s BGP tables) and 8 buckets (k=3, the maximum number of buckets if the number of buckets is to be greater than the number of TCAM partitions with today’s TCAMs). This is a rather paltry reduction in power consumption. Although the experimental results show considerably better performance, the results are irrelevant as hardware designers must use the worst-case bound in order to construct a power budget.
The authors introduce a “three step” heuristic approach to selecting bits (choosing a hash function) for the partitioning technique using hashing. The first “step” simply uses the least-significant bits in the range examined by the hash function. If the maximum bucket size is too large, the second step uses a greedy algorithm where the greedy choice selects the bit position that minimizes the size of the largest bucket. (The authors provide no formal analysis of the properties and/or performance of this algorithm.) If the maximum bucket size is still too large, the third step uses a brute force examination of all possible bit selections to choose the bit selection that minimizes the maximum bucket size.
While the discussion of trie-based partitioning algorithms is satisfactory, I find the lack of discussion of alternative options (to the carving technique) for this first-stage lookup a major omission. Given the restricted depth of the trie and the plethora of trie-based lookup algorithms and implementations in the literature, the technique could be greatly enhanced by at least briefly investigating the possibilities. Would a properly dimensioned multi-bit trie for the first stage be fast enough and reduce power consumption even more (by not using an index TCAM)? It would be nice to know. We also could envision a scheme that builds out an initial trie until the bucket size falls below a threshold (a technique used in several packet classification schemes). This would relax the constraint on only considering 16-24 bit prefixes and make the distribution of bucket sizes relatively uniform. Indexing into these buckets is trivial, as the index TCAM can support any length prefixes.
Following partitioning of the route table, the buckets must undergo “layout” on the actual TCAM device. Here the authors employ a sequential ordering of buckets regardless of TCAM partition boundaries; hence, the number of TCAM partitions to be searched is ceiling(bucket-size / partition-size) + 1. As we saw in our “back of the envelope” calculation of performance for the subtree-split technique, this can cause “real” performance (using actual route tables) to be worse than the calculated worst-case bound by a factor of more than 2. A major deficiency of the postorder-split algorithm is the worst-case behavior regarding the size of the index TCAM and the number of cover prefixes required. Using realistic assumptions, it can be shown that the number of entries in the index TCAM can approach the size the database (not to mention the size of the TCAM required for bucket entries and covering prefixes).
Finally, it is unclear if the technique is robust under heavy update load (approximately one update per ms). While the authors do address the issue of update floods, they sort of “punt” on the issue for the trie-based partitioning schemes. In essence, they say that the hashing technique works better in this case. Of course if we use hashing, we don’t even cut the power consumption by a factor of 2 (as mentioned at the outset).
That is all.