Designing Packet Buffers for Router Line = Cards
S. Iyer, R. R. Kompella, N. McKeown
Computer Systems Laboratory, Stanford University,
Stanford University HPNG Technical Report, Mar. = 2002.
(Submitted to IEEE/ACM Transactions on Networking, Nov. = 2002)
Summary:
For internet routers, the speed and the size of the packet buffer grow linearly with = the line-rate R. With the small capacity, fast speed, power-consuming and = more expensive SRAM and large capacity, slow speed, power-saving and cheaper = DRAM, we need satisfy both the speed and size requirements. The problem that = the paper tried to solve is: packet buffer should be capable to deliver a = new packet from a queue immediately when it receive the request or in a = relaxed case, within a fixed bounded time. A hybrid scheme is already in use by = some commercial routers: cache the tail/header of the queues in SRAM and = store the body of the queues in DRAM; the packet data are stored/loaded to/form = the DRAM by memory management. But there is no literature analysis to this = technique.
The paper presents a memory management algorithm called Most Deficit Queue First (MDQF-MMA) which gives zero pipeline delay using an SRAM of size = O(Q*b*lnQ) (Q is the number of queues in the packet buffer and b is the size of block = written to or read from the DRAM on each transaction). The idea of this = algorithm is that the MMA always tries to load the data from DRAM to SRAM for those = queues that has largest number of unsatisfied read request since those queues = are more possible to become underrun first.
If the router can tolerate some fixed delay, the SRAM’s size can be decreased = further. By maintain a look-ahead buffer; MMA can get the knowledge of how many = pending requests for each queue thus it can refresh the SRAM for the most = critical queue. This algorithm is called the Most Deficit Queue First with Pipeline = Delay (MDQFP-MMA). If the pipeline delay approaches to Q*b time slots the size = of the SRAM could be O(Q*b).
Both of the above algorithms base on the static SRAM space allocation. If the memory = can be dynamically allocated among queues, no doubt more memory space can be = saved. Based on the dynamically shared SRAM buffer, Earliest Critical Queue = First MMA (ECQF-MMA) was designed. Except the shared buffer, it is actually a = variant of the second algorithm. It was shown that the head SRAM buffer with size of = Q*(b-1) is good enough to service all the request patterns.
The authors argue that the general architecture can be used to build packet buffers = faster than any that are commercially available today.
Critique:= p>
1. The paper lack of = necessary performance comparison with related works. We want to know how good this = work comparing with other architectures and memory management algorithms; how = urgent we need there is not any delay or only fixed pipeline delay when = requesting data from a queue. If packet delay is not a critical issue, maybe some = other cheaper technologies is more reasonable.
2. Given the bound of the = SRAM size O(Q*b*lnQ), if we want to support IP per flow queuing (there could be millions of = queues), this technique is still impractical to current SRAM technology though it = might be preferred for small number of supported queues.
3. There is only a little = discussion about the implementation issues. We want to know the implementation = complexity of the algorithm. It is a key to turn this technique to real application = if it only requires moderate effort and resource to implement.
Haoyu Song