Paper: New Directions in Traffic Measurement and Accounting
Authors: Cristian Estan and George Varghese (both of UCSD)
Presenter: Michela Becchi
Discussion Leaders: Andrew Levine and Jeff Mitchell
Editor: Patrick Crowley

The initial point that was raised concerned whether or not the assumptions that the paper makes (specifically, that looking at only large flows provides enough information to be useful to network operators) are valid. It was generally agreed that they were valid under some conditions. Points were raised that as a link's capacity increases the amount of memory required and the number of packets that would have to be examined would also increase, causing the algorithms to fail. Counterpoints were raised, however, that as a link's capacity increases, there is no reason that amount of traffic in a flow that would make it constitute a "large" flow could not also be rasied. In addition, counterpoints were raised that looking at smaller flows is not a valid way to defeat these algorithms, as they are designed for large flows. Response as to the veracity of their results was called into question by some reviewers and lauded by others, providing for some interesting contrast. The tradeoffs between unidentified flows and average error, as shown in Tables 5-7, were a cause of concern to some. For instance, as Jing pointed out, in Table 5, there is an increase of 9000% in the unidentified flow rate between Sampled NetFlow and Sample and Hold. Whether or not this is an actual cause for concern was not well established.

A point raised was whether comparing their algorithms to NetFlow was comparing apples to apples or apples to oranges, since their algorithms are designed with accountability in mind, and NetFlow was designed with raw statistics in mind.

Memory issues were another sticking point. Concerns over the tradeoffs between SRAM and SDRAM (such as whether in a real situation SDRAM is not fast enough, or SRAM too expensive) as well as the tradeoffs between memory size and measurement interval were brought up. In addition, the fact that their traces are unidirectional, and the impact (if any) upon memory requirements was floated, and it was decided that the paper was unclear as to in what way the traces are "unidirectional" (i.e. they are actually flowing one way, or simply taken at outputs or inputs only of a router), and that this would affect the answer to this question.

Finally, the effect on memory and accounting accuracy for the measurement interval size and rollover method were discussed. Most of the relevant information seems to be in the associated tech report.

Some issues that were brought up by students but did not make it into the discussion were: Alternative accounting methods (such as optimal prediction using linear minimum mean square error and "Active Measurement" [The Measurement Manifesto, also by Varghese and Estan]), tradoffs of hash key sizes, and function of the algorithms under special-case traffic conditions (such as extreme burstiness).

Overall, most students felt that this was a middle-third paper. In contrast, the professors and a minority of students felt that this was a top-third paper. Of substantial note, nobody felt that it was a bottom-third paper.