Reviewer: Amy Freestone
Date: 9-22-2005
How would you rate this paper, relative to others we have read? top 25%, but not top 10%
How would you rate your knowledge of the topic of this paper? novice
What problem or issue does the paper address? Why is it important?
The paper addresses the efficient searching of documents in peer-to-peer systems and develops a decentralized non-flooding method for doing so. This issue is important because, as more documents are available on P2P systems, it gets more difficult to search them quickly and accurately.
What are the main contributions of the paper and why are they important?
How significant are these contributions relative to previous work?
According to the authors of the paper, there are no previous systems for P2P systems organized around the documents' semantics. With the increasing amount of digital data, the ability to quickly and efficiently search that data also increases, making this a fairly sizable contribution. The paper indicates that pSearch is able to scale fairly well as corpus size increases, which will be of growing importance as the amount of data available increases.
The other contributions all also seem to be of nontrivial significance, addressing failings in previous systems, improving them and making them more usable and more scalable.
Give detailed comments justifying your view of the paper.
The issues which were raised in my mind as needing further exploration before results could be demonstrated as I read this paper were addressed later, giving me greater confidence in its contributions. One of the main things I was concerned about initially was the possibility of matches on dimensions not partitioned, but that was later addressed with the rolling-index scheme.
There is never any further clarification of the statement made that "in theory, pLSI can achieve the same precision as LSI." This makes it difficult to evaluate with certainty whether pLSI truly represents an algorithm with the same precision as LSI.
There were a number of grammatical errors which were distracting while reading because they frequently partially obscured the meaning of the sentence in which they were found.