We just posted a pre-print of the paper “Uses and Challenges for Network Datasets”, to appear at IEEE CATCH in March. The pre-print is at <http://www.isi.edu/~johnh/PAPERS/Heidemann09a.html>.
The abstract summarizes the paper:
Network datasets are necessary for many types of network research. While there has been significant discussion about specific datasets, there has been less about the overall state of network data collection. The goal of this paper is to explore the research questions facing the Internet today, the datasets needed to answer those questions, and the challenges to using those datasets. We suggest several practices that have proven important in use of current data sets, and open challenges to improve use of network data.
More specifically, the paper tries to answer the question Jody Westby put to PREDICT PIs, which is “why take data, what is it good for”? While a simple question, it’s not easy to answer (at least, my attempt to dash of a quick answer in e-mail failed). The paper is an attempt at a more thoughtful answer.
The paper tries to summarize and point to a lot of ongoing work, but I know that our coverage was insufficient. We welcome feedback about what we’re missing.
John Heidemann and Christos Papadopoulos. Uses and Challenges for Network Datasets. In