I’m honored to have been invited to give the keynote talk “Sharing Network Data: Bright Gray Days Ahead” at the Passive and Active Measurement Conference 2014 in Marina del Rey.
From the talk’s abstract:
Sharing data is what we expect as a community. From the IMC best paper award requiring a public dataset to NSF data management plans, we know that data is crucial to reproducible science. Yet privacy concerns today make data acquisition difficult and sharing harder still. AOL and Netflix have released anonymized datasets that leaked customer information, at least for a few customers and with some effort. With the EU suggesting that IP addresses are personally identifiable information, are we doomed to IP-address free “Internet” datasets?
In this talk I will explore the issues in data sharing, suggesting that we need to move beyond black and white definitions of private and public datasets, to embrace the gray shades of data sharing in our future. Gray need not be gloomy. I will discuss some new ideas in sharing that suggest that, if we move beyond “anonymous ftp” as our definition, the future may be gray but bright.
This talk did not generate new datasets, but it grows out of our experiences distributing data through several research projects (such as LANDER and LACREND, both part of the DHS PREDICT program) mentioned in the talk with data available http://www.isi.edu/ant/traces/. This talk represents my on opinions, not those of these projects or their sponsors.