Community Labeling and Sharing of Security and Networking Test datasets (CLASSNET)

Project Description

Community Labeling and Sharing of Security and Networking Test datasets (CLASSNET) will support network and security research with new, labeled, rich and diverse datasets to the research community. The project will develop a framework for collaborative, community-driven enrichment and labeling of data, enabling use of our datasets for machine learning in networking and security. Second, the CLASSNET project will make data available to researchers through multiple methods, ensuring privacy of data while enabling flexible data computation. Finally, the project will also generate diverse continuous (constantly, automatically updated) and curated (selected by human) datasets for research use.

CLASSNET project will innovate in dimensions of data labeling, data distribution and data sources. For data labeling CLASSNET will provide a collaborative framework for low-friction sharing of annotations among researchers. The framework will incentivize labeling with feedback mechanisms and user credits, and support bulk, automatic, algorithmic labeling. For data distribution, CLASSNET will support multiple ways of data access, ranging from downloading anonymized data to processing data in cloud, on provider machines or via the code-to-data approach. Finally, CLASSNET data sources will provide new, diverse, continuous, and curated datasets that are useful for network and security research, including traffic packets and flows, network telescope data, DNS data and Internet topology data.

The immediate impact of this project will include new types of labeled, curated and continuous datasets that enable new security, networking, and ML research and education, impacting a large community.

The broader impact of this data will be to foster research and education will make the Internet safer, more stable, and more secure, and will increase the community’s knowledge about the Internet. With the Internet’s importance for tele-work, tele-medicine, remote learning, e-commerce and e-government, these improvements will have a broad societal impact. In addition, CLASSNET datasets will support data-driven exercises for graduate and undergraduate education, and new PhD research. CLASSNET project’s innovations in multiple pathways to data access, combined with the automated and incentivized enrichment framework, will improve the state-of-the-art for responsible data sharing in related disciplines of information technology.

Data from CLASSNET will be made available to researchers at no cost, and used to support education and research.

Support

CLASSNET is supported by NSF/CISE as an NSF CRI-8115780.

CLASSNET is a joint effort of USC/ISI and Merit Network, Inc.

People

  • Wes Hardaker, co-PI on this project, researcher (USC/ISI)
  • John Heidemann, co-PI on this project, project leader and professor (USC/ISI)
  • Michalis Kallitsis, co-PI on this project, research scientist (Merit Network, Inc.)
  • Jelena Mirkovic, PI on this project, project leader and assistant professor (USC/ISI)

Publications

    For related publications, please see the ANT publications web page.

    Software

    See also the see the ANT distribution web page.

    Datasets

    We make all datasets available through our dataset page.