National Science Foundation Logo National Center for Supercomputing Applications (NCSA) Logo

AICyberLake: Live Evaluations of Real-World
Security Data Lake from National Cyberinfrastructure

Safeguarding U.S. supercomputing infrastructure, data, and AI workloads.

illustration

Illustration derived from the Art of HPC exhibit in ACM/IEEE Supercomputing '24.

Overview

The AICyberLake project curates a security data lake by sourcing cyberattacks from the DeltaAI system at NCSA and its peer supercomputing centers. The data includes Zeek network cryptographic metadata, graphics processing unit (GPU) interconnect vulnerabilities, and ground truth incident reports. The resulting data lake provides a real-time, anonymized stream of attack attempts to vetted research teams for evaluating their agentic AI-based detection models against unseen adversaries.

An interest form is available at https://go.illinois.edu/aicyberlake-interest-form. Please fill out the form or contact us if you have questions about accessing or using our resources. Instructions for accessing the data lake will follow soon.


Welcome to the official website for the "Live Evaluations of Real-World Security Data Lake from National Cyberinfrastructure" project, also known as AICyberLake. This project is funded by the National Science Foundation (NSF)'s Cybersecurity Innovation for Cyberinfrastructure (CICI) program and supported by the National Center for Supercomputing Applications (NCSA) at the University of Illinois Urbana-Champaign (UIUC).

Artificial Intelligence (AI)-driven cyberattack detection is essential for safeguarding the U.S. supercomputing infrastructure. Research in AI relies on this national supercomputing infrastructure, but this critical resource is vulnerable to cyberattacks. Securing this infrastructure requires an extensive understanding of historical security incidents, providing a longitudinal perspective on trends, seasonality, and the evolution of cyberattacks. Without this historical context and insight into emerging AI workloads, the research community is left to react rather than preempt futuristic threats, such as AI-driven malware, quantum-resistant vulnerabilities, and machine learning model supply chain backdoor attacks, leaving scientific breakthroughs vulnerable.


Mailing list

https://lists.illinois.edu/lists/info/aicyberlake

Objectives

The AICyberLake project aims to:

The AICyberLake team will work with research teams to analyze attacks targeting U.S. supercomputing infrastructure and provide an API (Application Programming Interface) to inform the broader community by contributing attack metadata to policymakers such as the National Institute of Standards and Technology (NIST).

People

Phuong Cao

Role: Principal Investigator (PI)

Affiliation: National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign

Email: pcao3@illinois.edu

Ravishankar Iyer

Role: Co-Principal Investigator (Co-PI)

Affiliation: Coordinated Science Laboratory, University of Illinois at Urbana-Champaign

Publications

A list of publications resulting from this project will be posted here as they become available. Please check back for updates.

Related Publications

A list of related publications using resiliency data, which broadly includes security and reliability, from NCSA and its partners are included below as samples.

Authors Title Year Full Citation
Cui, Shengkun, Archit Patke, Ziheng Chen, Aditya Ranjan, Hung Nguyen, Phuong Cao, Brett Bode et al. Characterizing Modern GPU Resilience and Impact in HPC Systems: A Case Case Study of A100 GPUs. 2025 Cui, Shengkun, Archit Patke, Ziheng Chen, Aditya Ranjan, Hung Nguyen, Phuong Cao, Brett Bode et al. "Characterizing Modern GPU Resilience and Impact in HPC Systems: A Case Study of A100 GPUs." In 2025 55th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W), pp. 1-6. IEEE, 2025.
Sowa, Jakub, Bach Hoang, Advaith Yeluru, Steven Qie, Anita Nikolich, Ravishankar Iyer, and Phuong Cao. Post-quantum cryptography (pqc) network instrument: Measuring pqc adoption rates and identifying migration pathways. 2024 Sowa, Jakub, Bach Hoang, Advaith Yeluru, Steven Qie, Anita Nikolich, Ravishankar Iyer, and Phuong Cao. "Post-quantum cryptography (pqc) network instrument: Measuring pqc adoption rates and identifying migration pathways." In 2024 IEEE International Conference on Quantum Computing and Engineering (QCE), vol. 1, pp. 1835-1846. IEEE, 2024.
Yang, Limin, Zhi Chen, Chenkai Wang, Zhenning Zhang, Sushruth Booma, Phuong Cao, Constantin Adam et al. True attacks, attack attempts, or benign triggers? an empirical measurement of network alerts in a security operations center. 2024 Yang, Limin, Zhi Chen, Chenkai Wang, Zhenning Zhang, Sushruth Booma, Phuong Cao, Constantin Adam et al. "True attacks, attack attempts, or benign triggers? an empirical measurement of network alerts in a security operations center." In 33rd USENIX Security Symposium (USENIX Security 24), pp. 1525-1542. 2024.
Tay, Vanessa, Xinran Li, Daisuke Mashima, Bennet Ng, Phuong Cao, Zbigniew Kalbarczyk, and Ravishankar K. Iyer. Taxonomy of fingerprinting techniques for evaluation of smart grid honeypot realism. 2023 Tay, Vanessa, Xinran Li, Daisuke Mashima, Bennet Ng, Phuong Cao, Zbigniew Kalbarczyk, and Ravishankar K. Iyer. "Taxonomy of fingerprinting techniques for evaluation of smart grid honeypot realism." In 2023 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), pp. 1-7. IEEE, 2023.
Chung, Keywhan, Phuong Cao, Zbigniew T. Kalbarczyk, and Ravishankar K. Iyer. StealthML: data-driven malware for stealthy data exfiltration. 2023 Chung, Keywhan, Phuong Cao, Zbigniew T. Kalbarczyk, and Ravishankar K. Iyer. "StealthML: data-driven malware for stealthy data exfiltration." In 2023 IEEE International Conference on Cyber Security and Resilience (CSR), pp. 16-21. IEEE, 2023.
Basney, Jim, Phuong Cao, and Terry Fleury. Investigating root causes of authentication failures using a saml and oidc observatory. 2020 Basney, Jim, Phuong Cao, and Terry Fleury. "Investigating root causes of authentication failures using a saml and oidc observatory." In 2020 IEEE 6th International Conference on Dependability in Sensor, Cloud and Big Data Systems and Application (DependSys), pp. 119-126. IEEE, 2020.
Cao, Phuong M., Yuming Wu, Subho S. Banerjee, Justin Azoff, Alex Withers, Zbigniew T. Kalbarczyk, and Ravishankar K. Iyer. {CAUDIT}: Continuous auditing of {SSH} servers to mitigate {Brute-Force} attacks. 2019 Cao, Phuong M., Yuming Wu, Subho S. Banerjee, Justin Azoff, Alex Withers, Zbigniew T. Kalbarczyk, and Ravishankar K. Iyer. "{CAUDIT}: Continuous auditing of {SSH} servers to mitigate {Brute-Force} attacks." In 16th USENIX symposium on networked systems design and implementation (NSDI 19), pp. 667-682. 2019.

Data and Resources

We are committed to open science and will make our data and resources publicly available where appropriate and feasible. The security data lake curated by this project will provide a real-time, anonymized stream of attack attempts to vetted research teams.

An API will be developed to contribute attack metadata to policymakers like NIST.

An interest form is available at https://go.illinois.edu/aicyberlake-interest-form. Please fill out the form or contact us if you have questions about accessing or using our resources. Instructions for accessing the data lake will follow soon.

Computing Resources

While the AICyberLake provides security data, accelerated computing resources (GPUs) can be requested through other programs such as the followings.
Program/Resource Description Link
NCSA Jupyter Provides no-cost NVIDIA A100 GPUs for Illinois researchers https://jupyter.ncsa.illinois.edu
FABRIC testbed Infrastructure to explore impactful new ideas that are impossible or impractical with the current Internet https://portal.fabric-testbed.net/
Chameleon Cloud Accelerated computing resources https://www.chameleoncloud.org/
National Research Platform Accelerated computing resources https://nrp.ai/
NAIRR Pilot Accelerated computing resources https://nairrpilot.org/
NSF ACCESS Accelerated computing resources https://access-ci.org/
DOE INCITE Accelerated computing resources https://doeleadershipcomputing.org/
NERSC Accelerated computing resources https://www.nersc.gov/

News and Events

Stay updated on the latest news and events related to the AICyberLake project:

Contact Us

Mailing list

https://lists.illinois.edu/lists/info/aicyberlake

Interest form

https://go.illinois.edu/aicyberlake-interest-form

For general inquiries about the project, please contact:

Principal Investigator: Phuong Cao - pcao3@illinois.edu

Recipient Sponsored Research Office:
University of Illinois at Urbana-Champaign
506 S WRIGHT ST
URBANA, IL US 61801-3620
Phone: (217) 333-2187

Partners

We are partnerting with FABRIC testbed, SDSC, and NIST.

This material is based upon work supported by the National Science Foundation under Grant No. 2530738

Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

Additional support for this project is provided by the National Center for Supercomputing Applications (NCSA) at the University of Illinois Urbana-Champaign.