Safeguarding U.S. supercomputing infrastructure, data, and AI workloads.
Illustration derived from the Art of HPC exhibit in ACM/IEEE Supercomputing '24.
The AICyberLake project curates a security data lake by sourcing cyberattacks from the DeltaAI system at NCSA and its peer supercomputing centers. The data includes Zeek network cryptographic metadata, graphics processing unit (GPU) interconnect vulnerabilities, and ground truth incident reports. The resulting data lake provides a real-time, anonymized stream of attack attempts to vetted research teams for evaluating their agentic AI-based detection models against unseen adversaries.
An interest form is available at https://go.illinois.edu/aicyberlake-interest-form. Please fill out the form or contact us if you have questions about accessing or using our resources. Instructions for accessing the data lake will follow soon.
Welcome to the official website for the "Live Evaluations of Real-World Security Data Lake from National Cyberinfrastructure" project, also known as AICyberLake. This project is funded by the National Science Foundation (NSF)'s Cybersecurity Innovation for Cyberinfrastructure (CICI) program and supported by the National Center for Supercomputing Applications (NCSA) at the University of Illinois Urbana-Champaign (UIUC).
Artificial Intelligence (AI)-driven cyberattack detection is essential for safeguarding the U.S. supercomputing infrastructure. Research in AI relies on this national supercomputing infrastructure, but this critical resource is vulnerable to cyberattacks. Securing this infrastructure requires an extensive understanding of historical security incidents, providing a longitudinal perspective on trends, seasonality, and the evolution of cyberattacks. Without this historical context and insight into emerging AI workloads, the research community is left to react rather than preempt futuristic threats, such as AI-driven malware, quantum-resistant vulnerabilities, and machine learning model supply chain backdoor attacks, leaving scientific breakthroughs vulnerable.
The AICyberLake project aims to:
The AICyberLake team will work with research teams to analyze attacks targeting U.S. supercomputing infrastructure and provide an API (Application Programming Interface) to inform the broader community by contributing attack metadata to policymakers such as the National Institute of Standards and Technology (NIST).
Role: Principal Investigator (PI)
Affiliation: National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign
Email: pcao3@illinois.edu
Role: Co-Principal Investigator (Co-PI)
Affiliation: Coordinated Science Laboratory, University of Illinois at Urbana-Champaign
A list of publications resulting from this project will be posted here as they become available. Please check back for updates.
A list of related publications using resiliency data, which broadly includes security and reliability, from NCSA and its partners are included below as samples.
Authors | Title | Year | Full Citation |
---|---|---|---|
Cui, Shengkun, Archit Patke, Ziheng Chen, Aditya Ranjan, Hung Nguyen, Phuong Cao, Brett Bode et al. | Characterizing Modern GPU Resilience and Impact in HPC Systems: A Case Case Study of A100 GPUs. | 2025 | Cui, Shengkun, Archit Patke, Ziheng Chen, Aditya Ranjan, Hung Nguyen, Phuong Cao, Brett Bode et al. "Characterizing Modern GPU Resilience and Impact in HPC Systems: A Case Study of A100 GPUs." In 2025 55th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W), pp. 1-6. IEEE, 2025. |
Sowa, Jakub, Bach Hoang, Advaith Yeluru, Steven Qie, Anita Nikolich, Ravishankar Iyer, and Phuong Cao. | Post-quantum cryptography (pqc) network instrument: Measuring pqc adoption rates and identifying migration pathways. | 2024 | Sowa, Jakub, Bach Hoang, Advaith Yeluru, Steven Qie, Anita Nikolich, Ravishankar Iyer, and Phuong Cao. "Post-quantum cryptography (pqc) network instrument: Measuring pqc adoption rates and identifying migration pathways." In 2024 IEEE International Conference on Quantum Computing and Engineering (QCE), vol. 1, pp. 1835-1846. IEEE, 2024. |
Yang, Limin, Zhi Chen, Chenkai Wang, Zhenning Zhang, Sushruth Booma, Phuong Cao, Constantin Adam et al. | True attacks, attack attempts, or benign triggers? an empirical measurement of network alerts in a security operations center. | 2024 | Yang, Limin, Zhi Chen, Chenkai Wang, Zhenning Zhang, Sushruth Booma, Phuong Cao, Constantin Adam et al. "True attacks, attack attempts, or benign triggers? an empirical measurement of network alerts in a security operations center." In 33rd USENIX Security Symposium (USENIX Security 24), pp. 1525-1542. 2024. |
Tay, Vanessa, Xinran Li, Daisuke Mashima, Bennet Ng, Phuong Cao, Zbigniew Kalbarczyk, and Ravishankar K. Iyer. | Taxonomy of fingerprinting techniques for evaluation of smart grid honeypot realism. | 2023 | Tay, Vanessa, Xinran Li, Daisuke Mashima, Bennet Ng, Phuong Cao, Zbigniew Kalbarczyk, and Ravishankar K. Iyer. "Taxonomy of fingerprinting techniques for evaluation of smart grid honeypot realism." In 2023 IEEE International Conference on Communications, Control, and Computing Technologies for Smart Grids (SmartGridComm), pp. 1-7. IEEE, 2023. |
Chung, Keywhan, Phuong Cao, Zbigniew T. Kalbarczyk, and Ravishankar K. Iyer. | StealthML: data-driven malware for stealthy data exfiltration. | 2023 | Chung, Keywhan, Phuong Cao, Zbigniew T. Kalbarczyk, and Ravishankar K. Iyer. "StealthML: data-driven malware for stealthy data exfiltration." In 2023 IEEE International Conference on Cyber Security and Resilience (CSR), pp. 16-21. IEEE, 2023. |
Basney, Jim, Phuong Cao, and Terry Fleury. | Investigating root causes of authentication failures using a saml and oidc observatory. | 2020 | Basney, Jim, Phuong Cao, and Terry Fleury. "Investigating root causes of authentication failures using a saml and oidc observatory." In 2020 IEEE 6th International Conference on Dependability in Sensor, Cloud and Big Data Systems and Application (DependSys), pp. 119-126. IEEE, 2020. |
Cao, Phuong M., Yuming Wu, Subho S. Banerjee, Justin Azoff, Alex Withers, Zbigniew T. Kalbarczyk, and Ravishankar K. Iyer. | {CAUDIT}: Continuous auditing of {SSH} servers to mitigate {Brute-Force} attacks. | 2019 | Cao, Phuong M., Yuming Wu, Subho S. Banerjee, Justin Azoff, Alex Withers, Zbigniew T. Kalbarczyk, and Ravishankar K. Iyer. "{CAUDIT}: Continuous auditing of {SSH} servers to mitigate {Brute-Force} attacks." In 16th USENIX symposium on networked systems design and implementation (NSDI 19), pp. 667-682. 2019. |
We are committed to open science and will make our data and resources publicly available where appropriate and feasible. The security data lake curated by this project will provide a real-time, anonymized stream of attack attempts to vetted research teams.
An API will be developed to contribute attack metadata to policymakers like NIST.
An interest form is available at https://go.illinois.edu/aicyberlake-interest-form. Please fill out the form or contact us if you have questions about accessing or using our resources. Instructions for accessing the data lake will follow soon.
Program/Resource | Description | Link |
---|---|---|
NCSA Jupyter | Provides no-cost NVIDIA A100 GPUs for Illinois researchers | https://jupyter.ncsa.illinois.edu |
FABRIC testbed | Infrastructure to explore impactful new ideas that are impossible or impractical with the current Internet | https://portal.fabric-testbed.net/ |
Chameleon Cloud | Accelerated computing resources | https://www.chameleoncloud.org/ |
National Research Platform | Accelerated computing resources | https://nrp.ai/ |
NAIRR Pilot | Accelerated computing resources | https://nairrpilot.org/ |
NSF ACCESS | Accelerated computing resources | https://access-ci.org/ |
DOE INCITE | Accelerated computing resources | https://doeleadershipcomputing.org/ |
NERSC | Accelerated computing resources | https://www.nersc.gov/ |
Stay updated on the latest news and events related to the AICyberLake project:
For general inquiries about the project, please contact:
Principal Investigator: Phuong Cao - pcao3@illinois.edu
Recipient Sponsored Research Office:
University of Illinois at Urbana-Champaign
506 S WRIGHT ST
URBANA, IL US 61801-3620
Phone: (217) 333-2187
We are partnerting with FABRIC testbed, SDSC, and NIST.
This material is based upon work supported by the National Science Foundation under Grant No. 2530738
Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.
Additional support for this project is provided by the National Center for Supercomputing Applications (NCSA) at the University of Illinois Urbana-Champaign.