By Peng Liu, Tao Liu, Nanqing Luo, Zitong Shang, Haizhou Wang, Zhilong Wang, Lan Zhang, and Qingtian Zou From Penn State Cybersecurity Lab.
@book{aisecurity,
title={AI for Cybersecurity: A Handbook of Use Cases},
author={Liu, Peng and Liu, Tao and Luo, Nanqing and Shang, Zitong and Wang, Haizhou and Wang, Zhilong and Zhang, Lan and Zou, Qingtian},
year={2022},
url = {https://www.amazon.com/gp/product/B09T3123RB/},
note = {Kindle edition},
publisher={Amazon}
}
Deep learning and reinforcement learning have been increasingly applied in solving cybersecurity challenges. However, the existing survey papers have limited usefulness for engineers, security analysts, and students taking an "AI for Cybersecurity" course. Therefore, we provide them with a handbook of use cases (e.g., using AI to conduct reverse engineering tasks, using AI to detect malware). In this book, we advocate "applying DL in a systematic way", observe that "ML pipeline standardization" is a good thing in the cybersecurity industry, and believe that the industry could be benefited a lot from ML pipeline standardization when applying DL and RL to solve various cybersecurity challenges. Engineers, security analysts, and students can use this handbook to get a hands-on introduction to how to apply DL and RL to solve a particular cybersecurity problem. Using the code snippets and dataset links provided in the handbook, the readers can achieve "learning by doing" at the use case level.
- Introduction
- Why a Handbook?
- The Use Cases Intend to Solve Various Cybersecurity Challenges through A Unified DL Pipeline
- How to Properly Use This Handbook?
- Organization of Rest of The Book
- AI Conducts Two Reverse Engineering Tasks
- The Security Problem
- Related Work
- DL Pipeline
- Model Architecture
- Model Training Issues
- Model Performance
- Deployed Model
- Source Code and Dataset
- Remaining Issues
- AI Detects Android Malware
- The Security Problem
- Android Malware Example
- Machine Learning Pipeline for the Use Case
- Feature Engineering
- Training Data
- Machine Learning
- Model Deployment
- System Evolution
- Code, Data, and Other Issues
- AI Detects Abnormal Events in Sequential Data
- The Security Problem
- Dataset
- Data Processing
- Model Architecture
- Hyperparameter Tuning
- Model Deployment
- Evaluation
- Code, Data, and OtherIssues
- AI Detects DNS Cache Poisoning Attack
- The Security Problem
- Raw Data Generation and Collection
- Labeling DNS Sessions
- Feature Extraction and Data Sample Representation
- Data Set Construction
- Model Architecture
- Parameter Tuning
- Evaluation results
- Model Deployment
- Remaining Issue
- Code and Data Resources
- AI Detects PC Malware
- The Security Problem
- Raw Data
- Data Processing
- Model Training
- Model Deployment
- Remaining Issues
- Code and Data Resources
- AI Detects Code Similarity
- The Security Problem
- Raw Data
- Data Processing
- Model
- Code, Data and Other Issues
- AI Conducts Malware Clustering
- The Security Problem
- Machine Learning Pipeline
- Example Data
- Feature Extraction
- Scalable Clustering
- Clusters Deployment
- Concluding Remarks
If you find an error, please report it using Github Issues. We appreciate any issues you raise now and we will keep an updated pdf version here.
PDF of manuscript is posted by Penn State Cybersecurity Lab. Users could download a copy (Download) for personal use.
All of the code related to the Book is stored in AIforCybersecurity.