The work involved the setup and configuration of Security Onion as a Network Security Monitoring (NSM) solution, developing a machine learning model for alert classification, integrating the ML model with the Security Onion solution filter out false positivea and reduce Alert Fatigue, and creating a graphical user interface (GUI) to streamline the usage of the solution.
This project was conducted as part of an internship at the National Agency of Cybersecurity (NACS), or Agence Nationale de la Cybersécurité (ANCS) in French, a Tunisian agency specializing in safeguarding digital infrastructures. The primary objective of this project is to enhance the efficiency of Intrusion Detection Systems (IDS) by reducing the number of false positives generated, allowing security analysts to focus on real threats.
The repository is organized as follows:
- App/: Contains the main application code, including the scripts for network monitoring and machine learning model integration with Security Onion.
- IPython Notebook/: Includes Jupyter notebooks for machine learning model development and alert prioritization.
- report_images/: Stores images used for documentation purposes.
- README.md: Documentation of the project and instructions for setup.
The virtualized architecture was established using Oracle VirtualBox. Three machines were set up:
- Attacking Machine (Kali Linux): This machine was used to simulate attacks.
- Victim Machine (Windows 10): This machine contained vulnerabilities to simulate attack scenarios.
- Security Onion Machine (Ubuntu Server 20.04): Hosted Security Onion for monitoring and intrusion detection.
- Network Configuration: All three machines were connected to the same NAT Network, establishing a controlled testing environment.
This phase of the project centers on the development of a machine learning classification model, trained on the UNSW-NB15 dataset, to predict the authenticity of network traffic, distinguishing between genuine threats and false positives.
The UNSW-NB15 Dataset is a publicly available dataset widely used in cybersecurity to develop and test intrusion detection systems (IDS) and intrusion prevention systems (IPS). It was developed by the Australian Centre for Cyber Security (ACCS) at the University of New South Wales in Australia.
The figure below summarizes the full steps of the ML Model development:
The implementation for this step is documented in a Jupyter notebook, offering a step-by-step explanation of the process. Please refer to Classification model.ipynb for access to this detailed guide.
The alert prioritization process involved four essential steps arranged in a pipeline:
- Elasticsearch Data Extraction
- PCAP Files Fetching
- Feature Extraction
- Prediction
The first step involves extracting Suricata alerts from Elasticsearch, identifying the associated connection information (flow info) for each alert, and saving the results in a CSV file.
The implementation for this step is documented in a Jupyter notebook, offering a step-by-step explanation of the process. Please refer to Classification model.ipynb for access to this detailed guide.
This step focuses on fetching Packet Capture (PCAP) files essential for acquiring the complete network flow associated with each alert. The process involves SSH connectivity with the Security Onion machine.
The implementation for this step is documented in a Jupyter notebook, providing a comprehensive, step-by-step explanation of the process. You can access the detailed guide in Remote PCAP Request and Retrieval (so-standalone).ipynb for the standalone node and in Remote PCAP Retrieval and Filtering (so-import).ipynb for the import node, each tailored to their respective implementations.
In this step, features for each alert are computed based on the UNSW-NB15 dataset. The relevant features are extracted from each alert's PCAP file and saved in a CSV file.
The final step involves predicting whether the alerts correspond to true attacks or false alarms, using the previously trained classification model.
The implementation for this step is documented in a Jupyter notebook, offering a step-by-step explanation of the process. Please refer to Predictive Analysis.ipynb for access to this detailed guide.
A graphical interface was developed using CustomTkinter to facilitate interaction with the system.
The diagram below offers a visual representation of the various interaction scenarios, providing a comprehensive understanding of the functionalities and use cases.
- Elasticsearch Alerts Interface: Allows analysts to retrieve Suricata alerts and related flow information.
- Security Onion Machine Interface: Enables packet capture retrieval and feature extraction.
- Prediction Panel: Predicts whether an alert is a genuine threat or a false positive.
- Automation Panel: Automates the alert retrieval and prediction process.
This project successfully addressed the challenge of reducing false positives in intrusion detection systems. By integrating machine learning models into the IDS workflow, the project reduced unnecessary alerts and improved the overall efficiency of cybersecurity operations. This work represents a significant step toward enhancing the resilience of digital defenses in a world where cybersecurity threats continue to evolve.