5G Core GTP-U Attack Dataset

Beskrivning

Introduction The 5G Core GTP-U Attack Dataset (5G-CGAD) is a dataset developed to address the critical scarcity of publicly available data for security research in 5G networks. While 5G enables unprecedented capabilities such as ultra-reliable low latency communications (URLLC), enhanced mobile broadband (eMBB), and massive machine-type communications (mMTC), it also introduces new vulnerabilities due to its cloud-native, virtualized, and software-driven architecture. This dataset focuses on the GPRS Tunneling Protocol for the User Plane (GTP-U), which operates on the N3 interface between the gNodeB and the User Plane Function (UPF). As a protocol designed under the assumption of a trusted backhaul, GTP-U lacks authentication and integrity protections, making it a high-value target for adversaries. Attacks exploiting GTP-U can result in denial-of-service, session hijacking, traffic redirection, and privacy violations. The dataset provides both benign traffic (representing real-world applications such as streaming, browsing, and file downloads) and malicious traffic (five classes of simulated attacks). Data is available in multiple formats (PCAP and CSV with over 80 extracted features), ensuring adaptability for diverse research purposes. Testbed Design The dataset was generated using a realistic cloud-native 5G testbed: Infrastructure: Kubernetes cluster with two Lenovo ThinkStation P3 servers (Ubuntu 22.04). Core Network: Open-source Open5GS for AMF, SMF, UPF, and other core functions. Radio Access Network (RAN): UERANSIM to emulate 30 UEs and 3 gNodeBs. Traffic Capture: Sidecar container inside the UPF pod for real-time packet sniffing. Monitoring: Prometheus and Grafana for system observability. Analysis: Independent ML-based anomaly detection to validate realism. This setup ensured scalability, reproducibility, and real-time traffic monitoring. Benign Traffic Generation To approximate real-world 5G usage, three benign traffic categories were simulated: Video Streaming – 20 UEs using iperf3, with randomized throughput (5–50 Mbps) and session durations. Web Browsing – 10 UEs using curl to fetch content from 12 randomized websites. File Downloading – Additional browsing users download files of varying sizes. This diversity captures the variability of session lengths, packet sizes, protocols, and throughput, which is essential for distinguishing attacks from legitimate usage. Attack Scenarios The dataset includes five attack categories, each carefully simulated and labeled: GTP Encapsulation Attack – Nested GTP-U packets injected to exploit UPF vulnerabilities. Malformed GTP Attack – Variants with invalid headers, corrupted checksums, oversized fields, and unsupported message types. DDoS Attack (ICMP/UDP Floods) – Attacks from compromised UEs targeting the UPF. Intra-UPF UE DoS Attack – Malicious UE floods another UE within the same UPF, using SYN floods, UDP floods, ICMP floods, HTTP floods, and fragmentation-based amplification. GTP-U TEID Brute-Force Attack – Adversary guesses Tunnel Endpoint Identifiers (TEIDs) to discover active sessions or disrupt connectivity. These attacks were repeated multiple times over several days to ensure diversity and statistical richness. Data Processing Pipeline Captured data underwent a structured pipeline: Packet Capture (PCAP) – Full traffic including GTP-U headers. GTP-U Header Removal (via STRIPE tool) – Preserving only relevant fields when appropriate. Flow Generation (via CICFlowMeter) – Conversion into flow-based CSV with 84 statistical features. Labeling – Mapping to one of six classes: BENIGN, GTP-ENCAPSULATION, GTP-MALFORMED, DDOS, INTRA-UPF-DOS, GTP-BRUTEFORCE. Feature Selection – Redundancy reduction using Pearson correlation and ANOVA. Normalization – Features standardized (zero mean, unit variance). Class Balancing (SMOTE) – To mitigate skew (e.g., large number of Intra-UPF DoS flows vs. fewer TEID brute-force flows).
Visa mer

Publiceringsår

2025

Typ av data

Upphovspersoner

Suranga Prasad Wengappuli Arachchige - Utgivare, Upphovsperson

Tharaka Hewa Orcid -palvelun logo - Medarbetare

Yushan Siriwardhana Orcid -palvelun logo - Medarbetare

Projekt

Övriga uppgifter

Vetenskapsområden

El-, automations- och telekommunikationsteknik, elektronik

Språk

engelska

Öppen tillgång

Öppet

Licens

Creative Commons Attribution 4.0 International (CC BY 4.0)

Nyckelord

5G Core Network, 5G Security, DDoS Attacks, Intrusion Detection Dataset, GTP Encapsulation Attack, GTP-U (GPRS Tunneling Protocol – User Plane), Machine Learning for Network Security, Malformed Packets, Network Intrusion Detection, TEID Brute Force

Ämnesord

Temporal täckning

undefined

Relaterade till denna forskningsdata