The rapid evolution of modern network protocols, such as QUIC and Multipath TCP (MPTCP), has introduced new complexities and challenges in network traffic analysis, anomaly detection, and cybersecurity. The need for high-quality, diverse, and realistic network traffic is great, particularly for training and evaluating machine learning-based network monitoring systems. However, collecting and sharing real network traffic data is often constrained by privacy concerns, security risks, and the sensitivity of proprietary information. This thesis will explore a novel approach to synthetic network traffic generation using, for example, Generative Adversarial Networks (GANs) [1], Diffusion Models [2] , focusing on the accurate modeling of modern and niche network behaviors, including QUIC client-side and server-side connection migrations [3], as well as TCP Multipath traffic fingerprints [4].
This thesis will involve the design, implementation, and evaluation of a synthetic network traffic generation framework. The student will be expected to (i) conduct a comprehensive literature review on synthetic network traffic generation and modern network protocols, (ii) design a model capable of learning and reproducing advanced network protocol behaviors, (iii) develop and train the model on modern network traffic datasets, and (iv) evaluate the quality of the generated synthetic traffic in comparison to real-world data, using established metrics.
The student should have experience with machine learning, network protocols, Python, and deep learning frameworks such as PyTorch or TensorFlow.
References
[1] Y. Yin, Z. Lin, M. Jin, G. Fanti, and V. Sekar, "Practical GAN-Based Synthetic IP Header Trace Generation Using NetShare," in Proc. ACM SIGCOMM Conf., Amsterdam, Netherlands, 2022, pp. 458–472
[2] Kotelnikov, A., Baranchuk, D., Rubachev, I. and Babenko, A., 2023, July. Tabddpm: Modelling tabular data with diffusion models. In International Conference on Machine Learning (pp. 17564-17579). PMLR
[3] J. Iyengar, M. Thomson" QUIC: A UDP-Based Multiplexed and Secure Transport," RFC 9000, May 2021. [Online]. Available: https://datatracker.ietf.org/doc/rfc9000/
[4] A. Ford, C. Raiciu, M. J. Handley, O. Bonaventure, and C. Paasch, "TCP Extensions for Multipath Operation with Multiple Addresses," RFC 8684, Mar. 2020. [Online]. Available: https://datatracker.ietf.org/doc/html/rfc8684
[5] Wolf, M., Tritscher, J., Landes, D., Hotho, A. and Schlör, D., 2024. Benchmarking of synthetic network data: Reviewing challenges and approaches. Computers & Security, p.103993.
Supervisors: Weijie Niu, Thomas Grübl
back to the main page