Machine Learning in Security Vulnerability Research

Selján, Gábor (2025) Machine Learning in Security Vulnerability Research. PhD thesis, Budapesti Corvinus Egyetem, Közgazdasági és Gazdaságinformatikai Doktori Iskola. DOI https://doi.org/10.14267/phd.2025042

[img] PDF : (dissertation)
2MB
[img] PDF : (draft in English)
223kB

Abstract

This dissertation explores the application of ML in cybersecurity, with a specific focus on offensive security. To examine existing work on the roles of AI and ML in vulnerability discovery, I conducted a targeted review of the literature, which indicated that fuzz testing, a method in which an automated tool evaluates software for security flaws using random input data, is a promising area for further exploration. To improve the effectiveness of fuzzers, I propose an ML-based seed selection method built on the concept that known bugs provide valuable insights to guide fuzzers toward the security-sensitive parts of the target program. For a thorough examination of this issue, I constructed a neural network for binary classification that uses knowledge derived from past fuzzing results to improve seed selection for subsequent test campaigns. I conducted a case study to investigate the practical application of this approach in real-world scenarios. The Windows graphics component represents a valuable attack surface due to its core functionality being deeply integrated into the operating system, as any flaws in this component could have significant security implications. Metafiles serve as an ideal input format for fuzz testing due to their compact size, which enables efficient mutation and rapid execution of testcases. I evaluated the proposed approach by running experiments on a legacy version of the Windows operating system using various seed selection strategies. Comparative analysis indicated that seeds selected according to the model's predictions effectively directed the fuzzer toward code paths associated with known security issues. The main logical units of the dissertation are organized into six chapters that collectively establish and validate a comprehensive research framework. Chapter 1 establishes the context by introducing key research questions and describing the methodology to conduct the literature review and develop multiple software artifacts to provide practical solutions to real-world problems. Chapter 2 presents an overview of the various applications of AI in cybersecurity, traditional fuzz testing methods, and relevant ML concepts. Chapters 3 and Chapter 4 transition from theory to practice by exploring related work, followed by proposing a conceptual framework for an alternative seed selection method and detailing the implementation of an experimental prototype. Finally, Chapters 5 and Chapter 6 provide an empirical assessment of the experiments and conclude by summarizing the research findings, discussing the vulnerabilities discovered and offering a brief outlook on future research opportunities, while reflecting on the limitations and implications of the study.

Item Type:Thesis (PhD thesis)
Supervisor:Racskó Péter
Subjects:Computer science
ID Code:1441
Date:13 October 2025
DOI:https://doi.org/10.14267/phd.2025042
Deposited On:23 Jun 2025 10:41
Last Modified:18 Dec 2025 11:19

Repository Staff Only: item control page

Downloads

Downloads per month over past two year

View more statistics