Knowledge Graphs for Software Security Assessments and Cyber Threat Intelligence

Construction, evaluation and reasoning using knowledge graphs for software security assessments.

Keywords: Software Security, Cyber Threat Intelligence, Artificial intelligence, Natural Language Processing

Description: Software vulnerabilities are weaknesses in software systems that can trigger unintended actions. The exploitation of security vulnerabilities in software can affect large groups of people and lead to massive financial damages. Several automated software vulnerability assessment techniques build on data sources that collect, rank, and abstract knowledge about concrete vulnerabilities found in existing systems. Many of these data sources keep their data in ways that make it difficult for machines to understand, combine and reuse the knowledge automatically. In order to discover and reason about implicit connections amongst weaknesses, security experts have to manually extract the vulnerability knowledge, analyze the vulnerability descriptions, and link them to related issues.

Knowledge graphs are knowledge bases that can be programmatically constructed from heterogeneous data sources using systematic open information extraction methods. A security knowledge graph is a knowledge graph that combines information about known source code vulnerabilities with knowledge about common attack tactics, procedures, and techniques. To create such graphs automatically, we need to develop smart data-driven techniques to combine information from relevant sources, such as the NVD (National Vulnerability Database), Common Vulnerabilities and Exposures (CVE), Common Weakness Enumeration (CWE), and Common Vulnerability Scoring System (CVSS), Open Web Application Security Project (OWASP), MITRE ATT&CK, etc. Different machine learning, natural language processing, software repository mining, logical inference, and correlation analysis techniques can be used to process and analyze different entities of the vulnerability data. The information derived from different data sources can be added as new knowledge to the graph and linked to the vulnerability knowledge graph ontology. Automated reasoning over the security knowledge graph can help increase the level of automation in identifying and classifying actively exploited vulnerabilities, mitigating threats, and uncover implicit relationships among previously unrelated parts of knowledge.

Goal

The goal of this project is to investigate how to construct security knowledge graphs and evaluate their use for vulnerability assessment and threat mitigation by reasoning over the constructed graph. Moreover, it aims to evaluate techniques for enriching existing knowledge graphs with new knowledge using additional fact extraction and inference techniques.

Learning outcome

Application of data science in a software engineering context
Proficiency with implementing and evaluating data-driven software engineering techniques and prototypes
Gain appreciation for the state of the art in machine learning on text and source code
Experience with working in an exciting and active research environment
Excellent opportunities to publish your research results in the form of a scientific publication

Qualifications

Interested in cybersecurity / software security / application security
Interested in machine learning, in particular, machine learning and natural language processing on text, source code, logs, and commits
Programming in Python and preferably paper writing in LaTeX

Supervisors

Anders Mølmen Høst
Leon Moonen

Knowledge Graphs for Software Security Assessments and Cyber Threat Intelligence

Goal

Learning outcome

Qualifications

Supervisors

Associated contacts