Do you watch the news and recently saw a report about a devastating vulnerability disclosure? Do you follow security blogs to be prepared for eventual patch waves? Are your software and devices up to date?
In our group, we are commited to improve software security and work on different aspects such as project management, engineering, release planning, and maintenance. In this project we want to shed light on the update mechanisms used by vulnerability database providers, and the evolution of their records. However, we lack the required information regarding the spread of disclosures between the different platforms, and their history.
A few problems you’ll be confronted with:
Several online vulnerability databases already exist [1] [2] [3], each containing only a fraction of all the available vulnerability disclosure reports. Some databases provide ready to download data feeds, others provide web APIs, and finally, there exist providers which only publish their reports in an HTML manner. None of the database providers prepares any historical data ready to work with; manual effort is always required.
There exists work in which researchers analysed historical data and built machine learning models on top [4]. However, only one database and basic features have been used, e.g., time span between reportings and vulnerability score.
In this project, we want to explore the security reports and their spread across major online databases, and if possible, predict trends regarding future changes.
Your task will consist of:
[1] National Vulnerability Database
[2] CVE Mitre
[3] KB Cert
[4] Zhang, Caragea, and Ou, An empirical study on using the National Vulnerability Database to predict software vulnerabilities, 2011 (PDF)