dc.contributor.author | Wilkins, Zachary | |
dc.date.accessioned | 2020-12-08T17:28:20Z | |
dc.date.available | 2020-12-08T17:28:20Z | |
dc.date.issued | 2020-12-08T17:28:20Z | |
dc.identifier.uri | http://hdl.handle.net/10222/80075 | |
dc.description.abstract | Malicious software is a persistent threat across our digital platforms.
With unending malware growth, and increasingly higher profile attacks,
organizations across the world are ramping up their cyber defence
capabilities.
Cluster analysis is one such tool for understanding the threats faced.
By organizing seemingly disconnected samples according to their behaviours,
attack patterns can be discerned and defended against. But given the volume
of malware, an automated approach is necessary to scale.
In this thesis, I design and implement a system called COUGAR which uses
a multi-objective genetic algorithm to automatically optimize clustering
algorithms. The clustering algorithms are applied to low-dimensional
embeddings derived from high-dimensional malware behavioural data.
The system employs function imports extracted from malicious binaries,
but is flexible enough to accommodate many other features derived from
static or dynamic malware analysis. After the optimization process completes,
the system generates signatures for each cluster which prioritize usability
and comprehensible signature components.
The experiments indicate that any of the chosen clustering algorithms can
produce at least satisfactory results, with density-based approaches
generating especially successful clusters, achieving an F-Score of 0.79
and V-Measure of 0.88. The resulting signatures are very representative of
their respective clusters, with the vast majority achieving representation
scores of at least 90%. | en_US |
dc.language.iso | en | en_US |
dc.subject | Cyber security | en_US |
dc.subject | Machine learning | en_US |
dc.subject | Malware | en_US |
dc.subject | Clustering | en_US |
dc.subject | Cyber attack | en_US |
dc.subject | Evolution | en_US |
dc.title | COUGAR: A System for Clustering Unknown Malware Using Genetic Algorithm Routines | en_US |
dc.date.defence | 2020-11-12 | |
dc.contributor.department | Faculty of Computer Science | en_US |
dc.contributor.degree | Master of Computer Science | en_US |
dc.contributor.external-examiner | n/a | en_US |
dc.contributor.graduate-coordinator | Michael McAllister | en_US |
dc.contributor.thesis-reader | Malcolm I. Heywood | en_US |
dc.contributor.thesis-reader | Tami Meredith | en_US |
dc.contributor.thesis-supervisor | Nur Zincir-Heywood | en_US |
dc.contributor.thesis-supervisor | Frédéric Massicotte | en_US |
dc.contributor.ethics-approval | Not Applicable | en_US |
dc.contributor.manuscripts | Not Applicable | en_US |
dc.contributor.copyright-release | Not Applicable | en_US |