Skip to content

misosvec/explainable-ai-malware-classification

Repository files navigation

Interpretable algorithms in artificial intelligence for malicious code classification

Goal

Take a closer look at AI algorithms that focus on interpretability or explainability, especially those that can be applied to malware classification. Explore different feature extraction methods that help reduce data dimensionality, improve accuracy, and lower the false positive rate (FPR). Based on your findings, suggest a combination of AI models and feature extraction techniques that could boost detection accuracy while keeping FPR low. Run experiments on the EMBER dataset and compare your results with those reported in references.

Progress

Task Date Status
Read papers on existing work to understand what we will be working with ~30.10.2024
Research various decision tree types and libraries in Python ~07.03.2025
Train a model on small reduced dataset, measure performance ~14.03.2025
Try to load the largest dataset, either locally or explore cloud solutions ~10.04.2025
Train model on the largest dataset, reproduce model from referenced paper and compare results ~19.04.2025
Design an algorithm for section anonymization ~26.04.2025
Train a model on anonymized dataset ~27.04.2025
Setup a solution for experiment tracking ~28.04.2025
Create a skeleton of text part, write few pages ~10.05.2025
Fixed issue in anonymization algorithm and retrained models ~12.06.2025
Became familiar with PyTorch and used it to train a neural network ~13.06.2025
Tried multiple TREPAN libraries and conducted experiments ~15.06.2025
Experimented with post-hoc explainability methods, specifically SHAP and LIME ~17.06.2025
Written several pages of thesis covering theoretical background ~12.09.2025
Experimented with two other TREPAN libraries ~14.09.2025
Trained surrogate decision tree model on predicitons of neural network ~02.10.2025
Creating graphs and visualizations of results ~18.10.2025
Written chapter "Experiments" that describes results ~11.11.2025
Tried to analyse mislcassified examples, tSNE dimensionality reduction ~02.12.2025

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published