PyTorch implementation of DisKT
🌹Many thanks to Ringotc for pointing out the data leak issue in our code and providing a fix.
Place the assist09, algebra05, algebra06, statics, ednet, prob, comp, linux, database, spanish, and slepemapy source files in the dataset directory, and process the data using the following commands respectively:
python preprocess_data.py --data_name assistments09
python preprocess_data.py --data_name [algebra05, bridge_algebra06]
python preprocess_data.py --data_name statics
python preprocess_data.py --data_name ednet
python preprocess_data.py --data_name [prob, sampled_comp, linux, database]
python preprocess_data.py --data_name spanish
python preprocess_data.py --data_name sampled_slepemapyThe statistics of the 11 datasets after processing are as follows:
| Datasets | #students | #questions | #concepts | #concepts* | #interactions |
|---|---|---|---|---|---|
| assist09 | 3,644 | 17,727 | 123 | 150 | 281,890 |
| algebra05 | 571 | 173,113 | 112 | 271 | 607,014 |
| algebra06 | 1,138 | 129,263 | 493 | 550 | 1,817,450 |
| statics | 333 | 1,223 | N/A | N/A | 189,297 |
| ednet | 5,000 | 12,117 | 189 | 1,769 | 676,276 |
| prob | 512 | 1,054 | 247 | 247 | 42,869 |
| comp | 5,000 | 7,460 | 445 | 445 | 668,927 |
| linux | 4,375 | 2,672 | 281 | 281 | 365,027 |
| database | 5,488 | 3,388 | 291 | 291 | 990,468 |
| spanish | 182 | 409 | 221 | 221 | 578,726 |
| slepemapy | 5,000 | 2,723 | 1,391 | 1,391 | 625,523 |
Table1: Statistics of 11 datasets. "#concepts*" denotes the total number of concepts after converting multiple concepts into a new concept.
The dataset processed with PTADisc can be found at the link.
Git clone this repository and create conda environment:
conda create -n diskt python=3.10.13
conda activate diskt
pip install -r requirements.txt Specially, Mamba requires a different CUDA version, please strictly follow the installation instructions for Mamba as provided in its respective GitHub repository. Downloading the correct CUDA packages is crucial.
Our model experiments are conducted on two NVIDIA RTX 3090 24GB GPUs. You can execute it directly using the following commands:
CUDA_VISIBLE_DEVICES=0 python main.py --model_name [diskt, dkt, dkvmn, skvmn, deep_irt, gkt, sakt, akt, atkt, cl4kt, corekt, dtransformer, simplekt, folibikt, sparsekt, mikt] --data_name [assist09, algebra05, algebra06, statics, ednet, prob, sampled_comp, linux ,database, spanish, sampled_slepemapy]If you find our work valuable, we would appreciate your citation:
@inproceedings{zhou2025disentangled,
title={Disentangled Knowledge Tracing for Alleviating Cognitive Bias},
author={Zhou, Yiyun and Lv, Zheqi and Zhang, Shengyu and Chen, Jingyuan},
booktitle={Proceedings of the ACM on Web Conference 2025},
pages={2633--2645},
year={2025}
}