[Documentation] [Web Interface] [Online Demo]
[NeurIPS'25 Benchmark Paper] [Benchmark Scripts] [IJCAI'24 Survey Paper] [Paper Collection]
🙋 Please let us know if you find out a mistake or have any suggestions!
🌟 If you find this resource helpful, please consider to star this repository and cite our research:
@inproceedings{gong2025gcnc,
title={{GC}4{NC}: A Benchmark Framework for Graph Condensation on Node Classification with New Insights},
author={Shengbo Gong and Juntong Ni and Noveen Sachdeva and Carl Yang and Wei Jin},
booktitle={The Thirty-ninth Annual Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
year={2025}
}
@inproceedings{hashemi2024comprehensive,
title={A comprehensive survey on graph reduction: sparsification, coarsening, and condensation},
author={Hashemi, Mohammad and Gong, Shengbo and Ni, Juntong and Fan, Wenqi and Prakash, B Aditya and Jin, Wei},
booktitle={Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence},
pages={8058--8066},
year={2024}
}
GraphSlim is a PyTorch library for graph reduction. It takes graph of PyG format as input and outputs a reduced graph preserving properties or performance of the original graph.
- Covering representative methods of all 3 graph reduction strategies: Sparsification, Coarsening and Condensation.
- Different reduction strategies can be easily combined in one run.
- Unified evaluation tools including Grid Search and NAS.
- Support evasion and poisoning attacks on the input graph by DeepRobust.
[11/9/25] Version 1.2 released! We categorize the parameters. Now they look much clear. The environment can be easily set up by uv now.
Check torch previous versions.
We test this repo in torch 1.13.1, torch 2.1.2, torch 2.6.0 with CUDA 12.8.
To ensure a consistent and reproducible environment, we use uv to lock our dependencies. If you have uv installed, you can reproduce our exact environment by running the following command:
uv pip syncThis will install all the packages listed in uv.lock with their exact versions.
# choose one version from https://data.pyg.org/whl/ based on your environment
pip install torch_scatter torch_sparse -f https://data.pyg.org/whl/torch-${TORCH}+${CUDA}.html
pip install torch_geometric
pip install -r graphslim/requirements.txtpip install graphslim
cd examples
python train_coreset.py
python train_coarsen.py
python train_gcond.py
See more examples in Benchmark Scripts.
cd graphslim
python train_all.py -xxx xxRun python configs.py --help to get all command line options.
Options:
-D, --dataset TEXT [default: cora]
-G, --gpu_id INTEGER gpu id start from 0, -1 means cpu [default:
0]
--setting [trans|ind] transductive or inductive setting
--split TEXT only support public split now, do not change
it [default: fixed]
--run_reduction INTEGER repeat times of reduction [default: 3]
--run_eval INTEGER repeat times of final evaluations [default:
10]
--run_inter_eval INTEGER repeat times of intermediate evaluations
[default: 5]
--eval_interval INTEGER [default: 100]
-H, --hidden INTEGER [default: 256]
--eval_epochs, --ee INTEGER [default: 300]
--eval_model, --em [GCN|GAT|SGC|APPNP|Cheby|GraphSage|GAT|SGFormer]
[default: GCN]
--condense_model [GCN|GAT|SGC|APPNP|Cheby|GraphSage|GAT]
[default: SGC]
-E, --epochs INTEGER number of reduction epochs [default: 1000]
--lr FLOAT [default: 0.01]
--weight_decay, --wd FLOAT [default: 0.0]
--pre_norm BOOLEAN pre-row-normalize features, forced true for
arxiv, flickr and reddit [default: True]
--outer_loop INTEGER [default: 10]
--inner_loop INTEGER [default: 1]
-R, --reduction_rate FLOAT -1 means use representative reduction rate;
reduction rate of training set, defined as
(number of nodes in small graph)/(number of
nodes in original graph) [default: -1.0]
-S, --seed INTEGER Random seed [default: 1]
--nlayers INTEGER number of GNN layers of condensed model
[default: 2]
-V, --verbose
--init [variation_neighborhoods|variation_edges|variation_cliques|heavy_edge|algebraic_JC|affinity_GS|kron|vng|clustering|averaging|cent_d|cent_p|kcenter|herding|random]
features initialization methods
-M, --method [variation_neighborhoods|variation_edges|variation_cliques|heavy_edge|algebraic_JC|affinity_GS|kron|vng|clustering|averaging|gcond|doscond|gcondx|doscondx|sfgc|msgc|disco|sgdd|gcsntk|geom|cent_d|cent_p|kcenter|herding|random]
[default: kcenter]
--activation [sigmoid|tanh|relu|linear|softplus|leakyrelu|relu6|elu]
activation function when do NAS [default:
relu]
-A, --attack [random_adj|metattack|random_feat]
corruption method
-P, --ptb_r FLOAT perturbation rate for corruptions [default:
0.25]
--aggpreprocess use aggregation for coreset methods
--dis_metric TEXT distance metric for all condensation
methods,ours means metric used in GCond
paper [default: ours]
--lr_adj FLOAT [default: 0.0001]
--lr_feat FLOAT [default: 0.0001]
--threshold INTEGER sparsificaiton threshold before evaluation
[default: 0]
--dropout FLOAT [default: 0.0]
--ntrans INTEGER number of transformations in SGC and APPNP
[default: 1]
--with_bn
--no_buff skip the buffer generation and use existing
in geom,sfgc
--batch_adj INTEGER batch size for msgc [default: 1]
--alpha FLOAT for appnp [default: 0.1]
--mx_size INTEGER for gcsntk methods, avoid SVD error
[default: 100]
--save_path, --sp TEXT save path for synthetic graph [default:
../checkpoints]
-W, --eval_whole if run on whole graph
--help Show this message and exit.from graphslim.dataset import *
from graphslim.evaluation import *
from graphslim.condensation import GCond
from graphslim.config import cli
args = cli(standalone_mode=False)
# customize args here
args.reduction_rate = 0.5
args.device = 'cuda:0'
# add more args.<main_args/dataset_args> here
graph = get_dataset('cora', args=args)
# To reproduce the benchmark, use our args and graph class
# To use your own args and graph format, please ensure the args and graph class has the required attributes
# create an agent of one reduction algorithm
# add more args.<agent_args> here
agent = GCond(setting='trans', data=graph, args=args)
# reduce the graph
reduced_graph = agent.reduce(graph, verbose=True)
# create an evaluator
# add more args.<evaluator_args> here
evaluator = Evaluator(args)
# evaluate the reduced graph on a GNN model
res_mean, res_std = evaluator.evaluate(reduced_graph, model_type='GCN')- To implement a new reduction algorithm, you need to create a new class in
sparsificationorcoarseningorcondensationand inherit theBaseclass. - To implement a new dataset, you need to create a new class in
dataset/loader.pyand inherit theTransAndIndclass. - To implement a new evaluation metric, you need to create a new function in
evaluation/eval_agent.py. - To implement a new GNN model, you need to create a new class in
modelsand inherit theBaseclass. - To customize sparsification before evaluation, please modify the function
sparsifyinevaluation/utils.py.
Our web application is deployed online using streamlit. But it also can be initiated using:
cd interface
python -m streamlit run vis_graphslim.pyto activate the interface. Please satisfy the dependency in interface/requirements.txt.
- [-] Add sparsification algorithms like Spanner
- Add more latest condensation methods
- Support more datasets
- Present full results in a website
- The GEOM and SFGC are not fully implemented in the current version due to disk space limit. We set the number of experts to 20 currently. If you have over 100GB disk space, you can set the number of experts to 1000 to reproduce the If you have over 100GB disk space, you can set the number of experts to 200 to reproduce the results in the paper.
Some of the algorithms are referred to paper authors' implementations and other packages.
