English | 中文
FlashTile is a CUDA Tile IR compiler that is compatible with NVIDIA's tileiras, targeting SM70 through SM121 NVIDIA GPUs.
-
Download the latest binary for your platform from GitHub Releases.
-
Rename and add to PATH:
Linux
mv flashtile tileiras
chmod +x tileiras
export PATH="/path/to/tileiras:$PATH"Windows
Rename-Item flashtile.exe tileiras.exe
$env:PATH = "C:\path\to\tileiras;" + $env:PATHValidated with 74 test files (54 from cutile-python, 20 from TileGym, excluding benchmarks). For efficiency, only 1 representative case is sampled per test file.

FlashTile is still in early development. Performance benchmarks have not yet been conducted.
FlashTile provides multiple ways to inspect intermediate representations at each compilation stage.
Step 1: Export the bytecode file from cutile-python by setting the environment variable:
export CUDA_TILE_DUMP_BYTECODE=/path/to/cutileThis causes cutile-python to save .cutile files to the working directory when compiling kernels.
Step 2: Use --dump-* flags to inspect each stage of the compilation pipeline:
flashtile --dump-ir kernel.cutile # Parsed CUDA Tile IR
flashtile --dump-mlir kernel.cutile # MLIR textual format (cutile-compatible)
flashtile --dump-tir kernel.cutile # Lowered TVM TIR
flashtile --dump-tvmscript kernel.cutile # TVMScript (TVM IR printer)
flashtile --dump-cuda --gpu-name=sm_90 kernel.cutile # Generated CUDA source
flashtile --dump-ptx --gpu-name=sm_90 kernel.cutile # Generated PTX assemblyLinux — see docker/ for Dockerfiles and build script:
bash docker/build.shWindows — see scripts/build-windows-static.ps1:
.\scripts\build-windows-static.ps1- NVIDIA driver version 580+ is required, due to the host launch mechanism used by cutile-python.
Welcome to join our Discord community for discussions, support, and collaboration!
QQ group: 1056444998 Wechat ID: serdes21