Skip to content

FlashTile is a CUDA Tile IR compiler that is compatible with NVIDIA's tileiras, targeting SM70 through SM121 NVIDIA GPUs.

License

Notifications You must be signed in to change notification settings

serdes21/flashtile

Repository files navigation

FlashTile

English | 中文

FlashTile is a CUDA Tile IR compiler that is compatible with NVIDIA's tileiras, targeting SM70 through SM121 NVIDIA GPUs.

Quick Start

  1. Download the latest binary for your platform from GitHub Releases.

  2. Rename and add to PATH:

Linux

mv flashtile tileiras
chmod +x tileiras
export PATH="/path/to/tileiras:$PATH"

Windows

Rename-Item flashtile.exe tileiras.exe
$env:PATH = "C:\path\to\tileiras;" + $env:PATH

Correctness Testing

Validated with 74 test files (54 from cutile-python, 20 from TileGym, excluding benchmarks). For efficiency, only 1 representative case is sampled per test file. Test Results

Performance Testing

FlashTile is still in early development. Performance benchmarks have not yet been conducted.

Debugging

FlashTile provides multiple ways to inspect intermediate representations at each compilation stage.

Step 1: Export the bytecode file from cutile-python by setting the environment variable:

export CUDA_TILE_DUMP_BYTECODE=/path/to/cutile

This causes cutile-python to save .cutile files to the working directory when compiling kernels.

Step 2: Use --dump-* flags to inspect each stage of the compilation pipeline:

flashtile --dump-ir        kernel.cutile   # Parsed CUDA Tile IR
flashtile --dump-mlir      kernel.cutile   # MLIR textual format (cutile-compatible)
flashtile --dump-tir       kernel.cutile   # Lowered TVM TIR
flashtile --dump-tvmscript kernel.cutile   # TVMScript (TVM IR printer)
flashtile --dump-cuda --gpu-name=sm_90 kernel.cutile   # Generated CUDA source
flashtile --dump-ptx  --gpu-name=sm_90 kernel.cutile   # Generated PTX assembly

Building from Source

Linux — see docker/ for Dockerfiles and build script:

bash docker/build.sh

Windows — see scripts/build-windows-static.ps1:

.\scripts\build-windows-static.ps1

Notes

  • NVIDIA driver version 580+ is required, due to the host launch mechanism used by cutile-python.

Join the Discussion

Welcome to join our Discord community for discussions, support, and collaboration!

Join our Discord

QQ group: 1056444998 Wechat ID: serdes21

About

FlashTile is a CUDA Tile IR compiler that is compatible with NVIDIA's tileiras, targeting SM70 through SM121 NVIDIA GPUs.

Topics

Resources

License

Stars

Watchers

Forks

Languages