Flash-attention2 CUTLASS Based on https://github.com/66RING/tiny-flash-attention.git Blog:https://zhuanlan.zhihu.com/p/708867810 Build git submodule update --init --recursive pip install . Benchmark python test.py Benchmark on H20: