This repository has some notebooks and implementations of topics and concepts that are related to inference acceleration. Learnings are grouped into folders, with READMEs of their own to explain the work going on. I plan on adding more as I continue to learn!
Next Steps:
- Custom GPU kernels for Torch-TensorRT via Triton/Python or CUDA/C++