Make a cloud computer for cloud gaming, video production, etc
-
Updated
Oct 11, 2025 - PowerShell
Make a cloud computer for cloud gaming, video production, etc
Data exploration tools for GPU computing benchmarks - Wikipedia & HDF5 sensor datasets
GPU-accelerated neural network inference using custom CUDA kernels. Achieves 97.82% accuracy on MNIST.
GPU-accelerated Fast Fourier Transform implementation using CUDA. Demonstrates Cooley-Tukey radix-2 algorithm with shared memory optimization, achieving 534x speedup over CPU and 1.74x over NumPy for large transforms. Developed on Tesla P100.
High-performance GPU SpMV kernels in Python/Numba CUDA. Achieves 96.6 GB/s (53% of cuSPARSE) with 5,432x speedup over CPU. Includes optimization analysis.
High-performance GPU matrix multiplication achieving 6,436 GFLOPS (69% of peak) on Tesla P100 through progressive CUDA optimization
Add a description, image, and links to the tesla-p100 topic page so that developers can more easily learn about it.
To associate your repository with the tesla-p100 topic, visit your repo's landing page and select "manage topics."