Fork of LM Evaluation Harness Suite for evaluating benchmarks in paper titled "PLDR-LLMs Learn A Generalizable Tensor Operator That Can Replace Its Own Deep Neural Net At Inference"
-
Updated
Feb 25, 2025 - Python
Fork of LM Evaluation Harness Suite for evaluating benchmarks in paper titled "PLDR-LLMs Learn A Generalizable Tensor Operator That Can Replace Its Own Deep Neural Net At Inference"
Evaluate models and compare their scores
Add a description, image, and links to the lm-evaluation-harness topic page so that developers can more easily learn about it.
To associate your repository with the lm-evaluation-harness topic, visit your repo's landing page and select "manage topics."