Small and Efficient Mathematical Reasoning LLMs
-
Updated
Jan 27, 2024 - Python
Small and Efficient Mathematical Reasoning LLMs
[ACL 2023] Solving Math Word Problems via Cooperative Reasoning induced Language Models (LLMs + MCTS + Self-Improvement)
Reproducible Language Agent Research
DICE: Detecting In-distribution Data Contamination with LLM's Internal State
Short-CoT distilled GSM8K dataset generated with OpenAI gpt-oss-120b.
GSM8K-Consistency is a benchmark database for analyzing the consistency of Arithmetic Reasoning on GSM8K.
An evaluation of prompting techniques (Zero-Shot CoT, Few-Shot, Self-Consistency) on the Mistral-7B model for mathematical reasoning. This project systematically benchmarks 7 distinct methods on the GSM8K dataset.
Nano R1 Model is an AI-driven reasoning model built using reinforcement learning techniques. It focuses on decision-making and adaptability in dynamic environments, utilizing state-of-the-art machine learning methods to improve over time. Developed with Python and hosted on Hugging Face.
In this we finetune GPT-OSS-20B on OpenAI's gsm8k dataset
Comprehensive benchmarking framework for RLVR/RLHF libraries on GSM8K mathematical reasoning dataset
Developing an autonomous system for prompt selection for Large Language Models (LLMs), enhancing performance across tasks by balancing generality and specificity. This project automates diverse, high-quality prompt creation and selection, reducing manual intervention and maximizing LLM utility across applications.
Analysis of CoT and standard prompt techniques on the standard dataset (GSM8K, AQuA, SVAMP) is done using the PaLM (text-bison-001), LLAMA (Llama-2-7B-Chat-GPTQ), LLAMA (Llama-2-13B-Chat-GPTQ) language model.
Transforming weak prompts into reasoning machines using Textual Gradients and AdalFlow. Runs on Colab.
Dataset management and caching for AI research benchmarks
AlphaZero-style RL training for LLMs using MCTS on mathematical reasoning tasks (GSM8K). Student model explores reasoning paths guided by teacher ensembles and reward signals.
Dataset management library for ML experiments—loaders for SciFact, FEVER, GSM8K, HumanEval, MMLU, TruthfulQA, HellaSwag; git-like versioning with lineage tracking; transformation pipelines; quality validation with schema checks and duplicate detection; GenStage streaming for large datasets. Built for reproducible AI research.
AlphaZero-style RL training for LLMs using MCTS on mathematical reasoning tasks (GSM8K). Student model explores reasoning paths guided by teacher ensembles and reward signals.
Add a description, image, and links to the gsm8k topic page so that developers can more easily learn about it.
To associate your repository with the gsm8k topic, visit your repo's landing page and select "manage topics."