🚀 Prompt Engineering Methods

A comprehensive toolkit for optimizing code generation prompts using 8 different prompt engineering techniques.

This project implements and benchmarks various prompt optimization methods for improving LLM-based code generation. All methods are designed to work with Ollama for local, private, and free LLM inference.

✨ Features

🧬 8 Prompt Engineering Methods - From genetic algorithms to reinforcement learning
🔧 Modular Architecture - Easy to extend with new methods
📊 Built-in Benchmarking - Compare all methods side-by-side
📝 Code Quality Evaluation - Syntax, functionality, readability metrics
🖥️ CLI Interface - Easy command-line usage
📈 Fitness Tracking - Monitor optimization progress
🔒 Local LLM Support - Works with Ollama for privacy

🔬 Methods Implemented

Method	Description	Type
AutoPrompt-GA	Genetic algorithm-based prompt evolution	Evolutionary
OPRO	Optimization by PROmpting (DeepMind)	Meta-prompting
Chain-of-Thought	Step-by-step reasoning chains	Reasoning
Tree-of-Thought	Tree-based exploration of reasoning paths	Search
Few-Shot	Example-based learning	In-context Learning
Prompt Tuning	Soft prompt optimization	Tuning
Prefix Tuning	Prefix-based prompt optimization	Tuning
Prompt-OIRL	Q-Learning based optimization	Reinforcement Learning

Method Details

🧬 AutoPrompt-GA (Genetic Algorithm)

Uses evolutionary principles to optimize prompts:

Selection: Tournament selection of best prompts
Crossover: Combines successful prompt elements
Mutation: Random modifications for exploration
Elitism: Preserves top performers

🎯 OPRO (Optimization by PROmpting)

MetaPrompting approach where the LLM itself suggests prompt improvements:

Analyzes current prompt performance
Generates improvement suggestions
Iteratively refines based on feedback

🔗 Chain-of-Thought (CoT)

Encourages step-by-step reasoning:

Problem analysis → Algorithm design → Implementation → Verification
Reduces errors through structured thinking

🌳 Tree-of-Thought (ToT)

Explores multiple reasoning paths simultaneously:

Branching factor for exploring alternatives
Beam search for pruning unpromising paths
Optimal path selection based on evaluation

📚 Few-Shot Learning

Provides examples of good code:

Demonstrates expected output format
Shows coding style and patterns
Context-based learning

⚙️ Prompt Tuning

Optimizes soft prompt tokens:

Explores combinations of descriptive adjectives
Finds optimal prompt formulations
Task-specific tuning

🔧 Prefix Tuning

Optimizes prompt prefixes:

Tests different expert personas
Finds effective prompt structures
Role-based optimization

🎮 Prompt-OIRL (Reinforcement Learning)

Q-Learning based optimization:

State space: Code quality levels
Action space: Prompt modifications
Reward: Score improvement
Exploration vs exploitation balance

🛠️ Installation

Prerequisites

Python 3.8+
Ollama installed and running

Setup

# Clone the repository
git clone https://github.com/kadiryonak/PromptEngineering_Methods.git
cd PromptEngineering_Methods

# Install dependencies
pip install -r requirements.txt

# Pull a code-focused model (recommended)
ollama pull codellama:7b

# Start Ollama server (if not running)
ollama serve

🚀 Quick Start

cd prompt_engineering_methods

# List all available methods
python main.py --list-methods

# Run all methods with 3 generations each
python main.py --method all --generations 3

# Run a specific method
python main.py --method autoprompt --generations 5

# Save results to JSON
python main.py --method all --output results.json

💡 Usage Examples

Basic Usage

from ollama_client import OllamaClient
from prompt_engineering import AutoPromptCodeGA, ChainOfThoughtCode

# Initialize client
client = OllamaClient(model="codellama:7b")

# Sample dataset
dataset = [
    {
        "task": "Write a function to calculate factorial",
        "test_cases": [
            {"function": "factorial", "input": [5], "expected": 120}
        ]
    }
]

# Run AutoPrompt-GA
optimizer = AutoPromptCodeGA(client, dataset)
best_prompt, score = optimizer.optimize(generations=5)
print(f"Best prompt: {best_prompt}")
print(f"Score: {score:.4f}")

Using Tree-of-Thought

from prompt_engineering.tree_of_thought import TreeOfThoughtCode

# Initialize with custom parameters
tot = TreeOfThoughtCode(
    client, 
    dataset,
    branching_factor=3,  # Explore 3 alternatives at each node
    max_depth=3          # Tree depth
)

best_prompt, score = tot.optimize(generations=4)

Custom Code Evaluation

from code_evaluator import CodeQualityEvaluator

evaluator = CodeQualityEvaluator()

code = """
def factorial(n):
    if n <= 1:
        return 1
    return n * factorial(n - 1)
"""

# Check syntax
is_valid, message = evaluator.syntax_check(code)
print(f"Syntax valid: {is_valid}")

# Get quality metrics
metrics = evaluator.code_quality_metrics(code)
print(f"Readability: {metrics['readability']:.2f}")
print(f"Best practices: {metrics['best_practices']:.2f}")

📊 Benchmarking

Run comprehensive benchmarks comparing all methods:

# Run benchmark
python benchmark.py --generations 3

# Save results
python benchmark.py --output results.json --report report.md

Sample Benchmark Results

Rank	Method	Best Score	Avg Score	Conv. Speed	Time (s)
1	AutoPrompt-GA	0.7850	0.6234	3	45.2
2	Tree-of-Thought	0.7623	0.6012	4	52.1
3	OPRO	0.7412	0.5891	5	38.7
4	Chain-of-Thought	0.7234	0.5678	2	28.3
5	Few-Shot	0.7012	0.5543	1	22.1

📁 Project Structure

PromptEngineering_Methods/
├── prompt_engineering_methods/
│   ├── main.py                    # CLI entry point
│   ├── benchmark.py               # Benchmarking system
│   ├── ollama_client.py           # Ollama API client
│   ├── code_evaluator.py          # Code quality evaluation
│   ├── base_optimizer.py          # Base optimizer class
│   ├── metrics.py                 # Evaluation metrics
│   └── prompt_engineering/
│       ├── __init__.py
│       ├── autoprompt_ga.py       # Genetic Algorithm
│       ├── opro.py                # OPRO method
│       ├── chain_of_thought.py    # Chain-of-Thought
│       ├── tree_of_thought.py     # Tree-of-Thought
│       ├── few_shot.py            # Few-Shot Learning
│       ├── prompt_tuning.py       # Prompt Tuning
│       ├── prefix_tuning.py       # Prefix Tuning
│       └── prompt_OIRL.py         # RL-based optimization
├── requirements.txt
├── .gitignore
└── README.md

📖 API Reference

OllamaClient

client = OllamaClient(
    base_url="http://localhost:11434",
    model="codellama:7b"
)

# Generate code
response = client.generate(prompt, max_tokens=300, temperature=0.1)

# Check availability
is_running = client.is_available()

BaseCodePromptOptimizer

All methods inherit from this base class:

class YourMethod(BaseCodePromptOptimizer):
    def __init__(self, ollama_client, dataset, name):
        super().__init__(ollama_client, dataset, name)
    
    def optimize(self, generations=5) -> Tuple[str, float]:
        # Your optimization logic
        return best_prompt, best_score

CodeQualityEvaluator

evaluator = CodeQualityEvaluator()

# Extract code from LLM response
code = evaluator.extract_code_block(text)

# Syntax validation
is_valid, message = evaluator.syntax_check(code)

# Functional testing
score = evaluator.functional_test(code, test_cases)

# Quality metrics
metrics = evaluator.code_quality_metrics(code)
# Returns: readability, complexity, documentation, best_practices

🤝 Contributing

Contributions are welcome! Here's how you can help:

Fork the repository
Create a feature branch (git checkout -b feature/new-method)
Implement your changes
Test thoroughly
Submit a pull request

Adding a New Method

Create a new file in prompt_engineering/
Inherit from BaseCodePromptOptimizer
Implement the optimize() method
Add to __init__.py exports
Register in main.py METHODS dict

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

Ollama for local LLM inference
DeepMind OPRO Paper for optimization methodology
Tree of Thoughts Paper for reasoning framework

Made with ❤️ by kadiryonak

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
prompt_engineering_methods		prompt_engineering_methods
.gitignore		.gitignore
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 Prompt Engineering Methods

📋 Table of Contents

✨ Features

🔬 Methods Implemented

Method Details

🛠️ Installation

Prerequisites

Setup

🚀 Quick Start

💡 Usage Examples

Basic Usage

Using Tree-of-Thought

Custom Code Evaluation

📊 Benchmarking

Sample Benchmark Results

📁 Project Structure

📖 API Reference

OllamaClient

BaseCodePromptOptimizer

CodeQualityEvaluator

🤝 Contributing

Adding a New Method

📄 License

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Languages

kadiryonak/PromptEngineering_Methods

Folders and files

Latest commit

History

Repository files navigation

🚀 Prompt Engineering Methods

📋 Table of Contents

✨ Features

🔬 Methods Implemented

Method Details

🛠️ Installation

Prerequisites

Setup

🚀 Quick Start

💡 Usage Examples

Basic Usage

Using Tree-of-Thought

Custom Code Evaluation

📊 Benchmarking

Sample Benchmark Results

📁 Project Structure

📖 API Reference

OllamaClient

BaseCodePromptOptimizer

CodeQualityEvaluator

🤝 Contributing

Adding a New Method

📄 License

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages