Llama2-Self-Aligned-Backtranslation: Reproducing "Self Alignment with Instruction Backtranslation" (ACL 2023)
This repository provides a complete implementation of the paper Self Alignment with Instruction Backtranslation, using Llama2-7B as the base model. The project includes backward model training, self-augmentation, quality curation with LLMs, and instruction fine-tuning, all optimized for memory efficiency using LoRA.
- Goal: Train a model to predict instructions (
x) from responses (y). - Dataset: OpenAssistant-Guanaco training set (seed data).
- Techniques: LoRA fine-tuning with 4-bit quantization (memory-efficient).
- Model: llama2-7b-backward-model
- Process:
- Randomly sample 150 single-turn completions from the LIMA dataset.
- Generate instructions for these completions using the backward model.
- Filter out multi-turn dialogues (e.g., conversations with >2 turns).
- Used
meta/llama-7b-chat-hfto score instruction-response pairs (1-5 scale). - Selected high-quality examples (score ≥4) and discarded low-quality ones (score ≤2).
- Curated Dataset: backtranslated-lima-cleaned
- Fine-tuned Llama2-7B on the curated dataset with LoRA, achieving better instruction-following capabilities.
- Final Model: llama2-instruction-aligned
Install dependencies via:
pip install -r requirements.txt