Refactor KTO coordinated with DPO [a/N]: Remove encoder-decoder support #4792

albertvillanova · 2026-01-08T15:24:09Z

Refactor KTO coordinated with DPO [a/N]: Remove encoder-decoder support.

This cleanup significantly simplifies the KTO trainer and makes subsequent refactoring much easier.

Part of:

KTO refactoring #4786

Coordinated with:

[WIP] Refactor DPO #3906

Key Changes:

KTOConfig

Removed is_encoder_decoder parameter and documentation
Removed max_completion_length parameter (because it is specific to encoder-decoder models) and documentation

KTOTrainer

Initialization:
- Added clear error message when user tries to use encoder-decoder model
- Removed self.is_encoder_decoder attribute initialization
- Removed self.max_completion_length attribute setup
- Hardcoded is_encoder_decoder=False in DPODataCollatorWithPadding call
Data Processing:
- Simplified _process_tokens() function - removed entire encoder-decoder branch (~90 lines)
- Kept only causal LM tokenization logic
Model Forward Pass:
- Simplified get_batch_logps(): removed is_encoder_decoder parameter
- Always shift labels/logits by one position (causal LM only)
- Updated all 4 calling sites to remove the parameter
Reference Model Computation:
- Simplified compute_reference_log_probs() - removed encoder-decoder branches
- Simplified _compute_kl_logps() - removed encoder-decoder conditional
- Simplified forward() - removed encoder-decoder model_kwargs
- Simplified _compute_loss_liger() - removed encoder-decoder branches for hidden states
Error Handling:

Users attempting to use encoder-decoder models will now receive a clear error

Tests

Test updated
Remove commented encoder-decoder tests

HuggingFaceDocBuilderDev · 2026-01-08T15:29:01Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

…nated-with-dpo-a

qgallouedec · 2026-01-08T16:37:19Z

Yes, as @albertvillanova says, I advocated for this change in #3906. It is important change, so I would be interested to hear the opinions of @lewtun, @edbeeching, and @kashif, and ensure we are aligned on this decision for both KTO and DPO.

qgallouedec · 2026-01-08T17:06:50Z

trl/experimental/kto/kto_trainer.py

                pad_token_id=processing_class.pad_token_id,
                label_pad_token_id=args.label_pad_token_id,
-                is_encoder_decoder=self.is_encoder_decoder,
+                is_encoder_decoder=False,


you could probably remove this

Done here and in other places.

…nated-with-dpo-a

edbeeching · 2026-01-09T08:09:17Z

Yes, as @albertvillanova says, I advocated for this change in #3906. It is important change, so I would be interested to hear the opinions of @lewtun, @edbeeching, and @kashif, and ensure we are aligned on this decision for both KTO and DPO.

Yes I agree it is best to streamline the repo and cut features which (presumably) are not widely used.

qgallouedec

a few suggestions, otherwise lgtm!

qgallouedec · 2026-01-09T15:21:19Z

trl/experimental/kto/kto_trainer.py

                "label_pad_token_id": self.label_pad_token_id,
                "max_prompt_length": self.max_prompt_length,
-                "max_completion_length": self.max_completion_length,
+                "max_completion_length": None,


Suggested change

"max_completion_length": None,

qgallouedec · 2026-01-09T15:22:58Z

tests/experimental/test_kto_trainer.py

            "truncation_mode": trainer.truncation_mode,
            "label_pad_token_id": trainer.label_pad_token_id,
            "max_prompt_length": trainer.max_prompt_length,
+            "max_completion_length": None,


Suggested change

"max_completion_length": None,

and remove all kwargs["max_completion_length"] in _process_tokens

EDIT: ignore the above, none remain

albertvillanova added 4 commits January 8, 2026 16:17

Remove is_encoder_decoder from KTOConfig

02ba4c2

Remove max_completion_lengt from KTOConfig

d4a5e96

Remove encoder-decoder from KTOTrainer

c526524

Fix style

252d527

albertvillanova mentioned this pull request Jan 8, 2026

KTO refactoring #4786

Open

6 tasks

albertvillanova added 3 commits January 8, 2026 16:33

Fix test

8c18b8f

Remove commented encoder-decoder tests

fb91f6e

Merge remote-tracking branch 'upstream/main' into refactor-kto-coordi…

6862b6b

…nated-with-dpo-a

qgallouedec requested review from edbeeching, kashif and lewtun January 8, 2026 16:57

qgallouedec reviewed Jan 8, 2026

View reviewed changes

albertvillanova added 2 commits January 8, 2026 19:16

Remove unused is_encoder_decoder kwarg

2beec4d

Merge remote-tracking branch 'upstream/main' into refactor-kto-coordi…

22379be

…nated-with-dpo-a

qgallouedec approved these changes Jan 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor KTO coordinated with DPO [a/N]: Remove encoder-decoder support #4792

Refactor KTO coordinated with DPO [a/N]: Remove encoder-decoder support #4792

Uh oh!

albertvillanova commented Jan 8, 2026 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Jan 8, 2026

Uh oh!

qgallouedec commented Jan 8, 2026

Uh oh!

qgallouedec Jan 8, 2026

Uh oh!

albertvillanova Jan 8, 2026

Uh oh!

edbeeching commented Jan 9, 2026

Uh oh!

qgallouedec left a comment

Uh oh!

qgallouedec Jan 9, 2026

Uh oh!

qgallouedec Jan 9, 2026

Uh oh!

qgallouedec Jan 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Refactor KTO coordinated with DPO [a/N]: Remove encoder-decoder support #4792

Are you sure you want to change the base?

Refactor KTO coordinated with DPO [a/N]: Remove encoder-decoder support #4792

Uh oh!

Conversation

albertvillanova commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Jan 8, 2026

Uh oh!

qgallouedec commented Jan 8, 2026

Uh oh!

qgallouedec Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

albertvillanova Jan 8, 2026

Choose a reason for hiding this comment

Uh oh!

edbeeching commented Jan 9, 2026

Uh oh!

qgallouedec left a comment

Choose a reason for hiding this comment

Uh oh!

qgallouedec Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

qgallouedec Jan 9, 2026

Choose a reason for hiding this comment

Uh oh!

qgallouedec Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

albertvillanova commented Jan 8, 2026 •

edited

Loading

qgallouedec Jan 9, 2026 •

edited

Loading