Skip to content

Conversation

@albertvillanova
Copy link
Member

@albertvillanova albertvillanova commented Jan 8, 2026

Refactor KTO coordinated with DPO [a/N]: Remove encoder-decoder support.

This cleanup significantly simplifies the KTO trainer and makes subsequent refactoring much easier.

Part of:

Coordinated with:

Key Changes:

  1. KTOConfig
  • Removed is_encoder_decoder parameter and documentation
  • Removed max_completion_length parameter (because it is specific to encoder-decoder models) and documentation
  1. KTOTrainer

    Initialization:

    • Added clear error message when user tries to use encoder-decoder model
    • Removed self.is_encoder_decoder attribute initialization
    • Removed self.max_completion_length attribute setup
    • Hardcoded is_encoder_decoder=False in DPODataCollatorWithPadding call

    Data Processing:

    • Simplified _process_tokens() function - removed entire encoder-decoder branch (~90 lines)
    • Kept only causal LM tokenization logic

    Model Forward Pass:

    • Simplified get_batch_logps(): removed is_encoder_decoder parameter
    • Always shift labels/logits by one position (causal LM only)
    • Updated all 4 calling sites to remove the parameter

    Reference Model Computation:

    • Simplified compute_reference_log_probs() - removed encoder-decoder branches
    • Simplified _compute_kl_logps() - removed encoder-decoder conditional
    • Simplified forward() - removed encoder-decoder model_kwargs
    • Simplified _compute_loss_liger() - removed encoder-decoder branches for hidden states
  2. Error Handling:

  • Users attempting to use encoder-decoder models will now receive a clear error
  1. Tests
  • Test updated
  • Remove commented encoder-decoder tests

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@albertvillanova albertvillanova mentioned this pull request Jan 8, 2026
6 tasks
@qgallouedec
Copy link
Member

Yes, as @albertvillanova says, I advocated for this change in #3906. It is important change, so I would be interested to hear the opinions of @lewtun, @edbeeching, and @kashif, and ensure we are aligned on this decision for both KTO and DPO.

pad_token_id=processing_class.pad_token_id,
label_pad_token_id=args.label_pad_token_id,
is_encoder_decoder=self.is_encoder_decoder,
is_encoder_decoder=False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could probably remove this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done here and in other places.

@edbeeching
Copy link
Collaborator

Yes, as @albertvillanova says, I advocated for this change in #3906. It is important change, so I would be interested to hear the opinions of @lewtun, @edbeeching, and @kashif, and ensure we are aligned on this decision for both KTO and DPO.

Yes I agree it is best to streamline the repo and cut features which (presumably) are not widely used.

Copy link
Member

@qgallouedec qgallouedec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few suggestions, otherwise lgtm!

"label_pad_token_id": self.label_pad_token_id,
"max_prompt_length": self.max_prompt_length,
"max_completion_length": self.max_completion_length,
"max_completion_length": None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"max_completion_length": None,

"truncation_mode": trainer.truncation_mode,
"label_pad_token_id": trainer.label_pad_token_id,
"max_prompt_length": trainer.max_prompt_length,
"max_completion_length": None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"max_completion_length": None,

Copy link
Member

@qgallouedec qgallouedec Jan 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and remove all kwargs["max_completion_length"] in _process_tokens

EDIT: ignore the above, none remain

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants