Mental-BERT Fine-tuning Comparison: LoRA vs. Weighted Full Fine-tuning

This project explores two distinct fine-tuning methodologies—Weighted Full Fine-tuning and LoRA (Low-Rank Adaptation)—applied to the specialized Mental-BERT-Base-Uncased model. The goal was to compare the performance, training efficiency, and parameter footprint of these methods for a specific downstream task (e.g., mental health text classification, which is implied by the base model).

Base Model

The foundation of this work is the Mental-BERT-Base-Uncased model, a BERT-based model specifically pre-trained on mental health-related texts.

Hugging Face Link: mental/mental-bert-base-uncased

This specialized model is leveraged to ensure better domain-specific understanding and performance compared to general-purpose language models.

Fine-tuning Methods

1. Weighted Full Fine-tuning

This is the standard approach where the entire pre-trained model is trained on the new dataset. Weighted Full Fine-tuning specifically incorporates sample weights (or class weights) into the loss function to handle potential class imbalance in the training data, ensuring the model pays adequate attention to underrepresented classes.

Approach: All parameters in the pre-trained backbone are updated.
Key Feature: Use of weights in the loss function to mitigate class imbalance.

2. LoRA (Low-Rank Adaptation) Fine-tuning

LoRA is a parameter-efficient fine-tuning (PEFT) technique. It freezes the pre-trained model weights and injects trainable rank decomposition matrices (A and B) into the transformer layers. This drastically reduces the number of trainable parameters while retaining the performance of the full model.

Approach: Only the newly added low-rank matrices are updated; the original model weights are frozen.
Key Feature: Significantly reduced memory usage and faster training due to fewer trainable parameters.

Results and Comparison

The following figures illustrate the performance metrics (e.g., accuracy, F1-score, loss) achieved by both fine-tuning methods on the test set.

Weighted Full Fine-tuning Results

LoRA Fine-tuning Results

Conclusion

The analysis of the results demonstrates the efficiency and effectiveness of the LoRA approach.

LoRA achieved similar results to full fine-tuning with training only 4.67% of parameters. This validates LoRA as a highly efficient method for adapting large, specialized language models like Mental-BERT to new tasks without the computational burden of updating the entire model.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
results		results
Annotated_data.csv		Annotated_data.csv
README.md		README.md
full_finetuning.ipynb		full_finetuning.ipynb
lora_finetuning.ipynb		lora_finetuning.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mental-BERT Fine-tuning Comparison: LoRA vs. Weighted Full Fine-tuning

Base Model

Fine-tuning Methods

1. Weighted Full Fine-tuning

2. LoRA (Low-Rank Adaptation) Fine-tuning

Results and Comparison

Weighted Full Fine-tuning Results

LoRA Fine-tuning Results

Conclusion

About

Uh oh!

Releases

Packages

Languages

amanmoon/fine-tuning-BERT

Folders and files

Latest commit

History

Repository files navigation

Mental-BERT Fine-tuning Comparison: LoRA vs. Weighted Full Fine-tuning

Base Model

Fine-tuning Methods

1. Weighted Full Fine-tuning

2. LoRA (Low-Rank Adaptation) Fine-tuning

Results and Comparison

Weighted Full Fine-tuning Results

LoRA Fine-tuning Results

Conclusion

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages