Skip to content

amanmoon/fine-tuning-BERT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mental-BERT Fine-tuning Comparison: LoRA vs. Weighted Full Fine-tuning

This project explores two distinct fine-tuning methodologies—Weighted Full Fine-tuning and LoRA (Low-Rank Adaptation)—applied to the specialized Mental-BERT-Base-Uncased model. The goal was to compare the performance, training efficiency, and parameter footprint of these methods for a specific downstream task (e.g., mental health text classification, which is implied by the base model).


Base Model

The foundation of this work is the Mental-BERT-Base-Uncased model, a BERT-based model specifically pre-trained on mental health-related texts.

This specialized model is leveraged to ensure better domain-specific understanding and performance compared to general-purpose language models.


Fine-tuning Methods

1. Weighted Full Fine-tuning

This is the standard approach where the entire pre-trained model is trained on the new dataset. Weighted Full Fine-tuning specifically incorporates sample weights (or class weights) into the loss function to handle potential class imbalance in the training data, ensuring the model pays adequate attention to underrepresented classes.

  • Approach: All parameters in the pre-trained backbone are updated.
  • Key Feature: Use of weights in the loss function to mitigate class imbalance.

2. LoRA (Low-Rank Adaptation) Fine-tuning

LoRA is a parameter-efficient fine-tuning (PEFT) technique. It freezes the pre-trained model weights and injects trainable rank decomposition matrices (A and B) into the transformer layers. This drastically reduces the number of trainable parameters while retaining the performance of the full model.

  • Approach: Only the newly added low-rank matrices are updated; the original model weights are frozen.
  • Key Feature: Significantly reduced memory usage and faster training due to fewer trainable parameters.

Results and Comparison

The following figures illustrate the performance metrics (e.g., accuracy, F1-score, loss) achieved by both fine-tuning methods on the test set.

Weighted Full Fine-tuning Results

Results from Weighted Full Fine-tuning

LoRA Fine-tuning Results

Results from LoRA Fine-tuning


Conclusion

The analysis of the results demonstrates the efficiency and effectiveness of the LoRA approach.

LoRA achieved similar results to full fine-tuning with training only 4.67% of parameters. This validates LoRA as a highly efficient method for adapting large, specialized language models like Mental-BERT to new tasks without the computational burden of updating the entire model.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published