Skip to content

WaliMuhammadAhmad/UrduTextSummarizationUsingm-BART

Repository files navigation

Urdu Text Summarization using m-BART

Welcome to the Urdu Text Summarization repository powered by m-BART! ✨

This project is based on a multilingual variant of the BART model, designed to generate concise and coherent summaries for Urdu text. It uses a finetuned m-BART model and offers a Flask-based web application as a simple GUI for interaction.

About Model

m-BART (Multilingual BART) is an extension of the BART model pretrained on a large-scale multilingual dataset. BART is built for sequence-to-sequence tasks, and m-BART extends this for multilingual applications.
This model has been trained on ~67,000 Urdu news articles and is optimized specifically for Urdu summarization tasks.

🚀 Getting Started

Model Checkpoints

You can manually download the model Checkpoints and place them in a folder named ckpt inside the cloned repo.

If not provided or placed incorrectly, the model and tokenizer will automatically be downloaded from Hugging Face Hub and saved locally in the ckpt directory.

Docker Image

You can run this project using Docker for quick setup and deployment.

  1. Build the Docker image:

    docker build -t urdu-summarizer .
  2. Run the container: If you have checkpoints locally, can optionally mount your local ckpt/ dir to avoid downloading model files:

    docker run -v $(pwd)/ckpt:/app/ckpt -p 5000:5000 urdu-summarizer
  3. Visit http://localhost:5000 in your browser.

Env Setup

pip

pip install -r requirements.txt

conda

conda env create -f environment.yaml
conda activate urdu-summarizer

Running Web App

  1. Start the app:
python app.py
  1. Open your browser and go to http://localhost:5000.
  2. Paste your Urdu text to get instant summaries.

Notebooks

The notebooks directory contains three Jupyter notebooks. These notebooks can be used to:

  • Load and run inference on the model.
  • Finetune m-BART on your own Urdu dataset.
  • Push the trained model to the Hugging Face Model Hub.

App GUI Preview

App Screenshot

📬 Contact

For any questions or suggestions, please reach out at [email protected].

About

m-BART, a model fine-tuned on 67k Urdu news articles. Pretrained model checkpoints and Flask web app for ready-to-go summarization!

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks