Urdu Text Summarization using m-BART

Welcome to the Urdu Text Summarization repository powered by m-BART! ✨

This project is based on a multilingual variant of the BART model, designed to generate concise and coherent summaries for Urdu text. It uses a finetuned m-BART model and offers a Flask-based web application as a simple GUI for interaction.

About Model

m-BART (Multilingual BART) is an extension of the BART model pretrained on a large-scale multilingual dataset. BART is built for sequence-to-sequence tasks, and m-BART extends this for multilingual applications.
This model has been trained on ~67,000 Urdu news articles and is optimized specifically for Urdu summarization tasks.

🚀 Getting Started

Model Checkpoints

You can manually download the model Checkpoints and place them in a folder named ckpt inside the cloned repo.

If not provided or placed incorrectly, the model and tokenizer will automatically be downloaded from Hugging Face Hub and saved locally in the ckpt directory.

Docker Image

You can run this project using Docker for quick setup and deployment.
Build the Docker image:
docker build -t urdu-summarizer .
Run the container: If you have checkpoints locally, can optionally mount your local ckpt/ dir to avoid downloading model files:
docker run -v $(pwd)/ckpt:/app/ckpt -p 5000:5000 urdu-summarizer
Visit http://localhost:5000 in your browser.

Env Setup

pip

pip install -r requirements.txt

conda

conda env create -f environment.yaml
conda activate urdu-summarizer

Running Web App

Start the app:

python app.py

Open your browser and go to http://localhost:5000.
Paste your Urdu text to get instant summaries.

Notebooks

The notebooks directory contains three Jupyter notebooks. These notebooks can be used to:

Load and run inference on the model.
Finetune m-BART on your own Urdu dataset.
Push the trained model to the Hugging Face Model Hub.

App GUI Preview

📬 Contact

For any questions or suggestions, please reach out at [email protected].

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
notebooks		notebooks
static		static
templates		templates
.dockerignore		.dockerignore
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
Dockerfile		Dockerfile
LICENSE.md		LICENSE.md
README.md		README.md
SECURITY.md		SECURITY.md
Urdu Text Summarization - Google Chrome Screenshot.png		Urdu Text Summarization - Google Chrome Screenshot.png
app.py		app.py
environment.yaml		environment.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Urdu Text Summarization using m-BART

About Model

🚀 Getting Started

Model Checkpoints

Docker Image

Env Setup

Running Web App

Notebooks

App GUI Preview

📬 Contact

About

Uh oh!

Languages

License

WaliMuhammadAhmad/UrduTextSummarizationUsingm-BART

Folders and files

Latest commit

History

Repository files navigation

Urdu Text Summarization using m-BART

About Model

🚀 Getting Started

Model Checkpoints

Docker Image

Env Setup

Running Web App

Notebooks

App GUI Preview

📬 Contact

About

Topics

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Languages