WebPage Summarizer

Summary

This project takes a URL and returns a Markdown summary using either OpenAI API (gpt-4o-mini) or a local model (llama3.2) through Ollama.

Components

Environment Check: Verifies that the appropriate virtual environment (e.g. llms, venv) is active.
API Key Handling:
- Loads .env file from the project root.
- Checks that OPENAI_API_KEY exists and is correctly formatted.
Website Class: Downloads and parses the webpage using requests and BeautifulSoup. Strips out unnecessary tags (script, style, img, input) and extracts the main text.
Model Interfaces:
- OpenAI: Uses the openai Python SDK and gpt-4o-mini.
- Ollama: Sends a POST request to the local server at http://localhost:11434/api/chat.
Prompt Handling: Combines system and user prompts for summarization.
Display: Uses IPython.display.Markdown to show the output in a notebook.

Features

Accepts any public URL (e.g. articles, blog posts)
Supports both cloud-based and local language models
Extracts clean content from HTML
Returns summary in Markdown format
Displays results inside Jupyter Notebooks

Requirements

Python 3.8 or higher
Jupyter Lab or Notebook
Active virtual environment (llms, venv, etc.)
.env file containing:

OPENAI_API_KEY=sk-proj-...

Setup

Clone the repository:

git clone https://github.com/yourusername/webpage-summarizer.git
cd webpage-summarizer

Create and activate a virtual environment:

python -m venv venv
source venv/bin/activate
conda activate llms

Install dependencies:

pip install -r requirements.txt

Add your OpenAI API key:
Create a .env file in the parent directory of your notebook:

conda env create -f environment.yml
OPENAI_API_KEY=sk-proj-xxxxxxxxxxxxxxxxxxxxxxxxxxxx

Start Ollama (for local models):

ollama run llama3.2

Run Jupyter Lab:

jupyter lab

Example

display_summary("https://en.wikipedia.org/wiki/OpenAI")

This will generate a Markdown summary of the provided URL.

Switching Between OpenAI and Ollama

Using OpenAI

response = openai.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": USER_PROMPT}
    ]
)

Using Ollama

response = requests.post(
    "http://localhost:11434/api/chat",
    json={
        "model": "llama3",
        "messages": [
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": USER_PROMPT}
        ]
    }
)

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
01_WebPageSummarizer.ipynb		01_WebPageSummarizer.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WebPage Summarizer

Summary

Components

Features

Requirements

Setup

Example

Switching Between OpenAI and Ollama

Using OpenAI

Using Ollama

About

Uh oh!

Releases

Packages

Languages

MinaZarifi2023/01WebPageSummarizer

Folders and files

Latest commit

History

Repository files navigation

WebPage Summarizer

Summary

Components

Features

Requirements

Setup

Example

Switching Between OpenAI and Ollama

Using OpenAI

Using Ollama

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages