Skip to content

SiteScope is an AI-powered web scraping API built with Node.js and Express. It extracts website brand names and descriptions using Playwright, optionally refines text with Gemini AI, and stores data in Supabase. Easy to use, testable via Postman.

License

Notifications You must be signed in to change notification settings

04arvind/SiteScope

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SiteScope

SiteScope is an AI-powered web scraping API built with Node.js and Express. It extracts website brand names and descriptions using Playwright, optionally refines text using Gemini AI, and stores the results in Supabase. The API is easy to use and can be tested via Postman.


Web Analyzer API

This project was developed as a Backend Development Internship assignment.

Tech Stack:

  • Node.js + Express(backend)
  • Supabase (PostgreSQL database)
  • Playwright (Web scraping)
  • Gemini API (AI text enhancement)
  • Postman (API testing)

Features

  • Website Analysis API: Accepts a URL and returns the brand name and description. Can optionally enhance text using AI.
  • CRUD Operations: Store, retrieve, update, and delete website data.
  • Validation & Error Handling: Ensures input is valid and handles errors gracefully.

Installation

  1. Clone the repository
git clone <repo-url>
cd SiteScope
  1. Install dependencies
npm install
  1. Set up environment variables Create a .env file in the root directory:
SUPABASE_URL=<supabase-url>
SUPABASE_KEY=<supabase-key>
GEMINI_API_KEY=<gemini-api-key>
PORT=
  1. Run the server
npm start

Server runs at http://localhost:3000


API Endpoints

1. Analyze Website

POST /api/analyze

Body:

{
  "url": "https://example.com",
  "useAI": true
}

Response:

{
  "brandName": "Example Brand",
  "description": "Enhanced description from Gemini AI"
}

2. CRUD Endpoints

  • GET /api/websites - Get all stored websites
  • GET /api/websites/:id - Get a specific website
  • POST /api/websites - Add a new website
  • PUT /api/websites/:id - Update website data
  • DELETE /api/websites/:id - Delete a website

Testing

  • Test all API endpoints using Postman or any API client.
  • Ensure Supabase connection and Gemini API key are correctly configured.

Deployment

  • Deploy on any Node.js-compatible platform (e.g., Vercel, Render, Railway).
  • Ensure environment variables are set in the hosting environment.

About

SiteScope is an AI-powered web scraping API built with Node.js and Express. It extracts website brand names and descriptions using Playwright, optionally refines text with Gemini AI, and stores data in Supabase. Easy to use, testable via Postman.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published