SiteScope is an AI-powered web scraping API built with Node.js and Express. It extracts website brand names and descriptions using Playwright, optionally refines text using Gemini AI, and stores the results in Supabase. The API is easy to use and can be tested via Postman.
This project was developed as a Backend Development Internship assignment.
Tech Stack:
- Node.js + Express(backend)
- Supabase (PostgreSQL database)
- Playwright (Web scraping)
- Gemini API (AI text enhancement)
- Postman (API testing)
- Website Analysis API: Accepts a URL and returns the brand name and description. Can optionally enhance text using AI.
- CRUD Operations: Store, retrieve, update, and delete website data.
- Validation & Error Handling: Ensures input is valid and handles errors gracefully.
- Clone the repository
git clone <repo-url>
cd SiteScope- Install dependencies
npm install- Set up environment variables
Create a
.envfile in the root directory:
SUPABASE_URL=<supabase-url>
SUPABASE_KEY=<supabase-key>
GEMINI_API_KEY=<gemini-api-key>
PORT=- Run the server
npm startServer runs at http://localhost:3000
POST /api/analyze
Body:
{
"url": "https://example.com",
"useAI": true
}Response:
{
"brandName": "Example Brand",
"description": "Enhanced description from Gemini AI"
}- GET /api/websites - Get all stored websites
- GET /api/websites/:id - Get a specific website
- POST /api/websites - Add a new website
- PUT /api/websites/:id - Update website data
- DELETE /api/websites/:id - Delete a website
- Test all API endpoints using Postman or any API client.
- Ensure Supabase connection and Gemini API key are correctly configured.
- Deploy on any Node.js-compatible platform (e.g., Vercel, Render, Railway).
- Ensure environment variables are set in the hosting environment.