Skip to content

A simple Express.js API that scrapes data from public websites using Playwright. Includes endpoints for quotes, books, fake users, and products.

Notifications You must be signed in to change notification settings

GowthamSelvaraj123/webscrapper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🌐 Web Scraping API with Playwright + Node.js

This project is a demo web scraping API built using Playwright and Express.js. It scrapes data from five publicly available websites that are safe and legal for practice.


πŸ”₯ Features

  • βœ… Quotes scraper from quotes.toscrape.com
  • βœ… Book title & price scraper from books.toscrape.com
  • βœ… HTML test scraping from httpbin.org
  • βœ… Product scraper from scrapeme.live/shop/

🧠 REST API Structure (Educational Purpose)

This project also demonstrates how to organize and expose RESTful API routes in a backend using Express.js. It follows a basic REST pattern:

  • GET method for fetching scraped data
  • Separation of concerns using routes and controllers
  • Clean modular design suitable for scaling into larger applications

πŸ—‚ Routes

Route Description
/quote Fetch quotes and authors
/books Fetch list of books with price
/httpbin Fetch title from basic static HTML
/product Fetch product list from fake shop

Each route is handled by a separate controller in controller/ folder.


πŸš€ How to Run

1. Clone the repo

git clone https://github.com/your-username/playwright-scraping-api.git
cd playwright-scraping-api

About

A simple Express.js API that scrapes data from public websites using Playwright. Includes endpoints for quotes, books, fake users, and products.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published