This project is a web scraper designed to extract book data from a specified website. The scraper uses Selenium to navigate the website, locate book details, and save the extracted data into a JSON file.
- Extracts book details such as title, thumbnail, author, published date, description, page count, genre, and language.
- Handles cookies and navigation through multiple pages.
- Saves the extracted data incrementally to avoid data loss in case of interruptions.
- Python 3.x
- Selenium
- Chrome WebDriver
- Clone the repository:
git clone https://github.com/michaelvbend/book-scraper.git cd webscraper - Install Python packages:
pip install -r requirements.txt
- Run the scraper
python main.py
- Output: The scraped data will be saved in book_data.json