Tooling for scraping and providing publicly available data from FCSE services. The data is provided using a REST API or webhooks. Requires Node.js >= 20.
The scrapers are implemented as classes (called strategies) which contain several selectors and methods for fetching the data from each container (post, announcement, etc). Adding a new service requires creating a new strategy and linking it. See the example strategy for more info.
To run the scraper:
- Clone the repository:
git clone https://github.com/finki-hub/finki-scraper.git - Prepare configuration by copying
config/config.sample.jsontoconfig/config.json - Install dependencies:
npm i - Run the scraper
npm run start
It's also available as a Docker image:
docker run -d \
--name finki-scraper \
--restart unless-stopped \
-v ./cache:/app/cache \
-v ./config:/app/config \
-v ./logs:/app/logs \
ghcr.io/finki-hub/finki-scraper:latestOr Docker Compose: docker compose up -d
You can select which scrapers to run declaratively (in the configuration with the enabled flag) or imperatively: npm run start scraper_1 scraper_2 ... scraper_n
- Clone the repository:
git clone https://github.com/finki-hub/finki-scraper.git - Install dependencies:
npm i - Prepare configuration:
cp config/config.sample.json config/config.json - Build the project:
npm run build - Run it:
npm run start
There is an example configuration file available at config/config.sample.json. Copy it to config/config.json and edit it to your liking.
This project is licensed under the terms of the MIT license.