site_checker monitors website availability over the
network, produces metrics about this and passes these events through Kafka instance into PostgreSQL database.
site_checker is divided into two components:
- producer: collect website metrics and publish results to
Kafka - consumer: consume metrics from
Kafkatopics and save metrics into aPostgreSQLdatabase.
This section will guide you through the steps to reproduce locally the demo presented above.
This tutorial expects you have already a Kafka and Postgres instance running. For more details please check, How to set up managed Apache Kafka and How to deploy an open source database.
Last request before start 🙏, On the Aiven Kafka dashboard, please go to Overview tab to the Advanced configuration section and enable the kafka.auto_create_topics_enable parameter which will allow you to produce messages to Kafka without needing to create a topic beforehand.
- Checkout this repository:
$ git clone [email protected]:aiven-recruitment/site_checker.git
$ cd site_checker- Setup credentials
$ vim example/example.consumer.env # replace <CHANGEME> entries
$ vim example/example.producer.env # replace <CHANGEME> entries
# Copy the Kafka and Postgres SSL certificates to example/ folder
$ cp ~/Downloads/ca.pem example/
$ cp ~/Downloads/service.* example/- Start docker-compose
$ make run- Check application logs.
$ make logs- Check the data saved into
Postgres.
$ psql -U <USER> -h <HOSTNAME> -p <PORT> defaultdb
$ select * from site_checker.apache;- Clean
$ make cleanconfig.ini, can check multiple websites, for more details please check the [example/example.- docker
.envorCLI parameters, checks a single website, for more details please check the example/example.producer.env file.
-
The
config.iniconfiguration approach will create onethreadper website. In case you want to runsite_checkerin a single host to check multiple websites the performance will be limited by the host resources, too many websites or threads can cause too many switch context operations leading to performance impacts. config.ini](example/example.config.ini) file.
-
The
CLI parametersconfiguration approach also used in theDemo, will create a single pythonprocess. In case you want to monitor more than one website using this approach, you can build a container using the Dockerfile definition as an starting point, and launch it in your container orchestrator system, e.g:Kubernetes,AWS ECSorMesos.
The site_checker consumer will create automatically one table per topic, for example, topic name apache creates the table name apache under the schema site_checker into the Postgres instance. The Demo and Getting Started are using admin credentials to keep the steps simpler. However, this approach is not suitable for production workloads.
For production it is recommended to create an application user limiting the usage only to the site_checker schema.
Example:
CREATE USER site_checker WITH PASSWORD '<STRONGPASSWORD>';
GRANT USAGE ON SCHEMA site_checker TO site_checker;
GRANT CREATE ON SCHEMA site_checker TO site_checker;
GRANT INSERT ON ALL TABLES IN SCHEMA site_checker TO site_checker;For more detail, please check CONTRIBUTING.md guide.
