A lightweight, cross-platform chaos engineering framework for testing service resilience through controlled failure injection.
Built in Rust for performance and safety. Test how your services handle real-world failures like network issues, resource exhaustion, and process crashes.
- π Cross-Platform: Windows, macOS, Linux with platform-native chaos injection
- β‘ High Performance: Async Rust, ~15MB memory, <1% CPU overhead
- π― 7 Chaos Types: Network latency, packet loss, TCP resets, CPU starvation, memory pressure, disk I/O, process kills
- π YAML Configuration: Declarative test scenarios with multi-phase support
- π₯οΈ Web Dashboard: Dark-themed UI for test management and monitoring
- π₯ Load Testing: Stress test HTTP, WebSocket, TCP, gRPC, HLS, RTMP endpoints
- π Multiple Outputs: CLI, JSON, Markdown, Prometheus metrics
- π‘οΈ Safe by Design: Input validation, no shell injection, clear privilege separation
- Rust 1.70+ - Install Rust
- Windows: No additional requirements
- Linux:
iproute2,iptables(usually pre-installed) - macOS: Built-in tools, requires sudo for network chaos
git clone https://github.com/NLlemain/chaos-engineering-rs
cd chaos-engineering-rs
cargo build --release# Start test service
./target/release/axum_http_service
# Run chaos test (new terminal)
./target/release/chaos run scenarios/quick_test.yaml --verbose./target/release/chaos serve --port 8080
# Open http://127.0.0.1:8080Modern dark-themed web interface for chaos engineering:
- Dashboard: Real-time test status, system overview
- Scenarios: Browse and run YAML test scenarios
- Load Testing: Stress test any HTTP/WebSocket/TCP endpoint
- Targets: Save and manage your test endpoints
- Results: View test history with detailed metrics
Go to Load Test page and configure:
| Field | Description |
|---|---|
| Target Type | HTTP, WebSocket, TCP, gRPC, HLS, RTMP |
| URL | Your endpoint (e.g., http://localhost:3000/api) |
| Concurrent Users | Parallel connections |
| Requests/Second | Target throughput |
| Duration | Test length in seconds |
| Ramp-up | Gradual load increase time |
Supported Protocols:
- HTTP/HTTPS - REST APIs, web apps
- WebSocket - Real-time feeds, chat
- TCP - Raw socket connections
- gRPC - gRPC services
- HLS - HTTP Live Streaming
- RTMP - Video streaming servers
| Injector | Description | Platform |
|---|---|---|
network_latency |
Adds delay to packets (mean + jitter) | All |
packet_loss |
Randomly drops packets | All |
tcp_reset |
Terminates TCP connections | All |
cpu_starvation |
Saturates CPU at specified intensity | All |
memory_pressure |
Allocates memory to target % | All |
disk_slow |
I/O latency injection | All |
process_kill |
Terminates/restarts processes | All |
name: "HTTP Service Resilience Test"
targets:
- name: "web_api"
type: "process"
process_name: "axum_http_service"
phases:
- name: "baseline"
duration: "30s"
- name: "network_stress"
duration: "60s"
injections:
- type: "network_latency"
target: "web_api"
delay: "100ms"
jitter: "20ms"
- name: "resource_stress"
duration: "60s"
parallel: true
injections:
- type: "cpu_starvation"
intensity: 0.7
- type: "memory_pressure"
target_usage: 0.8
- name: "recovery"
duration: "30s"# List injectors
./target/release/chaos list
# Validate scenario
./target/release/chaos validate scenarios/my_test.yaml
# Run test
./target/release/chaos run scenarios/my_test.yaml --verbose
# Run with reports
./target/release/chaos run scenarios/stress_test.yaml \
--output-json results.json \
--output-markdown report.md
# Start web dashboard
./target/release/chaos serve --port 8080Three example targets included:
# HTTP service (port 3000)
./target/release/axum_http_service
# TCP echo server (port 9001)
./target/release/tcp_echo_server
# WebSocket feed (port 9002)
./target/release/websocket_feedchaos-engineering-rs/
βββ chaos_cli/ CLI and commands
βββ chaos_core/ Injection engine
βββ chaos_scenarios/ YAML parser, orchestration
βββ chaos_targets/ Target discovery, test services
βββ chaos_metrics/ Metrics collection, export
βββ chaos_web/ Web dashboard
βββ scenarios/ Pre-built test scenarios
| Feature | Linux | macOS | Windows |
|---|---|---|---|
| CPU/Memory/Disk Chaos | β | β | β |
| Process Control | β | β | β |
| Network Chaos | β tc/netem | β dnctl | β app-level |
| Web Dashboard | β | β | β |
| Load Testing | β | β | β |
| Metric | Value |
|---|---|
| Binary Size | ~6 MB |
| Build Time | ~30 seconds |
| Memory | ~15 MB |
| CPU Overhead | <1% |
| Startup | <100ms |
- Input Validation: All configs validated before execution
- No Shell Injection: Uses safe Rust
CommandAPI - Privilege Separation: Clear user/root boundaries
- Audit Logging: All actions logged with timestamps
| Operation | Linux/macOS | Windows |
|---|---|---|
| Network chaos | sudo |
User |
| CPU/Memory/Disk | User | User |
| Process kill (own) | User | User |
| Process kill (other) | sudo |
Admin |
- QUICKSTART.md - 5-minute setup guide
- SECURITY.md - Security considerations
- CHANGES.md - Changelog
- LICENSE-MIT - License
- Fork the repo
- Create feature branch:
git checkout -b feature/amazing - Make changes with tests
- Format:
cargo fmt --all - Lint:
cargo clippy --all - Submit PR
MIT License - See LICENSE-MIT
- Issues: GitHub Issues
- LinkedIn: Ninian Lemain
- Email: [email protected]
Remember: The goal isn't to break things - it's to learn how systems fail so you can build them better.
"Everything fails all the time." - Werner Vogels