Skip to content

A lightweight, cross-platform chaos engineering framework built in Rust for testing service resilience through controlled failure injection. Supports network latency, packet loss, CPU/memory pressure, and more on Windows, macOS, and Linux.

License

Notifications You must be signed in to change notification settings

NLlemain/chaos-engineering-rs

Repository files navigation

πŸ¦€ Chaos Engineering Framework

License: MIT Rust Platform

A lightweight, cross-platform chaos engineering framework for testing service resilience through controlled failure injection.

Built in Rust for performance and safety. Test how your services handle real-world failures like network issues, resource exhaustion, and process crashes.

✨ Features

  • 🌐 Cross-Platform: Windows, macOS, Linux with platform-native chaos injection
  • ⚑ High Performance: Async Rust, ~15MB memory, <1% CPU overhead
  • 🎯 7 Chaos Types: Network latency, packet loss, TCP resets, CPU starvation, memory pressure, disk I/O, process kills
  • πŸ“‹ YAML Configuration: Declarative test scenarios with multi-phase support
  • πŸ–₯️ Web Dashboard: Dark-themed UI for test management and monitoring
  • πŸ”₯ Load Testing: Stress test HTTP, WebSocket, TCP, gRPC, HLS, RTMP endpoints
  • πŸ“Š Multiple Outputs: CLI, JSON, Markdown, Prometheus metrics
  • πŸ›‘οΈ Safe by Design: Input validation, no shell injection, clear privilege separation

πŸš€ Quick Start

Prerequisites

  • Rust 1.70+ - Install Rust
  • Windows: No additional requirements
  • Linux: iproute2, iptables (usually pre-installed)
  • macOS: Built-in tools, requires sudo for network chaos

Installation

git clone https://github.com/NLlemain/chaos-engineering-rs
cd chaos-engineering-rs
cargo build --release

Run Your First Test

# Start test service
./target/release/axum_http_service

# Run chaos test (new terminal)
./target/release/chaos run scenarios/quick_test.yaml --verbose

Launch Web Dashboard

./target/release/chaos serve --port 8080
# Open http://127.0.0.1:8080

πŸ–₯️ Web Dashboard

Modern dark-themed web interface for chaos engineering:

  • Dashboard: Real-time test status, system overview
  • Scenarios: Browse and run YAML test scenarios
  • Load Testing: Stress test any HTTP/WebSocket/TCP endpoint
  • Targets: Save and manage your test endpoints
  • Results: View test history with detailed metrics

Load Testing Your Apps

Go to Load Test page and configure:

Field Description
Target Type HTTP, WebSocket, TCP, gRPC, HLS, RTMP
URL Your endpoint (e.g., http://localhost:3000/api)
Concurrent Users Parallel connections
Requests/Second Target throughput
Duration Test length in seconds
Ramp-up Gradual load increase time

Supported Protocols:

  • HTTP/HTTPS - REST APIs, web apps
  • WebSocket - Real-time feeds, chat
  • TCP - Raw socket connections
  • gRPC - gRPC services
  • HLS - HTTP Live Streaming
  • RTMP - Video streaming servers

πŸ“¦ Chaos Injectors

Injector Description Platform
network_latency Adds delay to packets (mean + jitter) All
packet_loss Randomly drops packets All
tcp_reset Terminates TCP connections All
cpu_starvation Saturates CPU at specified intensity All
memory_pressure Allocates memory to target % All
disk_slow I/O latency injection All
process_kill Terminates/restarts processes All

πŸ“ Test Scenarios

name: "HTTP Service Resilience Test"
targets:
  - name: "web_api"
    type: "process"
    process_name: "axum_http_service"

phases:
  - name: "baseline"
    duration: "30s"
    
  - name: "network_stress"
    duration: "60s"
    injections:
      - type: "network_latency"
        target: "web_api"
        delay: "100ms"
        jitter: "20ms"
  
  - name: "resource_stress"
    duration: "60s"
    parallel: true
    injections:
      - type: "cpu_starvation"
        intensity: 0.7
      - type: "memory_pressure"
        target_usage: 0.8
        
  - name: "recovery"
    duration: "30s"

CLI Commands

# List injectors
./target/release/chaos list

# Validate scenario
./target/release/chaos validate scenarios/my_test.yaml

# Run test
./target/release/chaos run scenarios/my_test.yaml --verbose

# Run with reports
./target/release/chaos run scenarios/stress_test.yaml \
  --output-json results.json \
  --output-markdown report.md

# Start web dashboard
./target/release/chaos serve --port 8080

πŸ§ͺ Test Services

Three example targets included:

# HTTP service (port 3000)
./target/release/axum_http_service

# TCP echo server (port 9001)
./target/release/tcp_echo_server

# WebSocket feed (port 9002)
./target/release/websocket_feed

πŸ—οΈ Architecture

chaos-engineering-rs/
β”œβ”€β”€ chaos_cli/         CLI and commands
β”œβ”€β”€ chaos_core/        Injection engine
β”œβ”€β”€ chaos_scenarios/   YAML parser, orchestration
β”œβ”€β”€ chaos_targets/     Target discovery, test services
β”œβ”€β”€ chaos_metrics/     Metrics collection, export
β”œβ”€β”€ chaos_web/         Web dashboard
└── scenarios/         Pre-built test scenarios

πŸ–₯️ Platform Support

Feature Linux macOS Windows
CPU/Memory/Disk Chaos βœ… βœ… βœ…
Process Control βœ… βœ… βœ…
Network Chaos βœ… tc/netem βœ… dnctl βœ… app-level
Web Dashboard βœ… βœ… βœ…
Load Testing βœ… βœ… βœ…

⚑ Performance

Metric Value
Binary Size ~6 MB
Build Time ~30 seconds
Memory ~15 MB
CPU Overhead <1%
Startup <100ms

πŸ›‘οΈ Safety

  • Input Validation: All configs validated before execution
  • No Shell Injection: Uses safe Rust Command API
  • Privilege Separation: Clear user/root boundaries
  • Audit Logging: All actions logged with timestamps

Privilege Requirements

Operation Linux/macOS Windows
Network chaos sudo User
CPU/Memory/Disk User User
Process kill (own) User User
Process kill (other) sudo Admin

πŸ“– Documentation

🀝 Contributing

  1. Fork the repo
  2. Create feature branch: git checkout -b feature/amazing
  3. Make changes with tests
  4. Format: cargo fmt --all
  5. Lint: cargo clippy --all
  6. Submit PR

πŸ“œ License

MIT License - See LICENSE-MIT

πŸ’¬ Contact


Remember: The goal isn't to break things - it's to learn how systems fail so you can build them better.

"Everything fails all the time." - Werner Vogels

About

A lightweight, cross-platform chaos engineering framework built in Rust for testing service resilience through controlled failure injection. Supports network latency, packet loss, CPU/memory pressure, and more on Windows, macOS, and Linux.

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •