Skip to content

pinterbanget/mk-rav

Repository files navigation

mk-rav 🎥📊

Measurement Kit for Retrieving Attributes from Videos

A comprehensive video analysis toolkit with an intuitive GUI for extracting visual, audio, and behavioral metrics from videos. mk-rav provides advanced analytics for content creators, researchers, and video processing professionals.

Required Software

  • Python 3.8+
  • FFmpeg - For video/audio processing
  • yt-dlp - For YouTube video downloading

GPU Acceleration (Optional but Recommended)

  • NVIDIA GPU: CUDA-compatible graphics card for accelerated processing
  • Apple Silicon: Metal Performance Shaders (MPS) support on M1/M2 Macs

Installation

1. Clone the Repository

git clone https://github.com/aivaslab/mk-rav.git
cd mk-rav

2. Install Python Dependencies

pip install -r requirements.txt

Note: The requirements.txt includes comprehensive version constraints for compatibility across Python 3.8+ and handles PyTorch ecosystem dependencies, YOLOv8, DeepRhythm, and audio processing libraries with proper version pinning.

3. Install FFmpeg

Windows:

macOS:

brew install ffmpeg

Linux:

sudo apt update
sudo apt install ffmpeg

4. Install yt-dlp

pip install yt-dlp

5. GPU Setup (Optional)

For NVIDIA GPU (CUDA):

# Install CUDA-enabled PyTorch
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

For CPU-only (if no GPU):

# Install CPU-only PyTorch
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

Usage

Starting the Application

python app.py

Quick Start Guide

  1. Select Input Source: Choose between local file or YouTube URL
  2. Choose Analysis Metrics: Select from available visual and audio metrics
  3. Configure Settings:
    • Analysis mode (Frame by Frame or Scene by Scene)
    • Smoothing window for data
    • Time range for analysis
  4. Run Analysis: Click "Calculate Metrics" to start processing
  5. View Results: Navigate through multi-page plots and export data

Available Metrics

Visual Metrics

  • Luminance Per Frame: Frame-by-frame brightness analysis using ITU-R BT.601 conversion
  • Contrast Per Frame: Statistical contrast measurement using standard deviation
  • LAB Color Contrast: Color contrast analysis in CIELAB color space
  • HSV Color Contrast: Color contrast analysis in HSV color space
  • Local Color Contrast: Advanced local contrast analysis with configurable parameters
  • Pixel-wise Difference: Frame-to-frame pixel difference analysis
  • Scene Change Rate: Automatic scene detection and transition analysis
  • Number of Scenes Per Minute: Scene density analysis over time
  • Object Count Per Frame: Real-time object detection and counting using YOLOv8
  • Character Presence: Face and person detection with confidence scoring
  • Character Speed: Motion velocity analysis for detected characters
  • Optical Flow: Advanced motion vector analysis using Lucas-Kanade method

Audio Metrics

  • Volume LUFS: LUFS (Loudness Units Full Scale) loudness analysis following ITU-R BS.1770-4 standard
  • Enhanced Tempo Estimation: Advanced BPM detection using DeepRhythm with Librosa fallback
  • Zero Crossing Rate: Audio texture and transient analysis
  • Energy Distribution: Frequency band energy analysis
  • Speech Detection: Advanced speech/music classification with pyannote.audio
  • Loudness Shifts: Dynamic range and loudness variation analysis
  • Syllable Duration Histogram: Linguistic timing pattern analysis

Table Metrics

  • Count Colors Per Frame: Comprehensive color distribution analysis with statistics
  • Extract Main Characters: Character identification and tracking analysis
  • Characters Area Motion: Spatial movement analysis for detected characters
  • Characters Heatmap: Spatial distribution visualization and analysis

Configuration

Analysis Modes

  • Frame by Frame (FbF): Analyze every frame for detailed metrics
  • Scene by Scene (SbS): Analyze per scene for overview metrics

GPU Acceleration

The application automatically detects and uses available GPU acceleration:

  • NVIDIA CUDA GPUs
  • Apple Metal Performance Shaders (MPS)
  • OpenCV CUDA support

Whisper Models

Choose transcription quality vs. speed:

  • tiny: Fastest, lowest quality
  • base: Balanced (default)
  • small: Better quality
  • medium: High quality
  • large: Best quality, slowest

Output Formats

  • Plots: Multi-page visualization with navigation controls
  • Data Export: CSV/JSON format for further analysis
  • Individual Plot Saving: PNG/PDF/SVG export options

Performance Optimization

  • Single-pass Processing: Optimized visual metrics calculation
  • Memory Management: Efficient resource cleanup
  • GPU Acceleration: Automatic hardware acceleration when available
  • Batch Processing: Multiple metrics calculated simultaneously

About

Measurement Kit for Retrieving Attributes from Videos

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages