Skip to content

gkoulis/paper-with-code-SIMPAT-2025

Repository files navigation


Real-time Data Stream Imputation for Enhancing Fault Tolerance on the Edge

Real-time Data Stream Imputation for Enhancing Fault Tolerance on the Edge
Focusing on Environmental Data

Dimitris Gkoulis     Anargyros Tsadimas     George Kousiouris     Cleopatra Bardaki     Mara Nikolaidou

This repository contains the complete source code, datasets, and evaluation results for our research on real-time data stream imputation in edge-based environmental monitoring systems. The work supports the paper titled "Real-time Data Stream Imputation for Enhancing Fault Tolerance on the Edge Focusing on Environmental Data", which explores lightweight forecasting models for maintaining data continuity during sensor outages.

Inside the repository, you’ll find:

  • The Python implementation
  • The simulation framework
  • Real-world environmental datasets from a weather station in Athens, Greece
  • All scripts and configuration files used for experiments, plots, and result analysis
  • Paper data (spreadsheets and figures)

Real-time Data Stream Imputation for Enhancing Fault Tolerance on the Edge

Ensuring continuous, high-quality data in environmental monitoring systems is essential for applications such as climate modeling, urban planning, and disaster response. However, real-time data streams from edge-based IoT sensors are frequently affected by transmission errors, sensor faults, and network disruptions, leading to missing or incomplete observations. This paper investigates the application of lightweight, real-time imputation methods to enhance fault tolerance in edge computing environments. An imputation engine is developed and evaluated using five forecasting models—Naive, Seasonal Naive, Simple Exponential Smoothing, Holt’s Linear Trend, and Holt-Winters Exponential Smoothing—selected for their computational efficiency and suitability for edge deployment. To assess performance, a simulation framework is introduced that replicates sensor failure scenarios and allows controlled testing on real-world environmental data collected from a weather station in Athens, Greece. Imputation accuracy is evaluated using Mean Absolute Error (MAE), 95th percentile error, and maximum error, with results benchmarked against sensor tolerance thresholds. Findings show that Holt-Winters consistently provides the highest accuracy across diverse environmental variables and forecast horizons, while simpler models offer limited utility in short-term recovery contexts. The study demonstrates the feasibility of real-time imputation on low-power edge devices and provides actionable insights for deploying fault-tolerant environmental monitoring systems in resource-constrained settings.


DOI
https://doi.org/10.1016/j.simpat.2025.103178

BibTeX

@article{gkoulis2025exploring,
  title={Exploring the performance of real-time data imputation to enhance fault tolerance on the edge: A study on environmental data},
  author={Gkoulis, Dimitris and Tsadimas, Anargyros and Kousiouris, George and Bardaki, Cleopatra and Nikolaidou, Mara},
  journal={Simulation Modelling Practice and Theory},
  pages={103178},
  year={2025},
  publisher={Elsevier},
  doi={10.1016/j.simpat.2025.103178}
}

To install and use the simulator, first create a virtual environment (Python 3.11 required):

python -m venv venv

Activate the virtual environment:

source venv/bin/activate

Install dependencies:

pip install -r requirements.txt

To run pre-processing:

python -m rtds_imputation_sim.main pre_process

To run exploratory analysis for a single feature:

python -m rtds_imputation_sim.main explore_single Temperature

To run exploratory analysis for all features:

python -m rtds_imputation_sim.main explore_all

To run the simulation:

python -m rtds_imputation_sim.main run_simulation

To run the simulation (results) visualization:

python -m rtds_imputation_sim.main visualize_simulation