r.sim: Parallelize dx/dy derivatives computation using OpenMP by Abhi-d-gr8 · Pull Request #7094 · OSGeo/grass

Abhi-d-gr8 · 2026-02-16T02:29:02Z

Overview

This PR parallelizes the dx/dy slope derivatives computation in simlib/derivatives.c (Horn 3×3 method), addressing #7039.

The main simulation already supports multiple cores, but the dx/dy computation was still serial. This change enables that step to use available CPU cores as well.

What changed

Added an OpenMP parallel for on the outer row loop.
Used default(none) with explicit shared(...) and private(...) scoping.
Preserved original math and behavior.
Serial builds (without OpenMP) remain unchanged.

Performance Evaluation

To isolate the impact of this change, timing was performed inside derivatives() only, excluding raster I/O and the main simulation loop.

This avoids masking the effect, since total r.sim.water runtime is dominated by the simulation phase.

Test Setup

Grid size: 5000 × 5000 (25 million cells)
Interleaved runs: 1-thread vs 8-thread
First run discarded to avoid warm-up bias
Measured using omp_get_wtime() around derivatives() only

Results (Apple M3 Air)

OMP_NUM_THREADS=1: ~0.19 s (average)
OMP_NUM_THREADS=8: ~0.06 s (average)
Speedup: ~3.2×

This confirms that the dx/dy computation scales across cores as intended.

As expected, the overall r.sim.water runtime improvement is smaller since derivatives() is only one stage of the workflow.

Build / OpenMP Notes

To observe multi-threaded behavior in derivatives():

GRASS must be configured with OpenMP support:
```
./configure --with-openmp
```
The compiler must support OpenMP (e.g., libomp on macOS).

The source includes guarded OpenMP headers:

#if defined(_OPENMP)
#include <omp.h>
#endif

Control the number of threads via:
```
OMP_NUM_THREADS=<N>
```

If OpenMP is not available, the code compiles and runs serially without any behavior or numerical changes.

Reproducible Benchmark Command

For reference, this is the exact command used for the interleaved benchmark:

g.gisenv set=DEBUG=1
g.region n=5000 s=0 e=5000 w=0 res=1

echo "=== Interleaved 1-vs-8 thread runs (5000x5000 = 25M cells) ==="

for nprocs in 1 8 1 8 1 8; do
  r.sim.water elevation=elev rain_value=50 man_value=0.05 \
      depth=depth_bench discharge=disch_bench \
      nwalkers=5000 niterations=0 random_seed=1 nprocs=$nprocs \
      2>&1 | grep -E "OpenMP threads|derivatives time"
  echo "---"
done

Timing instrumentation (added for evaluation)

To isolate the performance of derivatives() (not masked by the main simulation),I temporarily added the following OpenMP-guarded timing block:

diff --git a/raster/r.sim/simlib/derivatives.c b/raster/r.sim/simlib/derivatives.c
--- a/raster/r.sim/simlib/derivatives.c
+++ b/raster/r.sim/simlib/derivatives.c
@@ -23,6 +23,11 @@
 H = (geometry->stepx / geometry->conv) * 8.0;
 V = (geometry->stepy / geometry->conv) * 8.0;

+#if defined(_OPENMP)
+    G_debug(1, "OpenMP threads (derivatives): %d", omp_get_max_threads());
+    double t0 = omp_get_wtime();
+#endif

 #if defined(_OPENMP)
 #pragma omp parallel for default(none) schedule(static) \
@@ -101,4 +109,8 @@
     }

+#if defined(_OPENMP)
+    G_debug(1, "dx/dy derivatives time: %.4f s", omp_get_wtime() - t0);
+#endif
 }

It used for benchmarking and does not affect numerical behavior.

wenzeslaus · 2026-02-16T06:23:02Z

Can you please add the diff into your description to add the time measuring code?

Abhi-d-gr8 · 2026-02-16T11:30:54Z

Hi @wenzeslaus !
I have added the diff in the description (at the end), to calculate the time measuring code of the derivatives.c

wenzeslaus · 2026-02-16T16:46:38Z

Awesome! I'm getting good results locally. Let's finalize it! Some valgrind-like scrutinizing would be good. There is some discussion in issues or PRs about how to do that with threads in our context in relation to r.mapcalc parallel code.

for nprocs in 1 8 1 8 1 8 1 4 1 4 1 4 1 36 1 36 1 36 1 36; do   r.sim.water elevation=elevation rain_value=50 man_value=0.05       depth=depth_bench discharge=disch_bench       nwalkers=5000 niterations=0 random_seed=1 nprocs=$nprocs --o       2>&1 | grep -E "threads";   echo "---"; done

diff --git a/raster/r.sim/simlib/derivatives.c b/raster/r.sim/simlib/derivatives.c
index 1519055783..9df42503b7 100644
--- a/raster/r.sim/simlib/derivatives.c
+++ b/raster/r.sim/simlib/derivatives.c
@@ -22,6 +22,11 @@ void derivatives(const Geometry *geometry, float **elevation, double **dx,
     H = (geometry->stepx / geometry->conv) * 8.0;
     V = (geometry->stepy / geometry->conv) * 8.0;
 
+#if defined(_OPENMP)
+    //G_debug(1, "OpenMP threads (derivatives): %d", omp_get_max_threads());
+    double t0 = omp_get_wtime();
+#endif
+
 #if defined(_OPENMP)
 #pragma omp parallel for default(none) schedule(static) \
     shared(geometry, elevation, dx, dy, H, V)           \
@@ -101,4 +106,9 @@ void derivatives(const Geometry *geometry, float **elevation, double **dx,
             dy[row][col] = ((c7 + c8 + c8 + c9) - (c1 + c2 + c2 + c3)) / V;
         }
     }
+
+#if defined(_OPENMP)
+    G_debug(1, "threads=%d, time=%.4f", omp_get_max_threads(), omp_get_wtime() - t0);
+#endif
+
 }

# generated

import pandas as pd
import matplotlib.pyplot as plt

# Raw data
data = [
    (1, 0.0537), (8, 0.0164),
    (1, 0.0540), (8, 0.0119),
    (1, 0.0546), (8, 0.0113),
    (1, 0.0557), (4, 0.0208),
    (1, 0.0538), (4, 0.0263),
    (1, 0.0534), (4, 0.0202),
    (1, 0.0557), (36, 0.0126),
    (1, 0.0525), (36, 0.0061),
    (1, 0.0536), (36, 0.0092),
    (1, 0.0553), (36, 0.0104),
]

df = pd.DataFrame(data, columns=["threads", "time"])

# Aggregate statistics: mean and full range
stats = (
    df.groupby("threads")["time"]
    .agg(mean="mean", min="min", max="max")
    .reset_index()
)

# Compute asymmetric error bars (min/max range)
yerr = [
    stats["mean"] - stats["min"],
    stats["max"] - stats["mean"],
]

# Plot
plt.figure()
plt.errorbar(
    stats["threads"],
    stats["mean"],
    yerr=yerr,
    fmt="o-"
)
plt.xlabel("Number of threads")
plt.ylabel("Time (s)")
plt.title("Threads vs Time with Full Data Range (min–max)")
plt.grid(True)

plt.show()

wenzeslaus · 2026-02-16T16:47:28Z

raster/r.sim/simlib/derivatives.c

@@ -1,4 +1,6 @@
 #include "simlib.h"
+#include <grass/gis.h>


Needed? (It is for debug, but here without it...)

Yes it is not needed , I was using different approaches so had included it earlier but yes its not required.

wenzeslaus · 2026-02-16T17:53:00Z

Another thing, do the current tests actually cover the two cases for dx-dy and for enabled/disabled parallelism?

feat: Parallelize derivatives with OpenMP (OSGeo#7039)

b69ecff

github-actions bot added raster Related to raster data processing C Related code is in C module labels Feb 16, 2026

Abhi-d-gr8 mentioned this pull request Feb 16, 2026

[Feat] r.sim: Parallelize dx, dy computation #7039

Open

wenzeslaus reviewed Feb 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

r.sim: Parallelize dx/dy derivatives computation using OpenMP#7094

r.sim: Parallelize dx/dy derivatives computation using OpenMP#7094
Abhi-d-gr8 wants to merge 1 commit intoOSGeo:mainfrom
Abhi-d-gr8:dxdy-multicore

Abhi-d-gr8 commented Feb 16, 2026 •

edited

Loading

Uh oh!

wenzeslaus commented Feb 16, 2026

Uh oh!

Abhi-d-gr8 commented Feb 16, 2026

Uh oh!

wenzeslaus commented Feb 16, 2026

Uh oh!

wenzeslaus Feb 16, 2026

Uh oh!

Abhi-d-gr8 Feb 16, 2026

Uh oh!

wenzeslaus commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Abhi-d-gr8 commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

What changed

Performance Evaluation

Test Setup

Results (Apple M3 Air)

Build / OpenMP Notes

Reproducible Benchmark Command

Timing instrumentation (added for evaluation)

Uh oh!

wenzeslaus commented Feb 16, 2026

Uh oh!

Abhi-d-gr8 commented Feb 16, 2026

Uh oh!

wenzeslaus commented Feb 16, 2026

Uh oh!

wenzeslaus Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

Abhi-d-gr8 Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

wenzeslaus commented Feb 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Abhi-d-gr8 commented Feb 16, 2026 •

edited

Loading