Skip to content

Conversation

@Paul1365972
Copy link
Contributor

This PR introduces several optimizations and refactoring improvements to the redpiler backend:

Changes:

  • Optimize flush performance: Track changed nodes in a separate vector to avoid unnecessarly iterating through the entire nodes array during flush operations
  • Extract tick scheduler and the events queue into a dedicated ExecutionContext (also contains the new changes vec)
  • Pin compiler options at start of compilation, this prevents the case where we switch io_only mode mid execution, we never used or really suppored this anyways so it should be fine
  • I also saw that the Node layout was a bit weird, we were wasting 18 bytes on padding. By reducing the updates inline vec from 10 to 9 entries we save 16 bytes (alternative being, adding 3 entries for free).

The flush optimization reduces overhead by only processing nodes that have actually changed, which should improve performance especially for large redstone circuits.

We also still have the bug where the reset function does not properly sync all components block state.
To further optimze it, I also decided to just skip syncing the comparators block entities as we weren't properly doing is anyways at the moment.
This should probably be addressed in a different PR.

@BramOtte
Copy link
Contributor

BramOtte commented Oct 6, 2025

From a few quick tests on my laptop and this seems to have comparable performance to my implementation (https://github.com/BramOtte/MCHPRS/tree/changed-vec)
but implemented a bit cleaner and also cleaning up some other parts.

I ran iris mandelbrot at 50'000 ticks per flush and got these results:

master run 1: DONE in 49.558927159s processing 135000000 ticks, effective rtps: 2724029.912247278
master run 2: DONE in 49.795871091s processing 135000000 ticks, effective rtps: 2711068.1476641465
paul   run 1: DONE in 45.279627796s processing 135000000 ticks, effective rtps: 2981473.2711192006
paul   run 2: DONE in 45.881503924s processing 135000000 ticks, effective rtps: 2942362.1384255304
bram   run 1: DONE in 45.920522083s processing 135000000 ticks, effective rtps: 2939862.0459059994
bram   run 2: DONE in 45.068884875s processing 135000000 ticks, effective rtps: 2995414.694071682

@Paul1365972
Copy link
Contributor Author

Thanks for testing. I still haven't set up my benchmark environment again, so I just prayed my changes would work just as well as yours did. Someone should also probably verify what impact the inline vec size has, although I feel like this was done already (Probably should be its own PR if I am being honest).

@Paul1365972
Copy link
Contributor Author

Decided it's a good idea to separate this. I will open a separate PR for the node struct change. Although it will cause a merge conflict, as far as I can see. Just ping me and I will rebase either one.

@Paul1365972
Copy link
Contributor Author

Paul1365972 commented Oct 22, 2025

Some ideas for changes:

  1. Rename ExecutionContext to something more meaningful
  2. We currently run self.redpiler.flush(&mut self.world) directly after self.tickn(batch_size as u64) (Reference). Would it make sense (and not introduce bugs/regressions) to only flush just before we send the data to the player according to the world send rate schedule? This could potentially be a big improvement in certain cases.

EDIT: One reason idea 2 is questionable is that we can imagine a future backend that runs ticks asynchronously (for example a multi-threaded or FPGA based one). The change would then remove our "sync point" where we ensure the ticks actually ran and mess with our ability to analyze how long it took to execute these ticks. Although there must be a better way to handle this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants