Skip to content

alexcastrodev/ar-postgresql-jsonb-gin

Repository files navigation

ActiveRecord Postgress JSON Perfomance measure

Summary

This summary presents the performance benchmarks for finding entries by UUID under three different scenarios

Update

After discussing the changes, I modified the implementation to produce a smaller object, now using UUIDs as keys.

Test Scenario Warming Up Rate (i/100ms) Average Rate (i/s) Variation (%) Total Executions Time (s)
Perform 100 entries 478 4.325k 19.1 21.032k 5.113760
Perform 50,000 entries 20 189.442 14.8 920.000 5.035987

Perform 100 entries

test-nodes-1  | Running Ruby application...
test-nodes-1  | ruby 3.3.1 (2024-04-23 revision c56cd86388) [aarch64-linux]
test-nodes-1  | Warming up --------------------------------------
test-nodes-1  |         find_by_uuid   478.000 i/100ms
test-nodes-1  | Calculating -------------------------------------
test-nodes-1  |         find_by_uuid      4.325k (±19.1%) i/s -     21.032k in   5.113760s

Perform 50.000 entries

test-nodes-1  | Running Ruby application...
test-nodes-1  | ruby 3.3.1 (2024-04-23 revision c56cd86388) [aarch64-linux]
test-nodes-1  | Warming up --------------------------------------
test-nodes-1  |         find_by_uuid    20.000 i/100ms
test-nodes-1  | Calculating -------------------------------------
test-nodes-1  |         find_by_uuid    189.442 (±14.8%) i/s -    920.000 in   5.035987s

For my use case, the better one was:

Operation Iterations/100ms Calculations/sec (i/s) Total in 5s
find_by_uuid (Task) 349 4.344k (±10.5%) 21.638k
find_by_uuid (50k) 11 121.315 (±8.2%) 605.000

The results are shown for two different dataset sizes: 100 entries and 50,000 entries.

Scenario 100 Entries (i/s) 50,000 Entries (i/s)
Without Index 1.310k 16.300
Index in Separate Column 5.002k 277.447
Index in Existing Column 2.290k 9.525
Index in Existing Column second try 4.105k 95.707
Index in Existing Column third try 4.114k 98.061
Using Hash 4.416k 125.192
Hash without inferring 1.881k 9.253
Using B-tree 4.214k 99.476

Current situation (Without index)

Perform 100 entries

test-1  | Running Ruby application...
test-1  | ruby 3.3.1 (2024-04-23 revision c56cd86388) [aarch64-linux]
test-1  | Warming up --------------------------------------
test-1  |             find_by_uuid   125.000 i/100ms
test-1  | Calculating -------------------------------------
test-1  |             find_by_uuid      1.310k (±28.4%) i/s -      6.000k in   5.037614s
test-1  | Run options: --seed 9793

Perform 50.000 entries

test-1  | Running Ruby application...
test-1  | ruby 3.3.1 (2024-04-23 revision c56cd86388) [aarch64-linux]
test-1  | Warming up --------------------------------------
test-1  |             find_by_uuid     1.000 i/100ms
test-1  | Calculating -------------------------------------
test-1  |             find_by_uuid     16.300 (±12.3%) i/s -     81.000 in   5.056892s

Alternative 1: Creating uuids with index in a separate column

Perform 100 entries

test-1  | Running Ruby application...
test-1  | ruby 3.3.1 (2024-04-23 revision c56cd86388) [aarch64-linux]
test-1  | Warming up --------------------------------------
test-1  |             find_by_uuid   509.000 i/100ms
test-1  | Calculating -------------------------------------
test-1  |             find_by_uuid      5.002k (± 6.8%) i/s -     24.941k in   5.011577s

Perform 50.000 entries

test-1  | Running Ruby application...
test-1  | ruby 3.3.1 (2024-04-23 revision c56cd86388) [aarch64-linux]
test-1  | Warming up --------------------------------------
test-1  |             find_by_uuid    24.000 i/100ms
test-1  | Calculating -------------------------------------
test-1  |             find_by_uuid    277.447 (± 5.4%) i/s -      1.392k in   5.034088s

improve it a little bit more, after start creating manual testing i got this:

CREATE INDEX idx_users_on_thirdparty_infos_identities_uuid ON users USING gin (thirdparty_infos);

And to be sure that this will work

EXPLAIN ANALYZE
SELECT *
FROM users
WHERE thirdparty_infos @> '{"identities": [{"uuid": "656ff884-99e8-4624-95d4-50d3952d2c38"}]}';

and the result is:

Step Operation Details Cost Range Actual Time (ms) Rows Loops
1 Bitmap Heap Scan on users Cost: 40.04..59.13, Rows: 5, Width: 160 40.04..59.13 0.504..0.508 1 1
Recheck Condition (thirdparty_infos @> '{"identities": [{"uuid": "fb4813db-686b-418b-84d5-9b6ec45e77b5"}]}'::jsonb)
Heap Blocks exact=1
2 Bitmap Index Scan on index_users_on_thirdparty_infos Cost: 0.00..40.04, Rows: 5, Width: 0 0.00..40.04 0.468..0.469 1 1
Index Condition (thirdparty_infos @> '{"identities": [{"uuid": "fb4813db-686b-418b-84d5-9b6ec45e77b5"}]}'::jsonb)
Planning Time 0.638
Execution Time 0.856

Perform 100 entries

test-1  | Running Ruby application...
test-1  | ruby 3.3.1 (2024-04-23 revision c56cd86388) [aarch64-linux]
test-1  | Warming up --------------------------------------
test-1  |         find_by_uuid   408.000 i/100ms
test-1  | Calculating -------------------------------------
test-1  |         find_by_uuid      4.105k (±11.0%) i/s -     20.400k in   5.051942s

Perform 50.000 entries

test-1  | Running Ruby application...
test-1  | ruby 3.3.1 (2024-04-23 revision c56cd86388) [aarch64-linux]
test-1  | Warming up --------------------------------------
test-1  |         find_by_uuid     9.000 i/100ms
test-1  | Calculating -------------------------------------
test-1  |         find_by_uuid     95.707 (±10.4%) i/s -    477.000 in   5.058234s

In the same line, I will attempt to infer an array of strings before saving it to the User Model. Additionally, I will try to locate the necessary values within the array rather than mapping identities.

Perform 100 entries

test-1  | Running Ruby application...
test-1  | ruby 3.3.1 (2024-04-23 revision c56cd86388) [aarch64-linux]
test-1  | Warming up --------------------------------------
test-1  |         find_by_uuid   478.000 i/100ms
test-1  | Calculating -------------------------------------
test-1  |         find_by_uuid      4.114k (±18.0%) i/s -     18.642k in   5.117524s

Perform 50.000 entries

test-1  | Running Ruby application...
test-1  | ruby 3.3.1 (2024-04-23 revision c56cd86388) [aarch64-linux]
test-1  | Warming up --------------------------------------
test-1  |         find_by_uuid     8.000 i/100ms
test-1  | Calculating -------------------------------------
test-1  |         find_by_uuid     98.061 (±26.5%) i/s -    408.000 in   5.083929s

Alternative 2: Adding index in a existent column

Perform 100 entries

test-1  | Running Ruby application...
test-1  | ruby 3.3.1 (2024-04-23 revision c56cd86388) [aarch64-linux]
test-1  | Warming up --------------------------------------
test-1  |         find_by_uuid   197.000 i/100ms
test-1  | Calculating -------------------------------------
test-1  |         find_by_uuid      2.290k (± 7.6%) i/s -     11.426k in   5.032567s

Perform 50.000 entries

test-1  | Running Ruby application...
test-1  | ruby 3.3.1 (2024-04-23 revision c56cd86388) [aarch64-linux]
test-1  | Warming up --------------------------------------
test-1  |         find_by_uuid     1.000 i/100ms
test-1  | Calculating -------------------------------------
test-1  |         find_by_uuid      9.525 (±10.5%) i/s -     47.000 in   5.012350s

Alternative 2: Use Hash instead GIN

Perform 100 entries

test-1  | Running Ruby application...
test-1  | ruby 3.3.1 (2024-04-23 revision c56cd86388) [aarch64-linux]
test-1  | Warming up --------------------------------------
test-1  |         find_by_uuid   214.000 i/100ms
test-1  | Calculating -------------------------------------
test-1  |         find_by_uuid      1.881k (±31.1%) i/s -      7.918k in   5.010515s

Perform 50.000 entries

test-1  | Running Ruby application...
test-1  | ruby 3.3.1 (2024-04-23 revision c56cd86388) [aarch64-linux]
test-1  | Warming up --------------------------------------
test-1  |         find_by_uuid     1.000 i/100ms
test-1  | Calculating -------------------------------------
test-1  |         find_by_uuid      9.253 (± 0.0%) i/s -     47.000 in   5.094357s

Now, infering an array of strings before saving it to the User Model

### Perform 100 entries
test-1  | Running Ruby application...
test-1  | ruby 3.3.1 (2024-04-23 revision c56cd86388) [aarch64-linux]
test-1  | Warming up --------------------------------------
test-1  |         find_by_uuid   351.000 i/100ms
test-1  | Calculating -------------------------------------
test-1  |         find_by_uuid      4.416k (±11.6%) i/s -     21.762k in   5.037456s

Perform 50.000 entries

test-1  | Running Ruby application...
test-1  | ruby 3.3.1 (2024-04-23 revision c56cd86388) [aarch64-linux]
test-1  | Warming up --------------------------------------
test-1  |         find_by_uuid    12.000 i/100ms
test-1  | Calculating -------------------------------------
test-1  |         find_by_uuid    125.192 (± 5.6%) i/s -    624.000 in   5.002306s

Alternative 2: Use BTREE

Perform 100 entries

test-1  | Running Ruby application...
test-1  | ruby 3.3.1 (2024-04-23 revision c56cd86388) [aarch64-linux]
test-1  | Warming up --------------------------------------
test-1  |         find_by_uuid   362.000 i/100ms
test-1  | Calculating -------------------------------------
test-1  |         find_by_uuid      4.214k (±11.6%) i/s -     20.996k in   5.080634s

Perform 50.000 entries

test-1  | Running Ruby application...
test-1  | ruby 3.3.1 (2024-04-23 revision c56cd86388) [aarch64-linux]
test-1  | Warming up --------------------------------------
test-1  |         find_by_uuid     9.000 i/100ms
test-1  | Calculating -------------------------------------
test-1  |         find_by_uuid     99.476 (± 7.0%) i/s -    504.000 in   5.092611s

How to run

docker compose up test

References

Using JSONB in PostgreSQL®: How to Effectively Store & Index JSON Data in PostgreSQL

Faster queries with index on JSONB columns in Postgres

About

performance benchmarks

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published