Skip to content

Replace dict with thin wrapper around hashtable#3366

Open
zuiderkwast wants to merge 5 commits intovalkey-io:unstablefrom
zuiderkwast:thin-dict
Open

Replace dict with thin wrapper around hashtable#3366
zuiderkwast wants to merge 5 commits intovalkey-io:unstablefrom
zuiderkwast:thin-dict

Conversation

@zuiderkwast
Copy link
Contributor

@zuiderkwast zuiderkwast commented Mar 16, 2026

Replace the dict.c implementation with a header-only wrapper (dict.h)
around the hashtable API. The dict types, iterators and API functions
are now typedefs, macros and inline functions that delegate to hashtable.
This unifies the hashtable implementations in the project and removes
duplicated logic.

Changes to dict:

  • Remove dict.c; dict.h is now the entire implementation
  • dict, dictType and dictIterator are direct aliases for the hashtable
    counterparts.
  • dictEntry is a struct allocated by dict wrapper functions to hold key
    and value. It doesn't have a next pointer anymore.
  • Fix key duplication for dictTypes that had keyDup callback by
    calling sdsdup() at call sites in functions.c
  • Remove unused functions, macros, includes and casts
  • Move some dict defrag logic to defrag.c
  • Remove obsolete dict unit tests (covered by test_hashtable.cpp)

Changes to hashtable:

  • Change hashtable keyCompare convention to match dict: non-zero means
    keys are equal, so existing dict compare functions can be reused
  • Add const to hashtableMemUsage parameter

Changes to server implementation:

  • Deduplicate common dict/hashtable callbacks in server.c
  • Change configured hash-seed to only apply to data hashtables. In
    particular, it must not modify the hash seed for dicts already
    initialized during startup for reading configs and similar.

Changes to libvalkey:

  • Let libvalkey use its own dict implementation.

Stop overriding libvalkey's dict with valkey's. Remove the
DICT_INCLUDE_DIR mechanism from libvalkey's build system since
it is no longer needed.

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Data hashtables (keys, sets, zsets, hashes) now use a configurable seed
separate from the global hashtable seed. This allows the hash-seed config
to control SCAN iteration order without affecting internal hashtables
(commands, ACL, modules, etc.) that are populated before config loading.

The configurable seed defaults to the random seed and is overridden
after config loading if hash-seed is set.

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
@codecov
Copy link

codecov bot commented Mar 16, 2026

Codecov Report

❌ Patch coverage is 88.00000% with 27 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.38%. Comparing base (afe6ee1) to head (49e86af).

Files with missing lines Patch % Lines
src/sentinel.c 0.00% 10 Missing ⚠️
src/server.c 86.27% 7 Missing ⚠️
src/rdb.c 0.00% 5 Missing ⚠️
src/valkey-cli.c 77.77% 4 Missing ⚠️
src/valkey-benchmark.c 0.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #3366      +/-   ##
============================================
- Coverage     74.46%   74.38%   -0.09%     
============================================
  Files           130      129       -1     
  Lines         72730    72303     -427     
============================================
- Hits          54160    53781     -379     
+ Misses        18570    18522      -48     
Files with missing lines Coverage Δ
src/cluster_legacy.c 88.10% <100.00%> (+0.18%) ⬆️
src/config.c 77.70% <ø> (ø)
src/defrag.c 81.12% <100.00%> (-0.81%) ⬇️
src/dict.h 100.00% <100.00%> (ø)
src/eval.c 91.50% <100.00%> (+0.07%) ⬆️
src/expire.c 98.12% <ø> (+0.80%) ⬆️
src/functions.c 96.64% <100.00%> (+0.07%) ⬆️
src/fuzzer_command_generator.c 76.82% <100.00%> (-0.14%) ⬇️
src/hashtable.c 93.37% <100.00%> (+0.49%) ⬆️
src/latency.c 83.33% <100.00%> (+0.04%) ⬆️
... and 9 more

... and 16 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Replace the dict.c implementation with a header-only wrapper (dict.h)
around the hashtable API. The dict types, iterators and API functions
are now typedefs, macros and inline functions that delegate to hashtable.
This unifies the hashtable implementations in the project and removes
duplicated logic.

Changes to dict:

- Remove dict.c; dict.h is now the entire implementation
- dict, dictType and dictIterator are direct aliases for the hashtable
  counterparts.
- dictEntry is a struct allocated by dict wrapper functions to hold key
  and value. It doesn't have a next pointer anymore.
- Fix key duplication for dictTypes that had keyDup callback by
  calling sdsdup() at call sites in functions.c
- Remove unused functions, macros, includes and casts
- Move some dict defrag logic to defrag.c
- Remove obsolete dict unit tests (covered by test_hashtable.cpp)

Changes to hashtable:

- Change hashtable keyCompare convention to match dict: non-zero means
  keys are equal, so existing dict compare functions can be reused
- Add const to hashtableMemUsage parameter

Changes to server implementation:

- Deduplicate common dict/hashtable callbacks in server.c

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
@zuiderkwast zuiderkwast added run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP) run-benchmark labels Mar 16, 2026
Copy link
Contributor

@hpatro hpatro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty pleasant refactoring.

@zuiderkwast zuiderkwast marked this pull request as ready for review March 17, 2026 02:04
Copy link
Contributor

@rainsupreme rainsupreme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good to me! I'd like to see the benchmark results, otherwise looks good to go! 😁

Copy link
Member

@dvkashapov dvkashapov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work! My only concern was always allocating new entry before checking if key exists in dictReplace() and freeing new entry if key already exists. But this function is not that frequently used so impact is minimal.
BTW for module dict API should we mention somewhere that now its another implementation?

@zuiderkwast
Copy link
Contributor Author

Awesome work! My only concern was always allocating new entry before checking if key exists in dictReplace() and freeing new entry if key already exists. But this function is not that frequently used so impact is minimal.

Exactly, that was my conclusion too, so it's fine.

BTW for module dict API should we mention somewhere that now its another implementation?

It's a rax! Crazy... but it's not affected by this PR.

Copy link
Contributor

@sarthakaggarwal97 sarthakaggarwal97 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comments! Looks pretty goood!

Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
@github-actions
Copy link

Benchmark ran on this commit: 49e86af

Benchmark Comparison: unstable vs 8ca73b3 (averaged) - rps metrics

Run Summary:

  • unstable: 80 total runs, 16 configurations (avg 5.00 runs per config)
  • 8ca73b3: 80 total runs, 16 configurations (avg 5.00 runs per config)

Statistical Notes:

  • CI99%: 99% Confidence Interval - range where the true population mean is likely to fall
  • PI99%: 99% Prediction Interval - range where a single future observation is likely to fall
  • CV: Coefficient of Variation - relative variability (σ/μ × 100%)

Note: Values with (n=X, σ=Y, CV=Z%, CI99%=±W%, PI99%=±V%) indicate averages from X runs with standard deviation Y, coefficient of variation Z%, 99% confidence interval margin of error ±W% of the mean, and 99% prediction interval margin of error ±V% of the mean. CI bounds [A, B] and PI bounds [C, D] show the actual interval ranges.

Configuration:

  • architecture: aarch64
  • benchmark_mode: duration
  • clients: 1600
  • cluster_mode: False
  • data_size: 16
  • duration: 180
  • tls: False
  • valkey_benchmark_threads: 90
  • warmup: 30
Command Metric Pipeline io_threads unstable 8ca73b3 Diff % Change
GET rps 1 1 229532.856 (n=5, σ=2288.134, CV=1.00%, CI99%=±2.053%, PI99%=±5.028%, CI[224821.557, 234244.155], PI[217992.578, 241073.134]) 229742.702 (n=5, σ=2668.886, CV=1.16%, CI99%=±2.392%, PI99%=±5.859%, CI[224247.430, 235237.974], PI[216282.089, 243203.315]) 209.846 +0.091%
GET rps 1 9 1500239.250 (n=5, σ=5846.989, CV=0.39%, CI99%=±0.802%, PI99%=±1.966%, CI[1488200.219, 1512278.281], PI[1470749.767, 1529728.733]) 1490869.976 (n=5, σ=14738.836, CV=0.99%, CI99%=±2.036%, PI99%=±4.986%, CI[1460522.509, 1521217.443], PI[1416534.166, 1565205.786]) -9369.274 -0.625%
GET rps 10 1 1252881.948 (n=5, σ=6082.645, CV=0.49%, CI99%=±1.000%, PI99%=±2.449%, CI[1240357.698, 1265406.198], PI[1222203.925, 1283559.971]) 1261843.672 (n=5, σ=4111.061, CV=0.33%, CI99%=±0.671%, PI99%=±1.643%, CI[1253378.940, 1270308.404], PI[1241109.399, 1282577.945]) 8961.724 +0.715%
GET rps 10 9 2845821.550 (n=5, σ=25909.625, CV=0.91%, CI99%=±1.875%, PI99%=±4.592%, CI[2792473.274, 2899169.826], PI[2715145.495, 2976497.605]) 2927557.850 (n=5, σ=22465.211, CV=0.77%, CI99%=±1.580%, PI99%=±3.870%, CI[2881301.669, 2973814.031], PI[2814253.810, 3040861.890]) 81736.300 +2.872%
SET rps 1 1 219963.862 (n=5, σ=1861.550, CV=0.85%, CI99%=±1.743%, PI99%=±4.268%, CI[216130.904, 223796.820], PI[210575.071, 229352.653]) 221007.348 (n=5, σ=2049.630, CV=0.93%, CI99%=±1.910%, PI99%=±4.677%, CI[216787.131, 225227.565], PI[210669.969, 231344.727]) 1043.486 +0.474%
SET rps 1 9 1464119.024 (n=5, σ=16397.047, CV=1.12%, CI99%=±2.306%, PI99%=±5.648%, CI[1430357.277, 1497880.771], PI[1381419.971, 1546818.077]) 1480615.428 (n=5, σ=14716.978, CV=0.99%, CI99%=±2.047%, PI99%=±5.013%, CI[1450312.968, 1510917.888], PI[1406389.863, 1554840.993]) 16496.404 +1.127%
SET rps 10 1 1043928.138 (n=5, σ=5226.437, CV=0.50%, CI99%=±1.031%, PI99%=±2.525%, CI[1033166.833, 1054689.443], PI[1017568.432, 1070287.844]) 1056213.750 (n=5, σ=6521.983, CV=0.62%, CI99%=±1.271%, PI99%=±3.114%, CI[1042784.896, 1069642.604], PI[1023319.910, 1089107.590]) 12285.612 +1.177%
SET rps 10 9 1946280.350 (n=5, σ=19116.738, CV=0.98%, CI99%=±2.022%, PI99%=±4.954%, CI[1906918.722, 1985641.978], PI[1849864.447, 2042696.253]) 1987682.874 (n=5, σ=16748.335, CV=0.84%, CI99%=±1.735%, PI99%=±4.250%, CI[1953197.821, 2022167.927], PI[1903212.090, 2072153.658]) 41402.524 +2.127%

Configuration:

  • architecture: aarch64
  • benchmark_mode: duration
  • clients: 1600
  • cluster_mode: False
  • data_size: 96
  • duration: 180
  • tls: False
  • valkey_benchmark_threads: 90
  • warmup: 30
Command Metric Pipeline io_threads unstable 8ca73b3 Diff % Change
GET rps 1 1 222176.852 (n=5, σ=1206.090, CV=0.54%, CI99%=±1.118%, PI99%=±2.738%, CI[219693.496, 224660.208], PI[216093.896, 228259.808]) 220376.370 (n=5, σ=2155.231, CV=0.98%, CI99%=±2.014%, PI99%=±4.932%, CI[215938.719, 224814.021], PI[209506.389, 231246.351]) -1800.482 -0.810%
GET rps 1 9 1457394.624 (n=5, σ=11412.368, CV=0.78%, CI99%=±1.612%, PI99%=±3.949%, CI[1433896.400, 1480892.848], PI[1399835.964, 1514953.284]) 1454072.198 (n=5, σ=12938.939, CV=0.89%, CI99%=±1.832%, PI99%=±4.488%, CI[1427430.743, 1480713.653], PI[1388814.227, 1519330.169]) -3322.426 -0.228%
GET rps 10 1 1185527.750 (n=5, σ=7371.858, CV=0.62%, CI99%=±1.280%, PI99%=±3.136%, CI[1170348.993, 1200706.507], PI[1148347.541, 1222707.959]) 1197302.726 (n=5, σ=6737.916, CV=0.56%, CI99%=±1.159%, PI99%=±2.838%, CI[1183429.263, 1211176.189], PI[1163319.821, 1231285.631]) 11774.976 +0.993%
GET rps 10 9 2230588.250 (n=5, σ=25678.874, CV=1.15%, CI99%=±2.370%, PI99%=±5.806%, CI[2177715.093, 2283461.407], PI[2101075.994, 2360100.506]) 2288793.500 (n=5, σ=26069.578, CV=1.14%, CI99%=±2.345%, PI99%=±5.745%, CI[2235115.878, 2342471.122], PI[2157310.716, 2420276.284]) 58205.250 +2.609%
SET rps 1 1 213710.366 (n=5, σ=987.679, CV=0.46%, CI99%=±0.952%, PI99%=±2.331%, CI[211676.721, 215744.011], PI[208728.973, 218691.759]) 213811.440 (n=5, σ=1010.977, CV=0.47%, CI99%=±0.974%, PI99%=±2.385%, CI[211729.824, 215893.056], PI[208712.542, 218910.338]) 101.074 +0.047%
SET rps 1 9 1430660.198 (n=5, σ=5161.627, CV=0.36%, CI99%=±0.743%, PI99%=±1.820%, CI[1420032.337, 1441288.059], PI[1404627.361, 1456693.035]) 1447783.876 (n=5, σ=7657.616, CV=0.53%, CI99%=±1.089%, PI99%=±2.668%, CI[1432016.739, 1463551.013], PI[1409162.435, 1486405.317]) 17123.678 +1.197%
SET rps 10 1 1034570.686 (n=5, σ=2241.983, CV=0.22%, CI99%=±0.446%, PI99%=±1.093%, CI[1029954.413, 1039186.959], PI[1023263.173, 1045878.199]) 1051506.962 (n=5, σ=3530.386, CV=0.34%, CI99%=±0.691%, PI99%=±1.693%, CI[1044237.848, 1058776.076], PI[1033701.342, 1069312.582]) 16936.276 +1.637%
SET rps 10 9 1830460.300 (n=5, σ=23823.894, CV=1.30%, CI99%=±2.680%, PI99%=±6.564%, CI[1781406.572, 1879514.028], PI[1710303.697, 1950616.903]) 1886475.202 (n=5, σ=30806.601, CV=1.63%, CI99%=±3.362%, PI99%=±8.236%, CI[1823043.984, 1949906.420], PI[1731101.085, 2041849.319]) 56014.902 +3.060%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

run-extra-tests Run extra tests on this PR (Runs all tests from daily except valgrind and RESP)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants