Skip to content

fix(PartialEq): resolve TOCTOU race condition and prevent self-comparison deadlock#365

Open
p-player wants to merge 1 commit intoxacrimon:masterfrom
p-player:fix-eq-toctou
Open

fix(PartialEq): resolve TOCTOU race condition and prevent self-comparison deadlock#365
p-player wants to merge 1 commit intoxacrimon:masterfrom
p-player:fix-eq-toctou

Conversation

@p-player
Copy link

Description

This PR addresses the Time-of-Check to Time-of-Use (TOCTOU) race condition in PartialEq and is_empty implementations, as reported in issue #358.

The previous implementation of PartialEq performed a length check followed by an element-wise iteration as separate, non-atomic steps. This allowed concurrent mutators to modify the map between these steps, leading to incorrect equality results (false positives).

Changes

  • Atomic Multi-Shard Locking: Updated DashMap::PartialEq to acquire read guards for all shards in both maps simultaneously before performing any comparison. This guarantees an atomic snapshot of the state.
  • Self-Comparison Safety: Integrated a pointer equality check (std::ptr::eq) at the beginning of PartialEq. This prevents deadlocks that would otherwise occur when attempting to double-lock the same shards during a self-comparison (e.g., map == map).
  • Consistent is_empty: Modified _is_empty to use an improved per-shard length check, significantly reducing the window for inconsistent observations.
  • DashSet Integration: Refactored DashSet to delegate its equality check to the underlying DashMap, inheriting the atomicity fixes.

Testing

Added new regression and concurrency tests in src/lib.rs:

  • test_self_equality: Verifies that comparing a map to itself does not deadlock and returns true.
  • test_eq_race_condition: Spawns a dedicated mutator thread to stress-test PartialEq under high concurrency, ensuring no false positives are returned.

All 35 existing and new tests passed successfully.

Fixes #358

src/lib.rs Outdated

fn _is_empty(&self) -> bool {
self._len() == 0
self.shards.iter().all(|s| s.read().len() == 0)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to improve anything -- you can still get a race because the shards are never locked at the same time.

Comment on lines +1372 to +1376
let hash = {
let mut hasher = other.hasher.build_hasher();
k.hash(&mut hasher);
hasher.finish()
};

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: BuildHasher::hash_one was added in 1.71, dashmap's MSRV is 1.70 and was bumped a year ago, so maybe bumping MSRV to use hash_one would be a good idea just from a code readability PoV

@purplesyringa
Copy link

purplesyringa commented Feb 23, 2026

As @xacrimon said in the issue, I think this fix is somewhat misguided.

This fixes races in some functions, like == and is_empty, but not in others, like len. And it's not quite obvious what's the right way to handle len is: the user might want either an approximation of the length fast (the current behavior of len), or a snapshot value (i.e. approximately what == does, where all shards are locked).

This calls for two functions as opposed to one, or, better yet, a separation of concerns: only make "true" ==, serialize, len, etc. available on ReadOnlyView, which guarantees lack of concurrent modification, and add separate clearly named methods to DashMap that don't guarantee consistency, e.g. approx_len (and that's it, probably).

The new issue here is that DashMap cannot be cheaply converted to ReadOnlyView: you need to own DashMap to do the conversion, which means the easiest way to achieve this is to put DashMap in an RwLock, call write, get &mut DashMap, transmute it to &mut ReadOnlyView (which is unsound without #[repr(transparent)] on ReadOnlyView, but let's ignore that for a second), and then operate on that. Ideally there'd be a way to get &ReadOnlyView without wrapping the map in RwLock, relying on the already existing per-shard mutexes.

I can think of a workaround: add a new type ReadOnlyViewRef that contains &DashMap, derefs to ReadOnlyView, and unlocks all shards on Drop. Add a method like DashMap::lock(&self) that locks all shards and returns a ReadOnlyViewRef.

This needs to be mulled over some more, but maybe it'll give you (or someone else who stumbles upon this) direction.

…ison deadlock

This change acquires read locks on all shards before performing equality checks in DashMap and DashSet, ensuring a consistent snapshot and preventing false positives under concurrent mutation. It also adds a pointer equality check to avoid deadlocks when comparing a map to itself.
@p-player
Copy link
Author

Thank you for the thorough review, @purplesyringa. I've pushed an update addressing the immediate issues:

Changes in this update:

  • ✅ Reverted _is_empty — agreed it doesn't improve anything without simultaneous locking
  • ✅ Fixed ABBA deadlock risk: shard locking order is now determined by pointer address (self as *const Self), so two threads comparing the same maps in opposite order (a.eq(b) vs b.eq(a)) will always acquire locks in the same global order
  • ℹ️ Kept manual hashing via build_hasher() to maintain the current MSRV (1.70). Happy to switch to hash_one if an MSRV bump is approved

On the broader architectural point:

You're right that the implicit locking approach in PartialEq doesn't compose well with other operations like len or serialize. Patching each method individually leads to inconsistent guarantees and more surface area for subtle deadlocks.

Your proposed separation of concerns makes much more sense:

  • Consistent operations (==, serialize, len) belong on ReadOnlyView / a new ReadOnlyViewRef
  • DashMap itself should expose approximate/non-atomic variants like approx_len

A DashMap::lock(&self) -> ReadOnlyViewRef that locks all shards and derefs to ReadOnlyView would elegantly solve the consistency, deadlock, and composability problems in one place.

I'm happy to rework this PR in that direction, or close it in favor of a more comprehensive implementation. Let me know how you'd prefer to proceed.

@oli-obk
Copy link

oli-obk commented Feb 23, 2026

Note that this p-player user is not contributing in good faith in Rust projects as evidenced by their comment in rust-lang/unsafe-code-guidelines#362 (comment)

@purplesyringa
Copy link

Ah, I was afraid that was the case. Thanks for the warning!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Eq, is_empty have false positives on races

3 participants