Skip to content

fix(core): batch-safe hashing for maven and gradle#34446

Merged
FrozenPandaz merged 22 commits intomasterfrom
fix-batch-hash
Mar 6, 2026
Merged

fix(core): batch-safe hashing for maven and gradle#34446
FrozenPandaz merged 22 commits intomasterfrom
fix-batch-hash

Conversation

@FrozenPandaz
Copy link
Collaborator

@FrozenPandaz FrozenPandaz commented Feb 13, 2026

Current Behavior

In batch mode (Maven/Gradle), all task hashes are computed upfront in processScheduledBatch before the batch executor runs any tasks. Tasks with dependentTasksOutputFiles (aka depsOutputs) get hashed using whatever dependency outputs happen to be on disk from a previous run. This leads to:

  • False cache hits: If a dependency's sources changed but its old outputs are still on disk, the dependent task's hash matches a stale cache entry and wrong results are served.
  • False cache misses: On cold runs with no outputs on disk, the hash is computed without dependency output content and never matches any stored cache entry.

Non-batch mode doesn't have this problem because it uses lazy hashing — tasks with depsOutputs are only hashed after their dependencies complete and fresh outputs exist on disk.

Expected Behavior

Batch mode hashes tasks topologically — each task is hashed only after its dependencies have run and their outputs are on disk. This means hashes are always computed against fresh outputs, eliminating both false cache hits and false cache misses.

How it works

applyFromCacheOrRunBatch now has two phases:

  1. Topological cache resolution — Walk the entire batch task graph level by level. At each level, partition root tasks into cache-eligible vs ineligible. A task is ineligible for cache if it has depsOutputs inputs AND any of its dependencies were not cached (their outputs aren't on disk, so the hash would be wrong). Hash and check cache for eligible tasks, then remove all roots from the graph to expose the next level — even when some tasks are cache misses. This ensures the walk continues past cache misses to find deeper cache hits.

  2. Run remaining tasks, then hash — Rebuild a run graph from all non-cached task IDs and run them through the batch executor. After the batch completes, hash all tasks that ran. Since all outputs (including from sibling batch tasks) are now fresh on disk, tasks with depsOutputs get correct hashes on the first pass — no re-hash needed.

Task history lifecycle fix

The batch streaming callback calls endTasks as tasks finish mid-batch, but tasks haven't been hashed yet at that point (hash is deferred to post-execution). Previously, TaskHistoryLifeCycle and LegacyTaskHistoryLifeCycle eagerly snapshotted task.hash in endTasks, which sent undefined to the native Rust layer causing a "Missing field hash" crash.

Fix: Both lifecycles now store TaskResult references in endTasks and defer building TaskRun objects until endCommand, when task.hash is guaranteed to be set by the post-batch re-hash. The streaming callback and runBatch return value also now use the original task object reference (instead of spread copies) so that the hash mutation from hashBatchTasks flows through to all stored references.

Example: 3 tasks over 3 runs

Consider a batch with three tasks in a linear chain: A → B → C

  • Task Alib:compile. Inputs: source files only. No depsOutputs.
  • Task Bapp:compile. Depends on A. Inputs: only depsOutputs from Task A (e.g., target/classes/**). No source file inputs.
  • Task Capp:checkstyle. Depends on B. Not cacheable.

Hash notation: H(inputs…) means the hash is a function of those inputs.


Run 1 — Fresh (no outputs on disk, empty cache)

Step What happens
Phase 1 Roots = [A] (B depends on A, C depends on B). Hash A → H_A
Check cache: A = miss. A is added to nonCachedTaskIds. Remove all roots.
Phase 1, iter 2 Roots = [B]. B has depsOutputs and A is non-cached → B is ineligible. Added to nonCachedTaskIds. Remove all roots.
Phase 1, iter 3 Roots = [C]. C has no depsOutputs → eligible. Hash C → H_C. Cache miss. Added to nonCachedTaskIds. Remove all roots.
Phase 2 Rebuild run graph from nonCachedTaskIds = {A, B, C}. Run batch: all 3 tasks execute. A produces target/classes/.
Hash all tasks post-execution: A → H_A, B → H(A_outputs) = H_B, C → H_C
Cache Store A as H_A, store B as H_B. C is not cached.

Key: B is hashed after the batch, when A's outputs already exist on disk. The hash is correct on the first pass. C was still checked against cache even though A and B were misses.


Run 2 — Warm (nothing changed, cache populated from Run 1)

Step What happens
Phase 1, iter 1 Roots = [A]. Hash A → H_A
Check cache: A = HIT ✅ → restore target/classes/ to disk.
Phase 1, iter 2 Roots = [B]. B has depsOutputs but A is cached (not in nonCachedTaskIds) → B is eligible. Hash B → H(A_outputs) = H_B (A's outputs just restored!)
Check cache: B = HIT ✅ → restore B's outputs.
Phase 1, iter 3 Roots = [C]. Hash C → H_C. Not cacheable → no hit. Added to nonCachedTaskIds.
Phase 2 Rebuild run graph from nonCachedTaskIds = {C}. Run batch with just C. Hash C post-execution.

Key: Phase 1 restored A's outputs from cache before hashing B. So B's hash matches Run 1's value → cache hit. The topological walk peels the chain one level at a time: A → B → C.


Run 3 — Source changed (A's source modified, stale outputs from Run 2 still on disk)

Step What happens
Phase 1, iter 1 Roots = [A]. Hash A → H(A_src') = H_A' (new hash!)
Check cache: A = miss (H_A' not in cache). A added to nonCachedTaskIds.
Phase 1, iter 2 Roots = [B]. B has depsOutputs and A is non-cached → B is ineligible. Added to nonCachedTaskIds.
Phase 1, iter 3 Roots = [C]. C has no depsOutputs → eligible. Hash C → H_C. Cache miss. Added to nonCachedTaskIds.
Phase 2 Run batch: all 3 tasks execute. A produces new outputs.
Hash all tasks post-execution: A → H_A', B → H(A_new_outputs) = H_B', C → H_C
Cache Store A as H_A', store B as H_B'. C is not cached.

Key: Because hashing happens after execution, B is always hashed against A's fresh outputs. No stale hash, no re-hash needed. And C is still checked against cache at every level, even when upstream tasks miss.


Summary of hashes across runs

Task Run 1 (fresh) Run 2 (warm) Run 3 (src changed)
A miss → cache H_A hit H_A miss → cache H_A'
B miss → post-exec hash & cache H_B hit H_B miss → post-exec hash & cache H_B'
C not cacheable → runs not cacheable → runs not cacheable → runs

Maven plugin fixes

Several fixes to the Maven plugin to ensure correct batch behavior:

  • Propagate batch runner exit code failures: Batch runner process exit codes are now correctly propagated so task failures are reported properly.
  • Use glob patterns for gitignored dependent task outputs: depsOutputs patterns like target/classes are now resolved using glob patterns, fixing issues with .gitignored output directories.
  • Fix inputs for maven:test: Test task inputs now correctly include test source files so hash changes when tests are modified.
  • Include test sources in testCompile task hash: The testCompile target now includes src/test/java in its inputs.

Related Issue(s)

Related to #30949

@nx-cloud
Copy link
Contributor

nx-cloud bot commented Feb 13, 2026

View your CI Pipeline Execution ↗ for commit e7a9491

Command Status Duration Result
nx affected --targets=lint,test,build,e2e,e2e-c... ✅ Succeeded 48m 21s View ↗
nx run-many -t check-imports check-lock-files c... ✅ Succeeded 3m 25s View ↗
nx-cloud record -- nx-cloud conformance:check ✅ Succeeded 8s View ↗
nx-cloud record -- nx format:check ✅ Succeeded 1s View ↗
nx-cloud record -- nx sync:check ✅ Succeeded <1s View ↗

☁️ Nx Cloud last updated this comment at 2026-03-06 21:27:44 UTC

@FrozenPandaz FrozenPandaz force-pushed the fix-batch-hash branch 3 times, most recently from cf36e18 to 71d9c98 Compare February 13, 2026 16:52
@netlify
Copy link

netlify bot commented Feb 13, 2026

Deploy Preview for nx-docs ready!

Name Link
🔨 Latest commit e7a9491
🔍 Latest deploy log https://app.netlify.com/projects/nx-docs/deploys/69ab3a78cf697a000817071e
😎 Deploy Preview https://deploy-preview-34446--nx-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@netlify
Copy link

netlify bot commented Feb 13, 2026

Deploy Preview for nx-dev ready!

Name Link
🔨 Latest commit e7a9491
🔍 Latest deploy log https://app.netlify.com/projects/nx-dev/deploys/69ab3a785f498d00084cc964
😎 Deploy Preview https://deploy-preview-34446--nx-dev.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@FrozenPandaz FrozenPandaz changed the title fix(core): validate batch task hashes against stale dependency outputs feat(core): batch-safe hashing with output fingerprinting Feb 13, 2026
nx-cloud[bot]

This comment was marked as outdated.

@FrozenPandaz FrozenPandaz force-pushed the fix-batch-hash branch 2 times, most recently from ef9a35e to 92f349b Compare February 20, 2026 16:11
nx-cloud[bot]

This comment was marked as outdated.

@FrozenPandaz FrozenPandaz force-pushed the fix-batch-hash branch 2 times, most recently from 940564b to b74c720 Compare February 20, 2026 19:57
nx-cloud[bot]

This comment was marked as outdated.

@FrozenPandaz FrozenPandaz force-pushed the fix-batch-hash branch 2 times, most recently from ebc28bc to a270277 Compare February 20, 2026 23:17
nx-cloud[bot]

This comment was marked as outdated.

nx-cloud[bot]

This comment was marked as outdated.

@FrozenPandaz FrozenPandaz force-pushed the fix-batch-hash branch 2 times, most recently from c7c71a0 to a0a0b8e Compare February 27, 2026 14:54
@FrozenPandaz FrozenPandaz changed the title feat(core): batch-safe hashing with output fingerprinting fix(core): batch-safe hashing for maven and gradle Mar 4, 2026
nx-cloud[bot]

This comment was marked as outdated.

@FrozenPandaz FrozenPandaz marked this pull request as ready for review March 5, 2026 14:17
@FrozenPandaz FrozenPandaz requested review from a team, MaxKless and lourw as code owners March 5, 2026 14:17
When tasks with dependentTasksOutputFiles are co-batched with their
dependencies, hashes are computed using outputs from a previous run
that may be stale. This adds a validation step that checks whether
dependency outputs on disk match the dependency's current hash, skips
cache reads for untrustworthy hashes, and re-hashes after the batch
completes with fresh outputs.
With output fingerprinting enabled for all cache operations (not just
batch mode), tasks whose outputs are already on disk now correctly
report "existing outputs match the cache" instead of "local cache".

Updated e2e assertions across cache, run, ng-add, and nx-init-angular
tests to expect the new status when outputs haven't been deleted.
Wrap identifyTasksWithStaleDepsOutputs and getInputs in try-catch so
that targets without proper input configuration (e.g. inferred maven
targets) don't crash the entire batch execution.
hashTasks filters out tasks that already have a hash. The previous code
cleared hashes on the result copies (created by runBatch via spread) but
called hashTasks on batch.taskGraph which holds the originals — still
with their hashes. This caused hashTasks to skip them entirely, leaving
the copies with undefined hashes that crashed napi when passed to
cache.put.

Clear hashes on the originals so hashTasks picks them up, then sync the
fresh hashes back to the result copies.
The testCompile target hash was missing src/test/java/**/*.java because
CacheConfig used the wrong parameter name for the compiler plugin's
test source roots. Also removes an unnecessary fallback in
MavenExpressionResolver that masked the issue.
When Maven task inputs reference gitignored paths (like target/classes),
they were being converted to dependent task outputs using the directory
path itself. This caused pattern matching issues because:

1. Outputs like lib:compile produce: lib/target/classes/Foo.class
2. Pattern was: "target/classes" (just the directory)
3. We need: "**/*" (to match files within that directory)

Changes:
- MojoAnalyzer: Use parameter's glob pattern (default "**/*") for gitignored inputs
- NxTargetFactory: Use "nx-build-state.json" pattern for gitignored build state
- create-maven-project: Check if .gitignore exists before reading

This ensures dependent task outputs correctly match files from dependency outputs.
The batch executor was always resolving the promise even when the
batch runner JAR exited with a non-zero code. This meant that test
failures were not being propagated back to the CLI.

Now properly rejects the promise when exitCode !== 0, ensuring that:
- Failed tests cause the nx command to throw an error
- The test output is captured and displayed
- The overall command exits with the correct exit code
When a batch task has stale dependency outputs, only that task was
marked stale. Transitive dependents (e.g. package depends on compile
via noop phases) were incorrectly served from cache and removed from
the batch graph, breaking the dependency chain ordering.
…Config

Gitignored input paths (e.g. target/classes, target/test-classes) are
already auto-detected by MojoAnalyzer and converted to
dependentTasksOutputFiles. The explicit entries were duplicating this.
Hashing now happens topologically so tasks are always hashed against
fresh dependency outputs. This removes the stale detection, transitive
propagation, and post-batch re-hashing machinery which is no longer needed.
Output fingerprinting (daemon-free OutputFingerprints Rust service and
hashTaskOutput napi binding) is not related to batch processing. Revert
to the daemon-only shouldCopyOutputsFromCache path that exists on master.
Batch tasks with depsOutputs inputs (e.g. app:compile depending on
lib:compile outputs) were getting incorrect hashes because all tasks
were hashed upfront before any ran, meaning dependency outputs didn't
exist on disk yet.

Replace the flat hash-all approach with three phases:
1. Topological cache resolution - walk batch graph level by level,
   hash roots, check cache, restore outputs before hashing dependents
2. Run remaining uncached tasks via batch executor
3. Re-hash tasks with depsOutputs after batch completes so
   postRunSteps caches under the correct hash
The DependentTaskOutput class and dependentTaskOutputs field on
MojoConfig were never populated by any configuration, making the
forEach loop in MojoAnalyzer dead code.
- Extract hashBatchTasks() helper to eliminate 3 duplicate hashTask call sites
- Remove unused hashTasks import
- Rename ranInBatch to batchTaskIds for clarity
- Parallelize Phase 3 re-hashing instead of sequential awaits
- Pre-filter tasks before calling getInputs() to avoid unnecessary config parsing
Remove pre-execution hashing (Phase 2) and post-execution re-hashing
(Phase 3). Instead, hash all batch tasks once after execution when
outputs are fresh on disk. Tasks with depsOutputs get correct hashes
on the first pass since sibling outputs already exist.
Task history lifecycles were snapshotting task.hash eagerly in endTasks,
but batch tasks haven't been hashed yet at streaming time. This caused
the Rust native layer to crash with "Missing field hash".

Instead of hashing mid-batch, store TaskResult references in endTasks
and build TaskRun objects lazily in endCommand when task.hash is
guaranteed to be set by the post-batch re-hash. Also use original task
references instead of spread copies so hash mutations flow through.
nx-cloud[bot]

This comment was marked as outdated.

"maven-compiler-plugin:testCompile" to MojoConfig(
inputParameters = setOf(
Parameter("testCompileSourceRoots", "**/*.java"),
Parameter("compileSourceRoots", "**/*.java"),

This comment was marked as resolved.

Copy link
Contributor

@nx-cloud nx-cloud bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important

At least one additional CI pipeline execution has run since the conclusion below was written and it may no longer be applicable.

Nx Cloud has identified a possible root cause for your failed CI:

Our test suite encountered a heap out of memory error during nx:test execution. This failure has a 7.49% historical flakiness rate and appears to be an environmental resource constraint rather than a code regression. We should increase the Node.js heap size for CI (NODE_OPTIONS=--max-old-space-size=4096) or optimize test suite chunking to address this recurring issue.

No code changes were suggested for this issue.

🔂 A CI rerun has been triggered by adding an empty commit to this branch.

Nx Cloud View detailed reasoning on Nx Cloud ↗

🔔 Heads up, your workspace has pending recommendations ↗ to auto-apply fixes for similar failures.


🎓 Learn more about Self-Healing CI on nx.dev

Instead of breaking when zero cache hits occur at a level, walk the
entire graph and partition tasks into cache-eligible vs ineligible.
Tasks with depsOutputs whose dependencies were not cached are skipped
(their dep outputs aren't on disk), while all other tasks are still
checked against cache at every level.
@FrozenPandaz FrozenPandaz merged commit 043aaee into master Mar 6, 2026
24 checks passed
@FrozenPandaz FrozenPandaz deleted the fix-batch-hash branch March 6, 2026 22:28
FrozenPandaz added a commit that referenced this pull request Mar 11, 2026
## Current Behavior

After #34446, batch tasks with `depsOutputs` inputs had their hashing
deferred until after execution. This meant the streaming `endTasks`
callback fired with `task.hash = undefined`, which Cloud/DTE rejects.

## Expected Behavior

All batch tasks always have a valid hash when `endTasks` is called.
Tasks with `depsOutputs` get a preliminary hash upfront (based on
whatever outputs are on disk), then are re-hashed after execution with
fresh outputs for correct cache storage.

### How it works

1. **Phase 1** now hashes ALL root tasks at each level (not just
cache-eligible ones). Ineligible tasks get a preliminary hash so the
streaming callback always has something valid to send.
2. **Phase 2** runs the batch, then clears and re-hashes all tasks that
ran — outputs are fresh on disk, so depsOutputs tasks get correct final
hashes.
3. The re-hash logic is consolidated into a single block after both code
paths (cache-enabled and cache-skipped).

## Related Issue(s)

Fixes the undefined hash regression from #34446

---------

Co-authored-by: nx-cloud[bot] <71083854+nx-cloud[bot]@users.noreply.github.com>
Co-authored-by: FrozenPandaz <FrozenPandaz@users.noreply.github.com>
@github-actions
Copy link
Contributor

This pull request has already been merged/closed. If you experience issues related to these changes, please open a new issue referencing this pull request.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 12, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants