Skip to content

Make Zig a first-class GitNexus citizen across the CLI and web app#305

Open
zolotukhin wants to merge 1 commit intoabhigyanpatwari:mainfrom
zolotukhin:zig-first-class-support
Open

Make Zig a first-class GitNexus citizen across the CLI and web app#305
zolotukhin wants to merge 1 commit intoabhigyanpatwari:mainfrom
zolotukhin:zig-first-class-support

Conversation

@zolotukhin
Copy link

@zolotukhin zolotukhin commented Mar 16, 2026

Summary

This PR makes Zig a first-class language in GitNexus across both the CLI and the web app, and it closes the main correctness gaps that showed up during review.

It does four things:

  • adds end-to-end Zig ingestion in the CLI and web paths
  • keeps the CLI and web Zig grammars in sync so they build the same graph
  • exposes detect-changes directly in the CLI
  • fixes Code Inspector so Zig files are readable in practice, with full relative paths and Zig syntax highlighting

Why this matters

Before this patch, Zig support was incomplete in ways that could silently degrade graph quality:

  • the CLI and web ingestion paths could diverge because they were not guaranteed to use the same Zig grammar shape
  • exported Zig symbols using export fn were not treated as exported
  • anonymous tests, opaque {} types, and container fields were not modeled well enough for reliable graph navigation
  • build files and generated artifacts could add avoidable noise to the graph
  • the web Code Inspector did not make Zig repositories pleasant to inspect because path display and syntax highlighting were incomplete

This patch makes Zig repositories much more dependable to analyze, review, and navigate in both shipped entry points.

What changed

Zig ingestion and graph correctness

  • registered Zig as a supported language in both gitnexus and gitnexus-web
  • aligned the web Zig query set with the CLI query set so both paths use the same node labels and call/import patterns
  • replaced the bundled Zig wasm grammar with the wasm compiled from the same @tree-sitter-grammars/tree-sitter-zig@1.1.2 source used by the CLI, and documented that provenance
  • added a parity test that checks:
    • the CLI and web Zig queries stay in sync
    • both parsers produce the same capture summary on the same Zig source
    • both packages use the same Zig wasm bytes
  • taught Zig export detection to recognize both pub and export
  • added support for:
    • anonymous test {} blocks
    • named Zig tests
    • opaque {} declarations
    • container fields as properties
    • struct/enum/union owner lookup for HAS_METHOD and HAS_PROPERTY
  • added minimal Zig type extraction for explicit declaration types and parameter types

Noise reduction and repository hygiene

  • excluded both build.zig and build.zig.zon from indexing
  • kept ignore handling generic rather than baking in repository-specific internal folders
  • added Zig-specific resolver plumbing so .zig files resolve through the standard import path machinery
  • added Zig built-in noise filtering for common allocator and testing helpers

CLI and web product improvements

  • exposed detect-changes directly through the CLI entrypoint
  • fixed the server connection defaults in the web app so the connect flow points at the API instead of the frontend origin
  • updated Code Inspector to:
    • show the full relative file path instead of only the basename
    • apply Zig syntax highlighting reliably
    • use the resolved file node path when node metadata is incomplete

Key implementation notes

  • tree-sitter-queries.ts
    • defines the Zig capture rules used to extract definitions, imports, calls, tests, containers, properties, and opaque types
  • export-detection.ts
    • decides whether a Zig declaration should be treated as exported in the graph
  • ast-helpers.ts
    • derives container ownership for members, including Zig methods declared inside const Foo = struct { ... }
  • type-extractors/zig.ts
    • extracts explicit declaration and parameter types so Zig symbols participate in the type environment
  • parser-loader.ts
    • loads the native Zig parser in the CLI and the matching wasm parser in the web app
  • CodeReferencesPanel.tsx and code-highlighting.ts
    • drive the Code Inspector path label and language-specific syntax highlighting in the web UI

How I tested it

  • npm run build in gitnexus
  • npm run build in gitnexus-web
  • npx vitest run --maxWorkers 1
    • result: 93 test files passed, 1 skipped
    • result: 3616 tests passed, 20 skipped
  • focused Zig/change-area verification:
    • npx vitest run --coverage --maxWorkers 1 test/unit/tree-sitter-queries.test.ts test/unit/zig-web-parity.test.ts test/unit/ingestion-utils.test.ts test/unit/ignore-service.test.ts test/unit/type-env.test.ts test/unit/has-method.test.ts test/unit/parser-loader.test.ts test/unit/repo-manager.test.ts test/unit/tool-cli.test.ts test/unit/cli-index-help.test.ts test/integration/tree-sitter-languages.test.ts test/integration/parsing.test.ts test/integration/filesystem-walker.test.ts test/integration/has-method.test.ts test/integration/query-compilation.test.ts test/integration/cli-e2e.test.ts test/integration/skills-e2e.test.ts
    • result: 17 test files passed
    • result: 894 tests passed, 1 skipped

Focused coverage highlights from that run:

  • gitnexus/src/core/ingestion/tree-sitter-queries.ts: 100% lines
  • gitnexus-web/src/core/ingestion/tree-sitter-queries.ts: 100% lines
  • gitnexus/src/config/ignore-service.ts: 98.18% lines
  • gitnexus/src/core/ingestion/export-detection.ts: 97.27% lines
  • gitnexus/src/core/tree-sitter/parser-loader.ts: 100% lines
  • gitnexus/src/core/ingestion/type-extractors/zig.ts: 95.65% lines
  • gitnexus/src/core/ingestion/filesystem-walker.ts: 86.56% lines

Review issues addressed

  • fixed the CLI/web Zig grammar mismatch by syncing the grammar source and adding a parity test
  • fixed the schema mismatch where web struct captures could diverge from CLI labels
  • fixed export fn handling
  • replaced the Zig type extractor stub with real extraction for explicit declaration and parameter types
  • excluded build.zig as well as build.zig.zon
  • verified the Zig import query shape against the synced grammar path
  • added coverage for anonymous tests, opaque types, container fields, and Zig owner resolution

@vercel
Copy link

vercel bot commented Mar 16, 2026

@zolotukhin is attempting to deploy a commit to the NexusCore Team on Vercel.

A member of the Team first needs to authorize it.

@zolotukhin zolotukhin force-pushed the zig-first-class-support branch from c2d0fe2 to 8390116 Compare March 16, 2026 16:19
@zolotukhin zolotukhin force-pushed the zig-first-class-support branch 3 times, most recently from cfea68a to c0529c6 Compare March 19, 2026 04:07
@zander-raycraft
Copy link
Collaborator

@claude can you review this as senior zig system designer, refer to https://ziglang.org/documentation/master/ for reference, look for nuances and gaps in this repo, check its viability to merge and build out support for ZIg

@github-actions
Copy link
Contributor

github-actions bot commented Mar 21, 2026

Claude finished @zander-raycraft's task in 3m 24s —— View job


Zig First-Class Support — Senior Zig System Design Review

Tasks

  • Initialize review
  • Read Zig-specific source files (tree-sitter-queries.ts, type-extractors/zig.ts, export-detection.ts)
  • Read tree-sitter query definitions (CLI + Web)
  • Read parser loader and call routing
  • Read filesystem walker and ignore service changes
  • Read tests for coverage gaps
  • Cross-reference against Zig language spec
  • Write detailed review

Summary

The PR introduces solid foundational Zig support — grammar registration, export detection, container parsing, and a fixture file. The architecture mirrors the established pattern used by Rust and Go, and the pub-based visibility check is semantically correct. The test coverage is reasonable. However, there are several Zig-specific language nuances and one critical consistency bug between the CLI and web grammars that need to be addressed before this can be considered fully production-ready.


🔴 Critical: CLI vs Web Grammar Node Name Mismatch

The CLI uses @tree-sitter-grammars/tree-sitter-zig (Node native) while the web uses a bundled .wasm file. These appear to be different grammar versions with incompatible AST node names:

Construct CLI (tree-sitter-queries.ts) Web (tree-sitter-queries.ts)
Direct calls call_expression function: (identifier) function_call function: (identifier)
Member calls call_expression function: (field_expression member:) function_call function: (field_access field:)
Import builtin builtin_function (builtin_identifier) (#eq? "@import") builtin_call function: (identifier) (#eq? "@import")

If the WASM grammar and the npm grammar are out of sync, one of the two ingestion paths will silently produce zero call edges for all Zig files. The WASM grammar provenance (tree-sitter-zig.wasm) is not documented — it's not clear which version/commit it was compiled from. This needs a pinned version comment and a CI test that validates both query sets against the same Zig source. Fix this →


🔴 Critical: Web ZIG_QUERIES Uses Different Schema Label for Structs

gitnexus-web/src/core/ingestion/tree-sitter-queries.ts:498:

(variable_declaration
  (identifier) @name
  (struct_declaration)) @definition.class   // ← Web uses 'class'

But gitnexus/src/core/ingestion/tree-sitter-queries.ts:903:

(variable_declaration
  (identifier) @name
  (struct_declaration)) @definition.struct  // ← CLI uses 'struct'

A Zig struct Config will be stored as Struct in CLI analysis and Class in web analysis. This breaks cross-context consistency of the knowledge graph schema. Fix this →


🟡 Major: export Keyword Not Recognized by zigExportChecker

gitnexus/src/core/ingestion/export-detection.ts:209-228:

const ZIG_DECL_TYPES = new Set([
  'function_declaration',
  'variable_declaration',
  'test_declaration',
]);

const zigExportChecker: ExportChecker = (node, _name) => {
  // Only looks for 'pub' child token
  if (child?.text === 'pub') return true;

Zig has two distinct export mechanisms:

  • pub fn foo() — marks a symbol public within Zig's module system
  • export fn foo() — marks a symbol with C ABI, making it externally linkable (the strongest form of "exported")

An export fn add(a: i32, b: i32) i32 is the most definitively-exported Zig symbol (used for C interop, embedded firmware entry points, etc.) and it will return false from this checker. Fix this →


🟡 Major: Stub Type Extractor — No Struct Field Tracking

gitnexus/src/core/ingestion/type-extractors/zig.ts:

export const typeConfig: LanguageTypeConfig = {
  declarationNodeTypes: EMPTY_DECLARATIONS,
  extractDeclaration: () => undefined,
  extractParameter: () => undefined,
};

This is a placeholder that skips all type resolution for Zig. While acceptable for a first pass, Zig structs are the primary unit of data composition — without field tracking, the knowledge graph cannot model struct composition, comptime generics, or allocator injection patterns. Zig struct fields have explicit types and are critical for the type graph. This should be called out prominently in the PR as a known limitation with a follow-up issue.


🟡 Major: build.zig Indexed as Source Code

gitnexus/src/config/ignore-service.ts correctly excludes build.zig.zon (the package manifest), but build.zig is indexed as regular source code. build.zig is a build script that calls Zig's build system API (b.addExecutable(), b.installArtifact(), b.dependency(), etc.) — these will produce noisy call edges pointing to build infrastructure, not application logic.

Consider adding build.zig to IGNORED_FILES (analogous to how Makefile/Cargo.toml are not indexed), or at minimum document this as known behavior. Fix this →


🟡 Major: Web Import Query Uses Incorrect Grammar Pattern for @import

gitnexus-web/src/core/ingestion/tree-sitter-queries.ts:528-532:

((builtin_call
  function: (identifier) @_fn
  arguments: (arguments
    (string (string_content) @import.source)))
  (#eq? @_fn "@import")) @import

The predicate #eq? @_fn "@import" compares against the identifier text including the @. In most tree-sitter-zig grammars, @import is a built-in identifier and the identifier node's text is either "import" (without @) or "@import". If the grammar represents the identifier without the @ prefix, this predicate will never match and import edges will silently be missing from the web graph. This needs verification against the actual WASM grammar AST.


🟢 Minor: Anonymous Tests Not Captured

Zig allows anonymous (unnamed) tests:

test {
    try std.testing.expect(2 + 2 == 4);
}

The current query (test_declaration (string (string_content) @name)) requires a string name. Anonymous tests are valid Zig and frequently used. They could be captured with a fallback pattern or simply documented as out of scope.


🟢 Minor: opaque {} Types Not Captured

Zig's opaque {} type (used extensively for FFI and type-safe pointer wrappers) is not captured:

pub const FileHandle = opaque {};
pub const sqlite3 = opaque {};

These are common Zig patterns, especially in binding layers. The queries cover struct, enum, and union but miss opaque. Fix this →


🟢 Minor: Method Detection Pattern Diverges Between CLI and Web

CLI (gitnexus/src/core/ingestion/tree-sitter-queries.ts:913-924) anchors method patterns at the container declaration level (e.g., (struct_declaration (function_declaration ...))). This correctly captures methods inside anonymous structs.

Web (gitnexus-web/src/core/ingestion/tree-sitter-queries.ts:508-525) requires the surrounding variable_declaration, meaning methods in returned/anonymous structs are missed in web analysis. This is lower-priority but is a semantic gap.


🟢 Minor: No Zig-Specific BUILT_IN_NAMES Noise Filtering

gitnexus/src/core/ingestion/utils.ts has extensive per-language built-in name lists but no Zig-specific section. Common Zig noise in call graphs includes standard library access patterns:

  • init, deinit (allocator lifecycle — universal Zig idiom)
  • alloc, free, dupe, dupeZ (allocator methods)
  • expectEqual, expectError, expect (testing framework)
  • format (string formatting)
  • parseInt, parseFloat (parsing)

Many of these (assert, init) are already in the list from other languages, so the impact is limited, but targeted Zig entries would reduce call-graph noise.


🟢 Minor: test_declaration in ZIG_DECL_TYPES is Semantically Dead

gitnexus/src/core/ingestion/export-detection.ts:209-213:

const ZIG_DECL_TYPES = new Set([
  'function_declaration',
  'variable_declaration',
  'test_declaration',   // ← test blocks cannot be `pub` in Zig
]);

In Zig, test blocks are always file-private — pub test is a compile error. So test_declaration will never return true from zigExportChecker. This is harmless but dead code that could confuse maintainers.


✅ What's Done Well

  • pub visibility check is correctly modeled — walking ancestors to find the declaration and scanning children for pub mirrors how Rust visibility works in this codebase.
  • Container-via-const-binding idiom is correctly identified (pub const Config = struct {...}) — this is a fundamental Zig pattern that other language servers often miss.
  • build.zig.zon exclusion is correct — it's a manifest file, not source.
  • Nested repo detection for Zig projects is consistent with existing language support.
  • noRouting call router for Zig is appropriate — Zig doesn't have Ruby-style method calls that double as import/heritage mechanisms.
  • Test fixture covers the essential struct/enum/union/fn/test patterns used for CI validation.
  • Grammar selected (@tree-sitter-grammars/tree-sitter-zig) is the community-maintained reference grammar, which is the right choice.

Verdict

Conditionally mergeable. The CLI-side Zig support is structurally sound and will work for the primary use case (indexing Zig repositories via the CLI). The two critical issues — grammar node name mismatch between CLI and web and struct vs class label divergence — should be resolved before or shortly after merge, as they affect graph correctness in the web ingestion path. The export keyword gap and build.zig indexing are real semantic gaps worth tracking as follow-up issues.

@zander-raycraft zander-raycraft added the Language Support Additional lang support for gitnexus label Mar 21, 2026
@zolotukhin zolotukhin force-pushed the zig-first-class-support branch from c0529c6 to 4425f6b Compare March 22, 2026 01:27
@zolotukhin zolotukhin changed the title Make Zig a first-class GitNexus citizen: native parsing, CLI change detection, and a fixed Code Inspector Make Zig a first-class GitNexus citizen across the CLI and web app Mar 22, 2026
@zolotukhin
Copy link
Author

Addressed the Claude review issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Language Support Additional lang support for gitnexus

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants