Skip to content

Commit f2d3df4

Browse files
authored
feat: Phase 5 type resolution — chained calls, pattern matching, class-as-receiver (#315)
* feat: Phase 5 type resolution — chained calls, pattern matching, class-as-receiver, code review fixes Phase 5.1: Chained method call resolution (depth-capped at 3) - resolveChainedReceiver() resolves a.getUser().save() by walking the chain and looking up intermediate return types from the SymbolTable - extractReceiverNode() + extractCallChain() shared in utils.ts - receiverCallChain on ExtractedCall for worker path parity - MAX_CHAIN_DEPTH=3 enforced in both extraction and resolution Phase 5.2: Pattern matching binding extractors - PatternBindingExtractor type added to LanguageTypeConfig - declarationTypeNodes map tracks original type AST nodes for generic unwrapping - Rust: if let Some(x)/Ok(x) unwrapping with extractGenericTypeArgs - Java: instanceof pattern variables (Java 16+) - C#: is-pattern disambiguation fixture (already working via extractDeclaration) Phase 5.5d: Python standalone type annotations (name: str) - expression_statement with type child now captured in DECLARATION_NODE_TYPES Phase 5.5e: ReceiverKey collision fix for overloaded methods - receiverKey preserves @StartIndex to prevent same-name method collisions - lookupReceiverType does prefix scan with ambiguity refusal Class-as-receiver for static method calls (#289) - UserService.find_user() now resolves via ctx.resolve() tiered lookup - Respects import scoping — no false positives from unrelated packages Code review fixes: - Extracted CALL_EXPRESSION_TYPES + extractCallChain to utils.ts (eliminated duplication) - Converted resolveChainedReceiver from recursion to loop (no exposed depth param) - Added depth cap to extractReturnTypeName (defense against nested wrapper types) - Replaced lookupFuzzy with ctx.resolve for class-as-receiver (architecturally consistent) Closes #289 Test coverage: 6 new fixtures, 12+ new unit tests, 7 new integration test suites * fix: Ruby chain calls, Rust Err(x) unwrap, Enum class-as-receiver (#315) Address three per-language gaps identified in Phase 5 code review: - Ruby: add `method`/`receiver` field fallbacks to extractCallChain (tree-sitter-ruby uses different field names than other grammars) - Rust: handle `Err(e)` pattern binding via typeArgs[1] from Result<T,E> - Enum: include Enum type in class-as-receiver filter (both paths) Integration tests added for all three fixes. * fix: chain base type resolution parity between serial and worker paths (#315) - Worker path: add typeEnv.lookup for chain base receiver after extraction (typed parameters like `fn process(svc: &UserService)` were silently lost) - Serial path: add ctx.resolve class-as-receiver fallback for chain base (class-name chains like `UserService.find_user().save()` failed) - Fix misleading comment in parse-worker.ts that described unimplemented logic - Integration tests: typed-parameter chain, static class-name chain * fix: Kotlin chain call extraction, createClassNameLookup Enum/Struct (#315) - Kotlin: extractCallChain now handles navigation_expression → navigation_suffix AST structure (Kotlin's call_expression has no 'function' field) - createClassNameLookup: include Enum and Struct alongside Class for consistent constructor recognition in extractInitializer - Integration test: kotlin-chain-call fixture verifying svc.getUser().save()
1 parent 5fa73ba commit f2d3df4

File tree

68 files changed

+1743
-34
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

68 files changed

+1743
-34
lines changed

gitnexus/src/core/ingestion/call-processor.ts

Lines changed: 187 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,11 @@ import {
1717
countCallArguments,
1818
inferCallForm,
1919
extractReceiverName,
20+
extractReceiverNode,
2021
findEnclosingClassId,
22+
CALL_EXPRESSION_TYPES,
23+
MAX_CHAIN_DEPTH,
24+
extractCallChain,
2125
} from './utils.js';
2226
import { buildTypeEnv } from './type-env.js';
2327
import type { ConstructorBinding } from './type-env.js';
@@ -72,7 +76,7 @@ const verifyConstructorBindings = (
7276
const isClass = tiered?.candidates.some(def => def.type === 'Class') ?? false;
7377

7478
if (isClass) {
75-
verified.set(receiverKey(extractFuncNameFromScope(scope), varName), calleeName);
79+
verified.set(receiverKey(scope, varName), calleeName);
7680
} else {
7781
let callableDefs = tiered?.candidates.filter(d =>
7882
d.type === 'Function' || d.type === 'Method'
@@ -105,7 +109,7 @@ const verifyConstructorBindings = (
105109
if (callableDefs && callableDefs.length === 1 && callableDefs[0].returnType) {
106110
const typeName = extractReturnTypeName(callableDefs[0].returnType);
107111
if (typeName) {
108-
verified.set(receiverKey(extractFuncNameFromScope(scope), varName), typeName);
112+
verified.set(receiverKey(scope, varName), typeName);
109113
}
110114
}
111115
}
@@ -253,8 +257,47 @@ export const processCalls = async (
253257
if (!receiverTypeName && receiverName && verifiedReceivers.size > 0) {
254258
const enclosingFunc = findEnclosingFunction(callNode, file.path, ctx);
255259
const funcName = enclosingFunc ? extractFuncNameFromSourceId(enclosingFunc) : '';
256-
receiverTypeName = verifiedReceivers.get(receiverKey(funcName, receiverName))
257-
?? verifiedReceivers.get(receiverKey('', receiverName));
260+
receiverTypeName = lookupReceiverType(verifiedReceivers, funcName, receiverName);
261+
}
262+
// Fall back to class-as-receiver for static method calls (e.g. UserService.find_user()).
263+
// When the receiver name is not a variable in TypeEnv but resolves to a Class/Struct/Interface
264+
// through the standard tiered resolution, use it directly as the receiver type.
265+
if (!receiverTypeName && receiverName && callForm === 'member') {
266+
const typeResolved = ctx.resolve(receiverName, file.path);
267+
if (typeResolved && typeResolved.candidates.some(
268+
d => d.type === 'Class' || d.type === 'Interface' || d.type === 'Struct' || d.type === 'Enum',
269+
)) {
270+
receiverTypeName = receiverName;
271+
}
272+
}
273+
// Fall back to chained call resolution when the receiver is a call expression
274+
// (e.g. svc.getUser().save() — receiver of save() is getUser(), not a simple identifier).
275+
if (callForm === 'member' && !receiverTypeName && !receiverName) {
276+
const receiverNode = extractReceiverNode(nameNode);
277+
if (receiverNode && CALL_EXPRESSION_TYPES.has(receiverNode.type)) {
278+
const extracted = extractCallChain(receiverNode);
279+
if (extracted) {
280+
// Resolve the base receiver type if possible
281+
let baseType = extracted.baseReceiverName && typeEnv
282+
? typeEnv.lookup(extracted.baseReceiverName, callNode)
283+
: undefined;
284+
if (!baseType && extracted.baseReceiverName && verifiedReceivers.size > 0) {
285+
const enclosingFunc = findEnclosingFunction(callNode, file.path, ctx);
286+
const funcName = enclosingFunc ? extractFuncNameFromSourceId(enclosingFunc) : '';
287+
baseType = lookupReceiverType(verifiedReceivers, funcName, extracted.baseReceiverName);
288+
}
289+
// Class-as-receiver for chain base (e.g. UserService.find_user().save())
290+
if (!baseType && extracted.baseReceiverName) {
291+
const cr = ctx.resolve(extracted.baseReceiverName, file.path);
292+
if (cr?.candidates.some(d =>
293+
d.type === 'Class' || d.type === 'Interface' || d.type === 'Struct' || d.type === 'Enum',
294+
)) {
295+
baseType = extracted.baseReceiverName;
296+
}
297+
}
298+
receiverTypeName = resolveChainedReceiver(extracted.chain, baseType, file.path, ctx);
299+
}
300+
}
258301
}
259302

260303
const resolved = resolveCallTarget({
@@ -352,6 +395,47 @@ const toResolveResult = (
352395
reason: tier === 'same-file' ? 'same-file' : tier === 'import-scoped' ? 'import-resolved' : 'global',
353396
});
354397

398+
/**
399+
* Resolve a chain of intermediate method calls to find the receiver type for a
400+
* final member call. Called when the receiver of a call is itself a call
401+
* expression (e.g. `svc.getUser().save()`).
402+
*
403+
* @param chainNames Ordered list of method names from outermost to innermost
404+
* intermediate call (e.g. ['getUser'] for `svc.getUser().save()`).
405+
* @param baseReceiverTypeName The already-resolved type of the base receiver
406+
* (e.g. 'UserService' for `svc`), or undefined.
407+
* @param currentFile The file path for resolution context.
408+
* @param ctx The resolution context for symbol lookup.
409+
* @returns The type name of the final intermediate call's return type, or undefined
410+
* if resolution fails at any step.
411+
*/
412+
function resolveChainedReceiver(
413+
chainNames: string[],
414+
baseReceiverTypeName: string | undefined,
415+
currentFile: string,
416+
ctx: ResolutionContext,
417+
): string | undefined {
418+
let currentType = baseReceiverTypeName;
419+
for (const name of chainNames) {
420+
const resolved = resolveCallTarget(
421+
{ calledName: name, callForm: 'member', receiverTypeName: currentType },
422+
currentFile,
423+
ctx,
424+
);
425+
if (!resolved) return undefined;
426+
427+
const candidates = ctx.symbols.lookupFuzzy(name);
428+
const symDef = candidates.find(c => c.nodeId === resolved.nodeId);
429+
if (!symDef?.returnType) return undefined;
430+
431+
const returnTypeName = extractReturnTypeName(symDef.returnType);
432+
if (!returnTypeName) return undefined;
433+
434+
currentType = returnTypeName;
435+
}
436+
return currentType;
437+
}
438+
355439
/**
356440
* Resolve a function call to its target node ID using priority strategy:
357441
* A. Narrow candidates by scope tier via ctx.resolve()
@@ -491,7 +575,8 @@ function extractFirstTypeArg(args: string): string {
491575
return args.trim();
492576
}
493577

494-
export const extractReturnTypeName = (raw: string): string | undefined => {
578+
export const extractReturnTypeName = (raw: string, depth = 0): string | undefined => {
579+
if (depth > 10) return undefined;
495580
let text = raw.trim();
496581
if (!text) return undefined;
497582

@@ -519,7 +604,7 @@ export const extractReturnTypeName = (raw: string): string | undefined => {
519604
// so that nested generics like Result<User, Error> are not split at the inner
520605
// comma. Lifetime parameters (Rust 'a, '_) are skipped.
521606
const firstArg = extractFirstTypeArg(args);
522-
return extractReturnTypeName(firstArg);
607+
return extractReturnTypeName(firstArg, depth + 1);
523608
}
524609
// Non-wrapper generic: return the base type (e.g., Map<K,V> → Map)
525610
return PRIMITIVE_TYPES.has(base.toLowerCase()) ? undefined : base;
@@ -548,6 +633,11 @@ export const extractReturnTypeName = (raw: string): string | undefined => {
548633
// Source IDs use "Label:filepath:funcName" (produced by parse-worker.ts).
549634
// NUL (\0) is used as a composite-key separator because it cannot appear
550635
// in source-code identifiers, preventing ambiguous concatenation.
636+
//
637+
// receiverKey stores the FULL scope (funcName@startIndex) to prevent
638+
// collisions between overloaded methods with the same name in different
639+
// classes (e.g. User.save@100 and Repo.save@200 are distinct keys).
640+
// Lookup uses a secondary funcName-only index built in lookupReceiverType.
551641

552642
/** Extract the function name from a scope key ("funcName@startIndex" → "funcName"). */
553643
const extractFuncNameFromScope = (scope: string): string =>
@@ -559,9 +649,58 @@ const extractFuncNameFromSourceId = (sourceId: string): string => {
559649
return lastColon >= 0 ? sourceId.slice(lastColon + 1) : '';
560650
};
561651

562-
/** Build a scope-aware composite key for receiver type lookup. */
563-
const receiverKey = (funcName: string, varName: string): string =>
564-
`${funcName}\0${varName}`;
652+
/**
653+
* Build a composite key for receiver type storage.
654+
* Uses the full scope string (e.g. "save@100") to distinguish overloaded
655+
* methods with the same name in different classes.
656+
*/
657+
const receiverKey = (scope: string, varName: string): string =>
658+
`${scope}\0${varName}`;
659+
660+
/**
661+
* Look up a receiver type from a verified receiver map.
662+
* The map is keyed by `scope\0varName` (full scope with @startIndex).
663+
* Since the lookup side only has `funcName` (no startIndex), we scan for
664+
* all entries whose key starts with `funcName@` and has the matching varName.
665+
* If exactly one unique type is found, return it. If multiple distinct types
666+
* exist (true overload collision), return undefined (refuse to guess).
667+
* Falls back to the file-level scope key `\0varName` (empty funcName).
668+
*/
669+
const lookupReceiverType = (
670+
map: Map<string, string>,
671+
funcName: string,
672+
varName: string,
673+
): string | undefined => {
674+
// Fast path: file-level scope (empty funcName — used as fallback)
675+
const fileLevelKey = receiverKey('', varName);
676+
677+
const prefix = `${funcName}@`;
678+
const suffix = `\0${varName}`;
679+
let found: string | undefined;
680+
let ambiguous = false;
681+
682+
for (const [key, value] of map) {
683+
if (key === fileLevelKey) continue; // handled separately below
684+
if (key.startsWith(prefix) && key.endsWith(suffix)) {
685+
// Verify the key is exactly "funcName@<digits>\0varName" with no extra chars.
686+
// The part between prefix and suffix should be the startIndex (digits only),
687+
// but we accept any non-empty segment to be forward-compatible.
688+
const middle = key.slice(prefix.length, key.length - suffix.length);
689+
if (middle.length === 0) continue; // malformed key — skip
690+
if (found === undefined) {
691+
found = value;
692+
} else if (found !== value) {
693+
ambiguous = true;
694+
break;
695+
}
696+
}
697+
}
698+
699+
if (!ambiguous && found !== undefined) return found;
700+
701+
// Fallback: file-level scope (bindings outside any function)
702+
return map.get(fileLevelKey);
703+
};
565704

566705
/**
567706
* Fast path: resolve pre-extracted call sites from workers.
@@ -609,15 +748,52 @@ export const processCallsFromExtracted = async (
609748

610749
for (const call of calls) {
611750
let effectiveCall = call;
751+
752+
// Step 1: resolve receiver type from constructor bindings
612753
if (!call.receiverTypeName && call.receiverName && receiverMap) {
613754
const callFuncName = extractFuncNameFromSourceId(call.sourceId);
614-
const resolvedType = receiverMap.get(receiverKey(callFuncName, call.receiverName))
615-
?? receiverMap.get(receiverKey('', call.receiverName)); // fall back to file-level scope
755+
const resolvedType = lookupReceiverType(receiverMap, callFuncName, call.receiverName);
616756
if (resolvedType) {
617757
effectiveCall = { ...call, receiverTypeName: resolvedType };
618758
}
619759
}
620760

761+
// Step 1b: class-as-receiver for static method calls (e.g. UserService.find_user())
762+
if (!effectiveCall.receiverTypeName && effectiveCall.receiverName && effectiveCall.callForm === 'member') {
763+
const typeResolved = ctx.resolve(effectiveCall.receiverName, effectiveCall.filePath);
764+
if (typeResolved && typeResolved.candidates.some(
765+
d => d.type === 'Class' || d.type === 'Interface' || d.type === 'Struct' || d.type === 'Enum',
766+
)) {
767+
effectiveCall = { ...effectiveCall, receiverTypeName: effectiveCall.receiverName };
768+
}
769+
}
770+
771+
// Step 2: if the call has a receiver call chain (e.g. svc.getUser().save()),
772+
// resolve the chain to determine the final receiver type.
773+
// This runs whenever receiverCallChain is present — even when Step 1 set a
774+
// receiverTypeName, that type is the BASE receiver (e.g. UserService for svc),
775+
// and the chain must be walked to produce the FINAL receiver (e.g. User from
776+
// getUser() : User).
777+
if (effectiveCall.receiverCallChain?.length) {
778+
// Step 1 may have resolved the base receiver type (e.g. svc → UserService).
779+
// Use it as the starting point for chain resolution.
780+
let baseType = effectiveCall.receiverTypeName;
781+
// If Step 1 didn't resolve it, try the receiver map directly.
782+
if (!baseType && effectiveCall.receiverName && receiverMap) {
783+
const callFuncName = extractFuncNameFromSourceId(effectiveCall.sourceId);
784+
baseType = lookupReceiverType(receiverMap, callFuncName, effectiveCall.receiverName);
785+
}
786+
const chainedType = resolveChainedReceiver(
787+
effectiveCall.receiverCallChain,
788+
baseType,
789+
effectiveCall.filePath,
790+
ctx,
791+
);
792+
if (chainedType) {
793+
effectiveCall = { ...effectiveCall, receiverTypeName: chainedType };
794+
}
795+
}
796+
621797
const resolved = resolveCallTarget(effectiveCall, effectiveCall.filePath, ctx);
622798
if (!resolved) continue;
623799

gitnexus/src/core/ingestion/type-env.ts

Lines changed: 47 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -265,7 +265,9 @@ const createClassNameLookup = (
265265
if (localNames.has(name)) return true;
266266
const cached = memo.get(name);
267267
if (cached !== undefined) return cached;
268-
const result = symbolTable.lookupFuzzy(name).some(def => def.type === 'Class');
268+
const result = symbolTable.lookupFuzzy(name).some(def =>
269+
def.type === 'Class' || def.type === 'Enum' || def.type === 'Struct',
270+
);
269271
memo.set(name, result);
270272
return result;
271273
},
@@ -292,18 +294,37 @@ export const buildTypeEnv = (
292294
const config = typeConfigs[language];
293295
const bindings: ConstructorBinding[] = [];
294296
const pendingAssignments: Array<{ scope: string; lhs: string; rhs: string }> = [];
297+
// Maps `scope\0varName` → the type annotation AST node from the original declaration.
298+
// Allows pattern extractors to navigate back to the declaration's generic type arguments
299+
// (e.g., to extract T from Result<T, E> for `if let Ok(x) = res`).
300+
const declarationTypeNodes = new Map<string, SyntaxNode>();
295301

296302
/**
297303
* Try to extract a (variableName → typeName) binding from a single AST node.
298304
*
299305
* Resolution tiers (first match wins):
300306
* - Tier 0: explicit type annotations via extractDeclaration / extractForLoopBinding
301307
* - Tier 1: constructor-call inference via extractInitializer (fallback)
308+
*
309+
* Side effect: populates declarationTypeNodes for variables that have an explicit
310+
* type annotation field on the declaration node. This allows pattern extractors to
311+
* retrieve generic type arguments from the original declaration (e.g., extracting T
312+
* from Result<T, E> for `if let Ok(x) = res`).
302313
*/
303-
const extractTypeBinding = (node: SyntaxNode, scopeEnv: Map<string, string>): void => {
314+
const extractTypeBinding = (node: SyntaxNode, scopeEnv: Map<string, string>, scope: string): void => {
304315
// This guard eliminates 90%+ of calls before any language dispatch.
305316
if (TYPED_PARAMETER_TYPES.has(node.type)) {
317+
const keysBefore = new Set(scopeEnv.keys());
306318
config.extractParameter(node, scopeEnv);
319+
// Capture the type node for newly introduced parameter bindings
320+
const typeNode = node.childForFieldName('type');
321+
if (typeNode) {
322+
for (const varName of scopeEnv.keys()) {
323+
if (!keysBefore.has(varName)) {
324+
declarationTypeNodes.set(`${scope}\0${varName}`, typeNode);
325+
}
326+
}
327+
}
307328
return;
308329
}
309330
// For-each loop variable bindings (Java/C#/Kotlin): explicit element types in the AST.
@@ -313,7 +334,19 @@ export const buildTypeEnv = (
313334
return;
314335
}
315336
if (config.declarationNodeTypes.has(node.type)) {
337+
const keysBefore = new Set(scopeEnv.keys());
316338
config.extractDeclaration(node, scopeEnv);
339+
// Capture the type annotation AST node for newly introduced bindings.
340+
// Only declarations with an explicit 'type' field are recorded — constructor
341+
// inferences (Tier 1) don't have a type annotation node to preserve.
342+
const typeNode = node.childForFieldName('type');
343+
if (typeNode) {
344+
for (const varName of scopeEnv.keys()) {
345+
if (!keysBefore.has(varName)) {
346+
declarationTypeNodes.set(`${scope}\0${varName}`, typeNode);
347+
}
348+
}
349+
}
317350
// Tier 1: constructor-call inference as fallback.
318351
// Always called when available — each language's extractInitializer
319352
// internally skips declarators that already have explicit annotations,
@@ -346,7 +379,18 @@ export const buildTypeEnv = (
346379
if (!env.has(scope)) env.set(scope, new Map());
347380
const scopeEnv = env.get(scope)!;
348381

349-
extractTypeBinding(node, scopeEnv);
382+
extractTypeBinding(node, scopeEnv, scope);
383+
384+
// Pattern binding extraction: handles constructs that introduce NEW typed variables
385+
// via pattern matching (e.g. `if let Some(x) = opt`, `x instanceof T t`).
386+
// Runs after Tier 0/1 so scopeEnv already contains the source variable's type.
387+
// Conservative: extractor returns undefined when source type is unknown.
388+
if (config.extractPatternBinding) {
389+
const patternBinding = config.extractPatternBinding(node, scopeEnv, declarationTypeNodes, scope);
390+
if (patternBinding && !scopeEnv.has(patternBinding.varName)) {
391+
scopeEnv.set(patternBinding.varName, patternBinding.typeName);
392+
}
393+
}
350394

351395
// Tier 2: collect plain-identifier RHS assignments for post-walk propagation.
352396
// Delegates to per-language extractPendingAssignment — AST shapes differ widely

gitnexus/src/core/ingestion/type-extractors/index.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ export type {
4040
ConstructorBindingScanner,
4141
ForLoopExtractor,
4242
PendingAssignmentExtractor,
43+
PatternBindingExtractor,
4344
} from './types.js';
4445
export {
4546
TYPED_PARAMETER_TYPES,

0 commit comments

Comments
 (0)