feat(codemode): support AI SDK jsonSchema wrapper + production-harden schema converter#960
feat(codemode): support AI SDK jsonSchema wrapper + production-harden schema converter#960mattzcarey merged 22 commits intomainfrom
Conversation
MCP tools returned by getAITools() now have the _zod property on their inputSchema, which is required for codemode type generation. Previously, getAITools() used the AI SDK's jsonSchema() function to wrap MCP tool JSON schemas, which created Schema objects without the _zod property. This change uses Zod v4's fromJSONSchema() function instead, which converts JSON schemas to actual Zod schemas that include the _zod property. - Add static import of fromJSONSchema from zod - Replace this.jsonSchema() calls with fromJSONSchema() in getAITools() - Remove the jsonSchema initialization check (no longer needed) - Update tests to verify _zod property is present on tool inputSchema
🦋 Changeset detectedLatest commit: 0ee5368 The changes in this PR will be included in the next version bump. This PR includes changesets to release 2 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
commit: |
46ca09f to
9f2be18
Compare
- Remove unused json-schema-to-typescript dependency from codemode - Add comprehensive test for MCP tools with input AND output schemas - Add comprehensive test for AI SDK tool() with input AND output schemas - Remove redundant mixed tools test (covered by the two focused tests) - Both tests verify rich output types, not just 'unknown'
- Fix JSONSchema7 type incompatibility with fromJSONSchema parameter type - Remove unused JSONSchema7 import - Simplify test to use ToolDescriptors directly instead of AI SDK tool() - Fix Text component className prop error in codemode example
- Keep jsonSchema property on MCPClientManager (deprecated but functional) - ensureJsonSchema() still lazy-loads jsonSchema from AI SDK for compat - No breaking changes - existing code using manager.jsonSchema still works
Performance Analysis:
|
| Metric | fromJSONSchema() |
jsonSchema() |
Ratio |
|---|---|---|---|
| Simple schema (2 props) | 25 µs | 0.25 µs | ~100x slower |
| Complex MCP schema | 151 µs | 0.22 µs | ~685x slower |
| Deeply nested (5 levels) | 89 µs | 0.26 µs | ~342x slower |
| Memory per schema | ~559 KB | ~1 KB | ~559x more |
What this means in practice
For a typical MCP server with 10 tools:
fromJSONSchema(): ~1.5ms startup + ~5.5MB memoryjsonSchema(): ~0.002ms startup + ~10KB memory
However, this cost is paid once when getAITools() is called, not per-request. After initialization:
- Validation is fast: ~1 µs per call
- Schemas are cached in the returned tool definitions
Why we need fromJSONSchema()
jsonSchema()creates a thin wrapper without_zodpropertyfromJSONSchema()creates a real Zod schema with_zodproperty- The
_zodproperty is required forzod-to-tswhich codemode uses for type generation
Recommendation
keep jsonSchema and build a parser to support it.
…andling - Add safeZodToTs() wrapper that returns "unknown" for schemas that can't be represented in TypeScript (e.g., transforms) instead of throwing - Add schema-conversion.test.ts with 165 tests covering all Zod v4 types: primitives, literals, enums, objects, arrays, tuples, unions, intersections, records, maps, sets, modifiers, readonly, coerce, pipe, transform, template literals, lazy/recursive, functions, promises, branded types, effects/refinements, string/number validators, fromJSONSchema, and sanitizeToolName edge cases - Test coverage exceeds zod-to-ts package (~25 tests vs 165 tests)
These were used to validate fromJSONSchema() vs jsonSchema() performance: - fromJSONSchema: ~168µs (creates real Zod schema with _zod property) - jsonSchema: ~0.25µs (thin wrapper, no _zod property) The ~680x slower performance is acceptable since schema creation happens once at startup, not per-request. fromJSONSchema is required for codemode type generation which needs the _zod property.
Reverts MCP client to use fast jsonSchema wrapper (~0.25µs) instead of slower fromJSONSchema (~168µs). Type conversion is now handled in the codemode package's generateTypes function. Changes to codemode/types.ts: - Add schema detection: isZodSchema, isJsonSchemaWrapper, isRawJsonSchema - Add normalizeToZodSchema to convert any schema type to Zod - Extract JSON schema from AI SDK jsonSchema wrapper via symbol properties - Use fromJSONSchema internally only during type generation (one-time cost) - Support: Zod schemas, AI SDK jsonSchema wrapper, raw JSON schemas Performance: - MCP getAITools(): ~0.25µs per schema (fast jsonSchema wrapper) - generateTypes(): ~210µs for complex schemas (acceptable startup cost) Tests added: - AI SDK jsonSchema wrapper handling - Raw JSON Schema object handling
Replace fromJSONSchema-based conversion with direct JSON Schema to TypeScript string conversion. This avoids creating Zod validators when we only need type information. - Add jsonSchemaToTypeString() for direct conversion - Remove fromJSONSchema import (no longer needed) - Remove raw JSON Schema handling (only Zod + jsonSchema wrapper) - Support: objects, arrays, enums, unions, intersections, etc. Performance: - jsonSchema wrapper → TypeScript: ~6µs (was ~210µs, 35x faster) - Zod schema → TypeScript: ~32µs (uses zod-to-ts)
Remove 144 tests that were testing zod-to-ts library behavior. Keep 21 focused tests for our own code: - sanitizeToolName (10 tests) - generateTypes with jsonSchema wrapper (9 tests) - generateTypes with Zod schema (2 tests - integration only)
Add depth/circular reference guards, $ref resolution, string escaping (control chars, U+2028/U+2029, JSDoc), tuple/nullable support, and per-tool error isolation so one malformed schema never crashes the pipeline.
- Apply nullable after resolving $ref so nullable: true on ref properties correctly produces `| null` - Empty enum (enum: []) now produces `never` instead of invalid TS - Add improvements.md documenting known limitations found during stress-testing with 51 real-world MCP schemas
- Remove improvements.md (findings go in PR comment instead) - Emit @Format JSDoc annotation when a property has a format keyword (e.g. email, date-time, uuid) so the info is preserved for the LLM without changing the TS type
Stress Testing FindingsRan Bugs found and fixed
Known limitations (cannot fix in converter)
Out of scope (by design, degrade to
|
- Return `{}` for `additionalProperties: false` with no properties
- Use JSON.stringify for object/array enum and const values
- Multi-line JSDoc when both description and @Format are present
- Add null guard in extractDescriptions for propSchema
- Normalize newlines in descriptions to prevent broken JSDoc
Treat boolean property schemas as concrete TS types: true -> unknown, false -> never, and respect required/optional markers when emitting properties. Add tests covering boolean property schemas, type arrays (e.g. ["string","null"]), integer->number mapping, bare arrays, empty enum -> never, and additionalProperties behavior (both true and typed). Also add a clarifying comment about index signature compatibility when emitting additionalProperties.
|
Added a few more tests, and addressed this gap: Boolean property schemas silently dropped Approving, land if my above thing makes sense to you (look at my commit) |
Summary
This PR makes codemode's
generateTypes()work with AI SDKjsonSchema()wrappers (from MCP tools) and hardens the entire JSON Schema → TypeScript conversion pipeline so it never crashes regardless of input.Before:
generateTypes()only worked with Zod schemas. MCP tools usingjsonSchema()wrappers silently producedunknowntypes. No protection against recursive schemas, malformed input, or special characters.After: Full support for both Zod and
jsonSchema()wrappers with direct JSON Schema → TypeScript conversion (~6µs vs ~32µs for Zod). The converter handles$ref, circular schemas, deeply nested schemas, tuple types, nullable types, and special characters without crashing.What changed
packages/codemode/src/types.ts(+568 lines)Schema detection & extraction:
isJsonSchemaWrapper()/extractJsonSchema()— detect and unwrap AI SDKjsonSchema()wrappers (both direct property and symbol-based storage)isZodSchema()— detect Zod schemas via_zodpropertysafeSchemaToTs()— routes tozodToTsor direct conversion based on schema typeDirect JSON Schema → TypeScript converter (
jsonSchemaToTypeString):string,number,integer,boolean,null,object,array,enum,const,anyOf,oneOf,allOf, type arrays (["string", "null"])ConversionContextthreading through all recursive calls (root schema, depth counter, visited set, max depth)Safety & correctness:
$refresolution —resolveRef()handles internal JSON Pointers (#/definitions/...,#/$defs/...,#), JSON Pointer unescaping (~0/~1). External URLs degrade tounknownunknownat depth 20Set<unknown>tracks visited schema objects, returnsunknownon cyclesprefixItems(JSON Schema 2020-12) anditemsas array (draft-07) →[T1, T2, ...]nullable: true—applyNullable()applied across all branches including$refadditionalProperties: false— empty object with no additional properties returns{}instead ofRecord<string, unknown>enum: []producesneverinstead of invalid TypeScriptJSON.stringify()instead ofString()(which produced[object Object])escapeStringLiteral()for\n,\r,\t, control chars, U+2028/U+2029 in enum/const values;quoteProp()for property names;escapeJsDoc()for*/in JSDoc\r?\n→ spaces in descriptions before JSDoc emission@formatas JSDoc hint — JSON Schemaformatkeyword emitted as@formattag; multi-line JSDoc when combined with descriptiontry-catchingenerateTypes()loop; one bad tool emitsunknowntypes without crashing the pipelineinputSchema— missing schema produces empty param descriptions instead of throwingextractDescriptions— guardstypeof null === "object"edge casepackages/agents/src/mcp/client.ts(+25/-7)getAITools()— replaced.map()with for-loop +try-catch; bad tools logged viaconsole.warnand skippedtool.inputSchema— missing schema falls back to{ type: "object" }packages/codemode/src/tests/schema-conversion.test.ts(new, 42 tests)$refresolution$defs,definitions, unresolvable, external URL, nested chains)*/in property and tool descriptions)itemsarray, 2020-12prefixItems)nullable: true)additionalProperties: false{}vsRecord<string, unknown>)JSON.stringify)@formatpackages/codemode/src/tests/types.test.ts(new, 24 tests)Covers: Zod tools, jsonSchema tools, mixed tools, tool name sanitization, description/JSDoc, edge cases (hyphenated names, reserved words, digit-leading names), malformed input (null/undefined/string inputSchema), error isolation (throwing tool doesn't break others).
Stress testing
Tested against 51 schemas from real-world MCP servers and adversarial inputs:
Real MCP servers: Cloudflare Workers, DynamoDB, Docker, Filesystem, Playwright, Kubernetes, Home Assistant, Obsidian, Stagehand
Adversarial inputs: Self-referencing
$ref, 25-level nesting, circular A→B→A refs,__proto__property names, JSDoc-breaking*/descriptions, empty enums, external URL refs, boolean schemas, null/undefined/string inputSchemaAll 51 pass without crashes.
Performance
getAITools()generateTypes()with jsonSchemagenerateTypes()with ZodKnown limitations
__proto__as a property name is silently dropped by the JS engine (object literals only —JSON.parse()from MCP wire protocol is unaffected)$refURLs,not,if/then/else,patternPropertiesare out of scope and degrade tounknownformatkeyword is emitted as a@formatJSDoc hint, not as a refined TypeScript typeTest plan
glm-4.7-flashonly)