Support image inputs for vision chat models by threepointone · Pull Request #448 · cloudflare/ai

threepointone · 2026-03-19T12:50:57Z

Summary

Fix image input handling in workers-ai-provider and @cloudflare/tanstack-ai for vision-capable chat models (Llama 4 Scout, Kimi K2.5, etc.)
Handle all LanguageModelV3DataContent variants (Uint8Array, base64 string, data URL) — previously only Uint8Array was handled, silently dropping base64 and data URL inputs
Send images as OpenAI-compatible image_url content parts inline in messages, which works with both the binding and REST API paths
Add Vision sections to both READMEs with usage examples

What changed

workers-ai-provider

convert-to-workersai-chat-messages.ts: Added toUint8Array (normalises all data content types) and uint8ArrayToBase64 (chunked encoder). File parts are now converted to image_url content parts in the messages array.
workersai-chat-prompt.ts: Added WorkersAIContentPart type, widened WorkersAIUserMessage.content to string | WorkersAIContentPart[]
workersai-chat-language-model.ts: Simplified buildRunInputs — both REST and binding paths pass content arrays through directly
Added 17 unit tests for image handling, e2e vision tests for Llama 4 Scout + Kimi K2.5 (REST) and uform-gen2 (binding)

@cloudflare/tanstack-ai

Updated normalizeMessagesForBinding docs to reflect that content arrays pass through (binding accepts them at runtime)
Updated tests to expect content arrays in binding path

Test plan

workers-ai-provider: 210 unit tests pass, tsc --noEmit clean
@cloudflare/tanstack-ai: 219 unit tests pass, tsc --noEmit clean
E2E: Llama 4 Scout and Kimi K2.5 correctly describe test images via REST API
E2E: Confirmed content arrays work through the binding for both Llama 4 Scout and Kimi K2.5

Made with Cursor

Convert various image sources into OpenAI-compatible image_url parts and send them inline in chat messages so vision-capable models work via both binding and REST paths. Key changes: - convertToWorkersAIChatMessages: accept LanguageModelV3DataContent (Uint8Array, base64, data URL), normalize to bytes, and emit content arrays with image_url data: URLs; removed the separate images array. - workers-ai-provider: allow messages.content to be either string or content-part arrays, normalize binding messages but pass content arrays through at runtime. - workersai-chat-language-model / create-fetcher: stop extracting a separate image payload and instead include content arrays in inputs; cast inputs for binding runtime use. - Tests and e2e fixtures: added/updated tests for base64, data URLs, multiple images, REST & binding vision flows; updated mock binding worker to handle vision route. - Docs: added Vision (Image Inputs) usage examples to READMEs. This enables sending images (Uint8Array, base64, or data URLs) inline as image_url parts so models like Llama 4 Scout and Kimi K2.5 can perform vision tasks.

changeset-bot · 2026-03-19T12:51:01Z

🦋 Changeset detected

Latest commit: 054ccb8

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages

Name	Type
workers-ai-provider	Patch
@cloudflare/tanstack-ai	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

pkg-pr-new · 2026-03-19T12:52:00Z

Open in StackBlitz

npx https://pkg.pr.new/cloudflare/ai/ai-gateway-provider@448

npx https://pkg.pr.new/cloudflare/ai/@cloudflare/tanstack-ai@448

npx https://pkg.pr.new/cloudflare/ai/workers-ai-provider@448

commit: 054ccb8

threepointone merged commit 5f603ff into main Mar 19, 2026
3 checks passed

threepointone deleted the vision-inputs branch March 19, 2026 12:54

github-actions bot mentioned this pull request Mar 19, 2026

Version Packages #449

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support image inputs for vision chat models#448

Support image inputs for vision chat models#448
threepointone merged 1 commit intomainfrom
vision-inputs

threepointone commented Mar 19, 2026

Uh oh!

changeset-bot bot commented Mar 19, 2026

Uh oh!

pkg-pr-new bot commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

threepointone commented Mar 19, 2026

Summary

What changed

Test plan

Uh oh!

changeset-bot bot commented Mar 19, 2026

🦋 Changeset detected

Uh oh!

pkg-pr-new bot commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant