Been diving deep into the Vercel AI SDK v5 canary releases lately, and wow, there are some significant shifts from v4 that are worth talking about. If you're building chat interfaces or anything conversational with AI, this is stuff you'll want to know. The team over at Vercel, along with community feedback, has clearly put a lot of thought into addressing some of the trickier parts of building rich, interactive AI experiences.
This post is going to be a bit of a deep dive. We'll look at how v5 changes the game for representing messages, how client-side state is managed, and how message delivery is becoming more flexible. Think of this as a walk-through from a fellow dev who's been in the trenches with the canary.
🖖🏿 A Note on Process & Curation: While I didn't personally write every word, this piece is a product of my dedicated curation. It's a new concept in content creation, where I've guided powerful AI tools (like Gemini Pro 2.5 for synthesis, git diff main vs canary v5 informed by extensive research including OpenAI's Deep Research, spent 10M+ tokens) to explore and articulate complex ideas. This method, inclusive of my fact-checking and refinement, aims to deliver depth and accuracy efficiently. I encourage you to see this as a potent blend of human oversight and AI capability. I use them for my own LLM chats on Thinkbuddy, and doing some make-ups and pushing to there too.
Let's get into it.
1. Intro: Why Message-Part Thinking Replaces content: string
- TL;DR: Vercel AI SDK v5 fundamentally changes how chat messages are structured by moving away from a single
content: string
to an array of typedparts
, enabling richer, multi-modal, and more maintainable conversational UIs.*
Why this matters?
If you've worked with Vercel AI SDK v4 (or many other chat systems, for that matter), you'll know that a chat message on the client often boiled down to a core message.content: string
. This was fine for simple text, but as AI interactions got more complex, that single string became a real bottleneck. Trying to represent things like inline tool UIs (think a loading spinner for a function call, then its results), file previews, structured data cards from an AI, distinct reasoning steps from the model, or rich citations for RAG – it all had to be shoehorned into that one string, or managed through somewhat ad-hoc annotations
or toolInvocations
arrays alongside the main content.
As you probably know, this meant developers often resorted to custom parsing logic, Markdown conventions that you'd have to interpret on the client, or complex client-side logic to rebuild these richer UI states from disparate pieces of information. It wasn't always clean, and it definitely wasn't geared towards the "Generative UI" paradigm where the AI can more directly influence the structure of the UI itself.
How it’s solved in v5? (Step-by-step, Code, Diagrams)
AI SDK v5 (specifically the canary versions we're seeing, so expect breakage and changes as it's iterative!) throws this old model out the window with the introduction of what I'm calling "Message-Part Thinking." The headline change: the top-level content: string
is gone from the core UIMessage
object (the new v5 data structure for representing chat messages on the client and for persistence). Instead, UIMessage
now has a parts: Array<UIMessagePart>
field.
This is a paradigm shift. The core philosophy, as gleaned from the Vercel AI team's discussions and the SDK's structure, is that UI messages ≠ model messages. What's optimal for UI display is often richer and more structured than what an LLM needs. The new UIMessage.parts
array allows for precisely this richness. Each element in the parts
array is a typed object representing a distinct piece of content – text, a tool interaction, a file, a citation, even reasoning steps.
This change directly enables:
- Richer, multi-part messages: Think a message that starts with text, shows a tool working, displays the tool's result, and then concludes with more text, all as distinct, renderable parts of a single conceptual message.
- True "Generative UI": The AI can stream a sequence of these typed parts, allowing the client to have dedicated rendering logic for each, leading to more dynamic and interactive interfaces.
- Maintainable Code: Handling complex UIs becomes cleaner because you're dealing with structured data, not parsing strings or juggling parallel arrays.
Take-aways / Migration Checklist Bullets
- v5 moves from a single
content: string
toUIMessage.parts: Array<UIMessagePart>
. - This fundamentally changes how you'll render messages.
- The goal is to support richer, more structured conversational experiences.
- This post will explore this new
UIMessage
anatomy in detail, then introduce two other foundational v5 concepts:ChatStore
(principles for client-side state management) andChatTransport
(a concept for decoupling message delivery), which leverage this new message thinking.
2. UIMessage
Anatomy
- TL;DR: AI SDK v5 introduces
UIMessage<METADATA>
, a new client-side message structure where all content resides in an array of typedUIMessagePart
s, complemented by stable IDs and typed message-level metadata, eliminating the top-levelcontent: string
.*
Why this matters?
In V4, the Message
object was simpler, often revolving around content: string
. While toolInvocations
and annotations
provided some extensibility, managing complex, ordered sequences of mixed content types (like text, then a tool UI, then a file preview) within a single assistant turn could be cumbersome. Persisting and rehydrating this rich state reliably was also a challenge. v5's UIMessage
aims to provide a robust, type-safe, and extensible "blueprint" for these modern chat messages.
How it’s solved in v5? (Step-by-step, Code, Diagrams)
Let's break down the new UIMessage<METADATA>
interface. This is the foundational data structure you'll be working with for messages in useChat
and for persistence. (Heads up: This is based on packages/ai/src/ui/ui-messages.ts
in the canary branch, so specifics might evolve.)
// From packages/ai/src/ui/ui-messages.ts (conceptual, may vary slightly in exact canary)
export interface UIMessage<METADATA = unknown> {
readonly id: string; // Unique identifier for the message
readonly role: 'system' | 'user' | 'assistant'; // Originator of the message
readonly parts: Array<UIMessagePart>; // Array of content parts, ALL content is here
metadata?: METADATA; // Optional, typed, application-specific metadata
// 'createdAt': Date; // Typically added by UI hooks like useChat or server-side logic
}
2.1 Message-level fields (id
, role
, metadata
, createdAt
)
-
id: string
:- Purpose: This is a unique identifier for the message.
- v5 Significance: A key improvement in v5 is that this
id
is intended to be stable throughout the message's lifecycle, even during streaming. If you recall, in V4, message IDs could sometimes change during streaming, causing UI re-keying headaches in React. The client-side stream processing utilities in v5 (likeprocessUIMessageStream
, which consumes the v5 UI Message Stream) are designed to manage this stability.
Message ID: msg_123 -------------------- Part 1 (stream) -> | Part 2 (stream) -> | msg_123 (stable) Part 3 (stream) -> | --------------------
[FIGURE 1: Diagram showing a message ID remaining constant as parts stream in]
-
role: 'system' | 'user' | 'assistant'
:- Purpose: Standard roles indicating the originator of the message. This is consistent with most chat systems.
-
metadata?: METADATA
:- Purpose: This is a new and powerful addition in v5. It's a generic, typed field for associating custom, application-specific structured information directly with a UI message.
- v5 Significance: In V4, you might have used untyped
annotations
or added ad-hoc properties to messages.metadata
provides a more robust and type-safe mechanism.useChat
in v5 even supports amessageMetadataSchema
option (e.g., a Zod schema) to validate this metadata when it's streamed from the server. -
Examples: You could store processing times, confidence scores from the AI, database references, UI display hints, or anything your application needs to track alongside a message. For instance:
// Example metadata type type MyMessageMetadata = { processingTimeMs?: number; userFeedback?: 'positive' | 'negative'; referenceSourceId?: string; }; // A UIMessage with this metadata // const message: UIMessage<MyMessageMetadata> = { /* ... */ };
Take-away: Leverage
metadata
for structured, typed custom data per message.
-
createdAt?: Date
:- Purpose: Timestamp indicating when the message was created.
- v5 Significance: While not explicitly in the core
UIMessage
interface definition shown above (which focuses on what's essential for the stream/parts), acreatedAt: Date
field is typically added by UI hooks likeuseChat
when a message is created on the client (e.g., when a user submits a message) or by the server when an AI message is generated. It's crucial for ordering messages in the UI and for persistence strategies. The v5 approach handles this more directly than V4'ssendExtraMessageFields
which is now removed.
-
Absence of Top-Level
content: string
:- It's worth reiterating: the single
content: string
field at the top level of a message object is gone fromUIMessage
. All displayable and structural content now resides in theparts
array. This is arguably the most significant structural change from V4's typical UIMessage
and requires a shift in how you think about and render messages.
- It's worth reiterating: the single
2.2 The six built-in UIMessagePart
types
All content of a UIMessage
now lives in its parts
array. UIMessagePart
(a new v5 term, often found in packages/ai/src/ui/ui-messages.ts
) is a discriminated union, meaning each part object has a type
property that dictates its structure and purpose. This allows for type-safe processing and rendering.
// From packages/ai/src/ui/ui-messages.ts
export type UIMessagePart =
| TextUIPart
| ReasoningUIPart
| ToolInvocationUIPart // Generic T is implicitly 'any' if not specified by consumer
| SourceUIPart
| FileUIPart
| StepStartUIPart;
// Note: ErrorUIPart and DataUIPart were in earlier conceptual drafts but
// might be handled via 'error' stream parts or UIMessage.metadata in current Canary.
// Stream-level errors are handled by an 'error' UIMessageStreamPart.
Let's go through each built-in part type:
-
TextUIPart
:- Interface:
{ readonly type: 'text'; readonly text: string; }
- Purpose: Represents a segment of plain textual content.
- Use Case: Standard user messages, simple AI text responses, or segments of text interleaved with other part types in a more complex AI response.
-
Example JSON:
{ "type": "text", "text": "Hello! How can I help you today?" }
- Interface:
-
ReasoningUIPart
:- Interface:
{ readonly type: 'reasoning'; readonly text: string; readonly providerMetadata?: Record<string, any>; }
- Purpose: Displays the AI's thought process, chain-of-thought explanations, or other "behind-the-scenes" reasoning.
- Key Fields:
-
text
: The actual reasoning content. -
providerMetadata
: Optional. Can hold provider-specific information related to that reasoning step (e.g., specific metrics or tags from the LLM). Its content is often provider-dependent and might be undefined.
-
- Use Case: You might render this in a collapsible "Show Reasoning" section in your UI to provide transparency into the AI's process.
-
Example JSON:
{ "type": "reasoning", "text": "The user asked for a weather forecast. I should call the 'getWeather' tool with the location 'San Francisco'.", "providerMetadata": { "step_id": "reason_step_1" } }
- Interface:
-
ToolInvocationUIPart
:- Interface:
{ readonly type: 'tool-invocation'; readonly toolInvocation: ToolInvocation; }
- Purpose: This is crucial for representing tool or function call interactions. It encapsulates the entire lifecycle of a tool usage event.
-
Key Field:
-
toolInvocation
: This nested object (defined inpackages/ai/src/util/tool-invocation.ts
) is itself a discriminated union representing different states of the tool call. This allows UIs to render rich, stateful feedback. The states are:-
'partial-call'
: The model is streaming the arguments for a tool call.- Structure:
{ state: 'partial-call'; toolCallId: string; toolName: string; argsTextDelta: string; }
-
argsTextDelta
contains the latest chunk of the (potentially stringified JSON) arguments. -
*Example JSON:*
{ "type": "tool-invocation", "toolInvocation": { "state": "partial-call", "toolCallId": "tool_abc123", "toolName": "searchWeb", "argsTextDelta": "{\"query\":\"latest AI news" } }
- Structure:
-
'call'
: The model has finished specifying the tool name and its complete arguments.- Structure:
{ state: 'call'; toolCallId: string; toolName: string; args: JSONValue; }
-
args
contains the complete, parsed arguments (as aJSONValue
). -
*Example JSON:*
{ "type": "tool-invocation", "toolInvocation": { "state": "call", "toolCallId": "tool_abc123", "toolName": "searchWeb", "args": { "query": "latest AI news", "count": 3 } } }
- Structure:
-
'result'
: The result from the executed tool is available.- Structure:
{ state: 'result'; toolCallId: string; toolName: string; args: JSONValue; result: JSONValue; }
-
result
contains the outcome of the tool execution (as aJSONValue
). -
*Example JSON:*
{ "type": "tool-invocation", "toolInvocation": { "state": "result", "toolCallId": "tool_abc123", "toolName": "searchWeb", "args": { "query": "latest AI news", "count": 3 }, "result": [ { "title": "AI SDK v5 Announced", "url": "..." }, { "title": "New LLM Achieves SOTA", "url": "..." } ] } }
- Structure:
-
'error'
: An error occurred during the tool invocation or its execution.- Structure:
{ state: 'error'; toolCallId: string; toolName: string; args: JSONValue; errorMessage: string; }
-
errorMessage
contains the error details. -
*Example JSON:*
{ "type": "tool-invocation", "toolInvocation": { "state": "error", "toolCallId": "tool_abc123", "toolName": "searchWeb", "args": { "query": "latest AI news", "count": 3 }, "errorMessage": "API limit exceeded for searchWeb tool." } }
- Structure:
-
-
- Interface:
* *Use Case:* Rendering dynamic tool UIs – showing a loading state during `'partial-call'` or `'call'`, then displaying formatted `args`, and finally the `result` or an `errorMessage`.
```markdown
ToolInvocationUIPart (tool_abc123)
-----------------------------------
1. 'partial-call' (streaming args...)
|
v
2. 'call' (args complete)
|
v (tool execution)
|
3. 'result' (success) OR 'error' (failure)
```
*`[FIGURE 2: Sequence diagram showing ToolInvocationUIPart states changing]`*
-
SourceUIPart
:- Interface:
{ readonly type: 'source'; readonly source: LanguageModelV2Source; }
- Purpose: Represents a cited source or reference for information provided in the message.
- Key Field:
-
source
: An object of typeLanguageModelV2Source
(a v5 term from@ai-sdk/provider
), typically looking like{ sourceType: 'url'; id: string; url: string; title?: string; providerMetadata?: SharedV2ProviderMetadata; }
.
-
- Use Case: Essential for Retrieval Augmented Generation (RAG) systems. The UI can render these as clickable links, footnotes, or rich preview cards for documents or web pages.
-
Example JSON:
{ "type": "source", "source": { "sourceType": "url", "id": "doc_xyz789", "url": "https://example.com/research-paper.pdf", "title": "Groundbreaking AI Research Paper" } }
- Interface:
-
FileUIPart
:- Interface:
{ readonly type: 'file'; readonly mediaType: string; readonly filename?: string; readonly url: string; }
- Purpose: Represents a file associated with the message, like an image, document, or other media.
- Key Fields:
-
mediaType
: An IANA standard media type (e.g., "image/png", "application/pdf"). -
filename
: Optional, for display purposes. -
url
: This is the crucial field. It can be:- A remote HTTP(S) URL (e.g.,
https://cdn.example.com/image.jpg
). - A Data URL (e.g.,
...
) for embedded content, often used for user-uploaded files before they're persisted to cloud storage.
- A remote HTTP(S) URL (e.g.,
-
- Use Case: Displaying user-uploaded files (e.g., an image the user wants the AI to analyze) or files generated by the AI (e.g., a chart, a piece of code). Your UI would use the
mediaType
andurl
to render an appropriate preview (e.g., an<img>
tag, a PDF viewer link, or a generic download link). -
Example JSON (remote URL):
{ "type": "file", "mediaType": "image/jpeg", "filename": "cat_photo.jpg", "url": "https://example-files.com/cats/cat_photo.jpg" }
- Interface:
* *Example JSON (Data URL):*
```json
{
"type": "file",
"mediaType": "image/png",
"url": "..." // (shortened for brevity)
}
```
-
StepStartUIPart
:- Interface:
{ readonly type: 'step-start'; }
(May haveexperimental_attachments
as per extra_details, but the core context focuses on its marker role) - Purpose: A simple, contentless marker part. It indicates the beginning of a new logical step or phase within a multi-step AI generation process.
- Use Case: Often used in conjunction with tool calls or complex reasoning sequences to help UIs visually delineate these stages. For example, you might render a horizontal rule or a step label (e.g., "Step 1: Searching...") when this part appears.
- The
extra_details
mention anexperimental_attachments
field onStepStartUIPart
. This could be{ name: string; contentType: string; url: string; }[]
, suggesting files provided by the user at the start of a step. This is an area to watch as the canary evolves. -
Example JSON:
{ "type": "step-start" }
- Interface:
Extensibility:
This parts
system is inherently extensible. While these are the built-in types, the architecture opens the door for custom part types in the future, allowing developers to represent even more specialized content within messages.
Take-aways / Migration Checklist Bullets
-
UIMessage
is the new client-side message format in v5. -
UIMessage.parts
is an array of typed objects (UIMessagePart
) and holds ALL message content. -
message.content
(top-level string) is GONE. Update all rendering logic. - Familiarize yourself with the six built-in part types:
text
,reasoning
,tool-invocation
,source
,file
,step-start
. -
UIMessage.metadata
allows for typed, application-specific data per message. -
UIMessage.id
is stable through streaming in v5. -
createdAt
is typically added by hooks/server for ordering.
3. From Client to LLM and Back
- TL;DR: AI SDK v5 establishes a clear data transformation pipeline where client-side
UIMessage
arrays are converted toModelMessage
arrays for LLM interaction viaconvertToModelMessages()
, and LLM output streams are transformed into v5 UI Message Streams (SSE ofUIMessageStreamPart
s) viatoUIMessageStreamResponse()
for client consumption.*
Why this matters?
Understanding how messages flow and transform between the client, your server, and the LLM is crucial. In V4, this could sometimes feel a bit implicit. v5 makes this pipeline more explicit and robust, driven by the "UI Messages ≠ Model Messages" philosophy. The rich UIMessage
structure is great for the UI, but LLMs expect a more constrained format. Similarly, the raw output from an LLM needs to be carefully translated into the structured UI stream the client now expects.
How it’s solved in v5? (Step-by-step, Code, Diagrams)
Let's trace the journey of a message.
+--------+ +------------------------+ +--------------+ +-----+ +------------+ +-------------------------+ +-------------------------+ +--------+
| Client | --> | Server | --> | convertTo | --> | LLM | --> | LLM Stream | --> | toUIMessageStreamResponse | --> | UI Message Stream | --> | Client |
| (UIMsg)| | (receives UIMsg Array) | | ModelMessages| | (V2)| | (raw parts)| | | | (UIMessageStreamPart[]) | | (renders)|
+--------+ +------------------------+ +--------------+ +-----+ +------------+ +-------------------------+ +-------------------------+ +--------+
[FIGURE 3: High-level flow diagram: Client (UIMessage[]) -> Server -> convertToModelMessages -> ModelMessage[] -> LLM (V2 Interface) -> LLM Stream -> toUIMessageStreamResponse -> UI Message Stream (UIMessageStreamPart[]) -> Client]
3.1 convertToModelMessages()
flow
This server-side utility is your bridge from the client's world to the LLM's world.
- Purpose: To convert an array of
UIMessage[]
(received from the client, rich withparts
andmetadata
) into an array ofModelMessage[]
(a v5 term for the format expected by V2 LLM interfaces on the server).ModelMessage
itself also has itscontent
as an array of typed parts (e.g.,LanguageModelV2TextPart
,LanguageModelV2FilePart
). - Location: This is a server-side utility, typically found in
packages/ai/src/ui/convert-to-model-messages.ts
. - Inputs: An array of
UIMessage<METADATA>[]
. - Outputs: An object containing
modelMessages: ModelMessage[]
. - Core Logic:
- It iterates through each
UIMessage
and itsparts
array. -
TextUIPart
s are mapped toLanguageModelV2TextPart
s within theModelMessage.content
. -
FileUIPart
s are mapped toLanguageModelV2FilePart
s. The function needs to handle whether theurl
is a Data URL (in which case it might extract base64 data) or a remote URL. It also considers themodel.supportedUrls
property from theLanguageModelV2
instance; if the model can fetch from the URL directly, the SDK might just pass the URL. -
ToolInvocationUIPart
s are critical:- If an assistant's
UIMessage
has aToolInvocationUIPart
withtoolInvocation.state === 'call'
, this is converted into aLanguageModelV2ToolCallPart
within an assistantModelMessage
'scontent
. - If a
UIMessage
(often withrole: 'tool'
) represents a tool result (e.g., from aToolInvocationUIPart
withstate === 'result'
or'error'
), this is converted into one or moreLanguageModelV2ToolResultPart
s, which are then wrapped in aModelMessage
withrole: 'tool'
.
- If an assistant's
- What's Excluded: Generally, UI-specific parts like
ReasoningUIPart
andStepStartUIPart
are excluded fromModelMessage
s because they are for UI presentation, not LLM prompting. Similarly,UIMessage.id
,UIMessage.createdAt
, andUIMessage.metadata
are typically stripped as they are not part of the standard LLM prompt structure.
- It iterates through each
- Provider Adaptation: The
ModelMessage[]
array produced byconvertToModelMessages
is a standardized format. This array is then passed to the specific V2 provider adapter (e.g.,@ai-sdk/openai
,@ai-sdk/anthropic
). The provider adapter performs the final transformation of thisModelMessage[]
into the exact JSON payload or prompt string required by that particular provider's API (e.g., mapping to OpenAI'smessages
array withtool_calls
objects, or Anthropic's specific content block structure).
3.2 toUIMessageStreamResponse()
flow
Once your server has called the LLM (e.g., using streamText()
with a V2 model instance) and has a result stream, this server-side helper method takes over to communicate back to the v5 client.
- Purpose: To take the raw output stream from a V2 core function like
streamText()
(which yields V2 stream parts like text deltas, tool call info, file info from the model, etc.) and transform it into the v5 UI Message Stream. This stream is an SSE (Server-Sent Events) stream composed ofUIMessageStreamPart
objects. -
Input: Typically, the
result
object from a V2 core function likestreamText()
. For example:
// Server-side import { streamText } from '@ai-sdk/provider'; import { openai } from '@ai-sdk/openai'; // ... const result = await streamText({ /* ... V2 options ... */ }); // Now use result.toUIMessageStreamResponse()
-
Output: An HTTP
Response
object, ready to be sent to the client. This response will have the correct SSE headers:-
Content-Type: text/event-stream
-
x-vercel-ai-ui-message-stream: v1
(This header signals to the client that it's a v5 UI Message Stream).
-
Transformation: Internally,
toUIMessageStreamResponse()
processes the V2 stream parts coming from the LLM provider (text deltas, tool call information, file data from the model, source information, etc.) and emits correspondingUIMessageStreamPart
events over SSE. We'll dive into the specifics ofUIMessageStreamPart
types in a future post, but examples include'text'
(for text deltas),'tool-call'
(for tool call info),'file'
(for file data),'metadata'
,'error'
, and'finish'
.-
onFinish
Hook: This is a very important callback you can provide as an option totoUIMessageStreamResponse()
.- Signature (conceptual):
onFinish({ messages }: { messages: UIMessage[] })
(The exact signature may vary based on context and SDK evolution, but it aims to provide the final message state). - Purpose: It's invoked on the server after the entire response from the LLM has been processed and all corresponding
UIMessageStreamPart
s have been written to the client-bound stream (or at least queued). - Use Case: This is the ideal place for persistence. The
messages
argument aims to provide the final, fully constructed assistantUIMessage
(s) from the current turn, or potentially the complete updated conversation history iforiginalMessages
(the history up to the user's turn) were passed into the context for merging. You'd use this to save the conversation to your database. This is a cleaner approach than some V4 patterns. - The "Minimal migration recipe" in
v4_vs_v5_comparison
(from my research context) uses thisonFinish
pattern in its server route example.
- Signature (conceptual):
Take-aways / Migration Checklist Bullets
- Remember: UI Messages (
UIMessage
) are for the client/persistence, Model Messages (ModelMessage
) are for the LLM. - On the server, use
convertToModelMessages()
to prepare data for V2 LLM calls. - UI-specific parts (
ReasoningUIPart
,StepStartUIPart
),UIMessage.id
,UIMessage.metadata
are generally stripped byconvertToModelMessages
. - Use
result.toUIMessageStreamResponse()
from V2 core functions likestreamText()
to send v5 UI Message Streams back to the client. - The
onFinish
callback intoUIMessageStreamResponse()
is key for server-side persistence of the finalUIMessage
(s). - Ensure your server API sets the
x-vercel-ai-ui-message-stream: v1
header for v5 clients.
4. ChatStore
: State-Sharing Under the Hood
- TL;DR: While a directly exposed
ChatStore
class might be more conceptual for typicaluseChat
users in v5 Canary, its principles—a centralized, reactive, and shared state manager for chat conversations—are deeply embedded in howuseChat
now handles state, especially when using a sharedid
prop.*
Why this matters?
One of the significant challenges in V4, as many of you probably experienced, was state synchronization. If you had multiple components displaying or interacting with the same chat (e.g., a main chat window and a small chat preview in a sidebar), each useChat
instance held its own separate copy of the state (messages
, input
, etc.). Keeping these in sync often required manual prop drilling or bolting on an external state management library. This could lead to the dreaded "tab A is out of sync with tab B" bugs or just a lot of boilerplate. The vision for ChatStore
(a v5 concept) is to solve this by providing a single, reliable source of truth.
How it’s solved in v5? (Step-by-step, Code, Diagrams)
Okay, a quick clarification: if you're digging through the v5 Canary diffs right now, you might not find a high-level, directly exportable ChatStore
class that you instantiate and pass around in most common useChat
scenarios. Instead, useChat
itself has been significantly beefed up to embody the principles of a ChatStore
.
- Purpose & Benefits (The
ChatStore
Vision):- Central State Container: The idea is a single place to hold and manage everything about a chat conversation: the array of
UIMessage<METADATA>
objects, the current user input, loading status, errors, etc. - Single Source of Truth: No more conflicting states between different UI components.
- Seamless Multi-Component Sync: Multiple UI components can share and react to the same chat data effortlessly. This is the "no more 'tab A is out of sync with tab B' bugs" promise.
- Synchronized Writes & Optimistic Updates: All operations that modify the chat (sending a message, receiving AI parts) are managed centrally, allowing for consistent optimistic updates.
- Central State Container: The idea is a single place to hold and manage everything about a chat conversation: the array of
4.1 Single-source cache, optimistic writes (via useChat
)
How does useChat
in v5 achieve this?
-
Caching with Consistent
id
: When you useuseChat
with a consistentid
prop (e.g.,useChat({ id: 'current_active_chat' })
), the SDK internally manages a shared state for that specific chat session. This provides an in-memory cache for the conversation (messages
array,input
value,status
,error
) for that session's lifetime in the browser. This means if another component also callsuseChat({ id: 'current_active_chat' })
, it will tap into that same cached state. This is a huge win.
Component A Component B useChat({id: "chat_XYZ"}) useChat({id: "chat_XYZ"}) | | +-----------+-------------------+ | v +-------------------+ | Internal Shared | | State for chat_XYZ| | (messages, input) | +-------------------+
[FIGURE 4: Diagram showing two useChat instances with the same ID pointing to a single internal state]
-
Optimistic Updates: This is where the UI feels snappy.
- When you call
handleSubmit
(fromuseChat
), the user's message (as aUIMessage
) is immediately added to this local, shared state. The UI updates instantly. - Then, as
UIMessageStreamPart
s arrive from the server (via the v5 UI Message Stream we discussed),useChat
(internally usingprocessUIMessageStream
and itsonUpdate
callback) incrementally updates the assistant'sUIMessage
in that shared state. Each update triggers a reactive re-render, giving that smooth streaming effect.
- When you call
4.2 Multi-hook synchronisation demo (using useChat
with shared id
)
This is where the ChatStore
principles really shine through useChat
.
-
The
id
Prop is Key: If you have, say, a main chat window component and a smaller chat preview sidebar component, and both useuseChat
initialized with the sameid
string:
// MainChatWindow.tsx const { messages, input, handleInputChange, handleSubmit } = useChat({ id: "session_123" }); // ... render UI ... // ChatPreviewSidebar.tsx const { messages: previewMessages } = useChat({ id: "session_123" }); // Same ID! // ... render a summary or last few messages ...
Automatic Synchronization: If the user sends a message from
MainChatWindow
,ChatPreviewSidebar
will automatically see that user message appear optimistically, and then see the AI's response stream in, without you needing to pass props down or use Context API manually for this specific chat state. This is because both hooks are subscribed to the same underlying managed state for"session_123"
.Relation to
createChatStore()
: Thev4_vs_v5_comparison
table (from my research context) and some migration guide snippets mentioncreateChatStore()
(e.g., from@ai-sdk/core
or@ai-sdk/react
). While you might not call this directly if you're just usinguseChat
, this function is likely the underlying factory thatuseChat
uses internally to create and manage these shared state instances when anid
is provided. It's the potential "engine" for this shared state mechanism.Conceptual Imperative API (Beyond
useChat
):
Theextra_details
also hint that aChatStore
might eventually (or conceptually does) expose an imperative API (methods likeaddMessage
,setMessages
,getState
,subscribe
). This would be powerful for advanced use cases, non-React environments, or for building custom UI layers on top of the SDK's state logic. For now, with v5 Canary,useChat
is the primary way to leverage these store-like capabilities in React.
Take-aways / Migration Checklist Bullets
-
ChatStore
principles in v5 aim to centralize client-side chat state. -
useChat({ id: 'shared_id' })
is the v5 Canary way to get synchronized state across multiple components. - This provides in-memory caching and enables smooth optimistic updates.
- No more manual state syncing for shared chat views if using the
id
prop correctly. - Be aware that while a standalone
ChatStore
class isn't the main focus foruseChat
users now, the underlying logic is there.
5. ChatTransport
: Decoupling Delivery
- TL;DR: AI SDK v5 introduces the
ChatTransport
concept, an abstraction layer designed to decouple chat logic from the message delivery mechanism, paving the way for flexible backend integrations (e.g., WebSockets, client-only storage, custom APIs) beyond the default HTTP/SSE.*
Why this matters?
In V4, useChat
was pretty tightly coupled to making an HTTP POST request (for submit
) or GET (for experimental_resume
) to a server endpoint (usually /api/chat
), expecting a specific Server-Sent Events stream back. This was great for Next.js apps but less flexible if you wanted to:
- Talk to a backend that used WebSockets.
- Connect to a non-Vercel/Next.js backend (like a Python/FastAPI service with its own API structure).
- Implement a purely client-side chat that talked directly to an LLM provider (with user-provided keys, for demos/prototypes) or used browser
localStorage
for an offline-first experience. - Basically, if your message delivery wasn't plain HTTP/SSE to
/api/chat
, you were often looking at bypassinguseChat
's networking or writing significant wrappers.
The ChatTransport
(a v5 architectural concept) aims to solve this by creating an abstraction for how messages are sent and received.
How it’s solved in v5? (Step-by-step, Code, Diagrams)
Similar to ChatStore
, if you're looking at the v5 Canary useChat
options right now, you might not see a direct transport: MyCustomTransport
prop. The SDK's internal architecture (with its V2 interfaces and standardized v5 UI Message Stream) is now built to support such an abstraction, even if the public API for plugging in custom transports to useChat
isn't fully exposed or is still evolving. useChat
currently uses an internal helper (often called callChatApi
) for its default HTTP/SSE transport.
-
Purpose & Benefits (The
ChatTransport
Vision):- Decouple Core Logic from Delivery: The main idea is to separate what data is being sent/received (
UIMessage
arrays, v5 UI Message Stream) from how it's physically transmitted. - Enable Client-Only Usage: A transport could interact with
localStorage
or call an LLM API directly from the browser. - Support Custom Backends/Protocols: Easily integrate with various backend systems or use different protocols like WebSockets, gRPC, etc.
- Future-Proofing: Makes it easier to adopt new communication technologies.
- Decouple Core Logic from Delivery: The main idea is to separate what data is being sent/received (
-
Conceptual Interface (What a
ChatTransport
would do):
Based on its role, aChatTransport
would likely need to define methods like:-
submit(messages: UIMessage[], options: ChatRequestOptions): Promise<Response>
:- Takes the current
UIMessage
history and any request-specific options. - Initiates the AI response.
- Crucially, it must return a
Promise
that resolves to a standardResponse
object whose body is aReadableStream
compliant with the v5 UI Message Streaming Protocol (i.e., it streamsUIMessageStreamPart
s as SSE). This adherence to the v5 stream format is key for compatibility.
- Takes the current
-
resume(chatId: string, options?: AbortSignalOptions): Promise<Response>
:- Attempts to resume an interrupted stream for a given
chatId
. - Also returns a
Promise<Response>
with a v5 UI Message Stream.
- Attempts to resume an interrupted stream for a given
-
getChat(chatId: string): Promise<UIMessage[] | null>
:- Fetches historical
UIMessage[]
for a given chat.
- Fetches historical
+-------------------+ +-------------------+ | useChat / | --> | ChatTransport | | ChatStore (Logic) | | Interface | +-------------------+ +-------------------+ / | \ / | \ / | \ +------------+ +-----------+ +--------------+ | HTTP | | WebSocket | | LocalStorage | | Transport | | Transport | | Transport | +------------+ +-----------+ +--------------+
[FIGURE 5: Diagram showing useChat/ChatStore interacting with a ChatTransport interface, which has different implementations (HTTP, WebSocket, LocalStorage)]
-
5.1 Default SSE/REST implementation
Currently, useChat
in v5 Canary uses an internal helper function (like callChatApi
from packages/ai/src/ui/call-chat-api.ts
) to handle its network communication.
- This internal utility effectively is the default HTTP/SSE transport.
- It makes HTTP POST requests (for
submit
) and HTTP GET requests (forexperimental_resume
) to the endpoint specified inuseChat
'sapi
prop (e.g.,/api/chat
). - It expects this endpoint to return a v5 UI Message Stream (SSE of
UIMessageStreamPart
s).
5.2 Concept sketch – Custom Transports
Even if useChat
doesn't have a direct transport
prop in the current canary, let's illustrate how you could think about building a custom one, which highlights the power of this abstraction. The core challenge is always ensuring your transport's submit
/resume
methods ultimately provide a ReadableStream
that yields v5 UIMessageStreamPart
s.
-
Conceptual WebSocket Transport:
- Establish Connection: Your transport would manage a WebSocket connection to your server.
-
submit()
method:- Takes
UIMessage[]
andoptions
. - Serializes these messages and sends them over the WebSocket to the server.
- Listens for messages back from the WebSocket server.
- Crucial Adaptation Step: The server would stream back its response over WebSockets (e.g., sending JSON objects representing text deltas, tool calls, etc.). Your transport's
submit
method needs to adapt these WebSocket messages into aReadableStream
that yields v5UIMessageStreamPart
s in the correct SSE-like format. This might involve creating aReadableStream
and having itsstart(controller)
function push formattedUIMessageStreamPart
strings (e.g.,data: ${JSON.stringify(part)}\n\n
) into the controller as they arrive from the WebSocket. This adaptation is key foruseChat
(orprocessUIMessageStream
) to consume it.
- Takes
- This is more involved than HTTP/SSE because you're bridging WebSocket's message-based paradigm to SSE's event-stream paradigm.
-
Conceptual Client-Only LocalStorageTransport (for offline or demos):
-
submit()
method:- Takes
UIMessage[]
. - Simulates an AI response (e.g., echoes the input, or uses a simple rule-based engine).
- Constructs an assistant
UIMessage
with the simulated response parts. - Uses SDK utilities like
createUIMessageStream
andUIMessageStreamWriter
(v5 server-side utilities that could potentially be adapted or conceptually used on the client for this) to turn this assistantUIMessage
into aReadableStream
of v5UIMessageStreamPart
s. - Saves the full updated conversation (
UIMessage[]
including user's message and simulated assistant message) tolocalStorage
. - Returns the
Response
containing the stream.
- Takes
-
getChat()
method:- Reads and parses
UIMessage[]
fromlocalStorage
for the givenchatId
.
- Reads and parses
-
Take-aways / Migration Checklist Bullets
-
ChatTransport
is a v5 concept for decoupling message delivery from chat UI logic. - It enables flexibility for different backends (HTTP, WebSockets, client-only, custom APIs like LangChain).
- Current v5 Canary
useChat
uses an internal default HTTP/SSE transport (callChatApi
). - A direct pluggable
transport
prop foruseChat
isn't fully evident in Canary diffs, but the architecture (V2 interfaces, standard UI Message Stream) is designed to support it. - If building a custom transport, its
submit
/resume
methods must return aPromise<Response>
whose body is aReadableStream
of v5UIMessageStreamPart
s (SSE format).
6. Putting It Together: End-to-End Code Walk-Through
- TL;DR: This section provides simplified but complete v5 code examples for setting up
useChat
on the client and the corresponding Next.js API route on the server, demonstrating the core patterns for sending and receiving structured messages.*
Why this matters?
Seeing actual code helps solidify understanding. Let's look at a minimal but functional example of how useChat
(client-side) and a Next.js API route (server-side) would work together using v5's new message structures and streaming protocols. This reflects the "Minimal migration recipe" mentioned in the v4_vs_v5_comparison
(from my research context).
Remember, this is v5 Canary – APIs can and likely will change!
6.1 Hook setup (useChat
) - Client-Side (React Example)
This component sets up useChat
, provides a basic form for input, and renders messages by iterating through their parts
.
// components/MyChatComponent.tsx
'use client'; // Required for useChat
import { useChat, UIMessage } from '@ai-sdk/react'; // Ensure you have the canary version
import { useEffect } from 'react';
import { z } from 'zod'; // Optional: for message metadata schema
// Optional: Define a Zod schema for your UIMessage.metadata
const MyMessageMetadataSchema = z.object({
timestampAccuracy: z.enum(['exact', 'estimated']).optional(),
processingTimeMs: z.number().optional(),
// Add any other custom metadata fields you expect
});
type MyCustomMetadata = z.infer<typeof MyMessageMetadataSchema>;
export default function MyChatComponent({ chatId }: { chatId: string }) {
const {
messages, // Array of UIMessage<MyCustomMetadata>
input,
handleInputChange,
handleSubmit,
isLoading,
error,
reload,
stop,
append, // To programmatically add messages
setMessages, // To set the entire message array
status, // More granular status: 'idle', 'loading', 'error', etc.
experimental_resume, // For stream resumption
} = useChat<MyCustomMetadata>({ // Pass metadata type if using schema
id: chatId, // Important for session identification and potential state sharing
api: '/api/v5/chat', // Your v5 backend endpoint
// initialMessages: [], // Optionally provide initial messages (UIMessage[])
messageMetadataSchema: MyMessageMetadataSchema, // Validate incoming metadata
// Optional client-side callbacks
onFinish: (message) => {
console.log('Assistant message finished streaming:', message);
// Useful for client-side actions after a message completes
},
onError: (err) => {
console.error('Chat error:', err);
// Update UI to show error, trigger logging, etc.
},
// onToolCall: async ({ toolCall }) => { /* Handle client-side tools */ },
});
// Attempt to resume stream on component mount if chatId is present
useEffect(() => {
if (chatId) {
experimental_resume().catch(e => console.warn("Stream resumption failed or no active stream:", e));
}
// eslint-disable-next-line react-hooks/exhaustive-deps
}, [chatId]); // Re-run if chatId changes
// Simple form submission (can be enhanced with file handling etc.)
const localHandleSubmit = (e: React.FormEvent<HTMLFormElement>) => {
handleSubmit(e, {
// v5 options for handleSubmit, e.g., for files:
// files: attachedFiles, // FileList or FileUIPart[]
// You can also pass per-request 'body' options here if your backend expects them
// body: { customPerRequestData: 'someValue' }
});
};
return (
<div>
<div style={{ maxHeight: '400px', overflowY: 'auto', border: '1px solid #ccc', padding: '10px', marginBottom: '10px' }}>
{messages.map((message: UIMessage<MyCustomMetadata>) => (
<div key={message.id} style={{ marginBottom: '10px', padding: '5px', border: '1px solid #eee' }}>
<strong>{message.role === 'user' ? 'You' : 'AI'}:</strong>
{/* Render message parts */}
{message.parts.map((part, index) => {
const partKey = `${message.id}-part-${index}`;
switch (part.type) {
case 'text':
// For Markdown: import Markdown from 'react-markdown'; <Markdown>{part.text}</Markdown>
return <span key={partKey}> {part.text}</span>;
case 'tool-invocation':
// Basic rendering for tool invocation state
return (
<div key={partKey} style={{ marginLeft: '10px', borderLeft: '2px solid blue', paddingLeft: '5px' }}>
<em>Tool: {part.toolInvocation.toolName} ({part.toolInvocation.state})</em>
{part.toolInvocation.state === 'call' && <pre>Args: {JSON.stringify(part.toolInvocation.args)}</pre>}
{part.toolInvocation.state === 'result' && <pre>Result: {JSON.stringify(part.toolInvocation.result)}</pre>}
{part.toolInvocation.state === 'error' && <pre style={{color: 'red'}}>Error: {part.toolInvocation.errorMessage}</pre>}
</div>
);
case 'file':
return <div key={partKey} style={{ marginLeft: '10px', fontStyle: '' }}>File: {part.filename || part.url} ({part.mediaType})</div>;
case 'source':
return <div key={partKey} style={{ marginLeft: '10px', fontSize: '0.8em' }}>Source: <a href={part.source.url} target="_blank" rel="noopener noreferrer">{part.source.title || part.source.url}</a></div>;
case 'reasoning':
return <div key={partKey} style={{ marginLeft: '10px', color: 'purple', fontSize: '0.9em' }}>Reasoning: {part.text}</div>;
case 'step-start':
return <hr key={partKey} style={{ margin: '5px 0', borderColor: '#ddd' }} />;
default:
// This case should ideally not be hit if all part types are handled.
// The type system should ensure `part` is one of the known types.
// However, as a fallback:
const unknownPart = part as any;
return <span key={partKey}> [Unsupported Part: {unknownPart.type}]</span>;
}
})}
{/* Optional: Display message metadata */}
{message.metadata?.processingTimeMs && (
<small style={{ display: 'block', color: 'gray' }}>
(Processed in {message.metadata.processingTimeMs}ms)
</small>
)}
</div>
))}
</div>
```
markdown
+----------------------------------------+
| Chat Window |
+----------------------------------------+
| You: Hello! |
| AI: [Text] Hi there! |
| [Tool: getWeather (call)] |
| Args: {"city": "London"} |
| AI: [Tool: getWeather (result)] |
| Result: {"temp": "15C"} |
| [Text] The weather is nice. |
+----------------------------------------+
| [Type your message... ] [Send]|
+----------------------------------------+
```
*`[FIGURE 6: Screenshot of a simple chat UI rendered from these parts]`*
<form onSubmit={localHandleSubmit}>
<input
value={input}
onChange={handleInputChange}
placeholder="Say something..."
disabled={isLoading || status === 'loading'}
style={{ width: '80%', padding: '8px' }}
/>
<button type="submit" disabled={isLoading || status === 'loading'} style={{ padding: '8px' }}>
Send
</button>
</form>
{error && <p style={{ color: 'red' }}>Error: {error.message} <button onClick={() => reload()}>Retry</button></p>}
{isLoading && <p>Loading...</p>}
{status !== 'idle' && status !== 'loading' && <p>Status: {status}</p>}
</div>
);
}
6.2 Server route with stream helpers - Server-Side (Next.js App Router Example)
This API route receives UIMessage[]
, converts them for the LLM, calls streamText
, and returns the v5 UI Message Stream.
// app/api/v5/chat/route.ts
import { NextRequest, NextResponse } from 'next/server';
import {
UIMessage,
convertToModelMessages, // v5 utility from 'ai'
// createUIMessageStream, // For manual stream construction, if needed
// UIMessageStreamWriter,
} from 'ai'; // Ensure this is the v5 version from canary
import { streamText, LanguageModelV2FunctionTool } from '@ai-sdk/provider'; // V2 core functions
import { openai } from '@ai-sdk/openai'; // V2 OpenAI provider
import { z } from 'zod';
export const runtime = 'edge'; // Recommended for streaming on Vercel
// Example: Define a server-side tool (V2 format)
const getCurrentWeatherTool: LanguageModelV2FunctionTool = {
type: 'function',
function: {
name: 'getCurrentWeather',
description: 'Get the current weather for a city',
parameters: z.object({
city: z.string().describe('The city, e.g., San Francisco'),
unit: z.enum(['celsius', 'fahrenheit']).optional().default('celsius'),
}),
execute: async ({ city, unit }) => {
// In a real app, call a weather API here
await new Promise(resolve => setTimeout(resolve, 300)); // Simulate API call
const temperature = Math.floor(Math.random() * 20) + 10; // Random temp
return { city, temperature, unit, forecast: ['sunny', 'cloudy'][Math.floor(Math.random()*2)] };
},
},
};
// Dummy persistence function (replace with your actual database logic)
async function saveConversation(chatId: string | undefined, messages: UIMessage[]) {
if (!chatId) {
console.warn('Cannot persist: chatId is undefined.');
return;
}
console.log(`[Server Persistence] Saving ${messages.length} messages for chat ${chatId}. Last message role: ${messages[messages.length-1]?.role}`);
// In a real app: await db.collection('chats').doc(chatId).set({ messages });
}
export async function POST(req: NextRequest) {
try {
// 1. Parse request body for messages and chatId
// The v5 client sends UIMessage[] and optionally an 'id' for the chat session.
const body = await req.json();
const { messages: uiMessagesFromClient, id: chatId }: { messages: UIMessage[]; id?: string } = body;
if (!uiMessagesFromClient || !Array.isArray(uiMessagesFromClient)) {
return NextResponse.json({ error: 'Missing or invalid "messages" in request body' }, { status: 400 });
}
// 2. Convert UIMessages (from client) to ModelMessages (for LLM)
// This handles the transformation of UIMessage.parts into ModelMessage.content parts.
const { modelMessages } = convertToModelMessages(uiMessagesFromClient);
// 3. Call the V2 LLM provider (e.g., OpenAI) using streamText
const result = await streamText({
model: openai('gpt-4o-mini'), // Use V2 model instance from @ai-sdk/openai
messages: modelMessages,
tools: { getCurrentWeather: getCurrentWeatherTool }, // Provide V2 tools
toolChoice: 'auto', // Let the model decide if/when to use tools
// system: "You are a helpful assistant.", // System prompt can be here or first message
// Optional: Server-side onFinish for this specific AI turn (logging, etc.)
onFinish: async ({ text, toolCalls, toolResults, finishReason, usage }) => {
console.log(`[Server AI Turn onFinish for Chat ${chatId}] Reason: ${finishReason}`);
if (usage) console.log(`Token usage: P${usage.promptTokens}, C${usage.completionTokens}`);
// This onFinish is about the AI's direct output for THIS turn.
},
});
// 4. Return the response using the v5 UI Message Streaming helper
// This method correctly transforms the stream from the LLM provider (V2 core parts)
// into the SSE-based UI Message Stream (UIMessageStreamPart[]) expected by v5 useChat.
return result.toUIMessageStreamResponse({
// Optional: onFinish for the entire stream response, ideal for persistence.
// This 'onFinish' receives the fully formed assistant UIMessage(s) for the current turn.
onFinish: async ({ responseMessages }: { responseMessages: UIMessage[] }) => {
if (chatId && responseMessages && responseMessages.length > 0) {
// Combine client history with new assistant messages for full context to save
const updatedFullConversation: UIMessage[] = [...uiMessagesFromClient, ...responseMessages];
await saveConversation(chatId, updatedFullConversation);
} else if (responseMessages && responseMessages.length > 0) {
console.warn(`[Server Stream onFinish] Chat ID missing, cannot persist. Assistant produced ${responseMessages.length} messages.`);
}
}
});
} catch (error: unknown) {
console.error('[Chat API Error]', error);
const errorMessage = error instanceof Error ? error.message : 'An unexpected error occurred.';
// For robust error handling, you might want to stream a v5 'error' UIMessageStreamPart
// using createUIMessageStream and writer.writeError(errorMessage).
// For simplicity here, returning a JSON error:
return NextResponse.json({ error: errorMessage }, { status: 500 });
}
}
// Basic GET handler for stream resumption (experimental_resume needs server support)
export async function GET(req: NextRequest) {
const { searchParams } = new URL(req.url);
const chatId = searchParams.get('chatId');
if (!chatId) {
return NextResponse.json({ error: 'Missing "chatId" query parameter' }, { status: 400 });
}
console.log(`[Chat API GET] Received request to resume stream for chat ID: ${chatId}`);
// --- Server-Side Resumption Logic ---
// This is complex. It requires a mechanism to store and retrieve active/recent stream states
// (e.g., using Redis, like 'resumable-stream' package concepts for V4).
// For v5, the resumed stream must also send v5 UIMessageStreamPart(s).
// Placeholder:
console.warn(`Stream resumption for chat ID ${chatId} is not fully implemented in this example.`);
return NextResponse.json({ message: `Resumption for chat ID ${chatId} not fully implemented.` }, { status: 501 });
}
Take-aways / Migration Checklist Bullets
- Client: Use
useChat
from@ai-sdk/react
(or other framework packages). Pass your v5api
endpoint. - Client: Crucially, update your message rendering logic to iterate
message.parts
. - Client: Handle
isLoading
anderror
states fromuseChat
for better UX. - Server: Your API route (e.g., Next.js App Router
route.ts
) receivesUIMessage[]
. - Server: Use
convertToModelMessages()
to prepare data for V2 LLM calls. - Server: Use
streamText()
(or other V2 core functions) with V2 model instances. - Server: Return
result.toUIMessageStreamResponse()
to send the v5 UI Message Stream. - Server: Implement persistence in
toUIMessageStreamResponse
'sonFinish
callback, saving the fullUIMessage[]
(including allparts
andmetadata
). - Server: Remember
runtime = 'edge'
for Vercel Edge Functions for optimal streaming.
7. Migration Pitfalls & How to Dodge Them
- TL;DR: Migrating from Vercel AI SDK v4 to v5 Canary involves several key breaking changes; the most common pitfalls include not updating message rendering to use
message.parts
, ensuring server endpoints emit the new v5 UI Message Stream, adapting database schemas for the richerUIMessage
structure, and using V2 model interfaces for all backend LLM calls.*
Why this matters?
Migrating major versions can be tricky, and v5 introduces some fundamental architectural shifts from V4. Being aware of the common pitfalls upfront can save you a lot of debugging time and headaches. This isn't just a version bump with a few new features; it's a rethinking of how chat is handled.
How it’s solved in v5? (Step-by-step, Code, Diagrams)
Here’s a list of common traps and how to navigate them when moving your V4 application to the v5 Canary:
(Remember: v5 is in Canary. Expect the unexpected, pin your versions, and check for updates regularly!)
-
Rendering
message.content
instead ofmessage.parts
:- The Pitfall: This is, without a doubt, the #1 issue developers will hit. In V4, you typically rendered
message.content
. In v5, the top-levelmessage.content: string
is gone fromUIMessage
. All content is inmessage.parts: UIMessagePart[]
. If you don't update your rendering logic, your chat messages will appear empty or broken. -
How to Dodge:
- You must refactor your UI components that render messages.
- Iterate over the
message.parts
array. - Use a
switch
statement or conditional logic based onpart.type
to render each part appropriately (e.g.,TextUIPart
,ToolInvocationUIPart
,FileUIPart
, etc.). - Refer to Section 2.2 and Section 6.1 for examples of how to render parts.
// Incorrect V4-style rendering: // messages.map(message => <div key={message.id}>{message.content}</div>) // THIS WILL BREAK IN v5 // Correct v5-style rendering (conceptual): // messages.map(message => ( // <div key={message.id}> // {message.parts.map((part, index) => <RenderMessagePart key={index} part={part} />)} // </div> // ))
- The Pitfall: This is, without a doubt, the #1 issue developers will hit. In V4, you typically rendered
-
Server Endpoint Incompatibility (Not Emitting v5 UI Message Stream):
- The Pitfall: Your V4 backend API route was likely using
result.toDataStreamResponse()
or similar to send a V4-compatible data stream. v5useChat
clients expect the new v5 UI Message Streaming Protocol (SSE ofUIMessageStreamPart
s, identified by thex-vercel-ai-ui-message-stream: v1
header). If your server doesn't send this, the client won't understand the stream. - How to Dodge:
- On your server, after calling a V2 core function like
streamText()
, you must useresult.toUIMessageStreamResponse()
to send the response. This helper handles the correct formatting and headers. - Ensure your server is actually using V2 model interfaces and functions, as
toUIMessageStreamResponse()
is part of their result structure.
- On your server, after calling a V2 core function like
- The Pitfall: Your V4 backend API route was likely using
-
Persistence Schema Mismatches:
- The Pitfall: Your V4 database schema probably stored messages with a simple
content: string
field. The v5UIMessage
is much richer, containing theparts
array (which is structured JSON) and the typedmetadata
field. Trying to save a v5UIMessage
into an old V4 schema will fail or lead to data loss. - How to Dodge:
- Update your database schema to accommodate the full
UIMessage
structure. This typically means having a column that can store JSON (e.g.,JSONB
in PostgreSQL) for theparts
array and another formetadata
. - Your server-side persistence logic (e.g., in the
onFinish
callback oftoUIMessageStreamResponse()
) must save the completeUIMessage
object.
- Update your database schema to accommodate the full
- The Pitfall: Your V4 database schema probably stored messages with a simple
-
Using V1 Core/Model Interfaces on the Backend:
- The Pitfall: If your server-side code that interacts with the LLM is still using V1 core functions or V1 model instances (e.g.,
OpenAIChat
from older SDK versions), it won't be compatible with v5's expectations or helpers liketoUIMessageStreamResponse()
. - How to Dodge:
- Ensure all backend LLM calls use V2 model instances (e.g.,
openai('gpt-4o-mini')
from@ai-sdk/openai
) and their new call signatures. - Use V2-compatible core functions like
streamText()
,generateObject()
, etc. from@ai-sdk/provider
. - This also applies to tool definitions, which must use the
LanguageModelV2FunctionTool
structure.
- Ensure all backend LLM calls use V2 model instances (e.g.,
- The Pitfall: If your server-side code that interacts with the LLM is still using V1 core functions or V1 model instances (e.g.,
-
Misunderstanding Removed or Changed
useChat
Options/State:- The Pitfall: Several
useChat
options and returned state fields from V4 are gone or have changed in v5. Using them will either cause errors or unexpected behavior. - How to Dodge: Review the v5
useChat
API carefully. Key changes include:-
data
andsetData
(for arbitrary JSON side-channel): Removed. UseUIMessage.metadata
for message-specific custom data. -
sendExtraMessageFields
: Removed.id
is now stable,createdAt
is handled by the hook/server, and custom data goes intometadata
. -
onResponse
: Removed. Use more specific callbacks likeonFinish
(for completed messages) oronError
. -
isLoading
: Still there, but also check the more granularstatus
string. -
handleSubmit
options: The second argument is now an options object which in v5 includes afiles?: FileList | FileUIPart[]
property. -
useAssistant
hook: Removed. Build assistant-like flows usinguseChat
with V2 tools and theUIMessagePart
system.
-
- The Pitfall: Several
-
Forgetting about Canary Instability:
- The Pitfall: Treating v5 Canary releases like stable versions. APIs can and will change between canary updates. If you just use
"canary"
in yourpackage.json
, a newnpm install
orpnpm install
could pull in breaking changes unexpectedly. - How to Dodge:
- Pin specific canary versions: In your
package.json
, use exact canary versions (e.g.,"@ai-sdk/react": "3.0.0-canary.25"
). - Upgrade deliberately after reviewing release notes or changelogs for the canary versions.
- Expect some churn. This is the nature of alpha/canary software.
- Pin specific canary versions: In your
- The Pitfall: Treating v5 Canary releases like stable versions. APIs can and will change between canary updates. If you just use
Take-aways / Migration Checklist Bullets
- Update message rendering:
message.content
->message.parts
. This is #1. - Server must emit v5 UI Message Stream (use
toUIMessageStreamResponse()
). - Database schema needs to store full
UIMessage
(includingparts
andmetadata
). - Backend must use V2 model interfaces and functions for all LLM calls.
- Review and adapt to changes in
useChat
options, state, andhandleSubmit
. -
useAssistant
is gone; refactor to useuseChat
with V2 tools. - Pin your v5 Canary SDK versions to avoid unexpected breakage.
- Test thoroughly after migration!
Migrating will take some effort, but the v5 architecture offers significant benefits in terms of building richer, more maintainable, and flexible AI chat applications. Good luck!
8. Take-aways & What’s Next
- TL;DR: Vercel AI SDK v5 revolutionizes chat development with structured
UIMessage.parts
for richer UIs,ChatStore
principles for synchronized client state viauseChat({ id })
, and theChatTransport
concept for flexible backends, ultimately future-proofing your conversational AI applications.*
Why this matters?
We've covered a lot of ground on the Vercel AI SDK v5 Canary! It's a significant leap forward from v4, fundamentally changing how we'll build conversational AI. The core idea is to move beyond simple text strings and empower developers to create truly dynamic, structured, and multi-modal chat experiences. This isn't just about new features; it's an architectural evolution aimed at making complex AI interactions more manageable and robust.
How it’s solved in v5? (Recap of Key Concepts)
Let's quickly recap the main pillars of v5 for chat that we've discussed:
-
UIMessage
withparts
: This is the heart of v5 chat.- Messages are no longer just a
content: string
. They areUIMessage
objects containing an array of typedUIMessagePart
s (likeTextUIPart
,ToolInvocationUIPart
,FileUIPart
,SourceUIPart
,ReasoningUIPart
,StepStartUIPart
). - This enables richer, multi-part messages, directly supporting "Generative UI" where the AI can stream structured content that your client renders into dynamic UI elements.
- It also includes typed
metadata
for application-specific data and ensures stableid
s.
- Messages are no longer just a
-
ChatStore
Principles (viauseChat
withid
):- While a fully exposed, directly instantiable
ChatStore
class might be more conceptual for typicaluseChat
users in v5 Canary, its principles are now deeply embedded. - Using
useChat({ id: 'shared_chat_id' })
across multiple components ensures they share the same underlying chat state (messages, input, status). - This means synchronized client state, optimistic updates for a snappy UI, and in-memory caching for the session, eliminating many V4 state synchronization headaches.
- While a fully exposed, directly instantiable
-
ChatTransport
Concept:- This architectural idea is about decoupling the core chat logic from the actual mechanism of message delivery.
- While direct pluggability into
useChat
isn't fully explicit in Canary diffs, the SDK's V2 interfaces and the standardized v5 UI Message Stream are built to support such an abstraction. - This paves the way for future flexibility: client-only chat (e.g., using
localStorage
or direct browser-to-LLM calls), custom backend integrations (WebSockets, gRPC), and easier testing.
Core Benefits (Why v5 is a Big Deal)
Drawing from the "Why it matters" points in the v4_vs_v5_comparison
and the overall direction:
- "Reload = pixel-perfect restore": By persisting the rich
UIMessage
format (with all itsparts
andmetadata
), you can rehydrate your chat UI with full fidelity. What you save is what you see. - Decouples UI from transport: The
ChatTransport
concept aims to free your UI logic from being tied to a specific backend communication method. - Future-proofs for Generative UI: The
UIMessage.parts
system is fundamental for building applications where the AI doesn't just return text but actively generates structured UI components. - Improved Type Safety and Maintainability: Clearer interfaces, typed metadata, and structured message parts lead to more robust and easier-to-maintain code.
- Addresses Key V4 Pain Points: Solves issues around state synchronization, rich content representation, and backend flexibility that were common challenges in V4.
What’s Next in the Series?
This post focused on the new message anatomy (UIMessage
, UIMessagePart
), the principles of client-side state management (ChatStore
via useChat
), the concept of decoupled delivery (ChatTransport
), and how these pieces fit together with an end-to-end example and migration tips.
But there's more to v5's chat capabilities! The communication backbone for all this is the v5 UI Message Streaming Protocol.
- Teaser for Post 2: In our next post, we'll dive deep into this protocol itself. We'll explore:
- Every
UIMessageStreamPart
event type (e.g.,'start'
,'text'
,'tool-call'
,'file'
,'metadata'
,'error'
,'finish'
). - How servers (using helpers like
toUIMessageStreamResponse
orUIMessageStreamWriter
) emit these structured events. - How the client-side (
useChat
andprocessUIMessageStream
) consumes and interprets these stream parts to build and updateUIMessage
objects in real-time. - Tips for debugging these structured SSE streams.
- Every
Understanding this streaming protocol is key to mastering v5 chat and unlocking its full potential for building truly interactive and dynamic AI experiences. See you at next one!
Top comments (0)