通过实现 hooks 构建自定义 middleware,这些 hooks 会在 agent 执行流的特定点运行。
Hooks
Middleware 提供两种 hook 风格来拦截 agent 执行:
Node-style hooks 在特定执行点按顺序运行。
Wrap-style hooks 围绕每次模型或工具调用运行。
Node-style hooks
在特定执行点按顺序运行。用于 logging、validation 和 state updates。
选择 middleware 所需的 hooks。你可以在 node-style hooks 和 wrap-style hooks 之间选择。
Node-style hooks 会在特定执行点运行:
Hook 何时运行 beforeAgentAgent 启动前(每次 invocation 一次) beforeModel每次模型调用前 afterModel每次模型响应后 afterAgentAgent 完成后(每次 invocation 一次)
Wrap-style hooks 会围绕每次调用运行,让你控制执行:
Hook 何时运行 wrapModelCall围绕每次模型调用 wrapToolCall围绕每次工具调用
示例:
import { createMiddleware , AIMessage } from "langchain" ;
const createMessageLimitMiddleware = ( maxMessages : number = 50 ) => {
return createMiddleware ( {
name : "MessageLimitMiddleware" ,
beforeModel : {
canJumpTo : [ "end" ] ,
hook : ( state ) => {
if (state . messages . length === maxMessages) {
return {
messages : [ new AIMessage ( "Conversation limit reached." )] ,
jumpTo : "end" ,
};
}
return ;
}
},
afterModel : ( state ) => {
const lastMessage = state . messages[state . messages . length - 1 ] ;
console . log ( `Model returned: ${ lastMessage . content } ` ) ;
return ;
},
} ) ;
};
Wrap-style hooks
拦截执行并控制何时调用 handler。用于 retries、caching 和 transformation。
你可以决定 handler 调用零次(短路)、一次(正常流程)或多次(重试逻辑)。
可用 hooks:
wrapModelCall:围绕每次模型调用
wrapToolCall:围绕每次工具调用
示例:
import { createMiddleware } from "langchain" ;
const createRetryMiddleware = ( maxRetries : number = 3 ) => {
return createMiddleware ( {
name : "RetryMiddleware" ,
wrapModelCall : ( request , handler ) => {
for ( let attempt = 0 ; attempt < maxRetries ; attempt ++ ) {
try {
return handler (request) ;
} catch (e) {
if (attempt === maxRetries - 1 ) {
throw e ;
}
console . log ( `Retry ${ attempt + 1 } / ${ maxRetries } after error: ${ e } ` ) ;
}
}
throw new Error ( "Unreachable" ) ;
},
} ) ;
};
State updates
Node-style 和 wrap-style hooks 都可以更新 agent state。机制有所不同:
Node-style hooks (beforeAgent、beforeModel、afterModel、afterAgent):直接返回 dict。该 dict 会使用 graph 的 reducers 应用到 agent state。
Wrap-style hooks (wrapModelCall、wrapToolCall):对于模型调用,直接返回 Command ,以便在模型响应旁边注入 state updates。对于工具调用,直接返回 Command 。当你需要根据模型或工具调用期间运行的逻辑来追踪或更新 state 时使用它们,例如摘要触发点、usage metadata,或根据 request/response 计算出的自定义字段。
Node-style hooks
从 node-style hook 返回 dict,将更新合并到 agent state 中。Dict keys 会映射到 state fields。
import { createMiddleware } from "langchain" ;
import * as z from "zod" ;
const trackingStateSchema = z . object ( {
modelCallCount : z . number () . default ( 0 ) ,
} ) ;
const incrementAfterModel = createMiddleware ( {
name : "incrementAfterModel" ,
stateSchema : trackingStateSchema ,
afterModel : ( state ) => {
return { modelCallCount : state . modelCallCount + 1 };
},
} ) ;
Wrap-style hooks
从 wrapModelCall 直接返回 Command ,从模型调用层注入 state updates:
import * as z from "zod" ;
import { createMiddleware } from "langchain" ;
import { Command } from "@langchain/langgraph" ;
const usageTrackingStateSchema = z . object ( {
lastModelCallTokens : z . number () . optional () ,
} ) ;
const trackUsage = createMiddleware ( {
name : "trackUsage" ,
stateSchema : usageTrackingStateSchema ,
wrapModelCall : async ( request , handler ) => {
const response = await handler (request) ;
return new Command ( { update : { lastModelCallTokens : 150 } } ) ;
},
} ) ;
Command 会流经 graph 的 reducers,因此更新会正确应用,messages 会追加而不是替换现有 state。
Composition with multiple middleware
当多个 middleware 层返回 responses 时,framework 会传递最后生成的 AIMessage:
AIMessage 会流经各层: 每个 middleware 的 handler() 都会接收上一层的 AIMessage。当 middleware 返回 AIMessage 时,它会成为下一个 middleware handler 的输入。
不更新 messages 的 Command 会透传: 如果 middleware 返回的 Command 其 state update 未触及 messages,framework 会将其视为 message flow 的 no-op。下一个 middleware 的 handler 会收到返回 Command 的 middleware 之前 那一层的 AIMessage。
Reducer 行为和重试安全: Commands 仍会通过 reducers 应用(messages 追加,冲突时外层优先)。Retry logic 会丢弃较早调用产生的 commands。
import * as z from "zod" ;
import { createMiddleware } from "langchain" ;
import { Command , StateSchema , ReducedValue } from "@langchain/langgraph" ;
import { AIMessage , SystemMessage } from "@langchain/core/messages" ;
/** Last-wins reducer: when both middleware write, outer overwrites inner. */
const customMiddlewareStateSchema = new StateSchema ( {
traceLayer : new ReducedValue (
z . string () . optional () ,
{ reducer : ( a , b ) => b },
) ,
} ) ;
const outerMiddleware = createMiddleware ( {
name : "OuterMiddleware" ,
stateSchema : customMiddlewareStateSchema ,
wrapModelCall : async ( _request , handler ) => {
await handler (_request) ;
return new Command ( {
update : {
traceLayer : "outer" ,
messages : [ new SystemMessage ( { content : "[Outer ran]" } )] ,
},
} ) ;
},
} ) ;
const innerMiddleware = createMiddleware ( {
name : "InnerMiddleware" ,
stateSchema : customMiddlewareStateSchema ,
wrapModelCall : async ( _request , handler ) => {
await handler (_request) ;
return new Command ( {
update : {
traceLayer : "inner" ,
messages : [ new SystemMessage ( { content : "[Inner ran]" } )] ,
},
} ) ;
},
} ) ;
Create middleware
python
AgentMiddleware 子类可以声明三个 class attributes,agent factory 会在 compile time 读取它们:
state_schema:使用自定义字段扩展 agent state。请参阅 Custom state schema 。
tools:注册随 middleware 提供的额外 tools,例如 to-do list middleware 上的 write_todos。
transformers:注册 scope-aware stream transformer factories。请参阅 Custom stream transformers 。
:::
createMiddleware 接受三个配置字段,agent factory 会在 compile time 读取它们:
示例:
from langchain . agents . middleware import (
AgentMiddleware ,
AgentState ,
ModelRequest ,
ModelResponse ,
)
from langgraph . runtime import Runtime
from typing import Any , Callable
class LoggingMiddleware ( AgentMiddleware ):
def before_model ( self , state : AgentState , runtime : Runtime ) -> dict [ str , Any ] | None :
print ( f "About to call model with { len ( state [ ' messages ' ]) } messages" )
return None
def after_model ( self , state : AgentState , runtime : Runtime ) -> dict [ str , Any ] | None :
print ( f "Model returned: { state [ ' messages ' ][ - 1 ]. content } " )
return None
async def abefore_model (
self , state : AgentState , runtime : Runtime
) -> dict [ str , Any ] | None :
# Async version of before_model
return None
async def aafter_model (
self , state : AgentState , runtime : Runtime
) -> dict [ str , Any ] | None :
# Async version of after_model
print ( f "Model returned: { state [ ' messages ' ][ - 1 ]. content } " )
return None
agent = create_agent (
model = "gpt-5.4" ,
middleware = [ LoggingMiddleware ()],
tools = [ ... ],
)
何时使用 classes:
为同一个 hook 同时定义同步和异步实现
单个 middleware 需要多个 hooks
需要复杂配置,例如可配置阈值、自定义模型
需要通过初始化时配置在多个项目中复用
:::
使用 createMiddleware 函数定义自定义 middleware:
import { createMiddleware } from "langchain" ;
const loggingMiddleware = createMiddleware ( {
name : "LoggingMiddleware" ,
beforeModel : ( state ) => {
console . log ( `About to call model with ${ state . messages . length } messages` ) ;
return ;
},
afterModel : ( state ) => {
const lastMessage = state . messages[state . messages . length - 1 ] ;
console . log ( `Model returned: ${ lastMessage . content } ` ) ;
return ;
},
} ) ;
Custom state schema
如果 middleware 需要跨 hooks 追踪 state,可以使用自定义属性扩展 agent state。这使 middleware 能够:
跨执行追踪 state :维护在 agent 执行生命周期中持续存在的计数器、flags 或其他值
在 hooks 之间共享数据 :将信息从 beforeModel 传给 afterModel,或在不同 middleware instances 之间传递
实现横切关注点 :添加 rate limiting、usage tracking、user context 或 audit logging 等功能,而不修改核心 agent 逻辑
做出条件决策 :使用累积 state 决定是否继续执行、跳转到不同节点,或动态修改行为
import { createMiddleware , createAgent , HumanMessage } from "langchain" ;
import { StateSchema } from "@langchain/langgraph" ;
import * as z from "zod" ;
const CustomState = new StateSchema ( {
modelCallCount : z . number () . default ( 0 ) ,
userId : z . string () . optional () ,
} ) ;
const callCounterMiddleware = createMiddleware ( {
name : "CallCounterMiddleware" ,
stateSchema : CustomState ,
beforeModel : {
canJumpTo : [ "end" ] ,
hook : ( state ) => {
if (state . modelCallCount > 10 ) {
return { jumpTo : "end" };
}
return ;
},
},
afterModel : ( state ) => {
return { modelCallCount : state . modelCallCount + 1 };
},
} ) ;
const agent = createAgent ( {
model : "gpt-5.4" ,
tools : [ ... ] ,
middleware : [callCounterMiddleware] ,
} ) ;
const result = await agent . invoke ( {
messages : [ new HumanMessage ( "Hello" )] ,
modelCallCount : 0 ,
userId : "user-123" ,
} ) ;
State fields 可以是 public 或 private。以下划线(_)开头的字段会被视为 private,不会包含在 agent 的结果中。只有 public fields(没有前导下划线)会被返回。
这适合存储不应暴露给调用方的内部 middleware state,例如临时 tracking variables 或内部 flags:
import { StateSchema } from "@langchain/langgraph" ;
import * as z from "zod" ;
const PrivateState = new StateSchema ( {
// Public field - included in invoke result
publicCounter : z . number () . default ( 0 ) ,
// Private field - excluded from invoke result
_internalFlag : z . boolean () . default ( false ) ,
} ) ;
const middleware = createMiddleware ( {
name : "ExampleMiddleware" ,
stateSchema : PrivateState ,
afterModel : ( state ) => {
// Both fields are accessible during execution
if (state . _internalFlag) {
return { publicCounter : state . publicCounter + 1 };
}
return { _internalFlag : true };
},
} ) ;
const result = await agent . invoke ( {
messages : [ new HumanMessage ( "Hello" )] ,
publicCounter : 0
} ) ;
// result only contains publicCounter, not _internalFlag
console . log (result . publicCounter) ; // 1
console . log (result . _internalFlag) ; // undefined
Middleware-registered transformers 需要 langchain@1.4.3 或更高版本。
Middleware 可以注册 stream transformer factories,将 live agent stream 中的事件投影到类型化 extension channels。这适合在不耦合 framework 内置 projections 的情况下暴露计数器、side-channel artifacts、部分输出或 wire-level redaction。
在 compile time,middleware-registered factories 会与调用方直接传给 agent factory 的内容合并。final ordering rules 会让内置 ToolCallTransformer 保持在前面,并让调用方提供的条目排在最后。
将 streamTransformers 作为 factories tuple 传给 createMiddleware。每个 factory 形如 () => StreamTransformer<any>(零参数),并且每个 scope 调用一次;每次调用返回新的 transformer 可以让每个 subgraph 保持隔离。
import { createAgent , createMiddleware } from "langchain" ;
const toolActivityMiddleware = createMiddleware ( {
name : "ToolActivityMiddleware" ,
streamTransformers : [toolActivityTransformer] ,
} ) ;
const agent = createAgent ( {
model : "gpt-5-nano" ,
tools : [ ... ] ,
middleware : [toolActivityMiddleware] ,
} ) ;
完整排序规则和 PII redaction 示例请参阅 Register transformers on middleware 。
Custom context
Middleware 可以定义自定义 context schema 来访问每次 invocation 的 metadata。与 state 不同,context 是只读的,并且不会在 invocations 之间持久化。这使它非常适合:
用户信息 :传递执行期间不会变化的用户 ID、角色或偏好
配置覆盖 :提供每次 invocation 的设置,例如 rate limits 或 feature flags
Tenant/workspace context :为多租户应用包含组织专属数据
Request metadata :传递 middleware 所需的 request IDs、API keys 或其他 metadata
使用 Zod 定义 context schema,并在 middleware hooks 中通过 runtime.context 访问它。Context schema 中的必填字段会在 TypeScript 层面强制执行,确保调用 agent.invoke() 时必须提供它们。
import { createAgent , createMiddleware , HumanMessage } from "langchain" ;
import * as z from "zod" ;
const contextSchema = z . object ( {
userId : z . string () ,
tenantId : z . string () ,
apiKey : z . string () . optional () ,
} ) ;
const userContextMiddleware = createMiddleware ( {
name : "UserContextMiddleware" ,
contextSchema ,
wrapModelCall : ( request , handler ) => {
// Access context from runtime
const { userId , tenantId } = request . runtime . context ;
// Add user context to system message
const contextText = `User ID: ${ userId } , Tenant: ${ tenantId } ` ;
const newSystemMessage = request . systemMessage . concat (contextText) ;
return handler ( {
... request ,
systemMessage : newSystemMessage ,
} ) ;
},
} ) ;
const agent = createAgent ( {
model : "gpt-5.4" ,
middleware : [userContextMiddleware] ,
tools : [] ,
contextSchema ,
} ) ;
const result = await agent . invoke (
{ messages : [ new HumanMessage ( "Hello" )] },
// Required fields (userId, tenantId) must be provided
{
context : {
userId : "user-123" ,
tenantId : "acme-corp" ,
},
}
) ;
必填 context 字段 :当你在 contextSchema 中定义必填字段(没有 .optional() 或 .default() 的字段)时,TypeScript 会强制要求在 agent.invoke() 调用期间提供这些字段。这可以确保类型安全,并防止缺少必填 context 导致 runtime errors。
// This will cause a TypeScript error if userId or tenantId are missing
const result = await agent . invoke (
{ messages : [ new HumanMessage ( "Hello" )] },
{ context : { userId : "user-123" } } // Error: tenantId is required
) ;
Execution order
使用多个 middleware 时,需要理解它们如何执行:
const agent = createAgent ( {
model : "gpt-5.4" ,
middleware : [middleware1 , middleware2 , middleware3] ,
tools : [ ... ] ,
} ) ;
Before hooks 按顺序运行:
middleware1.before_agent()
middleware2.before_agent()
middleware3.before_agent()
Agent loop 开始
middleware1.before_model()
middleware2.before_model()
middleware3.before_model()
Wrap hooks 像函数调用一样嵌套:
middleware1.wrap_model_call() → middleware2.wrap_model_call() → middleware3.wrap_model_call() → model
After hooks 按反向顺序运行:
middleware3.after_model()
middleware2.after_model()
middleware1.after_model()
Agent loop 结束
middleware3.after_agent()
middleware2.after_agent()
middleware1.after_agent()
关键规则:
before_* hooks:从前到后
after_* hooks:从后到前(反向)
wrap_* hooks:嵌套执行(第一个 middleware 包装其他所有 middleware)
Agent jumps
如需从 middleware 提前退出,请返回包含 jump_to 的 dictionary:
可用 jump targets:
'end':跳到 agent execution 末尾(或第一个 after_agent hook)
'tools':跳到 tools node
'model':跳到 model node(或第一个 before_model hook)
import { createAgent , createMiddleware , AIMessage } from "langchain" ;
const agent = createAgent ( {
model : "gpt-5.4" ,
middleware : [
createMiddleware ( {
name : "BlockedContentMiddleware" ,
beforeModel : {
canJumpTo : [ "end" ] ,
hook : ( state ) => {
if (state . messages . at ( - 1 ) ?. content . includes ( "BLOCKED" )) {
return {
messages : [ new AIMessage ( "I cannot respond to that request." )] ,
jumpTo : "end" as const ,
};
}
return ;
},
},
} ) ,
] ,
} ) ;
const result = await agent . invoke ( {
messages : "Hello, world! BLOCKED"
} ) ;
/**
* Expected output:
* I cannot respond to that request.
*/
console . log (result . messages . at ( - 1 ) ?. content) ;
Best practices
保持 middleware 聚焦,每个 middleware 都应做好一件事
优雅处理 errors,不要让 middleware errors 导致 agent 崩溃
Use appropriate hook types :
Node-style 用于顺序逻辑(logging、validation)
Wrap-style 用于控制流(retry、fallback、caching)
清楚记录任何自定义 state properties
集成前独立对 middleware 做 unit test
考虑执行顺序,将关键 middleware 放在列表前面
尽可能使用 built-in middleware
Examples
Dynamic prompt
在 runtime 动态修改 system prompt,以便在每次模型调用前注入 context、用户专属指令或其他信息。这是最常见的 middleware 用例之一。
使用 ModelRequest 中的 systemMessage 字段读取和修改 system prompt。它包含 SystemMessage 对象,即使 agent 是使用字符串 systemPrompt 创建的也是如此。
Google
OpenAI
Anthropic
OpenRouter
Fireworks
Baseten
Ollama
import { createMiddleware , SystemMessage , createAgent } from "langchain" ;
const addContextMiddleware = createMiddleware ( {
name : "AddContextMiddleware" ,
wrapModelCall : async ( request , handler ) => {
return handler ( {
... request ,
systemMessage : request . systemMessage . concat ( `Additional context.` ) ,
} ) ;
},
} ) ;
const agent = createAgent ( {
model : "google-genai:gemini-3.5-flash" ,
systemPrompt : "You are a helpful assistant." ,
middleware : [addContextMiddleware] ,
} ) ;
使用 SystemMessage.concat 保留 cache control metadata 或其他 middleware 创建的 structured content blocks。
Dynamic model selection
import { createMiddleware , initChatModel } from "langchain" ;
const models = {
complex : await initChatModel ( "claude-sonnet-4-6" ) ,
simple : await initChatModel ( "claude-haiku-4-5-20251001" ) ,
};
const dynamicModelMiddleware = createMiddleware ( {
name : "DynamicModelMiddleware" ,
wrapModelCall : ( request , handler ) => {
const modifiedRequest = { ... request };
if (request . messages . length > 10 ) {
modifiedRequest . model = models . complex ;
} else {
modifiedRequest . model = models . simple ;
}
return handler (modifiedRequest) ;
},
} ) ;
在 runtime 选择相关 tools,以提升性能和准确性。本节介绍如何过滤预注册 tools。对于在 runtime 发现的 tools(例如来自 MCP servers)的注册方式,请参阅 Runtime tool registration 。
收益:
更短的 prompts :仅暴露相关 tools,降低复杂度
更高准确性 :模型从更少选项中做出正确选择
权限控制 :根据用户访问权限动态过滤 tools
import { createAgent , createMiddleware } from "langchain" ;
const toolSelectorMiddleware = createMiddleware ( {
name : "ToolSelector" ,
wrapModelCall : ( request , handler ) => {
// Select a small, relevant subset of tools based on state/context
const relevantTools = selectRelevantTools (request . state , request . runtime) ;
const modifiedRequest = { ... request , tools : relevantTools };
return handler (modifiedRequest) ;
},
} ) ;
const agent = createAgent ( {
model : "gpt-5.4" ,
tools : allTools ,
middleware : [toolSelectorMiddleware] ,
} ) ;
import { createMiddleware } from "langchain" ;
const toolMonitoringMiddleware = createMiddleware ( {
name : "ToolMonitoringMiddleware" ,
wrapToolCall : ( request , handler ) => {
console . log ( `Executing tool: ${ request . toolCall . name } ` ) ;
console . log ( `Arguments: ${ JSON . stringify ( request . toolCall . args ) } ` ) ;
try {
const result = handler (request) ;
console . log ( "Tool completed successfully" ) ;
return result ;
} catch (e) {
console . log ( `Tool failed: ${ e } ` ) ;
throw e ;
}
},
} ) ;
Prompt caching (Anthropic)
使用 Anthropic models 时,使用带 cache control 指令的 structured content blocks 缓存大型 system prompts:
from langchain . agents . middleware import wrap_model_call , ModelRequest , ModelResponse
from langchain . messages import SystemMessage
from typing import Callable
@wrap_model_call
def add_cached_context (
request : ModelRequest ,
handler : Callable [[ ModelRequest ], ModelResponse ],
) -> ModelResponse :
# Always work with content blocks
new_content = list ( request . system_message . content_blocks ) + [
{
"type" : "text" ,
"text" : "Here is a large document to analyze: \n\n <document>...</document>" ,
# content up until this point is cached
"cache_control" : { "type" : "ephemeral" }
}
]
new_system_message = SystemMessage ( content = new_content )
return handler ( request . override ( system_message = new_system_message ))
from langchain . agents . middleware import AgentMiddleware , ModelRequest , ModelResponse
from langchain . messages import SystemMessage
from typing import Callable
class CachedContextMiddleware ( AgentMiddleware ):
def wrap_model_call (
self ,
request : ModelRequest ,
handler : Callable [[ ModelRequest ], ModelResponse ],
) -> ModelResponse :
# Always work with content blocks
new_content = list ( request . system_message . content_blocks ) + [
{
"type" : "text" ,
"text" : "Here is a large document to analyze: \n\n <document>...</document>" ,
"cache_control" : { "type" : "ephemeral" } # This content will be cached
}
]
new_system_message = SystemMessage ( content = new_content )
return handler ( request . override ( system_message = new_system_message ))
注意事项:
ModelRequest.system_message 始终是 SystemMessage 对象,即使 agent 是使用 system_prompt="string" 创建的
使用 SystemMessage.content_blocks 将 content 作为 blocks 列表访问,无论原始 content 是字符串还是列表
修改 system messages 时,使用 content_blocks 并追加新 blocks,以保留现有结构
可以将 SystemMessage 对象直接传给 create_agent 的 system_prompt 参数,以支持 cache control 等高级用例
:::
在 middleware 中使用 ModelRequest 的 systemMessage 字段修改 system messages。它包含 SystemMessage 对象,即使 agent 是使用字符串 systemPrompt 创建的也是如此。
示例:链接 middleware :不同 middleware 可以使用不同方式:
import { createMiddleware , SystemMessage , createAgent } from "langchain" ;
// Middleware 1: Uses systemMessage with simple concatenation
const myMiddleware = createMiddleware ( {
name : "MyMiddleware" ,
wrapModelCall : async ( request , handler ) => {
return handler ( {
... request ,
systemMessage : request . systemMessage . concat ( `Additional context.` ) ,
} ) ;
},
} ) ;
// Middleware 2: Uses systemMessage with structured content (preserves structure)
const myOtherMiddleware = createMiddleware ( {
name : "MyOtherMiddleware" ,
wrapModelCall : async ( request , handler ) => {
return handler ( {
... request ,
systemMessage : request . systemMessage . concat (
new SystemMessage ( {
content : [
{
type : "text" ,
text : " More additional context. This will be cached." ,
cache_control : { type : "ephemeral" , ttl : "5m" },
},
] ,
} )
) ,
} ) ;
},
} ) ;
const agent = createAgent ( {
model : "google_genai:gemini-3.5-flash" ,
systemPrompt : "You are a helpful assistant." ,
middleware : [myMiddleware , myOtherMiddleware] ,
} ) ;
生成的 system message 会是:
new SystemMessage ( {
content : [
{ type : "text" , text : "You are a helpful assistant." },
{ type : "text" , text : "Additional context." },
{
type : "text" ,
text : " More additional context. This will be cached." ,
cache_control : { type : "ephemeral" , ttl : "5m" },
},
] ,
} ) ;
使用 SystemMessage.concat 保留 cache control metadata 或其他 middleware 创建的 structured content blocks。
Additional resources