通过实现 hooks 构建自定义 middleware,这些 hooks 会在 agent 执行流的特定点运行。

Hooks

Middleware 提供两种 hook 风格来拦截 agent 执行:

Node-style hooks

在特定执行点按顺序运行。

Wrap-style hooks

围绕每次模型或工具调用运行。

Node-style hooks

在特定执行点按顺序运行。用于 logging、validation 和 state updates。 选择 middleware 所需的 hooks。你可以在 node-style hooks 和 wrap-style hooks 之间选择。 Node-style hooks 会在特定执行点运行:
Hook何时运行
beforeAgentAgent 启动前(每次 invocation 一次)
beforeModel每次模型调用前
afterModel每次模型响应后
afterAgentAgent 完成后(每次 invocation 一次)
Wrap-style hooks 会围绕每次调用运行,让你控制执行:
Hook何时运行
wrapModelCall围绕每次模型调用
wrapToolCall围绕每次工具调用
示例:
import { createMiddleware, AIMessage } from "langchain";

const createMessageLimitMiddleware = (maxMessages: number = 50) => {
  return createMiddleware({
    name: "MessageLimitMiddleware",
    beforeModel: {
      canJumpTo: ["end"],
      hook: (state) => {
        if (state.messages.length === maxMessages) {
          return {
            messages: [new AIMessage("Conversation limit reached.")],
            jumpTo: "end",
          };
        }
        return;
      }
    },
    afterModel: (state) => {
      const lastMessage = state.messages[state.messages.length - 1];
      console.log(`Model returned: ${lastMessage.content}`);
      return;
    },
  });
};

Wrap-style hooks

拦截执行并控制何时调用 handler。用于 retries、caching 和 transformation。 你可以决定 handler 调用零次(短路)、一次(正常流程)或多次(重试逻辑)。 可用 hooks:
  • wrapModelCall:围绕每次模型调用
  • wrapToolCall:围绕每次工具调用
示例:
import { createMiddleware } from "langchain";

const createRetryMiddleware = (maxRetries: number = 3) => {
  return createMiddleware({
    name: "RetryMiddleware",
    wrapModelCall: (request, handler) => {
      for (let attempt = 0; attempt < maxRetries; attempt++) {
        try {
          return handler(request);
        } catch (e) {
          if (attempt === maxRetries - 1) {
            throw e;
          }
          console.log(`Retry ${attempt + 1}/${maxRetries} after error: ${e}`);
        }
      }
      throw new Error("Unreachable");
    },
  });
};

State updates

Node-style 和 wrap-style hooks 都可以更新 agent state。机制有所不同:
  • Node-style hooksbeforeAgentbeforeModelafterModelafterAgent):直接返回 dict。该 dict 会使用 graph 的 reducers 应用到 agent state。
  • Wrap-style hookswrapModelCallwrapToolCall):对于模型调用,直接返回 Command,以便在模型响应旁边注入 state updates。对于工具调用,直接返回 Command。当你需要根据模型或工具调用期间运行的逻辑来追踪或更新 state 时使用它们,例如摘要触发点、usage metadata,或根据 request/response 计算出的自定义字段。

Node-style hooks

从 node-style hook 返回 dict,将更新合并到 agent state 中。Dict keys 会映射到 state fields。
import { createMiddleware } from "langchain";
import * as z from "zod";

const trackingStateSchema = z.object({
  modelCallCount: z.number().default(0),
});

const incrementAfterModel = createMiddleware({
  name: "incrementAfterModel",
  stateSchema: trackingStateSchema,
  afterModel: (state) => {
    return { modelCallCount: state.modelCallCount + 1 };
  },
});

Wrap-style hooks

wrapModelCall 直接返回 Command,从模型调用层注入 state updates:
import * as z from "zod";
import { createMiddleware } from "langchain";
import { Command } from "@langchain/langgraph";

const usageTrackingStateSchema = z.object({
  lastModelCallTokens: z.number().optional(),
});

const trackUsage = createMiddleware({
  name: "trackUsage",
  stateSchema: usageTrackingStateSchema,
  wrapModelCall: async (request, handler) => {
    const response = await handler(request);
    return new Command({ update: { lastModelCallTokens: 150 } });
  },
});
Command 会流经 graph 的 reducers,因此更新会正确应用,messages 会追加而不是替换现有 state。

Composition with multiple middleware

当多个 middleware 层返回 responses 时,framework 会传递最后生成的 AIMessage
  • AIMessage 会流经各层: 每个 middleware 的 handler() 都会接收上一层的 AIMessage。当 middleware 返回 AIMessage 时,它会成为下一个 middleware handler 的输入。
  • 不更新 messages 的 Command 会透传: 如果 middleware 返回的 Command 其 state update 未触及 messages,framework 会将其视为 message flow 的 no-op。下一个 middleware 的 handler 会收到返回 Command 的 middleware 之前 那一层的 AIMessage
  • Reducer 行为和重试安全: Commands 仍会通过 reducers 应用(messages 追加,冲突时外层优先)。Retry logic 会丢弃较早调用产生的 commands。
import * as z from "zod";
import { createMiddleware } from "langchain";
import { Command, StateSchema, ReducedValue } from "@langchain/langgraph";
import { AIMessage, SystemMessage } from "@langchain/core/messages";

/** Last-wins reducer: when both middleware write, outer overwrites inner. */
const customMiddlewareStateSchema = new StateSchema({
  traceLayer: new ReducedValue(
    z.string().optional(),
    { reducer: (a, b) => b },
  ),
});

const outerMiddleware = createMiddleware({
  name: "OuterMiddleware",
  stateSchema: customMiddlewareStateSchema,
  wrapModelCall: async (_request, handler) => {
    await handler(_request);
    return new Command({
      update: {
        traceLayer: "outer",
        messages: [new SystemMessage({ content: "[Outer ran]" })],
      },
    });
  },
});

const innerMiddleware = createMiddleware({
  name: "InnerMiddleware",
  stateSchema: customMiddlewareStateSchema,
  wrapModelCall: async (_request, handler) => {
    await handler(_request);
    return new Command({
      update: {
        traceLayer: "inner",
        messages: [new SystemMessage({ content: "[Inner ran]" })],
      },
    });
  },
});

Create middleware

python AgentMiddleware 子类可以声明三个 class attributes,agent factory 会在 compile time 读取它们:
  • state_schema:使用自定义字段扩展 agent state。请参阅 Custom state schema
  • tools:注册随 middleware 提供的额外 tools,例如 to-do list middleware 上的 write_todos
  • transformers:注册 scope-aware stream transformer factories。请参阅 Custom stream transformers。 :::
createMiddleware 接受三个配置字段,agent factory 会在 compile time 读取它们:
  • stateSchema:使用自定义字段扩展 agent state。请参阅 Custom state schema
  • tools:注册随 middleware 提供的额外 tools。
  • streamTransformers:注册 scope-aware stream transformer factories。请参阅 Custom stream transformers
示例:
from langchain.agents.middleware import (
    AgentMiddleware,
    AgentState,
    ModelRequest,
    ModelResponse,
)
from langgraph.runtime import Runtime
from typing import Any, Callable

class LoggingMiddleware(AgentMiddleware):
    def before_model(self, state: AgentState, runtime: Runtime) -> dict[str, Any] | None:
        print(f"About to call model with {len(state['messages'])} messages")
        return None

    def after_model(self, state: AgentState, runtime: Runtime) -> dict[str, Any] | None:
        print(f"Model returned: {state['messages'][-1].content}")
        return None

    async def abefore_model(
        self, state: AgentState, runtime: Runtime
    ) -> dict[str, Any] | None:
        # Async version of before_model
        return None

    async def aafter_model(
        self, state: AgentState, runtime: Runtime
    ) -> dict[str, Any] | None:
        # Async version of after_model
        print(f"Model returned: {state['messages'][-1].content}")
        return None


agent = create_agent(
    model="gpt-5.4",
    middleware=[LoggingMiddleware()],
    tools=[...],
)
何时使用 classes:
  • 为同一个 hook 同时定义同步和异步实现
  • 单个 middleware 需要多个 hooks
  • 需要复杂配置,例如可配置阈值、自定义模型
  • 需要通过初始化时配置在多个项目中复用
::: 使用 createMiddleware 函数定义自定义 middleware:
import { createMiddleware } from "langchain";

const loggingMiddleware = createMiddleware({
  name: "LoggingMiddleware",
  beforeModel: (state) => {
    console.log(`About to call model with ${state.messages.length} messages`);
    return;
  },
  afterModel: (state) => {
    const lastMessage = state.messages[state.messages.length - 1];
    console.log(`Model returned: ${lastMessage.content}`);
    return;
  },
});

Custom state schema

如果 middleware 需要跨 hooks 追踪 state,可以使用自定义属性扩展 agent state。这使 middleware 能够:
  • 跨执行追踪 state:维护在 agent 执行生命周期中持续存在的计数器、flags 或其他值
  • 在 hooks 之间共享数据:将信息从 beforeModel 传给 afterModel,或在不同 middleware instances 之间传递
  • 实现横切关注点:添加 rate limiting、usage tracking、user context 或 audit logging 等功能,而不修改核心 agent 逻辑
  • 做出条件决策:使用累积 state 决定是否继续执行、跳转到不同节点,或动态修改行为
import { createMiddleware, createAgent, HumanMessage } from "langchain";
import { StateSchema } from "@langchain/langgraph";
import * as z from "zod";

const CustomState = new StateSchema({
  modelCallCount: z.number().default(0),
  userId: z.string().optional(),
});

const callCounterMiddleware = createMiddleware({
  name: "CallCounterMiddleware",
  stateSchema: CustomState,
  beforeModel: {
    canJumpTo: ["end"],
    hook: (state) => {
      if (state.modelCallCount > 10) {
        return { jumpTo: "end" };
      }

      return;
    },
  },
  afterModel: (state) => {
    return { modelCallCount: state.modelCallCount + 1 };
  },
});

const agent = createAgent({
  model: "gpt-5.4",
  tools: [...],
  middleware: [callCounterMiddleware],
});

const result = await agent.invoke({
  messages: [new HumanMessage("Hello")],
  modelCallCount: 0,
  userId: "user-123",
});
State fields 可以是 public 或 private。以下划线(_)开头的字段会被视为 private,不会包含在 agent 的结果中。只有 public fields(没有前导下划线)会被返回。 这适合存储不应暴露给调用方的内部 middleware state,例如临时 tracking variables 或内部 flags:
import { StateSchema } from "@langchain/langgraph";
import * as z from "zod";

const PrivateState = new StateSchema({
  // Public field - included in invoke result
  publicCounter: z.number().default(0),
  // Private field - excluded from invoke result
  _internalFlag: z.boolean().default(false),
});

const middleware = createMiddleware({
  name: "ExampleMiddleware",
  stateSchema: PrivateState,
  afterModel: (state) => {
    // Both fields are accessible during execution
    if (state._internalFlag) {
      return { publicCounter: state.publicCounter + 1 };
    }
    return { _internalFlag: true };
  },
});

const result = await agent.invoke({
  messages: [new HumanMessage("Hello")],
  publicCounter: 0
});

// result only contains publicCounter, not _internalFlag
console.log(result.publicCounter); // 1
console.log(result._internalFlag); // undefined

Custom stream transformers

Middleware-registered transformers 需要 langchain@1.4.3 或更高版本。
Middleware 可以注册 stream transformer factories,将 live agent stream 中的事件投影到类型化 extension channels。这适合在不耦合 framework 内置 projections 的情况下暴露计数器、side-channel artifacts、部分输出或 wire-level redaction。 在 compile time,middleware-registered factories 会与调用方直接传给 agent factory 的内容合并。final ordering rules 会让内置 ToolCallTransformer 保持在前面,并让调用方提供的条目排在最后。 streamTransformers 作为 factories tuple 传给 createMiddleware。每个 factory 形如 () => StreamTransformer<any>(零参数),并且每个 scope 调用一次;每次调用返回新的 transformer 可以让每个 subgraph 保持隔离。
import { createAgent, createMiddleware } from "langchain";

const toolActivityMiddleware = createMiddleware({
  name: "ToolActivityMiddleware",
  streamTransformers: [toolActivityTransformer],
});

const agent = createAgent({
  model: "gpt-5-nano",
  tools: [...],
  middleware: [toolActivityMiddleware],
});
完整排序规则和 PII redaction 示例请参阅 Register transformers on middleware

Custom context

Middleware 可以定义自定义 context schema 来访问每次 invocation 的 metadata。与 state 不同,context 是只读的,并且不会在 invocations 之间持久化。这使它非常适合:
  • 用户信息:传递执行期间不会变化的用户 ID、角色或偏好
  • 配置覆盖:提供每次 invocation 的设置,例如 rate limits 或 feature flags
  • Tenant/workspace context:为多租户应用包含组织专属数据
  • Request metadata:传递 middleware 所需的 request IDs、API keys 或其他 metadata
使用 Zod 定义 context schema,并在 middleware hooks 中通过 runtime.context 访问它。Context schema 中的必填字段会在 TypeScript 层面强制执行,确保调用 agent.invoke() 时必须提供它们。
import { createAgent, createMiddleware, HumanMessage } from "langchain";
import * as z from "zod";

const contextSchema = z.object({
  userId: z.string(),
  tenantId: z.string(),
  apiKey: z.string().optional(),
});

const userContextMiddleware = createMiddleware({
  name: "UserContextMiddleware",
  contextSchema,
  wrapModelCall: (request, handler) => {
    // Access context from runtime
    const { userId, tenantId } = request.runtime.context;

    // Add user context to system message
    const contextText = `User ID: ${userId}, Tenant: ${tenantId}`;
    const newSystemMessage = request.systemMessage.concat(contextText);

    return handler({
      ...request,
      systemMessage: newSystemMessage,
    });
  },
});

const agent = createAgent({
  model: "gpt-5.4",
  middleware: [userContextMiddleware],
  tools: [],
  contextSchema,
});

const result = await agent.invoke(
  { messages: [new HumanMessage("Hello")] },
  // Required fields (userId, tenantId) must be provided
  {
    context: {
      userId: "user-123",
      tenantId: "acme-corp",
    },
  }
);
必填 context 字段:当你在 contextSchema 中定义必填字段(没有 .optional().default() 的字段)时,TypeScript 会强制要求在 agent.invoke() 调用期间提供这些字段。这可以确保类型安全,并防止缺少必填 context 导致 runtime errors。
// This will cause a TypeScript error if userId or tenantId are missing
const result = await agent.invoke(
  { messages: [new HumanMessage("Hello")] },
  { context: { userId: "user-123" } } // Error: tenantId is required
);

Execution order

使用多个 middleware 时,需要理解它们如何执行:
const agent = createAgent({
  model: "gpt-5.4",
  middleware: [middleware1, middleware2, middleware3],
  tools: [...],
});
Before hooks 按顺序运行:
  1. middleware1.before_agent()
  2. middleware2.before_agent()
  3. middleware3.before_agent()
Agent loop 开始
  1. middleware1.before_model()
  2. middleware2.before_model()
  3. middleware3.before_model()
Wrap hooks 像函数调用一样嵌套:
  1. middleware1.wrap_model_call()middleware2.wrap_model_call()middleware3.wrap_model_call() → model
After hooks 按反向顺序运行:
  1. middleware3.after_model()
  2. middleware2.after_model()
  3. middleware1.after_model()
Agent loop 结束
  1. middleware3.after_agent()
  2. middleware2.after_agent()
  3. middleware1.after_agent()
关键规则:
  • before_* hooks:从前到后
  • after_* hooks:从后到前(反向)
  • wrap_* hooks:嵌套执行(第一个 middleware 包装其他所有 middleware)

Agent jumps

如需从 middleware 提前退出,请返回包含 jump_to 的 dictionary: 可用 jump targets:
  • 'end':跳到 agent execution 末尾(或第一个 after_agent hook)
  • 'tools':跳到 tools node
  • 'model':跳到 model node(或第一个 before_model hook)
import { createAgent, createMiddleware, AIMessage } from "langchain";

const agent = createAgent({
  model: "gpt-5.4",
  middleware: [
    createMiddleware({
      name: "BlockedContentMiddleware",
      beforeModel: {
        canJumpTo: ["end"],
        hook: (state) => {
          if (state.messages.at(-1)?.content.includes("BLOCKED")) {
            return {
              messages: [new AIMessage("I cannot respond to that request.")],
              jumpTo: "end" as const,
            };
          }
          return;
        },
      },
    }),
  ],
});

const result = await agent.invoke({
    messages: "Hello, world! BLOCKED"
});

/**
 * Expected output:
 * I cannot respond to that request.
 */
console.log(result.messages.at(-1)?.content);

Best practices

  1. 保持 middleware 聚焦,每个 middleware 都应做好一件事
  2. 优雅处理 errors,不要让 middleware errors 导致 agent 崩溃
  3. Use appropriate hook types:
    • Node-style 用于顺序逻辑(logging、validation)
    • Wrap-style 用于控制流(retry、fallback、caching)
  4. 清楚记录任何自定义 state properties
  5. 集成前独立对 middleware 做 unit test
  6. 考虑执行顺序,将关键 middleware 放在列表前面
  7. 尽可能使用 built-in middleware

Examples

Dynamic prompt

在 runtime 动态修改 system prompt,以便在每次模型调用前注入 context、用户专属指令或其他信息。这是最常见的 middleware 用例之一。 使用 ModelRequest 中的 systemMessage 字段读取和修改 system prompt。它包含 SystemMessage 对象,即使 agent 是使用字符串 systemPrompt 创建的也是如此。
import { createMiddleware, SystemMessage, createAgent } from "langchain";

const addContextMiddleware = createMiddleware({
  name: "AddContextMiddleware",
  wrapModelCall: async (request, handler) => {
    return handler({
      ...request,
      systemMessage: request.systemMessage.concat(`Additional context.`),
    });
  },
});

const agent = createAgent({
  model: "google-genai:gemini-3.5-flash",
  systemPrompt: "You are a helpful assistant.",
  middleware: [addContextMiddleware],
});
使用 SystemMessage.concat 保留 cache control metadata 或其他 middleware 创建的 structured content blocks。

Dynamic model selection

import { createMiddleware, initChatModel } from "langchain";

const models = {
  complex: await initChatModel("claude-sonnet-4-6"),
  simple: await initChatModel("claude-haiku-4-5-20251001"),
};

const dynamicModelMiddleware = createMiddleware({
  name: "DynamicModelMiddleware",
  wrapModelCall: (request, handler) => {
    const modifiedRequest = { ...request };
    if (request.messages.length > 10) {
      modifiedRequest.model = models.complex;
    } else {
      modifiedRequest.model = models.simple;
    }
    return handler(modifiedRequest);
  },
});

Dynamically selecting tools

在 runtime 选择相关 tools,以提升性能和准确性。本节介绍如何过滤预注册 tools。对于在 runtime 发现的 tools(例如来自 MCP servers)的注册方式,请参阅 Runtime tool registration 收益:
  • 更短的 prompts:仅暴露相关 tools,降低复杂度
  • 更高准确性:模型从更少选项中做出正确选择
  • 权限控制:根据用户访问权限动态过滤 tools
import { createAgent, createMiddleware } from "langchain";

const toolSelectorMiddleware = createMiddleware({
  name: "ToolSelector",
  wrapModelCall: (request, handler) => {
    // Select a small, relevant subset of tools based on state/context
    const relevantTools = selectRelevantTools(request.state, request.runtime);
    const modifiedRequest = { ...request, tools: relevantTools };
    return handler(modifiedRequest);
  },
});

const agent = createAgent({
  model: "gpt-5.4",
  tools: allTools,
  middleware: [toolSelectorMiddleware],
});

Tool call monitoring

import { createMiddleware } from "langchain";

const toolMonitoringMiddleware = createMiddleware({
  name: "ToolMonitoringMiddleware",
  wrapToolCall: (request, handler) => {
    console.log(`Executing tool: ${request.toolCall.name}`);
    console.log(`Arguments: ${JSON.stringify(request.toolCall.args)}`);
    try {
      const result = handler(request);
      console.log("Tool completed successfully");
      return result;
    } catch (e) {
      console.log(`Tool failed: ${e}`);
      throw e;
    }
  },
});

Prompt caching (Anthropic)

使用 Anthropic models 时,使用带 cache control 指令的 structured content blocks 缓存大型 system prompts:
from langchain.agents.middleware import wrap_model_call, ModelRequest, ModelResponse
from langchain.messages import SystemMessage
from typing import Callable


@wrap_model_call
def add_cached_context(
    request: ModelRequest,
    handler: Callable[[ModelRequest], ModelResponse],
) -> ModelResponse:
    # Always work with content blocks
    new_content = list(request.system_message.content_blocks) + [
        {
            "type": "text",
            "text": "Here is a large document to analyze:\n\n<document>...</document>",
            # content up until this point is cached
            "cache_control": {"type": "ephemeral"}
        }
    ]

    new_system_message = SystemMessage(content=new_content)
    return handler(request.override(system_message=new_system_message))
注意事项:
  • ModelRequest.system_message 始终是 SystemMessage 对象,即使 agent 是使用 system_prompt="string" 创建的
  • 使用 SystemMessage.content_blocks 将 content 作为 blocks 列表访问,无论原始 content 是字符串还是列表
  • 修改 system messages 时,使用 content_blocks 并追加新 blocks,以保留现有结构
  • 可以将 SystemMessage 对象直接传给 create_agentsystem_prompt 参数,以支持 cache control 等高级用例
::: 在 middleware 中使用 ModelRequestsystemMessage 字段修改 system messages。它包含 SystemMessage 对象,即使 agent 是使用字符串 systemPrompt 创建的也是如此。 示例:链接 middleware:不同 middleware 可以使用不同方式:
import { createMiddleware, SystemMessage, createAgent } from "langchain";

// Middleware 1: Uses systemMessage with simple concatenation
const myMiddleware = createMiddleware({
  name: "MyMiddleware",
  wrapModelCall: async (request, handler) => {
    return handler({
      ...request,
      systemMessage: request.systemMessage.concat(`Additional context.`),
    });
  },
});

// Middleware 2: Uses systemMessage with structured content (preserves structure)
const myOtherMiddleware = createMiddleware({
  name: "MyOtherMiddleware",
  wrapModelCall: async (request, handler) => {
    return handler({
      ...request,
      systemMessage: request.systemMessage.concat(
        new SystemMessage({
          content: [
            {
              type: "text",
              text: " More additional context. This will be cached.",
              cache_control: { type: "ephemeral", ttl: "5m" },
            },
          ],
        })
      ),
    });
  },
});

const agent = createAgent({
  model: "google_genai:gemini-3.5-flash",
  systemPrompt: "You are a helpful assistant.",
  middleware: [myMiddleware, myOtherMiddleware],
});
生成的 system message 会是:
new SystemMessage({
  content: [
    { type: "text", text: "You are a helpful assistant." },
    { type: "text", text: "Additional context." },
    {
        type: "text",
        text: " More additional context. This will be cached.",
        cache_control: { type: "ephemeral", ttl: "5m" },
    },
  ],
});
使用 SystemMessage.concat 保留 cache control metadata 或其他 middleware 创建的 structured content blocks。

Additional resources