2026年4月5日

Agent 控制栈

评测、权限、安全、行为对齐和组织策略正在成为 agent 的一等工程对象。

Key Conclusions

重点不是模型变强，而是控制面变成工程核心。
agent 质量越来越取决于 harness、policy 和 eval design。

Selected Signals

1. Evaluating alignment of behavioral dispositions in LLMs

Source: Google Research Blog
Date: 2026-04-03

Google Research 把行为对齐从概念推进成了可复现、可验证的评测工程。

2. Copilot organization custom instructions are generally available

Source: GitHub Changelog
Date: 2026-04-02

GitHub 把组织级 custom instructions 推到 Copilot 全链路。

3. Claude Code auto mode: a safer way to skip permissions

Source: Anthropic Engineering
Date: 2026-03-25

Anthropic 用输入探测和输出 classifier 替代了手工审批。

Signal Technique

Name: 把 agent 外围系统当成一等产品面
Why it matters: 价值在 eval、policy、classifier 和 harness。

Observations

最强信号是可测、可配、可审计的控制面。