Self-Harness Optimization

自学习 AI Harness — Discover AI 深度拆解 Shanghai AI Lab 论文

视频:Code Loops and A Self Learning AI Harness 频道:Discover AI 时长:17:42(1062s) 字幕:EN 自动 → 中文提炼 日期:2026-06-17

核心论点

Harness engineering 是 2026-06-08 后 AI 圈新范式:Anthropic Boris 自己说 "stop prompting Claude, build loops that prompt themselves"。Anthropic 把"intelligence"从 LLM 移到 harness(参考 Fable 5 现象:70% refusal = 模型被换 harness)。Shanghai AI Lab 提出 Self-Harness Optimization —— 让 LLM 自己优化自己的 harness(不改权重),用 frozen model + 数学 loop 检测 failure pattern → 提案 → 验证 → 更新。

Minimax 2.5
42% → 53%
+11%
QA 3.5
20% → 36%
+16%
Fable 5 refusal
70%+
对 science task
Boris 公告
6/8
"harness engineering"
Harness 元素
4
prompts/tools/memory/policy
Loop 步骤
4
evaluate/propose/validate/update

🔥 Fable 5 吐槽(背景)

💬 Hugging Face CTO 评价

💡 Discover AI 关键猜想

"Is Fable 5 really a new LLM, or since it's input a vision language model — is this really a new object or has it just a better harness?"

"Tropic is moving the intelligence out from their LLM system into the harness system."

📊 Fable 5 vs Claude 4.6

任务类型Fable 5Claude 4.6
Video game✅ 好✅ 好
HTML 生成✅ 好✅ 好
科学 task❌ 70%+ refusal✅ work
价格2x1x

🏗️ Scaffold vs Harness(核心区分)

论文定义(Shanghai AI Lab)

"We use the term harness to denote the non-parametric scaffolding that governs how a fixed language model is deployed as an agent."

简化:论文把 "everything that is not the LLM" 都叫 harness,包含 scaffold

Discover AI 自己的区分(更精细)

维度ScaffoldHarness
本质 Expertise as workflow Execution substrate
"How should this problem be solved?" "With which tool? How often retry? When stop?"
包含 Skills + workflows Tool calling / file I/O / code exec / memory / loop control
类比 OSC: 知识 OSC: runtime layer
Source of expertise ✅ 是 ❌ 否(只是 operational layer)

Discover AI 框架

"LLM running inside an iterative loop, governed by a harness, and shaped by a scaffold of skills and workflows."

4 大 Harness 优化元素

1. Prompts

System / execution / verification / failure recovery instruction

2. Tools

Tool calling / file I/O / code execution

3. Memory

Context compaction / past experience

4. Policy

Runtime control / routing

🔁 Self-Harness 4 步 Loop

1
Evaluate(评估)
收集 execution traces · Cluster failed records by verifier-grounded failure signatures · 排序 by actionability
2
Propose(提案)
用 frozen LLM (M) + current harness 角色 · 提供 failure patterns + passing behavior + 之前 attempted edits · LLM 提案 minimum candidate modifications (baby steps)
3
Validate(验证)
同一 evaluator 评估新 proposal · Pass → accept · Fail → reject
4
Update(更新)
接受 → 更新 harness · 否则 → 保留旧版本

关键设计:Baby Steps

"Since it's a self-learning procedure, you know this is like a cool back library versions, you don't want to diverse too much here from your given probability distribution. So, you just make baby steps."

为什么不大幅修改:保持分布稳定 · 避免 catastrophic forgetting · 验证局部最优化(小改易验证)

📊 Cluster & 排序

Cluster 结构(per cluster)

字段含义
Cluster size失败记录数
Representative task代表任务
Shared trace symptomstrace 共同症状
Verifier evidenceverifier 证据
Agent mechanismsagent 行为

排序原则

"Order clusters by estimated actionability — what can we solve immediately, what's simplest to correct, how can we optimize performance immediately? We don't waste 30-min or 1-hour runs to find what's wrong."

提案包含

10 大 Initial Harness 元素

#元素
1System prompt
2Memory source
3Sub-agent
4Skills
5Bootstrap instruction
6Execution instruction
7Verification instruction
8Failure recovery instruction
9Runtime control policy
10Routing policy

→ Self-harness optimization 改的就是这 10 个 element

💡 失败 case 例子

场景:被 stuck in tool loop(52 calls 后)

❌ 旧 behavior

"继续重复 same failed path,stuck in loop"

✅ Self-correction(harness 自动改)

"You appear to be stuck in a tool loop. Stop repeating the same failed path. Summarize the evidence already collected. Choose the smallest remaining implementation step and then run one targeted verification."

💡 关键能力

不是改 LLM 权重,而是改 harness 的 instruction → 立即可读 / 立即生效

比 fine-tune 快 + 比 prompt 更结构化

📈 性能数据(Terminal Bench 2)

89 containerized terminal tasks · tool-based execution

Minimax M 2.5

初始 pass rate
42%
自优化后
53%

Q&A 3.5

初始 pass rate
20%
自优化后
36%

⚠️ 限制

"Of course, since the tasks are rather simple and you have a simple harness, a self-evolving harness, these are the easy wins."

"For a higher complexity system, there's nothing particular to it. It is more or less more of the same."

🔗 与 Patrick 工作的关联

🔗 Sam Altman Stanford

Sam 暗示 Anthropic "把 intelligence 移到 harness"(Fable 5 现象)

Sam "inference underinvested" → 被验证:harness engineering 是 inference 高阶

🔗 K1(Agent Native KG)

K1 = "harness 设计" 优秀 case(17 MCP tools + 3 源 retrieval)

Self-harness = 进一步让 harness 自我进化

K1 可以加 self-harness loop → 自动检测 failure → 优化 CLI

🔗 Claude Code / IndyDevDan

IndyDevDan "Claude Code = LLM + 迭代 loop + harness"

"Milestone + commit" = harness 的人工版本

Self-harness = Milestone + commit 的自动化

🔗 hybrid-llm-router ⭐

4 harness 元素:prompts / tools / memory / policy → memory + policy 是 routing 关键

Self-harness optimization 可以优化 "policy" → router policy 自学习

"估计 actionability" 排序 = routing decision log 分析

🔗 OpenSwarm

OpenSwarm = 多个 harness 协作

每个 harness 可以自学习 → 自演化 multi-agent

"OpenSkill"(Discover AI 之前)= self-evolution skill loop

🗺️ Patrick 实战路线

立即可做(30 分钟)

  1. 在 hybrid-llm-router skill 文档加 case study(self-harness optimization)
  2. 找当前 OpenSwarm / hybrid-llm-router 跑过的 traces
  3. 手工 cluster failure patterns(学思路)

中期(1-2 周)

  1. 实现 self-harness loop(基于 hybrid-llm-router 框架)
  2. 4 元素(prompts/tools/memory/policy)按 cluster 排序
  3. Baby-step 验证

长期(1 个月+)

  1. Self-evolving multi-agent system(OpenSwarm + self-harness)
  2. 跨 harness 共享 failure patterns(collective learning
  3. 集成到生产 LiteLLM proxy

💡 Boris 6/8 公告(Anthropic)

"Stop prompting Claude and build loops that prompt them self."

→ Harness engineering = 2026-06-08 后新范式

→ Self-harness = loop 的 self-improvement 版本

⏱️ 完整时间线(61 段字幕)

0:00
Hook:Fable 5 长周后 → Self-Learning Harness
0:08
上一视频:research harness(K1 类)
0:28
🔥 Fable 5 吐槽:70%+ refusal
Hugging Face CTO:"why pay 2x for 70% refusal?"
1:11
"Is Fable 5 new LLM or just better harness?"
1:27
"Anthropic moving intelligence to harness"
2:42
Open Skill 引用(自演化 skill)
3:01
🏗️ Scaffold vs Harness 区分
4:10
"Harness = operational layer, not source of expertise"
4:18
"Scaffold: how to solve / Harness: with which tool"
5:14
"OSC: scaffold = 知识, harness = runtime"
6:01
"🔑 Boris: stop prompting, build loops that prompt themselves"
6/8 hot topic
6:38
"LLM inside iterative loop, governed by harness"
6:53
🔁 Self-harness 引入
7:17
"AI bubble boiling in your own soup"(警告)
8:11
论文定义:harness = non-parametric scaffolding
8:55
"Fable 5 是不是只是优化 harness"
9:18
4 elements:prompts / tools / memory / policy
9:46
🔁 4 步循环:evaluate → propose → validate → update
10:33
Cluster failed records by failure signatures
11:43
Order by actionability(什么最容易修)
13:03
Baby steps(不大幅修改分布)
14:10
Validation 用同一 evaluator
14:33
Terminal Bench 2 · 89 tasks
14:53
Initial harness 构造(10 elements)
15:22
性能图:iteration 上升 → pass rate 提升
15:55
💡 失败 case(stuck in tool loop → self-correction)
16:42
📈 结果:Minimax 2.5 42%→53% · QA 3.5 20%→36%
17:20
限制:simple task 才能 win
17:34
Closing