K1: Agent Native Knowledge Orchestration

Forget GraphRAG — A 4B AI does the work NOW

视频:Discover AI 论文:K1, Shanghai AI Lab + ECNU + Fudan, 2026-06-11 时长:35:29(2129s) 字幕:EN 自动 → 中文提炼 许可证:MIT 日期:2026-06-17

核心论点

GraphRAG 系统普遍建错图 — 5 大问题导致"科学结构在 retrieval 阶段被破坏",AI agent 因此失败。K1 (Agent Native Knowledge Orchestration) 解决方案:(1) 5 schema 多模态 KG · (2) 4B 专用 extraction LLM · (3) 3 源 retrieval(不用 vector DB) · (4) 17 MCP tools + CLI

关键 insight:4B 模型 + SOTA reasoning model 分离 = "less money less expensive"。

性能提升
+24.5%
41.8% → 66.3%
Extraction 模型
4B
专用 RL 训练
Schemas
5
A / B / C / D / E
Retrieval Sources
3
web + KG + traversal
MCP Tools
17
立即对接 agent
训练 PDFs
2.5M
全科学领域

❌ GraphRAG 5 大问题

#问题影响例子
1 Flatten scientific structure 条件结构消失,依赖关系丢失 "improves the robustness only when X and Y" → 变成 "improves robustness"
2 Ignores multimodal table / figure 数据全丢 table 3 / figure 3 的非文本数据无法 retrieve
3 Lacks abstractions 只取 metadata,缺科学抽象 有 dataset/method/metric,没有 motivation/contribution/limitation
4 Citations as flat edges 丢失引用语义 "A cites B" → 没区分 support/contradict/extend/benchmark/criticize/reuse
5 Retrieves only chunks 还是 text retrieval 拿不到 benchmark results / failure claims / equations / lineage
"Our latest research AI agents fail not because they are stupid or have not enough complex sub-agent and multi-agent structure but because the scientific structure / information / knowledge itself gets destroyed during the retrieval."

🏗️ 3 大组件

1. Multimodal Parser

解析 PDF → 5 schemas 结构化

  • Text blocks
  • Tables
  • Figures / graphs
  • Equations
  • Citations

2. 4B Extraction LLM

专用 expert model(不是 chat / reasoning)

  • RL 训练(GRPO)
  • 3 reward: format / JSON / task
  • MIT license,可下载
  • 4B 足够一般任务

3. Graph Anything CLI

17 MCP tools + Python package + Unified CLI

  • 4 大 operator
  • 3 源 retrieval
  • Domain-specific KG
  • Symbolic graph traversal

📐 5 Schemas 详细

A
Factual / Metadata
  • Title / ID
  • Publication year
  • Document type
  • Language
  • License
  • Country/region
  • PDF URL
B
Authors
  • Name
  • Email
  • Affiliation
  • Ordering
  • Corresponding
C
Textual Entities
  • Method (proposed/cited)
  • Task
  • Evaluation
  • Dataset
  • Metric
  • Result
  • Table reference
  • Figure reference
D
Implicit / Abstracted
  • Motivation
  • Contribution
  • Limitation
  • Future work
  • Hypothesis
  • Failure mode
  • Novelty
  • Mechanism
E
Citation Relations
  • Site type (cites/cited by)
  • Related reference
  • Evidence (support/contradict/extend/benchmark/criticize/reuse)
  • Publication type

Socratic PO 提取例子

字段提取值
MethodSocratic Policy Optimization
What it improvesRL convergence
MechanismTeacher-guided curriculum generation
BaselinePPO (2017)
ProblemExploration inefficiency

🔍 3 源 Retrieval(关键创新)

1
Web Retrieval
Google Scholar / arXiv / APIs
2
KG Retrieval
Domain-specific multimodal
3
Graph Traversal
Symbolic reasoning over paths

4 大 Operator(CLI)

Operator用途例子
Seed resolution 解析 query → 实体 "Socratic PPO" → 论文名 + 作者 + 方法 + 引用
Citation lineage 演进链 "What led to this methodology?" → 遍历 graph
Limitation retrieval 未解决问题 "Which limitations remain?"
Comparative retrieval 文献对比 "What beats classical PPO?" → benchmark + delta + failure
"Graph traversal... is NOT vector building or vector databases or vector searching — this is reasoning over the graph paths and elements here that we need for the long reasoning task."

🎓 4B 模型训练(GRPO)

为什么不是 SFT?

"Think about the nature of the job. What we have to do is extraction and even more structured extraction optimization and not simple reasoning. So what we need is here a massive alignment to our new schema."

3 大 Reward

Reward作用
Format reward 输出符合 schema 格式
JSON reward JSON 合法 + 结构化
Task reward NER F1 + Relation Extraction + Long-form structured

3 大训练任务

1. Classical NER

"Socratic PO is a method, PPO is the baseline, this is a dataset"

2. Relation Extraction

extend / compare / reuse / adapt / contrast / generalize / modify / criticize

3. Structured NER

method / role / evidence / context / analysis

⚠️ 模型 size 选择

GRPO 关键

⚙️ 完整工作流(offline + online)

Offline Phase(一次性构建)

2.5M PDFs
6 大科学领域
Parser
multimodal: text/visual/equation
4B Extractor
RL 训练 · 5 schemas
Scholar KG
multimodal + 5-layer

Online Phase(每次 query)

User query
CLI / MCP
3 源 Retrieval
web + KG + traversal
Reasoning Agent
GPT-5 / Gemini 3 (SOTA)
Answer

✅ 关键设计:separation of concerns

4B 模型做 extraction · SOTA 模型做 reasoning · 避免 giant expensive LLM 用于所有 task

"Less money less expensive."

📈 性能数据

3 个 Benchmark

K1 vs Naked LLM(Knowledge + Research Questions)

Naked GPT-5.2
41.8%
K1 (Agent Native)
66.3%

📈 提升 +24.5%

在所有 3 个 multihop QA benchmark 上,K1 都 outperforms 所有 baseline(包括 LightRAG / HippoRAG / GFMRAG / HippoRAG 2)

注:Naked GPT-5.2 在 knowledge / research question 表现类似 41.8%

🔗 与 Patrick 工作的关联

🔗 hybrid-llm-routing-framework

核心关联 — K1 是 hybrid-llm-router 的 production-grade 实现:

  • ✅ 本地 cheap(4B extraction)+ cloud SOTA(GPT-5 reasoning)
  • 不用 vector DB → 3 源 retrieval
  • ✅ 17 MCP tools → 同 Patrick hybrid 路线
  • ✅ MIT license → 立即可用
  • ✅ "less money less expensive" = 验证 hybrid 决策

🔗 IndyDevDan / 本地 LLM

4B 模型本地能跑 → 本地 + cloud reasoning 的最优分工

Graph traversal reasoning 跟 IndyDevDan "8K context limit" 完全互补

🔗 OpenSwarm

OpenSwarm = 多 agent 协作 · K1 = single agent with 3-source retrieval

两条路线可结合:K1 节点 + OpenSwarm 协调

🔗 Sam Altman Stanford

Sam "inference underinvested" → K1 = inference 优化实例

"Always fresh data" → Sam 3 forks 之一(compute + data shortage)

"less money less expensive" = Sam "80% democratization" 具体路径

🔗 DataCamp AI Engineer

DataCamp 学 RAG → K1 是 RAG 2.0 范式(无 vector DB)

Multimodal + structured extraction 是进阶技能

🗺️ Patrick 实战路线

立即可做(10 分钟)

  1. 看 GitHub repo(17 MCP tools + Python package)
  2. 理解 5 schema + 3 source 架构
  3. 在 hybrid-llm-router skill 文档加 case study(K1 案例)

中期(1-2 周)

  1. 本地跑 4B extraction model(Ollama / LM Studio)
  2. 选 1 个 domain 跑 100-1000 docs 做小型 KG
  3. 接 hybrid-llm-router 做 routing(本地 4B + cloud GPT-5)

长期

  1. OpenSwarm 中集成 K1 节点
  2. 自训练 4B extraction model(垂直领域)
  3. 用 K1 论文框架做 OpenSwarm 自己的 knowledge layer

💡 关键洞察

K1 论文 6/11 刚出 → 7 个 MCP tools 立即可用 → 跟 hybrid-llm-router skill 完美互补

这是 RAG 2.0 范式:从 vector DB → 3 源 retrieval(web + KG + traversal)

⏱️ 完整时间线(65 段字幕)

0:00
Hook:Forget GraphRAG, 4B AI does the work
0:17
前 2 视频:hierarchical memory + planning
今天 = knowledge optimization
1:33
GraphRAG 流程介绍
chunk → triples → retrieval
2:39
问题 1:Flatten science into triples
Einstein 例子 + conditional structure 消失
3:07
问题 2:Ignores multimodal
table 3 / figure 3 丢失
3:41
问题 3:Lacks abstractions
motivation/contribution/limitation 缺
4:47
问题 4:Citations as flat edges
没区分 support/contradict/extend
5:42
问题 5:Retrieves only chunks
拿不到 benchmark / failure / equations
6:08
K1 论文介绍
Shanghai AI Lab + ECNU + Fudan, 2026-06-11
6:54
🏗️ 3 大组件
Parser + 4B LLM + Graph CLI
7:35
2.5M PDFs → Scholar KG
9:25
Socratic PO extraction 例子
method/motivation/baseline 全提取
10:54
训练 = GRPO(不是 SFT)
11:20
3 reward: format / JSON / task
11:50
3 extraction tasks: NER / relation / structured
13:30
📐 5 schemas (A/B/C/D/E)
15:30
4B ≠ reasoning model
= dedicated expert
16:15
"less money less expensive"
17:00
Always fresh data → web search integration
17:25
🔍 3 源 retrieval
web + KG + traversal
17:50
Graph traversal ≠ vector search
18:33
Graph operators 例子
seed / lineage / limitation / comparison
22:15
完整结构图(offline + online)
25:44
3 benchmarks: HotpotQA / 2Wiki / Musique
26:24
📈 41.8% → 66.3%
knowledge + research questions
27:00
Next week 新系统预告