K1: Agent Native Knowledge Orchestration

Forget GraphRAG — A 4B AI does the work NOW

视频：Discover AI 论文：K1, Shanghai AI Lab + ECNU + Fudan, 2026-06-11 时长：35:29（2129s） 字幕：EN 自动 → 中文提炼 许可证：MIT 日期：2026-06-17

核心论点

GraphRAG 系统普遍建错图 — 5 大问题导致"科学结构在 retrieval 阶段被破坏"，AI agent 因此失败。K1 (Agent Native Knowledge Orchestration) 解决方案：(1) 5 schema 多模态 KG · (2) 4B 专用 extraction LLM · (3) 3 源 retrieval（不用 vector DB） · (4) 17 MCP tools + CLI。

关键 insight：4B 模型 + SOTA reasoning model 分离 = "less money less expensive"。

性能提升

+24.5%

41.8% → 66.3%

Extraction 模型

专用 RL 训练

Schemas

A / B / C / D / E

Retrieval Sources

web + KG + traversal

MCP Tools

立即对接 agent

训练 PDFs

2.5M

全科学领域

❌ GraphRAG 5 大问题

#	问题	影响	例子
1	Flatten scientific structure	条件结构消失，依赖关系丢失	"improves the robustness only when X and Y" → 变成 "improves robustness"
2	Ignores multimodal	table / figure 数据全丢	table 3 / figure 3 的非文本数据无法 retrieve
3	Lacks abstractions	只取 metadata，缺科学抽象	有 dataset/method/metric，没有 motivation/contribution/limitation
4	Citations as flat edges	丢失引用语义	"A cites B" → 没区分 support/contradict/extend/benchmark/criticize/reuse
5	Retrieves only chunks	还是 text retrieval	拿不到 benchmark results / failure claims / equations / lineage

"Our latest research AI agents fail not because they are stupid or have not enough complex sub-agent and multi-agent structure but because the scientific structure / information / knowledge itself gets destroyed during the retrieval."

🏗️ 3 大组件

1. Multimodal Parser

解析 PDF → 5 schemas 结构化

Text blocks
Tables
Figures / graphs
Equations
Citations

2. 4B Extraction LLM

专用 expert model（不是 chat / reasoning）

RL 训练（GRPO）
3 reward: format / JSON / task
MIT license，可下载
4B 足够一般任务

3. Graph Anything CLI

17 MCP tools + Python package + Unified CLI

4 大 operator
3 源 retrieval
Domain-specific KG
Symbolic graph traversal

📐 5 Schemas 详细

Factual / Metadata

Title / ID
Publication year
Document type
Language
License
Country/region
PDF URL

Authors

Name
Email
Affiliation
Ordering
Corresponding

Textual Entities

Method (proposed/cited)
Task
Evaluation
Dataset
Metric
Result
Table reference
Figure reference

Implicit / Abstracted

Motivation
Contribution
Limitation
Future work
Hypothesis
Failure mode
Novelty
Mechanism

Citation Relations

Site type (cites/cited by)
Related reference
Evidence (support/contradict/extend/benchmark/criticize/reuse)
Publication type

Socratic PO 提取例子

字段	提取值
Method	Socratic Policy Optimization
What it improves	RL convergence
Mechanism	Teacher-guided curriculum generation
Baseline	PPO (2017)
Problem	Exploration inefficiency

🔍 3 源 Retrieval（关键创新）

Web Retrieval

Google Scholar / arXiv / APIs

KG Retrieval

Domain-specific multimodal

Graph Traversal

Symbolic reasoning over paths

4 大 Operator（CLI）

Operator	用途	例子
Seed resolution	解析 query → 实体	"Socratic PPO" → 论文名 + 作者 + 方法 + 引用
Citation lineage	演进链	"What led to this methodology?" → 遍历 graph
Limitation retrieval	未解决问题	"Which limitations remain?"
Comparative retrieval	文献对比	"What beats classical PPO?" → benchmark + delta + failure

"Graph traversal... is NOT vector building or vector databases or vector searching — this is reasoning over the graph paths and elements here that we need for the long reasoning task."

🎓 4B 模型训练（GRPO）

为什么不是 SFT？

"Think about the nature of the job. What we have to do is extraction and even more structured extraction optimization and not simple reasoning. So what we need is here a massive alignment to our new schema."

3 大 Reward

Reward	作用
Format reward	输出符合 schema 格式
JSON reward	JSON 合法 + 结构化
Task reward	NER F1 + Relation Extraction + Long-form structured

3 大训练任务

1. Classical NER

"Socratic PO is a method, PPO is the baseline, this is a dataset"

2. Relation Extraction

extend / compare / reuse / adapt / contrast / generalize / modify / criticize

3. Structured NER

method / role / evidence / context / analysis

⚠️ 模型 size 选择

4B：足够一般任务（论文目标）
8B / 27B：理论物理、数学可能需要
不是 reasoning model → dedicated expert

GRPO 关键

Group Relative Policy Optimization（熟悉的算法）
Low variance KL + KL penalty to frozen reference
KL 加到 last token（不 fold 进 reward）
Normalization factors / hyperparameters

⚙️ 完整工作流（offline + online）

Offline Phase（一次性构建）

①

2.5M PDFs

6 大科学领域

②

Parser

multimodal: text/visual/equation

③

4B Extractor

RL 训练 · 5 schemas

④

Scholar KG

multimodal + 5-layer

Online Phase（每次 query）

User query

CLI / MCP

3 源 Retrieval

web + KG + traversal

Reasoning Agent

GPT-5 / Gemini 3 (SOTA)

Answer

✅ 关键设计：separation of concerns

4B 模型做 extraction · SOTA 模型做 reasoning · 避免 giant expensive LLM 用于所有 task

"Less money less expensive."

📈 性能数据

3 个 Benchmark

HotpotQA（multihop QA）
2WikiMultihopQA（2 Wiki multihop）
Musique

K1 vs Naked LLM（Knowledge + Research Questions）

Naked GPT-5.2

41.8%

K1 (Agent Native)

66.3%

📈 提升 +24.5%

在所有 3 个 multihop QA benchmark 上，K1 都 outperforms 所有 baseline（包括 LightRAG / HippoRAG / GFMRAG / HippoRAG 2）

注：Naked GPT-5.2 在 knowledge / research question 表现类似 41.8%

🔗 与 Patrick 工作的关联

🔗 hybrid-llm-routing-framework

核心关联 — K1 是 hybrid-llm-router 的 production-grade 实现：

✅ 本地 cheap（4B extraction）+ cloud SOTA（GPT-5 reasoning）
✅ 不用 vector DB → 3 源 retrieval
✅ 17 MCP tools → 同 Patrick hybrid 路线
✅ MIT license → 立即可用
✅ "less money less expensive" = 验证 hybrid 决策

🔗 IndyDevDan / 本地 LLM

4B 模型本地能跑 → 本地 + cloud reasoning 的最优分工

Graph traversal reasoning 跟 IndyDevDan "8K context limit" 完全互补

🔗 OpenSwarm

OpenSwarm = 多 agent 协作 · K1 = single agent with 3-source retrieval

两条路线可结合：K1 节点 + OpenSwarm 协调

🔗 Sam Altman Stanford

Sam "inference underinvested" → K1 = inference 优化实例

"Always fresh data" → Sam 3 forks 之一（compute + data shortage）

"less money less expensive" = Sam "80% democratization" 具体路径

🔗 DataCamp AI Engineer

DataCamp 学 RAG → K1 是 RAG 2.0 范式（无 vector DB）

Multimodal + structured extraction 是进阶技能

🗺️ Patrick 实战路线

立即可做（10 分钟）

看 GitHub repo（17 MCP tools + Python package）
理解 5 schema + 3 source 架构
在 hybrid-llm-router skill 文档加 case study（K1 案例）

中期（1-2 周）

本地跑 4B extraction model（Ollama / LM Studio）
选 1 个 domain 跑 100-1000 docs 做小型 KG
接 hybrid-llm-router 做 routing（本地 4B + cloud GPT-5）

长期

OpenSwarm 中集成 K1 节点
自训练 4B extraction model（垂直领域）
用 K1 论文框架做 OpenSwarm 自己的 knowledge layer

💡 关键洞察

K1 论文 6/11 刚出 → 7 个 MCP tools 立即可用 → 跟 hybrid-llm-router skill 完美互补

这是 RAG 2.0 范式：从 vector DB → 3 源 retrieval（web + KG + traversal）