Case Study / AI Agent

派蒙三千问

Paimon Asks Everything

一个面向原神玩家与发行观察的双语证据型 Agent：它不急着生成答案，而是先判断进度、控制剧透、检索来源，再把可追溯的证据组织成回答。

TypeScript
Next.js
AI Agent
Retrieval

Paimon Asks Everything promotional banner showing Paimon, a starry blue background, and four product feature cards. — Project Artifact **剧透安全的游戏理解系统**

Overview

我想做的不是一个聊天框，而是一套有边界的理解系统

《原神》的故事信息散落在任务、版本活动、圣遗物、角色资料、Wiki 和社区讨论里。普通问答很容易把玩家还没玩到的剧情直接说破，也容易把推测写得像事实。我把这个项目设计成“派蒙作为向导”的产品：先确认玩家语境，再决定能说什么、从哪里查、证据够不够、应该如何降级表达。

Problem

玩家想问清楚，系统必须先学会克制

同一个问题，对新玩家、回归玩家和活跃玩家意味着不同的剧透风险。

中文和英文资料并不总是同构，实体别名、版本边界和社区说法会互相缠绕。

发行侧真正需要的不是原始聊天记录，而是哪些主题反复造成理解断点。

Product Strategy

四个模块把创意落到可验证体验

剧透安全问答

回答跟随玩家进度，遇到高风险剧情先停下，必要时要求二次确认。

证据约束回答

每个结论都尽量回到来源、引用和置信边界，而不是让模型凭语气取胜。

至冬关系图

把人物、组织、概念做成可探索节点，让复杂世界观从碎片变成关系。

发行洞察

用匿名聚合信号识别兴趣点、误解点和 FAQ 机会，而不是保存个人问题。

Evidence Chain

核心创意是一条“先证据，后生成”的链

我把 Agent 的工作流设计成一条证据链：用户问题进入后，先经过语境判断与剧透门控，再检索、分级、引用，最后才进入生成。这样页面里的每个回答都像一张可回看的研究卡片。

01 玩家问题
02 剧透门控
03 双语检索
04 来源分级
05 证据回答
06 匿名洞察

System Design

从“派蒙笔记 AI 化”到发行观察窗口

Design mind map for Paimon Asks Everything, connecting core positioning, reasons for the product, design principles, release value, Snezhnaya adaptation, and core features. — 这张图保留了最初的产品思路：派蒙是向导，不是替代游戏体验；准确性高于顺滑表达；发行价值来自玩家真正卡住的地方。

Reliability

我宁愿它少答，也别装懂

每条回答先看引用能不能对上，对不上就不放出来。
官方资料、Wiki 和社区讨论分开看。社区里的说法可以参考，但不能当官方设定。
资料不够时就说不确定，必要时不回答，不为了显得顺滑硬编。
我留了一组固定问题，每次改动后都跑一遍，看检索、剧透控制和引用有没有坏。

Retrospective

下一步我会继续增强“游戏感”和“可信度”之间的平衡

现在的公开 Demo 仍有 Vercel 冷启动、站外趋势层未完全接入、轻量限流等限制。下一步我会让 UI 更贴近游戏语境，同时继续强化本地语料、搜索稳定性和派蒙式表达的一致性。

Overview

I was not building a chat box. I was building a bounded understanding system.

Genshin lore lives across quests, version events, artifacts, character profiles, wikis, and community threads. A normal Q&A interface can easily spoil players or present speculation as fact. I shaped the project around Paimon as a guide: understand player context first, decide what can be said, choose where to search, judge whether evidence is enough, then answer with clear boundaries.

Problem

Players want clarity. The system has to learn restraint first.

The same question carries different spoiler risk for new, returning, and active players.

Chinese and English sources do not always align across aliases, version boundaries, and community wording.

Release teams need aggregate confusion signals, not raw personal question logs.

Product Strategy

Four modules turn the idea into a testable experience

Spoiler-safe Q&A

Answers adapt to player progress and stop before high-risk narrative reveals.

Evidence-grounded answers

Claims are tied back to sources, citations, and confidence boundaries instead of fluent guesswork.

Snezhnaya relationship map

Characters, factions, and concepts become explorable nodes instead of scattered notes.

Release insights

Anonymous aggregate signals reveal interest, confusion, and FAQ opportunities without storing personal questions.

Evidence Chain

The core idea is evidence before generation

I designed the agent workflow as an evidence chain: a question enters, passes context and spoiler gates, then retrieval, source grading, citation checks, and only finally generation. Each answer behaves like a research card you can audit.

01 Player question
02 Spoiler gate
03 Bilingual retrieval
04 Source grading
05 Evidence answer
06 Anonymous insight

System Design

From Paimon notebook AI to a release observation window

Reliability

I would rather have it say less than make things up

Every answer has to line up with its citations before I show it.
Official pages, wiki material, and community threads are treated differently. A forum theory should not become canon.
When the evidence is thin, the answer says so or stops there.
I keep a small set of repeat questions to catch broken retrieval, spoilers, citations, rate limits, and event logging.

Retrospective

Next I would keep balancing game feel with trustworthiness

The public demo still has Vercel cold-start limits, an unfinished external trend layer, and lightweight rate limiting. Next I would push the interface closer to the game while strengthening the local corpus, search reliability, and Paimon-style voice consistency.