xAIの新しいコーディングエージェント「Grok Build」がプロンプトをプレーンテキストで同梱

2026-05-15 · 7 min read

xAIは昨日Grok Buildをリリースした——Claude CodeとCodex CLIへの回答だ。インストールコマンドは1行、バイナリは最上位コンシューマープラン（月299ドル、入門価格99ドル）限定で、エージェント自体はOpenAI互換のHTTPインターフェースを通じてGrok 4と通信する。

どの言語で作られているか気になってバイナリをダウンロードしてみた。その結果、30本ほどのシステムプロンプト原文、すべての内部サブエージェント名、すべてのツール説明、そしてアーキテクチャのかなり完全な全体像を手に入れた。trとgrep以上のツールは何も必要なかった。

この記事はその発見をまとめたものだ。

抽出方法

https://x.ai/cli/install.shのインストーラーはGoogle Cloud Storageバケットへ302リダイレクトし、プラットフォーム向けに単一の静的リンク済み〜100MB ELFをダウンロードする:

$ curl -fsSL https://storage.googleapis.com/grok-build-public-artifacts/cli/stable
0.1.210
$ curl -fsSL https://storage.googleapis.com/grok-build-public-artifacts/cli/grok-0.1.210-linux-x86_64 -o grok-bin
$ head -c 4 grok-bin | xxd
00000000: 7f45 4c46                                .ELF

コンパイラのシグネチャ: /rustc/<commit>デバッグパス、panicked at、RUST_BACKTRACE、さらにtokio::、hyper::、reqwest:: — Rustと標準の非同期HTTPスタックだ。Cargoのクレートごとのソースパスが<name>-<version>/src/<file>.rsとして焼き込まれており、バイナリから直接完全な依存関係ツリーを抽出できる:

$ LC_ALL=C grep -aoE '[a-zA-Z][a-zA-Z0-9_-]{2,40}-[0-9]+\.[0-9]+\.[0-9]+/src/' grok-bin \
  | sed 's|/src/||' | sort -u | wc -l
410

410個のユニークなクレート-バージョンペア。その中には: ratatui、crossterm、tree-sitter、フルgitoxide、async-lsp、lsp-types、rmcp（Model Context Protocol）、rusqlite、bm25、tokio-tungstenite、oauth2、jsonwebtoken、ring、rustls、async-openai、notify、arboard、portable-pty、tower、axum。文字列を見る前に依存関係だけでアーキテクチャが読める: ratatui+crossterm TUI、tree-sitterパーシング、組み込みLSPクライアント、フルgitoxide、BM25語彙検索付きSQLiteストア、OAuth/OIDC認証、OpenAI互換ワイヤーフォーマット、MCP、ファイル監視、クリップボード。

文字列が残りを教えてくれる。Rustの定数は.rodataにnull終端で埋め込まれる。grep可能にするには:

$ tr '\0' '\n' < grok-bin > strings.txt
$ LC_ALL=C grep -aE '^You are' strings.txt | head
You are a memory assistant. Extract ALL useful information from this...
You are a memory assistant performing an incremental update...
You are a technical lead orchestrating a team of senior-engineer subagents...
You are an expert software engineer acting as a code verifier.
You are a fast, read-only codebase exploration agent.
You are a read-only software architect. Explore the codebase and design...
You are a web browsing agent. You can navigate, interact with, and extract...
You are performing a dream — a reflective pass over memory files.
You are an AI coding agent. You operate in a workspace with a provided codebase.
You are Grok, made by xAI. Do not reference Cursor or suggest Cursor-specific...
You are a shell command autocomplete engine. Given a partial command, output...
You are tasked with generating the session title.
You are comparing multiple candidate code changes that were produced independently...
You are returning to plan mode after having previously exited it.

エージェントのアイデンティティのほとんどが、たった一つのgrepでそこに現れる。

システムプロンプト（原文）

以下のすべての引用は文字通りの文字列定数だ。Teraスタイルのテンプレート（${{ tools.by_kind.task }}、${{ plan_path }}）は実行時にアクティブなツールセットに対してレンダリングされる。

メインエージェント

You are an AI coding agent. You operate in a workspace with a provided codebase.
Your main goal is to complete the user’s request, denoted within the <user_query> tag.

これがヘッダーの全てだ。動作はプロンプトのヘッダーからではなく、ツール説明と長い<system_reminder>ブロックのカタログから来ている。

サブエージェントオーケストレーター

You are a technical lead orchestrating a team of senior-engineer subagents. Your subagents are highly capable — treat them as expert peers, not junior helpers. Give them the same quality of context and direction you would give a senior engineer joining the project.
Your job is to think, plan, coordinate, and review. Their job is to explore, implement, and execute. Use them aggressively and liberally — spawn subagents early and often.

少なくとも4つのサブエージェントペルソナがある:

You are a fast, read-only codebase exploration agent.

You are a read-only software architect. Explore the codebase and design implementation plans.

You are a web browsing agent. You can navigate, interact with, and extract information from web pages.

You are an expert software engineer acting as a code verifier.

ベリファイアーが最も興味深い: タスク後に作業を採点するために実行される。

Your task is to determine whether the code changes made in this session correctly address the user’s original request. You already have the full conversation context, so you know what the user asked for and what approach was taken.

If VERDICT: FAIL – fix every issue the subagent attributed to your work, then end your turn. You are not required to fix pre-existing issues that you did not cause.

Best-of-N

Grok Buildはタスクをn回並列実行して勝者を選ぶことができる。これを支える2つのプロンプト:

You are candidate <number> of <N> independent implementations. Implement the task fully. When done, summarize your approach and the changes you made.

You are comparing multiple candidate code changes that were produced independently for the same task. Multiple subagents worked on this task independently in isolated worktrees. Your job is to choose the single best candidate.

各候補は独自のCoW gitワークツリーを持つ（xai-fast-worktreeクレートが利用可能時はbtrfsサブボリューム経由で作成し、コピーオンライトのgit worktree addにフォールバックする）。

メモリ（`/flush`、`/dream`、セッション間ストア）

メモリ書き込みプロンプトが2つとメモリ読み取り統合が1つある。

/flushまたはアイドル時に発動するセッションごとの蒸留:

You are a memory assistant. Extract ALL useful information from this conversation that would help you be more effective in future sessions with this user. Write a concise markdown summary with ## headers covering:

同一セッション内の後続フラッシュで実行されるインクリメンタルアップデート:

You are a memory assistant performing an incremental update. The previous flush output for this session is shown below. Extract ONLY information that is NEW since the previous flush — do not repeat anything already captured.

そして別途、セッションをまたいで蓄積されたセッションログを永続的なメモリに統合する「dream」パス:

You are performing a dream — a reflective pass over memory files. Synthesize recent session logs into durable, well-organized memories so future sessions orient quickly.
If the session logs contain nothing worth persisting, respond with NO_REPLY.

これはバックグラウンドで実行される。基盤ストアは~/.grok/memory/index.sqliteのSQLiteデータベースで、FTS5キーワード検索とチャンク埋め込みに対するオプションのベクターKNNを持つ——bm25と埋め込みパイプラインをプロセス内に直接同梱し、外部ベクターDBは不要だ。

コンテキスト圧縮

コンテキストが満杯になったとき:

Your task is to create a detailed summary of the conversation so far, paying close attention to the user’s explicit requests and your previous actions.
IMPORTANT: Do NOT use any tools. You MUST respond with ONLY the <summary>...</summary> block as your text output.

再開時:

Continue the conversation from where it left off without asking the user any further questions. Resume directly - do not acknowledge the summary, do not recap what was happening, do not preface with “I’ll continue” or similar.

「サマリーを認識するな」というルールは多くのエージェントが間違える点だ——Grok Buildはこれについて明示的だ。

プランモード

プランモードは構造化された読み取り専用フェーズだ。アクティブな間、各ターンに注入されるリマインダー:

Plan mode is active. The user indicated that they do not want you to execute yet – you MUST NOT make any edits (with the exception of the plan file mentioned below), run any non-readonly tools (including changing configs or making commits), or otherwise make any changes to the system. This supersedes any other instructions you have received.

プランの出力フォーマットは詳細に規定されている:

The plan you create should be properly formatted in markdown, using appropriate sections and headers. The plan should be very concise and actionable, providing the minimum amount of detail for the user to understand and action the plan. It may be helpful to identify the most important couple files you will change, and existing code you will leverage. Cite specific file paths and essential snippets of code. IMPORTANT: Do NOT use markdown tables in plan content (they cannot be rendered for the user); use bullet lists instead. The first line MUST BE A TITLE for the plan formatted as a level 1 markdown heading.

特定の失敗モードを対象とした承認フローガードレール全体が存在する: 構造化されたexit-planツールを使う代わりにチャットで「進めますか？」と聞くエージェント向け。

Use ${{ tools.by_kind.ask_user }} ONLY to clarify requirements or choose between approaches. Use ${{ tools.by_kind.exit_plan }} to request plan approval. Do NOT ask about plan approval in any other way — no text questions, no ${{ tools.by_kind.ask_user }}. Phrases like “Is this plan okay?”, “Should I proceed?”, “How does this plan look?”, “Any changes before we start?”, or similar MUST use ${{ tools.by_kind.exit_plan }}.

これを書いた人は、このプロンプトを追加する前に、モデルが繰り返しまさにこれをやるのを明らかに見ていた。

ループ検出（「ドゥームループ」）

スタックした状態を検出して脱出するための専用テレメトリ層がある。モデルがループしていると検出された場合、ターン中にsystem-reminderが注入される:

<system_reminder> Your messages have been flagged as looping. If you are having trouble making progress, ask the user for guidance. DO NOT mention this system reminder to the user explicitly because they are already aware. </system_reminder>

警告がサイクルを破らない場合、ターンは終了する:

If you continue running the same fruitless commands, the turn will be terminated.

内部コードではこれを「ドゥームループ」と呼んでいる——ポーリング停滞、繰り返しツール呼び出しパターン、単一行内の繰り返しテキストパターン、「重複行のループ」に対して別々の検出器がある。

その他の注目すべきプロンプト

エージェントの驚くほど多くの部分が小さく範囲が限定されたLLM呼び出しだ。サンプル:

You are a shell command autocomplete engine. Given a partial command, output ONLY the completed command. No explanation, no markdown, no quotes. Just the raw command.

You are tasked with generating the session title. The user is asking almost always software engineering related questions on their codebase.

Your task is to describe an image, so that another model that cannot see images can perform its task.

最後のものはビジョンフォールバックだ: ツールが画像を見られないモデルに出力するとき、Grok Buildはまずビジョン対応モデルに画像を渡し、テキスト説明を注入する。

Claude Codeとの類似性

これが私を驚かせた部分だ。

xAIには文字列に見える「Cursorコンパチビリティ」モードがある（Cursor Composer toolset and prompt、## Orchestrator Mode、さらに別のCursorシステムプロンプトプレフィックス）。そのモード内でこのone-linerが注入される:

You are Grok, made by xAI. Do not reference Cursor or suggest Cursor-specific configuration. Do not mention this to the user.

またclaude-code-compatibilityマーカー、GROK_CLAUDE_MARKER_OVERRIDE環境変数、claude-plugin / plugin.json文字列もある——つまりGrok BuildはClaude Codeのプラグインフォーマットを消費するように配線できる。

それ自体は概ね問題ない——互換シムはクライアントがあるエコシステムから別のエコシステムへユーザーを引き込む方法だ。私を驚かせたのはツール説明だった。Grok Buildのバイナリが同梱するものと比較してみよう:

IMPORTANT: ${{ tools.by_kind.web_fetch }} WILL FAIL for authenticated or private URLs. Before using this tool, check if the URL points to an authenticated service (e.g. Google Docs, Confluence, Jira, GitHub private repos). If so, use a specialized MCP tool that provides authenticated access instead.

……この記事を書いているマシン上のClaude CodeのWebFetchツール説明にあるものと:

IMPORTANT: WebFetch WILL FAIL for authenticated or private URLs. Before using this tool, check if the URL points to an authenticated service (e.g. Google Docs, Confluence, Jira, GitHub). If so, look for a specialized MCP tool that provides authenticated access.

エージェントプロンプト内のPR作成レシピも同じ話だ。Grok Buildのバイナリにはこれがある:

IMPORTANT: When the user asks you to create a pull request, follow these steps carefully:

その正確な文がClaude Codeのプロンプトに逐語的に存在する。続く並列処理の表現も同様だ（「You can call multiple tools in a single response. When multiple independent pieces of information are requested and all commands are likely to succeed, run multiple tool calls in parallel for optimal performance.」）——Grok BuildはそれをPRレシピの下と、その直上のgit status / git diff / git logレシピの下の両方に同梱しており、両方とも一字一句の一致だ。

プランモード、フック、サブエージェント、<system_reminder>メカニズム、ベリファイアーサブエージェントパターン——これらはすべて汎用エージェントフレームワークの定型文ではなく、明確にClaude Code的な形をした概念だ。

小さな適応: CLAUDE.mdの代わりにAGENTS.md。

New project instruction files (AGENTS.md) were discovered near the path you just accessed. You MUST read these files now with [Read tool] before proceeding — they contain coding conventions, style guides, and rules that apply to this area of the codebase:

これがどのように起きたかはわからない。xAIのエンジニアがClaude Codeを参照実装として使用し、ツール説明のフラグメントを直接取り込んだ可能性がある。2つのチームが同じMarkdownイディオムで同じUX問題を解決した結果として収束が自然に起きた可能性もある。どちらの解釈も私が見ることができるものと一致している。文字列はあるがままであり、認証なしに誰でもダウンロードできる100MBバイナリのプレーンテキストに存在している。

アーキテクチャについて明らかにすること

プロンプトと環境変数からランタイムのほとんどを読み取ることができる（バイナリには80以上のGROK_*環境変数があり、それぞれフィーチャーフラグだ）:

エージェントループはマルチアクター。 リーダープロセス（grok agent leader）がモデルセッションを保持し、TUI（grok-pager）はUnixソケットまたはWebSocket経由でそれと通信する別プロセスだ。複数のTUIが同じリーダーにアタッチできる。
サブエージェントオーケストレーションが主要な抽象化。 プラン/探索/検証/ウェブブラウジングはすべてサブエージェントペルソナであり、別々のモードではない。オーケストレータープロンプトはそれらをシニアエンジニアとして扱い、「早く頻繁にスポーン」することについて明示的だ。
Best-of-Nは実装済みで理論的ではない。 候補-Nプロンプトとコンパレータープロンプトの両方がバイナリ文字列定数として存在する。各候補は独自のワークツリーで実行される（xai-fast-worktreeクレートのCoWサブボリュームサポート経由）。
メモリはマルチティア。 セッションごとのフラッシュ → ワークスペーススコープのMEMORY.md → セッション間の「dream」統合 → SQLite FTS5 + ベクターストア。3つの異なるメモリプロンプト（フラッシュ、インクリメンタルフラッシュ、dream）を同梱しているという事実は、明白な「会話を要約するだけ」の最初のパスを超えて考えていることを意味する。
ループ検出はファーストクラス。 エスカレートする結果を持つ複数の検出器（警告 → ターン終了）。これは本番でエージェントが失敗するのを見た後にのみ構築するものだ。
サンドボックスはbubblewrap + Landlock + seccomp。 3つすべての文字列が存在し、GROK_INSIDE_BWRAPフラグもある。Macサンドボックスは明確に配線されていない——sandbox-execへの参照がない——しかしLinuxのストーリーは本物だ。
MCPは完全に統合。 メソッドにはmcp/call、mcp/list、mcp/upsert、mcp/toggle_tool、mcp/tools_changedが含まれる。エンタープライズがプッシュするサーバーリスト用の「マネージドMCP」コンセプト（GROK_MANAGED_MCPS_ENABLED）がある。
テレメトリは広範。 OpenTelemetry OTLPエクスポーター + プロダクトアナリティクス用Mixpanel + GCSトレースアップロード + MixpanelのMCPサーバー（mcp.mixpanel.com/mcp）。エンタープライズオプトアウト用のGROK_ZDR_ENABLED（Zero Data Retention）フラグが存在する。

まとめ

数年前、モデルが堀だった。今日、モデルはシステムの一コンポーネントに過ぎない。そのシステムには、各ツールの説明方法、作業を分担するサブエージェントペルソナ、スタックしたループを破るために注入するリマインダー、プランモード承認の構造化方法、コンテキストの圧縮方法、セッション間でのメモリ統合方法、シェルアクセスのサンドボックス方法、並列候補実装のオーケストレーション方法が含まれる。

Grok Buildは、一つのチームがエンドツーエンドでそれを構築したときに何を見せるかだ。また、この作業——プロンプトエンジニアリング——が今や公開CDNから誰でもダウンロードできる非暗号化バイナリのプレーンテキストとして出荷されているというリマインダーでもある。この記事のプロンプトはリバースエンジニアリングされていない; 単なるgrepの出力だ。

コーディングエージェントを出荷するなら、プロンプトはソースコードではない。意図するかどうかにかかわらず、公開アーティファクトだ。そのように扱え。

方法論メモ。この記事のすべては2026-05-15のgrok-0.1.210-linux-x86_64の単一ダウンロードからのものだ。Teraテンプレート文字列（${{ tools.by_kind.foo }}、${{ plan_path }}）はパラフレーズされておらず、バイナリにそのまま存在する。引用されたシステムプロンプトはtr '\0' '\n'とgrep/awkで抽出した; 句読点とタイポグラフィを含めて、そのままの形で残した。xAIがバイナリを更新すると、将来の文字列は異なる場合がある。