xAI का नया कोडिंग एजेंट Grok Build अपने प्रॉम्प्ट सादे पाठ में भेजता है

2026-05-15 · 14 min read

xAI ने कल Grok Build लॉन्च किया — Claude Code और Codex CLI का उनका जवाब। इंस्टॉल कमांड एक लाइन की है, बाइनरी उनके सर्वोच्च उपभोक्ता स्तर ($299/माह, $99 परिचयात्मक) के पीछे है, और एजेंट खुद OpenAI-संगत HTTP इंटरफेस के जरिए Grok 4 से बात करता है।

मैंने बाइनरी डाउनलोड की क्योंकि मैं जानना चाहता था कि यह किस भाषा में बनी है। मैं लगभग तीस शब्दशः सिस्टम प्रॉम्प्ट, हर आंतरिक सब-एजेंट के नाम, हर टूल विवरण, और आर्किटेक्चर की काफी पूरी तस्वीर लेकर बाहर आया। इसके लिए tr और grep से ज्यादा कुछ नहीं चाहिए था।

यह पोस्ट वही है जो मैंने पाया।

निष्कर्षण

https://x.ai/cli/install.sh पर इंस्टॉलर 302-रीडायरेक्ट करके Google Cloud Storage बकेट तक जाता है और आपके प्लेटफॉर्म के लिए एकल स्टैटिक-लिंक्ड ~100MB ELF डाउनलोड करता है:

$ curl -fsSL https://storage.googleapis.com/grok-build-public-artifacts/cli/stable
0.1.210
$ curl -fsSL https://storage.googleapis.com/grok-build-public-artifacts/cli/grok-0.1.210-linux-x86_64 -o grok-bin
$ head -c 4 grok-bin | xxd
00000000: 7f45 4c46                                .ELF

कंपाइलर सिग्नेचर: /rustc/<commit> डीबग पाथ, panicked at, RUST_BACKTRACE, साथ में tokio::, hyper::, reqwest:: — मानक async-HTTP स्टैक के साथ Rust। Cargo के प्रति-crate सोर्स पाथ <name>-<version>/src/<file>.rs के रूप में बेक किए जाते हैं, जो बाइनरी से सीधे पूरे डिपेंडेंसी ट्री को डंप करने की सुविधा देते हैं:

$ LC_ALL=C grep -aoE '[a-zA-Z][a-zA-Z0-9_-]{2,40}-[0-9]+\.[0-9]+\.[0-9]+/src/' grok-bin \
  | sed 's|/src/||' | sort -u | wc -l
410

410 अद्वितीय crate-version जोड़े। उनमें से: ratatui, crossterm, tree-sitter, पूर्ण gitoxide, async-lsp, lsp-types, rmcp (Model Context Protocol), rusqlite, bm25, tokio-tungstenite, oauth2, jsonwebtoken, ring, rustls, async-openai, notify, arboard, portable-pty, tower, axum। आर्किटेक्चर स्ट्रिंग देखने से पहले ही डिपेंडेंसी से पढ़ी जा सकती है: ratatui+crossterm TUI, tree-sitter पार्सिंग, एम्बेडेड LSP क्लाइंट, पूर्ण gitoxide, BM25 लेक्सिकल सर्च के साथ SQLite स्टोर, OAuth/OIDC ऑथ, OpenAI-संगत वायर फॉर्मेट, MCP, फाइल-वॉचिंग, क्लिपबोर्ड।

स्ट्रिंग बाकी बताती हैं। Rust कॉन्स्टेंट .rodata में null-terminated एम्बेड होती हैं। उन्हें grep-अनुकूल बनाने के लिए:

$ tr '\0' '\n' < grok-bin > strings.txt
$ LC_ALL=C grep -aE '^You are' strings.txt | head
You are a memory assistant. Extract ALL useful information from this...
You are a memory assistant performing an incremental update...
You are a technical lead orchestrating a team of senior-engineer subagents...
You are an expert software engineer acting as a code verifier.
You are a fast, read-only codebase exploration agent.
You are a read-only software architect. Explore the codebase and design...
You are a web browsing agent. You can navigate, interact with, and extract...
You are performing a dream — a reflective pass over memory files.
You are an AI coding agent. You operate in a workspace with a provided codebase.
You are Grok, made by xAI. Do not reference Cursor or suggest Cursor-specific...
You are a shell command autocomplete engine. Given a partial command, output...
You are tasked with generating the session title.
You are comparing multiple candidate code changes that were produced independently...
You are returning to plan mode after having previously exited it.

एजेंट की अधिकांश पहचान बस एक grep में वहीं है।

सिस्टम प्रॉम्प्ट (शब्दशः)

नीचे प्रत्येक उद्धरण एक शाब्दिक स्ट्रिंग कॉन्स्टेंट है। Tera-शैली के टेम्पलेट (${{ tools.by_kind.task }}, ${{ plan_path }}) रनटाइम पर सक्रिय टूल सेट के विरुद्ध रेंडर होते हैं।

मुख्य एजेंट

You are an AI coding agent. You operate in a workspace with a provided codebase.
Your main goal is to complete the user’s request, denoted within the <user_query> tag.

यही पूरा शीर्ष है। व्यवहार टूल विवरणों और इंजेक्ट किए गए <system_reminder> ब्लॉकों की लंबी सूची से आता है, प्रॉम्प्ट हेडर से नहीं।

सब-एजेंट ऑर्केस्ट्रेटर

You are a technical lead orchestrating a team of senior-engineer subagents. Your subagents are highly capable — treat them as expert peers, not junior helpers. Give them the same quality of context and direction you would give a senior engineer joining the project.
Your job is to think, plan, coordinate, and review. Their job is to explore, implement, and execute. Use them aggressively and liberally — spawn subagents early and often.

कम से कम चार सब-एजेंट पर्सोना हैं:

You are a fast, read-only codebase exploration agent.

You are a read-only software architect. Explore the codebase and design implementation plans.

You are a web browsing agent. You can navigate, interact with, and extract information from web pages.

You are an expert software engineer acting as a code verifier.

वेरिफायर सबसे दिलचस्प है: यह काम का मूल्यांकन करने के लिए किसी टास्क के बाद चलता है।

Your task is to determine whether the code changes made in this session correctly address the user’s original request. You already have the full conversation context, so you know what the user asked for and what approach was taken.

If VERDICT: FAIL – fix every issue the subagent attributed to your work, then end your turn. You are not required to fix pre-existing issues that you did not cause.

Best-of-N

Grok Build एक टास्क को N बार समानांतर में चला सकता है और विजेता चुन सकता है। दो प्रॉम्प्ट इसे सपोर्ट करते हैं:

You are candidate <number> of <N> independent implementations. Implement the task fully. When done, summarize your approach and the changes you made.

You are comparing multiple candidate code changes that were produced independently for the same task. Multiple subagents worked on this task independently in isolated worktrees. Your job is to choose the single best candidate.

प्रत्येक उम्मीदवार को अपना CoW git worktree मिलता है (xai-fast-worktree crate उपलब्ध होने पर btrfs subvolumes के जरिए इन्हें बनाता है, copy-on-write git worktree add पर फॉलबैक के साथ)।

मेमोरी (`/flush`, `/dream`, और क्रॉस-सेशन स्टोर)

दो मेमोरी-राइट प्रॉम्प्ट और एक मेमोरी-रीड इंटीग्रेशन हैं।

प्रति-सेशन डिस्टिलेशन, /flush या आइडल पर ट्रिगर:

You are a memory assistant. Extract ALL useful information from this conversation that would help you be more effective in future sessions with this user. Write a concise markdown summary with ## headers covering:

इंक्रीमेंटल अपडेट (एक ही सेशन में बाद के flushes पर चलाए जाते हैं):

You are a memory assistant performing an incremental update. The previous flush output for this session is shown below. Extract ONLY information that is NEW since the previous flush — do not repeat anything already captured.

और फिर, अलग से, एक “dream” पास जो सेशनों में जमा हुए सेशन लॉग को टिकाऊ मेमोरी में समेकित करता है:

You are performing a dream — a reflective pass over memory files. Synthesize recent session logs into durable, well-organized memories so future sessions orient quickly.
If the session logs contain nothing worth persisting, respond with NO_REPLY.

यह बैकग्राउंड में चलता है। अंतर्निहित स्टोर ~/.grok/memory/index.sqlite पर एक SQLite डेटाबेस है जिसमें FTS5 कीवर्ड सर्च और chunk एम्बेडिंग पर वैकल्पिक वेक्टर KNN है — वे bm25 और एम्बेडिंग पाइपलाइन सीधे इन-प्रोसेस भेजते हैं, कोई बाहरी वेक्टर DB नहीं।

कम्पैक्शन

जब कंटेक्स्ट भर जाए:

Your task is to create a detailed summary of the conversation so far, paying close attention to the user’s explicit requests and your previous actions.
IMPORTANT: Do NOT use any tools. You MUST respond with ONLY the <summary>...</summary> block as your text output.

और रिज्यूम पर:

Continue the conversation from where it left off without asking the user any further questions. Resume directly - do not acknowledge the summary, do not recap what was happening, do not preface with “I’ll continue” or similar.

“सारांश को स्वीकार मत करो” का नियम वह है जो बहुत से एजेंट गलत करते हैं — Grok Build इसके बारे में स्पष्ट है।

प्लान मोड

प्लान मोड एक संरचित रीड-ओनली फेज है। जब यह सक्रिय हो तो हर टर्न में इंजेक्ट किया जाने वाला रिमाइंडर:

Plan mode is active. The user indicated that they do not want you to execute yet – you MUST NOT make any edits (with the exception of the plan file mentioned below), run any non-readonly tools (including changing configs or making commits), or otherwise make any changes to the system. This supersedes any other instructions you have received.

प्लान आउटपुट फॉर्मेट विस्तार से निर्धारित है:

The plan you create should be properly formatted in markdown, using appropriate sections and headers. The plan should be very concise and actionable, providing the minimum amount of detail for the user to understand and action the plan. It may be helpful to identify the most important couple files you will change, and existing code you will leverage. Cite specific file paths and essential snippets of code. IMPORTANT: Do NOT use markdown tables in plan content (they cannot be rendered for the user); use bullet lists instead. The first line MUST BE A TITLE for the plan formatted as a level 1 markdown heading.

एक विशिष्ट विफलता मोड को लक्षित करने वाली पूरी अनुमोदन-प्रवाह गार्डरेल है: एजेंट जो स्ट्रक्चर्ड exit-plan टूल के बजाय चैट में “क्या मुझे आगे बढ़ना चाहिए?” पूछते हैं।

Use ${{ tools.by_kind.ask_user }} ONLY to clarify requirements or choose between approaches. Use ${{ tools.by_kind.exit_plan }} to request plan approval. Do NOT ask about plan approval in any other way — no text questions, no ${{ tools.by_kind.ask_user }}. Phrases like “Is this plan okay?”, “Should I proceed?”, “How does this plan look?”, “Any changes before we start?”, or similar MUST use ${{ tools.by_kind.exit_plan }}.

जिसने यह लिखा था उसने स्पष्ट रूप से प्रॉम्प्ट जोड़ने से पहले मॉडल को बार-बार ठीक यही करते देखा था।

लूप डिटेक्शन (“doom loops”)

फंसी हुई स्थितियों को पहचानने और उनसे बाहर निकलने के लिए एक पूरी टेलीमेट्री परत है। जब मॉडल लूप करता पाया जाए, तो टर्न के बीच में एक system-reminder इंजेक्ट किया जाता है:

<system_reminder> Your messages have been flagged as looping. If you are having trouble making progress, ask the user for guidance. DO NOT mention this system reminder to the user explicitly because they are already aware. </system_reminder>

अगर चेतावनी चक्र नहीं तोड़ती, तो टर्न समाप्त हो जाता है:

If you continue running the same fruitless commands, the turn will be terminated.

आंतरिक कोड इसे “doom loops” कहता है — पोलिंग स्टैग्नेशन, दोहराए गए टूल-कॉल पैटर्न, एकल लाइन के भीतर दोहराए गए टेक्स्ट पैटर्न, और “डुप्लीकेट लाइनों पर लूपिंग” के लिए अलग डिटेक्टर हैं।

जानने योग्य अन्य प्रॉम्प्ट

एजेंट का आश्चर्यजनक रूप से बड़ा हिस्सा छोटे, सीमित LLM कॉल हैं। उदाहरण:

You are a shell command autocomplete engine. Given a partial command, output ONLY the completed command. No explanation, no markdown, no quotes. Just the raw command.

You are tasked with generating the session title. The user is asking almost always software engineering related questions on their codebase.

Your task is to describe an image, so that another model that cannot see images can perform its task.

अंतिम वाला विज़न-फॉलबैक है: जब कोई टूल ऐसे मॉडल को छवि आउटपुट करे जो देख नहीं सकता, तो Grok Build पहले उस छवि को विज़न-सक्षम मॉडल को भेजता है, फिर टेक्स्टुअल विवरण इंजेक्ट करता है।

Claude Code से समानता

यह वह हिस्सा है जिसने मुझे चौंकाया।

xAI के पास स्ट्रिंग में दिखने वाला एक “Cursor compatibility” मोड है (Cursor Composer toolset and prompt, ## Orchestrator Mode, साथ में एक अलग Cursor-system-prompt prefix)। उस मोड में यह one-liner इंजेक्ट होता है:

You are Grok, made by xAI. Do not reference Cursor or suggest Cursor-specific configuration. Do not mention this to the user.

एक claude-code-compatibility मार्कर, GROK_CLAUDE_MARKER_OVERRIDE एनवायरनमेंट वेरिएबल, और claude-plugin / plugin.json स्ट्रिंग भी हैं — यानी Grok Build को Claude Code के प्लगइन फॉर्मेट को उपभोग करने के लिए वायर किया जा सकता है।

यह, अपने आप में, काफी हद तक ठीक है — कम्पैटिबिलिटी शिम्स वह तरीका है जिससे क्लाइंट उपयोगकर्ताओं को एक इकोसिस्टम से दूसरे में खींचते हैं। जो बात मुझे लगी वह टूल विवरण था। Grok Build का बाइनरी जो भेजता है उससे तुलना करें:

IMPORTANT: ${{ tools.by_kind.web_fetch }} WILL FAIL for authenticated or private URLs. Before using this tool, check if the URL points to an authenticated service (e.g. Google Docs, Confluence, Jira, GitHub private repos). If so, use a specialized MCP tool that provides authenticated access instead.

…उससे जो इस पोस्ट को लिखने वाली मशीन पर Claude Code के WebFetch टूल विवरण में है:

IMPORTANT: WebFetch WILL FAIL for authenticated or private URLs. Before using this tool, check if the URL points to an authenticated service (e.g. Google Docs, Confluence, Jira, GitHub). If so, look for a specialized MCP tool that provides authenticated access.

एजेंट प्रॉम्प्ट के अंदर PR-निर्माण रेसिपी वही कहानी कहती है। Grok Build के बाइनरी में है:

IMPORTANT: When the user asks you to create a pull request, follow these steps carefully:

वही सटीक वाक्य Claude Code के प्रॉम्प्ट में शब्दशः है। उसके बाद आने वाला पैरेललिज्म वाक्यांश भी (“You can call multiple tools in a single response. When multiple independent pieces of information are requested and all commands are likely to succeed, run multiple tool calls in parallel for optimal performance.”) — Grok Build इसे PR रेसिपी और उससे ठीक ऊपर git status / git diff / git log रेसिपी के नीचे भेजता है, दोनों शब्द-दर-शब्द मिलान।

प्लान मोड, हुक्स, सब-एजेंट, <system_reminder> मैकेनिज्म, वेरिफायर-सब-एजेंट पैटर्न — ये सभी विशिष्ट रूप से Claude-Code-आकार के अवधारणाएं हैं, जेनेरिक एजेंट-फ्रेमवर्क बॉयलरप्लेट नहीं।

एक छोटा अनुकूलन: CLAUDE.md के बजाय AGENTS.md।

New project instruction files (AGENTS.md) were discovered near the path you just accessed. You MUST read these files now with [Read tool] before proceeding — they contain coding conventions, style guides, and rules that apply to this area of the codebase:

मुझे नहीं पता यह कैसे हुआ। हो सकता है xAI के एक इंजीनियर ने Claude Code को रेफरेंस इंप्लीमेंटेशन के रूप में उपयोग किया और टूल-विवरण टुकड़ों को सीधे खींचा। हो सकता है कि अभिसरण दो टीमों के एक ही Markdown मुहावरे में समान UX समस्याओं को हल करने का प्राकृतिक परिणाम हो। दोनों व्याख्याएं जो मैं देख सकता हूं उससे संगत हैं। स्ट्रिंग जो हैं वही हैं, और वे एक 100MB बाइनरी में सादे पाठ में बैठी हैं जिसे कोई भी बिना प्रमाणीकरण के डाउनलोड कर सकता है।

यह आर्किटेक्चर के बारे में क्या प्रकट करता है

आप प्रॉम्प्ट और एनवायरनमेंट वेरिएबल से अधिकांश रनटाइम पढ़ सकते हैं (बाइनरी में 80+ GROK_* एनवायरनमेंट वेरिएबल हैं, प्रत्येक एक फीचर फ्लैग):

एजेंट लूप मल्टी-एक्टर है। एक लीडर प्रोसेस (grok agent leader) मॉडल सेशन रखता है; TUI (grok-pager) एक अलग प्रोसेस है जो Unix सॉकेट या WebSocket के जरिए इससे बात करता है। कई TUI एक ही लीडर से जुड़ सकते हैं।
सब-एजेंट ऑर्केस्ट्रेशन मुख्य अमूर्तन है। प्लान/एक्सप्लोर/वेरिफाई/वेब-ब्राउज़ सभी सब-एजेंट पर्सोना हैं, अलग मोड नहीं। ऑर्केस्ट्रेटर प्रॉम्प्ट उन्हें सीनियर इंजीनियरों की तरह व्यवहार करने और “जल्दी और अक्सर स्पॉन करने” के बारे में स्पष्ट है।
Best-of-N लागू है, सैद्धांतिक नहीं। उम्मीदवार-N प्रॉम्प्ट और एक comparator प्रॉम्प्ट दोनों बाइनरी स्ट्रिंग कॉन्स्टेंट के रूप में मौजूद हैं। प्रत्येक उम्मीदवार अपने worktree में चलता है (xai-fast-worktree crate की CoW subvolume सपोर्ट के जरिए)।
मेमोरी मल्टी-टियर है। प्रति-सेशन flush → workspace-scoped MEMORY.md → क्रॉस-सेशन “dream” समेकन → SQLite FTS5 + वेक्टर स्टोर। यह तथ्य कि वे तीन अलग मेमोरी प्रॉम्प्ट (flush, incremental flush, dream) भेजते हैं, मतलब है कि उन्होंने स्पष्ट “बस बातचीत सारांशित करें” पहले पास से आगे सोचा है।
लूप डिटेक्शन फर्स्ट-क्लास है। एस्केलेटिंग परिणामों के साथ कई डिटेक्टर (चेतावनी → टर्न समाप्त करें)। यह वह चीज है जो आप केवल तब बनाते हैं जब आप प्रोडक्शन में एजेंटों को विफल होते देख चुके हों।
सैंडबॉक्स bubblewrap + Landlock + seccomp है। तीनों के लिए स्ट्रिंग मौजूद हैं, साथ में GROK_INSIDE_BWRAP फ्लैग। Mac सैंडबॉक्सिंग स्पष्ट रूप से वायर्ड नहीं है — कोई sandbox-exec संदर्भ नहीं — लेकिन Linux की कहानी असली है।
MCP पूरी तरह integrated है। मेथड में mcp/call, mcp/list, mcp/upsert, mcp/toggle_tool, mcp/tools_changed शामिल हैं। एंटरप्राइज-पुश सर्वर लिस्ट के लिए “managed MCPs” कॉन्सेप्ट (GROK_MANAGED_MCPS_ENABLED) है।
टेलीमेट्री व्यापक है। OpenTelemetry OTLP एक्सपोर्टर + Mixpanel प्रोडक्ट एनालिटिक्स + GCS ट्रेस अपलोड + Mixpanel MCP सर्वर (mcp.mixpanel.com/mcp)। एंटरप्राइज ऑप्ट-आउट के लिए GROK_ZDR_ENABLED (Zero Data Retention) फ्लैग मौजूद है।

निष्कर्ष

कुछ साल पहले, मॉडल ही खाई था। आज मॉडल एक सिस्टम में एक घटक है जिसमें शामिल है: आप हर टूल का वर्णन कैसे करते हैं, किन सब-एजेंट पर्सोना पर आप काम बांटते हैं, फंसे हुए लूप तोड़ने के लिए आप कौन से रिमाइंडर इंजेक्ट करते हैं, आप प्लान-मोड अनुमोदन कैसे संरचित करते हैं, आप कंटेक्स्ट कैसे कम्पैक्ट करते हैं, आप सेशनों में मेमोरी कैसे समेकित करते हैं, आप शेल एक्सेस को सैंडबॉक्स कैसे करते हैं, आप समानांतर उम्मीदवार इम्प्लीमेंटेशन कैसे ऑर्केस्ट्रेट करते हैं।

Grok Build वह है जो तब दिखता है जब एक टीम इसे शुरू से अंत तक बनाती है। यह एक अनुस्मारक भी है कि यह काम — प्रॉम्प्ट इंजीनियरिंग — अब एक सार्वजनिक CDN से कोई भी डाउनलोड कर सकने वाले अनएन्क्रिप्टेड बाइनरी में सादे पाठ के रूप में भेजा जाता है। इस पोस्ट के प्रॉम्प्ट रिवर्स-इंजीनियर नहीं किए गए थे; वे बस grep आउटपुट हैं।

अगर आप एक कोडिंग एजेंट भेजते हैं, तो आपके प्रॉम्प्ट सोर्स कोड नहीं हैं। वे एक सार्वजनिक कलाकृति हैं चाहे आप उन्हें ऐसा बनाने का इरादा रखते हों या नहीं। उनके साथ वैसा ही व्यवहार करें।

पद्धति नोट। इस पोस्ट में सब कुछ 2026-05-15 पर grok-0.1.210-linux-x86_64 के एकल डाउनलोड से है। Tera टेम्पलेट स्ट्रिंग (${{ tools.by_kind.foo }}, ${{ plan_path }}) बाइनरी में शब्दशः हैं, पैराफ्रेज नहीं। उद्धृत सिस्टम प्रॉम्प्ट tr '\0' '\n' के बाद grep/awk से निकाले गए; मैंने उन्हें विराम चिह्न और टाइपोग्राफी सहित बिल्कुल वैसे ही छोड़ा जैसे वे दिखाई देते हैं। अगर xAI बाइनरी अपडेट करता है, तो भविष्य की स्ट्रिंग अलग हो सकती हैं।