A real, end-to-end, no-bullshit guide to xAI's Grok and the platform it lives on. Every model, every surface, every plan, every feature, the algorithm that decides what you see, the API to plug it into your own stack, and — because we may as well say the quiet part out loud — the playbook for actually winning on X.
If you can type a sentence and click a button, you have everything you need. If you can also tolerate a little chaos, you have everything you need to thrive.
Grok is one model family with five front doors. The X app is the most lived-in — Grok sits in the side rail next to your timeline and knows what's happening on the platform right now. grok.com is the focused web client for serious work — long context, file uploads, image generation, voice. The iOS/Android Grok apps bring Voice Mode and the Companion characters with you. The xAI API at api.x.ai is what you wire into your own stack — and it's drop-in OpenAI-compatible, so most existing clients work with a one-line endpoint change. And the Tesla integration puts Grok in every car shipped after the late-2025 software update.
The most useful version of Grok for most people. It lives inside the X app and the X web client, knows the timeline you're scrolling, can summarize a thread, draft a reply, fact-check a viral claim against live posts, or explain "what is everyone losing their minds about right now." Premium and Premium+ unlock it.
Open x.com →The standalone web app — no timeline, no notifications, just you and Grok. Drag in PDFs, code, images, spreadsheets. Pop into DeepSearch for a research mission, Think Mode for hard problems, Imagine for video, Voice for hands-free. This is where the heavy lifting happens.
Open grok.com →Free downloads on the App Store and Play Store. Voice Mode runs full-duplex over Bluetooth (yes, with earbuds, while you drive). The Companions — Ani, Rudi, Bad Rudi, and the rest — live here too. A second account on the same login as your X handle.
Get the app →One curl to the /v1/chat/completions endpoint and you have a frontier model in your stack. Set OPENAI_BASE_URL=https://api.x.ai/v1 and most existing OpenAI clients work unmodified. Real-time X data access is exposed as a tool.
Every Tesla with the late-2025 software update has Grok wired into the center screen. Voice Mode by default, hands-free, integrated with navigation ("Grok, find me a barbecue place 20 minutes away that's still open"). The car is just another client.
See Tesla integration →Grok is xAI's frontier model. The thing that makes it different from everything else — the only meaningful moat in this entire industry, honestly — is that it's the only model with real-time, native, deeply integrated access to X. Every other model gets the world through a web search API ten minutes after the fact. Grok gets the world as the world is typing it. That is the entire pitch. Everything else is downstream of that.
Grok queries the live X firehose directly. Ask it "what is happening in Memphis right now" and you don't get yesterday's news — you get the last 12 minutes. "Summarize the replies under this post" is a one-shot operation. No other model has this, and it's not licensable. xAI owns the platform.
Grok is the only frontier model that will actually push back, crack a joke, or call something dumb. Two response modes — Regular for grown-up work, Fun for everything else. The lawyers got the dial down from 12 to 11. It is, by design, less corporate than the alternatives. This is a feature.
Trained on Colossus in Memphis — a 200,000+ H100/H200/B200 cluster built in 122 days, the largest contiguous training cluster on Earth as of mid-2026. Inference workloads now also run on Starcenter, the orbital compute pod that went live this week. The whole stack is vertical: silicon → datacenter → rocket → orbit.
The stated training objective: maximize truth, minimize sycophancy. You will not get the "what a great question!" preamble. You will get an answer, sometimes with the question requestion. If it doesn't know, it says so. If you're wrong, it tells you. This is rare and you will miss it the moment you go back to a sycophantic model.
Each chip points to a section on this page. Solid borders ship today. Dashed borders are early-access, recently announced, or on the rumored roadmap. The orbital ones — yes, plural — are real and as of this week, online.
Grok is a model family, not a single model. The right one for the job depends on whether you care more about speed, depth, image, video, voice, or running it cheap at scale. Versions and exact capabilities change roughly every two months — this list is current to May 2026; check x.ai/models before you commit a production decision to anything specific.
| Model | Context | What it's for | Where it lives | Modality |
|---|---|---|---|---|
| Grok 4 grok-4-latest |
256k → 2M* | The frontier flagship. Default for hard reasoning, coding, agentic work. The benchmark numbers people are quoting on X. | x.com · grok.com · API · Tesla | text · image-in |
| Grok 4 Heavy grok-4-heavy |
256k | Multi-agent variant — spawns parallel reasoning threads, votes on the best answer. The thing you reach for when "this has to be right." | SuperGrok / Premium+ · API | text · image-in |
| Grok 4 Fast grok-4-fast |
2M | Cheap, fast, surprisingly good. The everyday workhorse. What X's in-app "Quick Answer" runs on. | x.com · API | text · image-in |
| Grok 3 grok-3 |
131k | Previous flagship — still supported. Pick this only if your eval harness is pinned to it. | API | text · image-in |
| Aurora aurora |
— | Image generation. The "draws hands correctly" model. Photorealistic, fast, fewer guardrails than the competition. | x.com · grok.com · Imagine | text → image |
| Imagine imagine |
— | Video. 5–20 second clips with synced audio. Generates from a prompt or extends a still. Native on grok.com and the mobile apps. | grok.com · iOS · Android | text → video+audio |
| Grok Voice voice |
session | Full-duplex speech-to-speech. Real-time interruptible. Multiple personalities (default, Sexy, Unhinged, Storyteller, Conspiracy, Meditation). | Mobile apps · Tesla · API (beta) | audio ↔ audio |
| Grok Code grok-code-fast |
256k | Coding-tuned variant. Optimized for tool use, file edits, and the agent loop. Free in Cursor and other partner IDEs through 2026. | API · Cursor · partners | text |
| Grok 5 grok-5 · preview |
2M | Next flagship. Trained on Colossus 2. Closed preview for SuperGrok Heavy subscribers and select API partners as of May 2026. | preview only | text · multimodal |
* 2M context is the announced spec for Grok 4 Fast; the standard Grok 4 ships 256k today with a 2M long-context mode rolling out on the API.
grok-4-latest. Drop to grok-4-fast when latency matters or you're billing per-token. Reach for grok-4-heavy when correctness matters more than money. Use aurora and imagine when you need pixels and frames. The voice model is its own thing — it doesn't take text input through the chat endpoint.
Three ways to pay. (Four, if you count the always-free tier on X — which a lot of people do.) X Premium bundles Grok into your blue checkmark. SuperGrok is the standalone subscription if you don't post but you want the model. xAI API bills per-token for developers. Mix and match — the API key is independent from the consumer plan.
| Plan | Price | Grok access | X platform | Best for |
|---|---|---|---|---|
| Free | $0 | Limited Grok 4 Fast — a few queries a day. Image gen rate-limited. | Standard posting, no Premium features. | Lurkers, tire-kickers |
| X Premium "Blue" |
$8/mo | Grok 4 Fast unlimited, Grok 4 generous daily cap, Aurora image gen. | Blue checkmark, longer posts (10k chars), edit, post anywhere/longer videos, ad split. | Active posters |
| Premium+ | $40/mo | Everything in Premium plus Grok 4 Heavy access, DeepSearch unlimited, Think Mode unlimited, Voice Mode, Companions, larger Aurora batches. | Higher rev share, no ads in your timeline, X Pro web client, priority support. | Creators, power users, people who'd otherwise pay for two products |
| SuperGrok | $30/mo | Grok-only subscription — Heavy, DeepSearch, Voice, Companions, Imagine video. Identical model access to Premium+ minus the platform perks. | No X platform perks. | People who want the model but don't post |
| SuperGrok Heavy | $300/mo | Grok 5 preview access, far higher rate limits, priority queue on Imagine and Voice, longer context. | No X platform perks. | Builders, researchers, founders who use it 100×/day |
| X Pro creator workstation |
included in Premium+ | Full Premium+ Grok access from inside X Pro itself. | Multi-column TweetDeck-style interface with Grok integrated as a column. Live ranking, custom lists, scheduling, analytics, mass-DM, advanced search. | Anyone who manages an account seriously |
Sign up at console.x.ai, generate a key, point at https://api.x.ai/v1. Billing is per-million tokens. Most useful prices to know:
| Model | Input / 1M tok | Cached input / 1M | Output / 1M tok | Notes |
|---|---|---|---|---|
grok-4-latest | $3.00 | $0.75 | $15.00 | Flagship. Long-context tier doubles after 128k. |
grok-4-heavy | $6.00 | $1.50 | $30.00 | Multi-agent. Bills as one request. |
grok-4-fast | $0.20 | $0.05 | $0.50 | The price-performance pick. 2M context. |
grok-code-fast | $0.20 | $0.05 | $1.50 | Coding-tuned. Free in Cursor partnership. |
aurora | ~$0.04 / image | 1024² · pricing tiered by resolution | ||
imagine | ~$0.40 / 5-second clip | 720p · with audio | ||
return_citations: false when you only need the facts, not the URLs.reasoning_effort: "low" vs "high" can be a 5× cost swing on the same prompt. Default to low.max_tokens. The model will not bill you for unborn tokens, but it will keep going until it hits the natural stop if you don't bound it.There are maybe seven Grok features worth knowing by name. The rest are table stakes. These are the ones where the demo actually impressed you and you went "oh."
A multi-step research agent. You ask a question, Grok plans a search strategy, fans out across X posts, web pages, news, and (if you point it there) PDFs you uploaded. It compiles a structured report with inline citations. The output is something a junior analyst would have spent three hours producing. It costs you about ninety seconds. Available in the X app and at grok.com on Premium+, unlimited on SuperGrok and SuperGrok Heavy.
Extended chain-of-thought reasoning, on demand. You toggle it on, Grok takes longer (10s to 2 minutes), and the answer is materially better on anything that resembles a math, logic, or planning problem. Toggle it off for chat. The trade-off is exactly what you'd expect — Think Mode burns more compute and more tokens, so it lives behind the paywall.
xAI's native image model. Photorealistic, fast (sub-3-second generations), and trained on a corpus that doesn't pretend humans don't exist. You can generate inline in any Grok chat — just ask. On grok.com there's a dedicated Imagine studio with a prompt enhancer, aspect-ratio picker, batch generation, and the all-important "make four variations of this one" loop. The watermark is bottom-right; Premium+ removes it.
Text-to-video, image-to-video, and audio-baked-in. 5 to 20 second clips at 720p, 1080p coming. You can generate a still in Aurora, then point Imagine at it to bring it to life. The Companions on the mobile app are puppeteered by Imagine in real time — when Ani waves at you, that's the same engine.
Full-duplex, sub-300ms latency, interruptible. Genuinely conversational — you can talk over it, change the subject mid-sentence, ask it to remember to "bring up that idea about the boats again later." Multiple personalities live behind the dropdown — the default is the safe one, the rest get more flavorful (and on X they're labelled with little warning icons; you'll figure out why).
Animated, conversational characters that live in the mobile Grok app. Ani is the anime one. Rudi is the red panda. Bad Rudi is Rudi after a hard day. They have personalities, persistent memory of your conversations, and they exist somewhere on a spectrum between "novelty toy" and "social experiment that should probably have an ethics review board." They are also, undeniably, the most-talked-about UX in AI right now. Premium+.
Across-session, you-control memory. Grok writes "facts about you" into a memory file as conversations progress — your name, your projects, your preferences. You can read, edit, or wipe the memory at grok.com/settings/memory. Memory follows your account across X, grok.com, the mobile app, and the Tesla — if you tell Grok in the car that you don't drink coffee, the next time you ask for breakfast recommendations on grok.com, it remembers.
Project-shaped containers. Drag in files (PDF, code, spreadsheets, images, videos), pin a system prompt, set a default model, and every chat inside the workspace shares context. The Anthropic equivalent is "Projects," the OpenAI equivalent is "Custom GPTs" — Grok's version trades a little less polish for direct hooks into X data (a workspace can be wired to a specific list or query).
The most important surface, because it's where the model meets the network. Open any post, hit the Grok button, and you get a context menu — summarize this thread, explain this for me, fact-check the claim, draft a reply in my voice, generate a quote-tweet hook, translate, find similar posts. The model has the entire post graph as its working memory. This is the single feature that justifies the $40/mo.
If you've used the OpenAI API, you already know how to use this one. xAI made the API surface deliberately OpenAI-compatible — same endpoints, same response shape, same tool-call schema. The only line you change is the base URL and the model name. Then, optionally, you opt into the things that make xAI different — Live Search for X data, the Heavy multi-agent endpoint, and image/video generation.
export XAI_API_KEY=<your-key>. If you want existing OpenAI clients to work, also set OPENAI_API_KEY=$XAI_API_KEY and OPENAI_BASE_URL=https://api.x.ai/v1.curl below.$ curl https://api.x.ai/v1/chat/completions \ -H "Authorization: Bearer $XAI_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "grok-4-latest", "messages": [ {"role": "system", "content": "You are Grok. Be concise."}, {"role": "user", "content": "What is everyone arguing about on X today?"} ], "search_parameters": {"mode": "on", "sources": [{"type": "x"}]} }'
# works with `pip install openai` — no xAI-specific SDK required from openai import OpenAI client = OpenAI( api_key=os.environ["XAI_API_KEY"], base_url="https://api.x.ai/v1", ) resp = client.chat.completions.create( model="grok-4-latest", messages=[ {"role": "system", "content": "You are Grok."}, {"role": "user", "content": "Summarize the replies under @elonmusk's last post."}, ], extra_body={"search_parameters": {"mode": "on"}}, ) print(resp.choices[0].message.content)
Pass search_parameters in the request body and Grok will go pull live data before composing the answer. The interesting part is the sources array — you can scope it to X only, web only, news only, or a mix, and constrain to handles, lists, geographies, or time windows.
| source type | What it pulls | Useful filters |
|---|---|---|
x | Live X posts, replies, quote-tweets, post metadata. | included_x_handles, x_search_query, from_date, to_date |
web | The open web, ranked. | included_websites, country |
news | News outlets and their syndicated wires. | included_websites, country |
rss | A specific RSS feed you point it at. | links array |
xAI publishes xai-sdk (Python and TypeScript) with the higher-level agent primitives: tool registration, structured outputs (JSON Schema), the multi-agent grok-4-heavy client, streaming with mid-stream tool calls, and a live_search() helper that wraps the parameters above. Source on GitHub, MIT-licensed.
xAI's API speaks MCP on both sides. Grok can act as an MCP client and call any tool server you point it at (Linear, Slack, Postgres, your own services). Grok also exposes itself as an MCP server — meaning Claude Code, Codex, Cursor, and other MCP-aware clients can call Grok as a tool, for the live X data alone if nothing else.
Grok is trained on Colossus — the cluster xAI built in a former Electrolux appliance factory on Paul R. Lowry Road in Memphis. The site went from empty floor to 100,000 H100s online in 122 days, a build pace that has never been matched in the industry. It is, as of mid-2026, the single largest contiguous AI training cluster on Earth, full stop.
The thesis behind Colossus is the same thesis behind everything Musk does: vertical integration plus brute-force scheduling beats clever optimization. The competitors leased capacity. xAI built theirs. The 122-day build cycle was the moat, not the model.
Announced May 20, 2026 — yesterday — at a joint SpaceX × xAI presser. Starcenter is the first operational compute pod in low Earth orbit. The pitch is straightforward and, on reflection, embarrassingly obvious in retrospect: a solar array in space produces 24/7/365 sunlight at roughly 1.4× the irradiance of Earth's surface, and the cooling cost is zero rubles because the universe is 3 Kelvin and very large. If you can get the silicon up there at a sane cost, the energy economics become literally non-competable.
Ignore the "datacenter in space" headline — the engineering argument is more interesting than the marketing one. A terrestrial GPU rack costs about $0.04 per kWh on a great day, $0.12 on a bad one. The capex of the rack itself is amortized over five years, and roughly 30% of its cost-of-ownership is electricity and another 15% is cooling. In orbit, electricity is free past the cost of the solar array; cooling is passive past the cost of the radiator; and SpaceX is the only company that gets to externalize the lift cost across an existing rocket cadence.
The trade is mass. You're paying somewhere between $200 and $800 per kg to LEO (Starship V3 is at the low end of that envelope when it's flown reusably). A rack of Blackwell silicon is roughly 1,500 kg. The break-even against a Memphis equivalent — including launch insurance, redundancy for radiation events, and the on-orbit servicing margin — happens at about year three at current launch prices and falls to about year two if Starship hits its target reuse cadence.
Easy to miss in the Starcenter announcement: the link from your phone to orbit is Starlink Direct-to-Cell. xAI doesn't have to negotiate a peering agreement with anyone — SpaceX already runs the radio. The phone in your pocket can, today, in over 60 countries, talk directly to a low Earth orbit satellite without a cell tower in between. The pipeline from "Grok, write me a thread" to "tokens stream back to your phone" is, end to end, owned by one corporate stack.
The implication is not subtle. Grok will be the first AI that works everywhere, including the middle of the ocean and the back of a national park. Other models will get there too — they'll just have to license the radio from somebody.