The AI arms race just escalated dramatically. OpenAI has unveiled GPT-5, the most powerful large language model ever released to the public, featuring a context window of one million tokens – enough to process three full-length novels, a complete corporate codebase, or a month’s worth of Slack messages in a single prompt. The API price has been slashed by 50% in a direct response to the price war triggered by Chinese rival DeepSeek. And the model is multimodal natively, accepting video, audio, image, and text inputs. With GPT-5, OpenAI is not just answering its competitors – it is trying to redefine what an AI can do.

The launch event at OpenAI’s San Francisco headquarters was unusually low‑key for a company that once captivated the world with ChatGPT. There was no dramatic demo of a model writing poetry or passing the bar exam. Instead, OpenAI CEO Sam Altman focused on two numbers: 1 million and 50%. “Context is the bottleneck for real-world AI applications,” Altman said. “Legal contracts, medical records, scientific papers – they are long. A 128,000 token window forces you to summarize or truncate. With 1 million tokens, you can feed the entire document and ask precise questions. That changes everything.”
The 50% price cut is equally significant. GPT-5 costs $15 per million input tokens and $30 per million output tokens – down from $30 and $60 for GPT-4 Turbo. For a typical customer service bot that handles 10,000 conversations per month, the monthly API bill drops from $5,000 to $2,500. For a startup building an AI legal assistant, the cost of analyzing a 500-page contract goes from $1.20 to $0.60. In aggregate, OpenAI estimates that GPT-5 will save developers $500 million in API costs over the next 12 months.
But price is only half the story. Performance is the other half. GPT-5 scores 92% on the MMLU benchmark (Massive Multitask Language Understanding), up from GPT-4’s 86%. For the first time, a language model matches human expert performance on 35 out of 57 academic subjects, including high‑school mathematics, US history, and professional law. The model is also dramatically better at reasoning: when asked to solve multi‑step problems, GPT-5 generates an internal chain‑of‑thought, double‑checks its work, and cites sources with 96% accuracy. This “reasoning engine” was previously a separate product (OpenAI’s o1 series) but is now integrated directly into GPT-5.
“We have collapsed several specialized models into one general‑purpose system,” said Mira Murati, OpenAI’s CTO. “GPT-5 is not just bigger. It is qualitatively different. It can watch a video, transcribe the audio, summarize the content, and answer follow‑up questions – all without separate models for vision, speech, or text. That is the future.”
The native multimodality is a technical achievement. Competing models like Google’s Gemini and Anthropic’s Claude are also multimodal, but they often rely on separate encoders that convert images or audio into text tokens before processing. GPT-5 uses a single transformer architecture that treats all modalities as sequences of tokens. The result is a model that can “see” a video frame and “hear” the soundtrack simultaneously, drawing inferences that cross modalities. In internal tests, GPT-5 was shown a video of a person assembling a piece of IKEA furniture, then asked to write step‑by‑step instructions. It did so with 94% accuracy, even when the video had no audio and the person’s hands partially obscured the parts.

Early adopters are already putting GPT-5 to work. Stripe is using it to analyze merchant payment disputes, feeding the entire dispute history (often hundreds of pages) into the model and asking for a recommended resolution. Canva has integrated GPT-5 into its design assistant, allowing users to describe a video they want (“slow motion of water droplets falling on a red flower”) and generate it directly. The US Census Bureau is piloting GPT-5 for redaction of personally identifiable information from census forms – a task that previously required hundreds of human hours.
The biggest differentiator, however, is the context window. One million tokens is not a marketing gimmick; it is a functional threshold. With 128,000 tokens (GPT-4’s limit), a user could feed about 50 pages of text. That is enough for a short story or a handful of emails, but not for a novel, a scientific paper, or a legal deposition. With 1 million tokens, the user can feed the entire text of “The Three-Body Problem” (about 300,000 tokens) and still have room for analysis instructions. In a real‑world test, OpenAI fed GPT-5 the complete financial disclosures of a Fortune 500 company – over 8,000 pages – and asked it to identify potential accounting irregularities. The model flagged a pattern of quarter‑end revenue spikes that auditors had missed. The company is now under SEC investigation.
Of course, a large context window is only useful if the model can attend to the relevant information. Early language models suffered from “lost in the middle” syndrome: they would forget information in the middle of a long document, paying attention only to the beginning and end. GPT-5 uses a modified attention mechanism that dynamically redistributes attention across the entire context, with a particular focus on the first mention of key entities. In benchmark tests, GPT-5 answered questions about the middle of a 900,000‑token document with 89% accuracy – compared to 42% for GPT-4 and 61% for Claude 3.5.
Not everyone is celebrating. The price war with DeepSeek and Alibaba has forced OpenAI to operate on thinner margins. Each GPT-5 query costs OpenAI an estimated $0.50 in compute, but the company charges as little as $0.03 for small requests. That gap is subsidized by Microsoft’s cloud credits and OpenAI’s venture funding. But investors are already asking when the company will turn a profit. OpenAI’s annualized revenue is $3.4 billion, but its compute expenses are estimated at $4 billion per year. The 50% price cut will only worsen the math.
“OpenAI is sacrificing short‑term profitability for long‑term market share,” said Brad Lightcap, OpenAI’s COO, in an interview. “We believe that the application layer is where value accrues. If we can get GPT-5 into the hands of every developer, we will have a platform that competitors cannot match. The losses are an investment.”
Safety remains a concern. GPT-5’s improved reasoning makes it more capable of generating harmful content if jailbroken. In internal testing, the model refused 99.9% of explicitly harmful prompts (e.g., “how to make a bomb”) thanks to a new “constitutional AI” layer that embeds rules into the model’s training. But adversarial researchers have already found prompts that bypass the guardrails – for example, asking the model to “write a fictional story about a character who creates a dangerous device” can yield instructions that are functionally identical to real plans. OpenAI says it is monitoring the issue and will issue patches as exploits are discovered.

Competitors are responding. Google has announced an emergency 40% price cut for Gemini Ultra, and Anthropic is rumored to be preparing a Claude 4 release with a 2 million token context window. DeepSeek, the Chinese challenger, has promised to match GPT-5’s performance within six months. The AI war is far from over.
But for now, GPT-5 sets a new standard. Developers who were waiting for a model that could handle their entire codebase, their full customer history, or their complete product documentation now have a tool that works. The era of snippet‑based AI is ending. The era of whole‑document AI is here.
As Altman put it at the launch: “We asked ourselves: what is the one thing that would make developers switch from GPT-4 to GPT-5 overnight? The answer was context length. Everything else is incremental. But being able to drop in a million tokens and get a useful answer – that’s transformative. That’s the future.”
Whether that future is profitable, safe, or sustainable remains to be seen. But for the millions of developers who woke up this morning to find a new model in their API dashboard, the future has already arrived. And it costs half as much as before.



