💰钛媒体•Mar 19, 2026Stalecollected in 11h

OpenAI Slashes ChatGPT Token Prices 90%

Post LinkedIn

💰Read original on 钛媒体

#price-cut #token-pricing #cost-reductionchatgpt

💡OpenAI cuts ChatGPT 1M tokens to $1.25—90% savings for scaling AI apps

⚡ 30-Second TL;DR

What Changed

OpenAI cut ChatGPT pricing to $1.25 per 1M tokens

Why It Matters

This pricing slash lowers barriers for developers building AI apps, enabling more scalable deployments without budget constraints. It could accelerate adoption of OpenAI models in production environments.

What To Do Next

Switch your OpenAI API calls to high-volume batches to save 90% on token costs immediately.

Who should care:Developers & AI Engineers

🧠 Deep Insight

Web-grounded analysis with 7 cited sources.

🔑 Enhanced Key Takeaways

•OpenAI's pricing evolution shows a consistent pattern of cost reduction with each model generation: GPT-3.5 Turbo launched at $0.002 per 1K tokens in early 2023, GPT-4 Turbo reached $8–$15 per million output tokens by late 2023, and GPT-4o debuted in mid-2024 at approximately 50% cheaper than GPT-4 Turbo[1]. The current $1.25 per 1M tokens pricing for GPT-5.1 Chat represents continued aggressive cost optimization across OpenAI's product line.
•As of March 2026, OpenAI offers a tiered pricing structure across multiple model families: GPT-5.4 (most capable) costs $2.50 input/$15.00 output per 1M tokens, while GPT-4.1 mini costs $0.80 input/$3.20 output, and GPT-4.1 nano costs $0.20 input/$0.80 output per 1M tokens[6]. This diversification allows developers to optimize cost-to-capability tradeoffs based on specific use cases.
•Cached input tokens represent a significant cost optimization mechanism introduced in OpenAI's 2026 pricing structure, with cached input priced at 10% of standard input token rates (e.g., $0.25 cached vs. $2.50 standard for GPT-5.4)[6]. This incentivizes applications with repeated context windows, such as multi-turn conversations or document analysis workflows.
•OpenAI's token-based pricing model charges separately for input and output tokens, with output tokens consistently priced 4–6× higher than input tokens across model tiers[3][6]. This asymmetry reflects the computational cost differential between processing input and generating output sequences.

🔮 Future ImplicationsAI analysis grounded in cited sources

Aggressive token pricing compression may accelerate AI adoption in cost-sensitive verticals (e.g., customer support, content moderation, data processing).

Sub-$2 per 1M token pricing removes economic barriers for high-volume inference workloads that were previously prohibitive.

Cached input token pricing creates incentive structures favoring stateful, context-heavy applications over stateless single-turn interactions.

The 90% discount on cached tokens economically rewards architectural patterns that reuse context, shifting developer behavior toward longer-context workflows.

⏳ Timeline

2023-03

GPT-3.5 Turbo API launched at $0.002 per 1K tokens, representing 90% cost reduction from prior GPT-3.5 models

2023-11

GPT-4 Turbo released with pricing at $8–$15 per million output tokens, significantly higher than GPT-3.5 Turbo

2024-05

GPT-4o ('Omni') launched at approximately 50% cheaper than GPT-4 Turbo, demonstrating continued cost optimization trajectory

2025-12

GPT-5 and GPT-5.1 models introduced with pricing at $1.25 per 1M input tokens and $10.00 per 1M output tokens

2026-01

Pricing stabilized across GPT-5.1 variants (Chat, Codex, Codex-Max) at $1.25/$10.00 per 1M tokens through March 2026

2026-03

Cached input token pricing introduced at $0.125 per 1M tokens (90% discount), enabling cost-optimized stateful applications

📎 Sources (7)

Factual claims are grounded in the sources below. Forward-looking analysis is AI-generated interpretation.

💰Read original article on 钛媒体

📰

Weekly AI Recap

Read this week's curated digest of top AI events →

👉Related Updates

Same topic

Explore #price-cut

Same product