FluxHire.AI
AI Technology

GPT-5.4, GPT-5.4 Mini & Nano: The Complete GuideOpenAI's Most Powerful Model Family — March 2026

Everything you need to know about GPT-5.4's 1 million token context window, native computer use, verified benchmarks, pricing across all four variants, and what it means for enterprise AI in Australia.

19 March 202618 min readTechnical Deep-Dive
Visual representation of GPT-5.4 model family showing Standard, Pro, Mini and Nano variants with benchmark charts and pricing data

Executive Summary

OpenAI has released GPT-5.4 — its most capable model to date — alongside smaller Mini and Nano variants designed for high-volume, cost-sensitive workloads. Released across two waves in March 2026, this model family introduces native computer use, a 1 million token context window, and a novel tool search mechanism that fundamentally changes how AI agents interact with external systems.

  • GPT-5.4 Standard & Pro launched 5 March 2026 with 1M token context and 128K max output
  • GPT-5.4 Mini & Nano followed on 17 March 2026 — Mini is 2x faster than its predecessor
  • Native computer use — first general-purpose OpenAI model to navigate desktops, browsers, and applications
  • 33% fewer false claims and 18% fewer response errors compared to GPT-5.2
  • Pricing from $0.20/MTok (Nano) to $30.00/MTok (Pro) — a model for every workload
  • GPT-5.2 retiring 5 June 2026 — migration planning should begin now

What Is GPT-5.4? OpenAI's Most Advanced Model Family

GPT-5.4 is the latest generation of OpenAI's flagship generative AI models, succeeding GPT-5.2 (released December 2025) and incorporating the agentic coding capabilities first introduced in GPT-5.3-Codex (February 2026). It represents a convergence of OpenAI's previously separate model lines — the reasoning-focused o-series and the general-purpose GPT series — into a unified architecture that scales from lightweight classification to frontier-grade professional work.

The model family comprises four distinct variants, each targeting a different segment of the cost-performance spectrum. GPT-5.4 Standard and GPT-5.4 Pro handle the most demanding tasks with a 1 million token context window. GPT-5.4 Mini and Nano serve high-volume, latency-sensitive applications at substantially lower cost. Together, they give developers and enterprises the flexibility to route different tasks to the most appropriate model — a pattern that has become standard practice in production AI systems.

What sets GPT-5.4 apart from its predecessors is not merely incremental improvement. Three genuinely new capabilities — native computer use, tool search, and the Thinking variant's visible reasoning — represent architectural shifts in how large language models interact with the world. These are not fine-tuned additions bolted onto an existing model; they are capabilities baked into the model's training from the ground up.

Complete Pricing & Specifications at a Glance

SpecificationGPT-5.4GPT-5.4 ProGPT-5.4 MiniGPT-5.4 Nano
Released5 Mar 20265 Mar 202617 Mar 202617 Mar 2026
API Model IDgpt-5.4-2026-03-05gpt-5.4-pro-2026-03-05gpt-5.4-mini-2026-03-17gpt-5.4-nano-2026-03-17
Context Window1,000,0001,000,000400,000400,000
Max Output128,000128,000
Input $/MTok$2.50$30.00$0.75$0.20
Output $/MTok$15.00$180.00$4.50$1.25
Computer UseYesYesYesLimited
ChatGPT AccessPlus/Team/ProPro/EnterpriseFree/GoAPI only

Pricing as published on the OpenAI API pricing page. Batch/Flex processing available at half the standard rate. Priority processing at double. Regional data residency adds a 10% uplift. Dash (—) indicates not separately confirmed by OpenAI at time of publication.

GPT-5.4 Standard & Pro: The Flagship Models

GPT-5.4 Standard is the general-purpose workhorse — the model most developers and ChatGPT users will interact with daily. GPT-5.4 Pro is the maximum-compute variant reserved for the hardest professional tasks, available to Pro and Enterprise plan subscribers in ChatGPT and via the API.

Both share the same architecture and core capabilities, but Pro allocates significantly more compute per request. OpenAI's own internal investment banking modelling benchmark illustrates the gap: GPT-5.4 Pro achieves 87.3% accuracy compared to GPT-5.2's 68.4% — a substantial improvement on tasks requiring sustained multi-step quantitative reasoning.

1 Million Token Context Window

GPT-5.4's context window of 1,000,000 tokens (split as 922K input and 128K output) is the largest OpenAI has ever offered. To put this in perspective, one million tokens is approximately 750,000 words — enough to process an entire codebase, a lengthy legal contract, or hundreds of candidate profiles in a single API call.

This represents a 2.5x increase over GPT-5.2's 400,000 token context. For enterprise applications that require synthesising information across large document collections, this is not merely an incremental upgrade — it enables entirely new workflows that were previously impossible without complex chunking and retrieval strategies.

Worth noting: pricing doubles for requests that exceed 272K tokens of context, so cost-conscious teams should still be deliberate about what they include in their prompts.

Native Computer Use

GPT-5.4 is the first general-purpose OpenAI model with native computer use capabilities. This means it can navigate desktop environments, control web browsers, operate applications, and execute multi-step workflows — all through the API and Codex.

On the OSWorld benchmark, which measures real-world computer interaction, GPT-5.4 scores 75.0%. For context, the human baseline on OSWorld is 72.4%. This is a significant milestone — the model matches or exceeds average human performance on standardised computer tasks.

For enterprises, native computer use opens the door to genuine process automation: filling forms, navigating legacy systems that lack APIs, extracting data from web interfaces, and orchestrating multi-application workflows. This is particularly valuable in recruitment, where many legacy ATS platforms and job boards still require manual browser interaction.

Tool Search: A New Paradigm for Agent Architectures

One of GPT-5.4's most innovative features is tool search. Rather than sending the full definition of every available tool with each API call — which consumes significant context and increases latency — GPT-5.4 receives only a lightweight list of tool names and a search capability. When the model determines it needs a specific tool, it looks up the tool's full definition and appends it to the conversation at that moment.

This dramatically reduces token usage for tool-heavy agentic workflows. If your application exposes dozens or hundreds of tools (as is common in enterprise integrations), tool search can significantly reduce per-request costs while maintaining the model's ability to use any tool when needed.

Accuracy and Reliability Improvements

OpenAI reports that individual claims produced by GPT-5.4 are 33% less likely to be false compared to GPT-5.2, and full responses are 18% less likely to contain errors. On the GDPval benchmark — which measures performance against professional human workers across diverse tasks — GPT-5.4 matches or exceeds professionals in 83.0% of comparisons, up from 70.9% for GPT-5.2.

On the GPQA Diamond benchmark (graduate-level science questions), GPT-5.4 scores 92.8%, a marginal but consistent improvement over GPT-5.2's 92.4%. These gains may appear modest in percentage terms, but at the frontier of model performance, each percentage point represents a meaningful reduction in error rates across millions of daily requests.

GPT-5.4 Thinking: Visible Reasoning for Complex Tasks

GPT-5.4 Thinking is a reasoning variant that makes the model's thought process visible to users. In ChatGPT, it provides an upfront plan of its thinking, allowing users to see and adjust the model's approach mid-response. This is the evolution of OpenAI's earlier o-series reasoning models (o1, o3, o3-pro, and o4-mini), which were retired from ChatGPT on 13 February 2026.

Rather than maintaining a separate reasoning model line, OpenAI has folded reasoning capabilities directly into the GPT-5.x architecture as a “Thinking” tier. This simplifies the model selection process for developers — instead of choosing between a chat model and a reasoning model, you now choose a thinking depth within a single model family.

For enterprise applications, visible reasoning is particularly valuable in compliance-sensitive domains. When an AI system makes a recommendation — whether it's shortlisting a candidate, scoring a resume, or flagging a risk — auditors and compliance teams can inspect the model's reasoning chain to verify that decisions are based on appropriate criteria and are not influenced by protected characteristics.

GPT-5.4 Mini: Speed Meets Capability

Released on 17 March 2026, GPT-5.4 Mini is designed for applications where speed and cost matter as much as intelligence. It runs more than 2x faster than GPT-5 Mini (the previous small model) while delivering significantly better performance across coding, reasoning, multimodal understanding, and tool use.

At $0.75 per million input tokens and $4.50 per million output tokens, Mini occupies a sweet spot between Nano's bare-minimum pricing and Standard's frontier capability. Its 400,000 token context window is generous for a small model, and it has native computer use capabilities — making it viable for agent subprocesses that need to interact with web interfaces or desktop applications.

Performance Benchmarks

SWE-Bench Pro

54.4%

vs GPT-5 Mini: 45.7%

tau2-bench (Tool Calling)

93.4%

vs GPT-5 Mini: 74.1%

OSWorld-Verified (Computer Use)

72.1%

Approaching human baseline of 72.4%

Terminal Tasks

60.0%

New benchmark for Mini-class models

Ideal Use Cases for GPT-5.4 Mini

  • Real-time coding assistants — fast enough for inline code completion and pair programming
  • Agent subprocesses — capable sub-agents in multi-agent architectures that need to call tools and use computers
  • High-volume API workloads — processing thousands of requests per minute at a fraction of Standard's cost
  • ChatGPT Free tier — now available to all ChatGPT users, including Free and Go plans

GPT-5.4 Nano: The Efficiency Frontier

GPT-5.4 Nano is the smallest and most cost-effective model in the GPT-5.4 family. At $0.20 per million input tokens and $1.25 per million output tokens, it is designed for workloads where speed and cost are the primary considerations — classification, data extraction, ranking, routing, and simple coding tasks.

Unlike Mini, Nano is an API-only model. It is not available through the ChatGPT consumer application. This positions it squarely as a developer tool — the model you reach for when building production pipelines that process millions of items and where every fraction of a cent per request matters.

Cost Comparison: Processing 1 Million Documents

Estimated cost for processing 1 million documents averaging 500 tokens each (500M total input tokens, assuming 100 output tokens per document):

GPT-5.4 Nano

$225

GPT-5.4 Mini

$825

GPT-5.4 Standard

$2,750

GPT-5.4 Pro

$33,000

Estimates based on published per-token pricing. Actual costs will vary with prompt structure, caching, and batch processing discounts.

Where Nano Shines

  • Classification and routing — sorting emails, categorising support tickets, routing queries to the appropriate department
  • Data extraction — pulling structured information from unstructured text at massive scale
  • Ranking and scoring — lightweight candidate scoring, content relevance ranking, search result reranking
  • Simple coding sub-agents — code generation for straightforward tasks within larger agent orchestrations

On SWE-Bench Pro, Nano still manages a respectable 52.4% — remarkable for a model at this price point. Its tool calling capability is more limited than Mini's, and its OSWorld-Verified score of 39.0% indicates that computer use tasks are better delegated to Mini or Standard.

The GPT-5 Evolution: From GPT-5 to GPT-5.4

Understanding GPT-5.4 requires context on the rapid evolution of OpenAI's model family since August 2025. Each release has built on the last, and the pace has been relentless — five major releases in seven months.

GPT-5

7 August 2025

The foundational release. Unified the o-series reasoning models and GPT chat models into a single system with a smart router. Three tiers: Instant, Thinking, Pro. 400K context. Massive leap over GPT-4: 94.6% on AIME 2025, 74.9% on SWE-Bench, 45% fewer factual errors than GPT-4o.

GPT-5.1

12 November 2025

Conversational refinement. Warmer personality, 8 personality options, adaptive reasoning in Instant mode. Codex-Max and Pro variants followed a week later. Retired 11 March 2026.

GPT-5.2

11 December 2025

Flagship reasoning model. Knowledge cutoff jumped to August 2025. 400K context, 128K output. Benchmarks: 100% AIME 2025, 93.2% GPQA Diamond, 80% SWE-Bench. Codex variant followed in January 2026. Retiring 5 June 2026.

GPT-5.3-Codex

5 February 2026

Specialised agentic coding model. Combined Codex + GPT-5 training stacks. 25% faster. First model that helped create itself. GPT-5.3 Instant followed on 3 March 2026 (stability focused, fewer hallucinations).

GPT-5.4 & GPT-5.4 Mini/Nano

March 2026 (Current)

Current frontier. 1M context, native computer use, tool search. Incorporates GPT-5.3-Codex capabilities. 33% fewer false claims. Four variants: Standard, Pro, Mini, Nano.

GPT-5.4 vs the Competition: Claude, Gemini, and Codex

GPT-5.4 enters a fiercely competitive landscape. Anthropic's Claude Opus 4.6 (released February 2026) leads on software engineering benchmarks. Google DeepMind's Gemini 3.1 Pro (also February 2026) offers the best price-to-performance ratio and full multimodal support. Understanding where each model excels is essential for making informed architecture decisions.

BenchmarkGPT-5.4Claude Opus 4.6Gemini 3.1 Pro
SWE-Bench Verified57.7% (Pro)80.84%80.6%
GPQA Diamond92.8%91.3%94.3%
OSWorld (Computer Use)75.0%
GDPval (Professional Work)83.0%
Context Window1,000,000200K (1M beta)1,000,000
Input $/MTok$2.50$5.00$2.00
Native Computer UseYesYes (Claude Code)Limited

Dash (—) indicates benchmark not reported by the developer. GPT-5.4 SWE-Bench score is from SWE-Bench Pro (a newer, more challenging variant); Claude and Gemini scores are from SWE-Bench Verified. Direct comparison should be made with caution. Sources: Official developer announcements.

The competitive picture is nuanced. GPT-5.4 leads on computer use (OSWorld) and professional task performance (GDPval). Claude Opus 4.6 dominates on software engineering (SWE-Bench Verified). Gemini 3.1 Pro offers the best value at $2.00/MTok input with full multimodal support. For a detailed comparison of Claude, Gemini, and Codex, see our comprehensive frontier model comparison.

In practice, most enterprise architectures now use multiple models. A common pattern is routing complex reasoning to a frontier model (GPT-5.4 Standard or Claude Opus 4.6), high-volume tasks to a small model (GPT-5.4 Nano or Claude Haiku 4.5), and multimodal work to Gemini. The right question is not “which model is best?” but “which model is best for each task in my pipeline?”

Enterprise and Recruitment Applications

GPT-5.4's combination of massive context, computer use, and improved accuracy makes it particularly well-suited for enterprise recruitment workflows. The ability to process entire candidate portfolios in a single context window eliminates the need for complex chunking strategies, while native computer use enables direct interaction with legacy ATS platforms and job boards.

AI-powered recruitment platforms are already leveraging multi-model architectures to optimise cost and performance. For example, FluxHire.AIuses frontier models for complex candidate analysis and matching, while routing high-volume tasks like resume parsing and initial screening to more cost-effective variants. The GPT-5.4 model family's four-tier pricing structure is tailor-made for this kind of intelligent routing.

Intelligent Matching

GPT-5.4's 1M context allows processing entire job requirement documents alongside hundreds of candidate profiles simultaneously, enabling more nuanced matching than was possible with smaller context windows.

Agentic Workflows

Computer use enables agents that navigate job boards, ATS platforms, and sourcing tools autonomously. Tool search reduces the cost of agents that integrate with dozens of recruitment APIs.

Cost Optimisation

Route resume screening to Nano ($0.20/MTok), shortlisting to Mini ($0.75/MTok), and final candidate analysis to Standard ($2.50/MTok). Match the model to the task for optimal cost-to-quality.

What This Means for Australian Businesses

Australian enterprises evaluating GPT-5.4 should consider several factors specific to the local regulatory and business environment.

Privacy Act 1988 Compliance:Any AI system processing personal information of Australian individuals must comply with the Australian Privacy Principles (APPs), particularly APP 11 (security of personal information) and APP 8 (cross-border disclosure). GPT-5.4's API includes data processing agreements, but organisations should ensure their use cases comply with data minimisation principles and have appropriate consent mechanisms in place.

Data Residency:OpenAI's regional data residency option (available at a 10% pricing uplift) allows organisations to ensure API requests are processed within specific geographic regions. Australian enterprises handling sensitive data should evaluate whether this meets their data sovereignty requirements.

Fair Work Act Considerations:When using GPT-5.4 for recruitment decisions (screening, shortlisting, scoring), organisations must ensure that AI-assisted decisions do not discriminate on the basis of protected attributes under the Fair Work Act 2009. The Thinking variant's visible reasoning provides an audit trail that can help demonstrate compliance.

Cost Advantage for Australian SMEs:The Nano model at $0.20/MTok makes frontier AI accessible to smaller Australian recruitment firms that previously could not justify the cost of API-based AI. A recruitment agency processing 10,000 candidate profiles per month would spend approximately $10–$50 on Nano, compared to $125–$750 on Standard.

Migration Guide: Moving from GPT-5.2 to GPT-5.4

With GPT-5.2 Thinking classified as a Legacy Model and scheduled for permanent retirement on 5 June 2026, organisations currently running GPT-5.2 in production need to plan their migration. Here are the key considerations.

Deprecation Timeline

GPT-5.2 Thinking will be permanently retired on 5 June 2026. After this date, API calls using gpt-5.2-2025-12-11 will fail. Begin testing GPT-5.4 in staging environments now to ensure a smooth transition.

API Model ID Changes

Update your model parameter from gpt-5.2-2025-12-11 to gpt-5.4-2026-03-05 (or gpt-5.4 for the latest alias). For cost-sensitive workloads, consider routing to gpt-5.4-mini-2026-03-17 or gpt-5.4-nano-2026-03-17.

Context Window Expansion

GPT-5.4's 1M context window (vs GPT-5.2's 400K) may allow you to simplify your architecture by removing chunking and retrieval layers. Evaluate whether your current RAG pipeline can be simplified or eliminated for use cases that now fit within the expanded context.

New Capabilities

GPT-5.4 introduces computer use and tool search. If your application currently uses browser automation or maintains large tool definitions, evaluate how these native capabilities could replace existing infrastructure.

For organisations using multi-model architectures, the migration is also an opportunity to re-evaluate model routing. Tasks currently running on GPT-5.2 may be better served by GPT-5.4 Mini or Nano, delivering the same or better quality at a lower cost. Our earlier article on FluxHire's GPT-5.2 integration provides context on how multi-agent platforms approach model selection.

ChatGPT for Excel and Google Sheets

Alongside GPT-5.4, OpenAI launched ChatGPT for Excel and Google Sheets in beta. This embeds ChatGPT directly within spreadsheet applications, allowing users to invoke GPT-5.4 capabilities without leaving their workflow.

For recruitment teams, this has immediate practical applications: analysing candidate data, generating interview scorecards, creating salary benchmarking reports, and automating repetitive spreadsheet tasks. The integration runs on GPT-5.4 Mini by default, keeping costs manageable for frequent use.

This is significant for the broader adoption of AI in enterprises. Many HR professionals and recruiters spend a substantial portion of their day in spreadsheets. Meeting them where they already work — rather than requiring them to switch to a separate AI tool — reduces friction and accelerates adoption.

What Comes Next

The pace of OpenAI's releases — five major model versions in seven months — shows no sign of slowing. GPT-5.4 is not an endpoint but a milestone in a rapid evolution. For enterprises, the strategic imperative is clear: build architectures that can adapt to new models quickly, invest in evaluation frameworks that can benchmark new releases against your specific use cases, and maintain model-agnostic abstractions in your codebase.

The AI recruitment space, in particular, is evolving rapidly. Platforms that integrate frontier models for complex decision-making while leveraging cost-effective models for high-volume processing are best positioned to deliver value. The GPT-5.4 family's four-tier structure makes this kind of intelligent routing more accessible than ever. To explore how AI-powered recruitment automation is transforming the industry, see our guide to OpenAI agents in recruitment automation.

Frequently Asked Questions

What is GPT-5.4 and when was it released?

GPT-5.4 is OpenAI's latest frontier AI model, released on 5 March 2026. It features a 1 million token context window, native computer use capabilities, and a new tool search mechanism. It is available via the OpenAI API, ChatGPT Plus, Team, and Pro plans.

What are GPT-5.4 Mini and GPT-5.4 Nano?

GPT-5.4 Mini and Nano are smaller, faster, more cost-effective variants released on 17 March 2026. Mini offers 400K context and is available in ChatGPT Free/Go tiers. Nano is API-only at $0.20 per million input tokens — the cheapest model in the family.

How much does GPT-5.4 cost per million tokens?

Standard: $2.50 input / $15.00 output. Pro: $30.00 / $180.00. Mini: $0.75 / $4.50. Nano: $0.20 / $1.25. Batch/Flex processing is available at half rate. Pricing doubles for requests exceeding 272K context tokens on Standard and Pro.

What is the context window for GPT-5.4?

GPT-5.4 Standard and Pro have a 1,000,000 token context window (922K input + 128K output) — roughly 750,000 words. Mini and Nano have 400,000 token windows. This is the largest context OpenAI has offered.

What is native computer use in GPT-5.4?

GPT-5.4 can navigate desktops, control browsers, and operate applications autonomously through the API and Codex. It scores 75.0% on OSWorld, exceeding the human baseline of 72.4%. This is the first general-purpose OpenAI model with this capability.

How does GPT-5.4 compare to Claude Opus 4.6?

GPT-5.4 leads on computer use (OSWorld 75.0%) and professional tasks (GDPval 83.0%). Claude Opus 4.6 leads on software engineering (SWE-Bench Verified 80.84% vs GPT-5.4's 57.7% on the more challenging SWE-Bench Pro). Different benchmarks measure different things — direct comparison requires caution.

What is GPT-5.4 Thinking mode?

GPT-5.4 Thinking is a reasoning variant that makes the model's thought process visible. Users can see and adjust the model's approach mid-response. It evolved from the o-series reasoning models (o1, o3, o4-mini), which were retired in February 2026.

When will GPT-5.2 be retired?

GPT-5.2 Thinking is now a Legacy Model and will be permanently retired on 5 June 2026. API calls using gpt-5.2-2025-12-11 will stop working after that date. Begin migration planning now.

Is GPT-5.4 Nano available in ChatGPT?

No. GPT-5.4 Nano is API-only and is not available in the ChatGPT consumer application. GPT-5.4 Mini is available in ChatGPT Free and Go tiers. Standard and Pro are available in Plus, Team, and Pro plans.

How can Australian enterprises use GPT-5.4 for recruitment?

Enterprises can leverage GPT-5.4 for candidate analysis, automated screening, intelligent matching, and agentic recruitment workflows. The 1M context window enables processing entire candidate portfolios in a single pass. Platforms like FluxHire.AI integrate frontier models while maintaining Australian Privacy Act 1988 compliance.