Last updated: June 22, 2026

gpt-realtime Pricing Calculator

Estimate the cost of using gpt-realtime-1.5 and gpt-realtime-mini on Azure and OpenAI based on your expected conversation volume, duration, and turn count.

What Is gpt-realtime?

gpt-realtime supports low-latency, “speech in, speech out” conversational interactions. Unlike traditional speech pipelines that chain speech-to-text → LLM → text-to-speech, gpt-realtime processes audio natively — producing faster, more natural voice interactions with a single API call.

You can connect to the Realtime API via WebRTC, WebSocket, or SIP to send audio input and receive audio responses in real time.

The latest version, gpt-realtime-1.5, is available on both OpenAI and Azure OpenAI. A smaller, more affordable variant — gpt-realtime-mini— is also available for cost-sensitive voice applications. Both models are well-suited for building voice assistants, real-time translation systems, interactive customer support agents, telephony bots, and any application where users expect a natural, spoken conversation with an AI.

Connecting to the Realtime API: WebRTC vs WebSocket vs SIP

The Realtime API supports three connection protocols. In most cases, WebRTC is the recommended choice for real-time audio streaming thanks to its lower latency, built-in media handling, error correction, and peer-to-peer communication.

ProtocolBest forLatencyComplexity
WebRTCClient-side apps (web, mobile)Lowest (~50-100 ms)Higher
WebSocketServer-to-server, batch processingModerate (~100-300 ms)Lower
SIPTelephony integrationVariesHighest

SIP (Session Initiation Protocol) lets you route inbound VoIP calls directly into an AI-powered session, making it ideal for telephony integration and contact center automation.

How Realtime API Pricing Works

Both gpt-realtime-1.5 and gpt-realtime-mini are billed per token across two modalities — audio and text — each with separate input, cached input, and output rates:

gpt-realtime-1.5 Pricing

  • Audio tokens: $32 per 1M input tokens, $0.40 per 1M cached input tokens, $64 per 1M output tokens.
  • Text tokens: $4 per 1M input tokens, $0.40 per 1M cached input tokens, $16 per 1M output tokens.

gpt-realtime-mini Pricing

  • Audio tokens: $10 per 1M input tokens, $0.30 per 1M cached input tokens, $20 per 1M output tokens.
  • Text tokens: $0.60 per 1M input tokens, $0.06 per 1M cached input tokens, $2.40 per 1M output tokens.

Audio input is tokenized at 10 tokens/second; audio output at 20 tokens/second. A small number of text tokens accompanies each audio response (~3 tokens/second of assistant speech).

  • Cached tokens: In multi-turn conversations, tokens from earlier turns are cached and re-billed at much lower rates — dramatically cheaper than the full input rate. More turns mean more caching and lower average cost per conversation.

The total cost of a conversation depends on its duration, number of turns, and the balance between user input and assistant output in each turn.

When to Use the Realtime API

The Realtime API is ideal when your application requires spoken, interactive exchanges with sub-second latency. Common use cases include:

  • Voice assistants: Build conversational agents that listen and respond in natural speech without noticeable delay.
  • Live translation & interpretation: Translate spoken language in near real time for meetings, calls, or customer service.
  • Interactive voice bots: Customer support bots, scheduling assistants, and IVR replacements that feel like talking to a human.
  • Telephony & contact centers: Route inbound VoIP calls via SIP directly into AI-powered sessions for automated phone support.
  • Accessibility tools: Real-time audio descriptions, read-aloud interfaces, and voice-driven navigation for users who prefer spoken interaction.

For text-only workloads, batch processing, or scenarios where request latency is less critical, standard GPT models like GPT-4.1 are more cost-effective.

Frequently Asked Questions

Need provisioned throughput sizing?

Estimate how many PTUs you need for Azure OpenAI deployments.

Azure PTU Calculator →