Estimate the cost of using the Azure AI Azure Voice Live API across Pro, Standard, and Lite tiers based on your expected conversation volume, duration, and audio configuration.
The Azure Voice Live API is a fully managed solution that enables low-latency, high-quality speech-to-speech interactions for voice agents. It integrates speech recognition, generative AI, and text-to-speech into a single, unified interface — eliminating the need to manually orchestrate multiple components.
Developers provide audio input and receive audio output, avatar visuals, and action triggers — all with minimal latency. You don’t need to deploy or manage any generative AI models; the API handles the underlying infrastructure.
Azure Voice Live supports a broad range of generative AI models including GPT-5, GPT-4.1, GPT-4o, Phi, and gpt-realtime variants. The model you choose determines your pricing tier (Pro, Standard, or Lite).
Pricing is tiered based on the generative AI model used. You don’t select a tier — you choose a model and the corresponding pricing applies:
Each tier has separate per-token rates for text, Azure Speech Standard audio, Azure Speech Custom audio, and native audio (speech-to-speech). Cached input tokens from earlier turns in a conversation are charged at significantly reduced rates.
Custom voice training/hosting and avatar costs are billed separately.
Looking for gpt-realtime standalone pricing?
Estimate gpt-realtime-1.5 and gpt-realtime-mini costs outside Azure Voice Live.
GPT real-time + option to use Azure text to speech voices including custom voice for audio.
| Category | Price / M Tokens | Tokens / Month (M) | Cost / Month |
|---|---|---|---|
Audio Input Native Audio (S2S) | $35.20 | 2.34 | $82.37 |
| Audio Input (cached) | $0.44 | 3.96 | $1.74 |
Audio Output Native Audio (S2S) | $70.40 | 1.80 | $126.72 |
| Text Input | $4.40 | 0.72 | $3.15 |
| Text Input (cached) | $1.38 | 2.32 | $3.20 |
| Text Output | $17.60 | 0.27 | $4.75 |
Per Conversation
$0.22
Estimated Monthly Total
$221.93
Estimated costs using your current conversation parameters across all available tier and audio configurations.
| Tier | Voice Option | Per Conversation | Monthly Total | Difference |
|---|---|---|---|---|
| Azure Voice Live Pro | Azure Speech — Custom | $0.15 | $3,095.30 | +1295% |
| Azure Voice Live Standard | Azure Speech — Custom | $0.13 | $3,076.25 | +1286% |
| Azure Voice Live Lite | Azure Speech — Custom | $0.13 | $3,074.59 | +1285% |
| Azure Voice Live Pro | Native Audio (S2S) | $0.22 | $221.93 | Baseline |
| Azure Voice Live Pro | Azure Speech — Standard | $0.09 | $94.80 | -57% |
| Azure Voice Live Standard | Azure Speech — Standard | $0.07 | $74.85 | -66% |
| Azure Voice Live Lite | Azure Speech — Standard | $0.07 | $73.19 | -67% |
| Azure Voice Live Standard | Native Audio (S2S) | $0.07 | $68.60 | -69% |
The Azure Voice Live API charges per token across audio and text modalities. Costs accumulate with each turn in a conversation because the full conversation history is sent as input for every response. Prompt caching reduces cost for previously seen tokens.
Prices per 1M tokens. The tier is determined by the generative AI model used.
| Feature | Input | Cached | Output |
|---|---|---|---|
| Text | $4.40 | $1.38 | $17.60 |
| Audio — Standard | $17.00 | $0.44 | $38.00 |
| Audio — Custom | $40.00 | $0.44 | $55.00 |
| Native Audio | $35.20 | $0.44 | $70.40 |
| Feature | Input | Cached | Output |
|---|---|---|---|
| Text | $0.66 | $0.33 | $2.64 |
| Audio — Standard | $15.00 | $0.33 | $33.00 |
| Audio — Custom | $39.00 | $0.33 | $50.00 |
| Native Audio | $11.00 | $0.33 | $22.00 |
| Feature | Input | Cached | Output |
|---|---|---|---|
| Text | $0.11 | $0.04 | $0.44 |
| Audio — Standard | $15.00 | $0.04 | $33.00 |
| Audio — Custom | $39.00 | $0.04 | $50.00 |
| Native Audio | $4.00 | $0.04 | — |
You don’t select a tier directly. You choose a generative AI model and the corresponding pricing tier applies automatically.
| Tier | Models |
|---|---|
| Azure Voice Live Pro | gpt-realtime, gpt-4o, gpt-4.1, gpt-5, gpt-5-chat |
| Azure Voice Live Standard | gpt-realtime-mini, gpt-4o-mini, gpt-4.1-mini, gpt-5-mini |
| Azure Voice Live Lite | gpt-5-nano, phi4-mm-realtime, phi4-mini |
This calculator provides rough estimates based on simplified assumptions. Actual costs depend on conversation dynamics, voice activity detection behavior, caching efficiency, and token overhead. Custom voice training/hosting and avatar costs are billed separately.