POST
/
audio
/
stream
const stream = await suonora.audio.stream({
  input: "Welcome to Suonora streaming!",
  model: "legacy-v2.5",
  voice: "axel",
  style: "cheerful",
  styleDegree: 1.2
});

// Pipe to file or audio player
const { createWriteStream } = require('fs');
const { pipeline } = require('stream/promises');

await pipeline(
  stream,
  createWriteStream("output.mp3")
);
"Binary MP3 data (not shown in documentation)"

Endpoint

POST /v1/audio/stream

Authentication

Bearer token required

Overview

The streaming endpoint converts text to speech and streams MP3 audio in real-time. This endpoint is ideal for applications requiring low-latency audio playback, such as real-time assistants or live caption-to-speech conversion.

Response Format

audio/mpeg (chunked MP3)

First Byte Latency

< 500ms

Request Parameters

input
string
required

The text to convert to speech. Maximum 5,000 characters.

model
string
required

The synthesis model to use. Currently supported: legacy-v2.5

voice
string
required

The voice ID to use. Get available voices from the voices endpoint.

pitch
string

Adjust the voice pitch. Range: -100% to +100%. Default: +0%

style
string

Emotional speaking style. Options: neutral, cheerful, calm, angry, sad, excited, whispering. Default: calm

styleDegree
number

Intensity of the selected style. Range: 0.5 to 2.0. Default: 1.5

lang
string

BCP-47 language code (e.g., en-US, fr-FR). Default: en-US

Examples

curl -X POST https://api.suonora.com/v1/audio/stream \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  --output - \
  -d '{
    "input": "Welcome to Suonora streaming!",
    "model": "legacy-v2.5",
    "voice": "axel",
    "style": "cheerful",
    "styleDegree": 1.2
  }' | ffplay -autoexit -nodisp -

Response

The endpoint streams MP3 audio data with the following headers:

audio/mpeg
chunked

Error Responses

Best Practices

  • Connection Management: Use HTTP/2 or keep-alive connections to reduce latency
  • Back-pressure: Process chunks as they arrive to maintain stream health
  • Error Recovery: Implement reconnection logic for network interruptions
  • Browser Support: Use MediaSource API for optimal browser streaming
  • Security: Keep your API key secure and never expose it in client-side code

Streaming vs Standard Endpoint

Use Streaming When

  • Real-time playback is needed - Low latency is critical - Processing long texts - Building conversational apps

Use Standard When

  • Saving audio to files - Offline caching - Simple playback - Short text snippets

Authorizations

Authorization
string
header
required

Your API key as a Bearer token

Body

application/json

Response

200
audio/mpeg

Successful response

Streaming MP3 audio data