Text to Speech

The Text to Speech API converts text into spoken audio using Capa TTS voices. It supports raw audio downloads for direct playback and a JSON/base64 mode for apps that need to store or transport audio inside JSON.

[!IMPORTANT] This endpoint uses Capa Text to Speech and counts toward your AI request limit.

Voice IDs use the Capa model format, for example v1-capa-tts-nabzclan/en-AU-NatashaNeural.

Endpoint

POST https://developer.nabzclan.vip/api/ai/v1/audio/speech

Authentication

Include your Developer API token in the Authorization header:

Authorization: Bearer YOUR_API_TOKEN

When to Use It

  • App narration: Read articles, stories, guides, onboarding screens, or generated AI responses aloud.
  • Accessibility: Add spoken output for users who prefer audio or need screen-reader-style content.
  • Customer support: Generate voice replies for chatbots, support flows, IVR menus, and help center content.
  • Education: Create pronunciation examples, language-learning clips, flashcards, and lesson audio.
  • Creator tools: Generate voiceovers for reels, shorts, podcasts, announcements, and product demos.
  • Notifications: Turn alerts, status changes, moderation results, or monitoring messages into audio.

Request Parameters

Required

Field Type Description
model string Voice/model ID to use, such as v1-capa-tts-nabzclan/en-AU-NatashaNeural.
input string Text to convert into speech. Maximum 50,000 characters.

Optional

Field Type Description
response_format string Output format. Supports mp3, wav, opus, aac, flac, pcm, or json. Use json for base64 audio JSON.
voice string Optional voice override for advanced voice routing. Usually not needed for v1-capa-tts-nabzclan/... models.
speed number Playback speed from 0.25 to 4. Use 1 for normal speed.

Response Modes

Raw Audio

By default, the API returns audio bytes. Save the response body to a file, stream it to a media player, or send it directly to a client.

curl -X POST https://developer.nabzclan.vip/api/ai/v1/audio/speech \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "v1-capa-tts-nabzclan/en-AU-NatashaNeural",
    "input": "Hello, this is a text to speech test."
  }' \
  --output speech.mp3

JSON Base64

Pass response_format=json in the query string when you want JSON containing base64-encoded audio instead of raw bytes. The JSON response contains the generated audio as base64 data.

curl -X POST "https://developer.nabzclan.vip/api/ai/v1/audio/speech?response_format=json" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "v1-capa-tts-nabzclan/en-AU-NatashaNeural",
    "input": "Hello, this is a text to speech test."
  }'

Use JSON/base64 mode when:

  • You are calling from a serverless function that prefers JSON payloads.
  • You need to store the audio inside a database record or queue message.
  • Your client cannot easily handle binary responses.
  • You want a single API response containing metadata and audio data.

Voice IDs

Voice IDs should be sent in the model field. The current format is:

v1-capa-tts-nabzclan/{locale}-{VoiceName}Neural

Examples:

Voice ID Locale Style
v1-capa-tts-nabzclan/en-AU-NatashaNeural English Australia Female, natural general-purpose voice
v1-capa-tts-nabzclan/en-US-GuyNeural English United States Male, confident general-purpose voice
v1-capa-tts-nabzclan/en-US-EmmaMultilingualNeural English United States Female, multilingual voice
v1-capa-tts-nabzclan/es-MX-JorgeNeural Spanish Mexico Male Spanish voice
v1-capa-tts-nabzclan/es-AR-ElenaNeural Spanish Argentina Female Spanish voice

Available Voices

This list shows the currently available Capa TTS voices. Send these IDs in the model field using the v1-capa-tts-nabzclan/... format.

Need another voice or locale? Open a ticket at helpdesk.nabzclan.vip with the voice you want, and the Nabzclan team will review it and add it when possible.

Each voice includes a sample generated with this phrase:

Hello, this is a voice test generated with the Capa text to speech model by nabzclan.

English

Voice ID Region Suggested use Sample
v1-capa-tts-nabzclan/en-AU-NatashaNeural Australia Australian female voice for apps, narration, and support Play MP3
v1-capa-tts-nabzclan/en-CA-ClaraNeural Canada Canadian female voice for product and support audio Play MP3
v1-capa-tts-nabzclan/en-CA-LiamNeural Canada Canadian male voice for narration and announcements Play MP3
v1-capa-tts-nabzclan/en-HK-YanNeural Hong Kong Hong Kong English female voice Play MP3
v1-capa-tts-nabzclan/en-HK-SamNeural Hong Kong Hong Kong English male voice Play MP3
v1-capa-tts-nabzclan/en-IN-NeerjaExpressiveNeural India Expressive Indian English female voice Play MP3
v1-capa-tts-nabzclan/en-IN-PrabhatNeural India Indian English male voice Play MP3
v1-capa-tts-nabzclan/en-NZ-MitchellNeural New Zealand New Zealand male voice Play MP3
v1-capa-tts-nabzclan/en-NZ-MollyNeural New Zealand New Zealand female voice Play MP3
v1-capa-tts-nabzclan/en-US-MichelleNeural United States US female voice for assistants and narration Play MP3
v1-capa-tts-nabzclan/en-US-EmmaMultilingualNeural United States US multilingual female voice for mixed-language content Play MP3
v1-capa-tts-nabzclan/en-US-GuyNeural United States US male voice for explainers and announcements Play MP3
v1-capa-tts-nabzclan/en-US-ChristopherNeural United States US male voice for product and support audio Play MP3

Spanish

Voice ID Region Suggested use Sample
v1-capa-tts-nabzclan/es-AR-ElenaNeural Argentina Argentine Spanish female voice Play MP3
v1-capa-tts-nabzclan/es-MX-JorgeNeural Mexico Mexican Spanish male voice Play MP3
v1-capa-tts-nabzclan/es-PR-KarinaNeural Puerto Rico Puerto Rican Spanish female voice Play MP3
v1-capa-tts-nabzclan/es-ES-AlvaroNeural Spain Spanish male narration Play MP3
v1-capa-tts-nabzclan/es-DO-RamonaNeural Dominican Republic Dominican Spanish female voice Play MP3
v1-capa-tts-nabzclan/es-DO-EmilioNeural Dominican Republic Dominican Spanish male voice Play MP3

Choosing a Voice

  • For apps and assistants: Use clear, neutral voices such as v1-capa-tts-nabzclan/en-AU-NatashaNeural, v1-capa-tts-nabzclan/en-US-MichelleNeural, or v1-capa-tts-nabzclan/en-US-GuyNeural.
  • For expressive narration: Pick voices with stronger delivery such as v1-capa-tts-nabzclan/en-IN-NeerjaExpressiveNeural or v1-capa-tts-nabzclan/en-US-ChristopherNeural.
  • For multilingual content: Prefer multilingual voices when available, such as v1-capa-tts-nabzclan/en-US-EmmaMultilingualNeural.
  • For regional products: Match the locale to your user base, for example en-AU for Australia, en-CA for Canada, en-HK for Hong Kong, or es-MX for Mexico.

Examples

Generate an MP3 File

curl -X POST https://developer.nabzclan.vip/api/ai/v1/audio/speech \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "v1-capa-tts-nabzclan/en-AU-NatashaNeural",
    "input": "Welcome to Nabzclan Developer. Your audio is ready."
  }' \
  --output speech.mp3

Generate JSON/Base64 Audio

curl -X POST "https://developer.nabzclan.vip/api/ai/v1/audio/speech?response_format=json" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "v1-capa-tts-nabzclan/en-AU-NatashaNeural",
    "input": "Hello, this is a text to speech test."
  }'

Change Audio Format

curl -X POST "https://developer.nabzclan.vip/api/ai/v1/audio/speech?response_format=wav" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "v1-capa-tts-nabzclan/en-US-MichelleNeural",
    "input": "This file will be returned as WAV audio."
  }' \
  --output speech.wav

Python: Save MP3

import requests

url = "https://developer.nabzclan.vip/api/ai/v1/audio/speech"
headers = {
    "Authorization": "Bearer YOUR_API_TOKEN",
    "Content-Type": "application/json",
}
payload = {
    "model": "v1-capa-tts-nabzclan/en-AU-NatashaNeural",
    "input": "Hello, this is a text to speech test.",
}

response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()

with open("speech.mp3", "wb") as audio_file:
    audio_file.write(response.content)

Python: Decode JSON/Base64

Check the response keys, then decode the base64 audio field returned by Capa TTS.

import base64
import requests

url = "https://developer.nabzclan.vip/api/ai/v1/audio/speech?response_format=json"
headers = {
    "Authorization": "Bearer YOUR_API_TOKEN",
    "Content-Type": "application/json",
}
payload = {
    "model": "v1-capa-tts-nabzclan/en-AU-NatashaNeural",
    "input": "Hello, this is a text to speech test.",
}

response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()
data = response.json()

# Common names are audio, data, base64, or b64_json.
audio_base64 = data.get("audio") or data.get("data") or data.get("base64") or data.get("b64_json")
if not audio_base64:
    raise ValueError(f"No base64 audio field found. Response keys: {list(data.keys())}")

with open("speech.mp3", "wb") as audio_file:
    audio_file.write(base64.b64decode(audio_base64))

JavaScript: Browser Playback

const response = await fetch("https://developer.nabzclan.vip/api/ai/v1/audio/speech", {
  method: "POST",
  headers: {
    "Authorization": "Bearer YOUR_API_TOKEN",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "v1-capa-tts-nabzclan/en-US-EmmaMultilingualNeural",
    input: "This audio plays directly in the browser.",
  }),
});

if (!response.ok) {
  throw new Error(`TTS failed: ${response.status}`);
}

const audioBlob = await response.blob();
const audioUrl = URL.createObjectURL(audioBlob);
const audio = new Audio(audioUrl);
await audio.play();

Node.js: Save Audio

import fs from "node:fs/promises";

const response = await fetch("https://developer.nabzclan.vip/api/ai/v1/audio/speech", {
  method: "POST",
  headers: {
    "Authorization": "Bearer YOUR_API_TOKEN",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "v1-capa-tts-nabzclan/en-US-ChristopherNeural",
    input: "Your generated narration is ready.",
  }),
});

if (!response.ok) {
  throw new Error(await response.text());
}

const audio = Buffer.from(await response.arrayBuffer());
await fs.writeFile("speech.mp3", audio);

Best Practices

  • Keep text clean: remove unsupported markup unless your chosen voice supports it.
  • Split very long scripts into smaller chunks for easier retrying and editing.
  • Cache repeated audio so you do not regenerate the same phrase every time.
  • Use JSON/base64 only when needed; raw audio is smaller and better for direct downloads.
  • Match the locale of the voice to the language and region of the text.
  • Test multiple voices for long-form content because pacing and tone can change listener comfort.

Errors

Status Meaning Fix
401 Missing or invalid API token Send Authorization: Bearer YOUR_API_TOKEN.
422 Validation failed Check model, input, response_format, and speed.
429 AI rate limit exceeded Wait for reset or upgrade your plan.
500 Internal server error Retry later; contact support if it continues.
502 Voice generation temporarily unavailable Retry later; contact support if it continues.

Rate Limiting

Text to Speech is an AI endpoint. Each successful or failed authenticated request is tracked as an AI API request and applies to your plan's AI request limits.