Text to Speech

The Text to Speech API converts text into spoken audio using Capa TTS voices. It supports raw audio downloads for direct playback and a JSON/base64 mode for apps that need to store or transport audio inside JSON.

[!IMPORTANT] This endpoint uses Capa Text to Speech and counts toward your AI request limit.

Voice IDs use the Capa model format, for example v1-capa-tts-nabzclan/en-AU-NatashaNeural.

Endpoint

POST https://developer.nabzclan.vip/api/ai/v1/audio/speech

Authentication

Include your Developer API token in the Authorization header:

Authorization: Bearer YOUR_API_TOKEN

When to Use It

App narration: Read articles, stories, guides, onboarding screens, or generated AI responses aloud.
Accessibility: Add spoken output for users who prefer audio or need screen-reader-style content.
Customer support: Generate voice replies for chatbots, support flows, IVR menus, and help center content.
Education: Create pronunciation examples, language-learning clips, flashcards, and lesson audio.
Creator tools: Generate voiceovers for reels, shorts, podcasts, announcements, and product demos.
Notifications: Turn alerts, status changes, moderation results, or monitoring messages into audio.

Request Parameters

Required

Field	Type	Description
`model`	string	Voice/model ID to use, such as `v1-capa-tts-nabzclan/en-AU-NatashaNeural`.
`input`	string	Text to convert into speech. Maximum 50,000 characters.

Optional

Field	Type	Description
`response_format`	string	Output format. Supports `mp3`, `wav`, `opus`, `aac`, `flac`, `pcm`, or `json`. Use `json` for base64 audio JSON.
`voice`	string	Optional voice override for advanced voice routing. Usually not needed for `v1-capa-tts-nabzclan/...` models.
`speed`	number	Playback speed from `0.25` to `4`. Use `1` for normal speed.

Response Modes

Raw Audio

By default, the API returns audio bytes. Save the response body to a file, stream it to a media player, or send it directly to a client.

curl -X POST https://developer.nabzclan.vip/api/ai/v1/audio/speech \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "v1-capa-tts-nabzclan/en-AU-NatashaNeural",
    "input": "Hello, this is a text to speech test."
  }' \
  --output speech.mp3

JSON Base64

Pass response_format=json in the query string when you want JSON containing base64-encoded audio instead of raw bytes. The JSON response contains the generated audio as base64 data.

curl -X POST "https://developer.nabzclan.vip/api/ai/v1/audio/speech?response_format=json" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "v1-capa-tts-nabzclan/en-AU-NatashaNeural",
    "input": "Hello, this is a text to speech test."
  }'

Use JSON/base64 mode when:

You are calling from a serverless function that prefers JSON payloads.
You need to store the audio inside a database record or queue message.
Your client cannot easily handle binary responses.
You want a single API response containing metadata and audio data.

Voice IDs

Voice IDs should be sent in the model field. The current format is:

v1-capa-tts-nabzclan/{locale}-{VoiceName}Neural

Examples:

Voice ID	Locale	Style
`v1-capa-tts-nabzclan/en-AU-NatashaNeural`	English Australia	Female, natural general-purpose voice
`v1-capa-tts-nabzclan/en-US-GuyNeural`	English United States	Male, confident general-purpose voice
`v1-capa-tts-nabzclan/en-US-EmmaMultilingualNeural`	English United States	Female, multilingual voice
`v1-capa-tts-nabzclan/es-MX-JorgeNeural`	Spanish Mexico	Male Spanish voice
`v1-capa-tts-nabzclan/es-AR-ElenaNeural`	Spanish Argentina	Female Spanish voice

Available Voices

This list shows the currently available Capa TTS voices. Send these IDs in the model field using the v1-capa-tts-nabzclan/... format.

Need another voice or locale? Open a ticket at helpdesk.nabzclan.vip with the voice you want, and the Nabzclan team will review it and add it when possible.

Each voice includes a sample generated with this phrase:

Hello, this is a voice test generated with the Capa text to speech model by nabzclan.

English

Voice ID	Region	Suggested use	Sample
`v1-capa-tts-nabzclan/en-AU-NatashaNeural`	Australia	Australian female voice for apps, narration, and support	Play MP3
`v1-capa-tts-nabzclan/en-CA-ClaraNeural`	Canada	Canadian female voice for product and support audio	Play MP3
`v1-capa-tts-nabzclan/en-CA-LiamNeural`	Canada	Canadian male voice for narration and announcements	Play MP3
`v1-capa-tts-nabzclan/en-HK-YanNeural`	Hong Kong	Hong Kong English female voice	Play MP3
`v1-capa-tts-nabzclan/en-HK-SamNeural`	Hong Kong	Hong Kong English male voice	Play MP3
`v1-capa-tts-nabzclan/en-IN-NeerjaExpressiveNeural`	India	Expressive Indian English female voice	Play MP3
`v1-capa-tts-nabzclan/en-IN-PrabhatNeural`	India	Indian English male voice	Play MP3
`v1-capa-tts-nabzclan/en-NZ-MitchellNeural`	New Zealand	New Zealand male voice	Play MP3
`v1-capa-tts-nabzclan/en-NZ-MollyNeural`	New Zealand	New Zealand female voice	Play MP3
`v1-capa-tts-nabzclan/en-US-MichelleNeural`	United States	US female voice for assistants and narration	Play MP3
`v1-capa-tts-nabzclan/en-US-EmmaMultilingualNeural`	United States	US multilingual female voice for mixed-language content	Play MP3
`v1-capa-tts-nabzclan/en-US-GuyNeural`	United States	US male voice for explainers and announcements	Play MP3
`v1-capa-tts-nabzclan/en-US-ChristopherNeural`	United States	US male voice for product and support audio	Play MP3

Spanish

Voice ID	Region	Suggested use	Sample
`v1-capa-tts-nabzclan/es-AR-ElenaNeural`	Argentina	Argentine Spanish female voice	Play MP3
`v1-capa-tts-nabzclan/es-MX-JorgeNeural`	Mexico	Mexican Spanish male voice	Play MP3
`v1-capa-tts-nabzclan/es-PR-KarinaNeural`	Puerto Rico	Puerto Rican Spanish female voice	Play MP3
`v1-capa-tts-nabzclan/es-ES-AlvaroNeural`	Spain	Spanish male narration	Play MP3
`v1-capa-tts-nabzclan/es-DO-RamonaNeural`	Dominican Republic	Dominican Spanish female voice	Play MP3
`v1-capa-tts-nabzclan/es-DO-EmilioNeural`	Dominican Republic	Dominican Spanish male voice	Play MP3

Choosing a Voice

For apps and assistants: Use clear, neutral voices such as v1-capa-tts-nabzclan/en-AU-NatashaNeural, v1-capa-tts-nabzclan/en-US-MichelleNeural, or v1-capa-tts-nabzclan/en-US-GuyNeural.
For expressive narration: Pick voices with stronger delivery such as v1-capa-tts-nabzclan/en-IN-NeerjaExpressiveNeural or v1-capa-tts-nabzclan/en-US-ChristopherNeural.
For multilingual content: Prefer multilingual voices when available, such as v1-capa-tts-nabzclan/en-US-EmmaMultilingualNeural.
For regional products: Match the locale to your user base, for example en-AU for Australia, en-CA for Canada, en-HK for Hong Kong, or es-MX for Mexico.

Examples

Generate an MP3 File

curl -X POST https://developer.nabzclan.vip/api/ai/v1/audio/speech \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "v1-capa-tts-nabzclan/en-AU-NatashaNeural",
    "input": "Welcome to Nabzclan Developer. Your audio is ready."
  }' \
  --output speech.mp3

Generate JSON/Base64 Audio

curl -X POST "https://developer.nabzclan.vip/api/ai/v1/audio/speech?response_format=json" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "v1-capa-tts-nabzclan/en-AU-NatashaNeural",
    "input": "Hello, this is a text to speech test."
  }'

Change Audio Format

curl -X POST "https://developer.nabzclan.vip/api/ai/v1/audio/speech?response_format=wav" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "v1-capa-tts-nabzclan/en-US-MichelleNeural",
    "input": "This file will be returned as WAV audio."
  }' \
  --output speech.wav

Python: Save MP3

import requests

url = "https://developer.nabzclan.vip/api/ai/v1/audio/speech"
headers = {
    "Authorization": "Bearer YOUR_API_TOKEN",
    "Content-Type": "application/json",
}
payload = {
    "model": "v1-capa-tts-nabzclan/en-AU-NatashaNeural",
    "input": "Hello, this is a text to speech test.",
}

response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()

with open("speech.mp3", "wb") as audio_file:
    audio_file.write(response.content)

Python: Decode JSON/Base64

Check the response keys, then decode the base64 audio field returned by Capa TTS.

import base64
import requests

url = "https://developer.nabzclan.vip/api/ai/v1/audio/speech?response_format=json"
headers = {
    "Authorization": "Bearer YOUR_API_TOKEN",
    "Content-Type": "application/json",
}
payload = {
    "model": "v1-capa-tts-nabzclan/en-AU-NatashaNeural",
    "input": "Hello, this is a text to speech test.",
}

response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()
data = response.json()

# Common names are audio, data, base64, or b64_json.
audio_base64 = data.get("audio") or data.get("data") or data.get("base64") or data.get("b64_json")
if not audio_base64:
    raise ValueError(f"No base64 audio field found. Response keys: {list(data.keys())}")

with open("speech.mp3", "wb") as audio_file:
    audio_file.write(base64.b64decode(audio_base64))

JavaScript: Browser Playback

const response = await fetch("https://developer.nabzclan.vip/api/ai/v1/audio/speech", {
  method: "POST",
  headers: {
    "Authorization": "Bearer YOUR_API_TOKEN",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "v1-capa-tts-nabzclan/en-US-EmmaMultilingualNeural",
    input: "This audio plays directly in the browser.",
  }),
});

if (!response.ok) {
  throw new Error(`TTS failed: ${response.status}`);
}

const audioBlob = await response.blob();
const audioUrl = URL.createObjectURL(audioBlob);
const audio = new Audio(audioUrl);
await audio.play();

Node.js: Save Audio

import fs from "node:fs/promises";

const response = await fetch("https://developer.nabzclan.vip/api/ai/v1/audio/speech", {
  method: "POST",
  headers: {
    "Authorization": "Bearer YOUR_API_TOKEN",
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "v1-capa-tts-nabzclan/en-US-ChristopherNeural",
    input: "Your generated narration is ready.",
  }),
});

if (!response.ok) {
  throw new Error(await response.text());
}

const audio = Buffer.from(await response.arrayBuffer());
await fs.writeFile("speech.mp3", audio);

Best Practices

Keep text clean: remove unsupported markup unless your chosen voice supports it.
Split very long scripts into smaller chunks for easier retrying and editing.
Cache repeated audio so you do not regenerate the same phrase every time.
Use JSON/base64 only when needed; raw audio is smaller and better for direct downloads.
Match the locale of the voice to the language and region of the text.
Test multiple voices for long-form content because pacing and tone can change listener comfort.

Errors

Status	Meaning	Fix
`401`	Missing or invalid API token	Send `Authorization: Bearer YOUR_API_TOKEN`.
`422`	Validation failed	Check `model`, `input`, `response_format`, and `speed`.
`429`	AI rate limit exceeded	Wait for reset or upgrade your plan.
`500`	Internal server error	Retry later; contact support if it continues.
`502`	Voice generation temporarily unavailable	Retry later; contact support if it continues.

Rate Limiting

Text to Speech is an AI endpoint. Each successful or failed authenticated request is tracked as an AI API request and applies to your plan's AI request limits.