Text to Speech
The Text to Speech API converts text into spoken audio using Capa TTS voices. It supports raw audio downloads for direct playback and a JSON/base64 mode for apps that need to store or transport audio inside JSON.
[!IMPORTANT] This endpoint uses Capa Text to Speech and counts toward your AI request limit.
Voice IDs use the Capa model format, for example
v1-capa-tts-nabzclan/en-AU-NatashaNeural.
Endpoint
POST https://developer.nabzclan.vip/api/ai/v1/audio/speech
Authentication
Include your Developer API token in the Authorization header:
Authorization: Bearer YOUR_API_TOKEN
When to Use It
- App narration: Read articles, stories, guides, onboarding screens, or generated AI responses aloud.
- Accessibility: Add spoken output for users who prefer audio or need screen-reader-style content.
- Customer support: Generate voice replies for chatbots, support flows, IVR menus, and help center content.
- Education: Create pronunciation examples, language-learning clips, flashcards, and lesson audio.
- Creator tools: Generate voiceovers for reels, shorts, podcasts, announcements, and product demos.
- Notifications: Turn alerts, status changes, moderation results, or monitoring messages into audio.
Request Parameters
Required
| Field | Type | Description |
|---|---|---|
model |
string | Voice/model ID to use, such as v1-capa-tts-nabzclan/en-AU-NatashaNeural. |
input |
string | Text to convert into speech. Maximum 50,000 characters. |
Optional
| Field | Type | Description |
|---|---|---|
response_format |
string | Output format. Supports mp3, wav, opus, aac, flac, pcm, or json. Use json for base64 audio JSON. |
voice |
string | Optional voice override for advanced voice routing. Usually not needed for v1-capa-tts-nabzclan/... models. |
speed |
number | Playback speed from 0.25 to 4. Use 1 for normal speed. |
Response Modes
Raw Audio
By default, the API returns audio bytes. Save the response body to a file, stream it to a media player, or send it directly to a client.
curl -X POST https://developer.nabzclan.vip/api/ai/v1/audio/speech \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "v1-capa-tts-nabzclan/en-AU-NatashaNeural",
"input": "Hello, this is a text to speech test."
}' \
--output speech.mp3
JSON Base64
Pass response_format=json in the query string when you want JSON containing base64-encoded audio instead of raw bytes. The JSON response contains the generated audio as base64 data.
curl -X POST "https://developer.nabzclan.vip/api/ai/v1/audio/speech?response_format=json" \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "v1-capa-tts-nabzclan/en-AU-NatashaNeural",
"input": "Hello, this is a text to speech test."
}'
Use JSON/base64 mode when:
- You are calling from a serverless function that prefers JSON payloads.
- You need to store the audio inside a database record or queue message.
- Your client cannot easily handle binary responses.
- You want a single API response containing metadata and audio data.
Voice IDs
Voice IDs should be sent in the model field. The current format is:
v1-capa-tts-nabzclan/{locale}-{VoiceName}Neural
Examples:
| Voice ID | Locale | Style |
|---|---|---|
v1-capa-tts-nabzclan/en-AU-NatashaNeural |
English Australia | Female, natural general-purpose voice |
v1-capa-tts-nabzclan/en-US-GuyNeural |
English United States | Male, confident general-purpose voice |
v1-capa-tts-nabzclan/en-US-EmmaMultilingualNeural |
English United States | Female, multilingual voice |
v1-capa-tts-nabzclan/es-MX-JorgeNeural |
Spanish Mexico | Male Spanish voice |
v1-capa-tts-nabzclan/es-AR-ElenaNeural |
Spanish Argentina | Female Spanish voice |
Available Voices
This list shows the currently available Capa TTS voices. Send these IDs in the model field using the v1-capa-tts-nabzclan/... format.
Need another voice or locale? Open a ticket at helpdesk.nabzclan.vip with the voice you want, and the Nabzclan team will review it and add it when possible.
Each voice includes a sample generated with this phrase:
Hello, this is a voice test generated with the Capa text to speech model by nabzclan.
English
| Voice ID | Region | Suggested use | Sample |
|---|---|---|---|
v1-capa-tts-nabzclan/en-AU-NatashaNeural |
Australia | Australian female voice for apps, narration, and support | Play MP3 |
v1-capa-tts-nabzclan/en-CA-ClaraNeural |
Canada | Canadian female voice for product and support audio | Play MP3 |
v1-capa-tts-nabzclan/en-CA-LiamNeural |
Canada | Canadian male voice for narration and announcements | Play MP3 |
v1-capa-tts-nabzclan/en-HK-YanNeural |
Hong Kong | Hong Kong English female voice | Play MP3 |
v1-capa-tts-nabzclan/en-HK-SamNeural |
Hong Kong | Hong Kong English male voice | Play MP3 |
v1-capa-tts-nabzclan/en-IN-NeerjaExpressiveNeural |
India | Expressive Indian English female voice | Play MP3 |
v1-capa-tts-nabzclan/en-IN-PrabhatNeural |
India | Indian English male voice | Play MP3 |
v1-capa-tts-nabzclan/en-NZ-MitchellNeural |
New Zealand | New Zealand male voice | Play MP3 |
v1-capa-tts-nabzclan/en-NZ-MollyNeural |
New Zealand | New Zealand female voice | Play MP3 |
v1-capa-tts-nabzclan/en-US-MichelleNeural |
United States | US female voice for assistants and narration | Play MP3 |
v1-capa-tts-nabzclan/en-US-EmmaMultilingualNeural |
United States | US multilingual female voice for mixed-language content | Play MP3 |
v1-capa-tts-nabzclan/en-US-GuyNeural |
United States | US male voice for explainers and announcements | Play MP3 |
v1-capa-tts-nabzclan/en-US-ChristopherNeural |
United States | US male voice for product and support audio | Play MP3 |
Spanish
| Voice ID | Region | Suggested use | Sample |
|---|---|---|---|
v1-capa-tts-nabzclan/es-AR-ElenaNeural |
Argentina | Argentine Spanish female voice | Play MP3 |
v1-capa-tts-nabzclan/es-MX-JorgeNeural |
Mexico | Mexican Spanish male voice | Play MP3 |
v1-capa-tts-nabzclan/es-PR-KarinaNeural |
Puerto Rico | Puerto Rican Spanish female voice | Play MP3 |
v1-capa-tts-nabzclan/es-ES-AlvaroNeural |
Spain | Spanish male narration | Play MP3 |
v1-capa-tts-nabzclan/es-DO-RamonaNeural |
Dominican Republic | Dominican Spanish female voice | Play MP3 |
v1-capa-tts-nabzclan/es-DO-EmilioNeural |
Dominican Republic | Dominican Spanish male voice | Play MP3 |
Choosing a Voice
- For apps and assistants: Use clear, neutral voices such as
v1-capa-tts-nabzclan/en-AU-NatashaNeural,v1-capa-tts-nabzclan/en-US-MichelleNeural, orv1-capa-tts-nabzclan/en-US-GuyNeural. - For expressive narration: Pick voices with stronger delivery such as
v1-capa-tts-nabzclan/en-IN-NeerjaExpressiveNeuralorv1-capa-tts-nabzclan/en-US-ChristopherNeural. - For multilingual content: Prefer multilingual voices when available, such as
v1-capa-tts-nabzclan/en-US-EmmaMultilingualNeural. - For regional products: Match the locale to your user base, for example
en-AUfor Australia,en-CAfor Canada,en-HKfor Hong Kong, ores-MXfor Mexico.
Examples
Generate an MP3 File
curl -X POST https://developer.nabzclan.vip/api/ai/v1/audio/speech \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "v1-capa-tts-nabzclan/en-AU-NatashaNeural",
"input": "Welcome to Nabzclan Developer. Your audio is ready."
}' \
--output speech.mp3
Generate JSON/Base64 Audio
curl -X POST "https://developer.nabzclan.vip/api/ai/v1/audio/speech?response_format=json" \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "v1-capa-tts-nabzclan/en-AU-NatashaNeural",
"input": "Hello, this is a text to speech test."
}'
Change Audio Format
curl -X POST "https://developer.nabzclan.vip/api/ai/v1/audio/speech?response_format=wav" \
-H "Authorization: Bearer YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "v1-capa-tts-nabzclan/en-US-MichelleNeural",
"input": "This file will be returned as WAV audio."
}' \
--output speech.wav
Python: Save MP3
import requests
url = "https://developer.nabzclan.vip/api/ai/v1/audio/speech"
headers = {
"Authorization": "Bearer YOUR_API_TOKEN",
"Content-Type": "application/json",
}
payload = {
"model": "v1-capa-tts-nabzclan/en-AU-NatashaNeural",
"input": "Hello, this is a text to speech test.",
}
response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()
with open("speech.mp3", "wb") as audio_file:
audio_file.write(response.content)
Python: Decode JSON/Base64
Check the response keys, then decode the base64 audio field returned by Capa TTS.
import base64
import requests
url = "https://developer.nabzclan.vip/api/ai/v1/audio/speech?response_format=json"
headers = {
"Authorization": "Bearer YOUR_API_TOKEN",
"Content-Type": "application/json",
}
payload = {
"model": "v1-capa-tts-nabzclan/en-AU-NatashaNeural",
"input": "Hello, this is a text to speech test.",
}
response = requests.post(url, headers=headers, json=payload)
response.raise_for_status()
data = response.json()
# Common names are audio, data, base64, or b64_json.
audio_base64 = data.get("audio") or data.get("data") or data.get("base64") or data.get("b64_json")
if not audio_base64:
raise ValueError(f"No base64 audio field found. Response keys: {list(data.keys())}")
with open("speech.mp3", "wb") as audio_file:
audio_file.write(base64.b64decode(audio_base64))
JavaScript: Browser Playback
const response = await fetch("https://developer.nabzclan.vip/api/ai/v1/audio/speech", {
method: "POST",
headers: {
"Authorization": "Bearer YOUR_API_TOKEN",
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "v1-capa-tts-nabzclan/en-US-EmmaMultilingualNeural",
input: "This audio plays directly in the browser.",
}),
});
if (!response.ok) {
throw new Error(`TTS failed: ${response.status}`);
}
const audioBlob = await response.blob();
const audioUrl = URL.createObjectURL(audioBlob);
const audio = new Audio(audioUrl);
await audio.play();
Node.js: Save Audio
import fs from "node:fs/promises";
const response = await fetch("https://developer.nabzclan.vip/api/ai/v1/audio/speech", {
method: "POST",
headers: {
"Authorization": "Bearer YOUR_API_TOKEN",
"Content-Type": "application/json",
},
body: JSON.stringify({
model: "v1-capa-tts-nabzclan/en-US-ChristopherNeural",
input: "Your generated narration is ready.",
}),
});
if (!response.ok) {
throw new Error(await response.text());
}
const audio = Buffer.from(await response.arrayBuffer());
await fs.writeFile("speech.mp3", audio);
Best Practices
- Keep text clean: remove unsupported markup unless your chosen voice supports it.
- Split very long scripts into smaller chunks for easier retrying and editing.
- Cache repeated audio so you do not regenerate the same phrase every time.
- Use JSON/base64 only when needed; raw audio is smaller and better for direct downloads.
- Match the locale of the voice to the language and region of the text.
- Test multiple voices for long-form content because pacing and tone can change listener comfort.
Errors
| Status | Meaning | Fix |
|---|---|---|
401 |
Missing or invalid API token | Send Authorization: Bearer YOUR_API_TOKEN. |
422 |
Validation failed | Check model, input, response_format, and speed. |
429 |
AI rate limit exceeded | Wait for reset or upgrade your plan. |
500 |
Internal server error | Retry later; contact support if it continues. |
502 |
Voice generation temporarily unavailable | Retry later; contact support if it continues. |
Rate Limiting
Text to Speech is an AI endpoint. Each successful or failed authenticated request is tracked as an AI API request and applies to your plan's AI request limits.