Quickstart
runanything.ai serves two OpenAI-compatible endpoints — text-to-speech and transcription — at the base URL https://runanything.ai/v1. If you've used OpenAI's audio APIs, you already know this one: same request shapes, same responses, same SDKs.
1. Get an API key
We're in private beta and issue keys by hand. Email help@runanything.ai with a line about what you're building and your expected volume — we'll usually reply with a key within a day.
2. Generate speech
Returns mp3 by default (wav, aac, and streaming pcm are one parameter away — see Text to speech).
curl https://runanything.ai/v1/audio/speech \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "kokoro-82m",
"input": "Hello! This is my first request.",
"voice": "af_heart"
}' \
--output speech.mp3from openai import OpenAI
client = OpenAI(
base_url="https://runanything.ai/v1",
api_key="YOUR_API_KEY",
)
speech = client.audio.speech.create(
model="kokoro-82m",
voice="af_heart",
input="Hello! This is my first request.",
)
speech.write_to_file("speech.mp3")import OpenAI from "openai";
import fs from "node:fs";
const client = new OpenAI({
baseURL: "https://runanything.ai/v1",
apiKey: "YOUR_API_KEY",
});
const speech = await client.audio.speech.create({
model: "kokoro-82m",
voice: "af_heart",
input: "Hello! This is my first request.",
});
fs.writeFileSync("speech.mp3", Buffer.from(await speech.arrayBuffer()));Voices: 28 built-ins like af_heart and bm_george (full list), and OpenAI names like nova work too.
3. Transcribe audio
Upload webm, mp4, ogg, wav, or mp3 (up to 4 MB during the beta) and get text back.
curl https://runanything.ai/v1/audio/transcriptions \
-H "Authorization: Bearer YOUR_API_KEY" \
-F file=@recording.wav \
-F model=distil-whisper-large-v3result = client.audio.transcriptions.create(
model="distil-whisper-large-v3",
file=open("recording.wav", "rb"),
)
print(result.text)const result = await client.audio.transcriptions.create({
model: "distil-whisper-large-v3",
file: fs.createReadStream("recording.wav"),
});
console.log(result.text);Next steps
- Text to speech — formats, speed, and streaming raw PCM for real-time playback.
- Speech to text — response formats and language handling.
- Errors & limits — what 4xx/5xx responses look like and current beta limits.