Back to APIs

Unified Agent API

Speech-to-Text

Speech-to-Text

Transcribe audio via Deepgram with confidence scores and word-level timing.

POST/v1/speech/transcribe

Overview

Transcribe audio to text with timestamps

Credits

2 credits per call

Providers

Deepgram

SDK Method

client.stt(...)

Parameters

audio_url

string

URL to audio file.

audio_base64

string

Base64-encoded audio.

language

string (default: en)

Language code.

Example Response

{
  "success": true,
  "data": {
    "transcript": "Yeah. As as much as, it's worth celebrating, the first spacewalk with an all female team, I think many of us are looking forward to it just being normal.",
    "confidence": 0.99,
    "language": "en",
    "words": [
      {
        "word": "yeah",
        "start": 0.08,
        "end": 0.32,
        "confidence": 0.99,
        "punctuated_word": "Yeah."
      },
      {
        "word": "as",
        "start": 0.32,
        "end": 0.8,
        "confidence": 0.99,
        "punctuated_word": "As"
      }
    ],
    "duration_seconds": 25.93
  },
  "metadata": {
    "provider_used": "deepgram",
    "providers_tried": [
      "deepgram"
    ],
    "mode_used": null,
    "response_time_ms": 1607,
    "request_id": "req_de622db0"
  },
  "credits_used": 2
}

Get Started

Use this API through the O-mega platform. Create an API key in your dashboard, then call the endpoint with your key in the Authorization header.

Try Speech-to-Text

Test Speech-to-Text in the interactive playground. No setup required.

Open Playground
Speech-to-Text API | Unified Agent APIs | suprbrowser