Generates audio from the input text.

POST /audio/speech
application/json

Body Required

  • model string Required

    One of the available TTS models: tts-1, tts-1-hd or gpt-4o-mini-tts.

    Any of:
  • input string Required

    The text to generate audio for. The maximum length is 4096 characters.

    Maximum length is 4096.

  • instructions string

    Control the voice of your generated audio with additional instructions. Does not work with tts-1 or tts-1-hd.

    Maximum length is 4096.

  • voice string Required

    The voice to use when generating the audio. Supported voices are alloy, ash, ballad, coral, echo, fable, onyx, nova, sage, shimmer, and verse. Previews of the voices are available in the Text to speech guide.

    Any of:
  • response_format string

    The format to audio in. Supported formats are mp3, opus, aac, flac, wav, and pcm.

    Values are mp3, opus, aac, flac, wav, or pcm. Default value is mp3.

  • speed number

    The speed of the generated audio. Select a value from 0.25 to 4.0. 1.0 is the default.

    Minimum value is 0.25, maximum value is 4. Default value is 1.

Responses

  • 200 application/octet-stream

    OK

    Hide headers attribute Show headers attribute
    • Transfer-Encoding string

      chunked

POST /audio/speech
curl \
 --request POST 'https://api.openai.com/v1/audio/speech' \
 --header "Authorization: Bearer $ACCESS_TOKEN" \
 --header "Content-Type: application/json" \
 --data '{"model":"string","input":"string","instructions":"string","voice":"ash","response_format":"mp3","speed":1}'
Request examples
{
  "model": "string",
  "input": "string",
  "instructions": "string",
  "voice": "ash",
  "response_format": "mp3",
  "speed": 1
}