GET /speech/asrlive

Performs asynchronous live speech recognition using websocket

This resource establish a websocket with client and receives audio data using websocket. It will start transcribing the audio using state-of-the-art deep neural networks and returns the partial results on the websocket. This endpoint is designed for transcription of stream audio data upto 15 minute. It will send back partial (status=partial) result everytime it transcribes an endpoint. After client sends the close signal, it will receive a ASRResponseBody with status=done. Token should be passed in query string as jwt.

Using config object you can can specify audio configs such as audioEncoding and sampleRateHertz. We will support different languages so you can choose the languageCode. Using asrModel and languageModel in config you can use customized models. Refer to ASRLongRuning API for long audio speech recognition. Refer to ASR API for fast recognition for short audio files.

Responses

200

OK.
Hide response attributes Show response attributes object
- transcriptionId string(uuid)
  
  A UUID string specifying a unique pair of audio and recognitionResult. It can be used to retrieve this recognitionResult using transcription endpoint. asrLongRunning recognitionResult will only be available using transcription endpoint and this transcriptionId.
- duration number(double)
  
  File duration in seconds.
- inferenceTime number(double)
  
  Total inference time in seconds.
- status string
  
  Status of the recognition process. USE THE RECOGNITION RESULT ONLY WHEN STATUS IS DONE.
  
  Values are queued, processing, done, or partial. Default value is queued.
- results array[object]
  
  Sequential list of transcription results corresponding to sequential portions of audio. May contain one or more recognition hypotheses (up to the maximum specified in maxAlternatives). These alternatives are ordered in terms of accuracy, with the top (first) alternative being the most probable, as ranked by the recognizer.
  
  Hide results attributes Show results attributes array[object]
  
  transcript string
  
  A UTF8-Encoded string. Transcript text representing the words that the user spoke.
  
  confidence number(double)
  
  The confidence of ASR engine for generated output. The confidence estimate between 0.0 and 1.0. A higher number indicates an estimated greater likelihood that the recognized words are correct. It is the total confidence of recognition in transcript level and each word confidence in word info object. This field is not guaranteed to be accurate and users should not rely on it to be always provided. The default of 0.0 is a sentinel value indicating confidence was not set.
  
  Minimum value is 0, maximum value is 1.
  
  words array[object]
  
  Hide words attributes Show words attributes array[object]
  
  startTime number(double)
  
  Time offset relative to the beginning of the audio, and corresponding to the start of the spoken word. This is an experimental feature and the accuracy of the time offset can vary. This field is not guaranteed to be accurate and users should not rely on it to be always provided. The default of 0.0 is a sentinel value indicating confidence was not set.
  
  endTime number(double)
  
  Time offset relative to the beginning of the audio, and corresponding to the end of the spoken word. This is an experimental feature and the accuracy of the time offset can vary. This field is not guaranteed to be accurate and users should not rely on it to be always provided. The default of 0.0 is a sentinel value indicating confidence was not set.
  
  word string
  
  The word corresponding to this set of information.
  
  confidence number(double)
  
  The confidence of ASR engine for generated output. The confidence estimate between 0.0 and 1.0. A higher number indicates an estimated greater likelihood that the recognized words are correct. It is the total confidence of recognition in transcript level and each word confidence in word info object. This field is not guaranteed to be accurate and users should not rely on it to be always provided. The default of 0.0 is a sentinel value indicating confidence was not set.
  
  Minimum value is 0, maximum value is 1.
400

This response means that server could not understand the request due to invalid syntax.
Hide response attributes Show response attributes object
- status string Required
  
  HTTP response status code.
- detail string Required
  
  Message explaining the issue.
- title string
  
  Error message title.
- type string
  
  Error type.
401

Authentication is needed to get requested response. This is similar to 403, but in this case, authentication is possible.
Hide response attributes Show response attributes object
- status string Required
  
  HTTP response status code.
- detail string Required
  
  Message explaining the issue.
- title string
  
  Error message title.
- type string
  
  Error type.
403

Client does not have access rights to the content so server is rejecting to give proper response.
Hide response attributes Show response attributes object
- status string Required
  
  HTTP response status code.
- detail string Required
  
  Message explaining the issue.
- title string
  
  Error message title.
- type string
  
  Error type.
405

The request method is known by the server but has been disabled and cannot be used.
Hide response attributes Show response attributes object
- status string Required
  
  HTTP response status code.
- detail string Required
  
  Message explaining the issue.
- title string
  
  Error message title.
- type string
  
  Error type.
415

The media format of the requested data is not supported by the server, so the server is rejecting the request.
Hide response attributes Show response attributes object
- status string Required
  
  HTTP response status code.
- detail string Required
  
  Message explaining the issue.
- title string
  
  Error message title.
- type string
  
  Error type.
429

The user has sent too many requests in a given amount of time ("rate limiting").
Hide response attributes Show response attributes object
- status string Required
  
  HTTP response status code.
- detail string Required
  
  Message explaining the issue.
- title string
  
  Error message title.
- type string
  
  Error type.
500

The server has encountered a situation it doesn't know how to handle.
Hide response attributes Show response attributes object
- status string Required
  
  HTTP response status code.
- detail string Required
  
  Message explaining the issue.
- title string
  
  Error message title.
- type string
  
  Error type.

GET /speech/asrlive

curl \
 -X GET https://api.amerandish.com/v1/speech/asrlive?jwt=api_token_value

Response examples (200)

{
  "transcriptionId": "string",
  "duration": 42.0,
  "inferenceTime": 42.0,
  "status": "queued",
  "results": [
    {
      "transcript": "string",
      "confidence": 42.0,
      "words": [
        {
          "startTime": 42.0,
          "endTime": 42.0,
          "word": "string",
          "confidence": 42.0
        }
      ]
    }
  ]
}

Response examples (200)

{
  "transcriptionId": "string",
  "duration": 42.0,
  "inferenceTime": 42.0,
  "status": "queued",
  "results": [
    {
      "transcript": "string",
      "confidence": 42.0,
      "words": [
        {
          "startTime": 42.0,
          "endTime": 42.0,
          "word": "string",
          "confidence": 42.0
        }
      ]
    }
  ]
}

Response examples (400)

{
  "code": 400,
  "message": "Bad Request. Invalid JSON object."
}

Response examples (400)

{
  "status": "string",
  "detail": "string",
  "title": "string",
  "type": "string"
}

Response examples (401)

{
  "code": 401,
  "message": "Unautherized. Invalid Authorization Token."
}

Response examples (401)

{
  "status": "string",
  "detail": "string",
  "title": "string",
  "type": "string"
}

Response examples (403)

{
  "code": 403,
  "message": "Forbidden. Do not have access right to resource."
}

Response examples (403)

{
  "status": "string",
  "detail": "string",
  "title": "string",
  "type": "string"
}

Response examples (405)

{
  "code": 405,
  "message": "Method Not Allowed."
}

Response examples (405)

{
  "status": "string",
  "detail": "string",
  "title": "string",
  "type": "string"
}

Response examples (415)

{
  "code": 415,
  "message": "Unsupported Media Type. Please change requested media type."
}

Response examples (415)

{
  "status": "string",
  "detail": "string",
  "title": "string",
  "type": "string"
}

Response examples (429)

{
  "code": 429,
  "message": "Too Many Requests. Your request is blocked due to exceeding rate limiting."
}

Response examples (429)

{
  "status": "string",
  "detail": "string",
  "title": "string",
  "type": "string"
}

Response examples (500)

{
  "code": 500,
  "message": "Internal Server Error. Please retry later."
}

Response examples (500)

{
  "status": "string",
  "detail": "string",
  "title": "string",
  "type": "string"
}