Create run

POST /threads/{thread_id}/runs

Create a run.

Path parameters

  • thread_id string Required

    The ID of the thread to run.

application/json

Body Required

  • assistant_id string Required

    The ID of the assistant to use to execute this run.

  • model string | null

    Any of:

    The ID of the Model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used.

    The ID of the Model to be used to execute this run. If a value is provided here, it will override the model associated with the assistant. If not, the model associated with the assistant will be used.

    Values are gpt-4o, gpt-4o-2024-05-13, gpt-4-turbo, gpt-4-turbo-2024-04-09, gpt-4-0125-preview, gpt-4-turbo-preview, gpt-4-1106-preview, gpt-4-vision-preview, gpt-4, gpt-4-0314, gpt-4-0613, gpt-4-32k, gpt-4-32k-0314, gpt-4-32k-0613, gpt-3.5-turbo, gpt-3.5-turbo-16k, gpt-3.5-turbo-0613, gpt-3.5-turbo-1106, gpt-3.5-turbo-0125, or gpt-3.5-turbo-16k-0613.

  • instructions string | null

    Overrides the instructions of the assistant. This is useful for modifying the behavior on a per-run basis.

  • Appends additional instructions at the end of the instructions for the run. This is useful for modifying the behavior on a per-run basis without overriding other instructions.

  • additional_messages array[object] | null

    Adds additional messages to the thread before creating the run.

    Hide additional_messages attributes Show additional_messages attributes object
    • role string Required

      The role of the entity that is creating the message. Allowed values include:

      • user: Indicates the message is sent by an actual user and should be used in most cases to represent user-generated messages.
      • assistant: Indicates the message is generated by the assistant. Use this value to insert messages from the assistant into the conversation.

      Values are user or assistant.

    • content string | array[object] Required

      One of:

      The text contents of the message.

      An array of content parts with a defined type, each can be of type text or images can be passed with image_url or image_file. Image types are only supported on Vision-compatible models.

      At least 1 element.

      One of:
    • attachments array[object] | null

      A list of files attached to the message, and the tools they should be added to.

      Hide attachments attributes Show attachments attributes object
      • file_id string

        The ID of the file to attach to the message.

      • tools array[object]

        The tools to add this file to.

        One of:
    • metadata object | null

      Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maxium of 512 characters long.

  • tools array[object] | null

    Override the tools the assistant can use for this run. This is useful for modifying the behavior on a per-run basis.

    Not more than 20 elements.

    One of:
  • metadata object | null

    Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maxium of 512 characters long.

  • temperature number | null

    What sampling temperature to use, between 0 and 2. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

    Minimum value is 0, maximum value is 2. Default value is 1.

  • top_p number | null

    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with top_p probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

    We generally recommend altering this or temperature but not both.

    Minimum value is 0, maximum value is 1. Default value is 1.

  • stream boolean | null

    If true, returns a stream of events that happen during the Run as server-sent events, terminating when the Run enters a terminal state with a data: [DONE] message.

  • max_prompt_tokens integer | null

    The maximum number of prompt tokens that may be used over the course of the run. The run will make a best effort to use only the number of prompt tokens specified, across multiple turns of the run. If the run exceeds the number of prompt tokens specified, the run will end with status incomplete. See incomplete_details for more info.

    Minimum value is 256.

  • The maximum number of completion tokens that may be used over the course of the run. The run will make a best effort to use only the number of completion tokens specified, across multiple turns of the run. If the run exceeds the number of completion tokens specified, the run will end with status incomplete. See incomplete_details for more info.

    Minimum value is 256.

  • truncation_strategy object | null

    Controls for how a thread will be truncated prior to the run. Use this to control the intial context window of the run.

    Hide truncation_strategy attributes Show truncation_strategy attributes object | null
    • type string Required

      The truncation strategy to use for the thread. The default is auto. If set to last_messages, the thread will be truncated to the n most recent messages in the thread. When set to auto, messages in the middle of the thread will be dropped to fit the context length of the model, max_prompt_tokens.

      Values are auto or last_messages.

    • last_messages integer | null

      The number of most recent messages from the thread when constructing the context for the run.

      Minimum value is 1.

  • tool_choice string | null | object

    One of:

    none means the model will not call any tools and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools before responding to the user.

    Values are none, auto, or required.

  • response_format string | null | object

    One of:

    auto is the default value

    Values are none or auto.

Responses

  • 200 application/json

    OK

    Hide response attributes Show response attributes object
    • id string Required

      The identifier, which can be referenced in API endpoints.

    • object string Required

      The object type, which is always thread.run.

      Value is thread.run.

    • created_at integer Required

      The Unix timestamp (in seconds) for when the run was created.

    • thread_id string Required

      The ID of the thread that was executed on as a part of this run.

    • assistant_id string Required

      The ID of the assistant used for execution of this run.

    • status string Required

      The status of the run, which can be either queued, in_progress, requires_action, cancelling, cancelled, failed, completed, incomplete, or expired.

      Values are queued, in_progress, requires_action, cancelling, cancelled, failed, completed, incomplete, or expired.

    • required_action object | null Required

      Details on the action required to continue the run. Will be null if no action is required.

      Hide required_action attributes Show required_action attributes object | null
      • type string Required

        For now, this is always submit_tool_outputs.

        Value is submit_tool_outputs.

      • submit_tool_outputs object Required

        Details on the tool outputs needed for this run to continue.

        Hide submit_tool_outputs attribute Show submit_tool_outputs attribute object
        • tool_calls array[object] Required

          Tool call objects

          Hide tool_calls attributes Show tool_calls attributes object
          • id string Required

            The ID of the tool call. This ID must be referenced when you submit the tool outputs in using the Submit tool outputs to run endpoint.

          • type string Required

            The type of tool call the output is required for. For now, this is always function.

            Value is function.

          • function object Required

            The function definition.

            Hide function attributes Show function attributes object
            • name string Required

              The name of the function.

            • arguments string Required

              The arguments that the model expects you to pass to the function.

    • last_error object | null Required

      The last error associated with this run. Will be null if there are no errors.

      Hide last_error attributes Show last_error attributes object | null
      • code string Required

        One of server_error, rate_limit_exceeded, or invalid_prompt.

        Values are server_error, rate_limit_exceeded, or invalid_prompt.

      • message string Required

        A human-readable description of the error.

    • expires_at integer | null Required

      The Unix timestamp (in seconds) for when the run will expire.

    • started_at integer | null Required

      The Unix timestamp (in seconds) for when the run was started.

    • cancelled_at integer | null Required

      The Unix timestamp (in seconds) for when the run was cancelled.

    • failed_at integer | null Required

      The Unix timestamp (in seconds) for when the run failed.

    • completed_at integer | null Required

      The Unix timestamp (in seconds) for when the run was completed.

    • incomplete_details object | null Required

      Details on why the run is incomplete. Will be null if the run is not incomplete.

      Hide incomplete_details attribute Show incomplete_details attribute object | null
      • reason string

        The reason why the run is incomplete. This will point to which specific token limit was reached over the course of the run.

        Values are max_completion_tokens or max_prompt_tokens.

    • model string Required

      The model that the assistant used for this run.

    • instructions string Required

      The instructions that the assistant used for this run.

    • tools array[object] Required

      The list of tools that the assistant used for this run.

      Not more than 20 elements. Default value is [] (empty).

      One of:
    • metadata object | null Required

      Set of 16 key-value pairs that can be attached to an object. This can be useful for storing additional information about the object in a structured format. Keys can be a maximum of 64 characters long and values can be a maxium of 512 characters long.

    • usage object | null Required

      Usage statistics related to the run. This value will be null if the run is not in a terminal state (i.e. in_progress, queued, etc.).

      Hide usage attributes Show usage attributes object | null
      • completion_tokens integer Required

        Number of completion tokens used over the course of the run.

      • prompt_tokens integer Required

        Number of prompt tokens used over the course of the run.

      • total_tokens integer Required

        Total number of tokens used (prompt + completion).

    • temperature number | null

      The sampling temperature used for this run. If not set, defaults to 1.

    • top_p number | null

      The nucleus sampling value used for this run. If not set, defaults to 1.

    • max_prompt_tokens integer | null Required

      The maximum number of prompt tokens specified to have been used over the course of the run.

      Minimum value is 256.

    • max_completion_tokens integer | null Required

      The maximum number of completion tokens specified to have been used over the course of the run.

      Minimum value is 256.

    • truncation_strategy object | null Required

      Controls for how a thread will be truncated prior to the run. Use this to control the intial context window of the run.

      Hide truncation_strategy attributes Show truncation_strategy attributes object | null
      • type string Required

        The truncation strategy to use for the thread. The default is auto. If set to last_messages, the thread will be truncated to the n most recent messages in the thread. When set to auto, messages in the middle of the thread will be dropped to fit the context length of the model, max_prompt_tokens.

        Values are auto or last_messages.

      • last_messages integer | null

        The number of most recent messages from the thread when constructing the context for the run.

        Minimum value is 1.

    • tool_choice string | null | object Required

      One of:

      none means the model will not call any tools and instead generates a message. auto means the model can pick between generating a message or calling one or more tools. required means the model must call one or more tools before responding to the user.

      Values are none, auto, or required.

    • response_format string | null | object Required

      One of:

      auto is the default value

      Values are none or auto.

POST /threads/{thread_id}/runs
curl \
 -X POST https://api.openai.com/v1/threads/{thread_id}/runs \
 -H "Authorization: Bearer $ACCESS_TOKEN" \
 -H "Content-Type: application/json" \
 -d '{"assistant_id":"string","model":"gpt-4-turbo","instructions":"string","additional_instructions":"string","additional_messages":[{"role":"user","content":"string","attachments":[{"file_id":"string","tools":[{"type":"code_interpreter"}]}],"metadata":{}}],"tools":[{"type":"code_interpreter"}],"metadata":{},"temperature":1,"top_p":1,"stream":true,"max_prompt_tokens":42,"max_completion_tokens":42,"truncation_strategy":{"type":"auto","last_messages":42},"tool_choice":"none","response_format":"none"}'
Request example
{
  "assistant_id": "string",
  "model": "gpt-4-turbo",
  "instructions": "string",
  "additional_instructions": "string",
  "additional_messages": [
    {
      "role": "user",
      "content": "string",
      "attachments": [
        {
          "file_id": "string",
          "tools": [
            {
              "type": "code_interpreter"
            }
          ]
        }
      ],
      "metadata": {}
    }
  ],
  "tools": [
    {
      "type": "code_interpreter"
    }
  ],
  "metadata": {},
  "temperature": 1,
  "top_p": 1,
  "stream": true,
  "max_prompt_tokens": 42,
  "max_completion_tokens": 42,
  "truncation_strategy": {
    "type": "auto",
    "last_messages": 42
  },
  "tool_choice": "none",
  "response_format": "none"
}
Response examples (200)
{
  "id": "string",
  "object": "thread.run",
  "created_at": 42,
  "thread_id": "string",
  "assistant_id": "string",
  "status": "queued",
  "required_action": {
    "type": "submit_tool_outputs",
    "submit_tool_outputs": {
      "tool_calls": [
        {
          "id": "string",
          "type": "function",
          "function": {
            "name": "string",
            "arguments": "string"
          }
        }
      ]
    }
  },
  "last_error": {
    "code": "server_error",
    "message": "string"
  },
  "expires_at": 42,
  "started_at": 42,
  "cancelled_at": 42,
  "failed_at": 42,
  "completed_at": 42,
  "incomplete_details": {
    "reason": "max_completion_tokens"
  },
  "model": "string",
  "instructions": "string",
  "tools": [
    {
      "type": "code_interpreter"
    }
  ],
  "metadata": {},
  "usage": {
    "completion_tokens": 42,
    "prompt_tokens": 42,
    "total_tokens": 42
  },
  "temperature": 42.0,
  "top_p": 42.0,
  "max_prompt_tokens": 42,
  "max_completion_tokens": 42,
  "truncation_strategy": {
    "type": "auto",
    "last_messages": 42
  },
  "tool_choice": "none",
  "response_format": "none"
}