Create a chat completion

Sends a request for a model response for the given chat conversation. Supports both streaming and non-streaming modes, text, images, audio, video, files, function calling, reasoning, and structured outputs. Compatible with the OpenAI Chat Completions API format.

Authentication

Authorization

string

required

Bearer token. Use your API key as the bearer token in the Authorization header.Format: Bearer <SUNRA_KEY>

Request

This endpoint expects an object.

messages

object[]

required

List of messages for the conversation. Each message has a role and content.

Show message types

SystemMessage
UserMessage
DeveloperMessage
AssistantMessage
ToolResponseMessage

role

string

required

Value: system.

content

string | object[]

required

The system message content. Can be a string or array of text content parts.

name

string

Optional name for the system message.

role

string

required

Value: user.

content

string | object[]

required

The user message content. Can be a string or array of content parts.

Show content part types

text
image_url
input_audio
video_url
file

type

string

required

Value: text.

text

string

required

The text content.

cache_control

object

Cache control for this content part.

Show properties

type

string

required

Value: ephemeral.

ttl

string

Cache TTL. Supported values: 5m, 1h.

type

string

required

Value: image_url.

image_url

object

required

Image URL object.

Show properties

url

string

required

URL of the image. Supports data: URLs for base64 encoded images.

detail

string

Image detail level for vision models. Supported values: auto, low, high.

type

string

required

Value: input_audio.

input_audio

object

required

Audio input data.

Show properties

data

string

required

Base64 encoded audio data.

format

string

required

Audio format (e.g., wav, mp3, flac, m4a, ogg, aiff, aac). Supported formats vary by provider.

type

string

required

Value: video_url.

video_url

object

required

Video input object.

Show properties

url

string

required

URL of the video. Supports data: URLs for base64 encoded video.

type

string

required

Value: file.

file

object

required

File content for document processing.

Show properties

file_data

string

File content as base64 data URL or URL.

file_id

string

File ID for previously uploaded files.

filename

string

Original filename.

name

string

Optional name for the user.

role

string

required

Value: developer.

content

string | object[]

required

The developer message content. Can be a string or array of text content parts.

name

string

Optional name for the developer message.

role

string

required

Value: assistant.

content

string | object[] | null

The assistant message content. Can be a string, array of content parts, or null (when tool_calls are present).

name

string

Optional name for the assistant.

tool_calls

object[]

Tool calls made by the assistant.

Show properties

string

required

Tool call identifier.

type

string

required

Value: function.

function

object

required

The function called.

Show properties

name

string

required

Function name.

arguments

string

required

Function arguments as JSON string.

refusal

string | null

Refusal message if content was refused.

reasoning

string | null

Reasoning output text.

reasoning_details

object[]

Detailed reasoning information for extended thinking models.

Show reasoning detail types

summary
encrypted
text

type

string

required

Value: reasoning.summary.

summary

string

required

The reasoning summary text.

string | null

Reasoning detail ID.

format

string | null

Format. Supported values: unknown, openai-responses-v1, azure-openai-responses-v1, xai-responses-v1, anthropic-claude-v1, google-gemini-v1.

type

string

required

Value: reasoning.encrypted.

data

string

required

Encrypted reasoning data.

string | null

Reasoning detail ID.

format

string | null

Format identifier.

type

string

required

Value: reasoning.text.

text

string | null

The reasoning text content.

signature

string | null

Signature for verification.

string | null

Reasoning detail ID.

format

string | null

Format identifier.

images

object[]

Generated images from image generation models. Each item has image_url.url.

audio

object

Audio output data.

Show properties

string

Audio output identifier.

data

string

Base64 encoded audio data.

transcript

string

Audio transcript.

expires_at

number

Audio expiration timestamp.

role

string

required

Value: tool.

content

string | object[]

required

Tool response content. Can be a string or array of content parts (text, image, audio, video, file).

tool_call_id

string

required

ID of the assistant message tool call this message responds to.

model

string

required

The model to use for the completion. Browse available models at sunra.ai/models.

stream

boolean

default:false

If set to true, partial message deltas will be sent as server-sent events (SSE).

max_completion_tokens

number | null

Maximum tokens in completion. Replaces max_tokens as the preferred parameter.

max_tokens

number | null

Maximum tokens in completion. Deprecated — use max_completion_tokens instead. Note: some providers enforce a minimum of 16.

temperature

number | null

default:1

Sampling temperature between 0 and 2. Higher values like 0.8 make output more random, lower values like 0.2 make it more focused and deterministic.

top_p

number | null

default:1

Nucleus sampling parameter (0-1). An alternative to temperature sampling where the model considers the tokens with top_p probability mass.

frequency_penalty

number | null

default:0

Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim.

presence_penalty

number | null

default:0

Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics.

stop

string | string[]

Up to 4 sequences where the API will stop generating further tokens.

integer

default:1

How many chat completion choices to generate for each input message.

logprobs

boolean | null

default:false

Whether to return log probabilities of the output tokens.

top_logprobs

number | null

An integer between 0 and 20 specifying the number of most likely tokens to return at each token position. logprobs must be set to true if this parameter is used.

logit_bias

object | null

Token logit bias adjustments. Modify the likelihood of specified tokens appearing in the completion. Maps token IDs to bias values from -100 to 100.

reasoning

object

Configuration options for reasoning models.

Show properties

effort

string | null

Constrains effort on reasoning. Supported values: xhigh, high, medium, low, minimal, none.

summary

string

Controls reasoning summary verbosity. Supported values: auto, concise, detailed.

response_format

object

An object specifying the format that the model must output.

Show format types

text
json_object
json_schema
grammar

type

string

required

Value: text.

type

string

required

Value: json_object.

type

string

required

Value: json_schema.

json_schema

object

required

JSON Schema configuration.

Show properties

name

string

required

Schema name (a-z, A-Z, 0-9, underscores, dashes, max 64 chars).

description

string

Schema description for the model.

schema

object

JSON Schema object.

strict

boolean | null

Enable strict schema adherence.

type

string

required

Value: grammar.

grammar

string

required

Custom grammar for text generation.

seed

integer | null

If specified, the system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.

tools

object[]

A list of tools the model may call.

Show properties

type

string

required

The type of the tool. Value: function.

function

object

required

The function definition.

Show properties

name

string

required

Function name (a-z, A-Z, 0-9, underscores, dashes, max 64 chars).

description

string

A description of what the function does.

parameters

object

The parameters the function accepts, described as a JSON Schema object.

strict

boolean | null

Whether to enable strict schema adherence.

cache_control

object

Cache control for this tool definition.

Show properties

type

string

required

Value: ephemeral.

ttl

string

Cache TTL. Supported values: 5m, 1h.

tool_choice

string | object

Controls which (if any) tool is called by the model. none means no tool calls. auto means the model decides. required means the model must call a tool. Can also specify a particular function.

Show object variant

type

string

required

Value: function.

function

object

required

Show properties

name

string

required

The name of the function to call.

parallel_tool_calls

boolean | null

default:true

Whether to enable parallel function calling during tool use.

modalities

string[]

Output modalities for the response. Supported values: text, image, audio.

metadata

object

Key-value pairs for additional object information. Maximum 16 pairs, 64 character keys, 512 character values.

cache_control

object

Enable automatic prompt caching. When set, the system automatically applies cache breakpoints to the last cacheable block in the request. Currently supported for Anthropic Claude models.

Show properties

type

string

required

Value: ephemeral.

ttl

string

Cache TTL. Supported values: 5m, 1h.

user

string

A unique identifier representing your end-user, which can help monitor and detect abuse.

Response

Successful chat completion response.

string

A unique identifier for the chat completion.

object

string

The object type. Always chat.completion.

created

number

The Unix timestamp (in seconds) of when the chat completion was created.

model

string

The model used for the chat completion.

choices

object[]

A list of chat completion choices. Can be more than one if n is greater than 1.

Show properties

index

number

The index of the choice in the list of choices.

finish_reason

string

The reason the model stopped generating tokens. Possible values: stop, length, tool_calls, content_filter, error.

message

object

A chat completion message generated by the model.

Show properties

role

string

Always assistant.

content

string | null

The text contents of the message. Null when tool_calls are present.

tool_calls

object[]

The tool calls generated by the model.

Show properties

string

The ID of the tool call.

type

string

Value: function.

function

object

The function that the model called.

Show properties

name

string

The name of the function to call.

arguments

string

The arguments in JSON string format.

refusal

string | null

Refusal message if the content was refused.

reasoning

string | null

Reasoning output text.

reasoning_details

object[]

Detailed reasoning information for extended thinking models.

Show reasoning detail types

summary
encrypted
text

type

string

Value: reasoning.summary.

summary

string

The reasoning summary text.

string | null

Detail ID.

format

string | null

Format identifier.

type

string

Value: reasoning.encrypted.

data

string

Encrypted reasoning data.

string | null

Detail ID.

format

string | null

Format identifier.

type

string

Value: reasoning.text.

text

string | null

The reasoning text.

signature

string | null

Verification signature.

string | null

Detail ID.

format

string | null

Format identifier.

images

object[]

Generated images. Each item contains image_url.url.

audio

object

Audio output data.

Show properties

string

Audio output identifier.

data

string

Base64 encoded audio data.

transcript

string

Audio transcript.

expires_at

number

Audio expiration timestamp.

logprobs

object | null

Log probability information for the choice.

Show properties

content

object[] | null

Log probabilities for content tokens. Each item contains token, logprob, bytes, and top_logprobs.

refusal

object[] | null

Log probabilities for refusal tokens.

usage

object

Usage statistics for the completion request.

Show properties

prompt_tokens

number

Number of tokens in the prompt.

completion_tokens

number

Number of tokens in the generated completion.

total_tokens

number

Total number of tokens used in the request (prompt + completion).

prompt_tokens_details

object | null

Detailed prompt token usage.

Show properties

cached_tokens

number

Number of cached prompt tokens.

cache_write_tokens

number

Tokens written to cache. Only returned for models with explicit caching.

audio_tokens

number

Audio input tokens.

video_tokens

number

Video input tokens.

completion_tokens_details

object | null

Detailed completion token usage.

Show properties

reasoning_tokens

number | null

Tokens used for reasoning.

audio_tokens

number | null

Tokens used for audio output.

accepted_prediction_tokens

number | null

Accepted prediction tokens.

rejected_prediction_tokens

number | null

Rejected prediction tokens.

system_fingerprint

string | null

This fingerprint represents the backend configuration that the model runs with. Can be used with the seed parameter to understand when backend changes have been made.

curl -X POST https://api-llm.sunra.ai/v1/chat/completions \
  -H "Authorization: Bearer <SUNRA_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4o",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "What is the capital of France?"
      }
    ]
  }'

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1677652288,
  "model": "openai/gpt-4o",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris.",
        "refusal": null
      },
      "finish_reason": "stop",
      "logprobs": null
    }
  ],
  "system_fingerprint": "fp_44709d6fcb",
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 8,
    "total_tokens": 33,
    "prompt_tokens_details": null,
    "completion_tokens_details": null
  }
}

Chat

Anthropic Messages

Responses

Create a chat completion

Authentication

Request

Response

Chat

Anthropic Messages

Responses

​Authentication

​Request

​Response

Authentication

Request

Response