Create a message - Sunra.ai

curl -X POST https://api-llm.sunra.ai/v1/messages \
  -H "Authorization: Bearer <SUNRA_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-sonnet-4-20250514",
    "max_tokens": 1024,
    "messages": [
      {
        "role": "user",
        "content": "Hello, how are you?"
      }
    ]
  }'

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "container": null,
  "content": [
    {
      "type": "text",
      "text": "Hello! I'm doing well, thank you for asking. How can I help you today?",
      "citations": null
    }
  ],
  "model": "anthropic/claude-sonnet-4-20250514",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 12,
    "output_tokens": 19,
    "cache_creation_input_tokens": null,
    "cache_read_input_tokens": null,
    "cache_creation": null,
    "inference_geo": null,
    "server_tool_use": null,
    "service_tier": null
  }
}

POST

messages

curl -X POST https://api-llm.sunra.ai/v1/messages \
  -H "Authorization: Bearer <SUNRA_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-sonnet-4-20250514",
    "max_tokens": 1024,
    "messages": [
      {
        "role": "user",
        "content": "Hello, how are you?"
      }
    ]
  }'

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "container": null,
  "content": [
    {
      "type": "text",
      "text": "Hello! I'm doing well, thank you for asking. How can I help you today?",
      "citations": null
    }
  ],
  "model": "anthropic/claude-sonnet-4-20250514",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 12,
    "output_tokens": 19,
    "cache_creation_input_tokens": null,
    "cache_read_input_tokens": null,
    "cache_creation": null,
    "inference_geo": null,
    "server_tool_use": null,
    "service_tier": null
  }
}

Creates a message using the Anthropic Messages API format. Supports text, images, PDFs, tools, and extended thinking.

Authentication

Authorization

string

required

Bearer token. Use your API key as the bearer token in the Authorization header.Format: Bearer <SUNRA_KEY>

Request

This endpoint expects an object.

model

string

required

The model that will complete your prompt. Browse available models at sunra.ai/models.

messages

object[] | null

required

Input messages. Each input message must be an object with a role and content. You can specify a single user-role message, or include multiple user and assistant messages for multi-turn conversations.

Show properties

role

string

required

The role of the message author. Supported values: user, assistant.

content

string | object[]

required

The content of the message. Can be a single string or an array of content blocks.

Show content block types

type

string

required

Value: text.

text

string

required

The text content.

citations

object[] | null

Citations for the text block.

cache_control

object

Cache control breakpoint at this content block.

Show properties

type

string

required

Value: ephemeral.

ttl

string

Time-to-live. Values: 5m (default), 1h.

type

string

required

Value: image.

source

object

required

Image source.

Show source variants

base64
url

type

string

required

Value: base64.

media_type

string

required

Media type: image/jpeg, image/png, image/gif, or image/webp.

data

string

required

Base64-encoded image data.

type

string

required

Value: url.

url

string

required

Image URL.

cache_control

object

Cache control breakpoint at this content block.

type

string

required

Value: document.

source

object

required

Document source.

Show source variants

base64
text
content
url

type

string

required

Value: base64.

media_type

string

required

Value: application/pdf.

data

string

required

Base64-encoded PDF data.

type

string

required

Value: text.

media_type

string

required

Value: text/plain.

data

string

required

Plain text data.

type

string

required

Value: content.

content

string | object[]

required

Content blocks.

type

string

required

Value: url.

url

string

required

PDF URL.

citations

object

Citations configuration.

Show properties

enabled

boolean

Whether citations are enabled for this document.

context

string

Additional context for the document.

title

string

Title of the document.

cache_control

object

Cache control breakpoint at this content block.

type

string

required

Value: tool_use.

string

required

The ID of the tool use.

name

string

required

The name of the tool.

input

object

required

The input to the tool.

cache_control

object

Cache control breakpoint at this content block.

type

string

required

Value: tool_result.

tool_use_id

string

required

The ID of the tool use this result corresponds to.

content

string | object[]

The result content. Can be a string or an array of content blocks.

is_error

boolean

Whether this is an error result.

cache_control

object

Cache control breakpoint at this content block.

type

string

required

Value: thinking.

thinking

string

required

The thinking content.

signature

string

required

The signature of the thinking block.

type

string

required

Value: redacted_thinking.

data

string

required

The redacted thinking data.

max_tokens

number

The maximum number of tokens to generate before stopping. Note that the model may stop before reaching this maximum. Different models have different maximum values for this parameter.

system

string | object[]

System prompt. A system prompt is a way of providing context and instructions to the model.Can be a string or an array of TextBlockParam objects, each containing text, type ("text"), optional cache_control, and optional citations.

stream

boolean

default:false

Whether to incrementally stream the response using server-sent events (SSE).

temperature

number

default:1

Amount of randomness injected into the response. Ranges from 0.0 to 1.0. Use temperature closer to 0.0 for analytical/multiple choice tasks, and closer to 1.0 for creative and generative tasks. Note that even with temperature of 0.0, the results will not be fully deterministic.

top_p

number

Use nucleus sampling. Computes the cumulative distribution over all options for each subsequent token in decreasing probability order and cuts it off once it reaches the probability specified by top_p. Recommended for advanced use cases only. You usually only need to use temperature.

top_k

number

Only sample from the top K options for each subsequent token. Used to remove “long tail” low probability responses. Recommended for advanced use cases only.

stop_sequences

string[]

Custom text sequences that will cause the model to stop generating. If the model encounters one of the custom sequences, the response stop_reason value will be "stop_sequence" and the response stop_sequence value will contain the matched stop sequence.

tools

object[]

Definitions of tools that the model may use. Supports custom tools, Anthropic built-in tools, and server tools.

Show tool types

Custom Tool
Bash Tool
Text Editor Tool
Web Search Tool

name

string

required

Name of the tool. This is how the tool will be called by the model.

input_schema

object

required

JSON schema for the tool’s input. This defines the shape of the input that your tool accepts and that the model will produce.

description

string

Description of what this tool does. Tool descriptions should be as detailed as possible.

type

string

Value: custom.

cache_control

object

Cache control breakpoint.

Show properties

type

string

required

Value: ephemeral.

ttl

string

Time-to-live. Values: 5m (default), 1h.

type

string

required

Value: bash_20250124.

name

string

required

Value: bash.

cache_control

object

Cache control breakpoint.

type

string

required

Value: text_editor_20250124.

name

string

required

Value: str_replace_editor.

cache_control

object

Cache control breakpoint.

type

string

required

Value: web_search_20250305.

name

string

required

Value: web_search.

allowed_domains

string[] | null

If provided, only these domains will be included in results. Cannot be used alongside blocked_domains.

blocked_domains

string[] | null

If provided, these domains will never appear in results. Cannot be used alongside allowed_domains.

max_uses

number | null

Maximum number of times the tool can be used in the API request.

user_location

object | null

Parameters for the user’s location. Used to provide more relevant search results.

Show properties

type

string

required

Value: approximate.

city

string | null

The city of the user.

country

string | null

The two letter ISO country code.

region

string | null

The region of the user.

timezone

string | null

The IANA timezone of the user.

cache_control

object

Cache control breakpoint.

tool_choice

object

How the model should use the provided tools. The model can use a specific tool, any available tool, decide by itself, or not use tools at all.

Show variants

auto
any
none
tool

type

string

required

Value: auto.

disable_parallel_tool_use

boolean

Whether to disable parallel tool use. Defaults to false. If true, the model will output at most one tool use.

type

string

required

Value: any.

disable_parallel_tool_use

boolean

Whether to disable parallel tool use. Defaults to false. If true, the model will output exactly one tool use.

type

string

required

Value: none.

type

string

required

Value: tool.

name

string

required

The name of the tool to use.

disable_parallel_tool_use

boolean

Whether to disable parallel tool use. Defaults to false. If true, the model will output exactly one tool use.

thinking

object

Configuration for enabling Claude’s extended thinking. When enabled, responses include thinking content blocks showing Claude’s thinking process before the final answer. Requires a minimum budget of 1,024 tokens.

Show variants

enabled
disabled
adaptive

type

string

required

Value: enabled.

budget_tokens

integer

required

Determines how many tokens Claude can use for its internal reasoning process. Must be ≥1024 and less than max_tokens.

type

string

required

Value: disabled.

type

string

required

Value: adaptive.

output_config

object

Configuration for controlling output behavior. Supports the effort parameter and structured output format.

Show properties

effort

string | null

How much effort the model should put into its response. Higher effort levels may result in more thorough analysis but take longer. Valid values: low, medium, high, max.

format

object | null

A schema to specify Claude’s output format in responses (structured outputs).

Show properties

type

string

required

Value: json_schema.

schema

object

required

The JSON schema of the format.

cache_control

object

Top-level cache control. Automatically applies a cache_control marker to the last cacheable block in the request.

Show properties

type

string

required

Value: ephemeral.

ttl

string

Time-to-live for the cache control breakpoint. Values: 5m (5 minutes, default), 1h (1 hour).

service_tier

string

Determines whether to use priority capacity or standard capacity for this request. Supported values: auto, standard_only.

metadata

object

An object describing metadata about the request.

Show properties

user_id

string

An external identifier for the user who is associated with the request. This should be a uuid, hash value, or other opaque identifier.

Response

Successful message response.

string

Unique message identifier, e.g. msg_01XFDUDYJgAACzvnptvVoYEL.

type

string

Object type. Always message.

role

string

Conversational role of the generated message. Always assistant.

container

object | null

Information about the container used in the request (for the code execution tool).

Show properties

string

Identifier for the container used in this request.

expires_at

string

The time at which the container will expire.

content

object[]

Content generated by the model. This is an array of content blocks, each of which has a type that determines its shape.

Show content block types

type

string

Value: text.

text

string

The generated text.

citations

object[] | null

Citations supporting the text block. Can be char_location, page_location, content_block_location, web_search_result_location, or search_result_location.

type

string

Value: tool_use.

string

The ID of the tool use block.

name

string

The name of the tool.

input

object

The input to the tool as generated by the model.

type

string

Value: thinking.

thinking

string

The thinking content.

signature

string

The signature of the thinking block.

type

string

Value: redacted_thinking.

data

string

The redacted data.

type

string

Value: server_tool_use.

string

The ID of the tool use block.

name

string

The server tool name. e.g., web_search, web_fetch, code_execution.

input

object

The input to the server tool.

type

string

Value: web_search_tool_result.

tool_use_id

string

The ID of the tool use this result corresponds to.

content

object | object[]

Search results or error object.

type

string

Value: web_fetch_tool_result.

tool_use_id

string

The ID of the tool use this result corresponds to.

content

object

Fetched content or error object.

type

string

Value: code_execution_tool_result.

tool_use_id

string

The ID of the tool use this result corresponds to.

content

object

Execution result or error object.

model

string

The model that handled the request.

stop_reason

string | null

The reason that the model stopped generating. Possible values:

end_turn — the model reached a natural stopping point
max_tokens — exceeded max_tokens or the model’s maximum
stop_sequence — one of your custom stop sequences was generated
tool_use — the model invoked one or more tools
pause_turn — a long-running turn was paused
refusal — streaming classifiers intervened for potential policy violations

stop_sequence

string | null

Which custom stop sequence was generated, if any.

usage

object

Billing and rate-limit usage.

Show properties

input_tokens

integer

The number of input tokens which were used.

output_tokens

integer

The number of output tokens which were used.

cache_creation_input_tokens

integer | null

The number of input tokens used to create the cache entry.

cache_read_input_tokens

integer | null

The number of input tokens read from the cache.

cache_creation

object | null

Breakdown of cached tokens by TTL.

Show properties

ephemeral_5m_input_tokens

integer

The number of input tokens used to create the 5 minute cache entry.

ephemeral_1h_input_tokens

integer

The number of input tokens used to create the 1 hour cache entry.

inference_geo

string | null

The geographic region where inference was performed for this request.

server_tool_use

object | null

The number of server tool requests.

Show properties

web_search_requests

integer

The number of web search tool requests.

web_fetch_requests

integer

The number of web fetch tool requests.

service_tier

string | null

If the request used the priority, standard, or batch tier. Values: standard, priority, batch.

curl -X POST https://api-llm.sunra.ai/v1/messages \
  -H "Authorization: Bearer <SUNRA_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "anthropic/claude-sonnet-4-20250514",
    "max_tokens": 1024,
    "messages": [
      {
        "role": "user",
        "content": "Hello, how are you?"
      }
    ]
  }'

{
  "id": "msg_01XFDUDYJgAACzvnptvVoYEL",
  "type": "message",
  "role": "assistant",
  "container": null,
  "content": [
    {
      "type": "text",
      "text": "Hello! I'm doing well, thank you for asking. How can I help you today?",
      "citations": null
    }
  ],
  "model": "anthropic/claude-sonnet-4-20250514",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 12,
    "output_tokens": 19,
    "cache_creation_input_tokens": null,
    "cache_read_input_tokens": null,
    "cache_creation": null,
    "inference_geo": null,
    "server_tool_use": null,
    "service_tier": null
  }
}

Create a chat completion Create a response

​Authentication

​Request

​Response

Authentication

Request

Response