curl -X POST https://api-llm.sunra.ai/v1/chat/completions \ -H "Authorization: Bearer <SUNRA_KEY>" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-4o", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "What is the capital of France?" } ] }'
Copy
{ "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1677652288, "model": "openai/gpt-4o", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "The capital of France is Paris.", "refusal": null }, "finish_reason": "stop", "logprobs": null } ], "system_fingerprint": "fp_44709d6fcb", "usage": { "prompt_tokens": 25, "completion_tokens": 8, "total_tokens": 33, "prompt_tokens_details": null, "completion_tokens_details": null }}
Chat
Create a chat completion
POST
/
v1
/
chat
/
completions
Copy
curl -X POST https://api-llm.sunra.ai/v1/chat/completions \ -H "Authorization: Bearer <SUNRA_KEY>" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-4o", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "What is the capital of France?" } ] }'
Copy
{ "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1677652288, "model": "openai/gpt-4o", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "The capital of France is Paris.", "refusal": null }, "finish_reason": "stop", "logprobs": null } ], "system_fingerprint": "fp_44709d6fcb", "usage": { "prompt_tokens": 25, "completion_tokens": 8, "total_tokens": 33, "prompt_tokens_details": null, "completion_tokens_details": null }}
Sends a request for a model response for the given chat conversation. Supports both streaming and non-streaming modes, text, images, audio, video, files, function calling, reasoning, and structured outputs. Compatible with the OpenAI Chat Completions API format.
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model’s likelihood to repeat the same line verbatim.
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model’s likelihood to talk about new topics.
An integer between 0 and 20 specifying the number of most likely tokens to return at each token position. logprobs must be set to true if this parameter is used.
If specified, the system will make a best effort to sample deterministically, such that repeated requests with the same seed and parameters should return the same result.
Controls which (if any) tool is called by the model. none means no tool calls. auto means the model decides. required means the model must call a tool. Can also specify a particular function.
Enable automatic prompt caching. When set, the system automatically applies cache breakpoints to the last cacheable block in the request. Currently supported for Anthropic Claude models.
This fingerprint represents the backend configuration that the model runs with. Can be used with the seed parameter to understand when backend changes have been made.
Copy
curl -X POST https://api-llm.sunra.ai/v1/chat/completions \ -H "Authorization: Bearer <SUNRA_KEY>" \ -H "Content-Type: application/json" \ -d '{ "model": "openai/gpt-4o", "messages": [ { "role": "system", "content": "You are a helpful assistant." }, { "role": "user", "content": "What is the capital of France?" } ] }'
Copy
{ "id": "chatcmpl-abc123", "object": "chat.completion", "created": 1677652288, "model": "openai/gpt-4o", "choices": [ { "index": 0, "message": { "role": "assistant", "content": "The capital of France is Paris.", "refusal": null }, "finish_reason": "stop", "logprobs": null } ], "system_fingerprint": "fp_44709d6fcb", "usage": { "prompt_tokens": 25, "completion_tokens": 8, "total_tokens": 33, "prompt_tokens_details": null, "completion_tokens_details": null }}