Queue

For requests that take longer than a few seconds, typical in AI applications, we’ve developed a queue system. This system offers you fine-grained control to manage traffic surges, cancel requests if needed, and monitor your request’s status in the queue. It also eliminates the need for handling long-running HTTP requests.

Queue Endpoints

You can access all queue features through the following endpoints:

Endpoint	Method	Description
api.sunra.ai/v1/queue/{model-id}	POST	Adds a request to the queue
api.sunra.ai/v1/queue/requests/{request_id}/status	GET	Retrieves the status of a request
api.sunra.ai/v1/queue/requests/{request_id}/status/stream	GET	Streams the status until completion
api.sunra.ai/v1/queue/requests/{request_id}	GET	Fetches the response of a request
api.sunra.ai/v1/queue/requests/{request_id}/cancel	PUT	Cancels a request

For example, to submit a request using curl and add it to the queue:

curl -X POST \
  https://api.sunra.ai/v1/queue/black-forest-labs/flux-1.1-pro/text-to-image \
  -H "Authorization: Key $SUNRA_KEY" \
  -d '{"prompt": "A Studio Ghibli-inspired seaside town with colorful houses, laundry flapping, and cats sleeping on windowsills."}'

Here’s a sample response including the request_id:

{
  "request_id": "pd_vXW7VwPN2MbTwT8bzpWrYU5Y",
  "response_url": "https://api.sunra.ai/v1/queue/requests/pd_vXW7VwPN2MbTwT8bzpWrYU5Y",
  "status_url": "https://api.sunra.ai/v1/queue/requests/pd_vXW7VwPN2MbTwT8bzpWrYU5Y/status",
  "cancel_url": "https://api.sunra.ai/v1/queue/requests/pd_vXW7VwPN2MbTwT8bzpWrYU5Y/cancel"
}

The payload includes the request_id and provides URLs for checking status, canceling, or retrieving the response, streamlining your workflow without additional endpoint development.

Request Status

To monitor the progress of your request, use the provided endpoint with your unique request ID. This allows you to track the status, queue position, or retrieve the response once it’s ready.

Endpoint Usage

curl -X GET https://api.sunra.ai/v1/queue/requests/{request_id}/status

Example Response

When your request is in the queue, you’ll receive a response like this:

{
  "status": "IN_QUEUE",
  "metrics": {},
  "queue_position": 0,
  "response_url": "https://api.sunra.ai/v1/queue/requests/pd_hvTNHJPSZj4KgtzytfTGsySf",
  "status_url": "https://api.sunra.ai/v1/queue/requests/pd_hvTNHJPSZj4KgtzytfTGsySf/status",
  "cancel_url": "https://api.sunra.ai/v1/queue/requests/pd_hvTNHJPSZj4KgtzytfTGsySf/cancel"
}

Possible Statuses

Your request can be in one of three states:

IN_QUEUE: Indicates the request is waiting to be processed.
- queue_position: Shows your place in the queue.
- response_url: URL for retrieving the response once processing completes.
IN_PROGRESS: The request is currently being processed.
- logs: Detailed logs (if enabled) showing processing steps.
- response_url: Where the final response will be available.
COMPLETED: Processing has finished.
- logs: Logs detailing the entire process.
- response_url: Direct link to your completed response.

Enabling Logs

Logs provide insights into request processing. They are disabled by default but can be enabled with a query parameter:

curl -X GET https://api.sunra.ai/v1/queue/requests/{request_id}/status?logs=1

Each log entry includes:

message: Description of the event.
level: Severity (e.g., INFO, ERROR).
source: Origin of the log.
timestamp: Time the log was generated.

Real-Time Monitoring

For continuous updates, use the streaming endpoint:

curl -X GET https://api.sunra.ai/v1/queue/requests/{request_id}/status/stream

This provides real-time status updates in text/event-stream format until the request is completed.

Webhooks

If you’d rather be notified than poll, pass a webhook query parameter when you submit a request. Sunra will POST the final result to that URL once the request reaches a terminal state, so you don’t need to keep a connection open or schedule polling.

Enabling Webhooks

Append a URL-encoded webhook query parameter to the submit endpoint:

curl -X POST \
  "https://api.sunra.ai/v1/queue/black-forest-labs/flux-1.1-pro/text-to-image?webhook=https%3A%2F%2Fexample.com%2Fsunra-webhook" \
  -H "Authorization: Key $SUNRA_KEY" \
  -d '{"prompt": "A Studio Ghibli-inspired seaside town with colorful houses, laundry flapping, and cats sleeping on windowsills."}'

URL-encode the webhook URL (e.g. https%3A%2F%2F...) and make sure it is reachable from Sunra; HTTPS is strongly recommended. The submit response is unchanged — you still receive a request_id and the same status/cancel/response URLs.

When the Webhook Fires

Sunra calls your webhook only on terminal events:

succeeded — the request finished and the output is included in the payload.
failed — the request errored and the error details are included in the payload.

Intermediate states (IN_QUEUE, IN_PROGRESS) do not trigger a webhook. If you also need progress updates, combine the webhook with the streaming or polling endpoints described above.

Request Format

Sunra sends a POST request to your webhook URL with:

Header	Value
`Content-Type`	`application/json`
`User-Agent`	`Sunra-AI-Webhook/1.0`

Your endpoint must respond with a 2xx status code within 5 seconds. Acknowledge fast and offload any heavy work to a background job — slow responses are treated as failures and trigger retries.

Payload

A successful event:

{
  "id": "pd_vXW7VwPN2MbTwT8bzpWrYU5Y",
  "object": "prediction",
  "model": "black-forest-labs/flux-1.1-pro",
  "model_endpoint": "text-to-image",
  "status": "succeeded",
  "input": {
    "prompt": "A Studio Ghibli-inspired seaside town..."
  },
  "output": {
    "images": [
      { "url": "https://..." }
    ]
  },
  "created_at": "2026-04-28T12:00:00.000Z",
  "completed_at": "2026-04-28T12:00:08.123Z"
}

A failed event swaps output for error. The exact contents of error vary by failure type — the example below shows one common form:

{
  "id": "pd_vXW7VwPN2MbTwT8bzpWrYU5Y",
  "object": "prediction",
  "model": "black-forest-labs/flux-1.1-pro",
  "model_endpoint": "text-to-image",
  "status": "failed",
  "input": {
    "prompt": "A Studio Ghibli-inspired seaside town..."
  },
  "error": {
    "code": "EXAMPLE_ERROR_CODE",
    "message": "Human-readable error message"
  },
  "created_at": "2026-04-28T12:00:00.000Z",
  "completed_at": "2026-04-28T12:00:08.123Z"
}

Field	Description
`id`	The `request_id` returned at submit time. Use it to deduplicate retries.
`object`	Always `"prediction"`.
`model`	The model owner and name (e.g. `black-forest-labs/flux-1.1-pro`).
`model_endpoint`	The endpoint slug (e.g. `text-to-image`).
`status`	`"succeeded"` or `"failed"`.
`input`	The original request body you submitted.
`output`	Present on `succeeded`. The same response body returned by the result endpoint.
`error`	Present on `failed`. An object describing the failure; the shape depends on the failure type.
`created_at`	ISO 8601 timestamp of when the request was created.
`completed_at`	ISO 8601 timestamp of when the request reached the terminal state.

Retry Behavior

Webhook delivery is at-least-once. If your endpoint times out (5s) or returns a non-2xx status, Sunra retries with exponential backoff — up to 3 retries with delays starting at 10 seconds and capped at 30 seconds. After the final retry, the failure is logged and no further attempts are made. Because retries can deliver the same event more than once, your handler should be idempotent — use the id field as the dedupe key.

Best Practices

Use HTTPS and validate that traffic is reaching the endpoint you expect.
Respond 2xx immediately and process the payload asynchronously.
Treat delivery as at-least-once and dedupe by id.
Combine webhooks with the polling or streaming endpoints if you need progress updates before the terminal event.

Cancelling Requests

If your request is still queued, you can cancel it with:

curl -X PUT https://api.sunra.ai/v1/queue/requests/{request_id}/cancel

Retrieving Responses

Once your request is COMPLETED, retrieve the response using:

curl -X GET https://api.sunra.ai/v1/queue/requests/{request_id}

This endpoint also provides logs for review.

Simplified Integration with Sunra Client

The Sunra client automates status tracking, simplifying app development with Sunra functions.

Rate Limits

To ensure fair usage and system stability, our API endpoints are subject to the following rate limits:

Endpoint Type	Rate Limit	Burst Limit
Submit to Queue	10 requests/second	100 requests/minute
All Other Endpoints	100 requests/second	1,800 requests/minute

If you exceed these limits, you will receive a 403 Forbidden response. We recommend implementing a retry mechanism with exponential backoff to handle these cases.

Overview

Multimodal

LLM

Client Libraries

Queue Endpoints

Request Status

Endpoint Usage

Example Response

Possible Statuses

Enabling Logs

Real-Time Monitoring

Webhooks

Enabling Webhooks

When the Webhook Fires

Request Format

Payload

Retry Behavior

Best Practices

Cancelling Requests

Retrieving Responses

Simplified Integration with Sunra Client

Rate Limits

Overview

Multimodal

LLM

Client Libraries

Documentation Index

​Queue Endpoints

​Request Status

​Endpoint Usage

​Example Response

​Possible Statuses

​Enabling Logs

​Real-Time Monitoring

​Webhooks

​Enabling Webhooks

​When the Webhook Fires

​Request Format

​Payload

​Retry Behavior

​Best Practices

​Cancelling Requests

​Retrieving Responses

​Simplified Integration with Sunra Client

​Rate Limits

Queue Endpoints

Request Status

Endpoint Usage

Example Response

Possible Statuses

Enabling Logs

Real-Time Monitoring

Webhooks

Enabling Webhooks

When the Webhook Fires

Request Format

Payload

Retry Behavior

Best Practices

Cancelling Requests

Retrieving Responses

Simplified Integration with Sunra Client

Rate Limits