GreenPT Docs

Chat Completion Streaming

Stream chat completions with the GreenPT model for real-time responses.

POST

Stream chat completions with the GreenPT model for real-time responses.

Endpoint

POST https://api.greenpt.ai/v1/chat/completions

Request body

Required and optional parameters.

ParameterTypeRequiredDescription
modelstringYesID of the model to use (e.g., "green-l").
messagesarrayYesArray of message objects with role and content.
streambooleanYesMust be set to true for streaming responses.

Example request

curl https://api.greenpt.ai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sk-your_api_key" \
  -d '{
    "model": "green-l",
    "messages": [
      {
        "role": "user",
        "content": "Hello!"
      }
    ],
    "stream": true
  }'

Streaming response format

The response is a text stream of Server-Sent Events (SSE). Each event is a JSON object prefixed with data: , terminated by a final data: [DONE] sentinel.

data: {"id":"cmpl-12345","object":"chat.completion.chunk","created":1699044968,"model":"green-l","choices":[{"index":0,"delta":{"role":"assistant","content":"Hello"},"finish_reason":null}]}

data: {"id":"cmpl-12345","object":"chat.completion.chunk","created":1699044968,"model":"green-l","choices":[{"index":0,"delta":{"content":"! How"},"finish_reason":null}]}

data: {"id":"cmpl-12345","object":"chat.completion.chunk","created":1699044968,"model":"green-l","choices":[{"index":0,"delta":{"content":" can I help you today?"},"finish_reason":null}]}

data: {"id":"cmpl-12345","object":"chat.completion.chunk","created":1699044968,"model":"green-l","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]

Reading the stream

Each chat.completion.chunk carries an incremental delta. Concatenate delta.content across chunks to assemble the full assistant message. The final non-[DONE] chunk includes a finish_reason ("stop", "length", etc.) and no further content.

Sustainability impact

The final chunk (the one that also carries usage) includes an impact object with the environmental cost of the inference:

data: {"id":"cmpl-12345","object":"chat.completion.chunk","model":"green-l",
  "choices":[{"index":0,"delta":{},"finish_reason":"stop"}],
  "usage":{"prompt_tokens":10,"completion_tokens":22,"total_tokens":32},
  "impact":{
    "inferenceTime": { "total": 1380, "unit": "ms" },
    "energy":        { "total": 526,  "unit": "Wms" },
    "emissions":     { "total": 47,   "unit": "ugCO2e" },
    "version": "20250922"
  }
}

data: [DONE]

emissions is in micrograms of CO₂ equivalent (µgCO₂e), calculated using 1-hour datacenter-level carbon intensity data from Nodera. The value reflects the actual grid conditions at the time of the request, so the same prompt can produce different figures at different times of day. See Carbon Calculations for the full methodology.

Models

See the full list of available models on the Models page.

On this page