Chat Completion

Create a chat completion with a GreenPT model. We recommend gemma4: long-context, multimodal, and hosted on sustainable EU infrastructure. See Models for the full list.

POST/v1/chat/completions

Authorization

bearerAuth

AuthorizationBearer <token>

In: header

Request Body

application/json

TypeScript Definitions

Use the request body type in TypeScript.

model*string

ID of the model to use.

Default"gemma4"

messages*array<>

A list of messages comprising the conversation so far.

stream?boolean

If set, partial message deltas will be sent as server-sent events.

Defaultfalse

temperature?number

Sampling temperature. Higher = more random.

Default1

Formatfloat

Range0 <= value <= 2

top_p?number

Nucleus sampling probability mass.

Formatfloat

Range0 <= value <= 1

max_tokens?integer

Maximum number of tokens to generate.

Range1 <= value

n?integer

Number of completions to generate for each prompt.

Default1

Range1 <= value

stop?|array<string>

Stop sequence(s) where generation will halt.

presence_penalty?number

Default0

Formatfloat

Range-2 <= value <= 2

frequency_penalty?number

Default0

Formatfloat

Range-2 <= value <= 2

user?string

A unique identifier representing your end-user.

reasoning_effort?string

Controls the model's thinking budget. When omitted, thinking is enabled by default. Pass "none" to disable thinking entirely. The values minimal / low / medium / high are accepted for OpenAI-spec compatibility and all currently map to thinking-enabled on GreenPT-hosted models.

Value in"none" | "minimal" | "low" | "medium" | "high"

Response Body

`application/json`

curl -X POST "https://example.com/v1/chat/completions" \  -H "Content-Type: application/json" \  -d '{    "model": "gemma4",    "messages": [      {        "role": "user",        "content": "Hello, how are you?"      }    ]  }'

{
  "id": "string",
  "object": "chat.completion",
  "created": 0,
  "model": "string",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "string",
        "reasoning_content": "string",
        "tool_calls": [
          {}
        ]
      },
      "logprobs": {},
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0,
    "inferenceTiming": {
      "inferenceTimeMs": 0
    }
  },
  "impact": {
    "inferenceTime": {
      "total": 0,
      "unit": "ms"
    },
    "energy": {
      "total": 0,
      "unit": "Wms"
    },
    "emissions": {
      "total": 0,
      "unit": "ugCO2e"
    },
    "version": "20250922"
  }
}

{
  "error": {
    "code": 0,
    "message": "string",
    "type": "string",
    "param": "string"
  },
  "requestId": "string"
}

{
  "error": {
    "code": 0,
    "message": "string",
    "type": "string",
    "param": "string"
  },
  "requestId": "string"
}

Notes

Recommended model: gemma4. Other models (e.g. green-l, green-r) are available, see Models.
Streaming: set stream: true and consume the response as Server-Sent Events. See Chat Completion (Streaming).
Authorization: pass your API key in the Authorization header as Bearer YOUR_API_KEY.
Reasoning effort: reasoning_effort is supported on GreenPT-hosted models. Omit it to keep thinking enabled (the default), or pass "none" to disable thinking entirely. OpenAI-spec values (minimal / low / medium / high) are accepted for compatibility and all currently map to thinking-enabled.

Sustainability impact

Every response includes an impact object with the environmental cost of that specific inference:

"impact": {
  "inferenceTime": { "total": 1380, "unit": "ms" },
  "energy":        { "total": 526,  "unit": "Wms" },
  "emissions":     { "total": 47,   "unit": "ugCO2e" },
  "version": "20250922"
}

emissions is in micrograms of CO₂ equivalent (µgCO₂e). The value is calculated using 1-hour datacenter-level carbon intensity data from Nodera, so it reflects what the electricity grid actually looked like at the moment your request was processed. The same prompt sent at different times of day will produce different emissions values as the share of renewable generation on the grid changes. See Carbon Calculations for the full methodology.

Authorization

Request Body

Response Body

200application/json

401application/json

429application/json

Notes

Sustainability impact

`application/json`

`application/json`

`application/json`