Every LLM parameter, for every model.
An open, community-maintained catalog of LLM model parameters. Search, filter, and link straight to the knobs you can turn. API-key and subscription variants of the same model are listed separately, because they behave differently.
Filter by provider
Filter by parameter
174 of 174 models
OpenAI 41
OpenAI Chatgpt 4o Latest 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. | — |
OpenAI Gpt 3.5 Turbo 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. | — |
OpenAI Gpt 4 Turbo 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. | — |
OpenAI Gpt 4 Turbo 2024-04-09 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. | — |
OpenAI Gpt 4.1 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. | — |
OpenAI Gpt 4.1 Mini 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. | — |
OpenAI Gpt 4.1 Nano 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. | — |
OpenAI GPT-4o 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. | — |
OpenAI Gpt 4o 2024-11-20 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. | — |
OpenAI GPT-4o mini 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. | — |
OpenAI Gpt 5 2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning_effort
|
enum (minimal | low | medium | high) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
OpenAI Gpt 5 Chat Latest 1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
OpenAI Gpt 5 Mini 2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning_effort
|
enum (minimal | low | medium | high) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
OpenAI Gpt 5 Nano 2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning_effort
|
enum (minimal | low | medium | high) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
OpenAI Gpt 5.1 2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning_effort
|
enum (none | low | medium | high) | "none" | Controls how much reasoning the model should perform before producing an answer. | — |
OpenAI Gpt 5.1 Codex Max Subscription 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Reasoning · 2 params | ||||
|
Reasoning effort
reasoning.effort
|
enum (minimal | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
| Output · 1 param | ||||
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
OpenAI Gpt 5.1 Codex Subscription 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Reasoning · 2 params | ||||
|
Reasoning effort
reasoning.effort
|
enum (minimal | low | medium | high) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
| Output · 1 param | ||||
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
OpenAI Gpt 5.2 2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning_effort
|
enum (none | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
OpenAI Gpt 5.2 Codex Subscription 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Reasoning · 2 params | ||||
|
Reasoning effort
reasoning.effort
|
enum (minimal | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
| Output · 1 param | ||||
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
OpenAI Gpt 5.2 Subscription 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Reasoning · 2 params | ||||
|
Reasoning effort
reasoning.effort
|
enum (minimal | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
| Output · 1 param | ||||
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
OpenAI Gpt 5.3 Codex 2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning_effort
|
enum (low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
OpenAI Gpt 5.3 Codex Spark Subscription 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Reasoning · 2 params | ||||
|
Reasoning effort
reasoning.effort
|
enum (minimal | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
| Output · 1 param | ||||
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
OpenAI Gpt 5.3 Codex Subscription 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Reasoning · 2 params | ||||
|
Reasoning effort
reasoning.effort
|
enum (minimal | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
| Output · 1 param | ||||
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
OpenAI Gpt 5.4 2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning_effort
|
enum (none | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
OpenAI Gpt 5.4 Mini 2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning_effort
|
enum (none | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
OpenAI Gpt 5.4 Mini Subscription 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Reasoning · 2 params | ||||
|
Reasoning effort
reasoning.effort
|
enum (minimal | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
| Output · 1 param | ||||
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
OpenAI Gpt 5.4 Nano 2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning_effort
|
enum (none | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
OpenAI Gpt 5.4 Pro 2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning_effort
|
enum (medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
OpenAI Gpt 5.4 Pro Subscription 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Reasoning · 2 params | ||||
|
Reasoning effort
reasoning.effort
|
enum (medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
| Output · 1 param | ||||
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
OpenAI Gpt 5.4 Subscription 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Reasoning · 2 params | ||||
|
Reasoning effort
reasoning.effort
|
enum (minimal | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
| Output · 1 param | ||||
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
OpenAI Gpt 5.5 2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning_effort
|
enum (none | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
OpenAI Gpt 5.5 Pro 2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning_effort
|
enum (medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
OpenAI Gpt 5.5 Pro Subscription 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Reasoning · 2 params | ||||
|
Reasoning effort
reasoning.effort
|
enum (medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
| Output · 1 param | ||||
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
OpenAI Gpt 5.5 Subscription 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Reasoning · 2 params | ||||
|
Reasoning effort
reasoning.effort
|
enum (minimal | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
| Output · 1 param | ||||
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
OpenAI o1 2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning_effort
|
enum (low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
OpenAI o1-mini 2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning_effort
|
enum (minimal | low | medium | high) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
OpenAI O1 Preview 2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning_effort
|
enum (minimal | low | medium | high) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
OpenAI o3 2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning_effort
|
enum (low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
OpenAI o3-mini 2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning_effort
|
enum (low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
OpenAI O3 Pro 2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning_effort
|
enum (low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
OpenAI o4-mini 2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning_effort
|
enum (low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
Anthropic 36
Anthropic Claude 3.5 Haiku 20241022 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Anthropic Claude 3.5 Haiku Latest 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Anthropic Claude 3.5 Sonnet 20241022 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Anthropic Claude 3.5 Sonnet Latest 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Anthropic Claude 3.7 Sonnet 20250219 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
| Reasoning · 2 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Anthropic Claude 3.7 Sonnet Latest 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
| Reasoning · 2 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Anthropic Claude 3 Opus 20240229 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Anthropic Claude 3 Opus Latest 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Anthropic Claude Haiku 4 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
| Reasoning · 2 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Anthropic Claude Haiku 4.5 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
| Reasoning · 2 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Anthropic Claude Haiku 4.5 20251001 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled" or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled" or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
| Reasoning · 2 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Anthropic Claude Haiku 4.5 20251001 Subscription 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled" or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled" or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
| Reasoning · 2 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Anthropic Claude Haiku 4.5 Subscription 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
| Reasoning · 2 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Anthropic Claude Haiku 4 Subscription 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
| Reasoning · 2 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Anthropic Claude Opus 4.1 20250805 7 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled" or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled" or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
| Reasoning · 3 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "enabled"
|
Anthropic Claude Opus 4.1 20250805 Subscription 7 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled" or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled" or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
| Reasoning · 3 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "enabled"
|
Anthropic Claude Opus 4 20250514 7 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled"
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled"
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
| Reasoning · 3 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "enabled"
|
Anthropic Claude Opus 4 20250514 Subscription 7 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled"
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled"
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
| Reasoning · 3 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "enabled"
|
Anthropic Claude Opus 4.5 20251101 8 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled" or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled" or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
| Reasoning · 4 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "enabled"
|
|
Effort
output_config.effort
|
enum (low | medium | high) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Anthropic Claude Opus 4.5 20251101 Subscription 8 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled" or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled" or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
| Reasoning · 4 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "enabled"
|
|
Effort
output_config.effort
|
enum (low | medium | high) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Anthropic Claude Opus 4.6 8 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"enabled", "adaptive"} or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"enabled", "adaptive"} or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"enabled", "adaptive"}
|
| Reasoning · 4 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | adaptive | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type ∈ {"adaptive", "enabled"}
|
|
Effort
output_config.effort
|
enum (low | medium | high | max) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Anthropic Claude Opus 4.6 Subscription 8 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"enabled", "adaptive"} or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"enabled", "adaptive"} or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"enabled", "adaptive"}
|
| Reasoning · 4 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | adaptive | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type ∈ {"adaptive", "enabled"}
|
|
Effort
output_config.effort
|
enum (low | medium | high | max) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Anthropic Claude Opus 4.7 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 3 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | adaptive) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "omitted" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "adaptive"
|
|
Effort
output_config.effort
|
enum (low | medium | high | xhigh | max) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Anthropic Claude Opus 4.7 Subscription 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 3 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | adaptive) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "omitted" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "adaptive"
|
|
Effort
output_config.effort
|
enum (low | medium | high | xhigh | max) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Anthropic Claude Opus 4.8 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 3 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | adaptive) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "omitted" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "adaptive"
|
|
Effort
output_config.effort
|
enum (low | medium | high | xhigh | max) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Anthropic Claude Opus 4.8 Subscription 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Reasoning · 3 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | adaptive) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "omitted" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "adaptive"
|
|
Effort
output_config.effort
|
enum (low | medium | high | xhigh | max) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Anthropic Claude Opus 4 Subscription 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
| Reasoning · 2 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | adaptive | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Anthropic Claude Sonnet 4 20250514 7 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled"
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled"
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
| Reasoning · 3 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "enabled"
|
Anthropic Claude Sonnet 4 20250514 Subscription 7 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled"
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled"
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
| Reasoning · 3 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "enabled"
|
Anthropic Claude Sonnet 4.5 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
| Reasoning · 2 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | adaptive | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Anthropic Claude Sonnet 4.5 20250929 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled" or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled" or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
| Reasoning · 2 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Anthropic Claude Sonnet 4.5 20250929 Subscription 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled" or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled" or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
| Reasoning · 2 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Anthropic Claude Sonnet 4.5 Subscription 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
| Reasoning · 2 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | adaptive | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Anthropic Claude Sonnet 4.6 8 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"enabled", "adaptive"} or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"enabled", "adaptive"} or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"enabled", "adaptive"}
|
| Reasoning · 4 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | adaptive | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type ∈ {"adaptive", "enabled"}
|
|
Effort
output_config.effort
|
enum (low | medium | high | max) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Anthropic Claude Sonnet 4.6 Subscription 8 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"enabled", "adaptive"} or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"enabled", "adaptive"} or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"enabled", "adaptive"}
|
| Reasoning · 4 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | adaptive | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type ∈ {"adaptive", "enabled"}
|
|
Effort
output_config.effort
|
enum (low | medium | high | max) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Anthropic Claude Sonnet 4 Subscription 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
| Reasoning · 2 params | ||||
|
Thinking mode
thinking.type
|
enum (disabled | adaptive | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Z.ai 19
Z.ai GLM-4.5 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 0.6 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Z.ai GLM-4.5-Air 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 0.6 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Z.ai GLM-4.5-Air Subscription 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 0.6 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Z.ai GLM-4.5-AirX 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 0.6 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Z.ai GLM-4.5-Flash 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 0.6 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Z.ai GLM-4.5 Subscription 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 0.6 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Z.ai GLM-4.5-X 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 0.6 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Z.ai GLM-4.6 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Z.ai GLM-4.6 Subscription 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Z.ai GLM-4.7 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Z.ai GLM-4.7-Flash 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Z.ai GLM-4.7-FlashX 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Z.ai GLM-4.7 Subscription 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Z.ai GLM-5 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Z.ai GLM-5 Subscription 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Z.ai GLM-5-Turbo 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Z.ai GLM-5-Turbo Subscription 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Z.ai GLM-5.1 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Z.ai GLM-5.1 Subscription 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
MiniMax 16
MiniMax MiniMax M2 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
| Reasoning · 1 param | ||||
|
Split reasoning
reasoning_split
|
boolean | false | Returns the model's reasoning in a separate reasoning_details field instead of inline with the response. | — |
MiniMax MiniMax M2 Subscription 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
MiniMax MiniMax M2.1 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
| Reasoning · 1 param | ||||
|
Split reasoning
reasoning_split
|
boolean | false | Returns the model's reasoning in a separate reasoning_details field instead of inline with the response. | — |
MiniMax MiniMax M2.1 Highspeed 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
| Reasoning · 1 param | ||||
|
Split reasoning
reasoning_split
|
boolean | false | Returns the model's reasoning in a separate reasoning_details field instead of inline with the response. | — |
MiniMax MiniMax M2.1 Highspeed Subscription 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
MiniMax MiniMax M2.1 Subscription 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
MiniMax MiniMax M2.5 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
| Reasoning · 1 param | ||||
|
Split reasoning
reasoning_split
|
boolean | false | Returns the model's reasoning in a separate reasoning_details field instead of inline with the response. | — |
MiniMax MiniMax M2.5 Highspeed 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
| Reasoning · 1 param | ||||
|
Split reasoning
reasoning_split
|
boolean | false | Returns the model's reasoning in a separate reasoning_details field instead of inline with the response. | — |
MiniMax MiniMax M2.5 Highspeed Subscription 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
MiniMax MiniMax M2.5 Subscription 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
MiniMax MiniMax M2.7 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
| Reasoning · 1 param | ||||
|
Split reasoning
reasoning_split
|
boolean | false | Returns the model's reasoning in a separate reasoning_details field instead of inline with the response. | — |
MiniMax MiniMax M2.7 Highspeed 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
| Reasoning · 1 param | ||||
|
Split reasoning
reasoning_split
|
boolean | false | Returns the model's reasoning in a separate reasoning_details field instead of inline with the response. | — |
MiniMax MiniMax M2.7 Highspeed Subscription 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
MiniMax MiniMax M2.7 Subscription 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
MiniMax Minimax M3 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
| Reasoning · 1 param | ||||
|
Split reasoning
reasoning_split
|
boolean | false | Returns the model's reasoning in a separate reasoning_details field instead of inline with the response. | — |
MiniMax MiniMax M3 Subscription 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
Mistral 13
Mistral Codestral Latest 9 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
| Sampling · 5 params | ||||
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
| Metadata · 1 param | ||||
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Mistral Devstral 2512 9 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
| Sampling · 5 params | ||||
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
| Metadata · 1 param | ||||
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Mistral Devstral Latest 9 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
| Sampling · 5 params | ||||
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
| Metadata · 1 param | ||||
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Mistral Magistral Medium Latest 10 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
| Sampling · 5 params | ||||
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
| Reasoning · 1 param | ||||
|
Prompt mode
prompt_mode
|
enum (reasoning) | — | Enables Mistral's reasoning system prompt; leave unset to disable the default reasoning behavior. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
| Metadata · 1 param | ||||
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Mistral Magistral Small Latest 10 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
| Sampling · 5 params | ||||
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
| Reasoning · 1 param | ||||
|
Prompt mode
prompt_mode
|
enum (reasoning) | — | Enables Mistral's reasoning system prompt; leave unset to disable the default reasoning behavior. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
| Metadata · 1 param | ||||
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Mistral Ministral 14b Latest 9 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
| Sampling · 5 params | ||||
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
| Metadata · 1 param | ||||
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Mistral Ministral 3b Latest 9 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
| Sampling · 5 params | ||||
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
| Metadata · 1 param | ||||
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Mistral Ministral 8b Latest 9 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
| Sampling · 5 params | ||||
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
| Metadata · 1 param | ||||
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Mistral Mistral Large Latest 9 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
| Sampling · 5 params | ||||
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
| Metadata · 1 param | ||||
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Mistral Mistral Medium 3.5 9 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
| Sampling · 5 params | ||||
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
| Metadata · 1 param | ||||
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Mistral Mistral Medium Latest 9 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
| Sampling · 5 params | ||||
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
| Metadata · 1 param | ||||
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Mistral Mistral Small Latest 9 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
| Sampling · 5 params | ||||
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
| Metadata · 1 param | ||||
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Mistral Open Mistral Nemo 9 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
| Sampling · 5 params | ||||
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
| Metadata · 1 param | ||||
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Google 11
Google Gemini 2.5 Flash 8 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
| Sampling · 4 params | ||||
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
| Reasoning · 2 params | ||||
|
Thinking budget
generationConfig.thinkingConfig.thinkingBudget
|
integer (-1…24576) | -1 | Number of thinking tokens Gemini should use; 0 disables thinking and -1 uses dynamic thinking. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
| Output · 1 param | ||||
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Google Gemini 2.5 Flash Lite 8 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
| Sampling · 4 params | ||||
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
| Reasoning · 2 params | ||||
|
Thinking budget
generationConfig.thinkingConfig.thinkingBudget
|
integer | 0 | Number of thinking tokens Gemini should use; -1 uses dynamic thinking, 0 disables thinking, and fixed budgets start at 512 tokens. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
| Output · 1 param | ||||
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Google Gemini 2.5 Flash Lite Subscription 8 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
| Sampling · 4 params | ||||
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
| Reasoning · 2 params | ||||
|
Thinking budget
generationConfig.thinkingConfig.thinkingBudget
|
integer | 0 | Number of thinking tokens Gemini should use; -1 uses dynamic thinking, 0 disables thinking, and fixed budgets start at 512 tokens. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
| Output · 1 param | ||||
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Google Gemini 2.5 Flash Subscription 8 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
| Sampling · 4 params | ||||
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
| Reasoning · 2 params | ||||
|
Thinking budget
generationConfig.thinkingConfig.thinkingBudget
|
integer (-1…24576) | -1 | Number of thinking tokens Gemini should use; 0 disables thinking and -1 uses dynamic thinking. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
| Output · 1 param | ||||
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Google Gemini 2.5 Pro 8 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
| Sampling · 4 params | ||||
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
| Reasoning · 2 params | ||||
|
Thinking budget
generationConfig.thinkingConfig.thinkingBudget
|
integer (128…32768) | — | Maximum number of thinking tokens Gemini should use before producing the final answer. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
| Output · 1 param | ||||
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Google Gemini 2.5 Pro Subscription 8 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
| Sampling · 4 params | ||||
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
| Reasoning · 2 params | ||||
|
Thinking budget
generationConfig.thinkingConfig.thinkingBudget
|
integer (128…32768) | — | Maximum number of thinking tokens Gemini should use before producing the final answer. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
| Output · 1 param | ||||
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Google Gemini 3 Flash Preview Subscription 8 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
| Sampling · 4 params | ||||
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
| Reasoning · 2 params | ||||
|
Thinking level
generationConfig.thinkingConfig.thinkingLevel
|
enum (minimal | low | medium | high) | "high" | Controls Gemini 3 Flash reasoning effort. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
| Output · 1 param | ||||
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Google Gemini 3.1 Flash Lite Preview Subscription 8 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
| Sampling · 4 params | ||||
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
| Reasoning · 2 params | ||||
|
Thinking level
generationConfig.thinkingConfig.thinkingLevel
|
enum (minimal | low | medium | high) | "high" | Controls Gemini 3.1 Flash-Lite reasoning effort. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
| Output · 1 param | ||||
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Google Gemini 3.1 Flash Lite Subscription 8 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
| Sampling · 4 params | ||||
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
| Reasoning · 2 params | ||||
|
Thinking level
generationConfig.thinkingConfig.thinkingLevel
|
enum (minimal | low | medium | high) | "high" | Controls Gemini 3.1 Flash-Lite reasoning effort. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
| Output · 1 param | ||||
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Google Gemini 3.1 Pro Preview Subscription 8 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
| Sampling · 4 params | ||||
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
| Reasoning · 2 params | ||||
|
Thinking level
generationConfig.thinkingConfig.thinkingLevel
|
enum (low | high) | "high" | Controls Gemini 3 Pro reasoning effort. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
| Output · 1 param | ||||
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Google Gemini 3.5 Flash 8 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
| Sampling · 4 params | ||||
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
| Reasoning · 2 params | ||||
|
Thinking level
generationConfig.thinkingConfig.thinkingLevel
|
enum (minimal | low | medium | high) | "medium" | Controls Gemini 3.5 Flash reasoning effort. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
| Output · 1 param | ||||
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Alibaba 8
Alibaba Qwen Flash 5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
extra_body.top_k
|
integer (1…+∞) | 20 | Limits generation to the selected number of highest-probability tokens. | — |
| Reasoning · 1 param | ||||
|
Enable thinking
extra_body.chat_template_kwargs.enable_thinking
|
boolean | true | Controls Qwen3 thinking mode when using OpenAI-compatible clients that pass provider-specific extra body fields. | — |
Alibaba Qwen Plus 5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
extra_body.top_k
|
integer (1…+∞) | 20 | Limits generation to the selected number of highest-probability tokens. | — |
| Reasoning · 1 param | ||||
|
Enable thinking
extra_body.chat_template_kwargs.enable_thinking
|
boolean | true | Controls Qwen3 thinking mode when using OpenAI-compatible clients that pass provider-specific extra body fields. | — |
Alibaba Qwen3 Coder Flash 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
extra_body.top_k
|
integer (1…+∞) | 20 | Limits generation to the selected number of highest-probability tokens. | — |
Alibaba Qwen3 Coder Plus 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
extra_body.top_k
|
integer (1…+∞) | 20 | Limits generation to the selected number of highest-probability tokens. | — |
Alibaba Qwen3 Max 5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
extra_body.top_k
|
integer (1…+∞) | 20 | Limits generation to the selected number of highest-probability tokens. | — |
| Reasoning · 1 param | ||||
|
Enable thinking
extra_body.chat_template_kwargs.enable_thinking
|
boolean | false | Controls Qwen3 thinking mode when using OpenAI-compatible clients that pass provider-specific extra body fields. | — |
Alibaba Qwen3.5 5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
extra_body.top_k
|
integer (1…+∞) | 20 | Limits generation to the selected number of highest-probability tokens. | — |
| Reasoning · 1 param | ||||
|
Enable thinking
extra_body.chat_template_kwargs.enable_thinking
|
boolean | true | Controls Qwen3 thinking mode when using OpenAI-compatible clients that pass provider-specific extra body fields. | — |
Alibaba Qwen3.5 Flash 5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
extra_body.top_k
|
integer (1…+∞) | 20 | Limits generation to the selected number of highest-probability tokens. | — |
| Reasoning · 1 param | ||||
|
Enable thinking
extra_body.chat_template_kwargs.enable_thinking
|
boolean | true | Controls Qwen3 thinking mode when using OpenAI-compatible clients that pass provider-specific extra body fields. | — |
Alibaba Qwq Plus 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
extra_body.top_k
|
integer (1…+∞) | 20 | Limits generation to the selected number of highest-probability tokens. | — |
Cohere 8
Cohere Command A 03 2025 12 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
|
Stop sequences
stop_sequences
|
string | — | Stops generation when one of these sequences is detected; up to five are allowed. | — |
| Sampling · 6 params | ||||
|
Temperature
temperature
|
number (0…+∞ step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
p
|
number (0.01…0.99 step 0.01) | 0.75 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
k
|
integer (0…500) | 0 | Limits sampling to the K most likely tokens; 0 disables top-k sampling. | — |
|
Frequency penalty
frequency_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens proportional to how often they have already appeared to reduce repetition. | — |
|
Presence penalty
presence_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens that have already appeared to encourage a wider variety of content. | — |
|
Seed
seed
|
integer | — | Seed used for best-effort deterministic sampling when reproducible outputs are desired. | — |
| Tools · 1 param | ||||
|
Tool choice
tool_choice
|
enum (REQUIRED | NONE) | — | Forces the model to either call a tool or skip tool calls for this request. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON object output. | — |
| Observability · 1 param | ||||
|
Log probabilities
logprobs
|
boolean | false | Controls whether the response includes log probabilities for the generated tokens. | — |
| Metadata · 1 param | ||||
|
Safety mode
safety_mode
|
enum (CONTEXTUAL | STRICT) | "CONTEXTUAL" | Controls Cohere's built-in safety instructions applied to the generation. | — |
Cohere Command A Plus 05 2026 12 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
|
Stop sequences
stop_sequences
|
string | — | Stops generation when one of these sequences is detected; up to five are allowed. | — |
| Sampling · 6 params | ||||
|
Temperature
temperature
|
number (0…+∞ step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
p
|
number (0.01…0.99 step 0.01) | 0.75 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
k
|
integer (0…500) | 0 | Limits sampling to the K most likely tokens; 0 disables top-k sampling. | — |
|
Frequency penalty
frequency_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens proportional to how often they have already appeared to reduce repetition. | — |
|
Presence penalty
presence_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens that have already appeared to encourage a wider variety of content. | — |
|
Seed
seed
|
integer | — | Seed used for best-effort deterministic sampling when reproducible outputs are desired. | — |
| Tools · 1 param | ||||
|
Tool choice
tool_choice
|
enum (REQUIRED | NONE) | — | Forces the model to either call a tool or skip tool calls for this request. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON object output. | — |
| Observability · 1 param | ||||
|
Log probabilities
logprobs
|
boolean | false | Controls whether the response includes log probabilities for the generated tokens. | — |
| Metadata · 1 param | ||||
|
Safety mode
safety_mode
|
enum (CONTEXTUAL | STRICT) | "CONTEXTUAL" | Controls Cohere's built-in safety instructions applied to the generation. | — |
Cohere Command A Reasoning 08 2025 14 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
|
Stop sequences
stop_sequences
|
string | — | Stops generation when one of these sequences is detected; up to five are allowed. | — |
| Sampling · 6 params | ||||
|
Temperature
temperature
|
number (0…+∞ step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
p
|
number (0.01…0.99 step 0.01) | 0.75 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
k
|
integer (0…500) | 0 | Limits sampling to the K most likely tokens; 0 disables top-k sampling. | — |
|
Frequency penalty
frequency_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens proportional to how often they have already appeared to reduce repetition. | — |
|
Presence penalty
presence_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens that have already appeared to encourage a wider variety of content. | — |
|
Seed
seed
|
integer | — | Seed used for best-effort deterministic sampling when reproducible outputs are desired. | — |
| Reasoning · 2 params | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "disabled" | Controls whether the model reasons step by step before producing its final answer. | — |
|
Thinking token budget
thinking.token_budget
|
integer (1…+∞) | — | Maximum number of tokens the model may spend on reasoning before answering. |
Only when thinking.type = "enabled"
|
| Tools · 1 param | ||||
|
Tool choice
tool_choice
|
enum (REQUIRED | NONE) | — | Forces the model to either call a tool or skip tool calls for this request. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON object output. | — |
| Observability · 1 param | ||||
|
Log probabilities
logprobs
|
boolean | false | Controls whether the response includes log probabilities for the generated tokens. | — |
| Metadata · 1 param | ||||
|
Safety mode
safety_mode
|
enum (CONTEXTUAL | STRICT) | "CONTEXTUAL" | Controls Cohere's built-in safety instructions applied to the generation. | — |
Cohere Command A Translate 08 2025 12 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
|
Stop sequences
stop_sequences
|
string | — | Stops generation when one of these sequences is detected; up to five are allowed. | — |
| Sampling · 6 params | ||||
|
Temperature
temperature
|
number (0…+∞ step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
p
|
number (0.01…0.99 step 0.01) | 0.75 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
k
|
integer (0…500) | 0 | Limits sampling to the K most likely tokens; 0 disables top-k sampling. | — |
|
Frequency penalty
frequency_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens proportional to how often they have already appeared to reduce repetition. | — |
|
Presence penalty
presence_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens that have already appeared to encourage a wider variety of content. | — |
|
Seed
seed
|
integer | — | Seed used for best-effort deterministic sampling when reproducible outputs are desired. | — |
| Tools · 1 param | ||||
|
Tool choice
tool_choice
|
enum (REQUIRED | NONE) | — | Forces the model to either call a tool or skip tool calls for this request. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON object output. | — |
| Observability · 1 param | ||||
|
Log probabilities
logprobs
|
boolean | false | Controls whether the response includes log probabilities for the generated tokens. | — |
| Metadata · 1 param | ||||
|
Safety mode
safety_mode
|
enum (CONTEXTUAL | STRICT) | "CONTEXTUAL" | Controls Cohere's built-in safety instructions applied to the generation. | — |
Cohere Command A Vision 07 2025 12 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
|
Stop sequences
stop_sequences
|
string | — | Stops generation when one of these sequences is detected; up to five are allowed. | — |
| Sampling · 6 params | ||||
|
Temperature
temperature
|
number (0…+∞ step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
p
|
number (0.01…0.99 step 0.01) | 0.75 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
k
|
integer (0…500) | 0 | Limits sampling to the K most likely tokens; 0 disables top-k sampling. | — |
|
Frequency penalty
frequency_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens proportional to how often they have already appeared to reduce repetition. | — |
|
Presence penalty
presence_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens that have already appeared to encourage a wider variety of content. | — |
|
Seed
seed
|
integer | — | Seed used for best-effort deterministic sampling when reproducible outputs are desired. | — |
| Tools · 1 param | ||||
|
Tool choice
tool_choice
|
enum (REQUIRED | NONE) | — | Forces the model to either call a tool or skip tool calls for this request. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON object output. | — |
| Observability · 1 param | ||||
|
Log probabilities
logprobs
|
boolean | false | Controls whether the response includes log probabilities for the generated tokens. | — |
| Metadata · 1 param | ||||
|
Safety mode
safety_mode
|
enum (CONTEXTUAL | STRICT) | "CONTEXTUAL" | Controls Cohere's built-in safety instructions applied to the generation. | — |
Cohere Command R 08 2024 11 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
|
Stop sequences
stop_sequences
|
string | — | Stops generation when one of these sequences is detected; up to five are allowed. | — |
| Sampling · 6 params | ||||
|
Temperature
temperature
|
number (0…+∞ step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
p
|
number (0.01…0.99 step 0.01) | 0.75 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
k
|
integer (0…500) | 0 | Limits sampling to the K most likely tokens; 0 disables top-k sampling. | — |
|
Frequency penalty
frequency_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens proportional to how often they have already appeared to reduce repetition. | — |
|
Presence penalty
presence_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens that have already appeared to encourage a wider variety of content. | — |
|
Seed
seed
|
integer | — | Seed used for best-effort deterministic sampling when reproducible outputs are desired. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON object output. | — |
| Observability · 1 param | ||||
|
Log probabilities
logprobs
|
boolean | false | Controls whether the response includes log probabilities for the generated tokens. | — |
| Metadata · 1 param | ||||
|
Safety mode
safety_mode
|
enum (CONTEXTUAL | STRICT | OFF) | "CONTEXTUAL" | Controls Cohere's built-in safety instructions applied to the generation. | — |
Cohere Command R Plus 08 2024 11 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
|
Stop sequences
stop_sequences
|
string | — | Stops generation when one of these sequences is detected; up to five are allowed. | — |
| Sampling · 6 params | ||||
|
Temperature
temperature
|
number (0…+∞ step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
p
|
number (0.01…0.99 step 0.01) | 0.75 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
k
|
integer (0…500) | 0 | Limits sampling to the K most likely tokens; 0 disables top-k sampling. | — |
|
Frequency penalty
frequency_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens proportional to how often they have already appeared to reduce repetition. | — |
|
Presence penalty
presence_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens that have already appeared to encourage a wider variety of content. | — |
|
Seed
seed
|
integer | — | Seed used for best-effort deterministic sampling when reproducible outputs are desired. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON object output. | — |
| Observability · 1 param | ||||
|
Log probabilities
logprobs
|
boolean | false | Controls whether the response includes log probabilities for the generated tokens. | — |
| Metadata · 1 param | ||||
|
Safety mode
safety_mode
|
enum (CONTEXTUAL | STRICT | OFF) | "CONTEXTUAL" | Controls Cohere's built-in safety instructions applied to the generation. | — |
Cohere Command R7b 12 2024 12 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
|
Stop sequences
stop_sequences
|
string | — | Stops generation when one of these sequences is detected; up to five are allowed. | — |
| Sampling · 6 params | ||||
|
Temperature
temperature
|
number (0…+∞ step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
p
|
number (0.01…0.99 step 0.01) | 0.75 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
k
|
integer (0…500) | 0 | Limits sampling to the K most likely tokens; 0 disables top-k sampling. | — |
|
Frequency penalty
frequency_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens proportional to how often they have already appeared to reduce repetition. | — |
|
Presence penalty
presence_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens that have already appeared to encourage a wider variety of content. | — |
|
Seed
seed
|
integer | — | Seed used for best-effort deterministic sampling when reproducible outputs are desired. | — |
| Tools · 1 param | ||||
|
Tool choice
tool_choice
|
enum (REQUIRED | NONE) | — | Forces the model to either call a tool or skip tool calls for this request. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON object output. | — |
| Observability · 1 param | ||||
|
Log probabilities
logprobs
|
boolean | false | Controls whether the response includes log probabilities for the generated tokens. | — |
| Metadata · 1 param | ||||
|
Safety mode
safety_mode
|
enum (CONTEXTUAL | STRICT) | "CONTEXTUAL" | Controls Cohere's built-in safety instructions applied to the generation. | — |
Moonshot AI 5
Moonshot AI Kimi K2.5 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the chat completion. | — |
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | — | Controls whether Kimi reasons step by step before answering, or responds directly when set to disabled. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Moonshot AI Kimi K2.6 3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the chat completion. | — |
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Controls whether Kimi reasons step by step before answering. Thinking is enabled by default; set disabled to respond directly. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Moonshot AI Moonshot v1 128K 7 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the chat completion. | — |
|
Number of completions
n
|
integer (1…5) | 1 | How many chat completion choices to generate for the request. | — |
| Sampling · 4 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes tokens that have already appeared, encouraging the model to talk about new topics. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes tokens by how often they have appeared, reducing verbatim repetition. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Moonshot AI Moonshot v1 32K 7 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the chat completion. | — |
|
Number of completions
n
|
integer (1…5) | 1 | How many chat completion choices to generate for the request. | — |
| Sampling · 4 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes tokens that have already appeared, encouraging the model to talk about new topics. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes tokens by how often they have appeared, reducing verbatim repetition. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Moonshot AI Moonshot v1 8K 7 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the chat completion. | — |
|
Number of completions
n
|
integer (1…5) | 1 | How many chat completion choices to generate for the request. | — |
| Sampling · 4 params | ||||
|
Temperature
temperature
|
number (0…1 step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes tokens that have already appeared, encouraging the model to talk about new topics. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes tokens by how often they have appeared, reducing verbatim repetition. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
xAI 5
xAI Grok 4.20 0309 Non Reasoning 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 2 params | ||||
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Upper bound for visible output tokens generated in the chat completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this sequence is produced. xAI accepts up to four stop sequences. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Seed
seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object | json_schema) | "text" | Controls whether the model returns text, JSON mode output, or structured JSON schema output. | — |
xAI Grok 4.20 0309 Reasoning 5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Upper bound for visible output tokens generated in the chat completion. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Seed
seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object | json_schema) | "text" | Controls whether the model returns text, JSON mode output, or structured JSON schema output. | — |
xAI Grok 4.20 Multi Agent 0309 5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max output tokens
max_output_tokens
|
integer (1…+∞) | — | Upper bound for output tokens generated in the Responses API response. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | 0.7 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning.effort
|
enum (low | medium | high | xhigh) | — | Controls whether the Responses API request uses the 4-agent or 16-agent multi-agent setup. | — |
| Output · 1 param | ||||
|
Text format
text.format.type
|
enum (text | json_object | json_schema) | "text" | Controls whether the Responses API returns free-form text, JSON mode output, or structured JSON schema output. | — |
xAI Grok 4.3 6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Upper bound for visible output tokens generated in the chat completion. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Seed
seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning_effort
|
enum (none | low | medium | high) | "low" | Controls how much reasoning Grok performs before responding. Set to none for non-reasoning requests. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object | json_schema) | "text" | Controls whether the model returns text, JSON mode output, or structured JSON schema output. | — |
xAI Grok Build 0.1 5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Upper bound for visible output tokens generated in the chat completion. | — |
| Sampling · 3 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Seed
seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_object | json_schema) | "text" | Controls whether the model returns text, JSON mode output, or structured JSON schema output. | — |
DeepSeek 4
DeepSeek Deepseek Chat 4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. In DeepSeek thinking mode this parameter is accepted for compatibility but has no effect. |
Not when thinking.type = "enabled"
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling. In DeepSeek thinking mode this parameter is accepted for compatibility but has no effect. |
Not when thinking.type = "enabled"
|
| Reasoning · 1 param | ||||
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls whether DeepSeek uses thinking mode before producing the final answer. | — |
DeepSeek Deepseek Reasoner 5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. In DeepSeek thinking mode this parameter is accepted for compatibility but has no effect. |
Not when thinking.type = "enabled"
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling. In DeepSeek thinking mode this parameter is accepted for compatibility but has no effect. |
Not when thinking.type = "enabled"
|
| Reasoning · 2 params | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Controls whether DeepSeek uses thinking mode before producing the final answer. | — |
|
Reasoning effort
reasoning_effort
|
enum (high | max) | "high" | Controls DeepSeek thinking effort when thinking mode is enabled. |
Only when thinking.type = "enabled"
|
DeepSeek Deepseek V4 Flash 5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. In DeepSeek thinking mode this parameter is accepted for compatibility but has no effect. |
Not when thinking.type = "enabled"
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling. In DeepSeek thinking mode this parameter is accepted for compatibility but has no effect. |
Not when thinking.type = "enabled"
|
| Reasoning · 2 params | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Controls whether DeepSeek uses thinking mode before producing the final answer. | — |
|
Reasoning effort
reasoning_effort
|
enum (high | max) | "high" | Controls DeepSeek thinking effort when thinking mode is enabled. |
Only when thinking.type = "enabled"
|
DeepSeek Deepseek V4 Pro 5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. In DeepSeek thinking mode this parameter is accepted for compatibility but has no effect. |
Not when thinking.type = "enabled"
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling. In DeepSeek thinking mode this parameter is accepted for compatibility but has no effect. |
Not when thinking.type = "enabled"
|
| Reasoning · 2 params | ||||
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Controls whether DeepSeek uses thinking mode before producing the final answer. | — |
|
Reasoning effort
reasoning_effort
|
enum (high | max) | "high" | Controls DeepSeek thinking effort when thinking mode is enabled. |
Only when thinking.type = "enabled"
|
Meta 4
Meta Llama 3.3 70B Instruct 7 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
| Sampling · 4 params | ||||
|
Temperature
temperature
|
number | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
top_k
|
integer | — | Limits generation to the selected number of highest-probability tokens. | — |
|
Repetition penalty
repetition_penalty
|
number | — | Penalizes tokens that have already appeared to reduce repetition in the output. | — |
| Tools · 1 param | ||||
|
Tool choice
tool_choice
|
enum (auto | none | required) | — | Controls whether the model may call tools, must call one, or skips tool calls. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_schema) | "text" | Controls whether the model returns normal text or a schema-constrained JSON object. | — |
Meta Llama 3.3 8B Instruct 7 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
| Sampling · 4 params | ||||
|
Temperature
temperature
|
number | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
top_k
|
integer | — | Limits generation to the selected number of highest-probability tokens. | — |
|
Repetition penalty
repetition_penalty
|
number | — | Penalizes tokens that have already appeared to reduce repetition in the output. | — |
| Tools · 1 param | ||||
|
Tool choice
tool_choice
|
enum (auto | none | required) | — | Controls whether the model may call tools, must call one, or skips tool calls. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_schema) | "text" | Controls whether the model returns normal text or a schema-constrained JSON object. | — |
Meta Llama 4 Maverick 17B 128E Instruct FP8 7 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
| Sampling · 4 params | ||||
|
Temperature
temperature
|
number | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
top_k
|
integer | — | Limits generation to the selected number of highest-probability tokens. | — |
|
Repetition penalty
repetition_penalty
|
number | — | Penalizes tokens that have already appeared to reduce repetition in the output. | — |
| Tools · 1 param | ||||
|
Tool choice
tool_choice
|
enum (auto | none | required) | — | Controls whether the model may call tools, must call one, or skips tool calls. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_schema) | "text" | Controls whether the model returns normal text or a schema-constrained JSON object. | — |
Meta Llama 4 Scout 17B 16E Instruct FP8 7 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
| Sampling · 4 params | ||||
|
Temperature
temperature
|
number | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
top_k
|
integer | — | Limits generation to the selected number of highest-probability tokens. | — |
|
Repetition penalty
repetition_penalty
|
number | — | Penalizes tokens that have already appeared to reduce repetition in the output. | — |
| Tools · 1 param | ||||
|
Tool choice
tool_choice
|
enum (auto | none | required) | — | Controls whether the model may call tools, must call one, or skips tool calls. | — |
| Output · 1 param | ||||
|
Response format
response_format.type
|
enum (text | json_schema) | "text" | Controls whether the model returns normal text or a schema-constrained JSON object. | — |
Perplexity 4
Perplexity Sonar 12 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…128000) | — | Maximum number of output tokens the model may generate. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
| Metadata · 9 params | ||||
|
Search mode
search_mode
|
enum (web | academic | sec) | — | Selects the corpus the model searches when grounding its answer. | — |
|
Search recency filter
search_recency_filter
|
enum (hour | day | week | month | year) | — | Restricts web search results to a recent time window. | — |
|
Search domain filter
search_domain_filter
|
string | — | Limits search to, or excludes, specific domains. | — |
|
Search after date
search_after_date_filter
|
string | — | Restricts search results to content published after this date (MM/DD/YYYY). | — |
|
Search before date
search_before_date_filter
|
string | — | Restricts search results to content published before this date (MM/DD/YYYY). | — |
|
Search context size
web_search_options.search_context_size
|
enum (low | medium | high) | "low" | Controls how much web search context is retrieved before generating the answer. | — |
|
Return images
return_images
|
boolean | false | Controls whether the response may include related images from the search. | — |
|
Return related questions
return_related_questions
|
boolean | false | Controls whether the response includes suggested follow-up questions. | — |
|
Disable search
disable_search
|
boolean | false | Turns off web search so the model answers from its own knowledge only. | — |
Perplexity Sonar Deep Research 12 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…128000) | — | Maximum number of output tokens the model may generate. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
| Reasoning · 1 param | ||||
|
Reasoning effort
reasoning_effort
|
enum (minimal | low | medium | high) | — | Controls how much reasoning and searching the model performs before producing the report. | — |
| Metadata · 8 params | ||||
|
Search mode
search_mode
|
enum (web | academic | sec) | — | Selects the corpus the model searches when grounding its answer. | — |
|
Search recency filter
search_recency_filter
|
enum (hour | day | week | month | year) | — | Restricts web search results to a recent time window. | — |
|
Search domain filter
search_domain_filter
|
string | — | Limits search to, or excludes, specific domains. | — |
|
Search after date
search_after_date_filter
|
string | — | Restricts search results to content published after this date (MM/DD/YYYY). | — |
|
Search before date
search_before_date_filter
|
string | — | Restricts search results to content published before this date (MM/DD/YYYY). | — |
|
Search context size
web_search_options.search_context_size
|
enum (low | medium | high) | "low" | Controls how much web search context is retrieved before generating the answer. | — |
|
Return images
return_images
|
boolean | false | Controls whether the response may include related images from the search. | — |
|
Return related questions
return_related_questions
|
boolean | false | Controls whether the response includes suggested follow-up questions. | — |
Perplexity Sonar Pro 12 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…128000) | — | Maximum number of output tokens the model may generate. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
| Metadata · 9 params | ||||
|
Search mode
search_mode
|
enum (web | academic | sec) | — | Selects the corpus the model searches when grounding its answer. | — |
|
Search recency filter
search_recency_filter
|
enum (hour | day | week | month | year) | — | Restricts web search results to a recent time window. | — |
|
Search domain filter
search_domain_filter
|
string | — | Limits search to, or excludes, specific domains. | — |
|
Search after date
search_after_date_filter
|
string | — | Restricts search results to content published after this date (MM/DD/YYYY). | — |
|
Search before date
search_before_date_filter
|
string | — | Restricts search results to content published before this date (MM/DD/YYYY). | — |
|
Search context size
web_search_options.search_context_size
|
enum (low | medium | high) | "low" | Controls how much web search context is retrieved before generating the answer. | — |
|
Return images
return_images
|
boolean | false | Controls whether the response may include related images from the search. | — |
|
Return related questions
return_related_questions
|
boolean | false | Controls whether the response includes suggested follow-up questions. | — |
|
Disable search
disable_search
|
boolean | false | Turns off web search so the model answers from its own knowledge only. | — |
Perplexity Sonar Reasoning Pro 12 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
| Length · 1 param | ||||
|
Max tokens
max_tokens
|
integer (1…128000) | — | Maximum number of output tokens the model may generate. | — |
| Sampling · 2 params | ||||
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
| Metadata · 9 params | ||||
|
Search mode
search_mode
|
enum (web | academic | sec) | — | Selects the corpus the model searches when grounding its answer. | — |
|
Search recency filter
search_recency_filter
|
enum (hour | day | week | month | year) | — | Restricts web search results to a recent time window. | — |
|
Search domain filter
search_domain_filter
|
string | — | Limits search to, or excludes, specific domains. | — |
|
Search after date
search_after_date_filter
|
string | — | Restricts search results to content published after this date (MM/DD/YYYY). | — |
|
Search before date
search_before_date_filter
|
string | — | Restricts search results to content published before this date (MM/DD/YYYY). | — |
|
Search context size
web_search_options.search_context_size
|
enum (low | medium | high) | "low" | Controls how much web search context is retrieved before generating the answer. | — |
|
Return images
return_images
|
boolean | false | Controls whether the response may include related images from the search. | — |
|
Return related questions
return_related_questions
|
boolean | false | Controls whether the response includes suggested follow-up questions. | — |
|
Disable search
disable_search
|
boolean | false | Turns off web search so the model answers from its own knowledge only. | — |