Every LLM parameter,
for every model.
An open, community-maintained catalog of LLM model parameters. Browse the UI below, query the API, or install the npm package.
Access type
Providers
Parameters
191 of 191 models
OpenAI
Chatgpt 4o Latest OpenAI 3 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. | — |
Gpt 3.5 Turbo OpenAI 3 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. | — |
Gpt 4 Turbo OpenAI 3 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. | — |
Gpt 4 Turbo 2024-04-09 OpenAI 3 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. | — |
Gpt 4.1 OpenAI 3 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. | — |
Gpt 4.1 Mini OpenAI 3 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. | — |
Gpt 4.1 Nano OpenAI 3 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. | — |
GPT-4o OpenAI 3 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. | — |
Gpt 4o 2024-11-20 OpenAI 3 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. | — |
GPT-4o mini OpenAI 3 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. | — |
Gpt 5 OpenAI 2 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (minimal | low | medium | high) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
Gpt 5 Chat Latest OpenAI 1 param
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Gpt 5 Mini OpenAI 2 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (minimal | low | medium | high) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
Gpt 5 Nano OpenAI 2 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (minimal | low | medium | high) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
Gpt 5.1 OpenAI 2 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (none | low | medium | high) | "none" | Controls how much reasoning the model should perform before producing an answer. | — |
Gpt 5.1 Codex Max OpenAI Subscription 3 params
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning.effort
|
enum (minimal | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
Gpt 5.1 Codex OpenAI Subscription 3 params
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning.effort
|
enum (minimal | low | medium | high) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
Gpt 5.2 OpenAI 2 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (none | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
Gpt 5.2 Codex OpenAI Subscription 3 params
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning.effort
|
enum (minimal | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
Gpt 5.2 OpenAI Subscription 3 params
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning.effort
|
enum (minimal | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
Gpt 5.3 Codex OpenAI 2 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
Gpt 5.3 Codex Spark OpenAI Subscription 3 params
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning.effort
|
enum (minimal | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
Gpt 5.3 Codex OpenAI Subscription 3 params
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning.effort
|
enum (minimal | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
Gpt 5.4 OpenAI 2 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (none | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
Gpt 5.4 Mini OpenAI 2 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (none | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
Gpt 5.4 Mini OpenAI Subscription 3 params
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning.effort
|
enum (minimal | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
Gpt 5.4 Nano OpenAI 2 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (none | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
Gpt 5.4 Pro OpenAI 2 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
Gpt 5.4 Pro OpenAI Subscription 3 params
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning.effort
|
enum (medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
Gpt 5.4 OpenAI Subscription 3 params
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning.effort
|
enum (minimal | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
Gpt 5.5 OpenAI 2 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (none | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
Gpt 5.5 Pro OpenAI 2 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
Gpt 5.5 Pro OpenAI Subscription 3 params
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning.effort
|
enum (medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
Gpt 5.5 OpenAI Subscription 3 params
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning.effort
|
enum (minimal | low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
|
Reasoning summary
reasoning.summary
|
enum (auto | concise | detailed | none) | "auto" | Controls the level of reasoning summary returned with the response. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Verbosity
text.verbosity
|
enum (low | medium | high) | "medium" | Controls how concise or detailed the model's final text response should be. | — |
o1 OpenAI 2 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
o1-mini OpenAI 2 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (minimal | low | medium | high) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
O1 Preview OpenAI 2 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (minimal | low | medium | high) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
o3 OpenAI 2 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
o3-mini OpenAI 2 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
O3 Pro OpenAI 2 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
o4-mini OpenAI 2 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (16…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (low | medium | high | xhigh) | "medium" | Controls how much reasoning the model should perform before producing an answer. | — |
Anthropic
Claude 3.5 Haiku 20241022 Anthropic 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Claude 3.5 Haiku Latest Anthropic 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Claude 3.5 Sonnet 20241022 Anthropic 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Claude 3.5 Sonnet Latest Anthropic 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Claude 3.7 Sonnet 20250219 Anthropic 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Claude 3.7 Sonnet Latest Anthropic 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Claude 3 Opus 20240229 Anthropic 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Claude 3 Opus Latest Anthropic 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Claude Fable 5 Anthropic 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (adaptive) | — | Only adaptive thinking is supported; omit the parameter entirely to run without thinking (an explicit disabled value is rejected). | — |
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "omitted" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "adaptive"
|
|
Effort
output_config.effort
|
enum (low | medium | high | xhigh | max) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Claude Fable 5 Anthropic Subscription 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (adaptive) | — | Only adaptive thinking is supported; omit the parameter entirely to run without thinking (an explicit disabled value is rejected). | — |
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "omitted" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "adaptive"
|
|
Effort
output_config.effort
|
enum (low | medium | high | xhigh | max) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Claude Haiku 4 Anthropic 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Claude Haiku 4.5 Anthropic 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Claude Haiku 4.5 20251001 Anthropic 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled" or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled" or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Claude Haiku 4.5 20251001 Anthropic Subscription 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled" or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled" or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Claude Haiku 4.5 Anthropic Subscription 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Claude Haiku 4 Anthropic Subscription 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Claude Opus 4.1 20250805 Anthropic 7 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled" or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled" or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
Reasoning
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "enabled"
|
Claude Opus 4.1 20250805 Anthropic Subscription 7 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled" or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled" or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
Reasoning
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "enabled"
|
Claude Opus 4 20250514 Anthropic 7 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled"
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled"
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
Reasoning
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "enabled"
|
Claude Opus 4 20250514 Anthropic Subscription 7 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled"
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled"
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
Reasoning
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "enabled"
|
Claude Opus 4.5 20251101 Anthropic 8 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled" or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled" or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
Reasoning
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "enabled"
|
|
Effort
output_config.effort
|
enum (low | medium | high) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Claude Opus 4.5 20251101 Anthropic Subscription 8 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled" or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled" or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
Reasoning
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "enabled"
|
|
Effort
output_config.effort
|
enum (low | medium | high) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Claude Opus 4.6 Anthropic 8 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"enabled", "adaptive"} or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"enabled", "adaptive"} or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"enabled", "adaptive"}
|
Reasoning
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | adaptive | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type ∈ {"adaptive", "enabled"}
|
|
Effort
output_config.effort
|
enum (low | medium | high | max) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Claude Opus 4.6 Anthropic Subscription 8 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"enabled", "adaptive"} or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"enabled", "adaptive"} or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"enabled", "adaptive"}
|
Reasoning
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | adaptive | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type ∈ {"adaptive", "enabled"}
|
|
Effort
output_config.effort
|
enum (low | medium | high | max) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Claude Opus 4.7 Anthropic 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | adaptive) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "omitted" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "adaptive"
|
|
Effort
output_config.effort
|
enum (low | medium | high | xhigh | max) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Claude Opus 4.7 Anthropic Subscription 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | adaptive) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "omitted" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "adaptive"
|
|
Effort
output_config.effort
|
enum (low | medium | high | xhigh | max) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Claude Opus 4.8 Anthropic 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | adaptive) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "omitted" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "adaptive"
|
|
Effort
output_config.effort
|
enum (low | medium | high | xhigh | max) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Claude Opus 4.8 Anthropic Subscription 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Reasoning
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | adaptive) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "omitted" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "adaptive"
|
|
Effort
output_config.effort
|
enum (low | medium | high | xhigh | max) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Claude Opus 4 Anthropic Subscription 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | adaptive | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Claude Sonnet 4 20250514 Anthropic 7 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled"
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled"
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
Reasoning
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "enabled"
|
Claude Sonnet 4 20250514 Anthropic Subscription 7 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled"
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled"
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
Reasoning
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type = "enabled"
|
Claude Sonnet 4.5 Anthropic 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | adaptive | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Claude Sonnet 4.5 20250929 Anthropic 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled" or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled" or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Claude Sonnet 4.5 20250929 Anthropic Subscription 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type = "enabled" or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type = "enabled" or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type = "enabled"
|
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Claude Sonnet 4.5 Anthropic Subscription 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | adaptive | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Claude Sonnet 4.6 Anthropic 8 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"enabled", "adaptive"} or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"enabled", "adaptive"} or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"enabled", "adaptive"}
|
Reasoning
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | adaptive | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type ∈ {"adaptive", "enabled"}
|
|
Effort
output_config.effort
|
enum (low | medium | high | max) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Claude Sonnet 4.6 Anthropic Subscription 8 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"enabled", "adaptive"} or top_p ≠ null
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"enabled", "adaptive"} or temperature ≠ null
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"enabled", "adaptive"}
|
Reasoning
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | adaptive | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
|
Thinking display
thinking.display
|
enum (summarized | omitted) | "summarized" | Controls whether Anthropic returns summarized or omitted thinking content. |
Only when thinking.type ∈ {"adaptive", "enabled"}
|
|
Effort
output_config.effort
|
enum (low | medium | high | max) | "high" | Controls Anthropic response thoroughness and token spend. | — |
Claude Sonnet 4 Anthropic Subscription 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value. |
Not when thinking.type ∈ {"adaptive", "enabled"} or temperature ≠ 1
|
|
Top K
top_k
|
integer (0…+∞) | 0 | Limits token sampling to the top K most likely next tokens. |
Not when thinking.type ∈ {"adaptive", "enabled"}
|
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | adaptive | enabled) | "disabled" | Controls the Anthropic thinking mode values supported by this model. | — |
|
Budget tokens
thinking.budget_tokens
|
integer (1024…+∞) | 4096 | Maximum token budget Anthropic may use for extended thinking before producing the final answer. |
Only when thinking.type = "enabled"
|
Z.ai
GLM-4.5 Z.ai 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 0.6 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
GLM-4.5-Air Z.ai 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 0.6 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
GLM-4.5-Air Z.ai Subscription 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 0.6 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
GLM-4.5-AirX Z.ai 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 0.6 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
GLM-4.5-Flash Z.ai 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 0.6 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
GLM-4.5 Z.ai Subscription 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 0.6 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
GLM-4.5-X Z.ai 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 0.6 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
GLM-4.6 Z.ai 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
GLM-4.6 Z.ai Subscription 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
GLM-4.7 Z.ai 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
GLM-4.7-Flash Z.ai 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
GLM-4.7-FlashX Z.ai 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
GLM-4.7 Z.ai Subscription 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
GLM-5 Z.ai 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
GLM-5 Z.ai Subscription 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
GLM-5-Turbo Z.ai 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
GLM-5-Turbo Z.ai Subscription 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
GLM-5.1 Z.ai 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
GLM-5.1 Z.ai Subscription 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. |
Not when do_sample = false
|
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. |
Not when do_sample = false
|
|
Do sample
do_sample
|
boolean | true | When false, the model uses greedy decoding and ignores temperature and top_p. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Toggles the model's extended reasoning before it produces the final answer. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
MiniMax
MiniMax M2 MiniMax 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Split reasoning
reasoning_split
|
boolean | false | Returns the model's reasoning in a separate reasoning_details field instead of inline with the response. | — |
MiniMax M2 MiniMax Subscription 3 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
MiniMax M2.1 MiniMax 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Split reasoning
reasoning_split
|
boolean | false | Returns the model's reasoning in a separate reasoning_details field instead of inline with the response. | — |
MiniMax M2.1 Highspeed MiniMax 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Split reasoning
reasoning_split
|
boolean | false | Returns the model's reasoning in a separate reasoning_details field instead of inline with the response. | — |
MiniMax M2.1 Highspeed MiniMax Subscription 3 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
MiniMax M2.1 MiniMax Subscription 3 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
MiniMax M2.5 MiniMax 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Split reasoning
reasoning_split
|
boolean | false | Returns the model's reasoning in a separate reasoning_details field instead of inline with the response. | — |
MiniMax M2.5 Highspeed MiniMax 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Split reasoning
reasoning_split
|
boolean | false | Returns the model's reasoning in a separate reasoning_details field instead of inline with the response. | — |
MiniMax M2.5 Highspeed MiniMax Subscription 3 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
MiniMax M2.5 MiniMax Subscription 3 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
MiniMax M2.7 MiniMax 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Split reasoning
reasoning_split
|
boolean | false | Returns the model's reasoning in a separate reasoning_details field instead of inline with the response. | — |
MiniMax M2.7 Highspeed MiniMax 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Split reasoning
reasoning_split
|
boolean | false | Returns the model's reasoning in a separate reasoning_details field instead of inline with the response. | — |
MiniMax M2.7 Highspeed MiniMax Subscription 3 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
MiniMax M2.7 MiniMax Subscription 3 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
Minimax M3 MiniMax 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Split reasoning
reasoning_split
|
boolean | false | Returns the model's reasoning in a separate reasoning_details field instead of inline with the response. | — |
MiniMax M3 MiniMax Subscription 3 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the response. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0.01…1 step 0.01) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Values must be greater than 0 and at most 1. | — |
|
Top P
top_p
|
number (0.01…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
Nvidia
Gliner Pii Nvidia 4 params
Sampling
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Threshold
threshold
|
number (0…1) | 0.5 | Confidence threshold for entity detection. Lower values detect more entities but may include false positives. | — |
Metadata
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Chunk length
chunk_length
|
integer (1…2048) | 384 | Context window size for processing. Longer texts are automatically split into chunks with overlap for complete coverage. Must be greater than overlap. | — |
|
Overlap
overlap
|
integer (0…512) | 128 | Token overlap between chunks to prevent entity clipping. Must be less than chunk_length. | — |
|
Flat NER
flat_ner
|
boolean | false | When true, prevents overlapping entity spans. When false, may return nested entities such as both a full name and its constituent first name. | — |
Llama 3.1 Nemoguard 8b Topic Control Nvidia 6 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 1024 | Maximum number of tokens to generate. Generation stops when this limit is reached. | — |
|
Stop
stop
|
string | — | A string or list of strings where the API will stop generating further tokens. The returned text will not contain the stop sequence. | — |
Sampling
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2) | 0.5 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Not recommended to modify both temperature and top_p in the same call. | — |
|
Top P
top_p
|
number (-∞…1) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. Not recommended to modify both temperature and top_p in the same call. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2) | 0 | Penalizes new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. | — |
|
Presence penalty
presence_penalty
|
number (-2…2) | 0 | Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. | — |
Llama 3.1 Nemotron Nano 8b V1 Nvidia 7 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…16384) | 4096 | Maximum number of tokens to generate. Generation stops when this limit is reached. | — |
|
Stop
stop
|
string | — | A string or list of strings where the API will stop generating further tokens. The returned text will not contain the stop sequence. | — |
Sampling
5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1) | 0.6 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Not recommended to modify both temperature and top_p in the same call. | — |
|
Top P
top_p
|
number (-∞…1) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. Not recommended to modify both temperature and top_p in the same call. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2) | 0 | Penalizes new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. | — |
|
Presence penalty
presence_penalty
|
number (-2…2) | 0 | Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. | — |
|
Seed
seed
|
integer (0…18446744073709552000) | 0 | Best-effort deterministic sampling seed. Changing the seed produces a different response with similar characteristics. Fix the seed to reproduce results. | — |
Llama 3.1 Nemotron Safety Guard 8b V3 Nvidia 1 param
Sampling
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1) | 0 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
Llama 3.1 Nemotron Ultra 253b V1 Nvidia 7 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…16384) | 4096 | Maximum number of tokens to generate. Generation stops when this limit is reached. | — |
|
Stop
stop
|
string | — | A string or list of strings where the API will stop generating further tokens. The returned text will not contain the stop sequence. | — |
Sampling
5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1) | 0.6 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Not recommended to modify both temperature and top_p in the same call. | — |
|
Top P
top_p
|
number (-∞…1) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. Not recommended to modify both temperature and top_p in the same call. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2) | 0 | Penalizes new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. | — |
|
Presence penalty
presence_penalty
|
number (-2…2) | 0 | Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. | — |
|
Seed
seed
|
integer (0…18446744073709552000) | 0 | Best-effort deterministic sampling seed. Changing the seed produces a different response with similar characteristics. Fix the seed to reproduce results. | — |
Llama 3.3 Nemotron Super 49b V1 Nvidia 7 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…16384) | 4096 | Maximum number of tokens to generate. Generation stops when this limit is reached. | — |
|
Stop
stop
|
string | — | A string or list of strings where the API will stop generating further tokens. The returned text will not contain the stop sequence. | — |
Sampling
5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1) | 0.6 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Not recommended to modify both temperature and top_p in the same call. | — |
|
Top P
top_p
|
number (-∞…1) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. Not recommended to modify both temperature and top_p in the same call. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2) | 0 | Penalizes new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. | — |
|
Presence penalty
presence_penalty
|
number (-2…2) | 0 | Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. | — |
|
Seed
seed
|
integer (0…18446744073709552000) | 0 | Best-effort deterministic sampling seed. Changing the seed produces a different response with similar characteristics. Fix the seed to reproduce results. | — |
Llama 3.3 Nemotron Super 49b V1.5 Nvidia 7 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…65536) | 65536 | Maximum number of tokens to generate. Generation stops when this limit is reached. | — |
|
Stop
stop
|
string | — | A string or list of strings where the API will stop generating further tokens. The returned text will not contain the stop sequence. | — |
Sampling
5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1) | 0.6 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Not recommended to modify both temperature and top_p in the same call. | — |
|
Top P
top_p
|
number (-∞…1) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. Not recommended to modify both temperature and top_p in the same call. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2) | 0 | Penalizes new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. | — |
|
Presence penalty
presence_penalty
|
number (-2…2) | 0 | Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. | — |
|
Seed
seed
|
integer (0…18446744073709552000) | 0 | Best-effort deterministic sampling seed. Changing the seed produces a different response with similar characteristics. Fix the seed to reproduce results. | — |
Nemoguard Jailbreak Detect Nvidia 0 params
No parameters documented yet.
Nemotron 3 Nano 30b A3b Nvidia 5 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…32768) | 16384 | Maximum number of tokens to generate. Generation stops when this limit is reached. | — |
|
Stop
stop
|
string | — | A string or list of strings where the API will stop generating further tokens. The returned text will not contain the stop sequence. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (-∞…1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Not recommended to modify both temperature and top_p in the same call. | — |
|
Top P
top_p
|
number (-∞…1) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. Not recommended to modify both temperature and top_p in the same call. | — |
|
Seed
seed
|
integer (0…18446744073709552000) | — | Best-effort deterministic sampling seed. Repeated requests with the same seed and parameters should return the same result. | — |
Nemotron 3 Super 120b A12b Nvidia 7 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…32768) | 16384 | Maximum number of tokens to generate. Generation stops when this limit is reached. | — |
|
Stop
stop
|
string | — | A string or list of strings where the API will stop generating further tokens. The returned text will not contain the stop sequence. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (-∞…1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Not recommended to modify both temperature and top_p in the same call. | — |
|
Top P
top_p
|
number (-∞…1) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. Not recommended to modify both temperature and top_p in the same call. | — |
|
Seed
seed
|
integer (0…18446744073709552000) | — | Best-effort deterministic sampling seed. Repeated requests with the same seed and parameters should return the same result. | — |
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (none | low | high) | "high" | Controls the reasoning mode. 'none' disables reasoning tokens, 'low' enables low-effort reasoning, and 'high' enables full reasoning. | — |
|
Reasoning budget
reasoning_budget
|
integer (-1…32768) | 16384 | Maximum number of tokens the model may use for internal reasoning before being forced to end the reasoning trace. Use -1 to disable budget enforcement. | — |
Nemotron 3 Ultra 550b A55b Nvidia 7 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…32768) | 16384 | Maximum number of tokens to generate. Generation stops when this limit is reached. | — |
|
Stop
stop
|
string | — | A string or list of strings where the API will stop generating further tokens. The returned text will not contain the stop sequence. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (-∞…1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Not recommended to modify both temperature and top_p in the same call. | — |
|
Top P
top_p
|
number (-∞…1) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. Not recommended to modify both temperature and top_p in the same call. | — |
|
Seed
seed
|
integer (0…18446744073709552000) | — | Best-effort deterministic sampling seed. Repeated requests with the same seed and parameters should return the same result. | — |
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (none | medium | high) | "high" | Controls the reasoning mode. 'none' disables reasoning tokens, 'medium' enables efficient reasoning, and 'high' enables full reasoning. | — |
|
Reasoning budget
reasoning_budget
|
integer (-1…32768) | 16384 | Maximum number of tokens the model may use for internal reasoning before being forced to end the reasoning trace. Use -1 to disable budget enforcement. | — |
Nemotron Content Safety Reasoning 4b Nvidia 5 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…32768) | 16384 | Maximum number of tokens to generate. Generation stops when this limit is reached. | — |
|
Stop
stop
|
string | — | A string or list of strings where the API will stop generating further tokens. The returned text will not contain the stop sequence. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (-∞…1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Not recommended to modify both temperature and top_p in the same call. | — |
|
Top P
top_p
|
number (-∞…1) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. Not recommended to modify both temperature and top_p in the same call. | — |
|
Seed
seed
|
integer (0…18446744073709552000) | — | Best-effort deterministic sampling seed. Repeated requests with the same seed and parameters should return the same result. | — |
Nemotron Mini 4b Instruct Nvidia 6 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…4096) | 1024 | Maximum number of tokens to generate. Generation stops when this limit is reached. | — |
|
Stop
stop
|
string | — | A string or list of strings where the API will stop generating further tokens. The returned text will not contain the stop sequence. | — |
Sampling
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1) | 0.2 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Not recommended to modify both temperature and top_p in the same call. | — |
|
Top P
top_p
|
number (-∞…1) | 0.7 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. Not recommended to modify both temperature and top_p in the same call. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2) | 0 | Penalizes new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. | — |
|
Presence penalty
presence_penalty
|
number (-2…2) | 0 | Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. | — |
Riva Translate 4b Instruct V1.1 Nvidia 6 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…4096) | 512 | Maximum number of tokens to generate. Generation stops when this limit is reached. | — |
|
Stop
stop
|
string | — | A string or list of strings where the API will stop generating further tokens. The returned text will not contain the stop sequence. | — |
Sampling
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1) | 0 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Not recommended to modify both temperature and top_p in the same call. | — |
|
Top P
top_p
|
number (-∞…1) | 0.9 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. Not recommended to modify both temperature and top_p in the same call. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2) | 0 | Penalizes new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim. | — |
|
Presence penalty
presence_penalty
|
number (-2…2) | 0 | Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics. | — |
Usdcode Llama 3.1 70b Instruct Nvidia 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…2048) | 1024 | Maximum number of tokens to generate. Generation stops when this limit is reached. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1) | 0.1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. Not recommended to modify both temperature and top_p in the same call. | — |
|
Top P
top_p
|
number (-∞…1) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. Not recommended to modify both temperature and top_p in the same call. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Expert type
expert_type
|
enum (auto | code | knowledge | helperfunction) | "auto" | The type of expert to use. 'knowledge' answers with USD knowledge, 'code' responds with vanilla OpenUSD code, 'helperfunction' uses high-level helper functions, and 'auto' lets the LLM determine which expert to use. | — |
Mistral
Codestral Latest Mistral 9 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
Sampling
5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Devstral 2512 Mistral 9 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
Sampling
5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Devstral Latest Mistral 9 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
Sampling
5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Magistral Medium Latest Mistral 10 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
Sampling
5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Prompt mode
prompt_mode
|
enum (reasoning) | — | Enables Mistral's reasoning system prompt; leave unset to disable the default reasoning behavior. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Magistral Small Latest Mistral 10 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
Sampling
5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Prompt mode
prompt_mode
|
enum (reasoning) | — | Enables Mistral's reasoning system prompt; leave unset to disable the default reasoning behavior. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Ministral 14b Latest Mistral 9 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
Sampling
5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Ministral 3b Latest Mistral 9 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
Sampling
5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Ministral 8b Latest Mistral 9 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
Sampling
5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Mistral Large Latest Mistral 9 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
Sampling
5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Mistral Medium 3.5 Mistral 9 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
Sampling
5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Mistral Medium Latest Mistral 9 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
Sampling
5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Mistral Small Latest Mistral 9 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
Sampling
5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Open Mistral Nemo Mistral 9 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this string is detected. | — |
Sampling
5 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1.5 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Random seed
random_seed
|
integer (0…+∞) | — | Seed used for deterministic sampling when reproducible outputs are desired. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes repeated words or phrases to encourage a wider variety of generated content. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes words based on how often they already appear in the generated text. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON mode output. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Safe prompt
safe_prompt
|
boolean | false | Controls whether Mistral injects its safety prompt before the conversation. | — |
Gemini 2.5 Flash Google 8 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
Sampling
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking budget
generationConfig.thinkingConfig.thinkingBudget
|
integer (-1…24576) | -1 | Number of thinking tokens Gemini should use; 0 disables thinking and -1 uses dynamic thinking. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Gemini 2.5 Flash Lite Google 8 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
Sampling
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking budget
generationConfig.thinkingConfig.thinkingBudget
|
integer | 0 | Number of thinking tokens Gemini should use; -1 uses dynamic thinking, 0 disables thinking, and fixed budgets start at 512 tokens. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Gemini 2.5 Flash Lite Google Subscription 8 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
Sampling
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking budget
generationConfig.thinkingConfig.thinkingBudget
|
integer | 0 | Number of thinking tokens Gemini should use; -1 uses dynamic thinking, 0 disables thinking, and fixed budgets start at 512 tokens. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Gemini 2.5 Flash Google Subscription 8 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
Sampling
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking budget
generationConfig.thinkingConfig.thinkingBudget
|
integer (-1…24576) | -1 | Number of thinking tokens Gemini should use; 0 disables thinking and -1 uses dynamic thinking. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Gemini 2.5 Pro Google 8 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
Sampling
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking budget
generationConfig.thinkingConfig.thinkingBudget
|
integer (128…32768) | — | Maximum number of thinking tokens Gemini should use before producing the final answer. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Gemini 2.5 Pro Google Subscription 8 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
Sampling
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking budget
generationConfig.thinkingConfig.thinkingBudget
|
integer (128…32768) | — | Maximum number of thinking tokens Gemini should use before producing the final answer. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Gemini 3 Flash Preview Google Subscription 8 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
Sampling
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking level
generationConfig.thinkingConfig.thinkingLevel
|
enum (minimal | low | medium | high) | "high" | Controls Gemini 3 Flash reasoning effort. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Gemini 3.1 Flash Lite Preview Google Subscription 8 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
Sampling
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking level
generationConfig.thinkingConfig.thinkingLevel
|
enum (minimal | low | medium | high) | "high" | Controls Gemini 3.1 Flash-Lite reasoning effort. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Gemini 3.1 Flash Lite Google Subscription 8 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
Sampling
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking level
generationConfig.thinkingConfig.thinkingLevel
|
enum (minimal | low | medium | high) | "high" | Controls Gemini 3.1 Flash-Lite reasoning effort. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Gemini 3.1 Pro Preview Google Subscription 8 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
Sampling
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking level
generationConfig.thinkingConfig.thinkingLevel
|
enum (low | high) | "high" | Controls Gemini 3 Pro reasoning effort. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Gemini 3.5 Flash Google 8 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max output tokens
generationConfig.maxOutputTokens
|
integer (1…65536) | — | Maximum number of tokens to include in a response candidate. | — |
Sampling
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
generationConfig.temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
generationConfig.topP
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
generationConfig.topK
|
integer (0…+∞) | 64 | Limits token sampling to the top K most likely next tokens. | — |
|
Seed
generationConfig.seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking level
generationConfig.thinkingConfig.thinkingLevel
|
enum (minimal | low | medium | high) | "medium" | Controls Gemini 3.5 Flash reasoning effort. | — |
|
Include thoughts
generationConfig.thinkingConfig.includeThoughts
|
boolean | false | Controls whether Gemini returns available thought summaries in the response parts. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response MIME type
generationConfig.responseMimeType
|
enum (text/plain | application/json) | "text/plain" | MIME type for generated text candidates. | — |
Alibaba
Qwen Flash Alibaba 5 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
extra_body.top_k
|
integer (1…+∞) | 20 | Limits generation to the selected number of highest-probability tokens. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Enable thinking
extra_body.chat_template_kwargs.enable_thinking
|
boolean | true | Controls Qwen3 thinking mode when using OpenAI-compatible clients that pass provider-specific extra body fields. | — |
Qwen Plus Alibaba 5 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
extra_body.top_k
|
integer (1…+∞) | 20 | Limits generation to the selected number of highest-probability tokens. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Enable thinking
extra_body.chat_template_kwargs.enable_thinking
|
boolean | true | Controls Qwen3 thinking mode when using OpenAI-compatible clients that pass provider-specific extra body fields. | — |
Qwen3 Coder Flash Alibaba 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
extra_body.top_k
|
integer (1…+∞) | 20 | Limits generation to the selected number of highest-probability tokens. | — |
Qwen3 Coder Plus Alibaba 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
extra_body.top_k
|
integer (1…+∞) | 20 | Limits generation to the selected number of highest-probability tokens. | — |
Qwen3 Max Alibaba 5 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
extra_body.top_k
|
integer (1…+∞) | 20 | Limits generation to the selected number of highest-probability tokens. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Enable thinking
extra_body.chat_template_kwargs.enable_thinking
|
boolean | false | Controls Qwen3 thinking mode when using OpenAI-compatible clients that pass provider-specific extra body fields. | — |
Qwen3.5 Alibaba 5 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
extra_body.top_k
|
integer (1…+∞) | 20 | Limits generation to the selected number of highest-probability tokens. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Enable thinking
extra_body.chat_template_kwargs.enable_thinking
|
boolean | true | Controls Qwen3 thinking mode when using OpenAI-compatible clients that pass provider-specific extra body fields. | — |
Qwen3.5 Flash Alibaba 5 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
extra_body.top_k
|
integer (1…+∞) | 20 | Limits generation to the selected number of highest-probability tokens. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Enable thinking
extra_body.chat_template_kwargs.enable_thinking
|
boolean | true | Controls Qwen3 thinking mode when using OpenAI-compatible clients that pass provider-specific extra body fields. | — |
Qwq Plus Alibaba 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
extra_body.top_k
|
integer (1…+∞) | 20 | Limits generation to the selected number of highest-probability tokens. | — |
Cohere
Command A 03 2025 Cohere 12 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
|
Stop sequences
stop_sequences
|
string | — | Stops generation when one of these sequences is detected; up to five are allowed. | — |
Sampling
6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…+∞ step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
p
|
number (0.01…0.99 step 0.01) | 0.75 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
k
|
integer (0…500) | 0 | Limits sampling to the K most likely tokens; 0 disables top-k sampling. | — |
|
Frequency penalty
frequency_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens proportional to how often they have already appeared to reduce repetition. | — |
|
Presence penalty
presence_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens that have already appeared to encourage a wider variety of content. | — |
|
Seed
seed
|
integer | — | Seed used for best-effort deterministic sampling when reproducible outputs are desired. | — |
Tools
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Tool choice
tool_choice
|
enum (REQUIRED | NONE) | — | Forces the model to either call a tool or skip tool calls for this request. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON object output. | — |
Observability
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Log probabilities
logprobs
|
boolean | false | Controls whether the response includes log probabilities for the generated tokens. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Safety mode
safety_mode
|
enum (CONTEXTUAL | STRICT) | "CONTEXTUAL" | Controls Cohere's built-in safety instructions applied to the generation. | — |
Command A Plus 05 2026 Cohere 12 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
|
Stop sequences
stop_sequences
|
string | — | Stops generation when one of these sequences is detected; up to five are allowed. | — |
Sampling
6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…+∞ step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
p
|
number (0.01…0.99 step 0.01) | 0.75 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
k
|
integer (0…500) | 0 | Limits sampling to the K most likely tokens; 0 disables top-k sampling. | — |
|
Frequency penalty
frequency_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens proportional to how often they have already appeared to reduce repetition. | — |
|
Presence penalty
presence_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens that have already appeared to encourage a wider variety of content. | — |
|
Seed
seed
|
integer | — | Seed used for best-effort deterministic sampling when reproducible outputs are desired. | — |
Tools
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Tool choice
tool_choice
|
enum (REQUIRED | NONE) | — | Forces the model to either call a tool or skip tool calls for this request. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON object output. | — |
Observability
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Log probabilities
logprobs
|
boolean | false | Controls whether the response includes log probabilities for the generated tokens. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Safety mode
safety_mode
|
enum (CONTEXTUAL | STRICT) | "CONTEXTUAL" | Controls Cohere's built-in safety instructions applied to the generation. | — |
Command A Reasoning 08 2025 Cohere 14 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
|
Stop sequences
stop_sequences
|
string | — | Stops generation when one of these sequences is detected; up to five are allowed. | — |
Sampling
6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…+∞ step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
p
|
number (0.01…0.99 step 0.01) | 0.75 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
k
|
integer (0…500) | 0 | Limits sampling to the K most likely tokens; 0 disables top-k sampling. | — |
|
Frequency penalty
frequency_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens proportional to how often they have already appeared to reduce repetition. | — |
|
Presence penalty
presence_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens that have already appeared to encourage a wider variety of content. | — |
|
Seed
seed
|
integer | — | Seed used for best-effort deterministic sampling when reproducible outputs are desired. | — |
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "disabled" | Controls whether the model reasons step by step before producing its final answer. | — |
|
Thinking token budget
thinking.token_budget
|
integer (1…+∞) | — | Maximum number of tokens the model may spend on reasoning before answering. |
Only when thinking.type = "enabled"
|
Tools
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Tool choice
tool_choice
|
enum (REQUIRED | NONE) | — | Forces the model to either call a tool or skip tool calls for this request. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON object output. | — |
Observability
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Log probabilities
logprobs
|
boolean | false | Controls whether the response includes log probabilities for the generated tokens. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Safety mode
safety_mode
|
enum (CONTEXTUAL | STRICT) | "CONTEXTUAL" | Controls Cohere's built-in safety instructions applied to the generation. | — |
Command A Translate 08 2025 Cohere 12 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
|
Stop sequences
stop_sequences
|
string | — | Stops generation when one of these sequences is detected; up to five are allowed. | — |
Sampling
6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…+∞ step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
p
|
number (0.01…0.99 step 0.01) | 0.75 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
k
|
integer (0…500) | 0 | Limits sampling to the K most likely tokens; 0 disables top-k sampling. | — |
|
Frequency penalty
frequency_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens proportional to how often they have already appeared to reduce repetition. | — |
|
Presence penalty
presence_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens that have already appeared to encourage a wider variety of content. | — |
|
Seed
seed
|
integer | — | Seed used for best-effort deterministic sampling when reproducible outputs are desired. | — |
Tools
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Tool choice
tool_choice
|
enum (REQUIRED | NONE) | — | Forces the model to either call a tool or skip tool calls for this request. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON object output. | — |
Observability
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Log probabilities
logprobs
|
boolean | false | Controls whether the response includes log probabilities for the generated tokens. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Safety mode
safety_mode
|
enum (CONTEXTUAL | STRICT) | "CONTEXTUAL" | Controls Cohere's built-in safety instructions applied to the generation. | — |
Command A Vision 07 2025 Cohere 12 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
|
Stop sequences
stop_sequences
|
string | — | Stops generation when one of these sequences is detected; up to five are allowed. | — |
Sampling
6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…+∞ step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
p
|
number (0.01…0.99 step 0.01) | 0.75 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
k
|
integer (0…500) | 0 | Limits sampling to the K most likely tokens; 0 disables top-k sampling. | — |
|
Frequency penalty
frequency_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens proportional to how often they have already appeared to reduce repetition. | — |
|
Presence penalty
presence_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens that have already appeared to encourage a wider variety of content. | — |
|
Seed
seed
|
integer | — | Seed used for best-effort deterministic sampling when reproducible outputs are desired. | — |
Tools
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Tool choice
tool_choice
|
enum (REQUIRED | NONE) | — | Forces the model to either call a tool or skip tool calls for this request. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON object output. | — |
Observability
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Log probabilities
logprobs
|
boolean | false | Controls whether the response includes log probabilities for the generated tokens. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Safety mode
safety_mode
|
enum (CONTEXTUAL | STRICT) | "CONTEXTUAL" | Controls Cohere's built-in safety instructions applied to the generation. | — |
Command R 08 2024 Cohere 11 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
|
Stop sequences
stop_sequences
|
string | — | Stops generation when one of these sequences is detected; up to five are allowed. | — |
Sampling
6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…+∞ step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
p
|
number (0.01…0.99 step 0.01) | 0.75 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
k
|
integer (0…500) | 0 | Limits sampling to the K most likely tokens; 0 disables top-k sampling. | — |
|
Frequency penalty
frequency_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens proportional to how often they have already appeared to reduce repetition. | — |
|
Presence penalty
presence_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens that have already appeared to encourage a wider variety of content. | — |
|
Seed
seed
|
integer | — | Seed used for best-effort deterministic sampling when reproducible outputs are desired. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON object output. | — |
Observability
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Log probabilities
logprobs
|
boolean | false | Controls whether the response includes log probabilities for the generated tokens. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Safety mode
safety_mode
|
enum (CONTEXTUAL | STRICT | OFF) | "CONTEXTUAL" | Controls Cohere's built-in safety instructions applied to the generation. | — |
Command R Plus 08 2024 Cohere 11 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
|
Stop sequences
stop_sequences
|
string | — | Stops generation when one of these sequences is detected; up to five are allowed. | — |
Sampling
6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…+∞ step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
p
|
number (0.01…0.99 step 0.01) | 0.75 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
k
|
integer (0…500) | 0 | Limits sampling to the K most likely tokens; 0 disables top-k sampling. | — |
|
Frequency penalty
frequency_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens proportional to how often they have already appeared to reduce repetition. | — |
|
Presence penalty
presence_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens that have already appeared to encourage a wider variety of content. | — |
|
Seed
seed
|
integer | — | Seed used for best-effort deterministic sampling when reproducible outputs are desired. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON object output. | — |
Observability
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Log probabilities
logprobs
|
boolean | false | Controls whether the response includes log probabilities for the generated tokens. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Safety mode
safety_mode
|
enum (CONTEXTUAL | STRICT | OFF) | "CONTEXTUAL" | Controls Cohere's built-in safety instructions applied to the generation. | — |
Command R7b 12 2024 Cohere 12 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
|
Stop sequences
stop_sequences
|
string | — | Stops generation when one of these sequences is detected; up to five are allowed. | — |
Sampling
6 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…+∞ step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
p
|
number (0.01…0.99 step 0.01) | 0.75 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
k
|
integer (0…500) | 0 | Limits sampling to the K most likely tokens; 0 disables top-k sampling. | — |
|
Frequency penalty
frequency_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens proportional to how often they have already appeared to reduce repetition. | — |
|
Presence penalty
presence_penalty
|
number (0…1 step 0.1) | 0 | Penalizes tokens that have already appeared to encourage a wider variety of content. | — |
|
Seed
seed
|
integer | — | Seed used for best-effort deterministic sampling when reproducible outputs are desired. | — |
Tools
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Tool choice
tool_choice
|
enum (REQUIRED | NONE) | — | Forces the model to either call a tool or skip tool calls for this request. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Controls whether the model returns normal text or JSON object output. | — |
Observability
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Log probabilities
logprobs
|
boolean | false | Controls whether the response includes log probabilities for the generated tokens. | — |
Metadata
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Safety mode
safety_mode
|
enum (CONTEXTUAL | STRICT) | "CONTEXTUAL" | Controls Cohere's built-in safety instructions applied to the generation. | — |
Moonshot AI
Kimi K2.5 Moonshot AI 3 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the chat completion. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | — | Controls whether Kimi reasons step by step before answering, or responds directly when set to disabled. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Kimi K2.6 Moonshot AI 3 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the chat completion. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Controls whether Kimi reasons step by step before answering. Thinking is enabled by default; set disabled to respond directly. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Moonshot v1 128K Moonshot AI 7 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the chat completion. | — |
|
Number of completions
n
|
integer (1…5) | 1 | How many chat completion choices to generate for the request. | — |
Sampling
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes tokens that have already appeared, encouraging the model to talk about new topics. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes tokens by how often they have appeared, reducing verbatim repetition. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Moonshot v1 32K Moonshot AI 7 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the chat completion. | — |
|
Number of completions
n
|
integer (1…5) | 1 | How many chat completion choices to generate for the request. | — |
Sampling
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes tokens that have already appeared, encouraging the model to talk about new topics. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes tokens by how often they have appeared, reducing verbatim repetition. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
Moonshot v1 8K Moonshot AI 7 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of tokens to generate in the chat completion. | — |
|
Number of completions
n
|
integer (1…5) | 1 | How many chat completion choices to generate for the request. | — |
Sampling
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…1 step 0.1) | 0.3 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Presence penalty
presence_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes tokens that have already appeared, encouraging the model to talk about new topics. | — |
|
Frequency penalty
frequency_penalty
|
number (-2…2 step 0.1) | 0 | Penalizes tokens by how often they have appeared, reducing verbatim repetition. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object) | "text" | Forces the response into plain text or a JSON object. | — |
xAI
Grok 4.20 0309 Non Reasoning xAI 6 params
Length
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Upper bound for visible output tokens generated in the chat completion. | — |
|
Stop sequence
stop
|
string | — | Stops generation when this sequence is produced. xAI accepts up to four stop sequences. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Seed
seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object | json_schema) | "text" | Controls whether the model returns text, JSON mode output, or structured JSON schema output. | — |
Grok 4.20 0309 Reasoning xAI 5 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Upper bound for visible output tokens generated in the chat completion. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Seed
seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object | json_schema) | "text" | Controls whether the model returns text, JSON mode output, or structured JSON schema output. | — |
Grok 4.20 Multi Agent 0309 xAI 5 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max output tokens
max_output_tokens
|
integer (1…+∞) | — | Upper bound for output tokens generated in the Responses API response. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | 0.7 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 0.95 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning.effort
|
enum (low | medium | high | xhigh) | — | Controls whether the Responses API request uses the 4-agent or 16-agent multi-agent setup. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Text format
text.format.type
|
enum (text | json_object | json_schema) | "text" | Controls whether the Responses API returns free-form text, JSON mode output, or structured JSON schema output. | — |
Grok 4.3 xAI 6 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Upper bound for visible output tokens generated in the chat completion. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Seed
seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (none | low | medium | high) | "low" | Controls how much reasoning Grok performs before responding. Set to none for non-reasoning requests. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object | json_schema) | "text" | Controls whether the model returns text, JSON mode output, or structured JSON schema output. | — |
Grok Build 0.1 xAI 5 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Upper bound for visible output tokens generated in the chat completion. | — |
Sampling
3 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Seed
seed
|
integer | — | Optional seed used for decoding when reproducible sampling is desired. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_object | json_schema) | "text" | Controls whether the model returns text, JSON mode output, or structured JSON schema output. | — |
DeepSeek
Deepseek Chat DeepSeek 4 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. In DeepSeek thinking mode this parameter is accepted for compatibility but has no effect. |
Not when thinking.type = "enabled"
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling. In DeepSeek thinking mode this parameter is accepted for compatibility but has no effect. |
Not when thinking.type = "enabled"
|
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (disabled | enabled) | "disabled" | Controls whether DeepSeek uses thinking mode before producing the final answer. | — |
Deepseek Reasoner DeepSeek 5 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. In DeepSeek thinking mode this parameter is accepted for compatibility but has no effect. |
Not when thinking.type = "enabled"
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling. In DeepSeek thinking mode this parameter is accepted for compatibility but has no effect. |
Not when thinking.type = "enabled"
|
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Controls whether DeepSeek uses thinking mode before producing the final answer. | — |
|
Reasoning effort
reasoning_effort
|
enum (high | max) | "high" | Controls DeepSeek thinking effort when thinking mode is enabled. |
Only when thinking.type = "enabled"
|
Deepseek V4 Flash DeepSeek 5 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. In DeepSeek thinking mode this parameter is accepted for compatibility but has no effect. |
Not when thinking.type = "enabled"
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling. In DeepSeek thinking mode this parameter is accepted for compatibility but has no effect. |
Not when thinking.type = "enabled"
|
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Controls whether DeepSeek uses thinking mode before producing the final answer. | — |
|
Reasoning effort
reasoning_effort
|
enum (high | max) | "high" | Controls DeepSeek thinking effort when thinking mode is enabled. |
Only when thinking.type = "enabled"
|
Deepseek V4 Pro DeepSeek 5 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…+∞) | 4096 | Maximum number of output tokens the model may generate. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | 1 | Controls randomness. In DeepSeek thinking mode this parameter is accepted for compatibility but has no effect. |
Not when thinking.type = "enabled"
|
|
Top P
top_p
|
number (0…1 step 0.01) | 1 | Controls nucleus sampling. In DeepSeek thinking mode this parameter is accepted for compatibility but has no effect. |
Not when thinking.type = "enabled"
|
Reasoning
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Thinking mode
thinking.type
|
enum (enabled | disabled) | "enabled" | Controls whether DeepSeek uses thinking mode before producing the final answer. | — |
|
Reasoning effort
reasoning_effort
|
enum (high | max) | "high" | Controls DeepSeek thinking effort when thinking mode is enabled. |
Only when thinking.type = "enabled"
|
Meta
Llama 3.3 70B Instruct Meta 7 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
Sampling
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
top_k
|
integer | — | Limits generation to the selected number of highest-probability tokens. | — |
|
Repetition penalty
repetition_penalty
|
number | — | Penalizes tokens that have already appeared to reduce repetition in the output. | — |
Tools
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Tool choice
tool_choice
|
enum (auto | none | required) | — | Controls whether the model may call tools, must call one, or skips tool calls. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_schema) | "text" | Controls whether the model returns normal text or a schema-constrained JSON object. | — |
Llama 3.3 8B Instruct Meta 7 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
Sampling
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
top_k
|
integer | — | Limits generation to the selected number of highest-probability tokens. | — |
|
Repetition penalty
repetition_penalty
|
number | — | Penalizes tokens that have already appeared to reduce repetition in the output. | — |
Tools
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Tool choice
tool_choice
|
enum (auto | none | required) | — | Controls whether the model may call tools, must call one, or skips tool calls. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_schema) | "text" | Controls whether the model returns normal text or a schema-constrained JSON object. | — |
Llama 4 Maverick 17B 128E Instruct FP8 Meta 7 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
Sampling
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
top_k
|
integer | — | Limits generation to the selected number of highest-probability tokens. | — |
|
Repetition penalty
repetition_penalty
|
number | — | Penalizes tokens that have already appeared to reduce repetition in the output. | — |
Tools
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Tool choice
tool_choice
|
enum (auto | none | required) | — | Controls whether the model may call tools, must call one, or skips tool calls. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_schema) | "text" | Controls whether the model returns normal text or a schema-constrained JSON object. | — |
Llama 4 Scout 17B 16E Instruct FP8 Meta 7 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max completion tokens
max_completion_tokens
|
integer (1…+∞) | — | Maximum number of output tokens the model may generate. | — |
Sampling
4 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
|
Top K
top_k
|
integer | — | Limits generation to the selected number of highest-probability tokens. | — |
|
Repetition penalty
repetition_penalty
|
number | — | Penalizes tokens that have already appeared to reduce repetition in the output. | — |
Tools
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Tool choice
tool_choice
|
enum (auto | none | required) | — | Controls whether the model may call tools, must call one, or skips tool calls. | — |
Output
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Response format
response_format.type
|
enum (text | json_schema) | "text" | Controls whether the model returns normal text or a schema-constrained JSON object. | — |
Perplexity
Sonar Perplexity 12 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…128000) | — | Maximum number of output tokens the model may generate. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
Metadata
9 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Search mode
search_mode
|
enum (web | academic | sec) | — | Selects the corpus the model searches when grounding its answer. | — |
|
Search recency filter
search_recency_filter
|
enum (hour | day | week | month | year) | — | Restricts web search results to a recent time window. | — |
|
Search domain filter
search_domain_filter
|
string | — | Limits search to, or excludes, specific domains. | — |
|
Search after date
search_after_date_filter
|
string | — | Restricts search results to content published after this date (MM/DD/YYYY). | — |
|
Search before date
search_before_date_filter
|
string | — | Restricts search results to content published before this date (MM/DD/YYYY). | — |
|
Search context size
web_search_options.search_context_size
|
enum (low | medium | high) | "low" | Controls how much web search context is retrieved before generating the answer. | — |
|
Return images
return_images
|
boolean | false | Controls whether the response may include related images from the search. | — |
|
Return related questions
return_related_questions
|
boolean | false | Controls whether the response includes suggested follow-up questions. | — |
|
Disable search
disable_search
|
boolean | false | Turns off web search so the model answers from its own knowledge only. | — |
Sonar Deep Research Perplexity 12 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…128000) | — | Maximum number of output tokens the model may generate. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
Reasoning
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Reasoning effort
reasoning_effort
|
enum (minimal | low | medium | high) | — | Controls how much reasoning and searching the model performs before producing the report. | — |
Metadata
8 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Search mode
search_mode
|
enum (web | academic | sec) | — | Selects the corpus the model searches when grounding its answer. | — |
|
Search recency filter
search_recency_filter
|
enum (hour | day | week | month | year) | — | Restricts web search results to a recent time window. | — |
|
Search domain filter
search_domain_filter
|
string | — | Limits search to, or excludes, specific domains. | — |
|
Search after date
search_after_date_filter
|
string | — | Restricts search results to content published after this date (MM/DD/YYYY). | — |
|
Search before date
search_before_date_filter
|
string | — | Restricts search results to content published before this date (MM/DD/YYYY). | — |
|
Search context size
web_search_options.search_context_size
|
enum (low | medium | high) | "low" | Controls how much web search context is retrieved before generating the answer. | — |
|
Return images
return_images
|
boolean | false | Controls whether the response may include related images from the search. | — |
|
Return related questions
return_related_questions
|
boolean | false | Controls whether the response includes suggested follow-up questions. | — |
Sonar Pro Perplexity 12 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…128000) | — | Maximum number of output tokens the model may generate. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
Metadata
9 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Search mode
search_mode
|
enum (web | academic | sec) | — | Selects the corpus the model searches when grounding its answer. | — |
|
Search recency filter
search_recency_filter
|
enum (hour | day | week | month | year) | — | Restricts web search results to a recent time window. | — |
|
Search domain filter
search_domain_filter
|
string | — | Limits search to, or excludes, specific domains. | — |
|
Search after date
search_after_date_filter
|
string | — | Restricts search results to content published after this date (MM/DD/YYYY). | — |
|
Search before date
search_before_date_filter
|
string | — | Restricts search results to content published before this date (MM/DD/YYYY). | — |
|
Search context size
web_search_options.search_context_size
|
enum (low | medium | high) | "low" | Controls how much web search context is retrieved before generating the answer. | — |
|
Return images
return_images
|
boolean | false | Controls whether the response may include related images from the search. | — |
|
Return related questions
return_related_questions
|
boolean | false | Controls whether the response includes suggested follow-up questions. | — |
|
Disable search
disable_search
|
boolean | false | Turns off web search so the model answers from its own knowledge only. | — |
Sonar Reasoning Pro Perplexity 12 params
Length
1 param
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Max tokens
max_tokens
|
integer (1…128000) | — | Maximum number of output tokens the model may generate. | — |
Sampling
2 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Temperature
temperature
|
number (0…2 step 0.1) | — | Controls randomness. Lower values make outputs more focused; higher values make them more varied. | — |
|
Top P
top_p
|
number (0…1 step 0.01) | — | Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability. | — |
Metadata
9 params
| Parameter | Type | Default | Description | Condition |
|---|---|---|---|---|
|
Search mode
search_mode
|
enum (web | academic | sec) | — | Selects the corpus the model searches when grounding its answer. | — |
|
Search recency filter
search_recency_filter
|
enum (hour | day | week | month | year) | — | Restricts web search results to a recent time window. | — |
|
Search domain filter
search_domain_filter
|
string | — | Limits search to, or excludes, specific domains. | — |
|
Search after date
search_after_date_filter
|
string | — | Restricts search results to content published after this date (MM/DD/YYYY). | — |
|
Search before date
search_before_date_filter
|
string | — | Restricts search results to content published before this date (MM/DD/YYYY). | — |
|
Search context size
web_search_options.search_context_size
|
enum (low | medium | high) | "low" | Controls how much web search context is retrieved before generating the answer. | — |
|
Return images
return_images
|
boolean | false | Controls whether the response may include related images from the search. | — |
|
Return related questions
return_related_questions
|
boolean | false | Controls whether the response includes suggested follow-up questions. | — |
|
Disable search
disable_search
|
boolean | false | Turns off web search so the model answers from its own knowledge only. | — |