LLM parameter glossary

Length

Max tokens max_tokens integer · 52 models

Maximum number of output tokens the model may generate.

Supported by: Anthropic, DeepSeek, Mistral, OpenAI

Max tokens max_completion_tokens integer · 13 models

Maximum number of output tokens the model may generate.

Supported by: OpenAI

Stop sequence stop string · 13 models

Stops generation when this string is detected.

Supported by: Mistral

Max output tokens generationConfig.maxOutputTokens integer · 4 models

Maximum number of tokens to include in a response candidate.

Supported by: Google

Sampling

Temperature temperature number · 49 models

Controls randomness. Lower values make outputs more focused; higher values make them more varied.

Supported by: Anthropic, DeepSeek, Mistral, OpenAI

Top P top_p number · 49 models

Controls nucleus sampling by limiting generation to tokens whose cumulative probability reaches this value.

Supported by: Anthropic, DeepSeek, Mistral, OpenAI

Top K top_k integer · 22 models

Limits token sampling to the top K most likely next tokens.

Supported by: Anthropic

Frequency penalty frequency_penalty number · 13 models

Penalizes words based on how often they already appear in the generated text.

Supported by: Mistral

Presence penalty presence_penalty number · 13 models

Penalizes repeated words or phrases to encourage a wider variety of generated content.

Supported by: Mistral

Random seed random_seed integer · 13 models

Seed used for deterministic sampling when reproducible outputs are desired.

Supported by: Mistral

Seed generationConfig.seed integer · 4 models

Optional seed used for decoding when reproducible sampling is desired.

Supported by: Google

Temperature generationConfig.temperature number · 4 models

Controls randomness. Lower values make outputs more focused; higher values make them more varied.

Supported by: Google

Top K generationConfig.topK integer · 4 models

Limits token sampling to the top K most likely next tokens.

Supported by: Google

Top P generationConfig.topP number · 4 models

Controls nucleus sampling by limiting generation to tokens within the selected cumulative probability.

Supported by: Google

Reasoning

Thinking mode thinking.type enum · 21 models

Controls the Anthropic thinking mode values supported by this model.

Supported by: Anthropic, DeepSeek

Reasoning effort reasoning_effort enum · 17 models

Controls how much reasoning the model should perform before producing an answer.

Supported by: DeepSeek, OpenAI

Budget tokens thinking.budget_tokens integer · 16 models

Maximum token budget Anthropic may use for extended thinking before producing the final answer.

Supported by: Anthropic

Reasoning effort reasoning.effort enum · 9 models

Controls how much reasoning the model should perform before producing an answer.

Supported by: OpenAI

Reasoning summary reasoning.summary enum · 9 models

Controls the level of reasoning summary returned with the response.

Supported by: OpenAI

Thinking display thinking.display enum · 7 models

Controls whether Anthropic returns summarized or omitted thinking content.

Supported by: Anthropic

Include thoughts generationConfig.thinkingConfig.includeThoughts boolean · 4 models

Controls whether Gemini returns available thought summaries in the response parts.

Supported by: Google

Effort output_config.effort enum · 4 models

Controls Anthropic response thoroughness and token spend.

Supported by: Anthropic

Thinking budget generationConfig.thinkingConfig.thinkingBudget integer · 3 models

Number of thinking tokens Gemini should use; -1 uses dynamic thinking, 0 disables thinking, and fixed budgets start at 512 tokens.

Supported by: Google

Thinking level generationConfig.thinkingConfig.thinkingLevel enum · 1 model

Controls Gemini 3.5 Flash reasoning effort.

Supported by: Google

Output

Response format response_format.type enum · 13 models

Controls whether the model returns normal text or JSON mode output.

Supported by: Mistral

Verbosity text.verbosity enum · 9 models

Controls how concise or detailed the model's final text response should be.

Supported by: OpenAI

Response MIME type generationConfig.responseMimeType enum · 4 models

MIME type for generated text candidates.

Supported by: Google

Metadata

Safe prompt safe_prompt boolean · 13 models

Controls whether Mistral injects its safety prompt before the conversation.

Supported by: Mistral

Back to the full catalog

Length

Sampling

Reasoning

Output

Metadata

How to use

Catalog API

Single model

JSON Schema

Logos

Contribute