Reasoning Tokens
TheAgentic's models includes reasoning tokens in every response. These tokens represent the logical steps the model takes to generate its output.
See reasoning tokens in action
To see the reasoning tokens in each response, you can pass in return_reasoning=True in the extra_body parameter of the chat completion request.
response = client.chat.completions.create(
model="agentic-large",
messages=[...],
extra_body={"return_reasoning": True},
)The response will include the reasoning tokens in the reasoning_content of the ChatCompletionMessage object.
print(response.choices[0].message.reasoning_content)When streaming the response, the reasoning tokens will be streamed in the reasoning_content field.
for chunk in response:
print(chunk.choices[0].message.reasoning_content)Reasoning Tokens in Usage
Following an example of a chat request and response where the number of reasoning tokens for this request can be seen in the usage section of the response.
ChatCompletion(
id='179863581b5af5185fa08f87387ab013',
choices=[...],
created=1744236734,
model='agentic-large',
object='chat.completion',
service_tier=None,
system_fingerprint=None,
usage=CompletionUsage(
completion_tokens=408,
prompt_tokens=1772,
total_tokens=2180,
completion_tokens_details=CompletionTokensDetails(
accepted_prediction_tokens=0,
audio_tokens=0,
reasoning_tokens=94,
rejected_prediction_tokens=0
),
prompt_tokens_details=PromptTokensDetails(
audio_tokens=0,
cached_tokens=0
)
),
metadata={}
)In this example of a response from the API, the usage field contains the information about token counts.
prompt_tokensinclude both the system prompt sent by the user and the internal system promptcompletion_tokens_details.reasoning_tokensis the count of reasoning tokens for this requestcompletion_tokensincludes the total response from the LLM (reasoning tokens + output tokens)total_tokensis the sum of prompt_tokens and completion_tokens
Updated 2 months ago