Vision Support
TheAgentic LLMs support vision capabilities for the chat completions. This allows you to send images along with text messages for multimodal AI interactions.
The vision feature enables the model to analyze, understand, and respond to visual content including photographs, charts, diagrams, and other image formats.
Code Example
Here's how to use the OpenAI SDK with chat completions on the agentic-large model supporting image URLs:
from openai import OpenAI
client = OpenAI(
api_key="your-api-key",
base_url="https://api.theagentic.ai"
)
response = client.chat.completions.create(
model="agentic-large",
messages=[
{
"role": "user",
"content": [
{
"type": "text",
"text": "What do you see in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://example.com/image.jpg"
}
}
]
}
],
max_tokens=300
)
print(response.choices[0].message.content)Best Practices
- Image Resolution: For optimal performance, use images with reasonable resolution (typically under 2048x2048 pixels)
- File Size: Keep image files under 20MB for faster processing
- Multiple Images: You can include multiple images in a single request by adding multiple image_url objects
- Context: Provide clear text instructions about what you want the model to do with the image
Limitations
Vision capabilities are optimized for static images. Video files are not currently supported.
Updated 2 months ago