Vision Support

TheAgentic LLMs support vision capabilities for the chat completions. This allows you to send images along with text messages for multimodal AI interactions.

The vision feature enables the model to analyze, understand, and respond to visual content including photographs, charts, diagrams, and other image formats.

Code Example

Here's how to use the OpenAI SDK with chat completions on the agentic-large model supporting image URLs:

from openai import OpenAI

client = OpenAI(
    api_key="your-api-key",
    base_url="https://api.theagentic.ai"
)

response = client.chat.completions.create(
    model="agentic-large",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "What do you see in this image?"
                },
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://example.com/image.jpg"
                    }
                }
            ]
        }
    ],
    max_tokens=300
)

print(response.choices[0].message.content)

Best Practices

Image Resolution: For optimal performance, use images with reasonable resolution (typically under 2048x2048 pixels)
File Size: Keep image files under 20MB for faster processing
Multiple Images: You can include multiple images in a single request by adding multiple image_url objects
Context: Provide clear text instructions about what you want the model to do with the image

Limitations

🚧
Vision capabilities are optimized for static images. Video files are not currently supported.