Skip to content

AI Models

ai.KMITL provides access to multiple state-of-the-art AI models from different providers. Each model has unique strengths and characteristics. This guide will help you choose the right model for your task.

Context Limit

All hosted models currently run with a 12,000 token context window. If a vendor model advertises a larger window, assume it is capped to 12K tokens inside ai.KMITL until further notice.

Premium Usage

Some models are marked as premium and can consume more than 1 quota unit per message. Use the quota comparison table below to double-check costs before running long conversations.

Available Models

Google Gemini & Gemma

  • Gemini 2.5 Pro — Premium reasoning, vision, PDF, and effort-control support. Best for heavy research and complex drafting.
  • Gemini 2.5 Flash / Flash Lite — Fast, balanced models for day-to-day chats, product copy, and product support flows.
  • Gemini 2.0 Flash / Flash Lite — Stable default for most text + vision use cases with low quota cost.
  • Gemini 2.0 Flash Lite — Lightweight fallback when you need maximum throughput.
  • Google Gemma 3 (27B) — Tuned LLM for quick brainstorming with an open-source flavor.

OpenAI GPT & o-series

  • GPT 5.1 / GPT 5 / GPT 5 Codex variants — Flagship premium models focused on reasoning, code generation, and multimodal use.
  • GPT 4.1 family (Standard, Mini, Nano) — Balanced for structured outputs and product UX flows.
  • GPT 4o & 4o Mini — Good trade-off between creativity and cost for interactive chat UIs.
  • GPT OSS 20B / 120B — Open-weight variants hosted through OpenAI endpoints for experimentation.
  • o3 / o3 mini / o3 pro — Reasoning-specialized models with adjustable “effort” controls; choose pro for the most rigorous analysis.
  • o4 mini / o1 — Fast reasoning-first experiences when latency matters.

Anthropic Claude

  • Claude Sonnet 4.5 / 4 / 3.7 / 3.5 — High-quality default for instructions, analysis, and code; Sonnet 4.5 adds optional reasoning toggle.
  • Claude Haiku 4.5 — Cost-effective, fast responses, handy for support or summarization.
  • Claude Opus 4 — Most capable Claude option for deep reasoning and nuanced planning.

xAI Grok

  • Grok 4 — Premium reasoning + multimodal support for opinionated, creative drafting.
  • Grok 4 Fast — Budget-friendly, reasoning-capable alternative suited for rapid back-and-forth chats.

DeepSeek

  • DeepSeek 3.1 — Efficient reasoning model with function-calling support.
  • DeepSeek R1 — Premium reasoning-first variant for algorithmic planning and math.

Meta Llama

  • Llama 4 Maverick & Scout — Vision-savvy assistants ideal for creative brainstorming or design reviews.
  • Llama 3.1 8B Instant — Lightweight instruct model for commodity text generation.

Image & Multimodal Creation

  • GPT Image 1 — Photorealistic and illustrative assets with multiple canvas sizes.
  • Gemini 2.5 Flash Image — Text-in/text-out with optional embedded generation.
  • Google Nano Banana — Edit images with text.
  • Google Imagen 3 & Imagen 4 — High-quality illustration.

OpenRouter

Access to various models through OpenRouter:

  • DeepSeek
  • Mixtral
  • And more specialized models

How to Choose a Model

By Task Type

TaskRecommended ModelWhy
Coding helpClaude, GPT-5, GPT CodexStrong at code understanding
Math/ScienceGPT-5, Claude OpusExcellent reasoning
Creative writingClaude Opus, GPT-5Creative and articulate
Quick questionsGemini Flash, GroqVery fast responses
Document analysisGemini ProHuge context window
General chatClaudeBest all-around

By Speed

Fastest: Groq, Gemini Flash
Balanced: Claude Sonnet, GPT-5, GPT-5.1 Slower but thorough: Claude Opus, GPT-5, o3

By Context Length

All models are capped at 12K tokens in ai.KMITL. Treat vendor-advertised maximums as future capabilities.

Switching Models

You can switch models at any time:

  1. Click the model selector at the top of the chat
  2. Browse or search for a model
  3. Click to select it
  4. Continue your conversation

Model Memory

When you switch models, the new model receives the entire conversation history. It will have context of everything discussed so far.

Model Capabilities

Text Generation

All models can generate text, but with different styles:

  • Claude: Natural, conversational, detailed
  • GPT: Structured, analytical, clear
  • Gemini: Factual, comprehensive, thorough

Code Understanding

Best models for coding:

  1. Claude Sonnet - Excellent explanations
  2. GPT - Strong debugging
  3. Claude Opus - Complex algorithms

Reasoning

Best models for complex reasoning:

  1. Claude Opus - Deep analysis
  2. GPT - Structured thinking
  3. Gemini Pro - Comprehensive evaluation

Multimodal (Images, Files)

All major models support:

  • ✅ Image analysis
  • ✅ PDF reading
  • ✅ Document understanding
  • ✅ Code in images

Special Features

Claude Extended Thinking

Some Claude models support extended reasoning for complex problems. The model will "think" through the problem step by step.

GPT-4 Vision

GPT-4 models have strong vision capabilities for analyzing images, diagrams, and screenshots.

Gemini Long Context

Gemini Pro can handle extremely long documents - perfect for analyzing entire books or large codebases.

Usage Tips

When to Use Each Model

Starting a new topic?

Use Claude Sonnet or GPT-5 (reliable, well-rounded)

Need it fast?

Use Claude Haiku or Gemini Flash (speed optimized)

Complex problem?

Use Claude Opus or GPT-5 (deep reasoning)

Long document?

Use Gemini Pro (massive context)

Simple question?

Use Haiku or Gemini Flash Lite (quick and efficient)

Model Comparison Examples

Question: "What is photosynthesis?"

  • Claude Sonnet: Detailed, educational explanation
  • GPT-5: Structured, clear breakdown
  • Gemini Flash: Quick, accurate summary
  • Haiku: Concise, efficient answer

All correct, different styles!

Quota Considerations

Message Counting

Each message you send (regardless of model) counts toward your monthly quota of 1,000 messages. Choose faster models if you're having a long conversation!

Quota Usage Comparison

ModelCategoryQuota CostPremium?Notes
Gemini 2.5 ProText5xYesFull reasoning + vision
Gemini 2.0 FlashText2xNoDefault balanced pick
Gemini 2.5 Flash LiteText1xNoThroughput friendly
Google Gemma 3 (27B)Text1xNoOpen-source tuned
GPT 5.1Text5xYesFlagship reasoning
GPT 5.1 Codex MiniText3xYesCoding-focused
GPT 4oText1xNoGeneral-purpose multimodal
o3 ProText20xYesMax-effort reasoning
Grok 4Text10YesCreative + multimodal
Grok 4 FastText1xNoLow-latency chats
DeepSeek R1Text3xYesMath & planning
DeepSeek 3.1Text1xNoEfficient reasoning
Llama 3.1 8B InstantText1xNoLightweight responses
GPT Image 1Image15xYesPhotorealistic renders
Gemini 2.5 Flash ImageImage10xYesText-in/text-out image gen

Quota-Friendly Strategies

  1. Use faster models for simple questions
  2. Use powerful models when you need accuracy
  3. Combine models: Ask Gemini Flash first, then GPT-5 for details
  4. Edit prompts before sending to get it right the first time

Bring Your Own Key (BYOK)

If you have your own API keys:

  • Use any model unlimited
  • No monthly message limit
  • Full control over costs
  • See BYOK Guide for setup

Custom Models

If you've added your own API keys, you can also

  • Use beta models
  • Access newest releases immediately
  • Configure custom parameters

Frequently Asked Questions

Can I use multiple models in one conversation?

Yes! Switch models anytime. The conversation history transfers over.

Which model is best?

For most users: Gemini 2.5 Flash is the best starting point. It's fast, capable, and handles most tasks excellently.

Do different models cost different amounts?

  • All non-premium models count equally (1 message = 1 message).
  • Premium models cost more (view Quota Usage Comparison Table above).
  • With BYOK, actual costs vary by provider.

Why does model X sometimes give better answers than model Y?

Each model has different training, strengths, and characteristics. Try a few to find what works best for your needs.

Can I request new models?

Yes! Contact support with model requests.


Experiment!

Don't be afraid to try different models. You'll quickly learn which ones work best for your specific needs.

Next Steps

Made with ❤️ by KDMC (KMITL Data Management Center)