AI Model Registry | Vibgrate

Comprehensive catalog of AI models, providers, capabilities, and pricing.

Aider

AI pair programming tool for terminal with git integration

Amazon CodeWhisperer

AWS-native AI coding assistant with security scanning

Amazon Nova Lite

Multimodal Nova model for image and video understanding

Amazon Nova Micro

Fastest and most cost-effective Nova model

Amazon Nova Premier

Most capable Nova model for complex reasoning

Amazon Nova Pro

Balanced Nova model for most tasks

Amazon Q Developer

Next-gen AWS coding assistant with broad AWS service integration

Amazon Titan Text Express

Fast and cost-effective model for general tasks

Amazon Titan Text Lite

Lightweight model for cost-sensitive applications

Amazon Titan Text Premier

Most capable Titan model for complex tasks

Athene V2 Chat 72B

Qwen-based model optimized for chat and reasoning

Aya Expanse 32B

Multilingual model supporting 23 languages

Aya Expanse 8B

Efficient multilingual model

Baichuan 2 13B

Chinese-focused large language model

Bark

Open-source text-to-audio model

BGE-M3

Multi-lingual, multi-functionality embedding model

BLOOM

Multilingual open model supporting 46 languages

c4ai-command-r-08-2024

Latest Command R with RAG optimizations

Claude 3 Haiku

Fastest Claude 3 model for instant responses

Claude 3 Opus

Powerful model for complex tasks requiring deep expertise

ai-modelmigrationlegacy-systems

Claude 3 Sonnet

Balanced Claude 3 model for enterprise tasks

Claude 3.5 Haiku

Fast and affordable model for high-volume tasks

claude-3-5ai-modelsmigration

Claude 3.5 Opus

Enhanced Opus model with superior reasoning

Claude 3.5 Sonnet

Most intelligent Claude model, excels at coding and complex reasoning

ai-modelmigrationcode-conversion

Claude 4 Haiku

Fast and efficient Claude 4 model

Claude 4 Opus

Most capable Claude model with extended thinking

Claude 4 Sonnet

Balanced Claude 4 model with strong coding abilities

Claude 4.5 Haiku

Fastest Claude 4.5 model optimized for quick tasks and high-throughput applications

Claude 4.5 Opus

Anthropic's most capable model with breakthrough reasoning, extended thinking, and exceptional coding abilities

Claude 4.5 Sonnet

High-performance Claude model balancing intelligence and speed, excels at code generation and analysis

Claude 4.6 Haiku

Ultra-fast Claude 4.6 model for real-time applications and high-volume processing

Claude 4.6 Opus

Latest flagship Anthropic model with state-of-the-art reasoning, coding expertise, and agentic capabilities

Claude 4.6 Sonnet

Most advanced Claude Sonnet with exceptional coding and reasoning, ideal balance of capability and efficiency

Claude Computer Use

Claude model specialized for computer control and automation

Claude Haiku 4.5

Fastest Claude model with near-frontier intelligence and extended thinking

Claude Opus 4

Latest flagship Claude model with superior reasoning

Claude Opus 4.6

The most intelligent Claude model for building agents and coding with extended thinking

Claude Opus 4.6 Fast

A faster variant of Claude Opus 4.6 exposed via OpenRouter, aimed at high-throughput production workloads while retaining the Opus-class capability profile.

reasoningtext-generationtool-use

Claude Opus 4.7

A new Claude Opus-series frontier model version listed on OpenRouter with a 1M-token context window, intended for high-end reasoning and long-context workloads.

reasoninglong-context

Claude Opus 4.7 Fast

A faster variant of Claude Opus 4.7 offered via OpenRouter, targeting lower-latency usage while retaining a large (1M-token) context window.

reasoninglong-contexttool-use

Claude Sonnet 4

Balanced Claude 4 model optimized for coding

Claude Sonnet 4.6

Best combination of speed and intelligence with extended thinking support

Code Llama 13B

Mid-size code-specialized Llama model

Code Llama 3 70B

Meta's latest Code Llama based on Llama 3 architecture

Code Llama 3 8B

Efficient Code Llama 3 for local development

Code Llama 34B

Large code-specialized Llama model

Code Llama 34B Instruct

Efficient instruction-tuned Code Llama for coding tasks

Code Llama 70B

Specialized code model fine-tuned from Llama 2 for programming tasks

code-modelmigrationai-tools

Code Llama 70B Instruct

Instruction-tuned Code Llama for following complex coding instructions

Code Llama 7B

Code-specialized Llama model for development

Code Llama Instruct 34B

Instruction-tuned Code Llama for complex tasks

Code Llama Python 34B

Python-specialized Code Llama model

CodeGeeX 4

Open-source multilingual code generation model with strong performance

CodeGeeX 5

Latest multilingual code generation model with enhanced capabilities

CodeGemma 7B

Code-specialized open model based on Gemma for programming tasks

Codeium

Free AI code completion with broad IDE support

CodeQwen 1.5 7B

Efficient code model based on Qwen 1.5 architecture

Codestral

Specialized code model trained on 80+ programming languages

codestralmistral-aimigration

Codestral 25.02

Latest Mistral coding model with enhanced performance

Codestral 2501

Latest Mistral coding model with improved performance and longer context

Codestral Mamba

Mamba-architecture code model for unlimited context

Codey

Google's code-specialized model for enterprise development

Cohere Embed v3

Enterprise-grade embedding model

Command

General-purpose instruction-following model

Command A

Latest flagship model optimized for enterprise tasks

Command Code

Cohere's enterprise code model for development tasks

Command Light

Lightweight model for simple tasks

Command R

RAG-optimized model for enterprise search

Command R+

Most capable Cohere model for complex tasks

Continue

Open-source AI code assistant supporting multiple models

Cursor AI

AI-native code editor with advanced code understanding

DALL-E 3

OpenAI's latest image generation model

DBRX

MoE model optimized for enterprise

DeepSeek Chat

Optimized chat model for conversations

DeepSeek Coder 33B Instruct

Instruction-tuned DeepSeek coding model for following coding instructions

DeepSeek Coder V2

Code-specialized MoE model supporting 300+ languages

ai-migrationcode-conversiondata-transformation

DeepSeek Coder V3

Latest DeepSeek coding model with state-of-the-art code understanding

DeepSeek R1

Reasoning model with chain-of-thought capabilities

DeepSeek R1 Coder

DeepSeek's reasoning model specialized for complex coding tasks

DeepSeek R1 Distill Llama 70B

Distilled R1 model based on Llama 70B

DeepSeek R1 Distill Llama 8B

Efficient Llama-based reasoning model

DeepSeek R1 Distill Qwen 1.5B

Ultra-compact reasoning model

DeepSeek R1 Distill Qwen 32B

Distilled R1 model based on Qwen for efficient reasoning

DeepSeek R1 Distill Qwen 7B

Compact distilled reasoning model

DeepSeek Reasoner

API-accessible reasoning model based on R1

DeepSeek V2

Efficient MoE model with strong general capabilities

DeepSeek V3

MoE model with 671B parameters achieving frontier performance

deepseekai-modelmigration

DeepSeek V4 Flash

DeepSeek’s V4 Flash foundation model listing with a 1M-token context window, optimized for lower-latency long-context tasks.

long-contextreasoningcode-generation

DeepSeek V4 Pro

DeepSeek’s V4 Pro foundation model listing with a 1M-token context window, intended for long-context reasoning and agentic workloads.

long-contextreasoningcode-generation

Devstral

Mistral's agentic coding model for complex development tasks

E5-Mistral-7B-Instruct

Instruction-following embedding model

ElevenLabs Turbo v2.5

Fast text-to-speech model

EXAONE 3.5 32B

Korean-English bilingual model from LG

EXAONE 3.5 7.8B

Efficient Korean-English model

Falcon 180B

Largest open Falcon model

Falcon 3 10B

Latest Falcon 3 model for efficient deployment

Falcon 40B

Mid-size Falcon model

Falcon 7B

Efficient Falcon model

FLUX 1.1 Pro

High-quality image generation model

FLUX.1 [dev]

Open-weight image model for development

Gemini 1.0 Pro

Original Gemini Pro model for general tasks

Gemini 1.5 Flash

Fast and versatile model for diverse tasks at scale

gemini-1-5-flashgoogle-aicode-conversion

Gemini 1.5 Pro

Production-ready model with massive context window for complex tasks

gemini-1-5-progoogle-aisoftware-migrations

Gemini 2.0 Flash

Next-generation multimodal model with native tool use and agentic capabilities

gemini-2-0ai-modelsoftware-migration

Gemini 2.0 Flash Thinking

Flash model with explicit reasoning for complex tasks

Gemini 2.0 Pro

Advanced Gemini 2.0 model for complex reasoning tasks

Gemini 2.5 Flash

Fast and efficient Gemini 2.5 model with thinking

Gemini 2.5 Flash-Lite

Fastest and most budget-friendly multimodal model in the Gemini 2.5 family

Gemini 2.5 Pro

Latest Gemini model with enhanced thinking capabilities

Gemini 2.5 Pro Thinking

Google's most advanced reasoning model with extended chain-of-thought capabilities

Gemini 2.5 Ultra

Google's most powerful model for demanding enterprise tasks and complex reasoning

Gemini 3 Flash

Frontier-class performance rivaling larger models at a fraction of the cost

Gemini 3 Pro

Google's state-of-the-art reasoning model with advanced multimodal understanding

Gemini 3.0 Flash

Next-generation fast model with improved efficiency and multimodal capabilities

Gemini 3.0 Pro

Latest Gemini Pro with enhanced reasoning and coding capabilities across all modalities

Gemini 3.1 Flash Image (Preview)

Google's Flash-speed image generation and editing model referenced as "Nano Banana 2" and listed on OpenRouter as a Gemini 3.1 Flash Image preview.

image-generationimage-editingmultimodal

Gemini 3.1 Flash Live

A low-latency, live audio-capable Gemini Flash model designed for more natural, reliable real-time voice interactions across Google products.

audioreal-timemultimodal

Gemini 3.1 Flash TTS

A text-to-speech model focused on next-generation expressive speech, now available across Google products.

text-to-speechspeech-generation

Gemini 3.1 Flash-Lite

Google’s fastest and most cost-efficient Gemini 3 series model, built for intelligence at scale.

reasoningchattool-use

Gemini 3.1 Pro

Advanced intelligence with complex problem-solving, agentic and vibe coding capabilities

Gemini 3.1 Pro Preview

Preview release of Google's Gemini 3.1 Pro model with a very large context window, aimed at advanced general-purpose reasoning and long-context workloads.

reasoninglong-contexttool-use

Gemini 3.1 Pro Preview (Custom Tools)

A Gemini 3.1 Pro preview variant listed on OpenRouter that is explicitly labeled for custom tools, suggesting enhanced tool-use integration with a very large context window.

tool-usereasoninglong-context

Gemini 3.5

Google’s Gemini 3.5 is a new frontier model series focused on combining strong general intelligence with agentic action/tool use, announced at Google I/O 2026.

reasoningtool-useagentic-workflows

Gemini 3.5 Flash

A fast, efficient Gemini 3.5-series model variant listed on OpenRouter, intended for low-latency agentic and general assistant workloads with a very large context window.

reasoningtool-useagentic-workflows

Gemini Code

Specialized coding model optimized for software development and code understanding

Gemini Computer Use

Specialized model for UI automation - clicking, typing, and navigating browser tasks

Gemini Deep Research

Agentic model for autonomous multi-step research across hundreds of sources

Gemini Ultra

Most capable Gemini model for complex tasks

Gemma 2 27B

Open-weight model for research and development

Gemma 2 9B

Efficient open-weight model for various tasks

Gemma 4 26B A4B IT

An instruction-tuned Gemma 4 model listed on OpenRouter, positioned as a large open model for general-purpose chat and instruction following with a long context window.

text-generationinstruction-followingreasoning

Gemma 4 31B IT

An instruction-tuned Gemma 4 family model offered via OpenRouter with a very large context window, aimed at general-purpose assistant and agentic workflows.

instruction-followingreasoningtool-use

Gemma 7B

Original Gemma model for lightweight tasks

GitHub Copilot

AI pair programmer powered by OpenAI with deep GitHub integration

GitHub Copilot Chat

Conversational AI for coding powered by GPT-5

GitHub Copilot Workspace

Agentic AI for complex multi-file development tasks

GLM-4 9B

Efficient bilingual model from GLM family

GPT-3.5 Turbo

Fast and cost-effective model for everyday tasks

GPT-4

Original GPT-4 model with strong reasoning and coding capabilities

GPT-4 Turbo

Enhanced GPT-4 with 128K context and improved performance

GPT-4.1

Optimized GPT-4 variant with improved coding and instruction following

GPT-4.1 mini

Cost-effective version of GPT-4.1 for everyday tasks

GPT-4.1 nano

Smallest and fastest GPT-4.1 variant for quick tasks

GPT-4.5 Preview

Next-generation GPT model with enhanced reasoning and multimodal capabilities

GPT-4o

Multimodal flagship model with vision and audio capabilities, optimized for speed and cost

gpt-4oopenaicode-migration

GPT-4o Mini

Affordable small model for fast, lightweight tasks

ai-modelmigration-taskscode-conversion

GPT-5

Next-generation GPT model (announced for 2025)

GPT-5 Codex

GPT-5 optimized for agentic coding in Codex

GPT-5 Mini

Faster, cost-efficient version of GPT-5 for well-defined tasks

GPT-5 Nano

Fastest, most cost-efficient version of GPT-5

GPT-5 Pro

GPT-5 variant producing smarter and more precise responses

GPT-5.1

Intelligent reasoning model for coding and agentic tasks with configurable reasoning effort

GPT-5.1 Codex

GPT-5.1 optimized for agentic coding in Codex environment

GPT-5.1 Codex Max

GPT-5.1 Codex optimized for long-running coding tasks

GPT-5.1 Codex Mini

Cost-effective smaller version of GPT-5.1 Codex

GPT-5.2

OpenAI's best model for coding and agentic tasks across industries

GPT-5.2 Codex

Most intelligent coding model optimized for long-horizon agentic coding tasks

GPT-5.2 Pro

Most capable GPT-5.2 variant producing smarter and more precise responses

GPT-5.3 Codex

A new Codex-branded GPT-5.3 model intended for code-centric use cases, listed as newly added on OpenRouter with a large context window.

code-generationreasoning

GPT-5.3 Instant

Conversation-focused GPT-5.3 variant announced by OpenAI for smoother, more useful everyday chat interactions.

chatreasoningsummarization

GPT-5.4

OpenAI frontier foundation model positioned as more capable and efficient for professional work, with state-of-the-art coding, computer use, and tool search, plus a 1M-token context window.

reasoningcode-generationtool-use

GPT-5.4 Image 2

An OpenAI multimodal model oriented around image understanding/generation workflows, listed on OpenRouter as a new GPT-5.4 image-capable offering with a large context window.

visionimage-generationmultimodal

GPT-5.4 Pro

Higher-tier GPT-5.4 offering listed by OpenRouter, providing a 1M-token context window for advanced professional and agentic workloads.

reasoningcode-generationtool-use

GPT-5.4-Cyber

A GPT-5.4-derived model introduced under OpenAI’s Trusted Access for Cyber program, intended for vetted cyber defenders with strengthened safeguards for cybersecurity use cases.

reasoningcybersecurity

GPT-5.5

OpenAI’s flagship GPT-5.5 model, positioned as faster and more capable for complex tasks like coding, research, and data analysis across tools.

reasoningcode-generationtool-use

GPT-5.5 Instant

An updated default ChatGPT model focused on smarter, more accurate responses with reduced hallucinations and improved personalization controls.

reasoningtext-generation

gpt-chat-latest

A ChatGPT-aligned OpenAI model alias newly added to OpenRouter with a 400k token context window, intended for general conversational and assistant-style use.

text-generation

GPT-OSS 120B

OpenAI's most powerful open-weight model, fits on H100 GPU

GPT-OSS 20B

Medium-sized open-weight model for low latency

GPT-Rosalind

A frontier reasoning model for life sciences research, positioned to accelerate drug discovery workflows including genomics analysis and protein reasoning.

reasoninglife-sciences

Granite 3 2B

Compact IBM model for edge deployment

Granite 3 8B

IBM's efficient enterprise model

Granite Code 20B

IBM's enterprise-focused code model with strong security awareness

Granite Code 3 34B

IBM's latest enterprise code model with enhanced security awareness

Granite Code 34B

Code-specialized Granite model

Granite Code 8B

Efficient IBM code model for resource-constrained deployments

Grok 3.5

xAI's advanced model with improved reasoning and real-time knowledge integration

Grok 4

Latest iteration of xAI's flagship model with breakthrough performance

Grok 4 Mini

Efficient version of Grok 4 optimized for speed and cost-effectiveness

Grok 4 Voice

Grok 4 with real-time voice conversation capabilities

Grok 4.20 (Beta)

A Grok 4.20 beta model offering a very large (2M token) context window for long-context general-purpose chat and reasoning workloads.

long-contextreasoningchat

Grok 4.20 Multi-Agent (Beta)

A Grok 4.20 beta variant positioned for multi-agent workflows, with a 2M token context window for coordinating longer multi-step tasks.

long-contextreasoningagentic

Grok 4.3

A new Grok-series flagship model variant listed on OpenRouter with a 1M-token context window, aimed at high-context general reasoning and assistant use.

reasoninglong-contextchat

Grok 420

xAI's most advanced model with breakthrough capabilities (early access)

Grok 420 Multi-Agent

Grok 420 variant optimized for multi-agent orchestration

Grok Vision

Multimodal Grok model with advanced image and document understanding

Grok-1

Original open-weight Grok model

Grok-1.5

Enhanced Grok with improved reasoning

Grok-2

Latest Grok model with frontier capabilities

Grok-2 mini

Efficient Grok-2 variant for faster inference

Grok-3

Next-generation Grok with enhanced reasoning

Grok-3 mini

Efficient Grok-3 with thinking capabilities

GTE-Qwen2-7B-instruct

High-performance embedding model based on Qwen2

Hermes 3 Llama 3.1 405B

Fine-tuned Llama 3.1 405B for instruction following

Hermes 3 Llama 3.1 70B

Fine-tuned Llama 3.1 70B with enhanced capabilities

Hunyuan-Large

Tencent's large MoE model

Ideogram 2

Image model with excellent text rendering

Imagen 3

Google's latest image generation model

InCoder 6B

Infilling-capable code model for completion and generation

InternLM 2 20B

Bilingual model with strong reasoning

Jamba 1.5 Large

Hybrid SSM-Transformer for long context

Jamba 1.5 Mini

Efficient hybrid model for quick tasks

Jina Embeddings v3

Multi-task embedding model with matryoshka support

Llama 2 13B

Mid-size previous generation Llama model

Llama 2 70B

Largest previous generation Llama model

Llama 2 7B

Previous generation efficient Llama model

Llama 3 70B

Large Llama 3 model for complex tasks

Llama 3 8B

Efficient Llama 3 model for everyday tasks

Llama 3.1 405B

Largest open-weight model with frontier-class capabilities

ai-modelmigrationcode-translation

Llama 3.1 70B

Extended context Llama 3.1 70B model

Llama 3.1 8B

Extended context Llama 3.1 8B model

Llama 3.1 Nemotron 70B

NVIDIA-optimized Llama 3.1 for enterprise

Llama 3.2 11B Vision

Multimodal Llama with vision capabilities

Llama 3.2 1B

Tiny Llama model for edge and mobile deployment

Llama 3.2 3B

Compact Llama model for efficient deployment

Llama 3.2 90B Vision

Large multimodal Llama with vision

Llama 3.2 Vision (General)

Multimodal Llama with image understanding

Llama 3.3 70B

Open-weight multilingual model matching Llama 3.1 405B performance

ai-modelmigrationcode-translation

Llama 3.3 70B Nemotron

NVIDIA-optimized Llama 3.3 for enterprise

Llama 3.3 Coder 70B

Meta's latest code-specialized Llama model with enhanced coding capabilities

Llama 4 Coder 405B

Meta's most capable code model based on Llama 4 architecture

Llama 4 Coder 70B

Efficient Llama 4 coding variant for production use

Llama 4 Maverick

Llama 4 variant for complex reasoning and coding

Llama 4 Scout

Llama 4 variant optimized for efficient multi-turn tasks

Lyria 3

Google’s newest music generation model, available in paid preview through the Gemini API and for testing in Google AI Studio.

music-generationaudio

Lyria 3 CLIP (Preview)

A preview Lyria 3 variant listed on OpenRouter, likely intended for clip-based audio/music generation or related multimodal embedding workflows within the Lyria stack.

audio-generationmusic-generation

Lyria 3 Pro (Preview)

A preview Lyria 3 variant surfaced on OpenRouter, associated with Google’s Lyria music/audio generation stack for higher-end generation workflows.

audio-generationmusic-generation

Magicoder S-DS 6.7B

Efficient code model trained with OSS-Instruct methodology

Marco-o1

Reasoning model inspired by o1 methodology

Megrez 3B

Efficient model designed for edge deployment

Ministral 3B

Smallest Ministral for ultra-efficient tasks

Ministral 8B

Edge-focused model for on-device deployment

Mistral 7B

Efficient base model with sliding window attention

Mistral Embed

Embedding model for semantic search

Mistral Large 2

Flagship model with 128k context and function calling

mistral-large-2ai-migrationcode-conversion

Mistral Large 2411

Latest Mistral Large with system prompt improvements

Mistral Large Code

Mistral's flagship model optimized for enterprise coding tasks

Mistral Medium

Balanced model for diverse tasks

Mistral Medium 3.5

A Mistral AI foundation model newly listed on OpenRouter with a 262k token context window, positioned as a balanced medium-tier model for general purpose generation and reasoning tasks.

text-generationreasoning

Mistral Nemo

Small but capable model for efficient deployment

mistral-nemoai-modelmigration-tasks

Mistral Saba

Expert model for Middle Eastern and South Asian languages

Mistral Small

Cost-effective model for simple tasks

Mistral Small 2603

A new Mistral Small series release listed on OpenRouter with a 262k context window, positioned as a general-purpose foundation model for long-context workloads.

long-contextreasoningtext-generation

Mistral Small 3

Latest small model with enhanced capabilities

Mixtral 8x22B

Large MoE model for complex tasks

Mixtral 8x7B

Mixture-of-experts model with efficient inference

Molmo 72B

Multimodal model for vision and language tasks

MPT-30B

Commercial-friendly open model

mxbai-embed-large

High-quality embedding model

Nano Banana 2

An image generation model in the Gemini app that uses personal context and Google Photos to create more personalized images.

image-generationpersonalization

Nemotron-4 340B

Largest NVIDIA model for enterprise tasks

Nemotron-4 70B

NVIDIA's flagship model for enterprise

Nomic Embed Text

Open-source text embedding model

NVIDIA Nemotron 3 Super (120B, A12B)

An open model from NVIDIA designed for scalable agentic AI, described as a 120B-parameter model with 12B active parameters and optimized throughput.

reasoningagenticlong-context

o1

Reasoning model designed to solve hard problems across domains using chain-of-thought

openaimigrationai-model

o1 Pro

Pro version of o1 with extended compute for harder problems

o1-mini

Fast reasoning model optimized for coding, math, and science

ai-modelmigration-taskscode-refactoring

o1-preview

Preview version of OpenAI's reasoning model

o3

Full o3 reasoning model for frontier problem solving

o3 Deep Research

o3 optimized for multi-step deep research tasks

o3 High

High compute version of o3 for maximum reasoning depth

o3 Pro

o3 with more compute for better, more thorough responses

o3-mini

Next-generation reasoning model with improved efficiency (announced)

o3-miniopenaimigration

o4-mini

Next-generation compact reasoning model

o4-mini Deep Research

Cost-efficient deep research model

OLMo 2 13B

Fully open model with training data available

OLMo 2 7B

Efficient fully open model

OlympicCoder 32B

Competition-grade code model fine-tuned on competitive programming

OpenAI Codex

OpenAI's code model powering GitHub Copilot

OpenAI Privacy Filter

An open-weight OpenAI model for detecting and redacting personally identifiable information (PII) in text, intended as a privacy/safety component in pipelines.

pii-detectiontext-redaction

OpenChat 3.5

Open chat model with RLHF training

PaLM 2

Google's previous generation foundation model

Parler TTS Large

Open-source controllable TTS

Phi-1

First Phi model focused on coding

Phi-1.5

Enhanced Phi with improved reasoning

Phi-2

Small but capable model rivaling larger ones

Phi-3-medium

Largest Phi-3 for complex reasoning

Phi-3-mini

Smallest Phi-3 model with strong capabilities

Phi-3-small

Balanced Phi-3 model for diverse tasks

Phi-4

Latest Phi model with state-of-the-art reasoning

Phi-4-mini

Compact Phi-4 for efficient deployment

Phind CodeLlama 34B

Fine-tuned Code Llama optimized for code generation and explanation

Pixtral 12B

Multimodal model with vision capabilities

Pixtral Large

Large multimodal model for complex visual tasks

PolyCoder 16B

Open-source polyglot code model trained on many programming languages

Qwen 1.5 72B

Older Qwen model for compatibility

Qwen 2 72B

Previous generation large Qwen model

Qwen 2.5 14B

Mid-size Qwen 2.5 for balanced tasks

Qwen 2.5 32B

Large Qwen 2.5 for complex tasks

Qwen 2.5 72B

Largest Qwen 2.5 model for complex tasks

Qwen 2.5 7B

Efficient Qwen 2.5 for everyday tasks

Qwen 2.5 Coder 32B

State-of-the-art open code model rivaling GPT-4o on coding tasks

qwencodermigration

Qwen 2.5 Coder 7B

Efficient coding model from Qwen 2.5 family

Qwen 3 235B

Latest flagship Qwen model with MoE architecture

Qwen 3 32B

Balanced Qwen 3 model for diverse tasks

Qwen 3 8B

Efficient Qwen 3 model for quick tasks

Qwen 3 Coder 235B

Alibaba's largest and most capable coding model

Qwen Coder 2.5 14B

Balanced code model with strong performance and reasonable resource requirements

Qwen Coder 2.5 32B

Alibaba's specialized coding model with strong code understanding capabilities

Qwen Coder 2.5 7B

Efficient code model for quick tasks and resource-constrained environments

Qwen Coder 3 72B

Alibaba's latest flagship coding model with exceptional performance

Qwen Max

Most capable Qwen via API

Qwen Plus

Balanced Qwen model via API

Qwen Turbo

Fast Qwen model for quick tasks

Qwen3.5 122B A10B

A large Qwen3.5 Mixture-of-Experts-style model variant newly added on OpenRouter, offering a large 262k-token context window.

reasoninglong-context

Qwen3.5 27B

A Qwen3.5 27B foundation model newly added on OpenRouter, providing a 262k-token context window for general assistant workloads.

reasoninglong-context

Qwen3.5 35B A3B

A Qwen3.5 model variant newly listed on OpenRouter with a 262k-token context window, intended as a mid-sized foundation option in the Qwen3.5 family.

reasoninglong-context

Qwen3.5 Flash 02-23

A Qwen3.5 Flash model snapshot (02-23) newly listed on OpenRouter with a 1M-token context window, positioned for fast, long-context inference.

long-contextreasoning

Qwen3.5-397B-A17B

Large-scale Qwen 3.5 model (397B with A17B MoE-style routing indicated by the name) added on OpenRouter, intended for high-end reasoning and generation with a 262K context window.

reasoninglong-contextcode-generation

Qwen3.5-9B

A 9B-parameter Qwen3.5 foundation model with a large (262k token) context window, positioned for general chat and reasoning with long-context inputs.

chatreasoninglong-context

Qwen3.5-Plus-02-15

Alibaba Qwen 3.5 'Plus' model variant as listed on OpenRouter, featuring a 1M-token context window for long-context general-purpose generation and analysis.

reasoninglong-contextcode-generation

Qwen3.6 Flash

A speed-optimized Qwen3.6 foundation model for low-latency chat and agent workloads while retaining a very large context window.

long-contextinstruction-followingtool-use

Qwen3.6 Max (Preview)

A preview flagship Qwen3.6 foundation model variant aimed at strong general-purpose reasoning and instruction following with a large context window.

reasoningtool-usecode-generation

Qwen3.6 Plus Preview

Preview release of Alibaba's Qwen 3.6 Plus model as listed on OpenRouter, offering a very large context window for general-purpose text tasks.

text-generationreasoning

Qwen3.6-Plus

A long-context Qwen model variant listed on OpenRouter, intended for general-purpose instruction following and long-document workloads.

text-generationinstruction-followinglong-context

QwQ 32B

Reasoning-focused model from Qwen family

Recraft V3

Professional image generation for design

Refact 1.6B

Ultra-efficient code model for real-time code completion

Reflection 70B

Self-correcting model trained on synthetic data

Replit Code V1.5 3B

Efficient code model trained on Replit's diverse codebase

SantaCoder

Efficient code model trained on Python, Java, and JavaScript

SeamlessM4T v2

Multilingual speech and text translation

Skywork o1 Open 8B

Open reasoning model following o1 methodology

SmolLM2 1.7B

Compact model for on-device deployment

SmolLM2 360M

Tiny model for ultra-constrained environments

Snowflake Arctic

Enterprise-focused MoE model

Snowflake Arctic Embed L

Enterprise embedding model from Snowflake

SOLAR 10.7B

Depth-upscaled model with strong performance

Sourcegraph Cody

AI coding assistant with deep codebase understanding

Stable Code 3B

Lightweight code model optimized for fast inference and local deployment

Stable Diffusion 3.5

Latest text-to-image generation model

StarCoder2 15B

Code-focused model trained on The Stack v2

StarCoder2 3B

Compact code model for edge deployment

StarCoder2 7B

Efficient code model for development

StarCoder3 32B

Next-generation open-source code LLM with improved capabilities

Suno v3.5

AI music generation model

Supermaven

Ultra-fast AI code completion with 1M token context

TabNine Enterprise

Enterprise AI code completion with custom model training

text-embedding-3-large

OpenAI's latest embedding model

text-embedding-3-small

Efficient OpenAI embedding model

Tulu 3 405B

Fine-tuned Llama 3.1 405B for instruction following

Tulu 3 70B

Efficient Tulu model for balanced tasks

Udio v1.5

Music generation with high fidelity

Veo 3.1 Lite

Cost-effective video generation model available in paid preview via the Gemini API and for testing in Google AI Studio.

video-generation

Voyage 3

State-of-the-art embedding model

Voyage Code 3

Code-specialized embedding model

Whisper Large v3

Speech recognition model for transcription

Whisper Large v3 Turbo

Fast speech recognition model

Windsurf Cascade

Agentic AI for autonomous coding with deep codebase understanding

WizardCoder 34B

Code-specialized model with evol-instruct training

WizardCoder 34B

Instruction-following code model with strong complex task performance

WizardLM 2 8x22B

Large MoE wizard model for complex tasks

Yi 1.5 34B Chat

Enhanced Yi chat model with extended context

Yi 34B

Large bilingual model from Yi series

Yi 6B

Efficient Yi model for lighter tasks

Yi Coder 1.5B

Ultra-efficient code model for edge deployment and quick tasks

Yi Coder 9B

Efficient open code model with strong multilingual support

Yi Large

Flagship Yi model via API

Yi Lightning

Fast Yi model for quick responses