AI Model Registry | Vibgrate
Comprehensive catalog of AI models, providers, capabilities, and pricing.
Aider
AI pair programming tool for terminal with git integration
Amazon CodeWhisperer
AWS-native AI coding assistant with security scanning
Amazon Nova Lite
Multimodal Nova model for image and video understanding
Amazon Nova Micro
Fastest and most cost-effective Nova model
Amazon Nova Premier
Most capable Nova model for complex reasoning
Amazon Nova Pro
Balanced Nova model for most tasks
Amazon Q Developer
Next-gen AWS coding assistant with broad AWS service integration
Amazon Titan Text Express
Fast and cost-effective model for general tasks
Amazon Titan Text Lite
Lightweight model for cost-sensitive applications
Amazon Titan Text Premier
Most capable Titan model for complex tasks
Athene V2 Chat 72B
Qwen-based model optimized for chat and reasoning
Aya Expanse 32B
Multilingual model supporting 23 languages
Aya Expanse 8B
Efficient multilingual model
Baichuan 2 13B
Chinese-focused large language model
Bark
Open-source text-to-audio model
BGE-M3
Multi-lingual, multi-functionality embedding model
BLOOM
Multilingual open model supporting 46 languages
c4ai-command-r-08-2024
Latest Command R with RAG optimizations
Claude 3 Haiku
Fastest Claude 3 model for instant responses
Claude 3 Opus
Powerful model for complex tasks requiring deep expertise
Claude 3 Sonnet
Balanced Claude 3 model for enterprise tasks
Claude 3.5 Haiku
Fast and affordable model for high-volume tasks
Claude 3.5 Opus
Enhanced Opus model with superior reasoning
Claude 3.5 Sonnet
Most intelligent Claude model, excels at coding and complex reasoning
Claude 4 Haiku
Fast and efficient Claude 4 model
Claude 4 Opus
Most capable Claude model with extended thinking
Claude 4 Sonnet
Balanced Claude 4 model with strong coding abilities
Claude 4.5 Haiku
Fastest Claude 4.5 model optimized for quick tasks and high-throughput applications
Claude 4.5 Opus
Anthropic's most capable model with breakthrough reasoning, extended thinking, and exceptional coding abilities
Claude 4.5 Sonnet
High-performance Claude model balancing intelligence and speed, excels at code generation and analysis
Claude 4.6 Haiku
Ultra-fast Claude 4.6 model for real-time applications and high-volume processing
Claude 4.6 Opus
Latest flagship Anthropic model with state-of-the-art reasoning, coding expertise, and agentic capabilities
Claude 4.6 Sonnet
Most advanced Claude Sonnet with exceptional coding and reasoning, ideal balance of capability and efficiency
Claude Computer Use
Claude model specialized for computer control and automation
Claude Haiku 4.5
Fastest Claude model with near-frontier intelligence and extended thinking
Claude Opus 4
Latest flagship Claude model with superior reasoning
Claude Opus 4.6
The most intelligent Claude model for building agents and coding with extended thinking
Claude Opus 4.6 Fast
A faster variant of Claude Opus 4.6 exposed via OpenRouter, aimed at high-throughput production workloads while retaining the Opus-class capability profile.
Claude Opus 4.7
A new Claude Opus-series frontier model version listed on OpenRouter with a 1M-token context window, intended for high-end reasoning and long-context workloads.
Claude Opus 4.7 Fast
A faster variant of Claude Opus 4.7 offered via OpenRouter, targeting lower-latency usage while retaining a large (1M-token) context window.
Claude Sonnet 4
Balanced Claude 4 model optimized for coding
Claude Sonnet 4.6
Best combination of speed and intelligence with extended thinking support
Code Llama 13B
Mid-size code-specialized Llama model
Code Llama 3 70B
Meta's latest Code Llama based on Llama 3 architecture
Code Llama 3 8B
Efficient Code Llama 3 for local development
Code Llama 34B
Large code-specialized Llama model
Code Llama 34B Instruct
Efficient instruction-tuned Code Llama for coding tasks
Code Llama 70B
Specialized code model fine-tuned from Llama 2 for programming tasks
Code Llama 70B Instruct
Instruction-tuned Code Llama for following complex coding instructions
Code Llama 7B
Code-specialized Llama model for development
Code Llama Instruct 34B
Instruction-tuned Code Llama for complex tasks
Code Llama Python 34B
Python-specialized Code Llama model
CodeGeeX 4
Open-source multilingual code generation model with strong performance
CodeGeeX 5
Latest multilingual code generation model with enhanced capabilities
CodeGemma 7B
Code-specialized open model based on Gemma for programming tasks
Codeium
Free AI code completion with broad IDE support
CodeQwen 1.5 7B
Efficient code model based on Qwen 1.5 architecture
Codestral
Specialized code model trained on 80+ programming languages
Codestral 25.02
Latest Mistral coding model with enhanced performance
Codestral 2501
Latest Mistral coding model with improved performance and longer context
Codestral Mamba
Mamba-architecture code model for unlimited context
Codey
Google's code-specialized model for enterprise development
Cohere Embed v3
Enterprise-grade embedding model
Command
General-purpose instruction-following model
Command A
Latest flagship model optimized for enterprise tasks
Command Code
Cohere's enterprise code model for development tasks
Command Light
Lightweight model for simple tasks
Command R
RAG-optimized model for enterprise search
Command R+
Most capable Cohere model for complex tasks
Continue
Open-source AI code assistant supporting multiple models
Cursor AI
AI-native code editor with advanced code understanding
DALL-E 3
OpenAI's latest image generation model
DBRX
MoE model optimized for enterprise
DeepSeek Chat
Optimized chat model for conversations
DeepSeek Coder 33B Instruct
Instruction-tuned DeepSeek coding model for following coding instructions
DeepSeek Coder V2
Code-specialized MoE model supporting 300+ languages
DeepSeek Coder V3
Latest DeepSeek coding model with state-of-the-art code understanding
DeepSeek R1
Reasoning model with chain-of-thought capabilities
DeepSeek R1 Coder
DeepSeek's reasoning model specialized for complex coding tasks
DeepSeek R1 Distill Llama 70B
Distilled R1 model based on Llama 70B
DeepSeek R1 Distill Llama 8B
Efficient Llama-based reasoning model
DeepSeek R1 Distill Qwen 1.5B
Ultra-compact reasoning model
DeepSeek R1 Distill Qwen 32B
Distilled R1 model based on Qwen for efficient reasoning
DeepSeek R1 Distill Qwen 7B
Compact distilled reasoning model
DeepSeek Reasoner
API-accessible reasoning model based on R1
DeepSeek V2
Efficient MoE model with strong general capabilities
DeepSeek V3
MoE model with 671B parameters achieving frontier performance
DeepSeek V4 Flash
DeepSeek’s V4 Flash foundation model listing with a 1M-token context window, optimized for lower-latency long-context tasks.
DeepSeek V4 Pro
DeepSeek’s V4 Pro foundation model listing with a 1M-token context window, intended for long-context reasoning and agentic workloads.
Devstral
Mistral's agentic coding model for complex development tasks
E5-Mistral-7B-Instruct
Instruction-following embedding model
ElevenLabs Turbo v2.5
Fast text-to-speech model
EXAONE 3.5 32B
Korean-English bilingual model from LG
EXAONE 3.5 7.8B
Efficient Korean-English model
Falcon 180B
Largest open Falcon model
Falcon 3 10B
Latest Falcon 3 model for efficient deployment
Falcon 40B
Mid-size Falcon model
Falcon 7B
Efficient Falcon model
FLUX 1.1 Pro
High-quality image generation model
FLUX.1 [dev]
Open-weight image model for development
Gemini 1.0 Pro
Original Gemini Pro model for general tasks
Gemini 1.5 Flash
Fast and versatile model for diverse tasks at scale
Gemini 1.5 Pro
Production-ready model with massive context window for complex tasks
Gemini 2.0 Flash
Next-generation multimodal model with native tool use and agentic capabilities
Gemini 2.0 Flash Thinking
Flash model with explicit reasoning for complex tasks
Gemini 2.0 Pro
Advanced Gemini 2.0 model for complex reasoning tasks
Gemini 2.5 Flash
Fast and efficient Gemini 2.5 model with thinking
Gemini 2.5 Flash-Lite
Fastest and most budget-friendly multimodal model in the Gemini 2.5 family
Gemini 2.5 Pro
Latest Gemini model with enhanced thinking capabilities
Gemini 2.5 Pro Thinking
Google's most advanced reasoning model with extended chain-of-thought capabilities
Gemini 2.5 Ultra
Google's most powerful model for demanding enterprise tasks and complex reasoning
Gemini 3 Flash
Frontier-class performance rivaling larger models at a fraction of the cost
Gemini 3 Pro
Google's state-of-the-art reasoning model with advanced multimodal understanding
Gemini 3.0 Flash
Next-generation fast model with improved efficiency and multimodal capabilities
Gemini 3.0 Pro
Latest Gemini Pro with enhanced reasoning and coding capabilities across all modalities
Gemini 3.1 Flash Image (Preview)
Google's Flash-speed image generation and editing model referenced as "Nano Banana 2" and listed on OpenRouter as a Gemini 3.1 Flash Image preview.
Gemini 3.1 Flash Live
A low-latency, live audio-capable Gemini Flash model designed for more natural, reliable real-time voice interactions across Google products.
Gemini 3.1 Flash TTS
A text-to-speech model focused on next-generation expressive speech, now available across Google products.
Gemini 3.1 Flash-Lite
Google’s fastest and most cost-efficient Gemini 3 series model, built for intelligence at scale.
Gemini 3.1 Pro
Advanced intelligence with complex problem-solving, agentic and vibe coding capabilities
Gemini 3.1 Pro Preview
Preview release of Google's Gemini 3.1 Pro model with a very large context window, aimed at advanced general-purpose reasoning and long-context workloads.
Gemini 3.1 Pro Preview (Custom Tools)
A Gemini 3.1 Pro preview variant listed on OpenRouter that is explicitly labeled for custom tools, suggesting enhanced tool-use integration with a very large context window.
Gemini 3.5
Google’s Gemini 3.5 is a new frontier model series focused on combining strong general intelligence with agentic action/tool use, announced at Google I/O 2026.
Gemini 3.5 Flash
A fast, efficient Gemini 3.5-series model variant listed on OpenRouter, intended for low-latency agentic and general assistant workloads with a very large context window.
Gemini Code
Specialized coding model optimized for software development and code understanding
Gemini Computer Use
Specialized model for UI automation - clicking, typing, and navigating browser tasks
Gemini Deep Research
Agentic model for autonomous multi-step research across hundreds of sources
Gemini Ultra
Most capable Gemini model for complex tasks
Gemma 2 27B
Open-weight model for research and development
Gemma 2 9B
Efficient open-weight model for various tasks
Gemma 4 26B A4B IT
An instruction-tuned Gemma 4 model listed on OpenRouter, positioned as a large open model for general-purpose chat and instruction following with a long context window.
Gemma 4 31B IT
An instruction-tuned Gemma 4 family model offered via OpenRouter with a very large context window, aimed at general-purpose assistant and agentic workflows.
Gemma 7B
Original Gemma model for lightweight tasks
GitHub Copilot
AI pair programmer powered by OpenAI with deep GitHub integration
GitHub Copilot Chat
Conversational AI for coding powered by GPT-5
GitHub Copilot Workspace
Agentic AI for complex multi-file development tasks
GLM-4 9B
Efficient bilingual model from GLM family
GPT-3.5 Turbo
Fast and cost-effective model for everyday tasks
GPT-4
Original GPT-4 model with strong reasoning and coding capabilities
GPT-4 Turbo
Enhanced GPT-4 with 128K context and improved performance
GPT-4.1
Optimized GPT-4 variant with improved coding and instruction following
GPT-4.1 mini
Cost-effective version of GPT-4.1 for everyday tasks
GPT-4.1 nano
Smallest and fastest GPT-4.1 variant for quick tasks
GPT-4.5 Preview
Next-generation GPT model with enhanced reasoning and multimodal capabilities
GPT-4o
Multimodal flagship model with vision and audio capabilities, optimized for speed and cost
GPT-4o Mini
Affordable small model for fast, lightweight tasks
GPT-5
Next-generation GPT model (announced for 2025)
GPT-5 Codex
GPT-5 optimized for agentic coding in Codex
GPT-5 Mini
Faster, cost-efficient version of GPT-5 for well-defined tasks
GPT-5 Nano
Fastest, most cost-efficient version of GPT-5
GPT-5 Pro
GPT-5 variant producing smarter and more precise responses
GPT-5.1
Intelligent reasoning model for coding and agentic tasks with configurable reasoning effort
GPT-5.1 Codex
GPT-5.1 optimized for agentic coding in Codex environment
GPT-5.1 Codex Max
GPT-5.1 Codex optimized for long-running coding tasks
GPT-5.1 Codex Mini
Cost-effective smaller version of GPT-5.1 Codex
GPT-5.2
OpenAI's best model for coding and agentic tasks across industries
GPT-5.2 Codex
Most intelligent coding model optimized for long-horizon agentic coding tasks
GPT-5.2 Pro
Most capable GPT-5.2 variant producing smarter and more precise responses
GPT-5.3 Codex
A new Codex-branded GPT-5.3 model intended for code-centric use cases, listed as newly added on OpenRouter with a large context window.
GPT-5.3 Instant
Conversation-focused GPT-5.3 variant announced by OpenAI for smoother, more useful everyday chat interactions.
GPT-5.4
OpenAI frontier foundation model positioned as more capable and efficient for professional work, with state-of-the-art coding, computer use, and tool search, plus a 1M-token context window.
GPT-5.4 Image 2
An OpenAI multimodal model oriented around image understanding/generation workflows, listed on OpenRouter as a new GPT-5.4 image-capable offering with a large context window.
GPT-5.4 Pro
Higher-tier GPT-5.4 offering listed by OpenRouter, providing a 1M-token context window for advanced professional and agentic workloads.
GPT-5.4-Cyber
A GPT-5.4-derived model introduced under OpenAI’s Trusted Access for Cyber program, intended for vetted cyber defenders with strengthened safeguards for cybersecurity use cases.
GPT-5.5
OpenAI’s flagship GPT-5.5 model, positioned as faster and more capable for complex tasks like coding, research, and data analysis across tools.
GPT-5.5 Instant
An updated default ChatGPT model focused on smarter, more accurate responses with reduced hallucinations and improved personalization controls.
gpt-chat-latest
A ChatGPT-aligned OpenAI model alias newly added to OpenRouter with a 400k token context window, intended for general conversational and assistant-style use.
GPT-OSS 120B
OpenAI's most powerful open-weight model, fits on H100 GPU
GPT-OSS 20B
Medium-sized open-weight model for low latency
GPT-Rosalind
A frontier reasoning model for life sciences research, positioned to accelerate drug discovery workflows including genomics analysis and protein reasoning.
Granite 3 2B
Compact IBM model for edge deployment
Granite 3 8B
IBM's efficient enterprise model
Granite Code 20B
IBM's enterprise-focused code model with strong security awareness
Granite Code 3 34B
IBM's latest enterprise code model with enhanced security awareness
Granite Code 34B
Code-specialized Granite model
Granite Code 8B
Efficient IBM code model for resource-constrained deployments
Grok 3.5
xAI's advanced model with improved reasoning and real-time knowledge integration
Grok 4
Latest iteration of xAI's flagship model with breakthrough performance
Grok 4 Mini
Efficient version of Grok 4 optimized for speed and cost-effectiveness
Grok 4 Voice
Grok 4 with real-time voice conversation capabilities
Grok 4.20 (Beta)
A Grok 4.20 beta model offering a very large (2M token) context window for long-context general-purpose chat and reasoning workloads.
Grok 4.20 Multi-Agent (Beta)
A Grok 4.20 beta variant positioned for multi-agent workflows, with a 2M token context window for coordinating longer multi-step tasks.
Grok 4.3
A new Grok-series flagship model variant listed on OpenRouter with a 1M-token context window, aimed at high-context general reasoning and assistant use.
Grok 420
xAI's most advanced model with breakthrough capabilities (early access)
Grok 420 Multi-Agent
Grok 420 variant optimized for multi-agent orchestration
Grok Vision
Multimodal Grok model with advanced image and document understanding
Grok-1
Original open-weight Grok model
Grok-1.5
Enhanced Grok with improved reasoning
Grok-2
Latest Grok model with frontier capabilities
Grok-2 mini
Efficient Grok-2 variant for faster inference
Grok-3
Next-generation Grok with enhanced reasoning
Grok-3 mini
Efficient Grok-3 with thinking capabilities
GTE-Qwen2-7B-instruct
High-performance embedding model based on Qwen2
Hermes 3 Llama 3.1 405B
Fine-tuned Llama 3.1 405B for instruction following
Hermes 3 Llama 3.1 70B
Fine-tuned Llama 3.1 70B with enhanced capabilities
Hunyuan-Large
Tencent's large MoE model
Ideogram 2
Image model with excellent text rendering
Imagen 3
Google's latest image generation model
InCoder 6B
Infilling-capable code model for completion and generation
InternLM 2 20B
Bilingual model with strong reasoning
Jamba 1.5 Large
Hybrid SSM-Transformer for long context
Jamba 1.5 Mini
Efficient hybrid model for quick tasks
Jina Embeddings v3
Multi-task embedding model with matryoshka support
Llama 2 13B
Mid-size previous generation Llama model
Llama 2 70B
Largest previous generation Llama model
Llama 2 7B
Previous generation efficient Llama model
Llama 3 70B
Large Llama 3 model for complex tasks
Llama 3 8B
Efficient Llama 3 model for everyday tasks
Llama 3.1 405B
Largest open-weight model with frontier-class capabilities
Llama 3.1 70B
Extended context Llama 3.1 70B model
Llama 3.1 8B
Extended context Llama 3.1 8B model
Llama 3.1 Nemotron 70B
NVIDIA-optimized Llama 3.1 for enterprise
Llama 3.2 11B Vision
Multimodal Llama with vision capabilities
Llama 3.2 1B
Tiny Llama model for edge and mobile deployment
Llama 3.2 3B
Compact Llama model for efficient deployment
Llama 3.2 90B Vision
Large multimodal Llama with vision
Llama 3.2 Vision (General)
Multimodal Llama with image understanding
Llama 3.3 70B
Open-weight multilingual model matching Llama 3.1 405B performance
Llama 3.3 70B Nemotron
NVIDIA-optimized Llama 3.3 for enterprise
Llama 3.3 Coder 70B
Meta's latest code-specialized Llama model with enhanced coding capabilities
Llama 4 Coder 405B
Meta's most capable code model based on Llama 4 architecture
Llama 4 Coder 70B
Efficient Llama 4 coding variant for production use
Llama 4 Maverick
Llama 4 variant for complex reasoning and coding
Llama 4 Scout
Llama 4 variant optimized for efficient multi-turn tasks
Lyria 3
Google’s newest music generation model, available in paid preview through the Gemini API and for testing in Google AI Studio.
Lyria 3 CLIP (Preview)
A preview Lyria 3 variant listed on OpenRouter, likely intended for clip-based audio/music generation or related multimodal embedding workflows within the Lyria stack.
Lyria 3 Pro (Preview)
A preview Lyria 3 variant surfaced on OpenRouter, associated with Google’s Lyria music/audio generation stack for higher-end generation workflows.
Magicoder S-DS 6.7B
Efficient code model trained with OSS-Instruct methodology
Marco-o1
Reasoning model inspired by o1 methodology
Megrez 3B
Efficient model designed for edge deployment
Ministral 3B
Smallest Ministral for ultra-efficient tasks
Ministral 8B
Edge-focused model for on-device deployment
Mistral 7B
Efficient base model with sliding window attention
Mistral Embed
Embedding model for semantic search
Mistral Large 2
Flagship model with 128k context and function calling
Mistral Large 2411
Latest Mistral Large with system prompt improvements
Mistral Large Code
Mistral's flagship model optimized for enterprise coding tasks
Mistral Medium
Balanced model for diverse tasks
Mistral Medium 3.5
A Mistral AI foundation model newly listed on OpenRouter with a 262k token context window, positioned as a balanced medium-tier model for general purpose generation and reasoning tasks.
Mistral Nemo
Small but capable model for efficient deployment
Mistral Saba
Expert model for Middle Eastern and South Asian languages
Mistral Small
Cost-effective model for simple tasks
Mistral Small 2603
A new Mistral Small series release listed on OpenRouter with a 262k context window, positioned as a general-purpose foundation model for long-context workloads.
Mistral Small 3
Latest small model with enhanced capabilities
Mixtral 8x22B
Large MoE model for complex tasks
Mixtral 8x7B
Mixture-of-experts model with efficient inference
Molmo 72B
Multimodal model for vision and language tasks
MPT-30B
Commercial-friendly open model
mxbai-embed-large
High-quality embedding model
Nano Banana 2
An image generation model in the Gemini app that uses personal context and Google Photos to create more personalized images.
Nemotron-4 340B
Largest NVIDIA model for enterprise tasks
Nemotron-4 70B
NVIDIA's flagship model for enterprise
Nomic Embed Text
Open-source text embedding model
NVIDIA Nemotron 3 Super (120B, A12B)
An open model from NVIDIA designed for scalable agentic AI, described as a 120B-parameter model with 12B active parameters and optimized throughput.
o1
Reasoning model designed to solve hard problems across domains using chain-of-thought
o1 Pro
Pro version of o1 with extended compute for harder problems
o1-mini
Fast reasoning model optimized for coding, math, and science
o1-preview
Preview version of OpenAI's reasoning model
o3
Full o3 reasoning model for frontier problem solving
o3 Deep Research
o3 optimized for multi-step deep research tasks
o3 High
High compute version of o3 for maximum reasoning depth
o3 Pro
o3 with more compute for better, more thorough responses
o3-mini
Next-generation reasoning model with improved efficiency (announced)
o4-mini
Next-generation compact reasoning model
o4-mini Deep Research
Cost-efficient deep research model
OLMo 2 13B
Fully open model with training data available
OLMo 2 7B
Efficient fully open model
OlympicCoder 32B
Competition-grade code model fine-tuned on competitive programming
OpenAI Codex
OpenAI's code model powering GitHub Copilot
OpenAI Privacy Filter
An open-weight OpenAI model for detecting and redacting personally identifiable information (PII) in text, intended as a privacy/safety component in pipelines.
OpenChat 3.5
Open chat model with RLHF training
PaLM 2
Google's previous generation foundation model
Parler TTS Large
Open-source controllable TTS
Phi-1
First Phi model focused on coding
Phi-1.5
Enhanced Phi with improved reasoning
Phi-2
Small but capable model rivaling larger ones
Phi-3-medium
Largest Phi-3 for complex reasoning
Phi-3-mini
Smallest Phi-3 model with strong capabilities
Phi-3-small
Balanced Phi-3 model for diverse tasks
Phi-4
Latest Phi model with state-of-the-art reasoning
Phi-4-mini
Compact Phi-4 for efficient deployment
Phind CodeLlama 34B
Fine-tuned Code Llama optimized for code generation and explanation
Pixtral 12B
Multimodal model with vision capabilities
Pixtral Large
Large multimodal model for complex visual tasks
PolyCoder 16B
Open-source polyglot code model trained on many programming languages
Qwen 1.5 72B
Older Qwen model for compatibility
Qwen 2 72B
Previous generation large Qwen model
Qwen 2.5 14B
Mid-size Qwen 2.5 for balanced tasks
Qwen 2.5 32B
Large Qwen 2.5 for complex tasks
Qwen 2.5 72B
Largest Qwen 2.5 model for complex tasks
Qwen 2.5 7B
Efficient Qwen 2.5 for everyday tasks
Qwen 2.5 Coder 32B
State-of-the-art open code model rivaling GPT-4o on coding tasks
Qwen 2.5 Coder 7B
Efficient coding model from Qwen 2.5 family
Qwen 3 235B
Latest flagship Qwen model with MoE architecture
Qwen 3 32B
Balanced Qwen 3 model for diverse tasks
Qwen 3 8B
Efficient Qwen 3 model for quick tasks
Qwen 3 Coder 235B
Alibaba's largest and most capable coding model
Qwen Coder 2.5 14B
Balanced code model with strong performance and reasonable resource requirements
Qwen Coder 2.5 32B
Alibaba's specialized coding model with strong code understanding capabilities
Qwen Coder 2.5 7B
Efficient code model for quick tasks and resource-constrained environments
Qwen Coder 3 72B
Alibaba's latest flagship coding model with exceptional performance
Qwen Max
Most capable Qwen via API
Qwen Plus
Balanced Qwen model via API
Qwen Turbo
Fast Qwen model for quick tasks
Qwen3.5 122B A10B
A large Qwen3.5 Mixture-of-Experts-style model variant newly added on OpenRouter, offering a large 262k-token context window.
Qwen3.5 27B
A Qwen3.5 27B foundation model newly added on OpenRouter, providing a 262k-token context window for general assistant workloads.
Qwen3.5 35B A3B
A Qwen3.5 model variant newly listed on OpenRouter with a 262k-token context window, intended as a mid-sized foundation option in the Qwen3.5 family.
Qwen3.5 Flash 02-23
A Qwen3.5 Flash model snapshot (02-23) newly listed on OpenRouter with a 1M-token context window, positioned for fast, long-context inference.
Qwen3.5-397B-A17B
Large-scale Qwen 3.5 model (397B with A17B MoE-style routing indicated by the name) added on OpenRouter, intended for high-end reasoning and generation with a 262K context window.
Qwen3.5-9B
A 9B-parameter Qwen3.5 foundation model with a large (262k token) context window, positioned for general chat and reasoning with long-context inputs.
Qwen3.5-Plus-02-15
Alibaba Qwen 3.5 'Plus' model variant as listed on OpenRouter, featuring a 1M-token context window for long-context general-purpose generation and analysis.
Qwen3.6 Flash
A speed-optimized Qwen3.6 foundation model for low-latency chat and agent workloads while retaining a very large context window.
Qwen3.6 Max (Preview)
A preview flagship Qwen3.6 foundation model variant aimed at strong general-purpose reasoning and instruction following with a large context window.
Qwen3.6 Plus Preview
Preview release of Alibaba's Qwen 3.6 Plus model as listed on OpenRouter, offering a very large context window for general-purpose text tasks.
Qwen3.6-Plus
A long-context Qwen model variant listed on OpenRouter, intended for general-purpose instruction following and long-document workloads.
QwQ 32B
Reasoning-focused model from Qwen family
Recraft V3
Professional image generation for design
Refact 1.6B
Ultra-efficient code model for real-time code completion
Reflection 70B
Self-correcting model trained on synthetic data
Replit Code V1.5 3B
Efficient code model trained on Replit's diverse codebase
SantaCoder
Efficient code model trained on Python, Java, and JavaScript
SeamlessM4T v2
Multilingual speech and text translation
Skywork o1 Open 8B
Open reasoning model following o1 methodology
SmolLM2 1.7B
Compact model for on-device deployment
SmolLM2 360M
Tiny model for ultra-constrained environments
Snowflake Arctic
Enterprise-focused MoE model
Snowflake Arctic Embed L
Enterprise embedding model from Snowflake
SOLAR 10.7B
Depth-upscaled model with strong performance
Sourcegraph Cody
AI coding assistant with deep codebase understanding
Stable Code 3B
Lightweight code model optimized for fast inference and local deployment
Stable Diffusion 3.5
Latest text-to-image generation model
StarCoder2 15B
Code-focused model trained on The Stack v2
StarCoder2 3B
Compact code model for edge deployment
StarCoder2 7B
Efficient code model for development
StarCoder3 32B
Next-generation open-source code LLM with improved capabilities
Suno v3.5
AI music generation model
Supermaven
Ultra-fast AI code completion with 1M token context
TabNine Enterprise
Enterprise AI code completion with custom model training
text-embedding-3-large
OpenAI's latest embedding model
text-embedding-3-small
Efficient OpenAI embedding model
Tulu 3 405B
Fine-tuned Llama 3.1 405B for instruction following
Tulu 3 70B
Efficient Tulu model for balanced tasks
Udio v1.5
Music generation with high fidelity
Veo 3.1 Lite
Cost-effective video generation model available in paid preview via the Gemini API and for testing in Google AI Studio.
Voyage 3
State-of-the-art embedding model
Voyage Code 3
Code-specialized embedding model
Whisper Large v3
Speech recognition model for transcription
Whisper Large v3 Turbo
Fast speech recognition model
Windsurf Cascade
Agentic AI for autonomous coding with deep codebase understanding
WizardCoder 34B
Code-specialized model with evol-instruct training
WizardCoder 34B
Instruction-following code model with strong complex task performance
WizardLM 2 8x22B
Large MoE wizard model for complex tasks
Yi 1.5 34B Chat
Enhanced Yi chat model with extended context
Yi 34B
Large bilingual model from Yi series
Yi 6B
Efficient Yi model for lighter tasks
Yi Coder 1.5B
Ultra-efficient code model for edge deployment and quick tasks
Yi Coder 9B
Efficient open code model with strong multilingual support
Yi Large
Flagship Yi model via API
Yi Lightning
Fast Yi model for quick responses