---
title: Models
image: https://developers.cloudflare.com/dev-products-preview.png
---

> Documentation Index  
> Fetch the complete documentation index at: https://developers.cloudflare.com/ai/llms.txt  
> Use this file to discover all available pages before exploring further.

[Skip to content](#%5Ftop) 

# Models

Can't find what you're looking for? 

View all models available through AI Gateway, including third-party providers like Anthropic, OpenAI, and more.[Browse supported models for the REST API](https://developers.cloudflare.com/ai-gateway/supported-models/).

Task TypesCapabilitiesAuthorsNewest first

We found 136 models

[📌![Moonshot AI logo](https://developers.cloudflare.com/_astro/moonshotai.D9EBG7kx.svg)kimi-k2.6Text Generation • Moonshot AI • HostedKimi K2.6 is a frontier-scale open-source 1T parameter model with a 262.1k context window, multi-turn tool calling, vision inputs, and structured outputs for agentic workloads.Function callingReasoningVision](https://developers.cloudflare.com/ai/models/@cf/moonshotai/kimi-k2.6/)[📌![Zhipu AI logo](https://developers.cloudflare.com/_astro/zai.Dj2vcayE.svg)glm-4.7-flashText Generation • Zhipu AI • HostedGLM-4.7-Flash is a fast and efficient multilingual text generation model with a 131,072 token context window. Optimized for dialogue, instruction-following, and multi-turn tool calling across 100+ languages.Function callingReasoning](https://developers.cloudflare.com/ai/models/@cf/zai-org/glm-4.7-flash/)[📌![OpenAI logo](https://developers.cloudflare.com/_astro/openai.BI8PEEzI.svg)gpt-oss-120bText Generation • OpenAI • HostedOpenAI's open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases – gpt-oss-120b is for production, general purpose, high reasoning use-cases.Function callingReasoning](https://developers.cloudflare.com/ai/models/@cf/openai/gpt-oss-120b/)[📌![Meta logo](https://developers.cloudflare.com/_astro/meta.BR4nfp35.svg)llama-4-scout-17b-16e-instructText Generation • Meta • HostedMeta's Llama 4 Scout is a 17 billion parameter model with 16 experts that is natively multimodal. These models leverage a mixture-of-experts architecture to offer industry-leading performance in text and image understanding.BatchFunction callingVision](https://developers.cloudflare.com/ai/models/@cf/meta/llama-4-scout-17b-16e-instruct/)[![Inworld logo](https://developers.cloudflare.com/_astro/inworld.BDwMAXI2.svg)tts-2Text-to-Speech • Inworld • ProxiedInworld's most powerful and expressive text-to-speech model. Builds on TTS 1.5 with rich expressive speech, real-time latency, natural language steering (e.g. \[whisper\], \[say excitedly\]), and stronger multilingual support across 15 production languages plus 90+ experimental languages.](https://developers.cloudflare.com/ai/models/inworld/tts-2/)[![Alibaba logo](https://developers.cloudflare.com/_astro/alibaba.C3THgr9s.svg)hh1-t2vText-to-Video • Alibaba • ProxiedAlibaba's HappyHorse 1.0 text-to-video model. Generates videos from a text prompt with configurable resolution, aspect ratio, and duration (3-15s).](https://developers.cloudflare.com/ai/models/alibaba/hh1-t2v/)[![Alibaba logo](https://developers.cloudflare.com/_astro/alibaba.C3THgr9s.svg)hh1-i2vImage-to-Video • Alibaba • ProxiedAlibaba's HappyHorse 1.0 image-to-video model. Animates a reference image with an optional text prompt. Supports 720P and 1080P output with durations from 3 to 15 seconds.](https://developers.cloudflare.com/ai/models/alibaba/hh1-i2v/)[![OpenAI logo](https://developers.cloudflare.com/_astro/openai.BI8PEEzI.svg)gpt-5.4-proText Generation • OpenAI • ProxiedGPT-5.4 Pro uses OpenAI's Responses API with built-in tools, improved reasoning, and stateful context management.](https://developers.cloudflare.com/ai/models/openai/gpt-5.4-pro/)[![OpenAI logo](https://developers.cloudflare.com/_astro/openai.BI8PEEzI.svg)gpt-5.5Text Generation • OpenAI • ProxiedGPT-5.5 is OpenAI's flagship model with strong coding, reasoning, and multimodal capabilities.](https://developers.cloudflare.com/ai/models/openai/gpt-5.5/)[![OpenAI logo](https://developers.cloudflare.com/_astro/openai.BI8PEEzI.svg)gpt-image-2Text-to-Image • OpenAI • ProxiedOpenAI's next-generation image model that creates and edits images from text prompts, with support for multiple quality levels, sizes, and output formats. Note: transparent backgrounds are not supported — use openai/gpt-image-1.5 for transparent PNGs.](https://developers.cloudflare.com/ai/models/openai/gpt-image-2/)[![Anthropic logo](https://developers.cloudflare.com/_astro/anthropic.DbRqBIjP.svg)claude-opus-4.7Text Generation • Anthropic • ProxiedClaude Opus 4.7 is Anthropic's most capable generally available model, with a step-change improvement in agentic coding over Claude Opus 4.6\. It uses adaptive thinking to calibrate reasoning per task and supports a one million token context window at standard pricing.](https://developers.cloudflare.com/ai/models/anthropic/claude-opus-4.7/)[![Alibaba logo](https://developers.cloudflare.com/_astro/alibaba.C3THgr9s.svg)qwen3.5-397b-a17bText Generation • Alibaba • ProxiedAlibaba's Qwen 3.5 is a 397B-parameter mixture-of-experts model with 17B active parameters, offering strong reasoning capabilities with efficient inference.](https://developers.cloudflare.com/ai/models/alibaba/qwen3.5-397b-a17b/)[![Alibaba logo](https://developers.cloudflare.com/_astro/alibaba.C3THgr9s.svg)qwen3-maxText Generation • Alibaba • ProxiedAlibaba's Qwen 3 Max is a large language model with strong coding, reasoning, and multilingual capabilities, served via DashScope's OpenAI-compatible endpoint.](https://developers.cloudflare.com/ai/models/alibaba/qwen3-max/)[![PixVerse logo](https://developers.cloudflare.com/_astro/pixverse.DSyGEAYR.svg)v6Text-to-Video • PixVerse • ProxiedPixverse v6 is the latest Pixverse video model with support for up to 15-second videos, customizable duration from 1 to 15 seconds, and audio generation.](https://developers.cloudflare.com/ai/models/pixverse/v6/)[![PixVerse logo](https://developers.cloudflare.com/_astro/pixverse.DSyGEAYR.svg)v5.6Text-to-Video • PixVerse • ProxiedPixverse v5.6 is a video generation model supporting text-to-video and image-to-video with audio generation, customizable aspect ratios, and up to 1080p output.](https://developers.cloudflare.com/ai/models/pixverse/v5.6/)[![Vidu logo](https://developers.cloudflare.com/_astro/vidu._WEx0U8r.svg)q3-turboText-to-Video • Vidu • ProxiedVidu Q3 Turbo is a faster version of Vidu Q3 optimized for lower latency video generation while maintaining audio support and up to 16-second clips.](https://developers.cloudflare.com/ai/models/vidu/q3-turbo/)[![Vidu logo](https://developers.cloudflare.com/_astro/vidu._WEx0U8r.svg)q3-proText-to-Video • Vidu • ProxiedVidu Q3 Pro is a high-quality video generation model supporting text-to-video, image-to-video, and start/end-frame-to-video workflows with audio and up to 16-second clips.](https://developers.cloudflare.com/ai/models/vidu/q3-pro/)[![Alibaba logo](https://developers.cloudflare.com/_astro/alibaba.C3THgr9s.svg)wan-2.6-imageText-to-Image • Alibaba • ProxiedAlibaba's Wan 2.6 text-to-image model generating images from text prompts with optional negative prompts and customizable dimensions.](https://developers.cloudflare.com/ai/models/alibaba/wan-2.6-image/)[![RunwayML logo](https://developers.cloudflare.com/_astro/runway.Cq8Cjov4.svg)gen-4.5Text-to-Video • RunwayML • ProxiedRunwayML's video generation model supporting both text-to-video and image-to-video with customizable duration, aspect ratio, and content moderation controls.](https://developers.cloudflare.com/ai/models/runwayml/gen-4.5/)[![MiniMax logo](https://developers.cloudflare.com/_astro/minimax.DPZX-zZI.svg)music-2.6Music Generation • MiniMax • ProxiedMiniMax's music generation model that creates full-length songs with vocals from text prompts and lyrics, or instrumental tracks. Supports BPM/key control and auto-generated lyrics.](https://developers.cloudflare.com/ai/models/minimax/music-2.6/)[![OpenAI logo](https://developers.cloudflare.com/_astro/openai.BI8PEEzI.svg)gpt-image-1.5Text-to-Image • OpenAI • ProxiedOpenAI's image generation model that creates and edits images from text prompts, supporting multiple quality levels and output sizes.](https://developers.cloudflare.com/ai/models/openai/gpt-image-1.5/)[![Google logo](https://developers.cloudflare.com/_astro/google.DyXKPTPP.svg)imagen-4Text-to-Image • Google • ProxiedGoogle's latest image generation model producing high-quality, photorealistic images from text prompts with support for multiple aspect ratios.](https://developers.cloudflare.com/ai/models/google/imagen-4/)[![AssemblyAI logo](https://developers.cloudflare.com/_astro/assemblyai.DKrad3Z3.svg)universal-3-proAutomatic Speech Recognition • AssemblyAI • ProxiedAssemblyAI's Universal 3 Pro speech recognition model for high-accuracy transcription.](https://developers.cloudflare.com/ai/models/assemblyai/universal-3-pro/)[![Inworld logo](https://developers.cloudflare.com/_astro/inworld.BDwMAXI2.svg)tts-1.5-miniText-to-Speech • Inworld • ProxiedUltra-fast, cost-efficient text-to-speech with approximately 120ms latency and 15-language support.](https://developers.cloudflare.com/ai/models/inworld/tts-1.5-mini/)[![Inworld logo](https://developers.cloudflare.com/_astro/inworld.BDwMAXI2.svg)tts-1.5-maxText-to-Speech • Inworld • ProxiedHighest-quality text-to-speech with under 200ms latency, emotion control, and 15-language support.](https://developers.cloudflare.com/ai/models/inworld/tts-1.5-max/)[![MiniMax logo](https://developers.cloudflare.com/_astro/minimax.DPZX-zZI.svg)speech-2.8-turboText-to-Speech • MiniMax • ProxiedMiniMax Speech 2.8 Turbo turns text into natural, expressive speech with voice cloning, emotion control, and 40+ language support at faster speeds.](https://developers.cloudflare.com/ai/models/minimax/speech-2.8-turbo/)[![MiniMax logo](https://developers.cloudflare.com/_astro/minimax.DPZX-zZI.svg)m2.7Text Generation • MiniMax • ProxiedMiniMax's M2.7 language model with multilingual capabilities.](https://developers.cloudflare.com/ai/models/minimax/m2.7/)[![MiniMax logo](https://developers.cloudflare.com/_astro/minimax.DPZX-zZI.svg)speech-2.8-hdText-to-Speech • MiniMax • ProxiedMiniMax Speech 2.8 HD focuses on studio-grade audio generation with emotion control, multilingual support (40+ languages), and voice cloning.](https://developers.cloudflare.com/ai/models/minimax/speech-2.8-hd/)[![MiniMax logo](https://developers.cloudflare.com/_astro/minimax.DPZX-zZI.svg)hailuo-2.3-fastText-to-Video • MiniMax • ProxiedA lower-latency version of Hailuo 2.3 that preserves core motion quality, visual consistency, and stylization while enabling faster iteration.](https://developers.cloudflare.com/ai/models/minimax/hailuo-2.3-fast/)[![MiniMax logo](https://developers.cloudflare.com/_astro/minimax.DPZX-zZI.svg)hailuo-2.3Text-to-Video • MiniMax • ProxiedA high-fidelity video generation model optimized for realistic human motion, cinematic VFX, expressive characters, and strong prompt and style adherence across text-to-video and image-to-video workflows.](https://developers.cloudflare.com/ai/models/minimax/hailuo-2.3/)[![Recraft logo](https://developers.cloudflare.com/_astro/recraft.BhhnJczi.svg)recraftv4-pro-vectorText-to-Image • Recraft • ProxiedGenerate detailed, production-ready SVG vector graphics from text prompts with fine geometry, scalable to any size for print and design work.](https://developers.cloudflare.com/ai/models/recraft/recraftv4-pro-vector/)[![Recraft logo](https://developers.cloudflare.com/_astro/recraft.BhhnJczi.svg)recraftv4-vectorText-to-Image • Recraft • ProxiedGenerate production-ready SVG vector graphics from text prompts with clean geometry, structured layers, and editable paths.](https://developers.cloudflare.com/ai/models/recraft/recraftv4-vector/)[![Recraft logo](https://developers.cloudflare.com/_astro/recraft.BhhnJczi.svg)recraftv4Text-to-Image • Recraft • ProxiedRecraft V4 generates art-directed images with strong composition, accurate text rendering, and design taste built in. Fast and cost-efficient at standard resolution.](https://developers.cloudflare.com/ai/models/recraft/recraftv4/)[![Recraft logo](https://developers.cloudflare.com/_astro/recraft.BhhnJczi.svg)recraftv4-proText-to-Image • Recraft • ProxiedRecraft V4 Pro generates high-resolution, art-directed images at 2048px+ with strong composition, text rendering, and design taste. Built for print and production work.](https://developers.cloudflare.com/ai/models/recraft/recraftv4-pro/)[![Google logo](https://developers.cloudflare.com/_astro/google.DyXKPTPP.svg)gemini-3-flashText Generation • Google • ProxiedGemini 3 Flash is Google's fast multimodal model with frontier intelligence, superior search, and grounding capabilities.](https://developers.cloudflare.com/ai/models/google/gemini-3-flash/)[![Google logo](https://developers.cloudflare.com/_astro/google.DyXKPTPP.svg)gemini-3.1-flash-liteText Generation • Google • ProxiedGoogle's lightest and most cost-efficient Gemini model for high-throughput tasks.](https://developers.cloudflare.com/ai/models/google/gemini-3.1-flash-lite/)[![Google logo](https://developers.cloudflare.com/_astro/google.DyXKPTPP.svg)gemini-3.1-proText Generation • Google • ProxiedGoogle's most intelligent Gemini model with improved reasoning, a medium thinking level, and a 1M token context window.](https://developers.cloudflare.com/ai/models/google/gemini-3.1-pro/)[![OpenAI logo](https://developers.cloudflare.com/_astro/openai.BI8PEEzI.svg)tts-1Text-to-Speech • OpenAI • ProxiedOpenAI's text-to-speech model optimized for real-time use with low latency.](https://developers.cloudflare.com/ai/models/openai/tts-1/)[![OpenAI logo](https://developers.cloudflare.com/_astro/openai.BI8PEEzI.svg)tts-1-hdText-to-Speech • OpenAI • ProxiedOpenAI's high-definition text-to-speech model producing higher quality audio output.](https://developers.cloudflare.com/ai/models/openai/tts-1-hd/)[![OpenAI logo](https://developers.cloudflare.com/_astro/openai.BI8PEEzI.svg)gpt-4o-transcribeAutomatic Speech Recognition • OpenAI • ProxiedA speech-to-text model that uses GPT-4o to transcribe audio with improved word error rate and better language recognition compared to original Whisper models.](https://developers.cloudflare.com/ai/models/openai/gpt-4o-transcribe/)[![OpenAI logo](https://developers.cloudflare.com/_astro/openai.BI8PEEzI.svg)o4-miniText Generation • OpenAI • ProxiedOpenAI's fast, lightweight reasoning model optimized for multi-step problem solving at lower cost.](https://developers.cloudflare.com/ai/models/openai/o4-mini/)[![OpenAI logo](https://developers.cloudflare.com/_astro/openai.BI8PEEzI.svg)gpt-4.1Text Generation • OpenAI • ProxiedOpenAI's flagship GPT model for complex tasks with a million-token context window.](https://developers.cloudflare.com/ai/models/openai/gpt-4.1/)[![OpenAI logo](https://developers.cloudflare.com/_astro/openai.BI8PEEzI.svg)gpt-4.1-miniText Generation • OpenAI • ProxiedFast, affordable version of GPT-4.1 with a million-token context window.](https://developers.cloudflare.com/ai/models/openai/gpt-4.1-mini/)[![OpenAI logo](https://developers.cloudflare.com/_astro/openai.BI8PEEzI.svg)gpt-5Text Generation • OpenAI • ProxiedOpenAI's model excelling at coding, writing, and reasoning.](https://developers.cloudflare.com/ai/models/openai/gpt-5/)[![OpenAI logo](https://developers.cloudflare.com/_astro/openai.BI8PEEzI.svg)gpt-5.4-nanoText Generation • OpenAI • ProxiedGPT-5.4 Nano is OpenAI's smallest and fastest model, optimized for edge and low-latency use cases.](https://developers.cloudflare.com/ai/models/openai/gpt-5.4-nano/)[![OpenAI logo](https://developers.cloudflare.com/_astro/openai.BI8PEEzI.svg)gpt-5.4-miniText Generation • OpenAI • ProxiedGPT-5.4 Mini is a smaller, faster, and more cost-efficient version of GPT-5.4 for lightweight tasks.](https://developers.cloudflare.com/ai/models/openai/gpt-5.4-mini/)[![Anthropic logo](https://developers.cloudflare.com/_astro/anthropic.DbRqBIjP.svg)claude-haiku-4.5Text Generation • Anthropic • ProxiedClaude Haiku 4.5 delivers similar levels of coding performance at one-third the cost and more than twice the speed of larger models.](https://developers.cloudflare.com/ai/models/anthropic/claude-haiku-4.5/)[![Anthropic logo](https://developers.cloudflare.com/_astro/anthropic.DbRqBIjP.svg)claude-sonnet-4Text Generation • Anthropic • ProxiedClaude Sonnet 4 delivers superior coding and reasoning while responding more precisely to instructions, a significant upgrade over previous versions.](https://developers.cloudflare.com/ai/models/anthropic/claude-sonnet-4/)[![Anthropic logo](https://developers.cloudflare.com/_astro/anthropic.DbRqBIjP.svg)claude-sonnet-4.5Text Generation • Anthropic • ProxiedClaude Sonnet 4.5 is the best coding model to date, with significant improvements across the entire development lifecycle.](https://developers.cloudflare.com/ai/models/anthropic/claude-sonnet-4.5/)[![Anthropic logo](https://developers.cloudflare.com/_astro/anthropic.DbRqBIjP.svg)claude-sonnet-4.6Text Generation • Anthropic • ProxiedClaude Sonnet 4.6 is Anthropic's latest balanced model offering strong coding, reasoning, and agentic capabilities with improved instruction following.](https://developers.cloudflare.com/ai/models/anthropic/claude-sonnet-4.6/)[![Anthropic logo](https://developers.cloudflare.com/_astro/anthropic.DbRqBIjP.svg)claude-opus-4.6Text Generation • Anthropic • ProxiedClaude Opus 4.6 is Anthropic's flagship language model built for complex, multi-step work in coding, financial analysis, and legal reasoning. It uses extended thinking to work through complex problems carefully and features a one million token context window.](https://developers.cloudflare.com/ai/models/anthropic/claude-opus-4.6/)[![ByteDance logo](https://developers.cloudflare.com/_astro/bytedance.T1uiROQ6.svg)seedream-5-liteText-to-Image • ByteDance • ProxiedSeedream 5 Lite is a lighter, faster version of the Seedream 5 family with multi-reference and batch generation support.](https://developers.cloudflare.com/ai/models/bytedance/seedream-5-lite/)[![ByteDance logo](https://developers.cloudflare.com/_astro/bytedance.T1uiROQ6.svg)seedream-4.5Text-to-Image • ByteDance • ProxiedSeedream 4.5 builds on 4.0 with multi-reference image support, batch generation, and sequential image generation.](https://developers.cloudflare.com/ai/models/bytedance/seedream-4.5/)[![ByteDance logo](https://developers.cloudflare.com/_astro/bytedance.T1uiROQ6.svg)seedream-4.0Text-to-Image • ByteDance • ProxiedSeedream 4.0 is ByteDance's image creation model that combines text-to-image generation and image editing into a single architecture, offering fast, high-resolution output up to 4K.](https://developers.cloudflare.com/ai/models/bytedance/seedream-4.0/)[![Google logo](https://developers.cloudflare.com/_astro/google.DyXKPTPP.svg)nano-banana-2Text-to-Image • Google • ProxiedGoogle's second-generation image generation model with improved quality and speed.](https://developers.cloudflare.com/ai/models/google/nano-banana-2/)[![Google logo](https://developers.cloudflare.com/_astro/google.DyXKPTPP.svg)nano-banana-proText-to-Image • Google • ProxiedGoogle's higher-quality image generation model with improved detail and prompt adherence.](https://developers.cloudflare.com/ai/models/google/nano-banana-pro/)[![Google logo](https://developers.cloudflare.com/_astro/google.DyXKPTPP.svg)nano-bananaText-to-Image • Google • ProxiedGoogle's fast image generation model producing high-quality images from text prompts.](https://developers.cloudflare.com/ai/models/google/nano-banana/)[![Google logo](https://developers.cloudflare.com/_astro/google.DyXKPTPP.svg)veo-3.1-fastText-to-Video • Google • ProxiedA faster version of Veo 3.1 optimized for lower latency while maintaining high-quality video and audio output.](https://developers.cloudflare.com/ai/models/google/veo-3.1-fast/)[![Google logo](https://developers.cloudflare.com/_astro/google.DyXKPTPP.svg)veo-3-fastText-to-Video • Google • ProxiedA faster version of Veo 3 optimized for lower latency video generation with audio support.](https://developers.cloudflare.com/ai/models/google/veo-3-fast/)[![Google logo](https://developers.cloudflare.com/_astro/google.DyXKPTPP.svg)veo-3.1Text-to-Video • Google • ProxiedGoogle's latest video generation model with improved quality, motion, and audio generation.](https://developers.cloudflare.com/ai/models/google/veo-3.1/)[![Google logo](https://developers.cloudflare.com/_astro/google.DyXKPTPP.svg)veo-3Text-to-Video • Google • ProxiedGoogle's video generation model capable of producing high-quality videos with optional audio from text prompts.](https://developers.cloudflare.com/ai/models/google/veo-3/)[![OpenAI logo](https://developers.cloudflare.com/_astro/openai.BI8PEEzI.svg)gpt-5.4Text Generation • OpenAI • ProxiedGPT-5.4 is OpenAI's flagship model with strong coding, reasoning, and multimodal capabilities.](https://developers.cloudflare.com/ai/models/openai/gpt-5.4/)[![Google logo](https://developers.cloudflare.com/_astro/google.DyXKPTPP.svg)gemma-4-26b-a4b-itText Generation • Google • HostedGemma 4 is Google's most intelligent family of open models, built from Gemini 3 research to maximize intelligence-per-parameter.Function callingReasoningVision](https://developers.cloudflare.com/ai/models/@cf/google/gemma-4-26b-a4b-it/)[![NVIDIA logo](https://developers.cloudflare.com/_astro/nvidia.y1O6VlZA.svg)nemotron-3-120b-a12bText Generation • NVIDIA • HostedNVIDIA Nemotron 3 Super is a hybrid MoE model with leading accuracy for multi-agent applications and specialized agentic AI systems.Function callingReasoning](https://developers.cloudflare.com/ai/models/@cf/nvidia/nemotron-3-120b-a12b/)[![Moonshot AI logo](https://developers.cloudflare.com/_astro/moonshotai.D9EBG7kx.svg)kimi-k2.5Text Generation • Moonshot AI • HostedKimi K2.5 is a frontier-scale open-source model with a 256k context window, multi-turn tool calling, vision inputs, and structured outputs for agentic workloads.Function callingPlanned deprecationReasoningVision](https://developers.cloudflare.com/ai/models/@cf/moonshotai/kimi-k2.5/)[![Black Forest Labs logo](https://developers.cloudflare.com/_astro/blackforestlabs.Ccs-Y4-D.svg)flux-2-klein-9bText-to-Image • Black Forest Labs • HostedFLUX.2 \[klein\] 9B is an ultra-fast, distilled image model with enhanced quality. It unifies image generation and editing in a single model, delivering state-of-the-art quality enabling interactive workflows, real-time previews, and latency-critical applications.Partner](https://developers.cloudflare.com/ai/models/@cf/black-forest-labs/flux-2-klein-9b/)[![Black Forest Labs logo](https://developers.cloudflare.com/_astro/blackforestlabs.Ccs-Y4-D.svg)flux-2-klein-4bText-to-Image • Black Forest Labs • HostedFLUX.2 \[klein\] is an ultra-fast, distilled image model. It unifies image generation and editing in a single model, delivering state-of-the-art quality enabling interactive workflows, real-time previews, and latency-critical applications.Partner](https://developers.cloudflare.com/ai/models/@cf/black-forest-labs/flux-2-klein-4b/)[![Black Forest Labs logo](https://developers.cloudflare.com/_astro/blackforestlabs.Ccs-Y4-D.svg)flux-2-devText-to-Image • Black Forest Labs • HostedFLUX.2 \[dev\] is an image model from Black Forest Labs where you can generate highly realistic and detailed images, with multi-reference support.Partner](https://developers.cloudflare.com/ai/models/@cf/black-forest-labs/flux-2-dev/)[![Deepgram logo](https://developers.cloudflare.com/_astro/deepgram.BYzW8KfF.svg)aura-2-esText-to-Speech • Deepgram • HostedAura-2 is a context-aware text-to-speech (TTS) model that applies natural pacing, expressiveness, and fillers based on the context of the provided text. The quality of your text input directly impacts the naturalness of the audio output.BatchPartnerReal-time](https://developers.cloudflare.com/ai/models/@cf/deepgram/aura-2-es/)[![Deepgram logo](https://developers.cloudflare.com/_astro/deepgram.BYzW8KfF.svg)aura-2-enText-to-Speech • Deepgram • HostedAura-2 is a context-aware text-to-speech (TTS) model that applies natural pacing, expressiveness, and fillers based on the context of the provided text. The quality of your text input directly impacts the naturalness of the audio output.BatchPartnerReal-time](https://developers.cloudflare.com/ai/models/@cf/deepgram/aura-2-en/)[![IBM logo](https://developers.cloudflare.com/_astro/ibm.CNSuznmO.svg)granite-4.0-h-microText Generation • IBM • HostedGranite 4.0 instruct models deliver strong performance across benchmarks, achieving industry-leading results in key agentic tasks like instruction following and function calling. These efficiencies make the models well-suited for a wide range of use cases like retrieval-augmented generation (RAG), multi-agent workflows, and edge deployments.Function calling](https://developers.cloudflare.com/ai/models/@cf/ibm-granite/granite-4.0-h-micro/)[![Deepgram logo](https://developers.cloudflare.com/_astro/deepgram.BYzW8KfF.svg)fluxAutomatic Speech Recognition • Deepgram • HostedFlux is the first conversational speech recognition model built specifically for voice agents.PartnerReal-time](https://developers.cloudflare.com/ai/models/@cf/deepgram/flux/)[pplamo-embedding-1bText Embeddings • pfnet • HostedPLaMo-Embedding-1B is a Japanese text embedding model developed by Preferred Networks, Inc. It can convert Japanese text input into numerical vectors and can be used for a wide range of applications, including information retrieval, text classification, and clustering.](https://developers.cloudflare.com/ai/models/@cf/pfnet/plamo-embedding-1b/)[agemma-sea-lion-v4-27b-itText Generation • aisingapore • HostedSEA-LION stands for Southeast Asian Languages In One Network, which is a collection of Large Language Models (LLMs) which have been pretrained and instruct-tuned for the Southeast Asia (SEA) region.](https://developers.cloudflare.com/ai/models/@cf/aisingapore/gemma-sea-lion-v4-27b-it/)[aindictrans2-en-indic-1BTranslation • ai4bharat • HostedIndicTrans2 is the first open-source transformer-based multilingual NMT model that supports high-quality translations across all the 22 scheduled Indic languages](https://developers.cloudflare.com/ai/models/@cf/ai4bharat/indictrans2-en-indic-1B/)[![Google logo](https://developers.cloudflare.com/_astro/google.DyXKPTPP.svg)embeddinggemma-300mText Embeddings • Google • HostedEmbeddingGemma is a 300M parameter, state-of-the-art for its size, open embedding model from Google, built from Gemma 3 (with T5Gemma initialization) and the same research and technology used to create Gemini models. EmbeddingGemma produces vector representations of text, making it well-suited for search and retrieval tasks, including classification, clustering, and semantic similarity search. This model was trained with data in 100+ spoken languages.](https://developers.cloudflare.com/ai/models/@cf/google/embeddinggemma-300m/)[![Deepgram logo](https://developers.cloudflare.com/_astro/deepgram.BYzW8KfF.svg)aura-1Text-to-Speech • Deepgram • HostedAura is a context-aware text-to-speech (TTS) model that applies natural pacing, expressiveness, and fillers based on the context of the provided text. The quality of your text input directly impacts the naturalness of the audio output.BatchPartnerReal-time](https://developers.cloudflare.com/ai/models/@cf/deepgram/aura-1/)[![Leonardo logo](https://developers.cloudflare.com/_astro/leonardo.Ch-T5rST.svg)lucid-originText-to-Image • Leonardo • HostedLucid Origin from Leonardo.AI is their most adaptable and prompt-responsive model to date. Whether you're generating images with sharp graphic design, stunning full-HD renders, or highly specific creative direction, it adheres closely to your prompts, renders text with accuracy, and supports a wide array of visual styles and aesthetics – from stylized concept art to crisp product mockups.Partner](https://developers.cloudflare.com/ai/models/@cf/leonardo/lucid-origin/)[![Leonardo logo](https://developers.cloudflare.com/_astro/leonardo.Ch-T5rST.svg)phoenix-1.0Text-to-Image • Leonardo • HostedPhoenix 1.0 is a model by Leonardo.Ai that generates images with exceptional prompt adherence and coherent text.Partner](https://developers.cloudflare.com/ai/models/@cf/leonardo/phoenix-1.0/)[![OpenAI logo](https://developers.cloudflare.com/_astro/openai.BI8PEEzI.svg)gpt-oss-20bText Generation • OpenAI • HostedOpenAI's open-weight models designed for powerful reasoning, agentic tasks, and versatile developer use cases – gpt-oss-20b is for lower latency, and local or specialized use-cases.Function callingReasoning](https://developers.cloudflare.com/ai/models/@cf/openai/gpt-oss-20b/)[![Pipecat logo](https://developers.cloudflare.com/_astro/pipecat.B-PNBdef.svg)smart-turn-v2Voice Activity Detection • Pipecat • HostedAn open source, community-driven, native audio turn detection model in 2nd versionBatchReal-time](https://developers.cloudflare.com/ai/models/@cf/pipecat-ai/smart-turn-v2/)[![Qwen logo](https://developers.cloudflare.com/_astro/qwen.CVqFFn5h.svg)qwen3-embedding-0.6bText Embeddings • Qwen • HostedThe Qwen3 Embedding model series is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. ](https://developers.cloudflare.com/ai/models/@cf/qwen/qwen3-embedding-0.6b/)[![Deepgram logo](https://developers.cloudflare.com/_astro/deepgram.BYzW8KfF.svg)nova-3Automatic Speech Recognition • Deepgram • HostedTranscribe audio using Deepgram’s speech-to-text modelBatchPartnerReal-time](https://developers.cloudflare.com/ai/models/@cf/deepgram/nova-3/)[![Qwen logo](https://developers.cloudflare.com/_astro/qwen.CVqFFn5h.svg)qwen3-30b-a3b-fp8Text Generation • Qwen • HostedQwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support.BatchFunction callingReasoning](https://developers.cloudflare.com/ai/models/@cf/qwen/qwen3-30b-a3b-fp8/)[![Google logo](https://developers.cloudflare.com/_astro/google.DyXKPTPP.svg)gemma-3-12b-itText Generation • Google • HostedGemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Gemma 3 models are multimodal, handling text and image input and generating text output, with a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions.LoRAPlanned deprecation](https://developers.cloudflare.com/ai/models/@cf/google/gemma-3-12b-it/)[![MistralAI logo](https://developers.cloudflare.com/_astro/mistralai.Bn9UMUMu.svg)mistral-small-3.1-24b-instructText Generation • MistralAI • HostedBuilding upon Mistral Small 3 (2501), Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance. With 24 billion parameters, this model achieves top-tier capabilities in both text and vision tasks.Function calling](https://developers.cloudflare.com/ai/models/@cf/mistralai/mistral-small-3.1-24b-instruct/)[![Qwen logo](https://developers.cloudflare.com/_astro/qwen.CVqFFn5h.svg)qwq-32bText Generation • Qwen • HostedQwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems. QwQ-32B is the medium-sized reasoning model, which is capable of achieving competitive performance against state-of-the-art reasoning models, e.g., DeepSeek-R1, o1-mini.LoRAReasoning](https://developers.cloudflare.com/ai/models/@cf/qwen/qwq-32b/)[![Qwen logo](https://developers.cloudflare.com/_astro/qwen.CVqFFn5h.svg)qwen2.5-coder-32b-instructText Generation • Qwen • HostedQwen2.5-Coder is the latest series of Code-Specific Qwen large language models (formerly known as CodeQwen). As of now, Qwen2.5-Coder has covered six mainstream model sizes, 0.5, 1.5, 3, 7, 14, 32 billion parameters, to meet the needs of different developers. Qwen2.5-Coder brings the following improvements upon CodeQwen1.5:LoRA](https://developers.cloudflare.com/ai/models/@cf/qwen/qwen2.5-coder-32b-instruct/)[![BAAI logo](https://developers.cloudflare.com/_astro/baai.mOtdbKlV.svg)bge-reranker-baseText Classification • BAAI • HostedDifferent from embedding model, reranker uses question and document as input and directly output similarity instead of embedding. You can get a relevance score by inputting query and passage to the reranker. And the score can be mapped to a float value in \[0,1\] by sigmoid function.](https://developers.cloudflare.com/ai/models/@cf/baai/bge-reranker-base/)[![Meta logo](https://developers.cloudflare.com/_astro/meta.BR4nfp35.svg)llama-guard-3-8bText Generation • Meta • HostedLlama Guard 3 is a Llama-3.1-8B pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM inputs (prompt classification) and in LLM responses (response classification). It acts as an LLM – it generates text in its output that indicates whether a given prompt or response is safe or unsafe, and if unsafe, it also lists the content categories violated.LoRA](https://developers.cloudflare.com/ai/models/@cf/meta/llama-guard-3-8b/)[![DeepSeek logo](https://developers.cloudflare.com/_astro/deepseek.nPIT6fwR.svg)deepseek-r1-distill-qwen-32bText Generation • DeepSeek • HostedDeepSeek-R1-Distill-Qwen-32B is a model distilled from DeepSeek-R1 based on Qwen2.5\. It outperforms OpenAI-o1-mini across various benchmarks, achieving new state-of-the-art results for dense models.Reasoning](https://developers.cloudflare.com/ai/models/@cf/deepseek-ai/deepseek-r1-distill-qwen-32b/)[![Meta logo](https://developers.cloudflare.com/_astro/meta.BR4nfp35.svg)llama-3.3-70b-instruct-fp8-fastText Generation • Meta • HostedLlama 3.3 70B quantized to fp8 precision, optimized to be faster.BatchFunction calling](https://developers.cloudflare.com/ai/models/@cf/meta/llama-3.3-70b-instruct-fp8-fast/)[![Meta logo](https://developers.cloudflare.com/_astro/meta.BR4nfp35.svg)llama-3.2-1b-instructText Generation • Meta • HostedThe Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks.](https://developers.cloudflare.com/ai/models/@cf/meta/llama-3.2-1b-instruct/)[![Meta logo](https://developers.cloudflare.com/_astro/meta.BR4nfp35.svg)llama-3.2-3b-instructText Generation • Meta • HostedThe Llama 3.2 instruction-tuned text only models are optimized for multilingual dialogue use cases, including agentic retrieval and summarization tasks.](https://developers.cloudflare.com/ai/models/@cf/meta/llama-3.2-3b-instruct/)[![Meta logo](https://developers.cloudflare.com/_astro/meta.BR4nfp35.svg)llama-3.2-11b-vision-instructText Generation • Meta • Hosted The Llama 3.2-Vision instruction-tuned models are optimized for visual recognition, image reasoning, captioning, and answering general questions about an image.LoRAVision](https://developers.cloudflare.com/ai/models/@cf/meta/llama-3.2-11b-vision-instruct/)[![Black Forest Labs logo](https://developers.cloudflare.com/_astro/blackforestlabs.Ccs-Y4-D.svg)flux-1-schnellText-to-Image • Black Forest Labs • HostedFLUX.1 \[schnell\] is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions. ](https://developers.cloudflare.com/ai/models/@cf/black-forest-labs/flux-1-schnell/)[![Meta logo](https://developers.cloudflare.com/_astro/meta.BR4nfp35.svg)llama-3.1-8b-instruct-awqText Generation • Meta • HostedQuantized (int4) generative text model with 8 billion parameters from Meta.Planned deprecation](https://developers.cloudflare.com/ai/models/@cf/meta/llama-3.1-8b-instruct-awq/)[![Meta logo](https://developers.cloudflare.com/_astro/meta.BR4nfp35.svg)llama-3.1-8b-instruct-fp8Text Generation • Meta • HostedLlama 3.1 8B quantized to FP8 precision](https://developers.cloudflare.com/ai/models/@cf/meta/llama-3.1-8b-instruct-fp8/)[![MyShell logo](https://developers.cloudflare.com/_astro/myshell.BpTDMxd2.svg)melottsText-to-Speech • MyShell • HostedMeloTTS is a high-quality multi-lingual text-to-speech library by MyShell.ai.](https://developers.cloudflare.com/ai/models/@cf/myshell-ai/melotts/)[![Meta logo](https://developers.cloudflare.com/_astro/meta.BR4nfp35.svg)llama-3.1-8b-instructText Generation • Meta • HostedThe Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models. The Llama 3.1 instruction tuned text only models are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.Planned deprecation](https://developers.cloudflare.com/ai/models/@cf/meta/llama-3.1-8b-instruct/)[![BAAI logo](https://developers.cloudflare.com/_astro/baai.mOtdbKlV.svg)bge-m3Text Embeddings • BAAI • HostedMulti-Functionality, Multi-Linguality, and Multi-Granularity embeddings model.](https://developers.cloudflare.com/ai/models/@cf/baai/bge-m3/)[![Meta logo](https://developers.cloudflare.com/_astro/meta.BR4nfp35.svg)meta-llama-3-8b-instructText Generation • Meta • HostedGeneration over generation, Meta Llama 3 demonstrates state-of-the-art performance on a wide range of industry benchmarks and offers new capabilities, including improved reasoning. Planned deprecation](https://developers.cloudflare.com/ai/models/@hf/meta-llama/meta-llama-3-8b-instruct/)[![OpenAI logo](https://developers.cloudflare.com/_astro/openai.BI8PEEzI.svg)whisper-large-v3-turboAutomatic Speech Recognition • OpenAI • HostedWhisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Batch](https://developers.cloudflare.com/ai/models/@cf/openai/whisper-large-v3-turbo/)[![Meta logo](https://developers.cloudflare.com/_astro/meta.BR4nfp35.svg)llama-3-8b-instruct-awqText Generation • Meta • HostedQuantized (int4) generative text model with 8 billion parameters from Meta.Planned deprecation](https://developers.cloudflare.com/ai/models/@cf/meta/llama-3-8b-instruct-awq/)[lllava-1.5-7b-hfBetaImage-to-Text • llava-hf • HostedLLaVA is an open-source chatbot trained by fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data. It is an auto-regressive language model, based on the transformer architecture.](https://developers.cloudflare.com/ai/models/@cf/llava-hf/llava-1.5-7b-hf/)[![OpenAI logo](https://developers.cloudflare.com/_astro/openai.BI8PEEzI.svg)whisper-tiny-enBetaAutomatic Speech Recognition • OpenAI • HostedWhisper is a pre-trained model for automatic speech recognition (ASR) and speech translation. Trained on 680k hours of labelled data, Whisper models demonstrate a strong ability to generalize to many datasets and domains without the need for fine-tuning. This is the English-only version of the Whisper Tiny model which was trained on the task of speech recognition.](https://developers.cloudflare.com/ai/models/@cf/openai/whisper-tiny-en/)[![Meta logo](https://developers.cloudflare.com/_astro/meta.BR4nfp35.svg)llama-3-8b-instructText Generation • Meta • HostedGeneration over generation, Meta Llama 3 demonstrates state-of-the-art performance on a wide range of industry benchmarks and offers new capabilities, including improved reasoning.Planned deprecation](https://developers.cloudflare.com/ai/models/@cf/meta/llama-3-8b-instruct/)[![MistralAI logo](https://developers.cloudflare.com/_astro/mistralai.Bn9UMUMu.svg)mistral-7b-instruct-v0.2BetaText Generation • MistralAI • HostedThe Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.2\. Mistral-7B-v0.2 has the following changes compared to Mistral-7B-v0.1: 32k context window (vs 8k context in v0.1), rope-theta = 1e6, and no Sliding-Window Attention.LoRAPlanned deprecation](https://developers.cloudflare.com/ai/models/@hf/mistral/mistral-7b-instruct-v0.2/)[![Google logo](https://developers.cloudflare.com/_astro/google.DyXKPTPP.svg)gemma-7b-it-loraBetaText Generation • Google • Hosted This is a Gemma-7B base model that Cloudflare dedicates for inference with LoRA adapters. Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models.LoRA](https://developers.cloudflare.com/ai/models/@cf/google/gemma-7b-it-lora/)[![Google logo](https://developers.cloudflare.com/_astro/google.DyXKPTPP.svg)gemma-2b-it-loraBetaText Generation • Google • HostedThis is a Gemma-2B base model that Cloudflare dedicates for inference with LoRA adapters. Gemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models.LoRA](https://developers.cloudflare.com/ai/models/@cf/google/gemma-2b-it-lora/)[![Meta logo](https://developers.cloudflare.com/_astro/meta.BR4nfp35.svg)llama-2-7b-chat-hf-loraBetaText Generation • Meta • HostedThis is a Llama2 base model that Cloudflare dedicated for inference with LoRA adapters. Llama 2 is a collection of pretrained and fine-tuned generative text models ranging in scale from 7 billion to 70 billion parameters. This is the repository for the 7B fine-tuned model, optimized for dialogue use cases and converted for the Hugging Face Transformers format. LoRA](https://developers.cloudflare.com/ai/models/@cf/meta-llama/llama-2-7b-chat-hf-lora/)[![Google logo](https://developers.cloudflare.com/_astro/google.DyXKPTPP.svg)gemma-7b-itBetaText Generation • Google • HostedGemma is a family of lightweight, state-of-the-art open models from Google, built from the same research and technology used to create the Gemini models. They are text-to-text, decoder-only large language models, available in English, with open weights, pre-trained variants, and instruction-tuned variants.LoRAPlanned deprecation](https://developers.cloudflare.com/ai/models/@hf/google/gemma-7b-it/)[nhermes-2-pro-mistral-7bBetaText Generation • nousresearch • HostedHermes 2 Pro on Mistral 7B is the new flagship 7B Hermes! Hermes 2 Pro is an upgraded, retrained version of Nous Hermes 2, consisting of an updated and cleaned version of the OpenHermes 2.5 Dataset, as well as a newly introduced Function Calling and JSON Mode dataset developed in-house.Function callingPlanned deprecation](https://developers.cloudflare.com/ai/models/@hf/nousresearch/hermes-2-pro-mistral-7b/)[![MistralAI logo](https://developers.cloudflare.com/_astro/mistralai.Bn9UMUMu.svg)mistral-7b-instruct-v0.2-loraBetaText Generation • MistralAI • HostedThe Mistral-7B-Instruct-v0.2 Large Language Model (LLM) is an instruct fine-tuned version of the Mistral-7B-v0.2.LoRA](https://developers.cloudflare.com/ai/models/@cf/mistral/mistral-7b-instruct-v0.2-lora/)[![Unum logo](https://developers.cloudflare.com/_astro/unum.Cjjoj0_o.svg)uform-gen2-qwen-500mBetaImage-to-Text • Unum • HostedUForm-Gen is a small generative vision-language model primarily designed for Image Captioning and Visual Question Answering. The model was pre-trained on the internal image captioning dataset and fine-tuned on public instructions datasets: SVIT, LVIS, VQAs datasets.Planned deprecation](https://developers.cloudflare.com/ai/models/@cf/unum/uform-gen2-qwen-500m/)[![Meta logo](https://developers.cloudflare.com/_astro/meta.BR4nfp35.svg)bart-large-cnnBetaSummarization • Meta • HostedBART is a transformer encoder-encoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder. You can use this model for text summarization.Planned deprecation](https://developers.cloudflare.com/ai/models/@cf/facebook/bart-large-cnn/)[![Microsoft logo](https://developers.cloudflare.com/_astro/microsoft.LujcDJ--.svg)phi-2BetaText Generation • Microsoft • HostedPhi-2 is a Transformer-based model with a next-word prediction objective, trained on 1.4T tokens from multiple passes on a mixture of Synthetic and Web datasets for NLP and coding.Planned deprecation](https://developers.cloudflare.com/ai/models/@cf/microsoft/phi-2/)[![Defog logo](https://developers.cloudflare.com/_astro/defog.BeLrxE1p.svg)sqlcoder-7b-2BetaText Generation • Defog • HostedThis model is intended to be used by non-technical users to understand data inside their SQL databases. Planned deprecation](https://developers.cloudflare.com/ai/models/@cf/defog/sqlcoder-7b-2/)[![Meta logo](https://developers.cloudflare.com/_astro/meta.BR4nfp35.svg)detr-resnet-50BetaObject Detection • Meta • HostedDEtection TRansformer (DETR) model trained end-to-end on COCO 2017 object detection (118k annotated images).](https://developers.cloudflare.com/ai/models/@cf/facebook/detr-resnet-50/)[![ByteDance logo](https://developers.cloudflare.com/_astro/bytedance.T1uiROQ6.svg)stable-diffusion-xl-lightningBetaText-to-Image • ByteDance • HostedSDXL-Lightning is a lightning-fast text-to-image generation model. It can generate high-quality 1024px images in a few steps.](https://developers.cloudflare.com/ai/models/@cf/bytedance/stable-diffusion-xl-lightning/)[ldreamshaper-8-lcmText-to-Image • lykon • HostedStable Diffusion model that has been fine-tuned to be better at photorealism without sacrificing range.](https://developers.cloudflare.com/ai/models/@cf/lykon/dreamshaper-8-lcm/)[![RunwayML logo](https://developers.cloudflare.com/_astro/runway.Cq8Cjov4.svg)stable-diffusion-v1-5-img2imgBetaText-to-Image • RunwayML • HostedStable Diffusion is a latent text-to-image diffusion model capable of generating photo-realistic images. Img2img generate a new image from an input image with Stable Diffusion. ](https://developers.cloudflare.com/ai/models/@cf/runwayml/stable-diffusion-v1-5-img2img/)[![RunwayML logo](https://developers.cloudflare.com/_astro/runway.Cq8Cjov4.svg)stable-diffusion-v1-5-inpaintingBetaText-to-Image • RunwayML • HostedStable Diffusion Inpainting is a latent text-to-image diffusion model capable of generating photo-realistic images given any text input, with the extra capability of inpainting the pictures by using a mask.](https://developers.cloudflare.com/ai/models/@cf/runwayml/stable-diffusion-v1-5-inpainting/)[![Stability.ai logo](https://developers.cloudflare.com/_astro/stabilityai.CmlmNdqR.svg)stable-diffusion-xl-base-1.0BetaText-to-Image • Stability.ai • HostedDiffusion-based text-to-image generative model by Stability AI. Generates and modify images based on text prompts.](https://developers.cloudflare.com/ai/models/@cf/stabilityai/stable-diffusion-xl-base-1.0/)[![BAAI logo](https://developers.cloudflare.com/_astro/baai.mOtdbKlV.svg)bge-large-en-v1.5Text Embeddings • BAAI • HostedBAAI general embedding (Large) model that transforms any given text into a 1024-dimensional vectorBatch](https://developers.cloudflare.com/ai/models/@cf/baai/bge-large-en-v1.5/)[![BAAI logo](https://developers.cloudflare.com/_astro/baai.mOtdbKlV.svg)bge-small-en-v1.5Text Embeddings • BAAI • HostedBAAI general embedding (Small) model that transforms any given text into a 384-dimensional vectorBatch](https://developers.cloudflare.com/ai/models/@cf/baai/bge-small-en-v1.5/)[![Meta logo](https://developers.cloudflare.com/_astro/meta.BR4nfp35.svg)llama-2-7b-chat-fp16Text Generation • Meta • HostedFull precision (fp16) generative text model with 7 billion parameters from MetaPlanned deprecation](https://developers.cloudflare.com/ai/models/@cf/meta/llama-2-7b-chat-fp16/)[![MistralAI logo](https://developers.cloudflare.com/_astro/mistralai.Bn9UMUMu.svg)mistral-7b-instruct-v0.1Text Generation • MistralAI • HostedInstruct fine-tuned version of the Mistral-7b generative text model with 7 billion parametersLoRAPlanned deprecation](https://developers.cloudflare.com/ai/models/@cf/mistral/mistral-7b-instruct-v0.1/)[![BAAI logo](https://developers.cloudflare.com/_astro/baai.mOtdbKlV.svg)bge-base-en-v1.5Text Embeddings • BAAI • HostedBAAI general embedding (Base) model that transforms any given text into a 768-dimensional vectorBatch](https://developers.cloudflare.com/ai/models/@cf/baai/bge-base-en-v1.5/)[![HuggingFace logo](https://developers.cloudflare.com/_astro/huggingface.ngjt5u2J.svg)distilbert-sst-2-int8Text Classification • HuggingFace • HostedDistilled BERT model that was finetuned on SST-2 for sentiment classification](https://developers.cloudflare.com/ai/models/@cf/huggingface/distilbert-sst-2-int8/)[![Meta logo](https://developers.cloudflare.com/_astro/meta.BR4nfp35.svg)llama-2-7b-chat-int8Text Generation • Meta • HostedQuantized (int8) generative text model with 7 billion parameters from MetaPlanned deprecation](https://developers.cloudflare.com/ai/models/@cf/meta/llama-2-7b-chat-int8/)[![Meta logo](https://developers.cloudflare.com/_astro/meta.BR4nfp35.svg)m2m100-1.2bTranslation • Meta • HostedMultilingual encoder-decoder (seq-to-seq) model trained for Many-to-Many multilingual translationBatch](https://developers.cloudflare.com/ai/models/@cf/meta/m2m100-1.2b/)[![Microsoft logo](https://developers.cloudflare.com/_astro/microsoft.LujcDJ--.svg)resnet-50Image Classification • Microsoft • Hosted50 layers deep image classification CNN trained on more than 1M images from ImageNet](https://developers.cloudflare.com/ai/models/@cf/microsoft/resnet-50/)[![OpenAI logo](https://developers.cloudflare.com/_astro/openai.BI8PEEzI.svg)whisperAutomatic Speech Recognition • OpenAI • HostedWhisper is a general-purpose speech recognition model. It is trained on a large dataset of diverse audio and is also a multitasking model that can perform multilingual speech recognition, speech translation, and language identification.](https://developers.cloudflare.com/ai/models/@cf/openai/whisper/)[![Meta logo](https://developers.cloudflare.com/_astro/meta.BR4nfp35.svg)llama-3.1-70b-instructText Generation • Meta • HostedThe Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models. The Llama 3.1 instruction tuned text only models are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.Planned deprecation](https://developers.cloudflare.com/ai/models/@cf/meta/llama-3.1-70b-instruct/)[![Meta logo](https://developers.cloudflare.com/_astro/meta.BR4nfp35.svg)llama-3.1-8b-instruct-fastText Generation • Meta • Hosted\[Fast version\] The Meta Llama 3.1 collection of multilingual large language models (LLMs) is a collection of pretrained and instruction tuned generative models. The Llama 3.1 instruction tuned text only models are optimized for multilingual dialogue use cases and outperform many of the available open source and closed chat models on common industry benchmarks.](https://developers.cloudflare.com/ai/models/@cf/meta/llama-3.1-8b-instruct-fast/)

```json
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/ai/","name":"AI"}},{"@type":"ListItem","position":3,"item":{"@id":"/ai/models/","name":"Models"}}]}
```
