Run serverless AI models on Cloudflare's global network. 50+ models, built-in observability, and a vector database โ all in one platform.
From inference to observability to vector search โ everything you need to build production AI applications, running on Cloudflare's global network.
Run 50+ open-source models on serverless GPUs across 330+ cities. No infrastructure to manage โ just deploy and scale.
AI Gateway gives you caching, rate limiting, request retries, model fallback, and real-time analytics for every AI call.
Vectorize enables semantic search, recommendations, and RAG at the edge. Built on Cloudflare's network for low latency.
Your data stays on Cloudflare's network. No third-party hops. Built-in Firewall for AI detects prompt injections and unsafe content.
Pay per inference. No idle costs. Auto-scales from zero to global traffic with zero warm-up time.
Workers + AI Gateway + Vectorize + R2 + D1 + KV โ every piece of the platform integrates natively. No glue code needed.
Three products that work together to power your AI applications end-to-end.
Run machine learning models on Cloudflare's network โ no servers, no GPUs to manage. Invoke models from Workers, Pages, or directly via the API.
Llama 3.1 ยท Mistral ยท DeepSeek ยท Gemma ยท
Whisper ยท Stable Diffusion ยท BGE Embeddings
A unified gateway for all your AI API calls. Cache responses, enforce rate limits, retry failures, and monitor usage โ across any provider.
One endpoint ยท Any provider ยท Full observability
Build AI applications with semantic memory. Vectorize is Cloudflare's native vector database โ no external services, no data leaving the network.
Semantic search ยท RAG ยท Recommendations
Cloudflare AI Week 2025 brought major updates across the platform. Here's what shipped.
State-of-the-art image generation models now available on Workers AI. Generate and edit images at the edge.
Text-to-speech and speech-to-text models from Deepgram โ build voice applications entirely on Cloudflare.
Detect prompt injections, unsafe content, and shadow AI usage. Protect your AI applications before they reach the model.
Cloudflare is a Day 0 launch partner for OpenAI's new open-weight models โ available directly on Workers AI.
AI Gateway now supports dynamic model routing โ automatically route requests to the best-performing or cheapest model.
Workers AI is Generally Available with enterprise-grade SLAs, higher rate limits, and dedicated support options.
From startups to enterprises โ real applications powered by Cloudflare AI.
Deploy a smart chatbot that answers product questions, processes returns, and escalates to humans โ all at the edge with sub-100ms response times.
Workers AI + AI GatewayReplace keyword search with vector embeddings. Let customers search by meaning, not just keywords. 10x improvement in discovery rates.
Vectorize + BGE EmbeddingsUse Llama Guard on Workers AI to automatically flag unsafe content. Cache moderation results with AI Gateway to reduce costs by 60%.
Workers AI + AI GatewayBuild an image generation service using Leonardo AI on Workers AI. Generate, transform, and serve images โ all on Cloudflare's network.
Workers AI + R2 + ImagesIndex your docs with Vectorize, embed queries with Workers AI, and answer with Llama. Production-ready AI docs in under 100 lines of code.
Vectorize + Workers AITranslate content into 50+ languages at the edge. Use AI Gateway to cache frequent translations and monitor translation costs across your org.
Workers AI + AI GatewayNo servers. No GPUs to manage. Just your code and Cloudflare's global network. Start with the free tier โ no credit card required.