Moonshot AI Kimi K2.7 Code now available on Workers AI
@cf/moonshotai/kimi-k2.7-code is now available on Workers AI. Kimi K2.7 Code is a code-optimized variant of the Kimi K2 family, built on a Mixture-of-Experts architecture with 1T total parameters and 32B active per token.
K2.7 Code delivers meaningful gains over K2.6 on coding and agentic benchmarks:
- +21.8% on Kimi Code Bench v2
- +11.0% on Program Bench
- +31.5% on MLS Bench Lite
K2.7 Code uses 30% fewer reasoning tokens compared to K2.6, reducing overthinking and lowering inference cost for reasoning-heavy workloads.
- 262.1k token context window for retaining full conversation history, tool definitions, and codebases across long-running agent sessions
- Long-horizon coding with improved instruction following and higher end-to-end coding task success rates
- Vision inputs for processing images alongside text
- Thinking mode with configurable reasoning depth via
chat_template_kwargs.thinking - Multi-turn tool calling for building agents that invoke tools across multiple conversation turns
- Structured outputs with JSON schema support
If you are migrating from Kimi K2.6, note the following:
- K2.7 Code is optimized for coding tasks with improved benchmark performance and reasoning efficiency
- Cached input token pricing is $0.19 per M tokens (vs $0.16 for K2.6)
- API usage is identical — no parameter changes required
Use Kimi K2.7 Code through the Workers AI binding (env.AI.run()), the REST API at /ai/run, or the OpenAI-compatible endpoint at /v1/chat/completions. You can also use AI Gateway with any of these endpoints.
For more information, refer to the Kimi K2.7 Code model page and pricing.