import OpenAI from "openai";Morph Models
Fast general models for agent loops
Run the primary agent loop on fast, OpenAI-compatible coding models served on Morph's custom kernels. One API for chat, code generation, and reasoning.
Morph Models
Frontier coding models,
served on custom kernels
Output speed
Codegen-specific optimizations and custom GPU kernels. Up to 200 tok/s on Qwen 3.5 397B.
One OpenAI-compatible API
Point your existing client at api.morphllm.com. Switch models by changing one string.
01
02
03
const client = new OpenAI({04
baseURL: "https://api.morphllm.com/v1",05
apiKey: process.env.MORPH_API_KEY,06
});07
08
const res = await client.chat.completions.create({09
model: "morph-qwen35-397b",10
messages: [{ role: "user", content: "Refactor this function..." }],11
});The lineup
Open-weight frontier models with long context, served and billed per token. No per-seat fees.
01
// Available general models02
morph-qwen35-397b // 397B MoE, 262k context03
morph-minimax27-230b // 230B MoE, agentic workflows04
morph-dsv4flash // 393k context, fast05
morph-qwen36-27b // dense, low latencyBuilt for production agent workloads
Inference optimized for coding agents
Every agent will write code. We bet the stack on it.
So we tune every layer for that one workload: custom GPU kernels, speculative decoding shaped around code, and serving built for the agent loop instead of general chat. Not general infrastructure with code bolted on.
Get $10 in free credits when you sign up today.
Get API KeyNo credit card required. Pay only for what you use after that.