Economical compute for open-weight AI

Open-weight AI endpoints, optimized for cost.

Deploy model-powered features with an OpenAI-compatible API, regional routing, and usage-based pricing. Echo keeps open-weight inference practical for production teams.

Create endpoint Try chat

/v1

API shape

Global

Routing

Usage

Billing

OpenAI-compatible

const client = new OpenAI({
  baseURL: "https://api.echo.example/v1",
  apiKey: process.env.ECHO_API_KEY,
});

Regional control

Choose deployment regions for latency and data locality.

Gateway compatible

Use existing OpenAI SDKs, agents, and chat clients without custom adapters.

Cost-aware inference

Run open-weight models with infrastructure designed around usage efficiency.