Hello! How can I help?

Agent

/chat

/image

/audio

/embeddings

ChatGPT

Claude

Spend tracking

Accurately charge users and agents for their usage.

Budget and rate limits

Set budgets on users and agents.

OpenAI format

Call all major LLM providers in the OpenAI format.

LLM fallbacks

Clients are not impacted even during an outage.

Why teams use the Models API

Platform teams adopt a single, unified API to standardize access, observe spend, and avoid bespoke integrations for every new model. The Models API brings that pattern to monday.com: your apps call familiar endpoints (/chat/completions, /embeddings, /images/generations, /audio/speech), while monday handles routing to upstream providers and reconciles usage against your workspace.

Another reason is compliance, legal, and governance. Instead of creating new business relationships, security reviews, and procurement cycles with each LLM vendor, you access models through your existing monday subscription and the vendor relationship your company has already approved with monday.com (IT, legal, procurement, and security). You get broad model coverage without multiplying third party agreements and onboarding steps.

Pricing and metering

The Models API itself has no separate subscription fee. Usage draws down monday AI tokens, which are intended to approximate the cost of the underlying model to the best of our ability. Because underlying model costs can change over time, token consumption may vary and may reflect a small additional cost depending on the model used.

Simpler alternative: `run_prompt` (GraphQL)

If you only need single-turn text completions, the run_prompt mutation is a simplified, GraphQL-native entry point to the same AI gateway. You send a prompt and get the generated text back in the same GraphQL request as the rest of your integration — no separate base URL or OpenAI-compatible client required.

Because it wraps the same gateway, the same access requirements and monday AI token consumption apply. For multi-turn conversations, streaming, tools / function calling, embeddings, images, or audio, use the Models API instead. The mutation is available in API versions 2026-10 and later — see the AI reference.