You already call the Anthropic and OpenAI APIs directly. This is the map of what changes when you go through Microsoft's platform instead — and where the real differences hide.
The single most important thing to fix in your head first: "Azure AI services" is not a competitor to Anthropic or OpenAI. Anthropic and OpenAI make models. Microsoft's offering is a platform you reach those models through — plus governance, tooling, and billing wrapped around them. They sit at different layers of the stack.
If you last looked a year or two ago, the names have shifted. The umbrella product is now Microsoft Foundry. The old names map like this1:
| You may know it as… | It's now called… |
|---|---|
| Azure AI Studio / Azure AI Foundry | Microsoft Foundry |
| Azure AI Services (Vision, Speech, Language…) | Foundry Tools |
| Azure OpenAI Service | Folded into Foundry (the OpenAI models live in the Foundry Models catalog) |
| Hub + Azure OpenAI + AI Services resources | A single Foundry resource, organised into projects |
So when someone says "we're evaluating Azure AI," in 2026 they almost always mean Microsoft Foundry. We'll use that name from here.
Foundry is a platform-as-a-service for building AI applications: production infrastructure plus interfaces, so you focus on the app rather than plumbing.1 Its pieces:
Layered on top: built-in tracing, monitoring, evaluations, a 1,400+ tool catalog, agent memory, and publishing to Microsoft 365 / Teams.1 That whole second list is the part you'd otherwise build yourself on top of a raw API.
Before reading on, finish this sentence out loud: "Anthropic/OpenAI sit at the ___ layer; Microsoft Foundry sits at the ___ layer." If you can't yet, re-read the lead paragraph — this distinction is the spine of everything below.
Here's what makes "Azure vs. Anthropic/OpenAI" a slightly wrong question. As of late 2025/2026, Claude models run on Microsoft Foundry — Claude is now the first frontier model available on all three big clouds (AWS, Azure, GCP).3 So the choice usually isn't "Azure or Claude." It's "Claude (and GPT-5, and Grok, and Llama…) direct, or the same models via Foundry's wrapper."
Concretely, calling Claude through Foundry looks almost like calling it direct — same messages API shape — but the endpoint, auth, and billing are Azure's2:
| Dimension | Claude direct | Claude via Foundry |
|---|---|---|
| Endpoint | api.anthropic.com/v1/* | {resource}.services.ai.azure.com/anthropic/v1/* |
| Auth | Anthropic API key (x-api-key) | Azure API key or Microsoft Entra ID (RBAC tokens) |
| Billing | Anthropic invoice | Your Azure subscription / Microsoft Marketplace (at Anthropic's standard API prices) |
| SDK | anthropic | AnthropicFoundry client / @anthropic-ai/foundry-sdk |
| Some features differ | Full API, standard rate-limit headers, Batches API | No Message Batches / Models / Admin / Compliance APIs; rate-limit headers absent — use Azure Monitor instead |
In this preview integration, Claude on Foundry still runs on Anthropic's own infrastructure, not Azure's regional compute.2 Foundry is a commercial / billing integration here. So the usual "my data never leaves my Azure region" promise does not automatically hold for Claude the way it does for, say, Azure-hosted GPT models — EU-region native inference for Claude was still listed as coming. If data residency is the whole reason someone wants Azure, you must check it per model, not assume it across the catalog.
You told me all four matter. Here's the orientation pass on each — where Foundry genuinely differs from going direct.
The classic reason orgs pick a cloud over a direct API. With Foundry, your prompts and completions are not used to train Microsoft's or OpenAI's models, not shared with OpenAI, and stay private to your tenant.4 Data at rest stays in your customer-designated geography; it's covered by Azure's contractual commitments, GDPR (Microsoft as processor), and a HIPAA BAA.4 Plus Entra ID + RBAC, private networking (VNet), and Azure Policy.
vs. direct: Anthropic/OpenAI offer their own enterprise terms and zero-retention options too — but Foundry folds AI into the same identity, networking, and compliance fabric an enterprise already runs on Azure. That "one control plane" story is the pull. Caveat from above applies per model.
Direct = one vendor's models. Foundry = a 1,900+ model catalog spanning many providers (GPT-5, Claude, Grok, Mistral, Llama, DeepSeek, Phi…) behind one endpoint, one SDK, one bill.1 You can even route across models with a single API, or call models by name without pre-deploying.1
vs. direct: Going direct means integrating (and contracting with) each provider separately. Foundry's value is provider-portability: swap GPT-5 for Claude with a config change, not a new vendor relationship.
This is the biggest "you'd otherwise build it yourself" gap. Foundry ships a managed Agent Service (multi-agent orchestration), a 1,400+ tool catalog, agent memory, Foundry IQ for grounded RAG, and built-in evaluations + observability.1
vs. direct: With raw APIs you assemble RAG, tool-calling glue, memory, eval harnesses, and tracing from libraries yourself. Foundry trades flexibility for a batteries-included, governed path. The trade is control vs. time-to-production.
Direct APIs are pay-per-token. Foundry offers that pay-as-you-go and Provisioned Throughput Units (PTUs) — you reserve a block of capacity billed at $/PTU/hour regardless of tokens used, with Azure Reservations discounting sustained usage.5,6
vs. direct: PTUs buy predictable cost + guaranteed throughput/latency for steady, high-volume workloads — something per-token pricing can't. Pay-as-you-go suits spiky/variable load. (Note: Claude-via-Foundry currently bills at Anthropic's standard prices through the Marketplace.2)
| Axis | Direct Anthropic / OpenAI | Via Microsoft Foundry |
|---|---|---|
| What it is | A model provider | A platform layer over many providers |
| Models | That vendor's models only | 1,900+ across vendors, one endpoint |
| Identity | Provider API key | Entra ID + Azure RBAC (or key) |
| Compliance | Provider's own terms | Azure fabric: tenant isolation, geo, BAA, VNet |
| Agents / RAG / evals | You build it | Managed (Agent Service, IQ, observability) |
| Pricing options | Per-token | Per-token + PTU reservations |
| Billing | Provider invoice | Your Azure bill / Marketplace |
| Best when… | Max flexibility, one model, fast start, no Azure footprint | Already on Azure, need governance, multi-model, steady scale |
No code, just judgement. Pick one per question; you'll get the reasoning instantly.
1. A colleague says: "We should use Azure AI instead of Claude." What's the most useful correction?
2. A health-tech firm wants Claude but insists "no data may leave our designated region, ever." Going through Foundry, what must you verify first?
3. A team runs a steady 24/7 workload at high, predictable volume and wants cost certainty and latency guarantees. Which Foundry lever fits?
This was the orientation pass across all four axes. The natural next explainer is a deep-dive on whichever axis your real decision hinges on — most likely enterprise & compliance (the residency/per-model nuance has the most teeth) or the commercial model (PTU economics vs. per-token, with real numbers). Tell me which and I'll build it.