blog

AI Gateway User Stories Draft

Below I have a hypothetical architecture for an Egress Gateway which allows for network policies at three different scopes: Gateway (global), Backend (per FQDN), Route (per HTTP/GRPCRoute).

Proposed High Level Implementation

The proposal reuses existing Gateway API primitives (Gateway, HTTPRoute, GRPCRoute) and introduces a Backend resource derived from this proposal for representing external destinations and cross-cluster endpoints.

The diff for my proposal is here.

Two networking modes are available: Endpoint and Parent. In Endpoint mode the gateway connects directly to requested resources. In Parent mode it treats another gateway as its sole upstream.

Endpoint Mode Diagram Parent Mode Diagram

Policy Pattern summary

Policy Layer Example CRDs / Patterns Typical Use
Gateway GuardRailsPolicy, RegionAllowListPolicy, ObservabilityPolicy Org-wide posture: geo restrictions, auditing, deny-lists
Backend EgressPolicy, CredentialInjector, BackendTLSPolicy, DNSPolicy, QoSController Per-destination credentials, rate/QoS, DNS, mTLS
Route PayloadProcessor, ExternalAuth Per-request parsing, redaction, or access filtering

Policy Scopes

User Stories in this Model

These user stories were taken directly from the AI Gateway Working Group’s egress-gateway proposal.

1. Access to external services

As a gateway admin I need to provide workloads within my cluster access to services outside of my cluster, in particular cloud and otherwise hosted services.

Field Value
Routing mode Endpoint
Route filters Optional ExternalAuth (allow/deny)
Backend policies CredentialInjector, BackendTLSPolicy, optional RateLimitPolicy
Gateway attachments Optional GuardRailsPolicy
Example policy objects EgressPolicy, BackendTLSPolicy, GuardRailsPolicy
Notes Baseline: HTTPRoute → Backend (FQDN) with credentials and TLS validation.

2. Central token management

As a gateway admin I need to manage access tokens for 3rd party AI services so workloads can perform inference without managing secrets directly.

Field Value
Routing mode Endpoint
Route filters
Backend policies CredentialInjector (rotating API keys, STS/OIDC), RateLimitPolicy
Gateway attachments
Example policy objects EgressPolicy, CredentialInjector
Notes Secrets stay centralized; credentials injected dynamically per backend.

3. Cloud fail-over

As a gateway admin providing token management for 3rd party AI cloud services, I need fail-over between providers when the primary fails.

Field Value
Routing mode Endpoint
Route filters Optional ExternalAuth or health filter
Backend policies Multiple Backend objects with RateLimitPolicy, priority/weight for fail-over
Gateway attachments Optional backoff defaults (GuardRailsPolicy)
Example policy objects EgressPolicy, QoSController, BackendTLSPolicy
Notes Fail-over handled at backend level; avoids duplicate inference.

4. Verify external service identity

As a gateway admin providing egress routing to external services, I need to verify the identity of the remote service and enforce authentication.

Field Value
Routing mode Endpoint or Parent
Route filters
Backend policies BackendTLSPolicy (CA/hostname validation), optional mTLS
Gateway attachments Optional global TLS settings
Example policy objects BackendTLSPolicy, EgressPolicy
Notes Ensures outbound TLS verification and optional mTLS client auth.

5. Verify client to external service

As a gateway admin providing egress routing to external services, I need to verify the client identity when connecting to the external service.

Field Value
Routing mode Endpoint or Parent
Route filters
Backend policies mTLS via BackendTLSPolicy.clientCertificateRef
Gateway attachments Optional PKI defaults
Example policy objects BackendTLSPolicy, EgressPolicy
Notes Gateway acts as TLS client; authenticates with per-backend certificate.

6. Manage custom CAs and CRLs

As a gateway admin, I need to manage certificate authorities for egress connections, including pinning, intermediates, and CRLs.

Field Value
Routing mode Endpoint or Parent
Route filters
Backend policies BackendTLSPolicy (custom CA bundle, SPKI pins, CRL/OCSP config)
Gateway attachments Optional global CA defaults
Example policy objects BackendTLSPolicy, GuardRailsPolicy
Notes Per-destination trust; centralized CA revocation or pinning.

7. Controlled DNS resolution

As a gateway admin providing egress routing to external services, I need to control DNS resolution for these sources and enable reverse DNS checks.

Field Value
Routing mode Endpoint or Parent
Route filters
Backend policies DNSPolicy (resolver, TTL, reverse-DNS enforcement)
Gateway attachments Optional global resolver config
Example policy objects EgressPolicy, DNSPolicy
Notes Backend overrides global resolver defaults; secured DNS chain.

8. Dedicated inference cluster

As a cluster admin I need to provide inference to workloads, but through a dedicated cluster for separation.

Field Value
Routing mode Parent
Route filters Optional ExternalAuth, PayloadProcessor
Backend policies Backend = parent gateway; CredentialInjector, BackendTLSPolicy
Gateway attachments GuardRailsPolicy at parent gateway
Example policy objects EgressPolicy, BackendTLSPolicy, GuardRailsPolicy
Notes Local retries only; parent gateway manages pool routing and guardrails.

9. Cloud inference access

As a cluster admin I need to provide inference access via cloud services (e.g., Vertex, Bedrock) instead of running models locally.

Field Value
Routing mode Endpoint
Route filters Optional PayloadProcessor
Backend policies CredentialInjector, BackendTLSPolicy
Gateway attachments Optional RegionAllowList
Example policy objects EgressPolicy, BackendTLSPolicy, GuardRailsPolicy
Notes Managed API egress with centralized credential and TLS handling.

10. Specialized provider features

As a developer building an inference-enabled app, I need access to AI cloud providers offering unique capabilities.

Field Value
Routing mode Endpoint
Route filters Optional PayloadProcessor, ExternalAuth
Backend policies Provider-specific headers via CredentialInjector, optional QoS
Gateway attachments Optional ModelDenyList or RegionAllowList
Example policy objects EgressPolicy, CredentialInjector, GuardRailsPolicy
Notes Backend policies abstract provider details and headers.

11. Local-to-cloud fail-over

As a developer of an inference-enabled app, I need fail-over from local models to 3rd party providers if local workloads fail.

Field Value
Routing mode Endpoint
Route filters Optional PayloadProcessor
Backend policies Multiple Backend targets (local + remote); Rate/QoS per destination
Gateway attachments Optional GuardRailsPolicy or retry configuration
Example policy objects EgressPolicy, QoSController, BackendTLSPolicy
Notes Weighted fail-over with clear retry behavior to avoid duplication.

12. Outbound attribution

As a platform operator I need to attribute outbound traffic per namespace or workload to enforce rate or utilization limits.

Field Value
Routing mode Endpoint or Parent
Route filters
Backend policies
Gateway attachments ObservabilityPolicy (metrics, audit)
Example policy objects ObservabilityPolicy, GuardRailsPolicy
Notes Emit metrics tagged by {gateway, route, backend, ns, sa}. Enables billing and enforcement.

13. Regional compliance

As a compliance engineer I need to ensure outbound traffic to third-party AI resources obeys regulatory restrictions like region locks.

Field Value
Routing mode Endpoint or Parent
Route filters
Backend policies Optional metadata tag region: eu-west
Gateway attachments RegionAllowListPolicy
Example policy objects GuardRailsPolicy, RegionAllowListPolicy
Notes Gateway rejects connection if destination not in approved region set.