LLM Client Configuration

LLM Client Configuration for Kubernetes

Configure AI coding tools and applications to use agentgateway running in Kubernetes.

Overview

When agentgateway is deployed in Kubernetes, clients connect to the Gateway’s ingress or service endpoint. This guide shows how to get your gateway URL and configure popular AI tools to use it.

Get your gateway URL

Before configuring clients, you need to determine your agentgateway endpoint URL.

Option 1: Load Balancer (Cloud deployments)

If you deployed agentgateway with a LoadBalancer service, get the external IP:

export INGRESS_GW_ADDRESS=$(kubectl get svc -n agentgateway agentgateway-proxy \
  -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
echo "Gateway URL: http://$INGRESS_GW_ADDRESS"

For cloud providers that use hostname instead of IP:

export INGRESS_GW_ADDRESS=$(kubectl get svc -n agentgateway agentgateway-proxy \
  -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')
echo "Gateway URL: http://$INGRESS_GW_ADDRESS"

Option 2: Port-forward (Local testing)

For local development or testing:

kubectl port-forward -n agentgateway svc/agentgateway-proxy 8080:80

Gateway URL: http://localhost:8080

Option 3: Ingress (Production)

If using an Ingress controller with a custom domain:

kubectl get ingress -n agentgateway agentgateway-ingress -o jsonpath='{.spec.rules[0].host}'

Gateway URL: https://gateway.example.com (or your configured domain)

Configure clients

Once you have your gateway URL, configure your AI clients. The base URL for OpenAI-compatible endpoints is:

Format: <GATEWAY_URL>/<ROUTE_PATH>

Where <ROUTE_PATH> is the path you configured in your HTTPRoute resource (e.g., /openai, /anthropic, /ollama).

Cursor

Open Cursor Settings → Models
Add custom model:
- API Base URL: http://$INGRESS_GW_ADDRESS/openai (replace with your route path)
- API Key: Gateway API key if auth is configured, or anything
- Model Name: Model from your AIBackend (e.g., gpt-4o-mini)

Or via settings JSON:

{
  "cursor.models": [
    {
      "name": "k8s-gateway",
      "apiBase": "http://your-gateway-ip/openai",
      "apiKey": "anything",
      "model": "gpt-4o-mini"
    }
  ]
}

VS Code Continue

Edit ~/.continue/config.json:

{
  "models": [
    {
      "title": "Kubernetes Gateway",
      "provider": "openai",
      "model": "gpt-4o-mini",
      "apiBase": "http://your-gateway-ip/openai",
      "apiKey": "anything"
    }
  ]
}

OpenAI SDK (Python)

from openai import OpenAI

client = OpenAI(
    base_url="http://your-gateway-ip/openai",  # Your Gateway URL + route path
    api_key="anything",  # Or your gateway API key
)

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello from Kubernetes!"}]
)

print(response.choices[0].message.content)

OpenAI SDK (Node.js)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "http://your-gateway-ip/openai",  // Your Gateway URL + route path
  apiKey: "anything",  // Or your gateway API key
});

const response = await client.chat.completions.create({
  model: "gpt-4o-mini",
  messages: [{ role: "user", content: "Hello from Kubernetes!" }]
});

console.log(response.choices[0].message.content);

curl

curl "http://your-gateway-ip/openai" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      {"role": "user", "content": "Hello from Kubernetes!"}
    ]
  }' | jq

Authentication

If you configured authentication policies on your Gateway:

API Key authentication

Include the API key in requests:

# curl
curl "http://your-gateway-ip/openai" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{"model": "gpt-4o-mini", "messages": [...]}'

# Python SDK
client = OpenAI(
    base_url="http://your-gateway-ip/openai",
    api_key="YOUR_API_KEY"
)

JWT authentication

If using JWT tokens, include them in the Authorization header:

curl "http://your-gateway-ip/openai" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -d '{"model": "gpt-4o-mini", "messages": [...]}'

TLS/HTTPS

For production deployments with TLS:

Configure an Ingress with TLS:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: agentgateway-ingress
  namespace: agentgateway
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-prod
spec:
  tls:
  - hosts:
    - gateway.example.com
    secretName: agentgateway-tls
  rules:
  - host: gateway.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: agentgateway-proxy
            port:
              number: 80

Update client URLs to use HTTPS:

client = OpenAI(
    base_url="https://gateway.example.com/openai",
    api_key="YOUR_API_KEY"
)

Network considerations

Same cluster

If your client application runs in the same Kubernetes cluster, use the internal Service DNS name:

client = OpenAI(
    base_url="http://agentgateway-proxy.agentgateway.svc.cluster.local/openai",
    api_key="anything"
)

Network policies

If using Network Policies, ensure they allow traffic from client pods/machines to the agentgateway namespace:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-gateway-traffic
  namespace: agentgateway
spec:
  podSelector:
    matchLabels:
      app: agentgateway
  ingress:
  - from:
    - namespaceSelector: {}  # Allow from all namespaces
    ports:
    - protocol: TCP
      port: 80

Troubleshooting

Cannot connect to gateway

Solutions:

Verify the gateway service is running:

kubectl get svc -n agentgateway agentgateway-proxy

Check LoadBalancer has external IP assigned:

kubectl get svc -n agentgateway agentgateway-proxy -w

Test connectivity from local machine:

curl http://$INGRESS_GW_ADDRESS/openai -v

404 Not Found

Cause: Route path doesn’t match HTTPRoute configuration.

Solution: Verify your HTTPRoute paths:

kubectl get httproute -n agentgateway -o yaml | grep -A 5 "path:"

Ensure client URL matches the route path (e.g., /openai, not /v1/chat/completions).

Connection timeout

Possible causes:

Gateway pods not ready
Network policies blocking traffic
Cloud provider firewall rules