- Provider-Level Prompt Caching (HIGH VALUE) This is built into providers like Anthropic and OpenAI - it caches input tokens at the provider level, reducing costs significantly. Anthropic Prompt Caching:
Caches system prompts, messages, and tool definitions 90% cost reduction on cached input tokens Requires minimum 1024 tokens Uses providerOptions: { anthropic: { cacheControl: { type: 'ephemeral' } } }
OpenAI Prompt Caching:
Similar concept, automatic for repeated prompts Added in AI SDK 4.0
Google/Gemini Context Caching:
Caches large context windows Great for repeated analysis of same documents
- Response Caching (We Already Have This) Our current LRU cache in gateway.ts caches entire responses - if the same exact request comes in, we return the cached response without hitting the API at all.
Recommendation Yes, we should add Provider Prompt Caching - it's complementary to our existing response cache and provides significant cost savings for TendSocial's use case: Use CaseBenefitContent generation with same system prompt90% savings on system prompt tokensMulti-agent workflows with repeated tool definitions90% savings on tool schema tokensCampaign variations with same brand context90% savings on context tokens What to Add to v4.2 typescript// In the adapter, enable prompt caching for Anthropic const result = await generateText({ model: gateway('anthropic/claude-sonnet-4'), system: systemPrompt, messages, providerOptions: { anthropic: { cacheControl: { type: 'ephemeral' } // Cache the system prompt } } });
// For messages with cache breakpoints messages: [ { role: 'user', content: largeContext, providerOptions: { anthropic: { cacheControl: { type: 'ephemeral' } } } } ] Configuration Options to Add SettingDescriptionDefaultenablePromptCachingEnable provider-level prompt cachingtruecacheSystemPromptsAuto-cache system prompts > 1024 tokenstruecacheToolDefinitionsCache tool schemas (when supported)true