Prompt Caching Changed How I Think About AI Agent Costs

Thu, 23 Apr 2026 10:00:00 +1200

What I thought I knew

Before this week, my mental model of AI pricing was simple. You pay per token. Input tokens cost some amount, output tokens cost more, you add them up. System prompt counts as input. Tool definitions count as input. Every conversation starts from zero and you pay for every token you send.

Predictable. Also wrong, or at least not the whole picture.

I was reading the Anthropic docs properly this time and ran into a feature called prompt caching. It changed how I think about agent workflow design.

Prompt-Caching on Yushi's Blog

Prompt Caching Changed How I Think About AI Agent Costs

What I thought I knew