Understanding Tokens and How They Affect AI Costs
Author
Marc Logemann
Date Published

When you're using AI APIs (like those powering ChatGPT, Claude, or Gemini), pricing is often based on tokens—not on words, seconds, or API calls. This can be confusing at first, so here’s a simple breakdown:
What Is a Token?
A token is a chunk of text. It could be:
- A word (e.g., "cat")
- Part of a word (e.g., "un" in "understand")
- Or punctuation (e.g., "." or "!")
In English, 1 token is roughly 4 characters or ¾ of a word on average. So:
- 100 tokens ≈ 75 words
- 1,000 tokens ≈ 750 words
Be aware that most LLM Proviers price input tokens differently than output token
How Are Tokens Used?
When you send input to an AI model, it gets broken down into tokens. The model processes those tokens and generates output tokens in response.
You pay for both:
- Input tokens (your prompt)
- Output tokens (the model’s response)
A real example for input and output Tokens
In the following screenshot we created a request and told ChatGTP to also show the used tokens.

Then we asked ChatGTP to calculate the costs of the used tokens. He calculated based on the GPT-4 Turbo model which we are using on the official ChatGTP Website.

Why Does This Matter for Budgeting?
If you're building AI-based software that interacts with a model frequently—like a chatbot, summarizer, or recommendation system—token usage can add up fast.
Example:
- A single user query might use 200 input tokens
- The model responds with 300 output tokens
- That’s 500 tokens total per interaction
Now imagine thousands of users each day. Multiply by the price per 1,000 tokens (e.g., $0.002 for some models, or much more for larger ones), and you can see how costs scale.
Key Takeaways for Budgeting
- Token usage drives variable cost — it's like bandwidth for AI.
- Bigger models are more expensive per token — and may use more tokens per response.
- Fine-tuning, embeddings, and context windows (how much info the model remembers) also affect token usage.
You’ll need to:
- Track usage per user/session
- Cap token limits where possible
- Optimize prompts to reduce input/output length without degrading performance

Just a year ago, people were getting excited about chatbots that could write emails. Now there are autonomous agents that can plan vacations or events

In AI one of the most significant challenges has been enabling AI models to interact seamlessly with external tools and data sources. Let's talk MCP.

This comparison of two major LLMs focus on the areas Model Philosophy & Alignment, Training Data & Capabilities, Context Window and more.