Token Cost & Context Window
DevoxxGenie provides tools to help you manage token usage, control costs, and optimize the context window size when working with cloud LLM providers. This page explains how to configure and use these features.
Understanding Tokens and Costs
What are Tokens?
Tokens are the basic units that LLMs process. In simple terms:
- A token is roughly 4 characters or 3/4 of a word in English
- Code typically uses more tokens than natural language
- Different languages may have different tokenization patterns
Why Token Management Matters
- Cost Control: Cloud LLM providers charge based on token usage
- Context Window: LLMs have limits on how many tokens they can process at once
- Performance: Optimizing token usage can lead to better, more focused responses
Token Cost Configuration
Access token cost settings through:
- Settings → Tools → DevoxxGenie → Token Cost & Context Window
Configuring Provider Costs
For each LLM provider, you can configure:
- Input Cost: The cost per 1,000 tokens for input (prompts)
- Output Cost: The cost per 1,000 tokens for output (responses)
- Context Window: The maximum number of tokens the model can process
Default values are pre-configured, but you can adjust them if the provider changes their pricing or if you have special rates.
Provider-Specific Settings
Different providers have different pricing models:
- OpenAI: Different rates for different models (GPT-3.5, GPT-4, etc.)
- Anthropic: Pricing varies by Claude model
- Google: Different rates for different Gemini models
- Others: Each provider has their own pricing structure
Context Window Management
The context window is the amount of text (in tokens) that the LLM can "see" at once.
Window Size by Provider
Each LLM has a maximum context window size:
| Provider | Model | Context Window |
|---|---|---|
| OpenAI | GPT-4o | 128K tokens |
| OpenAI | GPT-3.5 Turbo | 16K tokens |
| Anthropic | Claude 3.5 Sonnet | 200K tokens |
| Gemini 1.5 Pro | 1M tokens | |
| Ollama | Llama 3 | Varies by version |
Optimizing Context Window Usage
DevoxxGenie provides several tools to help you optimize context window usage:
- Token Usage Bar: Visual indicator of how much of the context window is being used
- Token Calculator: Calculate tokens before sending prompts
- Project Scanner Settings: Control how much code is included in prompts
Token Calculation Features
Token Calculator
To calculate tokens for a directory or file:
- Right-click on a directory in the project view
- Select "Calc Tokens for Directory"
- View the token count, estimated cost, and available models
Calculating Tokens for Current Prompt
The DevoxxGenie interface displays:
- Current Token Count: The number of tokens in your prompt
- Estimated Cost: The approximate cost of the request
- Available Models: Which models can handle the token count
Token Usage in Responses
After receiving a response, DevoxxGenie shows:
- Input Tokens: How many tokens were in your prompt
- Output Tokens: How many tokens were in the LLM's response
- Total Cost: The estimated cost of the exchange
Smart Model Selection
DevoxxGenie helps you choose the right model for your task:
- Models in the dropdown are filtered by context window size
- Models that can't handle your prompt are disabled
- Cost information is shown for each model
Cost-Saving Strategies
To minimize token usage and costs:
- Be Specific: Craft focused prompts that are direct about what you need
- Limit Context: Only include relevant code in your prompts
- Use RAG: Instead of including entire files, use RAG to retrieve only relevant sections
- Clean Code: Remove unnecessary comments and unused imports before including code
- Use Local Models: For non-critical tasks, consider using local models (free)
Project Scanning and Token Management
When scanning your project for context:
- Selective Inclusion: Choose specific directories or files to include
- Exclusion Patterns: Configure patterns to exclude (e.g., test files, generated code)
- JavaDoc Removal: Option to strip JavaDocs to reduce token count
- Size Limits: Set maximum file sizes to include
"Add Project" Feature
The "Add Project to Context" feature considers token limits:
- Analyses project size in tokens
- Warns if the project exceeds the model's context window
- Suggests appropriate models for your project size
- Gives token count and cost estimates
Best Practices
- Start Small: Begin with smaller contexts and add more as needed
- Monitor Costs: Regularly check token usage to avoid unexpected bills
- Try Different Models: Smaller models are often adequate and much cheaper
- Balance Quality and Cost: Higher-tier models cost more but may provide better results
- Set Budgets: Consider setting monthly budgets for API usage
Troubleshooting
- Token Count Discrepancies: If provider-reported tokens differ from DevoxxGenie estimates, adjust the calculation settings
- Context Window Errors: If you receive errors about exceeding context length, reduce the amount of included code or switch to a model with a larger context window
- High Costs: If costs are higher than expected, review your usage patterns and consider using more efficient prompts or local models