Why I Built This
Last month, we got our OpenAI bill: $3,127 for a single week.
We were bleeding money on AI API calls. We had no visibility into spending, no caching, and we were using GPT-4 for everything—even simple queries that could run on GPT-3.5 (which is 60x cheaper).
After a weekend of frustrated coding, I built the AI API Cost Optimizer—a Python tool that:
- ✅ Intelligently caches responses to avoid duplicate calls
- ✅ Routes queries to the cheapest appropriate model
- ✅ Tracks spending in real-time with alerts
- ✅ Works with any AI provider (OpenAI, Anthropic, Google, Cohere, Mistral)
Result: 70% cost reduction ($8,660/month saved = $103,920/year)
Today, I'm open-sourcing it. If you're paying for AI APIs, this tool can save you serious money.
What It Does
1. Smart Caching (40-60% Savings)
Stores API responses in SQLite. When you make the same query twice, it returns the cached result instantly at $0 cost.
Example:
First call: "What is Python?" → API call → $0.02
Second call: "What is Python?" → Cache hit → $0.00 ✅
With 52% cache hit rate, half your API calls are free.
2. Intelligent Model Routing (20-30% Savings)
Automatically suggests cheaper models for simple queries.
Example:
- Query: "What is machine learning?"
- Your choice: GPT-4 ($0.06 per 1K tokens)
- Optimizer suggests: GPT-3.5-Turbo ($0.001 per 1K tokens)
- Savings: 98% 💰
For simple FAQs, definitions, and explanations—you don't need expensive models.
3. Real-Time Cost Monitoring
Tracks every API call with:
- Cost per call
- Cache hit rates
- Spending by model
- Hourly/daily/monthly totals
- Alerts when thresholds are exceeded
Dashboard shows:
Last 24 hours:
- Total cost: $45.32
- Total calls: 1,245
- Cache hit rate: 52%
- Top model: gpt-4-turbo ($32.15)
4. Beautiful Web Dashboard
Modern, animated dashboard built with:
- Real-time cost tracking
- Interactive charts (Chart.js)
- Cache performance metrics
- Model distribution graphs
- Responsive design (mobile-friendly)
Installation & Setup
Quick Start (2 minutes)
# Clone the repo
git clone https://github.com/dinesh-k-elumalai/ai-cost-optimizer.git
cd ai-cost-optimizer
# Install dependencies
pip install -r requirements.txt
# Run the quick start demo
python quick_start.py
# Start the web dashboard
python app.py
# Open http://localhost:5000
That's it! The optimizer is running.
Integrate with Your Code
Option 1: Drop-in wrapper (easiest)
from ai_cost_optimizer import AIAPIOptimizer
from openai import OpenAI
client = OpenAI(api_key="your-key")
optimizer = AIAPIOptimizer()
def optimized_call(prompt, model="gpt-4"):
# Check cache first
cached = optimizer.cache.get(prompt, model)
if cached:
return cached
# Make API call
response = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": prompt}]
)
# Track and cache
answer = response.choices[0].message.content
optimizer.process_request(
prompt, model,
response.usage.prompt_tokens,
response.usage.completion_tokens
)
optimizer.cache.set(prompt, model, answer, 0.02)
return answer
# Use it like normal!
answer = optimized_call("Explain async/await")
Option 2: Use the SDK
from ai_cost_optimizer.sdk import CostOptimizerClient
optimizer = CostOptimizerClient()
# Track any API call
optimizer.track_call(
prompt="Your prompt",
model="gpt-4-turbo",
input_tokens=100,
output_tokens=200
)
# Get suggestions
suggestion = optimizer.suggest_model("What is Python?", "gpt-4")
print(f"Use {suggestion['suggested']} to save {suggestion['savings']}%")
Option 3: Monitoring only
Just track your existing calls without changing code:
# After your API call
optimizer.process_request(prompt, model, input_tokens, output_tokens)
# Check stats anytime
stats = optimizer.tracker.get_stats(24) # Last 24 hours
print(f"Total cost: ${stats['total_cost']:.2f}")
Real Results
Here's what happened after we deployed it:
Before AI Cost Optimizer
- 💸 Monthly cost: $12,340
- 📊 Cache hit rate: 0%
- ⏱️ Avg response time: 2.1 seconds
- 🤷 Visibility: None
After AI Cost Optimizer
- 💰 Monthly cost: $3,680 (70% reduction)
- ✅ Cache hit rate: 52% (half of calls are free)
- ⚡ Avg response time: 1.4 seconds (33% faster)
- 📈 Visibility: Complete dashboard
Annual Savings
$8,660/month × 12 = $103,920/year saved 🎉
That's a junior developer's salary saved just by optimizing API calls!
Why This Tool is Different
🆓 Open Source & Free
- MIT License
- No vendor lock-in
- Community-driven
- Fork and customize
🚀 Production-Ready
- Used by 50+ startups in production
- Battle-tested code
- SQLite for simplicity (PostgreSQL for scale)
- Proper error handling
🎨 Beautiful UI
- Modern glassmorphism design
- Smooth animations
- Real-time updates
- Fully responsive
🔌 Universal Compatibility
Works with:
- OpenAI (GPT-4, GPT-3.5)
- Anthropic (Claude Opus, Sonnet, Haiku)
- Google (Gemini Pro, Flash)
- Cohere
- Mistral
- Any AI provider with token-based pricing
📊 Actionable Insights
- Which models cost the most
- Which queries can use cheaper models
- Cache effectiveness
- Hourly/daily spending trends
- Cost per task type
Features
Core Features
✅ Smart response caching with SQLite
✅ Intelligent model routing
✅ Real-time cost tracking
✅ Web dashboard with charts
✅ Cost alerts and thresholds
✅ Multi-provider support
✅ Cache TTL management
✅ Query complexity classification
Developer Experience
✅ Zero-code monitoring (just track calls)
✅ Drop-in integration (wrap existing calls)
✅ SDK for easy integration
✅ Complete API documentation
✅ Example integrations (FastAPI, Django, Flask)
✅ Docker support (coming soon)
Analytics
✅ Cost by model
✅ Cost by task type
✅ Cache hit rate tracking
✅ Hourly/daily/monthly breakdowns
✅ Token usage statistics
✅ Model performance comparison
Use Cases
1. Startups with AI Features
Problem: Unpredictable AI bills eating into runway
Solution: 40-70% cost reduction = more months of runway
2. SaaS with AI Chatbots
Problem: High support costs with AI assistants
Solution: Cache FAQ responses, save 60% on support queries
3. Development Teams
Problem: No visibility into AI spending
Solution: Real-time tracking, alerts before overspending
4. AI Agencies
Problem: Client projects with variable AI costs
Solution: Track per-project costs, optimize spending
5. Content Platforms
Problem: Expensive content generation at scale
Solution: Cache similar requests, use cheaper models
Getting Started
1. Install
git clone https://github.com/dinesh-k-elumalai/ai-cost-optimizer.git
cd ai-cost-optimizer
pip install -r requirements.txt
2. Quick Test
python quick_start.py
This runs a demo showing:
- ✅ Cache working (second call is free)
- ✅ Model suggestions (save 90%+ on simple queries)
- ✅ Cost tracking (see all spending)
3. Start Dashboard
python app.py
# Open http://localhost:5000
View real-time:
- 📊 Cost charts
- 💾 Cache performance
- 💡 Optimization recommendations
- 📈 Spending trends
4. Integrate
Choose your integration method:
- Monitoring only - Just track calls
- Drop-in wrapper - Wrap API calls for caching
- Full integration - Use SDK for everything
See Integration Guide for details.
Configuration
Customize for your needs:
from ai_cost_optimizer import AIAPIOptimizer
optimizer = AIAPIOptimizer()
# Set alert thresholds
optimizer.tracker.alert_thresholds = {
'hourly': 50.0, # $50/hour
'daily': 500.0, # $500/day
'monthly': 10000.0 # $10k/month
}
# Customize cache TTL
optimizer.cache.set(prompt, model, response, cost, ttl_hours=168) # 7 days
# Add custom model costs
from ai_cost_optimizer import MODEL_COSTS
MODEL_COSTS["your-custom-model"] = {
"input": 5.00,
"output": 15.00
}
Roadmap
What's coming next:
- [ ] Semantic caching - Cache similar queries (not just exact matches)
- [ ] A/B testing - Compare model performance automatically
- [ ] Slack/Email alerts - Get notified of cost spikes
- [ ] Docker container - One-command deployment
- [ ] Hosted version - No setup required (coming Q2 2026)
- [ ] Multi-user support - Team dashboards
- [ ] Cost forecasting - Predict future spending
- [ ] Browser extension - Monitor OpenAI Playground usage
Want a feature? Open an issue or contribute!
Contributing
This tool exists because developers shared their pain points. Your contributions make it better for everyone!
Ways to Contribute
- Share your savings - Tweet your results with #AIOptimizer
- Report bugs - Found an issue? Open a GitHub issue
- Add features - PRs welcome! See CONTRIBUTING.md
- Improve docs - Better examples, translations, tutorials
- Star the repo ⭐ - Helps others discover it
Areas We Need Help
- 🐛 Bug fixes and testing
- 🌐 Support for more AI providers (Replicate, HuggingFace, etc.)
- 📚 Documentation improvements
- 🎨 Dashboard enhancements
- 🧪 More test coverage
- 🌍 Translations
Community & Support
Get Help
Share Your Results
Save money? Share it!
Tweet format:
Just saved $X/month on AI API costs using @dinesh-k-elumalai's
AI Cost Optimizer! 🚀
70% cost reduction with smart caching and model routing.
Open source and free: [GitHub link]
#AIOptimizer #OpenSource #DevTools
Tech Stack
Built with:
- Python 3.8+ - Core optimizer
- SQLite - Caching and cost tracking
- Flask - Web dashboard
- Chart.js - Data visualization
- FontAwesome - Icons
- Modern CSS - Glassmorphism design
FAQ
Q: Does this work with my AI provider?
A: Yes! Supports OpenAI, Anthropic, Google, Cohere, Mistral, and any provider with token-based pricing.
Q: How much will I save?
A: Typically 40-70%. Actual savings depend on your usage patterns. More savings if you have duplicate queries.
Q: Is this production-ready?
A: Yes! Used by 50+ startups in production. SQLite works great for small-medium loads. PostgreSQL for high traffic.
Q: Can I use without code changes?
A: Yes! Monitoring mode tracks calls without any code changes. Add caching later when ready.
Q: How does caching work with dynamic content?
A: Cache TTL is configurable (default 7 days). For dynamic content, use shorter TTL or disable caching for specific queries.
Q: Does this replace my AI provider?
A: No! It's a wrapper that optimizes your existing AI API calls. You still use OpenAI, Anthropic, etc.
Q: What about privacy/security?
A: Everything runs locally. No data sent to third parties. Cache is stored in your SQLite database.
Try It Now
Quick Start
git clone https://github.com/dinesh-k-elumalai/ai-cost-optimizer.git
cd ai-cost-optimizer
pip install -r requirements.txt
python quick_start.py
Links
- 🌟 GitHub: github.com/dinesh-k-elumalai/ai-cost-optimizer
- 🐦 Follow me: @dinesh-k-elumalai on X/Twitter
- 📖 Docs: Full Documentation
- 💬 Discuss: GitHub Discussions
Final Thoughts
AI APIs are amazing but expensive. After getting burned by a $3K/week bill, I built this tool to:
- Give visibility - Know what you're spending
- Enable caching - Don't pay twice for the same query
- Optimize routing - Use cheaper models when possible
- Alert early - Catch cost spikes before they hurt
The result? 70% cost reduction and $103K/year saved.
If you're using AI APIs, you need cost optimization. This tool is:
- ✅ Free and open source
- ✅ Production-ready
- ✅ Easy to integrate
- ✅ Actively maintained
Give it a try. Your finance team will thank you. 💰
Found this useful?
⭐ Star the repo: GitHub
🐦 Follow me: @dk_elumalai
💬 Share your savings in the comments!
Questions? Drop them below! I read and respond to every comment. 👇
Happy optimizing! 🚀
Built with ❤️ by a developer tired of surprise bills. Open source forever.
Top comments (0)