LLM Providers¶
SQLatte supports three LLM providers. Configure one in config.yaml:
Anthropic Claude (Recommended)¶
Best for: SQL generation, complex reasoning, highest quality
llm:
provider: "anthropic"
anthropic:
api_key: "sk-ant-xxxxx"
model: "claude-sonnet-4-20250514" # Recommended
max_tokens: 4096
temperature: 0.0 # Deterministic (best for SQL)
timeout: 60
Models¶
| Model | Best For | Speed | Cost |
|---|---|---|---|
claude-sonnet-4-20250514 |
Balanced (recommended) | Fast | $$ |
claude-opus-4-20250514 |
Highest quality | Slow | $$$ |
claude-haiku-4-20250316 |
Simple queries | Fastest | $ |
Getting API Key¶
- Visit console.anthropic.com
- Sign up / Log in
- Go to API Keys
- Click Create Key
- Copy
sk-ant-xxxxx - Add to config.yaml
Pricing: Pay-as-you-go, ~$3 per 1M input tokens (Sonnet)
Google Gemini¶
Best for: Free tier, fast responses, Google ecosystem
llm:
provider: "gemini"
gemini:
api_key: "your-gemini-api-key"
model: "gemini-2.0-flash-exp" # Latest model
max_tokens: 4096
temperature: 0.0
Getting API Key¶
- Visit makersuite.google.com/app/apikey
- Sign in with Google account
- Click Create API Key
- Copy key
- Add to config.yaml
Free Tier: 60 requests/minute, 1,500 requests/day
Pricing: Free tier available, then pay-as-you-go
Google Vertex AI¶
Best for: Enterprise GCP deployments, production scale
llm:
provider: "vertex"
vertex:
project_id: "my-gcp-project"
location: "us-central1" # or europe-west1, asia-northeast1
model: "gemini-2.0-flash-exp"
credentials_path: "/path/to/service-account.json"
max_tokens: 4096
temperature: 0.0
Service Account Setup¶
-
Create Service Account:
-
Grant Permissions:
Vertex AI User-
Service Account Token Creator -
Download JSON Key:
-
Enable Vertex AI API:
Pricing: Enterprise pricing, similar to Gemini API
Comparison¶
| Feature | Anthropic Claude | Google Gemini | Vertex AI |
|---|---|---|---|
| Quality (SQL) | ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
| Speed | Fast | Very Fast | Fast |
| Cost | $$ | $ (free tier) | $$ |
| Setup | Easy | Easiest | Complex |
| Free Tier | No | Yes | No |
| Best For | Production | Testing/Free | Enterprise GCP |
Model Parameters¶
Temperature¶
temperature: 0.0 # Deterministic (recommended for SQL)
temperature: 0.5 # Balanced
temperature: 1.0 # Creative (not recommended for SQL)
Recommendation: Always use 0.0 for SQL generation to ensure consistent queries.
Max Tokens¶
Timeout¶
Testing LLM Connection¶
# Via Admin Panel
http://localhost:8000/admin
# → Providers tab → Test LLM button
# Via API
curl http://localhost:8000/health
# Check: "llm": "available"
Switching Providers¶
Change LLM provider in real-time:
Option 1: Config File
Option 2: Admin Panel
http://localhost:8000/admin
→ Providers tab
→ Select LLM provider
→ Save
→ Hot reload (no restart needed!)
Cost Optimization¶
Use Cheaper Models for Simple Queries¶
Adjust Token Limits¶
Cache Prompts (Future)¶
Troubleshooting¶
Invalid API Key¶
# Error: 401 Unauthorized
# - Verify API key is correct
# - Check key hasn't expired
# - Ensure no extra spaces in config.yaml
Rate Limit Exceeded¶
# Error: 429 Too Many Requests
# - Reduce request frequency
# - Upgrade to paid tier
# - Use different model
Model Not Found¶
# Error: 404 Model not found
# - Check model name spelling
# - Verify model is available in your region
# - Use latest model names
Timeout¶
# Error: Request timeout
# - Increase timeout in config
# - Simplify query/schema
# - Use faster model (Haiku, Gemini Flash)
Best Practices¶
For Production¶
- ✅ Use Claude Sonnet 4 for best SQL quality
- ✅ Set
temperature: 0.0for deterministic results - ✅ Monitor API costs with usage limits
- ✅ Use environment variables for API keys
For Development¶
- ✅ Use Gemini free tier for testing
- ✅ Test with sample queries before production
- ✅ Validate generated SQL before execution
For Enterprise¶
- ✅ Use Vertex AI for GCP integration
- ✅ Set up service account with minimal permissions
- ✅ Enable audit logging
- ✅ Use VPC for network isolation
Environment Variables¶
Store API keys securely:
# config.yaml
llm:
provider: "anthropic"
anthropic:
api_key: "${ANTHROPIC_API_KEY}" # From environment
Next: Analytics Setup | Full Config Reference