Error Handling
Error codes and troubleshooting
Error Handling & Troubleshooting
Comprehensive guide to handling errors, implementing retries, and troubleshooting common issues with the HiveOps API.
HTTP Status Codes
| Code | Meaning | Description | Action |
|---|---|---|---|
200 | OK | Request successful | Continue normally |
400 | Bad Request | Invalid request format or parameters | Fix request, don't retry |
401 | Unauthorized | Invalid or missing API key | Check API key |
403 | Forbidden | Insufficient balance or blocked account | Add funds or contact support |
429 | Too Many Requests | Rate limit exceeded | Implement exponential backoff |
500 | Internal Server Error | Server-side error | Retry with exponential backoff |
503 | Service Unavailable | Temporary overload or maintenance | Retry after delay |
Error Response Format
All errors return JSON in this format:
{
"error": {
"message": "Human-readable error description",
"type": "error_type",
"code": "error_code"
}
}
Example Errors
401 Unauthorized:
{
"error": {
"message": "Invalid API key provided",
"type": "invalid_request_error",
"code": "invalid_api_key"
}
}
429 Rate Limit:
{
"error": {
"message": "Rate limit exceeded. Please try again later.",
"type": "rate_limit_error",
"code": "rate_limit_exceeded"
}
}
403 Insufficient Balance:
{
"error": {
"message": "Insufficient account balance. Please add funds to continue.",
"type": "insufficient_quota",
"code": "insufficient_balance"
}
}
Error Types
1. Invalid Request Errors (400)
Common Causes:
- Missing required parameters
- Invalid parameter values
- Malformed JSON
- Model name not recognized
Example:
from openai import OpenAI
client = OpenAI(
api_key="sk-YOUR-API-KEY",
base_url="https://ai.hiveops.io"
)
try:
response = client.chat.completions.create(
model="invalid-model-name", # Wrong model
messages=[{"role": "user", "content": "Hi"}]
)
except Exception as e:
print(f"Error: {e}")
# Error: Invalid model 'invalid-model-name' specified
Solution:
- Validate request parameters before sending
- Use correct model names (see API Reference)
- Check request format against documentation
2. Authentication Errors (401)
Common Causes:
- Missing Authorization header
- Invalid API key
- Expired API key (90 days)
- Revoked API key
Example:
try:
response = client.chat.completions.create(
model="llama3:8b-instruct-q8_0",
messages=[{"role": "user", "content": "Hi"}]
)
except openai.AuthenticationError as e:
print(f"Authentication failed: {e}")
Solutions:
- Verify API Key:
import os
api_key = os.getenv("HIVEOPS_API_KEY")
if not api_key or not api_key.startswith("sk-"):
raise ValueError("Invalid or missing API key")
-
Check Key Expiration:
- API keys expire after 90 days
- Generate a new key in Dashboard → API Keys
-
Ensure Correct Header:
curl https://ai.hiveops.io/models \
-H "Authorization: Bearer sk-YOUR-API-KEY" # Must include "Bearer "
3. Insufficient Balance (403)
Cause: Your account balance is $0 or negative.
Error Message:
Insufficient account balance. Please add funds to continue.
Solutions:
-
Check Balance:
- Go to Dashboard
- View current balance in the top right
-
Add Funds:
- Click "Add Funds" in dashboard
- Minimum top-up: $10
- Maximum: $1,000 per transaction
-
Handle in Code:
try:
response = client.chat.completions.create(...)
except openai.APIError as e:
if "insufficient_balance" in str(e).lower():
print("⚠️ Balance too low! Add funds at https://hiveops.io/developer/billing")
# Notify admin, pause processing, etc.
raise
4. Rate Limit Errors (429)
Limits:
- 60 requests per minute
- 150,000 tokens per minute
Error Response Headers:
X-RateLimit-Limit-Requests: 60
X-RateLimit-Remaining-Requests: 0
X-RateLimit-Reset-Requests: 2026-03-20T12:30:00Z
Solution: Implement Exponential Backoff
Python
import time
import random
from openai import OpenAI, RateLimitError
client = OpenAI(
api_key="sk-YOUR-API-KEY",
base_url="https://ai.hiveops.io"
)
def call_with_retry(max_retries=5):
for attempt in range(max_retries):
try:
response = client.chat.completions.create(
model="llama3:8b-instruct-q8_0",
messages=[{"role": "user", "content": "Hello"}]
)
return response
except RateLimitError as e:
if attempt == max_retries - 1:
raise # Give up after max retries
# Exponential backoff: 2^attempt seconds + jitter
wait_time = (2 ** attempt) + random.uniform(0, 1)
print(f"Rate limited. Retrying in {wait_time:.2f}s...")
time.sleep(wait_time)
except Exception as e:
print(f"Unexpected error: {e}")
raise
response = call_with_retry()
print(response.choices[0].message.content)
JavaScript
import OpenAI from "openai";
const client = new OpenAI({
apiKey: "sk-YOUR-API-KEY",
baseURL: "https://ai.hiveops.io",
});
async function callWithRetry(maxRetries = 5) {
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
const response = await client.chat.completions.create({
model: "llama3:8b-instruct-q8_0",
messages: [{ role: "user", content: "Hello" }],
});
return response;
} catch (error) {
if (error.status === 429 && attempt < maxRetries - 1) {
// Exponential backoff
const waitTime = Math.pow(2, attempt) * 1000 + Math.random() * 1000;
console.log(`Rate limited. Retrying in ${waitTime / 1000}s...`);
await new Promise((resolve) => setTimeout(resolve, waitTime));
} else {
throw error;
}
}
}
}
const response = await callWithRetry();
console.log(response.choices[0].message.content);
5. Server Errors (500, 503)
Causes:
- Temporary server overload
- Model inference timeout
- Internal system errors
Solution: Retry with Backoff
from openai import OpenAI, APIError
import time
client = OpenAI(
api_key="sk-YOUR-API-KEY",
base_url="https://ai.hiveops.io"
)
def call_with_server_retry(max_retries=3):
for attempt in range(max_retries):
try:
response = client.chat.completions.create(
model="llama3:8b-instruct-q8_0",
messages=[{"role": "user", "content": "Hello"}],
timeout=30 # 30 second timeout
)
return response
except APIError as e:
if e.status_code in [500, 503] and attempt < max_retries - 1:
wait_time = 2 ** attempt # 1s, 2s, 4s
print(f"Server error. Retrying in {wait_time}s...")
time.sleep(wait_time)
else:
raise
response = call_with_server_retry()
Complete Error Handling Template
Python Production Template
import time
import random
from openai import OpenAI, APIError, RateLimitError, AuthenticationError
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
client = OpenAI(
api_key="sk-YOUR-API-KEY",
base_url="https://ai.hiveops.io"
)
def robust_api_call(
messages,
model="llama3:8b-instruct-q8_0",
max_retries=5,
timeout=30
):
"""
Call HiveOps API with comprehensive error handling
"""
for attempt in range(max_retries):
try:
response = client.chat.completions.create(
model=model,
messages=messages,
timeout=timeout
)
logger.info(f"Success after {attempt + 1} attempt(s)")
return response
except AuthenticationError as e:
# Don't retry auth errors
logger.error(f"Authentication failed: {e}")
raise
except RateLimitError as e:
if attempt == max_retries - 1:
logger.error("Rate limit exceeded after max retries")
raise
wait_time = (2 ** attempt) + random.uniform(0, 1)
logger.warning(f"Rate limited. Retrying in {wait_time:.2f}s...")
time.sleep(wait_time)
except APIError as e:
# Check for specific error types
if "insufficient_balance" in str(e).lower():
logger.error("Insufficient balance. Cannot retry.")
raise
if e.status_code in [500, 503] and attempt < max_retries - 1:
wait_time = 2 ** attempt
logger.warning(f"Server error {e.status_code}. Retrying in {wait_time}s...")
time.sleep(wait_time)
else:
logger.error(f"API error: {e}")
raise
except Exception as e:
logger.error(f"Unexpected error: {e}")
raise
raise Exception("Max retries exceeded")
# Usage
try:
response = robust_api_call(
messages=[{"role": "user", "content": "Hello!"}],
model="llama3:8b-instruct-q8_0"
)
print(response.choices[0].message.content)
except Exception as e:
print(f"Failed after all retries: {e}")
# Handle gracefully (log, alert, fallback, etc.)
TypeScript Production Template
import OpenAI from "openai";
import type { ChatCompletionMessageParam } from "openai/resources/chat";
const client = new OpenAI({
apiKey: process.env.HIVEOPS_API_KEY!,
baseURL: "https://ai.hiveops.io",
});
interface RetryOptions {
maxRetries?: number;
timeout?: number;
onRetry?: (attempt: number, error: any) => void;
}
async function robustApiCall(
messages: ChatCompletionMessageParam[],
model: string = "llama3:8b-instruct-q8_0",
options: RetryOptions = {},
) {
const { maxRetries = 5, timeout = 30000, onRetry } = options;
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
const response = await client.chat.completions.create({
model,
messages,
timeout,
});
console.log(`Success after ${attempt + 1} attempt(s)`);
return response;
} catch (error: any) {
// Authentication errors - don't retry
if (error.status === 401) {
console.error("Authentication failed");
throw error;
}
// Insufficient balance - don't retry
if (error.message?.includes("insufficient_balance")) {
console.error("Insufficient balance");
throw error;
}
// Rate limit - retry with backoff
if (error.status === 429 && attempt < maxRetries - 1) {
const waitTime = Math.pow(2, attempt) * 1000 + Math.random() * 1000;
console.warn(`Rate limited. Retrying in ${waitTime / 1000}s...`);
onRetry?.(attempt, error);
await new Promise((resolve) => setTimeout(resolve, waitTime));
continue;
}
// Server errors - retry
if ([500, 503].includes(error.status) && attempt < maxRetries - 1) {
const waitTime = Math.pow(2, attempt) * 1000;
console.warn(
`Server error ${error.status}. Retrying in ${waitTime / 1000}s...`,
);
onRetry?.(attempt, error);
await new Promise((resolve) => setTimeout(resolve, waitTime));
continue;
}
// Give up
console.error("API call failed:", error);
throw error;
}
}
throw new Error("Max retries exceeded");
}
// Usage
try {
const response = await robustApiCall(
[{ role: "user", content: "Hello!" }],
"llama3:8b-instruct-q8_0",
{
maxRetries: 5,
onRetry: (attempt, error) => {
console.log(`Retry ${attempt + 1}: ${error.message}`);
},
},
);
console.log(response.choices[0].message.content);
} catch (error) {
console.error("Failed after all retries:", error);
// Handle gracefully
}
Common Issues & Solutions
Issue 1: "Model not found"
Error:
Invalid model 'gpt-3.5-turbo' specified
Cause: Using OpenAI model names instead of HiveOps models.
Solution: Use HiveOps model names:
| OpenAI Model | HiveOps Equivalent |
|---|---|
gpt-3.5-turbo | llama3:8b-instruct-q8_0 |
gpt-4 | llama-3-70b-instruct |
# Wrong
model="gpt-3.5-turbo"
# Correct
model="llama3:8b-instruct-q8_0"
Issue 2: Requests Still Going to OpenAI
Symptom: Requests work, but you're being charged by OpenAI, not HiveOps.
Cause: base_url not set.
Solution:
# Make sure base_url is set
client = OpenAI(
api_key="sk-YOUR-HIVEOPS-KEY", # HiveOps key, not OpenAI
base_url="https://ai.hiveops.io" # Must include this!
)
Issue 3: Slow Response Times
Symptoms:
- Requests timing out
- High latency (>5 seconds)
Causes & Solutions:
- Large
max_tokens:
# Slow (generates up to 4000 tokens)
response = client.chat.completions.create(
model="llama3:8b-instruct-q8_0",
messages=[...],
max_tokens=4000
)
# Faster (limit to what you need)
response = client.chat.completions.create(
model="llama3:8b-instruct-q8_0",
messages=[...],
max_tokens=500 # Only generate what you need
)
- Long conversation history:
# Slow (processing 50+ messages)
messages = conversation_history # 50 messages
# Faster (use sliding window)
messages = [
{"role": "system", "content": system_prompt},
*conversation_history[-10:] # Last 10 messages only
]
- Choose faster model:
| Model | Avg Latency |
|---|---|
mistral-7b-instruct-v0.3 | ~200ms |
llama3:8b-instruct-q8_0 | ~300ms |
gemma-2-9b-it | ~350ms |
llama-3-70b-instruct | ~800ms |
Issue 4: Streaming Not Working
Problem: Streaming doesn't output incrementally.
Cause: Not handling streaming correctly.
Solution (Python):
# Correct streaming implementation
stream = client.chat.completions.create(
model="llama3:8b-instruct-q8_0",
messages=[{"role": "user", "content": "Count to 10"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True) # flush=True is key
Solution (JavaScript):
const stream = await client.chat.completions.create({
model: "llama3:8b-instruct-q8_0",
messages: [{ role: "user", content: "Count to 10" }],
stream: true,
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content || "";
process.stdout.write(content); // Write incrementally
}
Issue 5: High Costs
Problem: Bills higher than expected.
Solutions:
- Monitor token usage:
response = client.chat.completions.create(...)
print(f"Prompt tokens: {response.usage.prompt_tokens}")
print(f"Completion tokens: {response.usage.completion_tokens}")
print(f"Total: {response.usage.total_tokens}")
# Calculate cost (Llama 3 8B)
input_cost = (response.usage.prompt_tokens / 1_000_000) * 0.01
output_cost = (response.usage.completion_tokens / 1_000_000) * 0.02
print(f"Cost: ${input_cost + output_cost:.6f}")
-
Set budget alerts (coming soon)
-
Use cheaper models for simple tasks:
| Task Complexity | Model | Cost Multiplier |
|---|---|---|
| Simple | mistral-7b-instruct-v0.3 | 1x (baseline) |
| Medium | llama3:8b-instruct-q8_0 | 10x |
| Complex | llama-3-70b-instruct | 100x |
Debugging Checklist
When things aren't working:
- Is
base_urlset tohttps://ai.hiveops.io? - Is API key correct and starts with
sk-? - Is model name correct? (Use
llama3:8b-instruct-q8_0, notgpt-3.5-turbo) - Is account balance positive?
- Are you within rate limits? (60 req/min)
- Is request format valid? (Check JSON syntax)
- Did you implement retry logic for 429/500/503 errors?
- Are you using the latest SDK version?
- Python:
pip install --upgrade openai - JavaScript:
npm install openai@latest
- Python:
Testing Error Handling
Test Rate Limiting
import time
# Send 65 requests quickly to trigger rate limit
for i in range(65):
try:
response = client.chat.completions.create(
model="llama3:8b-instruct-q8_0",
messages=[{"role": "user", "content": f"Message {i}"}]
)
print(f"Request {i}: Success")
except RateLimitError as e:
print(f"Request {i}: Rate limited!")
break
time.sleep(0.1)
Test Balance Check
# Check balance before expensive operation
def check_balance_first():
try:
# Make a cheap request to verify account is active
client.chat.completions.create(
model="mistral-7b-instruct-v0.3",
messages=[{"role": "user", "content": "test"}],
max_tokens=1
)
return True
except APIError as e:
if "insufficient_balance" in str(e).lower():
print("⚠️ Balance too low!")
return False
raise
if check_balance_first():
# Proceed with main operation
response = client.chat.completions.create(...)
Support & Resources
When to Contact Support
Contact [email protected] if:
- Persistent 500 errors (not resolved by retries)
- Suspected API key compromised
- Billing discrepancies
- Account locked or suspended
- Feature requests
Do NOT contact support for:
- How to use the API (read documentation first)
- Model quality issues (models are third-party)
- Rate limit increases (standard for all users)
Helpful Resources
- 📚 API Reference - Complete endpoint documentation
- 🚀 Quickstart Guide - Get started in 5 minutes
- 💬 Discord Community - Ask questions, get help
- 📧 Email Support: [email protected]
- 🐛 Report Bugs: [email protected]
Last Updated: March 20, 2024