Nerif Batch API (OpenAI-Compatible)
Nerif provides an OpenAI-compatible Batch API that allows you to process large volumes of requests asynchronously. This implementation follows the same interface as OpenAI's Batch API.
Overview
The Batch API enables you to:
- Send multiple requests in a single batch file
- Process requests asynchronously at 50% lower cost
- Handle large-scale operations with a 24-hour completion window
- Track batch status and retrieve results
Installation
The Batch API is included in the Nerif package:
pip install nerif
Quick Start
1. Create a Batch Input File
First, create a JSONL file with your requests:
from nerif.batch import BatchFile
# Define your requests
requests = [
{
"custom_id": "request-1",
"method": "POST",
"url": "/v1/chat/completions",
"body": {
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "What is the capital of France?"}
],
"max_tokens": 100
}
},
{
"custom_id": "request-2",
"method": "POST",
"url": "/v1/embeddings",
"body": {
"model": "text-embedding-3-small",
"input": "Hello world"
}
}
]
# Create the batch file
batch_file = BatchFile()
input_file = batch_file.create_batch_file(requests)
print(f"Created input file: {input_file['id']}")
2. Create a Batch Job
from nerif.batch import Batch
# Create a batch job
batch = Batch.create(
input_file_id=input_file['id'],
endpoint="/v1/chat/completions",
completion_window="24h",
metadata={
"description": "My batch job"
}
)
print(f"Created batch: {batch['id']}")
print(f"Status: {batch['status']}")
3. Monitor Batch Progress
import time
# Check batch status
while True:
batch_status = Batch.retrieve(batch['id'])
print(f"Status: {batch_status['status']}")
if batch_status['status'] in ['completed', 'failed', 'cancelled']:
break
time.sleep(10) # Check every 10 seconds
# Get results
if batch_status['status'] == 'completed':
print(f"Output file: {batch_status['output_file_id']}")
print(f"Completed: {batch_status['request_counts']['completed']}")
print(f"Failed: {batch_status['request_counts']['failed']}")
API Reference
BatchFile Class
Handles JSONL file operations for batch requests.
# Create a batch file
file_info = batch_file.create_batch_file(requests, file_id=None)
# Read a batch file
requests = batch_file.read_batch_file(file_id)
# Create output/error files
output_file = batch_file.create_output_file(batch_id, results)
error_file = batch_file.create_error_file(batch_id, errors)
Batch Class
Main interface for batch operations, matching OpenAI's API exactly.
Create a Batch
batch = Batch.create(
input_file_id="file-abc123",
endpoint="/v1/chat/completions",
completion_window="24h",
metadata={"key": "value"} # Optional, max 16 pairs
)
Retrieve a Batch
batch = Batch.retrieve("batch_abc123")
Cancel a Batch
cancelled_batch = Batch.cancel("batch_abc123")
List Batches
batches = Batch.list(
after="batch_abc123", # Optional pagination cursor
limit=20 # 1-100
)
Request Format
Each request in the JSONL file must follow this format:
{
"custom_id": "unique-request-id",
"method": "POST",
"url": "/v1/chat/completions",
"body": {
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "Your message here"}
],
"max_tokens": 100
}
}
Supported Endpoints
/v1/chat/completions
- Chat completions/v1/embeddings
- Text embeddings/v1/completions
- Text completions (legacy)
Batch Object
A batch object contains:
{
"id": "batch_abc123",
"object": "batch",
"endpoint": "/v1/chat/completions",
"errors": null,
"input_file_id": "file-abc123",
"completion_window": "24h",
"status": "completed",
"output_file_id": "file-def456",
"error_file_id": null,
"created_at": 1714508499,
"in_progress_at": 1714508500,
"expires_at": 1714594899,
"completed_at": 1714508510,
"request_counts": {
"total": 100,
"completed": 98,
"failed": 2
},
"metadata": {
"description": "Nightly job"
}
}
Batch Status Values
validating
- Validating input filefailed
- Input validation failedin_progress
- Processing requestsfinalizing
- Generating result filescompleted
- Batch completed successfullyexpired
- Batch expired (24h window)cancelling
- Batch is being cancelledcancelled
- Batch was cancelled
Output Format
Results are saved in JSONL format:
{
"id": "response-1",
"custom_id": "request-1",
"response": {
"status_code": 200,
"request_id": "req_123",
"body": {
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1714508505,
"model": "gpt-4o",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "Paris is the capital of France."
},
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 8,
"total_tokens": 18
}
}
},
"error": null
}
Error Handling
Failed requests are saved in a separate error file:
{
"custom_id": "request-2",
"response": {
"status_code": 500,
"request_id": "req_124",
"body": null
},
"error": {
"message": "Internal server error",
"type": "server_error",
"code": "internal_error"
}
}
Best Practices
- Batch Size: While there's no hard limit, keep batches reasonable (e.g., 1000-5000 requests)
- Completion Window: All batches have a 24-hour completion window
- Metadata: Use metadata to track batch purpose, source, etc. (max 16 key-value pairs)
- Error Handling: Always check both output and error files
- Polling: For production, poll less frequently (e.g., every few minutes) or use webhooks
Example: Processing Multiple Files
from nerif.batch import Batch, BatchFile
import json
# Process multiple text files
texts = ["file1.txt", "file2.txt", "file3.txt"]
requests = []
for i, filename in enumerate(texts):
with open(filename, 'r') as f:
content = f.read()
requests.append({
"custom_id": f"summarize-{i}",
"method": "POST",
"url": "/v1/chat/completions",
"body": {
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "Summarize the following text"},
{"role": "user", "content": content}
],
"max_tokens": 200
}
})
# Create and process batch
batch_file = BatchFile()
input_file = batch_file.create_batch_file(requests)
batch = Batch.create(
input_file_id=input_file['id'],
endpoint="/v1/chat/completions"
)
print(f"Processing {len(requests)} files in batch {batch['id']}")
Limitations
- Batches must complete within 24 hours
- Only POST requests are supported
- Limited to specific endpoints (chat, embeddings, completions)
- Results are processed asynchronously (not real-time)
Storage
By default, batch files are stored in ~/.nerif/batches/
. You can customize this:
batch_file = BatchFile(file_path="/custom/path/to/batches")