Skip to main content

4 posts tagged with "proxy"

View All Tags

Maintain Visual Consistency with Reusable Video Characters

Sameer Kankute
SWE @ LiteLLM (LLM Translation)
Krrish Dholakia
CEO, LiteLLM
Ishaan Jaff
CTO, LiteLLM

Build branded video sequences with consistent characters using the LiteLLM Video Characters API. Create once, reuse everywhere.

Proxy Setupโ€‹

model_list:
- model_name: sora-2
litellm_params:
model: openai/sora-2
api_key: os.environ/OPENAI_API_KEY
- model_name: sora-2-pro
litellm_params:
model: openai/sora-2-pro
api_key: os.environ/OPENAI_API_KEY
litellm --config /path/to/config.yaml

SDK & Proxy Usageโ€‹

Create a Characterโ€‹

Upload a short video clip to create a reusable character asset.

from litellm import avideo_create_character

# Create character from video
character = await avideo_create_character(
name="Mossy", # Character name (use in prompts)
video=open("character.mp4", "rb"), # Short 2-4 second clip
custom_llm_provider="openai",
model="sora-2"
)

print(f"Character ID: {character.id}")
# Output: Character ID: char_abc123def456

Proxy endpoint:

curl -X POST "http://localhost:4000/v1/videos/characters" \
-H "Authorization: Bearer sk-litellm-key" \
-F "video=@character.mp4" \
-F "name=Mossy"

Generate Video with Characterโ€‹

Include the character ID in the characters array and mention the character name in your prompt.

from litellm import avideo

# Create video using the character
video = await avideo(
model="sora-2",
prompt="A cinematic tracking shot of Mossy weaving through a lantern-lit market at dusk, looking around curiously.",
characters=[{"id": "char_abc123def456"}],
seconds="8",
size="1280x720"
)

print(f"Video ID: {video.id}")

Proxy endpoint:

curl -X POST "http://localhost:4000/v1/videos" \
-H "Authorization: Bearer sk-litellm-key" \
-H "Content-Type: application/json" \
-d '{
"model": "sora-2",
"prompt": "Mossy dances through a meadow of glowing flowers.",
"characters": [{"id": "char_abc123def456"}],
"seconds": "8",
"size": "1280x720"
}'

Retrieve Character Infoโ€‹

Get metadata and status for an uploaded character.

from litellm import avideo_get_character

character = await avideo_get_character(
character_id="char_abc123def456",
custom_llm_provider="openai"
)

print(f"Character: {character.name}, Created: {character.created_at}")

Edit Video with Characterโ€‹

Preserve the character while making targeted edits to the scene.

from litellm import avideo_edit

# Edit existing video with character
edited = await avideo_edit(
video_id="video_xyz123",
prompt="Shift the lighting to warm golden hour, add wind effects to character's fur.",
custom_llm_provider="openai"
)

Extend Video with Characterโ€‹

Continue a completed video with the same character for longer sequences.

from litellm import avideo_extension

# Extend video by up to 20 seconds
extended = await avideo_extension(
video_id="video_xyz123",
prompt="Mossy walks towards the camera, waves, and smiles warmly.",
seconds="8",
custom_llm_provider="openai"
)

Best Practicesโ€‹

Character Uploadsโ€‹

  • Optimal duration: 2-4 seconds for best consistency
  • Aspect ratio: Match the target video resolution (16:9, 9:16, or 1:1)
  • Resolution: 720p to 1080p
  • Isolation: Show the character clearly against a distinct background
  • Movement: Include natural motion (walk, turn, gesture) to establish the character

Promptingโ€‹

โŒ Avoid:

"Create a video with a character that looks like Mossy"

โœ… Do this:

"Mossy the moss-covered teapot mascot weaves through a lantern-lit market at dusk, 
looking curious and friendly."

Always mention the character name verbatim in your prompt. The character ID alone won't reliably preserve the asset.

Character Limitsโ€‹

  • Per video: Up to 2 characters in a single generation
  • Extensions: 1 character per extension (characters not supported in extensions)
  • Consistency: Best results when character occupies 30-60% of frame

Full Example Workflowโ€‹

Create a branded video series with a consistent mascot character.

Python workflow example
import asyncio
from litellm import (
avideo_create_character,
avideo,
avideo_extension,
avideo_edit,
avideo_get_character
)

async def create_branded_series():
# Step 1: Create character asset
print("Creating character...")
character = await avideo_create_character(
name="Luna",
video=open("luna_intro.mp4", "rb"),
custom_llm_provider="openai"
)
char_id = character.id
print(f"โœ“ Character created: {char_id}")

# Step 2: Generate first scene
print("Generating opening scene...")
scene1 = await avideo(
model="sora-2",
prompt="Luna the magical fox dancing through a cosmic forest, stars trailing her movement.",
characters=[{"id": char_id}],
seconds="8",
size="1280x720"
)
print(f"โœ“ Scene 1: {scene1.id} ({scene1.status})")

# Wait for completion (in production, use webhooks)
while scene1.status in ("queued", "in_progress"):
await asyncio.sleep(5)
scene1 = await avideo_get_character(char_id) # refresh

# Step 3: Extend with same character
print("Extending scene...")
scene2 = await avideo_extension(
video_id=scene1.id,
prompt="Luna leaps into the air, transforms into stardust, and reforms on a moonlit cliff.",
seconds="6"
)
print(f"โœ“ Scene 2 (extension): {scene2.id}")

# Step 4: Edit for color grading
print("Applying color grade...")
scene1_graded = await avideo_edit(
video_id=scene1.id,
prompt="Shift palette to cool blues and purples with enhanced glow effects around Luna."
)
print(f"โœ“ Scene 1 graded: {scene1_graded.id}")

print("\nโœ“ Branded series created with consistent Luna character!")

asyncio.run(create_branded_series())

FAQโ€‹

Q: Can I use real people as characters?
A: Character uploads depicting human likeness are blocked by default. Contact OpenAI sales for eligibility.

Q: What happens if the character aspect ratio doesn't match the video size?
A: The character may appear stretched or distorted. Upload character videos matching your target aspect ratio.

Q: Can I use the same character in extensions?
A: Extensions don't currently support character preservation. Use image references or re-upload the character from the extended video.

Q: How long do characters persist?
A: Characters are stored with your account indefinitely. You can retrieve and reuse them anytime with GET /v1/videos/characters/{character_id}.

Q: Can I combine characters with image references?
A: Yes! Use input_reference (to set the opening frame) alongside characters (for reusable assets) in the same generation.

Q: What's the difference between characters and input_reference?
A: input_reference conditions a single generation's opening frame. characters are reusable assets you reference across multiple generations for consistency.

Q: How many characters can I upload?
A: No limit. Upload as many character assets as you need for your project.

Q: Can I update or delete a character?
A: Character assets are immutable. To modify an asset, upload a new character and reference the new ID in future generations.

Troubleshootingโ€‹

Character looks distorted in output

  • Verify the character video matches the target resolution
  • Re-upload with matching aspect ratio (16:9, 9:16, or 1:1)

Character doesn't appear in generated video

  • Check that you included the character ID in the characters array
  • Verify the character name is mentioned in the prompt (verbatim)
  • Ensure character occupies meaningful screen space in your prompt description

Character upload fails

  • Maximum file size is typically 100MB
  • Use MP4 format, 2-4 seconds, 720p-1080p resolution
  • Ensure character is clearly visible and isolated

Next Stepsโ€‹

Realtime WebRTC HTTP Endpoints

Sameer Kankute
SWE @ LiteLLM (LLM Translation)
Krrish Dholakia
CEO, LiteLLM
Ishaan Jaff
CTO, LiteLLM

Connect to the Realtime API via WebRTC from browser/mobile clients. LiteLLM handles auth and key management.

How it worksโ€‹

WebRTC flow: Browser, LiteLLM Proxy, and OpenAI/Azure

Flow of generating ephemeral token

Ephemeral token flow: Browser requests token, LiteLLM gets real token from OpenAI, returns encrypted token

Proxy Setupโ€‹

model_list:
- model_name: gpt-4o-realtime
litellm_params:
model: openai/gpt-4o-realtime-preview-2024-12-17
api_key: os.environ/OPENAI_API_KEY
model_info:
mode: realtime

Azure: use model: azure/gpt-4o-realtime-preview, api_key, api_base.

litellm --config /path/to/config.yaml

Try it liveโ€‹

INTERACTIVE TESTER
Browser โ†’ LiteLLM โ†’ OpenAI ยท WebRTC
โ–ผ

Client Usageโ€‹

1. Get token - POST /v1/realtime/client_secrets with LiteLLM API key and { model }.

2. WebRTC handshake - Create RTCPeerConnection, add mic track, create data channel oai-events, send SDP offer to POST /v1/realtime/calls with Authorization: Bearer <encrypted_token> and Content-Type: application/sdp.

3. Events - Use the data channel for session.update and other events.

Full code example
// 1. Token
const r = await fetch("http://proxy:4000/v1/realtime/client_secrets", {
method: "POST",
headers: { "Authorization": "Bearer sk-litellm-key", "Content-Type": "application/json" },
body: JSON.stringify({ model: "gpt-4o-realtime" }),
});
const { client_secret } = await r.json();
const token = client_secret.value;

// 2. WebRTC
const pc = new RTCPeerConnection();
const audio = document.createElement("audio");
audio.autoplay = true;
pc.ontrack = (e) => (audio.srcObject = e.streams[0]);
const ms = await navigator.mediaDevices.getUserMedia({ audio: true });
pc.addTrack(ms.getTracks()[0]);
const dc = pc.createDataChannel("oai-events");
const offer = await pc.createOffer();
await pc.setLocalDescription(offer);

const sdpRes = await fetch("http://proxy:4000/v1/realtime/calls", {
method: "POST",
headers: { "Authorization": `Bearer ${token}`, "Content-Type": "application/sdp" },
body: offer.sdp,
});
await pc.setRemoteDescription({ type: "answer", sdp: await sdpRes.text() });

// 3. Events
dc.send(JSON.stringify({ type: "session.update", session: { instructions: "..." } }));

FAQโ€‹

Q: What do I do if I get a 401 Token expired error?
A: Tokens are short-lived. Get a fresh token right before creating the WebRTC offer.

Q: Which key should I use for /v1/realtime/calls?
A: Use the encrypted token from client_secrets, not your raw API key.

Q: Should I pass the model parameter when making the call?
A: No, the encrypted token already encodes all routing information including model.

Q: How do I resolve Azure api-version errors?
A: Set the correct api_version in litellm_params (or via the AZURE_API_VERSION environment variable), along with the right api_base and deployment values.

Q: What if I get no audio?
A: Make sure you grant microphone permission, ensure pc.ontrack assigns the audio element with autoplay enabled, check your network/firewall for WebRTC traffic, and inspect the browser console for ICE or SDP errors.

Incident Report: Encrypted Content Failures in Multi-Region Responses API Load Balancing

Sameer Kankute
SWE @ LiteLLM (LLM Translation)
Krrish Dholakia
CEO, LiteLLM
Ishaan Jaff
CTO, LiteLLM

Date: Feb 24, 2026
Duration: Ongoing (until fix deployed)
Severity: High (for users load balancing Responses API across different API keys)
Status: Resolved

Summaryโ€‹

When load balancing OpenAI's Responses API across deployments with different API keys (e.g., different Azure regions or OpenAI organizations), follow-up requests containing encrypted content items (like rs_... reasoning items) would fail with:

{
"error": {
"message": "The encrypted content for item rs_0d09d6e56879e76500699d6feee41c8197bd268aae76141f87 could not be verified. Reason: Encrypted content organization_id did not match the target organization.",
"type": "invalid_request_error",
"code": "invalid_encrypted_content"
}
}

Encrypted content items are cryptographically tied to the API key's organization that created them. When the router load balanced a follow-up request to a deployment with a different API key, decryption failed.

  • Responses API calls with encrypted content: Complete failure when routed to wrong deployment
  • Initial requests: Unaffected โ€” only follow-up requests containing encrypted items failed
  • Other API endpoints: No impact โ€” chat completions, embeddings, etc. functioned normally

Incident Report: Wildcard Blocking New Models After Cost Map Reload

Sameer Kankute
SWE @ LiteLLM (LLM Translation)
Krrish Dholakia
CEO, LiteLLM
Ishaan Jaff
CTO, LiteLLM

Date: Feb 23, 2026
Duration: ~3 hours
Severity: High (for users with provider wildcard access rules)
Status: Resolved

Summaryโ€‹

When a new Anthropic model (e.g. claude-sonnet-4-6) was added to the LiteLLM model cost map and a cost map reload was triggered, requests to the new model were rejected with:

key not allowed to access model. This key can only access models=['anthropic/*']. Tried to access claude-sonnet-4-6.

The reload updated litellm.model_cost correctly but never re-ran add_known_models(), so litellm.anthropic_models (the in-memory set used by the wildcard resolver) remained stale. The new model was invisible to the anthropic/* wildcard even though the cost map knew about it.

  • LLM calls: All requests to newly-added Anthropic models were blocked with a 401.
  • Existing models: Unaffected โ€” only models missing from the stale provider set were impacted.
  • Other providers: Same bug class existed for any provider wildcard (e.g. openai/*, gemini/*).