Skip to main content

4 posts tagged with "openai"

View All Tags

Maintain Visual Consistency with Reusable Video Characters

Sameer Kankute
SWE @ LiteLLM (LLM Translation)
Krrish Dholakia
CEO, LiteLLM
Ishaan Jaff
CTO, LiteLLM

Build branded video sequences with consistent characters using the LiteLLM Video Characters API. Create once, reuse everywhere.

Proxy Setup

model_list:
- model_name: sora-2
litellm_params:
model: openai/sora-2
api_key: os.environ/OPENAI_API_KEY
- model_name: sora-2-pro
litellm_params:
model: openai/sora-2-pro
api_key: os.environ/OPENAI_API_KEY
litellm --config /path/to/config.yaml

SDK & Proxy Usage

Create a Character

Upload a short video clip to create a reusable character asset.

from litellm import avideo_create_character

# Create character from video
character = await avideo_create_character(
name="Mossy", # Character name (use in prompts)
video=open("character.mp4", "rb"), # Short 2-4 second clip
custom_llm_provider="openai",
model="sora-2"
)

print(f"Character ID: {character.id}")
# Output: Character ID: char_abc123def456

Proxy endpoint:

curl -X POST "http://localhost:4000/v1/videos/characters" \
-H "Authorization: Bearer sk-litellm-key" \
-F "video=@character.mp4" \
-F "name=Mossy"

Generate Video with Character

Include the character ID in the characters array and mention the character name in your prompt.

from litellm import avideo

# Create video using the character
video = await avideo(
model="sora-2",
prompt="A cinematic tracking shot of Mossy weaving through a lantern-lit market at dusk, looking around curiously.",
characters=[{"id": "char_abc123def456"}],
seconds="8",
size="1280x720"
)

print(f"Video ID: {video.id}")

Proxy endpoint:

curl -X POST "http://localhost:4000/v1/videos" \
-H "Authorization: Bearer sk-litellm-key" \
-H "Content-Type: application/json" \
-d '{
"model": "sora-2",
"prompt": "Mossy dances through a meadow of glowing flowers.",
"characters": [{"id": "char_abc123def456"}],
"seconds": "8",
"size": "1280x720"
}'

Retrieve Character Info

Get metadata and status for an uploaded character.

from litellm import avideo_get_character

character = await avideo_get_character(
character_id="char_abc123def456",
custom_llm_provider="openai"
)

print(f"Character: {character.name}, Created: {character.created_at}")

Edit Video with Character

Preserve the character while making targeted edits to the scene.

from litellm import avideo_edit

# Edit existing video with character
edited = await avideo_edit(
video_id="video_xyz123",
prompt="Shift the lighting to warm golden hour, add wind effects to character's fur.",
custom_llm_provider="openai"
)

Extend Video with Character

Continue a completed video with the same character for longer sequences.

from litellm import avideo_extension

# Extend video by up to 20 seconds
extended = await avideo_extension(
video_id="video_xyz123",
prompt="Mossy walks towards the camera, waves, and smiles warmly.",
seconds="8",
custom_llm_provider="openai"
)

Best Practices

Character Uploads

  • Optimal duration: 2-4 seconds for best consistency
  • Aspect ratio: Match the target video resolution (16:9, 9:16, or 1:1)
  • Resolution: 720p to 1080p
  • Isolation: Show the character clearly against a distinct background
  • Movement: Include natural motion (walk, turn, gesture) to establish the character

Prompting

❌ Avoid:

"Create a video with a character that looks like Mossy"

✅ Do this:

"Mossy the moss-covered teapot mascot weaves through a lantern-lit market at dusk, 
looking curious and friendly."

Always mention the character name verbatim in your prompt. The character ID alone won't reliably preserve the asset.

Character Limits

  • Per video: Up to 2 characters in a single generation
  • Extensions: 1 character per extension (characters not supported in extensions)
  • Consistency: Best results when character occupies 30-60% of frame

Full Example Workflow

Create a branded video series with a consistent mascot character.

Python workflow example
import asyncio
from litellm import (
avideo_create_character,
avideo,
avideo_extension,
avideo_edit,
avideo_get_character
)

async def create_branded_series():
# Step 1: Create character asset
print("Creating character...")
character = await avideo_create_character(
name="Luna",
video=open("luna_intro.mp4", "rb"),
custom_llm_provider="openai"
)
char_id = character.id
print(f"✓ Character created: {char_id}")

# Step 2: Generate first scene
print("Generating opening scene...")
scene1 = await avideo(
model="sora-2",
prompt="Luna the magical fox dancing through a cosmic forest, stars trailing her movement.",
characters=[{"id": char_id}],
seconds="8",
size="1280x720"
)
print(f"✓ Scene 1: {scene1.id} ({scene1.status})")

# Wait for completion (in production, use webhooks)
while scene1.status in ("queued", "in_progress"):
await asyncio.sleep(5)
scene1 = await avideo_get_character(char_id) # refresh

# Step 3: Extend with same character
print("Extending scene...")
scene2 = await avideo_extension(
video_id=scene1.id,
prompt="Luna leaps into the air, transforms into stardust, and reforms on a moonlit cliff.",
seconds="6"
)
print(f"✓ Scene 2 (extension): {scene2.id}")

# Step 4: Edit for color grading
print("Applying color grade...")
scene1_graded = await avideo_edit(
video_id=scene1.id,
prompt="Shift palette to cool blues and purples with enhanced glow effects around Luna."
)
print(f"✓ Scene 1 graded: {scene1_graded.id}")

print("\n✓ Branded series created with consistent Luna character!")

asyncio.run(create_branded_series())

FAQ

Q: Can I use real people as characters?
A: Character uploads depicting human likeness are blocked by default. Contact OpenAI sales for eligibility.

Q: What happens if the character aspect ratio doesn't match the video size?
A: The character may appear stretched or distorted. Upload character videos matching your target aspect ratio.

Q: Can I use the same character in extensions?
A: Extensions don't currently support character preservation. Use image references or re-upload the character from the extended video.

Q: How long do characters persist?
A: Characters are stored with your account indefinitely. You can retrieve and reuse them anytime with GET /v1/videos/characters/{character_id}.

Q: Can I combine characters with image references?
A: Yes! Use input_reference (to set the opening frame) alongside characters (for reusable assets) in the same generation.

Q: What's the difference between characters and input_reference?
A: input_reference conditions a single generation's opening frame. characters are reusable assets you reference across multiple generations for consistency.

Q: How many characters can I upload?
A: No limit. Upload as many character assets as you need for your project.

Q: Can I update or delete a character?
A: Character assets are immutable. To modify an asset, upload a new character and reference the new ID in future generations.

Troubleshooting

Character looks distorted in output

  • Verify the character video matches the target resolution
  • Re-upload with matching aspect ratio (16:9, 9:16, or 1:1)

Character doesn't appear in generated video

  • Check that you included the character ID in the characters array
  • Verify the character name is mentioned in the prompt (verbatim)
  • Ensure character occupies meaningful screen space in your prompt description

Character upload fails

  • Maximum file size is typically 100MB
  • Use MP4 format, 2-4 seconds, 720p-1080p resolution
  • Ensure character is clearly visible and isolated

Next Steps

Realtime WebRTC HTTP Endpoints

Sameer Kankute
SWE @ LiteLLM (LLM Translation)
Krrish Dholakia
CEO, LiteLLM
Ishaan Jaff
CTO, LiteLLM

Connect to the Realtime API via WebRTC from browser/mobile clients. LiteLLM handles auth and key management.

How it works

WebRTC flow: Browser, LiteLLM Proxy, and OpenAI/Azure

Flow of generating ephemeral token

Ephemeral token flow: Browser requests token, LiteLLM gets real token from OpenAI, returns encrypted token

Proxy Setup

model_list:
- model_name: gpt-4o-realtime
litellm_params:
model: openai/gpt-4o-realtime-preview-2024-12-17
api_key: os.environ/OPENAI_API_KEY
model_info:
mode: realtime

Azure: use model: azure/gpt-4o-realtime-preview, api_key, api_base.

litellm --config /path/to/config.yaml

Try it live

INTERACTIVE TESTER
Browser → LiteLLM → OpenAI · WebRTC

Client Usage

1. Get token - POST /v1/realtime/client_secrets with LiteLLM API key and { model }.

2. WebRTC handshake - Create RTCPeerConnection, add mic track, create data channel oai-events, send SDP offer to POST /v1/realtime/calls with Authorization: Bearer <encrypted_token> and Content-Type: application/sdp.

3. Events - Use the data channel for session.update and other events.

Full code example
// 1. Token
const r = await fetch("http://proxy:4000/v1/realtime/client_secrets", {
method: "POST",
headers: { "Authorization": "Bearer sk-litellm-key", "Content-Type": "application/json" },
body: JSON.stringify({ model: "gpt-4o-realtime" }),
});
const { client_secret } = await r.json();
const token = client_secret.value;

// 2. WebRTC
const pc = new RTCPeerConnection();
const audio = document.createElement("audio");
audio.autoplay = true;
pc.ontrack = (e) => (audio.srcObject = e.streams[0]);
const ms = await navigator.mediaDevices.getUserMedia({ audio: true });
pc.addTrack(ms.getTracks()[0]);
const dc = pc.createDataChannel("oai-events");
const offer = await pc.createOffer();
await pc.setLocalDescription(offer);

const sdpRes = await fetch("http://proxy:4000/v1/realtime/calls", {
method: "POST",
headers: { "Authorization": `Bearer ${token}`, "Content-Type": "application/sdp" },
body: offer.sdp,
});
await pc.setRemoteDescription({ type: "answer", sdp: await sdpRes.text() });

// 3. Events
dc.send(JSON.stringify({ type: "session.update", session: { instructions: "..." } }));

FAQ

Q: What do I do if I get a 401 Token expired error?
A: Tokens are short-lived. Get a fresh token right before creating the WebRTC offer.

Q: Which key should I use for /v1/realtime/calls?
A: Use the encrypted token from client_secrets, not your raw API key.

Q: Should I pass the model parameter when making the call?
A: No, the encrypted token already encodes all routing information including model.

Q: How do I resolve Azure api-version errors?
A: Set the correct api_version in litellm_params (or via the AZURE_API_VERSION environment variable), along with the right api_base and deployment values.

Q: What if I get no audio?
A: Make sure you grant microphone permission, ensure pc.ontrack assigns the audio element with autoplay enabled, check your network/firewall for WebRTC traffic, and inspect the browser console for ICE or SDP errors.

Day 0 Support: GPT-5.4

Sameer Kankute
SWE @ LiteLLM (LLM Translation)
Krrish Dholakia
CEO, LiteLLM
Ishaan Jaff
CTO, LiteLLM

LiteLLM now supports fully GPT-5.4!

Docker Image

docker pull ghcr.io/berriai/litellm:v1.81.14-stable.gpt-5.4_patch

Usage

1. Setup config.yaml

model_list:
- model_name: gpt-5.4
litellm_params:
model: openai/gpt-5.4
api_key: os.environ/OPENAI_API_KEY

2. Start the proxy

docker run -d \
-p 4000:4000 \
-e OPENAI_API_KEY=$OPENAI_API_KEY \
-v $(pwd)/config.yaml:/app/config.yaml \
ghcr.io/berriai/litellm:v1.81.14-stable.gpt-5.4_patch \
--config /app/config.yaml

3. Test it

curl -X POST "http://0.0.0.0:4000/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $LITELLM_KEY" \
-d '{
"model": "gpt-5.4",
"messages": [
{"role": "user", "content": "Write a Python function to check if a number is prime."}
]
}'

Notes

  • Restart your container to get the cost tracking for this model.
  • Use /responses for better model performance.
  • GPT-5.4 supports reasoning, function calling, vision, and tool-use — see the OpenAI provider docs for advanced usage.

Day 0 Support: GPT-5.3-Codex

Sameer Kankute
SWE @ LiteLLM (LLM Translation)
Krrish Dholakia
CEO, LiteLLM
Ishaan Jaff
CTO, LiteLLM

LiteLLM now supports GPT-5.3-Codex on Day 0, including support for the new assistant phase metadata on Responses API output items.

Why phase matters for GPT-5.3-Codex

phase appears on assistant output items and helps distinguish preamble/commentary turns from final closeout responses.

Reference: Phase parameter docs

Supported values:

  • null
  • "commentary"
  • "final_answer"

Important:

  • Persist assistant output items with phase exactly as returned.
  • Send those assistant items back on the next turn.
  • Do not add phase to user messages.

Docker Image

docker pull ghcr.io/berriai/litellm:v1.81.12-stable.gpt-5.3

Usage

1. Setup config.yaml

model_list:
- model_name: gpt-5.3-codex
litellm_params:
model: openai/gpt-5.3-codex

2. Start the proxy

docker run -d \
-p 4000:4000 \
-e ANTHROPIC_API_KEY=$OPENAI_API_KEY \
-v $(pwd)/config.yaml:/app/config.yaml \
ghcr.io/berriai/litellm:v1.81.12-stable.gpt-5.3 \
--config /app/config.yaml

3. Test it

curl -X POST "http://0.0.0.0:4000/v1/responses" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $LITELLM_KEY" \
-d '{
"model": "gpt-5.3-codex",
"input": "Write a Python script that checks if a number is prime."
}'

Python Example: Persist phase with OpenAI Client + LiteLLM Base URL

from openai import OpenAI

client = OpenAI(
base_url="http://0.0.0.0:4000/v1", # LiteLLM Proxy
api_key="your-litellm-api-key",
)

items = [] # Persist this per conversation/thread


def _item_get(item, key, default=None):
if isinstance(item, dict):
return item.get(key, default)
return getattr(item, key, default)


def run_turn(user_text: str):
global items

# User message: no phase field
items.append(
{
"type": "message",
"role": "user",
"content": [{"type": "input_text", "text": user_text}],
}
)

resp = client.responses.create(
model="gpt-5.3-codex",
input=items,
)

# Persist assistant output items verbatim, including phase
for out_item in (resp.output or []):
items.append(out_item)

# Optional: inspect latest phase for UI/telemetry routing
latest_phase = None
for out_item in reversed(resp.output or []):
if _item_get(out_item, "type") == "output_item.done" and _item_get(out_item, "phase") is not None:
latest_phase = _item_get(out_item, "phase")
break

return resp, latest_phase

Notes

  • Use /v1/responses for GPT Codex models.
  • Preserve full assistant output history for best multi-turn behavior.
  • If phase metadata is dropped during history reconstruction, output quality can degrade on long-running tasks.