4 posts tagged with "openai"

View All Tags

Maintain Visual Consistency with Reusable Video Characters

March 16, 2026

Sameer Kankute

SWE @ LiteLLM (LLM Translation)

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaff

CTO, LiteLLM

Build branded video sequences with consistent characters using the LiteLLM Video Characters API. Create once, reuse everywhere.

Proxy Setup

model_list:
  - model_name: sora-2
    litellm_params:
      model: openai/sora-2
      api_key: os.environ/OPENAI_API_KEY
  - model_name: sora-2-pro
    litellm_params:
      model: openai/sora-2-pro
      api_key: os.environ/OPENAI_API_KEY

litellm --config /path/to/config.yaml

SDK & Proxy Usage

Create a Character

Upload a short video clip to create a reusable character asset.

from litellm import avideo_create_character

# Create character from video
character = await avideo_create_character(
    name="Mossy",  # Character name (use in prompts)
    video=open("character.mp4", "rb"),  # Short 2-4 second clip
    custom_llm_provider="openai",
    model="sora-2"
)

print(f"Character ID: {character.id}")
# Output: Character ID: char_abc123def456

Proxy endpoint:

curl -X POST "http://localhost:4000/v1/videos/characters" \
  -H "Authorization: Bearer sk-litellm-key" \
  -F "video=@character.mp4" \
  -F "name=Mossy"

Generate Video with Character

Include the character ID in the characters array and mention the character name in your prompt.

from litellm import avideo

# Create video using the character
video = await avideo(
    model="sora-2",
    prompt="A cinematic tracking shot of Mossy weaving through a lantern-lit market at dusk, looking around curiously.",
    characters=[{"id": "char_abc123def456"}],
    seconds="8",
    size="1280x720"
)

print(f"Video ID: {video.id}")

Proxy endpoint:

curl -X POST "http://localhost:4000/v1/videos" \
  -H "Authorization: Bearer sk-litellm-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "sora-2",
    "prompt": "Mossy dances through a meadow of glowing flowers.",
    "characters": [{"id": "char_abc123def456"}],
    "seconds": "8",
    "size": "1280x720"
  }'

Retrieve Character Info

Get metadata and status for an uploaded character.

from litellm import avideo_get_character

character = await avideo_get_character(
    character_id="char_abc123def456",
    custom_llm_provider="openai"
)

print(f"Character: {character.name}, Created: {character.created_at}")

Edit Video with Character

Preserve the character while making targeted edits to the scene.

from litellm import avideo_edit

# Edit existing video with character
edited = await avideo_edit(
    video_id="video_xyz123",
    prompt="Shift the lighting to warm golden hour, add wind effects to character's fur.",
    custom_llm_provider="openai"
)

Extend Video with Character

Continue a completed video with the same character for longer sequences.

from litellm import avideo_extension

# Extend video by up to 20 seconds
extended = await avideo_extension(
    video_id="video_xyz123",
    prompt="Mossy walks towards the camera, waves, and smiles warmly.",
    seconds="8",
    custom_llm_provider="openai"
)

Best Practices

Character Uploads

Optimal duration: 2-4 seconds for best consistency
Aspect ratio: Match the target video resolution (16:9, 9:16, or 1:1)
Resolution: 720p to 1080p
Isolation: Show the character clearly against a distinct background
Movement: Include natural motion (walk, turn, gesture) to establish the character

Prompting

❌ Avoid:

"Create a video with a character that looks like Mossy"

✅ Do this:

"Mossy the moss-covered teapot mascot weaves through a lantern-lit market at dusk, 
looking curious and friendly."

Always mention the character name verbatim in your prompt. The character ID alone won't reliably preserve the asset.

Character Limits

Per video: Up to 2 characters in a single generation
Extensions: 1 character per extension (characters not supported in extensions)
Consistency: Best results when character occupies 30-60% of frame

Full Example Workflow

Create a branded video series with a consistent mascot character.

Python workflow example

import asyncio
from litellm import (
    avideo_create_character,
    avideo,
    avideo_extension,
    avideo_edit,
    avideo_get_character
)

async def create_branded_series():
    # Step 1: Create character asset
    print("Creating character...")
    character = await avideo_create_character(
        name="Luna",
        video=open("luna_intro.mp4", "rb"),
        custom_llm_provider="openai"
    )
    char_id = character.id
    print(f"✓ Character created: {char_id}")

    # Step 2: Generate first scene
    print("Generating opening scene...")
    scene1 = await avideo(
        model="sora-2",
        prompt="Luna the magical fox dancing through a cosmic forest, stars trailing her movement.",
        characters=[{"id": char_id}],
        seconds="8",
        size="1280x720"
    )
    print(f"✓ Scene 1: {scene1.id} ({scene1.status})")

    # Wait for completion (in production, use webhooks)
    while scene1.status in ("queued", "in_progress"):
        await asyncio.sleep(5)
        scene1 = await avideo_get_character(char_id)  # refresh

    # Step 3: Extend with same character
    print("Extending scene...")
    scene2 = await avideo_extension(
        video_id=scene1.id,
        prompt="Luna leaps into the air, transforms into stardust, and reforms on a moonlit cliff.",
        seconds="6"
    )
    print(f"✓ Scene 2 (extension): {scene2.id}")

    # Step 4: Edit for color grading
    print("Applying color grade...")
    scene1_graded = await avideo_edit(
        video_id=scene1.id,
        prompt="Shift palette to cool blues and purples with enhanced glow effects around Luna."
    )
    print(f"✓ Scene 1 graded: {scene1_graded.id}")

    print("\n✓ Branded series created with consistent Luna character!")

asyncio.run(create_branded_series())

FAQ

Q: Can I use real people as characters?
A: Character uploads depicting human likeness are blocked by default. Contact OpenAI sales for eligibility.

Q: What happens if the character aspect ratio doesn't match the video size?
A: The character may appear stretched or distorted. Upload character videos matching your target aspect ratio.

Q: Can I use the same character in extensions?
A: Extensions don't currently support character preservation. Use image references or re-upload the character from the extended video.

Q: How long do characters persist?
A: Characters are stored with your account indefinitely. You can retrieve and reuse them anytime with GET /v1/videos/characters/{character_id}.

Q: Can I combine characters with image references?
A: Yes! Use input_reference (to set the opening frame) alongside characters (for reusable assets) in the same generation.

Q: What's the difference between characters and input_reference?
A: input_reference conditions a single generation's opening frame. characters are reusable assets you reference across multiple generations for consistency.

Q: How many characters can I upload?
A: No limit. Upload as many character assets as you need for your project.

Q: Can I update or delete a character?
A: Character assets are immutable. To modify an asset, upload a new character and reference the new ID in future generations.

Troubleshooting

Character looks distorted in output

Verify the character video matches the target resolution
Re-upload with matching aspect ratio (16:9, 9:16, or 1:1)

Character doesn't appear in generated video

Check that you included the character ID in the characters array
Verify the character name is mentioned in the prompt (verbatim)
Ensure character occupies meaningful screen space in your prompt description

Character upload fails

Maximum file size is typically 100MB
Use MP4 format, 2-4 seconds, 720p-1080p resolution
Ensure character is clearly visible and isolated

Next Steps

Explore the Sora video generation guide
Combine characters with image references for added control
Use the Batch API for rendering character-based shot lists

Realtime WebRTC HTTP Endpoints

March 12, 2026

Sameer Kankute

SWE @ LiteLLM (LLM Translation)

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaff

CTO, LiteLLM

Connect to the Realtime API via WebRTC from browser/mobile clients. LiteLLM handles auth and key management.

How it works

WebRTC flow: Browser, LiteLLM Proxy, and OpenAI/Azure

Flow of generating ephemeral token

Ephemeral token flow: Browser requests token, LiteLLM gets real token from OpenAI, returns encrypted token

Proxy Setup

model_list:
  - model_name: gpt-4o-realtime
    litellm_params:
      model: openai/gpt-4o-realtime-preview-2024-12-17
      api_key: os.environ/OPENAI_API_KEY
    model_info:
      mode: realtime

Azure: use model: azure/gpt-4o-realtime-preview, api_key, api_base.

litellm --config /path/to/config.yaml

Try it live

INTERACTIVE TESTER

Browser → LiteLLM → OpenAI · WebRTC

▼

Client Usage

1. Get token - POST /v1/realtime/client_secrets with LiteLLM API key and { model }.

2. WebRTC handshake - Create RTCPeerConnection, add mic track, create data channel oai-events, send SDP offer to POST /v1/realtime/calls with Authorization: Bearer <encrypted_token> and Content-Type: application/sdp.

3. Events - Use the data channel for session.update and other events.

Full code example

// 1. Token
const r = await fetch("http://proxy:4000/v1/realtime/client_secrets", {
  method: "POST",
  headers: { "Authorization": "Bearer sk-litellm-key", "Content-Type": "application/json" },
  body: JSON.stringify({ model: "gpt-4o-realtime" }),
});
const { client_secret } = await r.json();
const token = client_secret.value;

// 2. WebRTC
const pc = new RTCPeerConnection();
const audio = document.createElement("audio");
audio.autoplay = true;
pc.ontrack = (e) => (audio.srcObject = e.streams[0]);
const ms = await navigator.mediaDevices.getUserMedia({ audio: true });
pc.addTrack(ms.getTracks()[0]);
const dc = pc.createDataChannel("oai-events");
const offer = await pc.createOffer();
await pc.setLocalDescription(offer);

const sdpRes = await fetch("http://proxy:4000/v1/realtime/calls", {
  method: "POST",
  headers: { "Authorization": `Bearer ${token}`, "Content-Type": "application/sdp" },
  body: offer.sdp,
});
await pc.setRemoteDescription({ type: "answer", sdp: await sdpRes.text() });

// 3. Events
dc.send(JSON.stringify({ type: "session.update", session: { instructions: "..." } }));

FAQ

Q: What do I do if I get a 401 Token expired error?
A: Tokens are short-lived. Get a fresh token right before creating the WebRTC offer.

Q: Which key should I use for /v1/realtime/calls?
A: Use the encrypted token from client_secrets, not your raw API key.

Q: Should I pass the model parameter when making the call?
A: No, the encrypted token already encodes all routing information including model.

Q: How do I resolve Azure api-version errors?
A: Set the correct api_version in litellm_params (or via the AZURE_API_VERSION environment variable), along with the right api_base and deployment values.

Q: What if I get no audio?
A: Make sure you grant microphone permission, ensure pc.ontrack assigns the audio element with autoplay enabled, check your network/firewall for WebRTC traffic, and inspect the browser console for ICE or SDP errors.

Day 0 Support: GPT-5.4

March 5, 2026

Sameer Kankute

SWE @ LiteLLM (LLM Translation)

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaff

CTO, LiteLLM

LiteLLM now supports fully GPT-5.4!

Docker Image

docker pull ghcr.io/berriai/litellm:v1.81.14-stable.gpt-5.4_patch

Usage

LiteLLM Proxy
LiteLLM SDK

1. Setup config.yaml

model_list:
  - model_name: gpt-5.4
    litellm_params:
      model: openai/gpt-5.4
      api_key: os.environ/OPENAI_API_KEY

2. Start the proxy

docker run -d \
  -p 4000:4000 \
  -e OPENAI_API_KEY=$OPENAI_API_KEY \
  -v $(pwd)/config.yaml:/app/config.yaml \
  ghcr.io/berriai/litellm:v1.81.14-stable.gpt-5.4_patch \
  --config /app/config.yaml

3. Test it

curl -X POST "http://0.0.0.0:4000/chat/completions" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $LITELLM_KEY" \
  -d '{
    "model": "gpt-5.4",
    "messages": [
      {"role": "user", "content": "Write a Python function to check if a number is prime."}
    ]
  }'

from litellm import completion

response = completion(
    model="openai/gpt-5.4",
    messages=[
        {"role": "user", "content": "Write a Python function to check if a number is prime."}
    ],
)

print(response.choices[0].message.content)

Notes

Restart your container to get the cost tracking for this model.
Use /responses for better model performance.
GPT-5.4 supports reasoning, function calling, vision, and tool-use — see the OpenAI provider docs for advanced usage.

Day 0 Support: GPT-5.3-Codex

February 24, 2026

Sameer Kankute

SWE @ LiteLLM (LLM Translation)

Krrish Dholakia

CEO, LiteLLM

Ishaan Jaff

CTO, LiteLLM

LiteLLM now supports GPT-5.3-Codex on Day 0, including support for the new assistant phase metadata on Responses API output items.

Why `phase` matters for GPT-5.3-Codex

phase appears on assistant output items and helps distinguish preamble/commentary turns from final closeout responses.

Reference: Phase parameter docs

Supported values:

null
"commentary"
"final_answer"

Important:

Persist assistant output items with phase exactly as returned.
Send those assistant items back on the next turn.
Do not add phase to user messages.

Docker Image

docker pull ghcr.io/berriai/litellm:v1.81.12-stable.gpt-5.3

Usage

LiteLLM Proxy

1. Setup config.yaml

model_list:
  - model_name: gpt-5.3-codex
    litellm_params:
      model: openai/gpt-5.3-codex

2. Start the proxy

docker run -d \
  -p 4000:4000 \
  -e ANTHROPIC_API_KEY=$OPENAI_API_KEY \
  -v $(pwd)/config.yaml:/app/config.yaml \
  ghcr.io/berriai/litellm:v1.81.12-stable.gpt-5.3 \
  --config /app/config.yaml

3. Test it

curl -X POST "http://0.0.0.0:4000/v1/responses" \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $LITELLM_KEY" \
  -d '{
    "model": "gpt-5.3-codex",
    "input": "Write a Python script that checks if a number is prime."
  }'

Python Example: Persist `phase` with OpenAI Client + LiteLLM Base URL

from openai import OpenAI

client = OpenAI(
    base_url="http://0.0.0.0:4000/v1",  # LiteLLM Proxy
    api_key="your-litellm-api-key",
)

items = []  # Persist this per conversation/thread


def _item_get(item, key, default=None):
    if isinstance(item, dict):
        return item.get(key, default)
    return getattr(item, key, default)


def run_turn(user_text: str):
    global items

    # User message: no phase field
    items.append(
        {
            "type": "message",
            "role": "user",
            "content": [{"type": "input_text", "text": user_text}],
        }
    )

    resp = client.responses.create(
        model="gpt-5.3-codex",
        input=items,
    )

    # Persist assistant output items verbatim, including phase
    for out_item in (resp.output or []):
        items.append(out_item)

    # Optional: inspect latest phase for UI/telemetry routing
    latest_phase = None
    for out_item in reversed(resp.output or []):
        if _item_get(out_item, "type") == "output_item.done" and _item_get(out_item, "phase") is not None:
            latest_phase = _item_get(out_item, "phase")
            break

    return resp, latest_phase

Notes

Use /v1/responses for GPT Codex models.
Preserve full assistant output history for best multi-turn behavior.
If phase metadata is dropped during history reconstruction, output quality can degrade on long-running tasks.

Proxy Setup​

SDK & Proxy Usage​

Create a Character​

Generate Video with Character​

Retrieve Character Info​

Edit Video with Character​

Extend Video with Character​

Best Practices​

Character Uploads​

Prompting​

Character Limits​

Full Example Workflow​

FAQ​

Troubleshooting​

Next Steps​

How it works​

Proxy Setup​

Try it live​

Client Usage​

FAQ​

Docker Image​

Usage​

Notes​

Why phase matters for GPT-5.3-Codex​

Docker Image​

Usage​

Python Example: Persist phase with OpenAI Client + LiteLLM Base URL​

Notes​

Proxy Setup

SDK & Proxy Usage

Create a Character

Generate Video with Character

Retrieve Character Info

Edit Video with Character

Extend Video with Character

Best Practices

Character Uploads

Prompting

Character Limits

Full Example Workflow

FAQ

Troubleshooting

Next Steps

How it works

Proxy Setup

Try it live

Client Usage

FAQ

Docker Image

Usage

Notes

Why `phase` matters for GPT-5.3-Codex

Docker Image

Usage

Python Example: Persist `phase` with OpenAI Client + LiteLLM Base URL

Notes