WebSocket API
The WebSocket API provides real-time bidirectional communication between your application and the RAVATAR platform. It is used for sending messages and receiving avatar responses in both Chat Mode and Live Mode.
Connection
Establish a WebSocket connection for real-time messaging.
WebSocket URL:
wss://chat.rvtr.ai/ws/chat?token={jwt_token}
Example:
wss://chat.rvtr.ai/ws/chat?token=eyJhbGciOiJIUzI1NiIs...
Connection Flow:
- Obtain JWT token via
POST /jwt(see Authentication) - Connect to WebSocket URL with token as query parameter
- On successful connection, ready to send/receive messages
- On connection error, implement reconnection logic (see Reconnection Logic below)
Outgoing Messages
Text Message
{
"avatar_id": "avatar-id-123",
"user_id": "user-123",
"chat_type": "text",
"language": "en",
"request": "Hello!",
"requestType": "text",
"source": "my-source-app-name",
"isLive": false,
"session": "session-id"
}
Audio Message (Standard)
{
"avatar_id": "avatar-id-123",
"user_id": "user-123",
"chat_type": "voice",
"language": "en",
"requestType": "audio",
"source": "my-source-app-name",
"isLive": false,
"isPushToTalk": true,
"file_base64_data": "base64-encoded-wav-data",
"session": "session-id"
}
Audio Message (PCM for Live Mode)
Use this message format when sending raw PCM audio (recommended for Gemini Native Audio models) during Live Mode.
{
"avatar_id": "avatar-id-123",
"user_id": "user-123",
"chat_type": "voice",
"language": "en",
"requestType": "audio",
"source": "my-source-app-name",
"isLive": true,
"LicenseId": "license-abc-123",
"audio_base64_pcm": "base64-encoded-pcm-data",
"session": "session-id"
}
Live Mode can also send blob/standard audio (e.g., WAV/WebM). In that case, use the standard audio format above, but set isLive: true and include LicenseId (send file_base64_data instead of audio_base64_pcm).
Live Mode audio message (blob/standard):
{
"avatar_id": "avatar-id-123",
"user_id": "user-123",
"chat_type": "voice",
"language": "en",
"requestType": "audio",
"source": "my-source-app-name",
"isLive": true,
"LicenseId": "license-abc-123",
"isPushToTalk": true,
"file_base64_data": "base64-encoded-wav-data",
"session": "session-id"
}
Incoming Messages
Chat Response
{
"direction": "incoming",
"user_id": "user-123",
"request": "Hello!",
"avatar_id": "avatar-id-123",
"answer": "Hello! How can I help you today?",
"avatar_name": "Assistant",
"date": 1737820800000,
"chat_type": "text",
"language_code": "en",
"fileUrl": "https://example.com/response.mp4"
}
| Field | Type | Description |
|---|---|---|
direction | string | "incoming" for avatar responses |
answer | string | Avatar's text response |
fileUrl | string | URL to audio/video response (optional) |
System Message
{
"type": "system",
"action": "connected"
}
Event Messages
Outgoing Events
Start Recording Event:
{
"avatar_id": "avatar-id-123",
"user_id": "user-123",
"type": "event",
"direction": "outgoing",
"action": "startRecording",
"isLive": true,
"LicenseId": "license-abc-123",
"session": "session-id"
}
Stop Speech Event (interrupt avatar):
{
"avatar_id": "avatar-id-123",
"user_id": "user-123",
"type": "event",
"direction": "outgoing",
"action": "stopSpeech",
"isLive": true,
"LicenseId": "license-abc-123",
"session": "session-id"
}
Change Avatar Event:
{
"avatar_id": "new-avatar-id",
"user_id": "user-123",
"type": "event",
"direction": "outgoing",
"action": "changeAvatarById",
"isLive": true,
"LicenseId": "license-abc-123",
"session": "session-id"
}
Outgoing Event Actions:
| Action | Description |
|---|---|
startRecording | Notify that user started recording audio |
stopSpeech | Interrupt avatar's current speech |
changeAvatarById | Switch to a different avatar |
NewPixelStreamingClientConnected | Notify that streaming client connected |
Incoming Events
Speech Started:
{
"type": "event",
"action": "startSpeech"
}
Speech Ended:
{
"type": "event",
"action": "endSpeech"
}
Live Session Disconnected:
{
"type": "event",
"action": "liveDisconnect"
}
Incoming Event Actions:
| Action | Description | Recommended Handling |
|---|---|---|
startSpeech | Avatar started speaking | Disable user input, show speaking indicator |
endSpeech | Avatar stopped speaking | Enable user input, hide speaking indicator |
liveDisconnect | Live session ended | Show "Session expired" message, prompt to reconnect |
Reconnection Logic
Implement reconnection logic to handle connection failures:
Reconnection Strategy:
- On connection error, wait 500ms
- Attempt reconnection (up to 3 attempts)
- On 2nd attempt, refresh JWT token first
- After max attempts, fall back to error state or HTTP API
Pseudocode:
CONSTANTS:
MAX_ATTEMPTS = 3
RECONNECT_DELAY = 500ms
VARIABLES:
attempt = 0
FUNCTION onConnectionError():
IF attempt < MAX_ATTEMPTS THEN
attempt = attempt + 1
WAIT RECONNECT_DELAY
IF attempt == 2 THEN
// Refresh token on second attempt
token = POST /jwt WITH original credentials
UPDATE authorization header
END IF
CONNECT to WebSocket
ELSE
SHOW error state "Connection failed"
END IF
END FUNCTION
Message Types and Actions Reference
Outgoing Message Types
| Value | Description |
|---|---|
"text" | Text message |
"audio" | Audio message |
Outgoing Event Actions
| Value | Description |
|---|---|
"startRecording" | User started recording |
"stopSpeech" | Stop avatar speech |
"changeAvatarById" | Switch avatar |
"NewPixelStreamingClientConnected" | Streaming client connected |
Incoming Message Types
| Value | Description |
|---|---|
"system" | System message |
"event" | Event message |
Incoming Event Actions
| Value | Description |
|---|---|
"startSpeech" | Avatar started speaking |
"endSpeech" | Avatar stopped speaking |
"liveDisconnect" | Live session disconnected |
Session Lifecycle
The following diagram shows the full session lifecycle from initialization to close:
+----------------------------------------------------------+
| 1. INITIALIZE: Generate unique user_id (UUID) |
+----------------------------+-----------------------------+
|
v
+----------------------------------------------------------+
| 2. AUTHENTICATE: POST /jwt -> JWT token |
+----------------------------+-----------------------------+
|
v
+----------------------------------------------------------+
| 3. CONFIGURE: GET /connection -> avatars, languages |
+----------------------------+-----------------------------+
|
v
+----------------------------------------------------------+
| 4. CONNECT: WebSocket wss://chat.rvtr.ai/ws/chat?token= |
+----------------------------+-----------------------------+
|
v
+----------------------------------------------------------+
| 5. CHOOSE MODE |
+----------------------------+-----------------------------+
| CHAT MODE | LIVE MODE |
| - isLive: false | - isLive: true |
| - Blob/standard audio | - PCM or blob audio |
| - No LicenseId | - POST /startLiveSession |
| | - Store LicenseId |
+-------------+--------------+--------------+--------------+
+--------------+--------------+
|
v
+----------------------------------------------------------+
| 6. MESSAGING: Send/receive via WebSocket |
+----------------------------+-----------------------------+
|
v
+----------------------------------------------------------+
| 7. CLOSE: Live -> POST /endLiveSession, close WebSocket |
+----------------------------------------------------------+