WebSocket API

The WebSocket API provides real-time bidirectional communication between your application and the RAVATAR platform. It is used for sending messages and receiving avatar responses in both Chat Mode and Live Mode.

Connection

Establish a WebSocket connection for real-time messaging.

WebSocket URL:

wss://chat.rvtr.ai/ws/chat?token={jwt_token}

Example:

wss://chat.rvtr.ai/ws/chat?token=eyJhbGciOiJIUzI1NiIs...

Connection Flow:

  1. Obtain JWT token via POST /jwt (see Authentication)
  2. Connect to WebSocket URL with token as query parameter
  3. On successful connection, ready to send/receive messages
  4. On connection error, implement reconnection logic (see Reconnection Logic below)

Outgoing Messages

Text Message

{
"avatar_id": "avatar-id-123",
"user_id": "user-123",
"chat_type": "text",
"language": "en",
"request": "Hello!",
"requestType": "text",
"source": "my-source-app-name",
"isLive": false,
"session": "session-id"
}

Audio Message (Standard)

{
"avatar_id": "avatar-id-123",
"user_id": "user-123",
"chat_type": "voice",
"language": "en",
"requestType": "audio",
"source": "my-source-app-name",
"isLive": false,
"isPushToTalk": true,
"file_base64_data": "base64-encoded-wav-data",
"session": "session-id"
}

Audio Message (PCM for Live Mode)

Use this message format when sending raw PCM audio (recommended for Gemini Native Audio models) during Live Mode.

{
"avatar_id": "avatar-id-123",
"user_id": "user-123",
"chat_type": "voice",
"language": "en",
"requestType": "audio",
"source": "my-source-app-name",
"isLive": true,
"LicenseId": "license-abc-123",
"audio_base64_pcm": "base64-encoded-pcm-data",
"session": "session-id"
}
note:  

Live Mode can also send blob/standard audio (e.g., WAV/WebM). In that case, use the standard audio format above, but set isLive: true and include LicenseId (send file_base64_data instead of audio_base64_pcm).

Live Mode audio message (blob/standard):

{
"avatar_id": "avatar-id-123",
"user_id": "user-123",
"chat_type": "voice",
"language": "en",
"requestType": "audio",
"source": "my-source-app-name",
"isLive": true,
"LicenseId": "license-abc-123",
"isPushToTalk": true,
"file_base64_data": "base64-encoded-wav-data",
"session": "session-id"
}

Incoming Messages

Chat Response

{
"direction": "incoming",
"user_id": "user-123",
"request": "Hello!",
"avatar_id": "avatar-id-123",
"answer": "Hello! How can I help you today?",
"avatar_name": "Assistant",
"date": 1737820800000,
"chat_type": "text",
"language_code": "en",
"fileUrl": "https://example.com/response.mp4"
}
FieldTypeDescription
directionstring"incoming" for avatar responses
answerstringAvatar's text response
fileUrlstringURL to audio/video response (optional)

System Message

{
"type": "system",
"action": "connected"
}

Event Messages

Outgoing Events

Start Recording Event:

{
"avatar_id": "avatar-id-123",
"user_id": "user-123",
"type": "event",
"direction": "outgoing",
"action": "startRecording",
"isLive": true,
"LicenseId": "license-abc-123",
"session": "session-id"
}

Stop Speech Event (interrupt avatar):

{
"avatar_id": "avatar-id-123",
"user_id": "user-123",
"type": "event",
"direction": "outgoing",
"action": "stopSpeech",
"isLive": true,
"LicenseId": "license-abc-123",
"session": "session-id"
}

Change Avatar Event:

{
"avatar_id": "new-avatar-id",
"user_id": "user-123",
"type": "event",
"direction": "outgoing",
"action": "changeAvatarById",
"isLive": true,
"LicenseId": "license-abc-123",
"session": "session-id"
}

Outgoing Event Actions:

ActionDescription
startRecordingNotify that user started recording audio
stopSpeechInterrupt avatar's current speech
changeAvatarByIdSwitch to a different avatar
NewPixelStreamingClientConnectedNotify that streaming client connected

Incoming Events

Speech Started:

{
"type": "event",
"action": "startSpeech"
}

Speech Ended:

{
"type": "event",
"action": "endSpeech"
}

Live Session Disconnected:

{
"type": "event",
"action": "liveDisconnect"
}

Incoming Event Actions:

ActionDescriptionRecommended Handling
startSpeechAvatar started speakingDisable user input, show speaking indicator
endSpeechAvatar stopped speakingEnable user input, hide speaking indicator
liveDisconnectLive session endedShow "Session expired" message, prompt to reconnect

Reconnection Logic

Implement reconnection logic to handle connection failures:

Reconnection Strategy:

  1. On connection error, wait 500ms
  2. Attempt reconnection (up to 3 attempts)
  3. On 2nd attempt, refresh JWT token first
  4. After max attempts, fall back to error state or HTTP API

Pseudocode:

CONSTANTS:
MAX_ATTEMPTS = 3
RECONNECT_DELAY = 500ms

VARIABLES:
attempt = 0

FUNCTION onConnectionError():
IF attempt < MAX_ATTEMPTS THEN
attempt = attempt + 1

WAIT RECONNECT_DELAY

IF attempt == 2 THEN
// Refresh token on second attempt
token = POST /jwt WITH original credentials
UPDATE authorization header
END IF

CONNECT to WebSocket
ELSE
SHOW error state "Connection failed"
END IF
END FUNCTION

Message Types and Actions Reference

Outgoing Message Types

ValueDescription
"text"Text message
"audio"Audio message

Outgoing Event Actions

ValueDescription
"startRecording"User started recording
"stopSpeech"Stop avatar speech
"changeAvatarById"Switch avatar
"NewPixelStreamingClientConnected"Streaming client connected

Incoming Message Types

ValueDescription
"system"System message
"event"Event message

Incoming Event Actions

ValueDescription
"startSpeech"Avatar started speaking
"endSpeech"Avatar stopped speaking
"liveDisconnect"Live session disconnected

Session Lifecycle

The following diagram shows the full session lifecycle from initialization to close:

  +----------------------------------------------------------+
| 1. INITIALIZE: Generate unique user_id (UUID) |
+----------------------------+-----------------------------+
|
v
+----------------------------------------------------------+
| 2. AUTHENTICATE: POST /jwt -> JWT token |
+----------------------------+-----------------------------+
|
v
+----------------------------------------------------------+
| 3. CONFIGURE: GET /connection -> avatars, languages |
+----------------------------+-----------------------------+
|
v
+----------------------------------------------------------+
| 4. CONNECT: WebSocket wss://chat.rvtr.ai/ws/chat?token= |
+----------------------------+-----------------------------+
|
v
+----------------------------------------------------------+
| 5. CHOOSE MODE |
+----------------------------+-----------------------------+
| CHAT MODE | LIVE MODE |
| - isLive: false | - isLive: true |
| - Blob/standard audio | - PCM or blob audio |
| - No LicenseId | - POST /startLiveSession |
| | - Store LicenseId |
+-------------+--------------+--------------+--------------+
+--------------+--------------+
|
v
+----------------------------------------------------------+
| 6. MESSAGING: Send/receive via WebSocket |
+----------------------------+-----------------------------+
|
v
+----------------------------------------------------------+
| 7. CLOSE: Live -> POST /endLiveSession, close WebSocket |
+----------------------------------------------------------+