Introduction

TTP Agent SDK is a powerful JavaScript library for building AI-powered voice and chat interactions in your web applications.

What We Offer

🎙️💬

Voice & Chat Widget

Drop-in widget with voice & text for any website

→

🎤

Key Features

🔒

Simple Authentication

Direct connection using agentId and appId with domain whitelist access control

🎨

Fully Customizable

Colors, branding, languages, RTL support, and custom agent settings

📱

Mobile Optimized

Works seamlessly on desktop, tablet, and mobile devices

🌍

Multi-language

Built-in support for multiple languages and custom translations

⚡

Real-time Streaming

WebSocket-based audio streaming with low latency

🔧

Easy Integration

Simple CDN setup or NPM package with comprehensive API

Installation

NPM

npm install ttp-agent-sdk

CDN

<script src="https://unpkg.com/ttp-agent-sdk/dist/agent-widget.js"></script>

Import

// ES6 Import
import { VoiceSDK } from 'ttp-agent-sdk';

// CommonJS
const { VoiceSDK } = require('ttp-agent-sdk');

// Browser Global
const sdk = new window.TTPAgentSDK.VoiceSDK(config);

Quick Start

Get up and running in 5 minutes with this simple example.

1

Initialize the SDK

Create a VoiceSDK instance with your agent ID and app ID:

import { VoiceSDK } from 'ttp-agent-sdk';

const voiceSDK = new VoiceSDK({
  agentId: 'agent_123',     // The AI agent to connect to
  appId: 'your_app_id',     // Your application ID
  
  // Optional: Configure audio formats (v2 protocol)
  outputContainer: 'raw',      // 'raw' or 'wav'
  outputEncoding: 'pcm',       // 'pcm', 'pcmu', 'pcma'
  outputSampleRate: 24000,     // Typical server/TTS output (default)
  protocolVersion: 2           // Use v2 protocol for format negotiation
});

// Listen to events
voiceSDK.on('connected', () => {
  console.log('✅ Connected to agent');
});

voiceSDK.on('formatNegotiated', (format) => {
  console.log('✅ Format negotiated:', format);
  // Format contains: container, encoding, sampleRate, channels, bitDepth
});

voiceSDK.on('message', (msg) => {
  if (msg.t === 'agent_response') {
    console.log('Agent:', msg.agent_response);
  }
});

2

Connect & Start Recording

Connect to the agent and start capturing audio. If your agent uses a server-driven disclaimer, listen for disclaimersRequired, call sendDisclaimerAck(true) after the user accepts, then call startRecording().

// Connect
await voiceSDK.connect();

// Start recording
await voiceSDK.startRecording();

// Stop recording
await voiceSDK.stopRecording();

Authentication

The SDK connects directly using agentId and appId. No server-side authentication step is needed. Access control is managed via domain whitelist in your agent's admin panel.

How It Works

💡 Simple Setup: Just provide your agentId and appId in the SDK configuration. The SDK connects directly to the TTP backend via WebSocket.

const voiceSDK = new VoiceSDK({
  agentId: 'agent_123',     // Your AI agent ID
  appId: 'your_app_id'      // Your application ID
});

Domain Whitelist

To control which websites can use your agent, configure a domain whitelist in your agent's admin panel. Only requests originating from whitelisted domains will be accepted.

💡 Tip: During development, you can add localhost to the domain whitelist. Remove it before deploying to production.

Configuration Parameters

Parameter	Type	Required	Description
`agentId`	string	Yes	The AI agent identifier
`appId`	string	Yes	Your application identifier

Agent Settings Override

NEW FEATURE

Dynamically customize agent behavior, voice, and personality on a per-session basis.

💡 Access Control: Agent settings override is available when the agent has domain whitelist configured in the admin panel. No additional authentication step is required.

How It Works

Configure: Set up a domain whitelist for your agent in the admin panel
Initialize: Pass agentSettingsOverride in the SDK configuration
Connect: The SDK sends overrides in the hello message
TTP Backend: Validates the domain and applies your overrides

Example

const voiceSDK = new VoiceSDK({
  agentId: 'agent_123',     // The AI agent to connect to
  appId: 'your_app_id',     // Your application ID
  
  // Override agent settings
  agentSettingsOverride: {
    // Core settings
    prompt: "You are a friendly Spanish-speaking travel assistant",
    language: "es",
    temperature: 0.9,
    maxTokens: 200,
    
    // Voice settings
    voiceSpeed: 1.2,
    voiceId: "nova",  // Use voiceId (not selectedVoice)
    
    // Behavior
    firstMessage: "¡Hola! ¿Cómo puedo ayudarte hoy?",
    disableInterruptions: false,
    autoDetectLanguage: true,
    
    // Tools (optional)
    toolIds: [123, 456, 789],              // Custom tool IDs
    internalToolIds: ['calendar', 'email'] // Internal tool IDs
  }
});

Available Override Settings

15 out of 16 settings can be overridden. Only model selection is not supported (requires infrastructure changes).

📝 Core Settings

prompt - System prompt/instructions
temperature - LLM temperature (0-2)
maxTokens - Maximum response tokens
model - ⚠️ NOT SUPPORTED
language - Response language code

🔊 Voice Settings

voiceId - Specific voice ID
voiceSpeed - Speed multiplier (0.5-2)

⚙️ Behavior

firstMessage - Initial greeting
disableInterruptions - Allow/prevent barge-in
autoDetectLanguage - Auto language detection
candidateLanguages - List of languages for auto-detection
maxCallDuration - Max session duration (seconds)

🛠️ Advanced

toolIds - Array of custom tool IDs
internalToolIds - Array of internal tool IDs
timezone - User timezone

⚠️ Validation: All overrides are validated and sanitized on the server. Invalid values will be rejected or clamped to safe ranges.

Variables in Hello Request

Overview

Variables allow you to pass dynamic values to your agent that will be used to replace placeholders in the system prompt and first message. Variables sent in the hello request take precedence over default variables stored in the agent configuration.

Hello Message Format

SDK v2 Format (Recommended)

When using VoiceSDK v2, variables are passed in the SDK constructor:

import { VoiceSDK_v2 } from 'ttp-agent-sdk';

const voiceSDK = new VoiceSDK_v2({
  agentId: 'agent_5a2b984c1',
  appId: 'app_Bc01EqMQt2Euehl4qqZSi6l3FJP42Q9vJ0pC',
  
  // Variables (optional)
  variables: {
    USER_NAME: 'John',
    ACCOUNT_TYPE: 'premium',
    LANGUAGE: 'en-US'
  },
  
  // Audio format configuration
  sampleRate: 16000,
  channels: 1,
  bitDepth: 16,
  outputContainer: 'raw',
  outputEncoding: 'pcm',
  outputSampleRate: 24000,
  outputChannels: 1,
  outputBitDepth: 16,
  outputFrameDurationMs: 600
});

await voiceSDK.connect();

Raw WebSocket Format

If connecting via raw WebSocket (without SDK), send variables in the hello message:

{
  "t": "hello",
  "v": 2,
  "variables": {
    "USER_NAME": "John",
    "ACCOUNT_TYPE": "premium",
    "LANGUAGE": "en-US"
  },
  "inputFormat": {
    "encoding": "pcm",
    "sampleRate": 16000,
    "channels": 1,
    "bitDepth": 16
  },
  "requestedOutputFormat": {
    "encoding": "pcm",
    "sampleRate": 24000,
    "channels": 1,
    "bitDepth": 16,
    "container": "raw"
  },
  "outputFrameDurationMs": 600
}

Variable Format

Variables are sent as a JSON object where:

Keys: Variable names (case-sensitive, e.g., USER_NAME)
Values: String values that will replace {{VARIABLE_NAME}} in the prompt

Example

{
  "variables": {
    "USER_NAME": "John",
    "ACCOUNT_TYPE": "premium",
    "LANGUAGE": "en-US",
    "COMPANY": "Acme Corp"
  }
}

Variable Replacement Priority

Variables are replaced in the following priority order:

Hello Variables (highest priority) - Variables sent in the hello request
Default Variables - Variables stored in agent configuration (Redis)
Leave as-is - If no value found, {{VARIABLE_NAME}} remains unchanged

Example Priority

Agent Configuration (Redis):

{
  "USER_NAME": "David",
  "ACCOUNT_TYPE": "premium"
}

Hello Request:

{
  "variables": {
    "USER_NAME": "John"
  }
}

Result:

{{USER_NAME}} → "John" (from hello - takes precedence)
{{ACCOUNT_TYPE}} → "premium" (from defaults - hello doesn't override)

Usage Examples

Example 1: JavaScript/TypeScript with SDK v2

import { VoiceSDK_v2 } from 'ttp-agent-sdk';

const voiceSDK = new VoiceSDK_v2({
  agentId: 'agent_5a2b984c1',
  appId: 'app_Bc01EqMQt2Euehl4qqZSi6l3FJP42Q9vJ0pC',
  
  variables: {
    USER_NAME: 'John Doe',
    ACCOUNT_TYPE: 'premium',
    LANGUAGE: 'en-US'
  },
  
  // ... audio format config
});

await voiceSDK.connect();

Example 2: Raw WebSocket (JavaScript)

const ws = new WebSocket('wss://speech.talktopc.com/ws/conv?agentId=agent_5a2b984c1&appId=app_Bc01EqMQt2Euehl4qqZSi6l3FJP42Q9vJ0pC');

ws.onopen = () => {
  const helloMessage = {
    t: 'hello',
    v: 2,
    variables: {
      USER_NAME: 'John',
      ACCOUNT_TYPE: 'premium',
      LANGUAGE: 'en-US'
    },
    inputFormat: {
      encoding: 'pcm',
      sampleRate: 16000,
      channels: 1,
      bitDepth: 16
    },
    requestedOutputFormat: {
      encoding: 'pcm',
      sampleRate: 24000,
      channels: 1,
      bitDepth: 16,
      container: 'raw'
    },
    outputFrameDurationMs: 600
  };
  
  ws.send(JSON.stringify(helloMessage));
};

Example 3: Python WebSocket

import websocket
import json

def on_open(ws):
    hello_message = {
        "t": "hello",
        "v": 2,
        "variables": {
            "USER_NAME": "John",
            "ACCOUNT_TYPE": "premium",
            "LANGUAGE": "en-US"
        },
        "inputFormat": {
            "encoding": "pcm",
            "sampleRate": 16000,
            "channels": 1,
            "bitDepth": 16
        },
        "requestedOutputFormat": {
            "encoding": "pcm",
            "sampleRate": 24000,
            "channels": 1,
            "bitDepth": 16,
            "container": "raw"
        },
        "outputFrameDurationMs": 600
    }
    
    ws.send(json.dumps(hello_message))

ws = websocket.WebSocketApp(
    "wss://speech.talktopc.com/ws/conv?agentId=agent_5a2b984c1&appId=app_Bc01EqMQt2Euehl4qqZSi6l3FJP42Q9vJ0pC",
    on_open=on_open
)
ws.run_forever()

Example 4: cURL / wscat

# Using wscat
wscat -c "wss://speech.talktopc.com/ws/conv?agentId=agent_5a2b984c1&appId=app_Bc01EqMQt2Euehl4qqZSi6l3FJP42Q9vJ0pC"

# Then send:
{"t":"hello","v":2,"variables":{"USER_NAME":"John","ACCOUNT_TYPE":"premium"},"inputFormat":{"encoding":"pcm","sampleRate":16000,"channels":1,"bitDepth":16},"requestedOutputFormat":{"encoding":"pcm","sampleRate":24000,"channels":1,"bitDepth":16,"container":"raw"},"outputFrameDurationMs":600}

Agent Prompt Setup

To use variables in your agent, include placeholders in the system prompt or first message:

System Prompt Example

Your name is {{AGENT_NAME}}.
You are helping {{USER_NAME}} who has a {{ACCOUNT_TYPE}} account.
Speak in {{LANGUAGE}}.

First Message Example

Hello {{USER_NAME}}! Welcome to our {{ACCOUNT_TYPE}} service.

Backend Processing

When the hello request is received with variables:

Variables are extracted from the hello message
Default variables are loaded from agent configuration (Redis)
Variables are merged (hello variables take precedence)
Prompt is processed - {{VARIABLE_NAME}} placeholders are replaced
First message is processed - Variables are replaced here too
Metadata is added - Variables metadata section is appended to prompt

Server Logs

After sending hello with variables, check server logs for:

📝 Processing variables from hello message: [USER_NAME, ACCOUNT_TYPE, LANGUAGE]
✅ Variables processed and prompt updated
📝 FINAL PROCESSED PROMPT (agentId: ...):
[Prompt with variables replaced]
✅ Processed variables: 3 hello variables, 2 default variables, metadata: added

Variable Naming Conventions

Use UPPERCASE with underscores: USER_NAME, ACCOUNT_TYPE
Variable names are case-sensitive
Avoid special characters except underscores
Recommended format: {{VARIABLE_NAME}} in prompts

Common Use Cases

1. User Personalization

{
  "variables": {
    "USER_NAME": "John Doe",
    "USER_EMAIL": "john@example.com"
  }
}

2. Account Context

{
  "variables": {
    "ACCOUNT_TYPE": "premium",
    "SUBSCRIPTION_STATUS": "active"
  }
}

3. Language/Localization

{
  "variables": {
    "LANGUAGE": "en-US",
    "CURRENCY": "USD"
  }
}

4. Session Context

{
  "variables": {
    "SESSION_ID": "abc123",
    "PAGE_URL": "https://example.com/products"
  }
}

Error Handling

Missing Variables

If a variable is referenced in the prompt but not provided:

Has default value: Uses default from agent configuration
No default value: Placeholder remains unchanged ({{VARIABLE_NAME}})

Invalid Variable Format

Variables must be a JSON object
Values should be strings (will be converted to string if needed)
Empty object {} or null is valid (will use defaults only)

Best Practices

Set defaults in agent configuration for all variables
Override with hello variables only when you have dynamic values
Use descriptive names that clearly indicate the variable's purpose
Document variables in your agent's description or notes
Test variables by checking server logs for "FINAL PROCESSED PROMPT"

API Reference

Hello Message Structure

interface HelloMessage {
  t: "hello";                    // Message type
  v?: number;                    // SDK version (2 for v2)
  variables?: {                   // Optional variables object
    [key: string]: string;        // Variable name -> value mapping
  };
  inputFormat?: AudioFormat;     // Input audio format
  requestedOutputFormat?: AudioFormat; // Output audio format
  outputFrameDurationMs?: number; // Frame duration for streaming
}

interface AudioFormat {
  encoding: "pcm" | "pcmu" | "pcma";
  sampleRate: number;
  channels: number;
  bitDepth: number;
  container?: "raw" | "wav";      // For output format only
}

Troubleshooting

Variables Not Being Replaced

Check variable names match exactly (case-sensitive)
Verify variables are sent in hello message (check logs)
Check server logs for "FINAL PROCESSED PROMPT" to see actual replacement

Variables Not in Hello Message

SDK v2: Check if SDK supports variables in constructor
Raw WebSocket: Ensure variables field is included in JSON
Check WebSocket message is sent after connection opens

Default Variables Not Used

Verify variables are stored in Redis (check agent configuration)
Check extractVariablesFromAgentConfig is working
Look for "Default variables not found in state" warnings in logs

Events & Callbacks

The SDK emits events for all important state changes and interactions.

Event Categories

Connection Events

voiceSDK.on('connected', () => {
  console.log('✅ Connected to agent');
});

voiceSDK.on('disconnected', (event) => {
  console.log('❌ Disconnected:', event.reason);
  console.log('Close code:', event.code);
});

voiceSDK.on('error', (error) => {
  console.error('Error:', error);
});

Recording Events

voiceSDK.on('recordingStarted', () => {
  console.log('🎤 Recording started');
});

voiceSDK.on('recordingStopped', () => {
  console.log('⏹️ Recording stopped');
});

Message Events

voiceSDK.on('message', (msg) => {
  switch(msg.type) {
    case 'agent_response':
      console.log('Agent:', msg.agent_response);
      break;
    case 'transcription':
      console.log('You said:', msg.text);
      break;
    // ... other message types
  }
});

Audio Events

voiceSDK.on('playbackStarted', () => {
  console.log('🔊 Audio playback started');
});

voiceSDK.on('playbackStopped', () => {
  console.log('🔇 Audio playback stopped');
});

voiceSDK.on('audioData', (audioData) => {
  // Raw audio data (Uint8Array)
});

Pause Events

// Call paused (server acknowledged)
voiceSDK.on('callPaused', (data) => {
  console.log('⏸️ Call paused, timeout:', data.timeoutSeconds, 'seconds');
});

// Call resumed (server acknowledged, STT ready)
voiceSDK.on('callResumed', () => {
  console.log('▶️ Call resumed');
});

// Pause timeout (call auto-ended because pause lasted too long)
voiceSDK.on('pauseTimeout', () => {
  console.log('⏱️ Pause timeout — call ended');
});

Special Events

// Barge-in (user interrupts agent)
voiceSDK.on('bargeIn', (message) => {
  console.log('User interrupted the agent');
});

// Format negotiation (v2 protocol only)
voiceSDK.on('formatNegotiated', (format) => {
  console.log('Format negotiated:', format);
  // format: { container, encoding, sampleRate, channels, bitDepth }
});

// Greeting audio
voiceSDK.on('greetingStarted', () => {
  console.log('Playing greeting message');
});

// Domain whitelist error
voiceSDK.on('domainError', (error) => {
  console.error('Domain not whitelisted:', error.reason);
});

// Server-driven disclaimer (voice v2) — see #server-driven-disclaimer
voiceSDK.on('disclaimersRequired', (payload) => {
  // Show your UI, then call voiceSDK.sendDisclaimerAck(true|false)
});
voiceSDK.on('disclaimerRejected', ({ code, message }) => {
  // DISCLAIMER_DECLINED, DISCLAIMER_TIMEOUT, DISCLAIMER_HASH_MISMATCH, etc.
});

Protocol v2 - Format Negotiation

The SDK v2 introduces format negotiation, allowing you to specify exactly what audio format you want to receive from the server.

Server-driven disclaimer (voice)

Some deployments must show exact legal or policy copy from the server before speech recognition and the agent greeting run. The conversation server can require an explicit acknowledgement step after hello_ack. This applies to VoiceSDK v2 (protocol version 2) and the Voice & Chat Widget voice path.

When the gate is active

The agent has a non-empty disclaimers list in backend storage (Redis field disclaimers as a JSON array of plain-text strings). An empty array [] means no disclaimer gate.
The session is not a resumed voice call (resume skips the gate).

While the gate is open, the server:

Does not open STT or play the greeting.
Rejects start_continuous_mode with {"ok":false,"t":"error","code":"DISCLAIMER_PENDING",...}.
Drops uplink binary audio (microphone data) until the gate clears.
Starts a server-side timer; if the user never acknowledges, the session is closed with DISCLAIMER_TIMEOUT (duration is configured on the server, typically on the order of minutes).

iOS Safari: microphone uplink

Voice capture uses an AudioWorklet. On iPhone and iPad, WebKit often runs the capture AudioContext at the device hardware rate (commonly 44.1 kHz or 48 kHz) even when a lower rate was requested, while the server expects PCM at the input rate negotiated in hello_ack (typically 16 kHz). The SDK resamples uplink PCM to that negotiated rate before sending binary frames. The recorder connects the worklet through a zero-gain node to the destination so WebKit reliably pulls the processor (graphs that dead-end at the worklet alone may not run on some builds). On mobile, after getUserMedia succeeds, the SDK may prime the shared recorder context while user activation is still fresh, before the WebSocket handshake and hello_ack finish. For embedded widgets, use allow="microphone" on the iframe when the host page is cross-origin.

Embed sites: CSP and “Unable to load a worklet's module”

Strict Content-Security-Policy on the parent page can block audioWorklet.addModule() when the processor URL points at another host (for example cdn.talktopc.com), which surfaces as AbortError: Unable to load a worklet's module and voice never reaches the server. The SDK tries that URL first, then automatically retries using the capture worklet source bundled inside the widget script and a blob: URL—this works on many shops without whitelisting our CDN. If both attempts fail, relax CSP (often worker-src / script-src must allow blob:, and/or your CDN origin for the processor), or host audio-processor.js on your own domain and set voice.audioProcessorPath to that same-origin URL.

`hello_ack` fields (gate active)

When disclaimers apply, the server includes:

Field	Type	Description
`disclaimersRequired`	boolean	Must be `true` when the gate is active
`disclaimerTexts`	string[]	Plain-text lines to show the user (no HTML; escape/sanitize in your UI)
`disclaimersHash`	string	SHA-256 (hex) over the canonical text list; echoed in `disclaimer_ack` for verification
`disclaimerTimeoutMs`	number	Hint for UI (countdown copy); server enforces the real timeout independently

Client message: `disclaimer_ack`

After the user accepts or declines, send:

{
  "t": "disclaimer_ack",
  "accepted": true,
  "disclaimersHash": "sha256-from-hello_ack",
  "conversationId": "optional-matches-hello_ack"
}

For decline, set "accepted": false. Duplicate acks for the same session are ignored. The SDK method sendDisclaimerAck(accepted) builds this frame using the hash and conversation id from hello_ack.

Server error frames (`t: "error"`)

`code`	Meaning
`DISCLAIMER_PENDING`	Client tried to start continuous mode or stream audio before sending a successful ack; session stays open—send `disclaimer_ack` then retry
`DISCLAIMER_DECLINED`	User declined; server ends the conversation
`DISCLAIMER_TIMEOUT`	No acknowledgement in time; connection closed
`DISCLAIMER_HASH_MISMATCH`	Ack hash did not match server expectation; connection closed

Using VoiceSDK v2 (custom apps)

Use protocol v2 (default in current SDK): protocolVersion: 2.
After connect(), wait for hello_ack. The SDK sets voiceSDK.disclaimersPending === true when the gate is active.
Listen for disclaimersRequired. The payload includes texts, disclaimersHash, disclaimerTimeoutMs, and conversationId for your UI.
Show your own modal or screen with the given texts. Do not call startRecording() until the user has accepted and you have called sendDisclaimerAck(true) (calling startRecording() while disclaimersPending is still true throws with error.code === 'DISCLAIMER_PENDING').
On accept: voiceSDK.sendDisclaimerAck(true). The server then opens STT, plays the greeting, and accepts start_continuous_mode.
On decline: voiceSDK.sendDisclaimerAck(false). The server closes the session; the SDK emits disclaimerRejected and the raw message event for the error frame.
Handle disclaimerRejected for terminal server outcomes (DISCLAIMER_DECLINED, DISCLAIMER_TIMEOUT, DISCLAIMER_HASH_MISMATCH). Handle error for DISCLAIMER_PENDING (ordering bug or race—fix by ack first).

const voiceSDK = new VoiceSDK({ agentId, appId, protocolVersion: 2 });

voiceSDK.on('disclaimersRequired', (payload) => {
  showMyModal({
    texts: payload.texts,
    onAccept: () => voiceSDK.sendDisclaimerAck(true),
    onDecline: () => voiceSDK.sendDisclaimerAck(false)
  });
});

voiceSDK.on('disclaimerRejected', ({ code, message }) => {
  console.warn('Disclaimer flow ended:', code, message);
});

voiceSDK.on('error', (err) => {
  if (err.code === 'DISCLAIMER_PENDING') {
    console.warn('Start recording only after sendDisclaimerAck(true)');
  }
});

await voiceSDK.connect();
// Only after ack (or if disclaimersRequired never fired):
await voiceSDK.startRecording();

SDK state you may read: disclaimersPending, disclaimersHash, lastDisclaimerPayload (set from hello_ack). After sendDisclaimerAck runs, the SDK clears disclaimersPending locally when the WebSocket send succeeds.

Voice & Chat Widget (built-in behavior)

No extra widget options are required. When the server sends the disclaimer gate:

VoiceInterface waits after the WebSocket is up and hello_ack is processed.
It opens a built-in modal (Notice / Accept / Decline) with disclaimerTexts from the server.
Accept: on desktop, the widget sends sendDisclaimerAck(true) immediately, then requests the microphone and startListening. On mobile, it waits until the user has granted microphone access (and the post-grant audio delay) before sending sendDisclaimerAck(true), so the server does not stream the greeting over the system permission sheet or get cut off when capture starts.
Decline (No thanks) calls sendDisclaimerAck(false), waits briefly so the ack reaches the server (which ends the conversation and closes the socket), then disconnects the client, invokes onConversationEnd on the wrapper SDK since recording never started, resets UI, and returns to landing (or idle voice in voice-only mode).

To match your site language, use the widget’s existing language / translation hooks for general UI; disclaimer body text always comes from the server (compliance copy).

Resume & text chat

Resume: Resumed voice sessions omit the disclaimer gate.

Text chat: Server-driven disclaimer is implemented for voice in this release. The text WebSocket hello path does not yet mirror these fields; extend the backend and TextChatSDK if you need the same gate for text-only sessions.

🎯 Key Benefits:

Format Control: Request specific audio formats (container, encoding, sample rate, bit depth)
Automatic Conversion: SDK automatically converts audio if backend sends different format
Quality Optimization: Choose optimal formats for your use case (e.g., 48kHz for high quality, 8kHz for bandwidth savings)
Protocol Support: Uses v2 protocol with format negotiation

Supported Formats

Input Formats (What SDK Sends)

Property	Supported Values
`encoding`	'pcm', 'pcmu' (μ-law), 'pcma' (A-law)
`sampleRate`	8000, 16000, 22050, 24000, 44100, 48000 Hz
`bitDepth`	8, 16, 24 bits
`channels`	1 (mono only)

Output Formats (What SDK Receives)

Property	Supported Values
`container`	'raw' (no header), 'wav' (with WAV header)
`encoding`	'pcm', 'pcmu' (μ-law), 'pcma' (A-law)
`sampleRate`	8000, 16000, 22050, 24000, 44100, 48000 Hz
`bitDepth`	8, 16, 24 bits
`channels`	1 (mono only)

Format Negotiation Flow

1. SDK Initialization

Configure requested output format

↓

2. Connect & Send Hello

SDK sends format request in hello message

↓

3. Server Response

Server sends hello_ack with negotiated format

↓

4. Format Negotiated Event

SDK emits 'formatNegotiated' event

↓

5. Automatic Conversion

If formats differ, SDK converts automatically

Example: High-Quality Audio

const voiceSDK = new VoiceSDK({
  agentId: 'agent_123',
  appId: 'your_app_id',
  
  // Request high-quality audio
  outputContainer: 'raw',        // Raw PCM for lower latency
  outputEncoding: 'pcm',         // Uncompressed PCM
  outputSampleRate: 48000,       // 48kHz for high quality
  outputBitDepth: 16,            // 16-bit depth
  outputChannels: 1,             // Mono
  outputFrameDurationMs: 600,    // 600ms frames
  
  protocolVersion: 2             // Enable format negotiation
});

voiceSDK.on('formatNegotiated', (format) => {
  console.log('Negotiated format:', format);
  // If backend sends different format, SDK will convert automatically
});

Example: Bandwidth-Optimized Audio

const voiceSDK = new VoiceSDK({
  agentId: 'agent_123',
  appId: 'your_app_id',
  
  // Request compressed, low-bandwidth audio
  outputContainer: 'raw',
  outputEncoding: 'pcmu',        // μ-law compression (8kHz equivalent)
  outputSampleRate: 8000,        // 8kHz for bandwidth savings
  outputBitDepth: 16,
  outputChannels: 1,
  
  protocolVersion: 2
});

Format Conversion

If the backend sends audio in a different format than requested, the SDK automatically converts it:

Container: WAV ↔ Raw PCM extraction/wrapping
Encoding: PCM ↔ PCMU/PCMA encoding/decoding
Sample Rate: Automatic resampling using Web Audio API
Bit Depth: 8-bit ↔ 16-bit ↔ 24-bit conversion
Channels: Mono/stereo conversion (if needed)

💡 Best Practices:

Use protocolVersion: 2 for new projects
Request formats that match your use case (quality vs. bandwidth)
48kHz is recommended for best quality (matches most browser defaults)
Raw PCM is lower latency than WAV (no header overhead)
Listen to formatNegotiated event to verify format

Voice & Chat Widget

Pre-built, customizable widget with voice and text chat - perfect for adding AI conversation to any website.

Agent display name

Set the name on the voice idle hero (and the letter inside the avatar when no image URL is set) with the root property agentName only.
Use header.pillTitle for a short CTA on the desktop floating pill and the mobile FAB (e.g. “Talk to me” / “דברו איתי”); when empty, both use header.title. Override the mobile FAB line only with header.mobileLabel. Separate from agentName.
voice.agentName is not supported — it is removed from config when the widget merges settings.
Legacy root headerTitle is ignored. Use agentName for the hero name and header.title for the general assistant title line (pill / mobile landing fallback).
In unified mode, the text chat screen has a top bar with Voice or text (widget translation key backToModeChoice) to return to the voice/text choice inside the panel.

💬

Voice & Text Chat

Beautiful interface with voice recording, text chat, and message history

🎨

Fully Customizable

Colors, position, size, RTL support, and custom branding

📱

Mobile Optimized

Responsive design that works perfectly on all devices

🌍

Multi-language

Built-in support for multiple languages with custom translations

Installation

<!-- Add the SDK script to your page -->
<script src="https://unpkg.com/ttp-agent-sdk/dist/agent-widget.js"></script>

<!-- Initialize the widget -->
<script>
  const widget = new TTPAgentSDK.TTPChatWidget({
    agentId: 'agent_123',
    appId: 'your_app_id'
  });
</script>

Basic Configuration

const widget = new TTPAgentSDK.TTPChatWidget({
  // Required
  agentId: 'agent_123',       // Your AI agent ID
  appId: 'your_app_id',       // Your application ID

  // Optional — root only (voice idle hero name & avatar initial when no image)
  agentName: 'Alex',
  
  // Optional - Agent Settings Override (available when domain whitelist is configured)
  agentSettingsOverride: {
    prompt: "You are a helpful customer service assistant.",
    temperature: 0.8,
    voiceId: "F2",
    voiceSpeed: 1.2,
    firstMessage: "Hello! How can I help you today?",
    disableInterruptions: false,
    maxCallDuration: 600
  },
  
  // Optional - Appearance
  primaryColor: '#7C3AED',    // Widget theme color
  position: {                 // Or shorthand: 'bottom-right', 'bottom-left'
    vertical: 'bottom',
    horizontal: 'right',
    offset: { x: 20, y: 20 },
    draggable: false,         // true = user can drag launcher + panel
    draggablePersist: true    // remember drag position in localStorage
  },
  language: 'en',             // 'en', 'es', 'fr', 'de', 'he', etc.
  direction: 'ltr',           // 'ltr' or 'rtl' for right-to-left languages
  
  // Optional - Variables
  variables: {
    userName: 'John Doe',
    page: 'homepage',
    customData: 'value'
  }
});

Server-driven disclaimer (voice)

If your agent has non-empty disclaimers configured on the server, the widget shows a Notice modal with the server-provided text before microphone streaming and the greeting. The user must tap Accept or Decline; you do not need extra embed code. For protocol fields, SDK hooks, and custom implementations, see Server-driven disclaimer (voice).

Access Control

💡 Domain Whitelist: For production applications, configure a domain whitelist in your agent's admin panel to control which websites can connect to your agent.

The widget connects directly using agentId and appId. No backend authentication step is needed:

const widget = new TTPAgentSDK.TTPChatWidget({
  agentId: 'agent_123',
  appId: 'your_app_id',
  variables: {
    userName: 'John Doe',
    page: 'homepage'
  }
});

📋 Important:

The widget only needs agentId and appId to connect
Access control is managed via domain whitelist in the admin panel
WebSocket URLs use the production TalkToPC endpoint by default (same as voice). Pass websocketUrl only if you need a non-default backend.

Advanced Customization

Icon Customization

const widget = new TTPAgentSDK.TTPChatWidget({
  agentId: 'agent_123',
  appId: 'your_app_id',
  
  icon: {
    type: 'custom',                              // 'microphone', 'emoji', or 'custom'
    // Omit customImage (or use '') for default animated waveform on the desktop pill
    customImage: 'https://your-site.com/logo.png', // Optional: pill icon image URL
    size: 60,                                    // Icon size in pixels
    backgroundColor: '#FFFFFF',                  // Background color
    borderRadius: '50%'                         // Border radius (50% for circle)
  }
});

Chat Window Customization

const widget = new TTPAgentSDK.TTPChatWidget({
  agentId: 'agent_123',
  appId: 'your_app_id',
  
  chatWindow: {
    width: 400,                    // Width in pixels
    height: 600,                   // Height in pixels
    title: 'Chat with us!',        // Custom title
    subtitle: 'We reply instantly', // Custom subtitle
    placeholder: 'Type here...',   // Input placeholder
    borderRadius: 12               // Window border radius
  }
});

Branding

const widget = new TTPAgentSDK.TTPChatWidget({
  agentId: 'agent_123',
  appId: 'your_app_id',
  
  branding: {
    companyName: 'Your Company',
    logo: 'https://your-site.com/logo.png',
    showPoweredBy: false           // Hide "Powered by TTP" footer
  }
});

Agent Settings Override

💡 Access Control: Agent settings override is available when the agent has domain whitelist configured in the admin panel.

Dynamically customize agent behavior, voice, and personality on a per-session basis:

const widget = new TTPAgentSDK.TTPChatWidget({
  agentId: 'agent_123',
  appId: 'your_app_id',
  
  // Override agent settings dynamically
  agentSettingsOverride: {
    // Core settings
    prompt: "You are a friendly customer service assistant",
    temperature: 0.8,
    maxTokens: 200,
    
    // Voice settings
    voiceId: "F2",
    voiceSpeed: 1.2,
    
    // Behavior
    firstMessage: "Hello! How can I help you today?",
    disableInterruptions: false,
    maxCallDuration: 600,
    
    // Language
    language: "en",
    autoDetectLanguage: false
  }
});

See the Agent Settings Override section for complete documentation of all available override settings.

RTL (Right-to-Left) Support

// For Hebrew, Arabic, etc.
const widget = new TTPAgentSDK.TTPChatWidget({
  agentId: 'agent_123',
  appId: 'your_app_id',
  direction: 'rtl',
  language: 'he',                  // Hebrew
  position: 'bottom-left'          // Better for RTL
});

Widget Methods

`widget.open()`

Programmatically open the chat window.

// Open chat from your own button
document.getElementById('myButton').onclick = () => {
  widget.open();
};

`widget.close()`

Close the chat window.

widget.close();

`widget.toggle()`

Toggle chat window open/closed.

widget.toggle();

`widget.minimize()`

Collapse the chat panel back to the round launcher pill — same visual state as if the user had clicked the launcher while the panel was open. Idempotent (no-op when already minimized) and does not end an active voice call: the WebSocket and conversation state are preserved, the user just sees the bubble until they re-open. Also triggered automatically by a backend-pushed { t: 'minimize_widget' } control message, so partner integrations can choreograph the chat panel from the server side (e.g. minimize the widget when opening a native trolley drawer).

widget.minimize();

`widget.maximize()`

Expand the chat panel from the round launcher pill — same visual state as if the user had clicked the launcher while the panel was closed. Idempotent (no-op when already open) and runs the same auto-connect side-effect a real click would when configured. Also triggered automatically by a backend-pushed { t: 'maximize_widget' } control message.

widget.maximize();

`widget.destroy()`

Remove the widget from the page.

widget.destroy();

`widget.updateConfig(config)`

Update widget configuration dynamically.

widget.updateConfig({
  primaryColor: '#FF5733',
  language: 'es',
  agentName: 'Jordan'  // root only; voice.agentName in this object is ignored
});

Widget Event Callbacks

Pass these as top-level config properties when constructing TTPChatWidget:

const widget = new TTPAgentSDK.TTPChatWidget({
  agentId: 'agent_123',
  appId: 'your_app_id',

  onConversationStart: () => {
    console.log('Voice conversation started');
  },

  onConversationEnd: () => {
    console.log('Voice conversation ended');
  },

  onBargeIn: () => {
    console.log('User interrupted the agent');
  },

  onAudioStartPlaying: () => {
    console.log('Agent audio started');
  },

  onAudioStoppedPlaying: () => {
    console.log('Agent audio stopped');
  },

  onSubtitleDisplay: (subtitle) => {
    console.log('Subtitle:', subtitle);
  },

  onVoiceCallButtonClick: () => {
    // Return false to prevent starting the call
    return true;
  }
});

For lower-level VoiceSDK events (onConnected, onMessage, etc.), use widget.voiceInterface.sdk or instantiate VoiceSDK directly. See Events & Callbacks.

Complete Example

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <title>My Website with AI Chat</title>
</head>
<body>
  <h1>Welcome to my website!</h1>
  
  <!-- Load the SDK -->
  <script src="https://unpkg.com/ttp-agent-sdk/dist/agent-widget.js"></script>
  
  <!-- Initialize widget -->
  <script>
    const widget = new TTPAgentSDK.TTPChatWidget({
      agentId: 'agent_123',
      appId: 'your_app_id',
      
      // Customize appearance
      primaryColor: '#7C3AED',
      position: 'bottom-right',
      language: 'en',
      
      // Custom branding
      chatWindow: {
        title: 'Chat with us!',
        subtitle: 'We typically reply instantly'
      },
      
      // Pass context variables
      variables: {
        userName: 'Visitor',
        page: window.location.pathname,
        referrer: document.referrer
      },
      
      // Event handlers
      onReady: () => {
        console.log('Chat widget ready!');
      },
      
      onMessage: (message) => {
        // Track messages in analytics
        console.log('Message:', message);
      }
    });
    
    // Optional: Open chat programmatically
    // widget.open();
  </script>
</body>
</html>

Configuration Reference

🎨 Extensive Customization

The Voice & Chat Widget can be customized in almost every aspect - colors, text, icons, sizes, behaviors, and more!

Agent naming: use root agentName for the voice idle hero and avatar letter fallback. Do not use voice.agentName (removed on merge). Do not use legacy root headerTitle (ignored). See Agent display name above.

See the live customization demo to experiment with all options interactively, including quick themes (Default, Light, Sunset, Hebrew, S-Law).

Required Configuration

Property	Type	Description
`agentId`	string	Your AI agent identifier
`appId`	string	Your application identifier

General Configuration

Property	Type	Default	Description
`primaryColor`	string	'#7C3AED'	Main theme color (hex)
`direction`	string	`'ltr'` (or `'rtl'` when `language` is `he` / `ar`)	`'ltr'` or `'rtl'`. If omitted, Hebrew and Arabic default to `'rtl'`. Sets the shadow host `dir` so the desktop pill launcher matches the panel (fixes extra padding beside the logo on RTL sites). In `'rtl'`, transcript, bubbles, and mobile bar use RTL punctuation order.
`language`	string	'en'	Language code (en, es, fr, de, he, ar, etc.)
`agentName`	string	—	Root only. Name on the voice idle hero and first-letter avatar fallback when no header image is set. If omitted or empty, the widget defaults to `Sasha`. Color: `voice.agentNameColor`. Not the desktop pill text — use `header.pillTitle` or `header.title` for that.
`headerTitle`	string	—	Deprecated / ignored. Former optional root field; it is no longer read. Use `agentName` for the voice hero name and `header.title` for the shared assistant title line.
`variables`	object	{}	Custom variables to pass to agent
`websocketUrl`	string	`wss://speech.talktopc.com/ws/conv` (built-in)	Optional override for voice and text. Omit unless you point at a custom backend; text still uses `/chat/text` on the same host as this base.
`agentSettingsOverride`	object	null	Override agent settings dynamically. See Agent Settings Override for details.
`customStyles`	string	''	Custom CSS to inject
`useShadowDOM`	boolean	true	Enable Shadow DOM for CSS isolation. Set to `false` for Shopify compatibility. See Shadow DOM Configuration below.
`mobileVoiceUI`	boolean	auto	Root-level override. Force mobile-style voice call UI (bottom bar + overlay) or desktop hero. Takes precedence over `behavior.mobileVoiceUI` when both are set. Omit for auto: native iOS/Android, or viewport ≤768px with touch/coarse pointer.
`inputFormat`	object	—	Optional v2 input audio format passed to the voice WebSocket hello message: `{ encoding, sampleRate, channels, bitDepth }`. Same fields as top-level VoiceSDK config (`sampleRate`, etc.). See Protocol v2.
`visualAssistant`	object	null	Enable browser-side visual assistant tools (page read, highlight, scroll, navigate, form fill, click, screenshot). Also accepted under `agentSettingsOverride.visualAssistant`. See Visual Assistant below.
`whatsapp`	object	null	Optional WhatsApp handoff on the voice idle hero: `{ number: '972501234567', text: 'optional pre-filled message' }`. Non-digits are stripped from `number`. Omit to hide the WhatsApp button.

Shadow DOM Configuration (`useShadowDOM`)

What is Shadow DOM?

Shadow DOM is a web standard that provides CSS isolation by creating a separate DOM tree that doesn't inherit styles from the parent page. This prevents theme CSS from interfering with the widget's appearance.

When to Use Shadow DOM:

✅ WordPress: Use Shadow DOM (useShadowDOM: true or omit - defaults to true)
✅ Most platforms: Shadow DOM works well on most websites and platforms
❌ Shopify: Disable Shadow DOM (useShadowDOM: false) due to rendering issues

Why We Need This Option:

While Shadow DOM provides excellent CSS isolation, some platforms (notably Shopify) have rendering issues where Shadow DOM elements render with 0x0 dimensions, making the widget invisible. By setting useShadowDOM: false, the widget uses regular DOM with targeted CSS resets instead, ensuring visibility while still protecting against most theme conflicts.

Platform-Specific Recommendations:

Platform	Recommended Setting	Reason
WordPress	`useShadowDOM: true` (default)	Shadow DOM works perfectly and provides better CSS isolation
Shopify	`useShadowDOM: false`	Shadow DOM elements render with 0x0 dimensions, making widget invisible
Wix	`useShadowDOM: true` (default)	Shadow DOM works well on Wix
Custom Websites	`useShadowDOM: true` (default)	Use Shadow DOM unless you experience rendering issues

Example Usage:

// WordPress (default - Shadow DOM enabled)
const widget = new TTPChatWidget({
  agentId: 'your-agent-id',
  // useShadowDOM defaults to true, so widget is isolated from theme CSS
});

// Shopify (disable Shadow DOM)
const widget = new TTPChatWidget({
  agentId: 'your-agent-id',
  useShadowDOM: false  // Required for Shopify compatibility
});

// If widget is invisible, try disabling Shadow DOM
const widget = new TTPChatWidget({
  agentId: 'your-agent-id',
  useShadowDOM: false  // Fallback if Shadow DOM causes rendering issues
});

How It Works:

With Shadow DOM (useShadowDOM: true): Widget is rendered inside a Shadow DOM tree, completely isolated from page CSS. Styles are injected into the shadow root.
Without Shadow DOM (useShadowDOM: false): Widget is rendered in regular DOM. Styles are injected into the document <head> with high-specificity selectors to prevent theme CSS conflicts. Targeted CSS resets protect against common theme issues while preserving widget functionality.

⚠️ Important Notes:

When useShadowDOM: false, the widget uses targeted CSS resets instead of aggressive resets to preserve internal widget styles
If you experience layout issues with useShadowDOM: false, check if your theme's CSS is overriding widget styles - you may need to add more specific CSS rules
Unified mode: With useShadowDOM: false, the widget must show only one of voice or text at a time inside the panel. A previous reset rule forced display:flex on multiple roots (higher specificity than the hide rules), which made them stack vertically on Shopify; this is fixed in current builds.
Voice call UI: Call duration updates and the red recording dot use DOM queries scoped to #ttp-widget-container (not document), and extra CSS guards the dot and pulse when themes override spans — fixes static timers and missing dots on Shopify.
Orb waveform: Bars are injected under #ttp-widget-container. With useShadowDOM: false, the same CSS ttp-wave keyframes and per-bar delays are used as in Shadow DOM; high-specificity rules protect size and animation without fixing opacity (which would break the keyframe fade).
The widget automatically handles CSS injection differently based on this setting - no additional configuration needed

Positioning

Property	Type	Default	Description
`position`	string \| object	'bottom-right'	String: 'bottom-right', 'bottom-left'. Object: { vertical, horizontal, offset }
`position.vertical`	string	'bottom'	'top' or 'bottom'
`position.horizontal`	string	'right'	'left' or 'right'
`position.offset`	object	{ x: 20, y: 20 }	Offset from edges (pixels)
`position.draggable`	boolean	false	When `true`, visitors can drag the launcher pill / mobile FAB and the open chat panel around the viewport. They move together as one unit (never split). Drag handles: the pill/FAB and panel header bars (voice idle header, active-call top bar, text chat top bar). Header buttons (close, etc.) still work normally.
`position.draggablePersist`	boolean	true	When `true` (default), the dragged position is saved in `localStorage` per origin+path and restored on the next visit. Set `false` to reset to the configured corner on every load. The widget unit and the desktop minimized voice strip (`flavor.callView: 'minimized'`) remember positions independently.
`positionOffset`	object	—	Legacy. Used only when `position` is a string (e.g. `'bottom-right'`) instead of an object. Prefer `position.offset`.

When position.draggable is enabled, the desktop minimized voice strip (flavor.callView: 'minimized') is also draggable by its body (controls and text input still work). Positions are clamped to stay on-screen and re-clamped on window resize.

Icon & Button

Property	Type	Default	Description
`icon.type`	string	'custom'	'microphone', 'custom', 'emoji', 'text'
`icon.customImage`	string	—	Optional HTTPS image URL for the desktop floating pill icon. When omitted or empty, the pill uses the same animated waveform as the mobile launcher and pre-chat landing (white bars on a frosted circle).
`icon.size`	string	'medium'	'small', 'medium', 'large', 'xl'
`icon.backgroundColor`	string	'#FFFFFF'	Icon background color
`button.size`	string	'medium'	'small', 'medium', 'large'
`button.shape`	string	'circle'	'circle', 'rounded', 'square'
`button.backgroundColor`	string	primaryColor	Button background color
`button.hoverColor`	string	'#7C3AED'	Button hover color
`button.shadow`	boolean	true	Enable button shadow

Panel & Header

Property	Type	Default	Description
`panel.width`	number	350	Panel width (pixels)
`panel.height`	number	500	Panel height (pixels)
`panel.borderRadius`	number	12	Border radius (pixels)
`panel.backgroundColor`	string	'#FFFFFF'	Panel background color
`header.title`	string	'Chat Assistant'	Default title when more specific labels are omitted: fallback for the desktop and mobile launchers if `header.pillTitle` is not set (unless `header.mobileLabel` is set), and for the mobile pre-chat landing name when root `agentName` is not set. Does not set the voice idle hero name — use root `agentName` for that.
`header.pillTitle`	string	''	Optional main line on the desktop floating pill and the mobile pill launcher (same fallback chain as desktop: when empty, uses `header.title`). Set `header.mobileLabel` if the FAB needs different copy than the desktop pill. Does not change the voice idle hero name — use root `agentName` for that.
`header.showTitle`	boolean	true	Show/hide header title
`header.backgroundColor`	string	'#7C3AED'	Header background color
`header.textColor`	string	'#FFFFFF'	Header text color
`header.mobileLabel`	string	—	Optional: overrides the mobile pill launcher’s main line only. When omitted, the FAB uses the same text as the desktop pill (`header.pillTitle` or `header.title`).
`header.showCloseButton`	boolean	true	Show or hide the panel close button in the header.
`header.onlineIndicatorText`	string	Auto	Online status label on the desktop pill, mobile FAB, and headers. When omitted, uses the translated “Online” string for the widget language.
`header.onlineIndicatorColor`	string	header.textColor	Text color for the online indicator label.
`header.onlineIndicatorDotColor`	string	'#10b981'	Color of the online status dot next to the indicator text.
`footer.show`	boolean	true	Show or hide the "Powered by" footer.
`footer.brand`	string	'talktopc'	Brand shown in the "Powered by" footer. `'talktopc'` links to talktopc.com; `'speacart'` links to speacart.com. Any other value defaults to TalkToPC.
`footer.backgroundColor`	string	'#f9fafb'	Footer background color.
`footer.textColor`	string	'#9ca3af'	Footer text color.
`footer.hoverColor`	string	'#7C3AED'	Footer link hover color.

Messages & Chat

On desktop, when the text chat view is open, the panel uses a fixed height (up to min(520px, 100vh − 100px)) so new messages scroll inside the transcript instead of stretching the card. By default (text.useVoiceTheme not false), the text UI shares the voice theme: voice.heroGradient1 / heroGradient2 surface (same as the voice idle hero), voice.primaryBtnGradient* and startCallButtonColor for send/focus and user bubbles (#primary40 tint), avatarGradient1/2 on the assistant avatar, and transcript-style translucent inputs. The “Powered by” footer uses the same link accent as on the hero. Set text.useVoiceTheme: false for a light layout driven by solid panel.backgroundColor and messages.* colors.

Property	Type	Default	Description
`messages.userBackgroundColor`	string	'#E5E7EB'	User message background
`messages.agentBackgroundColor`	string	'#F3F4F6'	Agent message background
`messages.systemBackgroundColor`	string	'#DCFCE7'	System message background
`messages.errorBackgroundColor`	string	'#FEE2E2'	Error message background
`messages.textColor`	string	'#1F2937'	Message text color (fallback when role-specific colors are omitted)
`messages.userTextColor`	string	messages.textColor	User message text color
`messages.agentTextColor`	string	messages.textColor	Agent message text color
`messages.userAvatarIcon`	string	'👤'	Emoji/icon shown on user message avatars
`messages.agentAvatarIcon`	string	'🤖'	Emoji/icon shown on agent message avatars
`messages.fontSize`	string	'16px'	Message font size
`messages.borderRadius`	number	8	Message bubble radius
`text.useVoiceTheme`	boolean	`true`	When `true`, text chat matches the voice idle hero gradient, primary gradients, avatar gradients, and dark translucent bubbles/inputs (see voice theme). When `false`, chrome follows solid `panel.backgroundColor` (hex) and `messages.userBackgroundColor` / `agentBackgroundColor`.
`text.sendButtonColor`	string	voice accent	Send button fill; defaults to the first resolvable hex from `voice.primaryBtnGradient1`, `voice.primaryBtnGradient2`, or `voice.startCallButtonColor` (including the default start-call indigo when those are unset), then `#7C3AED`.
`text.sendButtonHoverColor`	string	voice accent 2 / shaded	Hover state; uses `voice.primaryBtnGradient2` when it resolves to a different hex than the send color, otherwise a slightly darker shade of the accent.
`text.sendButtonActiveColor`	string	same as hover	Active press state; same resolution as hover.
`text.sendButtonText`	string	'➤'	Send button text/icon
`text.sendButtonTextColor`	string	'#FFFFFF'	Send button text color
`text.sendButtonFontSize`	string	'20px'	Send button font size
`text.sendButtonFontWeight`	string	'500'	Send button font weight
`text.inputPlaceholder`	string	'Type your message...'	Input placeholder text
`text.inputBorderColor`	string	'#E5E7EB'	Input border color
`text.inputFocusColor`	string	voice accent	Input focus border and ring; same default chain as `text.sendButtonColor`.
`text.inputBackgroundColor`	string	'#FFFFFF'	Input background color
`text.inputTextColor`	string	'#1F2937'	Input text color
`text.inputFontSize`	string	'16px'	Input font size
`text.inputBorderRadius`	number	20	Input border radius (pixels)
`text.inputPadding`	string	'8px 16px'	Input padding
`text.sendButtonHint.text`	string	''	Optional hint text below or near the send button
`text.sendButtonHint.color`	string	'#6B7280'	Send button hint text color
`text.sendButtonHint.fontSize`	string	'14px'	Send button hint font size

Voice Configuration

The agent’s spoken/idle display name is configured with root agentName only. Any voice.agentName value in your JSON is stripped during merge and has no effect (widget.updateConfig({ voice: { agentName: '…' } }) is ignored for naming).

Property	Type	Default	Description
`voice.micButtonColor`	string	primaryColor	Microphone button color (inside panel)
`voice.micButtonActiveColor`	string	'#EF4444'	Microphone button color when active
`voice.micButtonHint.text`	string	'Click the button to start...'	Hint text below mic button
`voice.micButtonHint.color`	string	'#6B7280'	Hint text color
`voice.avatarBackgroundColor`	string	'#667eea'	Voice avatar background
`voice.avatarActiveBackgroundColor`	string	'#667eea'	Avatar background when active
`voice.statusTitleColor`	string	'#1e293b'	Status title text color
`voice.statusSubtitleColor`	string	'#64748b'	Status subtitle text color
`voice.startCallTitle`	string	null	Custom text for "Click to Start Call" title (bypasses translations)
`voice.startCallSubtitle`	string	null	Custom text for "Real-time voice conversation" subtitle (bypasses translations)
`voice.startCallButtonText`	string	null	Custom text for "Start Call" button (bypasses translations)
`voice.startCallButtonColor`	string	'#667eea'	Fills the start-call button gradient when `primaryBtnGradient1`/`2` are omitted. The “TalkToPC” footer link uses the same solid accent as the start button (`primaryBtnGradient1`, then `startCallButtonColor`, then defaults).
`voice.startCallButtonTextColor`	string	'#FFFFFF'	Start call button text color
`voice.endCallButtonColor`	string	'#ef4444'	End call button color
`voice.transcriptBackgroundColor`	string	'#FFFFFF'	Transcript background
`voice.transcriptTextColor`	string	'#1e293b'	Transcript text color
`voice.transcriptLabelColor`	string	'#94a3b8'	Transcript label color
`voice.userTranscriptPrefix`	string \| null	null	Prefix before live user speech (STT) in the collapsed transcript strip and mobile bar (e.g. `"You: "`). `null` uses the translation key `userTranscriptPrefix` for the widget language; empty string removes the prefix.
`voice.controlButtonColor`	string	'#FFFFFF'	Control button color
`voice.controlButtonSecondaryColor`	string	'#64748b'	Secondary control button color
`voice.language`	string	'en'	Voice language (overrides global)
`voice.statusDotColor`	string	'#10b981'	Status dot color on the voice idle screen
`voice.statusText`	string \| null	null	Custom status line on the voice idle screen. `null` uses the translated default.
`voice.outputContainer`	string	'raw'	Output audio container for v2 format negotiation: `'raw'` or `'wav'`
`voice.outputEncoding`	string	'pcm'	Output encoding: `'pcm'`, `'pcmu'`, or `'pcma'`
`voice.outputSampleRate`	number	24000	Requested output sample rate (Hz): 8000, 16000, 22050, 24000, 44100, or 48000
`voice.outputChannels`	number	1	Output channels (mono only supported)
`voice.outputBitDepth`	number	16	Output bit depth: 8, 16, or 24

Voice Theming & Pill Launcher

🎨 Full Voice UI Theming

Customize every visual aspect of the voice interface — hero section, buttons, active call screen, pill launcher, and more. Use these properties to create branded themes or match your website's design.

The desktop pill has a fixed width (158px) on viewports ≥769px with a taller vertical layout (padding, 36px icon circle, 13px/11px title/status). Long titles ellipsize. RTL uses logical padding so the logo sits evenly on the start edge.

Pill Launcher

Property	Type	Default	Description
`voice.pillGradient`	string	''	CSS gradient for the mobile floating pill, the mobile pre-chat landing sheet (`.ttp-mobile-landing`), and the mobile in-call bottom bar plus the expanded conversation header—same token everywhere. When unset, the widget uses the same default three-stop purple as the landing sheet (`linear-gradient(135deg, #581c87, #312e81, #1e1b4b)`). Example: `linear-gradient(135deg, #7c3aed, #6d28d9)`.
`voice.pillTextColor`	string	'#ffffff'	Text color on the pill launcher
`voice.pillDotColor`	string	'#4ade80'	Online-status dot color on the pill

Hero / Idle Screen

Property	Type	Default	Description
`agentName`	string	—	Root only (duplicate of General Configuration). Idle hero name and avatar initial; default `Sasha` when unset/empty. Mobile pre-chat landing: `agentName` if set, else `header.title`.
`voice.avatarGradient1`	string	'#6d56f5'	Avatar gradient start color
`voice.avatarGradient2`	string	'#a78bfa'	Avatar gradient end color
`voice.headerAvatarImageUrl`	string	''	Optional https (or http) URL for the idle voice header circle (desktop and mobile pre-call). When missing or invalid, the UI shows the first letter of the agent name inside the circle (e.g. "S" for Sasha). Same key may be set at the top level as `headerAvatarImageUrl`. Legacy: `voice.avatarImageUrl` (and snake_case `header_avatar_image_url` / `avatar_image_url` on `voice` or root) are also accepted. Invalid or non-http(s) URLs are ignored.
`voice.onlineDotColor`	string	'#22c55e'	Online dot next to avatar
`voice.heroGradient1`	string	'#2a2550'	Hero gradient start: desktop voice panel (idle and in-call), widget footer, and `.voice-interface.active` use `linear-gradient(160deg, heroGradient1, heroGradient2)`. Mobile in-call chrome uses `voice.pillGradient` instead (see Pill Launcher).
`voice.heroGradient2`	string	'#1a1a2e'	Hero gradient end; pairs with `heroGradient1` for desktop/active surfaces above—not for the mobile minimized bar (use `pillGradient`).
`voice.agentNameColor`	string	'#f0eff8'	Agent name text color
`voice.agentRoleColor`	string	'rgba(255,255,255,0.35)'	Agent role text color
`voice.agentRole`	string	'AI Voice Assistant'	Agent role label
`voice.headlineColor`	string	'#ffffff'	Hero headline text color
`voice.headline`	string	'Hi there 👋'	Hero headline text
`voice.sublineColor`	string	'rgba(255,255,255,0.45)'	Hero subline text color
`voice.subline`	string	'Ask me anything...'	Hero subline text (supports HTML)

Primary & Secondary Buttons

Property	Type	Default	Description
`voice.primaryBtnGradient1`	string	'#6d56f5'	"Start Voice Call" button gradient start; footer “TalkToPC” link uses the same resolved accent (after `startCallButtonColor` fallback when gradients are omitted). Mobile user bubbles and send accent use this palette with `primaryBtnGradient2` / `sendButtonColor`.
`voice.primaryBtnGradient2`	string	'#9d8df8'	"Start Voice Call" button gradient end; pairs with `sendButtonColor` for mobile message/send styling.
`voice.startCallButtonTextColor`	string	'#FFFFFF'	Primary button text color
`voice.startCallButtonText`	string	'Start Voice Call'	Primary button label
`voice.sendMessageText`	string	'Send a Message'	Secondary button label
`voice.secondaryBtnBg`	string	'rgba(255,255,255,0.05)'	Secondary button background
`voice.secondaryBtnBorder`	string	'rgba(255,255,255,0.09)'	Secondary button border color
`voice.secondaryBtnTextColor`	string	'rgba(255,255,255,0.6)'	Secondary button text color

Active Call View

Property	Type	Default	Description
`voice.waveformBarColor`	string	'#7C3AED'	Waveform bar color during call
`voice.speakerButtonColor`	string	'#FFFFFF'	Speaker button color

Quick Theme Example

Apply a complete light theme:

const widget = new TTPAgentSDK.TTPChatWidget({
  agentId: 'your-agent-id',
  appId: 'your-app-id',
  panel: {
    backgroundColor: '#ffffff',
    border: '1px solid rgba(0,0,0,0.06)'
  },
  voice: {
    pillGradient: 'linear-gradient(135deg, #7c3aed, #6d28d9)',
    pillTextColor: '#ffffff',
    heroGradient1: '#ede9fe',
    heroGradient2: '#f5f3ff',
    agentNameColor: '#1e1b4b',
    headlineColor: '#1e1b4b',
    sublineColor: '#6b7280',
    primaryBtnGradient1: '#7c3aed',
    primaryBtnGradient2: '#a78bfa',
    secondaryBtnBg: '#f5f3ff',
    secondaryBtnBorder: 'rgba(124,58,237,0.15)',
    secondaryBtnTextColor: '#6d28d9'
  }
});

RTL Hebrew theme example:

const widget = new TTPAgentSDK.TTPChatWidget({
  agentId: 'your-agent-id',
  appId: 'your-app-id',
  direction: 'rtl',
  header: { title: 'עוזרת חכמה', onlineIndicatorText: 'מחוברת' },
  panel: { backgroundColor: '#0f172a' },
  voice: {
    pillGradient: 'linear-gradient(135deg, #1e3a5f, #1e40af, #0f172a)',
    avatarGradient1: '#3b82f6',
    avatarGradient2: '#1d4ed8',
    heroGradient1: '#1a2744',
    heroGradient2: '#0f172a',
    primaryBtnGradient1: '#3b82f6',
    primaryBtnGradient2: '#1d4ed8',
    startCallButtonText: 'התחל שיחה קולית',
    sendMessageText: 'שלח הודעה',
    agentRole: 'עוזרת קולית חכמה',
    headline: 'היי, מה שלומך? 👋',
    subline: 'שאל/י אותי הכל — אני עונה מיידית בקול או בטקסט.'
  }
});

Behavior

Property	Type	Default	Description
`behavior.mode`	string	'unified'	'unified' (both), 'voice-only', 'text-only'
`behavior.autoOpen`	boolean	false	Auto-open widget on page load
`behavior.startOpen`	boolean	false	Start with widget open
`behavior.hidden`	boolean	false	Hide the widget completely
`behavior.mobileVoiceUI`	boolean	auto	Force mobile-style voice call UI (bottom bar + overlay) or desktop hero. Omit for auto: native iOS/Android, or viewport ≤768px with touch/coarse pointer (helps in-app browsers with desktop User-Agent). Root-level `mobileVoiceUI` overrides this if both are set.
`behavior.autoConnect`	boolean	false	Auto-connect on widget open
`behavior.showWelcomeMessage`	boolean	true	Show welcome message
`behavior.welcomeMessage`	string	'Hello! How can I help...'	Welcome message text
`behavior.enableVoiceMode`	boolean	true	Enable voice mode option (in unified mode)

Animation, Prompt Bubble & Tooltips

Property	Type	Default	Description
`animation.enableHover`	boolean	true	Enable hover animations
`animation.enablePulse`	boolean	true	Enable pulse animations
`animation.enableSlide`	boolean	true	Enable slide animations
`animation.duration`	number	0.3	Animation duration (seconds)
`promptAnimation.enabled`	boolean	false	Show an animated “Try me!” prompt bubble next to the launcher pill. Must be explicitly set to `true`.
`promptAnimation.text`	string	'Try me!'	Prompt bubble label text
`promptAnimation.backgroundColor`	string	purple gradient	Prompt bubble background (CSS color or gradient)
`promptAnimation.textColor`	string	'#ffffff'	Prompt bubble text color
`promptAnimation.animationType`	string	'bounce'	`'bounce'`, `'pulse'`, `'float'`, or `'none'`
`promptAnimation.showShimmer`	boolean	true	Shimmer effect on the prompt bubble
`promptAnimation.showPulseRings`	boolean	true	Pulse rings around the launcher while the prompt is visible
`promptAnimation.hideAfterClick`	boolean	true	Hide the prompt after the user opens the widget
`promptAnimation.hideAfterSeconds`	number \| null	null	Auto-hide after N seconds. `null` = never auto-hide.
`promptAnimation.position`	string	'top'	Prompt placement relative to the launcher: `'top'`, `'left'`, or `'right'`
`tooltips.newChat`	string	Auto	New chat button tooltip
`tooltips.back`	string	Auto	Back button tooltip
`tooltips.close`	string	Auto	Close button tooltip
`tooltips.mute`	string	Auto	Mute button tooltip
`tooltips.speaker`	string	Auto	Speaker button tooltip
`tooltips.endCall`	string	Auto	End call button tooltip

Mobile pre-chat overlay & legacy landing keys (Unified Mode)

On desktop, unified mode opens the voice idle hero inside the panel—there is no in-panel “Voice / Text” mode card screen. Back navigation and call end return to that hero. On mobile, the full-screen pre-chat sheet (ttpMobileLanding) still offers call vs text; the following options apply there. During an active call, the bottom minimized bar stays visible first; tapping the transcript row opens a conversation sheet (type while voice is active). Close the sheet to return to the bar—mute, speaker, and end call remain on the bar when the sheet is closed. On mobile, Back (header or text bar) closes the panel and opens the Call vs Chat landing overlay directly—the same sheet as tapping the FAB. If the user declines the server-driven disclaimer during voice setup, the widget also returns to that Call vs Chat overlay (and disconnects) instead of leaving the in-panel “Start call” card. After accepting the disclaimer, the minimized voice bar (waveform / mic / end call) stays the default view; the expandable conversation sheet with the text field does not open automatically. Other landing.* keys (e.g. mode card colors, in-panel title) remain in config merges for backward compatibility but are not rendered on desktop.

Property	Type	Default	Description
`landing.voiceCardTitle`	string	null	Fallback label for the mobile “call” action when `landing.callButtonText` is not set
`landing.textCardTitle`	string	null	Fallback label for the mobile “chat” action when `landing.chatButtonText` is not set
`landing.callButtonText`	string	null	Mobile overlay primary button text (falls back to `voiceCardTitle` / translation)
`landing.chatButtonText`	string	null	Mobile overlay secondary button text (falls back to `textCardTitle` / translation)
`landing.statusText`	string	null	Mobile overlay status line; when unset, uses online translation plus `landing.subtitle` (e.g. “Ready to help”)
`landing.subtitle`	string	—	Shown after the online dot in the mobile overlay status when `landing.statusText` is not set

Advanced Configuration

Property	Type	Default	Description
`websocketUrl`	string	Optional	Custom WebSocket base URL (defaults to wss://speech.talktopc.com/ws/conv). If not provided, URL is constructed from agentId/appId.
`demo`	boolean	true	Enable demo mode
`panel.backdropFilter`	string	null	CSS backdrop filter (e.g., 'blur(10px)')
`panel.border`	string	'1px solid rgba(0,0,0,0.1)'	Panel border style
`button.shadowColor`	string	'rgba(0,0,0,0.15)'	Button shadow color
`icon.emoji`	string	'🎤'	Emoji when icon.type = 'emoji'
`icon.text`	string	'AI'	Text when icon.type = 'text'
`accessibility.ariaLabel`	string	'Chat Assistant'	ARIA label for the widget
`accessibility.ariaDescription`	string	'Click to open chat assistant'	ARIA description
`accessibility.keyboardNavigation`	boolean	true	Enable keyboard navigation
`onConversationStart`	function	—	Called when a voice conversation starts (after connect + recording begins).
`onConversationEnd`	function	—	Called when a voice conversation ends (including disclaimer decline before recording).
`onBargeIn`	function	—	Called when the user interrupts the agent (barge-in).
`onAudioStartPlaying`	function	—	Called when agent audio playback starts.
`onAudioStoppedPlaying`	function	—	Called when agent audio playback stops.
`onSubtitleDisplay`	function	—	Called when a subtitle/transcript line is displayed during a call.
`onVoiceCallButtonClick`	function	—	Called when the user taps “Start Voice Call”. Return `false` to cancel the call start.

Visual Assistant (`visualAssistant`)

When enabled, the SDK registers browser-side tools the agent can call to read the page, highlight elements, scroll, navigate, fill forms, click elements, and capture screenshots. Tools are always registered on the client; the backend agent config decides which are available.

visualAssistant: {
  enabled: true,           // Required to activate visual assistant
  allowHighlight: true,    // highlight_element
  allowScroll: true,       // scroll_to_element
  allowNavigate: true,     // navigate_to
  allowFillForm: true,     // fill_form
  allowClick: true         // click_element
}

May be set at the widget root or inside agentSettingsOverride.visualAssistant. See examples/test-client-tools.html for a live demo.

✅ Configuration Verified:

All configuration options listed above have been verified against the source code and are fully supported. The widget has 100+ customization options across 15+ categories:

General (incl. visualAssistant, whatsapp, inputFormat)
Positioning (incl. draggable, draggablePersist)
Icon & Button
Panel & Header (incl. online indicator, footer colors)
Messages & Text Chat
Voice Interface & output format
Voice Theming & Pill Launcher
Mobile landing / legacy landing
Prompt bubble (promptAnimation)
Tooltips & Animations
Behavior (incl. mobileVoiceUI)
Accessibility
Event callbacks (voice lifecycle)
Advanced & legacy keys
Visual Assistant

Experiment with all options interactively in the live demo.

📖 Tip: All configuration options support the spread operator (...), so you can pass additional custom properties that will be merged with defaults.

🎮 Live Demo: Try the fully functional widget with customization options at test-text-chat.html

Use Cases

💼 Customer Support

Add 24/7 AI-powered support to your website

🛒 E-commerce

Help customers find products and answer questions

📚 Documentation

Provide interactive help for your docs

🎓 Education

Create AI tutors and learning assistants

Vanilla JavaScript Guide

Use the SDK in any JavaScript application without frameworks.

Complete Example

import { VoiceSDK } from 'ttp-agent-sdk';

class VoiceAssistant {
  constructor() {
    this.sdk = null;
    this.isConnected = false;
    this.isRecording = false;
  }

  async initialize(agentId, overrides = {}) {
    this.sdk = new VoiceSDK({
      agentId: agentId,
      appId: 'your_app_id',
      agentSettingsOverride: overrides
    });

    // Setup event listeners
    this.setupEventListeners();

    // Connect
    await this.sdk.connect();
  }

  setupEventListeners() {
    this.sdk.on('connected', () => {
      this.isConnected = true;
      this.updateUI('connected');
    });

    this.sdk.on('disconnected', () => {
      this.isConnected = false;
      this.isRecording = false;
      this.updateUI('disconnected');
    });

    this.sdk.on('recordingStarted', () => {
      this.isRecording = true;
      this.updateUI('recording');
    });

    this.sdk.on('recordingStopped', () => {
      this.isRecording = false;
      this.updateUI('connected');
    });

    this.sdk.on('message', (msg) => {
      if (msg.type === 'agent_response') {
        this.displayMessage('agent', msg.agent_response);
      }
    });

    this.sdk.on('error', (error) => {
      this.handleError(error);
    });
  }

  async toggleRecording() {
    if (!this.isConnected) return;

    if (this.isRecording) {
      await this.sdk.stopRecording();
    } else {
      await this.sdk.startRecording();
    }
  }

  disconnect() {
    if (this.sdk) {
      this.sdk.disconnect();
      this.sdk = null;
    }
  }

  updateUI(state) {
    // Update your UI based on state
  }

  displayMessage(role, text) {
    // Display message in your UI
  }

  handleError(error) {
    // Handle errors
  }
}

// Usage
const assistant = new VoiceAssistant();

await assistant.initialize('agent_123', {
  language: 'es',
  temperature: 0.9
});

React Integration

Use the SDK in React applications with hooks and components.

Using Hooks

import React, { useState, useEffect, useRef } from 'react';
import { VoiceSDK } from 'ttp-agent-sdk';

function VoiceChat() {
  const [status, setStatus] = useState('disconnected');
  const [isRecording, setIsRecording] = useState(false);
  const [messages, setMessages] = useState([]);
  const sdkRef = useRef(null);

  // Initialize SDK
  useEffect(() => {
    async function initSDK() {
      const sdk = new VoiceSDK({
        agentId: 'agent_123',
        appId: 'your_app_id',
        agentSettingsOverride: {
          language: 'es',
          temperature: 0.9
        }
      });

      // Event listeners
      sdk.on('connected', () => setStatus('connected'));
      sdk.on('disconnected', () => {
        setStatus('disconnected');
        setIsRecording(false);
      });
      sdk.on('recordingStarted', () => setIsRecording(true));
      sdk.on('recordingStopped', () => setIsRecording(false));
      sdk.on('message', (msg) => {
        if (msg.type === 'agent_response') {
          setMessages(prev => [
            ...prev,
            { role: 'agent', text: msg.agent_response }
          ]);
        }
      });

      await sdk.connect();
      sdkRef.current = sdk;
    }

    initSDK();

    // Cleanup
    return () => {
      if (sdkRef.current) {
        sdkRef.current.disconnect();
      }
    };
  }, []);

  const toggleRecording = async () => {
    if (sdkRef.current) {
      await sdkRef.current.toggleRecording();
    }
  };

  return (
    <div>
      <div>Status: {status}</div>
      <button onClick={toggleRecording} disabled={status !== 'connected'}>
        {isRecording ? 'Stop' : 'Start'} Recording
      </button>
      <div>
        {messages.map((msg, i) => (
          <div key={i}>{msg.role}: {msg.text}</div>
        ))}
      </div>
    </div>
  );
}

export default VoiceChat;

VoiceButton Component

Pre-built React component for quick integration.

Installation

import { VoiceButton } from 'ttp-agent-sdk/react';

Basic Usage

import React from 'react';
import { VoiceButton } from 'ttp-agent-sdk/react';

function App() {
  return (
    <VoiceButton
      agentId="agent_123"
      appId="your_app_id"
      agentSettingsOverride={{
        language: 'es',
        temperature: 0.9
      }}
      onConnected={() => console.log('Connected')}
      onMessage={(msg) => console.log('Message:', msg)}
      onError={(error) => console.error('Error:', error)}
    />
  );
}

Props

Prop	Type	Required	Description
`agentId`	string	Yes	The AI agent identifier
`appId`	string	Yes	Your application ID
`websocketUrl`	string	No	Optional custom WebSocket base URL (defaults to wss://speech.talktopc.com/ws/conv)
`agentSettingsOverride`	object	No	Custom agent settings
`voice`	string	No	Voice preset (default: 'default')
`language`	string	No	Language code (default: 'en')
`autoReconnect`	boolean	No	Auto-reconnect on disconnect (default: true)
`className`	string	No	Custom CSS class for the button
`style`	object	No	Inline styles for the button
`children`	React.Node	No	Custom button content (replaces default icon)

Event Callbacks

Callback	Parameters	Description
`onConnected`	-	Called when successfully connected
`onDisconnected`	-	Called when disconnected
`onRecordingStarted`	-	Called when recording starts
`onRecordingStopped`	-	Called when recording stops
`onPlaybackStarted`	-	Called when audio playback starts
`onPlaybackStopped`	-	Called when audio playback stops
`onMessage`	`message`	Called for all WebSocket messages
`onError`	`error`	Called on errors
`onBargeIn`	`message`	Called when user interrupts agent
`onStopPlaying`	`message`	Called when server requests to stop audio

Fly-to-Cart Animation

When users tap "Add to Cart" in the e-commerce widget, the product card flies down into the cart icon with a tornado funnel effect. Uses html2canvas to capture the card as an image, avoiding stacking/overflow clipping in iframes or containers with overflow:hidden. Built-in for TTPEcommerceWidget. For custom React product widgets, use the useFlyToCart hook.

React Hook (useFlyToCart)

import { useFlyToCart } from 'ttp-agent-sdk';

const cartIconRef = useRef(null);
const { triggerFly, isAnimating } = useFlyToCart(cartIconRef);

const handleAddToCart = (product) => {
  triggerFly(cardRef.current, product, () => {
    addToCartAPI(product);
  });
};

Vanilla JS (FlyToCart)

import { FlyToCart } from 'ttp-agent-sdk';

const flyToCart = new FlyToCart({
  getCartTarget: () => document.querySelector('.cart-icon'),
  onCartBump: () => { /* optional */ },
});

flyToCart.triggerFly(cardElement, product, () => {
  addToCartAPI(product);
});

E-commerce: Adding to cart (partner API & front-end cart)

Cart logic is owned by your integration (partner APIs, mock stores, or a Shopify storefront). The widget renders products, sends user intent over the WebSocket, and updates the cart summary when it receives the right messages. The SDK does not keep its own authoritative cart array: totals and counts should reflect what the backend (or the browser cart, then the backend) considers true.

Who does the "add"?

Model	Where the line item is added	How the widget learns the new totals
Server-side partner cart (mock store, custom API, headless stack)	Conversation backend calls the partner API after `product_selected` or after the model runs the internal tool `add_to_cart`.	Backend sends `t: "cart_updated"` with `cartTotal`, `cartItemCount`, and optional `currency`. For a visible "added" toast, include `action: "added"` and a `product` object (`id`, `name`, `price`).
Shopify Online Store (widget embedded on the store theme)	The browser must call Shopify's Ajax Cart API (`POST /cart/add.js`) so the add uses the visitor's store session cookie. Server-only Storefront API carts are a different session and will not match the theme cart.	The conversation backend sends `t: "add_to_store_cart"` with `variantId` and `quantity`. The ecommerce flavor runs the Ajax calls, refreshes the cart bar from `GET /cart.js`, and sends `t: "cart_add_result"` back so the backend can continue the turn. The backend may still send `cart_updated` afterward if you want one canonical sync message.
Custom client tool	Your backend issues a `client_tool_call` with a tool name you defined (for example `addToCart`). Your page handles it via `registerToolHandler` on the widget or `AgentSDK`.	The SDK replies with `client_tool_result` (or `client_tool_error`). The cart bar still expects `cart_updated` unless your handler updates UI itself.

Flow 1 — User taps Add / Update on a product card

The widget stops playback (so the agent does not talk over the action) and sends t: "product_selected" on the voice WebSocket. Payload includes productId, productName, price, quantity (absolute units or weight), and sellBy (quantity or weight).
Your backend decides what happens next:
- Update the partner cart via API, then push cart_updated to the session, or
- For Shopify on-domain, send add_to_store_cart and let the widget perform /cart/add.js, then consume cart_add_result on the server, or
- Hybrid: tool or internal step on the server plus a follow-up cart_updated.
EcommerceManager.handleCartUpdated updates the bottom cart bar when cartTotal and cartItemCount are present. Optional action: "added" drives the short confirmation UX.

Flow 2 — User asks in voice (or the agent adds without a card click)

For TTP's Java conversation backend, e-commerce uses internal tools such as search_products, add_to_cart, and get_cart (names match what the LLM sees; they are not the same as optional SDK client_tool_call names unless you wire them).

The model calls add_to_cart with productId and quantity.
The server runs the partner integration:
- Partner API cart: The service calls the partner, receives success and line items, then typically sends cart_updated to the widget with totals and item count so the UI matches the server.
- Shopify theme cart: The service sends add_to_store_cart to the widget instead of mutating cart only on the server; the widget performs the Ajax add and returns cart_add_result. Reading the cart may use get_store_cart from the server, which the widget answers with cart_state_result after GET /cart.js.
The model receives a text summary of the tool result and can speak to the user.

Message cheat sheet (widget ↔ backend)

Direction	`t` / type	Role
Widget → backend	`product_selected`	User confirmed quantity and tapped Add/Update on a card.
Backend → widget	`show_products` / `show_items`	Render product cards (search or browse results).
Backend → widget	`show_recipe`	After the `send_recipe` tool (sync-to-speech): opens a recipe card modal and auto-downloads a self-contained `.html` file. Payload: `{ t: "show_recipe", recipe: { title, ingredients[], steps[], servings?, notes? } }`. Partner logo (optional): resolved from `flavor.partnerId` → CDN `/partners/{partnerId}.svg` (PNG fallback); hidden if missing. User can Print or Save again from the modal. Dismissed on barge-in / overlay click / Escape.
Backend → widget	`cart_updated`	Sync `cartTotal`, `cartItemCount`, optional `currency`, optional `action` + `product` for toast.
Backend → widget	`add_to_store_cart`	Shopify: widget runs Ajax add for `variantId` / `quantity`. Optional `verbalAck: true` when the add was triggered by the user tapping Add (not an agent tool); echoed on `cart_add_result` so the server can prompt a brief spoken acknowledgment.
Widget → backend	`cart_add_result`	Outcome of Ajax add (success, counts, totals, currency). Includes `verbalAck` when the originating `add_to_store_cart` or hook `add_to_site_cart` requested it (UI add).
Backend → widget	`get_store_cart`	Shopify: widget fetches `/cart.js` and replies with cart contents.
Widget → backend	`cart_state_result`	Async cart snapshot after `get_store_cart` (implementation-specific).
Backend → widget	`client_tool_call`	Optional: run custom logic in the page; reply with `client_tool_result`.

Custom client tools: Register handlers with registerToolHandler on the widget or AgentSDK so the toolName in each client_tool_call matches what your backend defines.

Widget Flavors

The TTP SDK supports domain-specific "flavors" that customize the widget experience for different verticals. Each flavor provides specialized UI components, backend tools, and message handling.

Available Flavors

Flavor Type	Partner IDs	Backend Tools	Key Features
`ecommerce`	`mock-store`, `shopify`	`search_products`, `add_to_cart`, `get_cart`	Product cards, cart bar, fly-to-cart animation
`hotels`	`mock-hotel`	`search_rooms`, `select_room`, `add_extra`, `get_booking`, `show_media`	Room cards, booking bar, gallery
`pharma`	`mock-pharm`	`search_medications`, `add_to_prescription`, `get_prescription`	Medication cards (with Rx/OTC badges), prescription summary bar
`restaurants`	`mock-restaurant`	`search_menu`, `add_to_order`, `get_order`, `show_media`	Menu item cards (with allergen/dietary tags), order summary bar, gallery
`tours`	`mock-tour`	`search_tours`, `book_tour`, `get_tour_booking`, `show_media`	Tour item cards, booking summary bar, gallery

E-commerce WebSocket messages: The widget listens for t: "show_items" and t: "show_products" (same handler; backend tools often send show_products with products, title, layout). Use TTPEcommerceWidget (or TTPChatWidget with flavor.type: 'ecommerce') so these handlers are registered. For end-to-end add-to-cart flows (partner API vs Shopify Ajax vs client tools), see E-commerce cart flows.

Desktop voice strip (flavor.callView: 'minimized'): When flavor.callView is 'minimized', a fixed bottom voice surface is used on viewports wider than 768px. This works with or without a flavor: it requires only a flavor object carrying callView: 'minimized' (e.g. flavor: { callView: 'minimized' }, no type/partner needed) — a flavor type is optional. It is a rounded floating dock with layered glass styling (rim light, ambient shadow, soft indigo outer glow), an inset transcript “well,” squircle control tiles, and a blurred LIVE capsule—same layout and controls as before. It is centered with a capped width (up to about 720px with horizontal inset) and bottom spacing plus env(safe-area-inset-bottom). It uses the same tokens as the floating pill (voice.pillGradient, pillTextColor, pillDotColor, endCallButtonColor). Mute, pause, speaker, keyboard (text inject), and end-call align with the in-widget voice UI. The in-panel desktop “active call” UI stays hidden; the floating panel collapses if it was open, and the launcher pill is hidden for the duration of the call. The “Powered by” footer (footer.show, footer.brand) is rendered inside the strip (below the call controls) for the duration of the call—not as a separate floating pill at the widget corner. Viewports 768px and below skip the strip and use the mobile minimized bar flow instead. A single transcript line shows either the assistant (streaming) or the user (speech-to-text interim/final), never both at once; user text uses a warm highlight color and no You: prefix. When the widget language is Hebrew or Arabic (base he / ar) or direction is 'rtl', chrome uses RTL. Transcript lines use explicit dir="rtl" when the string contains Hebrew or Arabic letters (even if the widget Language is English), so neutral punctuation follows the sentence; Latin-only lines use dir="ltr". Mic/pause/speaker/end layout is unchanged.

Configuration

Set the flavor when creating the widget:

const widget = new TTPChatWidget({
  agentId: 'agent_...',
  appId: 'app_...',
  flavor: {
    type: 'pharma',        // 'ecommerce' | 'hotels' | 'pharma' | 'restaurants' | 'tours'
    partnerId: 'mock-pharm' // partner-specific data source
  }
});

Pharma Flavor

The pharmacy flavor provides a medication search and prescription management experience.

// Pharma configuration
flavor: {
  type: 'pharma',
  partnerId: 'mock-pharm'
}

// Backend tools injected automatically:
// - search_medications: Search the medication catalog
// - add_to_prescription: Add a medication to the prescription
// - get_prescription: View current prescription contents

// Frontend message types:
// - show_items: Displays medication cards
// - prescription_updated: Updates the prescription summary bar

The pharma flavor does not include gallery support (show_media).

Restaurants Flavor

The restaurants flavor provides a menu browsing, ordering, and photo gallery experience.

// Restaurant configuration
flavor: {
  type: 'restaurants',
  partnerId: 'mock-restaurant'
}

// Backend tools injected automatically:
// - search_menu: Search the restaurant menu
// - add_to_order: Add a menu item to the order
// - get_order: View current order contents
// - show_media: Display restaurant photo gallery

// Frontend message types:
// - show_items: Displays menu item cards
// - order_updated: Updates the order summary bar
// - show_media / dismiss_media: Gallery with dish photos, ambiance, etc.

Menu item cards display allergen warnings and dietary tags automatically.

Tours Flavor

The tours flavor provides a tour browsing, booking, and photo gallery experience.

// Tour configuration
flavor: {
  type: 'tours',
  partnerId: 'mock-tour'
}

// Backend tools injected automatically:
// - search_tours: Search available tours and activities by query
// - book_tour: Book a tour or activity
// - get_tour_booking: View current booking contents
// - show_media: Display tour photo gallery

// Frontend message types:
// - show_items: Displays tour item cards
// - cart_updated: Updates the booking summary bar
// - show_media / dismiss_media: Gallery with tour photos, highlights, etc.

Tour cards display activity tags and pricing. Reuses the same UI infrastructure as the restaurants flavor.

Client-Script Tools

Client-script tools let you attach backend-authored JavaScript to a client tool (tool_type: 'client') that runs in the visitor's browser when the LLM calls the tool — no host-page registerToolHandler() code required. The script is configured in the dashboard (tool form → "Agent scripts"), stored with the tool, and delivered to the widget automatically at session start. Available since SDK v2.45.3.

When to use which: use registerToolHandler() when the host page owns the logic and ships its own JS. Use a client-script tool when the agent owner wants browser-side behavior (DOM reads, page API calls, UI nudges) configurable from the dashboard without touching the embedding site.

How it works

Session init — the backend pushes a partner_bundle message with the reserved partner id __client_tools__, carrying every scripted tool attached to the agent (plus auto-run library scripts). The bundle is pushed on both the voice and text-chat channels, works with or without a widget flavor, and is re-pushed on reconnect (a new bundle replaces the previous one).
Compile on load — each entry compiles into a strict-mode async function immediately when the bundle lands. A syntax error poisons only that entry (logged at load time); the rest of the bundle still works.
Invocation — when the LLM calls the tool, the SDK runs the compiled script and returns its result to the backend.
Auto-run — library scripts flagged auto_run execute exactly once when the bundle arrives (e.g. to set up page listeners).

Authoring styles

Two styles compile — the SDK detects the shape and picks the right wrapping:

// 1. Raw statement body (typical in the dashboard UI; `ctx` is in scope)
alert("hi");
return { ok: true, clicked: true };

// 2. Function expression
async (ctx) => {
  const res = await ctx.fetch('/api/something');
  return { ok: true, data: await res.json() };
}

Both run inside an async function, so await is always allowed. A script that returns nothing resolves to { ok: true }.

The `ctx` object

Property	Description
`ctx.args`	Parameters from the LLM tool call.
`ctx.host`	`window.location.hostname`.
`ctx.fetch`	Bound `window.fetch`.
`ctx.log(...)`	`console.log` prefixed with `[adapter:<tool>]`.
`ctx.runAdapter(action, args)`	Invoke another script in the same bundle (max nesting depth 16).
`ctx.emit(msg)`	Send an unsolicited JSON message back to the backend over the active channel.

Client-tool scripts run with a generic context — the ecommerce-only helpers (ctx.platform, ctx.normalizeCart, ctx.refreshUI) are not available; those exist only in ecommerce partner-adapter bundles.

Pre / main / post steps

A scripted tool may carry up to three steps, executed in order: pre → main → post.

Pre is best-effort — if it throws, the error is logged and main still runs.
Main produces the tool result. A thrown error becomes an ok: false result.
Post runs only when main succeeded; its ctx.args._mainResult holds main's return value.

Sync vs async results

Configured per tool in the dashboard ("Wait for result"):

Mode	Behavior
Wait ON (sync)	The backend awaits the script result up to `timeoutMs` (1000–15000 ms, default 8000) and feeds it to the LLM as the tool result.
Wait OFF (async)	The LLM immediately receives the configured pending message; the real result is injected later: `SPOKEN_EVENT` (spoken immediately), `SPOKEN_DEFERRED` (next natural turn, default), or `SILENT_CONTEXT` (silent context update).

Limits

Limit	Value
Per-step code size	32 KB
Result size	512 KB serialized (oversize → `ok: false, error: "RESULT_TOO_LARGE"`)
Sync timeout	1000–15000 ms (default 8000)
Pending message	500 chars

Wire protocol (reference)

Script invocations arrive on two wire forms, both handled by the SDK on both channels:

// Bundle (session init / reconnect)
{ "t": "partner_bundle", "partner_id": "__client_tools__",
  "adapters": { "my_tool": { "code_js": "...", "pre_code_js": "...", "post_code_js": "..." } },
  "autoRun":  { "my_script": { "code_js": "..." } } }

// Form A — dedicated envelope (async path / chain steps)
→ { "t": "run_partner_script", "requestId": "...", "partnerId": "__client_tools__",
    "action": "my_tool", "args": { ... } }
← { "t": "run_partner_script_result", "requestId": "...", "ok": true, "result": { ... } }

// Form B — client_tool_call piggyback (backend sync-await path)
→ { "t": "client_tool_call", "toolCallId": "...", "toolName": "run_partner_script",
    "parameters": { "action": "my_tool", "args": { ... }, "partner_id": "__client_tools__" } }
← { "t": "client_tool_result", "toolCallId": "...", "result": { ... } }

Routing rule: partner_id === '__client_tools__' → the flavor-independent ClientScriptManager; any other partner id → the ecommerce flavor's partner-adapter path. The two bundles coexist.

Troubleshooting

Symptom	Cause / fix
`compile failed ... SyntaxError` at bundle load	Script doesn't parse. On SDK < 2.45.1, raw statement bodies (e.g. `alert("stop");`) failed even when valid — check the SDK version banner in the console and hard-refresh if stale.
`NO_HANDLER` for `run_partner_script`	SDK < 2.45.3 didn't route the `client_tool_call` piggyback form without an ecommerce flavor. Fixed in 2.45.3.
`adapter_not_in_bundle`	Tool not attached to the agent, or the bundle hasn't arrived yet. Check the session-start `partner_bundle` log.
Sync result is a timeout	Script exceeded `timeoutMs`. Raise it (max 15 s) or switch to async delivery.

For the full internals reference see CLIENT_SCRIPT_TOOLS_GUIDE.md in the repository root.

Chain Tools

A chain tool (tool_type: 'chain') runs a graph of steps — server tools, client tools, agent switches, and library scripts — as a single LLM tool call. Chains are edited visually in the dashboard (Chain Editor: nodes are steps, edges are data-flow dependencies) and executed by the backend; the widget participates only when a step targets the browser.

Execution model

Steps with no dependencies run first; steps at the same level run in parallel; levels run sequentially.
A step's input_from lists the upstream steps whose outputs feed it. Dependent steps receive the original LLM arguments shallow-merged with each upstream result (a non-object result lands under the upstream step's id).
Fail-fast: the first failing step aborts the chain with { "success": false, "error": ..., "step": ... }.
The LLM receives the terminal step's result (multiple terminals merge keyed by step id).
Inside a chain, every step is awaited — a referenced client tool's own "wait for result = off" setting is ignored.

Step types and the widget

Step type	Runs where	Widget involvement
`server_tool`	Backend (webhook)	None.
`switch_agent`	Backend	None.
`client_tool` (plain)	Browser	Arrives as a normal `client_tool_call` → your `registerToolHandler()` handler.
`client_tool` (scripted)	Browser	Dispatched as a `__client_tools__` bundle action keyed by tool name.
`client_script` (library)	Browser	Dispatched as a `__client_tools__` bundle action keyed by `script:{scriptId}`.

Scripts and scripted client tools referenced by a chain are resolved into the session bundle automatically — they do not need to be attached to the agent.

Constraints

1–30 steps per chain; no cycles; chains cannot reference other chain tools (no recursion).
Deleting a tool or library script that a chain references is blocked in the dashboard (the delete dialog lists the referencing chains).

VoiceSDK Class

Core class for voice interaction functionality. Server-driven disclaimer gating (compliance copy before STT/greeting) is described in detail under Server-driven disclaimer (voice) and applies when using protocol v2.

Constructor

new VoiceSDK(config)

Configuration Object

Property	Type	Required	Description
`agentId`	string	Yes	The AI agent identifier to connect to
`appId`	string	Yes	Your application identifier
`websocketUrl`	string	No	Optional custom WebSocket base URL (defaults to wss://speech.talktopc.com/ws/conv)
`agentSettingsOverride`	object	No	Custom agent configuration
`voice`	string	No	Voice preset name (default: 'default')
`language`	string	No	Language code (default: 'en')
`sampleRate`	number	No	Input audio sample rate: 8000, 16000, 22050, 24000, 44100, or 48000 Hz (default: 16000)
`channels`	number	No	Input audio channels (default: 1, mono only)
`bitDepth`	number	No	Input audio bit depth: 8, 16, or 24 bits (default: 16)
`outputContainer`	string	No	Output container format: 'raw' or 'wav' (default: 'raw')
`outputEncoding`	string	No	Output audio encoding: 'pcm', 'pcmu' (μ-law), or 'pcma' (A-law) (default: 'pcm')
`outputSampleRate`	number	No	Output audio sample rate: 8000, 16000, 22050, 24000, 44100, or 48000 Hz (default: 24000)
`outputChannels`	number	No	Output audio channels (default: 1, mono only)
`outputBitDepth`	number	No	Output audio bit depth: 8, 16, or 24 bits (default: 16)
`outputFrameDurationMs`	number	No	Frame duration for raw PCM streaming in milliseconds (default: 600)
`protocolVersion`	number	No	Protocol version: 1 (legacy) or 2 (format negotiation) (default: 2)
`autoReconnect`	boolean	No	Auto-reconnect on disconnect (default: true)

Audio Format Configuration (v2 Protocol)

The SDK v2 supports format negotiation with the backend. You can specify both input and output audio formats:

📋 Format Support:

Input Encodings: PCM, PCMU (μ-law), PCMA (A-law)
Input Sample Rates: 8000, 16000, 22050, 24000, 44100, 48000 Hz
Input Bit Depths: 8, 16, 24 bits
Output Containers: 'raw' (no header) or 'wav' (with header)
Output Encodings: PCM, PCMU (μ-law), PCMA (A-law)
Output Sample Rates: 8000, 16000, 22050, 24000, 44100, 48000 Hz
Output Bit Depths: 8, 16, 24 bits

Example: Custom Audio Format

const voiceSDK = new VoiceSDK({
  agentId: 'agent_123',
  appId: 'your_app_id',
  
  // Input format (what we send to server)
  sampleRate: 16000,
  channels: 1,
  bitDepth: 16,
  
  // Output format (what we want from server)
  outputContainer: 'raw',        // 'raw' or 'wav'
  outputEncoding: 'pcm',          // 'pcm', 'pcmu', 'pcma'
  outputSampleRate: 24000,       // Default; typical TTS/server output
  outputChannels: 1,
  outputBitDepth: 16,
  outputFrameDurationMs: 600,    // Frame duration for streaming
  
  // Protocol version
  protocolVersion: 2              // Use v2 protocol for format negotiation
});

// Listen for format negotiation
voiceSDK.on('formatNegotiated', (format) => {
  console.log('Format negotiated:', format);
  // format contains: { container, encoding, sampleRate, channels, bitDepth }
});

Methods

`connect()`

Connect to the voice agent.

Returns: Promise<boolean>

await voiceSDK.connect();

`disconnect()`

Disconnect from the voice agent.

Returns: void

voiceSDK.disconnect();

`startRecording()`

Start capturing and streaming audio.

Disclaimer (v2): If the server required a disclaimer and disclaimersPending is still true, this throws an Error with error.code === 'DISCLAIMER_PENDING'. Call sendDisclaimerAck(true) first. See Server-driven disclaimer.

Returns: Promise<boolean>

await voiceSDK.startRecording();

`sendDisclaimerAck(accepted)` (v2, voice)

Sends disclaimer_ack after the user accepts or declines server-shown disclaimer text. Uses disclaimersHash and conversationId from the last hello_ack.

Parameters: accepted (boolean) — true to continue the call, false to decline.

Returns: void (logs and clears disclaimersPending on successful send).

voiceSDK.sendDisclaimerAck(true);

`stopRecording()`

Stop capturing audio.

Returns: Promise<boolean>

await voiceSDK.stopRecording();

`toggleRecording()`

Toggle recording state (start/stop).

Returns: Promise<boolean>

await voiceSDK.toggleRecording();

`pauseCall()`

Pause the active call. Stops sending audio, clears the playback queue, and notifies the server to close STT/TTS connections. The WebSocket stays open and conversation history is preserved. A configurable timeout (default 5 minutes) will automatically end the call if not resumed.

Returns: void

voiceSDK.pauseCall();
// or via AgentSDK:
agentSDK.pauseCall();

`resumeCall()`

Resume a paused call. Restarts audio recording and notifies the server to re-create STT/TTS connections. The conversation continues from where it left off with full history.

Returns: void

voiceSDK.resumeCall();
// or via AgentSDK:
agentSDK.resumeCall();

`isPaused` (property)

Boolean indicating whether the call is currently paused.

Type: boolean

if (voiceSDK.isPaused) {
  console.log('Call is paused');
}
// or via AgentSDK:
if (agentSDK.isPaused) {
  console.log('Call is paused');
}

`getStatus()`

Get current connection and recording status.

Returns: Object

const status = voiceSDK.getStatus();
// Returns: {
//   version: '2.0.0',
//   isConnected: boolean,
//   isRecording: boolean,
//   isPlaying: boolean,
//   outputFormat: object,      // Negotiated output format (v2)
//   audioPlayer: object,       // AudioPlayer status
//   audioRecorder: object      // AudioRecorder status
// }

`validateInputFormat(format)`

v2 only: Validate input audio format configuration.

Parameters:

format (object) - Format object with encoding, sampleRate, bitDepth, channels

Returns: string|null - Error message if invalid, null if valid

const error = voiceSDK.validateInputFormat({
  encoding: 'pcm',
  sampleRate: 16000,
  bitDepth: 16,
  channels: 1
});
if (error) {
  console.error('Invalid format:', error);
}

`validateOutputFormat(format)`

v2 only: Validate output audio format configuration.

Parameters:

format (object) - Format object with container, encoding, sampleRate, bitDepth, channels

Returns: string|null - Error message if invalid, null if valid

const error = voiceSDK.validateOutputFormat({
  container: 'raw',
  encoding: 'pcm',
  sampleRate: 24000,
  bitDepth: 16,
  channels: 1
});
if (error) {
  console.error('Invalid format:', error);
}

`updateConfig(newConfig)`

Update SDK configuration dynamically.

Parameters:

newConfig (object) - Partial configuration object to merge with existing config

Returns: void

voiceSDK.updateConfig({
  outputSampleRate: 48000,
  outputEncoding: 'pcmu'
});

`reconnect()`

Manually reconnect to the agent.

Returns: Promise<boolean>

await voiceSDK.reconnect();

`stopAudioPlayback()`

Immediately stop audio playback (for barge-in).

Returns: void

voiceSDK.stopAudioPlayback();

`on(event, callback)`

Register an event listener.

Parameters:

event (string) - Event name
callback (function) - Event handler

voiceSDK.on('connected', () => {
  console.log('Connected!');
});

`destroy()`

Cleanup all resources and disconnect.

Returns: void

voiceSDK.destroy();

Events Reference

Event	Parameters	Description
`connected`	-	Emitted when successfully connected
`disconnected`	`event`	Emitted when disconnected (includes reason)
`error`	`error`	Emitted on errors
`recordingStarted`	-	Emitted when recording starts
`recordingStopped`	-	Emitted when recording stops
`message`	`message`	Emitted for all WebSocket messages
`playbackStarted`	-	Emitted when audio playback starts
`playbackStopped`	-	Emitted when audio playback stops
`playbackError`	`error`	Emitted on audio playback errors
`bargeIn`	`message`	Emitted when user interrupts agent
`stopPlaying`	`message`	Emitted when server requests to stop audio
`formatNegotiated`	`format`	v2 only: Emitted when audio format is negotiated with server. Format object contains: container, encoding, sampleRate, channels, bitDepth
`greetingStarted`	-	Emitted when greeting audio starts
`domainError`	`error`	Emitted when domain is not whitelisted
`disclaimersRequired`	`payload`	v2 voice: Server requires disclaimer acknowledgement. Payload: `texts`, `disclaimersHash`, `disclaimerTimeoutMs`, `conversationId`. Call `sendDisclaimerAck` after user decision.
`disclaimerRejected`	`{ code, message }`	v2 voice: Terminal disclaimer failure from server (`DISCLAIMER_DECLINED`, `DISCLAIMER_TIMEOUT`, `DISCLAIMER_HASH_MISMATCH`). Not used for `DISCLAIMER_PENDING` (that uses `error`).

Configuration Options

Agent Settings Override

Complete reference for all overridable settings:

Core Settings

Setting	Type	Range/Values	Description
`prompt`	string	Any text	System prompt/instructions for the agent
`temperature`	number	0.0 - 2.0	LLM creativity level
`maxTokens`	number	1 - 4096	Maximum tokens per response
`model`	string	Model names	⚠️ NOT SUPPORTED - LLM model selection requires infrastructure changes
`language`	string	ISO codes	Response language (e.g., 'en', 'es', 'fr')

Voice Settings

Setting	Type	Range/Values	Description
`voiceId`	string	Voice IDs	Specific voice identifier
`voiceSpeed`	number	0.5 - 2.0	Voice speed multiplier

Behavior Settings

Setting	Type	Range/Values	Description
`firstMessage`	string	Any text	Initial greeting message
`disableInterruptions`	boolean	true/false	Prevent user from interrupting agent
`autoDetectLanguage`	boolean	true/false	Automatically detect user's language
`candidateLanguages`	array	Language codes	List of candidate languages for auto-detection (e.g., ['en', 'es', 'fr'])
`maxCallDuration`	number	Seconds	Maximum session duration

Advanced Settings

Setting	Type	Range/Values	Description
`toolIds`	array	Array of numbers	Array of custom tool IDs to enable for this agent (e.g., [123, 456, 789])
`internalToolIds`	array	Array of strings	Array of internal tool IDs to enable for this agent (e.g., ['calendar', 'weather', 'email'])
`timezone`	string	TZ names	User timezone (e.g., 'America/New_York')

Text-to-Speech API

PUBLIC REST API

Generate high-quality voice audio from text using our public REST API endpoint.

🔒 Authentication: This endpoint requires API key authentication. Never expose your API key in frontend code - always call from your backend.

Endpoint

POST https://backend.talktopc.com/api/public/agents/tts/generate

Authentication

Include your API key in the Authorization header:

Authorization: Bearer YOUR_API_KEY

Request Parameters

Parameter	Type	Required	Description
`text`	string	Yes	The text to convert to speech
`voiceId`	string	No	Voice identifier (default: agent's configured voice)
`voiceSpeed`	number	No	Voice speed multiplier: 0.5 - 2.0 (default: 1.0)
`language`	string	No	Language code (e.g., 'en', 'es', 'fr')
`agentId`	string	No	Agent ID to use voice settings from

Example Requests

curl -X POST https://backend.talktopc.com/api/public/agents/tts/generate \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -d '{
    "text": "Hello! Welcome to our service.",
    "voiceId": "nova",
    "voiceSpeed": 1.2,
    "language": "en"
  }' \
  --output speech.mp3

const response = await fetch('https://backend.talktopc.com/api/public/agents/tts/generate', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': `Bearer ${process.env.TTP_API_KEY}`
  },
  body: JSON.stringify({
    text: 'Hello! Welcome to our service.',
    voiceId: 'nova',
    voiceSpeed: 1.2,
    language: 'en'
  })
});

// Response is audio file
const audioBuffer = await response.arrayBuffer();
const audioBlob = new Blob([audioBuffer], { type: 'audio/mpeg' });

// Play or save the audio
const audioUrl = URL.createObjectURL(audioBlob);
const audio = new Audio(audioUrl);
audio.play();

import requests
import os

url = "https://backend.talktopc.com/api/public/agents/tts/generate"
headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {os.environ['TTP_API_KEY']}"
}
data = {
    "text": "Hello! Welcome to our service.",
    "voiceId": "nova",
    "voiceSpeed": 1.2,
    "language": "en"
}

response = requests.post(url, json=data, headers=headers)

if response.status_code == 200:
    # Save audio to file
    with open("speech.mp3", "wb") as f:
        f.write(response.content)
    print("Audio saved to speech.mp3")
else:
    print(f"Error: {response.status_code}")

<?php
$url = "https://backend.talktopc.com/api/public/agents/tts/generate";
$apiKey = getenv('TTP_API_KEY');

$data = [
    'text' => 'Hello! Welcome to our service.',
    'voiceId' => 'nova',
    'voiceSpeed' => 1.2,
    'language' => 'en'
];

$ch = curl_init($url);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode($data));
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, [
    'Content-Type: application/json',
    'Authorization: Bearer ' . $apiKey
]);

$response = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);

if ($httpCode === 200) {
    file_put_contents('speech.mp3', $response);
    echo "Audio saved to speech.mp3";
} else {
    echo "Error: HTTP $httpCode";
}
?>

import java.net.http.*;
import java.net.URI;
import java.nio.file.*;

public class TTSExample {
    public static void main(String[] args) throws Exception {
        String url = "https://backend.talktopc.com/api/public/agents/tts/generate";
        String apiKey = System.getenv("TTP_API_KEY");
        
        String json = """
            {
                "text": "Hello! Welcome to our service.",
                "voiceId": "nova",
                "voiceSpeed": 1.2,
                "language": "en"
            }
            """;
        
        HttpClient client = HttpClient.newHttpClient();
        HttpRequest request = HttpRequest.newBuilder()
            .uri(URI.create(url))
            .header("Content-Type", "application/json")
            .header("Authorization", "Bearer " + apiKey)
            .POST(HttpRequest.BodyPublishers.ofString(json))
            .build();
        
        HttpResponse<byte[]> response = client.send(
            request, 
            HttpResponse.BodyHandlers.ofByteArray()
        );
        
        if (response.statusCode() == 200) {
            Files.write(Paths.get("speech.mp3"), response.body());
            System.out.println("Audio saved to speech.mp3");
        } else {
            System.out.println("Error: " + response.statusCode());
        }
    }
}

Response

The endpoint returns audio data directly with the following headers:

Header	Value	Description
`Content-Type`	audio/mpeg	Audio format (MP3)
`Content-Length`	number	Size of audio file in bytes

Voice Speed Examples

Speed	Effect	Use Case
`0.5`	50% slower (half speed)	Educational content, accessibility
`0.75`	25% slower	Clear pronunciation, language learning
`1.0`	Normal speed (default)	Standard conversation
`1.2`	20% faster	Quick updates, notifications
`1.5`	50% faster	Rapid information delivery
`2.0`	2x speed (double speed)	Maximum speed, time-saving

Backend Implementation Example

// Your backend endpoint
app.post('/api/generate-speech', async (req, res) => {
  const { text, voiceSpeed = 1.0 } = req.body;
  
  // Call TTP TTS API
  const ttpResponse = await fetch('https://backend.talktopc.com/api/public/agents/tts/generate', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'Authorization': `Bearer ${process.env.TTP_API_KEY}`  // 🔒 Secret!
    },
    body: JSON.stringify({
      text: text,
      voiceSpeed: voiceSpeed,
      voiceId: 'nova',
      language: 'en'
    })
  });
  
  if (!ttpResponse.ok) {
    return res.status(ttpResponse.status).json({ 
      error: 'TTS generation failed' 
    });
  }
  
  // Forward audio to client
  const audioBuffer = await ttpResponse.arrayBuffer();
  res.set('Content-Type', 'audio/mpeg');
  res.send(Buffer.from(audioBuffer));
});

Error Responses

Status Code	Description
`400`	Bad Request - Invalid parameters
`401`	Unauthorized - Invalid or missing API key
`429`	Too Many Requests - Rate limit exceeded
`500`	Internal Server Error - TTS generation failed

⚠️ Security Best Practices:

Never expose your API key in frontend JavaScript
Always call this endpoint from your backend server
Implement rate limiting on your backend
Validate and sanitize text input to prevent abuse

Use Cases

📢 Announcements

Generate audio announcements for notifications

📚 Content Creation

Convert articles or books to audio format

♿ Accessibility

Provide audio alternatives for text content

🎓 E-Learning

Create voice-overs for educational materials

Java SDK

Server-side Java SDK for text-to-speech conversion. Perfect for backend applications, phone systems, and server-to-server integrations.

🎯 Use Cases:

Backend TTS: Generate speech on your server without exposing API keys
Phone Systems: Integrate with Twilio, Telnyx, or custom VoIP systems
Server-to-Server: Automated voice generation for notifications, alerts, or content
Audio Format Control: Request specific formats (PCMU, PCMA, PCM) for phone systems

Installation

Maven

<repositories>
    <repository>
        <id>github</id>
        <url>https://maven.pkg.github.com/TTP-GO/java-sdk</url>
    </repository>
</repositories>

<dependencies>
    <dependency>
        <groupId>com.talktopc</groupId>
        <artifactId>ttp-agent-sdk-java</artifactId>
        <version>1.0.5</version>
    </dependency>
</dependencies>

⚠️ GitHub Packages Authentication:

You'll need to authenticate with GitHub Packages. Add credentials to your ~/.m2/settings.xml:

<settings>
    <servers>
        <server>
            <id>github</id>
            <username>YOUR_GITHUB_USERNAME</username>
            <password>YOUR_GITHUB_TOKEN</password>
        </server>
    </servers>
</settings>

Gradle

repositories {
    maven {
        url = uri("https://maven.pkg.github.com/TTP-GO/java-sdk")
        credentials {
            username = project.findProperty("gpr.user") ?: System.getenv("USERNAME")
            password = project.findProperty("gpr.key") ?: System.getenv("TOKEN")
        }
    }
}

dependencies {
    implementation 'com.talktopc:ttp-agent-sdk-java:1.0.5'
}

Quick Start

1. Initialize SDK

import com.talktopc.sdk.VoiceSDK;

// Get API key from environment variable
String apiKey = System.getenv("TALKTOPC_API_KEY");

VoiceSDK sdk = VoiceSDK.builder()
    .apiKey(apiKey)
    .baseUrl("https://speech.talktopc.com")  // Optional
    .build();

2. Simple TTS (Blocking)

// Generate complete audio file
byte[] audio = sdk.textToSpeech("Hello world", "mamre");

// Save to file
Files.write(Paths.get("output.wav"), audio);

// Or send to phone system
phoneSystem.playAudio(audio);

3. Streaming TTS (Real-time)

// Stream audio chunks as they're generated
sdk.textToSpeechStream(
    "Hello world, this is a longer text that will be streamed",
    "mamre",
    audioChunk -> {
        // Receive chunks in real-time
        phoneSystem.playAudio(audioChunk);
    }
);

API Reference

VoiceSDK

Main SDK entry point for text-to-speech operations.

Builder Methods

Method	Type	Description
`apiKey(String)`	String	Your TalkToPC API key (required)
`baseUrl(String)`	String	API base URL (default: https://speech.talktopc.com)
`connectTimeout(int)`	int	Connection timeout in milliseconds (default: 30000)
`readTimeout(int)`	int	Read timeout in milliseconds (default: 60000)

Methods

Method	Description
`textToSpeech(String text, String voiceId)`	Simple TTS (blocking) - returns complete audio as byte array
`textToSpeech(String text, String voiceId, double speed)`	TTS with speed control (0.1 - 3.0)
`textToSpeech(TTSRequest request)`	TTS with full configuration (format, speed, etc.)
`synthesize(TTSRequest request)`	Get full response with metadata (sample rate, duration, credits)
`textToSpeechStream(String text, String voiceId, Consumer<byte[]> chunkHandler)`	Streaming TTS - chunks delivered to handler as they're generated
`textToSpeechStream(TTSRequest request, Consumer<byte[]> chunkHandler, Consumer<StreamMetadata> onComplete, Consumer<Throwable> onError)`	Streaming TTS with completion and error callbacks

TTSRequest Builder

Configure TTS requests with audio format options.

Basic Configuration

TTSRequest request = TTSRequest.builder()
    .text("Hello world")              // Required
    .voiceId("mamre")                // Required
    .speed(1.0)                      // Optional (0.1 - 3.0)
    .build();

Audio Format Configuration

TTSRequest request = TTSRequest.builder()
    .text("Hello world")
    .voiceId("mamre")
    .outputContainer("raw")          // "raw" or "wav"
    .outputEncoding("pcm")           // "pcm", "pcmu", "pcma"
    .outputSampleRate(24000)         // Hz (8000, 16000, 22050, 24000, 44100, 48000)
    .outputBitDepth(16)              // bits (8, 16, 24)
    .outputChannels(1)               // 1 (mono) or 2 (stereo)
    .outputFrameDurationMs(600)     // ms per frame (for streaming)
    .build();

Preset Methods

Method	Format	Use Case
`phoneSystem()`	PCMU @ 8kHz, 20ms frames	Phone systems (Twilio, Telnyx, VoIP)
`highQuality()`	WAV @ 44.1kHz	High-quality audio files
`standardQuality()`	PCM @ 22.05kHz	Standard quality audio

TTSResponse

Response object containing audio and metadata.

Method	Return Type	Description
`getAudio()`	byte[]	Audio data
`getSampleRate()`	int	Sample rate in Hz
`getDurationMs()`	long	Playback duration in milliseconds
`getAudioSizeBytes()`	long	Audio size in bytes
`getCreditsUsed()`	double	Credits consumed
`getConversationId()`	String	Unique conversation ID

Examples

Basic Usage

VoiceSDK sdk = VoiceSDK.builder()
    .apiKey(System.getenv("TALKTOPC_API_KEY"))
    .build();

// Simple TTS
byte[] audio = sdk.textToSpeech("Welcome to TalkToPC", "mamre");
System.out.println("Generated " + audio.length + " bytes of audio");

// Save to file
Files.write(Paths.get("output.wav"), audio);

With Speed Control

// Faster speech (1.5x speed)
byte[] fastAudio = sdk.textToSpeech("Quick message", "mamre", 1.5);

// Slower speech (0.8x speed)
byte[] slowAudio = sdk.textToSpeech("Slow and clear", "mamre", 0.8);

Streaming with Metadata

sdk.textToSpeechStream(
    TTSRequest.builder()
        .text("Streaming example with full configuration")
        .voiceId("mamre")
        .speed(1.0)
        .build(),
    audioChunk -> {
        // Handle each audio chunk
        System.out.println("Received chunk: " + audioChunk.length + " bytes");
        phoneSystem.playAudio(audioChunk);
    },
    metadata -> {
        // Handle completion
        System.out.println("Stream completed:");
        System.out.println("  Total chunks: " + metadata.getTotalChunks());
        System.out.println("  Total bytes: " + metadata.getTotalBytes());
        System.out.println("  Duration: " + metadata.getDurationMs() + " ms");
        System.out.println("  Credits: " + metadata.getCreditsUsed());
    },
    error -> {
        // Handle errors
        System.err.println("Stream error: " + error.getMessage());
    }
);

Phone System Integration

Perfect for Twilio, Telnyx, or custom VoIP systems.

Standard Phone System (PCMU @ 8kHz)

// Using convenient phoneSystem() preset
TTSRequest request = TTSRequest.builder()
    .text("Hello, thank you for calling. How can I help you today?")
    .voiceId("en-US-female")
    .phoneSystem()  // ✅ PCMU @ 8kHz, 20ms frames
    .build();

sdk.textToSpeechStream(
    request,
    audioChunk -> {
        // audioChunk is PCMU @ 8kHz, 20ms frames (160 bytes)
        // Ready to send directly to phone connection
        phoneConnection.sendAudio(audioChunk);
    }
);

Twilio Integration

TTSRequest request = TTSRequest.builder()
    .text("Your appointment is confirmed for tomorrow at 3 PM")
    .voiceId("en-US-male")
    .outputContainer("raw")
    .outputEncoding("pcmu")      // μ-law for Twilio
    .outputSampleRate(8000)       // 8kHz
    .outputBitDepth(16)
    .outputChannels(1)            // Mono
    .outputFrameDurationMs(20)    // 20ms frames
    .build();

sdk.textToSpeechStream(
    request,
    audioChunk -> {
        // Send to Twilio Media Stream
        twilioStream.sendMedia(audioChunk);
    }
);

Custom Audio Format

TTSRequest request = TTSRequest.builder()
    .text("Custom format example")
    .voiceId("mamre")
    .outputContainer("raw")
    .outputEncoding("pcm")
    .outputSampleRate(16000)      // 16kHz
    .outputBitDepth(16)           // 16-bit
    .outputChannels(1)            // Mono
    .outputFrameDurationMs(100)   // 100ms frames
    .build();

byte[] audio = sdk.textToSpeech(request);
// Expected: 16kHz PCM, 16-bit, mono

High Quality Audio

TTSRequest request = TTSRequest.builder()
    .text("This is a high quality recording")
    .voiceId("mamre")
    .highQuality()  // WAV @ 44.1kHz
    .build();

byte[] audio = sdk.textToSpeech(request);
Files.write(Paths.get("high_quality.wav"), audio);

Error Handling

import com.talktopc.sdk.exception.TtsException;

try {
    byte[] audio = sdk.textToSpeech("Test", "mamre");
} catch (TtsException e) {
    System.err.println("TTS Error [" + e.getStatusCode() + "]: " + e.getErrorMessage());
    
    switch (e.getStatusCode()) {
        case 401:
            System.err.println("→ Invalid API key");
            break;
        case 402:
            System.err.println("→ Insufficient credits");
            break;
        case 400:
            System.err.println("→ Invalid parameters");
            break;
        default:
            System.err.println("→ Other error");
    }
}

Full Configuration Example

import com.talktopc.sdk.models.TTSRequest;
import com.talktopc.sdk.models.TTSResponse;

// Build request with all options
TTSRequest request = TTSRequest.builder()
    .text("Full configuration example")
    .voiceId("mamre")
    .speed(1.2)
    .outputContainer("wav")
    .outputEncoding("pcm")
    .outputSampleRate(24000)
    .outputBitDepth(16)
    .outputChannels(1)
    .build();

// Get response with metadata
TTSResponse response = sdk.synthesize(request);

System.out.println("Audio: " + response.getAudioSizeBytes() + " bytes");
System.out.println("Sample rate: " + response.getSampleRate() + " Hz");
System.out.println("Duration: " + response.getDurationMs() + " ms");
System.out.println("Credits: " + response.getCreditsUsed());

// Save audio
Files.write(Paths.get("output.wav"), response.getAudio());

Supported Audio Formats

Format	Encoding	Sample Rates	Use Case
PCM	`pcm`	8000, 16000, 22050, 24000, 44100 Hz	General purpose, high quality
PCMU (μ-law)	`pcmu`	8000 Hz	Phone systems (Twilio, Telnyx, VoIP)
PCMA (A-law)	`pcma`	8000 Hz	Phone systems (European standards)

Requirements

Java 11 or higher
Valid TalkToPC API key
No external dependencies - Uses Java 11+ HttpClient

💡 Key Differences from Frontend SDK:

Backend-only: Designed for server-side use, not browser
Format Pass-through: Can request PCMU/PCMA and forward directly to phone systems
No Audio Playback: Returns raw audio bytes - you handle playback/forwarding
REST API: Uses REST endpoints instead of WebSocket

Resources

GitHub Repository: TTP-GO/java-sdk
Maven Package: com.talktopc:ttp-agent-sdk-java:1.0.5
Documentation: See README.md in the repository
Examples: Check src/main/java/com/talktopc/sdk/examples/

Introduction

What We Offer

Voice & Chat Widget

Voice SDK

Integrations

Agent Override

Text To Speech

Java SDK

Key Features

Installation

NPM

CDN

Import

Quick Start

Initialize the SDK

Connect & Start Recording

Authentication

How It Works

Domain Whitelist

Configuration Parameters

Agent Settings Override

How It Works

Example

Available Override Settings

📝 Core Settings

🔊 Voice Settings

⚙️ Behavior

🛠️ Advanced

Variables in Hello Request

Overview

Hello Message Format

SDK v2 Format (Recommended)

Raw WebSocket Format

Variable Format

Example

Variable Replacement Priority

Example Priority

Usage Examples

Example 1: JavaScript/TypeScript with SDK v2

Example 2: Raw WebSocket (JavaScript)

Example 3: Python WebSocket

Example 4: cURL / wscat

Agent Prompt Setup

System Prompt Example

First Message Example

Backend Processing

Server Logs

Variable Naming Conventions

Common Use Cases

1. User Personalization

2. Account Context

3. Language/Localization

4. Session Context

Error Handling

Missing Variables

Invalid Variable Format

Best Practices

API Reference

Hello Message Structure

Troubleshooting

Variables Not Being Replaced

Variables Not in Hello Message

Default Variables Not Used

Events & Callbacks

Event Categories

Connection Events

Recording Events

Message Events

Audio Events

Pause Events

Special Events

Protocol v2 - Format Negotiation

Server-driven disclaimer (voice)

When the gate is active

iOS Safari: microphone uplink

Embed sites: CSP and “Unable to load a worklet's module”

hello_ack fields (gate active)

Client message: disclaimer_ack

Server error frames (t: "error")

Using VoiceSDK v2 (custom apps)

`hello_ack` fields (gate active)

Client message: `disclaimer_ack`

Server error frames (`t: "error"`)

`widget.open()`

`widget.close()`

`widget.toggle()`

`widget.minimize()`

`widget.maximize()`

`widget.destroy()`

`widget.updateConfig(config)`

Shadow DOM Configuration (`useShadowDOM`)

Visual Assistant (`visualAssistant`)